Re: [modwsgi] Webfaction mod_wsgi + Django

Graham Dumpleton Mon, 16 Feb 2015 02:05:03 -0800

What version of Django are you running? Older Django versions have a bug in 
them which can cause the Python garbage collector to block and no longer run. 
Memory usage will go up because Python objects are not reclaimed properly. Not 
out of the question that other third party libraries could cause this also and 
can see a correlation between it and the fact that you are hitting queue time 
where request threads are blocking on a thread mutex.


The MaxRequestWorkers warning is because you ran out of capacity in the daemon 
processes due to all your long running requests and/or hung requests. The fact 
you are hitting the request timeout means that the daemon process likely has 
stopped taking more requests and eventually as new requests back up, the Apache 
child worker process eventually complain as they also run out of capacity in 
their capacity of proxying requests.

So it is an outcome of the problems you are having. You still need to work out 
the underlying problem.

Graham

On 16/02/2015, at 8:53 PM, Paul Royik <[email protected]> wrote:

> I also got following error.
> server reached MaxRequestWorkers setting, consider raising the 
> MaxRequestWorkers setting
> 
> On Sunday, February 15, 2015 at 10:42:19 PM UTC+2, Paul Royik wrote:
> As I discovered, memory grows on every request, not only calculator, even in 
> admin. Situation is very close to this: 
> http://stackoverflow.com/questions/2293333/django-memory-usage-going-up-with-every-request
> I hit 3 GB. It is the first time.
> 
> On Sunday, February 15, 2015 at 1:59:06 PM UTC+2, Paul Royik wrote:
> It grows. Below is list between subsequent requests.
> 78988
> 85503
> 92873
> 100237
> 
> On Sunday, February 15, 2015 at 1:36:15 PM UTC+2, Graham Dumpleton wrote:
> It being empty is fine.
> 
> At least is not caused by uncollectable objects.
> 
> Next would be to trying printing out periodically:
> 
>     len(gc.get_objects())
> 
> and see if it grows over time.
> 
> This is not conclusive either as meaning anything, but if it does keep 
> growing, still useful to know.
> 
> Graham
> 
> On 15/02/2015, at 10:32 PM, Paul Royik <[email protected]> wrote:
> 
> Don't know why but gc.garbage is empty list.
> 
> On Sunday, February 15, 2015 at 1:04:15 PM UTC+2, Graham Dumpleton wrote:
> If the objects cannot be reclaimed due to Python objects cycles where objects 
> have __del__() methods, they simply are not able to be reclaimed.
> 
> You can try:
> 
>     import gc
>     gc.collect()
> 
> but will likely have little effect.
> 
> I would suggest you also read:
> 
> http://christian.hofstaedtler.name/blog/2013/01/gc-garbage.html
> 
> Graham
> 
> On 15/02/2015, at 10:00 PM, Paul Royik <[email protected]> wrote:
> 
> Can I force cleaning memory after algorithm has finished?
> 
> On Sunday, February 15, 2015 at 12:53:30 PM UTC+2, Graham Dumpleton wrote:
> 
> On 15/02/2015, at 9:41 PM, Paul Royik <[email protected]> wrote:
> 
> I reduced possibility by restricting very long and complex expressions.
> 
> Code, that I use is simple.
> 
> class TimeLimit(object):
>     def __init__(self, timeout=60):
>         self.timeout = timeout
>         self.end = None
> 
>     def check_timeout(self):
>         if self.end and time.time() > self.end:
>             raise TimeoutException
>         else:
>             self.start()
> 
>     def start(self):
>         if not self.end:
>             self.end = time.time() + self.timeout
> 
> class MainClass(object):
>   def __init__(self):
>        self.timer = TimeLimit()
>         algo(timer)
> 
> 
> def algo(timer):
>     do_something()
>     external()
>     timer.check_timeout()
>     algo(timer)
> 
> Which ultimately is no different to what was being done with the decorator.
> 
> If external() can run a lot longer than the timeout, either way is not going 
> to be much use. It is only going to be of much use if external() is always a 
> short call and on subsequent calls you are simply calling it with different 
> arguments.
> 
> I debugged code thoroughly and it works as expected. And as shown memory is 
> not drastically increased during execution, but later. Is that possible that 
> reasong in apache conf? Or I need to look for problem in code? I have no clue 
> where to search as everything works. Is it possible to see memory usage 
> somehow on localhost (I'm under Windows)?
> 
> It is unlikely to be the Apache configuration. It is more likely that after 
> finishing the execution of the algorithm, the algorithm isn't clearing out 
> any working data structures it creates when it is doing calculations and they 
> are persisting in memory. Subsequent calls simply add more memory each time 
> and so it grows in that way.
> 
> There are various reasons that Python objects cannot be reclaimed and will 
> keep using memory.
> 
> There aren't really great tools for tracking that sort of stuff down.
> 
> Graham
> 
> On Sunday, February 15, 2015 at 12:31:36 PM UTC+2, Graham Dumpleton wrote:
> 
> On 15/02/2015, at 9:26 PM, Paul Royik <[email protected]> wrote:
> 
> Yes. External library could run long (although I reduced this possibility).
> 
> How have you reduced possibility?
> 
> So, what should I do with locking threads? Is it resolvable?
> 
> No idea. It isn't my code. I can only highlight what looks suspicious in the 
> log output.
> 
> I'm just now playing with code on stackoverflow. People suggested to create 
> class instead of decorator and pass its instance in all functions. I did as 
> they proposed and it worked.
> 
> Since you don't provide the code for that, I can't comment.
> 
> However, with the flow of time I see increase in memory.
> So, now nobody access site.
> 
> I ran command ps -u simamura -o pid,rss,command | awk '{print $0}{sum+=$2} 
> END {print "Total", sum/1024, "MB"}' and see 200 MB.
> Then I invoke integral calculator, while it works above command shows 200 MB.
> Calculator finished, I check memory - still 200 MB. In 10-15 minutes 
> (remember nobody access the site) I see 240 MB.
> 
> In result, memory increased from 130 MB (memory immediately after server 
> restart) to 435 MB in 10 hours.
> 
> How can I resolve this issue? I feel that in another 12 hours webfaction will 
> kill the process, because I memomy will be over limit.
> 
> Have you tried adding print() debug statements to your code to track through 
> what is being called and to determine whether your code is doing what is 
> expected?
> 
> Graham
> 
> On Sunday, February 15, 2015 at 12:12:39 PM UTC+2, Graham Dumpleton wrote:
> 
> On 15/02/2015, at 9:03 PM, Paul Royik <[email protected]> wrote:
> 
> How do you know that it runs too long?
> 
> In explained the log output previously and what was happening.
> 
> If you go back far enough in the logs even before what you provided you will 
> see some messages like:
> 
> [Fri Feb 06 12:39:06.254056 2015] [wsgi:info] [pid 9731:tid 140178879313664] 
> mod_wsgi (pid=9731): Daemon process request time limit exceeded, stopping 
> process 'localhost:20241'.
> [Fri Feb 06 12:39:06.254108 2015] [wsgi:info] [pid 9731:tid 140179136616192] 
> mod_wsgi (pid=9731): Shutdown requested 'localhost:20241'.
> [Fri Feb 06 12:39:06.286339 2015] [wsgi:info] [pid 9731:tid 140179136616192] 
> mod_wsgi (pid=9731): Dumping stack trace for active Python threads.
> [Fri Feb 06 12:39:06.286359 2015] [wsgi:info] [pid 9731:tid 140179136616192] 
> mod_wsgi (pid=9731): Thread 140178316318464 executing file 
> "/usr/local/lib/python2.7/threading.py", line 147, in acquire
> [Fri Feb 06 12:39:06.286363 2015] [wsgi:info] [pid 9731:tid 140179136616192] 
> mod_wsgi (pid=9731): called from file 
> "/home/simamura/lib/python2.7/sympy/core/compatibility.py", line 851, in 
> wrapper,
> [Fri Feb 06 12:39:06.286366 2015] [wsgi:info] [pid 9731:tid 140179136616192] 
> mod_wsgi (pid=9731): called from file 
> "/home/simamura/lib/python2.7/sympy/core/cache.py", line 89, in wrapper,
> [Fri Feb 06 12:39:06.286369 2015] [wsgi:info] [pid 9731:tid 140179136616192] 
> mod_wsgi (pid=9731): called from file 
> "/home/simamura/lib/python2.7/sympy/core/operations.py", line 127, in 
> _matches_commutative,
> [Fri Feb 06 12:39:06.286372 2015] [wsgi:info] [pid 9731:tid 140179136616192] 
> mod_wsgi (pid=9731): called from file 
> "/home/simamura/lib/python2.7/sympy/core/mul.py", line 808, in matches,
> [Fri Feb 06 12:39:06.286375 2015] [wsgi:info] [pid 9731:tid 140179136616192] 
> mod_wsgi (pid=9731): called from file 
> "/home/simamura/lib/python2.7/sympy/core/add.py", line 358, in 
> _matches_simple,
> [Fri Feb 06 12:39:06.286378 2015] [wsgi:info] [pid 9731:tid 140179136616192] 
> mod_wsgi (pid=9731): called from file 
> "/home/simamura/lib/python2.7/sympy/core/operations.py", line 127, in 
> _matches_commutative,
> 
> So it says that request time limit was exceeded.
> 
> This is followed by stack dumps for what all the request threads were doing.
> 
> In the part you did show, they both are stuck in:
> 
> [Sat Feb 14 19:58:00.261118 2015] [wsgi:info] [pid 12528:tid 139862016030464] 
> mod_wsgi (pid=12528): Thread 139861715523328 executing file 
> "/usr/local/lib/python2.7/threading.py", line 147, in acquire
> [Sat Feb 14 19:58:00.261120 2015] [wsgi:info] [pid 12528:tid 139862016030464] 
> mod_wsgi (pid=12528): called from file 
> "/home/simamura/lib/python2.7/sympy/core/compatibility.py", line 851, in 
> wrapper,
> [Sat Feb 14 19:58:00.261122 2015] [wsgi:info] [pid 12528:tid 139862016030464] 
> mod_wsgi (pid=12528): called from file 
> "/home/simamura/lib/python2.7/sympy/core/cache.py", line 89, in wrapper,
> [Sat Feb 14 19:58:00.261125 2015] [wsgi:info] [pid 12528:tid 139862016030464] 
> mod_wsgi (pid=12528): called from file 
> "/home/simamura/lib/python2.7/sympy/core/mul.py", line 373, in _gather,
> [Sat Feb 14 19:58:00.261127 2015] [wsgi:info] [pid 12528:tid 139862016030464] 
> mod_wsgi (pid=12528): called from file 
> "/home/simamura/lib/python2.7/sympy/core/mul.py", line 98, in flatten,
> 
> That is, they are both waiting on a thread lock.
> 
> This suggests to me that the code isn't possibly dealing with locking 
> properly, or a lock is held for a very long time while in some certain 
> section and blocking other threads from running and so why they are delayed 
> and take so long.
> 
> No more than 300 seconds as I wrote in conf file (socket-timeout and 
> request-timeout).
> 
> I can not control it thoroughly.
> def algo():
>   do_something()
>   invoke_external_library() # this could run slighlty longer, so timeout is 
> not precise
>   check_timeout()
>   algo()
> 
> So the thing that takes the unpredictable amount of time and you need to 
> interrupt is not even your code?
> 
> Graham
> 
> On Sunday, February 15, 2015 at 11:36:05 AM UTC+2, Graham Dumpleton wrote:
> So we are back the fact that your requests are still running too long and 
> however you have set up the timeout check isn't working.
> 
> Have you added print() debug statements in your code to validate that the die 
> on timeout check is even being run regularly?
> 
> To be able to distinguish each request thread, you can include the output of 
> threading.currentThread() on the print statements.
> 
> Graham
> 
> On 15/02/2015, at 8:19 PM, Paul Royik <[email protected]> wrote:
> 
> [Sat Feb 14 19:58:00.261309 2015] [wsgi:info] [pid 12528:tid 139862016030464] 
> mod_wsgi (pid=12528): called from file 
> "/home/simamura/lib/python2.7/mod_wsgi/server/__init__.py", line 1137, in 
> handle_request.
> [Sat Feb 14 19:58:05.245850 2015] [wsgi:info] [pid 12528:tid 139861541500672] 
> mod_wsgi (pid=12528): Aborting process 'localhost:20241'.
> [Sat Feb 14 19:58:05.245889 2015] [wsgi:info] [pid 12528:tid 139861541500672] 
> mod_wsgi (pid=12528): Exiting process 'localhost:20241'.
> [Sun Feb 15 01:58:05.352861 2015] [wsgi:error] [pid 11272:tid 
> 139862014424832] [client 127.0.0.1:48193] Truncated or oversized response 
> headers received from daemon process 'localhost:20241': 
> /home/simamura/webapps/django_math/express/htdocs/calculators, referer: 
> http://www.emathhelp.net/calculators/calculus-2/integral-calculator/
> [Sun Feb 15 01:58:05.352890 2015] [wsgi:error] [pid 11272:tid 
> 139862015223552] [client 127.0.0.1:48563] Truncated or oversized response 
> headers received from daemon process 'localhost:20241': 
> /home/simamura/webapps/django_math/express/htdocs/calculators, referer: 
> http://www.emathhelp.net/calculators/calculus-2/integral-calculator/
> [Sun Feb 15 01:58:05.352958 2015] [wsgi:error] [pid 11272:tid 
> 139861948524288] [client 127.0.0.1:48240] Truncated or oversized response 
> headers received from daemon process 'localhost:20241': 
> /home/simamura/webapps/django_math/express/htdocs/calculators, referer: 
> http://www.emathhelp.net/calculators/calculus-2/integral-calculator/
> [Sun Feb 15 01:58:05.352992 2015] [wsgi:error] [pid 11272:tid 
> 139862015489792] [client 127.0.0.1:48440] Truncated or oversized response 
> headers received from daemon process 'localhost:20241': 
> /home/simamura/webapps/django_math/express/htdocs/calculators, referer: 
> http://www.emathhelp.net/calculators/calculus-2/integral-calculator/
> [Sun Feb 15 01:58:05.353008 2015] [wsgi:error] [pid 11666:tid 
> 139862014957312] [client 127.0.0.1:48329] Truncated or oversized response 
> headers received from daemon process 'localhost:20241': 
> /home/simamura/webapps/django_math/express/htdocs/calculators, referer: 
> http://www.emathhelp.net/calculators/calculus-2/integral-calculator/
> [Sun Feb 15 01:58:05.353015 2015] [wsgi:error] [pid 11272:tid 
> 139862014691072] [client 127.0.0.1:48620] Truncated or oversized response 
> headers received from daemon process 'localhost:20241': 
> /home/simamura/webapps/django_math/express/htdocs/calculators, referer: 
> http://www.emathhelp.net/calculators/calculus-2/integral-calculator/
> [Sun Feb 15 01:58:05.353081 2015] [wsgi:error] [pid 11272:tid 
> 139861949323008] [client 127.0.0.1:48778] Truncated or oversized response 
> headers received from daemon process 'localhost:20241': 
> /home/simamura/webapps/django_math/express/htdocs/calculators, referer: 
> http://www.emathhelp.net/calculators/calculus-2/integral-calculator/
> [Sun Feb 15 01:58:05.353081 2015] [wsgi:error] [pid 11666:tid 
> 139861948790528] [client 127.0.0.1:48735] Truncated or oversized response 
> headers received from daemon process 'localhost:20241': 
> /home/simamura/webapps/django_math/express/htdocs/calculators, referer: 
> http://www.emathhelp.net/calculators/calculus-2/integral-calculator/
> [Sun Feb 15 01:58:05.353096 2015] [wsgi:error] [pid 11666:tid 
> 139862015223552] [client 127.0.0.1:48792] Truncated or oversized response 
> headers received from daemon process 'localhost:20241': 
> /home/simamura/webapps/django_math/express/htdocs/calculators, referer: 
> http://www.emathhelp.net/calculators/calculus-2/integral-calculator/
> [Sun Feb 15 01:58:05.972993 2015] [wsgi:info] [pid 24792:tid 139862016030464] 
> mod_wsgi (pid=12528): Process 'localhost:20241' has died, deregister and 
> restart it.
> [Sun Feb 15 01:58:06.081051 2015] [wsgi:info] [pid 24792:tid 139862016030464] 
> mod_wsgi (pid=12528): Process 'localhost:20241' has been deregistered and 
> will no longer be monitored.
> [Sun Feb 15 01:58:06.081333 2015] [wsgi:info] [pid 23028:tid 139862016030464] 
> mod_wsgi (pid=23028): Starting process 'localhost:20241' with threads=10.
> [Sun Feb 15 01:58:06.163910 2015] [wsgi:info] [pid 23028:tid 139862016030464] 
> mod_wsgi (pid=23028): Python home /usr/local.
> [Sun Feb 15 01:58:06.163937 2015] [wsgi:info] [pid 23028:tid 139862016030464] 
> mod_wsgi (pid=23028): Initializing Python.
> [Sun Feb 15 01:58:06.534740 2015] [wsgi:info] [pid 23028:tid 139862016030464] 
> mod_wsgi (pid=23028): Attach interpreter ''.
> [Sun Feb 15 01:58:06.549849 2015] [wsgi:info] [pid 23028:tid 139862016030464] 
> mod_wsgi (pid=23028): Imported 'mod_wsgi'.
> [Sun Feb 15 01:58:06.549942 2015] [wsgi:info] [pid 23028:tid 139862016030464] 
> mod_wsgi (pid=23028, process='localhost:20241', application=''): Loading WSGI 
> script '/home/simamura/webapps/django_math/express/handler.wsgi'.
> [Sun Feb 15 02:06:36.476939 2015] [core:info] [pid 11272:tid 139861948524288] 
> [client 127.0.0.1:56156] AH00128: File does not exist: 
> /home/simamura/webapps/math_root/browserconfig.xml
> 
> 
> And that's all.
> 
> On Sunday, February 15, 2015 at 10:11:25 AM UTC+2, Graham Dumpleton wrote:
> Provide me with the logging from a period before the restart until a period 
> after.
> 
> The log can be very informative if you know what to look for.
> 
> Graham
> 
> On 15/02/2015, at 6:43 PM, Paul Royik <[email protected]> wrote:
> 
> I actually have custom error 500 page. This means, that when Django throws 
> error, custom page should be shown.
> But it is not the case. Standard white-background apache page is shown.
> So, I think that problem is not in Django exception. Otherwise Django would 
> show custom page, which is not the case.
> In logs I saw only that server was restarted.
> Can I somehow tune logging, so that real error is written?
> 
> On Sunday, February 15, 2015 at 2:44:50 AM UTC+2, Graham Dumpleton wrote:
> 
> On 15/02/2015, at 10:53 AM, Paul Royik <[email protected]> wrote: 
> 
> > I already tried this approach. 
> > 
> > It seems good. 
> > But when I tested it on server, by simultaneously executing page on two 
> > different computers, it gave me 500 error. 
> > 
> > How this can be explained? Again something with apache? Logs didn't show 
> > anything. But I noticed again serious memory usage. And this happens only 
> > when I use code for limiting time. 
> 
> 
> Some tips for you about ensuring you get the best help. 
> 
> Stop discarding the message content for the message you are replying to. It 
> can be frustrating to have to keep going back to old messages to see what was 
> said originally and what you may be following up to, especially when on a 
> phone. When asking questions on mailing list, let the person you are asking 
> questions of decide what can be discarded from the message chain as they will 
> know better what should be kept in the message to provide easy access to 
> important information for context. 
> 
> When you reply to say you are have a new error, provide the actual code you 
> were using exactly at the time you had the error. Saves the person who is 
> helping you having to go back and ask for it if it is unclear what code you 
> were using. 
> 
> Your messages here and on StackOverflow show that you play around with the 
> example code I am giving you and I have reduced confidence you were running 
> with the code I suggested at the time of the problem you are now relating to. 
> 
> If you are getting 500 errors and nothing is being logged, it is because 
> Django is capturing the exception and converting it to a generic 500 error 
> response page. Configure Django to send you emails with the details of the 
> exceptions. If you are on a local system, then set DEBUG=True in the Django 
> settings so the details of the error are shown in response that goes back to 
> the browser. 
> 
> As to the code, I did test it this time, including under mod_wsgi with 20 
> request threads and hitting it with concurrent and successive requests with a 
> benchmarking tool. I saw no issues and it appeared to behave as I would 
> expect. 
> 
> So in a test2.py file I had: 
> 
> import time 
> import functools 
> import threading 
> 
> def time_limit(seconds): 
>     def decorator(func): 
>         func.info = threading.local() 
>         def check_timeout(): 
>             if time.time() > func.info.end_time: 
>                 raise RuntimeError('timeout') 
> 
>         func.check_timeout = check_timeout 
> 
>         @functools.wraps(func) 
>         def wrapper(*args, **kwargs): 
>             print 'hasattr counter', hasattr(func.info, 'counter') 
>             if not hasattr(func.info, 'counter'): 
>                 print 'init counter to 0' 
>                 func.info.counter = 0 
>             if func.info.counter == 0: 
>                 func.info.end_time = time.time() + seconds 
>             print 'counter', func.info.counter 
>             func.info.counter += 1 
>             try: 
>                 return func(*args, **kwargs) 
>             finally: 
>                 func.info.counter -= 1 
> 
>         return wrapper 
> 
>     return decorator 
> 
> @time_limit(5) 
> def algorithm(limit, nest=0): 
>     algorithm.check_timeout() 
>     print 'sleep' 
>     time.sleep(1.0) 
>     if nest == limit: 
>         print 'return' 
>         return 
>     algorithm(limit, nest+1) 
> 
> In serial2.py I had a straight serialised test: 
> 
> from test2 import algorithm 
> 
> try: 
>     algorithm(3) 
> except RuntimeError: 
>     print 'timeout' 
> 
> try: 
>     algorithm(10) 
> except RuntimeError: 
>     print 'timeout' 
> 
> try: 
>     algorithm(10) 
> except RuntimeError: 
>     print 'timeout' 
> 
> And then in hello2.wsgi I had it used by a WSGI application. 
> 
> from test2 import algorithm 
> 
> def application(environ, start_response): 
>     status = '200 OK' 
>     output = b'Hello World!' 
> 
>     algorithm(1) 
> 
>     response_headers = [('Content-type', 'text/plain'), 
>                         ('Content-Length', str(len(output)))] 
>     start_response(status, response_headers) 
> 
>     return [output] 
> 
> In the latter case I ran mod_wsgi-express against it as: 
> 
> mod_wsgi-express start-server hello2.wsgi --port 8002 --threads=20 
> 
> and then hit it with ab as: 
> 
> ab -n 100 -c 15 http://localhost:8002/ 
> 
> In this case it wasn't designed to timeout anything, but that should not be a 
> concern as the counter initialisation is still being tested. 
> 
> Do note that if you did cut and paste that last code, I did change the 
> exception type. 
> 
> Anyway, the best thing to do is setup Django so that it provides the details 
> of the exception it captured but then effectively discarded because it 
> converted it to a 500 page. 
> 
> 
> -- 
> You received this message because you are subscribed to the Google Groups 
> "modwsgi" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to [email protected].
> To post to this group, send email to [email protected].
> Visit this group at http://groups.google.com/group/modwsgi.
> For more options, visit https://groups.google.com/d/optout.
> 
> 
> -- 
> You received this message because you are subscribed to the Google Groups 
> "modwsgi" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to [email protected].
> To post to this group, send email to [email protected].
> Visit this group at http://groups.google.com/group/modwsgi.
> For more options, visit https://groups.google.com/d/optout.
> 
> 
> -- 
> You received this message because you are subscribed to the Google Groups 
> "modwsgi" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to [email protected].
> To post to this group, send email to [email protected].
> Visit this group at http://groups.google.com/group/modwsgi.
> For more options, visit https://groups.google.com/d/optout.
> 
> 
> -- 
> You received this message because you are subscribed to the Google Grou
> ...
> 
> -- 
> You received this message because you are subscribed to the Google Groups 
> "modwsgi" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to [email protected].
> To post to this group, send email to [email protected].
> Visit this group at http://groups.google.com/group/modwsgi.
> For more options, visit https://groups.google.com/d/optout.

-- 
You received this message because you are subscribed to the Google Groups 
"modwsgi" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at http://groups.google.com/group/modwsgi.
For more options, visit https://groups.google.com/d/optout.

Re: [modwsgi] Webfaction mod_wsgi + Django

Reply via email to