Is there a way to make apache work even when such deadlock occur? Can a process be killed and restarted automatically? I know, it is not a solution for actual problem and should be solved by eliminating deadlock, but the goal is to make production server work while debugging the problem. I tried all options of modwsgi that seemed relevant, but could not achieve stable apache counficuration. It stuck after some time for about 5 hours.
On Apr 25, 2:51 am, Graham Dumpleton <[email protected]> wrote: > That many threads was never a good idea. > > A possible reason why you are seeing less problems with only 5 threads > in a process is that your code or a third party C extension is not > thread safe and are perhaps deadlocking. > > You really need to ascertain when process threads are starting to hang and > use: > > http://code.google.com/p/modwsgi/wiki/DebuggingTechniques#Extracting_... > http://code.google.com/p/modwsgi/wiki/DebuggingTechniques#Debugging_C... > > to work out what it is doing at that time. > > Graham > > On 24 April 2011 03:38, Chase <[email protected]> wrote: > > > > > > > > > Changed the config from 1 process 50 threads to 3 processed 5 threads. > > That seems to have solved it, or at least made it much less likely. > > > -Chase > > > On Apr 16, 7:56 am, Chase <[email protected]> wrote: > >> The problem persists. I have removed our calls to lxml; they were not > >> critical. We'll see what effect that has going forward. > > >> -Chase > > >> On Apr 16, 12:08 am, Graham Dumpleton <[email protected]> > >> wrote: > > >> > On 16 April 2011 01:04, Chase <[email protected]> wrote: > > >> > > Wow, lots of good info. Thanks guys! I have made the > >> > > "WSGIApplicationGroup %{GLOBAL}" change for now; we'll see if that > >> > > clears it up over the next week or so. > > >> > > As for running in prefork, I have not made that change yet. But here > >> > > is the documentation that lead me to believe this was preferred: > > >> > >http://code.google.com/p/modwsgi/wiki/IntegrationWithDjango > > >> > > "Now, traditional wisdom in respect of Django has been that it should > >> > > perferably only be used on single threaded servers. This would mean > >> > > for Apache using the single threaded 'prefork' MPM on UNIX systems and > >> > > avoiding the multithreaded 'worker' MPM." > > >> > > Also, the older modpython docs also advised this: > > >> > >http://docs.djangoproject.com/en/dev/howto/deployment/modpython/?from... > > >> > > "Django requires Apache 2.x and mod_python 3.x, and you should use > >> > > Apache’s prefork MPM, as opposed to the worker MPM." > > >> > > Can you link to a discussion of the subtle problems reported with > >> > > prefork? Thanks again, > > >> > That section was more relevant when Django 1.0 had only just come out, > >> > which was the first version of Django for which the core was > >> > supposedly thread safe. > > >> > Anyway, the MPM you use isn't particularly relevant as you are using > >> > daemon mode and not embedded mode. Which MPM you use is only critical > >> > if you are using embedded mode. > > >> > In daemon mode you have the arbitrary ability to control > >> > processes/threads based on whether your application is thread safe. > > >> > For related reading see: > > >> > http://code.google.com/p/modwsgi/wiki/ProcessesAndThreading > >> > http://blog.dscpl.com.au/2009/03/load-spikes-and-excessive-memory-usa... > > >> > BTW, the IntegrationWithDjango page in the wiki is likely to be > >> > completely removed at some point in the near future and I will stop > >> > providing details for specific frameworks to cover where frameworks > >> > don't themselves provide enough information. I have already removed > >> > the pages for most of the other frameworks already. End result is that > >> > the frameworks themselves will need to provide decent documentation > >> > themselves to cover any idiosyncrasies that exist in setting up their > >> > framework to work with mod_wsgi which are due to issues or design > >> > decisions related to their framework and which are nothing to do with > >> > mod_wsgi. I have had enough of trying to document these framework > >> > specific subtleties and framework authors tend to express a belief > >> > that their own documentation is already more than adequate even though > >> > from what I have seen people still get tripped up when they follow > >> > only the documentation provided by the framework. So, I will be > >> > devoting my time elsewhere now and not worrying about documenting > >> > stuff related to the frameworks or actively assisting users of > >> > frameworks on forums related to those frameworks or on general forums > >> > such as StackOverflow. Instead, if it is a framework specific issue, > >> > you will need to seek help from the developers or the community for > >> > that framework. > > >> > Graham > > >> > > -Chase > > >> > > On Apr 14, 6:30 pm, Graham Dumpleton <[email protected]> > >> > > wrote: > >> > >> On 15 April 2011 05:18, Chase <[email protected]> wrote: > > >> > >> > I have a custom Django app that's becoming unresponsive > >> > >> > intermittently. About once every couple of days between three > >> > >> > servers, > >> > >> > serving about 10,000 requests a day. When it happens, it never > >> > >> > recovers. I can leave it there for hours, and it will not server any > >> > >> > more requests. > > >> > >> > In the apache logs, I see see the following: > > >> > >> > Apr 13 11:45:07 www3 apache2[27590]: **successful view render here** > >> > >> > ... > >> > >> > Apr 13 11:47:11 www3 apache2[24032]: [error] server is within > >> > >> > MinSpareThreads of MaxClients, consider raising the MaxClients > >> > >> > setting > >> > >> > Apr 13 11:47:43 www3 apache2[24032]: [error] server reached > >> > >> > MaxClients > >> > >> > setting, consider raising the MaxClients setting > >> > >> > ... > >> > >> > Apr 13 11:50:34 www3 apache2[27617]: [error] [client 10.177.0.204] > >> > >> > Script timed out before returning headers: django.wsgi > >> > >> > (repeated 100 times, exactly) > > >> > >> > I am running: > > >> > >> > apache version 2.2, using the worker MPM > >> > >> > wsgi version 2.8 > >> > >> > SELinux NOT installed > >> > >> > lxml package being used, infrequently > >> > >> > Ubuntu 10.04 > > >> > >> > apache config: > > >> > >> > WSGIDaemonProcess site-1 user=django group=django threads=50 > >> > >> > WSGIProcessGroup site-1 > >> > >> > WSGIScriptAlias / /somepath/django.wsgi /somepath/django.wsgi > > >> > >> > wsgi config: > > >> > >> > import os, sys > >> > >> > sys.path.append('/home/django') > >> > >> > os.environ['DJANGO_SETTINGS_MODULE'] = 'myapp.settings' > >> > >> > import django.core.handlers.wsgi > >> > >> > application = django.core.handlers.wsgi.WSGIHandler() > > >> > >> > When this happens, I can kill the wsgi process and the server will > >> > >> > recover. > > >> > >> >>ps aux|grep django # process is running as user "django" > >> > >> > django 27590 5.3 17.4 908024 178760 ? Sl Apr12 76:09 > >> > >> > /usr/ > >> > >> > sbin/apache2 -k start > >> > >> >>kill -9 27590 > > >> > >> > This leads me to believe that the problem is a known issue: > > >> > >> > "(deadlock-timeout) Defines the maximum number of seconds allowed to > >> > >> > pass before the daemon process is shutdown and restarted after a > >> > >> > potential deadlock on the Python GIL has been detected. The default > >> > >> > is > >> > >> > 300 seconds. This option exists to combat the problem of a daemon > >> > >> > process freezing as the result of a rouge Python C extension module > >> > >> > which doesn't properly release the Python GIL when entering into a > >> > >> > blocking or long running operation." > > >> > >> > However, I'm not sure why this condition is not clearing > >> > >> > automatically. I do see that the script timeout occurs exactly 5 > >> > >> > minutes after the last successful page render, so the > >> > >> > deadlock-timeout > >> > >> > is getting triggered. But it does not actually kill the process. > > >> > >> They likely aren't being killed because there isn't actually a > >> > >> deadlock of a single thread which hasn't release the GIL. > > >> > >> In other words, what the dead lock timeout will not protect against is > >> > >> threads calling into C code, releasing the GIL and then deadlocking in > >> > >> C code. > > >> > >> In your case, the problem is going to be the lxml module. This module > >> > >> is known not to work in Python sub interpreters properly. > >> > >> Specifically, the lxml can release the GIL and then attempt to do a > >> > >> callback into Python code. To do this, it uses the simplified GIL > >> > >> state API in Python to reacquire the GIL, but that API is only > >> > >> supposed to be used if running in the main Python interpreter and not > >> > >> a sub interpreter. When used in a sub interpreter, the code will > >> > >> deadlock on trying to reacquire the Python GIL. > > >> > >> That lxml is a problem is documented in: > > >> > >> http://code.google.com/p/modwsgi/wiki/ApplicationIssues#Multiple_Pyth... > > >> > >> The solution, since you are only delegating one application to that > >> > >> mod_wsgi daemon process group, is to add: > > >> > >> WSGIApplicationGroup %{GLOBAL} > > >> > >> This will force the application to run in the main Python interpreter > >> > >> and avoid the shortcomings of lxml module. > > >> > >> As how you might protect against this sort of deadlock in C code when > >> > >> GIL isn't locked, the only way is to use 'inactivity-timeout'. This > >> > >> will cause a restart when there has been no new requests and/or no > >> > >> reading of request content or generation of response content for that > >> > >> timeout period. So, this could be used as a fail safe, but if your > >> > >> application is used in frequently, it will also have the affect of > >> > >> causing your idle process to be restarted after the timeout period as > >> > >> well. > > >> > >> BTW, in worst cases, for detecting what process is doing, one can use > >> > >> either: > > >> > >> http://code.google.com/p/modwsgi/wiki/DebuggingTechniques#Extracting_... > >> > >> http://code.google.com/p/modwsgi/wiki/DebuggingTechniques#Debugging_C... > > >> > >> > I'm thinking of switching to MPM/prefork, but I'm not sure if that > >> > >> > should have any effect, given that I'm in daemon mode already. > > >> > >> Prefork for some people has been causing subtle problems and I would > >> > >> avoid it if you can. > > >> > >> Graham > > >> > > -- > >> > > You received this message because you are subscribed to the Google > >> > > Groups "modwsgi" group. > >> > > To post to this group, send email to [email protected]. > >> > > To unsubscribe from this > > ... > > read more » -- You received this message because you are subscribed to the Google Groups "modwsgi" group. To post to this group, send email to [email protected]. To unsubscribe from this group, send email to [email protected]. For more options, visit this group at http://groups.google.com/group/modwsgi?hl=en.
