On 20 October 2010 00:54, Patrick Michael Kane <[email protected]> wrote: > Hey all: > > We're running a Django site using Apache+mod_wsgi. The stack is > pretty typical: Apache 2.2.14, mod_wsgi 3.3, Python 2.6.1. We're > running single-threaded using DaemonProcess (WSGIDaemonProcess staging > threads=1 processes=8 maximum-requests=10 python-path=[path] display- > name=%{GROUP}). > > We're seeing an issue where the Apache server will stop responding to > all requests (even non-WSGI ones) after a fairly large number of > requests. When we strace the wsgi processes, they are all in poll(), > waiting to be talked to: > > poll([{fd=3, events=POLLIN}], 1, -1... > > When we look at the httpds, they are blocking on connect to the > mod_wsgi unix socket: > > connect(69, {sa_family=AF_FILE, path="/home/actionkit/releases/stable/ > apache/logs/.1873.28.7.sock"}, 110 > > A restart fixes it and then the problem goes away for a few days to a > week. We're seeing the same behavior across 3 identical webservers. > > The problem happens once every few days. > > If folks have any ideas on the cause, I'm all ears. Alternatively, if > there's additional steps we can take to debug that would be helpful, > let me know.
Can you provide the output of running -V option on Apache. Eg: /usr/sbin/httpd -V Want to verify what MPM you are using and some of the compilation options. As to strace on WSGI processes, really need a gdb stack trace on all threads if you can manage to get one. I need to see if the actual request handler threads have exited. My latest theory is that they might have exited. The poll() may be the main thread which is just waiting for shutdown indicator to be sent via a socketpair from signal handler. Can you also add to your Apache configuration files at global scope: WSGIAcceptMutex xxx Then run: /usr/sbin/apachectl -t This should yield something like: Accept mutex lock mechanism 'xxx' is invalid. Valid accept mutex mechanisms for this platform are: default, flock, fcntl, sysvsem, posixsem. Do the same thing for the directive: AcceptMutex xxx >From this and output of -V I hopefully can determine what type of cross process mutex lock is being used. If it is sysvsem, then it may be a potential bug in mod_wsgi code I uncovered while staring at code on bumpy flight from Toronto to New York when on holidays. Lets try and work out whether it might be this case and can then tell you work around to use and we can see if problem appears to go away. As much information as possible, rather than truncated information, appreciated as this is proving a tricky issue to solve. BTW, any reason you are running maximum-requests so low? One of the observations in the past is that the problem is made worse when WSGI processes are being restarted a lot as would be case for small number of maximum requests. Graham -- You received this message because you are subscribed to the Google Groups "modwsgi" group. To post to this group, send email to [email protected]. To unsubscribe from this group, send email to [email protected]. For more options, visit this group at http://groups.google.com/group/modwsgi?hl=en.
