On Wed, Jan 27, 2010 at 11:23:49AM +1100, Graham Dumpleton wrote: > 2010/1/27 Giel van Schijndel: >> On Tue, Jan 26, 2010 at 01:46:53PM +1100, Graham Dumpleton wrote: >>> 2010/1/26 Giel van Schijndel: >>>> On Tue, Jan 26, 2010 at 01:12:00AM +0100, Giel van Schijndel wrote: >>>>> On Tue, Jan 26, 2010 at 11:00:42AM +1100, Graham Dumpleton wrote: >>>>>> 2010/1/26 Graham Dumpleton: >>>>>>> 2010/1/26 Giel van Schijndel: >>>>>>>> Some time later, and now it seems that the daemon crashed. At least I >>>>>>>> think it must have, considering that it didn't leave anything in the >>>>>>>> logs, except for timeouts around when the daemon must have gone. No >>>>>>>> coredump either. >>>>>>> >>>>>>> I'll give you an updated patch shortly then which includes the other >>>>>>> changes I figured are required to make it more robust on platforms >>>>>>> where conditional wait can actually return even though condition not >>>>>>> satisfied. >>>>>> >>>>>> Revert that prior patch and try this one instead: >>>>> >>>>> No immediate regressions so far. I.e. it functions properly within a few >>>>> minutes after restarting Apache with it. >>>> >>>> It seems that after the daemon times out due to inactivity it gets >>>> killed, but it never seems to be respawned when a request arrives after >>>> that timeout. >>> >>> Ensure LogLevel directive is 'info' and see what messages appear in >>> error log, or if you have them already, from both main Apache error >>> log and any virtual host log which daemon process is within, post them >>> for me. >> >> Logs (last request + everything that follows): >>> [Tue Jan 26 15:18:48 2010] [debug] mod_wsgi.c(12695): mod_wsgi (pid=16220): >>> Request server matched was 'ilpam.il.fontys.nl|0'. >>> [Tue Jan 26 15:27:33 2010] [info] mod_wsgi (pid=16220): Daemon process >>> deadlock timer expired, stopping process 'ilpam'. > > Hmmm, you were talking about inactivity timeouts, yet above is > something completely different. The above specifically happens when > something has locked Python GIL for period of deadlock timeout, > default 300 seconds, and as such no other Python code can run.
Except that the above doesn't look like 300 seconds to me, more like 515. I guess I could test this by setting 'threads=1'? > I can't see from logs, as not enough from before this point, whether > this occurred on inactivity timeout shutdown or whether distinct. My > guess is it is distinct as for inactivity timeout, where shutdown > timeout is 5 seconds by default, there is no way one could get to 300 > seconds and trigger this. > > Can you explain more about how often/when you are seeing this deadlock > timeout in error logs. Something like that is not good as indicates a > badly implemented Python C extension module that doesn't release GIL > when doing blocking operations. I'm not using any Python C extensions, not directly at least. The only things (aside from the Python standard library) I use are CherryPy (3.1.2) and Genshi (0.5.1). It's a rather simple project up until now [1], currently with as only code a 125 line long Python script [2]. As for how often I see this deadlock. I thought it to be related to the inactivity timeout because it appears to happen about 500 (+/- 120) seconds after the last HTTP request gets in. [1] http://git.il.fontys.nl/git/ilpam.git [2] http://git.il.fontys.nl/git/ilpam.git/blob/HEAD:/mgt/web/wsgi.py -- With kind regards, Giel van Schijndel - Interlink <www.il.fontys.nl>
signature.asc
Description: Digital signature
