2010/1/27 Giel van Schijndel <[email protected]>:
> On Tue, Jan 26, 2010 at 01:46:53PM +1100, Graham Dumpleton wrote:
>> 2010/1/26 Giel van Schijndel <[email protected]>:
>>> On Tue, Jan 26, 2010 at 01:12:00AM +0100, Giel van Schijndel wrote:
>>>> On Tue, Jan 26, 2010 at 11:00:42AM +1100, Graham Dumpleton wrote:
>>>>> 2010/1/26 Graham Dumpleton <[email protected]>:
>>>>>> 2010/1/26 Giel van Schijndel <[email protected]>:
>>>>>>> Some time later, and now it seems that the daemon crashed. At least I
>>>>>>> think it must have, considering that it didn't leave anything in the
>>>>>>> logs, except for timeouts around when the daemon must have gone. No
>>>>>>> coredump either.
>>>>>>
>>>>>> I'll give you an updated patch shortly then which includes the other
>>>>>> changes I figured are required to make it more robust on platforms
>>>>>> where conditional wait can actually return even though condition not
>>>>>> satisfied.
>>>>>
>>>>> Revert that prior patch and try this one instead:
>>>>
>>>> No immediate regressions so far. I.e. it functions properly within a few
>>>> minutes after restarting Apache with it.
>>>
>>> It seems that after the daemon times out due to inactivity it gets
>>> killed, but it never seems to be respawned when a request arrives after
>>> that timeout.
>>
>> Ensure LogLevel directive is 'info' and see what messages appear in
>> error log, or if you have them already, from both main Apache error
>> log and any virtual host log which daemon process is within, post them
>> for me.
>
> Logs (last request + everything that follows):
>> [Tue Jan 26 15:18:48 2010] [debug] mod_wsgi.c(12695): mod_wsgi (pid=16220): 
>> Request server matched was 'ilpam.il.fontys.nl|0'.
>> [Tue Jan 26 15:27:33 2010] [info] mod_wsgi (pid=16220): Daemon process 
>> deadlock timer expired, stopping process 'ilpam'.

Hmmm, you were talking about inactivity timeouts, yet above is
something completely different. The above specifically happens when
something has locked Python GIL for period of deadlock timeout,
default 300 seconds, and as such no other Python code can run.

I can't see from logs, as not enough from before this point, whether
this occurred on inactivity timeout shutdown or whether distinct. My
guess is it is distinct as for inactivity timeout, where shutdown
timeout is 5 seconds by default, there is no way one could get to 300
seconds and trigger this.

Can you explain more about how often/when you are seeing this deadlock
timeout in error logs. Something like that is not good as indicates a
badly implemented Python C extension module that doesn't release GIL
when doing blocking operations.

Graham

>> [Tue Jan 26 15:27:33 2010] [info] mod_wsgi (pid=16220): Shutdown requested 
>> 'ilpam'.
>> [Tue Jan 26 15:27:33 2010] [info] mod_wsgi (pid=16220): Stopping process 
>> 'ilpam'.
>> [Tue Jan 26 15:27:33 2010] [info] mod_wsgi (pid=16220): Destroying 
>> interpreters.
>> [Tue Jan 26 15:27:33 2010] [debug] mod_wsgi.c(5172): mod_wsgi (pid=16220): 
>> Create thread state for thread 0 against interpreter 'ilpam.il.fontys.nl|'.
>> [Tue Jan 26 15:27:33 2010] [info] mod_wsgi (pid=16220): Destroy interpreter 
>> 'ilpam.il.fontys.nl|'.
>> [Tue Jan 26 15:27:33 2010] [info] mod_wsgi (pid=16220): Cleanup interpreter 
>> ''.
>> [Tue Jan 26 15:27:33 2010] [info] mod_wsgi (pid=16220): Terminating Python.
>> [Tue Jan 26 15:27:33 2010] [info] mod_wsgi (pid=16220): Python has shutdown.
>
> Looking at the process tree of Apache it seems the daemon has gone
> defunct:
>> [g...@hyde:~]$ pstree -p 8103
>> -+= 00001 root /sbin/init --
>>  \-+= 08103 root /usr/local/sbin/httpd -f /beast/config/apache/httpd.conf 
>> -DNOHTTPACCEPT
>>    |--- 16220 giel <defunct>
>>    |--- 16221 www /usr/local/sbin/httpd -f /beast/config/apache/httpd.conf 
>> -DNOHTTPACCEPT
>>    |--- 16222 www /usr/local/sbin/httpd -f /beast/config/apache/httpd.conf 
>> -DNOHTTPACCEPT
>>    |--- 16223 www /usr/local/sbin/httpd -f /beast/config/apache/httpd.conf 
>> -DNOHTTPACCEPT
>>    |--- 16224 www /usr/local/sbin/httpd -f /beast/config/apache/httpd.conf 
>> -DNOHTTPACCEPT
>>    |--- 16225 www /usr/local/sbin/httpd -f /beast/config/apache/httpd.conf 
>> -DNOHTTPACCEPT
>>    |--- 16226 www /usr/local/sbin/httpd -f /beast/config/apache/httpd.conf 
>> -DNOHTTPACCEPT
>>    |--- 16227 www /usr/local/sbin/httpd -f /beast/config/apache/httpd.conf 
>> -DNOHTTPACCEPT
>>    \--- 16228 www /usr/local/sbin/httpd -f /beast/config/apache/httpd.conf 
>> -DNOHTTPACCEPT
>
> Further I had a small script running to look at what's happening to the
> process:
>> while kill -0 16220
>> do
>>       (date ; ps aux | grep -v grep | grep 16220) | tee -a 
>> mod_wsgi_daemon.log
>>       sleep 10
>> done
>
> Its output from the time the daemon went defunct:
>> Tue Jan 26 15:27:21 CET 2010
>> giel    16220  0.0  1.1 28592 22216  ??  I     3:11PM   0:00.81 httpd: ILMAP 
>> (httpd)
>> Tue Jan 26 15:27:31 CET 2010
>> giel    16220  0.0  1.1 28592 22216  ??  I     3:11PM   0:00.81 httpd: ILMAP 
>> (httpd)
>> Tue Jan 26 15:27:41 CET 2010
>> giel    16220  0.1  0.0     0     0  ??  Z     3:11PM   0:00.85 <defunct>
>> Tue Jan 26 15:27:51 CET 2010
>> giel    16220  0.0  0.0     0     0  ??  Z     3:11PM   0:00.85 <defunct>
>
> I'm guessing that the daemon process shouldn't be allowed to remain
> zombie?
>
> Further sending SIGHUP (or using 'apachectl graceful') to the parent
> Apache process causes it to be properly cleaned up and respawned.
>
> --
> With kind regards,
> Giel van Schijndel
> - Interlink <www.il.fontys.nl>
>
> -----BEGIN PGP SIGNATURE-----
> Version: GnuPG v1.4.10 (GNU/Linux)
>
> iEYEARECAAYFAkte/xQACgkQZBYm/87l50JueACfVg7NCMKWYT6EhfLsLJI+KoQC
> J/gAmwQvhbne5O/0y4kZ960DD7ccO2AO
> =zB3N
> -----END PGP SIGNATURE-----
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"modwsgi" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to 
[email protected].
For more options, visit this group at 
http://groups.google.com/group/modwsgi?hl=en.

Reply via email to