2010/1/27 Giel van Schijndel <[email protected]>:
> On Wed, Jan 27, 2010 at 11:23:49AM +1100, Graham Dumpleton wrote:
>> 2010/1/27 Giel van Schijndel:
>>> On Tue, Jan 26, 2010 at 01:46:53PM +1100, Graham Dumpleton wrote:
>>>> 2010/1/26 Giel van Schijndel:
>>>>> On Tue, Jan 26, 2010 at 01:12:00AM +0100, Giel van Schijndel wrote:
>>>>>> On Tue, Jan 26, 2010 at 11:00:42AM +1100, Graham Dumpleton wrote:
>>>>>>> 2010/1/26 Graham Dumpleton:
>>>>>>>> 2010/1/26 Giel van Schijndel:
>>>>>>>>> Some time later, and now it seems that the daemon crashed. At least I
>>>>>>>>> think it must have, considering that it didn't leave anything in the
>>>>>>>>> logs, except for timeouts around when the daemon must have gone. No
>>>>>>>>> coredump either.
>>>>>>>>
>>>>>>>> I'll give you an updated patch shortly then which includes the other
>>>>>>>> changes I figured are required to make it more robust on platforms
>>>>>>>> where conditional wait can actually return even though condition not
>>>>>>>> satisfied.
>>>>>>>
>>>>>>> Revert that prior patch and try this one instead:
>>>>>>
>>>>>> No immediate regressions so far. I.e. it functions properly within a few
>>>>>> minutes after restarting Apache with it.
>>>>>
>>>>> It seems that after the daemon times out due to inactivity it gets
>>>>> killed, but it never seems to be respawned when a request arrives after
>>>>> that timeout.
>>>>
>>>> Ensure LogLevel directive is 'info' and see what messages appear in
>>>> error log, or if you have them already, from both main Apache error
>>>> log and any virtual host log which daemon process is within, post them
>>>> for me.
>>>
>>> Logs (last request + everything that follows):
>>>> [Tue Jan 26 15:18:48 2010] [debug] mod_wsgi.c(12695): mod_wsgi 
>>>> (pid=16220): Request server matched was 'ilpam.il.fontys.nl|0'.
>>>> [Tue Jan 26 15:27:33 2010] [info] mod_wsgi (pid=16220): Daemon process 
>>>> deadlock timer expired, stopping process 'ilpam'.
>>
>> Hmmm, you were talking about inactivity timeouts, yet above is
>> something completely different. The above specifically happens when
>> something has locked Python GIL for period of deadlock timeout,
>> default 300 seconds, and as such no other Python code can run.
>
> Except that the above doesn't look like 300 seconds to me, more like
> 515.

I wasn't saying it was linked to the prior logged message, just that
the deadlock timeout is 300 seconds.

That said, since you have additional debug turned on, that would have
been the previous request.

Thus, only possibility is that there was a stuck request from long
before, or a background thread, doing stuff in a loop which at some
point finally didn't release the GIL.

Anyway, that is assuming that it is a user code problem. :-)

> I guess I could test this by setting 'threads=1'?
>
>> I can't see from logs, as not enough from before this point, whether
>> this occurred on inactivity timeout shutdown or whether distinct. My
>> guess is it is distinct as for inactivity timeout, where shutdown
>> timeout is 5 seconds by default, there is no way one could get to 300
>> seconds and trigger this.
>>
>> Can you explain more about how often/when you are seeing this deadlock
>> timeout in error logs. Something like that is not good as indicates a
>> badly implemented Python C extension module that doesn't release GIL
>> when doing blocking operations.
>
> I'm not using any Python C extensions, not directly at least. The only
> things (aside from the Python standard library) I use are CherryPy
> (3.1.2) and Genshi (0.5.1). It's a rather simple project up until now
> [1], currently with as only code a 125 line long Python script [2].
>
> As for how often I see this deadlock. I thought it to be related to the
> inactivity timeout because it appears to happen about 500 (+/- 120)
> seconds after the last HTTP request gets in.

But you have extra debug turned on from what I can tell and there is
no message that process is shutting down due to inactivity timeout. If
there was, then there is an outside possibility that during shutdown
of interpreter the deadlock detection code incorrectly detected a
deadlock as code doesn't check to see whether interpreter in process
of being shutdown. But then, GIL locking still should work even if
interpreter is being shutdown or has been shutdown as it is
effectively a separate lock. I'll still have a look over the code and
see if there is potential for false detection during shutdown.

Graham

> [1] http://git.il.fontys.nl/git/ilpam.git
> [2] http://git.il.fontys.nl/git/ilpam.git/blob/HEAD:/mgt/web/wsgi.py
>
> --
> With kind regards,
> Giel van Schijndel
> - Interlink <www.il.fontys.nl>
>
> -----BEGIN PGP SIGNATURE-----
> Version: GnuPG v1.4.10 (GNU/Linux)
>
> iEYEARECAAYFAktgKvQACgkQZBYm/87l50LJrQCaAg7+z1gtgVmSmEGs+zZ7N0fK
> lDcAoIEaYHqeFsRg7ngpSJ1ssHw4X9TP
> =K8EV
> -----END PGP SIGNATURE-----
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"modwsgi" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to 
[email protected].
For more options, visit this group at 
http://groups.google.com/group/modwsgi?hl=en.

Reply via email to