Re: [SPAM] Re: [modwsgi] Re: Mysterious 500's

Graham Dumpleton Fri, 10 Jul 2009 03:17:28 -0700

2009/7/10 Ricky Zhou <[email protected]>:
> On 2009-07-10 03:19:47 PM, Graham Dumpleton wrote:
>> Normally assert() would see details of the assertion being written to
>> stderr. Under Apache though, because of stderr being buffered, that
>> information may not be shown in Apache error logs if inside assert()
>> it is not explicitly flushing stderr after writing any information. In
>> other words, the details of the assertion may be getting lost. This
>> may not be the case though, and suggest you have a good look through
>> the error logs, including the main Apache error log if this is
>> actually for a virtual host with its own error log, for any line which
>> doesn't have the leading date/time stamp on it.
> Hi, I looked a little bit up from the log snippet that Mike posted, and
> indeed, there was a line:
>
> Fatal Python error: Py_EndInterpreter: not the last thread
>
> within a second of the abort line that he pasted - could that have been
> the missing message from the SIGABRT?  We just set threads to 1, so I
> hope that will go away soon - does this look like we might be cauesd by
> non-threadsafe modules?
>
> I've also seen segfaults in the logs recently as well:
>
> [Fri Jul 10 09:09:48 2009] [notice] child pid 23582 exit signal Segmentation 
> fault (11)
> [Fri Jul 10 09:09:48 2009] [info] mod_wsgi (pid=23582): Process 'fas' has 
> died, restarting.
> [Fri Jul 10 09:09:48 2009] [info] mod_wsgi (pid=24463): Starting process 
> 'fas' with uid=421, gid=421 and threads=2.
>
> These are tough to debug since they're happening pretty sporadically - I
> see 3 in the last 5 hours or so.  I guess the next step will be to setup 
> apache
> to keep core dumps around so that we look into those segfaults more.  We'll 
> let
> you what can get out of that.


That error indicates two things.

The first is that the process was being shutdown anyway as the only
time mod_wsgi calls that function is when destroying sub interpreters
on shutdown.

This could only be occurring if you had maximum-requests or
inactivity-timeout as options to WSGIDaemonProcess and those
conditions were triggered, you had touched the WSGI script file to
trigger a daemon process restart, or you were running a code monitor
to trigger an automatic restart.

The only other way that function could have been triggered is if a
third party C extension module you are using is doing really weird
stuff by creating and the destroying sub interpreters during the life
of the process. Doing such a thing is decidedly dangerous as C
extension modules don't often like seeing a sub interpreter they were
initialised against being destroyed and then the C extension module
used again in a different sub interpreter.

The other thing it indicates is that it is possible that there was a
background thread, created by your application, still running and for
some reason what mod_wsgi does when destroying sub interpreters isn't
ensuring that is cleaned up okay. That or there is some code creating
independent sub interpreters and it isn't doing what it should to
ensure there are no active threads against a sub interpreter before
destroying it.

I have a couple of suggestions at this point. The first is to register
an atexit callback which logs an error message. When mod_wsgi is
shutting down the process, it will ensure that atexit callbacks are
called when interpreters are destroyed, even sub interpreters.
Normally Python doesn't do this for sub interpreters, only the main
interpreter. If therefore you see the atexit callback logging, then it
was a mod_wsgi process shutdown.

So, corresponding to what you will find in:

  
http://code.google.com/p/modwsgi/wiki/RegisteringCleanupCode#Cleanup_On_Process_Shutdown

Add to start of WSGI script file:

import atexit
import mod_wsgi
import sys

def cleanup():
    print >> sys.stderr, "ATEXIT"
    print >> sys.stderr, "process=%s" % mod_wsgi.process_group
    print >> sys.stderr, "application=%s" % mod_wsgi.application_group

atexit.register(cleanup)

The other suggestions is to try mod_wsgi 3.0c3 from:

  http://code.google.com/p/modwsgi/downloads/list

In mod_wsgi 3.0, how Python thread states are managed is somewhat
different. In particular, rather than a new thread state being created
for each request, thread states are cached and used between requests
for same thread. This ensures that thread local data is preserved.

This may not be practical if you are concerned about readiness of 3.0,
but would be interesting to see if the change in how thread states are
managed makes a difference.

A final thought though, if you are seeing 500 error responses, that
would indicate that a request is still in the process of being handled
when this problem occurs. This would mean that process shutdown or
inactivity-timeout can't really be triggering, unless you actually had
a single request that blocked for the timeout period and no other
requests arrived in that time. If this did occur, then would also
expect to see a 'Premature end of script headers' error message in
logs and you have said anything about that.

This is really starting to look like you are using some C extension
module that is creating its own interpreters. What third party C
extension modules are you using which are out of the ordinary?

Graham

--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"modwsgi" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to 
[email protected]
For more options, visit this group at 
http://groups.google.com/group/modwsgi?hl=en
-~----------~----~----~----~------~----~------~--~---

Re: [SPAM] Re: [modwsgi] Re: Mysterious 500's

Reply via email to