2009/7/10 Ricky Zhou <[email protected]>: > On 2009-07-10 03:19:47 PM, Graham Dumpleton wrote: >> Normally assert() would see details of the assertion being written to >> stderr. Under Apache though, because of stderr being buffered, that >> information may not be shown in Apache error logs if inside assert() >> it is not explicitly flushing stderr after writing any information. In >> other words, the details of the assertion may be getting lost. This >> may not be the case though, and suggest you have a good look through >> the error logs, including the main Apache error log if this is >> actually for a virtual host with its own error log, for any line which >> doesn't have the leading date/time stamp on it. > Hi, I looked a little bit up from the log snippet that Mike posted, and > indeed, there was a line: > > Fatal Python error: Py_EndInterpreter: not the last thread > > within a second of the abort line that he pasted - could that have been > the missing message from the SIGABRT? We just set threads to 1, so I > hope that will go away soon - does this look like we might be cauesd by > non-threadsafe modules? > > I've also seen segfaults in the logs recently as well: > > [Fri Jul 10 09:09:48 2009] [notice] child pid 23582 exit signal Segmentation > fault (11) > [Fri Jul 10 09:09:48 2009] [info] mod_wsgi (pid=23582): Process 'fas' has > died, restarting. > [Fri Jul 10 09:09:48 2009] [info] mod_wsgi (pid=24463): Starting process > 'fas' with uid=421, gid=421 and threads=2. > > These are tough to debug since they're happening pretty sporadically - I > see 3 in the last 5 hours or so. I guess the next step will be to setup > apache > to keep core dumps around so that we look into those segfaults more. We'll > let > you what can get out of that.
That error indicates two things. The first is that the process was being shutdown anyway as the only time mod_wsgi calls that function is when destroying sub interpreters on shutdown. This could only be occurring if you had maximum-requests or inactivity-timeout as options to WSGIDaemonProcess and those conditions were triggered, you had touched the WSGI script file to trigger a daemon process restart, or you were running a code monitor to trigger an automatic restart. The only other way that function could have been triggered is if a third party C extension module you are using is doing really weird stuff by creating and the destroying sub interpreters during the life of the process. Doing such a thing is decidedly dangerous as C extension modules don't often like seeing a sub interpreter they were initialised against being destroyed and then the C extension module used again in a different sub interpreter. The other thing it indicates is that it is possible that there was a background thread, created by your application, still running and for some reason what mod_wsgi does when destroying sub interpreters isn't ensuring that is cleaned up okay. That or there is some code creating independent sub interpreters and it isn't doing what it should to ensure there are no active threads against a sub interpreter before destroying it. I have a couple of suggestions at this point. The first is to register an atexit callback which logs an error message. When mod_wsgi is shutting down the process, it will ensure that atexit callbacks are called when interpreters are destroyed, even sub interpreters. Normally Python doesn't do this for sub interpreters, only the main interpreter. If therefore you see the atexit callback logging, then it was a mod_wsgi process shutdown. So, corresponding to what you will find in: http://code.google.com/p/modwsgi/wiki/RegisteringCleanupCode#Cleanup_On_Process_Shutdown Add to start of WSGI script file: import atexit import mod_wsgi import sys def cleanup(): print >> sys.stderr, "ATEXIT" print >> sys.stderr, "process=%s" % mod_wsgi.process_group print >> sys.stderr, "application=%s" % mod_wsgi.application_group atexit.register(cleanup) The other suggestions is to try mod_wsgi 3.0c3 from: http://code.google.com/p/modwsgi/downloads/list In mod_wsgi 3.0, how Python thread states are managed is somewhat different. In particular, rather than a new thread state being created for each request, thread states are cached and used between requests for same thread. This ensures that thread local data is preserved. This may not be practical if you are concerned about readiness of 3.0, but would be interesting to see if the change in how thread states are managed makes a difference. A final thought though, if you are seeing 500 error responses, that would indicate that a request is still in the process of being handled when this problem occurs. This would mean that process shutdown or inactivity-timeout can't really be triggering, unless you actually had a single request that blocked for the timeout period and no other requests arrived in that time. If this did occur, then would also expect to see a 'Premature end of script headers' error message in logs and you have said anything about that. This is really starting to look like you are using some C extension module that is creating its own interpreters. What third party C extension modules are you using which are out of the ordinary? Graham --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "modwsgi" group. To post to this group, send email to [email protected] To unsubscribe from this group, send email to [email protected] For more options, visit this group at http://groups.google.com/group/modwsgi?hl=en -~----------~----~----~----~------~----~------~--~---
