On 18 March 2010 07:26, Ben Drees <[email protected]> wrote: > Hi, > > I'm working with the same stack and problem that Alec has been > describing here. I have a new (but still vague) theory about what's > happening for your consideration. In trying to understand exactly how > the daemon process restart mechanism works, I noticed that the switch > statement in wsgi_manage_process() has no default block. There's at > least one documented 'reason' code which is not handled: > APR_OC_REASON_UNWRITABLE. I'm not exactly sure what that code means, > but I was able to reproduce the problem (declining number of daemon > children) by adding these lines to the top of wsgi_manage_process(): > > if (daemon->process.pid % 2) { > reason = APR_OC_REASON_UNWRITABLE; > } > > This is admittedly contrived, but demonstrates that the problem > *could* be related to an unexpected condition (unhandled reason) or > error (transient resource shortage) occurring during the execution of > wsgi_manage_process(). There seems to be at most one opportunity to > get the terminating daemon restarted. If there was a > wsgi_stork_thread() similar to wsgi_reaper_thread(), then perhaps the > restart could be retried later in the event of transient difficulties.
The particular value is never used in APR code or Apache. It more than likely exists such that user code can use it as a reason when alerting that a process has died but mod_wsgi doesn't do that. It wouldn't hurt to have a default which at least logs when unexpected reason arrives. FWIW, mod_cgid doesn't have that reason code or a default for switch statement either and that is what code was modelled off. That other process management in APR is a bit magic though and there are certain parts of it and how it interacts with Apache graceful restarts that I remember as being uncertain about for quite a while. I can't remember if I resolved the issue in my mind, but for a long time I couldn't work out where the signal came from which shutdown other processes on an Apache graceful restart. This was an issue at the time as I wasn't seeing proper shutdown messages from processes and Python exit code wasn't being called, instead the processes just got killed. One time when I went to investigate again, I couldn't duplicate the issue again and all seemed to work properly, even when I went back and tried older versions of Apache. Even though processes were being killed, they were always replaced. When I have time, I'll look other that APR code for managing processes again. Graham -- You received this message because you are subscribed to the Google Groups "modwsgi" group. To post to this group, send email to [email protected]. To unsubscribe from this group, send email to [email protected]. For more options, visit this group at http://groups.google.com/group/modwsgi?hl=en.
