Whoops, should have meant 'ps auxjwww’. The ‘j’ enables the parent process 
relationships.

Also, what LogLevel are you running in Apache? Are you using ‘info’ so can see 
messages from mod_wsgi about process shutdown?

Graham

> On 18 Nov 2016, at 11:13 PM, [email protected] wrote:
> 
> Thanks Graham.
> 
> They look pretty normal:
> 
> USER       PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
> root      2673  0.0  0.0  61272  3276 ?        Ss   04:09   0:01 
> /usr/sbin/httpd.worker
> apache    1201  0.1  0.0 739448  8668 ?        Sl   06:32   0:01 
> /usr/sbin/httpd.worker
> svcuser    12840  0.0  0.0 455436 22876 ?        Sl   05:19   0:03 
> daemon-display-name
> svcuser    23339  0.0  0.0 237320  5392 ?        Sl   Nov17   0:00 
> daemon-display-name  <-- orphan
> 
> Note that we do *not* see the pids of our daemon workers in the apache log 
> when it shuts down.  We only see the pids of non-modwsgi workers, for 
> handling server-status et al.  So in above output we would see only pid 1201 
> shutdown problems in httpd log.  
> 
> This issue has been around for a while, we have observed it here and there in 
> the past, but recently it has amplified and is causing resource exhaustion 
> and we're trying to answer 'why now' in addition to 'why'?
> 
> 
> Appreciate the help.
> 
> 
> On Thursday, November 17, 2016 at 11:20:12 PM UTC-5, Graham Dumpleton wrote:
> 
> > On 18 Nov 2016, at 2:39 PM, robert...@ <>dealertrack.com 
> > <http://dealertrack.com/> wrote: 
> > 
> > Hello, 
> > 
> > We are having an issue using Apache/2.2.15 (Unix) mod_wsgi/3.3 Python/2.7.3 
> > worker MPM/daemon mode, where apache restarts cause daemon processes to 
> > become orphaned (adopt ppid 1 and continue to run app code but not take 
> > http requests).   
> > 
> > Each time the error occurs, we will see something like: 
> > [Thu Nov 17 22:15:00 2016] [warn] child process 23371 still did not exit, 
> > sending a SIGTERM 
> > [Thu Nov 17 22:15:02 2016] [warn] child process 23371 still did not exit, 
> > sending a SIGTERM 
> > [Thu Nov 17 22:15:04 2016] [warn] child process 23371 still did not exit, 
> > sending a SIGTERM 
> > [Thu Nov 17 22:15:06 2016] [error] child process 23371 still did not exit, 
> > sending a SIGKILL 
> > 
> > .. where pid 23371 was an httpd worker. 
> > 
> > This causes me to assume that the root worker (initial process spawned by 
> > httpd and owned by root) sends (TERM, TERM, TERM, KILL) to the worker(s), 
> > which then attempts to kill the daemon processes but can't for some reason 
> > and that causes it to not respond to it's parent's requests to die.  
> > However, this does not make sense to me because that worker is run by 
> > low-privilege apache user which does not have ability to kill our daemon 
> > processes (which have a different uid/gid).  We have tried permutations of 
> > different users and privileges and nothing helps. 
> > 
> > We can easily send a TERM to any of the daemon processes manually (orphaned 
> > or not), and they die cleanly in well under the 3 second window that apache 
> > uses.  They die, and mod_wsgi emits something to the httpd log saying they 
> > were aborted.  It just doesn't happen when httpd tries to do it. 
> > 
> > We are using C modules, and we have enabled WSGIApplicationGroup ${GLOBAL} 
> > and as far as we can tell our permissions and vhost configuration is right. 
> >  The application works well at runtime. 
> > 
> > In order to continue to debug this, we were hoping to find out exactly how 
> > the daemons are signaled that they should exit.  Tracing the daemon 
> > processes with sysdig shows nothing about them getting any signals from 
> > httpd to terminate.   
> > 
> > Any ideas or tips on how to put the pieces together? 
> 
> The signals to shutdown should be sent by the Apache root process, which runs 
> as root. There is no way the daemon processes should be able to ignore the 
> SIGKILL. The only way the processes should be able to hang around is if they 
> became zombie processes because they were hung on some resource such as an 
> NFS mount. They will not actually be running in this case, only occupying a 
> slot in the process table and nothing more. 
> 
> Really need to see the output of ‘ps auxwww’ so can see the pids, 
> relationship to other httpd processes and the process state and whether it is 
> a zombie (Z). 
> 
> Overall not much can do to help as you are on an ancient Apache/mod_wsgi 
> version. From memory have seen some complaints of something similar before, 
> but they all revolved around the user of Apache 2.2.12-2.2.16. Never seen 
> anything similar since. So have always suspected some strange issue with 
> Apache around that version. 
> 
> Graham 
> 
> 
> -- 
> You received this message because you are subscribed to the Google Groups 
> "modwsgi" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to [email protected] 
> <mailto:[email protected]>.
> To post to this group, send email to [email protected] 
> <mailto:[email protected]>.
> Visit this group at https://groups.google.com/group/modwsgi 
> <https://groups.google.com/group/modwsgi>.
> For more options, visit https://groups.google.com/d/optout 
> <https://groups.google.com/d/optout>.

-- 
You received this message because you are subscribed to the Google Groups 
"modwsgi" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at https://groups.google.com/group/modwsgi.
For more options, visit https://groups.google.com/d/optout.

Reply via email to