On Tue, 6 May 2008, Jeremy Chadwick wrote: > On Tue, May 06, 2008 at 08:59:14AM -0400, Dan Mahoney, System Admin wrote: >> On Tue, 6 May 2008, Jeremy Chadwick wrote: >> >>> The parent process in your case is httpd, which means that Apache (or a >>> related Apache module) is not calling wait() when waiting for a child to >>> finish. >>> >>> So, there is a possibility suPHP is responsible for this. The way it >>> works, without going into suphp.conf semantics: >>> >>> 1) httpd loads mod_suphp.so >>> 2) mod_suphp.so executes /usr/local/bin/suphp >>> 3) /usr/local/bin/suphp executes /usr/local/bin/php-cgi >>> 4) /usr/local/bin/php-cgi parses PHP script and outputs data to >>> stdout, which makes it back to httpd. >>> >>> Here's the problem: there's not going an easy way to trace this down, >>> because Apache makes debugging children it forks off fairly difficult. >>> The problem could be in any of 3 places. >> >> Why not? >> >> Three (kinda) words for you. mod_log_forensic. Available in every apache >> since 1.3.30 >> >> Basically, writes to a logfile whenever a process request *starts* and when >> it *ends*. Comes with a script to grep the log for lines that don't have >> an end. Includes the request url. >> >> http://httpd.apache.org/docs/1.3/mod/mod_log_forensic.html >> >> See if this helps you. It may not, as somehow apache may write the logfile >> line and assume the suPHP child successfully returned (but I don't *think* >> so, because apache includes things in logs like how many seconds it took to >> serve a request). > > I can see how this is useful, but it isn't going to tell you which of > the above 3 pieces isn't calling wait(). The only way one is going to > find that out is via truss or strace on each child.
My personal theory is (in cases with this), this is caused by either scripts hitting a DB way, WAY faster than they can handle (such as when a phpBB is being "Wormed", when the system load is high (due to a high number of processes, swap thrash or the like, or when using an in-general written like crap(*) script. I'd assert the solution is: once you know what script it is, you can isolate that behavior: see if there are known issues, check the forums for that script, etc. Failing that, LART the user. "Dude, you're slowing my system down, find another script". Remember, if you're seeing a zombie process, chances are, the user didn't get their request served, either. Which means in theory they want to fix this. And while yes, in theory it may be a PHP bug, you don't see this on EVERY php request, so it's an interoperability issue. It's probably worth mentioning as well: run the latest versions of PHP. Have a look at that changelog: http://www.php.net/ChangeLog-5.php#5.2.6 What are the odds you're not going to hit at least ONE of those? * (okay, okay, you know me, that means anything not perl with strict and warnings). -Dan -- "Goodbye my peoples. I'll miss each one of you. Sniff-Sniff I now know the true meaning of love. Thank you Sniff-Sniff. You are all in my heart." -Chris D. --------Dan Mahoney-------- Techie, Sysadmin, WebGeek Gushi on efnet/undernet IRC ICQ: 13735144 AIM: LarpGM Site: http://www.gushi.org --------------------------- _______________________________________________ suPHP mailing list suPHP@lists.marsching.biz http://lists.marsching.com/mailman/listinfo/suphp