John,

Sometimes it's difficult to see what the error is because you can't see the request (doesn't get logged)

To get round this - add:

 * a transhandler which writes a tag (e.g. ST), the request and the PID
   to the error log
 * a cleanuphandler which does the same... with a different tag (e.g. FI)

you can then get a better idea of what is causing the error as the request that causes the seg-fault will have a ST just before the seg fault but no FI... you will also have a history of all the request handled by that PID (in case it is cumulative)

Sometime (about 12 years) ago we were having errors with apparently random requests (including static images) - doing this we discovered the request which died was the request after a request which talked to a particular Oracle database.

On the live site we just killed the child at the end of these requests... and then went back to diagnose the error...

James

On 03/09/2015 22:21, John Dunlap wrote:
Ever since upgrading from Debian 7 - which shipped with Apache 2.2 - to Debian 8 - which shipped with Apache 2.4 - my user base has been reporting that their browsers randomly tell them "No data received". To date, they have not been able to identify any kind of pattern which triggers it. I've been sifting through the server logs looking for problems and I'm seeing a lot of errors similar to the following: [Thu Sep 03 21:12:52.382357 2015] [core:notice] [pid 13199:tid 140364918835072] AH00052: child pid 2088 exit signal Segmentation fault (11) [Thu Sep 03 21:13:03.406215 2015] [core:notice] [pid 13199:tid 140364918835072] AH00052: child pid 2121 exit signal Segmentation fault (11) [Thu Sep 03 21:13:05.417909 2015] [core:notice] [pid 13199:tid 140364918835072] AH00052: child pid 2165 exit signal Segmentation fault (11) [Thu Sep 03 21:13:08.433829 2015] [core:notice] [pid 13199:tid 140364918835072] AH00052: child pid 2232 exit signal Segmentation fault (11) [Thu Sep 03 21:15:53.614351 2015] [core:notice] [pid 13199:tid 140364918835072] AH00052: child pid 2264 exit signal Segmentation fault (11) [Thu Sep 03 21:16:03.637236 2015] [core:notice] [pid 13199:tid 140364918835072] AH00052: child pid 2539 exit signal Segmentation fault (11)


Can someone give me some tips on how to proceed with troubleshooting this and, possibly, fixing it?

--
John Dunlap
/CTO | Lariat/
/
/
/*Direct:*/
/j...@lariat.co <mailto:j...@lariat.co>/
/
*Customer Service:*/
877.268.6667
supp...@lariat.co <mailto:supp...@lariat.co>




--
The Wellcome Trust Sanger Institute is operated by Genome Research Limited, a charity registered in England with number 1021457 and a company registered in England with number 2742969, whose registered office is 215 Euston Road, London, NW1 2BE.

Reply via email to