It could be another (not root) httpd process? Do you thing this is not a
"apache" problem?



*Fabio F. Gervasi*
Engineering Consultant


*Web             www.centrometeo.com <http://www.centrometeo.com/>*
*g+                 plus.google.com/centrometeo
<https://plus.google.com/105294368204645075621>*

*Twitter        twitter.com/centrometeo <http://twitter.com/centrometeo>*
*Facebook    facebook.com/centrometeo <http://www.facebook.com/centrometeo>*









2016-10-02 11:07 GMT+02:00 Daniel <dferra...@gmail.com>:

> I really see nothing in that strace output that tells anything is crashing.
>
>
>
> 2016-10-02 10:44 GMT+02:00 Fabio F.Gervasi <fabio.gervas...@gmail.com>:
>
>> Hi James,
>>
>> I do not have the skills to be able to work well in "depth" to the
>> problem as you (rightly) suggest. What I know is that this problem started
>> from the last system update (apache / php / mariadb, ...) from 23 September.
>>
>> The only log info is: /var/log/httpd/error_log, at "crash time": [Fri
>> Sep 30 19:09:03.897325 2016] [mpm_event:trace4] [pid 30339:tid
>> 139796798162688] event.c(930): socket reached timeout in lingering-close
>> state
>>
>> and I don't know if strace (the results of which I have just quoted) is
>> useful..
>>
>> Should I wait for further apache / php updates?
>>
>>
>>
>> *Fabio F. Gervasi*
>> Engineering Consultant
>>
>>
>> *Web             www.centrometeo.com <http://www.centrometeo.com/>*
>> *g+                 plus.google.com/centrometeo
>> <https://plus.google.com/105294368204645075621>*
>>
>> *Twitter        twitter.com/centrometeo <http://twitter.com/centrometeo>*
>> *Facebook    facebook.com/centrometeo
>> <http://www.facebook.com/centrometeo>*
>>
>>
>>
>>
>>
>>
>>
>>
>>
>> 2016-10-02 10:34 GMT+02:00 Dr James Smith <j...@sanger.ac.uk>:
>>
>>> Things you can do are trying to track down if there is any pattern - are
>>> these first request on a child, "nth" request on  child etc... we add
>>> additional variables to access logs which include things such as
>>> PID/request no in PID, memory usage before and after etc. This allows us
>>> sometimes to see patterns in errors which aren't normally easy to see..
>>> Saying that it may not help you here...
>>>
>>> Another thing you can do with ab is check to see if it is OK under lower
>>> load?
>>>
>>> Finally I find "siege" better than "ab" when it comes to testing as it
>>> is more configurable - you can send it a list of URLs - run test for
>>> "n-seconds" rather than "n-requests" which is more useful if you don't know
>>> how long things take to return!
>>>
>>> James
>>>
>>>
>>> On 02/10/2016 09:02, Daniel wrote:
>>>
>>> I would bet on your method of parsing php and the scripts being ran, in
>>> any case to be sure:
>>>
>>> Try the strace approach:
>>> strace -o /tmp/outputfile -s 5000 httpd -X or strace -ff -F -s200 -o
>>> /tmp/strace.out -p PID
>>>
>>> to try to find out what's really going on.
>>>
>>> 2016-09-30 20:19 GMT+02:00 Fabio F.Gervasi <fabio.gervas...@gmail.com>:
>>>
>>>> Hi!
>>>>
>>>> I have additional information:
>>>>
>>>> /var/log/httpd/error_log, at "crash time": [Fri Sep 30 19:09:03.897325
>>>> 2016] [mpm_event:trace4] [pid 30339:tid 139796798162688] event.c(930):
>>>> socket reached timeout in lingering-close state
>>>>
>>>> What do you think?
>>>>
>>>>
>>>>
>>>> 2016-09-30 14:47 GMT+02:00 Fabio F.Gervasi <fabio.gervas...@gmail.com>:
>>>>
>>>>> Hi!
>>>>>
>>>>> Thank you for your reply. I tried the following tests.
>>>>>
>>>>> *1) Using a little text file:*
>>>>> *# ab -k -c 100 -n 2000 localhost/test.txt*
>>>>> This is ApacheBench, Version 2.3 <$Revision: 1748469 $>
>>>>> Copyright 1996 Adam Twiss, Zeus Technology Ltd,
>>>>> http://www.zeustech.net/
>>>>> Licensed to The Apache Software Foundation, http://www.apache.org/
>>>>>
>>>>> Benchmarking localhost (be patient)
>>>>> Completed 200 requests
>>>>> Completed 400 requests
>>>>> Completed 600 requests
>>>>> Completed 800 requests
>>>>> Completed 1000 requests
>>>>> Completed 1200 requests
>>>>> Completed 1400 requests
>>>>> Completed 1600 requests
>>>>> Completed 1800 requests
>>>>> Completed 2000 requests
>>>>> Finished 2000 requests
>>>>>
>>>>>
>>>>> Server Software:        Apache/2.4.23
>>>>> Server Hostname:        localhost
>>>>> Server Port:            80
>>>>>
>>>>> Document Path:          /test.txt
>>>>> Document Length:        52 bytes
>>>>>
>>>>> Concurrency Level:      100
>>>>> Time taken for tests:   2.521 seconds
>>>>> Complete requests:      2000
>>>>> *Failed requests:        16*
>>>>>    (Connect: 0, Receive: 0, Length: 16, Exceptions: 0)
>>>>> Keep-Alive requests:    1984
>>>>> Total transferred:      950336 bytes
>>>>> HTML transferred:       103168 bytes
>>>>> Requests per second:    793.45 [#/sec] (mean)
>>>>> Time per request:       126.032 [ms] (mean)
>>>>> Time per request:       1.260 [ms] (mean, across all concurrent
>>>>> requests)
>>>>> Transfer rate:          368.18 [Kbytes/sec] received
>>>>>
>>>>> Connection Times (ms)
>>>>>               min  mean[+/-sd] median   max
>>>>> Connect:        0    1   3.8      0      21
>>>>> Processing:     1  123 126.5     83     772
>>>>> Waiting:        0  123 126.7     83     772
>>>>> Total:          1  124 126.6     85     772
>>>>>
>>>>> Percentage of the requests served within a certain time (ms)
>>>>>   50%     85
>>>>>   66%    154
>>>>>   75%    201
>>>>>   80%    229
>>>>>   90%    303
>>>>>   95%    373
>>>>>   98%    452
>>>>>   99%    547
>>>>>  100%    772 (longest request)
>>>>>
>>>>> *I obtain 16 failed requests, but if I run more again I obtain a
>>>>> different number.*
>>>>>
>>>>> *2) Using a big gif file:*
>>>>> *# ab -k -c 100 -n 2000 localhost/it-vis-animation.gif*
>>>>> This is ApacheBench, Version 2.3 <$Revision: 1748469 $>
>>>>> Copyright 1996 Adam Twiss, Zeus Technology Ltd,
>>>>> http://www.zeustech.net/
>>>>> Licensed to The Apache Software Foundation, http://www.apache.org/
>>>>>
>>>>> Benchmarking localhost (be patient)
>>>>> Completed 200 requests
>>>>> Completed 400 requests
>>>>> Completed 600 requests
>>>>> Completed 800 requests
>>>>> Completed 1000 requests
>>>>> Completed 1200 requests
>>>>> Completed 1400 requests
>>>>> Completed 1600 requests
>>>>> Completed 1800 requests
>>>>> Completed 2000 requests
>>>>> Finished 2000 requests
>>>>>
>>>>>
>>>>> Server Software:        Apache/2.4.23
>>>>> Server Hostname:        localhost
>>>>> Server Port:            80
>>>>>
>>>>> Document Path:          /it-vis-animation.gif
>>>>> Document Length:        8105309 bytes
>>>>>
>>>>> Concurrency Level:      100
>>>>> Time taken for tests:   26.294 seconds
>>>>> Complete requests:      2000
>>>>> *Failed requests:        0*
>>>>> Keep-Alive requests:    2000
>>>>> Total transferred:      16211484000 bytes
>>>>> HTML transferred:       16210618000 bytes
>>>>> Requests per second:    76.06 [#/sec] (mean)
>>>>> Time per request:       1314.691 [ms] (mean)
>>>>> Time per request:       13.147 [ms] (mean, across all concurrent
>>>>> requests)
>>>>> Transfer rate:          602100.83 [Kbytes/sec] received
>>>>>
>>>>> Connection Times (ms)
>>>>>               min  mean[+/-sd] median   max
>>>>> Connect:        0    1   4.0      0      22
>>>>> Processing:   404 1309  80.0   1298    1635
>>>>> Waiting:        5   22  32.4     10     207
>>>>> Total:        420 1310  80.6   1298    1635
>>>>>
>>>>> Percentage of the requests served within a certain time (ms)
>>>>>   50%   1298
>>>>>   66%   1315
>>>>>   75%   1342
>>>>>   80%   1360
>>>>>   90%   1397
>>>>>   95%   1440
>>>>>   98%   1513
>>>>>   99%   1530
>>>>>  100%   1635 (longest request)
>>>>>
>>>>> *In this case I obtain always 0 failed!*
>>>>>
>>>>> *3) Using a little png file:*
>>>>> *# ab -k -c 100 -n 2000 localhost/apple-touch-icon.png*
>>>>> This is ApacheBench, Version 2.3 <$Revision: 1748469 $>
>>>>> Copyright 1996 Adam Twiss, Zeus Technology Ltd,
>>>>> http://www.zeustech.net/
>>>>> Licensed to The Apache Software Foundation, http://www.apache.org/
>>>>>
>>>>> Benchmarking localhost (be patient)
>>>>> Completed 200 requests
>>>>> Completed 400 requests
>>>>> Completed 600 requests
>>>>> Completed 800 requests
>>>>> Completed 1000 requests
>>>>> Completed 1200 requests
>>>>> Completed 1400 requests
>>>>> Completed 1600 requests
>>>>> Completed 1800 requests
>>>>> Completed 2000 requests
>>>>> Finished 2000 requests
>>>>>
>>>>>
>>>>> Server Software:        Apache/2.4.23
>>>>> Server Hostname:        localhost
>>>>> Server Port:            80
>>>>>
>>>>> Document Path:          /apple-touch-icon.png
>>>>> Document Length:        12364 bytes
>>>>>
>>>>> Concurrency Level:      100
>>>>> Time taken for tests:   2.576 seconds
>>>>> Complete requests:      2000
>>>>> *Failed requests:        22*
>>>>>    (Connect: 0, Receive: 0, Length: 22, Exceptions: 0)
>>>>> Keep-Alive requests:    1978
>>>>> Total transferred:      25304554 bytes
>>>>> HTML transferred:       24455992 bytes
>>>>> Requests per second:    776.52 [#/sec] (mean)
>>>>> Time per request:       128.779 [ms] (mean)
>>>>> Time per request:       1.288 [ms] (mean, across all concurrent
>>>>> requests)
>>>>> Transfer rate:          9594.53 [Kbytes/sec] received
>>>>>
>>>>> Connection Times (ms)
>>>>>               min  mean[+/-sd] median   max
>>>>> Connect:        0    1   3.9      0      20
>>>>> Processing:     0  125 112.0    101     563
>>>>> Waiting:        0  125 112.1    101     563
>>>>> Total:          0  126 112.6    102     563
>>>>>
>>>>> Percentage of the requests served within a certain time (ms)
>>>>>   50%    102
>>>>>   66%    159
>>>>>   75%    196
>>>>>   80%    220
>>>>>   90%    282
>>>>>   95%    354
>>>>>   98%    401
>>>>>   99%    457
>>>>>  100%    563 (longest request)
>>>>>
>>>>> *I get 22 failed requests, but if I run more again I obtain a
>>>>> different number.*
>>>>>
>>>>> What do you thing?
>>>>>
>>>>> Thank you!
>>>>>
>>>>>
>>>>>
>>>>> 2016-09-30 13:50 GMT+02:00 Daniel <dferra...@gmail.com>:
>>>>>
>>>>>> hello,
>>>>>>
>>>>>> the fact that all those 500 error reponses go with "X-Powered-By:
>>>>>> PHP/5.6.26" tell me it is php giving those errors.
>>>>>>
>>>>>> If you are still unsure, you could use the ab tool from Apache HTTPD,
>>>>>> and request a specific static file (text, gif) hundreds of times to see 
>>>>>> if
>>>>>> you get a 500 error from HTTPD.
>>>>>>
>>>>>> 2016-09-30 8:59 GMT+02:00 Fabio F.Gervasi <fabio.gervas...@gmail.com>
>>>>>> :
>>>>>>
>>>>>>> Hi,
>>>>>>>
>>>>>>> I'm not 100% sure that the real source of the problem is apache
>>>>>>> (could be php or other), but I'd like start from here as the only logs 
>>>>>>> that
>>>>>>> provides me with information (with trace4 level) is apache log. No
>>>>>>> information appears from other logs.
>>>>>>>
>>>>>>> I have:
>>>>>>> - Linux 4.7.4-100.fc23.x86_64
>>>>>>> - Server version: Apache/2.4.23 (Fedora)
>>>>>>> - mysql  Ver 15.1 Distrib 10.0.26-MariaDB, for Linux (x86_64) using
>>>>>>> readline 5.1
>>>>>>> - PHP Version => 5.6.25
>>>>>>>
>>>>>>> The problem: random internal server error 500. Random because, for
>>>>>>> example, if a page produces the error and do "reload" then it works. The
>>>>>>> same page may not work as well as work. Sometimes you can use the site 
>>>>>>> for
>>>>>>> several minutes with no problems, then suddenly there is the error.
>>>>>>>
>>>>>>> When error occurs, this is all that I get from the log file (with
>>>>>>> loglevel = trace4):
>>>>>>>
>>>>>>> [...]
>>>>>>> [Fri Sep 30 07:41:39.151052 2016] [rewrite:trace3] [pid 17957:tid
>>>>>>> 140241771534080] mod_rewrite.c(477): [client 151.25.206.200:38368]
>>>>>>> 151.25.206.200 - - [www.centrometeo.com/sid#55c3e
>>>>>>> 3673400][rid#7f8c8000e9d0/subreq
>>>>>>> <http://www.centrometeo.com/sid#55c3e3673400][rid%237f8c8000e9d0/subreq>]
>>>>>>> [perdir /home/web/centrometeo.com/] applying pattern '.*' to uri
>>>>>>> 'index.php', referer: http://www.centrometeo.com/mod
>>>>>>> elli-numerici/modello-wrf-nmm/559-wrf-mslp-6hprec-ita
>>>>>>> [Fri Sep 30 07:41:39.151092 2016] [rewrite:trace1] [pid 17957:tid
>>>>>>> 140241771534080] mod_rewrite.c(477): [client 151.25.206.200:38368]
>>>>>>> 151.25.206.200 - - [www.centrometeo.com/sid#55c3e
>>>>>>> 3673400][rid#7f8c8000e9d0/subreq
>>>>>>> <http://www.centrometeo.com/sid#55c3e3673400][rid%237f8c8000e9d0/subreq>]
>>>>>>> [perdir /home/web/centrometeo.com/] pass through /home/web/
>>>>>>> centrometeo.com/index.php, referer: http://www.centrometeo.com/mod
>>>>>>> elli-numerici/modello-wrf-nmm/559-wrf-mslp-6hprec-ita
>>>>>>> [Fri Sep 30 07:41:39.913003 2016] [core:trace4] [pid 17957:tid
>>>>>>> 140241771534080] util_script.c(567): [client 151.25.206.200:38368]
>>>>>>> Headers from script 'php.fcgi':, referer:
>>>>>>> http://www.centrometeo.com/modelli-numerici/modello-wrf-nmm/
>>>>>>> 559-wrf-mslp-6hprec-ita
>>>>>>> [Fri Sep 30 07:41:39.913097 2016] [core:trace4] [pid 17957:tid
>>>>>>> 140241771534080] util_script.c(568): [client 151.25.206.200:38368]
>>>>>>>   *Status: 500 Internal Server Error*, referer:
>>>>>>> http://www.centrometeo.com/modelli-numerici/modello-wrf-nmm/
>>>>>>> 559-wrf-mslp-6hprec-ita
>>>>>>> [Fri Sep 30 07:41:39.913130 2016] [core:trace1] [pid 17957:tid
>>>>>>> 140241771534080] util_script.c(649): [client 151.25.206.200:38368] 
>>>>>>> *Status
>>>>>>> line from script 'php.fcgi': 500 Internal Server Error*, referer:
>>>>>>> http://www.centrometeo.com/modelli-numerici/modello-wrf-nmm/
>>>>>>> 559-wrf-mslp-6hprec-ita
>>>>>>> [Fri Sep 30 07:41:39.913162 2016] [core:trace4] [pid 17957:tid
>>>>>>> 140241771534080] util_script.c(568): [client 151.25.206.200:38368]
>>>>>>>   X-Powered-By: PHP/5.6.26, referer: http://www.centrometeo.com/mod
>>>>>>> elli-numerici/modello-wrf-nmm/559-wrf-mslp-6hprec-ita
>>>>>>> [Fri Sep 30 07:41:39.913191 2016] [core:trace4] [pid 17957:tid
>>>>>>> 140241771534080] util_script.c(568): [client 151.25.206.200:38368]
>>>>>>>   P3P: CP="NOI ADM DEV PSAi COM NAV OUR OTRo STP IND DEM", referer:
>>>>>>> http://www.centrometeo.com/modelli-numerici/modello-wrf-nmm/
>>>>>>> 559-wrf-mslp-6hprec-ita
>>>>>>> [Fri Sep 30 07:41:39.913218 2016] [core:trace4] [pid 17957:tid
>>>>>>> 140241771534080] util_script.c(568): [client 151.25.206.200:38368]
>>>>>>>   Content-Encoding: gzip, referer: http://www.centrometeo.com/mod
>>>>>>> elli-numerici/modello-wrf-nmm/559-wrf-mslp-6hprec-ita
>>>>>>> [Fri Sep 30 07:41:39.913246 2016] [core:trace4] [pid 17957:tid
>>>>>>> 140241771534080] util_script.c(568): [client 151.25.206.200:38368]
>>>>>>>   Vary: Accept-Encoding, referer: http://www.centrometeo.com/mod
>>>>>>> elli-numerici/modello-wrf-nmm/559-wrf-mslp-6hprec-ita
>>>>>>> [Fri Sep 30 07:41:39.913275 2016] [core:trace4] [pid 17957:tid
>>>>>>> 140241771534080] util_script.c(568): [client 151.25.206.200:38368]
>>>>>>>   Content-type: text/html; charset=UTF-8, referer:
>>>>>>> http://www.centrometeo.com/modelli-numerici/modello-wrf-nmm/
>>>>>>> 559-wrf-mslp-6hprec-ita
>>>>>>> [Fri Sep 30 07:41:39.913395 2016] [http:trace3] [pid 17957:tid
>>>>>>> 140241771534080] http_filters.c(1006): [client 151.25.206.200:38368]
>>>>>>> *Response sent with status 500, headers:*, referer:
>>>>>>> http://www.centrometeo.com/modelli-numerici/modello-wrf-nmm/
>>>>>>> 559-wrf-mslp-6hprec-ita
>>>>>>> [Fri Sep 30 07:41:39.913432 2016] [http:trace4] [pid 17957:tid
>>>>>>> 140241771534080] http_filters.c(835): [client 151.25.206.200:38368]
>>>>>>>   X-Powered-By: PHP/5.6.26, referer: http://www.centrometeo.com/mod
>>>>>>> elli-numerici/modello-wrf-nmm/559-wrf-mslp-6hprec-ita
>>>>>>> [Fri Sep 30 07:41:39.913458 2016] [http:trace4] [pid 17957:tid
>>>>>>> 140241771534080] http_filters.c(835): [client 151.25.206.200:38368]
>>>>>>>   P3P: CP=\\"NOI ADM DEV PSAi COM NAV OUR OTRo STP IND DEM\\", referer:
>>>>>>> http://www.centrometeo.com/modelli-numerici/modello-wrf-nmm/
>>>>>>> 559-wrf-mslp-6hprec-ita
>>>>>>> [Fri Sep 30 07:41:39.913481 2016] [http:trace4] [pid 17957:tid
>>>>>>> 140241771534080] http_filters.c(835): [client 151.25.206.200:38368]
>>>>>>>   Content-Encoding: gzip, referer: http://www.centrometeo.com/mod
>>>>>>> elli-numerici/modello-wrf-nmm/559-wrf-mslp-6hprec-ita
>>>>>>> [...]
>>>>>>>
>>>>>>> Note the use of fcgi. Apache is indeed configured with
>>>>>>> event/php-fpm/FastCGI, but the same thing happens with "normal" prefork 
>>>>>>> /
>>>>>>> php!
>>>>>>> Until a few days ago it did not happen, it started after updating
>>>>>>> the system (kernel, apache, php, mariadb).
>>>>>>> I do not really know what to do.
>>>>>>>
>>>>>>> Thank you very much.
>>>>>>> Fabio.
>>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> *Daniel Ferradal*
>>>>>> IT Specialist
>>>>>>
>>>>>> email         dferradal at gmail.com
>>>>>> linkedin     es.linkedin.com/in/danielferradal
>>>>>>
>>>>>
>>>>>
>>>>
>>>
>>>
>>> --
>>> *Daniel Ferradal*
>>> IT Specialist
>>>
>>> email         dferradal at gmail.com
>>> linkedin     es.linkedin.com/in/danielferradal
>>>
>>>
>>>
>>> -- The Wellcome Trust Sanger Institute is operated by Genome Research
>>> Limited, a charity registered in England with number 1021457 and a company
>>> registered in England with number 2742969, whose registered office is 215
>>> Euston Road, London, NW1 2BE.
>>>
>>
>>
>
>
> --
> *Daniel Ferradal*
> IT Specialist
>
> email         dferradal at gmail.com
> linkedin     es.linkedin.com/in/danielferradal
>

Reply via email to