Stefan Seufert wrote:
Hi Jano,
BUT, seems like apache does segfault if graceful restart is done:
Graceful restarts always have been a problem with apache, even with the
normal prefork MPM. I've never seen segfaults there but processes
hanging forever are not that uncommon.
The code in peruser handling this situation is somewhat complex and to
be honest I'm glad I didn't have to touch it so far. Sean, is this part
specific to peruser or did we inherit that code from some other MPM?
I cannot reproduce this and unlike the previous problem and don't even
have an educated guess. My best bet was a race on the pipe_of_death but
I can't get a segfault even when I run an apache bench with 20
concurrent requests against the server.
About 7-8 of segfaults seem to happen after (few seconds or less)
graceful restart. Probably all processors which are active at the moment
seem to crash and burn.
o Can you please check the log files and see if those processes have
been active (serving requests) before the graceful restart or if they
were just starting?
Seems like they are serving requests before graceful
o Is the server a multi processor machine?
It's an AMD64 3800+ X2 dual-core so yes it's multiprocessor machine
o Do these Segfaults always happen? If not
- Does it matter whether the config has been changed or not?
- Does it depend on the load of the server (e.g. only happens if the
server is heavily loaded)?
They does not always happen
It doesn't seem to count if configfile has been changed.
Well if more requests are being server, more segfaults happen after
graceful. So seems like all processors or whatever will crash.
Out of 5 gracefuls 1 seem to fail. If load is low then 1-2 segfaults
occur, on day 10-20 can occur.
I added traceback to 2 pids at the end of email.
Gotto wait for afternoon, then i can do some more debugging and
statistics, because load is higher then.
o Do these segfaults only happen with a graceful restart (SIGUSR1) or do
they happen with a normal restart (SIGHUP), too?
I don't restart much but so far I don't remember seeing them after of
before restart
o Do you send the signal only to the master or to all processes (kill
vs. killall)?
umm. dunno . using centos4.3 httpd default init.d script which has been
modified slightly
This does not seem to be big problem, but if someone feels bored then it
could be fixed and if not fixed as a feature , it could be fixed as a
little cleaner notice than segfault.
You are too kind but I think segfaults always are problems which should
be fixed ASAP.
Well yeah, but i didn't want to put a pressure on or anything :)
Graceful vas done on : 2006-08-12 11:17:45
[Sat Aug 12 11:07:33 2006] [warn] (peruser: pid=2015 uid=99 child=4)
peruser_post_read(): MULTIPLEXER => Determining if request should be
passed. Child Num: 4
[Sat Aug 12 11:07:33 2006] [warn] (peruser: pid=2015 uid=99 child=4)
pass_request(): r->the_request="GET
/memcp.php?action=favorites&favadd=28430 HTTP/1.0" le
[Sat Aug 12 11:07:33 2006] [warn] (peruser: pid=2015 uid=99 child=4)
pass_request(): header_len=221 headers="GET
/memcp.php?action=favorites&favadd=28430 HTTP
[Sat Aug 12 11:07:33 2006] [warn] (peruser: pid=2801 uid=5010 child=65)
receive_from_multiplexer(): header_len=221 headers="GET
/memcp.php?action=favorites&fa
[Sat Aug 12 11:17:41 2006] [warn] (peruser: pid=2843 uid=0 child=9)
child_main(): sock_fd_in=39 sock_fd_out=40
[Sat Aug 12 11:17:41 2006] [warn] (peruser: pid=2843 uid=0 child=9)
child_main(): MULTIPLEXER 9
[Sat Aug 12 11:17:41 2006] [warn] (peruser: pid=2843 uid=0 child=9)
child_main(): updating processor stati
[Sat Aug 12 11:17:41 2006] [warn] (peruser: pid=2843 uid=0 child=9)
listen_add(): function entered
[Sat Aug 12 11:17:43 2006] [warn] (peruser: pid=2890 uid=0 child=58)
child_main(): sock_fd_in=67 sock_fd_out=68
[Sat Aug 12 11:17:43 2006] [warn] (peruser: pid=2890 uid=0 child=58)
child_main(): WORKER 58
[Sat Aug 12 11:17:43 2006] [warn] (peruser: pid=2890 uid=0 child=58)
listen_clear(): function entered
[Sat Aug 12 11:17:43 2006] [warn] (peruser: pid=2890 uid=0 child=58)
listen_add(): function entered
[Sat Aug 12 11:17:43 2006] [warn] (peruser: pid=2890 uid=0 child=58)
listen_add(): function entered
[Sat Aug 12 11:17:43 2006] [warn] (peruser: pid=2890 uid=5011 child=58)
child_main(): input available ... resetting socket.
[Sat Aug 12 11:17:43 2006] [warn] (peruser: pid=2890 uid=5011 child=58)
receive_from_multiplexer(): receiving from sock_fd=67
[Sat Aug 12 11:17:43 2006] [warn] (peruser: pid=2890 uid=5011 child=58)
receive_from_multiplexer(): recvmsg returned 72
[Sat Aug 12 11:17:43 2006] [warn] (peruser: pid=2890 uid=5011 child=58)
receive_from_multiplexer(): trans_sock=169631584 fdx=5 sock_fd=5
[Sat Aug 12 11:17:43 2006] [warn] (peruser: pid=2890 uid=5011 child=58)
receive_from_multiplexer(): header_len=63 headers="GET
/failid/nimepaev.dat HTTP/1.0\r
[Sat Aug 12 11:17:43 2006] [warn] (peruser: pid=2890 uid=5011 child=58)
receive_from_multiplexer(): header_len > 0, we got a request
[Sat Aug 12 11:17:43 2006] [warn] (peruser: pid=2890 uid=5011 child=58)
receive_from_multiplexer(): There is no body
[Sat Aug 12 11:17:43 2006] [warn] (peruser: pid=2890 uid=5011 child=58)
receive_from_multiplexer(): returning 0
[Sat Aug 12 11:17:43 2006] [warn] (peruser: pid=2890 uid=5011 child=58)
child_main(): CHECKING IF WE SHOULD CLONE A CHILD...
[Sat Aug 12 11:17:43 2006] [warn] (peruser: pid=2890 uid=5011 child=58)
child_main(): total_processors = 8, max_processors = 8
[Sat Aug 12 11:17:43 2006] [warn] (peruser: pid=2890 uid=5011 child=58)
child_main(): idle_processors = 1, min_free_processors = 2
[Sat Aug 12 11:17:43 2006] [warn] (peruser: pid=2890 uid=5011 child=58)
child_main(): marked jmpbuffer
[Sat Aug 12 11:17:43 2006] [warn] (peruser: pid=2890 uid=5011 child=58)
child_main(): calling process_socket()
[Sat Aug 12 11:17:43 2006] [warn] (peruser: pid=2890 uid=5011 child=58)
process_socket(): Creating dummy connection to use the vhost lookup api
[Sat Aug 12 11:17:43 2006] [warn] (peruser: pid=2890 uid=5011 child=58)
process_socket(): Looking up the right vhost
[Sat Aug 12 11:17:43 2006] [warn] (peruser: pid=2890 uid=5011 child=58)
process_socket(): Base server is p2.xservu.com, name based vhosts on
[Sat Aug 12 11:17:43 2006] [warn] (peruser: pid=2890 uid=5011 child=58)
process_socket(): child_num=58 sock=169631584 sock_fd=5\n
[Sat Aug 12 11:17:43 2006] [warn] (peruser: pid=2890 uid=5011 child=58)
process_socket(): type=WORKER 58
[Sat Aug 12 11:17:43 2006] [warn] (peruser: pid=2890 uid=5011 child=58)
peruser_process_connection(): function entered
[Sat Aug 12 11:17:43 2006] [warn] (peruser: pid=2890 uid=5011 child=58)
peruser_process_connection(): leaving (DECLINED)
[Sat Aug 12 11:17:43 2006] [warn] (peruser: pid=2890 uid=5011 child=58)
peruser_post_read(): WORKER 58
[Sat Aug 12 11:17:43 2006] [warn] (peruser: pid=2890 uid=5011 child=58)
peruser_post_read(): request for www.romance.pri.ee / (server
romance.pri.ee) seems to
[Sat Aug 12 11:17:43 2006] [warn] (peruser: pid=2890 uid=5011 child=58)
child_main(): returned from process_socket()
[Sat Aug 12 11:17:44 2006] [warn] (peruser: pid=2890 uid=5011 child=58)
child_main(): input available ... resetting socket.
[Sat Aug 12 11:17:44 2006] [warn] (peruser: pid=2890 uid=5011 child=58)
receive_from_multiplexer(): receiving from sock_fd=67
[Sat Aug 12 11:17:44 2006] [warn] (peruser: pid=2843 uid=99 child=9)
child_main(): input available ... resetting socket.
[Sat Aug 12 11:17:44 2006] [warn] (peruser: pid=2890 uid=5011 child=58)
receive_from_multiplexer(): recvmsg returned 72
[Sat Aug 12 11:17:44 2006] [warn] (peruser: pid=2890 uid=5011 child=58)
receive_from_multiplexer(): trans_sock=169631584 fdx=5 sock_fd=5
[Sat Aug 12 11:17:44 2006] [warn] (peruser: pid=2890 uid=5011 child=58)
receive_from_multiplexer(): header_len=63 headers="GET
/failid/nimepaev.dat HTTP/1.0\r
[Sat Aug 12 11:17:44 2006] [warn] (peruser: pid=2890 uid=5011 child=58)
receive_from_multiplexer(): header_len > 0, we got a request
[Sat Aug 12 11:17:44 2006] [warn] (peruser: pid=2890 uid=5011 child=58)
receive_from_multiplexer(): There is no body
[Sat Aug 12 11:17:44 2006] [warn] (peruser: pid=2890 uid=5011 child=58)
receive_from_multiplexer(): returning 0
[Sat Aug 12 11:17:44 2006] [warn] (peruser: pid=2890 uid=5011 child=58)
child_main(): CHECKING IF WE SHOULD CLONE A CHILD...
[Sat Aug 12 11:17:44 2006] [warn] (peruser: pid=2890 uid=5011 child=58)
child_main(): total_processors = 8, max_processors = 8
[Sat Aug 12 11:17:44 2006] [warn] (peruser: pid=2890 uid=5011 child=58)
child_main(): idle_processors = 3, min_free_processors = 2
[Sat Aug 12 11:17:44 2006] [warn] (peruser: pid=2890 uid=5011 child=58)
child_main(): marked jmpbuffer
[Sat Aug 12 11:17:44 2006] [warn] (peruser: pid=2890 uid=5011 child=58)
child_main(): calling process_socket()
[Sat Aug 12 11:17:44 2006] [warn] (peruser: pid=2890 uid=5011 child=58)
process_socket(): Creating dummy connection to use the vhost lookup api
[Sat Aug 12 11:17:44 2006] [warn] (peruser: pid=2890 uid=5011 child=58)
process_socket(): Looking up the right vhost
[Sat Aug 12 11:17:44 2006] [warn] (peruser: pid=2890 uid=5011 child=58)
process_socket(): Base server is p2.xservu.com, name based vhosts on
[Sat Aug 12 11:17:44 2006] [warn] (peruser: pid=2890 uid=5011 child=58)
process_socket(): child_num=58 sock=169631584 sock_fd=5\n
[Sat Aug 12 11:17:44 2006] [warn] (peruser: pid=2890 uid=5011 child=58)
process_socket(): type=WORKER 58
[Sat Aug 12 11:17:44 2006] [warn] (peruser: pid=2890 uid=5011 child=58)
peruser_process_connection(): function entered
[Sat Aug 12 11:17:44 2006] [warn] (peruser: pid=2890 uid=5011 child=58)
peruser_process_connection(): leaving (DECLINED)
[Sat Aug 12 11:17:44 2006] [warn] (peruser: pid=2890 uid=5011 child=58)
peruser_post_read(): WORKER 58
[Sat Aug 12 11:17:44 2006] [warn] (peruser: pid=2890 uid=5011 child=58)
peruser_post_read(): request for www.romance.pri.ee / (server
romance.pri.ee) seems to
[Sat Aug 12 11:17:44 2006] [warn] (peruser: pid=2890 uid=5011 child=58)
child_main(): returned from process_socket()
[Sat Aug 12 11:17:45 2006] [warn] (peruser: pid=2843 uid=99 child=9)
child_main(): input available ... resetting socket.
[Sat Aug 12 11:17:45 2006] [warn] (peruser: pid=2843 uid=99 child=9)
check_pipe_of_death(): WATCH: die_now=0
[Sat Aug 12 11:17:45 2006] [warn] (peruser: pid=2890 uid=5011 child=58)
child_main(): input available ... resetting socket.
[Sat Aug 12 11:17:45 2006] [warn] (peruser: pid=2890 uid=5011 child=58)
check_pipe_of_death(): WATCH: die_now=0
[Sat Aug 12 11:17:45 2006] [warn] (peruser: pid=2890 uid=5011 child=58)
child_main(): CHECKING IF WE SHOULD CLONE A CHILD...
[Sat Aug 12 11:17:45 2006] [warn] (peruser: pid=2890 uid=5011 child=58)
child_main(): total_processors = 8, max_processors = 8
[Sat Aug 12 11:17:45 2006] [warn] (peruser: pid=2890 uid=5011 child=58)
child_main(): idle_processors = 5, min_free_processors = 2
[Sat Aug 12 11:17:45 2006] [warn] (peruser: pid=2890 uid=5011 child=58)
child_main(): marked jmpbuffer
[Sat Aug 12 11:17:45 2006] [warn] (peruser: pid=2890 uid=5011 child=58)
child_main(): calling process_socket()
[Sat Aug 12 11:17:45 2006] [warn] (peruser: pid=2890 uid=5011 child=58)
process_socket(): Creating dummy connection to use the vhost lookup api
[Sat Aug 12 11:17:45 2006] [warn] (peruser: pid=2843 uid=99 child=9)
child_main(): marked jmpbuffer
[Sat Aug 12 11:17:45 2006] [warn] (peruser: pid=2843 uid=99 child=9)
child_main(): calling process_socket()
[Sat Aug 12 11:17:45 2006] [warn] (peruser: pid=2843 uid=99 child=9)
process_socket(): Creating dummy connection to use the vhost lookup api
[Sat Aug 12 11:17:45 2006] [notice] child pid 2843 exit signal
Segmentation fault (11)
[Sat Aug 12 11:17:45 2006] [notice] child pid 2890 exit signal
Segmentation fault (11)
_______________________________________________
Peruser mailing list
[email protected]
http://www.telana.com/mailman/listinfo/peruser