@marcelo,

could you please also add this patch to v16?

On Wed, 26 Jan 2011 10:26:18 +0100, Hannes Landeholm <[email protected]> wrote:
Just wanted to add a note to this.

I installed the long lost children patch yesterday and since then I
haven't seen this problem.

Regards,

Hannes

On 25 January 2011 17:49, Hannes Landeholm  wrote:
 To fix this I suggest changing the line in peruser.c:2111

 ret = apr_poll(pollset, num_listensocks, &n, 1000000);

 so it gets a time out of 1000000 (one second). And below it do a
check
 if the main process is still running. If not, quit.

 Hannes

 On 25 January 2011 16:56, Hannes Landeholm  wrote:
 > Oh yeah, and it's even worse since my automatic watchdog script
 > doesn't even know apache doesn't work anymore since it sees alive
 > httpd processes running and think everything is hunky dory.
 >
 > Hannes
 >
 > On 25 January 2011 16:55, Hannes Landeholm  wrote:
 >> Hi,
 >>
 >> I see a whole bunch of loose children that are stopped and refuse
to
 >> exit even though their parent process has died a long long time
ago.
 >> This has happened multiple times. I think it happens when the
parent
 >> exits ungracefully like when it's crashed. Can you add a check
that
 >> terminates child processes when the parent is killed? This is
 >> exceptionally annoying when multiplexers do this since they block
 >> apache from restarting as they block the listen port. Since not
even
 >> automatic watchdog scripts can bring back apache to life when
that
 >> happens I'd say this is a critical/major bug.
 >>
 >> Here's a backtrace for one of the borked kids:
 >>
 >> #0  0x00007f04fd20ef58 in *__GI___poll (fds=0x7fffe4e50740,
nfds=2,
 >> timeout=) at ../sysdeps/unix/sysv/linux/poll.c:83
 >> #1  0x00007f04fdd3c230 in apr_poll (aprset=0x1c44670, num=2,
 >> nsds=0x7fffe4e508c8, timeout=-1) at poll/unix/poll.c:120
 >> #2  0x000000000046c0da in child_main (child_num_arg=> out>) at
peruser.c:2111
 >> #3  0x000000000046cfe9 in make_child (s=0xd34b38, slot=14) at
peruser.c:2534
 >> ...
 >>
 >> It's also proably related to the earlier mutex warning/critical
child error.
 >>
 >> Hannes
 >>
 >



Links:
------
[1] mailto:[email protected]
[2] mailto:[email protected]
[3] mailto:[email protected]

_______________________________________________
Peruser mailing list
[email protected]
http://www.telana.com/mailman/listinfo/peruser

Reply via email to