Hi Stefan, I'm working on this right now. The v16 patch will incorporate:
- long lost children patch - PAM patch - cpu_limit patch But PAM patch and cpu_limit will be disabled by default, so you only enable if you need it. Everybody will be happy and the patch will continue to be a single patch. -- Marcelo Coelho marcelo at mco2.com.br On Jan 26, 2011, at 8:12 AM, <[email protected]> <[email protected]> wrote: > @marcelo, > > could you please also add this patch to v16? > > On Wed, 26 Jan 2011 10:26:18 +0100, Hannes Landeholm <[email protected]> > wrote: >> Just wanted to add a note to this. >> >> I installed the long lost children patch yesterday and since then I >> haven't seen this problem. >> >> Regards, >> >> Hannes >> >> On 25 January 2011 17:49, Hannes Landeholm wrote: >> To fix this I suggest changing the line in peruser.c:2111 >> >> ret = apr_poll(pollset, num_listensocks, &n, 1000000); >> >> so it gets a time out of 1000000 (one second). And below it do a >> check >> if the main process is still running. If not, quit. >> >> Hannes >> >> On 25 January 2011 16:56, Hannes Landeholm wrote: >> > Oh yeah, and it's even worse since my automatic watchdog script >> > doesn't even know apache doesn't work anymore since it sees alive >> > httpd processes running and think everything is hunky dory. >> > >> > Hannes >> > >> > On 25 January 2011 16:55, Hannes Landeholm wrote: >> >> Hi, >> >> >> >> I see a whole bunch of loose children that are stopped and refuse >> to >> >> exit even though their parent process has died a long long time >> ago. >> >> This has happened multiple times. I think it happens when the >> parent >> >> exits ungracefully like when it's crashed. Can you add a check >> that >> >> terminates child processes when the parent is killed? This is >> >> exceptionally annoying when multiplexers do this since they block >> >> apache from restarting as they block the listen port. Since not >> even >> >> automatic watchdog scripts can bring back apache to life when >> that >> >> happens I'd say this is a critical/major bug. >> >> >> >> Here's a backtrace for one of the borked kids: >> >> >> >> #0 0x00007f04fd20ef58 in *__GI___poll (fds=0x7fffe4e50740, >> nfds=2, >> >> timeout=) at ../sysdeps/unix/sysv/linux/poll.c:83 >> >> #1 0x00007f04fdd3c230 in apr_poll (aprset=0x1c44670, num=2, >> >> nsds=0x7fffe4e508c8, timeout=-1) at poll/unix/poll.c:120 >> >> #2 0x000000000046c0da in child_main (child_num_arg=> out>) at >> peruser.c:2111 >> >> #3 0x000000000046cfe9 in make_child (s=0xd34b38, slot=14) at >> peruser.c:2534 >> >> ... >> >> >> >> It's also proably related to the earlier mutex warning/critical >> child error. >> >> >> >> Hannes >> >> >> > >> >> >> >> Links: >> ------ >> [1] mailto:[email protected] >> [2] mailto:[email protected] >> [3] mailto:[email protected] > > _______________________________________________ > Peruser mailing list > [email protected] > http://www.telana.com/mailman/listinfo/peruser _______________________________________________ Peruser mailing list [email protected] http://www.telana.com/mailman/listinfo/peruser
