Hi Taavi, Thank you for reporting this problem. Now it is fixed.
Patch from RC2: http://opensource.mco2.net/download/apache/peruser/peruser-rc2-to-rc3-v7.patch Full patch from vanilla Apache 2.2.17: http://opensource.mco2.net/download/apache/peruser/peruser-rc3-full-v7.patch Changes (from RC2): * (v7) Bug fixed: multiplexers now can clone a processor child if all workers are busy. * (v6) Bug fixed: apachectl graceful now working properly, without "long lost child" errors * (v5) Not released to public * (v4) Code cleanup * (v4) Performance: childs are started in ~25ms, 40 times faster than in RC2 (~1000ms) * (v4) Bug fixed: now checking if total_processors is 1 (first access) to start StartProcessors * (v3) Performance: new child type (CHILD_TYPE_RESERVED) to avoid collision (2 childs trying to get the same free slot) * (v3) Bug fixed: fixed a bug in RC2, wait_timeout was always 0, never sleeping to wait for new workers. * (v2) Performance: StartProcessors, new configuration directive to control the number of child processors per vhost at startup * (v2) Performance: childs are started in ~50ms, 20 times faster than in RC2 (~1000ms) * (v1) Performance: faster to lookup for free slots (this is important on busy servers, with many virtual hosts) * (v1) Performance: faster to count processors, one single loop counts all processors * (v1) Bug fixed: bug when MinSpareProcessors is set to 0 (now all workers processes are killed when idle_timeout is reached) * (v1) Bug fixed: Free-up slots when a WORKER or PROCESSOR unexpectedly dies -- Marcelo Coelho marcelo at mco2.com.br On Jan 5, 2011, at 8:44 AM, Taavi Sannik wrote: > Hello again! > > I see that you have added a special case if MinProcessors is 0, then it will > allow processor count to be below MinSpareProcessors (if IdleTimeout is > reached). > This gets really bad, if one of the active workers hangs and all the other > workers get killed, because noone would accept new connections and noone will > clone new children. > > The steps to reproduce this: > - use this peruser configuration: > <IfModule peruser.c> > ServerLimit 700 > MaxClients 700 > MinSpareProcessors 1 > MaxSpareProcessors 20 > MinProcessors 0 > MaxProcessors 80 > MaxRequestsPerChild 1000 > ExpireTimeout 7200 > IdleTimeout 10 > MinMultiplexers 3 > MaxMultiplexers 40 > MultiplexerIdleTimeout 120 > ProcessorWaitTimeout 5 > </IfModule> > - create an infinite sleep script (for example in PHP: <?php while(true) > sleep(1); ?>) > - start the server and run lynx on this script (lynx > http://hostname/sleep.php). Lynx will start to wait for the response. > - if you look at the server-status or ps aux, you can see 2 workers (one of > them is handling the sleep script, and the second is idle). > - wait until idletimeout kicks in. There is now only the "sleeping" worker > left and the virtualhost is no longer accessible. > - run ab -c 100 -n 10000 against the dead virtualhost (you may need to repeat > it a couple of times as ab timeouts). > - the whole server is now not accessible, all multiplexers have been spawned > and are trying to forward the requests to the dead virtualhost's workers but > there is noone to accept them. > > There would be 2 ways to fix this: > - rewrite the child cloning part and make multiplexers able to clone other > children > - disallow setting MinSpareProcessors to 0. If MinProcessors is 0 then kill > the idle workers only if there are no active workers and all the workers have > their idletimeout limit reached. > > > Cheers, > Taavi > _______________________________________________ > Peruser mailing list > [email protected] > http://www.telana.com/mailman/listinfo/peruser > _______________________________________________ Peruser mailing list [email protected] http://www.telana.com/mailman/listinfo/peruser
