Hi Taavi,

Thank you for reporting this problem. Now it is fixed.

Patch from RC2:
http://opensource.mco2.net/download/apache/peruser/peruser-rc2-to-rc3-v7.patch

Full patch from vanilla Apache 2.2.17:
http://opensource.mco2.net/download/apache/peruser/peruser-rc3-full-v7.patch

Changes (from RC2):

* (v7) Bug fixed: multiplexers now can clone a processor child if all workers 
are busy.
* (v6) Bug fixed: apachectl graceful now working properly, without "long lost 
child" errors
* (v5) Not released to public
* (v4) Code cleanup
* (v4) Performance: childs are started in ~25ms, 40 times faster than in RC2 
(~1000ms)
* (v4) Bug fixed: now checking if total_processors is 1 (first access) to start 
StartProcessors
* (v3) Performance: new child type (CHILD_TYPE_RESERVED) to avoid collision (2 
childs trying to get the same free slot)
* (v3) Bug fixed: fixed a bug in RC2, wait_timeout was always 0, never sleeping 
to wait for new workers.
* (v2) Performance: StartProcessors, new configuration directive to control the 
number of child processors per vhost at startup
* (v2) Performance: childs are started in ~50ms, 20 times faster than in RC2 
(~1000ms)
* (v1) Performance: faster to lookup for free slots (this is important on busy 
servers, with many virtual hosts)
* (v1) Performance: faster to count processors, one single loop counts all 
processors
* (v1) Bug fixed: bug when MinSpareProcessors is set to 0 (now all workers 
processes are killed when idle_timeout is reached)
* (v1) Bug fixed: Free-up slots when a WORKER or PROCESSOR unexpectedly dies

--
Marcelo Coelho
marcelo at mco2.com.br


On Jan 5, 2011, at 8:44 AM, Taavi Sannik wrote:

> Hello again!
> 
> I see that you have added a special case if MinProcessors is 0, then it will 
> allow processor count to be below MinSpareProcessors (if IdleTimeout is 
> reached).
> This gets really bad, if one of the active workers hangs and all the other 
> workers get killed, because noone would accept new connections and noone will 
> clone new children.
> 
> The steps to reproduce this:
> - use this peruser configuration:
> <IfModule peruser.c>
>    ServerLimit 700
>    MaxClients 700
>    MinSpareProcessors 1
>    MaxSpareProcessors 20
>    MinProcessors 0
>    MaxProcessors 80
>    MaxRequestsPerChild 1000
>    ExpireTimeout 7200
>    IdleTimeout 10
>    MinMultiplexers 3
>    MaxMultiplexers 40
>    MultiplexerIdleTimeout 120
>    ProcessorWaitTimeout 5
> </IfModule>
> - create an infinite sleep script (for example in PHP: <?php while(true) 
> sleep(1); ?>)
> - start the server and run lynx on this script (lynx 
> http://hostname/sleep.php). Lynx will start to wait for the response.
> - if you look at the server-status or ps aux, you can see 2 workers (one of 
> them is handling the sleep script, and the second is idle).
> - wait until idletimeout kicks in. There is now only the "sleeping" worker 
> left and the virtualhost is no longer accessible.
> - run ab -c 100 -n 10000 against the dead virtualhost (you may need to repeat 
> it a couple of times as ab timeouts).
> - the whole server is now not accessible, all multiplexers have been spawned 
> and are trying to forward the requests to the dead virtualhost's workers but 
> there is noone to accept them.
> 
> There would be 2 ways to fix this:
> - rewrite the child cloning part and make multiplexers able to clone other 
> children
> - disallow setting MinSpareProcessors to 0. If MinProcessors is 0 then kill 
> the idle workers only if there are no active workers and all the workers have 
> their idletimeout limit reached.
> 
> 
> Cheers,
> Taavi
> _______________________________________________
> Peruser mailing list
> [email protected]
> http://www.telana.com/mailman/listinfo/peruser
> 

_______________________________________________
Peruser mailing list
[email protected]
http://www.telana.com/mailman/listinfo/peruser

Reply via email to