Actually, what I am seeing is that Apache realizes that it needs to start more
worker threads (because available worker threads falls below min_spare_threads)
and during the time where new processes and threads are being started, requests
being processed (i.e. rps) decreases to near zero (5-12 range). The newly created
threads take on new requests, but existing idle threads are left waiting until the
new processes have created all their threads. Moving the creation of the listener
until after the threads have all been created seems to allow the existing threads
to take on new work while delaying the newly created threads from taking on requests
until all of the threads have been created.

It may not be a complete solution, but current testing seems to show that it limits
the amount of time that the server throughput drops to near zero.

I am still in the midst of testing this, so the results may not hold up in the
long run...

Paul J. Reder

Aaron Bannert wrote:

> On Thu, Apr 25, 2002 at 11:30:54AM -0400, Bill Stoddard wrote:
> 
>>Would someone care to see if this fixes the worker MPM performance problem reported
>>earlier on the list (request-per-second dropping when clients exceeded 
>threadsperchild)?
>>This patch defers starting the listener untill -all- the workers have started.
>>
> 
> I'm not really sure how this would fix the performance problems, and given
> the current theory it might even exacerbate it. The current hypothesis
> is that when we run out of available workers in all available children,
> and we are waiting for a new child to be spawned, connections continue
> to be accepted and placed in a queue*, and as such aren't able to be
> immediately serviced as soon as the new child is started.
> 
> A simple fix would be to prevent the queue* from accepting more
> connections until there is an idle worker thread available. The reason
> I have hesitated to make this change is because it would alter the
> places where the listener thread may enter blocking calls, and would
> probably break graceful/non-graceful restarts. If I get a little
> time I will try to look in to this again this weekend.
> 
> * When I say "queue" I really mean stack. In thinking about this problem
> over the last few days I realized that we should convert back to a true
> LIFO, otherwise it is possible for a request to sit at the back of the
> stack for a long time before it is serviced.
> 
> 
> Summary of worker bugs that need to be fixed:
> 
> - convert fd_queue back into a LIFO
> - add a counter that blocks ap_queue_pop() until there are available workers
>   (without breaking restarts/shutdown).
> - add a way to track open socket descriptors; when we get the signal to
>   do a hard shutdown of the server, walk down this set and close the fds
>   so we can halt any long-running requests.
> 
> -aaron
> 
> 
> 


-- 
Paul J. Reder
-----------------------------------------------------------
"The strength of the Constitution lies entirely in the determination of each
citizen to defend it.  Only if every single citizen feels duty bound to do
his share in this defense are the constitutional rights secure."
-- Albert Einstein


Reply via email to