Re: [naviserver-devel] lurking bugs: conn threads

Jeff Rogers Thu, 25 Oct 2012 08:28:31 -0700

I started working on some related ideas on a fork I created 
(naviserver-queues).  The main thing I'm trying to improve is how the 
state-managing code is spread out across several functions and different 
threads.

My approach is to have a separate monitor thread for each server that 
checks all the threads in each pool to see that there are enough threads 
running, that threads die when they have been around too long (services 
too many requests), and that threads die when no longer needed.  The 
driver thread still just queues the requests, but now there's an extra 
layer of control, and yes, overhead.

The question of how many threads are needed is an interesting one: is it 
better to create threads quickly in response to traffic, or to create 
them more slowly in case the traffic is very bursty?  The answer is of 
course, it depends.

So I'm assuming that the available processing power - the number of 
threads - should correlate to how busy the server is.  A server that is 
50% busy should have 50% of its full capacity working.  I'm using the 
wait queue length as a proxy for business:  if the wait queue is 50% 
full, then there should be 50% of maxthreads running.  But this seems 
like it unnecessarily underpowers the server, I added in an adjustable 
correction factor to scale this, so that you could tune for eager thread 
creation so you will be close to maxthreads when the server is 20% busy, 
or you could wait until the server is 80% busy before spawning more than 
a few threads.  On the idle end it works similarly;  if you tuned for 
eager thread creation then the threads wait longer to idle out.

I expect that this approach would lead to a sort of self-balancing: if 
the queue gets bigger then it starts getting serviced faster and stops 
growing, while if it shrinks then it gets serviced slower and stops 
shrinking.  There's room for experimentation here on exactly how to tune 
it; in my revision this logic is all in one place so it's simple.

My initial testing shows that the server handles the same throughput 
(total req/s about the same) and is a bit more equitable (smaller 
difference between slowest and fastest request) but slightly less 
responsive (which is expected, since the requests inherently spend 
longer in the wait queue.)

I'm still cleaning it up and it's definitely not ready for prime-time, 
but I'd be interested to hear what others think.

-J

Gustaf Neumann wrote:
> I don't think, that a major problem comes from the "racy"
> notification of queuing  events to the connection threads.
> This has advantages (make os responsible, which does this
> very efficiently, less mutex requirements) and disadvantages
> (little control).
>
> While the current architecture with the cond-broadcast is
> certainly responsible for the problem of simultaneous dieing
> threads (the OS determines, which thread receives sees the
> condition first, therefore round robin), a list of the
> linked connection threads does not help to determine on how
> many threads are actually needed, how bursty thread creation
> should be, how to handle short resource quenches (e.g.
> caused by locks, etc.). By having a conn-thread-queue, the
> threads have to update this queue with their status
> information (being created, warming up, free, busy,
> will-die) which requires some overhead and more mutex locks
> on the driver. The thread-status-handling is done currently
> automatically, a "busy" request ignores currently the
> condition, etc.
>
> On the good side, we would have more control over the
> threads. When a dieing thread notifies the
> conn-thread-queue, one can control thread-creation via this
> hook the same way as on situations, where requests are
> queued. Another good aspect is, that the thread-idle-timeout
> starts to makes sense again on busy sites. Currently, the
> thread-reduction works via counter, since unneeded threads
> die and won't be recreated unless the traffic requires it
> (which works in practice quite well). For busy sites, the
> thread-idle timeout is not needed this way.
>
> currently we have a one-way communication from the driver to
> the conn-threads. with the conn-thread-list (or array), one
> has a two way communication, ... at least, how i understand
> this for now.

>> I think this is racy because all conn threads block on a single
>> condition variable. The driver thread and conn threads must cooperate
>> to manage the whole life cycle and the code to manage the state is
>> spread around.
>>
>> If instead all conn thread were in a queue, each with it's own
>> condition variable, the driver thread could have sole responsibility
>> for choosing which conn thread to run by signalling it directly,
>> probably in LIFO order rather than the current semi-round-robin order
>> which tends to cause all conn threads to expire at once. Conn threads
>> would return to the front of the queue, unless wishing to expire in
>> which case they'd go on the back of the queue, and the driver would
>> signal when it was convenient to do so. Something like that...
>>

------------------------------------------------------------------------------
Everyone hates slow websites. So do we.
Make your web apps faster with AppDynamics
Download AppDynamics Lite for free today:
http://p.sf.net/sfu/appdyn_sfd2d_oct
_______________________________________________
naviserver-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/naviserver-devel

Re: [naviserver-devel] lurking bugs: conn threads

Reply via email to