I started working on some related ideas on a fork I created (naviserver-queues). The main thing I'm trying to improve is how the state-managing code is spread out across several functions and different threads.
My approach is to have a separate monitor thread for each server that checks all the threads in each pool to see that there are enough threads running, that threads die when they have been around too long (services too many requests), and that threads die when no longer needed. The driver thread still just queues the requests, but now there's an extra layer of control, and yes, overhead. The question of how many threads are needed is an interesting one: is it better to create threads quickly in response to traffic, or to create them more slowly in case the traffic is very bursty? The answer is of course, it depends. So I'm assuming that the available processing power - the number of threads - should correlate to how busy the server is. A server that is 50% busy should have 50% of its full capacity working. I'm using the wait queue length as a proxy for business: if the wait queue is 50% full, then there should be 50% of maxthreads running. But this seems like it unnecessarily underpowers the server, I added in an adjustable correction factor to scale this, so that you could tune for eager thread creation so you will be close to maxthreads when the server is 20% busy, or you could wait until the server is 80% busy before spawning more than a few threads. On the idle end it works similarly; if you tuned for eager thread creation then the threads wait longer to idle out. I expect that this approach would lead to a sort of self-balancing: if the queue gets bigger then it starts getting serviced faster and stops growing, while if it shrinks then it gets serviced slower and stops shrinking. There's room for experimentation here on exactly how to tune it; in my revision this logic is all in one place so it's simple. My initial testing shows that the server handles the same throughput (total req/s about the same) and is a bit more equitable (smaller difference between slowest and fastest request) but slightly less responsive (which is expected, since the requests inherently spend longer in the wait queue.) I'm still cleaning it up and it's definitely not ready for prime-time, but I'd be interested to hear what others think. -J Gustaf Neumann wrote: > I don't think, that a major problem comes from the "racy" > notification of queuing events to the connection threads. > This has advantages (make os responsible, which does this > very efficiently, less mutex requirements) and disadvantages > (little control). > > While the current architecture with the cond-broadcast is > certainly responsible for the problem of simultaneous dieing > threads (the OS determines, which thread receives sees the > condition first, therefore round robin), a list of the > linked connection threads does not help to determine on how > many threads are actually needed, how bursty thread creation > should be, how to handle short resource quenches (e.g. > caused by locks, etc.). By having a conn-thread-queue, the > threads have to update this queue with their status > information (being created, warming up, free, busy, > will-die) which requires some overhead and more mutex locks > on the driver. The thread-status-handling is done currently > automatically, a "busy" request ignores currently the > condition, etc. > > On the good side, we would have more control over the > threads. When a dieing thread notifies the > conn-thread-queue, one can control thread-creation via this > hook the same way as on situations, where requests are > queued. Another good aspect is, that the thread-idle-timeout > starts to makes sense again on busy sites. Currently, the > thread-reduction works via counter, since unneeded threads > die and won't be recreated unless the traffic requires it > (which works in practice quite well). For busy sites, the > thread-idle timeout is not needed this way. > > currently we have a one-way communication from the driver to > the conn-threads. with the conn-thread-list (or array), one > has a two way communication, ... at least, how i understand > this for now. >> I think this is racy because all conn threads block on a single >> condition variable. The driver thread and conn threads must cooperate >> to manage the whole life cycle and the code to manage the state is >> spread around. >> >> If instead all conn thread were in a queue, each with it's own >> condition variable, the driver thread could have sole responsibility >> for choosing which conn thread to run by signalling it directly, >> probably in LIFO order rather than the current semi-round-robin order >> which tends to cause all conn threads to expire at once. Conn threads >> would return to the front of the queue, unless wishing to expire in >> which case they'd go on the back of the queue, and the driver would >> signal when it was convenient to do so. Something like that... >> ------------------------------------------------------------------------------ Everyone hates slow websites. So do we. Make your web apps faster with AppDynamics Download AppDynamics Lite for free today: http://p.sf.net/sfu/appdyn_sfd2d_oct _______________________________________________ naviserver-devel mailing list naviserver-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/naviserver-devel