On Wed, Nov 6, 2013 at 7:29 AM, Pavel Levshin <[email protected]> wrote:

> Hello.
>
> Currently, worker threads are started as needed, but they virtually never
> can be stopped. The algorithm is as follows:
>
> 1. First thread is started when there is at least 1 message in the queue.
> Additional thread N is started when the queue has at least (N-1)*
> WorkerThreadMinimumMessages.
> 2. Worker thread can stop when it has been sleeping for a while, this is
> controlled via QueueWorkerTimeoutThreadShutdown parameter.
> 3. When there is some work in the queue, sleeping threads are awoken, one
> at a time. There is no way to select most appropriate thread to awake.
> Furthermore, they are awoken even when there is already enough workers.
> Therefore, all threads are awoken in round-robin fashion, and they will
> never reach timeout while there is some traffic in the queue.
>
> Having too many threads is not good for performance, because it has
> additional overhead. Here is my proposed patch to make thread pool
> shrinkable. It already works for me on a loaded server. Basically, this
> patch always selects the same threads to wake up, and limits number of
> running threads to advised maximum. In that way, unneeded threads are able
> to sleep up to timeout. In general, it makes rsyslog behave closer to docs.
>
> This modification has an important consequence: if one thread cannot cope
> with traffic, the queue is almost always has more than 
> (N-1)*WorkerThreadMinimumMessages.
> It reduces overhead, because threads are able to fetch more messages in
> each batch. On the other hand, this increases latency. It this is an issue,
> WorkerThreadMinimumMessage can be set to lower value. Or, maybe, formula
> for iMaxWorkers could be changed.
>
>
OK, I have begun to merge the patch and I really like it. Unfortunately,
there seems to be one case where it does not work. I could not yet test
this, maybe you can have a look into it.

The problem, I think, occurs with DA queues. If we look into queue.c, the
*same* condition variable is used for regular and DA workers, which reside
in different pools. This is so that when we need to go to disk, the disk
queue is also properly processed. I think (again, could not yet verify)
this can lead to stalls. Note that will NOT occur with pure disk queues,
just with pure DA queues.

It would be great if you could look into this, if not, I'll do as soon as I
find a bit of time.

Thanks again,
Rainer

>
> --
> Pavel Levshin
>
>
> _______________________________________________
> rsyslog mailing list
> http://lists.adiscon.net/mailman/listinfo/rsyslog
> http://www.rsyslog.com/professional-services/
> What's up with rsyslog? Follow https://twitter.com/rgerhards
> NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad
> of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you
> DON'T LIKE THAT.
>
_______________________________________________
rsyslog mailing list
http://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com/professional-services/
What's up with rsyslog? Follow https://twitter.com/rgerhards
NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of 
sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T LIKE 
THAT.

Reply via email to