On Wed, 18 Nov 2015, Otis Gospodnetić wrote:

Hi,

On Wed, Nov 18, 2015 at 9:37 PM, David Lang <[email protected]> wrote:

On Wed, 18 Nov 2015, Otis Gospodnetić wrote:

Hi Dave,

Thanks for the answer.  Ouch.  This sounds suboptimal :( and makes me feel
like I have no control :(.  Now that you said this, I have a feeling Radu
asked about this exact same thing at one point and had a similar
reaction.....

.... aha, yes, found it:


http://search-devops.com/m/PamuZftUHylTXGc1&subj=+rsyslog+Can+we+have+a+minimum+bulk+size+for+omelasticsearch+
+ issue from that thread: https://github.com/rsyslog/rsyslog/issues/495


suboptimal in overhead, but optimal in terms of latency.

It's also much simpler and safer (the number of bugs that happen in code
that has to implement timeouts to batch things up is apalling, and
troubleshooting such cases is really nasty). Delaying messages also extends
the window when something going wrong can cause them to be lost.

It does mean that in the absense of contention, usage ramps up much faster
than when messages are delayed, but as contention of any sort appears, it's
adapted to.

David Lang


Maybe it's just me, but I wish I could configure rsyslog differently for
different situations.  Maybe I'm OK with some small chance of data loss if
it buys be some performance.  Maybe I don't care so much about latency and
care more about throughput.  But now that you explained the behaviour I'll
stop creating noise - thanks! :)

It's not just you :-) That's why there's now an enhancement request to change this if/when the queue system gets revamped.

But to clarify, it's not a latency vs throughput tradeoff, it's latency vs efficiency with throughput being the same with both approaches.

At very high traffic levels (when there is a backlog of at least batchsize messages waiting to be delivered), both approaches are essentially identical and will deliver messages at the same rate.

at lower traffic levels, the timed batch approach will sit idle, then have a burst of activity, while the current approach will deliver more, smaller batches. But over the same timeframe (even a few of the timed batches) the number of messages delivered will be the same. If the current approach falls behind at all, the next batch will just be larger to make up for it.


Part of the original thinking was that the performance impact boiled down to the idea that being more efficient only resulted in resources sitting idle while waiting for logs to accumulate (I will admit, this was 2006 or so, before virtualization became a really big thing, which does change the picture a little bit by making the resources one system doesn't use available to others instead of just sitting idle).

not arguing, just trying to explain.

David Lang
_______________________________________________
rsyslog mailing list
http://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com/professional-services/
What's up with rsyslog? Follow https://twitter.com/rgerhards
NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of 
sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T LIKE 
THAT.

Reply via email to