Re: [rsyslog] omelasticsearch: send every N seconds even if bulk not full?

David Lang Wed, 18 Nov 2015 19:50:46 -0800

On Wed, 18 Nov 2015, Otis Gospodnetić wrote:

Hi,


On Wed, Nov 18, 2015 at 9:37 PM, David Lang <[email protected]> wrote:

On Wed, 18 Nov 2015, Otis Gospodnetić wrote:

Hi Dave,


Thanks for the answer.  Ouch.  This sounds suboptimal :( and makes me feel
like I have no control :(.  Now that you said this, I have a feeling Radu
asked about this exact same thing at one point and had a similar
reaction.....

.... aha, yes, found it:


http://search-devops.com/m/PamuZftUHylTXGc1&subj=+rsyslog+Can+we+have+a+minimum+bulk+size+for+omelasticsearch+
+ issue from that thread: https://github.com/rsyslog/rsyslog/issues/495


suboptimal in overhead, but optimal in terms of latency.

It's also much simpler and safer (the number of bugs that happen in code
that has to implement timeouts to batch things up is apalling, and
troubleshooting such cases is really nasty). Delaying messages also extends
the window when something going wrong can cause them to be lost.

It does mean that in the absense of contention, usage ramps up much faster
than when messages are delayed, but as contention of any sort appears, it's
adapted to.

David Lang



Maybe it's just me, but I wish I could configure rsyslog differently for
different situations.  Maybe I'm OK with some small chance of data loss if
it buys be some performance.  Maybe I don't care so much about latency and
care more about throughput.  But now that you explained the behaviour I'll
stop creating noise - thanks! :)

It's not just you :-) That's why there's now an enhancement request to changethis if/when the queue system gets revamped.

But to clarify, it's not a latency vs throughput tradeoff, it's latency vsefficiency with throughput being the same with both approaches.

At very high traffic levels (when there is a backlog of at least batchsizemessages waiting to be delivered), both approaches are essentially identical andwill deliver messages at the same rate.

at lower traffic levels, the timed batch approach will sit idle, then have aburst of activity, while the current approach will deliver more, smallerbatches. But over the same timeframe (even a few of the timed batches) thenumber of messages delivered will be the same. If the current approach fallsbehind at all, the next batch will just be larger to make up for it.

Part of the original thinking was that the performance impact boiled down to theidea that being more efficient only resulted in resources sitting idle whilewaiting for logs to accumulate (I will admit, this was 2006 or so, beforevirtualization became a really big thing, which does change the picture a littlebit by making the resources one system doesn't use available to others insteadof just sitting idle).


not arguing, just trying to explain.

David Lang

_______________________________________________
rsyslog mailing list
http://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com/professional-services/
What's up with rsyslog? Follow https://twitter.com/rgerhards
NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of 
sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T LIKE 
THAT.

Re: [rsyslog] omelasticsearch: send every N seconds even if bulk not full?

Reply via email to