Re: [rsyslog] how to force a larger omelasticsearch bulk size?

David Lang Wed, 17 Jun 2015 02:58:34 -0700

Probably a risk, something to keep an eye on (or watch the pstats from rsyslogand tweak the priority if the queue too large)

I also believe that the vast majority of searches that are typically done aredone wrong (see my dashboards/reports article athttps://www.usenix.org/publications/login/feb14/logging-reports-dashboards )


David Lang

On Wed, 17 Jun 2015, Radu Gheorghe wrote:

This sounds interesting, David. I guess it's possible to renice just some
threads from an app and make it "nicer", right? Googling a bit it seems it
is possible.

The only problem I see with this approach is that searches (and other kinds
of requests from other threadpools
<https://www.elastic.co/guide/en/elasticsearch/reference/current/modules-threadpool.html>)
would automatically have higher priority so, with heavy searches, indexing
might fall behind more than usual. Am I getting it right?

--
Performance Monitoring * Log Analytics * Search Analytics
Solr & Elasticsearch Support * http://sematext.com/

On Wed, Jun 17, 2015 at 11:53 AM, David Lang <[email protected]> wrote:

Thinking about it, probably the best thing to do is to renice the ES
threads that accept the messages from rsyslog. That way if nothing else
needs the capacity, everything works at the fastest insert speed (even if
less optimized than if there were larger batches) But if anything else on
the system need the resources, the indexing threads work slower, which will
result in larger batches.

all self tuning.

David Lang



On Wed, 17 Jun 2015, Radu Gheorghe wrote:

 Date: Wed, 17 Jun 2015 10:20:46 +0300

From: Radu Gheorghe <[email protected]>
Reply-To: rsyslog-users <[email protected]>
To: rsyslog-users <[email protected]>
Subject: Re: [rsyslog] how to force a larger omelasticsearch bulk size?

Maybe this went overlooked, but David suggested earlier that you can
slowdown the queue to let more messages arrive before sending a bulk.
queue.dequeueslowdown
<http://www.rsyslog.com/doc/v8-stable/rainerscript/queue_parameters.html>
is the option and it's in microseconds.

I think you have a valid point in that if batches are too small then
Elasticsearch will do more work than necessary (as indexing in very small
batches is more expensive). Plus, since the refresh rate (i.e. how long it
may take for an indexed doc to be visible to searches, because Searchers
reopen their view in the index at a certain interval) is typically a few
seconds
<
http://blog.sematext.com/2013/07/08/elasticsearch-refresh-interval-vs-indexing-performance/


waiting a bit before submitting a batch will have no impact on the user
experience.

On the other hand, in my experience you'll be sending small batches if the
indexing rate is low - which means the load on ES is low anyway. So I'm
not
sure if optimizing this will actually give significant results. You could
introduce that slowdown, but then rsyslog may have trouble keeping up when
the load is high. You can compensate by raising the limit of maximum
worker
threads for the queue (queue.workerthreads) and play with
queue.workerthreadminimummessages and queue.timeoutworkerthreadshutdown to
make rsyslog spawn new threads when there are at least N messages in the
queue (that's what min messages does) and kill them when the queue is
smaller than that for a while (that's the timeout option). If the load is
low, you'd have just one thread that works with that slowdown.

I hope this helps.

Best regards,
Radu

--
Performance Monitoring * Log Analytics * Search Analytics
Solr & Elasticsearch Support * http://sematext.com/

On Wed, Jun 17, 2015 at 6:23 AM, chenlin rao <[email protected]>
wrote:

 So how can I define the output queue configuration?

I found the omelasticsearch action process 60000/min, and the
queue.discarded.nf was 600000.
I run `tcpdump -i eth1 -s0 -A 'tcp dst port 9200'  | grep Content-Length`
and saw the length is 1.6k. As my msgline size is 0.1k, the bulk size is
only 10. Too small.

Sometimes when I restart rsyslogd, the Content-Length grows to 8MB. Why~~

2015-05-06 1:39 GMT+08:00 David Lang <[email protected]>:

 On Tue, 5 May 2015, chenlin rao wrote:


 I'm using rsyslog-elasticsearch to writing nginx accesslog into

Elasticsearch cluster. I found the document told that the plugin would

use

queue.dequeuesize as the bulk size.But my tcpdump show that every POST

only
has 8-9 events in the bulk body while my input flow is nearly 10k per
second.

How can I force a larger bulk size?

Rsyslog adapts the size to the number of messages waiting to be

delivered,

so if it's keeping up at that size, it won't increase it.

are you running impstats? if so, please look at the queue size. If it's
staying low, then you just have a nice, fast ES instance that is able to

do

1k inserts/sec (which is not unreasonable), so each insert would be <10

messages.

Trying to force a larger bulk size would mean not inserting messages as
fast as we can, and instead pausing and waiting for enough messages to
accumulate to fill the bulk size. We never delay messages intentionally,
each pass through the loop we grab all pending messages, up to the max
dequeue size, and deliver them. If more messages arrive than we deliver,
the next pass through the queue is larger, so we grab more messages
(this
quickly stabilizes to inserting messages as fast as they are arriving)

there is a dequeue delay that forces rsyslog to sit and do nothing

between

one batch of messages and the next. It's use is discouraged, but
delaying
like this would allow more messages to accumulate.

David Lang
_______________________________________________
rsyslog mailing list
http://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com/professional-services/
What's up with rsyslog? Follow https://twitter.com/rgerhards
NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad
of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you
DON'T LIKE THAT.

 _______________________________________________

rsyslog mailing list
http://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com/professional-services/
What's up with rsyslog? Follow https://twitter.com/rgerhards
NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad
of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you
DON'T LIKE THAT.

 _______________________________________________

rsyslog mailing list
http://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com/professional-services/
What's up with rsyslog? Follow https://twitter.com/rgerhards
NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad
of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you
DON'T LIKE THAT.

 _______________________________________________

rsyslog mailing list
http://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com/professional-services/
What's up with rsyslog? Follow https://twitter.com/rgerhards
NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad
of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you
DON'T LIKE THAT.

_______________________________________________
rsyslog mailing list
http://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com/professional-services/
What's up with rsyslog? Follow https://twitter.com/rgerhards
NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of 
sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T LIKE 
THAT.

_______________________________________________
rsyslog mailing list
http://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com/professional-services/
What's up with rsyslog? Follow https://twitter.com/rgerhards
NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of 
sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T LIKE 
THAT.

Re: [rsyslog] how to force a larger omelasticsearch bulk size?

Reply via email to