This sounds interesting, David. I guess it's possible to renice just some threads from an app and make it "nicer", right? Googling a bit it seems it is possible.
The only problem I see with this approach is that searches (and other kinds of requests from other threadpools <https://www.elastic.co/guide/en/elasticsearch/reference/current/modules-threadpool.html>) would automatically have higher priority so, with heavy searches, indexing might fall behind more than usual. Am I getting it right? -- Performance Monitoring * Log Analytics * Search Analytics Solr & Elasticsearch Support * http://sematext.com/ On Wed, Jun 17, 2015 at 11:53 AM, David Lang <[email protected]> wrote: > Thinking about it, probably the best thing to do is to renice the ES > threads that accept the messages from rsyslog. That way if nothing else > needs the capacity, everything works at the fastest insert speed (even if > less optimized than if there were larger batches) But if anything else on > the system need the resources, the indexing threads work slower, which will > result in larger batches. > > all self tuning. > > David Lang > > > > On Wed, 17 Jun 2015, Radu Gheorghe wrote: > > Date: Wed, 17 Jun 2015 10:20:46 +0300 >> From: Radu Gheorghe <[email protected]> >> Reply-To: rsyslog-users <[email protected]> >> To: rsyslog-users <[email protected]> >> Subject: Re: [rsyslog] how to force a larger omelasticsearch bulk size? >> >> Maybe this went overlooked, but David suggested earlier that you can >> slowdown the queue to let more messages arrive before sending a bulk. >> queue.dequeueslowdown >> <http://www.rsyslog.com/doc/v8-stable/rainerscript/queue_parameters.html> >> is the option and it's in microseconds. >> >> I think you have a valid point in that if batches are too small then >> Elasticsearch will do more work than necessary (as indexing in very small >> batches is more expensive). Plus, since the refresh rate (i.e. how long it >> may take for an indexed doc to be visible to searches, because Searchers >> reopen their view in the index at a certain interval) is typically a few >> seconds >> < >> http://blog.sematext.com/2013/07/08/elasticsearch-refresh-interval-vs-indexing-performance/ >> >, >> >> waiting a bit before submitting a batch will have no impact on the user >> experience. >> >> On the other hand, in my experience you'll be sending small batches if the >> indexing rate is low - which means the load on ES is low anyway. So I'm >> not >> sure if optimizing this will actually give significant results. You could >> introduce that slowdown, but then rsyslog may have trouble keeping up when >> the load is high. You can compensate by raising the limit of maximum >> worker >> threads for the queue (queue.workerthreads) and play with >> queue.workerthreadminimummessages and queue.timeoutworkerthreadshutdown to >> make rsyslog spawn new threads when there are at least N messages in the >> queue (that's what min messages does) and kill them when the queue is >> smaller than that for a while (that's the timeout option). If the load is >> low, you'd have just one thread that works with that slowdown. >> >> I hope this helps. >> >> Best regards, >> Radu >> >> -- >> Performance Monitoring * Log Analytics * Search Analytics >> Solr & Elasticsearch Support * http://sematext.com/ >> >> On Wed, Jun 17, 2015 at 6:23 AM, chenlin rao <[email protected]> >> wrote: >> >> So how can I define the output queue configuration? >>> I found the omelasticsearch action process 60000/min, and the >>> queue.discarded.nf was 600000. >>> I run `tcpdump -i eth1 -s0 -A 'tcp dst port 9200' | grep Content-Length` >>> and saw the length is 1.6k. As my msgline size is 0.1k, the bulk size is >>> only 10. Too small. >>> >>> Sometimes when I restart rsyslogd, the Content-Length grows to 8MB. Why~~ >>> >>> 2015-05-06 1:39 GMT+08:00 David Lang <[email protected]>: >>> >>> On Tue, 5 May 2015, chenlin rao wrote: >>>> >>>> I'm using rsyslog-elasticsearch to writing nginx accesslog into >>>> >>>>> Elasticsearch cluster. I found the document told that the plugin would >>>>> >>>> use >>> >>>> queue.dequeuesize as the bulk size.But my tcpdump show that every POST >>>>> only >>>>> has 8-9 events in the bulk body while my input flow is nearly 10k per >>>>> second. >>>>> >>>>> How can I force a larger bulk size? >>>>> >>>>> >>>> Rsyslog adapts the size to the number of messages waiting to be >>>> >>> delivered, >>> >>>> so if it's keeping up at that size, it won't increase it. >>>> >>>> are you running impstats? if so, please look at the queue size. If it's >>>> staying low, then you just have a nice, fast ES instance that is able to >>>> >>> do >>> >>>> 1k inserts/sec (which is not unreasonable), so each insert would be <10 >>>>> >>>> messages. >>>> >>>> Trying to force a larger bulk size would mean not inserting messages as >>>> fast as we can, and instead pausing and waiting for enough messages to >>>> accumulate to fill the bulk size. We never delay messages intentionally, >>>> each pass through the loop we grab all pending messages, up to the max >>>> dequeue size, and deliver them. If more messages arrive than we deliver, >>>> the next pass through the queue is larger, so we grab more messages >>>> (this >>>> quickly stabilizes to inserting messages as fast as they are arriving) >>>> >>>> there is a dequeue delay that forces rsyslog to sit and do nothing >>>> >>> between >>> >>>> one batch of messages and the next. It's use is discouraged, but >>>> delaying >>>> like this would allow more messages to accumulate. >>>> >>>> David Lang >>>> _______________________________________________ >>>> rsyslog mailing list >>>> http://lists.adiscon.net/mailman/listinfo/rsyslog >>>> http://www.rsyslog.com/professional-services/ >>>> What's up with rsyslog? Follow https://twitter.com/rgerhards >>>> NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad >>>> of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you >>>> DON'T LIKE THAT. >>>> >>>> _______________________________________________ >>> rsyslog mailing list >>> http://lists.adiscon.net/mailman/listinfo/rsyslog >>> http://www.rsyslog.com/professional-services/ >>> What's up with rsyslog? Follow https://twitter.com/rgerhards >>> NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad >>> of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you >>> DON'T LIKE THAT. >>> >>> _______________________________________________ >> rsyslog mailing list >> http://lists.adiscon.net/mailman/listinfo/rsyslog >> http://www.rsyslog.com/professional-services/ >> What's up with rsyslog? Follow https://twitter.com/rgerhards >> NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad >> of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you >> DON'T LIKE THAT. >> >> _______________________________________________ > rsyslog mailing list > http://lists.adiscon.net/mailman/listinfo/rsyslog > http://www.rsyslog.com/professional-services/ > What's up with rsyslog? Follow https://twitter.com/rgerhards > NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad > of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you > DON'T LIKE THAT. > _______________________________________________ rsyslog mailing list http://lists.adiscon.net/mailman/listinfo/rsyslog http://www.rsyslog.com/professional-services/ What's up with rsyslog? Follow https://twitter.com/rgerhards NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T LIKE THAT.

