That might work, thanks for the feedback and the interesting article! -- Performance Monitoring * Log Analytics * Search Analytics Solr & Elasticsearch Support * http://sematext.com/
On Wed, Jun 17, 2015 at 12:58 PM, David Lang <[email protected]> wrote: > Probably a risk, something to keep an eye on (or watch the pstats from > rsyslog and tweak the priority if the queue too large) > > I also believe that the vast majority of searches that are typically done > are done wrong (see my dashboards/reports article at > https://www.usenix.org/publications/login/feb14/logging-reports-dashboards > ) > > David Lang > > On Wed, 17 Jun 2015, Radu Gheorghe wrote: > > This sounds interesting, David. I guess it's possible to renice just some >> threads from an app and make it "nicer", right? Googling a bit it seems it >> is possible. >> >> The only problem I see with this approach is that searches (and other >> kinds >> of requests from other threadpools >> < >> https://www.elastic.co/guide/en/elasticsearch/reference/current/modules-threadpool.html >> >) >> >> would automatically have higher priority so, with heavy searches, indexing >> might fall behind more than usual. Am I getting it right? >> >> -- >> Performance Monitoring * Log Analytics * Search Analytics >> Solr & Elasticsearch Support * http://sematext.com/ >> >> On Wed, Jun 17, 2015 at 11:53 AM, David Lang <[email protected]> wrote: >> >> Thinking about it, probably the best thing to do is to renice the ES >>> threads that accept the messages from rsyslog. That way if nothing else >>> needs the capacity, everything works at the fastest insert speed (even if >>> less optimized than if there were larger batches) But if anything else on >>> the system need the resources, the indexing threads work slower, which >>> will >>> result in larger batches. >>> >>> all self tuning. >>> >>> David Lang >>> >>> >>> >>> On Wed, 17 Jun 2015, Radu Gheorghe wrote: >>> >>> Date: Wed, 17 Jun 2015 10:20:46 +0300 >>> >>>> From: Radu Gheorghe <[email protected]> >>>> Reply-To: rsyslog-users <[email protected]> >>>> To: rsyslog-users <[email protected]> >>>> Subject: Re: [rsyslog] how to force a larger omelasticsearch bulk size? >>>> >>>> Maybe this went overlooked, but David suggested earlier that you can >>>> slowdown the queue to let more messages arrive before sending a bulk. >>>> queue.dequeueslowdown >>>> < >>>> http://www.rsyslog.com/doc/v8-stable/rainerscript/queue_parameters.html >>>> > >>>> is the option and it's in microseconds. >>>> >>>> I think you have a valid point in that if batches are too small then >>>> Elasticsearch will do more work than necessary (as indexing in very >>>> small >>>> batches is more expensive). Plus, since the refresh rate (i.e. how long >>>> it >>>> may take for an indexed doc to be visible to searches, because Searchers >>>> reopen their view in the index at a certain interval) is typically a few >>>> seconds >>>> < >>>> >>>> http://blog.sematext.com/2013/07/08/elasticsearch-refresh-interval-vs-indexing-performance/ >>>> >>>>> , >>>>> >>>> >>>> waiting a bit before submitting a batch will have no impact on the user >>>> experience. >>>> >>>> On the other hand, in my experience you'll be sending small batches if >>>> the >>>> indexing rate is low - which means the load on ES is low anyway. So I'm >>>> not >>>> sure if optimizing this will actually give significant results. You >>>> could >>>> introduce that slowdown, but then rsyslog may have trouble keeping up >>>> when >>>> the load is high. You can compensate by raising the limit of maximum >>>> worker >>>> threads for the queue (queue.workerthreads) and play with >>>> queue.workerthreadminimummessages and queue.timeoutworkerthreadshutdown >>>> to >>>> make rsyslog spawn new threads when there are at least N messages in the >>>> queue (that's what min messages does) and kill them when the queue is >>>> smaller than that for a while (that's the timeout option). If the load >>>> is >>>> low, you'd have just one thread that works with that slowdown. >>>> >>>> I hope this helps. >>>> >>>> Best regards, >>>> Radu >>>> >>>> -- >>>> Performance Monitoring * Log Analytics * Search Analytics >>>> Solr & Elasticsearch Support * http://sematext.com/ >>>> >>>> On Wed, Jun 17, 2015 at 6:23 AM, chenlin rao <[email protected]> >>>> wrote: >>>> >>>> So how can I define the output queue configuration? >>>> >>>>> I found the omelasticsearch action process 60000/min, and the >>>>> queue.discarded.nf was 600000. >>>>> I run `tcpdump -i eth1 -s0 -A 'tcp dst port 9200' | grep >>>>> Content-Length` >>>>> and saw the length is 1.6k. As my msgline size is 0.1k, the bulk size >>>>> is >>>>> only 10. Too small. >>>>> >>>>> Sometimes when I restart rsyslogd, the Content-Length grows to 8MB. >>>>> Why~~ >>>>> >>>>> 2015-05-06 1:39 GMT+08:00 David Lang <[email protected]>: >>>>> >>>>> On Tue, 5 May 2015, chenlin rao wrote: >>>>> >>>>>> >>>>>> I'm using rsyslog-elasticsearch to writing nginx accesslog into >>>>>> >>>>>> Elasticsearch cluster. I found the document told that the plugin >>>>>>> would >>>>>>> >>>>>>> use >>>>>> >>>>> >>>>> queue.dequeuesize as the bulk size.But my tcpdump show that every POST >>>>>> >>>>>>> only >>>>>>> has 8-9 events in the bulk body while my input flow is nearly 10k per >>>>>>> second. >>>>>>> >>>>>>> How can I force a larger bulk size? >>>>>>> >>>>>>> >>>>>>> Rsyslog adapts the size to the number of messages waiting to be >>>>>> >>>>>> delivered, >>>>> >>>>> so if it's keeping up at that size, it won't increase it. >>>>>> >>>>>> are you running impstats? if so, please look at the queue size. If >>>>>> it's >>>>>> staying low, then you just have a nice, fast ES instance that is able >>>>>> to >>>>>> >>>>>> do >>>>> >>>>> 1k inserts/sec (which is not unreasonable), so each insert would be >>>>>> <10 >>>>>> >>>>>>> >>>>>>> messages. >>>>>> >>>>>> Trying to force a larger bulk size would mean not inserting messages >>>>>> as >>>>>> fast as we can, and instead pausing and waiting for enough messages to >>>>>> accumulate to fill the bulk size. We never delay messages >>>>>> intentionally, >>>>>> each pass through the loop we grab all pending messages, up to the max >>>>>> dequeue size, and deliver them. If more messages arrive than we >>>>>> deliver, >>>>>> the next pass through the queue is larger, so we grab more messages >>>>>> (this >>>>>> quickly stabilizes to inserting messages as fast as they are arriving) >>>>>> >>>>>> there is a dequeue delay that forces rsyslog to sit and do nothing >>>>>> >>>>>> between >>>>> >>>>> one batch of messages and the next. It's use is discouraged, but >>>>>> delaying >>>>>> like this would allow more messages to accumulate. >>>>>> >>>>>> David Lang >>>>>> _______________________________________________ >>>>>> rsyslog mailing list >>>>>> http://lists.adiscon.net/mailman/listinfo/rsyslog >>>>>> http://www.rsyslog.com/professional-services/ >>>>>> What's up with rsyslog? Follow https://twitter.com/rgerhards >>>>>> NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a >>>>>> myriad >>>>>> of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you >>>>>> DON'T LIKE THAT. >>>>>> >>>>>> _______________________________________________ >>>>>> >>>>> rsyslog mailing list >>>>> http://lists.adiscon.net/mailman/listinfo/rsyslog >>>>> http://www.rsyslog.com/professional-services/ >>>>> What's up with rsyslog? Follow https://twitter.com/rgerhards >>>>> NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a >>>>> myriad >>>>> of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you >>>>> DON'T LIKE THAT. >>>>> >>>>> _______________________________________________ >>>>> >>>> rsyslog mailing list >>>> http://lists.adiscon.net/mailman/listinfo/rsyslog >>>> http://www.rsyslog.com/professional-services/ >>>> What's up with rsyslog? Follow https://twitter.com/rgerhards >>>> NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad >>>> of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you >>>> DON'T LIKE THAT. >>>> >>>> _______________________________________________ >>>> >>> rsyslog mailing list >>> http://lists.adiscon.net/mailman/listinfo/rsyslog >>> http://www.rsyslog.com/professional-services/ >>> What's up with rsyslog? Follow https://twitter.com/rgerhards >>> NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad >>> of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you >>> DON'T LIKE THAT. >>> >>> _______________________________________________ >> rsyslog mailing list >> http://lists.adiscon.net/mailman/listinfo/rsyslog >> http://www.rsyslog.com/professional-services/ >> What's up with rsyslog? Follow https://twitter.com/rgerhards >> NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad >> of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you >> DON'T LIKE THAT. >> >> _______________________________________________ > rsyslog mailing list > http://lists.adiscon.net/mailman/listinfo/rsyslog > http://www.rsyslog.com/professional-services/ > What's up with rsyslog? Follow https://twitter.com/rgerhards > NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad > of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you > DON'T LIKE THAT. > _______________________________________________ rsyslog mailing list http://lists.adiscon.net/mailman/listinfo/rsyslog http://www.rsyslog.com/professional-services/ What's up with rsyslog? Follow https://twitter.com/rgerhards NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T LIKE THAT.

