No, it's dequeuebatchsize. Or at least that's what I've know and seen for years. It would be quite a thing if I'm wrong :D
-- Performance Monitoring * Log Analytics * Search Analytics Solr & Elasticsearch Support * http://sematext.com/ On Wed, Jun 17, 2015 at 11:35 PM, David Lang <[email protected]> wrote: > I seem to remember seeing that there is a different variable for > omelasticsearch to set the max bulk size for the ES insert as opposed to > the batch size used internally by rsyslog. I don't remember what it is. > > > David Lang > > On Wed, 17 Jun 2015, Radu Gheorghe wrote: > > But I think what Chenlin is describing is a bug. He basically ends up with >> small batches, but the queue is getting full. So rsyslog could build >> bigger >> batches (there are messages in the queue) but it doesn't. Am I right? If >> yes, it's a weird thing, I didn't see this issue before :( Maybe a full >> reproduction (complete config of rsyslog + ES + versions + OSes) would >> help? >> >> -- >> Performance Monitoring * Log Analytics * Search Analytics >> Solr & Elasticsearch Support * http://sematext.com/ >> >> On Wed, Jun 17, 2015 at 6:47 PM, singh.janmejay <[email protected] >> > >> wrote: >> >> ES uses worker-pool for indexing(there is a worker-pool for >>> bulk-indexing too). Prioritizing approach may not be easy, and >>> possibly a little dangerous too, but sizing that thread-pool is >>> definitely easy. Just size it to your need and it'll shape the >>> batch-size optimally when under pressure (like David explained). >>> >>> On Wed, Jun 17, 2015 at 6:14 PM, chenlin rao <[email protected]> >>> wrote: >>> >>>> well, there is something I can't understand: If rsyslog use 10msg per >>>> >>> bulk >>> >>>> because Elasticsearch keep up the sending speed, why the output queue >>>> >>> has a >>> >>>> size reached maxsize and discarded.nf/enqueued = 90%. >>>> >>>> here is my configuration: >>>> >>>> ``` >>>> action ( >>>> type="omelasticsearch" >>>> template="videotmpl" >>>> server="10.13.244.214" >>>> dynSearchIndex="on" >>>> searchIndex="videoIndexName" >>>> searchType="videoaccess" >>>> bulkmode="on" >>>> name="action_videoaccess-es1003" >>>> queue.size="1000000" >>>> queue.dequeuebatchsize="40000" >>>> queue.discardmark="950000" >>>> queue.highwatermark="600000" >>>> queue.lowwatermark="400000" >>>> queue.discardseverity="3" >>>> queue.dequeueslowdown="10000" >>>> queue.type="linkedlist" >>>> queue.maxdiskspace="15G" >>>> queue.maxfilesize="500M" >>>> queue.filename="action_videoaccess-es1003" >>>> queue.checkpointinterval="10000" >>>> queue.saveonshutdown="on" >>>> ) >>>> ``` >>>> >>>> and pstats.log: >>>> >>>> ``` >>>> 2015-06-17T12:17:48.708364+08:00 localhost rsyslogd-pstats: >>>> {"name":"action_videoaccess-es1003 >>>> >>>> >>> queue[DA]","origin":"core.queue","size":27838434,"enqueued":9,"full":735,"discarded.full":9," >>> >>>> discarded.nf":0,"maxqsize":28153530} >>>> 2015-06-17T12:17:48.708370+08:00 localhost rsyslogd-pstats: >>>> {"name":"action_videoaccess-es1003 >>>> >>>> >>> queue","origin":"core.queue","size":950000,"enqueued":522298,"full":0,"discarded.full":0," >>> >>>> discarded.nf":442298,"maxqsize":950000} >>>> ``` >>>> >>>> btw: I had try slowdown setting from 10 to 10000, no change to 10 msg >>>> per >>>> bulk. >>>> >>>> 2015-06-17 19:54 GMT+08:00 Radu Gheorghe <[email protected]>: >>>> >>>> That might work, thanks for the feedback and the interesting article! >>>>> >>>>> -- >>>>> Performance Monitoring * Log Analytics * Search Analytics >>>>> Solr & Elasticsearch Support * http://sematext.com/ >>>>> >>>>> On Wed, Jun 17, 2015 at 12:58 PM, David Lang <[email protected]> wrote: >>>>> >>>>> Probably a risk, something to keep an eye on (or watch the pstats from >>>>>> rsyslog and tweak the priority if the queue too large) >>>>>> >>>>>> I also believe that the vast majority of searches that are typically >>>>>> >>>>> done >>> >>>> are done wrong (see my dashboards/reports article at >>>>>> >>>>>> >>>>> >>> https://www.usenix.org/publications/login/feb14/logging-reports-dashboards >>> >>>> ) >>>>>> >>>>>> David Lang >>>>>> >>>>>> On Wed, 17 Jun 2015, Radu Gheorghe wrote: >>>>>> >>>>>> This sounds interesting, David. I guess it's possible to renice just >>>>>> >>>>> some >>>>> >>>>>> threads from an app and make it "nicer", right? Googling a bit it >>>>>>> >>>>>> seems >>> >>>> it >>>>> >>>>>> is possible. >>>>>>> >>>>>>> The only problem I see with this approach is that searches (and other >>>>>>> kinds >>>>>>> of requests from other threadpools >>>>>>> < >>>>>>> >>>>>>> >>>>> >>> https://www.elastic.co/guide/en/elasticsearch/reference/current/modules-threadpool.html >>> >>>> ) >>>>>>>> >>>>>>> >>>>>>> would automatically have higher priority so, with heavy searches, >>>>>>> >>>>>> indexing >>>>> >>>>>> might fall behind more than usual. Am I getting it right? >>>>>>> >>>>>>> -- >>>>>>> Performance Monitoring * Log Analytics * Search Analytics >>>>>>> Solr & Elasticsearch Support * http://sematext.com/ >>>>>>> >>>>>>> On Wed, Jun 17, 2015 at 11:53 AM, David Lang <[email protected]> wrote: >>>>>>> >>>>>>> Thinking about it, probably the best thing to do is to renice the ES >>>>>>> >>>>>>>> threads that accept the messages from rsyslog. That way if nothing >>>>>>>> >>>>>>> else >>> >>>> needs the capacity, everything works at the fastest insert speed >>>>>>>> >>>>>>> (even >>> >>>> if >>>>> >>>>>> less optimized than if there were larger batches) But if anything >>>>>>>> >>>>>>> else >>> >>>> on >>>>> >>>>>> the system need the resources, the indexing threads work slower, >>>>>>>> >>>>>>> which >>> >>>> will >>>>>>>> result in larger batches. >>>>>>>> >>>>>>>> all self tuning. >>>>>>>> >>>>>>>> David Lang >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> On Wed, 17 Jun 2015, Radu Gheorghe wrote: >>>>>>>> >>>>>>>> Date: Wed, 17 Jun 2015 10:20:46 +0300 >>>>>>>> >>>>>>>> From: Radu Gheorghe <[email protected]> >>>>>>>>> Reply-To: rsyslog-users <[email protected]> >>>>>>>>> To: rsyslog-users <[email protected]> >>>>>>>>> Subject: Re: [rsyslog] how to force a larger omelasticsearch bulk >>>>>>>>> >>>>>>>> size? >>>>> >>>>>> >>>>>>>>> Maybe this went overlooked, but David suggested earlier that you >>>>>>>>> >>>>>>>> can >>> >>>> slowdown the queue to let more messages arrive before sending a >>>>>>>>> >>>>>>>> bulk. >>> >>>> queue.dequeueslowdown >>>>>>>>> < >>>>>>>>> >>>>>>>>> >>>>> http://www.rsyslog.com/doc/v8-stable/rainerscript/queue_parameters.html >>>>> >>>>>> >>>>>>>>>> is the option and it's in microseconds. >>>>>>>>> >>>>>>>>> I think you have a valid point in that if batches are too small >>>>>>>>> >>>>>>>> then >>> >>>> Elasticsearch will do more work than necessary (as indexing in very >>>>>>>>> small >>>>>>>>> batches is more expensive). Plus, since the refresh rate (i.e. how >>>>>>>>> >>>>>>>> long >>>>> >>>>>> it >>>>>>>>> may take for an indexed doc to be visible to searches, because >>>>>>>>> >>>>>>>> Searchers >>>>> >>>>>> reopen their view in the index at a certain interval) is typically >>>>>>>>> >>>>>>>> a >>> >>>> few >>>>> >>>>>> seconds >>>>>>>>> < >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>> >>> http://blog.sematext.com/2013/07/08/elasticsearch-refresh-interval-vs-indexing-performance/ >>> >>>> >>>>>>>>> , >>>>>>>>>> >>>>>>>>>> >>>>>>>>> waiting a bit before submitting a batch will have no impact on the >>>>>>>>> >>>>>>>> user >>>>> >>>>>> experience. >>>>>>>>> >>>>>>>>> On the other hand, in my experience you'll be sending small >>>>>>>>> >>>>>>>> batches if >>> >>>> the >>>>>>>>> indexing rate is low - which means the load on ES is low anyway. So >>>>>>>>> >>>>>>>> I'm >>>>> >>>>>> not >>>>>>>>> sure if optimizing this will actually give significant results. You >>>>>>>>> could >>>>>>>>> introduce that slowdown, but then rsyslog may have trouble keeping >>>>>>>>> >>>>>>>> up >>> >>>> when >>>>>>>>> the load is high. You can compensate by raising the limit of >>>>>>>>> >>>>>>>> maximum >>> >>>> worker >>>>>>>>> threads for the queue (queue.workerthreads) and play with >>>>>>>>> queue.workerthreadminimummessages and >>>>>>>>> >>>>>>>> queue.timeoutworkerthreadshutdown >>>>> >>>>>> to >>>>>>>>> make rsyslog spawn new threads when there are at least N messages >>>>>>>>> >>>>>>>> in >>> >>>> the >>>>> >>>>>> queue (that's what min messages does) and kill them when the queue >>>>>>>>> >>>>>>>> is >>> >>>> smaller than that for a while (that's the timeout option). If the >>>>>>>>> >>>>>>>> load >>> >>>> is >>>>>>>>> low, you'd have just one thread that works with that slowdown. >>>>>>>>> >>>>>>>>> I hope this helps. >>>>>>>>> >>>>>>>>> Best regards, >>>>>>>>> Radu >>>>>>>>> >>>>>>>>> -- >>>>>>>>> Performance Monitoring * Log Analytics * Search Analytics >>>>>>>>> Solr & Elasticsearch Support * http://sematext.com/ >>>>>>>>> >>>>>>>>> On Wed, Jun 17, 2015 at 6:23 AM, chenlin rao < >>>>>>>>> >>>>>>>> [email protected]> >>> >>>> wrote: >>>>>>>>> >>>>>>>>> So how can I define the output queue configuration? >>>>>>>>> >>>>>>>>> I found the omelasticsearch action process 60000/min, and the >>>>>>>>>> queue.discarded.nf was 600000. >>>>>>>>>> I run `tcpdump -i eth1 -s0 -A 'tcp dst port 9200' | grep >>>>>>>>>> Content-Length` >>>>>>>>>> and saw the length is 1.6k. As my msgline size is 0.1k, the bulk >>>>>>>>>> >>>>>>>>> size >>> >>>> is >>>>>>>>>> only 10. Too small. >>>>>>>>>> >>>>>>>>>> Sometimes when I restart rsyslogd, the Content-Length grows to >>>>>>>>>> >>>>>>>>> 8MB. >>> >>>> Why~~ >>>>>>>>>> >>>>>>>>>> 2015-05-06 1:39 GMT+08:00 David Lang <[email protected]>: >>>>>>>>>> >>>>>>>>>> On Tue, 5 May 2015, chenlin rao wrote: >>>>>>>>>> >>>>>>>>>> >>>>>>>>>>> I'm using rsyslog-elasticsearch to writing nginx accesslog into >>>>>>>>>>> >>>>>>>>>>> Elasticsearch cluster. I found the document told that the plugin >>>>>>>>>>> >>>>>>>>>>>> would >>>>>>>>>>>> >>>>>>>>>>>> use >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>> queue.dequeuesize as the bulk size.But my tcpdump show that every >>>>>>>>>> >>>>>>>>> POST >>>>> >>>>>> >>>>>>>>>>> only >>>>>>>>>>>> has 8-9 events in the bulk body while my input flow is nearly >>>>>>>>>>>> >>>>>>>>>>> 10k >>> >>>> per >>>>> >>>>>> second. >>>>>>>>>>>> >>>>>>>>>>>> How can I force a larger bulk size? >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> Rsyslog adapts the size to the number of messages waiting to be >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> delivered, >>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> so if it's keeping up at that size, it won't increase it. >>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> are you running impstats? if so, please look at the queue size. >>>>>>>>>>> >>>>>>>>>> If >>> >>>> it's >>>>>>>>>>> staying low, then you just have a nice, fast ES instance that is >>>>>>>>>>> >>>>>>>>>> able >>>>> >>>>>> to >>>>>>>>>>> >>>>>>>>>>> do >>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> 1k inserts/sec (which is not unreasonable), so each insert would >>>>>>>>>> >>>>>>>>> be >>> >>>> <10 >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>>> messages. >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> Trying to force a larger bulk size would mean not inserting >>>>>>>>>>> >>>>>>>>>> messages >>> >>>> as >>>>>>>>>>> fast as we can, and instead pausing and waiting for enough >>>>>>>>>>> >>>>>>>>>> messages >>> >>>> to >>>>> >>>>>> accumulate to fill the bulk size. We never delay messages >>>>>>>>>>> intentionally, >>>>>>>>>>> each pass through the loop we grab all pending messages, up to >>>>>>>>>>> >>>>>>>>>> the >>> >>>> max >>>>> >>>>>> dequeue size, and deliver them. If more messages arrive than we >>>>>>>>>>> deliver, >>>>>>>>>>> the next pass through the queue is larger, so we grab more >>>>>>>>>>> >>>>>>>>>> messages >>> >>>> (this >>>>>>>>>>> quickly stabilizes to inserting messages as fast as they are >>>>>>>>>>> >>>>>>>>>> arriving) >>>>> >>>>>> >>>>>>>>>>> there is a dequeue delay that forces rsyslog to sit and do >>>>>>>>>>> >>>>>>>>>> nothing >>> >>>> >>>>>>>>>>> between >>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> one batch of messages and the next. It's use is discouraged, but >>>>>>>>>> >>>>>>>>>>> delaying >>>>>>>>>>> like this would allow more messages to accumulate. >>>>>>>>>>> >>>>>>>>>>> David Lang >>>>>>>>>>> _______________________________________________ >>>>>>>>>>> rsyslog mailing list >>>>>>>>>>> http://lists.adiscon.net/mailman/listinfo/rsyslog >>>>>>>>>>> http://www.rsyslog.com/professional-services/ >>>>>>>>>>> What's up with rsyslog? Follow https://twitter.com/rgerhards >>>>>>>>>>> NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a >>>>>>>>>>> myriad >>>>>>>>>>> of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST >>>>>>>>>>> >>>>>>>>>> if >>> >>>> you >>>>> >>>>>> DON'T LIKE THAT. >>>>>>>>>>> >>>>>>>>>>> _______________________________________________ >>>>>>>>>>> >>>>>>>>>>> rsyslog mailing list >>>>>>>>>> http://lists.adiscon.net/mailman/listinfo/rsyslog >>>>>>>>>> http://www.rsyslog.com/professional-services/ >>>>>>>>>> What's up with rsyslog? Follow https://twitter.com/rgerhards >>>>>>>>>> NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a >>>>>>>>>> myriad >>>>>>>>>> of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if >>>>>>>>>> >>>>>>>>> you >>>>> >>>>>> DON'T LIKE THAT. >>>>>>>>>> >>>>>>>>>> _______________________________________________ >>>>>>>>>> >>>>>>>>>> rsyslog mailing list >>>>>>>>> http://lists.adiscon.net/mailman/listinfo/rsyslog >>>>>>>>> http://www.rsyslog.com/professional-services/ >>>>>>>>> What's up with rsyslog? Follow https://twitter.com/rgerhards >>>>>>>>> NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a >>>>>>>>> >>>>>>>> myriad >>>>> >>>>>> of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if >>>>>>>>> >>>>>>>> you >>> >>>> DON'T LIKE THAT. >>>>>>>>> >>>>>>>>> _______________________________________________ >>>>>>>>> >>>>>>>>> rsyslog mailing list >>>>>>>> http://lists.adiscon.net/mailman/listinfo/rsyslog >>>>>>>> http://www.rsyslog.com/professional-services/ >>>>>>>> What's up with rsyslog? Follow https://twitter.com/rgerhards >>>>>>>> NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a >>>>>>>> >>>>>>> myriad >>>>> >>>>>> of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if >>>>>>>> >>>>>>> you >>> >>>> DON'T LIKE THAT. >>>>>>>> >>>>>>>> _______________________________________________ >>>>>>>> >>>>>>> rsyslog mailing list >>>>>>> http://lists.adiscon.net/mailman/listinfo/rsyslog >>>>>>> http://www.rsyslog.com/professional-services/ >>>>>>> What's up with rsyslog? Follow https://twitter.com/rgerhards >>>>>>> NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a >>>>>>> >>>>>> myriad >>> >>>> of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if >>>>>>> >>>>>> you >>> >>>> DON'T LIKE THAT. >>>>>>> >>>>>>> _______________________________________________ >>>>>>> >>>>>> rsyslog mailing list >>>>>> http://lists.adiscon.net/mailman/listinfo/rsyslog >>>>>> http://www.rsyslog.com/professional-services/ >>>>>> What's up with rsyslog? Follow https://twitter.com/rgerhards >>>>>> NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a >>>>>> >>>>> myriad >>> >>>> of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you >>>>>> DON'T LIKE THAT. >>>>>> >>>>>> _______________________________________________ >>>>> rsyslog mailing list >>>>> http://lists.adiscon.net/mailman/listinfo/rsyslog >>>>> http://www.rsyslog.com/professional-services/ >>>>> What's up with rsyslog? Follow https://twitter.com/rgerhards >>>>> NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a >>>>> myriad >>>>> of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you >>>>> DON'T LIKE THAT. >>>>> >>>>> _______________________________________________ >>>> rsyslog mailing list >>>> http://lists.adiscon.net/mailman/listinfo/rsyslog >>>> http://www.rsyslog.com/professional-services/ >>>> What's up with rsyslog? Follow https://twitter.com/rgerhards >>>> NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad >>>> >>> of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you >>> DON'T LIKE THAT. >>> >>> >>> >>> -- >>> Regards, >>> Janmejay >>> http://codehunk.wordpress.com >>> _______________________________________________ >>> rsyslog mailing list >>> http://lists.adiscon.net/mailman/listinfo/rsyslog >>> http://www.rsyslog.com/professional-services/ >>> What's up with rsyslog? Follow https://twitter.com/rgerhards >>> NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad >>> of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you >>> DON'T LIKE THAT. >>> >>> _______________________________________________ >> rsyslog mailing list >> http://lists.adiscon.net/mailman/listinfo/rsyslog >> http://www.rsyslog.com/professional-services/ >> What's up with rsyslog? Follow https://twitter.com/rgerhards >> NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad >> of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you >> DON'T LIKE THAT. >> >> _______________________________________________ > rsyslog mailing list > http://lists.adiscon.net/mailman/listinfo/rsyslog > http://www.rsyslog.com/professional-services/ > What's up with rsyslog? Follow https://twitter.com/rgerhards > NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad > of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you > DON'T LIKE THAT. > _______________________________________________ rsyslog mailing list http://lists.adiscon.net/mailman/listinfo/rsyslog http://www.rsyslog.com/professional-services/ What's up with rsyslog? Follow https://twitter.com/rgerhards NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T LIKE THAT.

