Hello everyone. I have some discovery so woke up this mail again.
Yesterday I use iptables to drop the input tcp 514, and watch the consume progress of omelasticsearch queue[DA]. At that time, there are 6000000 msg in DA. But tcpdump show Content-Length < 8000 means 10+ msg per bulk. 2015-06-18 17:18 GMT+08:00 Rainer Gerhards <[email protected]>: > 2015-06-18 5:13 GMT+02:00 chenlin rao <[email protected]>: > > yes. you are right. > > > > FYI, This rsyslog server is sending a short msg with 100B size. And I > use a > > special ES template with _source disabled for it. > > > > I check another rsyslog server which is sending 600B longmsg and has > also a > > nearfull queue, it surely has a large bulk size(18MB). So I don't know > > where the problem is. > > Can you pls disable the delay, run stats in 1-minute intervals and > provide the pstats file for a couple of hours -- so that we can see > how things evolve. > > Rainer > > > > I use rsyslog-8.10.0.ad1-2.el6.x86_64 on CentOS6.2, > > elasticsearch-1.5.1-1.noarch on CentOS6.5. > > > > 2015-06-18 4:21 GMT+08:00 Radu Gheorghe <[email protected]>: > > > >> But I think what Chenlin is describing is a bug. He basically ends up > with > >> small batches, but the queue is getting full. So rsyslog could build > bigger > >> batches (there are messages in the queue) but it doesn't. Am I right? If > >> yes, it's a weird thing, I didn't see this issue before :( Maybe a full > >> reproduction (complete config of rsyslog + ES + versions + OSes) would > >> help? > >> > >> -- > >> Performance Monitoring * Log Analytics * Search Analytics > >> Solr & Elasticsearch Support * http://sematext.com/ > >> > >> On Wed, Jun 17, 2015 at 6:47 PM, singh.janmejay < > [email protected]> > >> wrote: > >> > >> > ES uses worker-pool for indexing(there is a worker-pool for > >> > bulk-indexing too). Prioritizing approach may not be easy, and > >> > possibly a little dangerous too, but sizing that thread-pool is > >> > definitely easy. Just size it to your need and it'll shape the > >> > batch-size optimally when under pressure (like David explained). > >> > > >> > On Wed, Jun 17, 2015 at 6:14 PM, chenlin rao <[email protected]> > >> > wrote: > >> > > well, there is something I can't understand: If rsyslog use 10msg > per > >> > bulk > >> > > because Elasticsearch keep up the sending speed, why the output > queue > >> > has a > >> > > size reached maxsize and discarded.nf/enqueued = 90%. > >> > > > >> > > here is my configuration: > >> > > > >> > > ``` > >> > > action ( > >> > > type="omelasticsearch" > >> > > template="videotmpl" > >> > > server="10.13.244.214" > >> > > dynSearchIndex="on" > >> > > searchIndex="videoIndexName" > >> > > searchType="videoaccess" > >> > > bulkmode="on" > >> > > name="action_videoaccess-es1003" > >> > > queue.size="1000000" > >> > > queue.dequeuebatchsize="40000" > >> > > queue.discardmark="950000" > >> > > queue.highwatermark="600000" > >> > > queue.lowwatermark="400000" > >> > > queue.discardseverity="3" > >> > > queue.dequeueslowdown="10000" > >> > > queue.type="linkedlist" > >> > > queue.maxdiskspace="15G" > >> > > queue.maxfilesize="500M" > >> > > queue.filename="action_videoaccess-es1003" > >> > > queue.checkpointinterval="10000" > >> > > queue.saveonshutdown="on" > >> > > ) > >> > > ``` > >> > > > >> > > and pstats.log: > >> > > > >> > > ``` > >> > > 2015-06-17T12:17:48.708364+08:00 localhost rsyslogd-pstats: > >> > > {"name":"action_videoaccess-es1003 > >> > > > >> > > >> > queue[DA]","origin":"core.queue","size":27838434,"enqueued":9,"full":735,"discarded.full":9," > >> > > discarded.nf":0,"maxqsize":28153530} > >> > > 2015-06-17T12:17:48.708370+08:00 localhost rsyslogd-pstats: > >> > > {"name":"action_videoaccess-es1003 > >> > > > >> > > >> > queue","origin":"core.queue","size":950000,"enqueued":522298,"full":0,"discarded.full":0," > >> > > discarded.nf":442298,"maxqsize":950000} > >> > > ``` > >> > > > >> > > btw: I had try slowdown setting from 10 to 10000, no change to 10 > msg > >> per > >> > > bulk. > >> > > > >> > > 2015-06-17 19:54 GMT+08:00 Radu Gheorghe < > [email protected]>: > >> > > > >> > >> That might work, thanks for the feedback and the interesting > article! > >> > >> > >> > >> -- > >> > >> Performance Monitoring * Log Analytics * Search Analytics > >> > >> Solr & Elasticsearch Support * http://sematext.com/ > >> > >> > >> > >> On Wed, Jun 17, 2015 at 12:58 PM, David Lang <[email protected]> > wrote: > >> > >> > >> > >> > Probably a risk, something to keep an eye on (or watch the pstats > >> from > >> > >> > rsyslog and tweak the priority if the queue too large) > >> > >> > > >> > >> > I also believe that the vast majority of searches that are > typically > >> > done > >> > >> > are done wrong (see my dashboards/reports article at > >> > >> > > >> > >> > >> > > >> > https://www.usenix.org/publications/login/feb14/logging-reports-dashboards > >> > >> > ) > >> > >> > > >> > >> > David Lang > >> > >> > > >> > >> > On Wed, 17 Jun 2015, Radu Gheorghe wrote: > >> > >> > > >> > >> > This sounds interesting, David. I guess it's possible to renice > >> just > >> > >> some > >> > >> >> threads from an app and make it "nicer", right? Googling a bit > it > >> > seems > >> > >> it > >> > >> >> is possible. > >> > >> >> > >> > >> >> The only problem I see with this approach is that searches (and > >> other > >> > >> >> kinds > >> > >> >> of requests from other threadpools > >> > >> >> < > >> > >> >> > >> > >> > >> > > >> > https://www.elastic.co/guide/en/elasticsearch/reference/current/modules-threadpool.html > >> > >> >> >) > >> > >> >> > >> > >> >> would automatically have higher priority so, with heavy > searches, > >> > >> indexing > >> > >> >> might fall behind more than usual. Am I getting it right? > >> > >> >> > >> > >> >> -- > >> > >> >> Performance Monitoring * Log Analytics * Search Analytics > >> > >> >> Solr & Elasticsearch Support * http://sematext.com/ > >> > >> >> > >> > >> >> On Wed, Jun 17, 2015 at 11:53 AM, David Lang <[email protected]> > >> wrote: > >> > >> >> > >> > >> >> Thinking about it, probably the best thing to do is to renice > the > >> ES > >> > >> >>> threads that accept the messages from rsyslog. That way if > nothing > >> > else > >> > >> >>> needs the capacity, everything works at the fastest insert > speed > >> > (even > >> > >> if > >> > >> >>> less optimized than if there were larger batches) But if > anything > >> > else > >> > >> on > >> > >> >>> the system need the resources, the indexing threads work > slower, > >> > which > >> > >> >>> will > >> > >> >>> result in larger batches. > >> > >> >>> > >> > >> >>> all self tuning. > >> > >> >>> > >> > >> >>> David Lang > >> > >> >>> > >> > >> >>> > >> > >> >>> > >> > >> >>> On Wed, 17 Jun 2015, Radu Gheorghe wrote: > >> > >> >>> > >> > >> >>> Date: Wed, 17 Jun 2015 10:20:46 +0300 > >> > >> >>> > >> > >> >>>> From: Radu Gheorghe <[email protected]> > >> > >> >>>> Reply-To: rsyslog-users <[email protected]> > >> > >> >>>> To: rsyslog-users <[email protected]> > >> > >> >>>> Subject: Re: [rsyslog] how to force a larger omelasticsearch > bulk > >> > >> size? > >> > >> >>>> > >> > >> >>>> Maybe this went overlooked, but David suggested earlier that > you > >> > can > >> > >> >>>> slowdown the queue to let more messages arrive before sending > a > >> > bulk. > >> > >> >>>> queue.dequeueslowdown > >> > >> >>>> < > >> > >> >>>> > >> > >> > >> http://www.rsyslog.com/doc/v8-stable/rainerscript/queue_parameters.html > >> > >> >>>> > > >> > >> >>>> is the option and it's in microseconds. > >> > >> >>>> > >> > >> >>>> I think you have a valid point in that if batches are too > small > >> > then > >> > >> >>>> Elasticsearch will do more work than necessary (as indexing in > >> very > >> > >> >>>> small > >> > >> >>>> batches is more expensive). Plus, since the refresh rate (i.e. > >> how > >> > >> long > >> > >> >>>> it > >> > >> >>>> may take for an indexed doc to be visible to searches, because > >> > >> Searchers > >> > >> >>>> reopen their view in the index at a certain interval) is > >> typically > >> > a > >> > >> few > >> > >> >>>> seconds > >> > >> >>>> < > >> > >> >>>> > >> > >> >>>> > >> > >> > >> > > >> > http://blog.sematext.com/2013/07/08/elasticsearch-refresh-interval-vs-indexing-performance/ > >> > >> >>>> > >> > >> >>>>> , > >> > >> >>>>> > >> > >> >>>> > >> > >> >>>> waiting a bit before submitting a batch will have no impact on > >> the > >> > >> user > >> > >> >>>> experience. > >> > >> >>>> > >> > >> >>>> On the other hand, in my experience you'll be sending small > >> > batches if > >> > >> >>>> the > >> > >> >>>> indexing rate is low - which means the load on ES is low > anyway. > >> So > >> > >> I'm > >> > >> >>>> not > >> > >> >>>> sure if optimizing this will actually give significant > results. > >> You > >> > >> >>>> could > >> > >> >>>> introduce that slowdown, but then rsyslog may have trouble > >> keeping > >> > up > >> > >> >>>> when > >> > >> >>>> the load is high. You can compensate by raising the limit of > >> > maximum > >> > >> >>>> worker > >> > >> >>>> threads for the queue (queue.workerthreads) and play with > >> > >> >>>> queue.workerthreadminimummessages and > >> > >> queue.timeoutworkerthreadshutdown > >> > >> >>>> to > >> > >> >>>> make rsyslog spawn new threads when there are at least N > messages > >> > in > >> > >> the > >> > >> >>>> queue (that's what min messages does) and kill them when the > >> queue > >> > is > >> > >> >>>> smaller than that for a while (that's the timeout option). If > the > >> > load > >> > >> >>>> is > >> > >> >>>> low, you'd have just one thread that works with that slowdown. > >> > >> >>>> > >> > >> >>>> I hope this helps. > >> > >> >>>> > >> > >> >>>> Best regards, > >> > >> >>>> Radu > >> > >> >>>> > >> > >> >>>> -- > >> > >> >>>> Performance Monitoring * Log Analytics * Search Analytics > >> > >> >>>> Solr & Elasticsearch Support * http://sematext.com/ > >> > >> >>>> > >> > >> >>>> On Wed, Jun 17, 2015 at 6:23 AM, chenlin rao < > >> > [email protected]> > >> > >> >>>> wrote: > >> > >> >>>> > >> > >> >>>> So how can I define the output queue configuration? > >> > >> >>>> > >> > >> >>>>> I found the omelasticsearch action process 60000/min, and the > >> > >> >>>>> queue.discarded.nf was 600000. > >> > >> >>>>> I run `tcpdump -i eth1 -s0 -A 'tcp dst port 9200' | grep > >> > >> >>>>> Content-Length` > >> > >> >>>>> and saw the length is 1.6k. As my msgline size is 0.1k, the > bulk > >> > size > >> > >> >>>>> is > >> > >> >>>>> only 10. Too small. > >> > >> >>>>> > >> > >> >>>>> Sometimes when I restart rsyslogd, the Content-Length grows > to > >> > 8MB. > >> > >> >>>>> Why~~ > >> > >> >>>>> > >> > >> >>>>> 2015-05-06 1:39 GMT+08:00 David Lang <[email protected]>: > >> > >> >>>>> > >> > >> >>>>> On Tue, 5 May 2015, chenlin rao wrote: > >> > >> >>>>> > >> > >> >>>>>> > >> > >> >>>>>> I'm using rsyslog-elasticsearch to writing nginx accesslog > >> into > >> > >> >>>>>> > >> > >> >>>>>> Elasticsearch cluster. I found the document told that the > >> plugin > >> > >> >>>>>>> would > >> > >> >>>>>>> > >> > >> >>>>>>> use > >> > >> >>>>>> > >> > >> >>>>> > >> > >> >>>>> queue.dequeuesize as the bulk size.But my tcpdump show that > >> every > >> > >> POST > >> > >> >>>>>> > >> > >> >>>>>>> only > >> > >> >>>>>>> has 8-9 events in the bulk body while my input flow is > nearly > >> > 10k > >> > >> per > >> > >> >>>>>>> second. > >> > >> >>>>>>> > >> > >> >>>>>>> How can I force a larger bulk size? > >> > >> >>>>>>> > >> > >> >>>>>>> > >> > >> >>>>>>> Rsyslog adapts the size to the number of messages waiting > to > >> be > >> > >> >>>>>> > >> > >> >>>>>> delivered, > >> > >> >>>>> > >> > >> >>>>> so if it's keeping up at that size, it won't increase it. > >> > >> >>>>>> > >> > >> >>>>>> are you running impstats? if so, please look at the queue > size. > >> > If > >> > >> >>>>>> it's > >> > >> >>>>>> staying low, then you just have a nice, fast ES instance > that > >> is > >> > >> able > >> > >> >>>>>> to > >> > >> >>>>>> > >> > >> >>>>>> do > >> > >> >>>>> > >> > >> >>>>> 1k inserts/sec (which is not unreasonable), so each insert > >> would > >> > be > >> > >> >>>>>> <10 > >> > >> >>>>>> > >> > >> >>>>>>> > >> > >> >>>>>>> messages. > >> > >> >>>>>> > >> > >> >>>>>> Trying to force a larger bulk size would mean not inserting > >> > messages > >> > >> >>>>>> as > >> > >> >>>>>> fast as we can, and instead pausing and waiting for enough > >> > messages > >> > >> to > >> > >> >>>>>> accumulate to fill the bulk size. We never delay messages > >> > >> >>>>>> intentionally, > >> > >> >>>>>> each pass through the loop we grab all pending messages, up > to > >> > the > >> > >> max > >> > >> >>>>>> dequeue size, and deliver them. If more messages arrive > than we > >> > >> >>>>>> deliver, > >> > >> >>>>>> the next pass through the queue is larger, so we grab more > >> > messages > >> > >> >>>>>> (this > >> > >> >>>>>> quickly stabilizes to inserting messages as fast as they are > >> > >> arriving) > >> > >> >>>>>> > >> > >> >>>>>> there is a dequeue delay that forces rsyslog to sit and do > >> > nothing > >> > >> >>>>>> > >> > >> >>>>>> between > >> > >> >>>>> > >> > >> >>>>> one batch of messages and the next. It's use is discouraged, > >> but > >> > >> >>>>>> delaying > >> > >> >>>>>> like this would allow more messages to accumulate. > >> > >> >>>>>> > >> > >> >>>>>> David Lang > >> > >> >>>>>> _______________________________________________ > >> > >> >>>>>> rsyslog mailing list > >> > >> >>>>>> http://lists.adiscon.net/mailman/listinfo/rsyslog > >> > >> >>>>>> http://www.rsyslog.com/professional-services/ > >> > >> >>>>>> What's up with rsyslog? Follow > https://twitter.com/rgerhards > >> > >> >>>>>> NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED > >> by a > >> > >> >>>>>> myriad > >> > >> >>>>>> of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT > POST > >> > if > >> > >> you > >> > >> >>>>>> DON'T LIKE THAT. > >> > >> >>>>>> > >> > >> >>>>>> _______________________________________________ > >> > >> >>>>>> > >> > >> >>>>> rsyslog mailing list > >> > >> >>>>> http://lists.adiscon.net/mailman/listinfo/rsyslog > >> > >> >>>>> http://www.rsyslog.com/professional-services/ > >> > >> >>>>> What's up with rsyslog? Follow https://twitter.com/rgerhards > >> > >> >>>>> NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED > by > >> a > >> > >> >>>>> myriad > >> > >> >>>>> of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT > POST > >> if > >> > >> you > >> > >> >>>>> DON'T LIKE THAT. > >> > >> >>>>> > >> > >> >>>>> _______________________________________________ > >> > >> >>>>> > >> > >> >>>> rsyslog mailing list > >> > >> >>>> http://lists.adiscon.net/mailman/listinfo/rsyslog > >> > >> >>>> http://www.rsyslog.com/professional-services/ > >> > >> >>>> What's up with rsyslog? Follow https://twitter.com/rgerhards > >> > >> >>>> NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED > by a > >> > >> myriad > >> > >> >>>> of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT > POST > >> if > >> > you > >> > >> >>>> DON'T LIKE THAT. > >> > >> >>>> > >> > >> >>>> _______________________________________________ > >> > >> >>>> > >> > >> >>> rsyslog mailing list > >> > >> >>> http://lists.adiscon.net/mailman/listinfo/rsyslog > >> > >> >>> http://www.rsyslog.com/professional-services/ > >> > >> >>> What's up with rsyslog? Follow https://twitter.com/rgerhards > >> > >> >>> NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED > by a > >> > >> myriad > >> > >> >>> of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT > POST if > >> > you > >> > >> >>> DON'T LIKE THAT. > >> > >> >>> > >> > >> >>> _______________________________________________ > >> > >> >> rsyslog mailing list > >> > >> >> http://lists.adiscon.net/mailman/listinfo/rsyslog > >> > >> >> http://www.rsyslog.com/professional-services/ > >> > >> >> What's up with rsyslog? Follow https://twitter.com/rgerhards > >> > >> >> NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by > a > >> > myriad > >> > >> >> of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST > if > >> > you > >> > >> >> DON'T LIKE THAT. > >> > >> >> > >> > >> >> _______________________________________________ > >> > >> > rsyslog mailing list > >> > >> > http://lists.adiscon.net/mailman/listinfo/rsyslog > >> > >> > http://www.rsyslog.com/professional-services/ > >> > >> > What's up with rsyslog? Follow https://twitter.com/rgerhards > >> > >> > NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a > >> > myriad > >> > >> > of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST > if > >> you > >> > >> > DON'T LIKE THAT. > >> > >> > > >> > >> _______________________________________________ > >> > >> rsyslog mailing list > >> > >> http://lists.adiscon.net/mailman/listinfo/rsyslog > >> > >> http://www.rsyslog.com/professional-services/ > >> > >> What's up with rsyslog? Follow https://twitter.com/rgerhards > >> > >> NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a > >> myriad > >> > >> of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if > you > >> > >> DON'T LIKE THAT. > >> > >> > >> > > _______________________________________________ > >> > > rsyslog mailing list > >> > > http://lists.adiscon.net/mailman/listinfo/rsyslog > >> > > http://www.rsyslog.com/professional-services/ > >> > > What's up with rsyslog? Follow https://twitter.com/rgerhards > >> > > NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a > >> myriad > >> > of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you > >> > DON'T LIKE THAT. > >> > > >> > > >> > > >> > -- > >> > Regards, > >> > Janmejay > >> > http://codehunk.wordpress.com > >> > _______________________________________________ > >> > rsyslog mailing list > >> > http://lists.adiscon.net/mailman/listinfo/rsyslog > >> > http://www.rsyslog.com/professional-services/ > >> > What's up with rsyslog? Follow https://twitter.com/rgerhards > >> > NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a > myriad > >> > of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you > >> > DON'T LIKE THAT. > >> > > >> _______________________________________________ > >> rsyslog mailing list > >> http://lists.adiscon.net/mailman/listinfo/rsyslog > >> http://www.rsyslog.com/professional-services/ > >> What's up with rsyslog? Follow https://twitter.com/rgerhards > >> NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad > >> of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you > >> DON'T LIKE THAT. > >> > > _______________________________________________ > > rsyslog mailing list > > http://lists.adiscon.net/mailman/listinfo/rsyslog > > http://www.rsyslog.com/professional-services/ > > What's up with rsyslog? Follow https://twitter.com/rgerhards > > NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad > of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you > DON'T LIKE THAT. > _______________________________________________ > rsyslog mailing list > http://lists.adiscon.net/mailman/listinfo/rsyslog > http://www.rsyslog.com/professional-services/ > What's up with rsyslog? Follow https://twitter.com/rgerhards > NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad > of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you > DON'T LIKE THAT. > _______________________________________________ rsyslog mailing list http://lists.adiscon.net/mailman/listinfo/rsyslog http://www.rsyslog.com/professional-services/ What's up with rsyslog? Follow https://twitter.com/rgerhards NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T LIKE THAT.

