This sounds interesting, David. I guess it's possible to renice just some
threads from an app and make it "nicer", right? Googling a bit it seems it
is possible.

The only problem I see with this approach is that searches (and other kinds
of requests from other threadpools
<https://www.elastic.co/guide/en/elasticsearch/reference/current/modules-threadpool.html>)
would automatically have higher priority so, with heavy searches, indexing
might fall behind more than usual. Am I getting it right?

--
Performance Monitoring * Log Analytics * Search Analytics
Solr & Elasticsearch Support * http://sematext.com/

On Wed, Jun 17, 2015 at 11:53 AM, David Lang <[email protected]> wrote:

> Thinking about it, probably the best thing to do is to renice the ES
> threads that accept the messages from rsyslog. That way if nothing else
> needs the capacity, everything works at the fastest insert speed (even if
> less optimized than if there were larger batches) But if anything else on
> the system need the resources, the indexing threads work slower, which will
> result in larger batches.
>
> all self tuning.
>
> David Lang
>
>
>
> On Wed, 17 Jun 2015, Radu Gheorghe wrote:
>
>  Date: Wed, 17 Jun 2015 10:20:46 +0300
>> From: Radu Gheorghe <[email protected]>
>> Reply-To: rsyslog-users <[email protected]>
>> To: rsyslog-users <[email protected]>
>> Subject: Re: [rsyslog] how to force a larger omelasticsearch bulk size?
>>
>> Maybe this went overlooked, but David suggested earlier that you can
>> slowdown the queue to let more messages arrive before sending a bulk.
>> queue.dequeueslowdown
>> <http://www.rsyslog.com/doc/v8-stable/rainerscript/queue_parameters.html>
>> is the option and it's in microseconds.
>>
>> I think you have a valid point in that if batches are too small then
>> Elasticsearch will do more work than necessary (as indexing in very small
>> batches is more expensive). Plus, since the refresh rate (i.e. how long it
>> may take for an indexed doc to be visible to searches, because Searchers
>> reopen their view in the index at a certain interval) is typically a few
>> seconds
>> <
>> http://blog.sematext.com/2013/07/08/elasticsearch-refresh-interval-vs-indexing-performance/
>> >,
>>
>> waiting a bit before submitting a batch will have no impact on the user
>> experience.
>>
>> On the other hand, in my experience you'll be sending small batches if the
>> indexing rate is low - which means the load on ES is low anyway. So I'm
>> not
>> sure if optimizing this will actually give significant results. You could
>> introduce that slowdown, but then rsyslog may have trouble keeping up when
>> the load is high. You can compensate by raising the limit of maximum
>> worker
>> threads for the queue (queue.workerthreads) and play with
>> queue.workerthreadminimummessages and queue.timeoutworkerthreadshutdown to
>> make rsyslog spawn new threads when there are at least N messages in the
>> queue (that's what min messages does) and kill them when the queue is
>> smaller than that for a while (that's the timeout option). If the load is
>> low, you'd have just one thread that works with that slowdown.
>>
>> I hope this helps.
>>
>> Best regards,
>> Radu
>>
>> --
>> Performance Monitoring * Log Analytics * Search Analytics
>> Solr & Elasticsearch Support * http://sematext.com/
>>
>> On Wed, Jun 17, 2015 at 6:23 AM, chenlin rao <[email protected]>
>> wrote:
>>
>>  So how can I define the output queue configuration?
>>> I found the omelasticsearch action process 60000/min, and the
>>> queue.discarded.nf was 600000.
>>> I run `tcpdump -i eth1 -s0 -A 'tcp dst port 9200'  | grep Content-Length`
>>> and saw the length is 1.6k. As my msgline size is 0.1k, the bulk size is
>>> only 10. Too small.
>>>
>>> Sometimes when I restart rsyslogd, the Content-Length grows to 8MB. Why~~
>>>
>>> 2015-05-06 1:39 GMT+08:00 David Lang <[email protected]>:
>>>
>>>  On Tue, 5 May 2015, chenlin rao wrote:
>>>>
>>>>  I'm using rsyslog-elasticsearch to writing nginx accesslog into
>>>>
>>>>> Elasticsearch cluster. I found the document told that the plugin would
>>>>>
>>>> use
>>>
>>>> queue.dequeuesize as the bulk size.But my tcpdump show that every POST
>>>>> only
>>>>> has 8-9 events in the bulk body while my input flow is nearly 10k per
>>>>> second.
>>>>>
>>>>> How can I force a larger bulk size?
>>>>>
>>>>>
>>>> Rsyslog adapts the size to the number of messages waiting to be
>>>>
>>> delivered,
>>>
>>>> so if it's keeping up at that size, it won't increase it.
>>>>
>>>> are you running impstats? if so, please look at the queue size. If it's
>>>> staying low, then you just have a nice, fast ES instance that is able to
>>>>
>>> do
>>>
>>>> 1k inserts/sec (which is not unreasonable), so each insert would be <10
>>>>>
>>>> messages.
>>>>
>>>> Trying to force a larger bulk size would mean not inserting messages as
>>>> fast as we can, and instead pausing and waiting for enough messages to
>>>> accumulate to fill the bulk size. We never delay messages intentionally,
>>>> each pass through the loop we grab all pending messages, up to the max
>>>> dequeue size, and deliver them. If more messages arrive than we deliver,
>>>> the next pass through the queue is larger, so we grab more messages
>>>> (this
>>>> quickly stabilizes to inserting messages as fast as they are arriving)
>>>>
>>>> there is a dequeue delay that forces rsyslog to sit and do nothing
>>>>
>>> between
>>>
>>>> one batch of messages and the next. It's use is discouraged, but
>>>> delaying
>>>> like this would allow more messages to accumulate.
>>>>
>>>> David Lang
>>>> _______________________________________________
>>>> rsyslog mailing list
>>>> http://lists.adiscon.net/mailman/listinfo/rsyslog
>>>> http://www.rsyslog.com/professional-services/
>>>> What's up with rsyslog? Follow https://twitter.com/rgerhards
>>>> NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad
>>>> of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you
>>>> DON'T LIKE THAT.
>>>>
>>>>  _______________________________________________
>>> rsyslog mailing list
>>> http://lists.adiscon.net/mailman/listinfo/rsyslog
>>> http://www.rsyslog.com/professional-services/
>>> What's up with rsyslog? Follow https://twitter.com/rgerhards
>>> NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad
>>> of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you
>>> DON'T LIKE THAT.
>>>
>>>  _______________________________________________
>> rsyslog mailing list
>> http://lists.adiscon.net/mailman/listinfo/rsyslog
>> http://www.rsyslog.com/professional-services/
>> What's up with rsyslog? Follow https://twitter.com/rgerhards
>> NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad
>> of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you
>> DON'T LIKE THAT.
>>
>>  _______________________________________________
> rsyslog mailing list
> http://lists.adiscon.net/mailman/listinfo/rsyslog
> http://www.rsyslog.com/professional-services/
> What's up with rsyslog? Follow https://twitter.com/rgerhards
> NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad
> of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you
> DON'T LIKE THAT.
>
_______________________________________________
rsyslog mailing list
http://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com/professional-services/
What's up with rsyslog? Follow https://twitter.com/rgerhards
NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of 
sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T LIKE 
THAT.

Reply via email to