Re: [rsyslog] Help Optimizing Action Queue

Radu Gheorghe Thu, 18 Sep 2014 00:04:05 -0700

Hi Devin,

My understanding is that a timeoutenqueue of 0 might have you lose
messages, as they're enqueued from the main queue to the action queue when
rsyslog is busy, because it waits for... no time... for the action queue to
enqueue that message. I was losing tons of messages during a performance
test with that setting. So I changed it to the default and had no more
problems.


I'm still not 100% on what timeoutenqueue does, but I'd leave it to the
default for now on the action queue. Maybe Rainer or someone else who
understand this stuff better can clarify.

Best regards,
Radu

--
Performance Monitoring * Log Analytics * Search Analytics
Solr & Elasticsearch Support * http://sematext.com/

On Thu, Sep 18, 2014 at 9:52 AM, Devin Christensen <
[email protected]> wrote:

> Thanks for the clarification. In that case, am I right in thinking that
> the following action will discard all messages it receives when its disk
> queue reaches 1 gigabyte, and avoid any throttling?
>
> local1.* action(
>    type="omfwd"
>    Target="remote.example.com"
>    Port="4414"
>    Protocol="tcp"
>    template="preformatted"
>    action.resumeRetryCount="-1"
>    action.resumeInterval="15"
>    queue.type="LinkedList"
>    queue.size="100000"
>    queue.highwatermark="60000"
>    queue.lowwatermark="50000"
>    queue.dequeuebatchsize="1000"
>    queue.timeoutenqueue="0"
>    queue.workerthreads="2"
>    queue.filename="fwd_to_remote"
>    queue.maxdisksize="1g"
>    queue.maxfilesize="16m"
>    queue.saveonshutdown="on"
> )
>
> On 09/18/2014 12:43 AM, Rainer Gerhards wrote:
>
>> 2014-09-18 8:28 GMT+02:00 Devin Christensen <
>> [email protected]>:
>>
>>  Actually, it looks like there may be some conflicting documentation
>>> around
>>> "queue.timeoutenqueue". From "Understanding Rsyslog Queues"
>>>
>>>     We can not hold processing infinitely, not even when throtteling.
>>>     For example, throtteling the local log socket too long would cause
>>>     the system at whole come to a standstill. To prevent this, rsyslogd
>>>     times out after a configured period ("$<object>QueueTimeoutEnqueue",
>>>     specified in milliseconds) if no space becomes available. As a last
>>>     resort, it then discards the newly arrived message.
>>>
>>>     *If you do not like throtteling, set the timeout to 0 - the message
>>>     will then immediately be discarded*. If you use a high timeout, be
>>>     sure you know what you do. If a high main message queue enqueue
>>>     timeout is set, it can lead to something like a complete hang of the
>>>     system. The same problem does not apply to action queues.
>>>
>>>  From "General Queue Parameters"
>>>
>>>     *queue.timeoutenqueue* number number is timeout in ms (1000ms is
>>>     1sec!), default 2000, *0 means indefinite*
>>>
>>>
>>>  This is wrong, just fixed the doc. 0 is "discard immediately".
>>
>> Thx,
>> Rainer
>>
>>  Guess I won't tinker with that without a bit of clarification.
>>>
>>>
>>> On 09/18/2014 12:15 AM, Devin Christensen wrote:
>>>
>>>  Thanks for the quick response. The other setting that I thought might
>>>> help is "queue.timeoutenqueue" which I was considering setting to 0 on
>>>> the
>>>> action queue. The documentation leads me to believe this will discard
>>>> any
>>>> new messages arriving to the action when the disk queue reaches its max
>>>> size. Does that sound right?
>>>>
>>>> If I can isolate the discarded messages to those going to the omfwd
>>>> action that would be ideal. None of the other logs should cause back
>>>> pressure becuase they're not dependent on a remote host being up. Of
>>>> course, I think I should also add queue.discardmark and
>>>> queue.discardseverity to the main queue for additional reassurance.
>>>>
>>>> On 09/17/2014 11:54 PM, Radu Gheorghe wrote:
>>>>
>>>>  Hi Devin,
>>>>>
>>>>> I'm not 100% sure about this, but it sounds like what you should do is
>>>>> to
>>>>> apply queue.discardmark and queue.discardseverity on the main queue.
>>>>> This
>>>>> should allow the action queue to fill up (to that 1GB), and put
>>>>> pressure
>>>>> on
>>>>> the main queue. When main queue has more than $DISCARDMARK messages, it
>>>>> should begin discarding messages with a severity number higher than
>>>>> $DISCARDSEVERITY.
>>>>>
>>>>> You could go all-or-nothing with this, and discard everything
>>>>> (severity=1
>>>>> or maybe even 0 works?) when you hit 999999 messages, or you can show a
>>>>> bit
>>>>> of mercy and, say, let only errors pass after you have 800K messages in
>>>>> the
>>>>> queue. In the latter case you'd risk putting pressure back on the
>>>>> socket,
>>>>> though.
>>>>>
>>>>> It sounds like you already know about all the queue parameters, but
>>>>> just
>>>>> in
>>>>> case you missed the docs:
>>>>> http://www.rsyslog.com/doc/master/rainerscript/queue_parameters.html
>>>>>
>>>>> Best regards,
>>>>> Radu
>>>>> --
>>>>> Performance Monitoring * Log Analytics * Search Analytics
>>>>> Solr & Elasticsearch Support * http://sematext.com/
>>>>>
>>>>> On Thu, Sep 18, 2014 at 8:41 AM, Devin Christensen <
>>>>> [email protected]> wrote:
>>>>>
>>>>>   I'm trying to configure an action queue so that it will discard all
>>>>>
>>>>>> messages immediately if it fills up it's allocated disk space. The log
>>>>>> messages are coming in on the local socket. I just recovered from a
>>>>>> scenario where rsyslog was bringing systems to a halt, presumably
>>>>>> because
>>>>>> back pressure is ending up on the local log socket, filling it up, and
>>>>>> letting nothing else write.
>>>>>>
>>>>>> Here is my current configuration for my main queue and the action.
>>>>>>
>>>>>> main_queue(
>>>>>>     queue.type="LinkedList"
>>>>>>     queue.size="1000000"
>>>>>>     queue.dequeuebatchsize="1000"
>>>>>>     queue.workerthreads="5"
>>>>>>     queue.dequeueslowdown="0"
>>>>>> )
>>>>>>
>>>>>> local1.* action(
>>>>>>     type="omfwd"
>>>>>>     Target="remote.example.com"
>>>>>>     Port="4414"
>>>>>>     Protocol="tcp"
>>>>>>     template="preformatted"
>>>>>>     action.resumeRetryCount="-1"
>>>>>>     action.resumeInterval="15"
>>>>>>     queue.type="LinkedList"
>>>>>>     queue.size="100000"
>>>>>>     queue.highwatermark="60000"
>>>>>>     queue.lowwatermark="50000"
>>>>>>     queue.dequeuebatchsize="1000"
>>>>>>     queue.workerthreads="2"
>>>>>>     queue.filename="fwd_preformatted_to_logflume"
>>>>>>     queue.maxdisksize="1g"
>>>>>>     queue.maxfilesize="16m"
>>>>>>     queue.saveonshutdown="on"
>>>>>> )
>>>>>>
>>>>>> In the event that the target (remote.example.com) is unavailable, I
>>>>>> would
>>>>>> like logs to spool to disk upto 1 gigabyte, and discard everything
>>>>>> immediately after that. I want to avoid any back pressure ending up on
>>>>>> the
>>>>>> local log socket. It's much more valuable for our systems to continue
>>>>>> running than to get all the log data.
>>>>>>
>>>>>> My question is, what am I missing or completely messed up?
>>>>>> _______________________________________________
>>>>>> rsyslog mailing list
>>>>>> http://lists.adiscon.net/mailman/listinfo/rsyslog
>>>>>> http://www.rsyslog.com/professional-services/
>>>>>> What's up with rsyslog? Follow https://twitter.com/rgerhards
>>>>>> NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a
>>>>>> myriad
>>>>>> of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you
>>>>>> DON'T LIKE THAT.
>>>>>>
>>>>>>   _______________________________________________
>>>>>>
>>>>> rsyslog mailing list
>>>>> http://lists.adiscon.net/mailman/listinfo/rsyslog
>>>>> http://www.rsyslog.com/professional-services/
>>>>> What's up with rsyslog? Follow https://twitter.com/rgerhards
>>>>> NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a
>>>>> myriad
>>>>> of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you
>>>>> DON'T LIKE THAT.
>>>>>
>>>>>  _______________________________________________
>>>> rsyslog mailing list
>>>> http://lists.adiscon.net/mailman/listinfo/rsyslog
>>>> http://www.rsyslog.com/professional-services/
>>>> What's up with rsyslog? Follow https://twitter.com/rgerhards
>>>> NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad
>>>> of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you
>>>> DON'T LIKE THAT.
>>>>
>>>>  _______________________________________________
>>> rsyslog mailing list
>>> http://lists.adiscon.net/mailman/listinfo/rsyslog
>>> http://www.rsyslog.com/professional-services/
>>> What's up with rsyslog? Follow https://twitter.com/rgerhards
>>> NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad
>>> of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you
>>> DON'T LIKE THAT.
>>>
>>>  _______________________________________________
>> rsyslog mailing list
>> http://lists.adiscon.net/mailman/listinfo/rsyslog
>> http://www.rsyslog.com/professional-services/
>> What's up with rsyslog? Follow https://twitter.com/rgerhards
>> NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad
>> of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you
>> DON'T LIKE THAT.
>>
>
> _______________________________________________
> rsyslog mailing list
> http://lists.adiscon.net/mailman/listinfo/rsyslog
> http://www.rsyslog.com/professional-services/
> What's up with rsyslog? Follow https://twitter.com/rgerhards
> NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad
> of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you
> DON'T LIKE THAT.
>
_______________________________________________
rsyslog mailing list
http://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com/professional-services/
What's up with rsyslog? Follow https://twitter.com/rgerhards
NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of 
sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T LIKE 
THAT.

Re: [rsyslog] Help Optimizing Action Queue

Reply via email to