Actually, it looks like there may be some conflicting documentation
around "queue.timeoutenqueue". From "Understanding Rsyslog Queues"
We can not hold processing infinitely, not even when throtteling.
For example, throtteling the local log socket too long would cause
the system at whole come to a standstill. To prevent this, rsyslogd
times out after a configured period ("$<object>QueueTimeoutEnqueue",
specified in milliseconds) if no space becomes available. As a last
resort, it then discards the newly arrived message.
*If you do not like throtteling, set the timeout to 0 - the message
will then immediately be discarded*. If you use a high timeout, be
sure you know what you do. If a high main message queue enqueue
timeout is set, it can lead to something like a complete hang of the
system. The same problem does not apply to action queues.
From "General Queue Parameters"
*queue.timeoutenqueue* number number is timeout in ms (1000ms is
1sec!), default 2000, *0 means indefinite*
Guess I won't tinker with that without a bit of clarification.
On 09/18/2014 12:15 AM, Devin Christensen wrote:
Thanks for the quick response. The other setting that I thought might
help is "queue.timeoutenqueue" which I was considering setting to 0 on
the action queue. The documentation leads me to believe this will
discard any new messages arriving to the action when the disk queue
reaches its max size. Does that sound right?
If I can isolate the discarded messages to those going to the omfwd
action that would be ideal. None of the other logs should cause back
pressure becuase they're not dependent on a remote host being up. Of
course, I think I should also add queue.discardmark and
queue.discardseverity to the main queue for additional reassurance.
On 09/17/2014 11:54 PM, Radu Gheorghe wrote:
Hi Devin,
I'm not 100% sure about this, but it sounds like what you should do
is to
apply queue.discardmark and queue.discardseverity on the main queue.
This
should allow the action queue to fill up (to that 1GB), and put
pressure on
the main queue. When main queue has more than $DISCARDMARK messages, it
should begin discarding messages with a severity number higher than
$DISCARDSEVERITY.
You could go all-or-nothing with this, and discard everything
(severity=1
or maybe even 0 works?) when you hit 999999 messages, or you can show
a bit
of mercy and, say, let only errors pass after you have 800K messages
in the
queue. In the latter case you'd risk putting pressure back on the
socket,
though.
It sounds like you already know about all the queue parameters, but
just in
case you missed the docs:
http://www.rsyslog.com/doc/master/rainerscript/queue_parameters.html
Best regards,
Radu
--
Performance Monitoring * Log Analytics * Search Analytics
Solr & Elasticsearch Support * http://sematext.com/
On Thu, Sep 18, 2014 at 8:41 AM, Devin Christensen <
[email protected]> wrote:
I'm trying to configure an action queue so that it will discard all
messages immediately if it fills up it's allocated disk space. The log
messages are coming in on the local socket. I just recovered from a
scenario where rsyslog was bringing systems to a halt, presumably
because
back pressure is ending up on the local log socket, filling it up, and
letting nothing else write.
Here is my current configuration for my main queue and the action.
main_queue(
queue.type="LinkedList"
queue.size="1000000"
queue.dequeuebatchsize="1000"
queue.workerthreads="5"
queue.dequeueslowdown="0"
)
local1.* action(
type="omfwd"
Target="remote.example.com"
Port="4414"
Protocol="tcp"
template="preformatted"
action.resumeRetryCount="-1"
action.resumeInterval="15"
queue.type="LinkedList"
queue.size="100000"
queue.highwatermark="60000"
queue.lowwatermark="50000"
queue.dequeuebatchsize="1000"
queue.workerthreads="2"
queue.filename="fwd_preformatted_to_logflume"
queue.maxdisksize="1g"
queue.maxfilesize="16m"
queue.saveonshutdown="on"
)
In the event that the target (remote.example.com) is unavailable, I
would
like logs to spool to disk upto 1 gigabyte, and discard everything
immediately after that. I want to avoid any back pressure ending up
on the
local log socket. It's much more valuable for our systems to continue
running than to get all the log data.
My question is, what am I missing or completely messed up?
_______________________________________________
rsyslog mailing list
http://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com/professional-services/
What's up with rsyslog? Follow https://twitter.com/rgerhards
NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a
myriad
of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you
DON'T LIKE THAT.
_______________________________________________
rsyslog mailing list
http://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com/professional-services/
What's up with rsyslog? Follow https://twitter.com/rgerhards
NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a
myriad of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT
POST if you DON'T LIKE THAT.
_______________________________________________
rsyslog mailing list
http://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com/professional-services/
What's up with rsyslog? Follow https://twitter.com/rgerhards
NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a
myriad of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST
if you DON'T LIKE THAT.
_______________________________________________
rsyslog mailing list
http://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com/professional-services/
What's up with rsyslog? Follow https://twitter.com/rgerhards
NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of
sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T LIKE
THAT.