Re: [rsyslog] ruleset queues filled in start quickly and then smoothly

David Lang Wed, 30 Dec 2015 11:43:39 -0800

$RepeatedMsgReduction on is discouraged. It was useful 30 years ago when logswere looked through by people, but when you have automated tools looking at thelogs, this makes it impossible for the tool to actually know how many times alog message has happened.


don't do this:


if $msg contains 'session opened' then /dev/null
&~

instead do this

if $msg contains 'session opened' then ~

or

if $msg contains 'session opened' then stop

you really want a tag for imfile, the tag is what shows up as the programname inthe resulting log message


David Lang

On Wed, 30 Dec 2015, Muhammad Asif wrote:

Date: Wed, 30 Dec 2015 14:39:12 +0500
From: Muhammad Asif <[email protected]>
Reply-To: rsyslog-users <[email protected]>
To: rsyslog-users <[email protected]>
Subject: Re: [rsyslog] ruleset queues filled in start quickly and then  smoothly

I am really sorry I attached old link of config.  This is actual.
http://pastebin.com/H7JPvvzj

I am grateful to you for your detailed answer.

Thanks

On Wed, Dec 30, 2015 at 1:17 AM, David Lang <[email protected]> wrote:

On Tue, 29 Dec 2015, Muhammad Asif wrote:

Hi Guys,


I have three queues at rulesets level as following.


no, this config shows the queue at the action level, not the ruleset level

at the ruleset level, instead of:

ruleset(name="events-on-tcp"){
 action(type="omfwd" target="127.0.0.1" port="5170" protocol="tcp"
    name="tcp-events-queue" template="msgonly" queue.size="1000000"
    queue.dequeuebatchsize="2000" queue.dequeueslowdown="1000000"
    queue.workerthreads="2" queue.type="LinkedList" )
}

you would have:

ruleset(name="events-on-tcp" queue.size="1000000"
    queue.dequeuebatchsize="2000" queue.dequeueslowdown="1000000"
    queue.workerthreads="2" queue.type="LinkedList" ){
  action(type="omfwd" target="127.0.0.1" port="5170" protocol="tcp"
      name="tcp-events-queue" template="msgonly" )
}


http://pastebin.com/k4EWRwL7


you should try to use a unique name for everything, you use
'Events-on-TCP' for both a ruleset (and therefor the queue for the ruleset)
and the action. Reuse of names can cause stats data to be lost.

Queues stats.


http://pastebin.com/Js5bEqxm

I restart rsyslog service and then observer that ruleset queus get empty ,
then start to fill rapidly and then smootly as shown above stats.

Please help me in understanding this behavior.


so the interesting data for Events-on-TCP queue is:

grep core.queue Js5bEqxm.txt |grep Events-on-TCP |grep size= |cut -f
8,9,13 -d ' '
size=0 enqueued=0 maxqsize=0
size=0 enqueued=0 maxqsize=0
size=48776 enqueued=55520 maxqsize=48776
size=150353 enqueued=119577 maxqsize=150353
size=273953 enqueued=123600 maxqsize=273953
size=398355 enqueued=124402 maxqsize=398355
size=513015 enqueued=126660 maxqsize=513015
size=629848 enqueued=122833 maxqsize=629848
size=700003 enqueued=70156 maxqsize=700003
size=700000 enqueued=11997 maxqsize=700007
size=700010 enqueued=10 maxqsize=700010
size=700000 enqueued=17990 maxqsize=700011
size=700010 enqueued=10 maxqsize=700011
size=700000 enqueued=5990 maxqsize=700019
size=700009 enqueued=2009 maxqsize=700019
size=700006 enqueued=3997 maxqsize=700019
size=700006 enqueued=6000 maxqsize=700019
size=700016 enqueued=10 maxqsize=700019


so you have a flood of messages arriving shortly after you startup, which
is the senders noticing that you are there and sending their backlog. After
that the rate of new messages arriving stabilizes

But the fact that the queue size climbs up to ~70,000 and stays there
means that you are not delivering messages to your destination at a
reasonable rate. Having the queue build initially as you are receiving
hundreds of thousands of logs per sample period, but when the volume of new
logs drops to thousands of new logs per period, you should make headway in
delivering the logs faster than you receive them.

The fact that the events-on-tcp action is processing very few messages,
and only in bursts is saying that you are having problems delivering
messages. this may just be due to your throttling settings, you wait a LONG
time between bursts.

grep -e Events-on-TCP -e imtcp  Js5bEqxm.txt  |cut -f 7,8 -d ' '
origin=core.action processed=0
origin=imtcp submitted=0
origin=core.queue size=0
origin=core.action processed=0
origin=imtcp submitted=0
origin=core.queue size=0
origin=core.action processed=8744
origin=imtcp submitted=55549
origin=core.queue size=48776
origin=core.action processed=18000
origin=imtcp submitted=119548
origin=core.queue size=150353
origin=core.action processed=0
origin=imtcp submitted=123600
origin=core.queue size=273953
origin=core.action processed=0
origin=imtcp submitted=124402
origin=core.queue size=398355
origin=core.action processed=12000
origin=imtcp submitted=126660
origin=core.queue size=513015
origin=core.action processed=6000
origin=imtcp submitted=122833
origin=core.queue size=629848
origin=core.action processed=0
origin=imtcp submitted=70156
origin=core.queue size=700003
origin=core.action processed=12000
origin=imtcp submitted=12643
origin=core.queue size=700000
origin=core.action processed=0
origin=imtcp submitted=0
origin=core.queue size=700010
origin=core.action processed=18000
origin=imtcp submitted=17772
origin=core.queue size=700000
origin=core.action processed=0
origin=imtcp submitted=0
origin=core.queue size=700010
origin=core.action processed=6000
origin=imtcp submitted=5925
origin=core.queue size=700000
origin=core.action processed=2000
origin=imtcp submitted=2221
origin=core.queue size=700009
origin=core.action processed=4000
origin=imtcp submitted=3713
origin=core.queue size=700006
origin=core.action processed=6000
origin=imtcp submitted=5924
origin=core.queue size=700006
origin=core.action processed=0
origin=imtcp submitted=0
origin=core.queue size=700016

so during this time you have received 790946 messages, but only processed
92744 of them, leaving ~70,000 messages pending. the fact that the
submissions drop off may mean that the watermark settings are refusing to
accept more messages (the fact that the queue size is staying at almost
exactly 70k even when you have a wildly varying number of messages
processed is very suspcious.

try setting your batch size to 20 instead of 2000 and your delay to 10000
instead of 1000000 this is the same total limit, but should smooth things
out a lot so that you can see what the receiving processes can really
handle. Right now they are getting a huge batch of 2000 messages every
second and then nothing for the next second instead of a continuing stream
of messages.

David Lang
_______________________________________________
rsyslog mailing list
http://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com/professional-services/
What's up with rsyslog? Follow https://twitter.com/rgerhards
NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad
of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you
DON'T LIKE THAT.

_______________________________________________
rsyslog mailing list
http://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com/professional-services/
What's up with rsyslog? Follow https://twitter.com/rgerhards
NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of 
sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T LIKE 
THAT.

_______________________________________________
rsyslog mailing list
http://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com/professional-services/
What's up with rsyslog? Follow https://twitter.com/rgerhards
NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of 
sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T LIKE 
THAT.

Re: [rsyslog] ruleset queues filled in start quickly and then smoothly

Reply via email to