Re: [rsyslog] Need help with high volume forwarding config

David Lang Sat, 03 Nov 2018 12:57:01 -0700

ok, the config you posted did specify threads

check and see if you are running out of entropy on your system, that could causeencryption to be delayed

Rainer, do you know if the encryption used /dev/random or /dev/urandom? if it'susing /dev/random that could be causing it to run out of entropy and haveproblems.


David Lang

On Sat, 3 Nov 2018, Rory Toma wrote:

I can't use imptcp as I'm using ssl. I'm not using threads because imtcpdoesn't recongnize it.
David Lang wrote on 11/3/18 12:51 PM:
I've run rsyslog up to 300k messages/sec (gig-e wire speed), so somethingis rong here
try using imptcp instead of imtcp.
also try eliminating the threads=3, you should not need more than onethread for this data rate. Rsyslog is very efficient but if you don'tactually have enough traffic to keep those threads busy, forcing morethreads can actually slow things down. Only increase the thread count whenyou see a thread maxing out the cpu.
David Lang


On Sat, 3 Nov 2018, Rory Toma via rsyslog wrote:
Date: Sat, 3 Nov 2018 12:39:01 -0700
From: Rory Toma via rsyslog <[email protected]>
To: rsyslog-users <[email protected]>,
    Rory Toma via rsyslog <[email protected]>
Cc: Rory Toma <[email protected]>
Subject: Re: [rsyslog] Need help with high volume forwarding config
My solution for this now is to set up a stream-based nginx proxy, and run4 different rsyslog servers on the host. The nginx will least_conn balanceto 4 local ports, and I'll restart 1 of the 4 rsyslog processes every 15minutes.
Rory Toma via rsyslog wrote on 11/2/18 9:04 PM:
Latest file made it longer, then gets stuck here: I must be hitting someprocess/OS limit.
Fri Nov  2 21:03:39 2018: global: origin=dynstats
Fri Nov 2 21:03:39 2018: forward1: origin=core.action processed=2104045failed=0 suspended=0 suspended.duration=0 resumed=0Fri Nov 2 21:03:39 2018: forward2: origin=core.action processed=0failed=0 suspended=0 suspended.duration=0 resumed=0
Fri Nov  2 21:03:39 2018: imtcp(110): origin=imtcp submitted=2104045
Fri Nov 2 21:03:39 2018: resource-usage: origin=impstatsutime=5376347690 stime=217621914 maxrss=242456 minflt=66148 majflt=0inblock=8 oublock=1480 nvcsw=1120907 nivcsw=8027 openfiles=4034Fri Nov 2 21:03:39 2018: forward1 queue[DA]: origin=core.queue size=0enqueued=0 full=0 discarded.full=0 discarded.nf=0 maxqsize=0Fri Nov 2 21:03:39 2018: forward1 queue: origin=core.queue size=0enqueued=2104045 full=0 discarded.full=0 discarded.nf=0 maxqsize=781Fri Nov 2 21:03:39 2018: forward2 queue[DA]: origin=core.queue size=0enqueued=0 full=0 discarded.full=0 discarded.nf=0 maxqsize=0Fri Nov 2 21:03:39 2018: forward2 queue: origin=core.queue size=0enqueued=0 full=0 discarded.full=0 discarded.nf=0 maxqsize=0Fri Nov 2 21:03:39 2018: main Q: origin=core.queue size=0enqueued=2104045 full=0 discarded.full=0 discarded.nf=0 maxqsize=575
Rory Toma wrote on 11/2/18 5:45 PM:
BTW, when it stops processing, oublock increases. I can't tell fromcasual inspection of code what it's supposed to be.
Rory Toma via rsyslog wrote on 11/2/18 5:45 PM:
OK, I'll try to convert. Thx.

David Lang wrote on 11/2/18 4:41 PM:
On Fri, 2 Nov 2018, Rory Toma wrote:
Sure.
I'm going to try and turn on TCP Keepalive next.
$DefaultNetstreamDriver gtls

# certificate files
$DefaultNetstreamDriverCAFile /opt/rsyslog/certs/ca.pem
$DefaultNetstreamDriverCertFile /opt/rsyslog/certs/cert.pem
$DefaultNetstreamDriverKeyFile /opt/rsyslog/certs/key.pem
I don't think these have any effect with the new function stylesyntax. I think you have to specify these each time they are needed.In some ways it's ugly, but it makes it so that when you are lookingat a line, you don't have to worry about what the lines before it mayhave set.
$MaxOpenFiles 1000000
see if you can set this in the global() section
module(load="imtcp" threads="3" MaxSessions="1048544"StreamDriver.Mode="1" Stre
amDriver.AuthMode="anon") # load TCP listener
module(load="impstats" interval="60" severity="7"log.file="/export/rsyslog/imps
tats.log" log.syslog="off")
I believe that you want impstats to be loaded in the very first lineto get all your stats properly.
$WorkDirectory /export/rsyslog
also set this in the global() section.
$ActionQueueType LinkedList
$ActionQueueFileName mainmsgq
$ActionQueueSaveOnShutdown on
$ActionQueueHighWaterMark 2000000
$ActionQueueLowWaterMark 1800000
#$ActionQueueSize 4000000
$ActionQueueSize 10000
#$ActionQueueDequeueBatchSize 512
$ActionQueueDequeueBatchSize 100
$ActionQueueMaxDiskSpace 3g
$ActionQueueWorkerThreads 2
$ActionQueueWorkerThreadMinimumMessages 512
all this queue stuff has no effect on any action(), it will onlyaffect the next legacy action line (which you don't have any of, soall this has no effect)
you need to convert this to the new syntax as part of the action()statement, or group the actions into a ruleset and put the queue onthe ruleset.
This explains why we weren't seeing any queue info in the impstatsoutput.
template(name="json_line" type="list" )
{
constant(value="{")
constant(value="\"time\":\"")
property(name="timegenerated" dateFormat="rfc3339" format="json" )
constant(value="\",\"msg\":\"")
property(name="msg" format="json")
constant(value="\",\"host\":\"")
property(name="hostname" format="json")
constant(value="\",\"svr\":\"")
property(name="syslogseverity-text" format="json")
constant(value="\",\"process\":\"")
property(name="programname" format="json")
constant(value="\",\"tag\":\"")
property(name="syslogtag" format="json")
constant(value="\",\"uuid\":\"")
property(name="uuid" format="json")
constant(value="\",\"rsyshost\":\"")
property(name="$myhostname" format="json")
constant(value="\"}")
}

$InputTCPServerRun 110 # Telo Logs
since you use the new style syntax everywhere else, you should changethis to input() syntax
$InputTCPServerKeepAlive on

action(
        type="omfwd"
        name="forward1"
        template="json_line"
        target="cortana-relay-1.corp.ooma.com"
        Port="9092"
        protocol="tcp"
)

action(
        type="omfwd"
        name="forward2"
        template="json_line"
        target="cortana-relay-2.corp.ooma.com"
        Port="9092"
        protocol="tcp"
        action.execOnlyWhenPreviousIsSuspended="on"
)


David Lang wrote on 11/2/18 4:28 PM:
I'm done for today, can you re-post your rsyslog.conf (and anyincluded files)?
David Lang

On Fri, 2 Nov 2018, Rory Toma wrote:
This is 8.39.
tcpdump shows tons of traffic incoming.


David Lang wrote on 11/2/18 4:25 PM:
what version of rsyslog is this? I'm not seeing the queue statsfor the disk queue that you say you have defined (I no longer haveyour original message with your full config)
I do not understand what is happening, main queue size is 0 soimtcp should be able to accept new messages, your outputs are notshowing syspended, the queue says it has never been full
David Lang

On Fri, 2 Nov 2018, Rory Toma wrote:
If I'm reading right, imtcp thinks it has no new messages?
Fri Nov 2 17:12:09 2018: forward1: origin=core.actionprocessed=712779 failed=0 suspended=0 suspended.duration=0resumed=0Fri Nov 2 17:12:09 2018: forward2: origin=core.actionprocessed=0 failed=0 suspended=0 suspended.duration=0 resumed=0Fri Nov 2 17:12:09 2018: imtcp(110): origin=imtcpsubmitted=712779Fri Nov 2 17:12:09 2018: resource-usage: origin=impstatsutime=916388169 stime=142221083 maxrss=90124 minflt=26083majflt=0 inblock=8 oublock=312 nvcsw=321384 nivcsw=1171openfiles=1434Fri Nov 2 17:12:09 2018: main Q: origin=core.queue size=0enqueued=712779 full=0 discarded.full=0 discarded.nf=0maxqsize=3548
Fri Nov  2 17:13:09 2018: global: origin=dynstats
Fri Nov 2 17:13:09 2018: forward1: origin=core.actionprocessed=712779 failed=0 suspended=0 suspended.duration=0resumed=0Fri Nov 2 17:13:09 2018: forward2: origin=core.actionprocessed=0 failed=0 suspended=0 suspended.duration=0 resumed=0Fri Nov 2 17:13:09 2018: imtcp(110): origin=imtcpsubmitted=712779Fri Nov 2 17:13:09 2018: resource-usage: origin=impstatsutime=976430226 stime=142221083 maxrss=90124 minflt=26083majflt=0 inblock=8 oublock=328 nvcsw=321385 nivcsw=1187openfiles=1434Fri Nov 2 17:13:09 2018: main Q: origin=core.queue size=0enqueued=712779 full=0 discarded.full=0 discarded.nf=0maxqsize=3548
Fri Nov  2 17:14:09 2018: global: origin=dynstats
Fri Nov 2 17:14:09 2018: forward1: origin=core.actionprocessed=712779 failed=0 suspended=0 suspended.duration=0resumed=0Fri Nov 2 17:14:09 2018: forward2: origin=core.actionprocessed=0 failed=0 suspended=0 suspended.duration=0 resumed=0Fri Nov 2 17:14:09 2018: imtcp(110): origin=imtcpsubmitted=712779Fri Nov 2 17:14:09 2018: resource-usage: origin=impstatsutime=1036475780 stime=142221083 maxrss=90124 minflt=26083majflt=0 inblock=8 oublock=336 nvcsw=321386 nivcsw=1203openfiles=1434Fri Nov 2 17:14:09 2018: main Q: origin=core.queue size=0enqueued=712779 full=0 discarded.full=0 discarded.nf=0maxqsize=3548
David Lang wrote on 11/2/18 4:05 PM:
On Fri, 2 Nov 2018, Rory Toma wrote:
WHat am I looking for? As far as I can tell, the numberenqueued remains constant for 15 minutes, then starts up again.
you should see the number processed continue to climb, and thequeue sizes drop gradually, when they drop enough, it will startaccepting new messages.
to avoid your impstats logs getting delayed, write them to afile directly (or use a separate ruleset that has it's own queuefor the impstats output)
David Lang
David Lang wrote on 11/2/18 3:53 PM:
On Fri, 2 Nov 2018, Rory Toma wrote:
Interesting. It stopped processing for 15 minutes, thenstarted to spontaneously process incoming packets again,
look at the impstats output, I'll bet that it shows that itstarted processing messages again, allowing for more messagesto be accepted.
David Lang
Rory Toma via rsyslog wrote on 11/2/18 4:18 PM:
OK, so the submitted maxes out for imtcp at submitted=583161
At this point, rsyslog thinks there are no more messages,even though tcpdump shows a ton. It's as if imtcp just stopsworking.
David Lang wrote on 11/2/18 2:58 PM:
On Fri, 2 Nov 2018, Rory Toma wrote:
OK, turned off counter reset. Also, I noticed that when Irestarted, if I waited about 2 minutes for all connectionsto clear out of FIN_WAIT, et al... It seems to log better.
Now we wait an hour...

So now I see:
Fri Nov 2 15:52:49 2018: imtcp(110): origin=imtcpsubmitted=142983Fri Nov 2 15:52:49 2018: resource-usage: origin=impstatsutime=12115102 stime=2320953 maxrss=56332 minflt=16662majflt=0 inblock=8 oublock=16 nvcsw=37439 nivcsw=99openfiles=334Fri Nov 2 15:52:49 2018: main Q: origin=core.queue size=0enqueued=142983 full=0 discarded.full=0 discarded.nf=0maxqsize=58676
Fri Nov  2 15:53:49 2018: global: origin=dynstats
Fri Nov 2 15:53:49 2018: forward1: origin=core.actionprocessed=153749 failed=0 suspended=0 suspended.duration=0resumed=0Fri Nov 2 15:53:49 2018: forward2: origin=core.actionprocessed=0 failed=0 suspended=0 suspended.duration=0resumed=0
so this shows 142983 new messages arriving and 153749processed (do you have a disk queue somewhere to accountfor the extra messages??) but at one point it was 58676messages behind.
let it run a while to where the throughput slows down andlet's see what it shows.
I expect that the number enqueued will continue to climb,but the number processed will taper off.
David Lang
David Lang wrote on 11/2/18 2:47 PM:
On Fri, 2 Nov 2018, Rory Toma wrote:
Date: Fri, 2 Nov 2018 15:44:10 -0700
From: Rory Toma <[email protected]>
To: David Lang <[email protected]>
Cc: Rory Toma via rsyslog <[email protected]>
Subject: Re: [rsyslog] Need help with high volumeforwarding config
Does this mean that the packets are not even beingforwarded?
Fri Nov 2 15:41:59 2018: main Q: origin=core.queuesize=0 enqueued=8059 full=0 discarded.full=0discarded.nf=0 maxqsize=632
so you received 8059 messages
Fri Nov  2 15:42:59 2018: global: origin=dynstats
Fri Nov 2 15:42:59 2018: action-0-builtin:omfwd:origin=core.action processed=2591 failed=0 suspended=0suspended.duration=0 resumed=0
and you sent 2591 messages through action 0 (this is whyit's good to have name='something' in the action to besure you are looking at the right thing
Fri Nov 2 15:42:59 2018: action-1-builtin:omfwd:origin=core.action processed=0 failed=0 suspended=0suspended.duration=0 resumed=0Fri Nov 2 15:42:59 2018: imtcp(110): origin=imtcpsubmitted=2591Fri Nov 2 15:42:59 2018: resource-usage:origin=impstats utime=2954582 stime=383935 maxrss=19192minflt=5924 majflt=0 inblock=8 oublock=32 nvcsw=3462nivcsw=7 openfiles=250Fri Nov 2 15:42:59 2018: main Q: origin=core.queuesize=0 enqueued=2591 full=0 discarded.full=0discarded.nf=0 maxqsize=632
but here it says there were only 2591 messages received,are you resetting the counters each time? if so, it'sprobably best not to do that right now.
David Lang
David Lang wrote on 11/2/18 2:34 PM:
On Fri, 2 Nov 2018, Rory Toma via rsyslog wrote:
We have several rsyslog hosts that forward to alogstash server. It runs great, then after about anhour, data slows down until we get a trickle. I didnot see anything last time I ran impstats, so I'mstuck. Here's my config (centos7, rsyslog 8.39) Anyadvice how to debug this?
Well, logstash has lots of bottlenecks, is it keepingup or is it refusing to accept more data?
what does impstats show? does it show the output tologstash being suspended?
until you know that the recipient is able to receivemore logs, I don't know that it's worth looking at thersyslog config.
David Lang
--
Sent from Postbox
<https://www.postbox-inc.com/?utm_source=email&utm_medium=siglink&utm_campaign=reach>

_______________________________________________
rsyslog mailing list
http://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com/professional-services/
What's up with rsyslog? Follow https://twitter.com/rgerhards
NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of 
sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T LIKE 
THAT.

Re: [rsyslog] Need help with high volume forwarding config

Reply via email to