Hello

We are trying to use mmnormalize to parse syslog from NAT device. This device makes around 150000 messages per second. Now, I’ve trying to make it working with rsyslog 7.5.5 (RHEL 6 x86_64, official RPMs). Then, I have two problems.

First, mmnormalize is unable to handle the load. It appears to me that there is a big lock somewhere in the module, because it is unable to distribute CPU load between threads. Total CPU utilization by the module is around 100% despite of number of threads.

Threads=”3”

  PID USER      PR  NI VIRT  RES  SHR S %CPU %MEM    TIME+ COMMAND

2482 root      20   0 635m 189m 1064 R 60.2  0.8   0:13.07 in:imudp

2486 root      20   0 635m 189m 1064 R 37.7  0.8   0:07.73 rs:action 1 que

2487 root      20   0 635m 189m 1064 S 36.7  0.8   0:07.66 rs:action 1 que

2485 root      20   0 635m 189m 1064 S 36.1  0.8   0:07.47 rs:action 1 que

2484 root      20   0 635m 189m 1064 R 27.5  0.8   0:05.44 rs:main Q:Reg

Threads=”10”

  PID USER      PR  NI VIRT  RES  SHR S %CPU %MEM    TIME+ COMMAND

2515 root      20   0 1218m 195m 1080 R 60.5  0.8   0:56.13 in:imudp

2516 root      20   0 1218m 195m 1080 R 27.8  0.8   0:24.41 rs:main Q:Reg

2519 root      20   0 1218m 195m 1080 R 11.6  0.8   0:09.97 rs:action 1 que

2523 root      20   0 1218m 195m 1080 S 11.6  0.8   0:10.42 rs:action 1 que

2520 root      20   0 1218m 195m 1080 S 11.2  0.8   0:10.40 rs:action 1 que

2525 root      20   0 1218m 195m 1080 S 11.2  0.8   0:10.25 rs:action 1 que

2518 root      20   0 1218m 195m 1080 S 10.9  0.8   0:10.20 rs:action 1 que

2522 root      20   0 1218m 195m 1080 S 10.9  0.8   0:09.99 rs:action 1 que

2524 root      20   0 1218m 195m 1080 S 10.9  0.8   0:10.03 rs:action 1 que

2526 root      20   0 1218m 195m 1080 S 10.9  0.8   0:10.19 rs:action 1 que

2517 root      20   0 1218m 195m 1080 S 10.6  0.8   0:10.05 rs:action 1 que

2521 root      20   0 1218m 195m 1080 S 10.6  0.8   0:10.32 rs:action 1 que

Omstats:

Fri Oct 18 14:13:22 2013: imuxsock: submitted=657 ratelimit.discarded=0 ratelimit.numratelimiters=7

Fri Oct 18 14:13:22 2013: action 1: processed=9385402 failed=0

Fri Oct 18 14:13:22 2013: dynafile cache normFilename: requests=0 level0=0 missed=0 evicted=0 maxused=0

Fri Oct 18 14:13:22 2013: action 2: processed=0 failed=0

Fri Oct 18 14:13:22 2013: action 3: processed=645 failed=0

Fri Oct 18 14:13:22 2013: action 4: processed=3 failed=0

Fri Oct 18 14:13:22 2013: action 5: processed=0 failed=0

Fri Oct 18 14:13:22 2013: action 6: processed=9 failed=0

Fri Oct 18 14:13:22 2013: action 7: processed=1 failed=0

Fri Oct 18 14:13:22 2013: action 8: processed=0 failed=0

Fri Oct 18 14:13:22 2013: action 9: processed=0 failed=0

Fri Oct 18 14:13:22 2013: action 10: processed=0 failed=0

Fri Oct 18 14:13:22 2013: imudp(*:514): submitted=9385432

Fri Oct 18 14:13:22 2013: imudp(*:514): submitted=0

Fri Oct 18 14:13:22 2013: resource-usage: utime=173476627 stime=42114597 maxrss=200500 minflt=1094749 majflt=0 inblock=0 oublock=552 nvcsw=2367078 nivcsw=1441

Fri Oct 18 14:13:22 2013: action 1 queue: size=200000 enqueued=9385401 full=4742604 discarded.full=4742604 discarded.nf=0 maxqsize=200000

Fri Oct 18 14:13:22 2013: main Q: size=31 enqueued=9386090 full=0 discarded.full=0 discarded.nf=0 maxqsize=1122

Fri Oct 18 14:13:22 2013: imudp(w0): called.recvmmsg=10554433 called.recvmsg=0 msgs.received=9385433

99,5% of messages has one format, like this (raw message):

<150>2013-10-17 20:41:59 CGN-ATS-528{sset_SP_000}[FWNAT]: ASP_SFW_CREATE_ACCEPT_FLOW 100.91.5.221:53129 [78.25.120.17:36231] -> 176.194.90.81:6881 (TCP)

Configuration for the module follows:

Ruleset(name="logd"){

   Action(

queue.workerthreads="10"

queue.type="LinkedList"

queue.size="200000"

queue.timeoutenqueue="0"

type="mmnormalize"

        userawmsg="on"

rulebase="/etc/rsyslog.d/rules.rb"

    )

    if $!date != "" then {

        Action(

type="omfile"

DynaFile="normFilename"

template="normTemplate"

        )

    }

}

Rulebase:

prefix=<150>%date:word% %time:word% %cgn:char-to:{%{%sset:char-to:}%}[FWNAT]:

rule=: ASP_SFW_CREATE_ACCEPT_FLOW %ip1:ipv4%:%port1:number% [%ip2:ipv4%:%port2:number%] -> %ip3:ipv4%:%port3:number% (%proto:char-to:)%)

Second, in 7.5.5, this does not work at all. In 7.4.4, mmnormalize produces expected output, though it cannot keep the load and has known memory leak.


On the other hand, in 7.5.5 it produces rare records in the output file (say, 100 per minute). It looks like it cannot match other records, but this is weird, as they are all in the same format and happily match in lognormalize.


Does anyone have this working?

Thanks for reading this,
Pavel

_______________________________________________
rsyslog mailing list
http://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com/professional-services/
What's up with rsyslog? Follow https://twitter.com/rgerhards
NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of 
sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T LIKE 
THAT.

Reply via email to