Re: [rsyslog] Optimizing rules for multi threading

Xuri Nagarin Tue, 11 Jun 2013 16:25:48 -0700

First, David's suggestion of using a variable like "set $!vendor =
field($msg, 124, 2)" worked really well in reducing CPU utilization. My
packet loss is now zero :)


On Fri, Jun 7, 2013 at 4:59 AM, Rainer Gerhards <[email protected]>wrote:

> On Thu, Jun 6, 2013 at 10:04 PM, David Lang <[email protected]> wrote:
>
> > On Thu, 6 Jun 2013, Xuri Nagarin wrote:
> >
> >  I am trying to translate log flow from syslog-ng to rsyslog.
> >>
> >> I have a few syslog sources that collect logs from all over the
> >> infrastructure and deliver it to central syslog servers that in turn
> >> deliver logs to different consumers.
> >>
> >> At the moment, I have two consumers - one reads flat files and other
> >> listens on TCP ports. For both, I need to tell them what type of event
> am
> >> I
> >> delivering to them - Linux auth, firewall event, or web log.
> >>
> >> Something like this:
> >>
> >> LogFireHose====> SyslogNG/RSyslog =========>
> >>                         (Parse and redirect events by type)||==> If
> (Cisco
> >> ASA) write to "FS (Cisco/ASA) & TCP DstHost: 5000"
> >>
> >> ||==> If (Apache Access) write to "FS (Apache/Access) & TCP DstHost:
> 5001"
> >>
> >> ||==> If (DNS logs) write to "FS (Bind/DNS) & TCP DstHost: 5002"
> >>
> >>
> >> In Syslog-NG, every incoming message (in CEF format) is subject to a
> >> parser
> >> that splits the log message into eight fields. Fields 2 and 3 that are
> >> vendor and product type are used to generate a template like
> >> "/var/log/$vendor/$product/**logfile".
> >>
> >> To deliver events, by type, to a specific network destination requires
> >> filters and I have 30+ different vendor/product combinations. So I end
> up
> >> with 30+ log() statements, each with it's filter logic.
> >> ----------xxxxxxxxxxxxxxx-----**-----------
> >> filter f1 (if $product contains "ASA")
> >> filter f2 (if $product contains "ACCESS")
> >> filter f3 (if $product contains "DNS")
> >> ...
> >> ..
> >> filter 35 (if field3 contains "blah")
> >>
> >> log (src=tcp;filter f1; dst=/var/log/$vendor/$product/**logfile;
> >> dst=remotehost:5000)
> >> log (src=tcp;filter f2; dst=/var/log/$vendor/$product/**logfile;
> >> dst=remotehost:5001)
> >> ...
> >> ....
> >> log (src=tcp;filter fx, dst=/var/log/$vendor/$product/**logfile;
> >> dst=remotehost:5030)
> >> ----------xxxxxxxxxxxxxxx-----**-----------
> >>
> >>
> >> In RSyslog, I have so far written the logic to write to filesystem like
> >> this:
> >> --------------xxxxxxxxxxxxx---**------------
> >> template(name="cefdynfile" type="string"
> >> string="/var/log/joe/%msg:F,**124:2%/%msg:F,124:3%/logfile")
> >>
> >> ruleset(name="tcpcef") {
> >> if $syslogtag=="CEF:" then { action (type="omfile" FileOwner="joe"
> >> FileGroup="joe" DirOwner="joe" DirGroup="joe" DirCreateMode="0755"
> >> FileCreateMode="0644" DynaFile="cefdynfile") stop }
> >> }
> >>
> >> module(load="imtcp") # needs to be done just once
> >> input(type="imtcp" port="514" ruleset="tcpcef")
> >> --------------xxxxxxxxxxxxx---**------------
> >>
> >> Now, I am thinking how do I add rules for delivering events to tcp
> >> destinations.
> >>
> >> I could expand the "tcpcef" ruleset and add more "if condition then
> >> {action()}" statements to it, OR
> >> I can write multiple rulesets, one for each filter like
> >> "rule f1 { if $msg contains "blah" then action()}"
> >> "rule f2 { if $msg contains "foo" then action()}"
> >>
> >> and then call these rules from "tcpcef" ruleset:
> >>
> >> ruleset(name="tcpcef") {
> >> call f1
> >> call f2
> >> ...
> >> ...
> >> call fx }
> >>
> >> So two questions (1) Does this seem like a good way to parse/route
> >> messages?
> >>
> >
> > Well, if you are getting your messages in the CEE format, it would seem
> to
> > me to make more sense to invoke the mmjson parsing module to extract all
> > the fields. That way you aren't doing string searches, you can just do an
> > if on the particular variable that you are looking for. I would expect
> this
> > to be faster once you have a few different filters.
> >
> >
> >
> I never understood why in syslog-ng you need to name an if-condition and
> then use that name. In any case, in rsyslog you should just naturally use
> the conditions as they are needed. Creating extra rulesets and calling them
> (without reuse) is defintely the wrong way to go.
>

I haven't looked at the code but from troubleshooting TCP packet collapse
issue extensively with Syslog-NG, I came to the conclusion that it
associates all actions related to a particular source with one thread. So,
like me, if you have one BIG source and multiple filters written to parse
that one big stream then all that gets associated with one thread/core.
That eventually leads to core saturation and packets are left unreaped in
the kernel tcp buffers. I tried to isolate the problem by using ramdisk and
essentially saw the same problem. Now this is all conjecture based on
several days of monitoring all sorts of OS parameters and I may have
written up the config very wrong in Syslog-NG. But the same logic ported to
RSyslog is working really well with zero packet collapse errors.

To be honest, I used to prefer Syslog-NG over RSyslog because (1) I found
config easier to write/understand for NG (2) Documentation is better.

But it is hard to argue with performance :D

Now that I've gotten hang of the config syntax, I am comfortable with
RSyslog too but the way config variables work still leaves a lot to be
desired in terms of documentation and example configs.




>
>
> >  (2) Which way is better for multi-threading? I read that each ruleset
> gets
> >> it own queue and thread so I am thinking defining multiple rulesets and
> >> then calling them for a master ruleset might offer better performance.
> >>
> >
> > There's overhead to rulesets as well.
> >
> >
> action can have their own queues as well. But with the simple file  writer,
> it totally makes no sense to create rules. Writing to files is so fast,
> there is no need to de-couple it from main procesing (except, maybe, if
> writing to NFS or some other network destination). If you run into a
> CPU-Bound, it's totally sufficient to increase the max number of worker
> threads for the main queue. This queue will handle all you need.
>
>
In my use case, for every inbound event three things happen:
1. It needs to be parsed and classified based on certain fields/content.
2. Based on classification, it gets written to disk in it's own unique
folder/file structure. Here, I have Splunk tailing the log file. For each
log file, Splunk needs to be told the type of application it is reading so
corresponding index can be created on the splunk side.
3. Based on classification, it gets forwarded over TCP to a Hadoop/Flume
listener. Each application type has a flume agent associated with it that
writes the event to it's own folder/file in HDFS. So I have 30+ application
types and there is an agent/port associated with each app type.

Decoupling each action rule from event receiver thread and each filter rule
allows rsyslog to adapt to however a destination behaves. For example, if a
flume agent crashes and stops listening or slows down, it should not slow
down rsyslog from reading events from senders or applying filters to
incoming events. Basically, isolate different components from each other in
case of various IO failures.




>
> > Remember that premature optimization is the root of all evil. Set it up
> > the simple way (single ruleset) and only look at changing it if you find
> > that it isn't fast enough.
> >
> > using the json parser can both speed things up, but more importantly, it
> > makes the rules easier to write and cleaner.
> >
> >
> As a reply to a later message: I'll see if I can quickly hack together a
> plugin that does field extraction for all fields at once. That would offer
> better performance, BUT only if you actually NEED the MAJORITY of fields.
> In any case, CEF is popular enough to provide such functionality and it is
> on my whishlist for quite a while - no promise though!
>
>
Actually, we are moving away from CEF because HP Arcsight is such a PITA
:-) Today, all our sources log into Arcsight Loggers and Loggers send a
stream to Syslog and then Syslog distributes to Splunk and Hadoop. Once we
kill the Loggers, all sources will directly feed into Syslog. Then I will
have to parse messages and extract PROGNAME to do routing decisions. So I
do not foresee a need to read ALL or MAJORITY of the fields. I am hoping
PROGNAME and maybe one more will be sufficient. What I will need is some
way to sanity check the event so I can avoid passing on bad messages or
events to my consumers (Splunk and Hadoop).

Again, a big thanks for all your hard work. RSyslog saved the day for me.
_______________________________________________
rsyslog mailing list
http://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com/professional-services/
What's up with rsyslog? Follow https://twitter.com/rgerhards
NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of 
sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T LIKE 
THAT.

Re: [rsyslog] Optimizing rules for multi threading

Reply via email to