just FYI: I have merged the mmfields module yesterday, but doc still needs to be written. Will see that I get a bit of this done today ... but more RELP enhancements are demanding my attention, so I need to see how I wrangle with all that... ;)
Rainer On Wed, Jun 12, 2013 at 1:04 AM, David Lang <[email protected]> wrote: > On Tue, 11 Jun 2013, Xuri Nagarin wrote: > > First, David's suggestion of using a variable like "set $!vendor = >> field($msg, 124, 2)" worked really well in reducing CPU utilization. My >> packet loss is now zero :) >> >> On Fri, Jun 7, 2013 at 4:59 AM, Rainer Gerhards <[email protected] >> >**wrote: >> >> On Thu, Jun 6, 2013 at 10:04 PM, David Lang <[email protected]> wrote: >>> >>> On Thu, 6 Jun 2013, Xuri Nagarin wrote: >>>> >>>> I am trying to translate log flow from syslog-ng to rsyslog. >>>> >>>>> >>>>> I have a few syslog sources that collect logs from all over the >>>>> infrastructure and deliver it to central syslog servers that in turn >>>>> deliver logs to different consumers. >>>>> >>>>> At the moment, I have two consumers - one reads flat files and other >>>>> listens on TCP ports. For both, I need to tell them what type of event >>>>> >>>> am >>> >>>> I >>>>> delivering to them - Linux auth, firewall event, or web log. >>>>> >>>>> Something like this: >>>>> >>>>> LogFireHose====> SyslogNG/RSyslog =========> >>>>> (Parse and redirect events by type)||==> If >>>>> >>>> (Cisco >>> >>>> ASA) write to "FS (Cisco/ASA) & TCP DstHost: 5000" >>>>> >>>>> ||==> If (Apache Access) write to "FS (Apache/Access) & TCP DstHost: >>>>> >>>> 5001" >>> >>>> >>>>> ||==> If (DNS logs) write to "FS (Bind/DNS) & TCP DstHost: 5002" >>>>> >>>>> >>>>> In Syslog-NG, every incoming message (in CEF format) is subject to a >>>>> parser >>>>> that splits the log message into eight fields. Fields 2 and 3 that are >>>>> vendor and product type are used to generate a template like >>>>> "/var/log/$vendor/$product/****logfile". >>>>> >>>>> To deliver events, by type, to a specific network destination requires >>>>> filters and I have 30+ different vendor/product combinations. So I end >>>>> >>>> up >>> >>>> with 30+ log() statements, each with it's filter logic. >>>>> ----------xxxxxxxxxxxxxxx-----****----------- >>>>> filter f1 (if $product contains "ASA") >>>>> filter f2 (if $product contains "ACCESS") >>>>> filter f3 (if $product contains "DNS") >>>>> ... >>>>> .. >>>>> filter 35 (if field3 contains "blah") >>>>> >>>>> log (src=tcp;filter f1; dst=/var/log/$vendor/$product/****logfile; >>>>> dst=remotehost:5000) >>>>> log (src=tcp;filter f2; dst=/var/log/$vendor/$product/****logfile; >>>>> dst=remotehost:5001) >>>>> ... >>>>> .... >>>>> log (src=tcp;filter fx, dst=/var/log/$vendor/$product/****logfile; >>>>> dst=remotehost:5030) >>>>> ----------xxxxxxxxxxxxxxx-----****----------- >>>>> >>>>> >>>>> In RSyslog, I have so far written the logic to write to filesystem like >>>>> this: >>>>> --------------xxxxxxxxxxxxx---****------------ >>>>> template(name="cefdynfile" type="string" >>>>> string="/var/log/joe/%msg:F,****124:2%/%msg:F,124:3%/logfile") >>>>> >>>>> ruleset(name="tcpcef") { >>>>> if $syslogtag=="CEF:" then { action (type="omfile" FileOwner="joe" >>>>> FileGroup="joe" DirOwner="joe" DirGroup="joe" DirCreateMode="0755" >>>>> FileCreateMode="0644" DynaFile="cefdynfile") stop } >>>>> } >>>>> >>>>> module(load="imtcp") # needs to be done just once >>>>> input(type="imtcp" port="514" ruleset="tcpcef") >>>>> --------------xxxxxxxxxxxxx---****------------ >>>>> >>>>> Now, I am thinking how do I add rules for delivering events to tcp >>>>> destinations. >>>>> >>>>> I could expand the "tcpcef" ruleset and add more "if condition then >>>>> {action()}" statements to it, OR >>>>> I can write multiple rulesets, one for each filter like >>>>> "rule f1 { if $msg contains "blah" then action()}" >>>>> "rule f2 { if $msg contains "foo" then action()}" >>>>> >>>>> and then call these rules from "tcpcef" ruleset: >>>>> >>>>> ruleset(name="tcpcef") { >>>>> call f1 >>>>> call f2 >>>>> ... >>>>> ... >>>>> call fx } >>>>> >>>>> So two questions (1) Does this seem like a good way to parse/route >>>>> messages? >>>>> >>>>> >>>> Well, if you are getting your messages in the CEE format, it would seem >>>> >>> to >>> >>>> me to make more sense to invoke the mmjson parsing module to extract all >>>> the fields. That way you aren't doing string searches, you can just do >>>> an >>>> if on the particular variable that you are looking for. I would expect >>>> >>> this >>> >>>> to be faster once you have a few different filters. >>>> >>>> >>>> >>>> I never understood why in syslog-ng you need to name an if-condition >>> and >>> then use that name. In any case, in rsyslog you should just naturally use >>> the conditions as they are needed. Creating extra rulesets and calling >>> them >>> (without reuse) is defintely the wrong way to go. >>> >>> >> I haven't looked at the code but from troubleshooting TCP packet collapse >> issue extensively with Syslog-NG, I came to the conclusion that it >> associates all actions related to a particular source with one thread. So, >> like me, if you have one BIG source and multiple filters written to parse >> that one big stream then all that gets associated with one thread/core. >> That eventually leads to core saturation and packets are left unreaped in >> the kernel tcp buffers. I tried to isolate the problem by using ramdisk >> and >> essentially saw the same problem. Now this is all conjecture based on >> several days of monitoring all sorts of OS parameters and I may have >> written up the config very wrong in Syslog-NG. But the same logic ported >> to >> RSyslog is working really well with zero packet collapse errors. >> > > rsyslog has a very differnt architecture. > > the receiving thread is only responsible for receiving the message, > parsing out the standard properties, and putting the resulting message into > the main queue. > > Other threads then do the work of pulling messages out of that queue, > applying rule to them, formatting the output, and outputting the message > > This makes troubleshooting lost messages a bit more interesting as you > have to figure out if the messages are being lost because the input thread > can't keep up or the output thread can't keep up. > > if it's the output threads, there are lots of things that you can do to > try and deal with the issues. > > one thing is to create additional queues so that instead of trying to deal > with all the outputs in one thread, you have a thread that just moves > messages from the main queue to a couple of output (action) queues. Then > each output queue will have it's own threads working to output the messages > from that queue. > > Rsyslog allows the output threads to process multiple messages at a time > (batch size), which greatly reduces the locking on the queues. > > Reducing the amount of work done to deliver the messages (through better > rulesets) is always something to look at. > > > >> To be honest, I used to prefer Syslog-NG over RSyslog because (1) I found >> config easier to write/understand for NG (2) Documentation is better. >> >> But it is hard to argue with performance :D >> >> Now that I've gotten hang of the config syntax, I am comfortable with >> RSyslog too but the way config variables work still leaves a lot to be >> desired in terms of documentation and example configs. >> > > There's a lot of bits of examples around, but it's hard to find them when > you don't know roughly what you're looking for. Rainer concentrates on > getting the code written and documentation is definantly secondary. We can > always use more help :-) > > I'm writing a series of articles for ;login on logging, and will be doing > a bunch of examples as I go along. i'll keep everyone posted as thigs get > written/published > > > >>> (2) Which way is better for multi-threading? I read that each ruleset >>>> >>> gets >>> >>>> it own queue and thread so I am thinking defining multiple rulesets and >>>>> then calling them for a master ruleset might offer better performance. >>>>> >>>>> >>>> There's overhead to rulesets as well. >>>> >>>> >>>> action can have their own queues as well. But with the simple file >>> writer, >>> it totally makes no sense to create rules. Writing to files is so fast, >>> there is no need to de-couple it from main procesing (except, maybe, if >>> writing to NFS or some other network destination). If you run into a >>> CPU-Bound, it's totally sufficient to increase the max number of worker >>> threads for the main queue. This queue will handle all you need. >>> >>> >>> In my use case, for every inbound event three things happen: >> 1. It needs to be parsed and classified based on certain fields/content. >> 2. Based on classification, it gets written to disk in it's own unique >> folder/file structure. Here, I have Splunk tailing the log file. For each >> log file, Splunk needs to be told the type of application it is reading so >> corresponding index can be created on the splunk side. >> > > Rather than having Splunk tail the file (large files are still not great > on ext* filesystems, in spite of the improvements over the years), I use > one of two approaches > > 1. I write the file and then roll the file every minute, when I roll it I > move the file to a Splunk sinkhole directory. Splunk reads the file, then > once it's processed, Splunk deletes the file. > > 2. I output the file to a named pipe that Splunk reads as it's input. > barring a little kernel buffering, once the message is written to the pipe, > I can consider that Splunk has the message (and since Splunk is not crash > safe anyway, if it crashes, some logs will be lost. for that matter, > rsyslog will loose logs if it crashes under normal configurations) > > I like to deliver the logs via syslog all the way to the Splunk Indexer > machines and only hand it to Splunk at that point. This also simplifies the > configuration of my central box as it can just forward all the logs to the > Splunk box, and the rsyslog on the Splunk box has a simpler config as it > can ignore all messages not going into Splunk. > > > 3. Based on classification, it gets forwarded over TCP to a Hadoop/Flume >> listener. Each application type has a flume agent associated with it that >> writes the event to it's own folder/file in HDFS. So I have 30+ >> application >> types and there is an agent/port associated with each app type. >> >> Decoupling each action rule from event receiver thread and each filter >> rule >> allows rsyslog to adapt to however a destination behaves. For example, if >> a >> flume agent crashes and stops listening or slows down, it should not slow >> down rsyslog from reading events from senders or applying filters to >> incoming events. Basically, isolate different components from each other >> in >> case of various IO failures. >> > > the question is just how finely you should decouple everythng. Each action > queue you create eats more memory. It's a balancing act between decoupling > everything and resource consuption. > > > >>> Remember that premature optimization is the root of all evil. Set it up >>>> the simple way (single ruleset) and only look at changing it if you find >>>> that it isn't fast enough. >>>> >>>> using the json parser can both speed things up, but more importantly, it >>>> makes the rules easier to write and cleaner. >>>> >>>> >>>> As a reply to a later message: I'll see if I can quickly hack together >>> a >>> plugin that does field extraction for all fields at once. That would >>> offer >>> better performance, BUT only if you actually NEED the MAJORITY of fields. >>> In any case, CEF is popular enough to provide such functionality and it >>> is >>> on my whishlist for quite a while - no promise though! >>> >>> >>> Actually, we are moving away from CEF because HP Arcsight is such a PITA >> :-) Today, all our sources log into Arcsight Loggers and Loggers send a >> stream to Syslog and then Syslog distributes to Splunk and Hadoop. Once we >> kill the Loggers, all sources will directly feed into Syslog. Then I will >> have to parse messages and extract PROGNAME to do routing decisions. So I >> do not foresee a need to read ALL or MAJORITY of the fields. I am hoping >> PROGNAME and maybe one more will be sufficient. What I will need is some >> way to sanity check the event so I can avoid passing on bad messages or >> events to my consumers (Splunk and Hadoop). >> > > anything that puts messages into some sort of 'standard' is good :-) > > CEF seems to just have a handful of fields, but some of the 'fields' then > contain complex structures of their own. I think you will be using > many/most of the CEF fields, but not needing to parse out the subfields. > > David Lang > > ______________________________**_________________ > rsyslog mailing list > http://lists.adiscon.net/**mailman/listinfo/rsyslog<http://lists.adiscon.net/mailman/listinfo/rsyslog> > http://www.rsyslog.com/**professional-services/<http://www.rsyslog.com/professional-services/> > What's up with rsyslog? Follow https://twitter.com/rgerhards > NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad > of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you > DON'T LIKE THAT. > _______________________________________________ rsyslog mailing list http://lists.adiscon.net/mailman/listinfo/rsyslog http://www.rsyslog.com/professional-services/ What's up with rsyslog? Follow https://twitter.com/rgerhards NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T LIKE THAT.

