> -----Original Message----- > From: [email protected] [mailto:rsyslog- > [email protected]] On Behalf Of Risto Vaarandi > Sent: Wednesday, November 21, 2012 1:25 PM > To: [email protected] > Subject: [rsyslog] rsyslog message formatting for elasticsearch > > hi all, > > I apologize in advance if this question has been asked before. I have > been playing with omelasticsearch module recently and it works nice for > me. According to my tests, its performance is clearly superior to some > Java tools which have been used for Elasticsearch in the past. > I began experimenting with a configuration given at > http://wiki.rsyslog.com/index.php/HOWTO:_rsyslog_%2B_elasticsearch > and tried to elaborate it a more advanced configuration. > > My question is about advanced parsing of log messages and extracting > additional fields from message content. In my environment, I have a lot > of IDS alarms, e.g., > > Nov 21 12:31:41 myhost snort[17449]: [1:2014527:1] ET CURRENT_EVENTS > Exploit Kit Delivering Compressed Flash Content to Client > [Classification: Potentially Bad Traffic] [Priority: 2] {TCP} > 10.1.1.1:80 -> 10.2.2.2:51601 > > I would like to extract some fields like signature ID, transport > protocol, source IP, etc. from each alarm, create a json record, and > write it into elasticsearch database. > > In order to address the problem of data extraction, I have used rsyslog > property replacers. For example, to extract the main message fields > plus > the signature and protocol ID, this template could be used: > > $template > SnortTemplate,"{\"timestamp\":\"%timereported:::date- > rfc3339%\",\"message\":\"%msg:::json%\",\"host\":\"%HOSTNAME:::json%\", > \"sig\":\"%msg:R,ERE,1:\[([0-9]+:[0-9]+):[0-9]+\]-- > end%\",\"proto\":\"%msg:R,ERE,1:\{([A-Z]+)\} > [0-9.]+--end%\"}" > > The signature extraction is done with > %msg:R,ERE,1:\[([0-9]+:[0-9]+):[0-9]+\]--end% > > while protocol is extracted with > %msg:R,ERE,1:\{([A-Z]+)\} [0-9.]+--end% > > However, while this approach works for me, it requires a separate > regular expression match for each additional field. My question is -- > are there any better ways for accomplishing this task? >
You should look into mmnormalize (based on liblognorm). This is a classical use case for it. Some doc: http://www.rsyslog.com/doc/mmnormalize.html http://www.rsyslog.com/tag/mmnormalize/ > Another question -- for writing into elasticsearch, I've used > omelasticsearch with 'bulkmode' enabled and queue batch sizes set to > higher values: > > $MainMsgQueueDequeueBatchSize 1024 > $ActionQueueDequeueBatchSize 512 > > $template SnortIndex,"rsyslog-%timereported:1:10:date-rfc3339%" > > if $programname == 'snort' then action(type="omelasticsearch" > template="SnortPayload" dynSearchIndex="on" searchIndex="SnortIndex" > server="localhost" bulkmode="on") > > Are there any other ways for increasing the throughput? It depends a bit on the overall workflow, but what I see doesn't look bad. I'd probably even increase the batch sizes more, if you have a heavy use system. 10240 for both is not evil. Rainer > > I'd be grateful for any feedback. > > with kind regards, > risto > _______________________________________________ > rsyslog mailing list > http://lists.adiscon.net/mailman/listinfo/rsyslog > http://www.rsyslog.com/professional-services/ > What's up with rsyslog? Follow https://twitter.com/rgerhards > NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a > myriad of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST > if you DON'T LIKE THAT. _______________________________________________ rsyslog mailing list http://lists.adiscon.net/mailman/listinfo/rsyslog http://www.rsyslog.com/professional-services/ What's up with rsyslog? Follow https://twitter.com/rgerhards NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T LIKE THAT.

