Hi Risto, 2012/11/21 Risto Vaarandi <[email protected]>
> hi all, > > I apologize in advance if this question has been asked before. I have been > playing with omelasticsearch module recently and it works nice for me. > According to my tests, its performance is clearly superior to some Java > tools which have been used for Elasticsearch in the past. > I began experimenting with a configuration given at > http://wiki.rsyslog.com/index.**php/HOWTO:_rsyslog_%2B_**elasticsearch<http://wiki.rsyslog.com/index.php/HOWTO:_rsyslog_%2B_elasticsearch> > and tried to elaborate it a more advanced configuration. > > My question is about advanced parsing of log messages and extracting > additional fields from message content. In my environment, I have a lot of > IDS alarms, e.g., > > Nov 21 12:31:41 myhost snort[17449]: [1:2014527:1] ET CURRENT_EVENTS > Exploit Kit Delivering Compressed Flash Content to Client [Classification: > Potentially Bad Traffic] [Priority: 2] {TCP} 10.1.1.1:80 -> 10.2.2.2:51601 > > I would like to extract some fields like signature ID, transport protocol, > source IP, etc. from each alarm, create a json record, and write it into > elasticsearch database. > > In order to address the problem of data extraction, I have used rsyslog > property replacers. For example, to extract the main message fields plus > the signature and protocol ID, this template could be used: > > $template SnortTemplate,"{\"timestamp\":**\"%timereported:::date-** > rfc3339%\",\"message\":\"%msg:**::json%\",\"host\":\"%** > HOSTNAME:::json%\",\"sig\":\"%**msg:R,ERE,1:\[([0-9]+:[0-9]+):** > [0-9]+\]--end%\",\"proto\":\"%**msg:R,ERE,1:\{([A-Z]+)\} [0-9.]+--end%\"}" > > The signature extraction is done with > %msg:R,ERE,1:\[([0-9]+:[0-9]+)**:[0-9]+\]--end% > > while protocol is extracted with > %msg:R,ERE,1:\{([A-Z]+)\} [0-9.]+--end% > > However, while this approach works for me, it requires a separate regular > expression match for each additional field. My question is -- are there any > better ways for accomplishing this task? > I've never used it, but it seems like mmnormalize could be your new friend. Take a look here for a place to start: http://www.rsyslog.com/normalizer-first-steps-for-mmnormalize/ > > Another question -- for writing into elasticsearch, I've used > omelasticsearch with 'bulkmode' enabled and queue batch sizes set to higher > values: > > $MainMsgQueueDequeueBatchSize 1024 > $ActionQueueDequeueBatchSize 512 > > $template SnortIndex,"rsyslog-%**timereported:1:10:date-**rfc3339%" > > if $programname == 'snort' then action(type="omelasticsearch" > template="SnortPayload" dynSearchIndex="on" searchIndex="SnortIndex" > server="localhost" bulkmode="on") > > Are there any other ways for increasing the throughput? > > Yep, on the rsyslog side, you can look at various tricks on queues and batch sizes: http://wiki.rsyslog.com/index.php/Queues_on_v6_with_omelasticsearch Ultimately, it's about which batch size fits you best, so there's no getting away from testing here, unfortunately. Also, there are quite a few tricks that you can use on the Elasticsearch side. Here's a tutorial I wrote a while ago: http://www.elasticsearch.org/tutorials/2012/05/19/elasticsearch-for-logging.html It wasn't available at the time, but if you want to have compression, I think you'd want skip the source-level compression recommended in that tutorial and go for store-level compression: http://www.elasticsearch.org/guide/reference/index-modules/store.html Off-topic tip: if you want a nice GUI for your logs@Elasticsearch, you might want to look at Kibana: http://kibana.org/ It expects a field called "@timestamp" to be an RFC-3339 date to sort on it. But other than that, it's pretty configurable and powerful. Best regards, Radu _______________________________________________ rsyslog mailing list http://lists.adiscon.net/mailman/listinfo/rsyslog http://www.rsyslog.com/professional-services/ What's up with rsyslog? Follow https://twitter.com/rgerhards NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T LIKE THAT.

