Hi Risto,

2012/11/21 Risto Vaarandi <[email protected]>

> hi all,
>
> I apologize in advance if this question has been asked before. I have been
> playing with omelasticsearch module recently and it works nice for me.
> According to my tests, its performance is clearly superior to some Java
> tools which have been used for Elasticsearch in the past.
> I began experimenting with a configuration given at
> http://wiki.rsyslog.com/index.**php/HOWTO:_rsyslog_%2B_**elasticsearch<http://wiki.rsyslog.com/index.php/HOWTO:_rsyslog_%2B_elasticsearch>
> and tried to elaborate it a more advanced configuration.
>
> My question is about advanced parsing of log messages and extracting
> additional fields from message content. In my environment, I have a lot of
> IDS alarms, e.g.,
>
> Nov 21 12:31:41 myhost snort[17449]: [1:2014527:1] ET CURRENT_EVENTS
> Exploit Kit Delivering Compressed Flash Content to Client [Classification:
> Potentially Bad Traffic] [Priority: 2] {TCP} 10.1.1.1:80 -> 10.2.2.2:51601
>
> I would like to extract some fields like signature ID, transport protocol,
> source IP, etc. from each alarm, create a json record, and write it into
> elasticsearch database.
>
> In order to address the problem of data extraction, I have used rsyslog
> property replacers. For example, to extract the main message fields plus
> the signature and protocol ID, this template could be used:
>
> $template SnortTemplate,"{\"timestamp\":**\"%timereported:::date-**
> rfc3339%\",\"message\":\"%msg:**::json%\",\"host\":\"%**
> HOSTNAME:::json%\",\"sig\":\"%**msg:R,ERE,1:\[([0-9]+:[0-9]+):**
> [0-9]+\]--end%\",\"proto\":\"%**msg:R,ERE,1:\{([A-Z]+)\} [0-9.]+--end%\"}"
>
> The signature extraction is done with
> %msg:R,ERE,1:\[([0-9]+:[0-9]+)**:[0-9]+\]--end%
>
> while protocol is extracted with
> %msg:R,ERE,1:\{([A-Z]+)\} [0-9.]+--end%
>
> However, while this approach works for me, it requires a separate regular
> expression match for each additional field. My question is -- are there any
> better ways for accomplishing this task?
>

I've never used it, but it seems like mmnormalize could be your new friend.
Take a look here for a place to start:
http://www.rsyslog.com/normalizer-first-steps-for-mmnormalize/


>
> Another question -- for writing into elasticsearch, I've used
> omelasticsearch with 'bulkmode' enabled and queue batch sizes set to higher
> values:
>
> $MainMsgQueueDequeueBatchSize 1024
> $ActionQueueDequeueBatchSize 512
>
> $template SnortIndex,"rsyslog-%**timereported:1:10:date-**rfc3339%"
>
> if $programname == 'snort' then        action(type="omelasticsearch"
> template="SnortPayload" dynSearchIndex="on" searchIndex="SnortIndex"
> server="localhost" bulkmode="on")
>
> Are there any other ways for increasing the throughput?
>
>
Yep, on the rsyslog side, you can look at various tricks on queues and
batch sizes:
 http://wiki.rsyslog.com/index.php/Queues_on_v6_with_omelasticsearch

Ultimately, it's about which batch size fits you best, so there's no
getting away from testing here, unfortunately.

Also, there are quite a few tricks that you can use on the Elasticsearch
side. Here's a tutorial I wrote a while ago:
http://www.elasticsearch.org/tutorials/2012/05/19/elasticsearch-for-logging.html

It wasn't available at the time, but if you want to have compression, I
think you'd want skip the source-level compression recommended in that
tutorial and go for store-level compression:
http://www.elasticsearch.org/guide/reference/index-modules/store.html

Off-topic tip: if you want a nice GUI for your logs@Elasticsearch, you
might want to look at Kibana: http://kibana.org/

It expects a field called "@timestamp" to be an RFC-3339 date to sort on
it. But other than that, it's pretty configurable and powerful.

Best regards,
Radu
_______________________________________________
rsyslog mailing list
http://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com/professional-services/
What's up with rsyslog? Follow https://twitter.com/rgerhards
NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of 
sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T LIKE 
THAT.

Reply via email to