hi all,

I apologize in advance if this question has been asked before. I have been playing with omelasticsearch module recently and it works nice for me. According to my tests, its performance is clearly superior to some Java tools which have been used for Elasticsearch in the past. I began experimenting with a configuration given at http://wiki.rsyslog.com/index.php/HOWTO:_rsyslog_%2B_elasticsearch
and tried to elaborate it a more advanced configuration.

My question is about advanced parsing of log messages and extracting additional fields from message content. In my environment, I have a lot of IDS alarms, e.g.,

Nov 21 12:31:41 myhost snort[17449]: [1:2014527:1] ET CURRENT_EVENTS Exploit Kit Delivering Compressed Flash Content to Client [Classification: Potentially Bad Traffic] [Priority: 2] {TCP} 10.1.1.1:80 -> 10.2.2.2:51601

I would like to extract some fields like signature ID, transport protocol, source IP, etc. from each alarm, create a json record, and write it into elasticsearch database.

In order to address the problem of data extraction, I have used rsyslog property replacers. For example, to extract the main message fields plus the signature and protocol ID, this template could be used:

$template SnortTemplate,"{\"timestamp\":\"%timereported:::date-rfc3339%\",\"message\":\"%msg:::json%\",\"host\":\"%HOSTNAME:::json%\",\"sig\":\"%msg:R,ERE,1:\[([0-9]+:[0-9]+):[0-9]+\]--end%\",\"proto\":\"%msg:R,ERE,1:\{([A-Z]+)\} [0-9.]+--end%\"}"

The signature extraction is done with
%msg:R,ERE,1:\[([0-9]+:[0-9]+):[0-9]+\]--end%

while protocol is extracted with
%msg:R,ERE,1:\{([A-Z]+)\} [0-9.]+--end%

However, while this approach works for me, it requires a separate regular expression match for each additional field. My question is -- are there any better ways for accomplishing this task?

Another question -- for writing into elasticsearch, I've used omelasticsearch with 'bulkmode' enabled and queue batch sizes set to higher values:

$MainMsgQueueDequeueBatchSize 1024
$ActionQueueDequeueBatchSize 512

$template SnortIndex,"rsyslog-%timereported:1:10:date-rfc3339%"

if $programname == 'snort' then action(type="omelasticsearch" template="SnortPayload" dynSearchIndex="on" searchIndex="SnortIndex" server="localhost" bulkmode="on")

Are there any other ways for increasing the throughput?

I'd be grateful for any feedback.

with kind regards,
risto
_______________________________________________
rsyslog mailing list
http://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com/professional-services/
What's up with rsyslog? Follow https://twitter.com/rgerhards
NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of 
sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T LIKE 
THAT.

Reply via email to