> -----Original Message-----
> From: [email protected] [mailto:rsyslog-
> [email protected]] On Behalf Of Risto Vaarandi
> Sent: Wednesday, November 21, 2012 1:25 PM
> To: [email protected]
> Subject: [rsyslog] rsyslog message formatting for elasticsearch
> 
> hi all,
> 
> I apologize in advance if this question has been asked before. I have
> been playing with omelasticsearch module recently and it works nice for
> me. According to my tests, its performance is clearly superior to some
> Java tools which have been used for Elasticsearch in the past.
> I began experimenting with a configuration given at
> http://wiki.rsyslog.com/index.php/HOWTO:_rsyslog_%2B_elasticsearch
> and tried to elaborate it a more advanced configuration.
> 
> My question is about advanced parsing of log messages and extracting
> additional fields from message content. In my environment, I have a lot
> of IDS alarms, e.g.,
> 
> Nov 21 12:31:41 myhost snort[17449]: [1:2014527:1] ET CURRENT_EVENTS
> Exploit Kit Delivering Compressed Flash Content to Client
> [Classification: Potentially Bad Traffic] [Priority: 2] {TCP}
> 10.1.1.1:80 -> 10.2.2.2:51601
> 
> I would like to extract some fields like signature ID, transport
> protocol, source IP, etc. from each alarm, create a json record, and
> write it into elasticsearch database.
> 
> In order to address the problem of data extraction, I have used rsyslog
> property replacers. For example, to extract the main message fields
> plus
> the signature and protocol ID, this template could be used:
> 
> $template
> SnortTemplate,"{\"timestamp\":\"%timereported:::date-
> rfc3339%\",\"message\":\"%msg:::json%\",\"host\":\"%HOSTNAME:::json%\",
> \"sig\":\"%msg:R,ERE,1:\[([0-9]+:[0-9]+):[0-9]+\]--
> end%\",\"proto\":\"%msg:R,ERE,1:\{([A-Z]+)\}
> [0-9.]+--end%\"}"
> 
> The signature extraction is done with
> %msg:R,ERE,1:\[([0-9]+:[0-9]+):[0-9]+\]--end%
> 
> while protocol is extracted with
> %msg:R,ERE,1:\{([A-Z]+)\} [0-9.]+--end%
> 
> However, while this approach works for me, it requires a separate
> regular expression match for each additional field. My question is --
> are there any better ways for accomplishing this task?
> 

You should look into mmnormalize (based on liblognorm). This is a classical use 
case for it. Some doc:

http://www.rsyslog.com/doc/mmnormalize.html
http://www.rsyslog.com/tag/mmnormalize/

> Another question -- for writing into elasticsearch, I've used
> omelasticsearch with 'bulkmode' enabled and queue batch sizes set to
> higher values:
> 
> $MainMsgQueueDequeueBatchSize 1024
> $ActionQueueDequeueBatchSize 512
> 
> $template SnortIndex,"rsyslog-%timereported:1:10:date-rfc3339%"
> 
> if $programname == 'snort' then        action(type="omelasticsearch"
> template="SnortPayload" dynSearchIndex="on" searchIndex="SnortIndex"
> server="localhost" bulkmode="on")
> 
> Are there any other ways for increasing the throughput?

It depends a bit on the overall workflow, but what I see doesn't look bad. I'd 
probably even increase the batch sizes more, if you have a heavy use system. 
10240 for both is not evil.

Rainer
> 
> I'd be grateful for any feedback.
> 
> with kind regards,
> risto
> _______________________________________________
> rsyslog mailing list
> http://lists.adiscon.net/mailman/listinfo/rsyslog
> http://www.rsyslog.com/professional-services/
> What's up with rsyslog? Follow https://twitter.com/rgerhards
> NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a
> myriad of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST
> if you DON'T LIKE THAT.
_______________________________________________
rsyslog mailing list
http://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com/professional-services/
What's up with rsyslog? Follow https://twitter.com/rgerhards
NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of 
sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T LIKE 
THAT.

Reply via email to