hi all,
I apologize in advance if this question has been asked before. I have
been playing with omelasticsearch module recently and it works nice for
me. According to my tests, its performance is clearly superior to some
Java tools which have been used for Elasticsearch in the past.
I began experimenting with a configuration given at
http://wiki.rsyslog.com/index.php/HOWTO:_rsyslog_%2B_elasticsearch
and tried to elaborate it a more advanced configuration.
My question is about advanced parsing of log messages and extracting
additional fields from message content. In my environment, I have a lot
of IDS alarms, e.g.,
Nov 21 12:31:41 myhost snort[17449]: [1:2014527:1] ET CURRENT_EVENTS
Exploit Kit Delivering Compressed Flash Content to Client
[Classification: Potentially Bad Traffic] [Priority: 2] {TCP}
10.1.1.1:80 -> 10.2.2.2:51601
I would like to extract some fields like signature ID, transport
protocol, source IP, etc. from each alarm, create a json record, and
write it into elasticsearch database.
In order to address the problem of data extraction, I have used rsyslog
property replacers. For example, to extract the main message fields plus
the signature and protocol ID, this template could be used:
$template
SnortTemplate,"{\"timestamp\":\"%timereported:::date-rfc3339%\",\"message\":\"%msg:::json%\",\"host\":\"%HOSTNAME:::json%\",\"sig\":\"%msg:R,ERE,1:\[([0-9]+:[0-9]+):[0-9]+\]--end%\",\"proto\":\"%msg:R,ERE,1:\{([A-Z]+)\}
[0-9.]+--end%\"}"
The signature extraction is done with
%msg:R,ERE,1:\[([0-9]+:[0-9]+):[0-9]+\]--end%
while protocol is extracted with
%msg:R,ERE,1:\{([A-Z]+)\} [0-9.]+--end%
However, while this approach works for me, it requires a separate
regular expression match for each additional field. My question is --
are there any better ways for accomplishing this task?
Another question -- for writing into elasticsearch, I've used
omelasticsearch with 'bulkmode' enabled and queue batch sizes set to
higher values:
$MainMsgQueueDequeueBatchSize 1024
$ActionQueueDequeueBatchSize 512
$template SnortIndex,"rsyslog-%timereported:1:10:date-rfc3339%"
if $programname == 'snort' then action(type="omelasticsearch"
template="SnortPayload" dynSearchIndex="on" searchIndex="SnortIndex"
server="localhost" bulkmode="on")
Are there any other ways for increasing the throughput?
I'd be grateful for any feedback.
with kind regards,
risto
_______________________________________________
rsyslog mailing list
http://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com/professional-services/
What's up with rsyslog? Follow https://twitter.com/rgerhards
NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of
sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T LIKE
THAT.