Bas van de Lustgraaf created METRON-1496:
--------------------------------------------
Summary: ChainLink Parse to reuse parser code at parserConfig level
Key: METRON-1496
URL: https://issues.apache.org/jira/browse/METRON-1496
Project: Metron
Issue Type: Improvement
Reporter: Bas van de Lustgraaf
During the development of some custom parsers we wrote a couple of classes /
functions to make it possible to reuse code and assemble parser quicker at java
coding level.
We took this idea one step further and created the so called ChainLinkParser.
This parser gives user without any java knowledge the opportunity to assemble
parsers at parser configuration level.
We would like to discuss the code and see if it can be submitted to the
project. We will create a PR during this week to submit he code for review and
discussion.
Below you'll find an example of our Parser configuration for Suricata, which is
using our ChainParser.
{noformat}
{
"parserClassName":"nl.qsight.chainparser.ChainParser",
"sensorTopic":"suricata",
"readMetadata":true,
"mergeMetadata":true,
"numWorkers":3,
"numAckers":3,
"spoutParallelism":6,
"spoutNumTasks":6,
"parserParallelism":20,
"parserNumTasks":20,
"errorWriterParallelism":1,
"errorWriterNumTasks":1,
"spoutConfig":{
"spout.firstPollOffsetStrategy":"LATEST"
},
"stormConfig":{
"topology.max.spout.pending":2000
},
"parserConfig":{
"chain":[
"parse_json",
"parse_username",
"rename_fields",
"parse_datetime"
],
"parsers":{
"parse_json":{
"class":"nl.qsight.links.io.JSONDecoderLink"
},
"parse_username":{
"class":"nl.qsight.links.io.RegexLink",
"pattern":"(?i)(user|username|log)[=:](\\w+)",
"selector":{
"username":"2"
},
"input":"{{payload_printable}}"
},
"rename_fields":{
"class":"nl.qsight.links.fields.RenameLink",
"rename":{
"proto":"protocol",
"dest_ip":"ip_dst_addr",
"src_ip":"ip_src_addr",
"dest_port":"ip_dst_port",
"src_port":"ip_src_port"
}
},
"parse_datetime":{
"class":"nl.qsight.links.io.TimestampLink",
"patterns":[
[
"([0-9]{4})-([0-9]+)-([0-9]+)T([0-9]+):([0-9]+):([0-9]+).([0-9]+)([+-]{1}[0-9]{1,2}[:]?[0-9]{2})",
"yyyy MM dd HH mm ss SSSSSS Z",
"newest"
]
],
"input":"{{timestamp}}"
}
}
}
}{noformat}
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)