[ 
https://issues.apache.org/jira/browse/METRON-1496?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bas van de Lustgraaf updated METRON-1496:
-----------------------------------------
    Description: 
During the development of some custom parsers we wrote a couple of classes / 
functions to make it possible to reuse code and assemble parser quicker at java 
coding level.

We took this idea one step further and created the so called ChainLinkParser.

This parser gives user without any java knowledge the opportunity to assemble 
parsers at parser configuration level.

We would like to discuss the code and see if it can be submitted to the 
project. We will create a PR during this week to submit he code for review and 
discussion.

Below you'll find an example of our Parser configuration for Suricata, which is 
using our ChainParser. 

 

{noformat}
{
   "parserClassName":"nl.qsight.chainparser.ChainParser",
   "sensorTopic":"suricata",
   "readMetadata":true,
   "mergeMetadata":true,
   "numWorkers":3,
   "numAckers":3,
   "spoutParallelism":6,
   "spoutNumTasks":6,
   "parserParallelism":20,
   "parserNumTasks":20,
   "errorWriterParallelism":1,
   "errorWriterNumTasks":1,
   "spoutConfig":{
      "spout.firstPollOffsetStrategy":"LATEST"
   },
   "stormConfig":{
      "topology.max.spout.pending":2000
   },
   "parserConfig":{
      "chain":[
         "parse_json",
         "parse_username",
         "rename_fields",
         "parse_datetime"
      ],
      "parsers":{
         "parse_json":{
            "class":"nl.qsight.links.io.JSONDecoderLink"
         },
         "parse_username":{
            "class":"nl.qsight.links.io.RegexLink",
            "pattern":"(?i)(user|username|log)[=:](\\w+)",
            "selector":{
               "username":"2"
            },
            "input":"{{payload_printable}}"
         },
         "rename_fields":{
            "class":"nl.qsight.links.fields.RenameLink",
            "rename":{
               "proto":"protocol",
               "dest_ip":"ip_dst_addr",
               "src_ip":"ip_src_addr",
               "dest_port":"ip_dst_port",
               "src_port":"ip_src_port"
            }
         },
         "parse_datetime":{
            "class":"nl.qsight.links.io.TimestampLink",
            "patterns":[
               [
                  
"([0-9]{4})-([0-9]+)-([0-9]+)T([0-9]+):([0-9]+):([0-9]+).([0-9]+)([+-]{1}[0-9]{1,2}[:]?[0-9]{2})",
                  "yyyy MM dd HH mm ss SSSSSS Z",
                  "newest"
               ]
            ],
            "input":"{{timestamp}}"
         }
      }
   }
}
{noformat}


  was:
During the development of some custom parsers we wrote a couple of classes / 
functions to make it possible to reuse code and assemble parser quicker at java 
coding level.

We took this idea one step further and created the so called ChainLinkParser.

This parser gives user without any java knowledge the opportunity to assemble 
parsers at parser configuration level.

We would like to discuss the code and see if it can be submitted to the 
project. We will create a PR during this week to submit he code for review and 
discussion.

Below you'll find an example of our Parser configuration for Suricata, which is 
using our ChainParser. 

 
{noformat}
{
  "parserClassName":"nl.qsight.chainparser.ChainParser",
  "sensorTopic":"suricata",
  "readMetadata":true,
  "mergeMetadata":true,
  "numWorkers":3,
  "numAckers":3,
  "spoutParallelism":6,
  "spoutNumTasks":6,
  "parserParallelism":20,
  "parserNumTasks":20,
  "errorWriterParallelism":1,
  "errorWriterNumTasks":1,
  "spoutConfig":{
    "spout.firstPollOffsetStrategy":"LATEST"
  },
  "stormConfig":{
    "topology.max.spout.pending":2000
  },
  "parserConfig":{
    "chain":[
      "parse_json",
      "parse_username",
      "rename_fields",
      "parse_datetime"
    ],
  "parsers":{
    "parse_json":{
      "class":"nl.qsight.links.io.JSONDecoderLink"
    },
    "parse_username":{
      "class":"nl.qsight.links.io.RegexLink",
      "pattern":"(?i)(user|username|log)[=:](\\w+)",
      "selector":{
        "username":"2"
      },
      "input":"{{payload_printable}}"
    },
    "rename_fields":{
      "class":"nl.qsight.links.fields.RenameLink",
      "rename":{
        "proto":"protocol",
        "dest_ip":"ip_dst_addr",
        "src_ip":"ip_src_addr",
        "dest_port":"ip_dst_port",
        "src_port":"ip_src_port"
      }
    },
    "parse_datetime":{
      "class":"nl.qsight.links.io.TimestampLink",
      "patterns":[
        [
          
"([0-9]{4})-([0-9]+)-([0-9]+)T([0-9]+):([0-9]+):([0-9]+).([0-9]+)([+-]{1}[0-9]{1,2}[:]?[0-9]{2})",
          "yyyy MM dd HH mm ss SSSSSS Z",
          "newest"
        ]
      ],
      "input":"{{timestamp}}"
    }
  }
 }
}{noformat}
 

 


> ChainLink Parser to reuse parser code at parserConfig level
> -----------------------------------------------------------
>
>                 Key: METRON-1496
>                 URL: https://issues.apache.org/jira/browse/METRON-1496
>             Project: Metron
>          Issue Type: Improvement
>            Reporter: Bas van de Lustgraaf
>            Priority: Minor
>
> During the development of some custom parsers we wrote a couple of classes / 
> functions to make it possible to reuse code and assemble parser quicker at 
> java coding level.
> We took this idea one step further and created the so called ChainLinkParser.
> This parser gives user without any java knowledge the opportunity to assemble 
> parsers at parser configuration level.
> We would like to discuss the code and see if it can be submitted to the 
> project. We will create a PR during this week to submit he code for review 
> and discussion.
> Below you'll find an example of our Parser configuration for Suricata, which 
> is using our ChainParser. 
>  
> {noformat}
> {
>    "parserClassName":"nl.qsight.chainparser.ChainParser",
>    "sensorTopic":"suricata",
>    "readMetadata":true,
>    "mergeMetadata":true,
>    "numWorkers":3,
>    "numAckers":3,
>    "spoutParallelism":6,
>    "spoutNumTasks":6,
>    "parserParallelism":20,
>    "parserNumTasks":20,
>    "errorWriterParallelism":1,
>    "errorWriterNumTasks":1,
>    "spoutConfig":{
>       "spout.firstPollOffsetStrategy":"LATEST"
>    },
>    "stormConfig":{
>       "topology.max.spout.pending":2000
>    },
>    "parserConfig":{
>       "chain":[
>          "parse_json",
>          "parse_username",
>          "rename_fields",
>          "parse_datetime"
>       ],
>       "parsers":{
>          "parse_json":{
>             "class":"nl.qsight.links.io.JSONDecoderLink"
>          },
>          "parse_username":{
>             "class":"nl.qsight.links.io.RegexLink",
>             "pattern":"(?i)(user|username|log)[=:](\\w+)",
>             "selector":{
>                "username":"2"
>             },
>             "input":"{{payload_printable}}"
>          },
>          "rename_fields":{
>             "class":"nl.qsight.links.fields.RenameLink",
>             "rename":{
>                "proto":"protocol",
>                "dest_ip":"ip_dst_addr",
>                "src_ip":"ip_src_addr",
>                "dest_port":"ip_dst_port",
>                "src_port":"ip_src_port"
>             }
>          },
>          "parse_datetime":{
>             "class":"nl.qsight.links.io.TimestampLink",
>             "patterns":[
>                [
>                   
> "([0-9]{4})-([0-9]+)-([0-9]+)T([0-9]+):([0-9]+):([0-9]+).([0-9]+)([+-]{1}[0-9]{1,2}[:]?[0-9]{2})",
>                   "yyyy MM dd HH mm ss SSSSSS Z",
>                   "newest"
>                ]
>             ],
>             "input":"{{timestamp}}"
>          }
>       }
>    }
> }
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to