When a message is filtered by the message filtering mechanism, we
explicitly drop the message (and presumably ack it in Storm), as explained
here -
https://github.com/apache/metron/tree/master/metron-platform/metron-parsing#filtered.
When using the REGEX_SELECT field transformation (see here -
https://github.com/apache/metron/tree/master/metron-platform/metron-parsing#fieldtransformation-configuration)
with the kafka.topicField option for parser-chaining, it's unclear to me
whether we expect the same behavior (drop message, ack it). The
interpretation I get from this example in the parser-chaining doc
https://github.com/apache/metron/tree/master/use-cases/parser_chaining#the-pix_syslog_router-parser
suggests to me that the approach we take for messages with message
filtering is the correct one, however in testing an example with dropped
messages, we appear not to ack those dropped messages.

Before I go creating a fix I thought it best to summarize and confirm my
expectations on this functionality. Messages from a REGEX_SELECT that don't
match a pattern, and therefore don't get a value assigned to their output
topic value, should be dropped and acked.

*Example:*
{
"parserClassName": "org.apache.metron.parsers.GrokParser",
        "sensorTopic": "myInTopic",
...
        "parserConfig": {
...,
"kafka.topicField": "output_topic"
},
"fieldTransformations": [
{
"input": [
"message"
],
"output": [
"output_topic"
],
"transformation": "REGEX_SELECT",
"config": {
"world": "^Hello "
}
},
...
}

*Input Records:*
"...sshd[32469]: Hello..."
"...sshd[30432]: Bye..."

*Output:*
Kafka topic = "world" (as determined by the REGEX_SELECT pattern match that
sets the "output_topic" property used by kafka.topicField)
1 record present
contents of that record = our record with "Hello" in it
1 record is dropped ("Bye" record) and will not be forwarded any further
through the pipeline.

Reply via email to