Github user cestella commented on a diff in the pull request:
https://github.com/apache/metron/pull/1099#discussion_r202817242
--- Diff: metron-platform/metron-parsers/README.md ---
@@ -82,6 +82,12 @@ topology in kafka. Errors are collected with the
context of the error
(e.g. stacktrace) and original message causing the error and sent to an
`error` queue. Invalid messages as determined by global validation
functions are also treated as errors and sent to an `error` queue.
+
+Multiple sensors can be aggregated into a single Storm topology. When this
is done, there will be
+multiple Kafka spouts, but only a single parser bolt which will handle
delegating to the correct
--- End diff --
@justinleet can you maybe create a data flow diagram or sequence diagram
that shows a syslog record from the use-case flowing through this topology and
add it to the use-case around parser chaining?
It'd be something like, given a `cisco-6-302` record, it'll go:
* From NiFi to the `pix_syslog_router` kafka topic
* From the `pix_syslog_router` kafka topic to the `pix_syslog_router` spout
in the aggregated storm topology
* From the `pix_syslog_router` kafka spout to the parser bolt, which will
run the `pix_syslog_router` Grok parser and write out to the `cisco-6-302`
kafka topic
* From the `cisco-6-302` kafka topic to the `cisco-6-302` spout in the
aggregated storm topology
* From the `cisco-6-302` kafka spout to the `cisco-6-302` Grok parser and
write out to the `enrichments` kafka topic, where it's picked up by the
enrichment topology.
Eventually, we should consider taking out the writing to the `cisco-6-302`
topic (optionally), but even eventually there may be value in those
intermediate kafka topics due to how users may want to group sensors (e.g.
grouping may be done via velocity or scalability requirements, rather than
logical connection).
---