Github user justinleet commented on a diff in the pull request:
https://github.com/apache/metron/pull/1099#discussion_r203092801
--- Diff: use-cases/parser_chaining/README.md ---
@@ -233,3 +233,10 @@ cat ~/data.log |
/usr/hdp/current/kafka-broker/bin/kafka-console-producer.sh --b
```
You should see indices created for the `cisco-5-304` and `cisco-6-302`
data with appropriate fields created for each type.
+
+# Aggregated Parsers with Parser Chaining
+Chained parsers can be run as aggregated parsers. These parsers continue
to use the sensor specific Kafka topics, and do not do internal routing to the
appropriate sensor.
+
--- End diff --
Right now, as noted in the description, there's no UI attached to this.
Even the REST API's update is pretty minimal (just to take comma separated
lists). I didn't want to build that out, because the management UI requires
some decent amount of thought put into it and that'll ripple through REST as
needed (e.g. needing/wanting to pass spout num tasks, parallelism, etc.).
Right now I look at this as providing a low level way of being able to get
some of the benefits of this type of aggregation, with making it more user
friendly being follow-on since it'll require nontrivial effort and design. I
can go ahead and create follow-on tickets for that work, if that works for you.
For the default Ambari processors, I'm not particularly inclined to worry
about it, although I could be persuaded that we need to. That feels like
something that can be addressed as this is made more user friendly (i.e. I
expect people familiar enough with the system to make the determination to
aggregate parsers right now to also be familiar enough to stop the topologies).
I could add a warning or something like that in the docs to not run an
aggregated parser with sensor X alongside a dedicated topology for sensor X,
but I'm not sure that's necessary.
I also went ahead and added the actual command to the chain parsers README,
so the practical example is complete.
---