I agree that making it easy for the user to "enrich enrichments", as Dima put it, to an arbitrary depth, would be extremely useful for a lot of use cases. We've discussed the use case a little in the past in this thread [1].
Re-purposing the "threat intel" phase gives us something that is feasible today, but only to a "depth" of 2. We would also need to rename and redocument it so that users understand how they can leverage the two phases. This seems like a minimally viable option if we want to head down this road. The other extreme might involve inferring the topology needed based on the user's configuration. If the user needs 3 phases, then we build a topology that supports 3 phases. Under the covers instead of using Flux, we would use Storm's topology builder Java API to grok the configuration and build the topology(ies) that the user needs. I am not sure if we can infer this from the configuration as it exists today or if we would need to redefine the configuration somehow. Like I said this is "extreme", but could give the user more expressive and intuitive options. --- [1] http://mail-archives.apache.org/mod_mbox/incubator-metron-dev/201610.mbox/%3CCAHSJ8NwJUiyp3YO6NVE4tfLoSSkOc6QG%2BMsAJSSDu%2B-wfct_vw%40mail.gmail.com%3E On Mon, Jan 9, 2017 at 10:56 AM, Casey Stella <[email protected]> wrote: > I think that would be a good feature to add to have arbitrary number of > phases, though it might be tricky to code (the way I envisioned it would > involve a loop in storm, which is possible[1]), might have unintended > consequences to guarantees (e.g. updating enrichments might not be able to > be applied in realtime) and could be tricky to reason about > performance-wise. > > As it stands, the number of phases is a consequence of the topology > itself. We do not currently have an architecture which would allow an > arbitrary number of phases without changing the flux file itself. What you > can do, though, in a stellar enrichment is stack enrichments (e.g. depend > on previous enrichments) because it's just a list of stellar statements. > The consequence, of course, is that these statements get run within the > same worker, which is unfortunate, but may be a stopgap workaround. > > *1. https://groups.google.com/forum/#!topic/storm-user/EjN1hU58Q_8 > > On Mon, Jan 9, 2017 at 10:48 AM, Otto Fowler <[email protected]> > wrote: > > > Maybe the naming of the phases is misleading? What if you could set up > an > > arbitrary number of stages, with defaults? > > > > > > On January 8, 2017 at 16:25:01, Casey Stella ([email protected]) wrote: > > > > You could do the geo enrichment normally and do a stellar hbase > enrichment > > in the threat Intel phase. > > > > On Sun, Jan 8, 2017 at 16:22 Ryan Merriman <[email protected]> wrote: > > > > > Hbase enrichments and geo enrichments are done in parallel so I would > > not > > > expect this to work. You could do the Hbase enrichment as a threat > Intel > > > enrichment and that should work because enrichments and threat Intel > are > > > done in series. > > > > > > > > > > > > The ideal way would be to chain together Stellar enrichments but I > don't > > > think there is a geo enrichment function created yet. I think that > > should > > > be a Jira. I know someone is working on an update to how we do geo > > > enrichments so I will file a follow on Jira if it's not included in the > > > scope of that work. > > > > > > > > > > > > Ryan > > > > > > > > > > > > > On Jan 8, 2017, at 2:31 PM, Dima Kovalyov <[email protected]> > > > wrote: > > > > > > > > > > > > > > Is it possible to enrich enrichment? > > > > > > > > > > > > > > For example I have IP address, I enrich it with geo and get City > name, > > > > > > > now I want to enrich City name with city crime level (assume I have > > that > > > > > > > data). But when I do that it just does not work. I specify enrichment > > > > > > > like that: > > > > > > >> { > > > > > > >> "index" : "msexchange", > > > > > > >> "batchSize" : 5, > > > > > > >> "enrichment" : { > > > > > > >> "fieldMap" : { > > > > > > >> "geo" : [ "destination_ip", "source_ip" ], > > > > > > >> "hbaseEnrichment" : [ "enrichments.geo.destination_ip.country" ], > > > > > > >> "hbaseEnrichment" : [ "enrichments:geo:destination_ip:country" ], > > > > > > >> "hbaseEnrichment" : [ "enrichments.geo.destination_ip:country" ] > > > > > > >> }, > > > > > > >> "fieldToTypeMap" : { > > > > > > >> "enrichments.geo.destination_ip.country" : [ "city_crime_level" ], > > > > > > >> "enrichments:geo:destination_ip:country" : [ "city_crime_level" ], > > > > > > >> "enrichments.geo.destination_ip:country" : [ "city_crime_level" ] > > > > > > >> }, > > > > > > >> "config" : { } > > > > > > >> }, > > > > > > >> "threatIntel" : { > > > > > > >> "fieldMap" : { }, > > > > > > >> "fieldToTypeMap" : { }, > > > > > > >> "config" : { }, > > > > > > >> "triageConfig" : { > > > > > > >> "riskLevelRules" : { }, > > > > > > >> "aggregator" : "MAX", > > > > > > >> "aggregationConfig" : { } > > > > > > >> } > > > > > > >> }, > > > > > > >> "configuration" : { } > > > > > > >> } > > > > > > > I tried all the ways how enrichment field can be entered just to be > > sure > > > > > > > I do not mistype it. > > > > > > > > > > > > > > - Dima > > > > > > > > > > > -- Nick Allen <[email protected]>
