Here is quick example with some hypothetical syntax. Whatever that syntax
might be, it would be very simple, easy to understand, and leverage
high-level concepts specific to Metron.
This flow consumes Bro data, ensures there are valid source/destination
IPs, performs geo-enrichment, asset enrichment and finally persists the
data in Elasticsearch.
source("bro")
-> parser("BasicBroParser")
-> exists("ip_src_addr")
-> exists("ip_dst_addr")
-> geo_ip_src = geo["ip_src_addr"]
-> geo_ip_dst = geo["ip_dst_addr"]
-> application = assets["ip_src_addr"].application
-> owner = assets["ip_src_addr"].owner
-> elasticsearch("bro-index")
On Thu, Oct 6, 2016 at 12:58 PM, Nick Allen <[email protected]> wrote:
> Chasing this bad idea down even further leads me to something even
> crazier.
>
> Stellar 1.0 can only operate within a single topology and in most cases
> only on a single message. Stellar 2.0 could be the mechanism that allows
> users to define their own data flows and what "useful bits of Metron
> functionality" get plugged-in.
>
> Once, you have a DSL that allows users to define what they want Metron to
> do, then the underlying implementation mechanism (which is currently Storm)
> can also be swapped-out. If we have an even faster Storm implementation,
> then we swap in the Storm NG engine. Maybe we want Metron to also run in
> Flink, then we just swap-in a Flink engine.
>
>
>
>
> On Thu, Oct 6, 2016 at 12:52 PM, Nick Allen <[email protected]> wrote:
>
>> I totally "bird dogged the previous thread" as Casey likes to call it. :)
>> I am extracting this thought into a separate thread before I start
>> throwing out even more, crazier ideas.
>>
>> In general, Metron is very opinionated about data flows right now. We
>>> have Parser topologies that feed an Enrichment topology, which then feeds
>>> an Indexing topology. We have useful bits of functionality (think Stellar
>>> transforms, Geo enrichment, etc) that are closely coupled with these
>>> topologies (aka data flows).
>>>
>>
>>
>>> When a user wants to parse heterogenous data from a single topic, that's
>>> not easy. When a user wants enriched output to land in unique topics by
>>> sensor type, well, that's also not easy. When a user wanted to skip
>>> enrichment of data sources, we actually re-architected the data flow to add
>>> the Indexing topology.
>>>
>>
>>
>>> In an ideal world, a user should be responsible for defining the data
>>> flow, not Metron. Metron should provide the "useful bits of functionality"
>>> that a user can "plugin" wherever they like. Metron itself should not care
>>> how the data is moving or what step in the process it is at.
>>
>>
>>
>>
>> --
>> Nick Allen <[email protected]>
>>
>
>
>
> --
> Nick Allen <[email protected]>
>
--
Nick Allen <[email protected]>