I had a thought after going through this exercise.  Why treat threat intel
any different than Netflow, Snort or YAF data?  All input should have the
opportunity to be enriched using the generic tools that Metron provides.
Is there any reason to treat threat intel differently from other data
sources?


On Sun, Apr 10, 2016 at 7:36 PM, Nick Allen <[email protected]> wrote:

> It might help to think of our code base as four separate types of
> functionality.  This is primarily meant to give us a framework to think
> about the organization of Metron (and drive more discussion), rather than
> my proposal for a specific structure.
>
>    - Sensor - Anything that captures external, non-streaming data and
>    presents it in a form ready for stream processing.
>    - Input - Responsible for preparing streaming data for enrichment.
>    The existing "parsers" fit neatly into this space.
>    - Enrichment - Responsible for enriching an incoming data feed like
>    geoip, asset enrichment, threat intel lookups, etc.
>    - Output - Responsible for persisting data that has been processed by
>    Metron which obviously means search indexers or data stores.
>
>
>
>
>
> On Fri, Apr 8, 2016 at 4:46 PM, Ryan Merriman <[email protected]>
> wrote:
>
>> All,
>>
>> I would like to propose a review and refactor of the current project
>> organization within Metron.  Much of the way the legacy code was organized
>> does not make sense anymore and could be designed so that it is easier to
>> navigate and understand.  Our test coverage has increased substantially so
>> I believe we can do this with confidence.
>>
>> First off, I think we should agree on a naming convention.  I see some
>> projects (YARN and Storm for example) that prepend the sub-project with the
>> name of the top-level project (storm-core for example).  Metron also
>> currently does this (Metron-Common).  I think that's fine, although in the
>> case of Metron, I feel like having "Metron" prepended is redundant.
>> Regardless of whether we decide to stick with that approach, I propose that
>> project names be uniform and lowercase.  For example, under these
>> assumptions "Metron-Common" would change to "common".
>>
>> The first level of organization makes sense to me.  Only change I would
>> make would be to project names:
>>
>>   *   deployment
>>   *   streaming
>>   *   ui
>>
>> Or if we want to keep metron in project names:
>>
>>   *   metron-deployment
>>   *   metron-streaming
>>   *   metron-ui
>>
>> For now I don't see any changes necessary in deployment or ui
>> organization.  I see the streaming project structure primarily driven by 2
>> things:  the Maven dependency tree and deployment targets.  For example,
>> solr and elasticsearch code should be separated (because their dependency
>> on lucene conflicts) but both will depend on common enrichment code.  Also,
>> now that parser, enrichment and pcap topologies are separate, code for
>> those topologies will be deployed as separate jars.  No reason to include
>> parser code in enrichment topologies and vice-versa.  Any other
>> considerations I'm missing?
>>
>> With that being said, here is my initial proposal:
>>
>>   *   common -  Any common code that all topologies depend on
>> (configuration classes, generic writers for example).  No dependencies on
>> other Metron projects.
>>   *   test - Contains utilities for writing unit tests, sample configs
>> and sample data.  Will depend on common.
>>   *   integration-test - Contains utilities and classes needed to run our
>> integration tests (in memory components for example).  Will depend on
>> common and test.
>>   *   dataload - Contains all code related to data loading.  Will also
>> include any property files needed and integration tests.  Will depend on
>> common, test (test scope), and integration-test (test scope).
>>   *   parser - All code specific to the parser topologies.  Would also
>> include scripts, property files, flux files and parser topology integration
>> tests.  This project will depend on common, test (test scope), and
>> integration-testing (test scope).
>>   *   enrichment - All code specific to the enrichment topologies (except
>> solr and elasticsearch).  Would also include scripts, property files, flux
>> files and enrichment topology integration tests.  This project will depend
>> on common, test (test scope), and integration-test (test scope).
>>   *   elasticsearch - All Elasticsearch related code.  Will depend on
>> enrichment.
>>   *   solr - All Solr related code.  Will depend on enrichment.
>>   *   pcap - All code specific to the topology dedicated to pcap.  Would
>> also include scripts, property files, flux files and pcap integration
>> test.  This project will depend on common, test (test scope) and
>> integration-test (test scope).
>>   *   api - This will serve as a generic replacement for
>> Metron-Pcap_Service.  Will contain all code to build a Metron web service
>> middle layer that can expose APIs through REST or other client protocols.
>> Could possibly depend on all other projects or separated further if version
>> conflicts arise (separate api projects for solr and elasticsearch for
>> example).
>>
>> Looking forward to hearing everyone's feedback and great ideas.
>>
>> Ryan Merriman
>>
>
>
>
> --
> Nick Allen <[email protected]>
>



-- 
Nick Allen <[email protected]>

Reply via email to