Re: [DISCUSS] Project reorganization

James Sirota Sun, 10 Apr 2016 17:54:03 -0700

I’d be open to an IRC channel.  Does anyone know if Apache allows this?  If 
yes, does anyone know how to set one up?


Thanks,
James 




On 4/10/16, 4:52 PM, "Debojyoti Dutta" <[email protected]> wrote:

>Hi Nick 
>
>I like your suggestions. For the enrichment layer do you think it would also 
>include any advanced analytics. Else we might want to have an analytics layer. 
>
>It would be good to have an arch which could be extended for new 
>functionality. 
>
>However Ryan's suggestion of the ui API and deployer also makes sense. 
>
>Should we have an IRC channel to discuss this or maybe etherpad?
>
>Debo
>
>Sent from my iPhone
>
>> On Apr 10, 2016, at 4:36 PM, Nick Allen <[email protected]> wrote:
>> 
>> It might help to think of our code base as four separate types of
>> functionality.  This is primarily meant to give us a framework to think
>> about the organization of Metron (and drive more discussion), rather than
>> my proposal for a specific structure.
>> 
>>   - Sensor - Anything that captures external, non-streaming data and
>>   presents it in a form ready for stream processing.
>>   - Input - Responsible for preparing streaming data for enrichment.  The
>>   existing "parsers" fit neatly into this space.
>>   - Enrichment - Responsible for enriching an incoming data feed like
>>   geoip, asset enrichment, threat intel lookups, etc.
>>   - Output - Responsible for persisting data that has been processed by
>>   Metron which obviously means search indexers or data stores.
>> 
>> 
>> 
>> 
>> 
>> On Fri, Apr 8, 2016 at 4:46 PM, Ryan Merriman <[email protected]>
>> wrote:
>> 
>>> All,
>>> 
>>> I would like to propose a review and refactor of the current project
>>> organization within Metron.  Much of the way the legacy code was organized
>>> does not make sense anymore and could be designed so that it is easier to
>>> navigate and understand.  Our test coverage has increased substantially so
>>> I believe we can do this with confidence.
>>> 
>>> First off, I think we should agree on a naming convention.  I see some
>>> projects (YARN and Storm for example) that prepend the sub-project with the
>>> name of the top-level project (storm-core for example).  Metron also
>>> currently does this (Metron-Common).  I think that's fine, although in the
>>> case of Metron, I feel like having "Metron" prepended is redundant.
>>> Regardless of whether we decide to stick with that approach, I propose that
>>> project names be uniform and lowercase.  For example, under these
>>> assumptions "Metron-Common" would change to "common".
>>> 
>>> The first level of organization makes sense to me.  Only change I would
>>> make would be to project names:
>>> 
>>>  *   deployment
>>>  *   streaming
>>>  *   ui
>>> 
>>> Or if we want to keep metron in project names:
>>> 
>>>  *   metron-deployment
>>>  *   metron-streaming
>>>  *   metron-ui
>>> 
>>> For now I don't see any changes necessary in deployment or ui
>>> organization.  I see the streaming project structure primarily driven by 2
>>> things:  the Maven dependency tree and deployment targets.  For example,
>>> solr and elasticsearch code should be separated (because their dependency
>>> on lucene conflicts) but both will depend on common enrichment code.  Also,
>>> now that parser, enrichment and pcap topologies are separate, code for
>>> those topologies will be deployed as separate jars.  No reason to include
>>> parser code in enrichment topologies and vice-versa.  Any other
>>> considerations I'm missing?
>>> 
>>> With that being said, here is my initial proposal:
>>> 
>>>  *   common -  Any common code that all topologies depend on
>>> (configuration classes, generic writers for example).  No dependencies on
>>> other Metron projects.
>>>  *   test - Contains utilities for writing unit tests, sample configs and
>>> sample data.  Will depend on common.
>>>  *   integration-test - Contains utilities and classes needed to run our
>>> integration tests (in memory components for example).  Will depend on
>>> common and test.
>>>  *   dataload - Contains all code related to data loading.  Will also
>>> include any property files needed and integration tests.  Will depend on
>>> common, test (test scope), and integration-test (test scope).
>>>  *   parser - All code specific to the parser topologies.  Would also
>>> include scripts, property files, flux files and parser topology integration
>>> tests.  This project will depend on common, test (test scope), and
>>> integration-testing (test scope).
>>>  *   enrichment - All code specific to the enrichment topologies (except
>>> solr and elasticsearch).  Would also include scripts, property files, flux
>>> files and enrichment topology integration tests.  This project will depend
>>> on common, test (test scope), and integration-test (test scope).
>>>  *   elasticsearch - All Elasticsearch related code.  Will depend on
>>> enrichment.
>>>  *   solr - All Solr related code.  Will depend on enrichment.
>>>  *   pcap - All code specific to the topology dedicated to pcap.  Would
>>> also include scripts, property files, flux files and pcap integration
>>> test.  This project will depend on common, test (test scope) and
>>> integration-test (test scope).
>>>  *   api - This will serve as a generic replacement for
>>> Metron-Pcap_Service.  Will contain all code to build a Metron web service
>>> middle layer that can expose APIs through REST or other client protocols.
>>> Could possibly depend on all other projects or separated further if version
>>> conflicts arise (separate api projects for solr and elasticsearch for
>>> example).
>>> 
>>> Looking forward to hearing everyone's feedback and great ideas.
>>> 
>>> Ryan Merriman
>> 
>> 
>> 
>> -- 
>> Nick Allen <[email protected]>
>

Re: [DISCUSS] Project reorganization

Reply via email to