Re: [DISCUSS] Project reorganization

Debojyoti Dutta Sun, 10 Apr 2016 18:26:23 -0700

I have set it up for another open source effort in the past and it was not very 
hard. Am happy to volunteer if needed.


Thx 
Debo

Sent from my iPhone

> On Apr 10, 2016, at 5:53 PM, James Sirota <[email protected]> wrote:
> 
> I’d be open to an IRC channel.  Does anyone know if Apache allows this?  If 
> yes, does anyone know how to set one up?
> 
> Thanks,
> James 
> 
> 
> 
> 
>> On 4/10/16, 4:52 PM, "Debojyoti Dutta" <[email protected]> wrote:
>> 
>> Hi Nick 
>> 
>> I like your suggestions. For the enrichment layer do you think it would also 
>> include any advanced analytics. Else we might want to have an analytics 
>> layer. 
>> 
>> It would be good to have an arch which could be extended for new 
>> functionality. 
>> 
>> However Ryan's suggestion of the ui API and deployer also makes sense. 
>> 
>> Should we have an IRC channel to discuss this or maybe etherpad?
>> 
>> Debo
>> 
>> Sent from my iPhone
>> 
>>> On Apr 10, 2016, at 4:36 PM, Nick Allen <[email protected]> wrote:
>>> 
>>> It might help to think of our code base as four separate types of
>>> functionality.  This is primarily meant to give us a framework to think
>>> about the organization of Metron (and drive more discussion), rather than
>>> my proposal for a specific structure.
>>> 
>>>  - Sensor - Anything that captures external, non-streaming data and
>>>  presents it in a form ready for stream processing.
>>>  - Input - Responsible for preparing streaming data for enrichment.  The
>>>  existing "parsers" fit neatly into this space.
>>>  - Enrichment - Responsible for enriching an incoming data feed like
>>>  geoip, asset enrichment, threat intel lookups, etc.
>>>  - Output - Responsible for persisting data that has been processed by
>>>  Metron which obviously means search indexers or data stores.
>>> 
>>> 
>>> 
>>> 
>>> 
>>> On Fri, Apr 8, 2016 at 4:46 PM, Ryan Merriman <[email protected]>
>>> wrote:
>>> 
>>>> All,
>>>> 
>>>> I would like to propose a review and refactor of the current project
>>>> organization within Metron.  Much of the way the legacy code was organized
>>>> does not make sense anymore and could be designed so that it is easier to
>>>> navigate and understand.  Our test coverage has increased substantially so
>>>> I believe we can do this with confidence.
>>>> 
>>>> First off, I think we should agree on a naming convention.  I see some
>>>> projects (YARN and Storm for example) that prepend the sub-project with the
>>>> name of the top-level project (storm-core for example).  Metron also
>>>> currently does this (Metron-Common).  I think that's fine, although in the
>>>> case of Metron, I feel like having "Metron" prepended is redundant.
>>>> Regardless of whether we decide to stick with that approach, I propose that
>>>> project names be uniform and lowercase.  For example, under these
>>>> assumptions "Metron-Common" would change to "common".
>>>> 
>>>> The first level of organization makes sense to me.  Only change I would
>>>> make would be to project names:
>>>> 
>>>> *   deployment
>>>> *   streaming
>>>> *   ui
>>>> 
>>>> Or if we want to keep metron in project names:
>>>> 
>>>> *   metron-deployment
>>>> *   metron-streaming
>>>> *   metron-ui
>>>> 
>>>> For now I don't see any changes necessary in deployment or ui
>>>> organization.  I see the streaming project structure primarily driven by 2
>>>> things:  the Maven dependency tree and deployment targets.  For example,
>>>> solr and elasticsearch code should be separated (because their dependency
>>>> on lucene conflicts) but both will depend on common enrichment code.  Also,
>>>> now that parser, enrichment and pcap topologies are separate, code for
>>>> those topologies will be deployed as separate jars.  No reason to include
>>>> parser code in enrichment topologies and vice-versa.  Any other
>>>> considerations I'm missing?
>>>> 
>>>> With that being said, here is my initial proposal:
>>>> 
>>>> *   common -  Any common code that all topologies depend on
>>>> (configuration classes, generic writers for example).  No dependencies on
>>>> other Metron projects.
>>>> *   test - Contains utilities for writing unit tests, sample configs and
>>>> sample data.  Will depend on common.
>>>> *   integration-test - Contains utilities and classes needed to run our
>>>> integration tests (in memory components for example).  Will depend on
>>>> common and test.
>>>> *   dataload - Contains all code related to data loading.  Will also
>>>> include any property files needed and integration tests.  Will depend on
>>>> common, test (test scope), and integration-test (test scope).
>>>> *   parser - All code specific to the parser topologies.  Would also
>>>> include scripts, property files, flux files and parser topology integration
>>>> tests.  This project will depend on common, test (test scope), and
>>>> integration-testing (test scope).
>>>> *   enrichment - All code specific to the enrichment topologies (except
>>>> solr and elasticsearch).  Would also include scripts, property files, flux
>>>> files and enrichment topology integration tests.  This project will depend
>>>> on common, test (test scope), and integration-test (test scope).
>>>> *   elasticsearch - All Elasticsearch related code.  Will depend on
>>>> enrichment.
>>>> *   solr - All Solr related code.  Will depend on enrichment.
>>>> *   pcap - All code specific to the topology dedicated to pcap.  Would
>>>> also include scripts, property files, flux files and pcap integration
>>>> test.  This project will depend on common, test (test scope) and
>>>> integration-test (test scope).
>>>> *   api - This will serve as a generic replacement for
>>>> Metron-Pcap_Service.  Will contain all code to build a Metron web service
>>>> middle layer that can expose APIs through REST or other client protocols.
>>>> Could possibly depend on all other projects or separated further if version
>>>> conflicts arise (separate api projects for solr and elasticsearch for
>>>> example).
>>>> 
>>>> Looking forward to hearing everyone's feedback and great ideas.
>>>> 
>>>> Ryan Merriman
>>> 
>>> 
>>> 
>>> -- 
>>> Nick Allen <[email protected]>
>>

Re: [DISCUSS] Project reorganization

Reply via email to