Re: [DISCUSS] Project reorganization

James Sirota Mon, 11 Apr 2016 12:07:33 -0700

Great, thanks, Debo.  Where can I find instructions on how to get to it?

Thanks,
James





On 4/11/16, 9:41 AM, "Debo Dutta (dedutta)" <[email protected]> wrote:

>Hi James 
>
>Ok set it up and ack ….. 
>
>Thx
>
>
>
>
>
>On 4/10/16, 6:31 PM, "James Sirota" <[email protected]> wrote:
>
>>Hi Debo,
>>
>>I think it would be great if you set it up
>>
>>Thanks,
>>James 
>>
>>
>>
>>
>>On 4/10/16, 6:25 PM, "Debojyoti Dutta" <[email protected]> wrote:
>>
>>>I have set it up for another open source effort in the past and it was not 
>>>very hard. Am happy to volunteer if needed. 
>>>
>>>Thx 
>>>Debo
>>>
>>>Sent from my iPhone
>>>
>>>> On Apr 10, 2016, at 5:53 PM, James Sirota <[email protected]> wrote:
>>>> 
>>>> I’d be open to an IRC channel.  Does anyone know if Apache allows this?  
>>>> If yes, does anyone know how to set one up?
>>>> 
>>>> Thanks,
>>>> James 
>>>> 
>>>> 
>>>> 
>>>> 
>>>>> On 4/10/16, 4:52 PM, "Debojyoti Dutta" <[email protected]> wrote:
>>>>> 
>>>>> Hi Nick 
>>>>> 
>>>>> I like your suggestions. For the enrichment layer do you think it would 
>>>>> also include any advanced analytics. Else we might want to have an 
>>>>> analytics layer. 
>>>>> 
>>>>> It would be good to have an arch which could be extended for new 
>>>>> functionality. 
>>>>> 
>>>>> However Ryan's suggestion of the ui API and deployer also makes sense. 
>>>>> 
>>>>> Should we have an IRC channel to discuss this or maybe etherpad?
>>>>> 
>>>>> Debo
>>>>> 
>>>>> Sent from my iPhone
>>>>> 
>>>>>> On Apr 10, 2016, at 4:36 PM, Nick Allen <[email protected]> wrote:
>>>>>> 
>>>>>> It might help to think of our code base as four separate types of
>>>>>> functionality.  This is primarily meant to give us a framework to think
>>>>>> about the organization of Metron (and drive more discussion), rather than
>>>>>> my proposal for a specific structure.
>>>>>> 
>>>>>>  - Sensor - Anything that captures external, non-streaming data and
>>>>>>  presents it in a form ready for stream processing.
>>>>>>  - Input - Responsible for preparing streaming data for enrichment.  The
>>>>>>  existing "parsers" fit neatly into this space.
>>>>>>  - Enrichment - Responsible for enriching an incoming data feed like
>>>>>>  geoip, asset enrichment, threat intel lookups, etc.
>>>>>>  - Output - Responsible for persisting data that has been processed by
>>>>>>  Metron which obviously means search indexers or data stores.
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> On Fri, Apr 8, 2016 at 4:46 PM, Ryan Merriman <[email protected]>
>>>>>> wrote:
>>>>>> 
>>>>>>> All,
>>>>>>> 
>>>>>>> I would like to propose a review and refactor of the current project
>>>>>>> organization within Metron.  Much of the way the legacy code was 
>>>>>>> organized
>>>>>>> does not make sense anymore and could be designed so that it is easier 
>>>>>>> to
>>>>>>> navigate and understand.  Our test coverage has increased substantially 
>>>>>>> so
>>>>>>> I believe we can do this with confidence.
>>>>>>> 
>>>>>>> First off, I think we should agree on a naming convention.  I see some
>>>>>>> projects (YARN and Storm for example) that prepend the sub-project with 
>>>>>>> the
>>>>>>> name of the top-level project (storm-core for example).  Metron also
>>>>>>> currently does this (Metron-Common).  I think that's fine, although in 
>>>>>>> the
>>>>>>> case of Metron, I feel like having "Metron" prepended is redundant.
>>>>>>> Regardless of whether we decide to stick with that approach, I propose 
>>>>>>> that
>>>>>>> project names be uniform and lowercase.  For example, under these
>>>>>>> assumptions "Metron-Common" would change to "common".
>>>>>>> 
>>>>>>> The first level of organization makes sense to me.  Only change I would
>>>>>>> make would be to project names:
>>>>>>> 
>>>>>>> *   deployment
>>>>>>> *   streaming
>>>>>>> *   ui
>>>>>>> 
>>>>>>> Or if we want to keep metron in project names:
>>>>>>> 
>>>>>>> *   metron-deployment
>>>>>>> *   metron-streaming
>>>>>>> *   metron-ui
>>>>>>> 
>>>>>>> For now I don't see any changes necessary in deployment or ui
>>>>>>> organization.  I see the streaming project structure primarily driven 
>>>>>>> by 2
>>>>>>> things:  the Maven dependency tree and deployment targets.  For example,
>>>>>>> solr and elasticsearch code should be separated (because their 
>>>>>>> dependency
>>>>>>> on lucene conflicts) but both will depend on common enrichment code.  
>>>>>>> Also,
>>>>>>> now that parser, enrichment and pcap topologies are separate, code for
>>>>>>> those topologies will be deployed as separate jars.  No reason to 
>>>>>>> include
>>>>>>> parser code in enrichment topologies and vice-versa.  Any other
>>>>>>> considerations I'm missing?
>>>>>>> 
>>>>>>> With that being said, here is my initial proposal:
>>>>>>> 
>>>>>>> *   common -  Any common code that all topologies depend on
>>>>>>> (configuration classes, generic writers for example).  No dependencies 
>>>>>>> on
>>>>>>> other Metron projects.
>>>>>>> *   test - Contains utilities for writing unit tests, sample configs and
>>>>>>> sample data.  Will depend on common.
>>>>>>> *   integration-test - Contains utilities and classes needed to run our
>>>>>>> integration tests (in memory components for example).  Will depend on
>>>>>>> common and test.
>>>>>>> *   dataload - Contains all code related to data loading.  Will also
>>>>>>> include any property files needed and integration tests.  Will depend on
>>>>>>> common, test (test scope), and integration-test (test scope).
>>>>>>> *   parser - All code specific to the parser topologies.  Would also
>>>>>>> include scripts, property files, flux files and parser topology 
>>>>>>> integration
>>>>>>> tests.  This project will depend on common, test (test scope), and
>>>>>>> integration-testing (test scope).
>>>>>>> *   enrichment - All code specific to the enrichment topologies (except
>>>>>>> solr and elasticsearch).  Would also include scripts, property files, 
>>>>>>> flux
>>>>>>> files and enrichment topology integration tests.  This project will 
>>>>>>> depend
>>>>>>> on common, test (test scope), and integration-test (test scope).
>>>>>>> *   elasticsearch - All Elasticsearch related code.  Will depend on
>>>>>>> enrichment.
>>>>>>> *   solr - All Solr related code.  Will depend on enrichment.
>>>>>>> *   pcap - All code specific to the topology dedicated to pcap.  Would
>>>>>>> also include scripts, property files, flux files and pcap integration
>>>>>>> test.  This project will depend on common, test (test scope) and
>>>>>>> integration-test (test scope).
>>>>>>> *   api - This will serve as a generic replacement for
>>>>>>> Metron-Pcap_Service.  Will contain all code to build a Metron web 
>>>>>>> service
>>>>>>> middle layer that can expose APIs through REST or other client 
>>>>>>> protocols.
>>>>>>> Could possibly depend on all other projects or separated further if 
>>>>>>> version
>>>>>>> conflicts arise (separate api projects for solr and elasticsearch for
>>>>>>> example).
>>>>>>> 
>>>>>>> Looking forward to hearing everyone's feedback and great ideas.
>>>>>>> 
>>>>>>> Ryan Merriman
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> -- 
>>>>>> Nick Allen <[email protected]>
>>>>> 
>>>

Re: [DISCUSS] Project reorganization

Reply via email to