Re: [DISCUSS] Project reorganization

Debojyoti Dutta Mon, 11 Apr 2016 14:16:31 -0700

If you load up your Irc client just type
/join #apache-metron-dev

Sent from my iPhone


> On Apr 11, 2016, at 12:06 PM, James Sirota <[email protected]> wrote:
> 
> Great, thanks, Debo.  Where can I find instructions on how to get to it?
> 
> Thanks,
> James 
> 
> 
> 
> 
>> On 4/11/16, 9:41 AM, "Debo Dutta (dedutta)" <[email protected]> wrote:
>> 
>> Hi James 
>> 
>> Ok set it up and ack ….. 
>> 
>> Thx
>> 
>> 
>> 
>> 
>> 
>>> On 4/10/16, 6:31 PM, "James Sirota" <[email protected]> wrote:
>>> 
>>> Hi Debo,
>>> 
>>> I think it would be great if you set it up
>>> 
>>> Thanks,
>>> James 
>>> 
>>> 
>>> 
>>> 
>>>> On 4/10/16, 6:25 PM, "Debojyoti Dutta" <[email protected]> wrote:
>>>> 
>>>> I have set it up for another open source effort in the past and it was not 
>>>> very hard. Am happy to volunteer if needed. 
>>>> 
>>>> Thx 
>>>> Debo
>>>> 
>>>> Sent from my iPhone
>>>> 
>>>>> On Apr 10, 2016, at 5:53 PM, James Sirota <[email protected]> wrote:
>>>>> 
>>>>> I’d be open to an IRC channel.  Does anyone know if Apache allows this?  
>>>>> If yes, does anyone know how to set one up?
>>>>> 
>>>>> Thanks,
>>>>> James 
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>>> On 4/10/16, 4:52 PM, "Debojyoti Dutta" <[email protected]> wrote:
>>>>>> 
>>>>>> Hi Nick 
>>>>>> 
>>>>>> I like your suggestions. For the enrichment layer do you think it would 
>>>>>> also include any advanced analytics. Else we might want to have an 
>>>>>> analytics layer. 
>>>>>> 
>>>>>> It would be good to have an arch which could be extended for new 
>>>>>> functionality. 
>>>>>> 
>>>>>> However Ryan's suggestion of the ui API and deployer also makes sense. 
>>>>>> 
>>>>>> Should we have an IRC channel to discuss this or maybe etherpad?
>>>>>> 
>>>>>> Debo
>>>>>> 
>>>>>> Sent from my iPhone
>>>>>> 
>>>>>>> On Apr 10, 2016, at 4:36 PM, Nick Allen <[email protected]> wrote:
>>>>>>> 
>>>>>>> It might help to think of our code base as four separate types of
>>>>>>> functionality.  This is primarily meant to give us a framework to think
>>>>>>> about the organization of Metron (and drive more discussion), rather 
>>>>>>> than
>>>>>>> my proposal for a specific structure.
>>>>>>> 
>>>>>>> - Sensor - Anything that captures external, non-streaming data and
>>>>>>> presents it in a form ready for stream processing.
>>>>>>> - Input - Responsible for preparing streaming data for enrichment.  The
>>>>>>> existing "parsers" fit neatly into this space.
>>>>>>> - Enrichment - Responsible for enriching an incoming data feed like
>>>>>>> geoip, asset enrichment, threat intel lookups, etc.
>>>>>>> - Output - Responsible for persisting data that has been processed by
>>>>>>> Metron which obviously means search indexers or data stores.
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> On Fri, Apr 8, 2016 at 4:46 PM, Ryan Merriman 
>>>>>>> <[email protected]>
>>>>>>> wrote:
>>>>>>> 
>>>>>>>> All,
>>>>>>>> 
>>>>>>>> I would like to propose a review and refactor of the current project
>>>>>>>> organization within Metron.  Much of the way the legacy code was 
>>>>>>>> organized
>>>>>>>> does not make sense anymore and could be designed so that it is easier 
>>>>>>>> to
>>>>>>>> navigate and understand.  Our test coverage has increased 
>>>>>>>> substantially so
>>>>>>>> I believe we can do this with confidence.
>>>>>>>> 
>>>>>>>> First off, I think we should agree on a naming convention.  I see some
>>>>>>>> projects (YARN and Storm for example) that prepend the sub-project 
>>>>>>>> with the
>>>>>>>> name of the top-level project (storm-core for example).  Metron also
>>>>>>>> currently does this (Metron-Common).  I think that's fine, although in 
>>>>>>>> the
>>>>>>>> case of Metron, I feel like having "Metron" prepended is redundant.
>>>>>>>> Regardless of whether we decide to stick with that approach, I propose 
>>>>>>>> that
>>>>>>>> project names be uniform and lowercase.  For example, under these
>>>>>>>> assumptions "Metron-Common" would change to "common".
>>>>>>>> 
>>>>>>>> The first level of organization makes sense to me.  Only change I would
>>>>>>>> make would be to project names:
>>>>>>>> 
>>>>>>>> *   deployment
>>>>>>>> *   streaming
>>>>>>>> *   ui
>>>>>>>> 
>>>>>>>> Or if we want to keep metron in project names:
>>>>>>>> 
>>>>>>>> *   metron-deployment
>>>>>>>> *   metron-streaming
>>>>>>>> *   metron-ui
>>>>>>>> 
>>>>>>>> For now I don't see any changes necessary in deployment or ui
>>>>>>>> organization.  I see the streaming project structure primarily driven 
>>>>>>>> by 2
>>>>>>>> things:  the Maven dependency tree and deployment targets.  For 
>>>>>>>> example,
>>>>>>>> solr and elasticsearch code should be separated (because their 
>>>>>>>> dependency
>>>>>>>> on lucene conflicts) but both will depend on common enrichment code.  
>>>>>>>> Also,
>>>>>>>> now that parser, enrichment and pcap topologies are separate, code for
>>>>>>>> those topologies will be deployed as separate jars.  No reason to 
>>>>>>>> include
>>>>>>>> parser code in enrichment topologies and vice-versa.  Any other
>>>>>>>> considerations I'm missing?
>>>>>>>> 
>>>>>>>> With that being said, here is my initial proposal:
>>>>>>>> 
>>>>>>>> *   common -  Any common code that all topologies depend on
>>>>>>>> (configuration classes, generic writers for example).  No dependencies 
>>>>>>>> on
>>>>>>>> other Metron projects.
>>>>>>>> *   test - Contains utilities for writing unit tests, sample configs 
>>>>>>>> and
>>>>>>>> sample data.  Will depend on common.
>>>>>>>> *   integration-test - Contains utilities and classes needed to run our
>>>>>>>> integration tests (in memory components for example).  Will depend on
>>>>>>>> common and test.
>>>>>>>> *   dataload - Contains all code related to data loading.  Will also
>>>>>>>> include any property files needed and integration tests.  Will depend 
>>>>>>>> on
>>>>>>>> common, test (test scope), and integration-test (test scope).
>>>>>>>> *   parser - All code specific to the parser topologies.  Would also
>>>>>>>> include scripts, property files, flux files and parser topology 
>>>>>>>> integration
>>>>>>>> tests.  This project will depend on common, test (test scope), and
>>>>>>>> integration-testing (test scope).
>>>>>>>> *   enrichment - All code specific to the enrichment topologies (except
>>>>>>>> solr and elasticsearch).  Would also include scripts, property files, 
>>>>>>>> flux
>>>>>>>> files and enrichment topology integration tests.  This project will 
>>>>>>>> depend
>>>>>>>> on common, test (test scope), and integration-test (test scope).
>>>>>>>> *   elasticsearch - All Elasticsearch related code.  Will depend on
>>>>>>>> enrichment.
>>>>>>>> *   solr - All Solr related code.  Will depend on enrichment.
>>>>>>>> *   pcap - All code specific to the topology dedicated to pcap.  Would
>>>>>>>> also include scripts, property files, flux files and pcap integration
>>>>>>>> test.  This project will depend on common, test (test scope) and
>>>>>>>> integration-test (test scope).
>>>>>>>> *   api - This will serve as a generic replacement for
>>>>>>>> Metron-Pcap_Service.  Will contain all code to build a Metron web 
>>>>>>>> service
>>>>>>>> middle layer that can expose APIs through REST or other client 
>>>>>>>> protocols.
>>>>>>>> Could possibly depend on all other projects or separated further if 
>>>>>>>> version
>>>>>>>> conflicts arise (separate api projects for solr and elasticsearch for
>>>>>>>> example).
>>>>>>>> 
>>>>>>>> Looking forward to hearing everyone's feedback and great ideas.
>>>>>>>> 
>>>>>>>> Ryan Merriman
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> -- 
>>>>>>> Nick Allen <[email protected]>
>>>>

Re: [DISCUSS] Project reorganization

Reply via email to