James brings up a good point.  I propose adding another project under
metron-platform called metron-configuration.  This would be a fairly
lightweight project that would contain anything related to configuration
(property files, json files, flux files, etc).

On 4/13/16, 8:56 AM, "James Sirota" <[email protected]> wrote:

>+1 from me.
>
>I would also like to address the configs and make sure the configs are in
>the same place.  Do you have ideas on where we would put those?
>
>Thanks,
>James 
>
>
>
>On 4/13/16, 6:50 AM, "Ryan Merriman" <[email protected]> wrote:
>
>>Thank you for all the feedback everyone.  I will attempt to summarize all
>>the input we¹ve received and update my initial proposal.  We can discuss
>>further if anyone is still unclear and I will volunteer to capture all
>>the
>>details in a document of some kind once we all come to a consensus.
>>
>>Looks like everyone is in agreement for the top level projects.  Nick is
>>working on a task that will require an addition top level project so I am
>>going to add that in as well:
>>
>>metron-deployment
>>metron-platform
>>metron-ui
>>metron-sensors
>>
>>All of these except metron-platform are well understood and don¹t warrant
>>any more discussion.  For metron-platform there seem to be 2 areas that
>>are not as clear:
>>
>>- whether we need a common project
>>- how do we organize test related code
>>
>>I agree with David and others that a common project will likely get
>>misused and could become unnecessary bloated.  But I suspect there will
>>be
>>cases where we have common code being used across multiple projects (is
>>already happening).  In this case we will either need this common project
>>or we will have to keep common code in one of the other projects and have
>>all other projects extend that. For the latter, an example would be
>>keeping common code in enrichment and having parsers declare enrichment
>>as
>>a dependency.  There are a couple downsides I see with this approach:
>>
>>- parser topology jars now bring along all the enrichment dependencies
>>- since more code from various projects are being packaged together,
>>version conflicts are more likely and poms become more complicated due to
>>all the necessary exclusions
>>
>>My thinking is that any jar file being deployed should only contain what
>>it needs.  Curious what others think here.  My vote would be to maintain
>>a
>>common project (or whatever we want to call it) and be diligent about not
>>letting project-specific code slip in there.
>>
>>I believe Nick was the first person to ask the question about projects
>>related to test code and why we would need separate test and integration
>>test.  The reason for this is that our integration-test classes currently
>>depend on other projects (not surprising since they are integration
>>tests).  If there are utilities we want make available to all projects
>>(mock classes, utilities for reading sample data, etc) then it can¹t live
>>in integration-test because that will introduce circular dependencies.
>>If
>>it is possible to refactor our current Metron-Testing project so that it
>>doesn¹t depend on any other projects, then we can keep utilities here.
>>Otherwise we need a separate project for testing utilities.  I suspect
>>removing other project dependencies from Metron-Testing will prove more
>>difficult than it¹s worth so my vote would be to have 2 test related
>>projects.
>>
>>So here is where our metron-platform organization stands:
>>
>>metron-common *
>>metron-integration-test *
>>metron-test-utilities *
>>metron-data-management
>>metron-pcap
>>metron-parsers
>>metron-enrichment
>>      metron-solr
>>      metron-elasticsearch
>>metron-api
>>
>>* may or may not change depending on the outcome of this discussion
>>
>>Thoughts?
>>
>>Ryan Merriman
>>
>>
>>On 4/11/16, 4:15 PM, "Debojyoti Dutta" <[email protected]> wrote:
>>
>>>If you load up your Irc client just type
>>>/join #apache-metron-dev
>>>
>>>Sent from my iPhone
>>>
>>>> On Apr 11, 2016, at 12:06 PM, James Sirota <[email protected]>
>>>>wrote:
>>>> 
>>>> Great, thanks, Debo.  Where can I find instructions on how to get to
>>>>it?
>>>> 
>>>> Thanks,
>>>> James 
>>>> 
>>>> 
>>>> 
>>>> 
>>>>> On 4/11/16, 9:41 AM, "Debo Dutta (dedutta)" <[email protected]>
>>>>>wrote:
>>>>> 
>>>>> Hi James 
>>>>> 
>>>>> Ok set it up and ack Š..
>>>>> 
>>>>> Thx
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>>> On 4/10/16, 6:31 PM, "James Sirota" <[email protected]> wrote:
>>>>>> 
>>>>>> Hi Debo,
>>>>>> 
>>>>>> I think it would be great if you set it up
>>>>>> 
>>>>>> Thanks,
>>>>>> James 
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>>> On 4/10/16, 6:25 PM, "Debojyoti Dutta" <[email protected]> wrote:
>>>>>>> 
>>>>>>> I have set it up for another open source effort in the past and it
>>>>>>>was not very hard. Am happy to volunteer if needed.
>>>>>>> 
>>>>>>> Thx 
>>>>>>> Debo
>>>>>>> 
>>>>>>> Sent from my iPhone
>>>>>>> 
>>>>>>>> On Apr 10, 2016, at 5:53 PM, James Sirota
>>>>>>>><[email protected]>
>>>>>>>>wrote:
>>>>>>>> 
>>>>>>>> I¹d be open to an IRC channel.  Does anyone know if Apache allows
>>>>>>>>this?  If yes, does anyone know how to set one up?
>>>>>>>> 
>>>>>>>> Thanks,
>>>>>>>> James 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>>> On 4/10/16, 4:52 PM, "Debojyoti Dutta" <[email protected]> wrote:
>>>>>>>>> 
>>>>>>>>> Hi Nick 
>>>>>>>>> 
>>>>>>>>> I like your suggestions. For the enrichment layer do you think it
>>>>>>>>>would also include any advanced analytics. Else we might want to
>>>>>>>>>have an analytics layer.
>>>>>>>>> 
>>>>>>>>> It would be good to have an arch which could be extended for new
>>>>>>>>>functionality.
>>>>>>>>> 
>>>>>>>>> However Ryan's suggestion of the ui API and deployer also makes
>>>>>>>>>sense. 
>>>>>>>>> 
>>>>>>>>> Should we have an IRC channel to discuss this or maybe etherpad?
>>>>>>>>> 
>>>>>>>>> Debo
>>>>>>>>> 
>>>>>>>>> Sent from my iPhone
>>>>>>>>> 
>>>>>>>>>> On Apr 10, 2016, at 4:36 PM, Nick Allen <[email protected]>
>>>>>>>>>>wrote:
>>>>>>>>>> 
>>>>>>>>>> It might help to think of our code base as four separate types
>>>>>>>>>>of
>>>>>>>>>> functionality.  This is primarily meant to give us a framework
>>>>>>>>>>to
>>>>>>>>>>think
>>>>>>>>>> about the organization of Metron (and drive more discussion),
>>>>>>>>>>rather than
>>>>>>>>>> my proposal for a specific structure.
>>>>>>>>>> 
>>>>>>>>>> - Sensor - Anything that captures external, non-streaming data
>>>>>>>>>>and
>>>>>>>>>> presents it in a form ready for stream processing.
>>>>>>>>>> - Input - Responsible for preparing streaming data for
>>>>>>>>>>enrichment.  The
>>>>>>>>>> existing "parsers" fit neatly into this space.
>>>>>>>>>> - Enrichment - Responsible for enriching an incoming data feed
>>>>>>>>>>like
>>>>>>>>>> geoip, asset enrichment, threat intel lookups, etc.
>>>>>>>>>> - Output - Responsible for persisting data that has been
>>>>>>>>>>processed by
>>>>>>>>>> Metron which obviously means search indexers or data stores.
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> On Fri, Apr 8, 2016 at 4:46 PM, Ryan Merriman
>>>>>>>>>><[email protected]>
>>>>>>>>>> wrote:
>>>>>>>>>> 
>>>>>>>>>>> All,
>>>>>>>>>>> 
>>>>>>>>>>> I would like to propose a review and refactor of the current
>>>>>>>>>>>project
>>>>>>>>>>> organization within Metron.  Much of the way the legacy code
>>>>>>>>>>>was
>>>>>>>>>>>organized
>>>>>>>>>>> does not make sense anymore and could be designed so that it is
>>>>>>>>>>>easier to
>>>>>>>>>>> navigate and understand.  Our test coverage has increased
>>>>>>>>>>>substantially so
>>>>>>>>>>> I believe we can do this with confidence.
>>>>>>>>>>> 
>>>>>>>>>>> First off, I think we should agree on a naming convention.  I
>>>>>>>>>>>see some
>>>>>>>>>>> projects (YARN and Storm for example) that prepend the
>>>>>>>>>>>sub-project with the
>>>>>>>>>>> name of the top-level project (storm-core for example).  Metron
>>>>>>>>>>>also
>>>>>>>>>>> currently does this (Metron-Common).  I think that's fine,
>>>>>>>>>>>although in the
>>>>>>>>>>> case of Metron, I feel like having "Metron" prepended is
>>>>>>>>>>>redundant.
>>>>>>>>>>> Regardless of whether we decide to stick with that approach, I
>>>>>>>>>>>propose that
>>>>>>>>>>> project names be uniform and lowercase.  For example, under
>>>>>>>>>>>these
>>>>>>>>>>> assumptions "Metron-Common" would change to "common".
>>>>>>>>>>> 
>>>>>>>>>>> The first level of organization makes sense to me.  Only change
>>>>>>>>>>>I would
>>>>>>>>>>> make would be to project names:
>>>>>>>>>>> 
>>>>>>>>>>> *   deployment
>>>>>>>>>>> *   streaming
>>>>>>>>>>> *   ui
>>>>>>>>>>> 
>>>>>>>>>>> Or if we want to keep metron in project names:
>>>>>>>>>>> 
>>>>>>>>>>> *   metron-deployment
>>>>>>>>>>> *   metron-streaming
>>>>>>>>>>> *   metron-ui
>>>>>>>>>>> 
>>>>>>>>>>> For now I don't see any changes necessary in deployment or ui
>>>>>>>>>>> organization.  I see the streaming project structure primarily
>>>>>>>>>>>driven by 2
>>>>>>>>>>> things:  the Maven dependency tree and deployment targets.  For
>>>>>>>>>>>example,
>>>>>>>>>>> solr and elasticsearch code should be separated (because their
>>>>>>>>>>>dependency
>>>>>>>>>>> on lucene conflicts) but both will depend on common enrichment
>>>>>>>>>>>code.  Also,
>>>>>>>>>>> now that parser, enrichment and pcap topologies are separate,
>>>>>>>>>>>code for
>>>>>>>>>>> those topologies will be deployed as separate jars.  No reason
>>>>>>>>>>>to include
>>>>>>>>>>> parser code in enrichment topologies and vice-versa.  Any other
>>>>>>>>>>> considerations I'm missing?
>>>>>>>>>>> 
>>>>>>>>>>> With that being said, here is my initial proposal:
>>>>>>>>>>> 
>>>>>>>>>>> *   common -  Any common code that all topologies depend on
>>>>>>>>>>> (configuration classes, generic writers for example).  No
>>>>>>>>>>>dependencies on
>>>>>>>>>>> other Metron projects.
>>>>>>>>>>> *   test - Contains utilities for writing unit tests, sample
>>>>>>>>>>>configs and
>>>>>>>>>>> sample data.  Will depend on common.
>>>>>>>>>>> *   integration-test - Contains utilities and classes needed to
>>>>>>>>>>>run our
>>>>>>>>>>> integration tests (in memory components for example).  Will
>>>>>>>>>>>depend on
>>>>>>>>>>> common and test.
>>>>>>>>>>> *   dataload - Contains all code related to data loading.  Will
>>>>>>>>>>>also
>>>>>>>>>>> include any property files needed and integration tests.  Will
>>>>>>>>>>>depend on
>>>>>>>>>>> common, test (test scope), and integration-test (test scope).
>>>>>>>>>>> *   parser - All code specific to the parser topologies.  Would
>>>>>>>>>>>also
>>>>>>>>>>> include scripts, property files, flux files and parser topology
>>>>>>>>>>>integration
>>>>>>>>>>> tests.  This project will depend on common, test (test scope),
>>>>>>>>>>>and
>>>>>>>>>>> integration-testing (test scope).
>>>>>>>>>>> *   enrichment - All code specific to the enrichment topologies
>>>>>>>>>>>(except
>>>>>>>>>>> solr and elasticsearch).  Would also include scripts, property
>>>>>>>>>>>files, flux
>>>>>>>>>>> files and enrichment topology integration tests.  This project
>>>>>>>>>>>will depend
>>>>>>>>>>> on common, test (test scope), and integration-test (test
>>>>>>>>>>>scope).
>>>>>>>>>>> *   elasticsearch - All Elasticsearch related code.  Will
>>>>>>>>>>>depend
>>>>>>>>>>>on
>>>>>>>>>>> enrichment.
>>>>>>>>>>> *   solr - All Solr related code.  Will depend on enrichment.
>>>>>>>>>>> *   pcap - All code specific to the topology dedicated to pcap.
>>>>>>>>>>>Would
>>>>>>>>>>> also include scripts, property files, flux files and pcap
>>>>>>>>>>>integration
>>>>>>>>>>> test.  This project will depend on common, test (test scope)
>>>>>>>>>>>and
>>>>>>>>>>> integration-test (test scope).
>>>>>>>>>>> *   api - This will serve as a generic replacement for
>>>>>>>>>>> Metron-Pcap_Service.  Will contain all code to build a Metron
>>>>>>>>>>>web service
>>>>>>>>>>> middle layer that can expose APIs through REST or other client
>>>>>>>>>>>protocols.
>>>>>>>>>>> Could possibly depend on all other projects or separated
>>>>>>>>>>>further
>>>>>>>>>>>if version
>>>>>>>>>>> conflicts arise (separate api projects for solr and
>>>>>>>>>>>elasticsearch for
>>>>>>>>>>> example).
>>>>>>>>>>> 
>>>>>>>>>>> Looking forward to hearing everyone's feedback and great ideas.
>>>>>>>>>>> 
>>>>>>>>>>> Ryan Merriman
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> -- 
>>>>>>>>>> Nick Allen <[email protected]>
>>>>>>> 
>>>
>>
>>

Reply via email to