Hi Ryan,

This is great.  You should attach this to the Jira when you are ready to commit 
the reorg so we know which parts shifted.

Thanks,
James 




On 4/18/16, 1:30 PM, "Ryan Merriman" <[email protected]> wrote:

>Thanks Frank.  I’ve updated those in the spreadsheet.
>
>On 4/18/16, 3:27 PM, "Frank Lu" <[email protected]> wrote:
>
>>As of now, I think the following classes are not used:
>>
>>
>> 
>> 
>>Metron-EnrichmentAdapters
>>  org.apache.metron.enrichment.adapters.cif.AbstractCIFAdapter.java
>> 
>> 
>>  org.apache.metron.enrichment.adapters.cif.CIFHbaseAdapter.java
>>
>>org.apache.metron.enrichment.adapters.whois.WhoisHBaseAdapter.java
>>
>>
>>Metron-DataLoads
>>org.apache.metron.dataloads.cif.HBaseTableLoad.java
>>              
>>
>>Thanks,
>>Frank Lu
>>
>>
>>
>>
>>On 4/18/16, 3:05 PM, "Ryan Merriman" <[email protected]> wrote:
>>
>>>All,
>>>
>>>I put together a list of all the project java assets that details where
>>>they will be moved (or potentially deleted) as part of the project
>>>reorganization.  Feedback welcome.
>>>
>>>Ryan Merriman 
>>>
>>>On 4/13/16, 9:42 AM, "James Sirota" <[email protected]> wrote:
>>>
>>>>I would have configs as a project but rather as a folder structure that
>>>>other modules can point to
>>>>
>>>>Thanks,
>>>>James 
>>>>
>>>>
>>>>
>>>>
>>>>On 4/13/16, 7:32 AM, "Ryan Merriman" <[email protected]> wrote:
>>>>
>>>>>James brings up a good point.  I propose adding another project under
>>>>>metron-platform called metron-configuration.  This would be a fairly
>>>>>lightweight project that would contain anything related to
>>>>>configuration
>>>>>(property files, json files, flux files, etc).
>>>>>
>>>>>On 4/13/16, 8:56 AM, "James Sirota" <[email protected]> wrote:
>>>>>
>>>>>>+1 from me.
>>>>>>
>>>>>>I would also like to address the configs and make sure the configs are
>>>>>>in
>>>>>>the same place.  Do you have ideas on where we would put those?
>>>>>>
>>>>>>Thanks,
>>>>>>James 
>>>>>>
>>>>>>
>>>>>>
>>>>>>On 4/13/16, 6:50 AM, "Ryan Merriman" <[email protected]>
>>>>>>wrote:
>>>>>>
>>>>>>>Thank you for all the feedback everyone.  I will attempt to summarize
>>>>>>>all
>>>>>>>the input we¹ve received and update my initial proposal.  We can
>>>>>>>discuss
>>>>>>>further if anyone is still unclear and I will volunteer to capture
>>>>>>>all
>>>>>>>the
>>>>>>>details in a document of some kind once we all come to a consensus.
>>>>>>>
>>>>>>>Looks like everyone is in agreement for the top level projects.  Nick
>>>>>>>is
>>>>>>>working on a task that will require an addition top level project so
>>>>>>>I
>>>>>>>am
>>>>>>>going to add that in as well:
>>>>>>>
>>>>>>>metron-deployment
>>>>>>>metron-platform
>>>>>>>metron-ui
>>>>>>>metron-sensors
>>>>>>>
>>>>>>>All of these except metron-platform are well understood and don¹t
>>>>>>>warrant
>>>>>>>any more discussion.  For metron-platform there seem to be 2 areas
>>>>>>>that
>>>>>>>are not as clear:
>>>>>>>
>>>>>>>- whether we need a common project
>>>>>>>- how do we organize test related code
>>>>>>>
>>>>>>>I agree with David and others that a common project will likely get
>>>>>>>misused and could become unnecessary bloated.  But I suspect there
>>>>>>>will
>>>>>>>be
>>>>>>>cases where we have common code being used across multiple projects
>>>>>>>(is
>>>>>>>already happening).  In this case we will either need this common
>>>>>>>project
>>>>>>>or we will have to keep common code in one of the other projects and
>>>>>>>have
>>>>>>>all other projects extend that. For the latter, an example would be
>>>>>>>keeping common code in enrichment and having parsers declare
>>>>>>>enrichment
>>>>>>>as
>>>>>>>a dependency.  There are a couple downsides I see with this approach:
>>>>>>>
>>>>>>>- parser topology jars now bring along all the enrichment
>>>>>>>dependencies
>>>>>>>- since more code from various projects are being packaged together,
>>>>>>>version conflicts are more likely and poms become more complicated
>>>>>>>due
>>>>>>>to
>>>>>>>all the necessary exclusions
>>>>>>>
>>>>>>>My thinking is that any jar file being deployed should only contain
>>>>>>>what
>>>>>>>it needs.  Curious what others think here.  My vote would be to
>>>>>>>maintain
>>>>>>>a
>>>>>>>common project (or whatever we want to call it) and be diligent about
>>>>>>>not
>>>>>>>letting project-specific code slip in there.
>>>>>>>
>>>>>>>I believe Nick was the first person to ask the question about
>>>>>>>projects
>>>>>>>related to test code and why we would need separate test and
>>>>>>>integration
>>>>>>>test.  The reason for this is that our integration-test classes
>>>>>>>currently
>>>>>>>depend on other projects (not surprising since they are integration
>>>>>>>tests).  If there are utilities we want make available to all
>>>>>>>projects
>>>>>>>(mock classes, utilities for reading sample data, etc) then it can¹t
>>>>>>>live
>>>>>>>in integration-test because that will introduce circular
>>>>>>>dependencies.
>>>>>>>If
>>>>>>>it is possible to refactor our current Metron-Testing project so that
>>>>>>>it
>>>>>>>doesn¹t depend on any other projects, then we can keep utilities
>>>>>>>here.
>>>>>>>Otherwise we need a separate project for testing utilities.  I
>>>>>>>suspect
>>>>>>>removing other project dependencies from Metron-Testing will prove
>>>>>>>more
>>>>>>>difficult than it¹s worth so my vote would be to have 2 test related
>>>>>>>projects.
>>>>>>>
>>>>>>>So here is where our metron-platform organization stands:
>>>>>>>
>>>>>>>metron-common *
>>>>>>>metron-integration-test *
>>>>>>>metron-test-utilities *
>>>>>>>metron-data-management
>>>>>>>metron-pcap
>>>>>>>metron-parsers
>>>>>>>metron-enrichment
>>>>>>> metron-solr
>>>>>>> metron-elasticsearch
>>>>>>>metron-api
>>>>>>>
>>>>>>>* may or may not change depending on the outcome of this discussion
>>>>>>>
>>>>>>>Thoughts?
>>>>>>>
>>>>>>>Ryan Merriman
>>>>>>>
>>>>>>>
>>>>>>>On 4/11/16, 4:15 PM, "Debojyoti Dutta" <[email protected]> wrote:
>>>>>>>
>>>>>>>>If you load up your Irc client just type
>>>>>>>>/join #apache-metron-dev
>>>>>>>>
>>>>>>>>Sent from my iPhone
>>>>>>>>
>>>>>>>>> On Apr 11, 2016, at 12:06 PM, James Sirota
>>>>>>>>><[email protected]>
>>>>>>>>>wrote:
>>>>>>>>> 
>>>>>>>>> Great, thanks, Debo.  Where can I find instructions on how to get
>>>>>>>>>to
>>>>>>>>>it?
>>>>>>>>> 
>>>>>>>>> Thanks,
>>>>>>>>> James 
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>>> On 4/11/16, 9:41 AM, "Debo Dutta (dedutta)" <[email protected]>
>>>>>>>>>>wrote:
>>>>>>>>>> 
>>>>>>>>>> Hi James 
>>>>>>>>>> 
>>>>>>>>>> Ok set it up and ack Š..
>>>>>>>>>> 
>>>>>>>>>> Thx
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>>> On 4/10/16, 6:31 PM, "James Sirota" <[email protected]>
>>>>>>>>>>>wrote:
>>>>>>>>>>> 
>>>>>>>>>>> Hi Debo,
>>>>>>>>>>> 
>>>>>>>>>>> I think it would be great if you set it up
>>>>>>>>>>> 
>>>>>>>>>>> Thanks,
>>>>>>>>>>> James 
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>>> On 4/10/16, 6:25 PM, "Debojyoti Dutta" <[email protected]>
>>>>>>>>>>>>wrote:
>>>>>>>>>>>> 
>>>>>>>>>>>> I have set it up for another open source effort in the past and
>>>>>>>>>>>>it
>>>>>>>>>>>>was not very hard. Am happy to volunteer if needed.
>>>>>>>>>>>> 
>>>>>>>>>>>> Thx 
>>>>>>>>>>>> Debo
>>>>>>>>>>>> 
>>>>>>>>>>>> Sent from my iPhone
>>>>>>>>>>>> 
>>>>>>>>>>>>> On Apr 10, 2016, at 5:53 PM, James Sirota
>>>>>>>>>>>>><[email protected]>
>>>>>>>>>>>>>wrote:
>>>>>>>>>>>>> 
>>>>>>>>>>>>> I¹d be open to an IRC channel.  Does anyone know if Apache
>>>>>>>>>>>>>allows
>>>>>>>>>>>>>this?  If yes, does anyone know how to set one up?
>>>>>>>>>>>>> 
>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>> James 
>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>>>> On 4/10/16, 4:52 PM, "Debojyoti Dutta" <[email protected]>
>>>>>>>>>>>>>>wrote:
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> Hi Nick
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> I like your suggestions. For the enrichment layer do you
>>>>>>>>>>>>>>think
>>>>>>>>>>>>>>it
>>>>>>>>>>>>>>would also include any advanced analytics. Else we might want
>>>>>>>>>>>>>>to
>>>>>>>>>>>>>>have an analytics layer.
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> It would be good to have an arch which could be extended for
>>>>>>>>>>>>>>new
>>>>>>>>>>>>>>functionality.
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> However Ryan's suggestion of the ui API and deployer also
>>>>>>>>>>>>>>makes
>>>>>>>>>>>>>>sense.
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> Should we have an IRC channel to discuss this or maybe
>>>>>>>>>>>>>>etherpad?
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> Debo
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> Sent from my iPhone
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> On Apr 10, 2016, at 4:36 PM, Nick Allen <[email protected]>
>>>>>>>>>>>>>>>wrote:
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> It might help to think of our code base as four separate
>>>>>>>>>>>>>>>types
>>>>>>>>>>>>>>>of
>>>>>>>>>>>>>>> functionality.  This is primarily meant to give us a
>>>>>>>>>>>>>>>framework
>>>>>>>>>>>>>>>to
>>>>>>>>>>>>>>>think
>>>>>>>>>>>>>>> about the organization of Metron (and drive more
>>>>>>>>>>>>>>>discussion),
>>>>>>>>>>>>>>>rather than
>>>>>>>>>>>>>>> my proposal for a specific structure.
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> - Sensor - Anything that captures external, non-streaming
>>>>>>>>>>>>>>>data
>>>>>>>>>>>>>>>and
>>>>>>>>>>>>>>> presents it in a form ready for stream processing.
>>>>>>>>>>>>>>> - Input - Responsible for preparing streaming data for
>>>>>>>>>>>>>>>enrichment.  The
>>>>>>>>>>>>>>> existing "parsers" fit neatly into this space.
>>>>>>>>>>>>>>> - Enrichment - Responsible for enriching an incoming data
>>>>>>>>>>>>>>>feed
>>>>>>>>>>>>>>>like
>>>>>>>>>>>>>>> geoip, asset enrichment, threat intel lookups, etc.
>>>>>>>>>>>>>>> - Output - Responsible for persisting data that has been
>>>>>>>>>>>>>>>processed by
>>>>>>>>>>>>>>> Metron which obviously means search indexers or data stores.
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> On Fri, Apr 8, 2016 at 4:46 PM, Ryan Merriman
>>>>>>>>>>>>>>><[email protected]>
>>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> All,
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> I would like to propose a review and refactor of the
>>>>>>>>>>>>>>>>current
>>>>>>>>>>>>>>>>project
>>>>>>>>>>>>>>>> organization within Metron.  Much of the way the legacy
>>>>>>>>>>>>>>>>code
>>>>>>>>>>>>>>>>was
>>>>>>>>>>>>>>>>organized
>>>>>>>>>>>>>>>> does not make sense anymore and could be designed so that
>>>>>>>>>>>>>>>>it
>>>>>>>>>>>>>>>>is
>>>>>>>>>>>>>>>>easier to
>>>>>>>>>>>>>>>> navigate and understand.  Our test coverage has increased
>>>>>>>>>>>>>>>>substantially so
>>>>>>>>>>>>>>>> I believe we can do this with confidence.
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> First off, I think we should agree on a naming convention.
>>>>>>>>>>>>>>>> I
>>>>>>>>>>>>>>>>see some
>>>>>>>>>>>>>>>> projects (YARN and Storm for example) that prepend the
>>>>>>>>>>>>>>>>sub-project with the
>>>>>>>>>>>>>>>> name of the top-level project (storm-core for example).
>>>>>>>>>>>>>>>>Metron
>>>>>>>>>>>>>>>>also
>>>>>>>>>>>>>>>> currently does this (Metron-Common).  I think that's fine,
>>>>>>>>>>>>>>>>although in the
>>>>>>>>>>>>>>>> case of Metron, I feel like having "Metron" prepended is
>>>>>>>>>>>>>>>>redundant.
>>>>>>>>>>>>>>>> Regardless of whether we decide to stick with that
>>>>>>>>>>>>>>>>approach,
>>>>>>>>>>>>>>>>I
>>>>>>>>>>>>>>>>propose that
>>>>>>>>>>>>>>>> project names be uniform and lowercase.  For example, under
>>>>>>>>>>>>>>>>these
>>>>>>>>>>>>>>>> assumptions "Metron-Common" would change to "common".
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> The first level of organization makes sense to me.  Only
>>>>>>>>>>>>>>>>change
>>>>>>>>>>>>>>>>I would
>>>>>>>>>>>>>>>> make would be to project names:
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> *   deployment
>>>>>>>>>>>>>>>> *   streaming
>>>>>>>>>>>>>>>> *   ui
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> Or if we want to keep metron in project names:
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> *   metron-deployment
>>>>>>>>>>>>>>>> *   metron-streaming
>>>>>>>>>>>>>>>> *   metron-ui
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> For now I don't see any changes necessary in deployment or
>>>>>>>>>>>>>>>>ui
>>>>>>>>>>>>>>>> organization.  I see the streaming project structure
>>>>>>>>>>>>>>>>primarily
>>>>>>>>>>>>>>>>driven by 2
>>>>>>>>>>>>>>>> things:  the Maven dependency tree and deployment targets.
>>>>>>>>>>>>>>>>For
>>>>>>>>>>>>>>>>example,
>>>>>>>>>>>>>>>> solr and elasticsearch code should be separated (because
>>>>>>>>>>>>>>>>their
>>>>>>>>>>>>>>>>dependency
>>>>>>>>>>>>>>>> on lucene conflicts) but both will depend on common
>>>>>>>>>>>>>>>>enrichment
>>>>>>>>>>>>>>>>code.  Also,
>>>>>>>>>>>>>>>> now that parser, enrichment and pcap topologies are
>>>>>>>>>>>>>>>>separate,
>>>>>>>>>>>>>>>>code for
>>>>>>>>>>>>>>>> those topologies will be deployed as separate jars.  No
>>>>>>>>>>>>>>>>reason
>>>>>>>>>>>>>>>>to include
>>>>>>>>>>>>>>>> parser code in enrichment topologies and vice-versa.  Any
>>>>>>>>>>>>>>>>other
>>>>>>>>>>>>>>>> considerations I'm missing?
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> With that being said, here is my initial proposal:
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> *   common -  Any common code that all topologies depend on
>>>>>>>>>>>>>>>> (configuration classes, generic writers for example).  No
>>>>>>>>>>>>>>>>dependencies on
>>>>>>>>>>>>>>>> other Metron projects.
>>>>>>>>>>>>>>>> *   test - Contains utilities for writing unit tests,
>>>>>>>>>>>>>>>>sample
>>>>>>>>>>>>>>>>configs and
>>>>>>>>>>>>>>>> sample data.  Will depend on common.
>>>>>>>>>>>>>>>> *   integration-test - Contains utilities and classes
>>>>>>>>>>>>>>>>needed
>>>>>>>>>>>>>>>>to
>>>>>>>>>>>>>>>>run our
>>>>>>>>>>>>>>>> integration tests (in memory components for example).  Will
>>>>>>>>>>>>>>>>depend on
>>>>>>>>>>>>>>>> common and test.
>>>>>>>>>>>>>>>> *   dataload - Contains all code related to data loading.
>>>>>>>>>>>>>>>>Will
>>>>>>>>>>>>>>>>also
>>>>>>>>>>>>>>>> include any property files needed and integration tests.
>>>>>>>>>>>>>>>>Will
>>>>>>>>>>>>>>>>depend on
>>>>>>>>>>>>>>>> common, test (test scope), and integration-test (test
>>>>>>>>>>>>>>>>scope).
>>>>>>>>>>>>>>>> *   parser - All code specific to the parser topologies.
>>>>>>>>>>>>>>>>Would
>>>>>>>>>>>>>>>>also
>>>>>>>>>>>>>>>> include scripts, property files, flux files and parser
>>>>>>>>>>>>>>>>topology
>>>>>>>>>>>>>>>>integration
>>>>>>>>>>>>>>>> tests.  This project will depend on common, test (test
>>>>>>>>>>>>>>>>scope),
>>>>>>>>>>>>>>>>and
>>>>>>>>>>>>>>>> integration-testing (test scope).
>>>>>>>>>>>>>>>> *   enrichment - All code specific to the enrichment
>>>>>>>>>>>>>>>>topologies
>>>>>>>>>>>>>>>>(except
>>>>>>>>>>>>>>>> solr and elasticsearch).  Would also include scripts,
>>>>>>>>>>>>>>>>property
>>>>>>>>>>>>>>>>files, flux
>>>>>>>>>>>>>>>> files and enrichment topology integration tests.  This
>>>>>>>>>>>>>>>>project
>>>>>>>>>>>>>>>>will depend
>>>>>>>>>>>>>>>> on common, test (test scope), and integration-test (test
>>>>>>>>>>>>>>>>scope).
>>>>>>>>>>>>>>>> *   elasticsearch - All Elasticsearch related code.  Will
>>>>>>>>>>>>>>>>depend
>>>>>>>>>>>>>>>>on
>>>>>>>>>>>>>>>> enrichment.
>>>>>>>>>>>>>>>> *   solr - All Solr related code.  Will depend on
>>>>>>>>>>>>>>>>enrichment.
>>>>>>>>>>>>>>>> *   pcap - All code specific to the topology dedicated to
>>>>>>>>>>>>>>>>pcap.
>>>>>>>>>>>>>>>>Would
>>>>>>>>>>>>>>>> also include scripts, property files, flux files and pcap
>>>>>>>>>>>>>>>>integration
>>>>>>>>>>>>>>>> test.  This project will depend on common, test (test
>>>>>>>>>>>>>>>>scope)
>>>>>>>>>>>>>>>>and
>>>>>>>>>>>>>>>> integration-test (test scope).
>>>>>>>>>>>>>>>> *   api - This will serve as a generic replacement for
>>>>>>>>>>>>>>>> Metron-Pcap_Service.  Will contain all code to build a
>>>>>>>>>>>>>>>>Metron
>>>>>>>>>>>>>>>>web service
>>>>>>>>>>>>>>>> middle layer that can expose APIs through REST or other
>>>>>>>>>>>>>>>>client
>>>>>>>>>>>>>>>>protocols.
>>>>>>>>>>>>>>>> Could possibly depend on all other projects or separated
>>>>>>>>>>>>>>>>further
>>>>>>>>>>>>>>>>if version
>>>>>>>>>>>>>>>> conflicts arise (separate api projects for solr and
>>>>>>>>>>>>>>>>elasticsearch for
>>>>>>>>>>>>>>>> example).
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> Looking forward to hearing everyone's feedback and great
>>>>>>>>>>>>>>>>ideas.
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> Ryan Merriman
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> -- 
>>>>>>>>>>>>>>> Nick Allen <[email protected]>
>>>>>>>>>>>> 
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>
>>>
>

Reply via email to