Re: [DISCUSS] Project reorganization

Casey Stella Mon, 11 Apr 2016 09:02:42 -0700

Also, +1 for more intelligent use of dependencyManagement and have each su
module build independently. I think the top level Pom should build metron
streaming as well as the C component as well.


I tend to favor smaller projects for common code rather than these grab bag
common projects as well, but I do not have a strong opposition necessarily.

Sorry for typos; commenting from my phone at the airport.

Casey
On Mon, Apr 11, 2016 at 11:57 Casey Stella <[email protected]> wrote:

> I'm in general in favor of keeping an integration test project only for
> integration test infrastructure (i.e. The inmemory components) and having
> the integration tests live in the projects that have the components that
> are being tested.
>
> On Mon, Apr 11, 2016 at 11:36 David Lyle <[email protected]> wrote:
>
>> I think I was thinking along the same lines as James, let me read it back
>> to make sure:
>>
>> Metron
>>   Platform
>>      Common (*)
>>      Integration-Test (*)
>>      DataManagement
>>      PCAP
>>      Parsers
>>      Enrichment
>>        Solr
>>        Elasticsearch
>>   Deployment
>>   Streaming
>>   UI
>>
>> For Common and Integration-Test, I'd be interested in a little more
>> discussion around keeping them. I lean toward not having them. I
>> understand
>> and support the goal of reuse, but I've found these catch-all projects
>> don't always facilitate that aim. We may be better served in the long run
>> by aligning these classes with their initial users. For example, wouldn't
>> all the bolt interfaces and abstract classes be better homed in
>> Enrichment?
>> Configuration classes may be best as a separate project under Platform?
>> The
>> classes in Metron-Testing may have to stick around as a separate project-
>> but perhaps not, they seem to be tightly aligned with enrichment type
>> integration testing.
>>
>> Also- since we're going to have to refactor the poms as part of this
>> effort, there are some first order principles that'd I'd be interested in
>> hearing other's thoughts about:
>>
>> 1) mvn (whatever) should run from the top level and each sub-module.
>> 2) The top level pom should use a dependencyManagement section to avoid
>> global_version type variables.
>> 3) All plugins and dependencies should have a specified version (fwiw, I
>> think we're pretty good here, but it's worth a look)
>> 4) Versioning- master/trunk should be version-SNAPSHOT.
>> 5) Other thoughts?
>>
>>
>> -D...
>>
>>
>> On Sun, Apr 10, 2016 at 8:31 PM, James Sirota <[email protected]>
>> wrote:
>>
>> > Hi Debo,
>> >
>> > I think it would be great if you set it up
>> >
>> > Thanks,
>> > James
>> >
>> >
>> >
>> >
>> > On 4/10/16, 6:25 PM, "Debojyoti Dutta" <[email protected]> wrote:
>> >
>> > >I have set it up for another open source effort in the past and it was
>> > not very hard. Am happy to volunteer if needed.
>> > >
>> > >Thx
>> > >Debo
>> > >
>> > >Sent from my iPhone
>> > >
>> > >> On Apr 10, 2016, at 5:53 PM, James Sirota <[email protected]>
>> > wrote:
>> > >>
>> > >> I’d be open to an IRC channel.  Does anyone know if Apache allows
>> > this?  If yes, does anyone know how to set one up?
>> > >>
>> > >> Thanks,
>> > >> James
>> > >>
>> > >>
>> > >>
>> > >>
>> > >>> On 4/10/16, 4:52 PM, "Debojyoti Dutta" <[email protected]> wrote:
>> > >>>
>> > >>> Hi Nick
>> > >>>
>> > >>> I like your suggestions. For the enrichment layer do you think it
>> > would also include any advanced analytics. Else we might want to have an
>> > analytics layer.
>> > >>>
>> > >>> It would be good to have an arch which could be extended for new
>> > functionality.
>> > >>>
>> > >>> However Ryan's suggestion of the ui API and deployer also makes
>> sense.
>> > >>>
>> > >>> Should we have an IRC channel to discuss this or maybe etherpad?
>> > >>>
>> > >>> Debo
>> > >>>
>> > >>> Sent from my iPhone
>> > >>>
>> > >>>> On Apr 10, 2016, at 4:36 PM, Nick Allen <[email protected]>
>> wrote:
>> > >>>>
>> > >>>> It might help to think of our code base as four separate types of
>> > >>>> functionality.  This is primarily meant to give us a framework to
>> > think
>> > >>>> about the organization of Metron (and drive more discussion),
>> rather
>> > than
>> > >>>> my proposal for a specific structure.
>> > >>>>
>> > >>>>  - Sensor - Anything that captures external, non-streaming data and
>> > >>>>  presents it in a form ready for stream processing.
>> > >>>>  - Input - Responsible for preparing streaming data for enrichment.
>> > The
>> > >>>>  existing "parsers" fit neatly into this space.
>> > >>>>  - Enrichment - Responsible for enriching an incoming data feed
>> like
>> > >>>>  geoip, asset enrichment, threat intel lookups, etc.
>> > >>>>  - Output - Responsible for persisting data that has been
>> processed by
>> > >>>>  Metron which obviously means search indexers or data stores.
>> > >>>>
>> > >>>>
>> > >>>>
>> > >>>>
>> > >>>>
>> > >>>> On Fri, Apr 8, 2016 at 4:46 PM, Ryan Merriman <
>> > [email protected]>
>> > >>>> wrote:
>> > >>>>
>> > >>>>> All,
>> > >>>>>
>> > >>>>> I would like to propose a review and refactor of the current
>> project
>> > >>>>> organization within Metron.  Much of the way the legacy code was
>> > organized
>> > >>>>> does not make sense anymore and could be designed so that it is
>> > easier to
>> > >>>>> navigate and understand.  Our test coverage has increased
>> > substantially so
>> > >>>>> I believe we can do this with confidence.
>> > >>>>>
>> > >>>>> First off, I think we should agree on a naming convention.  I see
>> > some
>> > >>>>> projects (YARN and Storm for example) that prepend the sub-project
>> > with the
>> > >>>>> name of the top-level project (storm-core for example).  Metron
>> also
>> > >>>>> currently does this (Metron-Common).  I think that's fine,
>> although
>> > in the
>> > >>>>> case of Metron, I feel like having "Metron" prepended is
>> redundant.
>> > >>>>> Regardless of whether we decide to stick with that approach, I
>> > propose that
>> > >>>>> project names be uniform and lowercase.  For example, under these
>> > >>>>> assumptions "Metron-Common" would change to "common".
>> > >>>>>
>> > >>>>> The first level of organization makes sense to me.  Only change I
>> > would
>> > >>>>> make would be to project names:
>> > >>>>>
>> > >>>>> *   deployment
>> > >>>>> *   streaming
>> > >>>>> *   ui
>> > >>>>>
>> > >>>>> Or if we want to keep metron in project names:
>> > >>>>>
>> > >>>>> *   metron-deployment
>> > >>>>> *   metron-streaming
>> > >>>>> *   metron-ui
>> > >>>>>
>> > >>>>> For now I don't see any changes necessary in deployment or ui
>> > >>>>> organization.  I see the streaming project structure primarily
>> > driven by 2
>> > >>>>> things:  the Maven dependency tree and deployment targets.  For
>> > example,
>> > >>>>> solr and elasticsearch code should be separated (because their
>> > dependency
>> > >>>>> on lucene conflicts) but both will depend on common enrichment
>> > code.  Also,
>> > >>>>> now that parser, enrichment and pcap topologies are separate, code
>> > for
>> > >>>>> those topologies will be deployed as separate jars.  No reason to
>> > include
>> > >>>>> parser code in enrichment topologies and vice-versa.  Any other
>> > >>>>> considerations I'm missing?
>> > >>>>>
>> > >>>>> With that being said, here is my initial proposal:
>> > >>>>>
>> > >>>>> *   common -  Any common code that all topologies depend on
>> > >>>>> (configuration classes, generic writers for example).  No
>> > dependencies on
>> > >>>>> other Metron projects.
>> > >>>>> *   test - Contains utilities for writing unit tests, sample
>> configs
>> > and
>> > >>>>> sample data.  Will depend on common.
>> > >>>>> *   integration-test - Contains utilities and classes needed to
>> run
>> > our
>> > >>>>> integration tests (in memory components for example).  Will
>> depend on
>> > >>>>> common and test.
>> > >>>>> *   dataload - Contains all code related to data loading.  Will
>> also
>> > >>>>> include any property files needed and integration tests.  Will
>> > depend on
>> > >>>>> common, test (test scope), and integration-test (test scope).
>> > >>>>> *   parser - All code specific to the parser topologies.  Would
>> also
>> > >>>>> include scripts, property files, flux files and parser topology
>> > integration
>> > >>>>> tests.  This project will depend on common, test (test scope), and
>> > >>>>> integration-testing (test scope).
>> > >>>>> *   enrichment - All code specific to the enrichment topologies
>> > (except
>> > >>>>> solr and elasticsearch).  Would also include scripts, property
>> > files, flux
>> > >>>>> files and enrichment topology integration tests.  This project
>> will
>> > depend
>> > >>>>> on common, test (test scope), and integration-test (test scope).
>> > >>>>> *   elasticsearch - All Elasticsearch related code.  Will depend
>> on
>> > >>>>> enrichment.
>> > >>>>> *   solr - All Solr related code.  Will depend on enrichment.
>> > >>>>> *   pcap - All code specific to the topology dedicated to pcap.
>> > Would
>> > >>>>> also include scripts, property files, flux files and pcap
>> integration
>> > >>>>> test.  This project will depend on common, test (test scope) and
>> > >>>>> integration-test (test scope).
>> > >>>>> *   api - This will serve as a generic replacement for
>> > >>>>> Metron-Pcap_Service.  Will contain all code to build a Metron web
>> > service
>> > >>>>> middle layer that can expose APIs through REST or other client
>> > protocols.
>> > >>>>> Could possibly depend on all other projects or separated further
>> if
>> > version
>> > >>>>> conflicts arise (separate api projects for solr and elasticsearch
>> for
>> > >>>>> example).
>> > >>>>>
>> > >>>>> Looking forward to hearing everyone's feedback and great ideas.
>> > >>>>>
>> > >>>>> Ryan Merriman
>> > >>>>
>> > >>>>
>> > >>>>
>> > >>>> --
>> > >>>> Nick Allen <[email protected]>
>> > >>>
>> > >
>> >
>>
>

Re: [DISCUSS] Project reorganization

Reply via email to