Also, +1 for more intelligent use of dependencyManagement and have each su module build independently. I think the top level Pom should build metron streaming as well as the C component as well.
I tend to favor smaller projects for common code rather than these grab bag common projects as well, but I do not have a strong opposition necessarily. Sorry for typos; commenting from my phone at the airport. Casey On Mon, Apr 11, 2016 at 11:57 Casey Stella <[email protected]> wrote: > I'm in general in favor of keeping an integration test project only for > integration test infrastructure (i.e. The inmemory components) and having > the integration tests live in the projects that have the components that > are being tested. > > On Mon, Apr 11, 2016 at 11:36 David Lyle <[email protected]> wrote: > >> I think I was thinking along the same lines as James, let me read it back >> to make sure: >> >> Metron >> Platform >> Common (*) >> Integration-Test (*) >> DataManagement >> PCAP >> Parsers >> Enrichment >> Solr >> Elasticsearch >> Deployment >> Streaming >> UI >> >> For Common and Integration-Test, I'd be interested in a little more >> discussion around keeping them. I lean toward not having them. I >> understand >> and support the goal of reuse, but I've found these catch-all projects >> don't always facilitate that aim. We may be better served in the long run >> by aligning these classes with their initial users. For example, wouldn't >> all the bolt interfaces and abstract classes be better homed in >> Enrichment? >> Configuration classes may be best as a separate project under Platform? >> The >> classes in Metron-Testing may have to stick around as a separate project- >> but perhaps not, they seem to be tightly aligned with enrichment type >> integration testing. >> >> Also- since we're going to have to refactor the poms as part of this >> effort, there are some first order principles that'd I'd be interested in >> hearing other's thoughts about: >> >> 1) mvn (whatever) should run from the top level and each sub-module. >> 2) The top level pom should use a dependencyManagement section to avoid >> global_version type variables. >> 3) All plugins and dependencies should have a specified version (fwiw, I >> think we're pretty good here, but it's worth a look) >> 4) Versioning- master/trunk should be version-SNAPSHOT. >> 5) Other thoughts? >> >> >> -D... >> >> >> On Sun, Apr 10, 2016 at 8:31 PM, James Sirota <[email protected]> >> wrote: >> >> > Hi Debo, >> > >> > I think it would be great if you set it up >> > >> > Thanks, >> > James >> > >> > >> > >> > >> > On 4/10/16, 6:25 PM, "Debojyoti Dutta" <[email protected]> wrote: >> > >> > >I have set it up for another open source effort in the past and it was >> > not very hard. Am happy to volunteer if needed. >> > > >> > >Thx >> > >Debo >> > > >> > >Sent from my iPhone >> > > >> > >> On Apr 10, 2016, at 5:53 PM, James Sirota <[email protected]> >> > wrote: >> > >> >> > >> I’d be open to an IRC channel. Does anyone know if Apache allows >> > this? If yes, does anyone know how to set one up? >> > >> >> > >> Thanks, >> > >> James >> > >> >> > >> >> > >> >> > >> >> > >>> On 4/10/16, 4:52 PM, "Debojyoti Dutta" <[email protected]> wrote: >> > >>> >> > >>> Hi Nick >> > >>> >> > >>> I like your suggestions. For the enrichment layer do you think it >> > would also include any advanced analytics. Else we might want to have an >> > analytics layer. >> > >>> >> > >>> It would be good to have an arch which could be extended for new >> > functionality. >> > >>> >> > >>> However Ryan's suggestion of the ui API and deployer also makes >> sense. >> > >>> >> > >>> Should we have an IRC channel to discuss this or maybe etherpad? >> > >>> >> > >>> Debo >> > >>> >> > >>> Sent from my iPhone >> > >>> >> > >>>> On Apr 10, 2016, at 4:36 PM, Nick Allen <[email protected]> >> wrote: >> > >>>> >> > >>>> It might help to think of our code base as four separate types of >> > >>>> functionality. This is primarily meant to give us a framework to >> > think >> > >>>> about the organization of Metron (and drive more discussion), >> rather >> > than >> > >>>> my proposal for a specific structure. >> > >>>> >> > >>>> - Sensor - Anything that captures external, non-streaming data and >> > >>>> presents it in a form ready for stream processing. >> > >>>> - Input - Responsible for preparing streaming data for enrichment. >> > The >> > >>>> existing "parsers" fit neatly into this space. >> > >>>> - Enrichment - Responsible for enriching an incoming data feed >> like >> > >>>> geoip, asset enrichment, threat intel lookups, etc. >> > >>>> - Output - Responsible for persisting data that has been >> processed by >> > >>>> Metron which obviously means search indexers or data stores. >> > >>>> >> > >>>> >> > >>>> >> > >>>> >> > >>>> >> > >>>> On Fri, Apr 8, 2016 at 4:46 PM, Ryan Merriman < >> > [email protected]> >> > >>>> wrote: >> > >>>> >> > >>>>> All, >> > >>>>> >> > >>>>> I would like to propose a review and refactor of the current >> project >> > >>>>> organization within Metron. Much of the way the legacy code was >> > organized >> > >>>>> does not make sense anymore and could be designed so that it is >> > easier to >> > >>>>> navigate and understand. Our test coverage has increased >> > substantially so >> > >>>>> I believe we can do this with confidence. >> > >>>>> >> > >>>>> First off, I think we should agree on a naming convention. I see >> > some >> > >>>>> projects (YARN and Storm for example) that prepend the sub-project >> > with the >> > >>>>> name of the top-level project (storm-core for example). Metron >> also >> > >>>>> currently does this (Metron-Common). I think that's fine, >> although >> > in the >> > >>>>> case of Metron, I feel like having "Metron" prepended is >> redundant. >> > >>>>> Regardless of whether we decide to stick with that approach, I >> > propose that >> > >>>>> project names be uniform and lowercase. For example, under these >> > >>>>> assumptions "Metron-Common" would change to "common". >> > >>>>> >> > >>>>> The first level of organization makes sense to me. Only change I >> > would >> > >>>>> make would be to project names: >> > >>>>> >> > >>>>> * deployment >> > >>>>> * streaming >> > >>>>> * ui >> > >>>>> >> > >>>>> Or if we want to keep metron in project names: >> > >>>>> >> > >>>>> * metron-deployment >> > >>>>> * metron-streaming >> > >>>>> * metron-ui >> > >>>>> >> > >>>>> For now I don't see any changes necessary in deployment or ui >> > >>>>> organization. I see the streaming project structure primarily >> > driven by 2 >> > >>>>> things: the Maven dependency tree and deployment targets. For >> > example, >> > >>>>> solr and elasticsearch code should be separated (because their >> > dependency >> > >>>>> on lucene conflicts) but both will depend on common enrichment >> > code. Also, >> > >>>>> now that parser, enrichment and pcap topologies are separate, code >> > for >> > >>>>> those topologies will be deployed as separate jars. No reason to >> > include >> > >>>>> parser code in enrichment topologies and vice-versa. Any other >> > >>>>> considerations I'm missing? >> > >>>>> >> > >>>>> With that being said, here is my initial proposal: >> > >>>>> >> > >>>>> * common - Any common code that all topologies depend on >> > >>>>> (configuration classes, generic writers for example). No >> > dependencies on >> > >>>>> other Metron projects. >> > >>>>> * test - Contains utilities for writing unit tests, sample >> configs >> > and >> > >>>>> sample data. Will depend on common. >> > >>>>> * integration-test - Contains utilities and classes needed to >> run >> > our >> > >>>>> integration tests (in memory components for example). Will >> depend on >> > >>>>> common and test. >> > >>>>> * dataload - Contains all code related to data loading. Will >> also >> > >>>>> include any property files needed and integration tests. Will >> > depend on >> > >>>>> common, test (test scope), and integration-test (test scope). >> > >>>>> * parser - All code specific to the parser topologies. Would >> also >> > >>>>> include scripts, property files, flux files and parser topology >> > integration >> > >>>>> tests. This project will depend on common, test (test scope), and >> > >>>>> integration-testing (test scope). >> > >>>>> * enrichment - All code specific to the enrichment topologies >> > (except >> > >>>>> solr and elasticsearch). Would also include scripts, property >> > files, flux >> > >>>>> files and enrichment topology integration tests. This project >> will >> > depend >> > >>>>> on common, test (test scope), and integration-test (test scope). >> > >>>>> * elasticsearch - All Elasticsearch related code. Will depend >> on >> > >>>>> enrichment. >> > >>>>> * solr - All Solr related code. Will depend on enrichment. >> > >>>>> * pcap - All code specific to the topology dedicated to pcap. >> > Would >> > >>>>> also include scripts, property files, flux files and pcap >> integration >> > >>>>> test. This project will depend on common, test (test scope) and >> > >>>>> integration-test (test scope). >> > >>>>> * api - This will serve as a generic replacement for >> > >>>>> Metron-Pcap_Service. Will contain all code to build a Metron web >> > service >> > >>>>> middle layer that can expose APIs through REST or other client >> > protocols. >> > >>>>> Could possibly depend on all other projects or separated further >> if >> > version >> > >>>>> conflicts arise (separate api projects for solr and elasticsearch >> for >> > >>>>> example). >> > >>>>> >> > >>>>> Looking forward to hearing everyone's feedback and great ideas. >> > >>>>> >> > >>>>> Ryan Merriman >> > >>>> >> > >>>> >> > >>>> >> > >>>> -- >> > >>>> Nick Allen <[email protected]> >> > >>> >> > > >> > >> >
