James brings up a good point. I propose adding another project under metron-platform called metron-configuration. This would be a fairly lightweight project that would contain anything related to configuration (property files, json files, flux files, etc).
On 4/13/16, 8:56 AM, "James Sirota" <[email protected]> wrote: >+1 from me. > >I would also like to address the configs and make sure the configs are in >the same place. Do you have ideas on where we would put those? > >Thanks, >James > > > >On 4/13/16, 6:50 AM, "Ryan Merriman" <[email protected]> wrote: > >>Thank you for all the feedback everyone. I will attempt to summarize all >>the input we¹ve received and update my initial proposal. We can discuss >>further if anyone is still unclear and I will volunteer to capture all >>the >>details in a document of some kind once we all come to a consensus. >> >>Looks like everyone is in agreement for the top level projects. Nick is >>working on a task that will require an addition top level project so I am >>going to add that in as well: >> >>metron-deployment >>metron-platform >>metron-ui >>metron-sensors >> >>All of these except metron-platform are well understood and don¹t warrant >>any more discussion. For metron-platform there seem to be 2 areas that >>are not as clear: >> >>- whether we need a common project >>- how do we organize test related code >> >>I agree with David and others that a common project will likely get >>misused and could become unnecessary bloated. But I suspect there will >>be >>cases where we have common code being used across multiple projects (is >>already happening). In this case we will either need this common project >>or we will have to keep common code in one of the other projects and have >>all other projects extend that. For the latter, an example would be >>keeping common code in enrichment and having parsers declare enrichment >>as >>a dependency. There are a couple downsides I see with this approach: >> >>- parser topology jars now bring along all the enrichment dependencies >>- since more code from various projects are being packaged together, >>version conflicts are more likely and poms become more complicated due to >>all the necessary exclusions >> >>My thinking is that any jar file being deployed should only contain what >>it needs. Curious what others think here. My vote would be to maintain >>a >>common project (or whatever we want to call it) and be diligent about not >>letting project-specific code slip in there. >> >>I believe Nick was the first person to ask the question about projects >>related to test code and why we would need separate test and integration >>test. The reason for this is that our integration-test classes currently >>depend on other projects (not surprising since they are integration >>tests). If there are utilities we want make available to all projects >>(mock classes, utilities for reading sample data, etc) then it can¹t live >>in integration-test because that will introduce circular dependencies. >>If >>it is possible to refactor our current Metron-Testing project so that it >>doesn¹t depend on any other projects, then we can keep utilities here. >>Otherwise we need a separate project for testing utilities. I suspect >>removing other project dependencies from Metron-Testing will prove more >>difficult than it¹s worth so my vote would be to have 2 test related >>projects. >> >>So here is where our metron-platform organization stands: >> >>metron-common * >>metron-integration-test * >>metron-test-utilities * >>metron-data-management >>metron-pcap >>metron-parsers >>metron-enrichment >> metron-solr >> metron-elasticsearch >>metron-api >> >>* may or may not change depending on the outcome of this discussion >> >>Thoughts? >> >>Ryan Merriman >> >> >>On 4/11/16, 4:15 PM, "Debojyoti Dutta" <[email protected]> wrote: >> >>>If you load up your Irc client just type >>>/join #apache-metron-dev >>> >>>Sent from my iPhone >>> >>>> On Apr 11, 2016, at 12:06 PM, James Sirota <[email protected]> >>>>wrote: >>>> >>>> Great, thanks, Debo. Where can I find instructions on how to get to >>>>it? >>>> >>>> Thanks, >>>> James >>>> >>>> >>>> >>>> >>>>> On 4/11/16, 9:41 AM, "Debo Dutta (dedutta)" <[email protected]> >>>>>wrote: >>>>> >>>>> Hi James >>>>> >>>>> Ok set it up and ack Š.. >>>>> >>>>> Thx >>>>> >>>>> >>>>> >>>>> >>>>> >>>>>> On 4/10/16, 6:31 PM, "James Sirota" <[email protected]> wrote: >>>>>> >>>>>> Hi Debo, >>>>>> >>>>>> I think it would be great if you set it up >>>>>> >>>>>> Thanks, >>>>>> James >>>>>> >>>>>> >>>>>> >>>>>> >>>>>>> On 4/10/16, 6:25 PM, "Debojyoti Dutta" <[email protected]> wrote: >>>>>>> >>>>>>> I have set it up for another open source effort in the past and it >>>>>>>was not very hard. Am happy to volunteer if needed. >>>>>>> >>>>>>> Thx >>>>>>> Debo >>>>>>> >>>>>>> Sent from my iPhone >>>>>>> >>>>>>>> On Apr 10, 2016, at 5:53 PM, James Sirota >>>>>>>><[email protected]> >>>>>>>>wrote: >>>>>>>> >>>>>>>> I¹d be open to an IRC channel. Does anyone know if Apache allows >>>>>>>>this? If yes, does anyone know how to set one up? >>>>>>>> >>>>>>>> Thanks, >>>>>>>> James >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>>> On 4/10/16, 4:52 PM, "Debojyoti Dutta" <[email protected]> wrote: >>>>>>>>> >>>>>>>>> Hi Nick >>>>>>>>> >>>>>>>>> I like your suggestions. For the enrichment layer do you think it >>>>>>>>>would also include any advanced analytics. Else we might want to >>>>>>>>>have an analytics layer. >>>>>>>>> >>>>>>>>> It would be good to have an arch which could be extended for new >>>>>>>>>functionality. >>>>>>>>> >>>>>>>>> However Ryan's suggestion of the ui API and deployer also makes >>>>>>>>>sense. >>>>>>>>> >>>>>>>>> Should we have an IRC channel to discuss this or maybe etherpad? >>>>>>>>> >>>>>>>>> Debo >>>>>>>>> >>>>>>>>> Sent from my iPhone >>>>>>>>> >>>>>>>>>> On Apr 10, 2016, at 4:36 PM, Nick Allen <[email protected]> >>>>>>>>>>wrote: >>>>>>>>>> >>>>>>>>>> It might help to think of our code base as four separate types >>>>>>>>>>of >>>>>>>>>> functionality. This is primarily meant to give us a framework >>>>>>>>>>to >>>>>>>>>>think >>>>>>>>>> about the organization of Metron (and drive more discussion), >>>>>>>>>>rather than >>>>>>>>>> my proposal for a specific structure. >>>>>>>>>> >>>>>>>>>> - Sensor - Anything that captures external, non-streaming data >>>>>>>>>>and >>>>>>>>>> presents it in a form ready for stream processing. >>>>>>>>>> - Input - Responsible for preparing streaming data for >>>>>>>>>>enrichment. The >>>>>>>>>> existing "parsers" fit neatly into this space. >>>>>>>>>> - Enrichment - Responsible for enriching an incoming data feed >>>>>>>>>>like >>>>>>>>>> geoip, asset enrichment, threat intel lookups, etc. >>>>>>>>>> - Output - Responsible for persisting data that has been >>>>>>>>>>processed by >>>>>>>>>> Metron which obviously means search indexers or data stores. >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> On Fri, Apr 8, 2016 at 4:46 PM, Ryan Merriman >>>>>>>>>><[email protected]> >>>>>>>>>> wrote: >>>>>>>>>> >>>>>>>>>>> All, >>>>>>>>>>> >>>>>>>>>>> I would like to propose a review and refactor of the current >>>>>>>>>>>project >>>>>>>>>>> organization within Metron. Much of the way the legacy code >>>>>>>>>>>was >>>>>>>>>>>organized >>>>>>>>>>> does not make sense anymore and could be designed so that it is >>>>>>>>>>>easier to >>>>>>>>>>> navigate and understand. Our test coverage has increased >>>>>>>>>>>substantially so >>>>>>>>>>> I believe we can do this with confidence. >>>>>>>>>>> >>>>>>>>>>> First off, I think we should agree on a naming convention. I >>>>>>>>>>>see some >>>>>>>>>>> projects (YARN and Storm for example) that prepend the >>>>>>>>>>>sub-project with the >>>>>>>>>>> name of the top-level project (storm-core for example). Metron >>>>>>>>>>>also >>>>>>>>>>> currently does this (Metron-Common). I think that's fine, >>>>>>>>>>>although in the >>>>>>>>>>> case of Metron, I feel like having "Metron" prepended is >>>>>>>>>>>redundant. >>>>>>>>>>> Regardless of whether we decide to stick with that approach, I >>>>>>>>>>>propose that >>>>>>>>>>> project names be uniform and lowercase. For example, under >>>>>>>>>>>these >>>>>>>>>>> assumptions "Metron-Common" would change to "common". >>>>>>>>>>> >>>>>>>>>>> The first level of organization makes sense to me. Only change >>>>>>>>>>>I would >>>>>>>>>>> make would be to project names: >>>>>>>>>>> >>>>>>>>>>> * deployment >>>>>>>>>>> * streaming >>>>>>>>>>> * ui >>>>>>>>>>> >>>>>>>>>>> Or if we want to keep metron in project names: >>>>>>>>>>> >>>>>>>>>>> * metron-deployment >>>>>>>>>>> * metron-streaming >>>>>>>>>>> * metron-ui >>>>>>>>>>> >>>>>>>>>>> For now I don't see any changes necessary in deployment or ui >>>>>>>>>>> organization. I see the streaming project structure primarily >>>>>>>>>>>driven by 2 >>>>>>>>>>> things: the Maven dependency tree and deployment targets. For >>>>>>>>>>>example, >>>>>>>>>>> solr and elasticsearch code should be separated (because their >>>>>>>>>>>dependency >>>>>>>>>>> on lucene conflicts) but both will depend on common enrichment >>>>>>>>>>>code. Also, >>>>>>>>>>> now that parser, enrichment and pcap topologies are separate, >>>>>>>>>>>code for >>>>>>>>>>> those topologies will be deployed as separate jars. No reason >>>>>>>>>>>to include >>>>>>>>>>> parser code in enrichment topologies and vice-versa. Any other >>>>>>>>>>> considerations I'm missing? >>>>>>>>>>> >>>>>>>>>>> With that being said, here is my initial proposal: >>>>>>>>>>> >>>>>>>>>>> * common - Any common code that all topologies depend on >>>>>>>>>>> (configuration classes, generic writers for example). No >>>>>>>>>>>dependencies on >>>>>>>>>>> other Metron projects. >>>>>>>>>>> * test - Contains utilities for writing unit tests, sample >>>>>>>>>>>configs and >>>>>>>>>>> sample data. Will depend on common. >>>>>>>>>>> * integration-test - Contains utilities and classes needed to >>>>>>>>>>>run our >>>>>>>>>>> integration tests (in memory components for example). Will >>>>>>>>>>>depend on >>>>>>>>>>> common and test. >>>>>>>>>>> * dataload - Contains all code related to data loading. Will >>>>>>>>>>>also >>>>>>>>>>> include any property files needed and integration tests. Will >>>>>>>>>>>depend on >>>>>>>>>>> common, test (test scope), and integration-test (test scope). >>>>>>>>>>> * parser - All code specific to the parser topologies. Would >>>>>>>>>>>also >>>>>>>>>>> include scripts, property files, flux files and parser topology >>>>>>>>>>>integration >>>>>>>>>>> tests. This project will depend on common, test (test scope), >>>>>>>>>>>and >>>>>>>>>>> integration-testing (test scope). >>>>>>>>>>> * enrichment - All code specific to the enrichment topologies >>>>>>>>>>>(except >>>>>>>>>>> solr and elasticsearch). Would also include scripts, property >>>>>>>>>>>files, flux >>>>>>>>>>> files and enrichment topology integration tests. This project >>>>>>>>>>>will depend >>>>>>>>>>> on common, test (test scope), and integration-test (test >>>>>>>>>>>scope). >>>>>>>>>>> * elasticsearch - All Elasticsearch related code. Will >>>>>>>>>>>depend >>>>>>>>>>>on >>>>>>>>>>> enrichment. >>>>>>>>>>> * solr - All Solr related code. Will depend on enrichment. >>>>>>>>>>> * pcap - All code specific to the topology dedicated to pcap. >>>>>>>>>>>Would >>>>>>>>>>> also include scripts, property files, flux files and pcap >>>>>>>>>>>integration >>>>>>>>>>> test. This project will depend on common, test (test scope) >>>>>>>>>>>and >>>>>>>>>>> integration-test (test scope). >>>>>>>>>>> * api - This will serve as a generic replacement for >>>>>>>>>>> Metron-Pcap_Service. Will contain all code to build a Metron >>>>>>>>>>>web service >>>>>>>>>>> middle layer that can expose APIs through REST or other client >>>>>>>>>>>protocols. >>>>>>>>>>> Could possibly depend on all other projects or separated >>>>>>>>>>>further >>>>>>>>>>>if version >>>>>>>>>>> conflicts arise (separate api projects for solr and >>>>>>>>>>>elasticsearch for >>>>>>>>>>> example). >>>>>>>>>>> >>>>>>>>>>> Looking forward to hearing everyone's feedback and great ideas. >>>>>>>>>>> >>>>>>>>>>> Ryan Merriman >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> -- >>>>>>>>>> Nick Allen <[email protected]> >>>>>>> >>> >> >>
