I would have configs as a project but rather as a folder structure that other modules can point to
Thanks, James On 4/13/16, 7:32 AM, "Ryan Merriman" <[email protected]> wrote: >James brings up a good point. I propose adding another project under >metron-platform called metron-configuration. This would be a fairly >lightweight project that would contain anything related to configuration >(property files, json files, flux files, etc). > >On 4/13/16, 8:56 AM, "James Sirota" <[email protected]> wrote: > >>+1 from me. >> >>I would also like to address the configs and make sure the configs are in >>the same place. Do you have ideas on where we would put those? >> >>Thanks, >>James >> >> >> >>On 4/13/16, 6:50 AM, "Ryan Merriman" <[email protected]> wrote: >> >>>Thank you for all the feedback everyone. I will attempt to summarize all >>>the input we¹ve received and update my initial proposal. We can discuss >>>further if anyone is still unclear and I will volunteer to capture all >>>the >>>details in a document of some kind once we all come to a consensus. >>> >>>Looks like everyone is in agreement for the top level projects. Nick is >>>working on a task that will require an addition top level project so I am >>>going to add that in as well: >>> >>>metron-deployment >>>metron-platform >>>metron-ui >>>metron-sensors >>> >>>All of these except metron-platform are well understood and don¹t warrant >>>any more discussion. For metron-platform there seem to be 2 areas that >>>are not as clear: >>> >>>- whether we need a common project >>>- how do we organize test related code >>> >>>I agree with David and others that a common project will likely get >>>misused and could become unnecessary bloated. But I suspect there will >>>be >>>cases where we have common code being used across multiple projects (is >>>already happening). In this case we will either need this common project >>>or we will have to keep common code in one of the other projects and have >>>all other projects extend that. For the latter, an example would be >>>keeping common code in enrichment and having parsers declare enrichment >>>as >>>a dependency. There are a couple downsides I see with this approach: >>> >>>- parser topology jars now bring along all the enrichment dependencies >>>- since more code from various projects are being packaged together, >>>version conflicts are more likely and poms become more complicated due to >>>all the necessary exclusions >>> >>>My thinking is that any jar file being deployed should only contain what >>>it needs. Curious what others think here. My vote would be to maintain >>>a >>>common project (or whatever we want to call it) and be diligent about not >>>letting project-specific code slip in there. >>> >>>I believe Nick was the first person to ask the question about projects >>>related to test code and why we would need separate test and integration >>>test. The reason for this is that our integration-test classes currently >>>depend on other projects (not surprising since they are integration >>>tests). If there are utilities we want make available to all projects >>>(mock classes, utilities for reading sample data, etc) then it can¹t live >>>in integration-test because that will introduce circular dependencies. >>>If >>>it is possible to refactor our current Metron-Testing project so that it >>>doesn¹t depend on any other projects, then we can keep utilities here. >>>Otherwise we need a separate project for testing utilities. I suspect >>>removing other project dependencies from Metron-Testing will prove more >>>difficult than it¹s worth so my vote would be to have 2 test related >>>projects. >>> >>>So here is where our metron-platform organization stands: >>> >>>metron-common * >>>metron-integration-test * >>>metron-test-utilities * >>>metron-data-management >>>metron-pcap >>>metron-parsers >>>metron-enrichment >>> metron-solr >>> metron-elasticsearch >>>metron-api >>> >>>* may or may not change depending on the outcome of this discussion >>> >>>Thoughts? >>> >>>Ryan Merriman >>> >>> >>>On 4/11/16, 4:15 PM, "Debojyoti Dutta" <[email protected]> wrote: >>> >>>>If you load up your Irc client just type >>>>/join #apache-metron-dev >>>> >>>>Sent from my iPhone >>>> >>>>> On Apr 11, 2016, at 12:06 PM, James Sirota <[email protected]> >>>>>wrote: >>>>> >>>>> Great, thanks, Debo. Where can I find instructions on how to get to >>>>>it? >>>>> >>>>> Thanks, >>>>> James >>>>> >>>>> >>>>> >>>>> >>>>>> On 4/11/16, 9:41 AM, "Debo Dutta (dedutta)" <[email protected]> >>>>>>wrote: >>>>>> >>>>>> Hi James >>>>>> >>>>>> Ok set it up and ack Š.. >>>>>> >>>>>> Thx >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>>> On 4/10/16, 6:31 PM, "James Sirota" <[email protected]> wrote: >>>>>>> >>>>>>> Hi Debo, >>>>>>> >>>>>>> I think it would be great if you set it up >>>>>>> >>>>>>> Thanks, >>>>>>> James >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>>> On 4/10/16, 6:25 PM, "Debojyoti Dutta" <[email protected]> wrote: >>>>>>>> >>>>>>>> I have set it up for another open source effort in the past and it >>>>>>>>was not very hard. Am happy to volunteer if needed. >>>>>>>> >>>>>>>> Thx >>>>>>>> Debo >>>>>>>> >>>>>>>> Sent from my iPhone >>>>>>>> >>>>>>>>> On Apr 10, 2016, at 5:53 PM, James Sirota >>>>>>>>><[email protected]> >>>>>>>>>wrote: >>>>>>>>> >>>>>>>>> I¹d be open to an IRC channel. Does anyone know if Apache allows >>>>>>>>>this? If yes, does anyone know how to set one up? >>>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> James >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>>> On 4/10/16, 4:52 PM, "Debojyoti Dutta" <[email protected]> wrote: >>>>>>>>>> >>>>>>>>>> Hi Nick >>>>>>>>>> >>>>>>>>>> I like your suggestions. For the enrichment layer do you think it >>>>>>>>>>would also include any advanced analytics. Else we might want to >>>>>>>>>>have an analytics layer. >>>>>>>>>> >>>>>>>>>> It would be good to have an arch which could be extended for new >>>>>>>>>>functionality. >>>>>>>>>> >>>>>>>>>> However Ryan's suggestion of the ui API and deployer also makes >>>>>>>>>>sense. >>>>>>>>>> >>>>>>>>>> Should we have an IRC channel to discuss this or maybe etherpad? >>>>>>>>>> >>>>>>>>>> Debo >>>>>>>>>> >>>>>>>>>> Sent from my iPhone >>>>>>>>>> >>>>>>>>>>> On Apr 10, 2016, at 4:36 PM, Nick Allen <[email protected]> >>>>>>>>>>>wrote: >>>>>>>>>>> >>>>>>>>>>> It might help to think of our code base as four separate types >>>>>>>>>>>of >>>>>>>>>>> functionality. This is primarily meant to give us a framework >>>>>>>>>>>to >>>>>>>>>>>think >>>>>>>>>>> about the organization of Metron (and drive more discussion), >>>>>>>>>>>rather than >>>>>>>>>>> my proposal for a specific structure. >>>>>>>>>>> >>>>>>>>>>> - Sensor - Anything that captures external, non-streaming data >>>>>>>>>>>and >>>>>>>>>>> presents it in a form ready for stream processing. >>>>>>>>>>> - Input - Responsible for preparing streaming data for >>>>>>>>>>>enrichment. The >>>>>>>>>>> existing "parsers" fit neatly into this space. >>>>>>>>>>> - Enrichment - Responsible for enriching an incoming data feed >>>>>>>>>>>like >>>>>>>>>>> geoip, asset enrichment, threat intel lookups, etc. >>>>>>>>>>> - Output - Responsible for persisting data that has been >>>>>>>>>>>processed by >>>>>>>>>>> Metron which obviously means search indexers or data stores. >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> On Fri, Apr 8, 2016 at 4:46 PM, Ryan Merriman >>>>>>>>>>><[email protected]> >>>>>>>>>>> wrote: >>>>>>>>>>> >>>>>>>>>>>> All, >>>>>>>>>>>> >>>>>>>>>>>> I would like to propose a review and refactor of the current >>>>>>>>>>>>project >>>>>>>>>>>> organization within Metron. Much of the way the legacy code >>>>>>>>>>>>was >>>>>>>>>>>>organized >>>>>>>>>>>> does not make sense anymore and could be designed so that it is >>>>>>>>>>>>easier to >>>>>>>>>>>> navigate and understand. Our test coverage has increased >>>>>>>>>>>>substantially so >>>>>>>>>>>> I believe we can do this with confidence. >>>>>>>>>>>> >>>>>>>>>>>> First off, I think we should agree on a naming convention. I >>>>>>>>>>>>see some >>>>>>>>>>>> projects (YARN and Storm for example) that prepend the >>>>>>>>>>>>sub-project with the >>>>>>>>>>>> name of the top-level project (storm-core for example). Metron >>>>>>>>>>>>also >>>>>>>>>>>> currently does this (Metron-Common). I think that's fine, >>>>>>>>>>>>although in the >>>>>>>>>>>> case of Metron, I feel like having "Metron" prepended is >>>>>>>>>>>>redundant. >>>>>>>>>>>> Regardless of whether we decide to stick with that approach, I >>>>>>>>>>>>propose that >>>>>>>>>>>> project names be uniform and lowercase. For example, under >>>>>>>>>>>>these >>>>>>>>>>>> assumptions "Metron-Common" would change to "common". >>>>>>>>>>>> >>>>>>>>>>>> The first level of organization makes sense to me. Only change >>>>>>>>>>>>I would >>>>>>>>>>>> make would be to project names: >>>>>>>>>>>> >>>>>>>>>>>> * deployment >>>>>>>>>>>> * streaming >>>>>>>>>>>> * ui >>>>>>>>>>>> >>>>>>>>>>>> Or if we want to keep metron in project names: >>>>>>>>>>>> >>>>>>>>>>>> * metron-deployment >>>>>>>>>>>> * metron-streaming >>>>>>>>>>>> * metron-ui >>>>>>>>>>>> >>>>>>>>>>>> For now I don't see any changes necessary in deployment or ui >>>>>>>>>>>> organization. I see the streaming project structure primarily >>>>>>>>>>>>driven by 2 >>>>>>>>>>>> things: the Maven dependency tree and deployment targets. For >>>>>>>>>>>>example, >>>>>>>>>>>> solr and elasticsearch code should be separated (because their >>>>>>>>>>>>dependency >>>>>>>>>>>> on lucene conflicts) but both will depend on common enrichment >>>>>>>>>>>>code. Also, >>>>>>>>>>>> now that parser, enrichment and pcap topologies are separate, >>>>>>>>>>>>code for >>>>>>>>>>>> those topologies will be deployed as separate jars. No reason >>>>>>>>>>>>to include >>>>>>>>>>>> parser code in enrichment topologies and vice-versa. Any other >>>>>>>>>>>> considerations I'm missing? >>>>>>>>>>>> >>>>>>>>>>>> With that being said, here is my initial proposal: >>>>>>>>>>>> >>>>>>>>>>>> * common - Any common code that all topologies depend on >>>>>>>>>>>> (configuration classes, generic writers for example). No >>>>>>>>>>>>dependencies on >>>>>>>>>>>> other Metron projects. >>>>>>>>>>>> * test - Contains utilities for writing unit tests, sample >>>>>>>>>>>>configs and >>>>>>>>>>>> sample data. Will depend on common. >>>>>>>>>>>> * integration-test - Contains utilities and classes needed to >>>>>>>>>>>>run our >>>>>>>>>>>> integration tests (in memory components for example). Will >>>>>>>>>>>>depend on >>>>>>>>>>>> common and test. >>>>>>>>>>>> * dataload - Contains all code related to data loading. Will >>>>>>>>>>>>also >>>>>>>>>>>> include any property files needed and integration tests. Will >>>>>>>>>>>>depend on >>>>>>>>>>>> common, test (test scope), and integration-test (test scope). >>>>>>>>>>>> * parser - All code specific to the parser topologies. Would >>>>>>>>>>>>also >>>>>>>>>>>> include scripts, property files, flux files and parser topology >>>>>>>>>>>>integration >>>>>>>>>>>> tests. This project will depend on common, test (test scope), >>>>>>>>>>>>and >>>>>>>>>>>> integration-testing (test scope). >>>>>>>>>>>> * enrichment - All code specific to the enrichment topologies >>>>>>>>>>>>(except >>>>>>>>>>>> solr and elasticsearch). Would also include scripts, property >>>>>>>>>>>>files, flux >>>>>>>>>>>> files and enrichment topology integration tests. This project >>>>>>>>>>>>will depend >>>>>>>>>>>> on common, test (test scope), and integration-test (test >>>>>>>>>>>>scope). >>>>>>>>>>>> * elasticsearch - All Elasticsearch related code. Will >>>>>>>>>>>>depend >>>>>>>>>>>>on >>>>>>>>>>>> enrichment. >>>>>>>>>>>> * solr - All Solr related code. Will depend on enrichment. >>>>>>>>>>>> * pcap - All code specific to the topology dedicated to pcap. >>>>>>>>>>>>Would >>>>>>>>>>>> also include scripts, property files, flux files and pcap >>>>>>>>>>>>integration >>>>>>>>>>>> test. This project will depend on common, test (test scope) >>>>>>>>>>>>and >>>>>>>>>>>> integration-test (test scope). >>>>>>>>>>>> * api - This will serve as a generic replacement for >>>>>>>>>>>> Metron-Pcap_Service. Will contain all code to build a Metron >>>>>>>>>>>>web service >>>>>>>>>>>> middle layer that can expose APIs through REST or other client >>>>>>>>>>>>protocols. >>>>>>>>>>>> Could possibly depend on all other projects or separated >>>>>>>>>>>>further >>>>>>>>>>>>if version >>>>>>>>>>>> conflicts arise (separate api projects for solr and >>>>>>>>>>>>elasticsearch for >>>>>>>>>>>> example). >>>>>>>>>>>> >>>>>>>>>>>> Looking forward to hearing everyone's feedback and great ideas. >>>>>>>>>>>> >>>>>>>>>>>> Ryan Merriman >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> -- >>>>>>>>>>> Nick Allen <[email protected]> >>>>>>>> >>>> >>> >>> >
