Hi Ryan, This is great. You should attach this to the Jira when you are ready to commit the reorg so we know which parts shifted.
Thanks, James On 4/18/16, 1:30 PM, "Ryan Merriman" <[email protected]> wrote: >Thanks Frank. I’ve updated those in the spreadsheet. > >On 4/18/16, 3:27 PM, "Frank Lu" <[email protected]> wrote: > >>As of now, I think the following classes are not used: >> >> >> >> >>Metron-EnrichmentAdapters >> org.apache.metron.enrichment.adapters.cif.AbstractCIFAdapter.java >> >> >> org.apache.metron.enrichment.adapters.cif.CIFHbaseAdapter.java >> >>org.apache.metron.enrichment.adapters.whois.WhoisHBaseAdapter.java >> >> >>Metron-DataLoads >>org.apache.metron.dataloads.cif.HBaseTableLoad.java >> >> >>Thanks, >>Frank Lu >> >> >> >> >>On 4/18/16, 3:05 PM, "Ryan Merriman" <[email protected]> wrote: >> >>>All, >>> >>>I put together a list of all the project java assets that details where >>>they will be moved (or potentially deleted) as part of the project >>>reorganization. Feedback welcome. >>> >>>Ryan Merriman >>> >>>On 4/13/16, 9:42 AM, "James Sirota" <[email protected]> wrote: >>> >>>>I would have configs as a project but rather as a folder structure that >>>>other modules can point to >>>> >>>>Thanks, >>>>James >>>> >>>> >>>> >>>> >>>>On 4/13/16, 7:32 AM, "Ryan Merriman" <[email protected]> wrote: >>>> >>>>>James brings up a good point. I propose adding another project under >>>>>metron-platform called metron-configuration. This would be a fairly >>>>>lightweight project that would contain anything related to >>>>>configuration >>>>>(property files, json files, flux files, etc). >>>>> >>>>>On 4/13/16, 8:56 AM, "James Sirota" <[email protected]> wrote: >>>>> >>>>>>+1 from me. >>>>>> >>>>>>I would also like to address the configs and make sure the configs are >>>>>>in >>>>>>the same place. Do you have ideas on where we would put those? >>>>>> >>>>>>Thanks, >>>>>>James >>>>>> >>>>>> >>>>>> >>>>>>On 4/13/16, 6:50 AM, "Ryan Merriman" <[email protected]> >>>>>>wrote: >>>>>> >>>>>>>Thank you for all the feedback everyone. I will attempt to summarize >>>>>>>all >>>>>>>the input we¹ve received and update my initial proposal. We can >>>>>>>discuss >>>>>>>further if anyone is still unclear and I will volunteer to capture >>>>>>>all >>>>>>>the >>>>>>>details in a document of some kind once we all come to a consensus. >>>>>>> >>>>>>>Looks like everyone is in agreement for the top level projects. Nick >>>>>>>is >>>>>>>working on a task that will require an addition top level project so >>>>>>>I >>>>>>>am >>>>>>>going to add that in as well: >>>>>>> >>>>>>>metron-deployment >>>>>>>metron-platform >>>>>>>metron-ui >>>>>>>metron-sensors >>>>>>> >>>>>>>All of these except metron-platform are well understood and don¹t >>>>>>>warrant >>>>>>>any more discussion. For metron-platform there seem to be 2 areas >>>>>>>that >>>>>>>are not as clear: >>>>>>> >>>>>>>- whether we need a common project >>>>>>>- how do we organize test related code >>>>>>> >>>>>>>I agree with David and others that a common project will likely get >>>>>>>misused and could become unnecessary bloated. But I suspect there >>>>>>>will >>>>>>>be >>>>>>>cases where we have common code being used across multiple projects >>>>>>>(is >>>>>>>already happening). In this case we will either need this common >>>>>>>project >>>>>>>or we will have to keep common code in one of the other projects and >>>>>>>have >>>>>>>all other projects extend that. For the latter, an example would be >>>>>>>keeping common code in enrichment and having parsers declare >>>>>>>enrichment >>>>>>>as >>>>>>>a dependency. There are a couple downsides I see with this approach: >>>>>>> >>>>>>>- parser topology jars now bring along all the enrichment >>>>>>>dependencies >>>>>>>- since more code from various projects are being packaged together, >>>>>>>version conflicts are more likely and poms become more complicated >>>>>>>due >>>>>>>to >>>>>>>all the necessary exclusions >>>>>>> >>>>>>>My thinking is that any jar file being deployed should only contain >>>>>>>what >>>>>>>it needs. Curious what others think here. My vote would be to >>>>>>>maintain >>>>>>>a >>>>>>>common project (or whatever we want to call it) and be diligent about >>>>>>>not >>>>>>>letting project-specific code slip in there. >>>>>>> >>>>>>>I believe Nick was the first person to ask the question about >>>>>>>projects >>>>>>>related to test code and why we would need separate test and >>>>>>>integration >>>>>>>test. The reason for this is that our integration-test classes >>>>>>>currently >>>>>>>depend on other projects (not surprising since they are integration >>>>>>>tests). If there are utilities we want make available to all >>>>>>>projects >>>>>>>(mock classes, utilities for reading sample data, etc) then it can¹t >>>>>>>live >>>>>>>in integration-test because that will introduce circular >>>>>>>dependencies. >>>>>>>If >>>>>>>it is possible to refactor our current Metron-Testing project so that >>>>>>>it >>>>>>>doesn¹t depend on any other projects, then we can keep utilities >>>>>>>here. >>>>>>>Otherwise we need a separate project for testing utilities. I >>>>>>>suspect >>>>>>>removing other project dependencies from Metron-Testing will prove >>>>>>>more >>>>>>>difficult than it¹s worth so my vote would be to have 2 test related >>>>>>>projects. >>>>>>> >>>>>>>So here is where our metron-platform organization stands: >>>>>>> >>>>>>>metron-common * >>>>>>>metron-integration-test * >>>>>>>metron-test-utilities * >>>>>>>metron-data-management >>>>>>>metron-pcap >>>>>>>metron-parsers >>>>>>>metron-enrichment >>>>>>> metron-solr >>>>>>> metron-elasticsearch >>>>>>>metron-api >>>>>>> >>>>>>>* may or may not change depending on the outcome of this discussion >>>>>>> >>>>>>>Thoughts? >>>>>>> >>>>>>>Ryan Merriman >>>>>>> >>>>>>> >>>>>>>On 4/11/16, 4:15 PM, "Debojyoti Dutta" <[email protected]> wrote: >>>>>>> >>>>>>>>If you load up your Irc client just type >>>>>>>>/join #apache-metron-dev >>>>>>>> >>>>>>>>Sent from my iPhone >>>>>>>> >>>>>>>>> On Apr 11, 2016, at 12:06 PM, James Sirota >>>>>>>>><[email protected]> >>>>>>>>>wrote: >>>>>>>>> >>>>>>>>> Great, thanks, Debo. Where can I find instructions on how to get >>>>>>>>>to >>>>>>>>>it? >>>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> James >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>>> On 4/11/16, 9:41 AM, "Debo Dutta (dedutta)" <[email protected]> >>>>>>>>>>wrote: >>>>>>>>>> >>>>>>>>>> Hi James >>>>>>>>>> >>>>>>>>>> Ok set it up and ack Š.. >>>>>>>>>> >>>>>>>>>> Thx >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>>> On 4/10/16, 6:31 PM, "James Sirota" <[email protected]> >>>>>>>>>>>wrote: >>>>>>>>>>> >>>>>>>>>>> Hi Debo, >>>>>>>>>>> >>>>>>>>>>> I think it would be great if you set it up >>>>>>>>>>> >>>>>>>>>>> Thanks, >>>>>>>>>>> James >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>>> On 4/10/16, 6:25 PM, "Debojyoti Dutta" <[email protected]> >>>>>>>>>>>>wrote: >>>>>>>>>>>> >>>>>>>>>>>> I have set it up for another open source effort in the past and >>>>>>>>>>>>it >>>>>>>>>>>>was not very hard. Am happy to volunteer if needed. >>>>>>>>>>>> >>>>>>>>>>>> Thx >>>>>>>>>>>> Debo >>>>>>>>>>>> >>>>>>>>>>>> Sent from my iPhone >>>>>>>>>>>> >>>>>>>>>>>>> On Apr 10, 2016, at 5:53 PM, James Sirota >>>>>>>>>>>>><[email protected]> >>>>>>>>>>>>>wrote: >>>>>>>>>>>>> >>>>>>>>>>>>> I¹d be open to an IRC channel. Does anyone know if Apache >>>>>>>>>>>>>allows >>>>>>>>>>>>>this? If yes, does anyone know how to set one up? >>>>>>>>>>>>> >>>>>>>>>>>>> Thanks, >>>>>>>>>>>>> James >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>>> On 4/10/16, 4:52 PM, "Debojyoti Dutta" <[email protected]> >>>>>>>>>>>>>>wrote: >>>>>>>>>>>>>> >>>>>>>>>>>>>> Hi Nick >>>>>>>>>>>>>> >>>>>>>>>>>>>> I like your suggestions. For the enrichment layer do you >>>>>>>>>>>>>>think >>>>>>>>>>>>>>it >>>>>>>>>>>>>>would also include any advanced analytics. Else we might want >>>>>>>>>>>>>>to >>>>>>>>>>>>>>have an analytics layer. >>>>>>>>>>>>>> >>>>>>>>>>>>>> It would be good to have an arch which could be extended for >>>>>>>>>>>>>>new >>>>>>>>>>>>>>functionality. >>>>>>>>>>>>>> >>>>>>>>>>>>>> However Ryan's suggestion of the ui API and deployer also >>>>>>>>>>>>>>makes >>>>>>>>>>>>>>sense. >>>>>>>>>>>>>> >>>>>>>>>>>>>> Should we have an IRC channel to discuss this or maybe >>>>>>>>>>>>>>etherpad? >>>>>>>>>>>>>> >>>>>>>>>>>>>> Debo >>>>>>>>>>>>>> >>>>>>>>>>>>>> Sent from my iPhone >>>>>>>>>>>>>> >>>>>>>>>>>>>>> On Apr 10, 2016, at 4:36 PM, Nick Allen <[email protected]> >>>>>>>>>>>>>>>wrote: >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> It might help to think of our code base as four separate >>>>>>>>>>>>>>>types >>>>>>>>>>>>>>>of >>>>>>>>>>>>>>> functionality. This is primarily meant to give us a >>>>>>>>>>>>>>>framework >>>>>>>>>>>>>>>to >>>>>>>>>>>>>>>think >>>>>>>>>>>>>>> about the organization of Metron (and drive more >>>>>>>>>>>>>>>discussion), >>>>>>>>>>>>>>>rather than >>>>>>>>>>>>>>> my proposal for a specific structure. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> - Sensor - Anything that captures external, non-streaming >>>>>>>>>>>>>>>data >>>>>>>>>>>>>>>and >>>>>>>>>>>>>>> presents it in a form ready for stream processing. >>>>>>>>>>>>>>> - Input - Responsible for preparing streaming data for >>>>>>>>>>>>>>>enrichment. The >>>>>>>>>>>>>>> existing "parsers" fit neatly into this space. >>>>>>>>>>>>>>> - Enrichment - Responsible for enriching an incoming data >>>>>>>>>>>>>>>feed >>>>>>>>>>>>>>>like >>>>>>>>>>>>>>> geoip, asset enrichment, threat intel lookups, etc. >>>>>>>>>>>>>>> - Output - Responsible for persisting data that has been >>>>>>>>>>>>>>>processed by >>>>>>>>>>>>>>> Metron which obviously means search indexers or data stores. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> On Fri, Apr 8, 2016 at 4:46 PM, Ryan Merriman >>>>>>>>>>>>>>><[email protected]> >>>>>>>>>>>>>>> wrote: >>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> All, >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> I would like to propose a review and refactor of the >>>>>>>>>>>>>>>>current >>>>>>>>>>>>>>>>project >>>>>>>>>>>>>>>> organization within Metron. Much of the way the legacy >>>>>>>>>>>>>>>>code >>>>>>>>>>>>>>>>was >>>>>>>>>>>>>>>>organized >>>>>>>>>>>>>>>> does not make sense anymore and could be designed so that >>>>>>>>>>>>>>>>it >>>>>>>>>>>>>>>>is >>>>>>>>>>>>>>>>easier to >>>>>>>>>>>>>>>> navigate and understand. Our test coverage has increased >>>>>>>>>>>>>>>>substantially so >>>>>>>>>>>>>>>> I believe we can do this with confidence. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> First off, I think we should agree on a naming convention. >>>>>>>>>>>>>>>> I >>>>>>>>>>>>>>>>see some >>>>>>>>>>>>>>>> projects (YARN and Storm for example) that prepend the >>>>>>>>>>>>>>>>sub-project with the >>>>>>>>>>>>>>>> name of the top-level project (storm-core for example). >>>>>>>>>>>>>>>>Metron >>>>>>>>>>>>>>>>also >>>>>>>>>>>>>>>> currently does this (Metron-Common). I think that's fine, >>>>>>>>>>>>>>>>although in the >>>>>>>>>>>>>>>> case of Metron, I feel like having "Metron" prepended is >>>>>>>>>>>>>>>>redundant. >>>>>>>>>>>>>>>> Regardless of whether we decide to stick with that >>>>>>>>>>>>>>>>approach, >>>>>>>>>>>>>>>>I >>>>>>>>>>>>>>>>propose that >>>>>>>>>>>>>>>> project names be uniform and lowercase. For example, under >>>>>>>>>>>>>>>>these >>>>>>>>>>>>>>>> assumptions "Metron-Common" would change to "common". >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> The first level of organization makes sense to me. Only >>>>>>>>>>>>>>>>change >>>>>>>>>>>>>>>>I would >>>>>>>>>>>>>>>> make would be to project names: >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> * deployment >>>>>>>>>>>>>>>> * streaming >>>>>>>>>>>>>>>> * ui >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Or if we want to keep metron in project names: >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> * metron-deployment >>>>>>>>>>>>>>>> * metron-streaming >>>>>>>>>>>>>>>> * metron-ui >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> For now I don't see any changes necessary in deployment or >>>>>>>>>>>>>>>>ui >>>>>>>>>>>>>>>> organization. I see the streaming project structure >>>>>>>>>>>>>>>>primarily >>>>>>>>>>>>>>>>driven by 2 >>>>>>>>>>>>>>>> things: the Maven dependency tree and deployment targets. >>>>>>>>>>>>>>>>For >>>>>>>>>>>>>>>>example, >>>>>>>>>>>>>>>> solr and elasticsearch code should be separated (because >>>>>>>>>>>>>>>>their >>>>>>>>>>>>>>>>dependency >>>>>>>>>>>>>>>> on lucene conflicts) but both will depend on common >>>>>>>>>>>>>>>>enrichment >>>>>>>>>>>>>>>>code. Also, >>>>>>>>>>>>>>>> now that parser, enrichment and pcap topologies are >>>>>>>>>>>>>>>>separate, >>>>>>>>>>>>>>>>code for >>>>>>>>>>>>>>>> those topologies will be deployed as separate jars. No >>>>>>>>>>>>>>>>reason >>>>>>>>>>>>>>>>to include >>>>>>>>>>>>>>>> parser code in enrichment topologies and vice-versa. Any >>>>>>>>>>>>>>>>other >>>>>>>>>>>>>>>> considerations I'm missing? >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> With that being said, here is my initial proposal: >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> * common - Any common code that all topologies depend on >>>>>>>>>>>>>>>> (configuration classes, generic writers for example). No >>>>>>>>>>>>>>>>dependencies on >>>>>>>>>>>>>>>> other Metron projects. >>>>>>>>>>>>>>>> * test - Contains utilities for writing unit tests, >>>>>>>>>>>>>>>>sample >>>>>>>>>>>>>>>>configs and >>>>>>>>>>>>>>>> sample data. Will depend on common. >>>>>>>>>>>>>>>> * integration-test - Contains utilities and classes >>>>>>>>>>>>>>>>needed >>>>>>>>>>>>>>>>to >>>>>>>>>>>>>>>>run our >>>>>>>>>>>>>>>> integration tests (in memory components for example). Will >>>>>>>>>>>>>>>>depend on >>>>>>>>>>>>>>>> common and test. >>>>>>>>>>>>>>>> * dataload - Contains all code related to data loading. >>>>>>>>>>>>>>>>Will >>>>>>>>>>>>>>>>also >>>>>>>>>>>>>>>> include any property files needed and integration tests. >>>>>>>>>>>>>>>>Will >>>>>>>>>>>>>>>>depend on >>>>>>>>>>>>>>>> common, test (test scope), and integration-test (test >>>>>>>>>>>>>>>>scope). >>>>>>>>>>>>>>>> * parser - All code specific to the parser topologies. >>>>>>>>>>>>>>>>Would >>>>>>>>>>>>>>>>also >>>>>>>>>>>>>>>> include scripts, property files, flux files and parser >>>>>>>>>>>>>>>>topology >>>>>>>>>>>>>>>>integration >>>>>>>>>>>>>>>> tests. This project will depend on common, test (test >>>>>>>>>>>>>>>>scope), >>>>>>>>>>>>>>>>and >>>>>>>>>>>>>>>> integration-testing (test scope). >>>>>>>>>>>>>>>> * enrichment - All code specific to the enrichment >>>>>>>>>>>>>>>>topologies >>>>>>>>>>>>>>>>(except >>>>>>>>>>>>>>>> solr and elasticsearch). Would also include scripts, >>>>>>>>>>>>>>>>property >>>>>>>>>>>>>>>>files, flux >>>>>>>>>>>>>>>> files and enrichment topology integration tests. This >>>>>>>>>>>>>>>>project >>>>>>>>>>>>>>>>will depend >>>>>>>>>>>>>>>> on common, test (test scope), and integration-test (test >>>>>>>>>>>>>>>>scope). >>>>>>>>>>>>>>>> * elasticsearch - All Elasticsearch related code. Will >>>>>>>>>>>>>>>>depend >>>>>>>>>>>>>>>>on >>>>>>>>>>>>>>>> enrichment. >>>>>>>>>>>>>>>> * solr - All Solr related code. Will depend on >>>>>>>>>>>>>>>>enrichment. >>>>>>>>>>>>>>>> * pcap - All code specific to the topology dedicated to >>>>>>>>>>>>>>>>pcap. >>>>>>>>>>>>>>>>Would >>>>>>>>>>>>>>>> also include scripts, property files, flux files and pcap >>>>>>>>>>>>>>>>integration >>>>>>>>>>>>>>>> test. This project will depend on common, test (test >>>>>>>>>>>>>>>>scope) >>>>>>>>>>>>>>>>and >>>>>>>>>>>>>>>> integration-test (test scope). >>>>>>>>>>>>>>>> * api - This will serve as a generic replacement for >>>>>>>>>>>>>>>> Metron-Pcap_Service. Will contain all code to build a >>>>>>>>>>>>>>>>Metron >>>>>>>>>>>>>>>>web service >>>>>>>>>>>>>>>> middle layer that can expose APIs through REST or other >>>>>>>>>>>>>>>>client >>>>>>>>>>>>>>>>protocols. >>>>>>>>>>>>>>>> Could possibly depend on all other projects or separated >>>>>>>>>>>>>>>>further >>>>>>>>>>>>>>>>if version >>>>>>>>>>>>>>>> conflicts arise (separate api projects for solr and >>>>>>>>>>>>>>>>elasticsearch for >>>>>>>>>>>>>>>> example). >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Looking forward to hearing everyone's feedback and great >>>>>>>>>>>>>>>>ideas. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Ryan Merriman >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> -- >>>>>>>>>>>>>>> Nick Allen <[email protected]> >>>>>>>>>>>> >>>>>>>> >>>>>>> >>>>>>> >>>>> >>> >
