Thanks Frank. I’ve updated those in the spreadsheet. On 4/18/16, 3:27 PM, "Frank Lu" <[email protected]> wrote:
>As of now, I think the following classes are not used: > > > > >Metron-EnrichmentAdapters > org.apache.metron.enrichment.adapters.cif.AbstractCIFAdapter.java > > > org.apache.metron.enrichment.adapters.cif.CIFHbaseAdapter.java > >org.apache.metron.enrichment.adapters.whois.WhoisHBaseAdapter.java > > >Metron-DataLoads >org.apache.metron.dataloads.cif.HBaseTableLoad.java > > >Thanks, >Frank Lu > > > > >On 4/18/16, 3:05 PM, "Ryan Merriman" <[email protected]> wrote: > >>All, >> >>I put together a list of all the project java assets that details where >>they will be moved (or potentially deleted) as part of the project >>reorganization. Feedback welcome. >> >>Ryan Merriman >> >>On 4/13/16, 9:42 AM, "James Sirota" <[email protected]> wrote: >> >>>I would have configs as a project but rather as a folder structure that >>>other modules can point to >>> >>>Thanks, >>>James >>> >>> >>> >>> >>>On 4/13/16, 7:32 AM, "Ryan Merriman" <[email protected]> wrote: >>> >>>>James brings up a good point. I propose adding another project under >>>>metron-platform called metron-configuration. This would be a fairly >>>>lightweight project that would contain anything related to >>>>configuration >>>>(property files, json files, flux files, etc). >>>> >>>>On 4/13/16, 8:56 AM, "James Sirota" <[email protected]> wrote: >>>> >>>>>+1 from me. >>>>> >>>>>I would also like to address the configs and make sure the configs are >>>>>in >>>>>the same place. Do you have ideas on where we would put those? >>>>> >>>>>Thanks, >>>>>James >>>>> >>>>> >>>>> >>>>>On 4/13/16, 6:50 AM, "Ryan Merriman" <[email protected]> >>>>>wrote: >>>>> >>>>>>Thank you for all the feedback everyone. I will attempt to summarize >>>>>>all >>>>>>the input we¹ve received and update my initial proposal. We can >>>>>>discuss >>>>>>further if anyone is still unclear and I will volunteer to capture >>>>>>all >>>>>>the >>>>>>details in a document of some kind once we all come to a consensus. >>>>>> >>>>>>Looks like everyone is in agreement for the top level projects. Nick >>>>>>is >>>>>>working on a task that will require an addition top level project so >>>>>>I >>>>>>am >>>>>>going to add that in as well: >>>>>> >>>>>>metron-deployment >>>>>>metron-platform >>>>>>metron-ui >>>>>>metron-sensors >>>>>> >>>>>>All of these except metron-platform are well understood and don¹t >>>>>>warrant >>>>>>any more discussion. For metron-platform there seem to be 2 areas >>>>>>that >>>>>>are not as clear: >>>>>> >>>>>>- whether we need a common project >>>>>>- how do we organize test related code >>>>>> >>>>>>I agree with David and others that a common project will likely get >>>>>>misused and could become unnecessary bloated. But I suspect there >>>>>>will >>>>>>be >>>>>>cases where we have common code being used across multiple projects >>>>>>(is >>>>>>already happening). In this case we will either need this common >>>>>>project >>>>>>or we will have to keep common code in one of the other projects and >>>>>>have >>>>>>all other projects extend that. For the latter, an example would be >>>>>>keeping common code in enrichment and having parsers declare >>>>>>enrichment >>>>>>as >>>>>>a dependency. There are a couple downsides I see with this approach: >>>>>> >>>>>>- parser topology jars now bring along all the enrichment >>>>>>dependencies >>>>>>- since more code from various projects are being packaged together, >>>>>>version conflicts are more likely and poms become more complicated >>>>>>due >>>>>>to >>>>>>all the necessary exclusions >>>>>> >>>>>>My thinking is that any jar file being deployed should only contain >>>>>>what >>>>>>it needs. Curious what others think here. My vote would be to >>>>>>maintain >>>>>>a >>>>>>common project (or whatever we want to call it) and be diligent about >>>>>>not >>>>>>letting project-specific code slip in there. >>>>>> >>>>>>I believe Nick was the first person to ask the question about >>>>>>projects >>>>>>related to test code and why we would need separate test and >>>>>>integration >>>>>>test. The reason for this is that our integration-test classes >>>>>>currently >>>>>>depend on other projects (not surprising since they are integration >>>>>>tests). If there are utilities we want make available to all >>>>>>projects >>>>>>(mock classes, utilities for reading sample data, etc) then it can¹t >>>>>>live >>>>>>in integration-test because that will introduce circular >>>>>>dependencies. >>>>>>If >>>>>>it is possible to refactor our current Metron-Testing project so that >>>>>>it >>>>>>doesn¹t depend on any other projects, then we can keep utilities >>>>>>here. >>>>>>Otherwise we need a separate project for testing utilities. I >>>>>>suspect >>>>>>removing other project dependencies from Metron-Testing will prove >>>>>>more >>>>>>difficult than it¹s worth so my vote would be to have 2 test related >>>>>>projects. >>>>>> >>>>>>So here is where our metron-platform organization stands: >>>>>> >>>>>>metron-common * >>>>>>metron-integration-test * >>>>>>metron-test-utilities * >>>>>>metron-data-management >>>>>>metron-pcap >>>>>>metron-parsers >>>>>>metron-enrichment >>>>>> metron-solr >>>>>> metron-elasticsearch >>>>>>metron-api >>>>>> >>>>>>* may or may not change depending on the outcome of this discussion >>>>>> >>>>>>Thoughts? >>>>>> >>>>>>Ryan Merriman >>>>>> >>>>>> >>>>>>On 4/11/16, 4:15 PM, "Debojyoti Dutta" <[email protected]> wrote: >>>>>> >>>>>>>If you load up your Irc client just type >>>>>>>/join #apache-metron-dev >>>>>>> >>>>>>>Sent from my iPhone >>>>>>> >>>>>>>> On Apr 11, 2016, at 12:06 PM, James Sirota >>>>>>>><[email protected]> >>>>>>>>wrote: >>>>>>>> >>>>>>>> Great, thanks, Debo. Where can I find instructions on how to get >>>>>>>>to >>>>>>>>it? >>>>>>>> >>>>>>>> Thanks, >>>>>>>> James >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>>> On 4/11/16, 9:41 AM, "Debo Dutta (dedutta)" <[email protected]> >>>>>>>>>wrote: >>>>>>>>> >>>>>>>>> Hi James >>>>>>>>> >>>>>>>>> Ok set it up and ack Š.. >>>>>>>>> >>>>>>>>> Thx >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>>> On 4/10/16, 6:31 PM, "James Sirota" <[email protected]> >>>>>>>>>>wrote: >>>>>>>>>> >>>>>>>>>> Hi Debo, >>>>>>>>>> >>>>>>>>>> I think it would be great if you set it up >>>>>>>>>> >>>>>>>>>> Thanks, >>>>>>>>>> James >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>>> On 4/10/16, 6:25 PM, "Debojyoti Dutta" <[email protected]> >>>>>>>>>>>wrote: >>>>>>>>>>> >>>>>>>>>>> I have set it up for another open source effort in the past and >>>>>>>>>>>it >>>>>>>>>>>was not very hard. Am happy to volunteer if needed. >>>>>>>>>>> >>>>>>>>>>> Thx >>>>>>>>>>> Debo >>>>>>>>>>> >>>>>>>>>>> Sent from my iPhone >>>>>>>>>>> >>>>>>>>>>>> On Apr 10, 2016, at 5:53 PM, James Sirota >>>>>>>>>>>><[email protected]> >>>>>>>>>>>>wrote: >>>>>>>>>>>> >>>>>>>>>>>> I¹d be open to an IRC channel. Does anyone know if Apache >>>>>>>>>>>>allows >>>>>>>>>>>>this? If yes, does anyone know how to set one up? >>>>>>>>>>>> >>>>>>>>>>>> Thanks, >>>>>>>>>>>> James >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>>> On 4/10/16, 4:52 PM, "Debojyoti Dutta" <[email protected]> >>>>>>>>>>>>>wrote: >>>>>>>>>>>>> >>>>>>>>>>>>> Hi Nick >>>>>>>>>>>>> >>>>>>>>>>>>> I like your suggestions. For the enrichment layer do you >>>>>>>>>>>>>think >>>>>>>>>>>>>it >>>>>>>>>>>>>would also include any advanced analytics. Else we might want >>>>>>>>>>>>>to >>>>>>>>>>>>>have an analytics layer. >>>>>>>>>>>>> >>>>>>>>>>>>> It would be good to have an arch which could be extended for >>>>>>>>>>>>>new >>>>>>>>>>>>>functionality. >>>>>>>>>>>>> >>>>>>>>>>>>> However Ryan's suggestion of the ui API and deployer also >>>>>>>>>>>>>makes >>>>>>>>>>>>>sense. >>>>>>>>>>>>> >>>>>>>>>>>>> Should we have an IRC channel to discuss this or maybe >>>>>>>>>>>>>etherpad? >>>>>>>>>>>>> >>>>>>>>>>>>> Debo >>>>>>>>>>>>> >>>>>>>>>>>>> Sent from my iPhone >>>>>>>>>>>>> >>>>>>>>>>>>>> On Apr 10, 2016, at 4:36 PM, Nick Allen <[email protected]> >>>>>>>>>>>>>>wrote: >>>>>>>>>>>>>> >>>>>>>>>>>>>> It might help to think of our code base as four separate >>>>>>>>>>>>>>types >>>>>>>>>>>>>>of >>>>>>>>>>>>>> functionality. This is primarily meant to give us a >>>>>>>>>>>>>>framework >>>>>>>>>>>>>>to >>>>>>>>>>>>>>think >>>>>>>>>>>>>> about the organization of Metron (and drive more >>>>>>>>>>>>>>discussion), >>>>>>>>>>>>>>rather than >>>>>>>>>>>>>> my proposal for a specific structure. >>>>>>>>>>>>>> >>>>>>>>>>>>>> - Sensor - Anything that captures external, non-streaming >>>>>>>>>>>>>>data >>>>>>>>>>>>>>and >>>>>>>>>>>>>> presents it in a form ready for stream processing. >>>>>>>>>>>>>> - Input - Responsible for preparing streaming data for >>>>>>>>>>>>>>enrichment. The >>>>>>>>>>>>>> existing "parsers" fit neatly into this space. >>>>>>>>>>>>>> - Enrichment - Responsible for enriching an incoming data >>>>>>>>>>>>>>feed >>>>>>>>>>>>>>like >>>>>>>>>>>>>> geoip, asset enrichment, threat intel lookups, etc. >>>>>>>>>>>>>> - Output - Responsible for persisting data that has been >>>>>>>>>>>>>>processed by >>>>>>>>>>>>>> Metron which obviously means search indexers or data stores. >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> On Fri, Apr 8, 2016 at 4:46 PM, Ryan Merriman >>>>>>>>>>>>>><[email protected]> >>>>>>>>>>>>>> wrote: >>>>>>>>>>>>>> >>>>>>>>>>>>>>> All, >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> I would like to propose a review and refactor of the >>>>>>>>>>>>>>>current >>>>>>>>>>>>>>>project >>>>>>>>>>>>>>> organization within Metron. Much of the way the legacy >>>>>>>>>>>>>>>code >>>>>>>>>>>>>>>was >>>>>>>>>>>>>>>organized >>>>>>>>>>>>>>> does not make sense anymore and could be designed so that >>>>>>>>>>>>>>>it >>>>>>>>>>>>>>>is >>>>>>>>>>>>>>>easier to >>>>>>>>>>>>>>> navigate and understand. Our test coverage has increased >>>>>>>>>>>>>>>substantially so >>>>>>>>>>>>>>> I believe we can do this with confidence. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> First off, I think we should agree on a naming convention. >>>>>>>>>>>>>>> I >>>>>>>>>>>>>>>see some >>>>>>>>>>>>>>> projects (YARN and Storm for example) that prepend the >>>>>>>>>>>>>>>sub-project with the >>>>>>>>>>>>>>> name of the top-level project (storm-core for example). >>>>>>>>>>>>>>>Metron >>>>>>>>>>>>>>>also >>>>>>>>>>>>>>> currently does this (Metron-Common). I think that's fine, >>>>>>>>>>>>>>>although in the >>>>>>>>>>>>>>> case of Metron, I feel like having "Metron" prepended is >>>>>>>>>>>>>>>redundant. >>>>>>>>>>>>>>> Regardless of whether we decide to stick with that >>>>>>>>>>>>>>>approach, >>>>>>>>>>>>>>>I >>>>>>>>>>>>>>>propose that >>>>>>>>>>>>>>> project names be uniform and lowercase. For example, under >>>>>>>>>>>>>>>these >>>>>>>>>>>>>>> assumptions "Metron-Common" would change to "common". >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> The first level of organization makes sense to me. Only >>>>>>>>>>>>>>>change >>>>>>>>>>>>>>>I would >>>>>>>>>>>>>>> make would be to project names: >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> * deployment >>>>>>>>>>>>>>> * streaming >>>>>>>>>>>>>>> * ui >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Or if we want to keep metron in project names: >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> * metron-deployment >>>>>>>>>>>>>>> * metron-streaming >>>>>>>>>>>>>>> * metron-ui >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> For now I don't see any changes necessary in deployment or >>>>>>>>>>>>>>>ui >>>>>>>>>>>>>>> organization. I see the streaming project structure >>>>>>>>>>>>>>>primarily >>>>>>>>>>>>>>>driven by 2 >>>>>>>>>>>>>>> things: the Maven dependency tree and deployment targets. >>>>>>>>>>>>>>>For >>>>>>>>>>>>>>>example, >>>>>>>>>>>>>>> solr and elasticsearch code should be separated (because >>>>>>>>>>>>>>>their >>>>>>>>>>>>>>>dependency >>>>>>>>>>>>>>> on lucene conflicts) but both will depend on common >>>>>>>>>>>>>>>enrichment >>>>>>>>>>>>>>>code. Also, >>>>>>>>>>>>>>> now that parser, enrichment and pcap topologies are >>>>>>>>>>>>>>>separate, >>>>>>>>>>>>>>>code for >>>>>>>>>>>>>>> those topologies will be deployed as separate jars. No >>>>>>>>>>>>>>>reason >>>>>>>>>>>>>>>to include >>>>>>>>>>>>>>> parser code in enrichment topologies and vice-versa. Any >>>>>>>>>>>>>>>other >>>>>>>>>>>>>>> considerations I'm missing? >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> With that being said, here is my initial proposal: >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> * common - Any common code that all topologies depend on >>>>>>>>>>>>>>> (configuration classes, generic writers for example). No >>>>>>>>>>>>>>>dependencies on >>>>>>>>>>>>>>> other Metron projects. >>>>>>>>>>>>>>> * test - Contains utilities for writing unit tests, >>>>>>>>>>>>>>>sample >>>>>>>>>>>>>>>configs and >>>>>>>>>>>>>>> sample data. Will depend on common. >>>>>>>>>>>>>>> * integration-test - Contains utilities and classes >>>>>>>>>>>>>>>needed >>>>>>>>>>>>>>>to >>>>>>>>>>>>>>>run our >>>>>>>>>>>>>>> integration tests (in memory components for example). Will >>>>>>>>>>>>>>>depend on >>>>>>>>>>>>>>> common and test. >>>>>>>>>>>>>>> * dataload - Contains all code related to data loading. >>>>>>>>>>>>>>>Will >>>>>>>>>>>>>>>also >>>>>>>>>>>>>>> include any property files needed and integration tests. >>>>>>>>>>>>>>>Will >>>>>>>>>>>>>>>depend on >>>>>>>>>>>>>>> common, test (test scope), and integration-test (test >>>>>>>>>>>>>>>scope). >>>>>>>>>>>>>>> * parser - All code specific to the parser topologies. >>>>>>>>>>>>>>>Would >>>>>>>>>>>>>>>also >>>>>>>>>>>>>>> include scripts, property files, flux files and parser >>>>>>>>>>>>>>>topology >>>>>>>>>>>>>>>integration >>>>>>>>>>>>>>> tests. This project will depend on common, test (test >>>>>>>>>>>>>>>scope), >>>>>>>>>>>>>>>and >>>>>>>>>>>>>>> integration-testing (test scope). >>>>>>>>>>>>>>> * enrichment - All code specific to the enrichment >>>>>>>>>>>>>>>topologies >>>>>>>>>>>>>>>(except >>>>>>>>>>>>>>> solr and elasticsearch). Would also include scripts, >>>>>>>>>>>>>>>property >>>>>>>>>>>>>>>files, flux >>>>>>>>>>>>>>> files and enrichment topology integration tests. This >>>>>>>>>>>>>>>project >>>>>>>>>>>>>>>will depend >>>>>>>>>>>>>>> on common, test (test scope), and integration-test (test >>>>>>>>>>>>>>>scope). >>>>>>>>>>>>>>> * elasticsearch - All Elasticsearch related code. Will >>>>>>>>>>>>>>>depend >>>>>>>>>>>>>>>on >>>>>>>>>>>>>>> enrichment. >>>>>>>>>>>>>>> * solr - All Solr related code. Will depend on >>>>>>>>>>>>>>>enrichment. >>>>>>>>>>>>>>> * pcap - All code specific to the topology dedicated to >>>>>>>>>>>>>>>pcap. >>>>>>>>>>>>>>>Would >>>>>>>>>>>>>>> also include scripts, property files, flux files and pcap >>>>>>>>>>>>>>>integration >>>>>>>>>>>>>>> test. This project will depend on common, test (test >>>>>>>>>>>>>>>scope) >>>>>>>>>>>>>>>and >>>>>>>>>>>>>>> integration-test (test scope). >>>>>>>>>>>>>>> * api - This will serve as a generic replacement for >>>>>>>>>>>>>>> Metron-Pcap_Service. Will contain all code to build a >>>>>>>>>>>>>>>Metron >>>>>>>>>>>>>>>web service >>>>>>>>>>>>>>> middle layer that can expose APIs through REST or other >>>>>>>>>>>>>>>client >>>>>>>>>>>>>>>protocols. >>>>>>>>>>>>>>> Could possibly depend on all other projects or separated >>>>>>>>>>>>>>>further >>>>>>>>>>>>>>>if version >>>>>>>>>>>>>>> conflicts arise (separate api projects for solr and >>>>>>>>>>>>>>>elasticsearch for >>>>>>>>>>>>>>> example). >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Looking forward to hearing everyone's feedback and great >>>>>>>>>>>>>>>ideas. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Ryan Merriman >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> -- >>>>>>>>>>>>>> Nick Allen <[email protected]> >>>>>>>>>>> >>>>>>> >>>>>> >>>>>> >>>> >>
ClassInventory.xlsx
Description: ClassInventory.xlsx
