+1 I like it. On Wed, Apr 13, 2016 at 9:59 AM, Ryan Merriman <[email protected]> wrote:
> To answer a couple of other questions people asked: > > Debo, agreed having clear extension points is going to be extremely > important for us. Currently we have well defined interfaces for parsers > and enrichment adapters as well as the ability to load data into and drive > enrichments (threat intels) from HBase tables with well defined key > structures. Eventually we will want to extend this to models. Maybe an > analytical project makes sense when we get to that point? > > Debo and James, yes my vision for the metron-api project is a standard > interface for interacting with Metron. This would include everything from > data access (pcap service) to security and beyond. > > David, let’s explore the best way to leverage the dependencyManagement > section in our top level pom. I think you’re on to something there. Our > maven implementation needs a thorough review as well. > > Ryan Merriman > > > > On 4/13/16, 8:50 AM, "Ryan Merriman" <[email protected]> wrote: > > >Thank you for all the feedback everyone. I will attempt to summarize all > >the input we¹ve received and update my initial proposal. We can discuss > >further if anyone is still unclear and I will volunteer to capture all the > >details in a document of some kind once we all come to a consensus. > > > >Looks like everyone is in agreement for the top level projects. Nick is > >working on a task that will require an addition top level project so I am > >going to add that in as well: > > > >metron-deployment > >metron-platform > >metron-ui > >metron-sensors > > > >All of these except metron-platform are well understood and don¹t warrant > >any more discussion. For metron-platform there seem to be 2 areas that > >are not as clear: > > > >- whether we need a common project > >- how do we organize test related code > > > >I agree with David and others that a common project will likely get > >misused and could become unnecessary bloated. But I suspect there will be > >cases where we have common code being used across multiple projects (is > >already happening). In this case we will either need this common project > >or we will have to keep common code in one of the other projects and have > >all other projects extend that. For the latter, an example would be > >keeping common code in enrichment and having parsers declare enrichment as > >a dependency. There are a couple downsides I see with this approach: > > > >- parser topology jars now bring along all the enrichment dependencies > >- since more code from various projects are being packaged together, > >version conflicts are more likely and poms become more complicated due to > >all the necessary exclusions > > > >My thinking is that any jar file being deployed should only contain what > >it needs. Curious what others think here. My vote would be to maintain a > >common project (or whatever we want to call it) and be diligent about not > >letting project-specific code slip in there. > > > >I believe Nick was the first person to ask the question about projects > >related to test code and why we would need separate test and integration > >test. The reason for this is that our integration-test classes currently > >depend on other projects (not surprising since they are integration > >tests). If there are utilities we want make available to all projects > >(mock classes, utilities for reading sample data, etc) then it can¹t live > >in integration-test because that will introduce circular dependencies. If > >it is possible to refactor our current Metron-Testing project so that it > >doesn¹t depend on any other projects, then we can keep utilities here. > >Otherwise we need a separate project for testing utilities. I suspect > >removing other project dependencies from Metron-Testing will prove more > >difficult than it¹s worth so my vote would be to have 2 test related > >projects. > > > >So here is where our metron-platform organization stands: > > > >metron-common * > >metron-integration-test * > >metron-test-utilities * > >metron-data-management > >metron-pcap > >metron-parsers > >metron-enrichment > > metron-solr > > metron-elasticsearch > >metron-api > > > >* may or may not change depending on the outcome of this discussion > > > >Thoughts? > > > >Ryan Merriman > > > > > >On 4/11/16, 4:15 PM, "Debojyoti Dutta" <[email protected]> wrote: > > > >>If you load up your Irc client just type > >>/join #apache-metron-dev > >> > >>Sent from my iPhone > >> > >>> On Apr 11, 2016, at 12:06 PM, James Sirota <[email protected]> > >>>wrote: > >>> > >>> Great, thanks, Debo. Where can I find instructions on how to get to > >>>it? > >>> > >>> Thanks, > >>> James > >>> > >>> > >>> > >>> > >>>> On 4/11/16, 9:41 AM, "Debo Dutta (dedutta)" <[email protected]> > wrote: > >>>> > >>>> Hi James > >>>> > >>>> Ok set it up and ack Š.. > >>>> > >>>> Thx > >>>> > >>>> > >>>> > >>>> > >>>> > >>>>> On 4/10/16, 6:31 PM, "James Sirota" <[email protected]> wrote: > >>>>> > >>>>> Hi Debo, > >>>>> > >>>>> I think it would be great if you set it up > >>>>> > >>>>> Thanks, > >>>>> James > >>>>> > >>>>> > >>>>> > >>>>> > >>>>>> On 4/10/16, 6:25 PM, "Debojyoti Dutta" <[email protected]> wrote: > >>>>>> > >>>>>> I have set it up for another open source effort in the past and it > >>>>>>was not very hard. Am happy to volunteer if needed. > >>>>>> > >>>>>> Thx > >>>>>> Debo > >>>>>> > >>>>>> Sent from my iPhone > >>>>>> > >>>>>>> On Apr 10, 2016, at 5:53 PM, James Sirota <[email protected] > > > >>>>>>>wrote: > >>>>>>> > >>>>>>> I¹d be open to an IRC channel. Does anyone know if Apache allows > >>>>>>>this? If yes, does anyone know how to set one up? > >>>>>>> > >>>>>>> Thanks, > >>>>>>> James > >>>>>>> > >>>>>>> > >>>>>>> > >>>>>>> > >>>>>>>> On 4/10/16, 4:52 PM, "Debojyoti Dutta" <[email protected]> wrote: > >>>>>>>> > >>>>>>>> Hi Nick > >>>>>>>> > >>>>>>>> I like your suggestions. For the enrichment layer do you think it > >>>>>>>>would also include any advanced analytics. Else we might want to > >>>>>>>>have an analytics layer. > >>>>>>>> > >>>>>>>> It would be good to have an arch which could be extended for new > >>>>>>>>functionality. > >>>>>>>> > >>>>>>>> However Ryan's suggestion of the ui API and deployer also makes > >>>>>>>>sense. > >>>>>>>> > >>>>>>>> Should we have an IRC channel to discuss this or maybe etherpad? > >>>>>>>> > >>>>>>>> Debo > >>>>>>>> > >>>>>>>> Sent from my iPhone > >>>>>>>> > >>>>>>>>> On Apr 10, 2016, at 4:36 PM, Nick Allen <[email protected]> > >>>>>>>>>wrote: > >>>>>>>>> > >>>>>>>>> It might help to think of our code base as four separate types of > >>>>>>>>> functionality. This is primarily meant to give us a framework to > >>>>>>>>>think > >>>>>>>>> about the organization of Metron (and drive more discussion), > >>>>>>>>>rather than > >>>>>>>>> my proposal for a specific structure. > >>>>>>>>> > >>>>>>>>> - Sensor - Anything that captures external, non-streaming data > >>>>>>>>>and > >>>>>>>>> presents it in a form ready for stream processing. > >>>>>>>>> - Input - Responsible for preparing streaming data for > >>>>>>>>>enrichment. The > >>>>>>>>> existing "parsers" fit neatly into this space. > >>>>>>>>> - Enrichment - Responsible for enriching an incoming data feed > >>>>>>>>>like > >>>>>>>>> geoip, asset enrichment, threat intel lookups, etc. > >>>>>>>>> - Output - Responsible for persisting data that has been > >>>>>>>>>processed by > >>>>>>>>> Metron which obviously means search indexers or data stores. > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> On Fri, Apr 8, 2016 at 4:46 PM, Ryan Merriman > >>>>>>>>><[email protected]> > >>>>>>>>> wrote: > >>>>>>>>> > >>>>>>>>>> All, > >>>>>>>>>> > >>>>>>>>>> I would like to propose a review and refactor of the current > >>>>>>>>>>project > >>>>>>>>>> organization within Metron. Much of the way the legacy code was > >>>>>>>>>>organized > >>>>>>>>>> does not make sense anymore and could be designed so that it is > >>>>>>>>>>easier to > >>>>>>>>>> navigate and understand. Our test coverage has increased > >>>>>>>>>>substantially so > >>>>>>>>>> I believe we can do this with confidence. > >>>>>>>>>> > >>>>>>>>>> First off, I think we should agree on a naming convention. I > >>>>>>>>>>see some > >>>>>>>>>> projects (YARN and Storm for example) that prepend the > >>>>>>>>>>sub-project with the > >>>>>>>>>> name of the top-level project (storm-core for example). Metron > >>>>>>>>>>also > >>>>>>>>>> currently does this (Metron-Common). I think that's fine, > >>>>>>>>>>although in the > >>>>>>>>>> case of Metron, I feel like having "Metron" prepended is > >>>>>>>>>>redundant. > >>>>>>>>>> Regardless of whether we decide to stick with that approach, I > >>>>>>>>>>propose that > >>>>>>>>>> project names be uniform and lowercase. For example, under > >>>>>>>>>>these > >>>>>>>>>> assumptions "Metron-Common" would change to "common". > >>>>>>>>>> > >>>>>>>>>> The first level of organization makes sense to me. Only change > >>>>>>>>>>I would > >>>>>>>>>> make would be to project names: > >>>>>>>>>> > >>>>>>>>>> * deployment > >>>>>>>>>> * streaming > >>>>>>>>>> * ui > >>>>>>>>>> > >>>>>>>>>> Or if we want to keep metron in project names: > >>>>>>>>>> > >>>>>>>>>> * metron-deployment > >>>>>>>>>> * metron-streaming > >>>>>>>>>> * metron-ui > >>>>>>>>>> > >>>>>>>>>> For now I don't see any changes necessary in deployment or ui > >>>>>>>>>> organization. I see the streaming project structure primarily > >>>>>>>>>>driven by 2 > >>>>>>>>>> things: the Maven dependency tree and deployment targets. For > >>>>>>>>>>example, > >>>>>>>>>> solr and elasticsearch code should be separated (because their > >>>>>>>>>>dependency > >>>>>>>>>> on lucene conflicts) but both will depend on common enrichment > >>>>>>>>>>code. Also, > >>>>>>>>>> now that parser, enrichment and pcap topologies are separate, > >>>>>>>>>>code for > >>>>>>>>>> those topologies will be deployed as separate jars. No reason > >>>>>>>>>>to include > >>>>>>>>>> parser code in enrichment topologies and vice-versa. Any other > >>>>>>>>>> considerations I'm missing? > >>>>>>>>>> > >>>>>>>>>> With that being said, here is my initial proposal: > >>>>>>>>>> > >>>>>>>>>> * common - Any common code that all topologies depend on > >>>>>>>>>> (configuration classes, generic writers for example). No > >>>>>>>>>>dependencies on > >>>>>>>>>> other Metron projects. > >>>>>>>>>> * test - Contains utilities for writing unit tests, sample > >>>>>>>>>>configs and > >>>>>>>>>> sample data. Will depend on common. > >>>>>>>>>> * integration-test - Contains utilities and classes needed to > >>>>>>>>>>run our > >>>>>>>>>> integration tests (in memory components for example). Will > >>>>>>>>>>depend on > >>>>>>>>>> common and test. > >>>>>>>>>> * dataload - Contains all code related to data loading. Will > >>>>>>>>>>also > >>>>>>>>>> include any property files needed and integration tests. Will > >>>>>>>>>>depend on > >>>>>>>>>> common, test (test scope), and integration-test (test scope). > >>>>>>>>>> * parser - All code specific to the parser topologies. Would > >>>>>>>>>>also > >>>>>>>>>> include scripts, property files, flux files and parser topology > >>>>>>>>>>integration > >>>>>>>>>> tests. This project will depend on common, test (test scope), > >>>>>>>>>>and > >>>>>>>>>> integration-testing (test scope). > >>>>>>>>>> * enrichment - All code specific to the enrichment topologies > >>>>>>>>>>(except > >>>>>>>>>> solr and elasticsearch). Would also include scripts, property > >>>>>>>>>>files, flux > >>>>>>>>>> files and enrichment topology integration tests. This project > >>>>>>>>>>will depend > >>>>>>>>>> on common, test (test scope), and integration-test (test scope). > >>>>>>>>>> * elasticsearch - All Elasticsearch related code. Will depend > >>>>>>>>>>on > >>>>>>>>>> enrichment. > >>>>>>>>>> * solr - All Solr related code. Will depend on enrichment. > >>>>>>>>>> * pcap - All code specific to the topology dedicated to pcap. > >>>>>>>>>>Would > >>>>>>>>>> also include scripts, property files, flux files and pcap > >>>>>>>>>>integration > >>>>>>>>>> test. This project will depend on common, test (test scope) and > >>>>>>>>>> integration-test (test scope). > >>>>>>>>>> * api - This will serve as a generic replacement for > >>>>>>>>>> Metron-Pcap_Service. Will contain all code to build a Metron > >>>>>>>>>>web service > >>>>>>>>>> middle layer that can expose APIs through REST or other client > >>>>>>>>>>protocols. > >>>>>>>>>> Could possibly depend on all other projects or separated further > >>>>>>>>>>if version > >>>>>>>>>> conflicts arise (separate api projects for solr and > >>>>>>>>>>elasticsearch for > >>>>>>>>>> example). > >>>>>>>>>> > >>>>>>>>>> Looking forward to hearing everyone's feedback and great ideas. > >>>>>>>>>> > >>>>>>>>>> Ryan Merriman > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> -- > >>>>>>>>> Nick Allen <[email protected]> > >>>>>> > >> > > > > > > -- Nick Allen <[email protected]>
