That's a great idea. Here's a link to an editable Google Doc with an initial draft of user stories: https://goo.gl/QAxiH6
Please give it a pass and let's iterate over it. On Thu, Jun 9, 2016 at 10:44 PM, Casey Stella <[email protected]> wrote: > +1 on the google doc idea. > > I think any solution should include a framework that allows the user to > > - Manage the training of their models > - Manage the deployment of their models without stopping the topologies > (i.e. hot loading of models) > - Application of their models > > I'd also very much like to see support for > > - both small data ML libraries (i.e. scikit-learn) and big-data ML > libraries (i.e. MLLib) > - The popular non-java language support (i.e. Python and R) > > > On Thu, Jun 9, 2016 at 3:33 PM, Debo Dutta (dedutta) <[email protected]> > wrote: > > > Haven't seen one. Hence I started a thread. > > > > Metron is a community project so please feel free to start a google doc. > > > > And then we can get feedback from the users. > > > > Thx > > Debo > > > > Sent from my iPhone > > > > > On Jun 9, 2016, at 12:28 PM, Yazan Boshmaf <[email protected]> wrote: > > > > > > Do we have a roadmap for ML support in Metron? If not, how someone > reach > > > out to existing users of Metron and get more input so that we at least > > > collect functional requirements? > > > > > > From my side, I can share some of the nice-to-have features from a > > research > > > perspective (i.e., feature that would make Metron a better platform to > > > conduct cybersecurity research). > > > > > > All the best, > > > Yazan > > > > > >> On Mon, Jun 6, 2016 at 10:12 AM, Debojyoti Dutta <[email protected]> > > wrote: > > >> > > >> Thx Egon. The idea of labeled data collection is awesome, else we have > > to > > >> resort to unsupervised alone. Maybe one of the things the website > could > > do > > >> is to point to labeled data contributed by users of Metron. > > >> > > >>> On Mon, Jun 6, 2016 at 12:03 AM, Egon Kidmose <[email protected]> > > wrote: > > >>> > > >>> Hi all, > > >>> > > >>> I'd be interested in joining that discussion. > > >>> > > >>> I'm a phd student applying ML in the security monitoring domain. > > >>> It is my expectation that I'll be able to contribute with some event > > >>> correlation and alert filtering methods. > > >>> (Corelation: Finding events that are relevant to each other. > Filtering: > > >>> Suppressing false alerts from e.g. IDSs, or picking out the relevant > > >> ones) > > >>> You'll see a PR as soon as I have something that is somewhat ready. > > >>> > > >>> A particularly interesting issue (to me at least) is the > possibilities > > of > > >>> using a real, running SOC as the the "label factory" for labelled > data. > > >>> Getting real data with labels for supervised methods is one of the > > great > > >>> challenges, and I see quite some potential for Metron here. > > >>> > > >>> > > >>> Mvh. / BR > > >>> Egon Kidmose > > >>> > > >>>> On Sat, Jun 4, 2016 at 5:02 PM, Yazan Boshmaf <[email protected]> > > >>> wrote: > > >>> > > >>>> One use case of Apache Metron (or OpenSOC) is to analyze > amplification > > >>> DDoS > > >>>> attacks < > https://www.internetsociety.org/sites/default/files/01_5.pdf > > >>> . > > >>>> > > >>>> With honeypots as information sources (e.g., AmptPot > > >>>> <http://www.christian-rossow.de/publications/amppot-raid2015.pdf>), > > >> you > > >>>> have the typical UDP/IP features (IP addresses, timestamps, > protocols, > > >>>> ports, payload, etc.), which get enriched with reverse IP data, > > >>>> geolocation, etc. Some of these attributes can be used as features > to > > >>>> identify and characterize types of reflection attacks (e.g., > > exploiting > > >>>> NTP, DNS resolvers, or even RIPv1). Also, it is important to > > >> distinguish > > >>>> attackers from scanners, using certain features like timestamp > > >>>> synchronization across honeypots, as scanner tend to go through IP > > >>> blocks, > > >>>> one by one, as compared to actual attacks. > > >>>> > > >>>> These are some of the attributes one might consider for this use > case. > > >> It > > >>>> would be nice to have something that does online learning and > > >> analytics, > > >>> so > > >>>> clustering / classification is done in real-time. Maybe Apache > Spark's > > >>>> MLlib? > > >>>> > > >>>> All the best, > > >>>> Yazan > > >>>> > > >>>>> On Sat, Jun 4, 2016 at 4:59 PM, [email protected] <[email protected] > > > > >>>> wrote: > > >>>> > > >>>>> I'm in > > >>>>> > > >>>>>> On Sat, Jun 4, 2016, 09:53 Yazan Boshmaf <[email protected]> > > wrote: > > >>>>>> > > >>>>>> Me too. > > >>>>>> > > >>>>>>> On Sat, Jun 4, 2016 at 9:43 AM, Franck Vervial < > [email protected]> > > >>>>>> wrote: > > >>>>>> > > >>>>>>> hi, > > >>>>>>> > > >>>>>>> i am interested. > > >>>>>>> > > >>>>>>> regards > > >>>>>>> On Fri, 3 Jun 2016 at 3:43 PM, Debo Dutta (dedutta) < > > >>>> [email protected] > > >>>>>> > > >>>>>>> wrote: > > >>>>>>> > > >>>>>>>> Hi > > >>>>>>>> > > >>>>>>>> Wondering if anyone is interested in starting a discussion on > > >>> what > > >>>>> kind > > >>>>>>> of > > >>>>>>>> machine learning based features would be good for Metron …. > > >> Would > > >>>>> love > > >>>>>> to > > >>>>>>>> have the SOC users chime in on the dev list. > > >>>>>>>> > > >>>>>>>> The result of the discussion could lead to JIRA items. > > >>>>>>>> > > >>>>>>>> thx > > >>>>>>>> debo > > >>>>> -- > > >>>>> > > >>>>> Jon > > >> > > >> > > >> > > >> -- > > >> -Debo~ > > >> > > >
