Re: ML features for Metron

Egon Kidmose Thu, 16 Jun 2016 03:09:43 -0700

Hi Yazan, others

I've ran through and added some of my ideas.
This is my first time with user stories, so please provide any constructive
feedback, whatsoever, and forgive me for breaking any conventions, of which
I know none :)


My input evolves around exploiting that a SOC is generating labels for free
when operating, which in return can be used for training/evaluating ML
models to assist the SOC operation.
My ideas is to keep the human in the loop, to enable both supervised and
unsupervised methods and to aid academics with methods for obtaining labels
for testing, as that is the sole greatest problem (from this here tree I'm
sitting in..)

In brief, the stories I added can be described as follows:

S5 and S6: Alerts (security events, such as IDS alerts) and whether they
are true or false
S7 and S8: Link between events and incidents
S9 and S10: incident management

S5, S7 and S9: Users in the SOC manually labeling data
S6, S8 and S10: ML models providing outputs for users





Mvh. / BR
Egon Kidmose

On Thu, Jun 9, 2016 at 10:56 PM, Yazan Boshmaf <[email protected]> wrote:

> That's a great idea.
>
> Here's a link to an editable Google Doc with an initial draft of user
> stories: https://goo.gl/QAxiH6
>
> Please give it a pass and let's iterate over it.
>
> On Thu, Jun 9, 2016 at 10:44 PM, Casey Stella <[email protected]> wrote:
>
> > +1 on the google doc idea.
> >
> > I think any solution should include a framework that allows the user to
> >
> >    - Manage the training of their models
> >    - Manage the deployment of their models without stopping the
> topologies
> >    (i.e. hot loading of models)
> >    - Application of their models
> >
> > I'd also very much like to see support for
> >
> >    - both small data ML libraries (i.e. scikit-learn) and big-data ML
> >    libraries (i.e. MLLib)
> >    - The popular non-java language support (i.e. Python and R)
> >
> >
> > On Thu, Jun 9, 2016 at 3:33 PM, Debo Dutta (dedutta) <[email protected]>
> > wrote:
> >
> > > Haven't seen one. Hence I started a thread.
> > >
> > > Metron is a community project so please feel free to start a google
> doc.
> > >
> > > And then we can get feedback from the users.
> > >
> > > Thx
> > > Debo
> > >
> > > Sent from my iPhone
> > >
> > > > On Jun 9, 2016, at 12:28 PM, Yazan Boshmaf <[email protected]>
> wrote:
> > > >
> > > > Do we have a roadmap for ML support in Metron? If not, how someone
> > reach
> > > > out to existing users of Metron and get more input so that we at
> least
> > > > collect functional requirements?
> > > >
> > > > From my side, I can share some of the nice-to-have features from a
> > > research
> > > > perspective (i.e., feature that would make Metron a better platform
> to
> > > > conduct cybersecurity research).
> > > >
> > > > All the best,
> > > > Yazan
> > > >
> > > >> On Mon, Jun 6, 2016 at 10:12 AM, Debojyoti Dutta <[email protected]>
> > > wrote:
> > > >>
> > > >> Thx Egon. The idea of labeled data collection is awesome, else we
> have
> > > to
> > > >> resort to unsupervised alone. Maybe one of the things the website
> > could
> > > do
> > > >> is to point to labeled data contributed by users of Metron.
> > > >>
> > > >>> On Mon, Jun 6, 2016 at 12:03 AM, Egon Kidmose <[email protected]>
> > > wrote:
> > > >>>
> > > >>> Hi all,
> > > >>>
> > > >>> I'd be interested in joining that discussion.
> > > >>>
> > > >>> I'm a phd student applying ML in the security monitoring domain.
> > > >>> It is my expectation that I'll be able to contribute with some
> event
> > > >>> correlation and alert filtering methods.
> > > >>> (Corelation: Finding events that are relevant to each other.
> > Filtering:
> > > >>> Suppressing false alerts from e.g. IDSs, or picking out the
> relevant
> > > >> ones)
> > > >>> You'll see a PR as soon as I have something that is somewhat ready.
> > > >>>
> > > >>> A particularly interesting issue (to me at least) is the
> > possibilities
> > > of
> > > >>> using a real, running SOC as the the "label factory" for labelled
> > data.
> > > >>> Getting real data with labels for supervised methods is one of the
> > > great
> > > >>> challenges, and I see quite some potential for Metron here.
> > > >>>
> > > >>>
> > > >>> Mvh. / BR
> > > >>> Egon Kidmose
> > > >>>
> > > >>>> On Sat, Jun 4, 2016 at 5:02 PM, Yazan Boshmaf <[email protected]
> >
> > > >>> wrote:
> > > >>>
> > > >>>> One use case of Apache Metron (or OpenSOC) is to analyze
> > amplification
> > > >>> DDoS
> > > >>>> attacks <
> > https://www.internetsociety.org/sites/default/files/01_5.pdf
> > > >>> .
> > > >>>>
> > > >>>> With honeypots as information sources (e.g., AmptPot
> > > >>>> <http://www.christian-rossow.de/publications/amppot-raid2015.pdf
> >),
> > > >> you
> > > >>>> have the typical UDP/IP features (IP addresses, timestamps,
> > protocols,
> > > >>>> ports, payload, etc.), which get enriched with reverse IP data,
> > > >>>> geolocation, etc. Some of these attributes can be used as features
> > to
> > > >>>> identify and characterize types of reflection attacks (e.g.,
> > > exploiting
> > > >>>> NTP, DNS resolvers, or even RIPv1). Also, it is important to
> > > >> distinguish
> > > >>>> attackers from scanners, using certain features like timestamp
> > > >>>> synchronization across honeypots, as scanner tend to go through IP
> > > >>> blocks,
> > > >>>> one by one, as compared to actual attacks.
> > > >>>>
> > > >>>> These are some of the attributes one might consider for this use
> > case.
> > > >> It
> > > >>>> would be nice to have something that does online learning and
> > > >> analytics,
> > > >>> so
> > > >>>> clustering / classification is done in real-time. Maybe Apache
> > Spark's
> > > >>>> MLlib?
> > > >>>>
> > > >>>> All the best,
> > > >>>> Yazan
> > > >>>>
> > > >>>>> On Sat, Jun 4, 2016 at 4:59 PM, [email protected] <
> [email protected]
> > >
> > > >>>> wrote:
> > > >>>>
> > > >>>>> I'm in
> > > >>>>>
> > > >>>>>> On Sat, Jun 4, 2016, 09:53 Yazan Boshmaf <[email protected]>
> > > wrote:
> > > >>>>>>
> > > >>>>>> Me too.
> > > >>>>>>
> > > >>>>>>> On Sat, Jun 4, 2016 at 9:43 AM, Franck Vervial <
> > [email protected]>
> > > >>>>>> wrote:
> > > >>>>>>
> > > >>>>>>> hi,
> > > >>>>>>>
> > > >>>>>>> i am interested.
> > > >>>>>>>
> > > >>>>>>> regards
> > > >>>>>>> On Fri, 3 Jun 2016 at 3:43 PM, Debo Dutta (dedutta) <
> > > >>>> [email protected]
> > > >>>>>>
> > > >>>>>>> wrote:
> > > >>>>>>>
> > > >>>>>>>> Hi
> > > >>>>>>>>
> > > >>>>>>>> Wondering if anyone is interested in starting a discussion on
> > > >>> what
> > > >>>>> kind
> > > >>>>>>> of
> > > >>>>>>>> machine learning based features would be good for Metron ….
> > > >> Would
> > > >>>>> love
> > > >>>>>> to
> > > >>>>>>>> have the SOC users chime in on the dev list.
> > > >>>>>>>>
> > > >>>>>>>> The result of the discussion could lead to JIRA items.
> > > >>>>>>>>
> > > >>>>>>>> thx
> > > >>>>>>>> debo
> > > >>>>> --
> > > >>>>>
> > > >>>>> Jon
> > > >>
> > > >>
> > > >>
> > > >> --
> > > >> -Debo~
> > > >>
> > >
> >
>

Re: ML features for Metron

Reply via email to