Re: MaaS + Apache Twill ?

2017-04-18 Thread Casey Stella
One more thought, I definitely am not opposed to making it distributed
resource manager independent.  I think what Otto is suggesting isn't a bad
thread to pull on.  Right now, MaaS is tied to Yarn inherently and it'd be
nice to make that dependency pluggable.  This would allow us to use other
Lambda or Kubernetes or whatever for model deployment, which would be
really neat.

On Tue, Apr 18, 2017 at 3:02 PM, Casey Stella <ceste...@gmail.com> wrote:

> Regarding model performance, I've thought about that a bit.  What I'd like
> to see MaaS be able to do is provide an API that the models can communicate
> through that will send events to kafka and provides a telemetry like any
> other.  Performance statistics, raw results for downstream analysis.  We
> have a system capable of analyzing telemetry data, it seems to me like MaaS
> should use the dogfood in which it runs.  I think we should build an API by
> which the model can communicate with kafka.
>
> One of the things that I very much like about MaaS is that it's light on
> the opinion about the language and library.  I think that building the API
> should be as simple as a log file that is monitored and the MaaS runner
> will provide that proxy to kafka.  I do think that it should be easier to
> write models, but I think that should be solved through applications that
> will turn model collateral into REST APIs if they conform to certain
> standards (i.e. PMML, Spark MLLib serialized models, etc.) and allow users
> the freedom to engage with MaaS via their own mechanism in the language of
> their choice if their situation doesn't conform to our expectations.
>
> On Tue, Apr 18, 2017 at 2:50 PM, Simon Elliston Ball <
> si...@simonellistonball.com> wrote:
>
>> Completely agree. It would be great to get more info from your experience.
>>
>> To an extent what we have at the moment is very much about isolating the
>> implementation of a model from the deployment and discovery mechanism, and
>> to my mind we should very much keep that to enable any kind of model to
>> plugin. The other thing worth discussing here would be how we can wrap
>> around a model while maintaining the lose coupling, to provide things like
>> generalised performance metrics for the models. Any thoughts on that front
>> anyone?
>>
>>
>> > On 18 Apr 2017, at 11:45, Otto Fowler <ottobackwa...@gmail.com> wrote:
>> >
>> > I will have to go back to my notes.  There was a day or so when I went
>> through the code and was thinking of a couple of things, but that was a
>> while ago.
>> >
>> > Off the top of my head, I would want something factored enough, or
>> loosely coupled enough that it was not dependent on twill or maas or
>> anything else.  This
>> > would not impose implementation on the services.   This would have to
>> revolve around a discovery/registration api and a rest interface contract.
>> >
>> > Does that make sense?
>> >
>> >
>> >
>> >
>> > On April 18, 2017 at 14:34:28, Simon Elliston Ball (
>> si...@simonellistonball.com <mailto:si...@simonellistonball.com>) wrote:
>> >
>> >> Right, how about some sort of REST API? Or through Ambari? What would
>> you say was the best way to start the service, and of course to submit
>> model artefacts?
>> >>
>> >> Simon
>> >>
>> >>> On 18 Apr 2017, at 11:33, Otto Fowler <ottobackwa...@gmail.com
>> <mailto:ottobackwa...@gmail.com>> wrote:
>> >>>
>> >>> In my mind, I didn’t want to deploy the service as a bash script or
>> wrapped in one, if I recall correctly.
>> >>>
>> >>>
>> >>>
>> >>> On April 18, 2017 at 14:27:52, Simon Elliston Ball (
>> si...@simonellistonball.com <mailto:si...@simonellistonball.com>) wrote:
>> >>>
>> >>>> Any particular issues, or things that didn’t work Otto?
>> >>>>
>> >>>> Simon
>> >>>>
>> >>>>
>> >>>> > On 18 Apr 2017, at 11:26, Otto Fowler <ottobackwa...@gmail.com
>> <mailto:ottobackwa...@gmail.com>> wrote:
>> >>>> >
>> >>>> > I’ll try to take a look. There are a couple of things I wanted to
>> do with MaaS but could not
>> >>>> > figure out because of a couple of limitations. I’d like to see if
>> twill offers more flexibility
>> >>>> >
>> >>>> >
>> >>>> >
>> >>>> >
>> >>>> > On April 

Re: MaaS + Apache Twill ?

2017-04-18 Thread Casey Stella
Regarding model performance, I've thought about that a bit.  What I'd like
to see MaaS be able to do is provide an API that the models can communicate
through that will send events to kafka and provides a telemetry like any
other.  Performance statistics, raw results for downstream analysis.  We
have a system capable of analyzing telemetry data, it seems to me like MaaS
should use the dogfood in which it runs.  I think we should build an API by
which the model can communicate with kafka.

One of the things that I very much like about MaaS is that it's light on
the opinion about the language and library.  I think that building the API
should be as simple as a log file that is monitored and the MaaS runner
will provide that proxy to kafka.  I do think that it should be easier to
write models, but I think that should be solved through applications that
will turn model collateral into REST APIs if they conform to certain
standards (i.e. PMML, Spark MLLib serialized models, etc.) and allow users
the freedom to engage with MaaS via their own mechanism in the language of
their choice if their situation doesn't conform to our expectations.

On Tue, Apr 18, 2017 at 2:50 PM, Simon Elliston Ball <
si...@simonellistonball.com> wrote:

> Completely agree. It would be great to get more info from your experience.
>
> To an extent what we have at the moment is very much about isolating the
> implementation of a model from the deployment and discovery mechanism, and
> to my mind we should very much keep that to enable any kind of model to
> plugin. The other thing worth discussing here would be how we can wrap
> around a model while maintaining the lose coupling, to provide things like
> generalised performance metrics for the models. Any thoughts on that front
> anyone?
>
>
> > On 18 Apr 2017, at 11:45, Otto Fowler  wrote:
> >
> > I will have to go back to my notes.  There was a day or so when I went
> through the code and was thinking of a couple of things, but that was a
> while ago.
> >
> > Off the top of my head, I would want something factored enough, or
> loosely coupled enough that it was not dependent on twill or maas or
> anything else.  This
> > would not impose implementation on the services.   This would have to
> revolve around a discovery/registration api and a rest interface contract.
> >
> > Does that make sense?
> >
> >
> >
> >
> > On April 18, 2017 at 14:34:28, Simon Elliston Ball (
> si...@simonellistonball.com ) wrote:
> >
> >> Right, how about some sort of REST API? Or through Ambari? What would
> you say was the best way to start the service, and of course to submit
> model artefacts?
> >>
> >> Simon
> >>
> >>> On 18 Apr 2017, at 11:33, Otto Fowler  > wrote:
> >>>
> >>> In my mind, I didn’t want to deploy the service as a bash script or
> wrapped in one, if I recall correctly.
> >>>
> >>>
> >>>
> >>> On April 18, 2017 at 14:27:52, Simon Elliston Ball (
> si...@simonellistonball.com ) wrote:
> >>>
>  Any particular issues, or things that didn’t work Otto?
> 
>  Simon
> 
> 
>  > On 18 Apr 2017, at 11:26, Otto Fowler  > wrote:
>  >
>  > I’ll try to take a look. There are a couple of things I wanted to
> do with MaaS but could not
>  > figure out because of a couple of limitations. I’d like to see if
> twill offers more flexibility
>  >
>  >
>  >
>  >
>  > On April 18, 2017 at 13:50:39, Nick Allen (n...@nickallen.org
> ) wrote:
>  >
>  > I ran across the Apache Twill [1] project recently whose goal is to
> reduce
>  > the complexity of developing distributed applications that run on
> YARN. My
>  > first thought is that it might offer additional capabilities and/or
>  > simplify our current MaaS implementation.
>  >
>  > Here are a list of features provided by Twill that I think might be
> useful
>  > for MaaS.
>  >
>  > - Service discovery
>  > - Elastic scaling
>  > - High Availability
>  > - Placement policies - Which rack/host should the model run on?
>  > - Security - Kerberos ticket refresh?
>  >
>  > Just wanted to float the thought in the community and see if anyone
> has
>  > experience with Twill. I need to do some more research myself.
>  >
>  > [1] http://twill.apache.org/ 
>


Re: [DISCUSS] MPack components that don't support Kerberos

2017-04-13 Thread Casey Stella
I honestly don't know if we can mock out a KDC for integration tests.  If
we did move the integration tests to running against docker, that might be
an option as we could dockerize a KDC as well.

Long story short, "probably, but not for free. ;)"

On Thu, Apr 13, 2017 at 10:41 AM, Otto Fowler <ottobackwa...@gmail.com>
wrote:

> Can we test kerberized support in integration?
>
>
> On April 13, 2017 at 10:24:43, Casey Stella (ceste...@gmail.com) wrote:
>
> Agreed, +1
>
> On Thu, Apr 13, 2017 at 10:14 AM, Otto Fowler <ottobackwa...@gmail.com>
> wrote:
>
>> This should be in the dev guide and pr template
>>
>>
>> On April 13, 2017 at 09:43:48, Casey Stella (ceste...@gmail.com) wrote:
>>
>> Based on my understanding, we have a few axioms that we're working from:
>>
>> - The installer should install a complete and workable product (i.e.
>> after install, everything should work). Afterall, that has to be the
>> sensible definition of 'working' for an installer
>> - Metron should support running in a Kerberized environment
>>
>> If we are going to support kerberos and the installer is going to install
>> the product, then I would consider lack of kerberos support for a
>> component
>> to block inclusion into the mpack.
>>
>> Casey
>>
>> On Thu, Apr 13, 2017 at 9:29 AM, Ryan Merriman <merrim...@gmail.com>
>> wrote:
>>
>> > There is a PR up for review (
>> > https://github.com/apache/incubator-metron/pull/518) that updates our
>> > MPack
>> > to support a Kerberized environment. There is also a PR up for review
>> that
>> > adds the REST service to the MPack (
>> > https://github.com/apache/incubator-metron/pull/500).
>> >
>> > However, the REST application currently does not work in a kerberized
>> > environment. That work has already started so it won't be an issue for
>> > long but how should we handle situations like this in the future where
>> we
>> > want to add a service but it's not quite ready for Kerberos? Should
>> > Kerberos support be a prerequisite before it's added to the MPack?
>> Should
>> > we look at ways to make these services optional? Any other thoughts or
>> > ideas?
>> >
>> > Ryan
>> >
>>
>>
>


Re: [DISCUSS] MPack components that don't support Kerberos

2017-04-13 Thread Casey Stella
Agreed, +1

On Thu, Apr 13, 2017 at 10:14 AM, Otto Fowler <ottobackwa...@gmail.com>
wrote:

> This should be in the dev guide and pr template
>
>
> On April 13, 2017 at 09:43:48, Casey Stella (ceste...@gmail.com) wrote:
>
> Based on my understanding, we have a few axioms that we're working from:
>
> - The installer should install a complete and workable product (i.e.
> after install, everything should work). Afterall, that has to be the
> sensible definition of 'working' for an installer
> - Metron should support running in a Kerberized environment
>
> If we are going to support kerberos and the installer is going to install
> the product, then I would consider lack of kerberos support for a
> component
> to block inclusion into the mpack.
>
> Casey
>
> On Thu, Apr 13, 2017 at 9:29 AM, Ryan Merriman <merrim...@gmail.com>
> wrote:
>
> > There is a PR up for review (
> > https://github.com/apache/incubator-metron/pull/518) that updates our
> > MPack
> > to support a Kerberized environment. There is also a PR up for review
> that
> > adds the REST service to the MPack (
> > https://github.com/apache/incubator-metron/pull/500).
> >
> > However, the REST application currently does not work in a kerberized
> > environment. That work has already started so it won't be an issue for
> > long but how should we handle situations like this in the future where
> we
> > want to add a service but it's not quite ready for Kerberos? Should
> > Kerberos support be a prerequisite before it's added to the MPack?
> Should
> > we look at ways to make these services optional? Any other thoughts or
> > ideas?
> >
> > Ryan
> >
>
>


Re: [DISCUSS] MPack components that don't support Kerberos

2017-04-13 Thread Casey Stella
Based on my understanding, we have a few axioms that we're working from:

   - The installer should install a complete and workable product (i.e.
   after install, everything should work).  Afterall, that has to be the
   sensible definition of 'working' for an installer
   - Metron should support running in a Kerberized environment

If we are going to support kerberos and the installer is going to install
the product, then I would consider lack of kerberos support for a component
to block inclusion into the mpack.

Casey

On Thu, Apr 13, 2017 at 9:29 AM, Ryan Merriman  wrote:

> There is a PR up for review (
> https://github.com/apache/incubator-metron/pull/518) that updates our
> MPack
> to support a Kerberized environment.  There is also a PR up for review that
> adds the REST service to the MPack (
> https://github.com/apache/incubator-metron/pull/500).
>
> However, the REST application currently does not work in a kerberized
> environment.  That work has already started so it won't be an issue for
> long but how should we handle situations like this in the future where we
> want to add a service but it's not quite ready for Kerberos?  Should
> Kerberos support be a prerequisite before it's added to the MPack?  Should
> we look at ways to make these services optional?  Any other thoughts or
> ideas?
>
> Ryan
>


Re: [DISCUSS] Extracting Stellar as a component/module

2017-04-12 Thread Casey Stella
I'm ok with google docs as long as when consensus is reached, it lives in
the wiki.

On Tue, Apr 11, 2017 at 6:35 PM, Matt Foley  wrote:

> I’ve copied it to the cwiki, but the thing is that cwiki only allows
> comments at the bottom.  With a long doc like this, that’s not very good.
> I’d much rather keep everyone’s comments in the same system, and local to
> the text they’re commenting on.
>
>
>
> Is it okay to leave this in google doc?
>
>
>
> If anyone can’t abide logging in to google, the cwiki version is here:
>
> https://cwiki.apache.org/confluence/display/METRON/
> Extracting+Stellar+into+an+Independent+Module
>
>
>
> Thanks,
>
> --Matt
>
>
>
> On 4/11/17, 3:15 PM, "Matt Foley"  wrote:
>
>
>
> No, actually you’re right.  Will have it moved over shortly.
>
>
>
> On 4/11/17, 2:56 PM, "Otto Fowler"  wrote:
>
>
>
> Nevermind
>
>
>
>
>
> On April 11, 2017 at 17:47:57, Otto Fowler (
> ottobackwa...@gmail.com) wrote:
>
>
>
> Can’t we do this in confluence?
>
>
>
>
>
>
>
> On April 11, 2017 at 17:38:40, Matt Foley (ma...@apache.org)
> wrote:
>
>
>
> Hi all,
>
> This is a new discussion thread, and if the proposed change is
> accepted by
>
> the community, it will be submitted to the next release, not the
> current
>
> 0.4.0 branch.
>
>
>
> Stellar has 126 verbs today, and seems only likely to continue
> growing.
>
> Furthermore, we expect Stellar to be extended by users, and
> probably grow
>
> into having one or more Registry/ Repositories, etc. All this
> suggests that
>
> we should start viewing Stellar itself as a component, and make
> sure it is
>
> maintainable and has clean interfaces to the rest of the system.
> And that
>
> will be easier if we extract it into its own module, both in the
> code tree
>
> and in maven.
>
>
>
> I’ve written a combination proposal / discussion about how to
> extract
>
> Stellar from its current deep embed in Metron. Comments are
> welcome, and
>
> encouraged. Please read:
>
> https://docs.google.com/document/d/1EP7Jt4ePHe2A-_
> oboLl2QbN1muh7uKeET_kbpIgjcJM/edit#heading=h.4vsrmths49wk
>
>
>
> I believe I’ve set access so anyone can read and comment on it.
> However,
>
> google docs may still ask you log in with a google-registered email
>
> address. If this is a problem for anyone, let me know and I can
> send you a
>
> Word document.
>
>
>
> Thanks,
>
> --Matt
>
>
>
>
>
>
>
>


Re: [DISCUSS] next release proposal

2017-04-11 Thread Casey Stella
+1 to 0.4.0

On Tue, Apr 11, 2017 at 2:03 PM, Otto Fowler 
wrote:

> +1
>
>
> On April 11, 2017 at 13:59:46, Matt Foley (ma...@apache.org) wrote:
>
> Hi all,
> Looks to me like the vast majority of the material mentioned below has been
> committed. There are still 8 recent PRs that need review and, hopefully,
> commit.
>
> I’m going to go ahead and make a release branch, with the understanding
> that any further commits (especially but not limited to Kerberization,
> Metron-UI, Metron Management UI, or Mpack support), that come in over the
> next 36 hours or so will still be included in the RC.
>
> Does that meet everyone’s needs? I want to get started because it will
> probably take a day or more just to create the branch, an RC build, and
> start the sanity testing.
>
> There’s enough major new stuff here that I’m going to call it 0.4.0. Is
> that also okay with everyone?
>
> Thanks,
> --Matt
>
> On 4/5/17, 6:23 PM, "Ali Nazemian"  wrote:
>
> Dear Metron Devs,
>
> As Metron users/customers, we are very keen to have all high priority
> related features/bugs to the Security as well as Metron-UI and Metron
> Management-UI.
>
> Thanks,
> Ali
>
> On Thu, Apr 6, 2017 at 8:04 AM, Ryan Merriman  wrote:
>
> > We just finished responding to the first round of feedback so I don't
> think
> > we're that far away on METRON-623.
> >
> > On Wed, Apr 5, 2017 at 3:30 PM, Matt Foley  wrote:
> >
> > > Totally agree would be good to have MPack support. Let’s see how it
> > > goes. Wouldn’t want to cut it out for the sake of a day or two.
> > >
> > > On 4/5/17, 1:14 PM, "Justin Leet"  wrote:
> > >
> > > I've made fairly good progress on
> > > https://issues.apache.org/jira/browse/METRON-799 (The MPack should
> > > function
> > > in a kerberized cluster). The PR itself might cut close to the
> > > deadline,
> > > and in particular might be tough to get reviewed in time.
> > >
> > > I'll do a best effort attempt to get it in to make our Kerberos story
> > > more
> > > complete, but I'd say the release can go on without this (and we use
> > > manual
> > > Kerberos in its absence).
> > >
> > > Justin
> > >
> > > On Wed, Apr 5, 2017 at 4:07 PM, Matt Foley  wrote:
> > >
> > > > Sure. To be clear, I wasn’t proposing an exclusive list, just
> > > making the
> > > > argument that there seemed to be enough to proceed with. Any duly
> > > > committed content in the master branch, at the time we create the
> > > first RC
> > > > (ie, some time after METRON-623 goes in, but not before Monday)
> > will
> > > surely
> > > > be included in the RC, unless something has a bug that can’t be
> > > readily
> > > > resolved.
> > > >
> > > > Thanks,
> > > > --Matt
> > > >
> > > > On 4/5/17, 12:56 PM, "David Lyle"  wrote:
> > > >
> > > > I'm working on METRON-826 right now. I'll have a PR up today or
> > > > tomorrow at
> > > > the latest. I'd like to see it go as well.
> > > >
> > > > https://issues.apache.org/jira/browse/METRON-826
> > > >
> > > > -D...
> > > >
> > > >
> > > > On Wed, Apr 5, 2017 at 3:52 PM, Nick Allen  > >
> > > wrote:
> > > >
> > > > > I would like to include #509 with the Fastcapa improvements..
> > > > Already have
> > > > > a +1. I'm just letting it soak giving others some time to
> > > review if
> > > > they
> > > > > feel so inclined.
> > > > >
> > > > > https://github.com/apache/incubator-metron/pull/509
> > > > >
> > > > >
> > > > > On Wed, Apr 5, 2017 at 3:50 PM, James Sirota <
> > > jsir...@apache.org>
> > > > wrote:
> > > > >
> > > > > > I second this. I want to see 623 go in in addition to the
> > > > kerberos work.
> > > > > > When both are in I think it makes sense to do the release
> > > > > >
> > > > > > 04.04.2017, 11:33, "Simon Elliston Ball" <
> > > > si...@simonellistonball.com>:
> > > > > > > I'd really like to see METRON-623 (the ui) get into the
> > > release.
> > > > It
> > > > > > feels like the current PR review is getting close, and that
> > > > getting it in
> > > > > > then focussing on follow on tasks in a separate release
> > > would work
> > > > well.
> > > > > > >
> > > > > > > I would be all for getting a release out if only for the
> > > > Kerberos work.
> > > > > > >
> > > > > > > Simon
> > > > > > >
> > > > > > >> On 4 Apr 2017, at 20:15, zeo...@gmail.com <
> > > zeo...@gmail.com>
> > > > wrote:
> > > > > > >>
> > > > > > >> How far out is the management UI?
> > > > > > >>
> > > > > > >> Jon
> > > > > > >>
> > > > > > >>> On Tue, Apr 4, 2017, 2:09 PM Matt Foley <
> > > ma...@apache.org>
> > > > wrote:
> > > > > > >>>
> > > > > > >>> Hi all,
> > > > > > >>> Although it’s only been a few weeks since the last
> > > release was
> > > > > finally
> > > > > > >>> published, that process started in January :-)
> > > > > > >>> Also, the last commit in 0.3.1 was Feb 23, and there’s
> > > been a
> > > > ton of
> > > > > > >>> really 

Re: Failing build

2017-04-06 Thread Casey Stella
Yeah, this is an intermittent test failure and has to do with the migration
to storm 1.0.3 and how it handles shutting down in local mode.  The slot
worker refuses to shut down and freezes and therefore we wait forever.  I
thought I had it fixed by manually clobbering the slots, but alas it
appears that I only made the problem less frequent.  When I found it as
part of METRON-793, I ran the fix 20 times locally as well as at least 10
times on my personal travis without repetition.

Bottom line: we should correct it.  I'll have to think a bit more about how
to fix it and if anyone else wants to take a crack at it, feel free. :)

On Thu, Apr 6, 2017 at 1:26 PM, zeo...@gmail.com  wrote:

> We appear to have a failed build again:
>
> No output has been received in the last 10m0s, this potentially indicates a
> stalled build or something wrong with the build itself.
>
> https://travis-ci.org/apache/incubator-metron/builds/219261745
>
> Jon
> --
>
> Jon
>


Re: [DISCUSS] The bro kafka plugin

2017-04-05 Thread Casey Stella
>>>>  > > > Jon
> >>>>  > > >
> >>>>  > > > On Fri, Mar 31, 2017 at 4:23 PM zeo...@gmail.com <
> zeo...@gmail.com
> >>>>  >
> >>>>  > > wrote:
> >>>>  > > >
> >>>>  > > > I would be happy to try it again but I attempted to do that
> before
> >>>>  with
> >>>>  > > > bro packages and it failed to be able to handle it. I also
> tried
> >>>>  using
> >>>>  > > > branches of a repo with bro but that similarly failed (and was
> a
> >>>>  pretty
> >>>>  > > bad
> >>>>  > > > idea to start with).
> >>>>  > > >
> >>>>  > > > Jon
> >>>>  > > >
> >>>>  > > > On Fri, Mar 31, 2017, 3:24 PM Matt Foley <ma...@apache.org>
> wrote:
> >>>>  > > >
> >>>>  > > > We should be able to request just one alternate repo from
> INFRA,
> >>>>  and
> >>>>  > put
> >>>>  > > a
> >>>>  > > > top hierarchical level in it that doesn’t include a maven pom.
> As
> >>>>  far
> >>>>  > as
> >>>>  > > > maven and clients are concerned, it
> >>>>  > > >
> >>>>  > > > just increases by 1 the path length to the root of the repo.
> >>>>  > > >
> >>>>  > > > On 3/31/17, 10:30 AM, "zeo...@gmail.com" <zeo...@gmail.com>
> wrote:
> >>>>  > > >
> >>>>  > > > Once we agree on a repo location to host this, I would be
> >>>>  happy to
> >>>>  > > put
> >>>>  > > > together the package and update our environments to use
> >>>>  bro-pkg to
> >>>>  > > > install
> >>>>  > > > the plugin. I have created METRON-813
> >>>>  > > > <https://issues.apache.org/jira/browse/METRON-813> to track
> >>>>  this
> >>>>  > and
> >>>>  > > > changed METRON-348 <
> >>>>  > https://issues.apache.org/jira/browse/METRON-348
> >>>>  > > >
> >>>>  > > > to be
> >>>>  > > > a sub-task.
> >>>>  > > >
> >>>>  > > > Otto - the bro packages model doesn't allow colocation with
> >>>>  > anything
> >>>>  > > > else.
> >>>>  > > > That said, if we have two similar situations, and given the
> >>>>  INFRA
> >>>>  > > > example
> >>>>  > > > <https://issues.apache.org/jira/browse/INFRA-7060> Casey
> >>>>  linked to
> >>>>  > > > before
> >>>>  > > > was requesting 9 repos, perhaps we just request two repos.
> >>>>  Would
> >>>>  > > > someone
> >>>>  > > > else mind putting that request in?
> >>>>  > > >
> >>>>  > > > Jon
> >>>>  > > >
> >>>>  > > > On Fri, Mar 31, 2017 at 12:49 PM Otto Fowler <
> >>>>  > > ottobackwa...@gmail.com>
> >>>>  > > > wrote:
> >>>>  > > >
> >>>>  > > > Could we create a separate repo for more than on thing? like
> >>>>  put …
> >>>>  > > um
> >>>>  > > > let’s say
> >>>>  > > > a maven plugin and the bro plugin?
> >>>>  > > >
> >>>>  > > >
> >>>>  > > >
> >>>>  > > > On March 31, 2017 at 12:30:25, Nick Allen (n...@nickallen.org)
> >>>>  > > wrote:
> >>>>  > > >
> >>>>  > > > I agree with everything that I've read.
> >>>>  > > >
> >>>>  > > > One of the guys from Bro had contacted me a while back,
> >>>>  letting me
> >>>>  > > know
> >>>>  > > > that the packaging mechanism in Bro was ready for public
> >>>>  > > consumption. I
> >>>>  > > > just have not had cycles to do anything with it yet. They are
>

Re: [DISCUSS] The bro kafka plugin

2017-04-05 Thread Casey Stella
gt; > > the plugin.  I have created METRON-813
> > > > <https://issues.apache.org/jira/browse/METRON-813> to track this
> > and
> > > > changed METRON-348 <
> > https://issues.apache.org/jira/browse/METRON-348
> > > >
> > > > to be
> > > > a sub-task.
> > > >
> > > > Otto - the bro packages model doesn't allow colocation with
> > anything
> > > > else.
> > > > That said, if we have two similar situations, and given the INFRA
> > > > example
> > > > <https://issues.apache.org/jira/browse/INFRA-7060> Casey linked
> to
> > > > before
> > > > was requesting 9 repos, perhaps we just request two repos.  Would
> > > > someone
> > > > else mind putting that request in?
> > > >
> > > > Jon
> > > >
> > > > On Fri, Mar 31, 2017 at 12:49 PM Otto Fowler <
> > > ottobackwa...@gmail.com>
> > > > wrote:
> > > >
> > > > Could we create a separate repo for more than on thing?  like
> put …
> > > um
> > > > let’s say
> > > > a maven plugin and the bro plugin?
> > > >
> > > >
> > > >
> > > > On March 31, 2017 at 12:30:25, Nick Allen (n...@nickallen.org)
> > > wrote:
> > > >
> > > > I agree with everything that I've read.
> > > >
> > > > One of the guys from Bro had contacted me a while back, letting
> me
> > > know
> > > > that the packaging mechanism in Bro was ready for public
> > > consumption. I
> > > > just have not had cycles to do anything with it yet. They are not
> > > > wanting
> > > > to host any of the plugins.
> > > >
> > > > I thought the package mechanism requires that a package live
> within
> > > > its own
> > > > repo (which Casey confirmed). This put me in a bind on how to
> > tackle
> > > > this. I don't want to personally host the plugin in my own Github
> > > > repo. I
> > > > would prefer that we host it in a community repo; either Bro or
> > > Metron.
> > > > Since Bro is moving away from hosting their own plugins, that
> > leaves
> > > > Metron.
> > > >
> > > > It would be great if we could create a separate repo for the
> > plugin.
> > > > That
> > > > solves the challenge of using the packaging mechanism.
> > > >
> > > > We do need to reconcile what is in bro/bro-plugins and what is in
> > > > Metron.
> > > > There are some enhancements that I and others have made that
> never
> > > > made it
> > > > back into Metron. They never made it back, because the original
> > plan
> > > > was
> > > > just to switch to using the plugin from bro/bro-plugins before
> the
> > > > idea of
> > > > a packaging mechanism hit Bro. Reconciling should be fairly easy
> to
> > > > see by
> > > > just doing a diff.
> > > >
> > > > It would be great if others want to take on any of that work. I
> > would
> > > > be
> > > > glad to offer any support that you need. Thanks, Jon!
> > > >
> > > >
> > > >
> > > >
> > > >
> > > > On Thu, Mar 30, 2017 at 11:20 PM, zeo...@gmail.com <
> > zeo...@gmail.com
> > > >
> > > > wrote:
> > > >
> > > > > Ok, great.
> > > > >
> > > > > I agree, I definitely want to hear from Nick on the topic. My
> > team
> > > is
> > > > > currently looking into enhancing the plugin as well to
> > potentially
> > > > allow
> > > > > sending to multiple clusters, investigating some issues we see
> > when
> > > > our
> > > > bro
> > > > > cluster is under load, turn it into a package, etc.
> > > > >
> > > > > The work you just did was on our to do list as well so I'm very
> > > > excited
> > > > to
> > > > > see it come through.
> > > > >
> > > > > Jon
> > > > >
> > > > > On Thu, Mar 30, 2017, 1

Re: [DISCUSS] next release proposal

2017-04-04 Thread Casey Stella
I'd like to see METRON-820 get in since it's correcting a performance
regression introduced earlier in the release.  I'll have it in by wednesday
of this week.

On Tue, Apr 4, 2017 at 2:33 PM, Simon Elliston Ball <
si...@simonellistonball.com> wrote:

> I'd really like to see METRON-623 (the ui) get into the release. It feels
> like the current PR review is getting close, and that getting it in then
> focussing on follow on tasks in a separate release would work well.
>
> I would be all for getting a release out if only for the Kerberos work.
>
> Simon
>
> > On 4 Apr 2017, at 20:15, zeo...@gmail.com  wrote:
> >
> > How far out is the management UI?
> >
> > Jon
> >
> >> On Tue, Apr 4, 2017, 2:09 PM Matt Foley  wrote:
> >>
> >> Hi all,
> >> Although it’s only been a few weeks since the last release was finally
> >> published, that process started in January :-)
> >> Also, the last commit in 0.3.1 was Feb 23, and there’s been a ton of
> >> really cool new stuff added since then:
> >>
> >> Biggest items:
> >> - Multiple commits for REST API (base Jira: METRON-503)
> >> - Multiple commits to work with Kerberized (secure) clusters (mult.
> Jiras)
> >>
> >> Other major new features:
> >> - METRON-690: DSL-based sparse time window specification for Profiler
> >> - METRON-733: Remove Geo db from ParserBolt
> >> - METRON-686: Record rule set that fired during Threat Triage
> >> - METRON-743: Sort files when reading results from Pcap
> >> - METRON-701: Triage metrics produced by Profiler
> >> - METRON-744: Stellar external functions loaded from HDFS (and huge
> >> speed-up for function resolution)
> >> - METRON-694: Index errors from Topologies, and
> >> - METRON-745: Create Error dashboards
> >> - METRON-712: Separate eval from parse in Stellar
> >> - METRON-765: Add GUID to messages
> >> - METRON-793: Updated to storm-kafka-client spout
> >>
> >> We’ve also had numerous bug fixes, docs improvements, and improvements
> to
> >> deployment tools (docker, ansible, mpack, quickdev, and fulldev).
> >>
> >> I think the REST API and Kerberization, by themselves, would justify a
> >> release.  Along with the others, I’d like to propose that we make a
> release
> >> soon.  The time frame I had in mind was at the end of this week I could
> cut
> >> a release branch (so on-going work in master doesn’t get blocked) and
> start
> >> the process of generating an RC.
> >>
> >> What do you-all think?
> >> Also, what additional work do you think should be included in this
> >> release, and can it realistically get done by the end of this week?  The
> >> time frame is, of course, flexible at the pleasure of the community –
> but
> >> also, there will be another release in another couple months or so, so
> no
> >> need to rush stuff.
> >>
> >> Thanks,
> >> --Matt
> >>
> >>
> >> --
> >
> > Jon
>


Re: Kerberos changes affected quick-dev and full-dev

2017-04-04 Thread Casey Stella
Thanks David!

On Mon, Apr 3, 2017 at 8:43 PM, David Lyle <dlyle65...@gmail.com> wrote:

> I've pushed a new Vagrant image for Quick Dev. You should be asked to
> update the box the next time you 'vagrant up' Quick Dev.
>
> -D...
>
>
> On Mon, Apr 3, 2017 at 2:33 PM, Casey Stella <ceste...@gmail.com> wrote:
>
> > Thanks Justin,  the packer build is started, but this is going to take
> some
> > time.  Please use full-dev to validate your PRs in the meantime.  I will
> > update this thread once it's uploaded.
> >
> > On Mon, Apr 3, 2017 at 2:13 PM, Justin Leet <justinjl...@gmail.com>
> wrote:
> >
> > > The PR to fix full-dev is in master now.  We still need a new packer
> > build
> > > before we have quick-dev available again.
> > >
> > > Justin
> > >
> > > On Mon, Apr 3, 2017 at 10:53 AM, Justin Leet <justinjl...@gmail.com>
> > > wrote:
> > >
> > > > Btw, here is a workaround for full-dev. In Ambari, add the line
> > > "topology.worker.childopts="
> > > > (no argument) to the elasticsearch.properties template, then restart
> > > > indexing through Ambari to propogate the change out.
> > > >
> > > > For example, make the Storm section look like:
> > > >
> > > > # Storm #
> > > > indexing.workers=1
> > > > indexing.executors=0
> > > > topology.worker.childopts=
> > > >
> > > > Justin
> > > >
> > > > On Mon, Apr 3, 2017 at 10:46 AM, Casey Stella <ceste...@gmail.com>
> > > wrote:
> > > >
> > > >> Hey guys,
> > > >>
> > > >> Just a quick heads up, the kerberos related changes (797 and 793)
> that
> > > >> went
> > > >> in last week had mpack changes.  This means that a new packer build
> > > needs
> > > >> to be updated for quickdev to work.  Unfortunately, that didn't
> happen
> > > >> *and* there's a follow-on bug (METRON-818) that also involves mpack
> > > >> changes
> > > >> (https://github.com/apache/incubator-metron/pull/506).
> > > >>
> > > >> Also, this change fixes a bug introduced in 797 with full-dev, so
> it's
> > > >> getting high priority attention just right now.  I just wanted to
> send
> > > an
> > > >> update and make sure everyone was aware of what's going on if you
> try
> > > >> full-dev or quick-dev and it fails for you.  I expect 818 to get in
> > > >> quickly
> > > >> (already has 2 +1s, so pretty much as soon as travis returns we'll
> > > >> commit),
> > > >> which should fix full-dev.
> > > >>
> > > >> Casey
> > > >>
> > > >
> > > >
> > >
> >
>


Re: Journey out of the Incubator (update)

2017-04-03 Thread Casey Stella
For reference and to save people the ponymail searches:
* The vote thread was at
https://lists.apache.org/thread.html/0aa75e8ffb0cd5a0446474f82ff4227ddacfbb8f1e84c442934bfabe@%3Cgeneral.incubator.apache.org%3E
* The discussion thread was at
https://lists.apache.org/thread.html/e5d106456b28562bdc947624c6f33e3281297dfd3803aab3d171bbad@%3Cgeneral.incubator.apache.org%3E



On Mon, Apr 3, 2017 at 6:30 PM, Casey Stella <ceste...@gmail.com> wrote:

> Hi All,
>
> For those of you who aren't following the discussion and vote, the
> incubator general, after a vigorous discussion, voted to approve
> recommending Metron to become a top level project with all +1s.  I will
> next submit the resolution to the apache board.  For those following along,
> the process is described at http://incubator.apache.org/
> guides/graduation.html#top-level-board-proposal
>
> The next board meeting is the 19th of April (http://www.apache.org/
> foundation/board/calendar.html) so we should be on the docket.
>
> Congrats everyone!
>
> Best,
>
> Casey
>


Journey out of the Incubator (update)

2017-04-03 Thread Casey Stella
Hi All,

For those of you who aren't following the discussion and vote, the
incubator general, after a vigorous discussion, voted to approve
recommending Metron to become a top level project with all +1s.  I will
next submit the resolution to the apache board.  For those following along,
the process is described at
http://incubator.apache.org/guides/graduation.html#top-level-board-proposal

The next board meeting is the 19th of April (
http://www.apache.org/foundation/board/calendar.html) so we should be on
the docket.

Congrats everyone!

Best,

Casey


Re: Kerberos changes affected quick-dev and full-dev

2017-04-03 Thread Casey Stella
Thanks Justin,  the packer build is started, but this is going to take some
time.  Please use full-dev to validate your PRs in the meantime.  I will
update this thread once it's uploaded.

On Mon, Apr 3, 2017 at 2:13 PM, Justin Leet <justinjl...@gmail.com> wrote:

> The PR to fix full-dev is in master now.  We still need a new packer build
> before we have quick-dev available again.
>
> Justin
>
> On Mon, Apr 3, 2017 at 10:53 AM, Justin Leet <justinjl...@gmail.com>
> wrote:
>
> > Btw, here is a workaround for full-dev. In Ambari, add the line
> "topology.worker.childopts="
> > (no argument) to the elasticsearch.properties template, then restart
> > indexing through Ambari to propogate the change out.
> >
> > For example, make the Storm section look like:
> >
> > # Storm #
> > indexing.workers=1
> > indexing.executors=0
> > topology.worker.childopts=
> >
> > Justin
> >
> > On Mon, Apr 3, 2017 at 10:46 AM, Casey Stella <ceste...@gmail.com>
> wrote:
> >
> >> Hey guys,
> >>
> >> Just a quick heads up, the kerberos related changes (797 and 793) that
> >> went
> >> in last week had mpack changes.  This means that a new packer build
> needs
> >> to be updated for quickdev to work.  Unfortunately, that didn't happen
> >> *and* there's a follow-on bug (METRON-818) that also involves mpack
> >> changes
> >> (https://github.com/apache/incubator-metron/pull/506).
> >>
> >> Also, this change fixes a bug introduced in 797 with full-dev, so it's
> >> getting high priority attention just right now.  I just wanted to send
> an
> >> update and make sure everyone was aware of what's going on if you try
> >> full-dev or quick-dev and it fails for you.  I expect 818 to get in
> >> quickly
> >> (already has 2 +1s, so pretty much as soon as travis returns we'll
> >> commit),
> >> which should fix full-dev.
> >>
> >> Casey
> >>
> >
> >
>


Re: [DISCUSS] The bro kafka plugin

2017-03-30 Thread Casey Stella
I *think* it's possible.  People do ask for mirrors of directories from
time to time (see https://issues.apache.org/jira/browse/INFRA-7060).  If we
think this is a good idea, we can pose it to INFRA as a request.  I'd love
to see us be able to use the bro packaging infrastructure and get more
visibility for the plugin.

I'd be particularly interested in Nick's opinion on this, though.

On Thu, Mar 30, 2017 at 11:12 PM, zeo...@gmail.com <zeo...@gmail.com> wrote:

> You can version packages -
> http://bro-package-manager.readthedocs.io/en/stable/package.html#package-
> versioning
>
> I agree that having a separate repo provided by Apache would be optimal, I
> just don't know the process for that or if it was even reasonable to
> suggest.
>
> Jon
>
> On Thu, Mar 30, 2017, 11:01 PM Casey Stella <ceste...@gmail.com> wrote:
>
> > Looking at the bro packages, it appears that bro is expecting things to
> be
> > its own git repository.  I wonder if we could either request INFRA
> provide
> > another repo for the bro-kafka plugin and integrate it into metron as a
> git
> > submodule *or* if we could request INFRA to create a github mirror of the
> > metron-sensors/bro-kafka-plugin directory.  I'm not sure how viable
> either
> > of those options are, frankly.
> >
> > One thing that I didn't see is how do you specify a particular release of
> > the plugin that you want to install?  For us, we'd want to release the
> > plugin along with the product.  I didn't quite see how you'd push
> releases
> > for bro plugins.
> >
> > On Thu, Mar 30, 2017 at 10:49 PM, Casey Stella <ceste...@gmail.com>
> wrote:
> >
> > > So, I do agree with the concern.  Is there a way to host the package
> > > within Metron?  I definitely would like to see the modifications at
> > > https://github.com/bro/bro-plugins/commit/b9f1f35415cb0db
> > > 065348da0a5043a8353b4a0a8 brought back into Metron and I'd love for us
> to
> > > host the plugin.
> > >
> > > Thoughts?
> > >
> > >
> > > On Thu, Mar 30, 2017 at 9:09 PM, zeo...@gmail.com <zeo...@gmail.com>
> > > wrote:
> > >
> > >> Today I was taking a look at METRON-812
> > >> <https://issues.apache.org/jira/browse/METRON-812>, which made me
> > recall
> > >> some conversations from a while back regarding where the bro kafka
> > plugin
> > >> should ultimately live, and how to update it.
> > >>
> > >> Back in METRON-348 <https://issues.apache.org/jira/browse/METRON-348>
> I
> > >> brought up the fact that some important changes
> > >> <https://github.com/bro/bro-plugins/commit/b9f1f35415cb0db06
> > >> 5348da0a5043a8353b4a0a8>
> > >> were made to the externally hosted version of the kafka plugin, and
> were
> > >> never introduced to Metron's hosted version (i.e. the one we use
> > >> <https://github.com/apache/incubator-metron/blob/master/metr
> > >> on-deployment/roles/bro/tasks/bro-plugin-kafka.yml>
> > >> in vagrant when bro is installed).  The conversation went down the
> route
> > >> of
> > >> discussing whether or not the bro kafka plugin code should continue to
> > >> live
> > >> in Metron in the first place.  Now, with METRON-812, I see us further
> > >> muddying the waters of where to go for the right plugin, as our
> version
> > is
> > >> still missing the public changes but adds some very important new
> > >> functionality.
> > >>
> > >> I'd like to bring up the idea of using bro's packages
> > >> <https://github.com/bro/packages> framework, released in late 2016
> > >> <http://blog.bro.org/2016/10/introducing-bro-package-manager.html>
> > >> (additional
> > >> documentation here <
> > http://bro-package-manager.readthedocs.io/en/stable/
> > >> >),
> > >> as a potential place for this to be hosted/referenced.  This is a
> simple
> > >> and supported method (funded by Mozilla
> > >> <https://blog.mozilla.org/blog/2015/12/10/mozilla-open-sourc
> > >> e-support-first-awards-made/>)
> > >> to install and uninstall bro scripts, plugins, etc., and it also
> allows
> > us
> > >> to continue to have enough control over updates to the plugin so that
> it
> > >> will not slow down Metron development by having it as a dependency
> > >> (resolving both of Casey's concerns noted here
> > >> <https:/

Re: [DISCUSS] The bro kafka plugin

2017-03-30 Thread Casey Stella
Looking at the bro packages, it appears that bro is expecting things to be
its own git repository.  I wonder if we could either request INFRA provide
another repo for the bro-kafka plugin and integrate it into metron as a git
submodule *or* if we could request INFRA to create a github mirror of the
metron-sensors/bro-kafka-plugin directory.  I'm not sure how viable either
of those options are, frankly.

One thing that I didn't see is how do you specify a particular release of
the plugin that you want to install?  For us, we'd want to release the
plugin along with the product.  I didn't quite see how you'd push releases
for bro plugins.

On Thu, Mar 30, 2017 at 10:49 PM, Casey Stella <ceste...@gmail.com> wrote:

> So, I do agree with the concern.  Is there a way to host the package
> within Metron?  I definitely would like to see the modifications at
> https://github.com/bro/bro-plugins/commit/b9f1f35415cb0db
> 065348da0a5043a8353b4a0a8 brought back into Metron and I'd love for us to
> host the plugin.
>
> Thoughts?
>
>
> On Thu, Mar 30, 2017 at 9:09 PM, zeo...@gmail.com <zeo...@gmail.com>
> wrote:
>
>> Today I was taking a look at METRON-812
>> <https://issues.apache.org/jira/browse/METRON-812>, which made me recall
>> some conversations from a while back regarding where the bro kafka plugin
>> should ultimately live, and how to update it.
>>
>> Back in METRON-348 <https://issues.apache.org/jira/browse/METRON-348> I
>> brought up the fact that some important changes
>> <https://github.com/bro/bro-plugins/commit/b9f1f35415cb0db06
>> 5348da0a5043a8353b4a0a8>
>> were made to the externally hosted version of the kafka plugin, and were
>> never introduced to Metron's hosted version (i.e. the one we use
>> <https://github.com/apache/incubator-metron/blob/master/metr
>> on-deployment/roles/bro/tasks/bro-plugin-kafka.yml>
>> in vagrant when bro is installed).  The conversation went down the route
>> of
>> discussing whether or not the bro kafka plugin code should continue to
>> live
>> in Metron in the first place.  Now, with METRON-812, I see us further
>> muddying the waters of where to go for the right plugin, as our version is
>> still missing the public changes but adds some very important new
>> functionality.
>>
>> I'd like to bring up the idea of using bro's packages
>> <https://github.com/bro/packages> framework, released in late 2016
>> <http://blog.bro.org/2016/10/introducing-bro-package-manager.html>
>> (additional
>> documentation here <http://bro-package-manager.readthedocs.io/en/stable/
>> >),
>> as a potential place for this to be hosted/referenced.  This is a simple
>> and supported method (funded by Mozilla
>> <https://blog.mozilla.org/blog/2015/12/10/mozilla-open-sourc
>> e-support-first-awards-made/>)
>> to install and uninstall bro scripts, plugins, etc., and it also allows us
>> to continue to have enough control over updates to the plugin so that it
>> will not slow down Metron development by having it as a dependency
>> (resolving both of Casey's concerns noted here
>> <https://issues.apache.org/jira/browse/METRON-348?focusedCom
>> mentId=15391865=com.atlassian.jira.plugin.system.
>> issuetabpanels:comment-tabpanel#comment-15391865>,
>> and I think this solution is supported by Nick's comments here
>> <https://issues.apache.org/jira/browse/METRON-348?focusedCom
>> mentId=15391872=com.atlassian.jira.plugin.system.
>> issuetabpanels:comment-tabpanel#comment-15391872>
>> as
>> well).
>>
>> The only thing I'm not sure about is where to host the plugin itself - my
>> first thought would be Nick's github <https://github.com/nickwallen>, as
>> he
>> really kicked off this effort, but maybe we can think of something better.
>>
>> Is this approach of interest to anybody?  It is extremely simple to put
>> together - I was able to throw one together
>> <https://github.com/bro/packages/blob/master/jonzeolla/bro-pkg.index> and
>> get it working with a fresh bro 2.5 install when attending the brocon talk
>> <https://www.bro.org/brocon2016/brocon2016_abstracts.html#
>> bro-packagemanager>
>>  (recording <https://www.youtube.com/watch?v=9RFfPJeGkcE>, slides
>> <https://www.bro.org/brocon2016/slides/hall_bpm.pdf>) that introduced
>> this
>> to me in the first place.
>>
>> Jon
>> --
>>
>> Jon
>>
>
>


Re: [DISCUSS] The bro kafka plugin

2017-03-30 Thread Casey Stella
So, I do agree with the concern.  Is there a way to host the package within
Metron?  I definitely would like to see the modifications at
https://github.com/bro/bro-plugins/commit/b9f1f35415cb0db065348da0a5043a
8353b4a0a8 brought back into Metron and I'd love for us to host the plugin.


Thoughts?


On Thu, Mar 30, 2017 at 9:09 PM, zeo...@gmail.com  wrote:

> Today I was taking a look at METRON-812
> , which made me recall
> some conversations from a while back regarding where the bro kafka plugin
> should ultimately live, and how to update it.
>
> Back in METRON-348  I
> brought up the fact that some important changes
>  8353b4a0a8>
> were made to the externally hosted version of the kafka plugin, and were
> never introduced to Metron's hosted version (i.e. the one we use
>  metron-deployment/roles/bro/tasks/bro-plugin-kafka.yml>
> in vagrant when bro is installed).  The conversation went down the route of
> discussing whether or not the bro kafka plugin code should continue to live
> in Metron in the first place.  Now, with METRON-812, I see us further
> muddying the waters of where to go for the right plugin, as our version is
> still missing the public changes but adds some very important new
> functionality.
>
> I'd like to bring up the idea of using bro's packages
>  framework, released in late 2016
> 
> (additional
> documentation here  >),
> as a potential place for this to be hosted/referenced.  This is a simple
> and supported method (funded by Mozilla
>  source-support-first-awards-made/>)
> to install and uninstall bro scripts, plugins, etc., and it also allows us
> to continue to have enough control over updates to the plugin so that it
> will not slow down Metron development by having it as a dependency
> (resolving both of Casey's concerns noted here
>  focusedCommentId=15391865=com.atlassian.jira.
> plugin.system.issuetabpanels:comment-tabpanel#comment-15391865>,
> and I think this solution is supported by Nick's comments here
>  focusedCommentId=15391872=com.atlassian.jira.
> plugin.system.issuetabpanels:comment-tabpanel#comment-15391872>
> as
> well).
>
> The only thing I'm not sure about is where to host the plugin itself - my
> first thought would be Nick's github , as
> he
> really kicked off this effort, but maybe we can think of something better.
>
> Is this approach of interest to anybody?  It is extremely simple to put
> together - I was able to throw one together
>  and
> get it working with a fresh bro 2.5 install when attending the brocon talk
>  packagemanager>
>  (recording , slides
> ) that introduced this
> to me in the first place.
>
> Jon
> --
>
> Jon
>


Re: Changes to development guidelines

2017-03-27 Thread Casey Stella
yes indeed, I have seen taht before, created a JIRA and a PR to fix it and
forgot to submit it.
https://issues.apache.org/jira/browse/METRON-773

I'll submit the PR.

On Mon, Mar 27, 2017 at 10:06 AM, Otto Fowler 
wrote:

> I’m working on METRON-806,  which is basically a new assembly not setting <
> tarLongFileMode>posix for the assembly plugin.
>
> I think there should be an entry in the dev guidelines about setting this
> when adding assemblies, but I’m not sure how we do changes there.
>
> Thoughts?
>


Re: [DISCUSS] Kerberos Support

2017-03-26 Thread Casey Stella
It occurs to me that I did not cite the work that inspired this
appropriately and I should've.  I do apologize for that; it was a
significant enough departure from the original investigatory branch that it
didn't occur to me.  I've adjusted the JIRA and the PR description as
follows to ensure proper credit for the inspiration:


This work was inspired by a portion of the investigatory work done at
https://github.com/dlyle65535/incubator-metron/tree/kerb-testing?files=1 by:
* @dlyle65535
* @merrimanr
* @mmiklavc

This carves out a specific piece of that functionality with the following
differences:
* The mpack work is not included, but the properties are set up to enable
it as a follow-on
* It presumes METRON-793, so it uses storm-kafka-client rather than
storm-kafka
* It adds a flag when starting the parsers to pass the security protocol
and sets up the writers and the spout automatically rather than relying on
the set of extra kafka configs (though both approaches would work here).


Sorry about that!

Casey


On Sat, Mar 25, 2017 at 12:14 PM, Casey Stella <ceste...@gmail.com> wrote:

> Actually, METRON-797 (https://github.com/apache/incubator-metron/pull/490)
> is inspired by that work, Dave.
> Specifically here are the differences:
>
>- The mpack work is not included
>- It's based on 793, so it uses storm-kafka-client rather than
>storm-kafka
>- It adds a flag when starting the parsers to pass the security
>protocol and sets up the writers and the spout automatically
>- The core extension points (flux properties) are set up like in that
>PR, to make the follow-on work easier
>
>
> What's left is to pull the great work on the mpack out of there and get it
> working as a follow-on PR, which I created METRON-799 (
> https://issues.apache.org/jira/browse/METRON-799) to accomplish.
>
> Casey
>
>
> On Sat, Mar 25, 2017 at 10:11 AM, David Lyle <dlyle65...@gmail.com> wrote:
>
>> Sounds good to me. A few of us did some initial exploration on that topic
>> a
>> week or so back. The branch is here:
>> https://github.com/dlyle65535/incubator-metron/tree/kerb-testing?files=1
>>
>> It contains a prototype quality version of everything you've identified
>> except METRON-793 and the probes.
>>
>> If it's a good starting place, it'd probably be short work to get it to a
>> PR-able form.
>>
>> Thoughts?
>>
>> -D...
>>
>> On Fri, Mar 24, 2017 at 23:03 Casey Stella <ceste...@gmail.com> wrote:
>>
>> > Hi All,
>> >
>> > I'd like to talk and start to formulate a plan around supporting running
>> > Metron on a kerberized cluster.  This is a big bundle of work and seems
>> > dauntingly nebulous, so I wanted to have a chat and get a firm
>> direction.
>> > When initially contemplating the issue, it's apparent that there are a
>> few
>> > snags that need to be thought through:
>> >
>> >- The kafka spout (storm-kafka) from apache does not support
>> interacting
>> >with kerberized kafka
>> >- We need to ensure the kafka writers are passing along the security
>> >protocol
>> >- We need to ensure that the credentials are auto-renewed
>> >
>> > Since I knew this was necessary, I wanted to get over this initial
>> barrier
>> > and submitted a couple of PRs associated with the following JIRAs that
>> are
>> > in review now:
>> >
>> >- METRON-793: Migrate to storm-kafka-client, a kafka spout which
>> >supports kerberized kafka
>> >- METRON-797: Pass along security protocols and set up autorenewal
>> >
>> > Between these, we get what I believe is the enrichment, profiler and
>> parser
>> > topologies and interactions with hbase and hdfs from them working along
>> > with the MR jobs.  We still lack a few things:
>> >
>> >- A document describing the process of kerberizing a vagrant
>> environment
>> >to enable testing
>> >- Kerberos support for the librdkafka-based sensors (bro and
>> fastcapa)
>> >- Kerberos support for the sensors that use the console-producer
>> (snort
>> >and yaf)
>> >   - This should be as easy as ensuring to pass the right params to
>> the
>> >   console producer, but still, we need to adjust the sensor stubs
>> > to make the
>> >   right call.
>> >- Kerberos support for the mpack
>> >- Kerberos support for the REST API
>> >- Kerberos support for the pycapa
>> >
>> > I'm tracking these as JIRAs with the label of "kerberos" (see
>> >
>> > https://issues.apache.org/jira/browse/METRON-802?jql=labels%
>> 20%3D%20kerberos%20AND%20project%20%3D%20Metron
>> > )
>> >
>> > Please chime in if I missed something; I'll add it to the list.
>> >
>> > Best,
>> >
>> > Casey
>> >
>>
>
>


Re: [DISCUSS] Kerberos Support

2017-03-25 Thread Casey Stella
Actually, METRON-797 (https://github.com/apache/incubator-metron/pull/490)
is inspired by that work, Dave.
Specifically here are the differences:

   - The mpack work is not included
   - It's based on 793, so it uses storm-kafka-client rather than
   storm-kafka
   - It adds a flag when starting the parsers to pass the security protocol
   and sets up the writers and the spout automatically
   - The core extension points (flux properties) are set up like in that
   PR, to make the follow-on work easier


What's left is to pull the great work on the mpack out of there and get it
working as a follow-on PR, which I created METRON-799 (
https://issues.apache.org/jira/browse/METRON-799) to accomplish.

Casey


On Sat, Mar 25, 2017 at 10:11 AM, David Lyle <dlyle65...@gmail.com> wrote:

> Sounds good to me. A few of us did some initial exploration on that topic a
> week or so back. The branch is here:
> https://github.com/dlyle65535/incubator-metron/tree/kerb-testing?files=1
>
> It contains a prototype quality version of everything you've identified
> except METRON-793 and the probes.
>
> If it's a good starting place, it'd probably be short work to get it to a
> PR-able form.
>
> Thoughts?
>
> -D...
>
> On Fri, Mar 24, 2017 at 23:03 Casey Stella <ceste...@gmail.com> wrote:
>
> > Hi All,
> >
> > I'd like to talk and start to formulate a plan around supporting running
> > Metron on a kerberized cluster.  This is a big bundle of work and seems
> > dauntingly nebulous, so I wanted to have a chat and get a firm direction.
> > When initially contemplating the issue, it's apparent that there are a
> few
> > snags that need to be thought through:
> >
> >- The kafka spout (storm-kafka) from apache does not support
> interacting
> >with kerberized kafka
> >- We need to ensure the kafka writers are passing along the security
> >protocol
> >- We need to ensure that the credentials are auto-renewed
> >
> > Since I knew this was necessary, I wanted to get over this initial
> barrier
> > and submitted a couple of PRs associated with the following JIRAs that
> are
> > in review now:
> >
> >- METRON-793: Migrate to storm-kafka-client, a kafka spout which
> >supports kerberized kafka
> >- METRON-797: Pass along security protocols and set up autorenewal
> >
> > Between these, we get what I believe is the enrichment, profiler and
> parser
> > topologies and interactions with hbase and hdfs from them working along
> > with the MR jobs.  We still lack a few things:
> >
> >- A document describing the process of kerberizing a vagrant
> environment
> >to enable testing
> >- Kerberos support for the librdkafka-based sensors (bro and fastcapa)
> >- Kerberos support for the sensors that use the console-producer
> (snort
> >and yaf)
> >   - This should be as easy as ensuring to pass the right params to
> the
> >   console producer, but still, we need to adjust the sensor stubs
> > to make the
> >   right call.
> >- Kerberos support for the mpack
> >- Kerberos support for the REST API
> >- Kerberos support for the pycapa
> >
> > I'm tracking these as JIRAs with the label of "kerberos" (see
> >
> > https://issues.apache.org/jira/browse/METRON-802?jql=
> labels%20%3D%20kerberos%20AND%20project%20%3D%20Metron
> > )
> >
> > Please chime in if I missed something; I'll add it to the list.
> >
> > Best,
> >
> > Casey
> >
>


[DISCUSS] Kerberos Support

2017-03-24 Thread Casey Stella
Hi All,

I'd like to talk and start to formulate a plan around supporting running
Metron on a kerberized cluster.  This is a big bundle of work and seems
dauntingly nebulous, so I wanted to have a chat and get a firm direction.
When initially contemplating the issue, it's apparent that there are a few
snags that need to be thought through:

   - The kafka spout (storm-kafka) from apache does not support interacting
   with kerberized kafka
   - We need to ensure the kafka writers are passing along the security
   protocol
   - We need to ensure that the credentials are auto-renewed

Since I knew this was necessary, I wanted to get over this initial barrier
and submitted a couple of PRs associated with the following JIRAs that are
in review now:

   - METRON-793: Migrate to storm-kafka-client, a kafka spout which
   supports kerberized kafka
   - METRON-797: Pass along security protocols and set up autorenewal

Between these, we get what I believe is the enrichment, profiler and parser
topologies and interactions with hbase and hdfs from them working along
with the MR jobs.  We still lack a few things:

   - A document describing the process of kerberizing a vagrant environment
   to enable testing
   - Kerberos support for the librdkafka-based sensors (bro and fastcapa)
   - Kerberos support for the sensors that use the console-producer (snort
   and yaf)
  - This should be as easy as ensuring to pass the right params to the
  console producer, but still, we need to adjust the sensor stubs
to make the
  right call.
   - Kerberos support for the mpack
   - Kerberos support for the REST API
   - Kerberos support for the pycapa

I'm tracking these as JIRAs with the label of "kerberos" (see
https://issues.apache.org/jira/browse/METRON-802?jql=labels%20%3D%20kerberos%20AND%20project%20%3D%20Metron
)

Please chime in if I missed something; I'll add it to the list.

Best,

Casey


Re: [MENTORS] initiating the TLP vote

2017-03-23 Thread Casey Stella
Sounds good, thanks Taylor.  I'll write up the email and start the
discussion on incubator general.

On Thu, Mar 23, 2017 at 1:58 PM, P. Taylor Goetz  wrote:

> Anyone can do this, it doesn’t need to be a mentor or the proposed VP.
> It’s much like a release vote: Forward a copy of the proposed resolution
> and a link to the PPMC graduation VOTE.
>
> So the procedure is:
>
> 1. Start a DISCUSS thread on general@incubator. Once that discussion dies
> down, and if there seems to be positive consensus…
> 2. Start a VOTE thread for the same.
>
> -Taylor
>
>
> > On Mar 22, 2017, at 12:53 PM, James Sirota  wrote:
> >
> > Mentors,
> >
> > Looks like we have everything we need to initiate the incubator board
> vote to leave the incubator.  Can you provide some guidance as to what is
> the best way to approach this?
> >
> > ---
> > Thank you,
> >
> > James Sirota
> > PPMC- Apache Metron (Incubating)
> > jsirota AT apache DOT org
>
>


Re: [DISCUSS][PROPOSAL] Acceptance Tests

2017-03-22 Thread Casey Stella
Agreed, we need to get the profiler, MaaS and the REST API deployed via the
mpack sooner rather than later.

On Wed, Mar 22, 2017 at 3:05 PM, Ryan Merriman <merrim...@gmail.com> wrote:

> I think we'll have non manual rest/web deployment soon regardless of this
> discussion.
>
> On Wed, Mar 22, 2017 at 2:00 PM, Ryan Merriman <merrim...@gmail.com>
> wrote:
>
> > I don't think a cluster installed by ansible is a prerequisite to using
> > ansible to integration test.  They would be completely separate modules
> > except maybe sharing some property or inventory files.  Just need to run
> > scripts and hit rest endpoints right?  Just an idea, maybe it's overkill.
> > I'm cool with rolling our own.
> >
> > On Wed, Mar 22, 2017 at 1:49 PM, Casey Stella <ceste...@gmail.com>
> wrote:
> >
> >> Maybe, but I'd argue that we would want this to be run against a
> >> non-ansible installed cluster.  For a first pass, I'd recommend just a
> set
> >> of shell scripts utilizing the REPL and the REST API along with shell
> >> commands.  Most of our capabilities are quite scriptable.
> >>
> >> On Wed, Mar 22, 2017 at 2:47 PM, Ryan Merriman <merrim...@gmail.com>
> >> wrote:
> >>
> >> > Bumping this thread.  Looks like we have several +1s so I propose we
> >> move
> >> > to the next step.  I'm anxious to get this done because these tests
> >> would
> >> > have saved me time over the last couple weeks.  The management UI in
> >> > https://github.com/apache/incubator-metron/pull/484 has a set of e2e
> >> tests
> >> > being maintained in another branch so those could also be included in
> >> this
> >> > test suite when the UI makes it into master.
> >> >
> >> > Ideas for an "Acceptance Testing Framework"?  Could Ansible be good
> fit
> >> for
> >> > this since we already have it in our stack?
> >> >
> >> > On Mon, Mar 6, 2017 at 1:01 PM, Michael Miklavcic <
> >> > michael.miklav...@gmail.com> wrote:
> >> >
> >> > > Ok, yes I agree. In my experience with e2e/acceptance tests, they're
> >> best
> >> > > kept general with an emphasis on verifying that all the plumbing
> works
> >> > > together. So yes, there are definite edge cases I think we'll want
> to
> >> > test
> >> > > here, but I say that with the caveat that I think we should ideally
> >> cover
> >> > > as many non-happy-path cases in unit and integration tests as
> >> possible.
> >> > As
> >> > > an example, I don't think it makes sense to cover most of the
> profiler
> >> > > windowing DSL language edge cases in acceptance tests instead of or
> in
> >> > > addition to unit/integration tests unless there is something
> specific
> >> to
> >> > > the integration with a given an environment that we think could be
> >> > > problematic.
> >> > >
> >> > > M
> >> > >
> >> > > On Mon, Mar 6, 2017 at 11:32 AM, Casey Stella <ceste...@gmail.com>
> >> > wrote:
> >> > >
> >> > > > No, I'm saying that they shouldn't be restricted to real-world
> >> > use-cases.
> >> > > > The E2E tests I laid out weren't real-world, but they did exercise
> >> the
> >> > > > components similar to real-world use-cases.  They should also be
> >> able
> >> > to
> >> > > be
> >> > > > able to tread outside of the happy-path for those use-cases.
> >> > > >
> >> > > > On Mon, Mar 6, 2017 at 6:30 PM, Michael Miklavcic <
> >> > > > michael.miklav...@gmail.com> wrote:
> >> > > >
> >> > > > > "I don't think acceptance tests should loosely associate with
> real
> >> > > uses,
> >> > > > > but they should
> >> > > > > be free to delve into weird non-happy-pathways."
> >> > > > >
> >> > > > > Not following - are you saying they should *tightly* associate
> >> with
> >> > > real
> >> > > > > uses and additonally include non-happy-path?
> >> > > > >
> >> > > > > On Fri, Mar 3, 2017 at 12:57 PM, Casey Stella <
> ceste...@gmail.com
> >> >
> >> > > > wrote:
> >&

Re: [DISCUSS][PROPOSAL] Acceptance Tests

2017-03-22 Thread Casey Stella
I'd be ok with ansible, but we've run into such trouble with
incompatibilities between versions that I'm still a bit gun-shy about it.
That being said, I'm ok with it.

On Wed, Mar 22, 2017 at 3:00 PM, Ryan Merriman <merrim...@gmail.com> wrote:

> I don't think a cluster installed by ansible is a prerequisite to using
> ansible to integration test.  They would be completely separate modules
> except maybe sharing some property or inventory files.  Just need to run
> scripts and hit rest endpoints right?  Just an idea, maybe it's overkill.
> I'm cool with rolling our own.
>
> On Wed, Mar 22, 2017 at 1:49 PM, Casey Stella <ceste...@gmail.com> wrote:
>
> > Maybe, but I'd argue that we would want this to be run against a
> > non-ansible installed cluster.  For a first pass, I'd recommend just a
> set
> > of shell scripts utilizing the REPL and the REST API along with shell
> > commands.  Most of our capabilities are quite scriptable.
> >
> > On Wed, Mar 22, 2017 at 2:47 PM, Ryan Merriman <merrim...@gmail.com>
> > wrote:
> >
> > > Bumping this thread.  Looks like we have several +1s so I propose we
> move
> > > to the next step.  I'm anxious to get this done because these tests
> would
> > > have saved me time over the last couple weeks.  The management UI in
> > > https://github.com/apache/incubator-metron/pull/484 has a set of e2e
> > tests
> > > being maintained in another branch so those could also be included in
> > this
> > > test suite when the UI makes it into master.
> > >
> > > Ideas for an "Acceptance Testing Framework"?  Could Ansible be good fit
> > for
> > > this since we already have it in our stack?
> > >
> > > On Mon, Mar 6, 2017 at 1:01 PM, Michael Miklavcic <
> > > michael.miklav...@gmail.com> wrote:
> > >
> > > > Ok, yes I agree. In my experience with e2e/acceptance tests, they're
> > best
> > > > kept general with an emphasis on verifying that all the plumbing
> works
> > > > together. So yes, there are definite edge cases I think we'll want to
> > > test
> > > > here, but I say that with the caveat that I think we should ideally
> > cover
> > > > as many non-happy-path cases in unit and integration tests as
> possible.
> > > As
> > > > an example, I don't think it makes sense to cover most of the
> profiler
> > > > windowing DSL language edge cases in acceptance tests instead of or
> in
> > > > addition to unit/integration tests unless there is something specific
> > to
> > > > the integration with a given an environment that we think could be
> > > > problematic.
> > > >
> > > > M
> > > >
> > > > On Mon, Mar 6, 2017 at 11:32 AM, Casey Stella <ceste...@gmail.com>
> > > wrote:
> > > >
> > > > > No, I'm saying that they shouldn't be restricted to real-world
> > > use-cases.
> > > > > The E2E tests I laid out weren't real-world, but they did exercise
> > the
> > > > > components similar to real-world use-cases.  They should also be
> able
> > > to
> > > > be
> > > > > able to tread outside of the happy-path for those use-cases.
> > > > >
> > > > > On Mon, Mar 6, 2017 at 6:30 PM, Michael Miklavcic <
> > > > > michael.miklav...@gmail.com> wrote:
> > > > >
> > > > > > "I don't think acceptance tests should loosely associate with
> real
> > > > uses,
> > > > > > but they should
> > > > > > be free to delve into weird non-happy-pathways."
> > > > > >
> > > > > > Not following - are you saying they should *tightly* associate
> with
> > > > real
> > > > > > uses and additonally include non-happy-path?
> > > > > >
> > > > > > On Fri, Mar 3, 2017 at 12:57 PM, Casey Stella <
> ceste...@gmail.com>
> > > > > wrote:
> > > > > >
> > > > > > > It is absolutely not a naive question, Matt.  We don't have a
> lot
> > > (or
> > > > > > any)
> > > > > > > docs about our integration tests; it's more of a "follow the
> > lead"
> > > > type
> > > > > > of
> > > > > > > thing at the moment, but that should be rectified.
> > > > > > >
> > > > > > > The integration tests

Re: [DISCUSS][PROPOSAL] Acceptance Tests

2017-03-22 Thread Casey Stella
Maybe, but I'd argue that we would want this to be run against a
non-ansible installed cluster.  For a first pass, I'd recommend just a set
of shell scripts utilizing the REPL and the REST API along with shell
commands.  Most of our capabilities are quite scriptable.

On Wed, Mar 22, 2017 at 2:47 PM, Ryan Merriman <merrim...@gmail.com> wrote:

> Bumping this thread.  Looks like we have several +1s so I propose we move
> to the next step.  I'm anxious to get this done because these tests would
> have saved me time over the last couple weeks.  The management UI in
> https://github.com/apache/incubator-metron/pull/484 has a set of e2e tests
> being maintained in another branch so those could also be included in this
> test suite when the UI makes it into master.
>
> Ideas for an "Acceptance Testing Framework"?  Could Ansible be good fit for
> this since we already have it in our stack?
>
> On Mon, Mar 6, 2017 at 1:01 PM, Michael Miklavcic <
> michael.miklav...@gmail.com> wrote:
>
> > Ok, yes I agree. In my experience with e2e/acceptance tests, they're best
> > kept general with an emphasis on verifying that all the plumbing works
> > together. So yes, there are definite edge cases I think we'll want to
> test
> > here, but I say that with the caveat that I think we should ideally cover
> > as many non-happy-path cases in unit and integration tests as possible.
> As
> > an example, I don't think it makes sense to cover most of the profiler
> > windowing DSL language edge cases in acceptance tests instead of or in
> > addition to unit/integration tests unless there is something specific to
> > the integration with a given an environment that we think could be
> > problematic.
> >
> > M
> >
> > On Mon, Mar 6, 2017 at 11:32 AM, Casey Stella <ceste...@gmail.com>
> wrote:
> >
> > > No, I'm saying that they shouldn't be restricted to real-world
> use-cases.
> > > The E2E tests I laid out weren't real-world, but they did exercise the
> > > components similar to real-world use-cases.  They should also be able
> to
> > be
> > > able to tread outside of the happy-path for those use-cases.
> > >
> > > On Mon, Mar 6, 2017 at 6:30 PM, Michael Miklavcic <
> > > michael.miklav...@gmail.com> wrote:
> > >
> > > > "I don't think acceptance tests should loosely associate with real
> > uses,
> > > > but they should
> > > > be free to delve into weird non-happy-pathways."
> > > >
> > > > Not following - are you saying they should *tightly* associate with
> > real
> > > > uses and additonally include non-happy-path?
> > > >
> > > > On Fri, Mar 3, 2017 at 12:57 PM, Casey Stella <ceste...@gmail.com>
> > > wrote:
> > > >
> > > > > It is absolutely not a naive question, Matt.  We don't have a lot
> (or
> > > > any)
> > > > > docs about our integration tests; it's more of a "follow the lead"
> > type
> > > > of
> > > > > thing at the moment, but that should be rectified.
> > > > >
> > > > > The integration tests spin up and down infrastructure in-process,
> > some
> > > of
> > > > > which are real and some of which are mock versions of the services.
> > > > These
> > > > > are good for catching some types of bugs, but often things sneak
> > > through,
> > > > > like:
> > > > >
> > > > >- Hbase and storm can't exist in the same JVM, so HBase is
> mocked
> > in
> > > > >those cases.
> > > > >- The FileSystem that we get for Hadoop is the
> LocalRawFileSystem,
> > > not
> > > > >truly HDFS.  There are differences and we've run into
> > > > them..hilariously
> > > > > at
> > > > >times. ;)
> > > > >- Things done statically in a bolt are shared across all bolts
> > > because
> > > > >they all are threads in the same process
> > > > >
> > > > > It's good, it catches bugs, it lets us debug things easily, it runs
> > > with
> > > > > every single build automatically via travis.
> > > > > It's bad because it's awkward to get the dependencies isolated
> > > > sufficiently
> > > > > for all of these components to get them to play nice in the same
> JVM.
> > > > >
> > > > > Acceptance tests would be run against a real cluster, so

Re: [DISCUSS] Stepping down as release manager

2017-03-21 Thread Casey Stella
Right, Billie is exactly right.  Working with the community to constructing
releases that conform to apache standards and policies is the main duty.
This will (hopefully) be our first set of releases outside of the
incubator, so if I'm allowed to be biased, I'm hoping that someone with
previous release management experience in other projects will volunteer.
We're leaving the nest a bit and having an experienced hand at the tiller
would be advantageous.


On Tue, Mar 21, 2017 at 10:21 AM, Billie Rinaldi <bil...@apache.org> wrote:

> See http://www.apache.org/dev/release-publishing#release_manager and
> http://www.apache.org/legal/release-policy.html for information on the
> tasks that a release manager performs.
>
> On Tue, Mar 21, 2017 at 7:10 AM, Khurram Ahmed <khurramah...@gmail.com>
> wrote:
>
> > Casey it would be helpful if you could outline the responsibilities of a
> > release manager for the Metron project.
> >
> > On Mar 21, 2017 6:57 PM, "Casey Stella" <ceste...@gmail.com> wrote:
> >
> > > I've been extremely honored to spend the last few months as the Metron
> > > Release Manager.  That being said, my watch is ended and it's time for
> > > another release manager to step into my place.
> > >
> > > Who would like to volunteer to be release manager for the next release
> of
> > > Metron?
> > >
> > > Best,
> > >
> > > Casey
> > >
> >
>


[DISCUSS] Stepping down as release manager

2017-03-21 Thread Casey Stella
I've been extremely honored to spend the last few months as the Metron
Release Manager.  That being said, my watch is ended and it's time for
another release manager to step into my place.

Who would like to volunteer to be release manager for the next release of
Metron?

Best,

Casey


Re: [ANNOUNCE] Apache Metron (incubating) 0.3.1 is released

2017-03-17 Thread Casey Stella
ion needs to be updated
closes apache/incubator-metron#407
* METRON-645 Unable to Start Fastcapa Test Environment (nickwallen)
closes apache/incubator-metron#410
* METRON-647 Parser unit test failures due to assumed year values
(justinleet) closes apache/incubator-metron#412
* METRON-639: The Network Stellar functions need to have better unit
testing closes apache/incubator-metron#402
* METRON-631 Broken link on fastcapa README (JonZeolla via nickwallen)
closes apache/incubator-metron#398
* METRON-625: Parser Filters cannot be specified from the sensor
config closes apache/incubator-metron#396
* METRON-587 Integration tests should use common processor
implementations where possible (ottobackwards) closes
apache/incubator-metron#374
* METRON-616: Added support for float and long literals in Stellar
closes apache/incubator-metron#392
* METRON-580: Remove hard-coded Metron version from Ambari MPack code
(mmiklavc) closes apache/incubator-metron#364
* METRON-618 Eliminate Javac Warnings in metron-analytics (justinleet)
closes apache/incubator-metron#391
* METRON-364: Preserve Type for Arithmetic Expressions in Stellar
closes apache/incubator-metron#390
* METRON-585 Pcap Replay Fails to Stop (nickwallen) closes
apache/incubator-metron#369
* METRON-586 STELLAR should have FILL_LEFT and FILL_RIGHT functions
(ottobackwards) closes apache/incubator-metron#370
* METRON-612 Clean up Error Prone generated warnings (justinleet)
closes apache/incubator-metron#389
* METRON-610: OnlineStatisticsProvider serialization is broken at
random in the REPL closes apache/incubator-metron#388
* METRON-596 Eliminate Maven warnings from build (justinleet) closes
apache/incubator-metron#378
* METRON-604 Mpack installs do not work on clean machines due to
missing Elastic Curator repo (justinleet) closes
apache/incubator-metron#385
* METRON-607: Enrichment doc improvement and test cleanup (mmiklavc)
closes apache/incubator-metron#386
* METRON-606 Profiler Overwriting Previously Written Values
(nickwallen) closes apache/incubator-metron#387
* METRON-591: Make the website in compliance with ASF standards closes
apache/incubator-metron#373
* METRON-597 Sporadic Failures of Profiler Integration Tests
(nickwallen) closes apache/incubator-metron#383
* METRON-595 Elasticsearch Writer only uses One IP Address
(JonathanRider via dlyle65535) closes apache/incubator-metron#379
* METRON-593 Enable an automated static analysis tool in the build
(justinleet) closes apache/incubator-metron#376
* METRON-592 Change popup to Warning message in Assign Master page for
Client check (anandsubbu via dlyle65535) closes
apache/incubator-metron#375
* METRON-598 Add Kyle Richardson to committers (kylerichardson) closes
apache/incubator-metron#382
* METRON-565: apps/metron/enrichment/indexed directory path does not
get created for metron cluster deployed via Ambari (mmiklavc) closes
apache/incubator-metron#365
* METRON-588 Remove ASF License Headers from directly copied Chef
Bento Files closes apache/incubator-metron#372
* METRON-576 Stellar function resolution takes too long on running
cluster (nickwallen) closes apache/incubator-metron#366
* METRON-570 Add Profiler Link to README (nickwallen) closes
apache/incubator-metron#371
* METRON-578 Missing error handling bolts for enrichment and threat
intel closes apache/incubator-metron#363
* METRON-522 Need to mandate Client installation on Metron Host closes
apache/incubator-metron#367
* METRON-584 STELLAR should have a TO_LONG Function (ottobackwards)
closes apache/incubator-metron#368
* METRON-575 State from different profiles can be co-mingled
incorrectly (nickwallen) closes apache/incubator-metron#362
* METRON-538 Ensure proper shutdown, exceptions of integration test
components (ottobackwards) closes apache/incubator-metron#344
* METRON-567 Usernames as numerics strings attempted to be parsed and
compared as numbers (ottobackwards) closes apache/incubator-metron#360
* METRON-562: Add rudimentary statistical outlier detection closes
apache/incubator-metron#352
* METRON-557 Create Stellar Functions for Kafka (nickwallen) closes
apache/incubator-metron#354
* METRON-556 Profiler - Refactor 'Group By' Calculation (nickwallen)
closes apache/incubator-metron#348
* METRON-547 MaasIntegrationTests should integrate with the Metron
Integration classes (ottobackwards) closes apache/incubator-metron#350
* METRON-561 ShellEditor tests hang if vim is set as EDITOR in your
profile (ottobackwards) closes apache/incubator-metron#351


On Fri, Mar 17, 2017 at 11:18 AM, Casey Stella <ceste...@gmail.com> wrote:

> I am very proud to announce that the 0.3.1 release bits have been
> released.  You can see this reflected on our website at
> http://metron.apache.org/documentation/#releases  Also, I want to point
> out that our github documentation for the release is currently located at
> http://metron.apache.org/current-book/index.html and linked from the
> release page (Thanks Matt for making that happen!).
>

[ANNOUNCE] Apache Metron (incubating) 0.3.1 is released

2017-03-17 Thread Casey Stella
I am very proud to announce that the 0.3.1 release bits have been
released.  You can see this reflected on our website at
http://metron.apache.org/documentation/#releases  Also, I want to point out
that our github documentation for the release is currently located at
http://metron.apache.org/current-book/index.html and linked from the
release page (Thanks Matt for making that happen!).

I'm particularly proud of this release as it'll be the release on which we
base our exit from the incubator.  I really appreciate all of the
contributions that everyone made to make this possible.  Heartfelt
gratitude goes out to the community, the committers, the contributors and
the mentors for making this happen.  In the best tradition of open source
software, it took a village to build a Metron. :)

Best,

Casey

PS. I still have some JIRA work to do to clean up from this release; I'll
be doing that by the end of the weekend.


Re: [DISCUSS] Apache Rat Exclusions

2017-03-16 Thread Casey Stella
Thanks, Dave!  Much appreciated insight.

On Thu, Mar 16, 2017 at 4:21 PM, David Lyle <dlyle65...@gmail.com> wrote:

> Hi Casey,
>
> I know a couple.
>
>
> - **/dependency-reduced-pom.xml
>- *Does anyone know if we can adjust the dependency-reduced-pom to have
>   a license?*
>
> I don't think that should be committed, it may have gotten in there by
> mistake. Those are generated on each build. I'd remove the exclusion.
>
>  **/ansible.cfg
>- *This is YAML, right?  YAML has comments, IIRC.  If so we should
>   license it.*
>
> Not YAML, more like ini. Supports comments and the copies I looked at had
> licenses. I'd remove the exclusion.
>
>  - *These are bundled source scripts, which falls under the rules
>   around bundling.  What are their licenses and did we include
> these scripts
>   in the LICENSE?*
>
> These come from Bento and are Apache licensed. LICENSED and NOTICED.
>
> Agree with your conclusions above.
>
> -D...
>
>
>
>
> On Thu, Mar 16, 2017 at 3:25 PM, Casey Stella <ceste...@gmail.com> wrote:
>
> > As part of the mentor feedback on our 0.3.1 release, it was noted that we
> > have broad apache rat exclusions in our top level pom.  It was suggested
> > that we distribute those to the individual relevant modules rather than
> > having them in the top level pom.  The concern is for broad exclusions
> from
> > rat that we might be out of compliance with respect to licensing
> everything
> > which can take a comment.  If we intend on moving on to a top level
> > project, I think this should at the very least be addressed.
> >
> > As a prerequisite to that work (and a stop-gap), I'd like to understand
> the
> > nature of those exclusions so we can at least justify them.  Where we
> > cannot justify the exclusion, we need to correct it.  This ideally should
> > happen before we attempt to go to a top level project, in my opinion.
> >
> > I have listed them here with my comments.  The bolded comments are the
> ones
> > most concerning to me or that I had a pointed question about.
> >
> >
> >- **/*.md
> >- *This seems wrong, markdown can take comments, so we should have an
> >   apache license header, right?*
> >- **/VERSION
> >- *This is coming from the python sensor plugins.  Any suggested
> >   verbiage around this for the comment?*
> >- **/*.json
> >- I think this is ok as JSON can't have comments, rights?
> >- **/*.tokens
> >- These are generated data files from antlr and don't appear to take
> >   comments, so I think they're ok
> >- **/*.log
> >- Log files aren't source, so I think this is ok
> >- **/*.template
> >- ES Templates are JSON and can't have comments
> >- **/.*
> >- *Which dotfiles are we explicitly excluding?  Some dotfiles take
> >   comments and could be licensed.*
> >- **/.*/**
> >- ditto
> >- **/*.seed
> >- *I don't see this anywhere, where did this one come from?*
> >- **/*.iml
> >- IDE files
> >- **/ansible.cfg
> >- *This is YAML, right?  YAML has comments, IIRC.  If so we should
> >   license it.*
> >- **/*.rpm
> >- This is generated, binary files, so it should be ok.
> >- site/**
> >- *This seems wrong.  It's jekyll format, so we can't put the license
> at
> >   the top, but we can attach a license in the generated HTML, right?*
> >- **/src/main/resources/patterns/**
> >- *Does Grok allow for comments?  If so, we should license these.*
> >- **/src/main/sample/patterns/**
> >- *Ditto*
> >- **/src/test/resources/**
> >- *This seems overly broad, to me.  We should at least do it via
> >   extension.  This way if someone adds something that should be
> > licensed, we
> >   know about it, right?*
> >- **/src/main/sample/data/**
> >- *This is raw data and ok, but maybe it'd be good to make this
> project
> >   specific.*
> >- **/dependency-reduced-pom.xml
> >- *Does anyone know if we can adjust the dependency-reduced-pom to
> have
> >   a license?*
> >- **/target/**
> >- **/bro-plugin-kafka/build/**
> >- The output of the build, so I think that's ok
> >- **/packer-build/scripts/**
> >   - *These are bundled source scripts, which falls under the rules
> >   around bundling.  What are their licenses and did we include
> > these scripts
> >   in the LICENSE?*
> >- **/packer-build/bin/**
> >- This seems to be binary data, so ok
> >- **/packer_cache/**
> >- This seems to be binary data, so ok
> >- **/hbase/data/**
> >- This seems to be binary data, so ok
> >- **/kafkazk/data/**
> >- This seems to be binary data, so ok
> >- **/wait-for-it.sh
> >- This one is cool and explicitly listed in the LICENSE as MIT
> licensed
> >- **/*.out
> >- This seems to be generated output, so ok.
> >
>


[DISCUSS] Apache Rat Exclusions

2017-03-16 Thread Casey Stella
As part of the mentor feedback on our 0.3.1 release, it was noted that we
have broad apache rat exclusions in our top level pom.  It was suggested
that we distribute those to the individual relevant modules rather than
having them in the top level pom.  The concern is for broad exclusions from
rat that we might be out of compliance with respect to licensing everything
which can take a comment.  If we intend on moving on to a top level
project, I think this should at the very least be addressed.

As a prerequisite to that work (and a stop-gap), I'd like to understand the
nature of those exclusions so we can at least justify them.  Where we
cannot justify the exclusion, we need to correct it.  This ideally should
happen before we attempt to go to a top level project, in my opinion.

I have listed them here with my comments.  The bolded comments are the ones
most concerning to me or that I had a pointed question about.


   - **/*.md
   - *This seems wrong, markdown can take comments, so we should have an
  apache license header, right?*
   - **/VERSION
   - *This is coming from the python sensor plugins.  Any suggested
  verbiage around this for the comment?*
   - **/*.json
   - I think this is ok as JSON can't have comments, rights?
   - **/*.tokens
   - These are generated data files from antlr and don't appear to take
  comments, so I think they're ok
   - **/*.log
   - Log files aren't source, so I think this is ok
   - **/*.template
   - ES Templates are JSON and can't have comments
   - **/.*
   - *Which dotfiles are we explicitly excluding?  Some dotfiles take
  comments and could be licensed.*
   - **/.*/**
   - ditto
   - **/*.seed
   - *I don't see this anywhere, where did this one come from?*
   - **/*.iml
   - IDE files
   - **/ansible.cfg
   - *This is YAML, right?  YAML has comments, IIRC.  If so we should
  license it.*
   - **/*.rpm
   - This is generated, binary files, so it should be ok.
   - site/**
   - *This seems wrong.  It's jekyll format, so we can't put the license at
  the top, but we can attach a license in the generated HTML, right?*
   - **/src/main/resources/patterns/**
   - *Does Grok allow for comments?  If so, we should license these.*
   - **/src/main/sample/patterns/**
   - *Ditto*
   - **/src/test/resources/**
   - *This seems overly broad, to me.  We should at least do it via
  extension.  This way if someone adds something that should be
licensed, we
  know about it, right?*
   - **/src/main/sample/data/**
   - *This is raw data and ok, but maybe it'd be good to make this project
  specific.*
   - **/dependency-reduced-pom.xml
   - *Does anyone know if we can adjust the dependency-reduced-pom to have
  a license?*
   - **/target/**
   - **/bro-plugin-kafka/build/**
   - The output of the build, so I think that's ok
   - **/packer-build/scripts/**
  - *These are bundled source scripts, which falls under the rules
  around bundling.  What are their licenses and did we include
these scripts
  in the LICENSE?*
   - **/packer-build/bin/**
   - This seems to be binary data, so ok
   - **/packer_cache/**
   - This seems to be binary data, so ok
   - **/hbase/data/**
   - This seems to be binary data, so ok
   - **/kafkazk/data/**
   - This seems to be binary data, so ok
   - **/wait-for-it.sh
   - This one is cool and explicitly listed in the LICENSE as MIT licensed
   - **/*.out
   - This seems to be generated output, so ok.


Re: [VOTE] Final Board Resolution Draft

2017-03-15 Thread Casey Stella
+1 binding

On Wed, Mar 15, 2017 at 3:24 PM, James Sirota <jsir...@apache.org> wrote:

> Please vote 1 for OK, -1 for not OK, and 0 for neutral.  The vote will be
> open for 72 hours.
>
>
> The incubating Apache Metron community believes it is time to graduate to
> TLP.
>
> Apache Metron entered incubation in December of 2015.  Since then, we've
> overcome technical challenges to remove Category X dependencies, and made 3
> releases.  Our most recent release contains binary convenience artifacts.
> We are a very helpful and engaged community, ready to answer all questions
> and feedback directed to us via the user list.  Through our time in
> incubation we've added a number of committers and promoted some of them to
> PPMC membership.  We are actively pursuing others. While we do still have
> issues to address raised by means of the maturity model, all projects are
> ongoing processes, and we believe we no longer need the incubator to
> continue addressing these issues.
>
> To inform the discussion, here is some basic project information:
>
> Project status:
>   http://incubator.apache.org/projects/metron.html
>
> Project website:
>   https://metron.incubator.apache.org/
>
> Project documentation:
>https://cwiki.apache.org/confluence/display/METRON/Documentation
>
> Maturity assessment:
>https://cwiki.apache.org/confluence/display/METRON/
> Apache+Project+Maturity+Model
>
> DRAFT of the board resolution is at the bottom of this email
>
> Proposed PMC size: 25 members
>
> Total number of committers: 6 members
>
> PMC affiliation (* indicated chair)
> * Hortonworks
> Cisco
> Rackspace
> B23
> Mantech
> Kinetica
>
> Committer affiliation:
> Leidos
> Paychex
> Carnegie Mellon University
> Hortonworks
>
> 516 commits on develop
> 34 contributors across all branches
>
> dev list averaged ~650 msgs/month for the last 3 months
>
>
> Resolution:
>
> Establish the Apache Metron Project
>
> WHEREAS, the Board of Directors deems it to be in the best
> interests of the Foundation and consistent with the
> Foundation's purpose to establish a Project Management
> Committee charged with the creation and maintenance of
> open-source software, for distribution at no charge to the
> public, related to a security analytics platform for big data use cases.
>
> NOW, THEREFORE, BE IT RESOLVED, that a Project Management
> Committee (PMC), to be known as the "Apache Metron Project",
> be and hereby is established pursuant to Bylaws of the
> Foundation; and be it further
>
> RESOLVED, that the Apache Metron Project be and hereby is
> responsible for the creation and maintenance of software
> related to:
> (a) A mechanism to capture, store, and normalize any type of security
> telemetry at extremely high rates.
> (b) Real time processing and application of enrichments
> (c) Efficient information storage
> (d) An interface that gives a security investigator a centralized view
> of data and alerts passed through the system.
>
> RESOLVED, that the office of "Vice President, Apache Metron" be
> and hereby is created, the person holding such office to
> serve at the direction of the Board of Directors as the chair
> of the Apache Metron Project, and to have primary responsibility
> for management of the projects within the scope of
> responsibility of the Apache Metron Project; and be it further
>
> RESOLVED, that the persons listed immediately below be and
> hereby are appointed to serve as the initial members of the
> Apache Metron Project:
>
>
> PPMC:
> Mark Bittmann
> Sheetal Dolas
> Debo Dutta
> Discovery Gerdes
> P. Taylor Goetz
> Andrew Hartnett
> Dave Hirko
> Paul Kehrer
> Brad Kolarov
> Kiran Komaravolu
> Larry McCay
> Ryan Merriman
> Michael Perez
> Charles Porter
> Phillip Rhodes
> Sean Schulte
> James Sirota
> Casey Stella
> Bryan Taylor
> Ray Urciuoli
> Vinod Kumar Vavilapalli
> George Vetticaden
> Oskar Zabik
> David Lyle
> Nick Allen
>
> Committers:
> Otto Fowler
> Kyle Richardson
> Justin Leet
> Michael Miklavcic
> Jon Zeolla
> Matt Foley
>
>
> NOW, THEREFORE, BE IT FURTHER RESOLVED, that Casey Stella
> be appointed to the office of Vice President, Apache Metron, to
> serve in accordance with and subject to the direction of the
> Board of Directors and the Bylaws of the Foundation until
> death, resignation, retirement, removal or disqualification,
> or until a successor is appointed; and be it further
>
> RESOLVED, that the initial Apache Metron PMC be and hereby is
> tasked with the creation of a set of bylaws intended to
> encourage open devel

Re: [MENTORS] POLL: Staying after graduation

2017-03-13 Thread Casey Stella
+1 to that.  Thanks Billie!

On Mon, Mar 13, 2017 at 6:28 PM, James Sirota  wrote:

> Billie,  thank you for your guidance and support. We really appreciate you
> being a part of this journey with us
>
> 13.03.2017, 08:09, "Billie Rinaldi" :
> > James,
> >
> > Thanks for asking. Unfortunately my time is stretched a bit thin at this
> > point, so I won't be able to continue with Metron. I am still very
> excited
> > about Metron's probable graduation!
> >
> > Billie
> >
> > On Sun, Mar 12, 2017 at 8:52 PM, James Sirota 
> wrote:
> >
> >>  As we are getting ready to graduate I wanted to poll the mentors and
> see
> >>  who wanted to stay on with the project after our graduation. You guys
> did
> >>  a great job with us and I personally would love to see you stay on.
> >>
> >>  ---
> >>  Thank you,
> >>
> >>  James Sirota
> >>  PPMC- Apache Metron (Incubating)
> >>  jsirota AT apache DOT org
>
> ---
> Thank you,
>
> James Sirota
> PPMC- Apache Metron (Incubating)
> jsirota AT apache DOT org
>


Re: [VOTE] Metron to graduate to TLP

2017-03-13 Thread Casey Stella
+1 (binding)

On Mon, Mar 13, 2017 at 6:37 PM, James Sirota  wrote:

> +1 (binding)
>
> 13.03.2017, 15:37, "James Sirota" :
> > Do we feel it's time for us to exit the Apache incubator and petition to
> make Metron a TLP?
> >
> > Please vote 1 for yes, -1 for no, 0 for neutral.
> >
> > The vote will be open for 72 hours
> >
> > ---
> > Thank you,
> >
> > James Sirota
> > PPMC- Apache Metron (Incubating)
> > jsirota AT apache DOT org
>
> ---
> Thank you,
>
> James Sirota
> PPMC- Apache Metron (Incubating)
> jsirota AT apache DOT org
>


Re: [DISCUSS] Metron VP

2017-03-13 Thread Casey Stella
Thanks so much for the confidence, James.  I'm happy to serve if selected.

On Mon, Mar 13, 2017 at 12:20 AM, James Sirota <jsir...@apache.org> wrote:

> I would like to propose that Casey Stella be our VP upon graduation.  I
> think has been the most outspoken proponent of the "Apache way" on our
> project and has made very significant contributions to moving it forward.
>
> ---
> Thank you,
>
> James Sirota
> PPMC- Apache Metron (Incubating)
> jsirota AT apache DOT org
>


Re: [Discuss] SIDELOADING PARSERS: Parsers as components

2017-03-10 Thread Casey Stella
Ok, so, after some thought about this, I am in agreement over Nar.  I do
want to make sure that on the roadmap we retrofit stellar to accept Nar
plugins and build an archetype for it.  We should have a single strategy
for plugins.  NOt saying it has to be part of the same PR, but it needs to
be associated and a follow-on task IMO.

On Fri, Mar 10, 2017 at 4:06 PM, Otto Fowler <ottobackwa...@gmail.com>
wrote:

> Also a Nar can depend on ‘one’ other nar, which is interesting
>
>
> On March 10, 2017 at 16:02:18, Otto Fowler (ottobackwa...@gmail.com)
> wrote:
>
> The isolation is just a ‘extra’ in the parser case.
>
> The parts of Nar that *are* more pertinent:
>
> * supporting a deployment artifact with just a jar, or a tar.gz with a jar
> and jar dependencies in it
> * taking a ‘package’ and deploying it for loading ( which will upgrade the
> deployed part if it is newer )
> * setting up the classloader hierarchy between the ‘system’ and provided
> things, and the dependencies of the individual plugin
>
>
>
> On March 10, 2017 at 15:56:08, Casey Stella (ceste...@gmail.com) wrote:
>
> Why would we need classpath isolation here in the case of the parser?
>
> On Fri, Mar 10, 2017 at 3:55 PM, Otto Fowler <ottobackwa...@gmail.com>
> wrote:
>
>> I *would* use the classloader part, extending it with VFS.
>>
>>
>>
>> On March 10, 2017 at 15:53:05, Casey Stella (ceste...@gmail.com) wrote:
>>
>> I'm a bit worried about copying and pasting from the NiFi project their
>> nar infrastructure.  That seems..unclean to me and since we're not using
>> the classloader part of nar for this, does it make more sense to just use
>> jar?
>>
>> On Fri, Mar 10, 2017 at 3:50 PM, Otto Fowler <ottobackwa...@gmail.com>
>> wrote:
>>
>>> Compared to how much time vagrant up takes now, you won’t even notice it
>>> ;)
>>>
>>> That is definitely an option.  I guess what I want to work out is if we
>>> are going to want to
>>> go to NAR, why not just go to NAR.
>>>
>>> In the end, the customer for this - like Jon Zeolla, isn’t going to care
>>> about the intermediate step,
>>> he wants the archetype that builds the ‘metron parser plugin’.
>>>
>>> Which is why I hesitate to put out an archetype that is going to
>>> obsolete so soon.
>>>
>>> Does that make sense?
>>>
>>> On March 10, 2017 at 14:50:55, Casey Stella (ceste...@gmail.com) wrote:
>>>
>>> I'm a little concerned about this increasing the size and length of the
>>> build due to the repeated shading. Should we figure out a way to deploy
>>> jars with provided dependencies on metron-parser-common as suggested in
>>> the
>>> previous JIRAs first?
>>>
>>> On Fri, Mar 10, 2017 at 2:31 PM, Matt Foley <mfo...@hortonworks.com>
>>> wrote:
>>>
>>> > It sounds like:
>>> > - This is a self-contained chunk of work, that can be tested, reviewed,
>>> > and committed on its own, then the other ideas you propose can follow
>>> it.
>>> > - It crosses a lot of lines, and restructures a lot of code, so will
>>> “rot”
>>> > fairly quickly as other people make commits, so if possible you should
>>> get
>>> > a PR out there and we should work through it as soon as possible.
>>> > Are those both true?
>>> >
>>> > How do other people feel about grouping a given sensor’s parser,
>>> enricher,
>>> > indexing logic all together? It seems to have multiple advantages are
>>> > there also disadvantages?
>>> >
>>> > On 3/10/17, 6:31 AM, "Otto Fowler" <ottobackwa...@gmail.com> wrote:
>>> >
>>> > As previously discussed here, I have been working on side loading of
>>> > parsers. The goals of this work are:
>>> > * Make it possible of developers to create, maintain and deploy parsers
>>> > outside of the Metron code tree and not have to fork
>>> > * Create maven archetype support for developers of parsers
>>> > * Introduce a parser ‘lifecycle’ to support multiple instances and
>>> > configurations, states of being installed, under configuration, and
>>> > deployed
>>> > etc.
>>> >
>>> > I would like to have some discussion based on where I am after rebasing
>>> > onto METRON-671 which revamps deployment to be totally ambari based.
>>> >
>>> >
>>> > Parsers as components:
>>> >
>>> >

Re: [Discuss] SIDELOADING PARSERS: Parsers as components

2017-03-10 Thread Casey Stella
I'm a bit worried about copying and pasting from the NiFi project their nar
infrastructure.  That seems..unclean to me and since we're not using the
classloader part of nar for this, does it make more sense to just use jar?

On Fri, Mar 10, 2017 at 3:50 PM, Otto Fowler <ottobackwa...@gmail.com>
wrote:

> Compared to how much time vagrant up takes now, you won’t even notice it ;)
>
> That is definitely an option.  I guess what I want to work out is if we
> are going to want to
> go to NAR, why not just go to NAR.
>
> In the end, the customer for this - like Jon Zeolla, isn’t going to care
> about the intermediate step,
> he wants the archetype that builds the ‘metron parser plugin’.
>
> Which is why I hesitate to put out an archetype that is going to obsolete
> so soon.
>
> Does that make sense?
>
> On March 10, 2017 at 14:50:55, Casey Stella (ceste...@gmail.com) wrote:
>
> I'm a little concerned about this increasing the size and length of the
> build due to the repeated shading. Should we figure out a way to deploy
> jars with provided dependencies on metron-parser-common as suggested in
> the
> previous JIRAs first?
>
> On Fri, Mar 10, 2017 at 2:31 PM, Matt Foley <mfo...@hortonworks.com>
> wrote:
>
> > It sounds like:
> > - This is a self-contained chunk of work, that can be tested, reviewed,
> > and committed on its own, then the other ideas you propose can follow
> it.
> > - It crosses a lot of lines, and restructures a lot of code, so will
> “rot”
> > fairly quickly as other people make commits, so if possible you should
> get
> > a PR out there and we should work through it as soon as possible.
> > Are those both true?
> >
> > How do other people feel about grouping a given sensor’s parser,
> enricher,
> > indexing logic all together? It seems to have multiple advantages are
> > there also disadvantages?
> >
> > On 3/10/17, 6:31 AM, "Otto Fowler" <ottobackwa...@gmail.com> wrote:
> >
> > As previously discussed here, I have been working on side loading of
> > parsers. The goals of this work are:
> > * Make it possible of developers to create, maintain and deploy parsers
> > outside of the Metron code tree and not have to fork
> > * Create maven archetype support for developers of parsers
> > * Introduce a parser ‘lifecycle’ to support multiple instances and
> > configurations, states of being installed, under configuration, and
> > deployed
> > etc.
> >
> > I would like to have some discussion based on where I am after rebasing
> > onto METRON-671 which revamps deployment to be totally ambari based.
> >
> >
> > Parsers as components:
> >
> > I have all the parsers broken out into individual packages/rpms/jars.
> > What I have done is taken metron-parsers and broken it out to:
> >
> > * metron-parsers-common
> > * This has all the base classes and interfaces, common testing
> > components
> > etc
> > * metron-parser-base
> > * This has the Grok, CSV, and JsonMap parsers and support
> > * metron-parser-X
> > * A module per parser type which we currently have in the system
> > * Each parser has all the indexing, enrichment and parser
> > configurations
> > for that parser in its package
> >
> > I will go into packaging and deployment issues in another email.
> >
> > I have this all working:
> > * the parsers are built
> > * the parsers are tested
> > * the parsers are integrated into the deployment build such that
> > vagrant up
> > just works as previously in full and quick dev
> > * maven component of rpm docker
> > * the metron.spec file
> > * ambari installation
> > * zookeeper configuration deployment
> > * the ambari parser service code
> > * the Rest interface works
> > * see all installed parser configurations etc
> >
> >
> > So this part of the work, is I think ready for a PR and review/next
> > steps
> > on it’s own.
> >
> > I think that it sets up the components and is a base for building out
> > the
> > rest of the functionality we want.
> >
> >
> >
>
>


Re: [Discuss] SIDELOADING PARSERS: Parsers as components

2017-03-10 Thread Casey Stella
Why would we need classpath isolation here in the case of the parser?

On Fri, Mar 10, 2017 at 3:55 PM, Otto Fowler <ottobackwa...@gmail.com>
wrote:

> I *would* use the classloader part, extending it with VFS.
>
>
>
> On March 10, 2017 at 15:53:05, Casey Stella (ceste...@gmail.com) wrote:
>
> I'm a bit worried about copying and pasting from the NiFi project their
> nar infrastructure.  That seems..unclean to me and since we're not using
> the classloader part of nar for this, does it make more sense to just use
> jar?
>
> On Fri, Mar 10, 2017 at 3:50 PM, Otto Fowler <ottobackwa...@gmail.com>
> wrote:
>
>> Compared to how much time vagrant up takes now, you won’t even notice it
>> ;)
>>
>> That is definitely an option.  I guess what I want to work out is if we
>> are going to want to
>> go to NAR, why not just go to NAR.
>>
>> In the end, the customer for this - like Jon Zeolla, isn’t going to care
>> about the intermediate step,
>> he wants the archetype that builds the ‘metron parser plugin’.
>>
>> Which is why I hesitate to put out an archetype that is going to obsolete
>> so soon.
>>
>> Does that make sense?
>>
>> On March 10, 2017 at 14:50:55, Casey Stella (ceste...@gmail.com) wrote:
>>
>> I'm a little concerned about this increasing the size and length of the
>> build due to the repeated shading. Should we figure out a way to deploy
>> jars with provided dependencies on metron-parser-common as suggested in
>> the
>> previous JIRAs first?
>>
>> On Fri, Mar 10, 2017 at 2:31 PM, Matt Foley <mfo...@hortonworks.com>
>> wrote:
>>
>> > It sounds like:
>> > - This is a self-contained chunk of work, that can be tested, reviewed,
>> > and committed on its own, then the other ideas you propose can follow
>> it.
>> > - It crosses a lot of lines, and restructures a lot of code, so will
>> “rot”
>> > fairly quickly as other people make commits, so if possible you should
>> get
>> > a PR out there and we should work through it as soon as possible.
>> > Are those both true?
>> >
>> > How do other people feel about grouping a given sensor’s parser,
>> enricher,
>> > indexing logic all together? It seems to have multiple advantages are
>> > there also disadvantages?
>> >
>> > On 3/10/17, 6:31 AM, "Otto Fowler" <ottobackwa...@gmail.com> wrote:
>> >
>> > As previously discussed here, I have been working on side loading of
>> > parsers. The goals of this work are:
>> > * Make it possible of developers to create, maintain and deploy parsers
>> > outside of the Metron code tree and not have to fork
>> > * Create maven archetype support for developers of parsers
>> > * Introduce a parser ‘lifecycle’ to support multiple instances and
>> > configurations, states of being installed, under configuration, and
>> > deployed
>> > etc.
>> >
>> > I would like to have some discussion based on where I am after rebasing
>> > onto METRON-671 which revamps deployment to be totally ambari based.
>> >
>> >
>> > Parsers as components:
>> >
>> > I have all the parsers broken out into individual packages/rpms/jars.
>> > What I have done is taken metron-parsers and broken it out to:
>> >
>> > * metron-parsers-common
>> > * This has all the base classes and interfaces, common testing
>> > components
>> > etc
>> > * metron-parser-base
>> > * This has the Grok, CSV, and JsonMap parsers and support
>> > * metron-parser-X
>> > * A module per parser type which we currently have in the system
>> > * Each parser has all the indexing, enrichment and parser
>> > configurations
>> > for that parser in its package
>> >
>> > I will go into packaging and deployment issues in another email.
>> >
>> > I have this all working:
>> > * the parsers are built
>> > * the parsers are tested
>> > * the parsers are integrated into the deployment build such that
>> > vagrant up
>> > just works as previously in full and quick dev
>> > * maven component of rpm docker
>> > * the metron.spec file
>> > * ambari installation
>> > * zookeeper configuration deployment
>> > * the ambari parser service code
>> > * the Rest interface works
>> > * see all installed parser configurations etc
>> >
>> >
>> > So this part of the work, is I think ready for a PR and review/next
>> > steps
>> > on it’s own.
>> >
>> > I think that it sets up the components and is a base for building out
>> > the
>> > rest of the functionality we want.
>> >
>> >
>> >
>>
>>
>


Re: [Discuss] SIDELOADING PARSERS: Parsers as components

2017-03-10 Thread Casey Stella
Also, it's not clear that we're ever going to need the classloader bits
from nar for parsers since they are naturally isolated by storm topology.
I might be wrong there though; do you see a scenario?

On Fri, Mar 10, 2017 at 3:53 PM, Casey Stella <ceste...@gmail.com> wrote:

> I'm a bit worried about copying and pasting from the NiFi project their
> nar infrastructure.  That seems..unclean to me and since we're not using
> the classloader part of nar for this, does it make more sense to just use
> jar?
>
> On Fri, Mar 10, 2017 at 3:50 PM, Otto Fowler <ottobackwa...@gmail.com>
> wrote:
>
>> Compared to how much time vagrant up takes now, you won’t even notice it
>> ;)
>>
>> That is definitely an option.  I guess what I want to work out is if we
>> are going to want to
>> go to NAR, why not just go to NAR.
>>
>> In the end, the customer for this - like Jon Zeolla, isn’t going to care
>> about the intermediate step,
>> he wants the archetype that builds the ‘metron parser plugin’.
>>
>> Which is why I hesitate to put out an archetype that is going to obsolete
>> so soon.
>>
>> Does that make sense?
>>
>> On March 10, 2017 at 14:50:55, Casey Stella (ceste...@gmail.com) wrote:
>>
>> I'm a little concerned about this increasing the size and length of the
>> build due to the repeated shading. Should we figure out a way to deploy
>> jars with provided dependencies on metron-parser-common as suggested in
>> the
>> previous JIRAs first?
>>
>> On Fri, Mar 10, 2017 at 2:31 PM, Matt Foley <mfo...@hortonworks.com>
>> wrote:
>>
>> > It sounds like:
>> > - This is a self-contained chunk of work, that can be tested, reviewed,
>> > and committed on its own, then the other ideas you propose can follow
>> it.
>> > - It crosses a lot of lines, and restructures a lot of code, so will
>> “rot”
>> > fairly quickly as other people make commits, so if possible you should
>> get
>> > a PR out there and we should work through it as soon as possible.
>> > Are those both true?
>> >
>> > How do other people feel about grouping a given sensor’s parser,
>> enricher,
>> > indexing logic all together? It seems to have multiple advantages are
>> > there also disadvantages?
>> >
>> > On 3/10/17, 6:31 AM, "Otto Fowler" <ottobackwa...@gmail.com> wrote:
>> >
>> > As previously discussed here, I have been working on side loading of
>> > parsers. The goals of this work are:
>> > * Make it possible of developers to create, maintain and deploy parsers
>> > outside of the Metron code tree and not have to fork
>> > * Create maven archetype support for developers of parsers
>> > * Introduce a parser ‘lifecycle’ to support multiple instances and
>> > configurations, states of being installed, under configuration, and
>> > deployed
>> > etc.
>> >
>> > I would like to have some discussion based on where I am after rebasing
>> > onto METRON-671 which revamps deployment to be totally ambari based.
>> >
>> >
>> > Parsers as components:
>> >
>> > I have all the parsers broken out into individual packages/rpms/jars.
>> > What I have done is taken metron-parsers and broken it out to:
>> >
>> > * metron-parsers-common
>> > * This has all the base classes and interfaces, common testing
>> > components
>> > etc
>> > * metron-parser-base
>> > * This has the Grok, CSV, and JsonMap parsers and support
>> > * metron-parser-X
>> > * A module per parser type which we currently have in the system
>> > * Each parser has all the indexing, enrichment and parser
>> > configurations
>> > for that parser in its package
>> >
>> > I will go into packaging and deployment issues in another email.
>> >
>> > I have this all working:
>> > * the parsers are built
>> > * the parsers are tested
>> > * the parsers are integrated into the deployment build such that
>> > vagrant up
>> > just works as previously in full and quick dev
>> > * maven component of rpm docker
>> > * the metron.spec file
>> > * ambari installation
>> > * zookeeper configuration deployment
>> > * the ambari parser service code
>> > * the Rest interface works
>> > * see all installed parser configurations etc
>> >
>> >
>> > So this part of the work, is I think ready for a PR and review/next
>> > steps
>> > on it’s own.
>> >
>> > I think that it sets up the components and is a base for building out
>> > the
>> > rest of the functionality we want.
>> >
>> >
>> >
>>
>>
>


Re: [DISCUSS] SIDELOADING PARSERS: Packaging and Loading and Extensions [oh.my]

2017-03-10 Thread Casey Stella
I would definitely agree that moving forward we should consider something
like Nar for Stellar.  I'm not seeing the need for parsers exactly.

I don't want to squash the forward thinking aspect here; we should be broad
and think about the end, ideal state.  I just want to make sure we think
through something that we can iterate on as an initial state that still
solves your problem, MVP style.

On Fri, Mar 10, 2017 at 3:39 PM, Otto Fowler <ottobackwa...@gmail.com>
wrote:

> "The Apache NiFi NAR ‘system’ allows for the packaging and loading of java
> resources with classloader isolation.
> Although technically it is the Service Provider api that makes the
> ‘plugins’  part of the system, you can view them
> together, and thus look at the NAR features as a system to create,
> package, load, and execute plugins in a java system
> while maintaining classloader isolation and dependency separation.
>
> While the NiFi problem case ( many plugins possibly executing in the same
> vm ) is not universal, the functionality provided
> by NAR is commonly needed, and is indeed functionality that I am currently
> looking at implementing in the Apache Metron project.”
>
> This is how I put it to Joe.
>
> I think what you are proposing would work.  I think what I have done up
> until now will pretty much work.  What I have been thinking about and
> considering
> is the difference between getting ‘something that works’, and maybe
> something better.
>
> So if you look at nar there is the ‘packaging’ part, and the class loading
> part.
> We are already doing almost the same thing with the assembly of the
> .tag.gz.  The Nar is a next step to this which adds more metadata and the
> dependency repo.
> More of a refinement than a change.
>
> As far as class loading, the Nar is a more refined system for deploying
> and consuming jars and dependencies, and setting up classloader instances.
> It has more
> functionality than we need at the moment in storm, but in other services
> where multiple parsers or plugin types may need to be loaded, it would make
> more sense.
> Rest may be that case.  Stellar may be that case too, if anyone ever
> writes a stellar function with different dependencies than the platform.
>
>
>
> On March 10, 2017 at 14:32:00, Casey Stella (ceste...@gmail.com) wrote:
>
> So, my question is whether we really need nar here. We have a classloading
> mechanism that will allow us to deploy just the parser logic just added
> into master for stellar, should we be considering another one?
>
> I would understand using nar if we needed to have multiple nars around
> that
> needed isolation from one another, but in the parser topology, we get that
> isolation naturally. It seems to me that, at least for a MVP, we should
> use the existing classloader that we just added. That being said, I might
> be missing something, so let me know your thoughts.
>
> Casey
>
> On Fri, Mar 10, 2017 at 2:18 PM, Matt Foley <ma...@apache.org> wrote:
>
> > I like the approach. I think Nar constitutes a production-quality
> > existing solution meeting highly similar needs to Metron’s.
> >
> > Just a ‘btw’ regarding Joe’s input that I transmitted:
> > - Joe made clear that he was only giving his personal opinion, since of
> > course no individual can speak for the community.
> > - Joe also felt that if Metron succeeded in re-using the Nar system
> > without having to change it too much, that that would be a good
> supporting
> > argument for later proposing that it become a separate child project.
> > - Whereas if we or they tried to break it out as a separate project now,
> > we would have to do all the community-building work around it, as well
> as
> > the technical work of adapting it for a different environment from NiFi.
> > - So he recommended to copy and appropriate it for now.
> > - Which I also agree with.
> >
> > Thanks,
> > --Matt
> >
> > On 3/10/17, 7:42 AM, "Otto Fowler" <ottobackwa...@gmail.com> wrote:
> >
> > As previously discussed here, I have been working on side loading of
> > parsers. The goals of this work are:
> > * Make it possible of developers to create, maintain and deploy parsers
> > outside of the Metron code tree and not have to fork
> > * Create maven archetype support for developers of parsers
> > * Introduce a parser ‘lifecycle’ to support multiple instances and
> > configurations, states of being installed, under configuration, and
> > deployed
> > etc.
> >
> > I would like to have some discussion based on where I am after rebasing
> > onto METRON-671 which revamps deployment to be totally ambari based.
> >
&

Re: [DISCUSS] SIDELOADING PARSERS: REST API

2017-03-10 Thread Casey Stella
This is another reason to consider deploying custom parsers to HDFS and
reusing the classloader, we can adjust the REST api to search the
centralized store of 3rd party parsers (HDFS).

On Fri, Mar 10, 2017 at 10:10 AM, Otto Fowler 
wrote:

> As previously discussed here, I have been working on side loading of
> parsers.  The goals of this work are:
> * Make it possible of developers to create, maintain and deploy parsers
> outside of the Metron code tree and not have to fork
> * Create maven archetype support for developers of parsers
> * Introduce a parser ‘lifecycle’ to support multiple instances and
> configurations, states of being installed, under configuration, and
> deployed
> etc.
>
> I would like to have some discussion based on where I am after rebasing
> onto METRON-671 which revamps deployment to be totally ambari based.
>
>
> REST API:
>
> The rest api works right now from what I can see, as everything is in
> zookeeper correctly.
>
> but,
>
> Because we don’t have an extension ‘loading’ mechanism, the rest service
> explicitly depends on all the parser jars.  And will have to if it wanted
> to
> generically test parsing etc.
>
>
> I think this is OK in the mean time, but is another thing to points to us
> wanting an extension loading and packaging solution.
>
> I also think we may want to add an add parser endpoint, that will handle
> the deployment… but we have to sort out ambari, and have the rest stuff
> all setup and working by default.
>


Re: [Discuss] SIDELOADING PARSERS: Parsers as components

2017-03-10 Thread Casey Stella
I'm a little concerned about this increasing the size and length of the
build due to the repeated shading.  Should we figure out a way to deploy
jars with provided dependencies on metron-parser-common as suggested in the
previous JIRAs first?

On Fri, Mar 10, 2017 at 2:31 PM, Matt Foley  wrote:

> It sounds like:
> - This is a self-contained chunk of work, that can be tested, reviewed,
> and committed on its own, then the other ideas you propose can follow it.
> - It crosses a lot of lines, and restructures a lot of code, so will “rot”
> fairly quickly as other people make commits, so if possible you should get
> a PR out there and we should work through it as soon as possible.
> Are those both true?
>
> How do other people feel about grouping a given sensor’s parser, enricher,
> indexing logic all together?  It seems to have multiple advantages are
> there also disadvantages?
>
> On 3/10/17, 6:31 AM, "Otto Fowler"  wrote:
>
> As previously discussed here, I have been working on side loading of
> parsers.  The goals of this work are:
> * Make it possible of developers to create, maintain and deploy parsers
> outside of the Metron code tree and not have to fork
> * Create maven archetype support for developers of parsers
> * Introduce a parser ‘lifecycle’ to support multiple instances and
> configurations, states of being installed, under configuration, and
> deployed
> etc.
>
> I would like to have some discussion based on where I am after rebasing
> onto METRON-671 which revamps deployment to be totally ambari based.
>
>
> Parsers as components:
>
> I have all the parsers broken out into individual packages/rpms/jars.
> What I have done is taken metron-parsers and broken it out to:
>
> * metron-parsers-common
> * This has all the base classes and interfaces, common testing
> components
> etc
> * metron-parser-base
> * This has the Grok, CSV, and JsonMap parsers and support
> * metron-parser-X
> * A module per parser type which we currently have in the system
> * Each parser has all the indexing, enrichment and parser
> configurations
> for that parser in its package
>
> I will go into packaging and deployment issues in another email.
>
> I have this all working:
> * the parsers are built
> * the parsers are tested
> * the parsers are integrated into the deployment build such that
> vagrant up
> just works as previously in full and quick dev
> * maven component of rpm docker
>   * the metron.spec file
> * ambari installation
> * zookeeper configuration deployment
> * the ambari parser service code
> * the Rest interface works
> * see all installed parser configurations etc
>
>
> So this part of the work, is I think ready for a PR and review/next
> steps
> on it’s own.
>
> I think that it sets up the components and is a base for building out
> the
> rest of the functionality we want.
>
>
>


Re: [DISCUSS] SIDELOADING PARSERS: Packaging and Deployment

2017-03-10 Thread Casey Stella
I really don't like the 2x build size.  I think we can at this point do
something similar to side-loading of stellar functions to remove that
concern. This should be easy now that that's in master.

What I'd like to see as a MVP is:

   - the maven archetype to have a "provided" dependency to metron-parsers
   - Create a JIRA to
  - Add a "parser.paths" field to global config
  - modify the ParserBolt to use the VFSClassloader to instantiate the
  parser

I'd consider that phase 1 as it solves the question of "how do I easily
create parsers without forking Metron" and it doesn't require shaded jars.

>From there, I wouldn't be opposed to splitting the existing parsers into
separate projects, but I'd consider that a phase 2 activity.  We can trim
down metron-parsers at our leisure.

Thoughts?

On Fri, Mar 10, 2017 at 9:53 AM, Otto Fowler 
wrote:

> As previously discussed here, I have been working on side loading of
> parsers.  The goals of this work are:
> * Make it possible of developers to create, maintain and deploy parsers
> outside of the Metron code tree and not have to fork
> * Create maven archetype support for developers of parsers
> * Introduce a parser ‘lifecycle’ to support multiple instances and
> configurations, states of being installed, under configuration, and
> deployed
> etc.
>
> I would like to have some discussion based on where I am after rebasing
> onto METRON-671 which revamps deployment to be totally ambari based.
>
>
> Packaging and Deployment
>
> I have not change the packaging methodology that was already there, ie. All
> the parsers are still shaded uber jars, and all are package into tar.gz to
> include /lib /config /pattens
>
> They are all explicitly called out in the copy resources portion of the
> rpm-docker pom, and explicitly configured in the metron.spec for rpm
> generation.
>
> When deployed, they are deployed to a new directory - telemetry under the
> metron home ( /usr/metron/0.3.1/telemetry ).
> Each parser has it’s own directory, which gives it an isolated environment:
>
> telemetry/asa
> /config
> /lib
> /patterns
>
> I could see adding a version here as well.  Also, this directory structure
> could change, if not by review then by other follow on deployment options
> due to ‘thinning'
>
> All the scripts, ambari services have been changed to account for this, and
> the start parser topology script is changed to find the right parser jar to
> use for the -s option ( as opposed to only loading metron-parsers jar which
> is the root of the issue ).
>
> The packaging issues here:
>
> * 10 or so new Uber jars increases the build size x2
> * Travis needs to be changed from a container build to a vm build to get a
> bigger space to work in
>
> The rpm issues here:
> * explicitly listing like items that *could* be iterated from a list is a
> code smell to me.  With ansible I was able to define a list and use
> with_items to get a nice clean, maintainable flow.  With rpm and maven
> resources we have to have explicit entries.  My rpm and maven foo was not
> good enough to sort this out, so I just bit the bullet and did it.  I think
> we should explore copying using some kind of script or iteration in maven,
> and possibly generating the metron spec from a template ( this too is
> easier in ansible  ).
>
> Options here:
>
> 1)  Accept this as mvp for the PR with improvements to packaging and
> deployment as a follow on
> 2)  Delay the mvp and go for a possibly smaller - optimized deployment
>
>
> Going for a smaller deployment, that is slimming the jars is something I’ll
> talk about in another email :)
>


Re: [DISCUSS] Unique id for messages

2017-03-10 Thread Casey Stella
Yes, we do use a UUID in the enrichment topology; this is our message join
key on the join portion of the split/join enrichment.  The logic being used
is EnrichmentSplitterBolt.java  line 63.

We might bring that out and make it part of the message IMO and be able to
reuse that unique identifier in the enrichment topology.

On Fri, Mar 10, 2017 at 10:51 AM, zeo...@gmail.com  wrote:

> I definitely think that this is a valuable discussion.  I seem to recall
> cstella mentioning at some point in the past that there is a UUID already
> used in storm that we might be able to expose into the message itself, but
> I could be wrong.
>
> For additional context regarding prior discussions, this was also briefly
> discussed in another topic here here
>  1e1bd5dae4b2247dda12698805@%3Cdev.metron.apache.org%3E>.
> In that context I was hoping to be able to link messages across all
> indexing destinations (HDFS, ES, Solr, etc.).
>
> On Fri, Mar 10, 2017 at 9:26 AM Raghu Mitra Kandikonda <
> r...@hortonworks.com>
> wrote:
>
> > Hi All,
> >
> > I would like to start a discussion around adding a unique id to all the
> > parsed messages.  I feel there  was  a discussion around a similar topic
> > but I am not sure as a community we agreed on a proposal.
> >
> > We could
> > -use a random number generator like UUID but this might have performance
> > implications
> > -use a kafka topic name + systemtime + Kafka message offset to generate a
> > unique identifier
> > -use the input message to generate a hashcode
> >
> > Any thoughts ?
> >
> > (Attached email that had similar discussion for error indexing)
> >
> > Regards,
> > RaghuM
> >
> >
> >
> > -- Forwarded message --
> > From: "zeo...@gmail.com" 
> > To: "dev@metron.incubator.apache.org" 
> > Cc:
> > Bcc:
> > Date: Wed, 1 Feb 2017 22:18:12 +
> > Subject: Re: [DISCUSS] Error Indexing
> > Simply as a unique identifier of the original information which is
> failing
> > some step, and thus giving you something to key in on and create a count
> of
> > unique events and prioritize issues without the concern of cyclical
> issues
> > (if the issue is with indexing a specific message, and you try to index
> it
> > again, it will just fail in a loop).
> >
> > Jon
> >
> > On Wed, Feb 1, 2017 at 6:59 AM Dima Kovalyov 
> > wrote:
> >
> > > That's a great topic of discussion.
> > >
> > > Throughout the thread the idea of having hash of the message that
> failed
> > > is changed, can someone please explain why do you plan to use this hash
> > > and how?
> > >
> > > - Dima
> > >
> > > On 02/01/2017 06:23 AM, zeo...@gmail.com wrote:
> > > > After thinking on this for a few days I recant my previous suggestion
> > of
> > > > TupleHash256.  It's still a bit early for SHA-3 - no good reference
> > > > implementations/libraries exist (I did some searching and emailing),
> it
> > > is
> > > > optimized for hardware but no hardware implementation is widely
> > > accessible,
> > > > FIPS 140-3 is still not close to finalized, etc.
> > > >
> > > > I think we could simulate the benefits of tuplehash by sorting the
> > > tuples,
> > > > then doing SHA-256(len(tuple1) | tuple1 | ... | len(tuplen) |
> tuplen).
> > > > Happy to entertain opposing thoughts, such as BLAKE2, etc. but with
> the
> > > > likely users of Metron, I think sticking with FIPS 140-2 is a solid
> > > choice.
> > > >
> > > > Jon
> > > >
> > > > On Thu, Jan 26, 2017, 11:23 AM zeo...@gmail.com 
> > > wrote:
> > > >
> > > > So one more thing regarding why I think we should throw an exception
> > on a
> > > > failed enrichment.  If we do make something like username a constant
> > > field,
> > > > in cases where that is used to calculate rawMessage_hash, if it fails
> > to
> > > > enrich, the hash would be different compared to when it succeeds.  Of
> > > > course I think the initial intent of adding username as a constant
> > field
> > > > would be to handle it in the parsers, where that information is
> > provided
> > > in
> > > > the messages themselves, but how would Threat Intel know the
> > difference?
> > > > In my environment I am looking forward to a streaming enrichment that
> > > adds
> > > > the username, where applicable, anywhere I have an IP.
> > > >
> > > > My hesitant suggestion for a hashing algorithm would be to use
> > > > TupleHash256, as it is a NIST-provided implementation of SHA-3 (using
> > > > cSHAKE) for this use case.  Details here
> > > > <
> > > http://nvlpubs.nist.gov/nistpubs/specialpublications/
> nist.sp.800-185.pdf
> > >.
> > > > However, I haven't been able to find a reference implementation of
> this
> > > in
> > > > any language, so that's a bit of a downside.  A more general SHA3-256
> > > > implementation where we handle ordering could work as well, but would
> > be
> > > > significantly less 

Re: SHELL_EDIT Stellar command and execution context

2017-03-10 Thread Casey Stella
That means that it will attempt to get the capability and if it's not
there, it won't throw an exception, but rather just return
Optional.empty().  Yeah, we probably actually *do* want an exception thrown
there in the case where we're not in a shell context (probably don't want
vim starting inside of a storm bolt..it'll get confused and angry ;)

On Thu, Mar 9, 2017 at 11:07 PM, Otto Fowler 
wrote:

> I was looking at this command that the capabilities functionality in
> Context and I saw that the call for capabilities is:
>
> Optional console =  context.getCapability(CONSOLE, false);
>
>
> This means that if we are NOT in CONSOLE, there will not be an error.
> I don’t think this is correct is it?
>


Re: [MENTORS][REQUEST] Please review the release currently going on in incubator-general

2017-03-09 Thread Casey Stella
Sorry, I missed an email in my search.  We have 1 review (thanks Billie!
:), need 2 more.

On Thu, Mar 9, 2017 at 1:25 PM, Casey Stella <ceste...@gmail.com> wrote:

> Hi Mentors,
>
> We have a release going on in incubator-general for around 10 days now
> with no reviews.  Would a couple of y'all mind reviewing for us?
>
> Thanks,
>
> Casey
>


[MENTORS][REQUEST] Please review the release currently going on in incubator-general

2017-03-09 Thread Casey Stella
Hi Mentors,

We have a release going on in incubator-general for around 10 days now with
no reviews.  Would a couple of y'all mind reviewing for us?

Thanks,

Casey


Re: Metron Rest - where is reflections coming from?

2017-03-07 Thread Casey Stella
It should get it from Metron common and IntelliJ is showingnme the same
issue. I'm baffled by it, slightly. Maven builds just fine, so who knows.
Its on my list of oddities to look at post-vacation.
On Tue, Mar 7, 2017 at 12:57 Otto Fowler  wrote:

>
> https://github.com/apache/incubator-metron/blob/master/metron-interface/metron-rest/pom.xml
>
> I don’t we where the dependency for reflections is being set, and my build
> is failing.  Am I missing something post merge?
> Is it an intellij thing?
>


Re: [DISCUSS][PROPOSAL] Acceptance Tests

2017-03-06 Thread Casey Stella
No, I'm saying that they shouldn't be restricted to real-world use-cases.
The E2E tests I laid out weren't real-world, but they did exercise the
components similar to real-world use-cases.  They should also be able to be
able to tread outside of the happy-path for those use-cases.

On Mon, Mar 6, 2017 at 6:30 PM, Michael Miklavcic <
michael.miklav...@gmail.com> wrote:

> "I don't think acceptance tests should loosely associate with real uses,
> but they should
> be free to delve into weird non-happy-pathways."
>
> Not following - are you saying they should *tightly* associate with real
> uses and additonally include non-happy-path?
>
> On Fri, Mar 3, 2017 at 12:57 PM, Casey Stella <ceste...@gmail.com> wrote:
>
> > It is absolutely not a naive question, Matt.  We don't have a lot (or
> any)
> > docs about our integration tests; it's more of a "follow the lead" type
> of
> > thing at the moment, but that should be rectified.
> >
> > The integration tests spin up and down infrastructure in-process, some of
> > which are real and some of which are mock versions of the services.
> These
> > are good for catching some types of bugs, but often things sneak through,
> > like:
> >
> >- Hbase and storm can't exist in the same JVM, so HBase is mocked in
> >those cases.
> >- The FileSystem that we get for Hadoop is the LocalRawFileSystem, not
> >truly HDFS.  There are differences and we've run into
> them..hilariously
> > at
> >times. ;)
> >- Things done statically in a bolt are shared across all bolts because
> >they all are threads in the same process
> >
> > It's good, it catches bugs, it lets us debug things easily, it runs with
> > every single build automatically via travis.
> > It's bad because it's awkward to get the dependencies isolated
> sufficiently
> > for all of these components to get them to play nice in the same JVM.
> >
> > Acceptance tests would be run against a real cluster, so they would:
> >
> >- run against real components, not testing or mock components
> >- run against multiple nodes
> >
> > I can imagine a world where we can unify the two to a certain degree in
> > many cases if we could spin up a docker version of Metron to run as part
> of
> > the build, but I think in the meantime, we should focus on providing
> both.
> >
> > I suspect the reference application is possibly inspiring my suggestions
> > here, but I think the main difference here is that the reference
> > application is intended to be informational from a end-user perspective:
> > it's detailing a use-case that users will understand.  I don't think
> > acceptance tests should loosely associate with real uses, but they should
> > be free to delve into weird non-happy-pathways.
> >
> > On Fri, Mar 3, 2017 at 2:16 PM, Matt Foley <ma...@apache.org> wrote:
> >
> > > Automating stuff that now has to be done manually gets a big +1.
> > >
> > > But, Casey, could you please clarify the relationship between what you
> > > plan to do and the current “integration test” framework?  Will this be
> in
> > > the form of additional integration tests? Or a different test
> framework?
> > > Can it be done in the integration test framework, rather than creating
> > new
> > > mechanism?
> > >
> > > BTW, if that’s a naïve question, forgive me, but I could find zero
> > > documentation for the existing integration test capability, neither
> wiki
> > > pages nor READMEs nor Jiras.  If there are any docs, please point me at
> > > them.  Or even archived email threads.
> > >
> > > There is also something called the “Reference Application”
> > > https://cwiki.apache.org/confluence/display/METRON/
> > > Metron+Reference+Application which sounds remarkably like what you
> > > propose to automate.  Is there / can there / should there be a
> > relationship?
> > >
> > > Thanks,
> > > --Matt
> > >
> > > On 3/3/17, 7:40 AM, "Otto Fowler" <ottobackwa...@gmail.com> wrote:
> > >
> > > +1
> > >
> > > I agree with Justin’s points.
> > >
> > >
> > > On March 3, 2017 at 08:41:37, Justin Leet (justinjl...@gmail.com)
> > > wrote:
> > >
> > > +1 to both. Having this would especially ease a lot of testing that
> > > hits
> > > multiple areas (which there is a fair amount of, given that we're
> > > building
> > > pretty quickly).

Re: Site Book

2017-03-05 Thread Casey Stella
Not yet, we don't have an official 0.3.1 release yet. When the voting ends
in incubatorngeneral there will be a PR to update our website with the
current one.
On Sun, Mar 5, 2017 at 14:43 Nick Allen  wrote:

> Is the site book for the *official* 0.3.1 release available somewhere on
> the web?  Like we made the site book available for the RCs [1].
>
> [1]
>
> https://dist.apache.org/repos/dist/dev/incubator/metron/0.3.1-RC5-incubating/book-site/index.html
>


Re: [DISCUSS][PROPOSAL] Acceptance Tests

2017-03-03 Thread Casey Stella
It is absolutely not a naive question, Matt.  We don't have a lot (or any)
docs about our integration tests; it's more of a "follow the lead" type of
thing at the moment, but that should be rectified.

The integration tests spin up and down infrastructure in-process, some of
which are real and some of which are mock versions of the services.  These
are good for catching some types of bugs, but often things sneak through,
like:

   - Hbase and storm can't exist in the same JVM, so HBase is mocked in
   those cases.
   - The FileSystem that we get for Hadoop is the LocalRawFileSystem, not
   truly HDFS.  There are differences and we've run into them..hilariously at
   times. ;)
   - Things done statically in a bolt are shared across all bolts because
   they all are threads in the same process

It's good, it catches bugs, it lets us debug things easily, it runs with
every single build automatically via travis.
It's bad because it's awkward to get the dependencies isolated sufficiently
for all of these components to get them to play nice in the same JVM.

Acceptance tests would be run against a real cluster, so they would:

   - run against real components, not testing or mock components
   - run against multiple nodes

I can imagine a world where we can unify the two to a certain degree in
many cases if we could spin up a docker version of Metron to run as part of
the build, but I think in the meantime, we should focus on providing both.

I suspect the reference application is possibly inspiring my suggestions
here, but I think the main difference here is that the reference
application is intended to be informational from a end-user perspective:
it's detailing a use-case that users will understand.  I don't think
acceptance tests should loosely associate with real uses, but they should
be free to delve into weird non-happy-pathways.

On Fri, Mar 3, 2017 at 2:16 PM, Matt Foley <ma...@apache.org> wrote:

> Automating stuff that now has to be done manually gets a big +1.
>
> But, Casey, could you please clarify the relationship between what you
> plan to do and the current “integration test” framework?  Will this be in
> the form of additional integration tests? Or a different test framework?
> Can it be done in the integration test framework, rather than creating new
> mechanism?
>
> BTW, if that’s a naïve question, forgive me, but I could find zero
> documentation for the existing integration test capability, neither wiki
> pages nor READMEs nor Jiras.  If there are any docs, please point me at
> them.  Or even archived email threads.
>
> There is also something called the “Reference Application”
> https://cwiki.apache.org/confluence/display/METRON/
> Metron+Reference+Application which sounds remarkably like what you
> propose to automate.  Is there / can there / should there be a relationship?
>
> Thanks,
> --Matt
>
> On 3/3/17, 7:40 AM, "Otto Fowler" <ottobackwa...@gmail.com> wrote:
>
> +1
>
> I agree with Justin’s points.
>
>
> On March 3, 2017 at 08:41:37, Justin Leet (justinjl...@gmail.com)
> wrote:
>
> +1 to both. Having this would especially ease a lot of testing that
> hits
> multiple areas (which there is a fair amount of, given that we're
> building
> pretty quickly).
>
> I do want to point out that adding this type of thing makes the speed
> of
> our builds and tests more important, because they already take up a
> good
> amount of time. There are obviously tickets to optimize these things,
> but
> I would like to make sure we don't pile too much on to every testing
> cycle
> before a PR. Having said that, I think the testing proposed is
> absolutely
> valuable enough to go forward with.
>
> Justin
>
> On Fri, Mar 3, 2017 at 8:33 AM, Casey Stella <ceste...@gmail.com>
> wrote:
>
> > I also propose, once this is done, that we modify the developer
> bylaws
> and
> > the github PR script to ensure that PR authors:
> >
> > - Update the acceptance tests where appropriate
> > - Run the tests as a smoketest
> >
> >
> >
> > On Fri, Mar 3, 2017 at 8:21 AM, Casey Stella <ceste...@gmail.com>
> wrote:
> >
> > > Hi All,
> > >
> > > After doing METRON-744, where I had to walk through a manual test
> of
> > every
> > > place that Stellar touched, it occurred to me that we should script
> this.
> > > It also occurred to me that some scripts that are run by the PR
> author
> to
> > > ensure no regressions and, eventually maybe, even run on an INFRA
> > instance
> > > of Jenkins would give all of us some peace o

Re: [DISCUSS][PROPOSAL] Acceptance Tests

2017-03-03 Thread Casey Stella
That's a very good point.  I'm hoping that this can take the place of some
of the more rigorous manual testing scripts that we have, so it's less time
at the keyboard for reviewers.

On Fri, Mar 3, 2017 at 8:41 AM, Justin Leet <justinjl...@gmail.com> wrote:

> +1 to both.  Having this would especially ease a lot of testing that hits
> multiple areas (which there is a fair amount of, given that we're building
> pretty quickly).
>
> I do want to point out that adding this type of thing makes the speed of
> our builds and tests more important, because they already take up a good
> amount of time.  There are obviously tickets to optimize these things, but
> I would like to make sure we don't pile too much on to every testing cycle
> before a PR.  Having said that, I think the testing proposed is absolutely
> valuable enough to go forward with.
>
> Justin
>
> On Fri, Mar 3, 2017 at 8:33 AM, Casey Stella <ceste...@gmail.com> wrote:
>
> > I also propose, once this is done, that we modify the developer bylaws
> and
> > the github PR script to ensure that PR authors:
> >
> >- Update the acceptance tests where appropriate
> >- Run the tests as a smoketest
> >
> >
> >
> > On Fri, Mar 3, 2017 at 8:21 AM, Casey Stella <ceste...@gmail.com> wrote:
> >
> > > Hi All,
> > >
> > > After doing METRON-744, where I had to walk through a manual test of
> > every
> > > place that Stellar touched, it occurred to me that we should script
> this.
> > > It also occurred to me that some scripts that are run by the PR author
> to
> > > ensure no regressions and, eventually maybe, even run on an INFRA
> > instance
> > > of Jenkins would give all of us some peace of mind.
> > >
> > > I am certain that this, along with a couple other manual tests from
> other
> > > PRs, could form the basis of a really great regression acceptance-test
> > > suite and I'd like to propose that we do that, as a community.
> > >
> > > What I'd like to see from such a suite has the following
> characteristics:
> > >
> > >- Can be run on any Metron cluster, including but not limited to
> > >   - Vagrant
> > >   - AWS
> > >   - An existing deployment
> > >- Can be *deployed* from ansible, but must be able to be deployed
> > >manually
> > >   - With instructions in the readme
> > >- Tests should be idempotent and independent
> > >   - Tear down what you set up
> > >
> > > I think between the Stellar REPL and the fundamental scriptability of
> the
> > > Hadoop services, we can accomplish these tests with a combination of
> > shell
> > > scripts and python.
> > >
> > > I propose we break this into the following parts:
> > >
> > >- Acceptance Testing Framework with a small smoketest
> > >- Baseline Metron Test
> > >   - Send squid data through the squid topology
> > >   - Add an threat triage alert
> > >   - Ensure it gets through to the other side with alerts preserved
> > >- + Enrichment
> > >   - Add an enrichment in the enrichment pipeline to the above
> > >- + Profiler
> > >   - Add a profile with a tick of 1 minute to count per destination
> > >   address
> > >- Base PCap test
> > >   - Something like the manual test for METRON-743 (
> > >   https://github.com/apache/incubator-metron/pull/467#
> > issue-210285324
> > >   <https://github.com/apache/incubator-metron/pull/467#
> > issue-210285324>
> > >   )
> > >
> > > Thoughts?
> > >
> > >
> > > Best,
> > >
> > > Casey
> > >
> >
>


Re: [DISCUSS][PROPOSAL] Acceptance Tests

2017-03-03 Thread Casey Stella
I also propose, once this is done, that we modify the developer bylaws and
the github PR script to ensure that PR authors:

   - Update the acceptance tests where appropriate
   - Run the tests as a smoketest



On Fri, Mar 3, 2017 at 8:21 AM, Casey Stella <ceste...@gmail.com> wrote:

> Hi All,
>
> After doing METRON-744, where I had to walk through a manual test of every
> place that Stellar touched, it occurred to me that we should script this.
> It also occurred to me that some scripts that are run by the PR author to
> ensure no regressions and, eventually maybe, even run on an INFRA instance
> of Jenkins would give all of us some peace of mind.
>
> I am certain that this, along with a couple other manual tests from other
> PRs, could form the basis of a really great regression acceptance-test
> suite and I'd like to propose that we do that, as a community.
>
> What I'd like to see from such a suite has the following characteristics:
>
>- Can be run on any Metron cluster, including but not limited to
>   - Vagrant
>   - AWS
>   - An existing deployment
>- Can be *deployed* from ansible, but must be able to be deployed
>manually
>   - With instructions in the readme
>- Tests should be idempotent and independent
>   - Tear down what you set up
>
> I think between the Stellar REPL and the fundamental scriptability of the
> Hadoop services, we can accomplish these tests with a combination of shell
> scripts and python.
>
> I propose we break this into the following parts:
>
>- Acceptance Testing Framework with a small smoketest
>- Baseline Metron Test
>   - Send squid data through the squid topology
>   - Add an threat triage alert
>   - Ensure it gets through to the other side with alerts preserved
>- + Enrichment
>   - Add an enrichment in the enrichment pipeline to the above
>- + Profiler
>   - Add a profile with a tick of 1 minute to count per destination
>   address
>- Base PCap test
>   - Something like the manual test for METRON-743 (
>   https://github.com/apache/incubator-metron/pull/467#issue-210285324
>   <https://github.com/apache/incubator-metron/pull/467#issue-210285324>
>   )
>
> Thoughts?
>
>
> Best,
>
> Casey
>


[DISCUSS][PROPOSAL] Acceptance Tests

2017-03-03 Thread Casey Stella
Hi All,

After doing METRON-744, where I had to walk through a manual test of every
place that Stellar touched, it occurred to me that we should script this.
It also occurred to me that some scripts that are run by the PR author to
ensure no regressions and, eventually maybe, even run on an INFRA instance
of Jenkins would give all of us some peace of mind.

I am certain that this, along with a couple other manual tests from other
PRs, could form the basis of a really great regression acceptance-test
suite and I'd like to propose that we do that, as a community.

What I'd like to see from such a suite has the following characteristics:

   - Can be run on any Metron cluster, including but not limited to
  - Vagrant
  - AWS
  - An existing deployment
   - Can be *deployed* from ansible, but must be able to be deployed
   manually
  - With instructions in the readme
   - Tests should be idempotent and independent
  - Tear down what you set up

I think between the Stellar REPL and the fundamental scriptability of the
Hadoop services, we can accomplish these tests with a combination of shell
scripts and python.

I propose we break this into the following parts:

   - Acceptance Testing Framework with a small smoketest
   - Baseline Metron Test
  - Send squid data through the squid topology
  - Add an threat triage alert
  - Ensure it gets through to the other side with alerts preserved
   - + Enrichment
  - Add an enrichment in the enrichment pipeline to the above
   - + Profiler
  - Add a profile with a tick of 1 minute to count per destination
  address
   - Base PCap test
  - Something like the manual test for METRON-743 (
  https://github.com/apache/incubator-metron/pull/467#issue-210285324)

Thoughts?


Best,

Casey


Re: [PROPOSAL] Reduce Reliance on Ansible for Deployment

2017-03-02 Thread Casey Stella
Just to clarify, your 1 and 2, which you're working on, will give us the
ability with full-dev (not quick-dev) to exercise the RPMs and management
pack on the non-sensor code (i.e. the current state of the management
pack).  As far as I'm concerned, this is huge.  This ensures we have an
easy vehicle to ensure PRs work with the management pack.  I think a couple
things should be ensured going forward:

   - People whose PR have changes that affect the management pack, should
  - notify the reviewers that it was tested on full-dev
  - They should regenerate quickdev to ensure things aren't broken.
  Dave, can you remind us all where the instructions are for that?


On Thu, Mar 2, 2017 at 9:47 AM, David Lyle  wrote:

> Just wanted to update this thread:
>
> I've been diligently working to the plan we discussed above:
>
> *1) Refactor existing Ansible deployment to use the Ambari MPack to install
> metron-common, metron-enrichments and metron-parsers. *
> *2) Regenerate quick-dev to leverage the change.*
> 3) Create rpm packages for all deployed components that don't currently
> have them.
>  - Sensor probes
>  - Sensor stubs
> 4) Create MPack service defs for the RPMs in (2).
> 5) Refactor existing Ansible deployment to use the Ambari MPack to install
> all services.
> 6) Regenerate quick-dev to leverage the change.
> 7) Plan iteration 2 to see if there are other opportunities to reduce our
> use of Ansible.
>
> I've completed #1 and will have #2 completed shortly, that will close out
> METRON-671.  If we're all still good with that direction, once I finish
> 671, I'd like to cut JIRAs for #3 and #4.
>
> Thoughts?
>
> -D...
>
>
> On Thu, Jan 19, 2017 at 9:00 AM, David Lyle  wrote:
>
> > For the first increment, I am planning on using the Ambari Metron
> topology
> > service that currently exists in the MPack. Handling the side loading
> isn't
> > in the scope of this proposal. We'll need to design that separately.
> >
> > -D...
> >
> >
> > On Thu, Jan 19, 2017 at 8:48 AM, Otto Fowler 
> > wrote:
> >
> >> So - my prototype was adding a new parser type and completely
> integrating
> >> it with the install, all the way down to monit templates.
> >> All of that work was configuration work in ansible.
> >>
> >> I think I’m asking about changes to something that wasn’t documented
> >> anyways, as I think you are pointing out by reference, so I’ll just say
> >> that this change has an effect on side loading.   You will be building
> an
> >> Ambari Metron Topology service, probably from some template - hopefully
> >> using maven archetypes or something the whole way.
> >>
> >>
> >> On January 19, 2017 at 05:56:42, David Lyle (dlyle65...@gmail.com)
> wrote:
> >>
> >> Looks like my replies were only going to Otto, sorry about that- I'll
> >> gather them here:
> >>
> >> "What does this do for Monit?"
> >>
> >> Monit would be deprecated for components under Ambari management.
> >>
> >> "How would this effect deploying new parsers or parsers not shipped?
> >> When I prototyped this I added a monit entry."
> >>
> >> Hard to say having not seen Otto's prototype, but I suspect no effect.
> >>
> >> Fwiw, there is a jira about sideloading that is meant for deploying
> >> custom
> >> parsers. Last I looked, the management was still up for design.
> >>
> >> I"ve opened https://issues.apache.org/jira/browse/METRON-667 to track
> >> this
> >> work. I'd like to get started. Thoughts?
> >>
> >> -D...
> >>
> >>
> >> On Wed, Jan 18, 2017 at 1:28 PM, David Lyle 
> >> wrote:
> >>
> >> > Hard to say having not seen your prototype, but I suspect no effect.
> >> >
> >> > Fwiw, there is a jira about sideloading that is meant for deploying
> >> custom
> >> > parsers. Last I looked, the management was still up for design.
> >> >
> >> > -D...
> >> >
> >> > On Wed, Jan 18, 2017 at 13:07 Otto Fowler 
> >> wrote:
> >> >
> >> >> How would this effect deploying new parsers or parsers not shipped?
> >> >> When I prototyped this I added a monit entry.
> >> >>
> >> >>
> >> >> On January 17, 2017 at 10:34:32, David Lyle (dlyle65...@gmail.com)
> >> wrote:
> >> >>
> >> >> In our "Dev Guide and Committer Review Guide additions" discussion,
> we
> >> had
> >> >>
> >> >>
> >> >> a bit of a side discussion about reducing reliance (perhaps to zero)
> >> on
> >> >>
> >> >>
> >> >> Ansible for our installation.
> >> >>
> >> >>
> >> >>
> >> >>
> >> >>
> >> >> It seemed there was consensus around that idea (if not, please let me
> >> >>
> >> >>
> >> >> know), so I propose the following steps to get there:
> >> >>
> >> >>
> >> >>
> >> >>
> >> >>
> >> >> 1) Refactor existing Ansible deployment to use the Ambari MPack to
> >> install
> >> >>
> >> >>
> >> >> metron-common, metron-enrichments and metron-parsers.
> >> >>
> >> >>
> >> >> 2) Regenerate quick-dev to leverage the change.
> >> >>
> >> >>
> >> >> 3) Create rpm packages for 

Re: [GitHub] incubator-metron issue #468: METRON-744: Allow Stellar functions to be loade...

2017-03-02 Thread Casey Stella
I did not see anything like that in the ansible logs, but I guess I can't
be sure.

On Thu, Mar 2, 2017 at 8:49 AM, David Lyle  wrote:

> I have not. Did HDFS die while quick-dev was coming up?
>
> -D...
>
> On Wed, Mar 1, 2017 at 9:15 PM, cestella  wrote:
>
> > Github user cestella commented on the issue:
> >
> > https://github.com/apache/incubator-metron/pull/468
> >
> > I ran this through from the beginning to the end without issue to
> make
> > sure the stuff I did in the last few hours didn't affect the test.
> >
> > The one issue, and this is off-topic, is that for some reason I
> cannot
> > get the quick-dev `vagrant up` to stage the geo enrichment database in
> HDFS
> > and, as a result, I get an enrichment exception and data doesn't flow
> > through.  Does anyone else have this happen to them?
> >
> >
> >
> > ---
> > If your project is set up for it, you can reply to this email and have
> your
> > reply appear on GitHub as well. If your project does not have this
> feature
> > enabled and wishes so, or if the feature is enabled but not working,
> please
> > contact infrastructure at infrastruct...@apache.org or file a JIRA
> ticket
> > with INFRA.
> > ---
> >
>


Re: [DISCUSS] Making adding new 3rd-party Stellar functions easier

2017-02-28 Thread Casey Stella
I started tinkering with the idea of the classloader to see about how hard
it would be and if it would even be feasible and realized it pretty much
writes itself for a MVP, so I submitted a PR (sans testing plan, which I'll
get to today): METRON-744 (
https://github.com/apache/incubator-metron/pull/468)

This would essentially conform to the second step above.

On Mon, Feb 27, 2017 at 2:52 PM, Matt Foley <ma...@apache.org> wrote:

> Couple thoughts:
>
> 1. I see the Accumulo class loader allows multiple clients with
> potentially conflicting loads, via the “context” mechanism.  That’s good.
> NiFi also used a multi-classloader mechanism to support potentially
> conflicting side-loads of their Processor bundles (“nars”), but I don’t
> think they supported re-loading (altho it’s been a few months since I
> looked at it).
>
> 2. I like the idea of loading from a configured location in HDFS.  This
> gives a far smaller scope of filesystem to be watched and/or searched, and
> of course obviates the deploy-to-many-servers problem.  Altho it costs
> another upload/maintenance tool for the admin to fiddle with.
>
> Thanks,
> --Matt
>
> On 2/27/17, 11:22 AM, "Casey Stella" <ceste...@gmail.com> wrote:
>
> Hi All,
>
> The benefit of Stellar is that adding new functionality is as simple as
> providing a Jar.  This enables people who want to integrate with
> Metron to
> easy add enrichments or other functionality.  The snag currently with
> this
> is that we provide a single jar, so all stellar functions that we have
> available must be dependencies of the main jar that drives the topology
> plus what local directories we can configure via the storm configs.
> This
> makes the process of adding 3rd party jars not as easy as it could be.
>
> What I'm proposing is the following and I'd like to get some community
> feedback on it:
>
>- Split the stellar lang into its own project which does not shade
> its
>dependencies from metron-common
>   - this makes creating your own stellar functions easier as you
> only
>   need depend on a small project
>- Adjust the the following to additionally load classes from a
> location
>in HDFS /apps/metron/stellar using something like accumulo (
>https://accumulo.apache.org/blog/2014/05/03/accumulo-
> classloader.html)
>   - Profiler topology
>   - Parser topology
>   - Enrichment topology
>   - Enrichment Flat file loader
>   - Enrichment MR loader
>- Make the classloader reload upon new files
>   - This would necessitate a new Stellar FunctionResolver
>
> I'd like to propose starting with the first two and attempting the
> third
> after we get something stable with the first 2.
>
> What this will give us is the following workflow to enable new stellar
> functions:
>
>- Build your function depending on stellar-lang into a Jar
>- Drop the new jar onto HDFS in /apps/metron/stellar
>- Restart the topology in question (after the 3rd bullet point,
> this is
>no longer required)
>
> Thoughts?
>
>
>
>


Re: [DISCUSS] System time vs. Event Time

2017-02-28 Thread Casey Stella
I think this is a really tricky topic, but necessary.  I've given it a bit
of thought over the last few months and I don't really see a great way to
do it given the Profiler.  Here's what I've come up with so far, though, in
my thinking.


   - Replaying events will compress events in time (e.g. 2 years of data
   may come through in 10 minutes)
   - Replaying events may result in events being out of order temporally
   even if it is written to kafka in order (just by virtue of hitting a
   different kafka partition)

Given both of these, in my mind we should handle replaying of data *not*
within a streaming context so we can control the order and the grouping of
the data.  In my mind, this is essentially the advent of batch Metron.  Off
the top of my head, I'm having trouble thinking about how to parallelize
this, however, in a pretty manner.

Imagine a scenario where telemetry A has an enrichment E1 that depends on
profile P1 and profile P1 depends on the previous 10 minutes of data.  How
in a batch or streaming context can we ever hope to ensure that the
profiles for P1 for the last 10 minutes are in place as data flows through
across all data points? Now how about if the values that P1 depend on are
computed from a profile P2?  Essentially you have a data dependency graph
between enrichments and profiles and raw data that you need to work in
order.



On Tue, Feb 28, 2017 at 8:03 AM, Justin Leet  wrote:

> There's a couple JIRAs related to the use of system time vs event time.
>
> METRON-590 Enable Use of Event Time in Profiler
> 
> METRON-691 Elastic Writer index partitions on system time, not event time
> 
>
> Is there anything else that needs to be making this distinction, and if so,
> do we need to be able to support both system time and event time for it?
>
> My immediate thought on this is that, once we work on replaying historical
> data, we'll want system time for geo data passing through.  Given that the
> geo files can update, we'd want to know which geo file we actually need to
> be using at the appropriate time.
>
> We'll probably also want to double check anything else that writes out data
> to a location and provides some sort of timestamping on it.
>
> Justin
>


Re: METRON-646 commit attribution

2017-02-27 Thread Casey Stella
+1

On Mon, Feb 27, 2017 at 10:11 PM, Kyle Richardson <kylerichards...@gmail.com
> wrote:

> Just to confirm. Please reply +1 if you're okay for me to commit the revert
> / re-commit of METRON-646 (PR#441).
>
> Thanks,
> Kyle
>
> On Mon, Feb 27, 2017 at 9:25 PM, Casey Stella <ceste...@gmail.com> wrote:
>
> > Yeah, I'd think there does not exist a time in which the email should be
> > null.  I'd rather just error out if you can't find it and the committer
> > doesn't put one in.
> >
> > I do agree that the most sensible way to pull the commit name is to pull
> it
> > from the repo's commit history.  Here's a 1-liner to use if you dont'
> feel
> > like coming up with it yourself
> >
> > git clone https://github.com/kylerichardson/incubator-metron.git
> --depth=1
> > --branch METRON-646 --single-branch METRON-646 >& /dev/null && cd
> > METRON-646 && (git log | grep Author | awk -F: '{print $2}' | sed 's/^
> > //g') && cd ..
> >
> >
> >
> > On Mon, Feb 27, 2017 at 9:15 PM, Nick Allen <n...@nickallen.org> wrote:
> >
> > > Sure.  It could validate the email address before letting you proceed.
> > >
> > > It tries to get the email from the author's Github profile.  If the
> > author
> > > doesn't make one public, it will come back as 'null' and prompt you to
> > > change it.  Of course, it will just use 'null' if you don't provide an
> > > alternative.
> > >
> > > Most of the time I have to enter the email address manually because not
> > > many people make their email public. Even better would be to pull the
> > email
> > > from the author's own commits in the PR.  That would reduce how often
> we
> > > have to manually input an email.
> > >
> > >
> > > On Mon, Feb 27, 2017 at 8:53 PM, Casey Stella <ceste...@gmail.com>
> > wrote:
> > >
> > > > Nick, what are your thoughts on adjusting the script to error out or
> > > prompt
> > > > for an email address if one can't be found?
> > > >
> > > > On Mon, Feb 27, 2017 at 8:51 PM, Nick Allen <n...@nickallen.org>
> > wrote:
> > > >
> > > > > I think revert and commit again is the best way to go.  Not a big
> > deal.
> > > > >
> > > > > On Mon, Feb 27, 2017 at 6:55 PM, Casey Stella <ceste...@gmail.com>
> > > > wrote:
> > > > >
> > > > > > I think it should be changed, but I'm not sure how to change it.
> I
> > > > think
> > > > > it
> > > > > > should be changed because our git history is our legal trail of
> > > > > > attribution.  Mucking with it is relatively serious business.
> > > > > >
> > > > > > As to how, normally I'd say git commit --amend --author
> > > > "kylerichardson <
> > > > > > kylerichards...@gmail.com>" if we act before the next commit
> and a
> > > git
> > > > > > rebase otherwise, but it's pushed and rewriting history for a
> > push'd
> > > > > commit
> > > > > > has consequences.  Not the least of which the scary force'd push.
> > > The
> > > > > > challenge here is that all forked repos during this period
> between
> > > the
> > > > > > wrong commit and the correction commit will be based on a dead
> > > > branch.  I
> > > > > > guess I would vote for 1, the revert and then the re-commit.
> > > > > >
> > > > > > I'd like to understand a bit more about how this happened.  Ryan,
> > can
> > > > you
> > > > > > walk it through how you did the commit so we can avoid it in the
> > > > future?
> > > > > >
> > > > > > Casey
> > > > > >
> > > > > >
> > > > > > On Mon, Feb 27, 2017 at 4:04 PM, Kyle Richardson <
> > > > > > kylerichards...@gmail.com>
> > > > > > wrote:
> > > > > >
> > > > > > > Ok, so here's the story... Ryan was nice enough to commit my
> > recent
> > > > PR
> > > > > > and
> > > > > > > for whatever reason my github username but not my email address
> > > > appears
> > > > > > in
> > > > > > > the commit author (see below).
> > > > > > >
> > > > > > > commit 41fc0ddc9881d9cfdd8bae129c0bb7800a116d4c
> > > > > > > Author: kylerichardson 
> > > > > > > Date:   Mon Feb 27 11:38:55 2017 -0600
> > > > > > >
> > > > > > > METRON-646 Add index templates to metron-docker
> > (kylerichardson
> > > > via
> > > > > > > merrimanr) closes apache/incubator-metron#441
> > > > > > >
> > > > > > > My question is can it be left as is or does it need to include
> > the
> > > > > email
> > > > > > > address per apache?
> > > > > > >
> > > > > > > If it needs to be changed, what are the acceptable options?
> > > > > > >
> > > > > > > (1) commit a revert and re-commit; maintains a record of
> > everything
> > > > > > > (2) rebase one back, update, and force a push; like it never
> > > happened
> > > > > > > (3) another option I haven't considered?
> > > > > > >
> > > > > > > -Kyle
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>


Re: Travis CI Changes have broken my heart..... or my builds

2017-02-27 Thread Casey Stella
Nothing from me, sorry!

On Mon, Feb 27, 2017 at 9:31 PM, Otto Fowler <ottobackwa...@gmail.com>
wrote:

> I took out the parallel processing and now I see the error:
>
> ---
>
>  T E S T S
>
> ---
>
> Running org.apache.metron.maas.service.MaasIntegrationTest
>
> 2017-02-28 00:15:17,056 ERROR [main] nodemanager.LocalDirsHandlerService 
> (LocalDirsHandlerService.java:updateDirsAfterTest(356)) - Most of the disks 
> failed. 1/1 local-dirs are bad: 
> /home/travis/build/ottobackwards/incubator-metron/metron-analytics/metron-maas-service/target/MaasIntegrationTest/MaasIntegrationTest-localDir-nm-0_0;
>  1/1 log-dirs are bad: 
> /home/travis/build/ottobackwards/incubator-metron/metron-analytics/metron-maas-service/target/MaasIntegrationTest/MaasIntegrationTest-logDir-nm-0_0
>
> No output has been received in the last 10m0s, this potentially indicates a 
> stalled build or something wrong with the build itself.
>
> Check the details on how to adjust your build configuration on: 
> https://docs.travis-ci.com/user/common-build-problems/#Build-times-out-because-no-output-was-received
>
> The build has been terminated
>
>
>
>
> Ring any bells?
>
> On February 26, 2017 at 10:42:30, Otto Fowler (ottobackwa...@gmail.com)
> wrote:
>
> I think the problem is with overlapping jar analysis and shading
>
>
> On February 26, 2017 at 08:11:30, Otto Fowler (ottobackwa...@gmail.com)
> wrote:
>
> So, master builds fine, there is just something with my branch
>
>
> On February 25, 2017 at 09:31:22, Otto Fowler (ottobackwa...@gmail.com)
> wrote:
>
> Of course the same commands build locally
>
>
>
> On February 25, 2017 at 09:17:50, Otto Fowler (ottobackwa...@gmail.com)
> wrote:
>
> https://s3.amazonaws.com/archive.travis-ci.org/jobs/205276680/log.txt
>
>
> On February 25, 2017 at 08:20:54, Casey Stella (ceste...@gmail.com) wrote:
>
> I have not seen those "no output received for 10m errors before. Can you
> change the Travis command to not have -q for maven so we can see more
> context?
> On Sat, Feb 25, 2017 at 07:56 Otto Fowler <ottobackwa...@gmail.com> wrote:
>
> > I have not had a build work on Travis CI ( linked to my fork ) in 4 days.
> >
> > https://travis-ci.org/ottobackwards/incubator-metron/builds
> >
> > This pretty much lines up with the last set of changes made to the travis
> > build. Is anyone else having this issue?
> >
>
>


Re: METRON-646 commit attribution

2017-02-27 Thread Casey Stella
Nick, what are your thoughts on adjusting the script to error out or prompt
for an email address if one can't be found?

On Mon, Feb 27, 2017 at 8:51 PM, Nick Allen <n...@nickallen.org> wrote:

> I think revert and commit again is the best way to go.  Not a big deal.
>
> On Mon, Feb 27, 2017 at 6:55 PM, Casey Stella <ceste...@gmail.com> wrote:
>
> > I think it should be changed, but I'm not sure how to change it. I think
> it
> > should be changed because our git history is our legal trail of
> > attribution.  Mucking with it is relatively serious business.
> >
> > As to how, normally I'd say git commit --amend --author "kylerichardson <
> > kylerichards...@gmail.com>" if we act before the next commit and a git
> > rebase otherwise, but it's pushed and rewriting history for a push'd
> commit
> > has consequences.  Not the least of which the scary force'd push.  The
> > challenge here is that all forked repos during this period between the
> > wrong commit and the correction commit will be based on a dead branch.  I
> > guess I would vote for 1, the revert and then the re-commit.
> >
> > I'd like to understand a bit more about how this happened.  Ryan, can you
> > walk it through how you did the commit so we can avoid it in the future?
> >
> > Casey
> >
> >
> > On Mon, Feb 27, 2017 at 4:04 PM, Kyle Richardson <
> > kylerichards...@gmail.com>
> > wrote:
> >
> > > Ok, so here's the story... Ryan was nice enough to commit my recent PR
> > and
> > > for whatever reason my github username but not my email address appears
> > in
> > > the commit author (see below).
> > >
> > > commit 41fc0ddc9881d9cfdd8bae129c0bb7800a116d4c
> > > Author: kylerichardson 
> > > Date:   Mon Feb 27 11:38:55 2017 -0600
> > >
> > > METRON-646 Add index templates to metron-docker (kylerichardson via
> > > merrimanr) closes apache/incubator-metron#441
> > >
> > > My question is can it be left as is or does it need to include the
> email
> > > address per apache?
> > >
> > > If it needs to be changed, what are the acceptable options?
> > >
> > > (1) commit a revert and re-commit; maintains a record of everything
> > > (2) rebase one back, update, and force a push; like it never happened
> > > (3) another option I haven't considered?
> > >
> > > -Kyle
> > >
> >
>


Re: METRON-646 commit attribution

2017-02-27 Thread Casey Stella
I think it should be changed, but I'm not sure how to change it. I think it
should be changed because our git history is our legal trail of
attribution.  Mucking with it is relatively serious business.

As to how, normally I'd say git commit --amend --author "kylerichardson <
kylerichards...@gmail.com>" if we act before the next commit and a git
rebase otherwise, but it's pushed and rewriting history for a push'd commit
has consequences.  Not the least of which the scary force'd push.  The
challenge here is that all forked repos during this period between the
wrong commit and the correction commit will be based on a dead branch.  I
guess I would vote for 1, the revert and then the re-commit.

I'd like to understand a bit more about how this happened.  Ryan, can you
walk it through how you did the commit so we can avoid it in the future?

Casey


On Mon, Feb 27, 2017 at 4:04 PM, Kyle Richardson 
wrote:

> Ok, so here's the story... Ryan was nice enough to commit my recent PR and
> for whatever reason my github username but not my email address appears in
> the commit author (see below).
>
> commit 41fc0ddc9881d9cfdd8bae129c0bb7800a116d4c
> Author: kylerichardson 
> Date:   Mon Feb 27 11:38:55 2017 -0600
>
> METRON-646 Add index templates to metron-docker (kylerichardson via
> merrimanr) closes apache/incubator-metron#441
>
> My question is can it be left as is or does it need to include the email
> address per apache?
>
> If it needs to be changed, what are the acceptable options?
>
> (1) commit a revert and re-commit; maintains a record of everything
> (2) rebase one back, update, and force a push; like it never happened
> (3) another option I haven't considered?
>
> -Kyle
>


Re: question about shading

2017-02-27 Thread Casey Stella
Metron-common is shaded so that we can use a more recent version of guava
which some of our functionality relies upon but does not play nice with the
version brought in with HBase (by relocating gauva).  I, personally,
believe that we should

   - Reduce our dependence on guava until we no longer need it in commons
   - Distribute some of the functionality in comments to smaller, more
   targeted modules
   - Remove the shading and relocating from commons
   - Make the proper exclusions on the leaf projects so that we do not have
   overlaps in dependencies across jars.

Casey

On Mon, Feb 27, 2017 at 2:31 PM, Otto Fowler 
wrote:

> Is there a reason why we shade ALL of the jars?  For example -
> metron-common is shaded.  But it is never ‘deployed’ to storm or yarn or mr
> as a stand alone…
>
> I would think that only the ‘outward’ facing libs would be shaded.
>


[DISCUSS] Making adding new 3rd-party Stellar functions easier

2017-02-27 Thread Casey Stella
Hi All,

The benefit of Stellar is that adding new functionality is as simple as
providing a Jar.  This enables people who want to integrate with Metron to
easy add enrichments or other functionality.  The snag currently with this
is that we provide a single jar, so all stellar functions that we have
available must be dependencies of the main jar that drives the topology
plus what local directories we can configure via the storm configs.  This
makes the process of adding 3rd party jars not as easy as it could be.

What I'm proposing is the following and I'd like to get some community
feedback on it:

   - Split the stellar lang into its own project which does not shade its
   dependencies from metron-common
  - this makes creating your own stellar functions easier as you only
  need depend on a small project
   - Adjust the the following to additionally load classes from a location
   in HDFS /apps/metron/stellar using something like accumulo (
   https://accumulo.apache.org/blog/2014/05/03/accumulo-classloader.html)
  - Profiler topology
  - Parser topology
  - Enrichment topology
  - Enrichment Flat file loader
  - Enrichment MR loader
   - Make the classloader reload upon new files
  - This would necessitate a new Stellar FunctionResolver

I'd like to propose starting with the first two and attempting the third
after we get something stable with the first 2.

What this will give us is the following workflow to enable new stellar
functions:

   - Build your function depending on stellar-lang into a Jar
   - Drop the new jar onto HDFS in /apps/metron/stellar
   - Restart the topology in question (after the 3rd bullet point, this is
   no longer required)

Thoughts?


[RESULT] [VOTE] Releasing Apache Metron (incubating) 0.3.1-RC5

2017-02-27 Thread Casey Stella
The release passes:
+1 (binding):

   - James Sirota
   - David Lyle
   - Ryan Merriman
   - Casey Stella

+1 (non-binding):

   - Justin Leet


Re: Odd integration-test failures on Fedora/CentOS for RC5

2017-02-25 Thread Casey Stella
METRON-743 (https://github.com/apache/incubator-metron/pull/467) for
reference.

On Sat, Feb 25, 2017 at 11:51 PM, Casey Stella <ceste...@gmail.com> wrote:

> Hmm, that's a very good catch if it's the issue.  I was able to verify
> that if you botch the sort order of the files that it fails.
>
> Would you mind sorting the files on PcapJob line 199 by filename?
> Something like Collections.sort(files, (o1,o2) -> o1.getName().compareTo(o2.
> getName()));
>
> I'm going to submit a PR regardless because we should own the assumptions
> here, but I suspect that for the HDFS filesystem this works as expected.
> That being said, it's better to be safe than sorry.
>
> Casey
>
> On Sat, Feb 25, 2017 at 11:35 PM, Otto Fowler <ottobackwa...@gmail.com>
> wrote:
>
>> /**
>>  * List the statuses and block locations of the files in the given path.
>>  * Does not guarantee to return the iterator that traverses statuses
>>  * of the files in a sorted order.
>>  * 
>>  * If the path is a directory,
>>  *   if recursive is false, returns files in the directory;
>>  *   if recursive is true, return files in the subtree rooted at the path.
>>  * If the path is a file, return the file's status and block locations.
>>  * 
>>  * @param f is the path
>>  * @param recursive if the subdirectories need to be traversed recursively
>>  *
>>  * @return an iterator that traverses statuses of the files
>>  *
>>  * @throws FileNotFoundException when the path does not exist;
>>  * @throws IOException see specific implementation
>>  */
>> public RemoteIterator listFiles(
>>
>>
>> So if we depend on this returning something sorted, it is only working
>> accidentally?
>>
>>
>> On February 25, 2017 at 23:10:59, Otto Fowler (ottobackwa...@gmail.com)
>> wrote:
>>
>> https://issues.apache.org/jira/browse/HADOOP-12009  makes it seem like
>> there is no order
>>
>>
>> On February 25, 2017 at 23:06:37, Otto Fowler (ottobackwa...@gmail.com)
>> wrote:
>>
>> Maybe Hadoop Local FileSystem returns different things from ListFiles() on
>> different platforms?
>> That would be something to check?
>>
>> Sorry that is all I got right now
>>
>>
>>
>> On February 25, 2017 at 22:57:49, Otto Fowler (ottobackwa...@gmail.com)
>> wrote:
>>
>> There are also some if Log.isDebugEnabled() outputs, so maybe try changing
>> the logging level, maybe running just this test?
>>
>>
>>
>> On February 25, 2017 at 22:39:02, Otto Fowler (ottobackwa...@gmail.com)
>> wrote:
>>
>> There are multiple “tests” within the test, with different parameters.  If
>> you look at where this is breaking, it is at
>>
>> {
>>   //make sure I get them all.
>>   Iterable<byte[]> results =
>>   job.query(new Path(outDir.getAbsolutePath())
>>   , new Path(queryDir.getAbsolutePath())
>>   , getTimestamp(0, pcapEntries)
>>   , getTimestamp(pcapEntries.size()-1, pcapEntries) + 1
>>   , 10
>>   , new EnumMap<>(Constants.Fields.class)
>>   , new Configuration()
>>   , FileSystem.get(new Configuration())
>>   , new FixedPcapFilter.Configurator()
>>   );
>>   assertInOrder(results);
>>   Assert.assertEquals(Iterables.size(results), pcapEntries.size());
>>
>>
>>
>> Which is the 7th test job run against the data.  I am not familiar with
>> this test or code, but
>> that has to be significant.
>>
>> Maybe you should enable and print out the information of the results - and
>> we can see a pattern there?
>>
>> On February 25, 2017 at 22:19:00, Kyle Richardson (
>> kylerichards...@gmail.com)
>> wrote:
>>
>> mvn integration-test
>>
>> Although I have also tried...
>> mvn clean install && mvn integration-test
>> mvn clean package && mvn integration-test
>> mvn install && mvn surefire-test@unit-tests && mvn
>> surefire-test@integration-tests
>>
>> -Kyle
>>
>> On Feb 25, 2017, at 8:34 PM, Otto Fowler <ottobackwa...@gmail.com> wrote:
>>
>> What command are you using to build?
>>
>>
>>
>> On February 25, 2017 at 17:40:20, Kyle Richardson (
>> kylerichards...@gmail.com)
>> wrote:
>>
>> Tried with Oracle JDK and got the same result. I went as far as trying to
>> run it through the debugger but am not that familiar with this part 

Re: Odd integration-test failures on Fedora/CentOS for RC5

2017-02-25 Thread Casey Stella
Hmm, that's a very good catch if it's the issue.  I was able to verify that
if you botch the sort order of the files that it fails.

Would you mind sorting the files on PcapJob line 199 by filename?
Something like Collections.sort(files, (o1,o2) ->
o1.getName().compareTo(o2.getName()));

I'm going to submit a PR regardless because we should own the assumptions
here, but I suspect that for the HDFS filesystem this works as expected.
That being said, it's better to be safe than sorry.

Casey

On Sat, Feb 25, 2017 at 11:35 PM, Otto Fowler 
wrote:

> /**
>  * List the statuses and block locations of the files in the given path.
>  * Does not guarantee to return the iterator that traverses statuses
>  * of the files in a sorted order.
>  * 
>  * If the path is a directory,
>  *   if recursive is false, returns files in the directory;
>  *   if recursive is true, return files in the subtree rooted at the path.
>  * If the path is a file, return the file's status and block locations.
>  * 
>  * @param f is the path
>  * @param recursive if the subdirectories need to be traversed recursively
>  *
>  * @return an iterator that traverses statuses of the files
>  *
>  * @throws FileNotFoundException when the path does not exist;
>  * @throws IOException see specific implementation
>  */
> public RemoteIterator listFiles(
>
>
> So if we depend on this returning something sorted, it is only working
> accidentally?
>
>
> On February 25, 2017 at 23:10:59, Otto Fowler (ottobackwa...@gmail.com)
> wrote:
>
> https://issues.apache.org/jira/browse/HADOOP-12009  makes it seem like
> there is no order
>
>
> On February 25, 2017 at 23:06:37, Otto Fowler (ottobackwa...@gmail.com)
> wrote:
>
> Maybe Hadoop Local FileSystem returns different things from ListFiles() on
> different platforms?
> That would be something to check?
>
> Sorry that is all I got right now
>
>
>
> On February 25, 2017 at 22:57:49, Otto Fowler (ottobackwa...@gmail.com)
> wrote:
>
> There are also some if Log.isDebugEnabled() outputs, so maybe try changing
> the logging level, maybe running just this test?
>
>
>
> On February 25, 2017 at 22:39:02, Otto Fowler (ottobackwa...@gmail.com)
> wrote:
>
> There are multiple “tests” within the test, with different parameters.  If
> you look at where this is breaking, it is at
>
> {
>   //make sure I get them all.
>   Iterable results =
>   job.query(new Path(outDir.getAbsolutePath())
>   , new Path(queryDir.getAbsolutePath())
>   , getTimestamp(0, pcapEntries)
>   , getTimestamp(pcapEntries.size()-1, pcapEntries) + 1
>   , 10
>   , new EnumMap<>(Constants.Fields.class)
>   , new Configuration()
>   , FileSystem.get(new Configuration())
>   , new FixedPcapFilter.Configurator()
>   );
>   assertInOrder(results);
>   Assert.assertEquals(Iterables.size(results), pcapEntries.size());
>
>
>
> Which is the 7th test job run against the data.  I am not familiar with
> this test or code, but
> that has to be significant.
>
> Maybe you should enable and print out the information of the results - and
> we can see a pattern there?
>
> On February 25, 2017 at 22:19:00, Kyle Richardson (
> kylerichards...@gmail.com)
> wrote:
>
> mvn integration-test
>
> Although I have also tried...
> mvn clean install && mvn integration-test
> mvn clean package && mvn integration-test
> mvn install && mvn surefire-test@unit-tests && mvn
> surefire-test@integration-tests
>
> -Kyle
>
> On Feb 25, 2017, at 8:34 PM, Otto Fowler  wrote:
>
> What command are you using to build?
>
>
>
> On February 25, 2017 at 17:40:20, Kyle Richardson (
> kylerichards...@gmail.com)
> wrote:
>
> Tried with Oracle JDK and got the same result. I went as far as trying to
> run it through the debugger but am not that familiar with this part of the
> code. The timestamps of the packets are definitely not coming back in the
> expected order, but I'm not sure why. Could it be related to something
> filesystem specific?
>
> Apologies if I'm just being dense but I'd really like to understand why
> this consistently fails on some platforms and not others.
>
> -Kyle
>
> > On Feb 25, 2017, at 9:07 AM, Kyle Richardson 
> wrote:
> >
> > Ok, I've tried this so many times I may be going crazy, so thought I'd
> ask the community for a sanity check.
> >
> > I'm trying to verify RC5 and I keep running into the same integration
> test failures but only on my Fedora (24 and 25) and CentOS 7 systems. It
> passes fine on my Macbook.
> >
> > It always fails on the PcapTopologyIntegrationTest (test results pasted
> below). Anyone have any ideas? I'm using the exact same version of maven in
> all cases (v3.3.9). The only difference I can think of is the Fedora/CentOS
> systems are using OpenJDK whereas the Macbook is running Sun/Oracle JDK.
> >
> > 

Re: Files modified after building

2017-02-25 Thread Casey Stella
Sorry, couldn't wait for monday.  Fix is submitted for your review and
heckling: https://github.com/apache/incubator-metron/pull/466 :)

On Sat, Feb 25, 2017 at 11:22 PM, Matt Foley <ma...@apache.org> wrote:

> Well, the last three are in a path with “/generated/” in it.
>
>
> On 2/25/17, 7:39 PM, "Casey Stella" <ceste...@gmail.com> wrote:
>
> Crap, those are generated by the new profile selector dsl committed
> Friday.
> I must've missed them on a commit on the PR. They are generated by the
> build, so it was factored into the tests and such for the PR. Sorry
> for the
> inconvenience, I'll have to make another PR on Monday to get them in.
> On Sat, Feb 25, 2017 at 22:21 Kyle Richardson <
> kylerichards...@gmail.com>
> wrote:
>
> > That's my guess. I noticed the same thing.
> >
> > -Kyle
> >
> > > On Feb 25, 2017, at 8:31 PM, Otto Fowler <ottobackwa...@gmail.com>
> > wrote:
> > >
> > > Changes not staged for commit:
> > >
> > >  (use "git add ..." to update what will be committed)
> > >
> > >  (use "git checkout -- ..." to discard changes in working
> > directory)
> > >
> > >
> > > modified:
> > > metron-analytics/metron-profiler-client/src/main/java/
> Window.tokens
> > >
> > > modified:
> > > metron-analytics/metron-profiler-client/src/main/java/
> WindowLexer.tokens
> > >
> > > modified:
> > >
> > metron-analytics/metron-profiler-client/src/main/java/
> org/apache/metron/profiler/client/window/generated/WindowLexer.java
> > >
> > > modified:
> > >
> > metron-analytics/metron-profiler-client/src/main/java/
> org/apache/metron/profiler/client/window/generated/WindowListener.java
> > >
> > > modified:
> > >
> > metron-analytics/metron-profiler-client/src/main/java/
> org/apache/metron/profiler/client/window/generated/WindowParser.java
> > >
> > >
> > >
> > >
> > > Are these files changed by the build?
> >
>
>
>
>


Re: Files modified after building

2017-02-25 Thread Casey Stella
Crap, those are generated by the new profile selector dsl committed Friday.
I must've missed them on a commit on the PR. They are generated by the
build, so it was factored into the tests and such for the PR. Sorry for the
inconvenience, I'll have to make another PR on Monday to get them in.
On Sat, Feb 25, 2017 at 22:21 Kyle Richardson 
wrote:

> That's my guess. I noticed the same thing.
>
> -Kyle
>
> > On Feb 25, 2017, at 8:31 PM, Otto Fowler 
> wrote:
> >
> > Changes not staged for commit:
> >
> >  (use "git add ..." to update what will be committed)
> >
> >  (use "git checkout -- ..." to discard changes in working
> directory)
> >
> >
> > modified:
> > metron-analytics/metron-profiler-client/src/main/java/Window.tokens
> >
> > modified:
> > metron-analytics/metron-profiler-client/src/main/java/WindowLexer.tokens
> >
> > modified:
> >
> metron-analytics/metron-profiler-client/src/main/java/org/apache/metron/profiler/client/window/generated/WindowLexer.java
> >
> > modified:
> >
> metron-analytics/metron-profiler-client/src/main/java/org/apache/metron/profiler/client/window/generated/WindowListener.java
> >
> > modified:
> >
> metron-analytics/metron-profiler-client/src/main/java/org/apache/metron/profiler/client/window/generated/WindowParser.java
> >
> >
> >
> >
> > Are these files changed by the build?
>


Re: [VOTE] Releasing Apache Metron (incubating) 0.3.1-RC5

2017-02-25 Thread Casey Stella
What exactly are the errors that you saw, Ryan?
On Sat, Feb 25, 2017 at 07:31 David Lyle <dlyle65...@gmail.com> wrote:

> Is there any reason full dev shouldn't be working?
>
> On Fri, Feb 24, 2017 at 9:19 PM, Casey Stella <ceste...@gmail.com> wrote:
>
> > Sounds like a good idea to me; thanks Ryan!
> > On Fri, Feb 24, 2017 at 21:11 Ryan Merriman <merrim...@gmail.com> wrote:
> >
> > > +1 binding
> > >
> > > Verified the signature
> > > Passed maven tests
> > > Started quick-dev, verified data in ES, kibana, and checked the
> > topologies
> > > for errors (bro topology has parsing errors but I think a couple bad
> > > messages in bro data set is normal)
> > > Tested REPL
> > > RPMs built fine
> > >
> > > The recommended build validation wiki page (https://cwiki.apache.org/
> > > confluence/display/METRON/Verifying+Builds
> > > <https://cwiki.apache.org/confluence/display/METRON/Verifying+Builds>)
> > > has some mistakes.  This did
> > > not run successfully in full-dev-platform and the HDFS paths look like
> > they
> > > are old.  I am happy to update the wiki page if everyone agrees these
> are
> > > legitimate mistakes.
> > >
> > > On Fri, Feb 24, 2017 at 9:22 AM, Justin Leet <justinjl...@gmail.com>
> > > wrote:
> > >
> > > > +1 (non-binding)
> > > >
> > > > Verified signature
> > > > Ran build and tests in maven
> > > > Ran up in quick-dev and saw data flow through topologies into the UI
> > > > Ensured the REPL spun up and performed some basic tasks
> > > > Built rpms
> > > >
> > > > Justin
> > > >
> > > > On Thu, Feb 23, 2017 at 11:18 AM, Casey Stella <ceste...@gmail.com>
> > > wrote:
> > > >
> > > > > This is a call to vote on releasing Apache Metron 0.3.1-RC5
> > incubating
> > > > >
> > > > > Full list of changes in this release:
> > > > > https://dist.apache.org/repos/dist/dev/incubator/metron/0.3.
> > > > > 1-RC5-incubating/CHANGES
> > > > >
> > > > > The tag/commit to be voted upon is apache-metron-0.3.1-rc5-
> > incubating:
> > > > > https://git-wip-us.apache.org/repos/asf?p=incubator-metron.
> > > > > git;a=shortlog;h=refs/tags/apache-metron-0.3.1-rc5-incubating
> > > > >
> > > > > The source archive being voted upon can be found here:
> > > > > https://dist.apache.org/repos/dist/dev/incubator/metron/0.3.
> > > > > 1-RC5-incubating/apache-metron-0.3.1-rc5-incubating.tar.gz
> > > > >
> > > > > Other release files, signatures and digests can be found here:
> > > > > https://dist.apache.org/repos/dist/dev/incubator/metron/0.3.
> > > > > 1-RC5-incubating/
> > > > >
> > > > > The release artifacts are signed with the following key:
> > > > > https://git-wip-us.apache.org/repos/asf?p=incubator-metron.
> > > > > git;a=blob;f=KEYS;h=8381e96d64c249a0c1b489bc0c234d
> > > > 9c260ba55e;hb=refs/tags/
> > > > > apache-metron-0.3.1-rc5-incubating
> > > > >
> > > > > The book associated with this RC is located at
> > > > > https://dist.apache.org/repos/dist/dev/incubator/metron/0.3.
> > > > > 1-RC5-incubating/book-site/index.html
> > > > >
> > > > > Please vote on releasing this package as Apache Metron 0.3.1-RC5
> > > > incubating
> > > > >
> > > > > When voting, please list the actions taken to verify the release.
> > > > >
> > > > > Recommended build validation and verification instructions are
> posted
> > > > here:
> > > > >
> https://cwiki.apache.org/confluence/display/METRON/Verifying+Builds
> > > > >
> > > > >
> > > > > This vote will be open for at least 72 hours.
> > > > >
> > > > > [ ] +1 Release this package as Apache Metron 0.3.1-RC5 incubating
> > > > >
> > > > > [ ]  0 No opinion
> > > > >
> > > > > [ ] -1 Do not release this package because...
> > > > >
> > > >
> > >
> >
>


Re: Travis CI Changes have broken my heart..... or my builds

2017-02-25 Thread Casey Stella
I have not seen those "no output received for 10m errors before. Can you
change the Travis command to not have -q for maven so we can see more
context?
On Sat, Feb 25, 2017 at 07:56 Otto Fowler  wrote:

> I have not had a build work on Travis CI ( linked to my fork ) in 4 days.
>
> https://travis-ci.org/ottobackwards/incubator-metron/builds
>
> This pretty much lines up with the last set of changes made to the travis
> build.  Is anyone else having this issue?
>


Re: [VOTE] Releasing Apache Metron (incubating) 0.3.1-RC5

2017-02-24 Thread Casey Stella
Sounds like a good idea to me; thanks Ryan!
On Fri, Feb 24, 2017 at 21:11 Ryan Merriman <merrim...@gmail.com> wrote:

> +1 binding
>
> Verified the signature
> Passed maven tests
> Started quick-dev, verified data in ES, kibana, and checked the topologies
> for errors (bro topology has parsing errors but I think a couple bad
> messages in bro data set is normal)
> Tested REPL
> RPMs built fine
>
> The recommended build validation wiki page (https://cwiki.apache.org/
> confluence/display/METRON/Verifying+Builds
> <https://cwiki.apache.org/confluence/display/METRON/Verifying+Builds>)
> has some mistakes.  This did
> not run successfully in full-dev-platform and the HDFS paths look like they
> are old.  I am happy to update the wiki page if everyone agrees these are
> legitimate mistakes.
>
> On Fri, Feb 24, 2017 at 9:22 AM, Justin Leet <justinjl...@gmail.com>
> wrote:
>
> > +1 (non-binding)
> >
> > Verified signature
> > Ran build and tests in maven
> > Ran up in quick-dev and saw data flow through topologies into the UI
> > Ensured the REPL spun up and performed some basic tasks
> > Built rpms
> >
> > Justin
> >
> > On Thu, Feb 23, 2017 at 11:18 AM, Casey Stella <ceste...@gmail.com>
> wrote:
> >
> > > This is a call to vote on releasing Apache Metron 0.3.1-RC5 incubating
> > >
> > > Full list of changes in this release:
> > > https://dist.apache.org/repos/dist/dev/incubator/metron/0.3.
> > > 1-RC5-incubating/CHANGES
> > >
> > > The tag/commit to be voted upon is apache-metron-0.3.1-rc5-incubating:
> > > https://git-wip-us.apache.org/repos/asf?p=incubator-metron.
> > > git;a=shortlog;h=refs/tags/apache-metron-0.3.1-rc5-incubating
> > >
> > > The source archive being voted upon can be found here:
> > > https://dist.apache.org/repos/dist/dev/incubator/metron/0.3.
> > > 1-RC5-incubating/apache-metron-0.3.1-rc5-incubating.tar.gz
> > >
> > > Other release files, signatures and digests can be found here:
> > > https://dist.apache.org/repos/dist/dev/incubator/metron/0.3.
> > > 1-RC5-incubating/
> > >
> > > The release artifacts are signed with the following key:
> > > https://git-wip-us.apache.org/repos/asf?p=incubator-metron.
> > > git;a=blob;f=KEYS;h=8381e96d64c249a0c1b489bc0c234d
> > 9c260ba55e;hb=refs/tags/
> > > apache-metron-0.3.1-rc5-incubating
> > >
> > > The book associated with this RC is located at
> > > https://dist.apache.org/repos/dist/dev/incubator/metron/0.3.
> > > 1-RC5-incubating/book-site/index.html
> > >
> > > Please vote on releasing this package as Apache Metron 0.3.1-RC5
> > incubating
> > >
> > > When voting, please list the actions taken to verify the release.
> > >
> > > Recommended build validation and verification instructions are posted
> > here:
> > > https://cwiki.apache.org/confluence/display/METRON/Verifying+Builds
> > >
> > >
> > > This vote will be open for at least 72 hours.
> > >
> > > [ ] +1 Release this package as Apache Metron 0.3.1-RC5 incubating
> > >
> > > [ ]  0 No opinion
> > >
> > > [ ] -1 Do not release this package because...
> > >
> >
>


Re: JSONMapParser Normalizer aka Flattener

2017-02-24 Thread Casey Stella
I don't know, I think I'm ok with lists, but I might be biased.  I think
it's the nested maps that were the issue.  Flattening lists seems...wrong
to me.  Maybe that's wrong-headed, but there it is. ;)

On Fri, Feb 24, 2017 at 10:12 AM, Nick Allen  wrote:

> Per Otto's advice, I am looking to reuse the normalizer/flattener mechanism
> that currently exists in JSONMapParser.  It looks like the mechanism is
> built into the class, so I will have to extract it.  It looks like landing
> it in JSONUtils is a logical place.
>
> It appears that the mechanism only handles maps, not lists.  Is that true?
> I will need to add similar functionality for lists to reuse this for
> METRON-686.
>


Re: [DISCUSS] Top domains enrichment config/extractor management

2017-02-24 Thread Casey Stella
Late to chime in here, but I feel that we have discussed Ambari's role
before and I think we should probably clarify, as a community a few things
with regards Ambari vs a management UI built around the REST PR currently
under review.  (I promise, I will get to the topic at hand eventually ;) :

   - Where functionality should live
   - Who is responsible for what

I will now make a couple (possibly controversial) statements (some of
which) we have actually discussed prior to this on the dev list:


   - I view Ambari as managing the install and the static configuration for
   Metron.  For us, this would include zookeeper configs as well as topology
   configuration.  This would be the persistent store of truth.
   - I view Zookeeper to be our runtime configuration store for the
   topologies.


   - I view a management UI (and the Stellar Shell) as managing
   functionality for interacting with the system.  Where it changes
   configuration, it must go through Ambari.
   - I believe the management UI should be exposed as an ambari view

As such, I see the importation and management of enrichments, which is a
data task, to be squarely in the purview of the management UI, whose job is
the care and feeding of the data.  That being said, any configuration
changes to USE the enrichment should at least be routed through ambari, but
should be managed in the UI.

Now the question becomes, should we have enrichment collateral (I'm
including both hbase as well as geo or anything else we have) loaded at
install-time.  I would argue that we should not.  Rather, we should design
the management UI so that the enrichments can be added easily, with a
wizard to enable the use of the enrichment via stellar for a sensor

On that topic, I think we are doing too much as part of our install.  I
would argue that we shouldn't pre-load even the geo data or depend on it
for the default parsers.

Casey



On Tue, Feb 21, 2017 at 6:31 PM, Michael Miklavcic <
michael.miklav...@gmail.com> wrote:

> With the work committed in
> https://github.com/apache/incubator-metron/pull/445 and
> https://github.com/apache/incubator-metron/pull/432, we now have a robust
> and flexible means to import enrichment sources and transform their
> contents as they are inserted into HBase. One of the main motivators for
> this new functionality was to add the ability to load top domain rankings
> from sources such as Alexa. The proposal is to make this type of enrichment
> a top-level feature in Metron by introducing it to the Ambari management UI
> as a configurable set of properties in the MPack install. This comes with
> some options and challenges in how we want to manage the configurations,
> which I will outline below.
>
> *Use cases:*
>
>- Single load of top domains file
>- Re-loading top domains file - need to be able to cleanup properly
>- Cleaning up/deleting old enrichment data (this is a general feature
>that we currently lack - I think it is worth a separate Jira/PR for
>creating a MapReduce job that enables cleanup to occur).
>- Modifying default top domains file source - there are other options
>besides Alexa. And users may want to load a file from local URI since
> many
>data centers do not have direct access to the internet.
>- Ability to modify the default extractor config JSON and tune the
>Stellar transformations for both the value and indicator transforms.
> Allows
>more flexible handling of data based on other sources.
>- Loading multiple top domains source enrichments. (Maybe a separate PR
>for this if we even think it would be useful)
>- Updating the top domain enrichment - This needs to be an atomic
>operation in order to prevent incorrect data.
>- Rolling back to an older version of the top domains enrichment. Also
>needs to be atomic.
>- Ability to schedule an enrichment load on schedule - we would like to
>defer this to an external scheduling mechanism, e.g. cron or Control M.
> The
>enrichment loading system should have the necessary features to enable
> this
>type of automation without data integrity issues.
>
> *Considerations:*
>
>- As mentioned above, we want to add this feature to the Ambari MPack.
>This requires at least 2 parameters to work. We need the ability to
> specify
>a URI as well as an extractor config.
>- How do we want to manage the extractor config? The most obvious
>solution is to provide a text field in Ambari with a default JSON
> config.
>When a load is initiated, Ambari would place a fresh copy of the
> extractor
>config in the /tmp/ directory. This is an ephemeral file that isn't
> needed
>other than during a load.
>- It seems easy enough to have the load occur during the initial
>install, however subsequent loads would require a different workflow.
> How
>do folks feel about adding a set of dropdown options in the Ambari UI
> for
>loading, updating, and deleting the top 

Re: [DISCUSS] Metron Alerts UI

2017-02-24 Thread Casey Stella
Regarding alert ID, it seems like this is the kind of thing which should be
uniform for all the different types of indices: solr and HDFS.  You might
(and probably do) want to be able to join between IDs in HDFS and ES or
Solr, for instance, so it probably shouldn't be tied to the ES ID.  We
might want to make a Metron ID that is baked into the parsers and is a
SHA-2 hash of the data.



On Fri, Feb 24, 2017 at 9:29 AM, Ryan Merriman <merrim...@gmail.com> wrote:

> Related to the 'What does "Escalate" do' question, one topic that needs
> some discussion is how we integrate with 3rd party ticketing systems.  How
> should we design this extension point?  Some basic requirements could be
> that a call is made to somewhere with the alert as the payload and some
> kind of ticket or issue id is received as a response.  This is a very
> open-ended question and there are likely several different ways we go do
> it.
>
> As for Casey's other points:
>
> - The most obvious choice for alert id would be the id in elasticsearch.
> Are there other ids we should consider?
> - Configurable display fields makes a lot of sense to me and should not be
> complex to implement.
> - Agreed on offering intuitive ways to filter messages by fields.
>
> Ryan
>
> On Thu, Feb 23, 2017 at 6:42 PM, Casey Stella <ceste...@gmail.com> wrote:
>
> >- What does "Escalate" do exactly?
> >- Where does the Alert ID come from?
> >- Are the fields displayed configurable?
> >- It'd be nice to be able to select a set of fields for a message and
> >have the list of messages filter to just those where those fields are
> > the
> >same as the one viewed.
> >
> >
> > On Thu, Feb 23, 2017 at 3:24 PM, Houshang Livian <
> hliv...@hortonworks.com>
> > wrote:
> >
> > > Hello Metron Community,
> > >
> > > We have mocked up an Alerts UI for Metron for your consideration.
> Please
> > > take a look and share your thoughts.
> > >
> > > Here is a link to our thoughts on this:
> > > http://imgur.com/a/KMTKN
> > >
> > > Does this look like a reasonable place to start?
> > > Is there anything that is an absolute MUST have or MUST NOT have?
> > >
> > > Houshang Livian
> > >
> > >
> > >
> >
>


Re: [DISCUSS] Metron Alerts UI

2017-02-23 Thread Casey Stella
   - What does "Escalate" do exactly?
   - Where does the Alert ID come from?
   - Are the fields displayed configurable?
   - It'd be nice to be able to select a set of fields for a message and
   have the list of messages filter to just those where those fields are the
   same as the one viewed.


On Thu, Feb 23, 2017 at 3:24 PM, Houshang Livian 
wrote:

> Hello Metron Community,
>
> We have mocked up an Alerts UI for Metron for your consideration. Please
> take a look and share your thoughts.
>
> Here is a link to our thoughts on this:
> http://imgur.com/a/KMTKN
>
> Does this look like a reasonable place to start?
> Is there anything that is an absolute MUST have or MUST NOT have?
>
> Houshang Livian
>
>
>


[VOTE] Releasing Apache Metron (incubating) 0.3.1-RC5

2017-02-23 Thread Casey Stella
This is a call to vote on releasing Apache Metron 0.3.1-RC5 incubating

Full list of changes in this release:
https://dist.apache.org/repos/dist/dev/incubator/metron/0.3.1-RC5-incubating/CHANGES

The tag/commit to be voted upon is apache-metron-0.3.1-rc5-incubating:
https://git-wip-us.apache.org/repos/asf?p=incubator-metron.git;a=shortlog;h=refs/tags/apache-metron-0.3.1-rc5-incubating

The source archive being voted upon can be found here:
https://dist.apache.org/repos/dist/dev/incubator/metron/0.3.1-RC5-incubating/apache-metron-0.3.1-rc5-incubating.tar.gz

Other release files, signatures and digests can be found here:
https://dist.apache.org/repos/dist/dev/incubator/metron/0.3.1-RC5-incubating/

The release artifacts are signed with the following key:
https://git-wip-us.apache.org/repos/asf?p=incubator-metron.git;a=blob;f=KEYS;h=8381e96d64c249a0c1b489bc0c234d9c260ba55e;hb=refs/tags/apache-metron-0.3.1-rc5-incubating

The book associated with this RC is located at
https://dist.apache.org/repos/dist/dev/incubator/metron/0.3.1-RC5-incubating/book-site/index.html

Please vote on releasing this package as Apache Metron 0.3.1-RC5 incubating

When voting, please list the actions taken to verify the release.

Recommended build validation and verification instructions are posted here:
https://cwiki.apache.org/confluence/display/METRON/Verifying+Builds


This vote will be open for at least 72 hours.

[ ] +1 Release this package as Apache Metron 0.3.1-RC5 incubating

[ ]  0 No opinion

[ ] -1 Do not release this package because...


Travis is taking a long time to start Metron builds

2017-02-23 Thread Casey Stella
Yesterday we waited all day for the METRON-734 PR to even start (only to
have it fail in a sporadic failure, but there's a PR for that too ;).

On the PR, I moved to allow the travis result to be an indication of a good
build after waiting 3 hours for Travis to start our build.  I would prefer
to not be required to take extreme measures to get a good build in a timely
manner.

As such, I have submitted an Apache INFRA ticket requesting some guidance
on the Travis situation: https://issues.apache.org/jira/browse/INFRA-13571

For those interested, please watch that JIRA and hopefully we can work with
INFRA to make this tolerable again.


Re: [MENTORS] The 0.3.1 RC is up at incubator-general, but we need your assistance

2017-02-22 Thread Casey Stella
Sorry, I was letting discussion happen on that separate thread, but I
should've responded earlier.  No, we no longer need reviews.  I'll be
cancelling the RC in incubator general momentarily.

On Wed, Feb 22, 2017 at 9:58 AM, P. Taylor Goetz <ptgo...@gmail.com> wrote:

> Given it looks like the community supports cancelling that vote and
> issuing a new RC, do you still want reviews?
>
> -Taylor
>
> > On Feb 22, 2017, at 9:01 AM, Casey Stella <ceste...@gmail.com> wrote:
> >
> > We have a release candidate up for a vote at incubator-general but it's
> not
> > gotten much attention (it's been up for more than 72 hours, but it only
> has
> > one vote).  Would any of you with a binding vote be willing to review it
> > for us?
> >
> > Casey
>
>


Re: [DISCUSS] 0.3.1 Release situation

2017-02-22 Thread Casey Stella
I'm in favor of moving 0.3.1 RC5 concurrent with master.  I see a number of
things there will make the release better:

   - Better docs in the doc-book
   - The CEF parser


Casey

On Wed, Feb 22, 2017 at 7:46 AM, Kyle Richardson <kylerichards...@gmail.com>
wrote:

> +1 on pulling and cutting a new RC. Would we simply patch rc4 with this one
> change or include all of the master commits too?
>
> -Kyle
>
> On Wed, Feb 22, 2017 at 10:29 AM, Nick Allen <n...@nickallen.org> wrote:
>
> > +1 I agree with you Casey.  I think we should re-cut the release.
> >
> > On Wed, Feb 22, 2017 at 10:27 AM, Casey Stella <ceste...@gmail.com>
> wrote:
> >
> > > As you are all aware by now, we have an issue with our maven build.  In
> > > short, we tripped on https://github.com/maxmind/GeoIP2-java/issues/77
> > >
> > > As such, our build no longer works, but also our RC for 0.3.1 no longer
> > > builds.  I am inclined to pull the release candidate from voting on
> > > incubator general and re-cut a new candidate after the fix METRON-734 (
> > > https://github.com/apache/incubator-metron/pull/462) gets in later
> > today.
> > > My reasoning is that the current situation makes the release candidate
> > > un-releasable due to it not being able to be build.
> > >
> > > I would like to bring that decision to the community and get some
> > feedback,
> > > though, before I summarily retract the candidate on incubator general.
> > >
> > > Thoughts?
> > >
> > > Best,
> > >
> > > Casey
> > >
> >
>


[DISCUSS] 0.3.1 Release situation

2017-02-22 Thread Casey Stella
As you are all aware by now, we have an issue with our maven build.  In
short, we tripped on https://github.com/maxmind/GeoIP2-java/issues/77

As such, our build no longer works, but also our RC for 0.3.1 no longer
builds.  I am inclined to pull the release candidate from voting on
incubator general and re-cut a new candidate after the fix METRON-734 (
https://github.com/apache/incubator-metron/pull/462) gets in later today.
My reasoning is that the current situation makes the release candidate
un-releasable due to it not being able to be build.

I would like to bring that decision to the community and get some feedback,
though, before I summarily retract the candidate on incubator general.

Thoughts?

Best,

Casey


Re: [DISCUSS] Sketch Libraries

2017-02-22 Thread Casey Stella
Oh, one thing we are doing in t-digest is that the library can serialize
itself to a bytestream (presumably) in a tighter representation than the
default kryo serialization, which is nice.  Not sure if data streams has
the ability to serialize itself, but I wouldn't be surprised.  Anyway, not
a dealbreaker per se, just a thought.

On Wed, Feb 22, 2017 at 6:11 AM, Casey Stella <ceste...@gmail.com> wrote:

> So looking at it, it seems to fit the bill, with a couple of comments:
>
>- The quantiles stuff provides a CDF and PMF function, which is
>sufficient for our purposes.  I haven't seen any real comparison between
>t-digests and their approach.  A cursory glance at the source code leads me
>to believe that it's not tree-based, so I'd have to dig into it a bit more
>to understand the tradeoffs of their approach vs a tree-based approach like
>in t-digest
>- The HLL stuff seems to be pure HLL, rather than HLL+, which is what
>we support.  HLL+ has better accuracy characteristics for small sets, as I
>recall.  I'll defer to Mike Miklavcic on that as I haven't read the paper
>in a while.
>
> On the whole, I'd love to integrate with it and maybe swap out the
> t-digest approach for this since it has an active community around it.
>
> Anyway, thanks for bringing it to our attention and if anyone wants to
> take that on, I'd be on board with a +1 ;)
>
> Casey
>
> On Tue, Feb 21, 2017 at 10:22 PM, Matt Foley <ma...@apache.org> wrote:
>
>> Looks interesting.  Any indication whether it supports MAD (median
>> absolute deviation) for outlier detection?
>>
>>
>> On 2/21/17, 8:08 AM, "Nick Allen" <n...@nickallen.org> wrote:
>>
>> We currently use the tdunning/t-digest
>> <https://github.com/tdunning/t-digest> library for generating our
>> STATS_*
>> sketches and then a separate library addthis/stream-lib
>> <https://github.com/addthis/stream-lib> for doing the HLL distinct
>> count.
>>
>> I ran across another library originating from Yahoo that looks quite
>> featureful, well documented and quite active.  On the surface it
>> *seems* to
>> be able to do what we need for both the STATS_* sketches and HLL.
>>
>> https://datasketches.github.io/
>>
>>
>> Has anyone evaluated this library before?  Are there deficiencies as
>> compared to the libraries that we currently use?
>>
>>
>>
>>
>


Re: [DISCUSS] Coding style via checkstyle

2017-02-22 Thread Casey Stella
+1 to longer line lengths, beer on fridays and mother's apple pie.

On Wed, Feb 22, 2017 at 5:01 AM, Kyle Richardson <kylerichards...@gmail.com>
wrote:

> +1 to longer line lengths and blanket reformatting. Personally, I see IDE
> integration as a must for adoption of checkstyle.
>
> -Kyle
>
> On Wed, Feb 22, 2017 at 1:39 AM, Matt Foley <ma...@apache.org> wrote:
>
> > +1, so do I.  Also like the idea of providing the necessary IntelliJ
> > specification.
> >
> > On 2/21/17, 1:25 PM, "Otto Fowler" <ottobackwa...@gmail.com> wrote:
> >
> > +1.  I agree with Michael’s points.
> >
> > On February 21, 2017 at 16:23:21, Michael Miklavcic (
> > michael.miklav...@gmail.com) wrote:
> >
> > +1 to a blanket reformat, failed build for improper formatting, and
> > automated formatting. I strongly prefer to remove "thinking" from my
> > code
> > formatting and it has worked very well for me on large projects in
> the
> > past. There is capability now in IntelliJ to work with Checkstyle as
> > well.
> > https://youtrack.jetbrains.com/issue/IDEA-61520#comment=27-1292600
> > https://plugins.jetbrains.com/idea/plugin/1065-checkstyle-idea
> >
> > A quick search didn't yield any obviously robust tools for automating
> > the
> > formatting other than an older non-maintained project named Jalopy. I
> > think
> > the checkstyle integration with IntelliJ and Eclipse should suffice
> > since
> > the Maven plugin would give devs the ability to run checks locally
> and
> > in
> > Github via Travis.
> >
> >
> > On Tue, Feb 21, 2017 at 12:32 PM, Nick Allen <n...@nickallen.org>
> > wrote:
> >
> > > I would be in favor of a blanket, reformat. Whether that is for the
> > entire
> > > code base or one project at a time. Might be able to conquer and
> > divide
> > > some of the heavy-lifting of testing, if we do a project at a time.
> > But
> > > whichever way you think is easier. I'd be glad to help.
> > >
> > > On Tue, Feb 21, 2017 at 1:57 PM, Justin Leet <
> justinjl...@gmail.com>
> > > wrote:
> > >
> > > > I already tried a blanket, manual reformat the other day, through
> > > > IntelliJ. I did every file matching *.java in the project and it
> > was
> > > > pretty quick. I didn't validate everything looked perfect
> > afterwards,
> > > but I
> > > > did click into a few files and things looked fine. I'm not quite
> > sure
> > > what
> > > > the lifecycle of our autogenerated stuff is, so we'd want to
> regen
> > > > afterwards, but it's a pretty trivial thing to do.
> > > >
> > > > I'm sure there's more nuance (and definitely more testing) than
> > that,
> > but
> > > > off the top of my head I'm not sure what it would be. Either
> way, I
> > don't
> > > > think there's a huge amount of effort to just do the reformat,
> but
> > we'd
> > > > still want to spin everything up and test it and so on. It's
> > probably
> > > more
> > > > work for everybody to rebase onto the (vastly) reformatted code
> > than
> > > > anything else, which will vary pretty significantly.
> > > >
> > > > For (slight) context, the changes are enough to eliminate ~5k
> > checkstyle
> > > > warnings (and there might be more if we have to tweak anything in
> > the
> > > code
> > > > formatting).
> > > >
> > > > On Tue, Feb 21, 2017 at 10:34 AM, Casey Stella <
> ceste...@gmail.com
> > >
> > > wrote:
> > > >
> > > > > Any idea, with those modifications to checkstyle, how much
> > effort it
> > > will
> > > > > take to reformat the code to conform?
> > > > >
> > > > > On Tue, Feb 21, 2017 at 8:23 AM, Justin Leet <
> > justinjl...@gmail.com>
> > > > > wrote:
> > > > >
> > > > > > As part of:
> > > > > > https://issues.apache.org/jira/browse/METRON-726
> > > > > > https://github.com/apache/incubator-metron/pull/459
> > > > > >
> > > > > > I integ

Re: Build failures because of transitive dependencies.

2017-02-22 Thread Casey Stella
I'd vote to exclude and pull in ours and then test that it still functions.
  I'd say that having the integration tests function should give a decent
idea if it'll work in vagrant, but of course we should run it up in vagrant
as well.

Casey

On Wed, Feb 22, 2017 at 6:07 AM, Justin Leet  wrote:

> Long story short, geoip2 lib behaves badly and one of its dependencies
> pulls in an open ended range of jackson-databind which is breaking our
> build as it tries to pull in 2.9.0-SNAPSHOT. See:
> https://github.com/maxmind/GeoIP2-java/issues/77
>
> As noted in that thread, the easy solution is to just  jackson and
> pull in ours.  The catch is that geoip2 specifies a slightly higher version
> of jackson than we globally use (2.8.x instead of 2.7.x).  Adding the
> exclusion does allow for compiling, but I haven't tested anything.
>
> I'd like to get thoughts on our preferred way to fix this.  E.g. add the
> exclusion alone and test?  Bump the jackson version in enrichments
> specifically and test? bump the whole version and test?
>
> Ticket to track
> https://issues.apache.org/jira/browse/METRON-734
>
> Thanks, Justin
>


Re: [DISCUSS] Sketch Libraries

2017-02-22 Thread Casey Stella
So looking at it, it seems to fit the bill, with a couple of comments:

   - The quantiles stuff provides a CDF and PMF function, which is
   sufficient for our purposes.  I haven't seen any real comparison between
   t-digests and their approach.  A cursory glance at the source code leads me
   to believe that it's not tree-based, so I'd have to dig into it a bit more
   to understand the tradeoffs of their approach vs a tree-based approach like
   in t-digest
   - The HLL stuff seems to be pure HLL, rather than HLL+, which is what we
   support.  HLL+ has better accuracy characteristics for small sets, as I
   recall.  I'll defer to Mike Miklavcic on that as I haven't read the paper
   in a while.

On the whole, I'd love to integrate with it and maybe swap out the t-digest
approach for this since it has an active community around it.

Anyway, thanks for bringing it to our attention and if anyone wants to take
that on, I'd be on board with a +1 ;)

Casey

On Tue, Feb 21, 2017 at 10:22 PM, Matt Foley  wrote:

> Looks interesting.  Any indication whether it supports MAD (median
> absolute deviation) for outlier detection?
>
>
> On 2/21/17, 8:08 AM, "Nick Allen"  wrote:
>
> We currently use the tdunning/t-digest
>  library for generating our
> STATS_*
> sketches and then a separate library addthis/stream-lib
>  for doing the HLL distinct
> count.
>
> I ran across another library originating from Yahoo that looks quite
> featureful, well documented and quite active.  On the surface it
> *seems* to
> be able to do what we need for both the STATS_* sketches and HLL.
>
> https://datasketches.github.io/
>
>
> Has anyone evaluated this library before?  Are there deficiencies as
> compared to the libraries that we currently use?
>
>
>
>


[MENTORS] The 0.3.1 RC is up at incubator-general, but we need your assistance

2017-02-22 Thread Casey Stella
We have a release candidate up for a vote at incubator-general but it's not
gotten much attention (it's been up for more than 72 hours, but it only has
one vote).  Would any of you with a binding vote be willing to review it
for us?

Casey


Re: [DISCUSS] Coding style via checkstyle

2017-02-21 Thread Casey Stella
+1 to blanket reformat as well.

On Tue, Feb 21, 2017 at 1:25 PM, Otto Fowler <ottobackwa...@gmail.com>
wrote:

> +1.  I agree with Michael’s points.
>
>
> On February 21, 2017 at 16:23:21, Michael Miklavcic (
> michael.miklav...@gmail.com) wrote:
>
> +1 to a blanket reformat, failed build for improper formatting, and
> automated formatting. I strongly prefer to remove "thinking" from my code
> formatting and it has worked very well for me on large projects in the
> past. There is capability now in IntelliJ to work with Checkstyle as well.
> https://youtrack.jetbrains.com/issue/IDEA-61520#comment=27-1292600
> https://plugins.jetbrains.com/idea/plugin/1065-checkstyle-idea
>
> A quick search didn't yield any obviously robust tools for automating the
> formatting other than an older non-maintained project named Jalopy. I think
> the checkstyle integration with IntelliJ and Eclipse should suffice since
> the Maven plugin would give devs the ability to run checks locally and in
> Github via Travis.
>
>
> On Tue, Feb 21, 2017 at 12:32 PM, Nick Allen <n...@nickallen.org> wrote:
>
> > I would be in favor of a blanket, reformat. Whether that is for the
> entire
> > code base or one project at a time. Might be able to conquer and divide
> > some of the heavy-lifting of testing, if we do a project at a time. But
> > whichever way you think is easier. I'd be glad to help.
> >
> > On Tue, Feb 21, 2017 at 1:57 PM, Justin Leet <justinjl...@gmail.com>
> > wrote:
> >
> > > I already tried a blanket, manual reformat the other day, through
> > > IntelliJ. I did every file matching *.java in the project and it was
> > > pretty quick. I didn't validate everything looked perfect afterwards,
> > but I
> > > did click into a few files and things looked fine. I'm not quite sure
> > what
> > > the lifecycle of our autogenerated stuff is, so we'd want to regen
> > > afterwards, but it's a pretty trivial thing to do.
> > >
> > > I'm sure there's more nuance (and definitely more testing) than that,
> but
> > > off the top of my head I'm not sure what it would be. Either way, I
> don't
> > > think there's a huge amount of effort to just do the reformat, but we'd
> > > still want to spin everything up and test it and so on. It's probably
> > more
> > > work for everybody to rebase onto the (vastly) reformatted code than
> > > anything else, which will vary pretty significantly.
> > >
> > > For (slight) context, the changes are enough to eliminate ~5k
> checkstyle
> > > warnings (and there might be more if we have to tweak anything in the
> > code
> > > formatting).
> > >
> > > On Tue, Feb 21, 2017 at 10:34 AM, Casey Stella <ceste...@gmail.com>
> > wrote:
> > >
> > > > Any idea, with those modifications to checkstyle, how much effort it
> > will
> > > > take to reformat the code to conform?
> > > >
> > > > On Tue, Feb 21, 2017 at 8:23 AM, Justin Leet <justinjl...@gmail.com>
> > > > wrote:
> > > >
> > > > > As part of:
> > > > > https://issues.apache.org/jira/browse/METRON-726
> > > > > https://github.com/apache/incubator-metron/pull/459
> > > > >
> > > > > I integrated checkstyle into the mvn:site command, and have
> > checkstyle
> > > > > reports being run as part of the mvn:site reporting. I expect to be
> > > > > celebrating hitting 25k checkstyle warnings soon.
> > > > >
> > > > > I tested out creating a code formatting setup in IntelliJ, with a
> > > couple
> > > > > slight modifications of the default Sun conventions (extended the
> > > > character
> > > > > limit of a line past 80 and made it two space indents). Given that
> > > > > checkstyle includes it as a default option, it's probably
> reasonably
> > > > close
> > > > > to the Sun conventions. I'm thinking we probably also at least
> create
> > > an
> > > > > Eclipse profile, to open up ease of development.
> > > > >
> > > > > There's probably also a discussion about how exactly we want to
> > enforce
> > > > it.
> > > > > Is it just something we add to the PR checklist and have reviewers
> > > give a
> > > > > glance, do we setup a hook to autoformat code, etc?
> > > > >
> > > > > Justin
> > > > >
> > > >
> > >
> >
>


Re: [DISCUSS] Coding style via checkstyle

2017-02-21 Thread Casey Stella
Any idea, with those modifications to checkstyle, how much effort it will
take to reformat the code to conform?

On Tue, Feb 21, 2017 at 8:23 AM, Justin Leet  wrote:

> As part of:
> https://issues.apache.org/jira/browse/METRON-726
> https://github.com/apache/incubator-metron/pull/459
>
> I integrated checkstyle into the mvn:site command, and have checkstyle
> reports being run as part of the mvn:site reporting. I expect to be
> celebrating hitting 25k checkstyle warnings soon.
>
> I tested out creating a code formatting setup in IntelliJ, with a couple
> slight modifications of the default Sun conventions (extended the character
> limit of a line past 80 and made it two space indents). Given that
> checkstyle includes it as a default option, it's probably reasonably close
> to the Sun conventions. I'm thinking we probably also at least create an
> Eclipse profile, to open up ease of development.
>
> There's probably also a discussion about how exactly we want to enforce it.
> Is it just something we add to the PR checklist and have reviewers give a
> glance, do we setup a hook to autoformat code, etc?
>
> Justin
>


Re: [DISCUSS][PROPOSAL] Side Loading and Installation of telemetry sources [METRON-258]

2017-02-17 Thread Casey Stella
Ok, This is a long one, so don't expect a coherent response just yet, but I
will give some initial impressions:

   - I strongly agree with the premise of this idea.  Making Metron
   extensible is and should be among the top of our priorities and at the
   moment, it's painful to develop a new parser.
   - One maven module per parser may be overkill here as the shading is
   costly and I think it may make some sense to group based on characteristics
   in some way (e.g. json and csv may get grouped together).
   - The notion of instance vs parser is a good one
   - Binding ES templates and parsers may not be a good idea.  You can have
   non-indexed parsers (e.g. streaming enrichments).

Can we start small here and then iterate toward the complete vision?  I'd
recommend

   - Splitting the parsers up into some coherent organization with common
   bits separated from the parser itself
   - Having a maven archetype

As the two most valuable and achievable parts of this idea since they are
the bits required to enable users to create parsers without forking Metron.

On Fri, Feb 17, 2017 at 11:54 AM, Otto Fowler 
wrote:

> The ability for implementors and developers building on the project to
> ‘side load’, that is to build, maintain, and install, telemetry sources
> into the system without having to actually develop within METRON itself is
> very important.
>
> If done properly it gives developers and easier and more manageable
> proposition for extending METRON to suit their needs in what may be the
> most common extension case.  It also may reduce the necessity to create and
> maintain forks of METRON.
>
> I would like to put forward a proposal on a way to move this forward, and
> ask the community for feedback and assistance in reaching an acceptable
> approach and raising the issues that I have surely missed.
>
> Conceptually what I would like to propose is the following:
>
> * What is currently metron-parsers should be broken apart such that each
> parser is it’s own individual component
> * Each of these components should be completely self contained ( or produce
> a self contained package )
> * These packages will include the shaded jar for the parser, default
> configurations for the parser and enrichment, default elasticsearch
> template, and a default log-rotate script
> * These packages will be deployed to disk in a new library directory under
> metron
> * Zookeeper should have a new telemetry or source area where all
> ‘installed’ sources exist
> * This area would host the default configurations, rules, templates, and
> scripts and metadata
> * Installed sources can be instantiated as named instances
> * Instantiating an instance will move the default configurations to what is
> currently the enrichment and parser areas for the instance name
> * It will also deploy the elasticsearch template for the instance
> name
> * It will deploy the log-rotate scripts
> * Installed and instantiated sources can be ‘redeployed’ from disk to
> upgrade
> * Installed sources are available for selection in ambari
> * question on post selection configuration, but we have that problem
> already
> * Instantiation is exposed through REST
> * the UI can install a new package
> * the UI can allow a workflow to edit the configurations and templates
> before finalizing
> * are there three states here?   Installed | Edited | Instantiated
> ?
> * the UI can edit existing and redeploy
> * possibly re-deploy ES template after adding fields or account for fields
> added by enrichment…. manually or automatically?
> * a script can be made to instantiate a ‘base’ parser ( json, grok, csv )
> with only configuration
> * The installation and instantiation should be exposed through the Stellar
> management console
> * Starting a topology will now start the parser’s shaded jar found through
> the parser type ( which may need to added to the configurations ) and the
> library
> * A Maven Archetype should be created for a parser | telemetry source
> project that allows the proper setup of a development project outside the
> METRON source tree
> * should be published
> * should have a useful default set
>
> So the developer’s workflow:
>
> * Create a new project from the archetype outside of the metron tree
> * edit the configurations, templates, rules etc in the project
> * code or modify the sample
> * build
> * run the installer script or the ui to upload/deploy the package
> * use the console or ui to create an instance
>
> QUESTIONS:
> * it seems strange to have this as ‘parsers’ when conceptually parsers are
> a part of the whole, should we introduce something like ‘source’ that is
> all of it?
> * should configurations etc be in ZK or on disk? or HDFS? or All of the
> above?
> * did you read this far?  good!
> * I am sure that after hitting send I will think of 10 things that are
> missing from this
>
> I have started a POC of this, and thus far have created
> metron-parsers-common 

Re: dependencies_with_url.csv

2017-02-15 Thread Casey Stella
+1 to that :)

On Wed, Feb 15, 2017 at 8:15 AM, Otto Fowler <ottobackwa...@gmail.com>
wrote:

> METRON-722
>
>
> On February 15, 2017 at 10:16:02, Casey Stella (ceste...@gmail.com) wrote:
>
> to verify that nobody has created a transitive
> dependency with a license that we don't know about in the build
>
>


Re: How would one debug MaaS service components during the integration test?

2017-02-15 Thread Casey Stella
Ok, so it's a bit weird, I'll admit, but it's the nature of the beast with
Yarn apps.

Here's the issue:
Code which executes in Yarn containers (e.g.
 org.apache.metron.maas.service.runner.Runner which is what the container
executes and is responsible for spinning up the model shell script and
registering the started model with the service registry) get run in
separate processes, so debugging isn't going to help you.

What I advise here for debug is to

   - Turn on more verbose logging by dropping a log4j.properties into
   src/main/resources that turns on debug logging or whatever level you think
   is appropriate
   - Run MaaS integration test
   - Monitor the target/MaasIntegrationTest directory.

Run find target/MaasIntegrationTest and you'll see a plethora of log files
for the app master and for the individual containers, etc.

This is what mine looks like during a run (irrelevant files to debugging
not included):

./MaasIntegrationTest-logDir-nm-0_0/application_1487172564786_0001/container_1487172564786_0001_01_01
./MaasIntegrationTest-logDir-nm-0_0/application_1487172564786_0001/container_1487172564786_0001_01_01/AppMaster.stderr
./MaasIntegrationTest-logDir-nm-0_0/application_1487172564786_0001/container_1487172564786_0001_01_01/AppMaster.stdout
./MaasIntegrationTest-logDir-nm-0_0/application_1487172564786_0001/container_1487172564786_0001_01_02/stderr
./MaasIntegrationTest-logDir-nm-0_0/application_1487172564786_0001/container_1487172564786_0001_01_02/stdout


I know this is weird and it's inconvenient to have to go through these
unnatural acts to get log files much less not be able to set breakpoints,
but there are multiple containers and once you leave the process boundary,
your debugger as it's set up now, can't communicate.  If you figure out a
better way, it's probably worth noting.  At the end of this, I'll summarize
in a big fat comment on the Integration Test, how I suggest debugging so
future parties don't have to figure it out.

Thanks for your patience!

Casey

On Wed, Feb 15, 2017 at 7:04 AM, Otto Fowler 
wrote:

> I would like to debug the MaaS service components while running the
> MaasIntegrationTest, but none of my breakpoints ever hit.  Has anyone ever
> done this?
>
> Specifically I want to step through how the dummy_rest.sh gets called and
> started.
>


Re: dependencies_with_url.csv

2017-02-15 Thread Casey Stella
It's hand-edited and used to verify that nobody has created a transitive
dependency with a license that we don't know about in the build.  We should
probably document it in the developer guidelines.

On Wed, Feb 15, 2017 at 7:14 AM, Otto Fowler 
wrote:

> Is this generated or hand edited?  I don’t see it mentioned in the
> developer guidelines.
>


Re: [DISCUSS] Update Metron Release Documentation

2017-02-15 Thread Casey Stella
On the subject, we should also document updating the releases page after a
release and figure out how old books are stored/served up.  Anyone have
thoughts on that?

On Wed, Feb 15, 2017 at 9:53 AM, Casey Stella <ceste...@gmail.com> wrote:

> Yeah I agree we need to document it there.
>
> On Tue, Feb 14, 2017 at 09:57 zeo...@gmail.com <zeo...@gmail.com> wrote:
>
>> As a follow-up to METRON-716, I would like to suggest that we update our
>> Metron
>> Release documentation
>> <https://cwiki.apache.org/confluence/display/METRON/Release+Process> to
>> account for the site-book.  Specifically, I think that Step 4 and Step 9
>> need a bit of a refresher.
>>
>> In the most recent build, Casey appears to have handled this by building
>> the site-book and then releasing it to
>> https://dist.apache.org/repos/dist/dev/incubator/metron/0.3.
>> 1-RC4-incubating/book-site,
>> documenting that in the VOTE thread.
>>
>> My initial question is, is there a reason to use the "book-site" folder
>> name, as opposed to "site-book"?  I would prefer to pick a standard and
>> stick with it, if possible.
>>
>> Regardless, I am suggesting that under Step 4 we add the following bullet
>> under the "The artifacts for a release" section:
>>
>> "- The site-book documentation, as generated using the most recent
>> documentation under the site-book/README.md."
>>
>> And under Step 9 we add the following:
>>
>> "- Update the Metron site documentation links to point to the
>> documentation
>> for the most recent release."
>>
>> Right now the website points to the wiki
>> <https://cwiki.apache.org/confluence/display/METRON/Documentation>.
>> Thoughts?
>>
>> Jon
>> --
>>
>> Jon
>>
>> Sent from my mobile device
>>
>


[RESULTS][VOTE] Releasing Apache Metron (incubating) 0.3.1-RC4

2017-02-15 Thread Casey Stella
This vote passes.  I will be submitting this to the incubator general.

+1:
Casey Stella (binding)
James Sirota (binding)
Anand Subramanian (non-binding)
Matt Foley (non-binding)


Re: [VOTE] Releasing Apache Metron (incubating) 0.3.1-RC4

2017-02-15 Thread Casey Stella
+1

On Wed, Feb 15, 2017 at 7:03 AM, Casey Stella <ceste...@gmail.com> wrote:

> Done.
>
> On Wed, Feb 15, 2017 at 10:00 AM, Casey Stella <ceste...@gmail.com> wrote:
>
>> That sounds good to me, Matt.  I'll substitute the file and we can note
>> it as a known issue.
>>
>>
>> On Tue, Feb 14, 2017 at 22:55 Matt Foley <ma...@apache.org> wrote:
>>
>>> I just found something in the docs.  I noticed the formatting was messed
>>> up in the generated html for file 
>>> metron-platform/metron-data-management/README.md.
>>> The cause of this is due to use of quadruple back-ticks instead of the
>>> correct triple back-ticks to delimit codeblocks.  Doxia-markdown doesn’t
>>> like this at all, even tho Github-MD for some reason doesn’t have a
>>> problem.  This in turn interrupted the re-write process on this file, so
>>> all the markdown dialect issues were unfixed and the bullets got munched
>>> into paragraphs, etc.  I have documented this in
>>> https://issues.apache.org/jira/browse/METRON-719
>>>
>>> I had previously fixed this problem in this file, but a few instances
>>> snuck back in during a later edit.  Guilty parties have been informed
>>> privately :-)
>>>
>>> Because this is only a docs issue, I don’t feel it is sufficient to
>>> force a start-over on the vote.  However, if no one objects I would like
>>> Casey to substitute the correctly formatted file into the site-book at
>>> https://dist.apache.org/repos/dist/dev/incubator/metron/0.3.
>>> 1-RC4-incubating/book-site/metron-platform/metron-data-manag
>>> ement/index.html   We could then document it as a known issue in 0.3.1,
>>> and I will submit the patch for integration immediately after 0.3.1.
>>>
>>> Is that acceptable?
>>> Thanks,
>>> --Matt
>>>
>>> On 2/13/17, 4:53 PM, "Matt Foley" <mfo...@hortonworks.com> wrote:
>>>
>>> +1
>>>
>>> Compared contents of release tarball https://dist.apache.org/repos/
>>> dist/dev/incubator/metron/0.3.1-RC4-incubating/apache-metron
>>> -0.3.1-rc4-incubating.tar.gz with contents of git tag
>>> apache-metron-0.3.0-rc4-incubating.  They match.
>>>
>>> Confirmed build and full unit test.
>>> Build Mpack
>>> Build RPMs
>>>
>>> Install on single-node CentOS7 VM, with Ambari-2.4.2.0 and
>>> HDP-2.5.3.0 stack (with changes from METRON-609 as known needed for
>>> single-node deployment, especially reduced elasticsearch.master.yml)
>>>
>>> Ran bro data through the system and observed proportional emits from
>>> parser, enrichment, and indexing topologies.
>>> Did not validate indexing due to human error during installation.
>>>
>>> --Matt
>>>
>>> On 2/10/17, 12:22 PM, "Casey Stella" <ceste...@gmail.com> wrote:
>>>
>>> This is a call to vote on releasing Apache Metron 0.3.1-RC4
>>> incubating
>>>
>>> Full list of changes in this release:
>>> https://dist.apache.org/repos/dist/dev/incubator/metron/0.3.
>>> 1-RC4-incubating/CHANGES
>>>
>>> The tag/commit to be voted upon is apache-metron-0.3.1-rc4-incuba
>>> ting:
>>> https://git-wip-us.apache.org/repos/asf?p=incubator-metron.g
>>> it;a=shortlog;h=refs/tags/apache-metron-0.3.1-rc4-incubating
>>>
>>> The source archive being voted upon can be found here:
>>> https://dist.apache.org/repos/dist/dev/incubator/metron/0.3.
>>> 1-RC4-incubating/apache-metron-0.3.1-rc4-incubating.tar.gz
>>>
>>> Other release files, signatures and digests can be found here:
>>> https://dist.apache.org/repos/dist/dev/incubator/metron/0.3.
>>> 1-RC4-incubating/
>>>
>>> The release artifacts are signed with the following key:
>>> https://git-wip-us.apache.org/repos/asf?p=incubator-metron.g
>>> it;a=blob;f=KEYS;h=8381e96d64c249a0c1b489bc0c234d9c260ba55e;
>>> hb=refs/tags/apache-metron-0.3.1-rc4-incubating
>>>
>>> The book associated with this RC is located at
>>> https://dist.apache.org/repos/dist/dev/incubator/metron/0.3.
>>> 1-RC4-incubating/book-site/index.html
>>>
>>> Please vote on releasing this package as Apache Metron 0.3.1-RC4
>>> incubating
>>>
>>> When voting, please list the actions taken to verify the release.
>>>
>>> Recommended build validation and verification instructions are
>>> posted here:
>>> https://cwiki.apache.org/confluence/display/METRON/Verifying
>>> +Builds
>>>
>>>
>>> This vote will be open for at least 72 hours.
>>>
>>> [ ] +1 Release this package as Apache Metron 0.3.1-RC4 incubating
>>>
>>> [ ]  0 No opinion
>>>
>>> [ ] -1 Do not release this package because...
>>>
>>>
>>>
>>>
>>>
>>>
>


Re: [VOTE] Releasing Apache Metron (incubating) 0.3.1-RC4

2017-02-15 Thread Casey Stella
Done.

On Wed, Feb 15, 2017 at 10:00 AM, Casey Stella <ceste...@gmail.com> wrote:

> That sounds good to me, Matt.  I'll substitute the file and we can note it
> as a known issue.
>
>
> On Tue, Feb 14, 2017 at 22:55 Matt Foley <ma...@apache.org> wrote:
>
>> I just found something in the docs.  I noticed the formatting was messed
>> up in the generated html for file 
>> metron-platform/metron-data-management/README.md.
>> The cause of this is due to use of quadruple back-ticks instead of the
>> correct triple back-ticks to delimit codeblocks.  Doxia-markdown doesn’t
>> like this at all, even tho Github-MD for some reason doesn’t have a
>> problem.  This in turn interrupted the re-write process on this file, so
>> all the markdown dialect issues were unfixed and the bullets got munched
>> into paragraphs, etc.  I have documented this in
>> https://issues.apache.org/jira/browse/METRON-719
>>
>> I had previously fixed this problem in this file, but a few instances
>> snuck back in during a later edit.  Guilty parties have been informed
>> privately :-)
>>
>> Because this is only a docs issue, I don’t feel it is sufficient to force
>> a start-over on the vote.  However, if no one objects I would like Casey to
>> substitute the correctly formatted file into the site-book at
>> https://dist.apache.org/repos/dist/dev/incubator/metron/0.3.
>> 1-RC4-incubating/book-site/metron-platform/metron-data-manag
>> ement/index.html   We could then document it as a known issue in 0.3.1,
>> and I will submit the patch for integration immediately after 0.3.1.
>>
>> Is that acceptable?
>> Thanks,
>> --Matt
>>
>> On 2/13/17, 4:53 PM, "Matt Foley" <mfo...@hortonworks.com> wrote:
>>
>> +1
>>
>> Compared contents of release tarball https://dist.apache.org/repos/
>> dist/dev/incubator/metron/0.3.1-RC4-incubating/apache-metron
>> -0.3.1-rc4-incubating.tar.gz with contents of git tag
>> apache-metron-0.3.0-rc4-incubating.  They match.
>>
>> Confirmed build and full unit test.
>> Build Mpack
>> Build RPMs
>>
>> Install on single-node CentOS7 VM, with Ambari-2.4.2.0 and
>> HDP-2.5.3.0 stack (with changes from METRON-609 as known needed for
>> single-node deployment, especially reduced elasticsearch.master.yml)
>>
>> Ran bro data through the system and observed proportional emits from
>> parser, enrichment, and indexing topologies.
>> Did not validate indexing due to human error during installation.
>>
>> --Matt
>>
>> On 2/10/17, 12:22 PM, "Casey Stella" <ceste...@gmail.com> wrote:
>>
>> This is a call to vote on releasing Apache Metron 0.3.1-RC4
>> incubating
>>
>> Full list of changes in this release:
>> https://dist.apache.org/repos/dist/dev/incubator/metron/0.3.
>> 1-RC4-incubating/CHANGES
>>
>> The tag/commit to be voted upon is apache-metron-0.3.1-rc4-incuba
>> ting:
>> https://git-wip-us.apache.org/repos/asf?p=incubator-metron.g
>> it;a=shortlog;h=refs/tags/apache-metron-0.3.1-rc4-incubating
>>
>> The source archive being voted upon can be found here:
>> https://dist.apache.org/repos/dist/dev/incubator/metron/0.3.
>> 1-RC4-incubating/apache-metron-0.3.1-rc4-incubating.tar.gz
>>
>> Other release files, signatures and digests can be found here:
>> https://dist.apache.org/repos/dist/dev/incubator/metron/0.3.
>> 1-RC4-incubating/
>>
>> The release artifacts are signed with the following key:
>> https://git-wip-us.apache.org/repos/asf?p=incubator-metron.g
>> it;a=blob;f=KEYS;h=8381e96d64c249a0c1b489bc0c234d9c260ba55e;
>> hb=refs/tags/apache-metron-0.3.1-rc4-incubating
>>
>> The book associated with this RC is located at
>> https://dist.apache.org/repos/dist/dev/incubator/metron/0.3.
>> 1-RC4-incubating/book-site/index.html
>>
>> Please vote on releasing this package as Apache Metron 0.3.1-RC4
>> incubating
>>
>> When voting, please list the actions taken to verify the release.
>>
>> Recommended build validation and verification instructions are
>> posted here:
>> https://cwiki.apache.org/confluence/display/METRON/Verifying
>> +Builds
>>
>>
>> This vote will be open for at least 72 hours.
>>
>> [ ] +1 Release this package as Apache Metron 0.3.1-RC4 incubating
>>
>> [ ]  0 No opinion
>>
>> [ ] -1 Do not release this package because...
>>
>>
>>
>>
>>
>>


Re: [DISCUSS] Update Metron Release Documentation

2017-02-15 Thread Casey Stella
Yeah I agree we need to document it there.
On Tue, Feb 14, 2017 at 09:57 zeo...@gmail.com  wrote:

> As a follow-up to METRON-716, I would like to suggest that we update our
> Metron
> Release documentation
>  to
> account for the site-book.  Specifically, I think that Step 4 and Step 9
> need a bit of a refresher.
>
> In the most recent build, Casey appears to have handled this by building
> the site-book and then releasing it to
>
> https://dist.apache.org/repos/dist/dev/incubator/metron/0.3.1-RC4-incubating/book-site
> ,
> documenting that in the VOTE thread.
>
> My initial question is, is there a reason to use the "book-site" folder
> name, as opposed to "site-book"?  I would prefer to pick a standard and
> stick with it, if possible.
>
> Regardless, I am suggesting that under Step 4 we add the following bullet
> under the "The artifacts for a release" section:
>
> "- The site-book documentation, as generated using the most recent
> documentation under the site-book/README.md."
>
> And under Step 9 we add the following:
>
> "- Update the Metron site documentation links to point to the documentation
> for the most recent release."
>
> Right now the website points to the wiki
> .
> Thoughts?
>
> Jon
> --
>
> Jon
>
> Sent from my mobile device
>


Re: [VOTE] Releasing Apache Metron (incubating) 0.3.1-RC4

2017-02-15 Thread Casey Stella
That sounds good to me, Matt.  I'll substitute the file and we can note it
as a known issue.


On Tue, Feb 14, 2017 at 22:55 Matt Foley <ma...@apache.org> wrote:

> I just found something in the docs.  I noticed the formatting was messed
> up in the generated html for file 
> metron-platform/metron-data-management/README.md.
> The cause of this is due to use of quadruple back-ticks instead of the
> correct triple back-ticks to delimit codeblocks.  Doxia-markdown doesn’t
> like this at all, even tho Github-MD for some reason doesn’t have a
> problem.  This in turn interrupted the re-write process on this file, so
> all the markdown dialect issues were unfixed and the bullets got munched
> into paragraphs, etc.  I have documented this in
> https://issues.apache.org/jira/browse/METRON-719
>
> I had previously fixed this problem in this file, but a few instances
> snuck back in during a later edit.  Guilty parties have been informed
> privately :-)
>
> Because this is only a docs issue, I don’t feel it is sufficient to force
> a start-over on the vote.  However, if no one objects I would like Casey to
> substitute the correctly formatted file into the site-book at
> https://dist.apache.org/repos/dist/dev/incubator/metron/0.3.
> 1-RC4-incubating/book-site/metron-platform/metron-data-
> management/index.html   We could then document it as a known issue in
> 0.3.1, and I will submit the patch for integration immediately after 0.3.1.
>
> Is that acceptable?
> Thanks,
> --Matt
>
> On 2/13/17, 4:53 PM, "Matt Foley" <mfo...@hortonworks.com> wrote:
>
> +1
>
> Compared contents of release tarball https://dist.apache.org/repos/
> dist/dev/incubator/metron/0.3.1-RC4-incubating/apache-
> metron-0.3.1-rc4-incubating.tar.gz with contents of git tag
> apache-metron-0.3.0-rc4-incubating.  They match.
>
> Confirmed build and full unit test.
> Build Mpack
> Build RPMs
>
> Install on single-node CentOS7 VM, with Ambari-2.4.2.0 and HDP-2.5.3.0
> stack (with changes from METRON-609 as known needed for single-node
> deployment, especially reduced elasticsearch.master.yml)
>
> Ran bro data through the system and observed proportional emits from
> parser, enrichment, and indexing topologies.
> Did not validate indexing due to human error during installation.
>
> --Matt
>
> On 2/10/17, 12:22 PM, "Casey Stella" <ceste...@gmail.com> wrote:
>
> This is a call to vote on releasing Apache Metron 0.3.1-RC4
> incubating
>
> Full list of changes in this release:
> https://dist.apache.org/repos/dist/dev/incubator/metron/0.3.
> 1-RC4-incubating/CHANGES
>
> The tag/commit to be voted upon is apache-metron-0.3.1-rc4-
> incubating:
> https://git-wip-us.apache.org/repos/asf?p=incubator-metron.
> git;a=shortlog;h=refs/tags/apache-metron-0.3.1-rc4-incubating
>
> The source archive being voted upon can be found here:
> https://dist.apache.org/repos/dist/dev/incubator/metron/0.3.
> 1-RC4-incubating/apache-metron-0.3.1-rc4-incubating.tar.gz
>
> Other release files, signatures and digests can be found here:
> https://dist.apache.org/repos/dist/dev/incubator/metron/0.3.
> 1-RC4-incubating/
>
> The release artifacts are signed with the following key:
> https://git-wip-us.apache.org/repos/asf?p=incubator-metron.
> git;a=blob;f=KEYS;h=8381e96d64c249a0c1b489bc0c234d9c260ba55e;hb=refs/tags/
> apache-metron-0.3.1-rc4-incubating
>
> The book associated with this RC is located at
> https://dist.apache.org/repos/dist/dev/incubator/metron/0.3.
> 1-RC4-incubating/book-site/index.html
>
> Please vote on releasing this package as Apache Metron 0.3.1-RC4
> incubating
>
> When voting, please list the actions taken to verify the release.
>
> Recommended build validation and verification instructions are
> posted here:
> https://cwiki.apache.org/confluence/display/METRON/
> Verifying+Builds
>
>
> This vote will be open for at least 72 hours.
>
> [ ] +1 Release this package as Apache Metron 0.3.1-RC4 incubating
>
> [ ]  0 No opinion
>
> [ ] -1 Do not release this package because...
>
>
>
>
>
>


Re: Site-Book

2017-02-13 Thread Casey Stella
Yes, definitely.
On Mon, Feb 13, 2017 at 09:01 Otto Fowler  wrote:

> Should Site-Book have a README.md describing the contents, how to build
> etc?
>


  1   2   3   4   >