Re: [DISCUSS] MPack components that don't support Kerberos

2017-04-13 Thread Kyle Richardson
Just to clarify, we're saying when adding a service to the mpack that Kerberos 
support is required... but we're not saying that installing Metron requires a 
kerberized cluster correct? I think we should support it but should allow 
installation and use of Metron without it (for testing or other reasons 
determined by the user).

-Kyle

> On Apr 13, 2017, at 12:55 PM, Otto Fowler  wrote:
> 
> My thought was that if it was a requirement, or blocker for contribution we
> would want
> to provide something to help.
> 
> I am not sure everyone will have a kerberos cluster to test with.  Maybe
> they will.
> Maybe the answer is Docker or Vagrant as Casey suggests and not integration
> testing.
> 
> 
> 
> On April 13, 2017 at 12:02:08, James Sirota (jsir...@apache.org) wrote:
> 
> Hi Guys,
> 
> I don't like further bloating our integration tests so I am not sure I like
> the idea. I think when people add new services they should test on real
> clusters using kerberos. Also, community members can take on end-to-end
> testing in advance of a release on a cluster using kerberos. But I think
> adding this to our integration test framework is just too much.
> 
> 13.04.2017, 08:12, "Casey Stella" :
>> I honestly don't know if we can mock out a KDC for integration tests. If
>> we did move the integration tests to running against docker, that might
> be
>> an option as we could dockerize a KDC as well.
>> 
>> Long story short, "probably, but not for free. ;)"
>> 
>> On Thu, Apr 13, 2017 at 10:41 AM, Otto Fowler 
>> wrote:
>> 
>>> Can we test kerberized support in integration?
>>> 
>>> On April 13, 2017 at 10:24:43, Casey Stella (ceste...@gmail.com) wrote:
>>> 
>>> Agreed, +1
>>> 
>>> On Thu, Apr 13, 2017 at 10:14 AM, Otto Fowler 
>>> wrote:
>>> 
 This should be in the dev guide and pr template
 
 On April 13, 2017 at 09:43:48, Casey Stella (ceste...@gmail.com)
> wrote:
 
 Based on my understanding, we have a few axioms that we're working
> from:
 
 - The installer should install a complete and workable product (i.e.
 after install, everything should work). Afterall, that has to be the
 sensible definition of 'working' for an installer
 - Metron should support running in a Kerberized environment
 
 If we are going to support kerberos and the installer is going to
> install
 the product, then I would consider lack of kerberos support for a
 component
 to block inclusion into the mpack.
 
 Casey
 
 On Thu, Apr 13, 2017 at 9:29 AM, Ryan Merriman 
 wrote:
 
> There is a PR up for review (
> https://github.com/apache/incubator-metron/pull/518) that updates
> our
> MPack
> to support a Kerberized environment. There is also a PR up for
> review
 that
> adds the REST service to the MPack (
> https://github.com/apache/incubator-metron/pull/500).
> 
> However, the REST application currently does not work in a
> kerberized
> environment. That work has already started so it won't be an issue
> for
> long but how should we handle situations like this in the future
> where
 we
> want to add a service but it's not quite ready for Kerberos? Should
> Kerberos support be a prerequisite before it's added to the MPack?
 Should
> we look at ways to make these services optional? Any other thoughts
> or
> ideas?
> 
> Ryan
> 
> 
> ---
> Thank you,
> 
> James Sirota
> PPMC- Apache Metron (Incubating)
> jsirota AT apache DOT org


[DISCUSS] New Stellar Functions

2017-04-09 Thread Kyle Richardson
I have the need for a new Stellar function to perform string concatenation.
I have it implemented but am curious about where new functions should live
given the new capabilities around 3rd party Stellar function libraries.

So, I guess my question is, should this function live in:
1) metron-common with the other string functions
2) another metron project
3) as a standalone project and not part of the metron source tree

While I'm specifically asking about this case, I think it's also worthwhile
that we think about where other new functions should live in the long term.

Thanks!

-Kyle


Stellar field transformations cannot use input fields with dashes

2017-04-09 Thread Kyle Richardson
So, I learned something the hard way today and thought I would share. I'm
sure most of you already knew this but here goes.

It turns out that, while you can use message field names with dashes in
them (e.g. cs-host) you cannot perform Stellar operations with them. In my
case, this was a field transformation but I'm assuming it would apply to
other uses of Stellar as well. Looking at it after the fact, it makes a lot
more sense, but it took me a while to realize that the dashes in the field
name were being treated as a minus in Stellar, thus returning zero for the
string operations I was trying to perform.

Example field transform config:
"fieldTransformations": [
  {
"transformation": "STELLAR",
"output": ["proto"],
"config": {
  "proto": "TO_UPPER(cs-uri-scheme)"
}
  }
]

Example message:
{
"cs-host": "crl.microsoft.com",
"cs-uri-scheme": "http",
"s-action": "TCP_HIT",
"timestamp": 1491759661030,
"proto": "0"
...
}

My solution, change the field names to not contain dashes and everything
works as expected :).

Lesson learned; read the docs carefully. It clearly states in the README
that '-' is a reserved keyword. Sharing this to save someone else like me a
little time.

-Kyle


Re: [VOTE] Final Board Resolution Draft V2

2017-03-22 Thread Kyle Richardson
+1 (binding)

On Mon, Mar 20, 2017 at 3:05 AM, James Sirota  wrote:

>
> - Removed affiliations
> - Added apache IDs where possible
> - Removed committers and only left PPMC members
>
> Hope this version holds up.  Please vote +1, -1, or 0 for neutral.  The
> vote will be open for 72 hours
>
>
> The incubating Apache Metron community believes it is time to graduate to
> TLP.
>
> Apache Metron entered incubation in December of 2015. Since then, we've
> overcome technical challenges to remove Category X dependencies, and made 3
> releases. Our most recent release contains binary convenience artifacts. We
> are a very helpful and engaged community, ready to answer all questions and
> feedback directed to us via the user list. Through our time in incubation
> we've added a number of committers and promoted some of them to PPMC
> membership. We are actively pursuing others. While we do still have issues
> to address raised by means of the maturity model, all projects are ongoing
> processes, and we believe we no longer need the incubator to continue
> addressing these issues.
>
> To inform the discussion, here is some basic project information:
>
> Project status:
>   http://incubator.apache.org/projects/metron.html
>
> Project website:
>   https://metron.incubator.apache.org/
>
> Project documentation:
>https://cwiki.apache.org/confluence/display/METRON/Documentation
>
> Maturity assessment:
>https://cwiki.apache.org/confluence/display/METRON/
> Apache+Project+Maturity+Model
>
> DRAFT of the board resolution is at the bottom of this email
>
> Proposed PMC size: 25 members
>
> Total number of committers: 6 members
>
>
> 516 commits on develop
> 34 contributors across all branches
>
> dev list averaged ~650 msgs/month for the last 3 months
>
>
> Resolution:
>
> Establish the Apache Metron Project
>
> WHEREAS, the Board of Directors deems it to be in the best
> interests of the Foundation and consistent with the
> Foundation's purpose to establish a Project Management
> Committee charged with the creation and maintenance of
> open-source software, for distribution at no charge to the
> public, related to a security analytics platform for big data use cases.
>
> NOW, THEREFORE, BE IT RESOLVED, that a Project Management
> Committee (PMC), to be known as the "Apache Metron Project",
> be and hereby is established pursuant to Bylaws of the
> Foundation; and be it further
>
> RESOLVED, that the Apache Metron Project be and hereby is
> responsible for the creation and maintenance of software
> related to:
> (a) A mechanism to capture, store, and normalize any type of security
> telemetry at extremely high rates.
> (b) Real time processing and application of enrichments
> (c) Efficient information storage
> (d) An interface that gives a security investigator a centralized view of
> data and alerts passed through the system.
>
> RESOLVED, that the office of "Vice President, Apache Metron" be
> and hereby is created, the person holding such office to
> serve at the direction of the Board of Directors as the chair
> of the Apache Metron Project, and to have primary responsibility
> for management of the projects within the scope of
> responsibility of the Apache Metron Project; and be it further
>
> RESOLVED, that the persons listed immediately below be and
> hereby are appointed to serve as the initial members of the
> Apache Metron Project:
>
>
> PPMC:
> Mark Bittmann (mbittmann)
> Sheetal Dolas (sheetal_dolas)
> Debo Dutta (ddutta)
> Discovery Gerdes (discovery)
> Andrew Hartnett (dev_warlord)
> Dave Hirko (dbhirko)
> Paul Kehrer (reaperhulk)
> Brad Kolarov (bjkolly)
> Kiran Komaravolu (UKNOWN)
> Larry McCay (lmccay)
> P. Taylor Goetz (ptgoetz)
> Ryan Merriman (rmerriman)
> Michael Perez (mperez)
> Charles Porter (cporter)
> Phillip Rhodes (prhodes)
> Sean Schulte (sirsean)
> James Sirota (jsirota)
> Casey Stella (cstella)
> Bryan Taylor (UKNOWN)
> Ray Urciuoli(UKNOWN)
> Vinod Kumar Vavilapalli (vinodkv)
> George Vetticaden (gvetticaden)
> Oskar Zabik (smogg)
> David Lyle (lyle)
> Nick Allen (nickallen)
>
>
>
> NOW, THEREFORE, BE IT FURTHER RESOLVED, that Casey Stella
> be appointed to the office of Vice President, Apache Metron, to
> serve in accordance with and subject to the direction of the
> Board of Directors and the Bylaws of the Foundation until
> death, resignation, retirement, removal or disqualification,
> or until a successor is appointed; and be it further
>
> RESOLVED, that the initial Apache Metron PMC be and hereby is
> tasked with the creation of a set of bylaws intended to
> encourage open development and increased participation in the
> Apache Metron Project; and be it further
>
> RESOLVED, that the Apache Metron Project be and hereby
> is tasked with the migration and rationalization of the Apache
> Incubator Metron podling; and be it further
>
> RESOLVED, that all responsibilities pertaining to the Apache
> Incubator Metron podling encumbered upon the Apache Incubator
> 

Re: [DISCUSS] Stepping down as release manager

2017-03-22 Thread Kyle Richardson
+1 for Matt

On Wed, Mar 22, 2017 at 10:23 AM, Justin Leet  wrote:

> Right now it's just support, not a vote.  I assume, based on our past
> practices, that there will be a separate [VOTE] thread.
>
> Justin
>
> On Wed, Mar 22, 2017 at 10:06 AM, Otto Fowler 
> wrote:
>
> > +1 but is this explicitly an official vote?
> >
> >
> > On March 21, 2017 at 13:51:16, Justin Leet (justinjl...@gmail.com)
> wrote:
> >
> > +1 for Matt
> >
> > On Tue, Mar 21, 2017 at 12:21 PM, zeo...@gmail.com 
> > wrote:
> >
> > > +1 for mattf
> > >
> > > On Tue, Mar 21, 2017 at 11:04 AM Ryan Merriman 
> > > wrote:
> > >
> > > > +1 for Matt
> > > >
> > > > On Tue, Mar 21, 2017 at 9:44 AM, Matt Foley 
> wrote:
> > > >
> > > > > Casey, you’ve been a great release manager. I know how much detail
> > > > effort
> > > > > goes into this role.
> > > > >
> > > > > I am willing to serve as RM for the next while, if the community
> > would
> > > > > like. I was the RM for Hadoop for about a year, and in fact was RM
> > for
> > > > its
> > > > > 1.0 release. Granted that was a while ago, but overall process
> > doesn’t
> > > > seem
> > > > > to have changed much :-)
> > > > >
> > > > > Cheers,
> > > > > --Matt
> > > > >
> > > > > On 3/21/17, 7:32 AM, "Casey Stella"  wrote:
> > > > >
> > > > > Right, Billie is exactly right. Working with the community to
> > > > > constructing
> > > > > releases that conform to apache standards and policies is the main
> > > > > duty.
> > > > > This will (hopefully) be our first set of releases outside of the
> > > > > incubator, so if I'm allowed to be biased, I'm hoping that someone
> > > > with
> > > > > previous release management experience in other projects will
> > > > > volunteer.
> > > > > We're leaving the nest a bit and having an experienced hand at the
> > > > > tiller
> > > > > would be advantageous.
> > > > >
> > > > >
> > > > > On Tue, Mar 21, 2017 at 10:21 AM, Billie Rinaldi <
> > > bil...@apache.org>
> > > > > wrote:
> > > > >
> > > > > > See http://www.apache.org/dev/release-publishing#release_manager
> > > > and
> > > > > > http://www.apache.org/legal/release-policy.html for information
> > > on
> > > > > the
> > > > > > tasks that a release manager performs.
> > > > > >
> > > > > > On Tue, Mar 21, 2017 at 7:10 AM, Khurram Ahmed <
> > > > > khurramah...@gmail.com>
> > > > > > wrote:
> > > > > >
> > > > > > > Casey it would be helpful if you could outline the
> > > > > responsibilities of a
> > > > > > > release manager for the Metron project.
> > > > > > >
> > > > > > > On Mar 21, 2017 6:57 PM, "Casey Stella" 
> > > > > wrote:
> > > > > > >
> > > > > > > > I've been extremely honored to spend the last few months as
> > > the
> > > > > Metron
> > > > > > > > Release Manager. That being said, my watch is ended and it's
> > > > > time for
> > > > > > > > another release manager to step into my place.
> > > > > > > >
> > > > > > > > Who would like to volunteer to be release manager for the
> > > next
> > > > > release
> > > > > > of
> > > > > > > > Metron?
> > > > > > > >
> > > > > > > > Best,
> > > > > > > >
> > > > > > > > Casey
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > > >
> > > > >
> > > > >
> > > > >
> > > >
> > > --
> > >
> > > Jon
> > >
> >
> >
>


Re: [ANNOUNCE] Apache Metron (incubating) 0.3.1 is released

2017-03-17 Thread Kyle Richardson
+1 to adding links to the website and downloads page from the top-level
README in git.

-Kyle

On Fri, Mar 17, 2017 at 3:16 PM, Justin Leet  wrote:

> Why don't we just add a link to the downloads page in the main README.md?
> Right now the "Obtaining Metron" section only talks about the code on
> GitHub.
>
> In fact, really it should link to the main site at all and it doesn't look
> like it does.  We should make it obvious where things are from the GitHub
> landing page.
>
> On Fri, Mar 17, 2017 at 2:20 PM, Billie Rinaldi  wrote:
>
> > On Fri, Mar 17, 2017 at 9:54 AM, Nick Allen  wrote:
> >
> > > Would it make sense to make the 0.3.1 release visible from the Github
> > > releases page like the other RCs?  I've seen more than a few users go
> to
> > > that page thinking those are the available releases.
> > >
> >
> > People should always be directed to the ASF downloads page to download
> > releases.
> >
> >
> > >
> > > https://github.com/apache/incubator-metron/releases
> > >
> > > On Fri, Mar 17, 2017 at 11:18 AM, Casey Stella 
> > wrote:
> > >
> > > > I am very proud to announce that the 0.3.1 release bits have been
> > > > released.  You can see this reflected on our website at
> > > > http://metron.apache.org/documentation/#releases  Also, I want to
> > point
> > > > out
> > > > that our github documentation for the release is currently located at
> > > > http://metron.apache.org/current-book/index.html and linked from the
> > > > release page (Thanks Matt for making that happen!).
> > > >
> > > > I'm particularly proud of this release as it'll be the release on
> which
> > > we
> > > > base our exit from the incubator.  I really appreciate all of the
> > > > contributions that everyone made to make this possible.  Heartfelt
> > > > gratitude goes out to the community, the committers, the contributors
> > and
> > > > the mentors for making this happen.  In the best tradition of open
> > source
> > > > software, it took a village to build a Metron. :)
> > > >
> > > > Best,
> > > >
> > > > Casey
> > > >
> > > > PS. I still have some JIRA work to do to clean up from this release;
> > I'll
> > > > be doing that by the end of the weekend.
> > > >
> > >
> >
>


Re: [VOTE] Final Board Resolution Draft

2017-03-16 Thread Kyle Richardson
h office to
>>> serve at the direction of the Board of Directors as the chair
>>> of the Apache Metron Project, and to have primary responsibility
>>> for management of the projects within the scope of
>>> responsibility of the Apache Metron Project; and be it further
>>> 
>>> RESOLVED, that the persons listed immediately below be and
>>> hereby are appointed to serve as the initial members of the
>>> Apache Metron Project:
>>> 
>>> 
>>> PPMC:
>>> Mark Bittmann
>>> Sheetal Dolas
>>> Debo Dutta
>>> Discovery Gerdes
>>> P. Taylor Goetz
>>> Andrew Hartnett
>>> Dave Hirko
>>> Paul Kehrer
>>> Brad Kolarov
>>> Kiran Komaravolu
>>> Larry McCay
>>> Ryan Merriman
>>> Michael Perez
>>> Charles Porter
>>> Phillip Rhodes
>>> Sean Schulte
>>> James Sirota
>>> Casey Stella
>>> Bryan Taylor
>>> Ray Urciuoli
>>> Vinod Kumar Vavilapalli
>>> George Vetticaden
>>> Oskar Zabik
>>> David Lyle
>>> Nick Allen
>>> 
>>> Committers:
>>> Otto Fowler
>>> Kyle Richardson
>>> Justin Leet
>>> Michael Miklavcic
>>> Jon Zeolla
>>> Matt Foley
>>> 
>>> 
>>> NOW, THEREFORE, BE IT FURTHER RESOLVED, that Casey Stella
>>> be appointed to the office of Vice President, Apache Metron, to
>>> serve in accordance with and subject to the direction of the
>>> Board of Directors and the Bylaws of the Foundation until
>>> death, resignation, retirement, removal or disqualification,
>>> or until a successor is appointed; and be it further
>>> 
>>> RESOLVED, that the initial Apache Metron PMC be and hereby is
>>> tasked with the creation of a set of bylaws intended to
>>> encourage open development and increased participation in the
>>> Apache Metron Project; and be it further
>>> 
>>> RESOLVED, that the Apache Metron Project be and hereby
>>> is tasked with the migration and rationalization of the Apache
>>> Incubator Metron podling; and be it further
>>> 
>>> RESOLVED, that all responsibilities pertaining to the Apache
>>> Incubator Metron podling encumbered upon the Apache Incubator
>>> Project are hereafter discharged.
>>> 
>>> ---
>>> Thank you,
>>> 
>>> James Sirota
>>> PPMC- Apache Metron (Incubating)
>>> jsirota AT apache DOT org
>> 
>> --
> 
> Jon


Re: new committer: Jon Zeolla

2017-03-15 Thread Kyle Richardson
Welcome Jon and Matt! Well deserved. Glad to have the opportunity to continue 
working with you both.

-Kyle

> On Mar 14, 2017, at 11:57 PM, Otto Fowler  wrote:
> 
> Congratulations Jon
> 
> 
> On March 14, 2017 at 23:40:37, James Sirota (jsir...@apache.org) wrote:
> 
> The Podling Project Management Committee (PPMC) for Apache Metron
> (Incubating)
> has asked Jon Zeolla to become a committer and we are pleased
> to announce that they have accepted.
> 
> 
> Being a committer enables easier contribution to the
> project since there is no need to go via the patch
> submission process. This should enable better productivity.
> Being a PMC member enables assistance with the management
> and to guide the direction of the project.
> ---
> Thank you,
> 
> James Sirota
> PPMC- Apache Metron (Incubating)
> jsirota AT apache DOT org


Re: [DISCUSS] SIDELOADING PARSERS: Archetype

2017-03-14 Thread Kyle Richardson
Could the archetype be deployment agnostic? It's probably a little
simplistic due to all the configs, but I like the NAR solution of simply
drop in lib and restart services.

On Fri, Mar 10, 2017 at 10:04 AM, Otto Fowler 
wrote:

> As previously discussed here, I have been working on side loading of
> parsers.  The goals of this work are:
> * Make it possible of developers to create, maintain and deploy parsers
> outside of the Metron code tree and not have to fork
> * Create maven archetype support for developers of parsers
> * Introduce a parser ‘lifecycle’ to support multiple instances and
> configurations, states of being installed, under configuration, and
> deployed
> etc.
>
> I would like to have some discussion based on where I am after rebasing
> onto METRON-671 which revamps deployment to be totally ambari based.
>
> Maven Parser Archetype
>
> I have an archetype for creating a metron parser module project.
> When running the archetype you get:
>
> * a parser project, with all the names etc filled out based on the
> archetype parameters
> * a rudimentary sample parser ( that needs improvement )
> * a metron-deployment ansible project that will deploy the parser
> * a script to run the deployment and deploy it to a locally running quick
> or full dev
> * It is still Monit based - no ambari integration
>
> I am not sure if this is worth landing right now, until other issues are
> sorted out:
>
> * the thinning of the jars and ‘extension’ packaging/loading
> * ambari deployment - will people be deploying their jars into ‘our’ ambari
> deployment or into their own ambari service dependent on ours?
>
> I am just not sure about having an archetype that is going to be obsoleted.
>
> What I would like to do is save the archetype for when that stuff is
> sorted.
>


Re: [DISCUSS] SIDELOADING PARSERS: Packaging and Loading and Extensions [oh.my]

2017-03-14 Thread Kyle Richardson
I like the direction of using NAR. The key benefits I see (there are
others, but I picked by favorites):

1. Move away from the uber jars in Storm. Granted, this should be possible
with modifications from the new Stellar classloader. My hope is that this
would allow us to stop shading the parser project packages.
2. Allow parser development to be somewhat independent of the core code
base. I do think we'll have to address versioning and backwards
compatibility.
3. "Drop in" extensions to Metron. Future expansion beyond parsers.

Questions:
- Is there a way to do 1 using the VFS Classloader and land the archetype
as an MVP? If so, we could avoid shading as part of the archetype and maybe
iterate from there on it. Do you think this would still be too much change
to the archetype after putting it out there?
- How would we adapt NAR from Nifi without frankensteining it? I'm all for
code reuse but, ideally, we'd like Nifi and Metron's versions of NAR to end
up converging in the end.

On Fri, Mar 10, 2017 at 3:43 PM, Casey Stella  wrote:

> I would definitely agree that moving forward we should consider something
> like Nar for Stellar.  I'm not seeing the need for parsers exactly.
>
> I don't want to squash the forward thinking aspect here; we should be broad
> and think about the end, ideal state.  I just want to make sure we think
> through something that we can iterate on as an initial state that still
> solves your problem, MVP style.
>
> On Fri, Mar 10, 2017 at 3:39 PM, Otto Fowler 
> wrote:
>
> > "The Apache NiFi NAR ‘system’ allows for the packaging and loading of
> java
> > resources with classloader isolation.
> > Although technically it is the Service Provider api that makes the
> > ‘plugins’  part of the system, you can view them
> > together, and thus look at the NAR features as a system to create,
> > package, load, and execute plugins in a java system
> > while maintaining classloader isolation and dependency separation.
> >
> > While the NiFi problem case ( many plugins possibly executing in the same
> > vm ) is not universal, the functionality provided
> > by NAR is commonly needed, and is indeed functionality that I am
> currently
> > looking at implementing in the Apache Metron project.”
> >
> > This is how I put it to Joe.
> >
> > I think what you are proposing would work.  I think what I have done up
> > until now will pretty much work.  What I have been thinking about and
> > considering
> > is the difference between getting ‘something that works’, and maybe
> > something better.
> >
> > So if you look at nar there is the ‘packaging’ part, and the class
> loading
> > part.
> > We are already doing almost the same thing with the assembly of the
> > .tag.gz.  The Nar is a next step to this which adds more metadata and the
> > dependency repo.
> > More of a refinement than a change.
> >
> > As far as class loading, the Nar is a more refined system for deploying
> > and consuming jars and dependencies, and setting up classloader
> instances.
> > It has more
> > functionality than we need at the moment in storm, but in other services
> > where multiple parsers or plugin types may need to be loaded, it would
> make
> > more sense.
> > Rest may be that case.  Stellar may be that case too, if anyone ever
> > writes a stellar function with different dependencies than the platform.
> >
> >
> >
> > On March 10, 2017 at 14:32:00, Casey Stella (ceste...@gmail.com) wrote:
> >
> > So, my question is whether we really need nar here. We have a
> classloading
> > mechanism that will allow us to deploy just the parser logic just added
> > into master for stellar, should we be considering another one?
> >
> > I would understand using nar if we needed to have multiple nars around
> > that
> > needed isolation from one another, but in the parser topology, we get
> that
> > isolation naturally. It seems to me that, at least for a MVP, we should
> > use the existing classloader that we just added. That being said, I might
> > be missing something, so let me know your thoughts.
> >
> > Casey
> >
> > On Fri, Mar 10, 2017 at 2:18 PM, Matt Foley  wrote:
> >
> > > I like the approach. I think Nar constitutes a production-quality
> > > existing solution meeting highly similar needs to Metron’s.
> > >
> > > Just a ‘btw’ regarding Joe’s input that I transmitted:
> > > - Joe made clear that he was only giving his personal opinion, since of
> > > course no individual can speak for the community.
> > > - Joe also felt that if Metron succeeded in re-using the Nar system
> > > without having to change it too much, that that would be a good
> > supporting
> > > argument for later proposing that it become a separate child project.
> > > - Whereas if we or they tried to break it out as a separate project
> now,
> > > we would have to do all the community-building work around it, as well
> > as
> > > the technical work of adapting it for a different environment from
> 

Re: [Discuss] SIDELOADING PARSERS: Parsers as components

2017-03-14 Thread Kyle Richardson
Gotcha. Makes sense.

Unrelated, but something I've had in the back of my head for a while... I
think we should try to loosely define what roles would be using Ambari vs
Metron UI vs CLI. For example, I'm thinking there may be a difference
between the cluster admin and the Metron admin in an organization. With
that, I think the idea of using the Metron UI to load parser packages makes
a lot of sense.

On Tue, Mar 14, 2017 at 10:39 AM, Otto Fowler <ottobackwa...@gmail.com>
wrote:

> The RPM’s are only a requirement for installation by ambari.
>
> We could drop in Nars to the system location ( hdfs in the end ).
> I imagine us doing that through the UI.
>
> Nar’s *are* automatically unpacked at execution time.
>
> Nar’s are unpackaged into a ‘working’ area ( if required ), that working
> area is the class path that is
> loaded.
>
>
>
>
> On March 14, 2017 at 09:57:14, Kyle Richardson (kylerichards...@gmail.com)
> wrote:
>
> Solid work, Otto. I'm excited to see us move in this direction. It's an
> important step to making Metron a more user friendly platform.
>
> I agree with Matt. I think a PR for this piece is needed sooner rather
> than
> later so you don't continually have to rework it as we make more commits
> into master.
>
> One question/concern/future thought, is there anyway we can get away from
> needing the RPMs for these extensible components? Ideally, I'd like to be
> able to dump my parser package file (jar, nar, etc) into a directory and
> have it automagically unpacked.
>
> -Kyle
>
> On Mon, Mar 13, 2017 at 10:10 AM, Otto Fowler <ottobackwa...@gmail.com>
> wrote:
>
> > James,
> > Can you clarify your concerns about backward compatibility?
> >
> >
> >
> > On March 12, 2017 at 23:48:17, James Sirota (jsir...@apache.org) wrote:
> >
> > As long as this doesn't break backwards compatibility I think I am ok
> with
> > this approach. I think this is a great idea. We would probably need to
> > follow up with an Ambari view that can allow you to list and
> > deploy/upgrade/delete various parsers
> >
> > 10.03.2017, 15:16, "Casey Stella" <ceste...@gmail.com>:
> > > Ok, so, after some thought about this, I am in agreement over Nar. I
> do
> > > want to make sure that on the roadmap we retrofit stellar to accept
> Nar
> > > plugins and build an archetype for it. We should have a single
> strategy
> > > for plugins. NOt saying it has to be part of the same PR, but it needs
> to
> > > be associated and a follow-on task IMO.
> > >
> > > On Fri, Mar 10, 2017 at 4:06 PM, Otto Fowler <ottobackwa...@gmail.com>
>
> > > wrote:
> > >
> > >> Also a Nar can depend on ‘one’ other nar, which is interesting
> > >>
> > >> On March 10, 2017 at 16:02:18, Otto Fowler (ottobackwa...@gmail.com)
> > >> wrote:
> > >>
> > >> The isolation is just a ‘extra’ in the parser case.
> > >>
> > >> The parts of Nar that *are* more pertinent:
> > >>
> > >> * supporting a deployment artifact with just a jar, or a tar.gz with
> a
> > jar
> > >> and jar dependencies in it
> > >> * taking a ‘package’ and deploying it for loading ( which will
> upgrade
> > the
> > >> deployed part if it is newer )
> > >> * setting up the classloader hierarchy between the ‘system’ and
> > provided
> > >> things, and the dependencies of the individual plugin
> > >>
> > >> On March 10, 2017 at 15:56:08, Casey Stella (ceste...@gmail.com)
> > wrote:
> > >>
> > >> Why would we need classpath isolation here in the case of the parser?
> > >>
> > >> On Fri, Mar 10, 2017 at 3:55 PM, Otto Fowler <ottobackwa...@gmail.com>
>
> > >> wrote:
> > >>
> > >>> I *would* use the classloader part, extending it with VFS.
> > >>>
> > >>> On March 10, 2017 at 15:53:05, Casey Stella (ceste...@gmail.com)
> > wrote:
> > >>>
> > >>> I'm a bit worried about copying and pasting from the NiFi project
> > their
> > >>> nar infrastructure. That seems..unclean to me and since we're not
> > using
> > >>> the classloader part of nar for this, does it make more sense to
> just
> > use
> > >>> jar?
> > >>>
> > >>> On Fri, Mar 10, 2017 at 3:50 PM, Otto Fowler <
> ottobackwa...@gmail.com
> > >
> > >>> wrote:
> > >>>
> > >>>> Compared to ho

Re: [Discuss] SIDELOADING PARSERS: Parsers as components

2017-03-14 Thread Kyle Richardson
Solid work, Otto. I'm excited to see us move in this direction. It's an
important step to making Metron a more user friendly platform.

I agree with Matt. I think a PR for this piece is needed sooner rather than
later so you don't continually have to rework it as we make more commits
into master.

One question/concern/future thought, is there anyway we can get away from
needing the RPMs for these extensible components? Ideally, I'd like to be
able to dump my parser package file (jar, nar, etc) into a directory and
have it automagically unpacked.

-Kyle

On Mon, Mar 13, 2017 at 10:10 AM, Otto Fowler 
wrote:

> James,
> Can you clarify your concerns about backward compatibility?
>
>
>
> On March 12, 2017 at 23:48:17, James Sirota (jsir...@apache.org) wrote:
>
> As long as this doesn't break backwards compatibility I think I am ok with
> this approach. I think this is a great idea. We would probably need to
> follow up with an Ambari view that can allow you to list and
> deploy/upgrade/delete various parsers
>
> 10.03.2017, 15:16, "Casey Stella" :
> > Ok, so, after some thought about this, I am in agreement over Nar. I do
> > want to make sure that on the roadmap we retrofit stellar to accept Nar
> > plugins and build an archetype for it. We should have a single strategy
> > for plugins. NOt saying it has to be part of the same PR, but it needs to
> > be associated and a follow-on task IMO.
> >
> > On Fri, Mar 10, 2017 at 4:06 PM, Otto Fowler 
> > wrote:
> >
> >>  Also a Nar can depend on ‘one’ other nar, which is interesting
> >>
> >>  On March 10, 2017 at 16:02:18, Otto Fowler (ottobackwa...@gmail.com)
> >>  wrote:
> >>
> >>  The isolation is just a ‘extra’ in the parser case.
> >>
> >>  The parts of Nar that *are* more pertinent:
> >>
> >>  * supporting a deployment artifact with just a jar, or a tar.gz with a
> jar
> >>  and jar dependencies in it
> >>  * taking a ‘package’ and deploying it for loading ( which will upgrade
> the
> >>  deployed part if it is newer )
> >>  * setting up the classloader hierarchy between the ‘system’ and
> provided
> >>  things, and the dependencies of the individual plugin
> >>
> >>  On March 10, 2017 at 15:56:08, Casey Stella (ceste...@gmail.com)
> wrote:
> >>
> >>  Why would we need classpath isolation here in the case of the parser?
> >>
> >>  On Fri, Mar 10, 2017 at 3:55 PM, Otto Fowler 
> >>  wrote:
> >>
> >>>  I *would* use the classloader part, extending it with VFS.
> >>>
> >>>  On March 10, 2017 at 15:53:05, Casey Stella (ceste...@gmail.com)
> wrote:
> >>>
> >>>  I'm a bit worried about copying and pasting from the NiFi project
> their
> >>>  nar infrastructure. That seems..unclean to me and since we're not
> using
> >>>  the classloader part of nar for this, does it make more sense to just
> use
> >>>  jar?
> >>>
> >>>  On Fri, Mar 10, 2017 at 3:50 PM, Otto Fowler  >
> >>>  wrote:
> >>>
>   Compared to how much time vagrant up takes now, you won’t even notice
> it
>   ;)
> 
>   That is definitely an option. I guess what I want to work out is if
> we
>   are going to want to
>   go to NAR, why not just go to NAR.
> 
>   In the end, the customer for this - like Jon Zeolla, isn’t going to
> care
>   about the intermediate step,
>   he wants the archetype that builds the ‘metron parser plugin’.
> 
>   Which is why I hesitate to put out an archetype that is going to
>   obsolete so soon.
> 
>   Does that make sense?
> 
>   On March 10, 2017 at 14:50:55, Casey Stella (ceste...@gmail.com)
> wrote:
> 
>   I'm a little concerned about this increasing the size and length of
> the
>   build due to the repeated shading. Should we figure out a way to
> deploy
>   jars with provided dependencies on metron-parser-common as suggested
> in
>   the
>   previous JIRAs first?
> 
>   On Fri, Mar 10, 2017 at 2:31 PM, Matt Foley 
>   wrote:
> 
>   > It sounds like:
>   > - This is a self-contained chunk of work, that can be tested,
> reviewed,
>   > and committed on its own, then the other ideas you propose can
> follow
>   it.
>   > - It crosses a lot of lines, and restructures a lot of code, so
> will
>   “rot”
>   > fairly quickly as other people make commits, so if possible you
> should
>   get
>   > a PR out there and we should work through it as soon as possible.
>   > Are those both true?
>   >
>   > How do other people feel about grouping a given sensor’s parser,
>   enricher,
>   > indexing logic all together? It seems to have multiple advantages
> are
>   > there also disadvantages?
>   >
>   > On 3/10/17, 6:31 AM, "Otto Fowler" 
> wrote:
>   >
>   > As previously discussed here, I have been working on side loading
> of
>   > 

Re: [VOTE] Cesey Stella for Metron VP

2017-03-14 Thread Kyle Richardson
+1 (binding)

On Mon, Mar 13, 2017 at 11:56 PM, Anand Subramanian <
asubraman...@hortonworks.com> wrote:

> +1 (non-binding)
>
>
>
>
> On 3/14/17, 4:03 AM, "James Sirota"  wrote:
>
> >This vote is to make Casey Stella our VP after graduation
> >
> >---
> >Thank you,
> >
> >James Sirota
> >PPMC- Apache Metron (Incubating)
> >jsirota AT apache DOT org
> >
>


Re: METRON-646 commit attribution

2017-02-27 Thread Kyle Richardson
Just to confirm. Please reply +1 if you're okay for me to commit the revert
/ re-commit of METRON-646 (PR#441).

Thanks,
Kyle

On Mon, Feb 27, 2017 at 9:25 PM, Casey Stella <ceste...@gmail.com> wrote:

> Yeah, I'd think there does not exist a time in which the email should be
> null.  I'd rather just error out if you can't find it and the committer
> doesn't put one in.
>
> I do agree that the most sensible way to pull the commit name is to pull it
> from the repo's commit history.  Here's a 1-liner to use if you dont' feel
> like coming up with it yourself
>
> git clone https://github.com/kylerichardson/incubator-metron.git --depth=1
> --branch METRON-646 --single-branch METRON-646 >& /dev/null && cd
> METRON-646 && (git log | grep Author | awk -F: '{print $2}' | sed 's/^
> //g') && cd ..
>
>
>
> On Mon, Feb 27, 2017 at 9:15 PM, Nick Allen <n...@nickallen.org> wrote:
>
> > Sure.  It could validate the email address before letting you proceed.
> >
> > It tries to get the email from the author's Github profile.  If the
> author
> > doesn't make one public, it will come back as 'null' and prompt you to
> > change it.  Of course, it will just use 'null' if you don't provide an
> > alternative.
> >
> > Most of the time I have to enter the email address manually because not
> > many people make their email public. Even better would be to pull the
> email
> > from the author's own commits in the PR.  That would reduce how often we
> > have to manually input an email.
> >
> >
> > On Mon, Feb 27, 2017 at 8:53 PM, Casey Stella <ceste...@gmail.com>
> wrote:
> >
> > > Nick, what are your thoughts on adjusting the script to error out or
> > prompt
> > > for an email address if one can't be found?
> > >
> > > On Mon, Feb 27, 2017 at 8:51 PM, Nick Allen <n...@nickallen.org>
> wrote:
> > >
> > > > I think revert and commit again is the best way to go.  Not a big
> deal.
> > > >
> > > > On Mon, Feb 27, 2017 at 6:55 PM, Casey Stella <ceste...@gmail.com>
> > > wrote:
> > > >
> > > > > I think it should be changed, but I'm not sure how to change it. I
> > > think
> > > > it
> > > > > should be changed because our git history is our legal trail of
> > > > > attribution.  Mucking with it is relatively serious business.
> > > > >
> > > > > As to how, normally I'd say git commit --amend --author
> > > "kylerichardson <
> > > > > kylerichards...@gmail.com>" if we act before the next commit and a
> > git
> > > > > rebase otherwise, but it's pushed and rewriting history for a
> push'd
> > > > commit
> > > > > has consequences.  Not the least of which the scary force'd push.
> > The
> > > > > challenge here is that all forked repos during this period between
> > the
> > > > > wrong commit and the correction commit will be based on a dead
> > > branch.  I
> > > > > guess I would vote for 1, the revert and then the re-commit.
> > > > >
> > > > > I'd like to understand a bit more about how this happened.  Ryan,
> can
> > > you
> > > > > walk it through how you did the commit so we can avoid it in the
> > > future?
> > > > >
> > > > > Casey
> > > > >
> > > > >
> > > > > On Mon, Feb 27, 2017 at 4:04 PM, Kyle Richardson <
> > > > > kylerichards...@gmail.com>
> > > > > wrote:
> > > > >
> > > > > > Ok, so here's the story... Ryan was nice enough to commit my
> recent
> > > PR
> > > > > and
> > > > > > for whatever reason my github username but not my email address
> > > appears
> > > > > in
> > > > > > the commit author (see below).
> > > > > >
> > > > > > commit 41fc0ddc9881d9cfdd8bae129c0bb7800a116d4c
> > > > > > Author: kylerichardson 
> > > > > > Date:   Mon Feb 27 11:38:55 2017 -0600
> > > > > >
> > > > > > METRON-646 Add index templates to metron-docker
> (kylerichardson
> > > via
> > > > > > merrimanr) closes apache/incubator-metron#441
> > > > > >
> > > > > > My question is can it be left as is or does it need to include
> the
> > > > email
> > > > > > address per apache?
> > > > > >
> > > > > > If it needs to be changed, what are the acceptable options?
> > > > > >
> > > > > > (1) commit a revert and re-commit; maintains a record of
> everything
> > > > > > (2) rebase one back, update, and force a push; like it never
> > happened
> > > > > > (3) another option I haven't considered?
> > > > > >
> > > > > > -Kyle
> > > > > >
> > > > >
> > > >
> > >
> >
>


Re: METRON-646 commit attribution

2017-02-27 Thread Kyle Richardson
Sounds like we're in agreement. I did a quick dry run and agree with Nick,
easy fix. I'm ready to push but want to be sure everyone's good with the
history. Here's what it will look like:

commit 8b4fa79d19be992ad270df7d1780db7f4b8ce176
Author: kylerichardson <kylerichards...@gmail.com>
Date:   Mon Feb 27 21:05:46 2017 -0500

METRON-646 Add index templates to metron-docker (kylerichardson) closes
apache/incubator-metron#441

commit 1e1b658b76f5c959bcebdf824372eea0d680d3f0
Author: Kyle Richardson <kylerichard...@apache.org>
Date:   Mon Feb 27 20:57:30 2017 -0500

Revert "METRON-646 Add index templates to metron-docker (kylerichardson
via merrimanr) closes apache/incubator-metron#441"

This reverts commit 41fc0ddc9881d9cfdd8bae129c0bb7800a116d4c. Change
will be re-commited with full author details.

commit b34037202775f78733cd21cc0fb4159991f29cd3
Author: nickwallen <n...@nickallen.org>
Date:   Mon Feb 27 18:02:36 2017 -0500

METRON-686 Record Rule Set that Fired During Threat Triage (nickwallen)
closes apache/incubator-metron#438

commit 41fc0ddc9881d9cfdd8bae129c0bb7800a116d4c
Author: kylerichardson 
Date:   Mon Feb 27 11:38:55 2017 -0600

METRON-646 Add index templates to metron-docker (kylerichardson via
merrimanr) closes apache/incubator-metron#441

commit 164d602d87ee6203514b27897f68bfa9aa1f0377
Author: cstella <ceste...@gmail.com>
Date:   Mon Feb 27 09:13:15 2017 -0500

METRON-742: Generated code for profile window selector DSL did not get
committed as part of METRON-690 closes apache/incubator-metron#466

-Kyle

On Mon, Feb 27, 2017 at 8:53 PM, Casey Stella <ceste...@gmail.com> wrote:

> Nick, what are your thoughts on adjusting the script to error out or prompt
> for an email address if one can't be found?
>
> On Mon, Feb 27, 2017 at 8:51 PM, Nick Allen <n...@nickallen.org> wrote:
>
> > I think revert and commit again is the best way to go.  Not a big deal.
> >
> > On Mon, Feb 27, 2017 at 6:55 PM, Casey Stella <ceste...@gmail.com>
> wrote:
> >
> > > I think it should be changed, but I'm not sure how to change it. I
> think
> > it
> > > should be changed because our git history is our legal trail of
> > > attribution.  Mucking with it is relatively serious business.
> > >
> > > As to how, normally I'd say git commit --amend --author
> "kylerichardson <
> > > kylerichards...@gmail.com>" if we act before the next commit and a git
> > > rebase otherwise, but it's pushed and rewriting history for a push'd
> > commit
> > > has consequences.  Not the least of which the scary force'd push.  The
> > > challenge here is that all forked repos during this period between the
> > > wrong commit and the correction commit will be based on a dead
> branch.  I
> > > guess I would vote for 1, the revert and then the re-commit.
> > >
> > > I'd like to understand a bit more about how this happened.  Ryan, can
> you
> > > walk it through how you did the commit so we can avoid it in the
> future?
> > >
> > > Casey
> > >
> > >
> > > On Mon, Feb 27, 2017 at 4:04 PM, Kyle Richardson <
> > > kylerichards...@gmail.com>
> > > wrote:
> > >
> > > > Ok, so here's the story... Ryan was nice enough to commit my recent
> PR
> > > and
> > > > for whatever reason my github username but not my email address
> appears
> > > in
> > > > the commit author (see below).
> > > >
> > > > commit 41fc0ddc9881d9cfdd8bae129c0bb7800a116d4c
> > > > Author: kylerichardson 
> > > > Date:   Mon Feb 27 11:38:55 2017 -0600
> > > >
> > > > METRON-646 Add index templates to metron-docker (kylerichardson
> via
> > > > merrimanr) closes apache/incubator-metron#441
> > > >
> > > > My question is can it be left as is or does it need to include the
> > email
> > > > address per apache?
> > > >
> > > > If it needs to be changed, what are the acceptable options?
> > > >
> > > > (1) commit a revert and re-commit; maintains a record of everything
> > > > (2) rebase one back, update, and force a push; like it never happened
> > > > (3) another option I haven't considered?
> > > >
> > > > -Kyle
> > > >
> > >
> >
>


METRON-646 commit attribution

2017-02-27 Thread Kyle Richardson
Ok, so here's the story... Ryan was nice enough to commit my recent PR and
for whatever reason my github username but not my email address appears in
the commit author (see below).

commit 41fc0ddc9881d9cfdd8bae129c0bb7800a116d4c
Author: kylerichardson 
Date:   Mon Feb 27 11:38:55 2017 -0600

METRON-646 Add index templates to metron-docker (kylerichardson via
merrimanr) closes apache/incubator-metron#441

My question is can it be left as is or does it need to include the email
address per apache?

If it needs to be changed, what are the acceptable options?

(1) commit a revert and re-commit; maintains a record of everything
(2) rebase one back, update, and force a push; like it never happened
(3) another option I haven't considered?

-Kyle


Re: Odd integration-test failures on Fedora/CentOS for RC5

2017-02-26 Thread Kyle Richardson
I turned on debug logging; however, it seems to only confirm the query and
files included in the job.

17/02/27 02:25:33 DEBUG mr.PcapJob: Executing query protocol_==_6 on
timerange February 27, 2017 12:00:00 AM UTC to February 27, 2017 2:25:31 AM
UTC
17/02/27 02:25:33 DEBUG mr.PcapJob: Including files
hdfs://node1:8020/apps/metron/pcap/pcap_pcap_1488154851638515000_0_pcap-9-1488153894,hdfs://node1:8020/apps/metron/pcap/pcap_pcap_1488154982405815000_0_pcap-9-1488153894,hdfs://node1:8020/apps/metron/pcap/pcap_pcap_1488155123884215000_0_pcap-9-1488153894,hdfs://node1:8020/apps/metron/pcap/pcap_pcap_1488155239282312000_0_pcap-9-1488153894,hdfs://node1:8020/apps/metron/pcap/pcap_pcap_1488155395170343000_0_pcap-9-1488153894

It feels like I'm missing something stupid.


On Sun, Feb 26, 2017 at 8:53 PM, Kyle Richardson <kylerichards...@gmail.com>
wrote:

> Ok... Seems there is a dark cloud hanging over me. Trying to test
> METRON-743 (PR#467) on quick-dev. Followed Casey's instructions to a tee
> but am getting no results. I did have to stop HBase to free up enough
> resources for the job to run but didn't think that would be an issue. Any
> ideas?
>
> [root@node1 ~]# hadoop fs -ls -R /apps/metron/pcap
> -rw-r--r--   1 storm hadoop 458718 2017-02-27 00:23
> /apps/metron/pcap/pcap_pcap_1488154851638515000_0_pcap-9-1488153894
> -rw-r--r--   1 storm hadoop 472857 2017-02-27 00:25
> /apps/metron/pcap/pcap_pcap_1488154982405815000_0_pcap-9-1488153894
> -rw-r--r--   1 storm hadoop 451711 2017-02-27 00:28
> /apps/metron/pcap/pcap_pcap_1488155123884215000_0_pcap-9-1488153894
> -rw-r--r--   1 storm hadoop 447685 2017-02-27 00:30
> /apps/metron/pcap/pcap_pcap_1488155239282312000_0_pcap-9-1488153894
> -rw-r--r--   1 storm hadoop 432695 2017-02-27 00:49
> /apps/metron/pcap/pcap_pcap_1488155395170343000_0_pcap-9-1488153894
> [root@node1 ~]# /usr/metron/0.3.1/bin/pcap_inspector.sh -i
> /apps/metron/pcap/pcap_pcap_1488155123884215000_0_pcap-9-1488153894 -n 10
> TS: February 27, 2017 12:25:23 AM UTC,ip_src_addr:
> 192.168.1.128,ip_src_port: 54126,ip_dst_addr: 192.168.1.11,ip_dst_port:
> 8080,protocol: 6
> TS: February 27, 2017 12:25:24 AM UTC,ip_src_addr:
> 192.168.1.128,ip_src_port: 53212,ip_dst_addr: 192.168.1.11,ip_dst_port:
> 8080,protocol: 6
> TS: February 27, 2017 12:25:24 AM UTC,ip_src_addr:
> 192.168.1.11,ip_src_port: 8080,ip_dst_addr: 192.168.1.128,ip_dst_port:
> 53212,protocol: 6
> TS: February 27, 2017 12:25:24 AM UTC,ip_src_addr:
> 192.168.1.11,ip_src_port: 8080,ip_dst_addr: 192.168.1.128,ip_dst_port:
> 53212,protocol: 6
> TS: February 27, 2017 12:25:24 AM UTC,ip_src_addr:
> 192.168.1.128,ip_src_port: 53212,ip_dst_addr: 192.168.1.11,ip_dst_port:
> 8080,protocol: 6
> TS: February 27, 2017 12:25:25 AM UTC,ip_src_addr:
> 192.168.1.128,ip_src_port: 53212,ip_dst_addr: 192.168.1.11,ip_dst_port:
> 8080,protocol: 6
> TS: February 27, 2017 12:25:25 AM UTC,ip_src_addr:
> 192.168.1.11,ip_src_port: 8080,ip_dst_addr: 192.168.1.128,ip_dst_port:
> 53212,protocol: 6
> TS: February 27, 2017 12:25:25 AM UTC,ip_src_addr:
> 192.168.1.11,ip_src_port: 8080,ip_dst_addr: 192.168.1.128,ip_dst_port:
> 53212,protocol: 6
> TS: February 27, 2017 12:25:25 AM UTC,ip_src_addr:
> 192.168.1.128,ip_src_port: 53212,ip_dst_addr: 192.168.1.11,ip_dst_port:
> 8080,protocol: 6
> TS: February 27, 2017 12:25:25 AM UTC,ip_src_addr:
> 192.168.1.128,ip_src_port: 53212,ip_dst_addr: 192.168.1.11,ip_dst_port:
> 8080,protocol: 6
> [root@node1 ~]# /usr/metron/0.3.1/bin/pcap_query.sh query -st "20170227"
> -df "MMdd" --query "protocol == '6'" -rpf 500
> 17/02/27 01:42:11 INFO impl.TimelineClientImpl: Timeline service address:
> http://node1:8188/ws/v1/timeline/
> 17/02/27 01:42:11 INFO client.RMProxy: Connecting to ResourceManager at
> node1/127.0.0.1:8050
> 17/02/27 01:42:12 INFO client.AHSProxy: Connecting to Application History
> server at node1/127.0.0.1:10200
> 17/02/27 01:42:14 INFO input.FileInputFormat: Total input paths to process
> : 5
> 17/02/27 01:42:15 INFO mapreduce.JobSubmitter: number of splits:5
> 17/02/27 01:42:16 INFO mapreduce.JobSubmitter: Submitting tokens for job:
> job_1488159654301_0001
> 17/02/27 01:42:17 INFO impl.YarnClientImpl: Submitted application
> application_1488159654301_0001
> 17/02/27 01:42:17 INFO mapreduce.Job: The url to track the job:
> http://node1:8088/proxy/application_1488159654301_0001/
> 17/02/27 01:42:17 INFO mapreduce.Job: Running job: job_1488159654301_0001
> 17/02/27 01:42:26 INFO mapreduce.Job: Job job_1488159654301_0001 running
> in uber mode : false
> 17/02/27 01:42:26 INFO mapreduce.Job:  map 0% reduce 0%
> 17/02/27 01:42:56 INFO mapreduce.Job:  map 40% reduce 0%
> 17/02/27 01:43:10 INFO mapreduce.Job:  map 60% reduce 0%
> 17/02/27

Re: Odd integration-test failures on Fedora/CentOS for RC5

2017-02-26 Thread Kyle Richardson
Thanks! You guys are awesome! It's nice to know I'm not completely crazy ;).

I definitely needed another set of eyes on this. Special thanks to Otto for 
identifying the root cause so quickly.

I can confirm all of the integration tests pass for me with this PR. Working on 
running through the rest of Casey's test plan now.

-Kyle

> On Feb 25, 2017, at 11:57 PM, Casey Stella <ceste...@gmail.com> wrote:
> 
> METRON-743 (https://github.com/apache/incubator-metron/pull/467) for 
> reference.
> 
>> On Sat, Feb 25, 2017 at 11:51 PM, Casey Stella <ceste...@gmail.com> wrote:
>> Hmm, that's a very good catch if it's the issue.  I was able to verify that 
>> if you botch the sort order of the files that it fails.
>> 
>> Would you mind sorting the files on PcapJob line 199 by filename?  Something 
>> like Collections.sort(files, (o1,o2) -> 
>> o1.getName().compareTo(o2.getName()));
>> 
>> I'm going to submit a PR regardless because we should own the assumptions 
>> here, but I suspect that for the HDFS filesystem this works as expected.  
>> That being said, it's better to be safe than sorry.
>> 
>> Casey
>> 
>>> On Sat, Feb 25, 2017 at 11:35 PM, Otto Fowler <ottobackwa...@gmail.com> 
>>> wrote:
>>> /**
>>>  * List the statuses and block locations of the files in the given path.
>>>  * Does not guarantee to return the iterator that traverses statuses
>>>  * of the files in a sorted order.
>>>  * 
>>>  * If the path is a directory,
>>>  *   if recursive is false, returns files in the directory;
>>>  *   if recursive is true, return files in the subtree rooted at the path.
>>>  * If the path is a file, return the file's status and block locations.
>>>  * 
>>>  * @param f is the path
>>>  * @param recursive if the subdirectories need to be traversed recursively
>>>  *
>>>  * @return an iterator that traverses statuses of the files
>>>  *
>>>  * @throws FileNotFoundException when the path does not exist;
>>>  * @throws IOException see specific implementation
>>>  */
>>> public RemoteIterator listFiles(
>>> 
>>> 
>>> So if we depend on this returning something sorted, it is only working
>>> accidentally?
>>> 
>>> 
>>> On February 25, 2017 at 23:10:59, Otto Fowler (ottobackwa...@gmail.com)
>>> wrote:
>>> 
>>> https://issues.apache.org/jira/browse/HADOOP-12009  makes it seem like
>>> there is no order
>>> 
>>> 
>>> On February 25, 2017 at 23:06:37, Otto Fowler (ottobackwa...@gmail.com)
>>> wrote:
>>> 
>>> Maybe Hadoop Local FileSystem returns different things from ListFiles() on
>>> different platforms?
>>> That would be something to check?
>>> 
>>> Sorry that is all I got right now
>>> 
>>> 
>>> 
>>> On February 25, 2017 at 22:57:49, Otto Fowler (ottobackwa...@gmail.com)
>>> wrote:
>>> 
>>> There are also some if Log.isDebugEnabled() outputs, so maybe try changing
>>> the logging level, maybe running just this test?
>>> 
>>> 
>>> 
>>> On February 25, 2017 at 22:39:02, Otto Fowler (ottobackwa...@gmail.com)
>>> wrote:
>>> 
>>> There are multiple “tests” within the test, with different parameters.  If
>>> you look at where this is breaking, it is at
>>> 
>>> {
>>>   //make sure I get them all.
>>>   Iterable<byte[]> results =
>>>   job.query(new Path(outDir.getAbsolutePath())
>>>   , new Path(queryDir.getAbsolutePath())
>>>   , getTimestamp(0, pcapEntries)
>>>   , getTimestamp(pcapEntries.size()-1, pcapEntries) + 1
>>>   , 10
>>>   , new EnumMap<>(Constants.Fields.class)
>>>   , new Configuration()
>>>   , FileSystem.get(new Configuration())
>>>   , new FixedPcapFilter.Configurator()
>>>   );
>>>   assertInOrder(results);
>>>   Assert.assertEquals(Iterables.size(results), pcapEntries.size());
>>> 
>>> 
>>> 
>>> Which is the 7th test job run against the data.  I am not familiar with
>>> this test or code, but
>>> that has to be significant.
>>> 
>>> Maybe you should enable and print out the information of the results - and
>>> we can see a pattern there?
>>> 
>>> On February 25, 2017 at 22:19:00, K

Re: Files modified after building

2017-02-25 Thread Kyle Richardson
That's my guess. I noticed the same thing.

-Kyle

> On Feb 25, 2017, at 8:31 PM, Otto Fowler  wrote:
> 
> Changes not staged for commit:
> 
>  (use "git add ..." to update what will be committed)
> 
>  (use "git checkout -- ..." to discard changes in working directory)
> 
> 
> modified:
> metron-analytics/metron-profiler-client/src/main/java/Window.tokens
> 
> modified:
> metron-analytics/metron-profiler-client/src/main/java/WindowLexer.tokens
> 
> modified:
> metron-analytics/metron-profiler-client/src/main/java/org/apache/metron/profiler/client/window/generated/WindowLexer.java
> 
> modified:
> metron-analytics/metron-profiler-client/src/main/java/org/apache/metron/profiler/client/window/generated/WindowListener.java
> 
> modified:
> metron-analytics/metron-profiler-client/src/main/java/org/apache/metron/profiler/client/window/generated/WindowParser.java
> 
> 
> 
> 
> Are these files changed by the build?


Re: Odd integration-test failures on Fedora/CentOS for RC5

2017-02-25 Thread Kyle Richardson
mvn integration-test

Although I have also tried...
mvn clean install && mvn integration-test
mvn clean package && mvn integration-test
mvn install && mvn surefire-test@unit-tests && mvn 
surefire-test@integration-tests

-Kyle

> On Feb 25, 2017, at 8:34 PM, Otto Fowler <ottobackwa...@gmail.com> wrote:
> 
> What command are you using to build?
> 
> 
> 
>> On February 25, 2017 at 17:40:20, Kyle Richardson 
>> (kylerichards...@gmail.com) wrote:
>> 
>> Tried with Oracle JDK and got the same result. I went as far as trying to 
>> run it through the debugger but am not that familiar with this part of the 
>> code. The timestamps of the packets are definitely not coming back in the 
>> expected order, but I'm not sure why. Could it be related to something 
>> filesystem specific? 
>> 
>> Apologies if I'm just being dense but I'd really like to understand why this 
>> consistently fails on some platforms and not others. 
>> 
>> -Kyle 
>> 
>> > On Feb 25, 2017, at 9:07 AM, Kyle Richardson <kylerichards...@gmail.com> 
>> > wrote: 
>> > 
>> > Ok, I've tried this so many times I may be going crazy, so thought I'd ask 
>> > the community for a sanity check. 
>> > 
>> > I'm trying to verify RC5 and I keep running into the same integration test 
>> > failures but only on my Fedora (24 and 25) and CentOS 7 systems. It passes 
>> > fine on my Macbook. 
>> > 
>> > It always fails on the PcapTopologyIntegrationTest (test results pasted 
>> > below). Anyone have any ideas? I'm using the exact same version of maven 
>> > in all cases (v3.3.9). The only difference I can think of is the 
>> > Fedora/CentOS systems are using OpenJDK whereas the Macbook is running 
>> > Sun/Oracle JDK. 
>> > 
>> > --- 
>> > T E S T S 
>> > --- 
>> > Running org.apache.metron.pcap.integration.PcapTopologyIntegrationTest 
>> > Formatting using clusterid: testClusterID 
>> > Formatting using clusterid: testClusterID 
>> > Sent pcap data: 20 
>> > Wrote 20 to kafka 
>> > Tests run: 2, Failures: 2, Errors: 0, Skipped: 0, Time elapsed: 42.011 sec 
>> > <<< FAILURE! - in 
>> > org.apache.metron.pcap.integration.PcapTopologyIntegrationTest 
>> > testTimestampInPacket(org.apache.metron.pcap.integration.PcapTopologyIntegrationTest)
>> >  Time elapsed: 26.968 sec <<< FAILURE! 
>> > java.lang.AssertionError 
>> > at org.junit.Assert.fail(Assert.java:86) 
>> > at org.junit.Assert.assertTrue(Assert.java:41) 
>> > at org.junit.Assert.assertTrue(Assert.java:52) 
>> > at 
>> > org.apache.metron.pcap.integration.PcapTopologyIntegrationTest.assertInOrder(PcapTopologyIntegrationTest.java:537)
>> >  
>> > at 
>> > org.apache.metron.pcap.integration.PcapTopologyIntegrationTest.testTopology(PcapTopologyIntegrationTest.java:383)
>> >  
>> > at 
>> > org.apache.metron.pcap.integration.PcapTopologyIntegrationTest.testTimestampInPacket(PcapTopologyIntegrationTest.java:135)
>> >  
>> > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 
>> > at 
>> > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>> >  
>> > at 
>> > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>> >  
>> > at java.lang.reflect.Method.invoke(Method.java:498) 
>> > at 
>> > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
>> >  
>> > at 
>> > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>> >  
>> > at 
>> > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
>> >  
>> > at 
>> > org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>> >  
>> > at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325) 
>> > at 
>> > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
>> >  
>> > at 
>> > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
>> >  
>> > at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290) 
>> > at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71) 
>> > at org.j

Re: Odd integration-test failures on Fedora/CentOS for RC5

2017-02-25 Thread Kyle Richardson
Tried with Oracle JDK and got the same result. I went as far as trying to run 
it through the debugger but am not that familiar with this part of the code. 
The timestamps of the packets are definitely not coming back in the expected 
order, but I'm not sure why. Could it be related to something filesystem 
specific?

Apologies if I'm just being dense but I'd really like to understand why this 
consistently fails on some platforms and not others. 

-Kyle

> On Feb 25, 2017, at 9:07 AM, Kyle Richardson <kylerichards...@gmail.com> 
> wrote:
> 
> Ok, I've tried this so many times I may be going crazy, so thought I'd ask 
> the community for a sanity check.
> 
> I'm trying to verify RC5 and I keep running into the same integration test 
> failures but only on my Fedora (24 and 25) and CentOS 7 systems. It passes 
> fine on my Macbook.
> 
> It always fails on the PcapTopologyIntegrationTest (test results pasted 
> below). Anyone have any ideas? I'm using the exact same version of maven in 
> all cases (v3.3.9). The only difference I can think of is the Fedora/CentOS 
> systems are using OpenJDK whereas the Macbook is running Sun/Oracle JDK.
> 
> ---
>  T E S T S
> ---
> Running org.apache.metron.pcap.integration.PcapTopologyIntegrationTest
> Formatting using clusterid: testClusterID
> Formatting using clusterid: testClusterID
> Sent pcap data: 20
> Wrote 20 to kafka
> Tests run: 2, Failures: 2, Errors: 0, Skipped: 0, Time elapsed: 42.011 sec 
> <<< FAILURE! - in 
> org.apache.metron.pcap.integration.PcapTopologyIntegrationTest
> testTimestampInPacket(org.apache.metron.pcap.integration.PcapTopologyIntegrationTest)
>   Time elapsed: 26.968 sec  <<< FAILURE!
> java.lang.AssertionError
>   at org.junit.Assert.fail(Assert.java:86)
>   at org.junit.Assert.assertTrue(Assert.java:41)
>   at org.junit.Assert.assertTrue(Assert.java:52)
>   at 
> org.apache.metron.pcap.integration.PcapTopologyIntegrationTest.assertInOrder(PcapTopologyIntegrationTest.java:537)
>   at 
> org.apache.metron.pcap.integration.PcapTopologyIntegrationTest.testTopology(PcapTopologyIntegrationTest.java:383)
>   at 
> org.apache.metron.pcap.integration.PcapTopologyIntegrationTest.testTimestampInPacket(PcapTopologyIntegrationTest.java:135)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
>   at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
>   at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
>   at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
>   at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
>   at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
>   at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:283)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:173)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:153)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:128)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:203)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:155)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:103)
> 
> testTimestampInKey(org.apache.metron.pcap.integration.PcapTopologyIntegrationTest)
>   Time elapsed: 15.038 sec  <<< FAILURE!
> java.lang.AssertionError
>  

Odd integration-test failures on Fedora/CentOS for RC5

2017-02-25 Thread Kyle Richardson
Ok, I've tried this so many times I may be going crazy, so thought I'd ask
the community for a sanity check.

I'm trying to verify RC5 and I keep running into the same integration test
failures but only on my Fedora (24 and 25) and CentOS 7 systems. It passes
fine on my Macbook.

It always fails on the PcapTopologyIntegrationTest (test results pasted
below). Anyone have any ideas? I'm using the exact same version of maven in
all cases (v3.3.9). The only difference I can think of is the Fedora/CentOS
systems are using OpenJDK whereas the Macbook is running Sun/Oracle JDK.

---
 T E S T S
---
Running org.apache.metron.pcap.integration.PcapTopologyIntegrationTest
Formatting using clusterid: testClusterID
Formatting using clusterid: testClusterID
Sent pcap data: 20
Wrote 20 to kafka
Tests run: 2, Failures: 2, Errors: 0, Skipped: 0, Time elapsed: 42.011
sec <<< FAILURE! - in
org.apache.metron.pcap.integration.PcapTopologyIntegrationTest
testTimestampInPacket(org.apache.metron.pcap.integration.PcapTopologyIntegrationTest)
 Time elapsed: 26.968 sec  <<< FAILURE!
java.lang.AssertionError
at org.junit.Assert.fail(Assert.java:86)
at org.junit.Assert.assertTrue(Assert.java:41)
at org.junit.Assert.assertTrue(Assert.java:52)
at 
org.apache.metron.pcap.integration.PcapTopologyIntegrationTest.assertInOrder(PcapTopologyIntegrationTest.java:537)
at 
org.apache.metron.pcap.integration.PcapTopologyIntegrationTest.testTopology(PcapTopologyIntegrationTest.java:383)
at 
org.apache.metron.pcap.integration.PcapTopologyIntegrationTest.testTimestampInPacket(PcapTopologyIntegrationTest.java:135)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
at 
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
at 
org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:283)
at 
org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:173)
at 
org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:153)
at 
org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:128)
at 
org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:203)
at 
org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:155)
at 
org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:103)

testTimestampInKey(org.apache.metron.pcap.integration.PcapTopologyIntegrationTest)
 Time elapsed: 15.038 sec  <<< FAILURE!
java.lang.AssertionError
at org.junit.Assert.fail(Assert.java:86)
at org.junit.Assert.assertTrue(Assert.java:41)
at org.junit.Assert.assertTrue(Assert.java:52)
at 
org.apache.metron.pcap.integration.PcapTopologyIntegrationTest.assertInOrder(PcapTopologyIntegrationTest.java:537)
at 
org.apache.metron.pcap.integration.PcapTopologyIntegrationTest.testTopology(PcapTopologyIntegrationTest.java:383)
at 
org.apache.metron.pcap.integration.PcapTopologyIntegrationTest.testTimestampInKey(PcapTopologyIntegrationTest.java:152)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
at 

Re: new committer: Josh Meyer

2017-02-23 Thread Kyle Richardson
Welcome, Josh!

-Kyle

On Thu, Feb 23, 2017 at 3:17 PM, James Sirota <jsir...@apache.org> wrote:

> Oops. Sorry.  Cut and paste error... These Apache templates are hard :)
>
> The Podling Project Management Committee (PPMC) for Apache Metron
> (Incubating)
> has asked Josh Meyer to become a committer and we are pleased
> to announce that they have accepted.
>
>
> Being a committer enables easier contribution to the
> project since there is no need to go via the patch
> submission process. This should enable better productivity.
> Being a PMC member enables assistance with the management
> and to guide the direction of the project.
>
>
> 23.02.2017, 13:14, "James Sirota" <jsir...@apache.org>:
> > The Podling Project Management Committee (PPMC) for Apache Metron
> (Incubating)
> > has asked Kyle Richardson to become a committer and we are pleased
> > to announce that they have accepted.
> >
> > Being a committer enables easier contribution to the
> > project since there is no need to go via the patch
> > submission process. This should enable better productivity.
> > Being a PMC member enables assistance with the management
> > and to guide the direction of the project.
> >
> > ---
> > Thank you,
> >
> > James Sirota
> > PPMC- Apache Metron (Incubating)
> > jsirota AT apache DOT org
>
> ---
> Thank you,
>
> James Sirota
> PPMC- Apache Metron (Incubating)
> jsirota AT apache DOT org
>


Re: [DISCUSS] Coding style via checkstyle

2017-02-22 Thread Kyle Richardson
+1 to longer line lengths and blanket reformatting. Personally, I see IDE
integration as a must for adoption of checkstyle.

-Kyle

On Wed, Feb 22, 2017 at 1:39 AM, Matt Foley  wrote:

> +1, so do I.  Also like the idea of providing the necessary IntelliJ
> specification.
>
> On 2/21/17, 1:25 PM, "Otto Fowler"  wrote:
>
> +1.  I agree with Michael’s points.
>
> On February 21, 2017 at 16:23:21, Michael Miklavcic (
> michael.miklav...@gmail.com) wrote:
>
> +1 to a blanket reformat, failed build for improper formatting, and
> automated formatting. I strongly prefer to remove "thinking" from my
> code
> formatting and it has worked very well for me on large projects in the
> past. There is capability now in IntelliJ to work with Checkstyle as
> well.
> https://youtrack.jetbrains.com/issue/IDEA-61520#comment=27-1292600
> https://plugins.jetbrains.com/idea/plugin/1065-checkstyle-idea
>
> A quick search didn't yield any obviously robust tools for automating
> the
> formatting other than an older non-maintained project named Jalopy. I
> think
> the checkstyle integration with IntelliJ and Eclipse should suffice
> since
> the Maven plugin would give devs the ability to run checks locally and
> in
> Github via Travis.
>
>
> On Tue, Feb 21, 2017 at 12:32 PM, Nick Allen 
> wrote:
>
> > I would be in favor of a blanket, reformat. Whether that is for the
> entire
> > code base or one project at a time. Might be able to conquer and
> divide
> > some of the heavy-lifting of testing, if we do a project at a time.
> But
> > whichever way you think is easier. I'd be glad to help.
> >
> > On Tue, Feb 21, 2017 at 1:57 PM, Justin Leet 
> > wrote:
> >
> > > I already tried a blanket, manual reformat the other day, through
> > > IntelliJ. I did every file matching *.java in the project and it
> was
> > > pretty quick. I didn't validate everything looked perfect
> afterwards,
> > but I
> > > did click into a few files and things looked fine. I'm not quite
> sure
> > what
> > > the lifecycle of our autogenerated stuff is, so we'd want to regen
> > > afterwards, but it's a pretty trivial thing to do.
> > >
> > > I'm sure there's more nuance (and definitely more testing) than
> that,
> but
> > > off the top of my head I'm not sure what it would be. Either way, I
> don't
> > > think there's a huge amount of effort to just do the reformat, but
> we'd
> > > still want to spin everything up and test it and so on. It's
> probably
> > more
> > > work for everybody to rebase onto the (vastly) reformatted code
> than
> > > anything else, which will vary pretty significantly.
> > >
> > > For (slight) context, the changes are enough to eliminate ~5k
> checkstyle
> > > warnings (and there might be more if we have to tweak anything in
> the
> > code
> > > formatting).
> > >
> > > On Tue, Feb 21, 2017 at 10:34 AM, Casey Stella  >
> > wrote:
> > >
> > > > Any idea, with those modifications to checkstyle, how much
> effort it
> > will
> > > > take to reformat the code to conform?
> > > >
> > > > On Tue, Feb 21, 2017 at 8:23 AM, Justin Leet <
> justinjl...@gmail.com>
> > > > wrote:
> > > >
> > > > > As part of:
> > > > > https://issues.apache.org/jira/browse/METRON-726
> > > > > https://github.com/apache/incubator-metron/pull/459
> > > > >
> > > > > I integrated checkstyle into the mvn:site command, and have
> > checkstyle
> > > > > reports being run as part of the mvn:site reporting. I expect
> to be
> > > > > celebrating hitting 25k checkstyle warnings soon.
> > > > >
> > > > > I tested out creating a code formatting setup in IntelliJ,
> with a
> > > couple
> > > > > slight modifications of the default Sun conventions (extended
> the
> > > > character
> > > > > limit of a line past 80 and made it two space indents). Given
> that
> > > > > checkstyle includes it as a default option, it's probably
> reasonably
> > > > close
> > > > > to the Sun conventions. I'm thinking we probably also at least
> create
> > > an
> > > > > Eclipse profile, to open up ease of development.
> > > > >
> > > > > There's probably also a discussion about how exactly we want to
> > enforce
> > > > it.
> > > > > Is it just something we add to the PR checklist and have
> reviewers
> > > give a
> > > > > glance, do we setup a hook to autoformat code, etc?
> > > > >
> > > > > Justin
> > > > >
> > > >
> > >
> >
>
>
>
>


Re: custom date format required for snort, but not working

2017-02-21 Thread Kyle Richardson
You're correct, a ZonedDateTime requires a year. I ran into this when
parsing the RFC3164 syslog timestamps.

Glad he was able to find the config option to enable the year in Snort.

-Kyle

On Tue, Feb 21, 2017 at 7:59 AM, Otto Fowler 
wrote:

> ok -
>
> # Configure Snort to show year in timestamps
> config show_year
>
> looks like it fixed it for him.
> I create a jira to make sure this is in our default
>
> On February 20, 2017 at 16:47:29, Otto Fowler (ottobackwa...@gmail.com)
> wrote:
>
> There is someone on the user list getting errors from snort, and I sent him
> this reply:
>
> -
> 2017-02-20 16:00:14 ERROR BasicSnortParser:179 - Unable to parse message:
> 02/18-16:24:46.262884 ,1,999158,0,"'snort test
> alert'",TCP,192.168.1.85,58472,192.168.1.216,22,34:68:
> 95:01:D1:BB,52:54:00:E0:8F:0D,0x42,***A,0x6756B8AF,
> 0xA5EF764E,,0x5A4,64,16,57034,52,53248
> java.time.format.DateTimeParseException: Text '02/18-16:24:46.262884'
> could
> not be parsed at index 5
>
> We are expect a date more like 01/27/16-16:01:04.877970
> So the year is missing.
>
>
> Our default date formatter for snort is defined as
> MM/dd/yy-HH:mm:ss.SS
>
> You can change this by adding “dateFormat”:”your format” to your parser
> configuration
> ——
>
> The issue is, I can’t get this to work.  I don’t think that the
> ZonedTimeDate will work if the year is missing.
>
> I tried the following test:
>
> import java.time.ZoneId;
>
> import java.time.ZonedDateTime;
>
> import java.time.format.DateTimeFormatter;
>
>
> class Untitled {
>
> public static void main(String[] args) {
>
> String fmt = "MM/dd-HH:mm:ss.SS";
>
> String old = "MM/dd/yy-HH:mm:ss.SS";
>
> String dateString = "02/18-16:24:46.262900";
>
> String oldString = "02/18/17-16:24:46.262900";
>
> DateTimeFormatter df = DateTimeFormatter.ofPattern(fmt);
>
> df = df.withZone(ZoneId.systemDefault());
>
> ZonedDateTime zdt = ZonedDateTime.parse(dateString,df);
>
> System.out.println(String.format("%d",zdt.toInstant().toEpochMilli()));
>
> }
>
> }
>
>
> Old and oldString work.
>
>
> fmt and dateString don’t with exception:
>
>
> Exception in thread "main" java.time.format.DateTimeParseException: Text
> '02/18-16:24:46.262900' could not be parsed: Unable to obtain ZonedDateTime
> from TemporalAccessor: {MonthOfYear=2, DayOfMonth=18},ISO,America/New_York
> resolved to 16:24:46.262900 of type java.time.format.Parsed
>
> at
> java.time.format.DateTimeFormatter.createError(
> DateTimeFormatter.java:1920)
>
> at java.time.format.DateTimeFormatter.parse(DateTimeFormatter.java:1855)
>
> at java.time.ZonedDateTime.parse(ZonedDateTime.java:597)
>
> at Untitled.main(Untitled 2.java:13)
>
> Caused by: java.time.DateTimeException: Unable to obtain ZonedDateTime from
> TemporalAccessor: {MonthOfYear=2, DayOfMonth=18},ISO,America/New_York
> resolved to 16:24:46.262900 of type java.time.format.Parsed
>
> at java.time.ZonedDateTime.from(ZonedDateTime.java:565)
>
> at java.time.format.Parsed.query(Parsed.java:226)
>
> at java.time.format.DateTimeFormatter.parse(DateTimeFormatter.java:1851)
>
> ... 2 more
>
> Caused by: java.time.DateTimeException: Unable to obtain LocalDate from
> TemporalAccessor: {MonthOfYear=2, DayOfMonth=18},ISO,America/New_York
> resolved to 16:24:46.262900 of type java.time.format.Parsed
>
> at java.time.LocalDate.from(LocalDate.java:368)
>
> at java.time.ZonedDateTime.from(ZonedDateTime.java:559)
>
> ... 4 more
>
>
> The snort parser doesn’t document the dateFormat override ( METRON-729 ).
> I don’t now and have not found a way to modify how snort outputs date
> string.
>
> Any ideas?
>


Re: [DISCUSS] Management of Elastic and other index schemas

2017-02-17 Thread Kyle Richardson
I personally like the idea of a typed schema per parser that we could
translate to multiple targets. This would allow us a lot more modularity
and extensibility in indexing down the road.

-Kyle

On Fri, Feb 17, 2017 at 1:59 PM, Simon Elliston Ball <
si...@simonellistonball.com> wrote:

> That sounds like a great idea Otto. Do you have any early design on that
> we can look at. Also, rather than just elastic templates do you think we
> should have some sort of typed schema we could translate to multiple
> targets (solr, elastic, ur... other...) or are you thinking of packaging
> specific scheme assets like template json with the parser?
>
> Simon
>
> > On 17 Feb 2017, at 18:42, Otto Fowler  wrote:
> >
> >
> > Not to jump the gun, but I’m crafting a proposal about parsers and one
> of the things I am going to propose relates to having the ES Template for a
> given parser installed or packaged with the parser.  We could load the
> template from there, edit, save and deploy etc.  We can extend that concept
> more and more later (drafts, versioning etc )
> >
> >
> >> On February 17, 2017 at 13:22:45, Simon Elliston Ball (
> si...@simonellistonball.com) wrote:
> >>
> >> A little while ago the issue of managing Elastic templates for new
> sensor configs came up, and we didn’t quite put it to bed.
> >>
> >> When creating new sensors, I almost invariably find the auto-generated
> schemas for elastic pick some incorrect types. I also find I have to
> recreate indexes every time to push in the proper dynamic templates for
> things like geo enrichment fields.
> >>
> >> So, my questions are:
> >> How should we address elastic template for new sensors?
> >> Do we have circumstances where we would need to configure types, or can
> we get away with inferring them?
> >> Should we just add some additional dynamic templates to cover our
> common fields like timestamp (the most common culprit I find for incorrect
> typing)?
> >>
> >> I’d also like to think about ways we can generalise this. Does anyone
> have any thoughts on what sort of additional index schemes we should want
> to infer (solr seems an obvious one, any others?).
> >>
> >> Thoughts on a well typed, schemaed and easily indexed postcard please :)
> >>
> >> Simon
>


Re: [Discuss] Direction of metron-docker

2017-02-06 Thread Kyle Richardson
I like the idea of porting some of the integration tests to metron-docker.
I believe the maven plugin used in the rpm-docker project could be used to
support that goal.

I agree with Ryan in that I see this as more of a toolbox for developers
than a supported deployment method. That is the vain I originally created
this PR in actually. I could continue to load the elasticsearch templates
manually when working with metron-docker but thought it would be worthwhile
to automate with a few lines of code.

I have another PR just about ready to go to include a hadoop/hdfs container
in metron-docker. Would folks see value in including this? The idea was to
provide an easier way to iterate on HDFS indexing options for cold
storage/archive data.

As for maintainability, the minimum would be to keep consistent versions of
storm, hbase, etc between the docker containers and the current supported
HDP stack. The automation pieces are nice to haves (not blockers in my
mind) and will continue to simplify as we move more configs into zookeeper
from the filesystem. I can't think of anything too onerous here but I may
be missing something obvious.

-Kyle

On Mon, Feb 6, 2017 at 2:30 PM, Otto Fowler  wrote:

> Beyond the utility, is the cost of maintaining the docker path.  It is just
> another thing that reviewers and committers have to keep in mind or know
> about when looking at PR’s.  Maybe if there was a better and wider spread
> understanding of the work that is done and how continue it, it would not
> seem so onerous.  It can’t be something that as long as one or two specific
> people keep up with it, it will be OK, or rather it should not be.  Even
> if, or perhaps because it won’t break the build.
>
> There is a lot of utility and value to metron-docker, maybe we just need to
> think through the sustainability and maintaining issues, so it is a how can
> we make it work to the project’s satisfaction.
>
> On February 6, 2017 at 14:11:04, Casey Stella (ceste...@gmail.com) wrote:
>
> So, I'm late chiming in here, but I'll go ahead anyway. :)
>
> There are a couple of questions here that stand out:
>
> *Is the docker infrastructure sufficient to replace vagrant at the moment?*
>
> I do not consider it to be a sufficient environment to acceptance test
> features because it does not install Metron in a realistic manner that
> mimics a user. Vagrant isn't currently where it should be in that regard
> and that is the reason that it is currently getting an overhaul to get
> closer to that ideal.
>
> *Does it scratch an itch?*
>
> Yes, it does, I think. For those who want a limited portion of metron spun
> up to smoke-test features in a targeted way, this works well. That being
> said, in my opinion, you still need to test in vagrant or a cluster. Matt
> brings up a good point as well about integration test infrastructure. I
> think there could be an even bigger itch to scratch there as the cost of
> spinning up and down integration testing components per-test can be time
> consuming and lead to long build times.
>
> *Can we unify them?*
>
> I don't know; I'd like to, honestly. I think that it'd be a good
> discussion to have and it'd be nice to have a path to victory there,
> because I'm not thrilled about having so many avenues to install. If we
> don't unify them, I feel that docker will eventually get so far out of date
> that it will become unusable, frankly.
>
>
> Ultimately, I don't care about the tech stack that we use, docker vs
> vagrant vs vagrant on docker vs whatever; I just want
>
> 1. A way to spin up Metron in an automated way using the same mechanism
> that the user uses to install (management pack)
> 2. It'd be nice to be able to slice and dice capabilities (sensors
> on/off, etc)
> 3. It'd be nice for it to not cause my machine to sound like a jet is
> taking off
>
>
> Anyway, those are my $0.02 and I want to thank Nick for bringing up the
> conversation. I think it's a good one to have, for sure.
>
> Casey
>
> On Mon, Feb 6, 2017 at 1:44 PM, David Lyle  wrote:
>
> > Exactly that, Matt. I think of it as an integration test enabler, so it's
> > squarely pointed at the use case you describe.
> >
> > Ryan - I didn't hear anyone asking for it to be removed, just some
> > clarification about its purpose and future use.
> >
> > Wrt "completely unusable": perhaps, since we require committed code to be
> > run through Quick Dev, we should have a focused community effort to make
> it
> > a bit more usable. Fwiw, with some recent changes from Justin and Nick,
> > it's working better than it had been in recent memory. It will be working
> > even better once I can get my current stuff pushed out. If it's still
> unsat
> > after that, we gotta fix it.
> >
> > -D...
> >
> >
> > On Mon, Feb 6, 2017 at 1:29 PM, Matt Foley  wrote:
> >
> > > There may be another area of application for this. I’m not certain, so
> > > tell me if I’m off base.
> > >
> > > In the 

Re: [Discuss] Direction of metron-docker

2017-02-05 Thread Kyle Richardson
My working assumption has been that metron-docker would provide a lightweight 
development environment without all of the overhead of a full cluster. The 
ability to quickly spin up and down specific components while coding is a big 
win in my book.

Unfortunately, since metron-docker containers are built on the individual 
binaries and not Ambari/HDP, it's not really feasible to reuse the MPack 
deployment. There's potential to reuse some of the Ansible roles but I haven't 
looked into it in detail.

To Nick's point, I'm not sure we want another entire deployment solution. 
What's the right balance? Honestly, I'm not sure. I'd like to think it's enough 
automation to support end-to-end data flow but not so much to become a burden 
to maintain for the community.

I'm very interested to hear others thoughts on this.

-Kyle

> On Feb 5, 2017, at 4:22 PM, Nick Allen  wrote:
> 
> Where is the `metron-docker` code base headed? What do we want that to be?
> How will it work with the other deployment mechanisms?
> 
> So far a lot of work has gone into creating the Ambari MPack and we have
> been moving away from the legacy Ansible deployments.  I have a limited
> understanding of the `metron-docker` stuff, but it seems to introduce a
> third deployment mechanism via the Docker files.
> 
> Is there no way to leverage the existing deployment paths for
> `metron-docker`?
> 
> 
> 
> -- Forwarded message --
> From: nickwallen 
> Date: Sun, Feb 5, 2017 at 4:09 PM
> Subject: [GitHub] incubator-metron issue #441: METRON-646: Add index
> templates to metron-docke...
> To: dev@metron.incubator.apache.org
> 
> 
> Github user nickwallen commented on the issue:
> 
>https://github.com/apache/incubator-metron/pull/441
> 
>Hi @kylerichardson - I don't want to throw cold water on your effort,
> but I am hesitant to create a third deployment code base for
> `metron-docker` (in addition to MPack and Ansible.)  Do you think that is
> what this is or would become?
> 
>Besides just the index templates, we'd have to add and support a lot of
> other functionality too.  Seems like we should have a goal to move towards
> a single deployment mechanism that works across multiple platforms (Docker,
> Metal, etc).
> 
>I don't even know if this is feasible, but it may be worth a community
> discussion.  I'll kick something off.
> 
> 
> ---
> If your project is set up for it, you can reply to this email and have your
> reply appear on GitHub as well. If your project does not have this feature
> enabled and wishes so, or if the feature is enabled but not working, please
> contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
> with INFRA.
> ---


Re: [DISCUSS] Metron Management UI

2017-02-01 Thread Kyle Richardson
This looks awesome! A great starting point. Like Jon, I'm looking forward
to kicking the tires.

-Kyle

On Wed, Feb 1, 2017 at 10:18 AM, Ryan Merriman  wrote:

> Jon, I've done my best to keep this UI in sync with the API PR so you can
> play around with it now if you want.  There are 2 different versions of the
> UI:  one without the Blockly editor (
> https://github.com/merrimanr/incubator-metron/tree/METRON-623) and one
> with
> (https://github.com/merrimanr/incubator-metron/tree/blockly).  The Blockly
> editor feature contains a significant amount of code so we decide to keep
> it in it's own branch so reviewing will be easier.
>
> If you are familiar with the javascript ecosystem it's not hard to start it
> up but some instructions will definitely help.  I will add a README to make
> this easier but here's a quick summary:
>
> Start up the Docker environment in metron-docker
> Set the "docker.host.address" property in
> /incubator-metron/metron-interface/metron-rest/src/
> main/resources/application-docker.yml
> to match the IP address of your Docker machine
> Run org.apache.metron.rest.MetronRestApplication with spring active
> profiles set to "docker,dev"
> Navigate to metron-interface/metron-config and run "npm install"
> Start the UI with "metron-config/scripts/start-dev.sh"
>
> We're still working on a seamless install process but this should work in
> the meantime.  Feel free to reach out if you need help getting it going.
> As for feedback, I wouldn't limit it to what the API currently provides the
> API PR.  We can add services as needed.
>
> Ryan
>
> On Tue, Jan 31, 2017 at 7:49 PM, zeo...@gmail.com 
> wrote:
>
> > First off - this is an awesome first take at a management UI and I'm
> > looking forward to messing around with it.  Other than skimming some of
> the
> > dialogue as it comes in I have not been keeping up with the API PR.
> Should
> > I be able to assume that the UI PR is broken until the API is merged and
> > the UI can be updated for any changes?  I have not spun up the UI based
> on
> > that assumption.
> >
> > Are you looking for feedback that's limited to what can currently be done
> > using the API (a CRUD interface for SensorParserConfigs)?
> >
> > Thanks,
> >
> > Jon
> >
> > On Tue, Jan 31, 2017, 7:06 PM Houshang Livian 
> > wrote:
> >
> > Hello Metron Community,
> >
> > We have constructed a Management Module UI, built on top of METRON-503
> > (REST API) (currently under review). This Module gives users the ability
> to
> > setup and administer much of the product through the UI.
> >
> > Here are some screens to show you what we are thinking. Please take a
> look:
> > http://imgur.com/a/QAQyu?
> >
> > Does this look like a reasonable place to start?
> > Is there anything that is an absolute MUST have or MUST NOT have?
> >
> > Houshang Livian
> > Senior User Experience Designer
> > Hortonworks
> >
> > www.hortonworks.com
> > 
> >
> > Mobile: (831) 521-4176
> > hliv...@hortonworks.com
> > 
> >
> > --
> >
> > Jon
> >
> > Sent from my mobile device
> >
>


Re: [DISCUSS] Contributions with multiple authors and requisite modifications to the development guide

2017-01-31 Thread Kyle Richardson
+1 I think this provides a reasonable set of expectations for the author
and committer.

-Kyle

On Tue, Jan 31, 2017 at 10:46 AM, Nick Allen  wrote:

> I am in agreement with what I am reading.  Something along the lines of the
> following.
>
>- Best effort should be made to create a pull request that can be
>attributed to a single author.
>- If the work of multiple authors cannot be split into separate pull
>requests for justifiable reasons, then it is the author's
> responsibility to
>create a pull request with the minimal number of commits required to
>provide appropriate attribution to multiple authors.
>- Each commit should be attributed to a single author and follow the
>commit message guidelines identifying the issue associated with the
> change.
>- When merging the pull request, the committer will not squash these
>commits so that attribution to each author can be maintained in the
>revision history.
>
>
> On Tue, Jan 31, 2017 at 9:42 AM, Otto Fowler 
> wrote:
>
> > Just to be clear - I’m not pushing for this, I was more curious about how
> > to do it if we wanted to.  My main interest is that the process we agree
> to
> > fosters and encourages collaboration rather than make it such a pain that
> > contributors think twice before helping out on large PR efforts.
> >
> > How did we do it for the Storm efforts?
> >
> >
> > On January 31, 2017 at 09:21:17, Casey Stella (ceste...@gmail.com)
> wrote:
> >
> > To my reading, that script only notes additional authors, it does not
> > create separate commits for those authors.  I feel that we should give
> > authors the ability to construct pull requests in such a way that
> > attribution in the form of github statistics are preserved.
> >
> > Also, and this is for the mentors, does the zookeeper methodology abide
> by
> > the requirement of having the commit log be a place of record for
> > authorization?
> >
> > Casey
> >
> > On Tue, Jan 31, 2017 at 9:16 AM, Otto Fowler 
> > wrote:
> >
> > > https://cwiki.apache.org/confluence/display/ZOOKEEPER/
> > > Merging+Github+Pull+Requests
> > >
> > >
> > > On January 31, 2017 at 09:04:38, Casey Stella (ceste...@gmail.com)
> > wrote:
> > >
> > > The problem is that a single commit cannot have multiple authors and
> it's
> > > difficult to construct a minimal set of commits for multiple authors in
> > the
> > > same PR by the committers. I would be in favor of not automating this
> > and,
> > > rather, insisting that the pull request either be split up into
> multiple
> > > PRs or have the individual PR's commits squashed and named
> appropriately
> > by
> > > the authors.
> > >
> > > For reference, here's how Apache Apex does this. Lines item 7 and 8
> under
> > > "Opening a Pull Request" at https://apex.apache.org/contributing.html
> :
> > >
> > > 1.
> > >
> > > After all review is complete, combine all new commits into one squashed
> > > commit except when there are multiple contributors, and include the
> Jira
> > > number in the commit message. There are several ways to squash
> > > commits, but here
> > > is one explanation from git-scm.com
> > >  > > History#Squashing-Commits>
> > > and
> > > a simple example is illustrated below:
> > >
> > > If tracking upstream/master then run git rebase -i. Else run git rebase
> > > -i upstream/master. This command opens the text editor which lists the
> > > multiple commits:
> > >
> > > pick 67cd79b change1
> > > pick 6f98905 change2
> > >
> > > # Rebase e13748b..3463fbf onto e13748b (2 command(s))
> > > #
> > > # Commands:
> > > # p, pick = use commit
> > > # r, reword = use commit, but edit the commit message
> > > # e, edit = use commit, but stop for amending
> > > # s, squash = use commit, but meld into previous commit
> > > # f, fixup = like "squash", but discard this commit's log message
> > > # x, exec = run command (the rest of the line) using shell
> > > #
> > > # These lines can be re-ordered; they are executed from top to bottom.
> > >
> > > Squash 'change2' to 'change1' and save.
> > >
> > > pick 67cd79b change1
> > > squash 6f98905 change2
> > >
> > > 2. If there are multiple contributors in a pull request preserve
> > > individual attributions. Try to squash the commits to the minimum
> number
> > of
> > > commits required to preserve attribution and the contribution to still
> be
> > > functionally correct.
> > >
> > > Their language is similar to the ones proposed by Dave and I like it.
> > This
> > > is, I consider, a special case and we should avoid trying to automate
> it,
> > > but rather delegate it to meatspace. Thoughts?
> > >
> > > On Tue, Jan 31, 2017 at 5:58 AM, zeo...@gmail.com 
> > > wrote:
> > >
> > > > I thought the idea would be to maintain the commit mapping to
> > individuals
> > > > so things like the GitHub statistics are accurate regarding # of
> > > 

Re: [VOTE] Release Process

2017-01-18 Thread Kyle Richardson
+1 (binding)

-Kyle

On Tue, Jan 17, 2017 at 11:17 PM, James Sirota  wrote:

> I made the revisions based on the discuss thread
>
> The document is available here:
> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=66854770
>
> And is also attached for reference to this email.
>
> Please vote +1, -1, or 0 for neutral.  The vote will last 72 hours.
>
> Thanks,
> James
>
> -
> Metron Release Types
> There are two types of Metron releases:
> Feature Release (FR) - this is a release that has a significant step
> forward in feature capability and is denoted by an upgrade of the second
> digit
> Maintenance Release (MR) - this is a set of patches and fixes that are
> issued following the FR and is denoted by an upgrade of the third digit
> Release Naming Convention
> Metron build naming convention is as follows: 0.[FR].[MR].  We keep the 0.
> notation to signify that the project is still under active development and
> we will hold a community vote to go to 1.x at a future time
> Initiating a New Metron Release
> Immediately upon the release of the previous Metron release create two
> branches: FR ++ and MR.  Create the FR++ branch by incrementing the second
> digit like so 0.[FR++].0.  Create the MR branch for the previous Metron
> release by incrementing the second digit of the previous release like so
> 0.[FR].[MR].  All patches to the previous Metron release will be checked in
> under the MR branch and where it makes sense also under the FR branch.  All
> new features will be checked in under the FR branch.
> Creating a Feature Release
> Step 1 - Initiate a discuss thread
> Prior to the release The Release manager should do the following
> (preferably a month before the release):
> Make sure that the list of JIRAs slated for the release accurately
> reflects to reflects the pull requests that are currently in master
> Construct an email to the Metron dev board (dev@metron.incubator.apache.
> org) which discusses with the community the desire to do a release. This
> email should contain the following:
> The list of JIRAs slated for the release with descriptions (use the output
> of git log and remove all the JIRAs from the last release’s changelog)
> A solicitation of JIRAs that should be included with the next release.
> Users should rate them as must/need/good to have as well as volunteering.
> A release email template is provided here.
> Step 2 - Monitor and Verify JIRAs
> Once the community votes for additional JIRAs they want included in the
> release verify that the pull requests are in before the release, close
> these JIRAs and tag them with the release name. All pull requests and JIRAs
> that were not slated for this release will go into the next releases.  The
> release manager should continue to monitor the JIRA to ensure that the
> timetable is on track until the release date.  On the release date the
> release manager should message the Metron dev board (
> dev@metron.incubator.apache.org) announcing the code freeze for the
> release.
> Step 3 - Create the Release Branch and Increment Metron version
> Create an branch for the release (from a repo cloned from
> https://git-wip-us.apache.org/repos/asf/incubator-metron.git). (assuming
> the release is 0.[FR++].0 and working from master):
> git checkout -b Metron_0.[FR++].0
> git push --set-upstream origin Metron_0.[FR++].0
> File a JIRA to increment the Metron version to 0.[FR++].0.  Either do it
> yourself or have a community member increment the build version for you.
> You can look at a pull request for a previous build to see how this is
> done.   METRON-533 - Up the version for release DONE
> Also, the release manager should have a couple of things set up:
> A SVN clone of the repo at https://dist.apache.org/repos/
> dist/dev/incubator/metron, We will refer to this as the dev repo.  It
> will hold the release candidate artifacts
> A SVN clone of the repo at https://dist.apache.org/repos/
> dist/release/incubator/metron, We will refer to this as the release
> repo.  It will hold the release artifacts.
> Step 4 - Create the Release Candidate
>
> Now, for each release candidate, we will tag from that branch. Assuming
> that this is RC1:
> git checkout Metron_0.[FR++].0 && git pull
> git tag apache-metron-0.[FR++].0-rc1-incubating
> git push origin —tags
> Now we must create the release candidate tarball. From the apache repo,
> you should run:
>
>  git archive --prefix=apache-metron-0.[FR++].0-rc1-incubating/
>  apache-metron-0.[FR++].0-rc1-incubating | gzip >
>  apache-metron-0.[FR++].0-rc-incubating.tar.gz
>
> We will refer to this as the release candidate tarball. *Note: Per Apache
> policy, the hardware used to create the candidate tarball must be owned by
> the release manager.
> The artifacts for a release (or a release candidate, for that matter) are
> as follows:
> Release (candidate) Tarball
>  MD5 hash of the release tarball (md5 apache-metron-Now, we must grab the
> release candidate 

Re: [DISCUSS] Moving GeoIP management away from MySQL

2017-01-16 Thread Kyle Richardson
+1 Agree with David's order

-Kyle

On Mon, Jan 16, 2017 at 12:41 PM, David Lyle  wrote:

> Def agree on the parity point.
>
> I'm a little worried about Supervisor relocations for non-HBase solutions,
> but having much of the work done for us by MaxMind changes my preference to
> (in order)
>
> 1) MM API
> 2) HBase Enrichment
> 3) MapDB should the others prove not feasible
>
>
> -D...
>
>
> On Mon, Jan 16, 2017 at 12:15 PM, Justin Leet 
> wrote:
>
> > I definitely agree on checking out the MaxMind API.  I'll take a look at
> > it, but at first glance it looks like it does include everything we use.
> > Great find, JJ.
> >
> > More details on various people's points:
> >
> > As a note to anyone hopping in, Simon's point on the range lookup vs a
> key
> > lookup is why it becomes a Scan in HBase vs a Get.  As an addendum to
> what
> > Simon mentioned, denormalizing is easy enough and turns it into an easy
> > range lookup.
> >
> > To David's point, the MapDB does require a network hop, but it's once per
> > refresh of the data (Got a relevant callback? Grab new data, load it,
> swap
> > out) instead of (up to) once per message.  I would expect the same to be
> > true of the MaxMind db files.
> >
> > I'd also argue MapDB not really more complex than refreshing the HBase
> > table, because we potentially have to start worrying about things like
> > hashing and/or indices and even just general data represtation. It's
> > definitely correct that the file processing has to occur on either path,
> so
> > it really boils down to handling the callback and reloading the file vs
> > handling some of the standard HBasey things.  I don't think either is an
> > enormous amount of work (and both are almost certainly more work than
> > MaxMind's API)
> >
> > Regarding extensibility, I'd argue for parity with what we have first,
> then
> > build what we need from there.  Does anybody have any disagreement with
> > that approach for right now?
> >
> > Justin
> >
> > On Mon, Jan 16, 2017 at 12:04 PM, David Lyle 
> wrote:
> >
> > > It is interesting- save us a ton of effort, and has the right license.
> I
> > > think it's worth at least checking out.
> > >
> > > -D...
> > >
> > >
> > > On Mon, Jan 16, 2017 at 12:00 PM, Simon Elliston Ball <
> > > si...@simonellistonball.com> wrote:
> > >
> > > > I like that approach even more. That way we would only have to worry
> > > about
> > > > distributing the database file in binary format to all the supervisor
> > > nodes
> > > > on update.
> > > >
> > > > It would also make it easier for people to switch to the enterprise
> DB
> > > > potentially if they had the license.
> > > >
> > > > One slight issue with this might be for people who wanted to extend
> the
> > > > database. For example, organisations may want to add geo-enrichment
> to
> > > > their own private network addresses based modified versions of the
> geo
> > > > database. Currently we don’t really allow this, since we hard-code
> > > ignoring
> > > > private network classes into the geo enrichment adapter, but I can
> see
> > a
> > > > case where a global org might want to add their own ranges and
> > locations
> > > to
> > > > the data set. Does that make sense to anyone else?
> > > >
> > > > Simon
> > > >
> > > >
> > > > > On 16 Jan 2017, at 16:50, JJ Meyer  wrote:
> > > > >
> > > > > Hello all,
> > > > >
> > > > > Can we leverage maxmind's Java client (
> > > > > https://github.com/maxmind/GeoIP2-java/tree/master/src/
> > > > main/java/com/maxmind/geoip2)
> > > > > in this case? I believe it can directly read maxmind file. Plus I
> > think
> > > > it
> > > > > also has some support for caching as well.
> > > > >
> > > > > Thanks,
> > > > > JJ
> > > > >
> > > > > On Mon, Jan 16, 2017 at 10:32 AM, Simon Elliston Ball <
> > > > > si...@simonellistonball.com> wrote:
> > > > >
> > > > >> I like the idea of MapDB, since we can essentially pull an
> instance
> > > into
> > > > >> each supervisor, so it makes a lot of sense for relatively small
> > > scale,
> > > > >> relatively static enrichments in general.
> > > > >>
> > > > >> Generally this feels like a caching problem, and would be for a
> > simple
> > > > >> key-value lookup. In that case I would agree with David Lyle on
> > using
> > > > HBase
> > > > >> as a source or truth and relying on caching.
> > > > >>
> > > > >> That said, GeoIP is a different lookup pattern, since it’s a range
> > > > lookup
> > > > >> then a key lookup (or if we denormalize the MaxMind data, just a
> > range
> > > > >> lookup) for that kind of thing MapDB with something like the BTree
> > > > seems a
> > > > >> good fit.
> > > > >>
> > > > >> Simon
> > > > >>
> > > > >>
> > > > >>> On 16 Jan 2017, at 16:28, David Lyle 
> wrote:
> > > > >>>
> > > > >>> I'm +1 on removing the MySQL dependency, BUT - I'd prefer to see
> it
> > > as
> > > > an
> > > > >>> HBase enrichment. If our 

Re: [DISCUSS] Turning off indexing writers feature discussion

2017-01-14 Thread Kyle Richardson
t;>>>> indexing
>>>>>>>> config only the writer-level configs.
>>>>>>>> 
>>>>>>>> My concerns about Nick's suggestion were that the default and
>>>> majority
>>>>>>>> case, specifying the index and the batchSize for all writers (th
>>>> eone
>>>>> we
>>>>>>>> support now) would require more configuration.
>>>>>>>> 
>>>>>>>> Nick's concerns about my suggestion were that it was overly
>>> complex
>>>>> and
>>>>>>>> hard to grok and that we could dispense with backwards
>>> compatibility
>>>>> and
>>>>>>>> make people do a bit more work on the default case for the
>>> benefits
>>>>> of a
>>>>>>>> simpler advanced case. (Nick, make sure I don't misstate your
>>>>> position).
>>>>>>>> 
>>>>>>>> Casey
>>>>>>>> 
>>>>>>>> 
>>>>>>>> On Fri, Jan 13, 2017 at 10:54 AM, David Lyle <
>>> dlyle65...@gmail.com>
>>>>>>>> wrote:
>>>>>>>> 
>>>>>>>>> Casey,
>>>>>>>>> 
>>>>>>>>> Can you give me a level set of what your thinking is now? I
>>> think
>>>>> it's
>>>>>>>>> global control of all index types + overrides on a per-type
>>> basis.
>>>>>> Fwiw,
>>>>>>>>> I'm totally for that, but I want to make sure I'm not imposing
>>> my
>>>>>>>>> pre-concieved notions on your consensus-driven ones.
>>>>>>>>> 
>>>>>>>>> -D
>>>>>>>>> 
>>>>>>>>> On Fri, Jan 13, 2017 at 10:44 AM, Casey Stella <
>>>> ceste...@gmail.com>
>>>>>>>> wrote:
>>>>>>>>> 
>>>>>>>>>> I am suggesting that, yes. The configs are essentially the
>>> same
>>>> as
>>>>>>>>> yours,
>>>>>>>>>> except there is an override specified at the top level.
>>> Without
>>>>>>>> that, in
>>>>>>>>>> order to specify both HDFS and ES have batch sizes of 100,
>> you
>>>>> have
>>>>>> to
>>>>>>>>>> explicitly configure each. It's less that I'm trying to have
>>>>>>>> backwards
>>>>>>>>>> compatibility and more that I'm trying to make the majority
>>> case
>>>>>> easy:
>>>>>>>>> both
>>>>>>>>>> writers write everything to a specified index name with a
>>>>> specified
>>>>>>>> batch
>>>>>>>>>> size (which is what we have now). Beyond that, I want to
>> allow
>>>> for
>>>>>>>>>> specifying an override for the config on a writer-by-writer
>>>> basis
>>>>>> for
>>>>>>>>> those
>>>>>>>>>> who need it.
>>>>>>>>>> 
>>>>>>>>>> On Fri, Jan 13, 2017 at 10:39 AM, Nick Allen <
>>>> n...@nickallen.org>
>>>>>>>> wrote:
>>>>>>>>>> 
>>>>>>>>>>> Are you saying we support all of these variants? I realize
>>> you
>>>>> are
>>>>>>>>>> trying
>>>>>>>>>>> to have some backwards compatibility, but this also makes
>> it
>>>>>> harder
>>>>>>>>> for a
>>>>>>>>>>> user to grok (for me at least).
>>>>>>>>>>> 
>>>>>>>>>>> Personally I like my original example as there are fewer
>>>>>>>>> sub-structures,
>>>>>>>>>>> like 'writerConfig', which makes the whole thing simpler
>> and
>>>>>> easier
>>>>>>>> to
>>>>>>>>>>> grok. But maybe others will think your proposal is just as
>>>> easy
>>>>> to
>>>>>>>>&

Re: [DISCUSS] Turning off indexing writers feature discussion

2017-01-12 Thread Kyle Richardson
I'll second my preference for the first option. I think the ability to use
Stellar filters to customize indexing would be a big win.

I'm glad Matt brought up the point about data lake and CEP. I think this is
a really important use case that we need to consider. Take a simple
example... If I have data coming in from 3 different firewall vendors and 2
different web proxy/url filtering vendors and I want to be able to analyze
that data set, I need the data to be indexed all together (likely in HDFS)
and to have a normalized schema such that IP address, URL, and user name
(to take a few) can be easily queried and aggregated. I can also envision
scenarios where I would want to index data based on attributes other than
sensor, business unit or subsidiary for example.

I've been wanted to propose extending our 7 standard fields to include
things like URL and user. Is there community interest/support for moving in
that direction? If so, I'll start a new thread.

Thanks!

-Kyle

On Thu, Jan 12, 2017 at 6:51 PM, Matt Foley  wrote:

> Ah, I see.  If overriding the default index name allows using the same
> name for multiple sensors, then the goal can be achieved.
> Thanks,
> --Matt
>
>
> On 1/12/17, 3:30 PM, "Casey Stella"  wrote:
>
> Oh, you could!  Let's say you have a syslog parser with data from
> sources 1
> 2 and 3.  You'd end up with one kafka queue with 3 parsers attached to
> that
> queue, each picking part the messages from source 1, 2 and 3.  They'd
> go
> through separate enrichment and into the indexing topology.  In the
> indexing topology, you could specify the same index name "syslog" and
> all
> of the messages go into the same index for CEP querying if so desired.
>
> On Thu, Jan 12, 2017 at 6:27 PM, Matt Foley  wrote:
>
> > Syslog is hell on parsers – I know, I worked at LogLogic in a
> previous
> > life.  It makes perfect sense to route different lines from syslog
> through
> > different appropriate parsers.  But a lot of what the parsers do is
> > identify consistent subsets of metadata and annotate it – eg,
> src_ip_addr,
> > event timestamps, etc.  Once those metadata are annotated and
> available
> > with common field names, why doesn’t it make sense to index the
> messages
> > together, for CEP querying?  I think Splunk has illustrated this
> model.
> >
> > On 1/12/17, 3:00 PM, "Casey Stella"  wrote:
> >
> > yeah, I mean, honestly, I think the approach that we've taken for
> > sources
> > which aggregate different types of data is to provide filters at
> the
> > parser
> > level and have multiple parser topologies (with different,
> possibly
> > mutually exclusive filters) running.  This would be a completely
> > separate
> > sensor.  Imagine a syslog data source that aggregates and you
> want to
> > pick
> > apart certain pieces of messages.  This is why the initial
> thought and
> > architecture was one index per sensor.
> >
> > On Thu, Jan 12, 2017 at 5:55 PM, Matt Foley 
> wrote:
> >
> > > I’m thinking that CEP (Complex Event Processing) is contrary
> to the
> > idea
> > > of silo-ing data per sensor.
> > > Now it’s true that some of those sensors are already
> aggregating
> > data from
> > > multiple sources, so maybe I’m wrong here.
> > > But it just seems to me that the “data lake” insights come from
> > being able
> > > to make decisions over the whole mass of data rather than just
> > vertical
> > > slices of it.
> > >
> > > On 1/12/17, 2:15 PM, "Casey Stella" 
> wrote:
> > >
> > > Hey Matt,
> > >
> > > Thanks for the comment!
> > > 1. At the moment, we only have one index name, the default
> of
> > which is
> > > the
> > > sensor name but that's entirely up to the user.  This is
> sensor
> > > specific,
> > > so it'd be a separate config for each sensor.  If we want
> to
> > build
> > > multiple
> > > indices per sensor, we'd have to think carefully about how
> to do
> > that
> > > and
> > > would be a bigger undertaking.  I guess I can see the use,
> though
> > > (redirect
> > > messages to one index vs another based on a predicate for
> a given
> > > sensor).
> > > Anyway, not where I was originally thinking that this
> discussion
> > would
> > > go,
> > > but it's an interesting point.
> > >
> > > 2. I hadn't thought through the implementation quite yet,
> but we
> > don't
> > > actually have a splitter bolt in that topology, just a
> spout
> > that goes
> > > to
> > 

Re: [PROPOSAL] up-to-date versioned documentation

2017-01-12 Thread Kyle Richardson
Matt, thanks for pulling this together. I completely agree that we need to
go all in on either cwiki or the README.md's. I think the wiki is poorly
updated and can cause confusion for new users and devs. My preference is
certainly for the README.md's.

I like your approach but also agree that we shouldn't need to roll our own
here. I really like the Spark documentation that Mike pointed out. Any way
we can duplicate/adapt their approach?

-Kyle

On Thu, Jan 12, 2017 at 7:19 PM, Michael Miklavcic <
michael.miklav...@gmail.com> wrote:

> Casey, Matt - These guys are using doxia
> https://github.com/apache/falcon/tree/master/docs
>
> Honestly, I kind of like Spark's approach -
> https://github.com/apache/spark/tree/master/docs
>
> Mike
>
> On Thu, Jan 12, 2017 at 4:48 PM, Matt Foley  wrote:
>
> > I’m ambivalent; I think we’d end up tied to the doxia processing
> pipeline,
> > which is “yet another arcane toolset” to learn.  Using .md as the input
> > format decreases the dependency, but we’d still be dependent on it.
> >
> > I had anticipated that the web page would be a write-once thing that
> would
> > be only a couple days for an experienced Web developer. But I was going
> to
> > get an estimate from some co-workers before actually trying to get it
> > implemented. And the script is a few hours of work with find and awk.
> >
> > On the other hand, doxia is certainly an expectable solution.  Is setting
> > up that infrastructure less work than developing the web page?  Or is it
> > actually just a matter of a few lines in pom.xml?
> >
> >
> > On 1/12/17, 3:24 PM, "Casey Stella"  wrote:
> >
> > Just a followup thought that's a bit more constructive, maybe we
> could
> > migrate the README.md's into a site directory and use doxia markdown
> > (example here ) to
> > generate the site as part of the build to resolve 1 through 3?
> >
> > On Thu, Jan 12, 2017 at 6:19 PM, Casey Stella 
> > wrote:
> >
> > > So, I do think this would be better than what we currently do.  I
> > like a
> > > few things in particular:
> > >
> > >- I don't like the wiki one bit.
> > >- We have a LOT of documentation in the README.md's and it's
> > sometimes
> > >poorly organized
> > >- I like a documentation preprocessing pipeline to be present.
> > For
> > >instance, a major ask is all of the stellar functions in one
> > place.  That's
> > >solved by updating an index manually in the READMEs and keeping
> > it in sync
> > >with the annotation.  I'd like to make a stellar annotation ->
> > markdown
> > >generator as part of the build and this would be nice for such a
> > task.
> > >
> > > My only concern is that the html generation/viewer seems like a
> fair
> > > amount of engineering.  Are you sure there isn't something easier
> > that we
> > > could conform to?  I'm sure we aren't the only project in the world
> > that
> > > has this particular issue.  Is there something like a maven site
> > plugin or
> > > something?  Just a thought.  I'll come back with more :)
> > >
> > > Great ideas!  Keep them coming!
> > >
> > > Casey
> > >
> > > On Thu, Jan 12, 2017 at 6:05 PM, Matt Foley 
> > wrote:
> > >
> > >> We currently have three forms of documentation, with the following
> > >> advantages and disadvantages:
> > >>
> > >> || Docs || Pro || Con ||
> > >> | CWiki |
> > >>   Easy to edit, no special tools required, don't have to be a
> > >> developer to contribute, google and wiki search |
> > >> Not versioned, no review process, distant from the code, obsolete
> > content
> > >> tends to accumulate |
> > >> | Site |
> > >>   Versioned and reviewed, only committers can edit, google
> > search |
> > >>   Yet another arcane toolset must be learned, only web
> > programmers
> > >> feel comfortable contributing, "asf-site" branch not related to
> code
> > >> versions, distant from the code, tends to go obsolete due to
> > >> non-maintenance |
> > >> | README.md |
> > >>   Versioned and reviewed, only committers can edit, tied to
> code
> > >> versions, highly local to the code being documented |
> > >>   Non-developers don't know about them, may be scared by
> > github, poor
> > >> scoring in google search, no high-level presentation |
> > >>
> > >> Various discussion threads indicate the developer community likes
> > >> README-based docs, and it's easy to see why from the above.  I
> > propose this
> > >> extension to the README-based documentation, to address their
> > disadvantages:
> > >>
> > >> 1. Produce a script that gathers the README.md files from all code
> > >> subdirectories into a hierarchical list.  The script would have an
> > 

Re: Long-term storage for enriched data

2017-01-06 Thread Kyle Richardson
Yep. Exactly.

Ok, cool. I'll file a couple of JIRAs to get the ball rolling.

-Kyle

> On Jan 6, 2017, at 5:21 PM, zeo...@gmail.com <zeo...@gmail.com> wrote:
> 
> I think we can use the ES templates to start off, as Avro (HDFS) and ES
> should be in sync.  There already is some normalization in place for field
> names (ip_src_addr, etc.) so that enrichment can work across any log type.
> This is currently handled in the parsers.
> 
> I think what you're talking about here is simply expanding that to handle
> more fields, right?  I'm all for that - the more normalized the data is the
> more fun I can have with it :)
> 
> Jon
> 
> On Fri, Jan 6, 2017, 4:26 PM Kyle Richardson <kylerichards...@gmail.com>
> wrote:
> 
>> You're right. I don't think it needs to account for everything. As I
>> understand it, one of the big selling features for Avro is the schema
>> evolution.
>> 
>> Do you think we can take the ES templates as a starting point for
>> developing an Avro schema? I do still think we need some type of
>> normalization across the sensors for fields like URL, user name, and
>> disposition. This wouldn't be specific to Avro but would allow us to better
>> search across multiple sensor types in the UI too. Say, for example, if I
>> have two different proxy solutions.
>> 
>> -Kyle
>> 
>>> On Fri, Jan 6, 2017 at 2:28 PM, zeo...@gmail.com <zeo...@gmail.com> wrote:
>>> 
>>> Does it really need to account for all enrichments off the bat?  I'm not
>>> familiar with these options in practice but my research led me to believe
>>> that adding fields to the Avro schema is not a huge issue, changing or
>>> removing them is the true problem.  I have no proof to substantiate my
>>> claim however, just that I heard that question get asked, and I read
>>> responses from people familiar with Avro reply uniformly in that way.
>>> 
>>> My thoughts, based off of my assumption, is that we simply need to handle
>>> out of the box enrichments and document a required schema change in our
>>> guides to creating custom enrichments.
>>> 
>>> In ES we are currently doing one template per sensor which gives us that
>>> overlapping field name (per sensor) flexibility.
>>> 
>>> Jon
>>> 
>>> On Fri, Jan 6, 2017, 12:33 PM Kyle Richardson <kylerichards...@gmail.com
>>> 
>>> wrote:
>>> 
>>>> Thanks, Jon. Really interesting talk.
>>>> 
>>>> For the GitHub data set discussed (which probably most closely mimics
>>>> Metron data due to number of fields and overall diversity), Avro with
>>>> Snappy compression seemed like the best balance of storage size and
>>>> retrieval time. I did find it interesting that he said Parquet was
>>>> originally developed for log data sets but didn't perform as well on
>> the
>>>> GitHub data.
>>>> 
>>>> I think our challenge is going to be on the schema. Would we create a
>>>> schema per sensor type and try to account for all of the possible
>>>> enrichments? Problem there is that similar data may not be mapped to
>> the
>>>> same field names across sensors. We may need to think about expanding
>> our
>>>> base JSON schema beyond these 7 fields (
>>>> https://cwiki.apache.org/confluence/display/METRON/Metron+JSON+Object)
>>> to
>>>> account for normalizing things like URL, user name, and disposition
>> (e.g.
>>>> whether an action was allowed or denied).
>>>> 
>>>> Thoughts?
>>>> 
>>>> -Kyle
>>>> 
>>>> On Tue, Jan 3, 2017 at 11:30 AM, zeo...@gmail.com <zeo...@gmail.com>
>>>> wrote:
>>>> 
>>>>> For those interested, I ended up finding a recording of the talk
>> itself
>>>>> when doing some Avro research - https://www.youtube.com/watch?
>>>>> v=tB28rPTvRiI
>>>>> 
>>>>> Jon
>>>>> 
>>>>>> On Sun, Jan 1, 2017 at 8:41 PM Matt Foley <ma...@apache.org> wrote:
>>>>>> 
>>>>>> I’m not an expert on these things, but my understanding is that
>> Avro
>>>> and
>>>>>> ORC serve many of the same needs.  The biggest difference is that
>> ORC
>>>> is
>>>>>> columnar, and Avro isn’t.  Avro, ORC, and Parquet were compared in
>>>> detail
>>>>>> at last year’s Hadoop Summit; the slideshare prezo

Re: Long-term storage for enriched data

2017-01-06 Thread Kyle Richardson
You're right. I don't think it needs to account for everything. As I
understand it, one of the big selling features for Avro is the schema
evolution.

Do you think we can take the ES templates as a starting point for
developing an Avro schema? I do still think we need some type of
normalization across the sensors for fields like URL, user name, and
disposition. This wouldn't be specific to Avro but would allow us to better
search across multiple sensor types in the UI too. Say, for example, if I
have two different proxy solutions.

-Kyle

On Fri, Jan 6, 2017 at 2:28 PM, zeo...@gmail.com <zeo...@gmail.com> wrote:

> Does it really need to account for all enrichments off the bat?  I'm not
> familiar with these options in practice but my research led me to believe
> that adding fields to the Avro schema is not a huge issue, changing or
> removing them is the true problem.  I have no proof to substantiate my
> claim however, just that I heard that question get asked, and I read
> responses from people familiar with Avro reply uniformly in that way.
>
> My thoughts, based off of my assumption, is that we simply need to handle
> out of the box enrichments and document a required schema change in our
> guides to creating custom enrichments.
>
> In ES we are currently doing one template per sensor which gives us that
> overlapping field name (per sensor) flexibility.
>
> Jon
>
> On Fri, Jan 6, 2017, 12:33 PM Kyle Richardson <kylerichards...@gmail.com>
> wrote:
>
> > Thanks, Jon. Really interesting talk.
> >
> > For the GitHub data set discussed (which probably most closely mimics
> > Metron data due to number of fields and overall diversity), Avro with
> > Snappy compression seemed like the best balance of storage size and
> > retrieval time. I did find it interesting that he said Parquet was
> > originally developed for log data sets but didn't perform as well on the
> > GitHub data.
> >
> > I think our challenge is going to be on the schema. Would we create a
> > schema per sensor type and try to account for all of the possible
> > enrichments? Problem there is that similar data may not be mapped to the
> > same field names across sensors. We may need to think about expanding our
> > base JSON schema beyond these 7 fields (
> > https://cwiki.apache.org/confluence/display/METRON/Metron+JSON+Object)
> to
> > account for normalizing things like URL, user name, and disposition (e.g.
> > whether an action was allowed or denied).
> >
> > Thoughts?
> >
> > -Kyle
> >
> > On Tue, Jan 3, 2017 at 11:30 AM, zeo...@gmail.com <zeo...@gmail.com>
> > wrote:
> >
> > > For those interested, I ended up finding a recording of the talk itself
> > > when doing some Avro research - https://www.youtube.com/watch?
> > > v=tB28rPTvRiI
> > >
> > > Jon
> > >
> > > On Sun, Jan 1, 2017 at 8:41 PM Matt Foley <ma...@apache.org> wrote:
> > >
> > > > I’m not an expert on these things, but my understanding is that Avro
> > and
> > > > ORC serve many of the same needs.  The biggest difference is that ORC
> > is
> > > > columnar, and Avro isn’t.  Avro, ORC, and Parquet were compared in
> > detail
> > > > at last year’s Hadoop Summit; the slideshare prezo is here:
> > > > http://www.slideshare.net/HadoopSummit/file-format-
> > > benchmark-avro-json-orc-parquet
> > > >
> > > > It’s conclusion: “For complex tables with common strings, Avro with
> > > Snappy
> > > > is a good fit.  For other tables [or when applications “just need a
> few
> > > > columns” of the tables], ORC with Zlib is a good fit.”  (The addition
> > in
> > > > square brackets incorporates a quote from another part of the prezo.)
> > > But
> > > > do look at the prezo please, it gives detailed benchmarks showing
> when
> > > each
> > > > one is better.
> > > >
> > > > --Matt
> > > >
> > > > On 1/1/17, 5:18 AM, "zeo...@gmail.com" <zeo...@gmail.com> wrote:
> > > >
> > > > I don't recall a conversation on that product specifically, but
> > I've
> > > > definitely brought up the need to search HDFS from time to time.
> > > > Things
> > > > like Spark SQL, Hive, Oozie have been discussed, but Avro is new
> to
> > > me
> > > > I'll
> > > > have to look into it.  Are you able to summarize it's benefits?
> > > >
> > > > Jon
> > > >
> > > > On Wed, De

Re: Long-term storage for enriched data

2017-01-06 Thread Kyle Richardson
Thanks, Jon. Really interesting talk.

For the GitHub data set discussed (which probably most closely mimics
Metron data due to number of fields and overall diversity), Avro with
Snappy compression seemed like the best balance of storage size and
retrieval time. I did find it interesting that he said Parquet was
originally developed for log data sets but didn't perform as well on the
GitHub data.

I think our challenge is going to be on the schema. Would we create a
schema per sensor type and try to account for all of the possible
enrichments? Problem there is that similar data may not be mapped to the
same field names across sensors. We may need to think about expanding our
base JSON schema beyond these 7 fields (
https://cwiki.apache.org/confluence/display/METRON/Metron+JSON+Object) to
account for normalizing things like URL, user name, and disposition (e.g.
whether an action was allowed or denied).

Thoughts?

-Kyle

On Tue, Jan 3, 2017 at 11:30 AM, zeo...@gmail.com <zeo...@gmail.com> wrote:

> For those interested, I ended up finding a recording of the talk itself
> when doing some Avro research - https://www.youtube.com/watch?
> v=tB28rPTvRiI
>
> Jon
>
> On Sun, Jan 1, 2017 at 8:41 PM Matt Foley <ma...@apache.org> wrote:
>
> > I’m not an expert on these things, but my understanding is that Avro and
> > ORC serve many of the same needs.  The biggest difference is that ORC is
> > columnar, and Avro isn’t.  Avro, ORC, and Parquet were compared in detail
> > at last year’s Hadoop Summit; the slideshare prezo is here:
> > http://www.slideshare.net/HadoopSummit/file-format-
> benchmark-avro-json-orc-parquet
> >
> > It’s conclusion: “For complex tables with common strings, Avro with
> Snappy
> > is a good fit.  For other tables [or when applications “just need a few
> > columns” of the tables], ORC with Zlib is a good fit.”  (The addition in
> > square brackets incorporates a quote from another part of the prezo.)
> But
> > do look at the prezo please, it gives detailed benchmarks showing when
> each
> > one is better.
> >
> > --Matt
> >
> > On 1/1/17, 5:18 AM, "zeo...@gmail.com" <zeo...@gmail.com> wrote:
> >
> > I don't recall a conversation on that product specifically, but I've
> > definitely brought up the need to search HDFS from time to time.
> > Things
> > like Spark SQL, Hive, Oozie have been discussed, but Avro is new to
> me
> > I'll
> > have to look into it.  Are you able to summarize it's benefits?
> >
> > Jon
> >
> > On Wed, Dec 28, 2016, 14:45 Kyle Richardson <
> kylerichards...@gmail.com
> > >
> > wrote:
> >
> > > This thread got me thinking... there are likely a fair number of
> use
> > cases
> > > for searching and analyzing the output stored in HDFS. Dima's use
> > case is
> > > certainly one. Has there been any discussion on the use of Avro to
> > store
> > > the output in HDFS? This would likely require an expansion of the
> > current
> > > json schema.
> > >
> > > -Kyle
> > >
> > > On Thu, Dec 22, 2016 at 5:53 PM, Casey Stella <ceste...@gmail.com>
> > wrote:
> > >
> > > > Oozie (or something like it) would appear to me to be the correct
> > tool
> > > > here.  You are likely moving files around and pinning up hive
> > tables:
> > > >
> > > >- Moving the data written in HDFS from
> > /apps/metron/enrichment/${
> > > > sensor}
> > > >to another directory in HDFS
> > > >- Running a job in Hive or pig or spark to take the JSON
> blobs,
> > map
> > > them
> > > >to rows and pin it up as an ORC table for downstream analytics
> > > >
> > > > NiFi is mostly about getting data in the cluster, not really for
> > > scheduling
> > > > large-scale batch ETL, I think.
> > > >
> > > > Casey
> > > >
> > > > On Thu, Dec 22, 2016 at 5:18 PM, Dima Kovalyov <
> > dima.koval...@sstech.us>
> > > > wrote:
> > > >
> > > > > Thank you for reply Carolyn,
> > > > >
> > > > > Currently for the test purposes we enrich flow with Geo and
> > ThreatIntel
> > > > > malware IP, but plan to expand this further.
> > > > >
> > > > > Our dev team is working on Oozie job to process this. So
> > meanwhile

Re: METRON-648 GrokWebSphereParserTest and BasicAsaParserTest are not 2017-safe

2017-01-04 Thread Kyle Richardson
+1 Why didn't I think of that? :) Thanks, Justin.

-Kyle

On Wed, Jan 4, 2017 at 10:26 AM, Nick Allen <n...@nickallen.org> wrote:

> +1 We can't merge anything else until we address this.  Thanks for
> volunteering, Justin.
>
> On Wed, Jan 4, 2017 at 10:24 AM, Michael Miklavcic <
> michael.miklav...@gmail.com> wrote:
>
> > +1
> >
> > On Wed, Jan 4, 2017 at 8:04 AM, Justin Leet <justinjl...@gmail.com>
> wrote:
> >
> > > In the short term, we could just generate the timestamp appropriately
> > with
> > > the current year in the test for the test and spin off another JIRA for
> > > actually addressing the question of what we do with this data (Keep in
> > mind
> > > we can eventually have replay use cases, so assuming the past year
> might
> > > not be totally sufficient either.)
> > >
> > > At that point it'll at least be year agnostic, but probably not the
> > actual
> > > output we want. Normally, I'd rather it be handled correctly, but given
> > > that our builds fail, I'd rather have something less broken until we
> get
> > a
> > > more correct solution.
> > >
> > > I can take care of doing that today.  Any objections to that solution?
> > >
> > > Justin
> > >
> > > On Wed, Jan 4, 2017 at 9:34 AM, Kyle Richardson <
> > kylerichards...@gmail.com
> > > >
> > > wrote:
> > >
> > > > Unfortunately, it's not going to be quite as simple as just adding
> the
> > > year
> > > > into the test strings, at least for GrokWebSphereParserTest. (For
> > > > BasicAsaParserTest, updating the test string worked just fine.)
> > > >
> > > > It turns out that that grok pattern being used only expects the month
> > and
> > > > day in the timestamp of the syslog messages. I'm happy to a take a
> stab
> > > and
> > > > making it year safe by reusing some of the code from the
> > BasicAsaParser;
> > > > however, I have limited time today and it will likely take me until
> > > Friday
> > > > to get a PR submitted given the new scope of changes required.
> > > >
> > > > -Kyle
> > > >
> > > > On Wed, Jan 4, 2017 at 12:50 AM, Matt Foley <ma...@apache.org>
> wrote:
> > > >
> > > > > Yes, this is an endemic problem with log processing.  And I agree
> > > adding
> > > > > the year to the testString is the best fix for our short-term
> > problem.
> > > > >
> > > > > For future consideration, we should consider if there should be an
> > > > > assumption/preference in the parser that the logs are in the
> “past”.
> > > > > Granted, if the timezone is also unspecified, there is still a 24
> hr
> > > > period
> > > > > of uncertainty, but it does seem that on Jan 3 2017 the preferred
> > > > > interpretation of “Apr 15” would be Apr 15 2016, not 2017.
> > > > >
> > > > > Cheers,
> > > > > --Matt
> > > > >
> > > > > On 1/3/17, 5:14 PM, "Michael Miklavcic" <
> michael.miklav...@gmail.com
> > >
> > > > > wrote:
> > > > >
> > > > > I also introduced a Clock object and testing mechanism back in
> > > > > METRON-235 -
> > > > > https://github.com/apache/incubator-metron/pull/156
> > > > > Sample test utilizing the Clock object here -
> > > > > https://github.com/apache/incubator-metron/blob/master/
> > > > > metron-platform/metron-pcap-backend/src/test/java/org/
> > > > > apache/metron/pcap/query/PcapCliTest.java
> > > > >
> > > > > That being said, it's probably better to use the new java.time
> > > fixed
> > > > > clock
> > > > > implementation in all places, as referenced by Matt. I'm agreed
> > > with
> > > > > everyone on a quick fix for the build and a follow-on PR to
> > > introduce
> > > > > appropriate dep injection for testing.
> > > > >
> > > > > AFA string dates with no year, we had something similar show up
> > in
> > > > the
> > > > > Snort parser. There ended up being a configuration option in
> > Snort
> > > to
> > > > > enable a year to be printed, but we may want to offer
> > alternat

Re: METRON-648 GrokWebSphereParserTest and BasicAsaParserTest are not 2017-safe

2017-01-04 Thread Kyle Richardson
Unfortunately, it's not going to be quite as simple as just adding the year
into the test strings, at least for GrokWebSphereParserTest. (For
BasicAsaParserTest, updating the test string worked just fine.)

It turns out that that grok pattern being used only expects the month and
day in the timestamp of the syslog messages. I'm happy to a take a stab and
making it year safe by reusing some of the code from the BasicAsaParser;
however, I have limited time today and it will likely take me until Friday
to get a PR submitted given the new scope of changes required.

-Kyle

On Wed, Jan 4, 2017 at 12:50 AM, Matt Foley <ma...@apache.org> wrote:

> Yes, this is an endemic problem with log processing.  And I agree adding
> the year to the testString is the best fix for our short-term problem.
>
> For future consideration, we should consider if there should be an
> assumption/preference in the parser that the logs are in the “past”.
> Granted, if the timezone is also unspecified, there is still a 24 hr period
> of uncertainty, but it does seem that on Jan 3 2017 the preferred
> interpretation of “Apr 15” would be Apr 15 2016, not 2017.
>
> Cheers,
> --Matt
>
> On 1/3/17, 5:14 PM, "Michael Miklavcic" <michael.miklav...@gmail.com>
> wrote:
>
> I also introduced a Clock object and testing mechanism back in
> METRON-235 -
> https://github.com/apache/incubator-metron/pull/156
> Sample test utilizing the Clock object here -
> https://github.com/apache/incubator-metron/blob/master/
> metron-platform/metron-pcap-backend/src/test/java/org/
> apache/metron/pcap/query/PcapCliTest.java
>
> That being said, it's probably better to use the new java.time fixed
> clock
> implementation in all places, as referenced by Matt. I'm agreed with
> everyone on a quick fix for the build and a follow-on PR to introduce
> appropriate dep injection for testing.
>
> AFA string dates with no year, we had something similar show up in the
> Snort parser. There ended up being a configuration option in Snort to
> enable a year to be printed, but we may want to offer alternatives for
> other parsers. Regardless of how we approach this it gets messy when
> you
> start thinking about potentially different src/dest timezones across a
> new
> year boundary in addition to data replay. I would urge our main goal
> here
> to be idempotency.
>
> Best,
> Mike
>
> On Tue, Jan 3, 2017 at 5:05 PM, Kyle Richardson <
> kylerichards...@gmail.com>
> wrote:
>
> > Agreed. I prefer the quick win to get us back to successful builds.
> >
> > I do think it's worth a general discussion around how we want to
> handle
> > the parsing of string dates with no year. In the long run, Matt's
> > suggestion of incorporating the Clock object is probably the route
> to go;
> > albeit as a separate enhancement PR.
> >
> > I'll start a new discuss thread for that and submit a PR for the
> quick fix.
> >
> > -Kyle
> >
> > > On Jan 3, 2017, at 5:20 PM, David Lyle <dlyle65...@gmail.com>
> wrote:
> > >
> > > I'm not sure I'm an owner, but I have an opinion. :)
> > >
> > > I'd just add "2016". Easy and targeted.
> > >
> > > -D...
> > >
> > >
> > >> On Tue, Jan 3, 2017 at 5:08 PM, Matt Foley <ma...@apache.org>
> wrote:
> > >>
> > >> I’ll subordinate this to METRON-647 since it was evidently filed
> while I
> > >> was writing METRON-648 (I did check before!)
> > >>
> > >> The question below remains valid, however…
> > >>
> > >>
> > >> On 1/3/17, 1:59 PM, "Matt Foley" <ma...@apache.org> wrote:
> > >>
> > >>Hi all,
> > >>As described in https://issues.apache.org/
> jira/browse/METRON-648 ,
> > >> these two test modules are not year-safe, and are suddenly (as of
> 2017)
> > >> giving false Travis errors.
> > >>
> > >>I can fix it quickly, but a question for the “owners” of
> GrokParser:
> > >> Do you have an opinion as to whether the fix should be done by
> adding
> > >> "2016" to the testString values in the GrokWebSphereParserTest
> test
> > module
> > >> (easy, and only affects the test module), vs making GrokParser
> use a
> > Clock
> > >> object set to 2016 (more involved, and affecting core code, but
> allowing
> > >> for more interesting testing)?
> > >>
> > >>For those interested, BasicAsaParserTest::testShortTimestamp()
> > >> illustrates the use of Clock object in the Asa Parser and its test
> > module.
> > >>
> > >>Thanks,
> > >>--Matt
> > >>
> > >>
> > >>
> > >>
> > >>
> > >>
> > >>
> >
>
>
>
>


Re: Tests failing due to new year

2017-01-03 Thread Kyle Richardson
Created METRON-647 for tracking.

-Kyle

On Tue, Jan 3, 2017 at 3:49 PM, Kyle Richardson <kylerichards...@gmail.com>
wrote:

> ** This is causing all new PRs to fail Travis CI **
>
> The rollover to the new year is causing unit test failures for some of our
> parser classes. It looks like the issue in the same in all cases... We have
> hard coded a timestamp assertion but the original message does not contain
> the year and is now parsing as 2017 instead of 2016.
>
> I'm currently investigating the failure for BasicAsaParserTest.
> testIp6Addr:151.
>
> Other failures from the Travis CI log I'm looking at are:
> GrokWebSphereParserTest.testParseLoginLine:60
> GrokWebSphereParserTest.testParseMalformedLoginLine:151
> GrokWebSphereParserTest.tetsParseLogoutLine:84
> GrokWebSphereParserTest.tetsParseMalformedLogoutLine:175
> GrokWebSphereParserTest.tetsParseMalformedOtherLine:220
> GrokWebSphereParserTest.tetsParseMalformedRBMLine:198
> GrokWebSphereParserTest.tetsParseOtherLine:129
> GrokWebSphereParserTest.tetsParseRBMLine:107
>
> -Kyle
>
>


Tests failing due to new year

2017-01-03 Thread Kyle Richardson
** This is causing all new PRs to fail Travis CI **

The rollover to the new year is causing unit test failures for some of our
parser classes. It looks like the issue in the same in all cases... We have
hard coded a timestamp assertion but the original message does not contain
the year and is now parsing as 2017 instead of 2016.

I'm currently investigating the failure for
BasicAsaParserTest.testIp6Addr:151.

Other failures from the Travis CI log I'm looking at are:
GrokWebSphereParserTest.testParseLoginLine:60
GrokWebSphereParserTest.testParseMalformedLoginLine:151
GrokWebSphereParserTest.tetsParseLogoutLine:84
GrokWebSphereParserTest.tetsParseMalformedLogoutLine:175
GrokWebSphereParserTest.tetsParseMalformedOtherLine:220
GrokWebSphereParserTest.tetsParseMalformedRBMLine:198
GrokWebSphereParserTest.tetsParseOtherLine:129
GrokWebSphereParserTest.tetsParseRBMLine:107

-Kyle


Re: Long-term storage for enriched data

2016-12-28 Thread Kyle Richardson
This thread got me thinking... there are likely a fair number of use cases
for searching and analyzing the output stored in HDFS. Dima's use case is
certainly one. Has there been any discussion on the use of Avro to store
the output in HDFS? This would likely require an expansion of the current
json schema.

-Kyle

On Thu, Dec 22, 2016 at 5:53 PM, Casey Stella  wrote:

> Oozie (or something like it) would appear to me to be the correct tool
> here.  You are likely moving files around and pinning up hive tables:
>
>- Moving the data written in HDFS from /apps/metron/enrichment/${
> sensor}
>to another directory in HDFS
>- Running a job in Hive or pig or spark to take the JSON blobs, map them
>to rows and pin it up as an ORC table for downstream analytics
>
> NiFi is mostly about getting data in the cluster, not really for scheduling
> large-scale batch ETL, I think.
>
> Casey
>
> On Thu, Dec 22, 2016 at 5:18 PM, Dima Kovalyov 
> wrote:
>
> > Thank you for reply Carolyn,
> >
> > Currently for the test purposes we enrich flow with Geo and ThreatIntel
> > malware IP, but plan to expand this further.
> >
> > Our dev team is working on Oozie job to process this. So meanwhile I
> > wonder if I could use NiFi for this purpose (because we already using it
> > for data ingest and stream).
> >
> > Could you elaborate why it may be overkill? The idea is to have
> > everything in one place instead of hacking into Metron libraries and
> code.
> >
> > - Dima
> >
> > On 12/22/2016 02:26 AM, Carolyn Duby wrote:
> > > Hi Dima -
> > >
> > > What type of analytics are you looking to do?  Is the normalized format
> > not working?  You could use an oozie or spark job to create derivative
> > tables.
> > >
> > > Nifi may be overkill for breaking up the kafka stream.  Spark streaming
> > may be easier.
> > >
> > > Thanks
> > > Carolyn
> > >
> > >
> > >
> > > Sent from my Verizon, Samsung Galaxy smartphone
> > >
> > >
> > >  Original message 
> > > From: Dima Kovalyov 
> > > Date: 12/21/16 6:28 PM (GMT-05:00)
> > > To: dev@metron.incubator.apache.org
> > > Subject: Long-term storage for enriched data
> > >
> > > Hello,
> > >
> > > Currently we are researching fast and resources efficient way to save
> > > enriched data in Hive for further Analytics.
> > >
> > > There are two scenarios that we consider:
> > > a) Use Ozzie Java job that uses Metron enrichment classes to "manually"
> > > enrich each line of the source data that is picked up from the source
> > > dir (the one that we have developed already and using). That is
> > > something that we developed on our own. Downside: custom code that
> built
> > > on top of Metron source code.
> > >
> > > b) Use NiFi to listen for indexing Kafka topic -> split stream by
> source
> > > type -> Put every source type in corresponding Hive table.
> > >
> > > I wonder, if someone was going any of this direction and if there are
> > > best practices for this? Please advise.
> > > Thank you.
> > >
> > > - Dima
> > >
> > >
> >
> >
>


Re: [DISCUSS] Release Process

2016-12-18 Thread Kyle Richardson
I think this thread got commingled with the discussion on Coding
Guidelines. The wiki page on the Release Process is at
https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=66854770.

Overall, a really informative document. Thanks for pulling this together.
Two questions:

1) I'm a little confused about how the feature release and maintenance
release branches are going to work. Is the idea that all PRs will be merged
into master and then also be committed to a FR++ or a MR++ branch (or maybe
even both)?

2) Are these steps to be taken by a release manager only or is the
intention that other committers or PMC members rotate through this
responsibly? Just curious. I actually kind of like the idea of shuffling
the duty every now and then to avoid burnout by one person.

-Kyle




On Fri, Dec 16, 2016 at 1:31 PM, James Sirota  wrote:

> fixed the link and made one addition that a qualified reviewer is a
> committer or PPMC member
>
> 16.12.2016, 11:07, "zeo...@gmail.com" :
> > Right, I agree. That change looks good to me.
> >
> > Looks like the Log4j levels links is broken too.
> >
> > For a broken travis - how about "If somehow the tests get into a failing
> > state on master (such as by a backwards incompatible release of a
> > dependency) only pull requests intended to rectify master may be merged,
> > and the removal or disabling of any tests must be +1'd by two reviewers."
> >
> > Also, reading through this, should there should be a delineation between
> a
> > "reviewer" and somebody who has the ability to vote/+1 a PR? Unless I'm
> > missing something, right now it looks open to anybody.
> >
> > Jon
> >
> > On Fri, Dec 16, 2016 at 12:48 PM Nick Allen  wrote:
> >
> > Personally, I don't think it matters who merges the pull request. As long
> > as you meet the requirements for code review, then anyone should be able
> to
> > merge it. In fact, I'd rather have the person who knows most about the
> > change actually merge it into master to ensure that it goes smoothly.
> >
> > On Fri, Dec 16, 2016 at 12:15 PM, James Sirota 
> wrote:
> >
> >>  Jon, for #2 I changed it to: A committer may merge their own pull
> request,
> >>  but only after a second reviewer has given it a +1.
> >>
> >>  16.12.2016, 10:07, "zeo...@gmail.com" :
> >>  > I made some minor changes to the doc - check out the history
> >>  >  viewpreviousversions.action?
> >>  pageId=61332235>
> >>  > if you have any concerns.
> >>  >
> >>  > Regarding the larger doc -
> >>  > 1. Not everybody can assign JIRAs to themselves. I recall I had to
> >>  request
> >>  > this access, so that should probably be mentioned.
> >>  > 2. "A committer may never merge their own pull request, a second
> party
> >>  must
> >>  > merge their changes after it has be properly reviewed."
> >>  > - Is this still true/accurate? I heard both ways.
> >>  > 3. "If somehow the tests get into a failing state on master (such as
> by
> >
> > a
> >>  > backwards incompatible release of a dependency) no pull requests may
> be
> >>  > merged until this is rectified."
> >>  > - Maybe this should get reassessed using the
> >>  >  most
> >>  >  recent
> >>  >  build
> >>  >  failures
> >>  >  as a valuable
> case
> >>  > study.
> >>  >
> >>  > Jon
> >>  >
> >>  > On Fri, Dec 16, 2016 at 11:38 AM James Sirota 
> >>  wrote:
> >>  >
> >>  >> I threw together a draft document for our release process. Would you
> >>  want
> >>  >> to add/change/delete anything?
> >>  >>
> >>  >> ---
> >>  >> Thank you,
> >>  >>
> >>  >> James Sirota
> >>  >> PPMC- Apache Metron (Incubating)
> >>  >> jsirota AT apache DOT org
> >>  > --
> >>  >
> >>  > Jon
> >>  >
> >>  > Sent from my mobile device
> >>
> >>  ---
> >>  Thank you,
> >>
> >>  James Sirota
> >>  PPMC- Apache Metron (Incubating)
> >>  jsirota AT apache DOT org
> >
> > --
> > Nick Allen 
> >
> > --
> >
> > Jon
> >
> > Sent from my mobile device
>
> ---
> Thank you,
>
> James Sirota
> PPMC- Apache Metron (Incubating)
> jsirota AT apache DOT org
>


[DISCUSS] Coding Guidelines

2016-12-18 Thread Kyle Richardson
Couple of questions/comments:

In 2.4, we talk about Javadoc and code comments but not too much about the
user documentation. Should we, possibly in a section 4, give some
recommendations on what should go into the README files versus on the wiki?
This could also help the reviewer know if the change is documented
sufficiently.

In 2.6, we say that 1 qualified reviewer (Apache committer or PPMC member)
other than the author of the PR must have given it a +1. In the case where
the author is not a committer (who could merge their own PR), should we
state that the reviewer will be responsible for the merge?

-Kyle

On Fri, Dec 16, 2016 at 6:39 PM, James Sirota  wrote:

> Lets move this back to the discuss thread since it's still generating that
> many comments.  Please post all your feedback and I will incorporate it and
> put it back to a vote.
>
> Thanks,
> James
>
> 16.12.2016, 16:12, "Matt Foley" :
> > +1
> >
> > In 2.2 (follow Sun guidelines), do you want to add the notation “except
> that indents are 2 spaces instead of 4”, as Hadoop does? Or does the Metron
> community like 4-space indents? I see both in the Metron code.
> >
> > My +1 holds in either case.
> > --Matt
> >
> > On 12/16/16, 9:34 AM, "James Sirota"  wrote:
> >
> > I incorporated the changes to the coding guidelines from our discuss
> thread. I'd like to get them voted on to make them official.
> >
> > https://cwiki.apache.org/confluence/pages/viewpage.
> action?pageId=61332235
> >
> > Please vote +1, -1, 0
> >
> > The vote will be open for 72 hours.
> >
> > ---
> > Thank you,
> >
> > James Sirota
> > PPMC- Apache Metron (Incubating)
> > jsirota AT apache DOT org
>
> ---
> Thank you,
>
> James Sirota
> PPMC- Apache Metron (Incubating)
> jsirota AT apache DOT org
>


Re: [DISCUSS] Metron IRC channel

2016-12-18 Thread Kyle Richardson
I'll second the JIRA and Git integrations. I also like the meeting minutes
feature. Maybe it's something we could think about for the future.

-Kyle

On Fri, Dec 16, 2016 at 3:57 PM, Casey Stella  wrote:

> I'll leave this open til monday and update the INFRA jira with the results.
>
> On Fri, Dec 16, 2016 at 3:46 PM, zeo...@gmail.com 
> wrote:
>
> > What they said ^.
> >
> > On Fri, Dec 16, 2016 at 3:39 PM Michael Miklavcic <
> > michael.miklav...@gmail.com> wrote:
> >
> >> Jira search
> >>
> >> On Fri, Dec 16, 2016 at 11:50 AM, Otto Fowler 
> >> wrote:
> >>
> >> > Start with jira and git?
> >> >
> >> >
> >> >
> >> > On December 16, 2016 at 13:17:06, Casey Stella (ceste...@gmail.com)
> >> wrote:
> >> >
> >> > Hi all,
> >> >
> >> > Any ideas of what features we would like to add? The options are at:
> >> > http://wilderness.apache.org/manual.html
> >> >
> >> > On Wed, Nov 16, 2016 at 3:14 PM, Casey Stella 
> >> wrote:
> >> >
> >> > > Done
> >> > >
> >> > > https://issues.apache.org/jira/browse/INFRA-12931
> >> > >
> >> > >
> >> > > On Wed, Nov 16, 2016 at 12:34 PM, Yohann Lepage  >
> >> > > wrote:
> >> > >
> >> > >> Could an official member of the project fill an issue on ASF INFRA
> as
> >> > >> described on https://reference.apache.org/pmc/github to get the
> bot
> >> on
> >> > >> #apache-metron?
> >> > >>
> >> > >>
> >> > >> 2016-11-04 18:18 GMT+01:00 James Sirota :
> >> > >>
> >> > >> > We tried using slack during the early days of the project and it
> >> was
> >> > >> > frowned upon by Apache. So we abandoned it in favor of IRC and
> >> message
> >> > >> > lists.
> >> > >> >
> >> > >> > 04.11.2016, 04:54, "zeo...@gmail.com" :
> >> > >> > > Is anybody interested in migrating this to slack? I'm
> personally
> >> a
> >> > >> fan of
> >> > >> > > the benefits this provides - just wanted to bring it up and see
> >> if
> >> > >> anyone
> >> > >> > > else was thinking the same thing. If not, no biggie.
> >> > >> > >
> >> > >> > > Jon
> >> > >> > >
> >> > >> > > On Thu, Sep 29, 2016 at 1:52 PM zeo...@gmail.com <
> >> zeo...@gmail.com>
> >> > >> > wrote:
> >> > >> > >
> >> > >> > >> +1 #apache-metron
> >> > >> > >>
> >> > >> > >> On Thu, Sep 29, 2016 at 1:45 PM David Lyle <
> >> dlyle65...@gmail.com>
> >> > >> > wrote:
> >> > >> > >>
> >> > >> > >> ditto.
> >> > >> > >>
> >> > >> > >> On Thu, Sep 29, 2016 at 1:29 PM, Casey Stella <
> >> ceste...@gmail.com>
> >> > >> > wrote:
> >> > >> > >>
> >> > >> > >> > I'd agree; let's focus on #apache-metron
> >> > >> > >> >
> >> > >> > >> > On Thu, Sep 29, 2016 at 11:55 AM, James Sirota <
> >> > >> jsir...@apache.org>
> >> > >> > >> wrote:
> >> > >> > >> >
> >> > >> > >> > > I would just keep #apache-metron and open it up to general
> >> > >> public
> >> > >> > >> > >
> >> > >> > >> > > 29.09.2016, 08:54, "Yohann Lepage" :
> >> > >> > >> > > > Hi everyone,
> >> > >> > >> > > >
> >> > >> > >> > > > There are currently two IRC channels on FreeNode for
> >> Metron:
> >> > >> > >> > > > - #apache-metron
> >> > >> > >> > > > - #apache-metron-dev
> >> > >> > >> > > >
> >> > >> > >> > > > One channel is maybe enough as we are less than 5 users.
> >> > >> > >> > > >
> >> > >> > >> > > > What do you think? Which one to keep ?
> >> > >> > >> > > >
> >> > >> > >> > > > Related issue: https://issues.apache.org/jira
> >> > >> /browse/METRON-337
> >> > >> > -
> >> > >> > >> > > > Invite ASFBot to #apache-metron-dev IRC channel
> >> > >> > >> > > > --
> >> > >> > >> > > > Yohann L.
> >> > >> > >> > >
> >> > >> > >> > > ---
> >> > >> > >> > > Thank you,
> >> > >> > >> > >
> >> > >> > >> > > James Sirota
> >> > >> > >> > > PPMC- Apache Metron (Incubating)
> >> > >> > >> > > jsirota AT apache DOT org
> >> > >> > >> > >
> >> > >> > >> >
> >> > >> > >>
> >> > >> > >> --
> >> > >> > >>
> >> > >> > >> Jon
> >> > >> > > --
> >> > >> > >
> >> > >> > > Jon
> >> > >> >
> >> > >> > ---
> >> > >> > Thank you,
> >> > >> >
> >> > >> > James Sirota
> >> > >> > PPMC- Apache Metron (Incubating)
> >> > >> > jsirota AT apache DOT org
> >> > >> >
> >> > >>
> >> > >>
> >> > >>
> >> > >> --
> >> > >> Yohann L.
> >> > >>
> >> > >
> >> > >
> >> >
> >>
> > --
> >
> > Jon
> >
> > Sent from my mobile device
> >
>


Re: [VOTE] Modify Bylaws

2016-12-18 Thread Kyle Richardson
+1 (binding)

-Kyle

On Sat, Dec 17, 2016 at 10:40 AM, Nick Allen  wrote:

> Oops.  My vote is binding.  +1
>
> On Fri, Dec 16, 2016 at 7:35 PM, Casey Stella  wrote:
>
> > +1 binding
> > On Fri, Dec 16, 2016 at 18:20 Matt Foley  wrote:
> >
> > > Um, should have stated “non-binding”, on both recents.
> > >
> > > On 12/16/16, 3:17 PM, "Matt Foley"  wrote:
> > >
> > > +1
> > >
> > > On 12/16/16, 10:30 AM, "Nick Allen"  wrote:
> > >
> > > I am reading the aggregate effect of these changes as a veto
> only
> > > exists
> > > for a code commit.  For all other votes, there is no such thing
> > as
> > > a veto.
> > >
> > > +1
> > >
> > > On Fri, Dec 16, 2016 at 1:13 PM, James Sirota <
> > jsir...@apache.org>
> > > wrote:
> > >
> > > > Sorry, cut and paste error. Of course the original text
> > > currently says the
> > > > following:
> > > >
> > > > -1 – This is a negative vote. On issues where consensus is
> > > required, this
> > > > vote counts as a veto. All vetoes must contain an explanation
> > of
> > > why the
> > > > veto is appropriate. Vetoes with no explanation are void. It
> > may
> > > also be
> > > > appropriate for a -1 vote to include an alternative course of
> > > action.
> > > >
> > > > 16.12.2016, 10:54, "Nick Allen" :
> > > > > I don't see any changes in your "Change 1". Am I missing
> it?
> > > What
> > > > changed?
> > > > >
> > > > > On Fri, Dec 16, 2016 at 12:01 PM, James Sirota <
> > > jsir...@apache.org>
> > > > wrote:
> > > > >
> > > > >>  Based on the discuss thread I propose the following
> > changes:
> > > > >>
> > > > >>  Change 1 - Replace:
> > > > >>
> > > > >>  -1 – This is a negative vote. On issues where consensus
> is
> > > required,
> > > > this
> > > > >>  vote counts as a veto. Vetoes are only valid for code
> > > commits and must
> > > > >>  include a technical explanation of why the veto is
> > > appropriate. Vetoes
> > > > with
> > > > >>  no or non-technical explanation are void. On issues
> where a
> > > majority is
> > > > >>  required, -1 is simply a vote against. In either case, it
> > > may also be
> > > > >>  appropriate for a -1 vote to include a proposed
> alternative
> > > course of
> > > > >>  action.
> > > > >>
> > > > >>  With
> > > > >>
> > > > >>  -1 – This is a negative vote. On issues where consensus
> is
> > > required,
> > > > this
> > > > >>  vote counts as a veto. Vetoes are only valid for code
> > > commits and must
> > > > >>  include a technical explanation of why the veto is
> > > appropriate. Vetoes
> > > > with
> > > > >>  no or non-technical explanation are void. On issues
> where a
> > > majority is
> > > > >>  required, -1 is simply a vote against. In either case, it
> > > may also be
> > > > >>  appropriate for a -1 vote to include a proposed
> alternative
> > > course of
> > > > >>  action.
> > > > >>
> > > > >>  Change 2 - Replace:
> > > > >>
> > > > >>  A valid, binding veto cannot be overruled. If a veto is
> > > cast, it must
> > > > be
> > > > >>  accompanied by a valid reason explaining the reasons for
> > the
> > > veto. The
> > > > >>  validity of a veto, if challenged, can be confirmed by
> > > anyone who has a
> > > > >>  binding vote. This does not necessarily signify agreement
> > > with the
> > > > veto -
> > > > >>  merely that the veto is valid. If you disagree with a
> valid
> > > veto, you
> > > > must
> > > > >>  lobby the person casting the veto to withdraw their veto.
> > If
> > > a veto is
> > > > not
> > > > >>  withdrawn, any action that has already been taken must be
> > > reversed in a
> > > > >>  timely manner.
> > > > >>
> > > > >>  With:
> > > > >>
> > > > >>  A valid, binding veto regarding a code commit cannot be
> > > overruled. If a
> > > > >>  veto is cast, it must be accompanied by a valid technical
> > > explanation
> > > > >>  giving the reasons for the veto. The technical validity
> of
> > a
> > > veto, if
> > > > >>  challenged, can be confirmed by anyone who has a binding
> > > vote. This
> > > > does
> > > > >>  not necessarily signify agreement with the veto - merely
> > > that the veto
> > > > is
> > > > >>  valid. If you disagree with a valid veto, you must lobby
> > the
> > > person
> > > > casting
> > > > >>  the veto to withdraw their veto. If a veto is not
> > withdrawn,
> > > any action
> > >

Re: Process for closing JIRAs

2016-12-05 Thread Kyle Richardson
Awesome. Thanks, guys!

-Kyle

On Mon, Dec 5, 2016 at 3:59 PM, Casey Stella <ceste...@gmail.com> wrote:

> Yeah, when committed, close it and make it Next + 1.
>
> On Mon, Dec 5, 2016 at 3:58 PM, Nick Allen <n...@nickallen.org> wrote:
>
> > I saw just today that someone added a "Next + 1" version to the JIRA.  I
> > started using that assuming that it was created to solve that problem.
> > Open to correction though.
> >
> > On Mon, Dec 5, 2016 at 3:47 PM, Michael Miklavcic <
> > michael.miklav...@gmail.com> wrote:
> >
> > > To the first question, mark the Jira as DONE. I don't believe we can do
> > fix
> > > version until the community agrees on the next version increment, which
> > we
> > > normally won't know until we're getting close to cutting a release and
> > can
> > > make an assessment of it being major or minor.
> > >
> > > On Mon, Dec 5, 2016 at 11:48 AM, Kyle Richardson <
> > > kylerichards...@gmail.com>
> > > wrote:
> > >
> > > > What's our current process for closing JIRAs once the github PR has
> > been
> > > > merged? Are we setting it to Done with a particular fix version?
> > > >
> > > > If a JIRA is a duplicate, are we marking it any particular way?
> > > >
> > > > Thanks,
> > > > Kyle
> > > >
> > >
> >
> >
> >
> > --
> > Nick Allen <n...@nickallen.org>
> >
>


Process for closing JIRAs

2016-12-05 Thread Kyle Richardson
What's our current process for closing JIRAs once the github PR has been
merged? Are we setting it to Done with a particular fix version?

If a JIRA is a duplicate, are we marking it any particular way?

Thanks,
Kyle


Re: new committer: Kyle Richardson

2016-11-23 Thread Kyle Richardson
Awesome. Thanks, guys. I'll have a PR out this weekend.

-Kyle

On Wed, Nov 23, 2016 at 11:43 AM, Nick Allen <n...@nickallen.org> wrote:

> Otto - Just merged your PR.  Sorry, I did not get notified that you had
> submitted that PR.
>
> On Wed, Nov 23, 2016 at 11:37 AM, Otto Fowler <ottobackwa...@gmail.com>
> wrote:
>
> > If you use Nick’s excellent metron-commit-stuff and use a different
> github
> > user from your apache user, check out the PR I have there, when it
> prepares
> > the commit, it sets the git user info to your apache info.
> >
> >
> > On November 23, 2016 at 11:31:29, Nick Allen (n...@nickallen.org) wrote:
> >
> > Yes, adding yourself as a committer to the web page would be a good first
> > step. Use Otto's PR [1] as a guide.
> >
> > We also have an unofficial tool set [2] for commits. Not sure how many
> are
> > using it, but I personally use it all the time.
> >
> > [1] https://github.com/apache/incubator-metron/pull/330
> > [2] https://github.com/nickwallen/metron-commit-stuff
> >
> >
> >
> > On Wed, Nov 23, 2016 at 11:16 AM, Kyle Richardson <
> > kylerichards...@gmail.com
> > > wrote:
> >
> > > What's the best way to verify my commit access? Updating the Metron
> site
> > > with my name? Are there any other pages I need to update?
> > >
> > > Thanks and Happy Thanksgiving!
> > >
> > > -Kyle
> > >
> > > On Fri, Nov 11, 2016 at 1:11 PM, James Sirota <jsir...@apache.org>
> > wrote:
> > >
> > > > The Podling Project Management Committee (PPMC) for Apache Metron
> > > > (Incubating)
> > > > has asked Kyle Richardson to become a committer and we are pleased
> > > > to announce that they have accepted.
> > > >
> > > >
> > > > Being a committer enables easier contribution to the
> > > > project since there is no need to go via the patch
> > > > submission process. This should enable better productivity.
> > > > Being a PMC member enables assistance with the management
> > > > and to guide the direction of the project.
> > > >
> > > > ---
> > > > Thank you,
> > > >
> > > > James Sirota
> > > > PPMC- Apache Metron (Incubating)
> > > > jsirota AT apache DOT org
> > > >
> > >
> >
> >
> >
> > --
> > Nick Allen <n...@nickallen.org>
> >
> >
>
>
> --
> Nick Allen <n...@nickallen.org>
>


Re: new committer: Kyle Richardson

2016-11-23 Thread Kyle Richardson
What's the best way to verify my commit access? Updating the Metron site
with my name? Are there any other pages I need to update?

Thanks and Happy Thanksgiving!

-Kyle

On Fri, Nov 11, 2016 at 1:11 PM, James Sirota <jsir...@apache.org> wrote:

> The Podling Project Management Committee (PPMC) for Apache Metron
> (Incubating)
> has asked Kyle Richardson to become a committer and we are pleased
> to announce that they have accepted.
>
>
> Being a committer enables easier contribution to the
> project since there is no need to go via the patch
> submission process. This should enable better productivity.
> Being a PMC member enables assistance with the management
> and to guide the direction of the project.
>
> ---
> Thank you,
>
> James Sirota
> PPMC- Apache Metron (Incubating)
> jsirota AT apache DOT org
>


Re: [DISCUSS] Next Release Name

2016-11-09 Thread Kyle Richardson
Makes sense to me. +1

-Kyle

> On Nov 9, 2016, at 5:50 PM, Nick Allen <n...@nickallen.org> wrote:
> 
> Me likey.  +1
> 
>> On Wed, Nov 9, 2016 at 5:15 PM, James Sirota <jsir...@apache.org> wrote:
>> 
>> Guys,
>> 
>> You know, looking at the release I think the changes were significant
>> enough due to the storm & kafka upgrade to justify moving it to a non-point
>> release.  Generally point releases are reserved for patches or maintenance
>> releases.  I think this release is more than just a maintenance release.  I
>> suggest we consider 0.3.0
>> 
>> 04.11.2016, 18:27, "Kyle Richardson" <kylerichards...@gmail.com>:
>>> I'm a little late to the party but thought I would go ahead and throw my
>>> two cents into the mix.
>>> 
>>> I share the concern around an upgrade / migration path. While I would
>> love
>>> to see the BETA dropped sooner than later, to me, this is a game changer
>>> for people implementing Metron. I think there is a silent expectation of
>> no
>>> data loss after dropping the BETA tag.
>>> 
>>> Even if there is not a direct upgrade path for a few releases, is there
>>> documentation that we could provide to ensure a data migration path for
>>> users? I'm not thinking anything automated just some instructions on what
>>> to do.
>>> 
>>> -Kyle
>>> 
>>>> On Fri, Nov 4, 2016 at 9:16 AM, Casey Stella <ceste...@gmail.com> wrote:
>>>> 
>>>> Jon,
>>>> 
>>>> Thank you for your thoughts; they are appreciated and you should keep
>> them
>>>> coming. This kind of discussion is exactly why I sent out this thread.
>> I
>>>> think it's safe to say that the entire community shares your desire for
>>>> Metron to be as easy to use as possible and a "data analysis platform
>> for
>>>> the masses." We should hold ourselves to a high standard, no doubt.
>>>> 
>>>> Casey
>>>> 
>>>> On Fri, Nov 4, 2016 at 6:30 AM, zeo...@gmail.com <zeo...@gmail.com>
>> wrote:
>>>> 
>>>>> Please understand that my points mostly relate to perception and
>> ease of
>>>>> use, not what's technically possible or available. I'm coming at
>> this as
>>>>> Metron should be a data analysis platform for the masses.
>>>>> 
>>>>> METRON-517/542 - While I'm willing to let this one go it depends on
>> your
>>>>> definition of non-issue. I personally believe that data (in every
>>>> location
>>>>> that it exists) needs to be obvious and have ultra high integrity.
>> I'm
>>>> not
>>>>> concerned that the correct data won't exist somewhere in the
>> cluster, I'm
>>>>> focusing on it being easily accessible by an operations team that may
>>>>> consist of entry level analysts. Once 517 is done and merged I would
>>>>> consider that a short term mitigation is in place.
>>>>> 
>>>>> I feel like the project should stick to certain principles and a
>>>> suggestion
>>>>> is that data access is easy, accurate, and obvious. Do we have
>> anything
>>>>> like this that was agreed upon, discussed, or documented? Probably a
>>>>> discussion for a different thread.
>>>>> 
>>>>> METRON-485/470/etc. were mostly to illustrate a consistency issue
>> that
>>>> and
>>>>> resolving them would give a better first impression (assuming that
>> people
>>>>> monitoring the project will start using it more once it's non-BETA
>>>>> software). First impressions are big on my book and could affect
>> initial
>>>>> adoption.
>>>>> 
>>>>> Regarding 485 - Otto may be able to clarify but I thought somebody
>> else
>>>> saw
>>>>> this issue as well. I think the finger is currently being pointed at
>>>> monit
>>>>> timeouts and not storm. It also doesn't happen every single time, I
>> only
>>>>> run into it while the cluster is under load and after dozens of
>> topology
>>>>> restarts that I do when tuning parallelism in storm. I'm going to be
>>>>> updating to storm 1.0.x in order to see if this still exists. Again,
>>>> this
>>>>> relates to ease of use/load testing/tuning.
>>>>> 
>>>>> 

Re: [ANNOUNCE] Metron Apache Community Demo Recording Nov4,2016

2016-11-08 Thread Kyle Richardson
Great use case. Really pulls a lot of the pieces together for me. Thanks
for sharing.

-Kyle

On Fri, Nov 4, 2016 at 8:03 PM, James Sirota  wrote:

> The recording is available at:
> https://youtu.be/vOMZcudmlYg
>
> The meeting was a demonstration of the upcoming build.  No architectural
> decisions about the platform were made at the meeting.  The features that
> were demoed were:
>
> Advanced use cases of using a profiler and statistical functions to triage
> alerts
>
> Thanks,
> James
>


Re: [DISCUSS] Next Release Name

2016-11-05 Thread Kyle Richardson
Thanks, James. Very helpful information. Based on that, I agree the path is
there and I have no issues with it being manual at this point. I would
suggest we add a simple UPGRADING.md outining the steps you have with a
little more detail to make it easy for the user. I'd be happy to take this
on if folks agree it would be useful.

-Kyle

On Sat, Nov 5, 2016 at 7:56 AM, Casey Stella <ceste...@gmail.com> wrote:

> I agree. I think the upgrade path is clear however manual right now. Going
> forward we will need to prioritize making it more automated, but I think
> the path is there.
>
> On Sat, Nov 5, 2016 at 00:26 James Sirota <jsir...@apache.org> wrote:
>
> > Hi Kyle,
> >
> > The HDP upgrade guide can be found here:
> >
> > https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.5.0/
> bk_command-line-upgrade/content/ch_upgrade_2_4.html
> >
> > After executing these instructions you get to HDP 2.5 with no data loss.
> > After that, upgrading Metron is as simple as saving the old configs, ES
> > templates, grok statements from HDFS, and NiFi flows from your 0.2.1
> build,
> > installing 0.2.2 (via Ambari management pack), and putting the configs
> back
> > into zookeeper, copying the ES templates and Grok files back, and
> > restarting your NiFi flows.  I agree that we should automate most of this
> > eventually, and we will, but I don't think this is necessarily a show
> > stopper for dropping BETA.  Would you agree?
> >
> > Thanks,
> > James
> >
> > 04.11.2016, 18:27, "Kyle Richardson" <kylerichards...@gmail.com>:
> > > I'm a little late to the party but thought I would go ahead and throw
> my
> > > two cents into the mix.
> > >
> > > I share the concern around an upgrade / migration path. While I would
> > love
> > > to see the BETA dropped sooner than later, to me, this is a game
> changer
> > > for people implementing Metron. I think there is a silent expectation
> of
> > no
> > > data loss after dropping the BETA tag.
> > >
> > > Even if there is not a direct upgrade path for a few releases, is there
> > > documentation that we could provide to ensure a data migration path for
> > > users? I'm not thinking anything automated just some instructions on
> what
> > > to do.
> > >
> > > -Kyle
> > >
> > > On Fri, Nov 4, 2016 at 9:16 AM, Casey Stella <ceste...@gmail.com>
> wrote:
> > >
> > >>  Jon,
> > >>
> > >>  Thank you for your thoughts; they are appreciated and you should keep
> > them
> > >>  coming. This kind of discussion is exactly why I sent out this
> thread.
> > I
> > >>  think it's safe to say that the entire community shares your desire
> for
> > >>  Metron to be as easy to use as possible and a "data analysis platform
> > for
> > >>  the masses." We should hold ourselves to a high standard, no doubt.
> > >>
> > >>  Casey
> > >>
> > >>  On Fri, Nov 4, 2016 at 6:30 AM, zeo...@gmail.com <zeo...@gmail.com>
> > wrote:
> > >>
> > >>  > Please understand that my points mostly relate to perception and
> > ease of
> > >>  > use, not what's technically possible or available. I'm coming at
> > this as
> > >>  > Metron should be a data analysis platform for the masses.
> > >>  >
> > >>  > METRON-517/542 - While I'm willing to let this one go it depends on
> > your
> > >>  > definition of non-issue. I personally believe that data (in every
> > >>  location
> > >>  > that it exists) needs to be obvious and have ultra high integrity.
> > I'm
> > >>  not
> > >>  > concerned that the correct data won't exist somewhere in the
> > cluster, I'm
> > >>  > focusing on it being easily accessible by an operations team that
> may
> > >>  > consist of entry level analysts. Once 517 is done and merged I
> would
> > >>  > consider that a short term mitigation is in place.
> > >>  >
> > >>  > I feel like the project should stick to certain principles and a
> > >>  suggestion
> > >>  > is that data access is easy, accurate, and obvious. Do we have
> > anything
> > >>  > like this that was agreed upon, discussed, or documented? Probably
> a
> > >>  > discussion for a different thread.
> > >>  >
> > >>  > METRON-485/470/etc. were mostly to illus

Re: HDFS Compression

2016-11-04 Thread Kyle Richardson
Possibly naive question... Has there been past discussion on the use of
avro for the data in HDFS?

-Kyle

On Tue, Oct 11, 2016 at 4:30 PM, Matt Foley  wrote:

> Some of the things that are desirable to do with stored data (including
> those mentioned by others below):
> - Use it to train ML models
> o This implies that the format of records stored in HDFS and the format of
> records streamed to a “Threat Intel” topology should be readily
> transformable into each other via simple filters – preferably very simple.
> - Reprocess as time series data
> - Aggregation, Summarization
> - Graphs, Pivot charts
> - Ad-hoc queries via Hive and Spark, about almost any aspect of the data
> - Investigation / discovery with Zeppelin, Tableau, or similar tools
> - CEP analysis (not necessarily all in ES)
> - Future integration with other data in a Data Lake
>
> --Matt
>
>
> On 10/11/16, 10:20 AM, "Otto Fowler"  wrote:
>
> And also support the extensibility offered by STELLAR and enrichments,
> such
> that adding new fields using either will not mean having to write
> supporting java code etc.
>
> Or from a higher level : The flexibility for configuration based
> enrichment
> and modification of the data through ingest should not be lost for
> storage
> requirements.
>
> On October 11, 2016 at 13:13:43, Carolyn Duby (cd...@hortonworks.com)
> wrote:
>
> The format should be compatible/optimal with spark and Zeppelin.
> Perhaps
> other interactive BI tools like Tableau.
>
> Thanks
> Carolyn
>
>
>
>
> On 10/11/16, 1:06 PM, "Nick Allen"  wrote:
>
> >Right. The original idea is to do batch analytics. Kind of difficult
> to
> >work with data sitting in an ES index. But if we get a better
> understanding
> >of the type of batch analytics, it might get us closer to the target.
> >
> >On Tue, Oct 11, 2016 at 1:03 PM, zeo...@gmail.com 
> wrote:
> >
> >> I'm somewhat ignorant here, never having used the MaaS stuff yet,
> but
> isn't
> >> that the dataset that the models would run against? I understand
> there
> >> could be additional use cases, I just wanted to be clear.
> >>
> >> Jon
> >>
> >> On Tue, Oct 11, 2016 at 1:01 PM Nick Allen 
> wrote:
> >>
> >> > I don't think we put much thought into how exactly the data
> should be
> >> > landed in HDFS and for what use cases. It just has not been a
> priority.
> >> >
> >> > That being said, this might be a good time to gather everyone's
> thoughts
> >> on
> >> > how they would use that kind of data and for what purposes.
> >> >
> >> >
> >> >
> >> > On Tue, Oct 11, 2016 at 12:11 PM, Owen O'Malley <
> omal...@apache.org>
> >> > wrote:
> >> >
> >> > > Be careful of using compressed JSON, since it isn't splittable.
> JSON
> is
> >> > > also very slow for reading.
> >> > >
> >> > > .. Owen
> >> > >
> >> > > On Tue, Oct 11, 2016 at 4:31 AM, Casey Stella <
> ceste...@gmail.com>
> >> > wrote:
> >> > >
> >> > > > I'd also tack on to this that the configuration for the hdfs
> writer
> >> > > should
> >> > > > be moved to zookeeper rather than done in flux, IMO
> >> > > > On Tue, Oct 11, 2016 at 07:20 Otto Fowler <
> ottobackwa...@gmail.com>
>
> >> > > wrote:
> >> > > >
> >> > > > > The storage format and retrieval from that format should be
> >> > > configurable,
> >> > > > > that is a ‘boundary’ for Metron so to speak.
> >> > > > >
> >> > > > > On October 10, 2016 at 16:15:12, zeo...@gmail.com (
> >> zeo...@gmail.com)
> >> > > > > wrote:
> >> > > > >
> >> > > > > Is there a specific reason why the JSON files stored in
> HDFS are
> >> not
> >> > > > > compressed? I looked for some related JIRAs and mail
> conversations
> >> > but
> >> > > > > couldn't find this already mentioned. I'm wondering if
> there was
> a
> >> > good
> >> > > > > enough of an argument to keep things uncompressed, or if the
> >> subject
> >> > > just
> >> > > > > hadn't been broached yet.
> >> > > > >
> >> > > > > Jon
> >> > > > > --
> >> > > > >
> >> > > > > Jon
> >> > > > >
> >> > > >
> >> > >
> >> >
> >> >
> >> >
> >> > --
> >> > Nick Allen 
> >> >
> >> --
> >>
> >> Jon
> >>
> >
> >
> >
> >--
> >Nick Allen 
>
>
>
>


Re: [DISCUSS] Next Release Name

2016-11-04 Thread Kyle Richardson
I'm a little late to the party but thought I would go ahead and throw my
two cents into the mix.

I share the concern around an upgrade / migration path. While I would love
to see the BETA dropped sooner than later, to me, this is a game changer
for people implementing Metron. I think there is a silent expectation of no
data loss after dropping the BETA tag.

Even if there is not a direct upgrade path for a few releases, is there
documentation that we could provide to ensure a data migration path for
users? I'm not thinking anything automated just some instructions on what
to do.

-Kyle

On Fri, Nov 4, 2016 at 9:16 AM, Casey Stella  wrote:

> Jon,
>
> Thank you for your thoughts; they are appreciated and you should keep them
> coming.  This kind of discussion is exactly why I sent out this thread.  I
> think it's safe to say that the entire community shares your desire for
> Metron to be as easy to use as possible and a "data analysis platform for
> the masses."  We should hold ourselves to a high standard, no doubt.
>
> Casey
>
> On Fri, Nov 4, 2016 at 6:30 AM, zeo...@gmail.com  wrote:
>
> > Please understand that my points mostly relate to perception and ease of
> > use, not what's technically possible or available.  I'm coming at this as
> > Metron should be a data analysis platform for the masses.
> >
> > METRON-517/542 - While I'm willing to let this one go it depends on your
> > definition of non-issue.  I personally believe that data (in every
> location
> > that it exists) needs to be obvious and have ultra high integrity.  I'm
> not
> > concerned that the correct data won't exist somewhere in the cluster, I'm
> > focusing on it being easily accessible by an operations team that may
> > consist of entry level analysts.  Once 517 is done and merged I would
> > consider that a short term mitigation is in place.
> >
> > I feel like the project should stick to certain principles and a
> suggestion
> > is that data access is easy, accurate, and obvious. Do we have anything
> > like this that was agreed upon, discussed, or documented? Probably a
> > discussion for a different thread.
> >
> > METRON-485/470/etc. were mostly to illustrate a consistency issue that
> and
> > resolving them would give a better first impression (assuming that people
> > monitoring the project will start using it more once it's non-BETA
> > software).  First impressions are big on my book and could affect initial
> > adoption.
> >
> > Regarding 485 - Otto may be able to clarify but I thought somebody else
> saw
> > this issue as well.  I think the finger is currently being pointed at
> monit
> > timeouts and not storm.  It also doesn't happen every single time, I only
> > run into it while the cluster is under load and after dozens of topology
> > restarts that I do when tuning parallelism in storm.  I'm going to be
> > updating to storm 1.0.x in order to see if this still exists.  Again,
> this
> > relates to ease of use/load testing/tuning.
> >
> > Agree with the upgrade comments - as long as it's supported at some
> defined
> > point (IMHO this is when a project leaves BETA but others are welcome to
> > disagree).
> >
> > Finally, I know this doesn't come across well in email but I'm just
> > mentioning items which I think are important, not attempting to demand
> that
> > they be fixed or that this doesn't leave beta.  Thanks,
> >
> > Jon
> >
> > On Thu, Nov 3, 2016, 16:44 James Sirota  wrote:
> >
> >
> > Hi Jon,
> >
> > Here are my thoughts around your objections.
> >
> > METRON-517/METRON-542
> >
> > I thin the mechanism currently exists within Metron to make this a
> > non-issue.  I believe you can solve it with a combination of a Stellar
> > statement and ES templates.  As you mentioned, we can truncate the string
> > and then include the relevant meta data in the message (original length,
> > hash, etc).  Cramming really long strings into ES is generally a bad
> thing,
> > which is why this limitation exists.   The metadata in the indexed
> message
> > along with the timestamp allows you to pull data from HDFS should you
> need
> > to recover the full string.
> >
> > METRON-485
> >
> > We cannot replicate this issue in our environment, but if this is indeed
> an
> > issue this is an issue with Storm.  A Jira should be filed against Storm
> > and not against Metron.  My hunch, though, is that it's probably
> something
> > in your environment.  I just tried stopping all topologies on my AWS
> > cluster and then went to all Storm nodes and didn't see any workers left
> > behind.
> >
> > METRON-470
> >
> > I think this is mainly a consistency issue.  I don't think this impacts
> the
> > stability or function of the software.  I think this is a nice to have,
> > maybe in the next few releases, but I don't think we absolutely have to
> > have this to drop BETA
> >
> > With respect to upgrades, here are my thoughts.  There is really no way
> to
> > upgrade Metron 

Re: [DISCUSS] Next Release Content

2016-11-02 Thread Kyle Richardson
I'd like to selfishly add METRON-363
 into consideration for
the next release. This is complete pending final review and merge.

-Kyle

On Wed, Nov 2, 2016 at 5:00 PM, Casey Stella  wrote:

> Hey Jon,
>
> Regarding the relationship between 463 and 460, agreed.  I adjusted 460 to
> correctly reflect that observation.
> I'll reserve further comment on the others until after I've reflected a bit
> more.  Regardless, thanks for the feedback; very valuable as usual.
>
> Casey
>
>
>
> On Wed, Nov 2, 2016 at 4:53 PM, zeo...@gmail.com  wrote:
>
> > *Proposing and justifying a JIRA from the list of unresolved JIRAs*
> > - Doesn't METRON-463 finish off METRON-460 as well?  460 doesn't appear
> to
> > be slated for the next release.
> > - I think METRON-447 should be in the next release, as it provides
> > continuity for upgrades.  I'd call it critical and I'm willing to do it -
> > I'm looking for feedback per the description.  This is an easy fix.
> > - I would like to see METRON-446 in the next release, but I'll call it
> nice
> > to have.  This causes an error if you follow the older (but "current")
> bare
> > metal install instructions
> >  > Metron+Installation+on+an+Ambari-Managed+Cluster>
> > using master (Step 5 #3).
> > - If ansible upgrades are supported/recommended I'd say that METRON-448
> is
> > critical.  If ansible upgrades aren't considered supported/recommended,
> I'd
> > downgrade to nice to have.
> >
> > *Other comments*
> > - METRON-276 had an interesting comment - "George Vetticaden added a
> > comment - 15/Jul/16 12:22 This needs to be prioritized higher and be
> > completed byt he 0.2.2 release".
> > - I think there either needs to be guidance on how to avoid sending IPv6
> > bro traffic into Metron (METRON-348, METRON-293, METRON-285, and
> > METRON-286) or the ability to parse IPv6 traffic.  This could be as
> simple
> > as using the logs-to-kafka2.bro that I have in METRON-348 and updating
> some
> > comments/documentation.
> >
> > Jon
> >
> > On Wed, Nov 2, 2016 at 4:11 PM Casey Stella  wrote:
> >
> > > Hello Everyone,
> > >
> > > It's me, your friendly Metron Release Manager and it's time to start
> > > thinking about the next release.
> > >
> > > *JIRA Housekeeping*
> > >
> > > For those who get email alerts via JIRA on changes, it should be no
> > > surprise that I went through did some JIRA housekeeping in anticipation
> > of
> > > the next release:
> > >
> > >- Ensured that everything committed since the last release was
> closed
> > >and marked 0.2.2BETA
> > >- Ensured that everything in active work (with a PR that was active
> in
> > >the last month on github) had a release version of 0.2.2BETA and "In
> > >Progress"
> > >- With the exception of METRON-533
> > >, which is
> release
> > >housekeeping, everything with a release version of 0.2.2BETA is
> actual
> > > work
> > >that is in progress, rather than aspirational.
> > >- Went through the list of JIRAs that are not done and have no
> version
> > >associated with them and ensured that they weren't duplicates (to
> the
> > > best
> > >of my abilities).
> > >
> > > This may mean that I moved your favorite JIRA around or changed the
> > > release.  I did not do this because it was unimportant or I considered
> it
> > > unfit for the next release, but because I want to begin the exercise of
> > > choosing what makes the release with the community with an accurate
> > picture
> > > of the current state in JIRA.
> > >
> > > *What's made it so far into the next release*
> > >
> > >- METRON-410 mysql_servers MySQL install causes mutually
> assured
> > >destruction when installed on the same machine as the Ambari Hive
> > MySQL
> > >closes apache/incubator-metron#317
> > >- METRON-148 Compress logs with logrotate (ottobackwards) closes
> > >apache/incubator-metron#329
> > >- METRON-536 Fix apache id for Otto Fowler (ottobackwards) closes
> > >apache/incubator-metron#331
> > >- METRON-249: Field Transformation functions fail to handle invalid
> > user
> > >inputs closes apache/incubator-metron#333
> > >- METRON-521: Stellar function documentation needs grammar/clarity
> > fixes
> > >closes apache/incubator-metron#327
> > >- METRON-484 Opentaxi service does not show count for subscribed
> > >services   (nickwallen) closes apache/incubator-metron#306
> > >- METRON-495: Upgrade Storm to 1.0.x (justinleet via mmiklavc)
> closes
> > >apache/incubator-metron#318
> > >- METRON-506 Add Otto Fowler to commiters (ottobackwards) closes
> > >apache/incubator-metron#330
> > >- METRON-515: Stellar IS_EMPTY() function does not work as expected
> > >(merrimanr via mmiklavc) closes apache/incubator-metron#324
> > >- 

Re: Jira Rights

2016-11-02 Thread Kyle Richardson
Sorry, I'm not sure I understand. Below is a screenshot from my profile
page on https://issues.apache.org/jira. Is there another place I need to
sign up?

[image: Inline image 1]

-Kyle

On Wed, Nov 2, 2016 at 11:11 AM, James Sirota <jsir...@apache.org> wrote:

> Its not finding you.  Make sure you sign up for the jira
>
> 02.11.2016, 08:00, "Kyle Richardson" <kylerichards...@gmail.com>:
> > Thanks, James! My JIRA username is kylerichardson.
> >
> > -Kyle
> >
> > On Wed, Nov 2, 2016 at 10:28 AM, James Sirota <jsir...@apache.org>
> wrote:
> >
> >>  Hi Kyle, you can have contributor privileges. Please sign up for the
> Jira
> >>  and I'll add you. Does anyone else want that too?
> >>
> >>  31.10.2016, 18:19, "Kyle Richardson" <kylerichards...@gmail.com>:
> >>  > Any chance I could also get rights to assign myself issues in JIRA?
> If
> >>  it's
> >>  > for PMC and committers only I totally understand.
> >>  >
> >>  > -Kyle
> >>  >
> >>  > On Mon, Oct 31, 2016 at 8:08 PM, James Sirota <jsir...@apache.org>
> >>  wrote:
> >>  >
> >>  >> Please sign up for the Jira and I will assign you the rights
> >>  >>
> >>  >> 28.10.2016, 19:35, "Otto Fowler" <ottobackwa...@gmail.com>:
> >>  >> > How can I get rights in jira to assign issues to myself etc?
> >>  >> >
> >>  >> > Otto
> >>  >>
> >>  >> ---
> >>  >> Thank you,
> >>  >>
> >>  >> James Sirota
> >>  >> PPMC- Apache Metron (Incubating)
> >>  >> jsirota AT apache DOT org
> >>
> >>  ---
> >>  Thank you,
> >>
> >>  James Sirota
> >>  PPMC- Apache Metron (Incubating)
> >>  jsirota AT apache DOT org
>
> ---
> Thank you,
>
> James Sirota
> PPMC- Apache Metron (Incubating)
> jsirota AT apache DOT org
>


Re: Jira Rights

2016-11-02 Thread Kyle Richardson
Thanks, James! My JIRA username is kylerichardson.

-Kyle


On Wed, Nov 2, 2016 at 10:28 AM, James Sirota <jsir...@apache.org> wrote:

> Hi Kyle, you can have contributor privileges.  Please sign up for the Jira
> and I'll add you.  Does anyone else want that too?
>
> 31.10.2016, 18:19, "Kyle Richardson" <kylerichards...@gmail.com>:
> > Any chance I could also get rights to assign myself issues in JIRA? If
> it's
> > for PMC and committers only I totally understand.
> >
> > -Kyle
> >
> > On Mon, Oct 31, 2016 at 8:08 PM, James Sirota <jsir...@apache.org>
> wrote:
> >
> >>  Please sign up for the Jira and I will assign you the rights
> >>
> >>  28.10.2016, 19:35, "Otto Fowler" <ottobackwa...@gmail.com>:
> >>  > How can I get rights in jira to assign issues to myself etc?
> >>  >
> >>  > Otto
> >>
> >>  ---
> >>  Thank you,
> >>
> >>  James Sirota
> >>  PPMC- Apache Metron (Incubating)
> >>  jsirota AT apache DOT org
>
> ---
> Thank you,
>
> James Sirota
> PPMC- Apache Metron (Incubating)
> jsirota AT apache DOT org
>


Re: [GitHub] incubator-metron issue #276: METRON-363 Fix Cisco ASA Parser

2016-11-01 Thread Kyle Richardson
Thanks, Otto! You're a genius.

I'm at a loss for why this broke the integration tests. For me, it seems to
have broke all of the integration tests which makes me think it broke some
piece of the underlying framework. The big change seems to have been with
the move to Storm 1.x but I can't say for sure it's related.

I had managed to get rid of all of the SLF4J multiple bindings prior to
rebasing so my guess is there was a change in some of the dependencies that
added these back in.

I've added the exclusion you highlighted as well as a couple of others to
get rid of the multiple bindings. I'm running through the unit and
integration tests now and, if successful, I'll push the fix to my PR and
see what Travis comes back with.

Thanks again for your help troubleshooting!

-Kyle

On Tue, Nov 1, 2016 at 4:52 PM, Otto Fowler  wrote:

> Sorry, same test.
>
> I was able to resolve the issue by adding an exclusion for slf4j in the
> metron-parsers pom:
>
> 
> org.apache.kafka
> kafka_2.10
> ${global_kafka_version}
> 
> 
> slf4j-log4j12
> org.slf4j
> 
> 
> log4j
> log4j
> 
> 
> 
>
>
> I’m not sure why this would break with the asa parser though.  Maybe
> someone else has an idea?
>
> On November 1, 2016 at 16:16:27, Otto Fowler (ottobackwa...@gmail.com)
> wrote:
>
> Kyle:
>
> I can reproduce this problem ( but with a different test ) locally.
>
> ---
>
>  T E S T S
>
> ---
>
> Running org.apache.metron.parsers.integration.AsaIntegrationTest
>
> SLF4J: Class path contains multiple SLF4J bindings.
>
> SLF4J: Found binding in
> [jar:file:/Users/ottofowler/.m2/repository/org/slf4j/slf4j-
> log4j12/1.7.21/slf4j-log4j12-1.7.21.jar!/org/slf4j/impl/
> StaticLoggerBinder.class]
>
> SLF4J: Found binding in
> [jar:file:/Users/ottofowler/.m2/repository/org/slf4j/slf4j-
> simple/1.7.7/slf4j-simple-1.7.7.jar!/org/slf4j/impl/
> StaticLoggerBinder.class]
>
> SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an
> explanation.
>
> SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
>
> Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 3.509 sec
> <<< FAILURE! - in org.apache.metron.parsers.integration.AsaIntegrationTest
>
> test(org.apache.metron.parsers.integration.AsaIntegrationTest)  Time
> elapsed: 3.506 sec  <<< ERROR!
>
> java.lang.NoClassDefFoundError: org/slf4j/event/LoggingEvent
>
> I *did* pull master over your pr.
>
> Can you merge/pull master and update your local branch and reproduce?
>
>
>
> On November 1, 2016 at 15:33:55, Otto Fowler (ottobackwa...@gmail.com)
> wrote:
>
> " T E S T S
>
> ---
> Running org.apache.metron.parsers.integration.YafIntegrationTest
> SLF4J: Class path contains multiple SLF4J bindings.
> SLF4J: Found binding in
> [jar:file:/home/travis/.m2/repository/org/slf4j/slf4j-
> log4j12/1.7.21/slf4j-log4j12-1.7.21.jar!/org/slf4j/impl/
> StaticLoggerBinder.class]
> SLF4J: Found binding in
> [jar:file:/home/travis/.m2/repository/org/slf4j/slf4j-
> simple/1.7.7/slf4j-simple-1.7.7.jar!/org/slf4j/impl/
> StaticLoggerBinder.class]
> SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an
> explanation.
> SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
> Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 8.64
> sec <<< FAILURE! - in
> org.apache.metron.parsers.integration.YafIntegrationTest
> test(org.apache.metron.parsers.integration.YafIntegrationTest)  Time
> elapsed: 8.637 sec  <<< ERROR!
> java.lang.NoClassDefFoundError: org/slf4j/event/LoggingEvent”
>
>
> This error, then a address already in use errors…. then no output -
> travis kills it.
>
> Maybe this error causes an ungraceful shutdown, which effects the next
> test?
>
>
> I’ll grab your pr clean and try to run mvn test && mvn
> integration-test on it and see here.  I assume that this builds
> locally for you and the test and integration-tests run?
>
>
>
>
> On November 1, 2016 at 13:17:47, kylerichardson (g...@git.apache.org)
> wrote:
>
> Github user kylerichardson commented on the issue:
>
> https://github.com/apache/incubator-metron/pull/276
>
> Ok, need some helping figuring out why the CI build keeps failing...
>
> I get several of these at the end of the log:
> ```
> Running org.apache.metron.parsers.integration.JSONMapIntegrationTest
> 2016-11-01 15:54:52 FATAL KafkaServer:116 - [Kafka Server 0], Fatal error
> during KafkaServer startup. Prepare to shutdown
> kafka.common.KafkaException: Socket server failed to bind to
> localhost:6667: Address already in use.
> ```
>
> and prior to that I see:
> ```
> Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 8.64 sec
> <<< FAILURE! - in org.apache.metron.parsers.integration.YafIntegrationTest
> 

Re: Jira Rights

2016-10-31 Thread Kyle Richardson
Any chance I could also get rights to assign myself issues in JIRA? If it's
for PMC and committers only I totally understand.

-Kyle


On Mon, Oct 31, 2016 at 8:08 PM, James Sirota  wrote:

> Please sign up for the Jira and I will assign you the rights
>
> 28.10.2016, 19:35, "Otto Fowler" :
> > How can I get rights in jira to assign issues to myself etc?
> >
> > Otto
>
> ---
> Thank you,
>
> James Sirota
> PPMC- Apache Metron (Incubating)
> jsirota AT apache DOT org
>


Re: [DISCUSS] Improving quick-dev

2016-10-19 Thread Kyle Richardson
Sorry, I'm a bit late to the party on this one :).

+1 on the four builds Nick described. Each would be useful and purpose
built.

I like the idea of using Docker, especially for local development and quick
testing. Has anyone explored this? I'm envisioning very specific containers
so you could spin up only the components you're actively working on.
Certainly something I would be willing to put some cycles into if there is
interest.

-Kyle

On Fri, Oct 14, 2016 at 2:15 PM, Ryan Merriman  wrote:

> +1 I like it.  Just to clarify, the scripts to run Storm topologies locally
> in an IDE should be available independent of the environment running.  No
> need for a separate build/image.
>
> On Fri, Oct 14, 2016 at 9:12 AM, Otto Fowler 
> wrote:
>
> > Going forward, the Demo env and data would have implications for testing
> as
> > well ( gold data sets ) etc.
> >
> > On October 14, 2016 at 09:52:07, Nick Allen (n...@nickallen.org) wrote:
> >
> > I think based on everyone's input so far, we're describing 4 different
> > builds/images/tools that would each be intended to run on a standard
> > Mac/Linux/Windows laptop.
> >
> > Full Dev - A development environment that performs a full end-to-end
> > deployment of Metron. This is intended for developers working with
> > sensors, deployments, or validating how all Metron components interact
> with
> > one another.
> >
> >
> > - Starts from base Linux image
> > - Installs Hadoop-y components
> > - Installs Metron
> > - Installs Sensors
> > - Nothing started by default
> >
> > Quick Dev - An environment intended for the developer focusing on the
> > streaming components of Metron; parsing, enrichment, and indexing.
> >
> >
> > - Starts from base image of Linux + Hadoop-y components
> > - Installs Metron
> > - Installs "data generator" spouts
> > - Does not install sensors
> > - Nothing started by default
> >
> > Demo - An environment intended to introduce new users to Metron. The
> > environment should go from nothing to plenty of data in the Metron
> > dashboard in as little "boot" time as possible.
> >
> >
> > - Starts from a base image including Linux + Hadoop-y + Metron + Data
> > Generator Spouts pre-installed
> > - Pre-load Elasticsearch indices so the user has plenty of data to
> > investigate in the dashboard
> > - Does not install sensors
> > - Everything started by default
> >
> > Storm Local Cluster - Otto suggested some scripts/tooling to make it easy
> > to launch the core topologies on a local Storm cluster running on the
> host
> > OS.
> >
> >
> > I'd be interested to hear if this works for everyone and how this might
> > play into the Ambari mpack + RPM based deployment scheme.
> >
> >
> > On Fri, Oct 14, 2016 at 1:45 AM, Michael Miklavcic <
> > michael.miklav...@gmail.com> wrote:
> >
> > > I think this may have come up in another PR already (have to look for
> > it).
> > > But maybe we could maintain our flexibility in quick-dev by installing
> > the
> > > sensors and not starting them until we need them. I think it's useful
> to
> > > have a quick "genuine" e2e testing environment that doesn't require
> > running
> > > through a full install. I'm also not opposed to extracting the
> > integration
> > > test functionality into general purpose data generators.
> > >
> > > On Thu, Oct 13, 2016 at 8:31 PM, Nick Allen 
> wrote:
> > >
> > > > To Jon's point, I think it would be useful to have a Demo box that
> uses
> > > > generators to produce 3 or 4 types of telemetry that shows up in the
> > > Metron
> > > > Dashboard. This box would be different from Quick-Dev in that
> > everything
> > > > starts automatically, so that a user just has to launch it and the
> > should
> > > > start seeing data in the Metron Dashboard right away. In fact, we
> could
> > > > even pre-load the Elasticsearch indices so that the user has more of
> a
> > > > history to mine when using the Demo box.
> > > >
> > > > On Thu, Oct 13, 2016 at 2:04 PM, zeo...@gmail.com 
> > > > wrote:
> > > >
> > > > > +1 Ryan and Otto's comments.
> > > > >
> > > > > I also strongly think we need to make a demo environment easier,
> but
> > > that
> > > > > should be different than quick-dev.
> > > > >
> > > > > Jon
> > > > >
> > > > > On Thu, Oct 13, 2016 at 1:15 PM Otto Fowler <
> ottobackwa...@gmail.com
> > >
> > > > > wrote:
> > > > >
> > > > > > - create scripts/utilities to easily run a topology locally in an
> > IDE
> > > > > > instead of in the VM
> > > > > >
> > > > > >
> > > > > >  THIS.
> > > > > >
> > > > > >
> > > > > > On October 13, 2016 at 12:36:45, Ryan Merriman (
> > merrim...@gmail.com)
> >
> > > > > > wrote:
> > > > > >
> > > > > > Working with the quick-dev vagrant VM recently left a lot to be
> > > > desired.
> > > > > > All forthcoming comments are made under the assumption that this
> VM
> > > is
> > > > > > intended for development purposes. If that is not true, I think
> we
> > > > 

Re: [MENTORS] Release Maturity

2016-10-19 Thread Kyle Richardson
I'm +1 on a meeting to discuss the backlog and would suggest, to be
considered, each JIRA should have a clear description. I think that 72
hours is good for final adds to an upcoming release.

I personally like the idea of having more visibility on which JIRAs are "up
next" to help me figure out where contributions would be most valuable.

-Kyle

On Mon, Oct 17, 2016 at 1:50 PM, zeo...@gmail.com  wrote:

> That's more aggressive than I would have initially suggested, but I would
> be on board with that sort of a meeting.  Interested to see how others
> feel.
>
> Jon
>
> On Mon, Oct 17, 2016 at 1:40 PM James Sirota  wrote:
>
> > Fair criticism.  Would you like to call a recurring meeting where PPMC
> and
> > community can get together and go through the Jira backlog?  We can then
> > have the opportunity to triage the Jiras.  Do you think once a month
> should
> > be sufficient + 72 hours prior to the release to verify all desired Jiras
> > are in?
> >
> > 17.10.2016, 10:01, "zeo...@gmail.com" :
> > > I think that grooming the JIRA backlog at each release provides a good
> > > method for new users or users less integrated into Metron development
> to
> > > understand the roadmap of the project by perusing the backlog. I feel
> > > somewhat aware of the state of the Metron project but often have
> > questions
> > > about how prioritized certain issues are for development. I think the
> > > easiest way to illustrate this gap are in these
> > > <
> > https://issues.apache.org/jira/browse/METRON-170?jql=
> project%20%3D%20METRON%20AND%20resolution%20%3D%20Unresolved%20AND%
> 20fixVersion%20%3D%200.2.1BETA%20ORDER%20BY%20priority%20DESC
> > >
> > > two
> > > <
> > https://issues.apache.org/jira/browse/METRON-469?jql=
> project%20%3D%20METRON%20AND%20resolution%20%3D%20Unresolved%20AND%
> 20fixVersion%20is%20EMPTY%20ORDER%20BY%20priority%20DESC
> > >
> > > links, where Metron currently has 45 unresolved issues slated for
> > 0.2.1BETA
> > > (?!?) and 71 unresolved issues that are unscheduled.
> > >
> > > I'd also like to see a comprehensive update of documentation on the
> wiki
> > >  (or
> > > wherever it lives). Heck, TP2 isn't even on the releases page
> > > , let
> alone
> > > 0.2.1 and other related details/changes.
> > >
> > > I'd be more than happy to help with either of those efforts, as
> > applicable.
> > >
> > > Jon
> > >
> > > On Mon, Oct 17, 2016 at 11:39 AM Casey Stella 
> > wrote:
> > >
> > >>  Hi Everyone,
> > >>
> > >>  I'd like to get a bit more systematic about how we release and I
> wanted
> > >>  some clarification and advice about suggested release process.
> > >>
> > >>  The last release, we
> > >>
> > >> - opened up the release via an announce thread that gave people
> the
> > >> opportunity to object and add JIRAs they felt were important to be
> > >> considered for the release
> > >> - made a release branch in git
> > >> - made a release candidate tag
> > >> - sent out the release candidate for a vote
> > >> - when passed, sent the release candidate for a vote in general
> > >>
> > >>  A couple of questions:
> > >>
> > >> - Is 72 hours sufficient for people to suggest JIRAs that need to
> > get in
> > >> for the release?
> > >> - What we did not do is have the JIRA backlog groomed and have
> JIRAs
> > >> assigned to releases beyond the current release. This would make
> it
> > >>  easier
> > >> for people to find JIRAs that they want in. Is that a sensible
> > >> prerequisite for the release or is that overkill?
> > >> - Are there best practices that successful projects of our
> > >> maturity-level do that we are not doing around release?
> > >>
> > >>  Casey
> > > --
> > >
> > > Jon
> >
> > ---
> > Thank you,
> >
> > James Sirota
> > PPMC- Apache Metron (Incubating)
> > jsirota AT apache DOT org
> >
> --
>
> Jon
>


Re: [ANNOUNCE] Metron Apache Community Demo Recording Oct14,2016

2016-10-17 Thread Kyle Richardson
Great stuff! Very useful information. Thanks for hosting.

-Kyle

On Sat, Oct 15, 2016 at 3:57 AM, Yohann Lepage  wrote:

> Hi James,
>
> Thanks for the recording!
>
> Could you please also update the "Meeting Notes" page on the wiki with
> the link to the recording?
>
> https://cwiki.apache.org/confluence/display/METRON/Meeting+notes
>
> Thanks
>
> 2016-10-14 21:40 GMT+02:00 James Sirota :
> > The recording is available at:
> > https://youtu.be/VAEU4JjbS1o
> >
> > The meeting was a demonstration of the upcoming build.  No architectural
> decisions about the platform were made at the meeting.  The features that
> were demoed were:
> >
> >  - Tutorial on extending Stellar functions (helpful for new and aspiring
> committers)
> >  - Using the profiler in enrichment topology
> >  - Various bug fixes
> >  - Address the incubator build comments
> >
> >
> > Please reply to this thread if you have additional
> questions/comments/ideas about what you have seen today.
> >
> > ---
> > Thank you,
> >
> > James Sirota
> > PPMC- Apache Metron (Incubating)
> > jsirota AT apache DOT org
>
>
>
> --
> Yohann L.
>


Re: Issues with Quick Dev Installation

2016-10-12 Thread Kyle Richardson
I like that idea. It seems much cleaner. Let me give it a try.

On Tue, Oct 11, 2016 at 9:53 PM, Nick Allen <n...@nickallen.org> wrote:

> Does it really need to sudo? Can we do something like "become: false" so it
> doesn't try to sudo?
>
> On Oct 11, 2016 9:33 PM, "Kyle Richardson" <kylerichards...@gmail.com>
> wrote:
>
> > Ok, I think I have the fix for this. With the new local_action logic,
> > ansible is checking the jar path on the host with sudo but doesn't have
> the
> > password. Just need to add the line below to the Vagrantfile.
> >
> > ansible.ask_sudo_pass = true
> >
> > I will test tomorrow and, if successful, open a new PR for the fix.
> >
> >
> > On Tue, Oct 11, 2016 at 8:42 PM, Kyle Richardson <
> > kylerichards...@gmail.com>
> > wrote:
> >
> > > This error could be related to my PR that was merged today (METRON-492
> > > <https://github.com/apache/incubator-metron/pull/302>). I tested this
> > > successfully in a single node vm deployment, but not with vagrant.
> > Perhaps
> > > there is something about vagrant that doesn't like the ansible
> > local_action
> > > logic? I'm trying to reproduce the error now.
> > >
> > > On Tue, Oct 11, 2016 at 7:06 PM, Nick Allen <n...@nickallen.org>
> wrote:
> > >
> > >> Hi Rita -
> > >>
> > >> Yes, I was seeing this same issue today.  I haven't looked into the
> > cause
> > >> yet, but you can comment out that check and proceed with the install.
> > >>
> > >> Doing this from memory, but I believe it is in
> > >> metron-deployment/roles/metron_common/tasks/main.yml.  Comment out or
> > >> delete any task that references the "metron_jar_path" variable.  There
> > are
> > >> two tasks that need commented out.
> > >>
> > >> I will try and get a fix out tomorrow.
> > >>
> > >> On Oct 11, 2016 6:35 PM, "Rita McKissick" <rmckiss...@hortonworks.com
> >
> > >> wrote:
> > >>
> > >> > *** Resending because this message seemed to disappear into the
> ether
> > >> and
> > >> > I didn’t receive a copy of it.
> > >> >
> > >> > I’m having difficulties with the latest Quick Development Platform
> > >> > installation. During deployment I received the following error
> > message:
> > >> >
> > >> > --
> > >> >
> > >> > TASK [metron_common : Check for Metron jar path]
> > >> > ***
> > >> > fatal: [node1 -> localhost]: FAILED! => {"changed": false, "failed":
> > >> true,
> > >> > "module_stderr": "sudo: a password is required\n", "module_stdout":
> > "",
> > >> > "msg": "MODULE FAILURE", "parsed": false}
> > >> >
> > >> > PLAY RECAP **
> > **
> > >> > *
> > >> > node1  : ok=20   changed=2unreachable=0
> > >> failed=1
> > >> >
> > >> > Ansible failed to complete successfully. Any error output should be
> > >> > visible above. Please fix these errors and try again.
> > >> >
> > >> > -
> > >> >
> > >> >
> > >> > Everything up to this point looked fine.
> > >> >
> > >> > By the way, I installed the latest Quick-Dev installation earlier
> this
> > >> > afternoon and it didn’t install/configure any topologies. That’s
> why I
> > >> did
> > >> > a second install … which turned out even worse.
> > >> >
> > >> > Anyone else having this issue? Can someone help me with a fix or
> > >> > workaround?
> > >> >
> > >> > Thank you!
> > >> >
> > >> > Rita
> > >> >
> > >> > Rita McKissick ! Sr. Technical Writer
> > >> >
> > >> >
> > >> >
> > >>
> > >
> > >
> >
>


Re: Issues with Quick Dev Installation

2016-10-11 Thread Kyle Richardson
Ok, I think I have the fix for this. With the new local_action logic,
ansible is checking the jar path on the host with sudo but doesn't have the
password. Just need to add the line below to the Vagrantfile.

ansible.ask_sudo_pass = true

I will test tomorrow and, if successful, open a new PR for the fix.


On Tue, Oct 11, 2016 at 8:42 PM, Kyle Richardson <kylerichards...@gmail.com>
wrote:

> This error could be related to my PR that was merged today (METRON-492
> <https://github.com/apache/incubator-metron/pull/302>). I tested this
> successfully in a single node vm deployment, but not with vagrant. Perhaps
> there is something about vagrant that doesn't like the ansible local_action
> logic? I'm trying to reproduce the error now.
>
> On Tue, Oct 11, 2016 at 7:06 PM, Nick Allen <n...@nickallen.org> wrote:
>
>> Hi Rita -
>>
>> Yes, I was seeing this same issue today.  I haven't looked into the cause
>> yet, but you can comment out that check and proceed with the install.
>>
>> Doing this from memory, but I believe it is in
>> metron-deployment/roles/metron_common/tasks/main.yml.  Comment out or
>> delete any task that references the "metron_jar_path" variable.  There are
>> two tasks that need commented out.
>>
>> I will try and get a fix out tomorrow.
>>
>> On Oct 11, 2016 6:35 PM, "Rita McKissick" <rmckiss...@hortonworks.com>
>> wrote:
>>
>> > *** Resending because this message seemed to disappear into the ether
>> and
>> > I didn’t receive a copy of it.
>> >
>> > I’m having difficulties with the latest Quick Development Platform
>> > installation. During deployment I received the following error message:
>> >
>> > --
>> >
>> > TASK [metron_common : Check for Metron jar path]
>> > ***
>> > fatal: [node1 -> localhost]: FAILED! => {"changed": false, "failed":
>> true,
>> > "module_stderr": "sudo: a password is required\n", "module_stdout": "",
>> > "msg": "MODULE FAILURE", "parsed": false}
>> >
>> > PLAY RECAP 
>> > *
>> > node1  : ok=20   changed=2unreachable=0
>> failed=1
>> >
>> > Ansible failed to complete successfully. Any error output should be
>> > visible above. Please fix these errors and try again.
>> >
>> > -
>> >
>> >
>> > Everything up to this point looked fine.
>> >
>> > By the way, I installed the latest Quick-Dev installation earlier this
>> > afternoon and it didn’t install/configure any topologies. That’s why I
>> did
>> > a second install … which turned out even worse.
>> >
>> > Anyone else having this issue? Can someone help me with a fix or
>> > workaround?
>> >
>> > Thank you!
>> >
>> > Rita
>> >
>> > Rita McKissick ! Sr. Technical Writer
>> >
>> >
>> >
>>
>
>


Re: Issues with Quick Dev Installation

2016-10-11 Thread Kyle Richardson
This error could be related to my PR that was merged today (METRON-492
). I tested this
successfully in a single node vm deployment, but not with vagrant. Perhaps
there is something about vagrant that doesn't like the ansible local_action
logic? I'm trying to reproduce the error now.

On Tue, Oct 11, 2016 at 7:06 PM, Nick Allen  wrote:

> Hi Rita -
>
> Yes, I was seeing this same issue today.  I haven't looked into the cause
> yet, but you can comment out that check and proceed with the install.
>
> Doing this from memory, but I believe it is in
> metron-deployment/roles/metron_common/tasks/main.yml.  Comment out or
> delete any task that references the "metron_jar_path" variable.  There are
> two tasks that need commented out.
>
> I will try and get a fix out tomorrow.
>
> On Oct 11, 2016 6:35 PM, "Rita McKissick" 
> wrote:
>
> > *** Resending because this message seemed to disappear into the ether and
> > I didn’t receive a copy of it.
> >
> > I’m having difficulties with the latest Quick Development Platform
> > installation. During deployment I received the following error message:
> >
> > --
> >
> > TASK [metron_common : Check for Metron jar path]
> > ***
> > fatal: [node1 -> localhost]: FAILED! => {"changed": false, "failed":
> true,
> > "module_stderr": "sudo: a password is required\n", "module_stdout": "",
> > "msg": "MODULE FAILURE", "parsed": false}
> >
> > PLAY RECAP 
> > *
> > node1  : ok=20   changed=2unreachable=0
> failed=1
> >
> > Ansible failed to complete successfully. Any error output should be
> > visible above. Please fix these errors and try again.
> >
> > -
> >
> >
> > Everything up to this point looked fine.
> >
> > By the way, I installed the latest Quick-Dev installation earlier this
> > afternoon and it didn’t install/configure any topologies. That’s why I
> did
> > a second install … which turned out even worse.
> >
> > Anyone else having this issue? Can someone help me with a fix or
> > workaround?
> >
> > Thank you!
> >
> > Rita
> >
> > Rita McKissick ! Sr. Technical Writer
> >
> >
> >
>


Re: [DISCUSS] Recurring community meetings to demo Metron features

2016-09-21 Thread Kyle Richardson
Great idea. +1 on the agenda. Maybe half demo of latest features / half
discussion on upcoming changes and new ideas.

-Kyle

On Wed, Sep 21, 2016 at 4:03 PM, zeo...@gmail.com  wrote:

> I'm in from CMU.  Zoom and WebEx work well.
>
> Only suggestion would be a basic agenda (I.e. feature list) prior to the
> meeting so people can do their homework.  Prior meaning ~24 hours before at
> a minimum IMO.
>
> Jon
>
> On Wed, Sep 21, 2016, 15:53 Tseytlin, Keren  >
> wrote:
>
> > Hi James,
> >
> > Kevin and I (and perhaps a couple others from our team) will join in too.
> > Keep us posted with the meeting details.
> >
> > Best,
> > Keren
> >
> > On 9/21/16, 2:26 PM, "Otto Fowler"  wrote:
> >
> > >Hi James,
> > >
> > >I think this is a great idea.  Zoom seems to work pretty well.
> > >
> > >On September 21, 2016 at 14:18:54, James Sirota (jsir...@apache.org)
> > >wrote:
> > >
> > >I want to setup recurring meetings that run twice a month where we can
> > >demo
> > >latest features of Metron and fixes for Metron . I want to have the
> first
> > >one of these this Friday on Sept. 23, 11AM PST.
> > >
> > >Would anyone object to doing this? If not, what medium do we want to do
> it
> > >in? I would like people to be able to use screen share and not just IRC.
> > >On
> > >my end I can provide a Zoom or a WebEx to facilitate this. Then we can
> > >post
> > >meeting notes onto the Metron boards and post the video up on you tube
> and
> > >link to it. Any other preferences/suggestions on how to do this?
> > >
> > >
> > >
> > >---
> > >Thank you,
> > >
> > >James Sirota
> > >PPMC- Apache Metron (Incubating)
> > >jsirota AT apache DOT org
> >
> > 
> >
> > The information contained in this e-mail is confidential and/or
> > proprietary to Capital One and/or its affiliates and may only be used
> > solely in performance of work or services for Capital One. The
> information
> > transmitted herewith is intended only for use by the individual or entity
> > to which it is addressed. If the reader of this message is not the
> intended
> > recipient, you are hereby notified that any review, retransmission,
> > dissemination, distribution, copying or other use of, or taking of any
> > action in reliance upon this information is strictly prohibited. If you
> > have received this communication in error, please contact the sender and
> > delete the material from your computer.
> >
> > --
>
> Jon
>


Re: [DISCUSS] Parsing messages without IP addresses

2016-09-18 Thread Kyle Richardson
Thanks, Casey. That's the piece I missed somewhere along the way. I was
looking for definitive guidance on the required fields.

That's right. The vast majority of ASA events contain the standard source
and destination address and port information. It's only very few that don't.

I'll move forward by simply not including those fields on those few message
types.

Thanks again,
Kyle


On Sun, Sep 18, 2016 at 1:10 PM, Casey Stella <ceste...@gmail.com> wrote:

> There are actually very few required fields in our parsers (timestamp and
> original_message), so not having an src and dest IP address only really
> means you won't be able to enrich based on THAT field, but could enrich on
> other fields.
>
> I'd say leave them out if they aren't part of the format. It sounds like
> some ASA events will have them and others won't, right?
> On Sun, Sep 18, 2016 at 13:05 Kyle Richardson <kylerichards...@gmail.com>
> wrote:
>
> > All,
> >
> > I've run into an edge case while working on METRON-363
> > <https://issues.apache.org/jira/browse/METRON-363>. There are some log
> > events which do not contain IP addresses and thus cannot be fully
> > normalized into the standard Metron JSON fields.
> >
> > What are folks thoughts on how to handle this situation? (Or how have you
> > handled it in other, existing parsers?) We could omit the fields, write
> > them out as nulls, or not continue processing the events at all.
> >
> > I'm interested in your feedback. It seems to me that we would want all
> the
> > events to be indexed/persisted for long term archival; however, currently
> > enrichment relies heavily on IP addresses.
> >
> > What do you think?
> >
> > Thanks,
> > Kyle
> >
>


[DISCUSS] Parsing messages without IP addresses

2016-09-18 Thread Kyle Richardson
All,

I've run into an edge case while working on METRON-363
. There are some log
events which do not contain IP addresses and thus cannot be fully
normalized into the standard Metron JSON fields.

What are folks thoughts on how to handle this situation? (Or how have you
handled it in other, existing parsers?) We could omit the fields, write
them out as nulls, or not continue processing the events at all.

I'm interested in your feedback. It seems to me that we would want all the
events to be indexed/persisted for long term archival; however, currently
enrichment relies heavily on IP addresses.

What do you think?

Thanks,
Kyle


Re: log parsers-

2016-09-14 Thread Kyle Richardson
I have a working code for the ASA piece (METRON-363). Just finishing up some 
edge case testing. I'll submit a PR for it within your 2 week timeframe.

Thanks,
Kyle

> On Sep 14, 2016, at 6:58 PM, Satish Abburi  wrote:
> 
> 
> Thanks, timelines are 2 weeks from now. Thanks.
> 
> From: Poornima Ravindra Mulukutla 
> >
> Reply-To: 
> "u...@metron.incubator.apache.org" 
> >
> Date: Wednesday, September 14, 2016 at 3:26 PM
> To: 
> "u...@metron.incubator.apache.org" 
> >
> Cc: "dev@metron.incubator.apache.org" 
> >
> Subject: Re: log parsers-
> 
> Thank you
> 
> I am happy to take up ASA log file analyser, what is the timeline you are 
> looking for so that I will plan accordingly?
> 
> In the past I have done BlueCoat log analyser when I was doing research on 
> HTTP specification (published a patent has created big change in HTTP 
> designs), recently adopted for the Microsoft IE 11
> 
> On Wed, Sep 14, 2016 at 6:54 PM, Satish Abburi 
> > wrote:
> 
> Hi, we are trying to build parsers for our Phase1 demo on Metron platform. 
> Would like to find, if anyone already has these parsers developed.
> We already started working on  Windows parser, rest planning to start this 
> week. We can leverage if some thing avaialble or collaborate appropriately.
> 
> 
>  *   ASA (Firewall) Metron-363
>  *   Windows (Desktop) - METRON-165
>  *   Unix (OS) Metron-175
>  *   Email
>  *   BlueCoat(Proxy) METRON-162
> 
> Thanks for your help!
> Satish
> 


Re: Newbie

2016-07-27 Thread Kyle Richardson
Thanks for the warm welcome! I'm looking forward to working with you all.

-Kyle

On Wed, Jul 27, 2016 at 9:37 AM, Casey Stella <ceste...@gmail.com> wrote:

> It appears that your questions were answered by Jon and Nick, but I wanted
> to say welcome aboard.
> We absolutely love contributions of either JIRAs and/or PRs.  The
> committers try very hard to be as responsive as possible.
> Also, don't feel like things have to be perfect.  We are a very forgiving
> group...if something is amiss, we'll let you know and there are absolutely
> no hard feelings.
>
> Best,
>
> Casey
>
> On Tue, Jul 26, 2016 at 9:25 PM, Kyle Richardson <
> kylerichards...@gmail.com>
> wrote:
>
> > Hi All,
> >
> > I've recently joined both the dev and user mailing lists for Metron and
> > thought I'd give a short intro before I started in with questions.
> >
> > I've been working in the Cyber Security space for a little over 4 years
> now
> > primarily as a SIEM Engineer, SOC Analyst, and SOC Manager. In my current
> > role, I'm primarily responsible for our SIEM system soup to nuts and for
> > investigating "Security Analytics" and "Big Data for Security" tools for
> > possible implementation.
> >
> > I'm in the early stages of getting a functional Metron install up and
> > running, and am really interested in helping contribute to the project
> from
> > both a use case / ideas perspective as well as working code and
> > documentation.
> >
> > The first use case I'm trying to build out is using Cisco ASA firewall
> logs
> > on a single node VM. I've run into a couple of snags but have been
> > reasonably okay at working through them thus far.
> >
> > On to my questions... Can anyone create an account and contribute to the
> > Jira board? If I do find small fixes can I submit them as PRs? If so, are
> > there any special conventions you use that I should know?
> >
> > I think this is a very exciting project and am looking forward to getting
> > involved.
> >
> > Thanks!
> >
> > -Kyle
> >
>


Newbie

2016-07-26 Thread Kyle Richardson
Hi All,

I've recently joined both the dev and user mailing lists for Metron and
thought I'd give a short intro before I started in with questions.

I've been working in the Cyber Security space for a little over 4 years now
primarily as a SIEM Engineer, SOC Analyst, and SOC Manager. In my current
role, I'm primarily responsible for our SIEM system soup to nuts and for
investigating "Security Analytics" and "Big Data for Security" tools for
possible implementation.

I'm in the early stages of getting a functional Metron install up and
running, and am really interested in helping contribute to the project from
both a use case / ideas perspective as well as working code and
documentation.

The first use case I'm trying to build out is using Cisco ASA firewall logs
on a single node VM. I've run into a couple of snags but have been
reasonably okay at working through them thus far.

On to my questions... Can anyone create an account and contribute to the
Jira board? If I do find small fixes can I submit them as PRs? If so, are
there any special conventions you use that I should know?

I think this is a very exciting project and am looking forward to getting
involved.

Thanks!

-Kyle