[GitHub] incubator-metron pull request #345: METRON-532 Define Profile Period When Ca...

2017-01-03 Thread mattf-horton
Github user mattf-horton closed the pull request at:

https://github.com/apache/incubator-metron/pull/345


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-metron issue #410: METRON-645 Unable to Start Fastcapa Test Enviro...

2017-01-03 Thread JonZeolla
Github user JonZeolla commented on the issue:

https://github.com/apache/incubator-metron/pull/410
  
I tested successfully without merging METRON-635.  LGTM


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-metron issue #407: METRON-643: Stellar function documentation need...

2017-01-03 Thread JonZeolla
Github user JonZeolla commented on the issue:

https://github.com/apache/incubator-metron/pull/407
  
Note:  `BIN` and `STATS_BIN` may be added via 
[METRON-637](https://github.com/apache/incubator-metron/pull/401) but the 
documentation in metron-common was not updated, potentially indicating a 
preference to remove the `STATS_*` documentation from the metron-common 
`README.md`.

I would suggest that at the end of the day we make sure there's a single 
place where people can go for Stellar function documentation, even if it's just 
a page of pointers to the various READMEs.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


Re: METRON-648 GrokWebSphereParserTest and BasicAsaParserTest are not 2017-safe

2017-01-03 Thread Michael Miklavcic
I also introduced a Clock object and testing mechanism back in METRON-235 -
https://github.com/apache/incubator-metron/pull/156
Sample test utilizing the Clock object here -
https://github.com/apache/incubator-metron/blob/master/metron-platform/metron-pcap-backend/src/test/java/org/apache/metron/pcap/query/PcapCliTest.java

That being said, it's probably better to use the new java.time fixed clock
implementation in all places, as referenced by Matt. I'm agreed with
everyone on a quick fix for the build and a follow-on PR to introduce
appropriate dep injection for testing.

AFA string dates with no year, we had something similar show up in the
Snort parser. There ended up being a configuration option in Snort to
enable a year to be printed, but we may want to offer alternatives for
other parsers. Regardless of how we approach this it gets messy when you
start thinking about potentially different src/dest timezones across a new
year boundary in addition to data replay. I would urge our main goal here
to be idempotency.

Best,
Mike

On Tue, Jan 3, 2017 at 5:05 PM, Kyle Richardson 
wrote:

> Agreed. I prefer the quick win to get us back to successful builds.
>
> I do think it's worth a general discussion around how we want to handle
> the parsing of string dates with no year. In the long run, Matt's
> suggestion of incorporating the Clock object is probably the route to go;
> albeit as a separate enhancement PR.
>
> I'll start a new discuss thread for that and submit a PR for the quick fix.
>
> -Kyle
>
> > On Jan 3, 2017, at 5:20 PM, David Lyle  wrote:
> >
> > I'm not sure I'm an owner, but I have an opinion. :)
> >
> > I'd just add "2016". Easy and targeted.
> >
> > -D...
> >
> >
> >> On Tue, Jan 3, 2017 at 5:08 PM, Matt Foley  wrote:
> >>
> >> I’ll subordinate this to METRON-647 since it was evidently filed while I
> >> was writing METRON-648 (I did check before!)
> >>
> >> The question below remains valid, however…
> >>
> >>
> >> On 1/3/17, 1:59 PM, "Matt Foley"  wrote:
> >>
> >>Hi all,
> >>As described in https://issues.apache.org/jira/browse/METRON-648 ,
> >> these two test modules are not year-safe, and are suddenly (as of 2017)
> >> giving false Travis errors.
> >>
> >>I can fix it quickly, but a question for the “owners” of GrokParser:
> >> Do you have an opinion as to whether the fix should be done by adding
> >> "2016" to the testString values in the GrokWebSphereParserTest test
> module
> >> (easy, and only affects the test module), vs making GrokParser use a
> Clock
> >> object set to 2016 (more involved, and affecting core code, but allowing
> >> for more interesting testing)?
> >>
> >>For those interested, BasicAsaParserTest::testShortTimestamp()
> >> illustrates the use of Clock object in the Asa Parser and its test
> module.
> >>
> >>Thanks,
> >>--Matt
> >>
> >>
> >>
> >>
> >>
> >>
> >>
>


Re: Tests failing due to new year

2017-01-03 Thread Matt Foley
Heh, darn it, crossed in Jira – see METRON-648.  You win :-)

On 1/3/17, 1:13 PM, "Nick Allen"  wrote:

Thanks, Kyle.  I am seeing the same issue.  Happy Y2K... I mean 2017.

On Tue, Jan 3, 2017 at 4:09 PM, Kyle Richardson 
wrote:

> Created METRON-647 for tracking.
>
> -Kyle
>
> On Tue, Jan 3, 2017 at 3:49 PM, Kyle Richardson  >
> wrote:
>
> > ** This is causing all new PRs to fail Travis CI **
> >
> > The rollover to the new year is causing unit test failures for some of
> our
> > parser classes. It looks like the issue in the same in all cases... We
> have
> > hard coded a timestamp assertion but the original message does not
> contain
> > the year and is now parsing as 2017 instead of 2016.
> >
> > I'm currently investigating the failure for BasicAsaParserTest.
> > testIp6Addr:151.
> >
> > Other failures from the Travis CI log I'm looking at are:
> > GrokWebSphereParserTest.testParseLoginLine:60
> > GrokWebSphereParserTest.testParseMalformedLoginLine:151
> > GrokWebSphereParserTest.tetsParseLogoutLine:84
> > GrokWebSphereParserTest.tetsParseMalformedLogoutLine:175
> > GrokWebSphereParserTest.tetsParseMalformedOtherLine:220
> > GrokWebSphereParserTest.tetsParseMalformedRBMLine:198
> > GrokWebSphereParserTest.tetsParseOtherLine:129
> > GrokWebSphereParserTest.tetsParseRBMLine:107
> >
> > -Kyle
> >
> >
>



-- 
Nick Allen 





[GitHub] incubator-metron issue #408: METRON-608 Mpack to install a single-node test ...

2017-01-03 Thread mattf-horton
Github user mattf-horton commented on the issue:

https://github.com/apache/incubator-metron/pull/408
  
Please note the Travis failures are not relevant.  All nine are due to 
https://issues.apache.org/jira/browse/METRON-648 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-metron issue #410: METRON-645 Unable to Start Fastcapa Test Enviro...

2017-01-03 Thread JonZeolla
Github user JonZeolla commented on the issue:

https://github.com/apache/incubator-metron/pull/410
  
+1 (nonbinding) pending a successful travis build.  It appears there is a 
general issue that Kyle Richardson is investigating (dev mailing list titled 
"Test failing due to new year").

Ran it up on my CentOS 6.8 host server, followed your suggested testing, 
and received the success message.  Note that in my testing I also merged [my 
changes for 
METRON-635](https://github.com/JonZeolla/incubator-metron/commit/318485547d6a4383a68b88566727aefdc4ff9748)
 but after some brief testing I'm not sure that was necessary.  Can revisit 
again tomorrow to be sure (this lack of testing METRON-635 is why I haven't 
opened a PR yet).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


Re: Tests failing due to new year

2017-01-03 Thread Nick Allen
Thanks, Kyle.  I am seeing the same issue.  Happy Y2K... I mean 2017.

On Tue, Jan 3, 2017 at 4:09 PM, Kyle Richardson 
wrote:

> Created METRON-647 for tracking.
>
> -Kyle
>
> On Tue, Jan 3, 2017 at 3:49 PM, Kyle Richardson  >
> wrote:
>
> > ** This is causing all new PRs to fail Travis CI **
> >
> > The rollover to the new year is causing unit test failures for some of
> our
> > parser classes. It looks like the issue in the same in all cases... We
> have
> > hard coded a timestamp assertion but the original message does not
> contain
> > the year and is now parsing as 2017 instead of 2016.
> >
> > I'm currently investigating the failure for BasicAsaParserTest.
> > testIp6Addr:151.
> >
> > Other failures from the Travis CI log I'm looking at are:
> > GrokWebSphereParserTest.testParseLoginLine:60
> > GrokWebSphereParserTest.testParseMalformedLoginLine:151
> > GrokWebSphereParserTest.tetsParseLogoutLine:84
> > GrokWebSphereParserTest.tetsParseMalformedLogoutLine:175
> > GrokWebSphereParserTest.tetsParseMalformedOtherLine:220
> > GrokWebSphereParserTest.tetsParseMalformedRBMLine:198
> > GrokWebSphereParserTest.tetsParseOtherLine:129
> > GrokWebSphereParserTest.tetsParseRBMLine:107
> >
> > -Kyle
> >
> >
>



-- 
Nick Allen 


Re: Tests failing due to new year

2017-01-03 Thread Kyle Richardson
Created METRON-647 for tracking.

-Kyle

On Tue, Jan 3, 2017 at 3:49 PM, Kyle Richardson 
wrote:

> ** This is causing all new PRs to fail Travis CI **
>
> The rollover to the new year is causing unit test failures for some of our
> parser classes. It looks like the issue in the same in all cases... We have
> hard coded a timestamp assertion but the original message does not contain
> the year and is now parsing as 2017 instead of 2016.
>
> I'm currently investigating the failure for BasicAsaParserTest.
> testIp6Addr:151.
>
> Other failures from the Travis CI log I'm looking at are:
> GrokWebSphereParserTest.testParseLoginLine:60
> GrokWebSphereParserTest.testParseMalformedLoginLine:151
> GrokWebSphereParserTest.tetsParseLogoutLine:84
> GrokWebSphereParserTest.tetsParseMalformedLogoutLine:175
> GrokWebSphereParserTest.tetsParseMalformedOtherLine:220
> GrokWebSphereParserTest.tetsParseMalformedRBMLine:198
> GrokWebSphereParserTest.tetsParseOtherLine:129
> GrokWebSphereParserTest.tetsParseRBMLine:107
>
> -Kyle
>
>


Tests failing due to new year

2017-01-03 Thread Kyle Richardson
** This is causing all new PRs to fail Travis CI **

The rollover to the new year is causing unit test failures for some of our
parser classes. It looks like the issue in the same in all cases... We have
hard coded a timestamp assertion but the original message does not contain
the year and is now parsing as 2017 instead of 2016.

I'm currently investigating the failure for
BasicAsaParserTest.testIp6Addr:151.

Other failures from the Travis CI log I'm looking at are:
GrokWebSphereParserTest.testParseLoginLine:60
GrokWebSphereParserTest.testParseMalformedLoginLine:151
GrokWebSphereParserTest.tetsParseLogoutLine:84
GrokWebSphereParserTest.tetsParseMalformedLogoutLine:175
GrokWebSphereParserTest.tetsParseMalformedOtherLine:220
GrokWebSphereParserTest.tetsParseMalformedRBMLine:198
GrokWebSphereParserTest.tetsParseOtherLine:129
GrokWebSphereParserTest.tetsParseRBMLine:107

-Kyle


Re: Custom Storm Topologies

2017-01-03 Thread Carolyn Duby
Also please consider the security of the scripts and script injection attacks.  
For example, we should probably restrict file access.

Thanks
Carolyn



On 1/3/17, 3:25 PM, "Otto Fowler"  wrote:

>A script bolt would still allow them to write the script the way they want
>to, but would avoid having to write all the scaffolding.
>The matter then would be how to integrate that script bolt into the
>topologies.
>
>
>On January 3, 2017 at 15:17:59, zeo...@gmail.com (zeo...@gmail.com) wrote:
>
>Right, that definitely is more efficient, but part of the point here is to
>lower the barrier of entry to using Metron.
>
>It makes Metron's triage abilities more flexible and allows a user to reuse
>existing code quickly and easily.  Having this available for PoC,
>prototyping, and low volume environments or situations (only when threat
>score is 100, for instance) is important, as it lowers the barrier to entry
>of migrating a company to a Metron environment.
>
>I see this as a tradeoff where I would prioritize ease of use over
>efficiency.  There's nothing wrong with making both options available, at
>some point, and making their different use cases clear.
>
>Jon
>
>On Tue, Jan 3, 2017 at 1:47 PM Matt Foley  wrote:
>
>Well, yes :-)
>And clearly it should always be more efficient to write a custom bolt in
>Java than to invoke a script and manage it.
>
>--Matt
>
>From: Otto Fowler 
>Date: Tuesday, January 3, 2017 at 7:08 AM
>To: "dev@metron.incubator.apache.org" ,
>Matt Foley 
>Subject: Re: Custom Storm Topologies
>
>Wouldn’t that be a bolt?
>
>
>On January 2, 2017 at 14:39:34, Matt Foley (ma...@apache.org) wrote:
>Should we consider a script calling capability that can launch a streaming
>script and keep it alive and fed, long-term, rather than launching the
>script anew every time the Stellar function is invoked? I’m thinking two
>basic rules: Write a line, read a line; and always have a timeout. Prob
>need a UID of some sort for a cache of running process objects.
>
>--Matt
>
>On 1/2/17, 8:50 AM, "Carolyn Duby"  wrote:
>
>
>Inserting a script inline is ok for low throughput and prototyping but once
>you get higher throughput (millions of events per second), it’s probably
>going to be a bottleneck.
>
>
>For Metron-571 you might want to consider a java based extension plugin
>similar to Eclipse plugins.
>
>Thanks
>Carolyn
>
>On 12/31/16, 5:22 PM, "Tyler Moore"  wrote:
>
>>Thanks Jon,
>>
>>I'll look over the tutorial and put something together for the SHELL_EXEC
>>stellar function.
>>I don't believe I have permissions to assign in Jira if you want to assign
>>to me my username is devopsec.
>>I'll post back details and we can review security issues
>>
>>Regards,
>>
>>Tyler Moore
>>Software Engineer
>>Phone: 248-909-2769 <(248)%20909-2769>
>>Email: moore.ty...@goflyball.com
>>
>>
>>On Sat, Dec 31, 2016 at 9:46 AM, zeo...@gmail.com  wrote:
>>
>>> Casey did a tutorial on how to add your own Stellar function here
>>>  - there is not an existing
>>> function that does this (current functions are listed here
>>> >> metron-platform/metron-common#stellar-core-functions>).
>>> I noticed that some of the Stellar function documentation was a bit dated
>>> so I've opened a PR to update it here
>>> .
>>>
>>> As this is something I need as well, I'd be happy to assist you where I
>>> can. Perhaps you want to self-assign METRON-571
>>> ? I do have some
>>> security concerns with a SHELL_EXEC function because it could result in
>RCE
>>> - if that's the route you go I could probably help with a thorough secure
>>> code review.
>>>
>>> Jon
>>>
>>> On Fri, Dec 30, 2016 at 10:43 PM Tyler Moore 
>wrote:
>>>
>>> Thank you everyone for your suggestions,
>>>
>>> I believe that kicking off the function via stellar would be the optimal
>>> solution. If anyone has an example of calling external code via stellar
>>> that would be very helpful. Thanks!
>>>
>>> Regards,
>>>
>>> Tyler Moore
>>> IT Specialist
>>> tyler.math...@yahoo.com
>>> 248-909-2769 <(248)%20909-2769> <(248)%20909-2769>
>>>
>>> > On Dec 30, 2016, at 17:54, Otto Fowler  wrote:
>>> >
>>> > They are all extension points.
>>> >
>>> >> On December 30, 2016 at 16:34:58, zeo...@gmail.com (zeo...@gmail.com)
>>> wrote:
>>> >>
>>> >> Right but unless I'm missing something, both of those options are more
>>> >> rigid and the MaaS service would have an unnecessary delay as opposed
>to
>>> >> doing it entirely in Stellar. Unless there's a reason to do otherwise
>>> that
>>> >> I'm missing, I would think doing this in Stellar gives you a more
>timely
>>> >> and 

Re: Custom Storm Topologies

2017-01-03 Thread Otto Fowler
A script bolt would still allow them to write the script the way they want
to, but would avoid having to write all the scaffolding.
The matter then would be how to integrate that script bolt into the
topologies.


On January 3, 2017 at 15:17:59, zeo...@gmail.com (zeo...@gmail.com) wrote:

Right, that definitely is more efficient, but part of the point here is to
lower the barrier of entry to using Metron.

It makes Metron's triage abilities more flexible and allows a user to reuse
existing code quickly and easily.  Having this available for PoC,
prototyping, and low volume environments or situations (only when threat
score is 100, for instance) is important, as it lowers the barrier to entry
of migrating a company to a Metron environment.

I see this as a tradeoff where I would prioritize ease of use over
efficiency.  There's nothing wrong with making both options available, at
some point, and making their different use cases clear.

Jon

On Tue, Jan 3, 2017 at 1:47 PM Matt Foley  wrote:

Well, yes :-)
And clearly it should always be more efficient to write a custom bolt in
Java than to invoke a script and manage it.

--Matt

From: Otto Fowler 
Date: Tuesday, January 3, 2017 at 7:08 AM
To: "dev@metron.incubator.apache.org" ,
Matt Foley 
Subject: Re: Custom Storm Topologies

Wouldn’t that be a bolt?


On January 2, 2017 at 14:39:34, Matt Foley (ma...@apache.org) wrote:
Should we consider a script calling capability that can launch a streaming
script and keep it alive and fed, long-term, rather than launching the
script anew every time the Stellar function is invoked? I’m thinking two
basic rules: Write a line, read a line; and always have a timeout. Prob
need a UID of some sort for a cache of running process objects.

--Matt

On 1/2/17, 8:50 AM, "Carolyn Duby"  wrote:


Inserting a script inline is ok for low throughput and prototyping but once
you get higher throughput (millions of events per second), it’s probably
going to be a bottleneck.


For Metron-571 you might want to consider a java based extension plugin
similar to Eclipse plugins.

Thanks
Carolyn

On 12/31/16, 5:22 PM, "Tyler Moore"  wrote:

>Thanks Jon,
>
>I'll look over the tutorial and put something together for the SHELL_EXEC
>stellar function.
>I don't believe I have permissions to assign in Jira if you want to assign
>to me my username is devopsec.
>I'll post back details and we can review security issues
>
>Regards,
>
>Tyler Moore
>Software Engineer
>Phone: 248-909-2769 <(248)%20909-2769>
>Email: moore.ty...@goflyball.com
>
>
>On Sat, Dec 31, 2016 at 9:46 AM, zeo...@gmail.com  wrote:
>
>> Casey did a tutorial on how to add your own Stellar function here
>>  - there is not an existing
>> function that does this (current functions are listed here
>> > metron-platform/metron-common#stellar-core-functions>).
>> I noticed that some of the Stellar function documentation was a bit dated
>> so I've opened a PR to update it here
>> .
>>
>> As this is something I need as well, I'd be happy to assist you where I
>> can. Perhaps you want to self-assign METRON-571
>> ? I do have some
>> security concerns with a SHELL_EXEC function because it could result in
RCE
>> - if that's the route you go I could probably help with a thorough secure
>> code review.
>>
>> Jon
>>
>> On Fri, Dec 30, 2016 at 10:43 PM Tyler Moore 
wrote:
>>
>> Thank you everyone for your suggestions,
>>
>> I believe that kicking off the function via stellar would be the optimal
>> solution. If anyone has an example of calling external code via stellar
>> that would be very helpful. Thanks!
>>
>> Regards,
>>
>> Tyler Moore
>> IT Specialist
>> tyler.math...@yahoo.com
>> 248-909-2769 <(248)%20909-2769> <(248)%20909-2769>
>>
>> > On Dec 30, 2016, at 17:54, Otto Fowler  wrote:
>> >
>> > They are all extension points.
>> >
>> >> On December 30, 2016 at 16:34:58, zeo...@gmail.com (zeo...@gmail.com)
>> wrote:
>> >>
>> >> Right but unless I'm missing something, both of those options are more
>> >> rigid and the MaaS service would have an unnecessary delay as opposed
to
>> >> doing it entirely in Stellar. Unless there's a reason to do otherwise
>> that
>> >> I'm missing, I would think doing this in Stellar gives you a more
timely
>> >> and (re)configurable end result.
>> >>
>> >> Jon
>> >>
>> >>> On Fri, Dec 30, 2016, 16:22 Otto Fowler 
>> wrote:
>> >>>
>> >>> I think there are a couple of things you can do here. There way to
get
>> >>> something else into the split is to have another adapter to split to,
>> which
>> >>> is what I think you mean. You can also integrate 

Re: Custom Storm Topologies

2017-01-03 Thread zeo...@gmail.com
Right, that definitely is more efficient, but part of the point here is to
lower the barrier of entry to using Metron.

It makes Metron's triage abilities more flexible and allows a user to reuse
existing code quickly and easily.  Having this available for PoC,
prototyping, and low volume environments or situations (only when threat
score is 100, for instance) is important, as it lowers the barrier to entry
of migrating a company to a Metron environment.

I see this as a tradeoff where I would prioritize ease of use over
efficiency.  There's nothing wrong with making both options available, at
some point, and making their different use cases clear.

Jon

On Tue, Jan 3, 2017 at 1:47 PM Matt Foley  wrote:

Well, yes :-)
And clearly it should always be more efficient to write a custom bolt in
Java than to invoke a script and manage it.

--Matt

From: Otto Fowler 
Date: Tuesday, January 3, 2017 at 7:08 AM
To: "dev@metron.incubator.apache.org" ,
Matt Foley 
Subject: Re: Custom Storm Topologies

Wouldn’t that be a bolt?


On January 2, 2017 at 14:39:34, Matt Foley (ma...@apache.org) wrote:
Should we consider a script calling capability that can launch a streaming
script and keep it alive and fed, long-term, rather than launching the
script anew every time the Stellar function is invoked? I’m thinking two
basic rules: Write a line, read a line; and always have a timeout. Prob
need a UID of some sort for a cache of running process objects.

--Matt

On 1/2/17, 8:50 AM, "Carolyn Duby"  wrote:


Inserting a script inline is ok for low throughput and prototyping but once
you get higher throughput (millions of events per second), it’s probably
going to be a bottleneck.


For Metron-571 you might want to consider a java based extension plugin
similar to Eclipse plugins.

Thanks
Carolyn

On 12/31/16, 5:22 PM, "Tyler Moore"  wrote:

>Thanks Jon,
>
>I'll look over the tutorial and put something together for the SHELL_EXEC
>stellar function.
>I don't believe I have permissions to assign in Jira if you want to assign
>to me my username is devopsec.
>I'll post back details and we can review security issues
>
>Regards,
>
>Tyler Moore
>Software Engineer
>Phone: 248-909-2769 <(248)%20909-2769>
>Email: moore.ty...@goflyball.com
>
>
>On Sat, Dec 31, 2016 at 9:46 AM, zeo...@gmail.com  wrote:
>
>> Casey did a tutorial on how to add your own Stellar function here
>>  - there is not an existing
>> function that does this (current functions are listed here
>> > metron-platform/metron-common#stellar-core-functions>).
>> I noticed that some of the Stellar function documentation was a bit dated
>> so I've opened a PR to update it here
>> .
>>
>> As this is something I need as well, I'd be happy to assist you where I
>> can. Perhaps you want to self-assign METRON-571
>> ? I do have some
>> security concerns with a SHELL_EXEC function because it could result in
RCE
>> - if that's the route you go I could probably help with a thorough secure
>> code review.
>>
>> Jon
>>
>> On Fri, Dec 30, 2016 at 10:43 PM Tyler Moore 
wrote:
>>
>> Thank you everyone for your suggestions,
>>
>> I believe that kicking off the function via stellar would be the optimal
>> solution. If anyone has an example of calling external code via stellar
>> that would be very helpful. Thanks!
>>
>> Regards,
>>
>> Tyler Moore
>> IT Specialist
>> tyler.math...@yahoo.com
>> 248-909-2769 <(248)%20909-2769> <(248)%20909-2769>
>>
>> > On Dec 30, 2016, at 17:54, Otto Fowler  wrote:
>> >
>> > They are all extension points.
>> >
>> >> On December 30, 2016 at 16:34:58, zeo...@gmail.com (zeo...@gmail.com)
>> wrote:
>> >>
>> >> Right but unless I'm missing something, both of those options are more
>> >> rigid and the MaaS service would have an unnecessary delay as opposed
to
>> >> doing it entirely in Stellar. Unless there's a reason to do otherwise
>> that
>> >> I'm missing, I would think doing this in Stellar gives you a more
timely
>> >> and (re)configurable end result.
>> >>
>> >> Jon
>> >>
>> >>> On Fri, Dec 30, 2016, 16:22 Otto Fowler 
>> wrote:
>> >>>
>> >>> I think there are a couple of things you can do here. There way to
get
>> >>> something else into the split is to have another adapter to split to,
>> which
>> >>> is what I think you mean. You can also integrate with MaaS and create
>> a
>> >>> service that you can call via STELLAR.
>> >>>
>> >>>
>> >>>
>> >>> On December 30, 2016 at 15:08:48, Otto Fowler (
ottobackwa...@gmail.com
>> )
>> >>> wrote:
>> >>>
>> >>> Or a Maas service?
>> >>>
>> >>>
>> >>> On December 30, 2016 at 13:52:06, 

[GitHub] incubator-metron pull request #410: METRON-645 Unable to Start Fastcapa Test...

2017-01-03 Thread nickwallen
GitHub user nickwallen opened a pull request:

https://github.com/apache/incubator-metron/pull/410

METRON-645 Unable to Start Fastcapa Test Environment

The Fastcapa Test Environment could not be started.  There were three 
problems that needed addressed.

1. The version of DPDK being used (2.2.0) does not work with newer versions 
of the Linux kernel.  See [this 
thread](http://dpdk.org/ml/archives/dev/2016-June/040968.html) for more 
information. Updating to the latest DPDK release, 16.11 which is N+3 from 
2.2.0, resolved this problem.

2. The `install_pcap_replay` variable was not defined.  This variable is 
required and was added sometime after the creation of the Fastcapa test 
environment.

3. In the Fastcapa Ansible deployment scripts, I moved all tasks into a 
separate `tasks` block.  Previously, Ansible was more forgiving and allowed 
tasks to be embedded in the `roles` block.

Testing

The Ansible deployment includes a complete validation process at the tail 
end of deployment.  If the process completes successfully, then you know that 
it is working.  Look for the message "Successfully received a Kafka message 
from fastcapa!"  The message, or absence thereof, is hard to miss.

```
cd incubator-metron/metron-deployment/vagrant/fastcapa-test-platform
vagrant up
```



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/nickwallen/incubator-metron METRON-645

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-metron/pull/410.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #410


commit 88e378e7332f711b54982190d67b647210fcdabf
Author: Nick Allen 
Date:   2017-01-03T19:29:45Z

METRON-645 Unable to Start Fastcapa Test Environment




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


Re: Custom Storm Topologies

2017-01-03 Thread Matt Foley
Well, yes :-)  
And clearly it should always be more efficient to write a custom bolt in Java 
than to invoke a script and manage it.

--Matt

From: Otto Fowler 
Date: Tuesday, January 3, 2017 at 7:08 AM
To: "dev@metron.incubator.apache.org" , Matt 
Foley 
Subject: Re: Custom Storm Topologies

Wouldn’t that be a bolt?


On January 2, 2017 at 14:39:34, Matt Foley (ma...@apache.org) wrote:
Should we consider a script calling capability that can launch a streaming 
script and keep it alive and fed, long-term, rather than launching the script 
anew every time the Stellar function is invoked? I’m thinking two basic rules: 
Write a line, read a line; and always have a timeout. Prob need a UID of some 
sort for a cache of running process objects. 

--Matt 

On 1/2/17, 8:50 AM, "Carolyn Duby"  wrote: 


Inserting a script inline is ok for low throughput and prototyping but once you 
get higher throughput (millions of events per second), it’s probably going to 
be a bottleneck. 


For Metron-571 you might want to consider a java based extension plugin similar 
to Eclipse plugins. 

Thanks 
Carolyn 

On 12/31/16, 5:22 PM, "Tyler Moore"  wrote: 

>Thanks Jon, 
> 
>I'll look over the tutorial and put something together for the SHELL_EXEC 
>stellar function. 
>I don't believe I have permissions to assign in Jira if you want to assign 
>to me my username is devopsec. 
>I'll post back details and we can review security issues 
> 
>Regards, 
> 
>Tyler Moore 
>Software Engineer 
>Phone: 248-909-2769 
>Email: moore.ty...@goflyball.com 
> 
> 
>On Sat, Dec 31, 2016 at 9:46 AM, zeo...@gmail.com  wrote: 
> 
>> Casey did a tutorial on how to add your own Stellar function here 
>>  - there is not an existing 
>> function that does this (current functions are listed here 
>> > metron-platform/metron-common#stellar-core-functions>). 
>> I noticed that some of the Stellar function documentation was a bit dated 
>> so I've opened a PR to update it here 
>> . 
>> 
>> As this is something I need as well, I'd be happy to assist you where I 
>> can. Perhaps you want to self-assign METRON-571 
>> ? I do have some 
>> security concerns with a SHELL_EXEC function because it could result in RCE 
>> - if that's the route you go I could probably help with a thorough secure 
>> code review. 
>> 
>> Jon 
>> 
>> On Fri, Dec 30, 2016 at 10:43 PM Tyler Moore  wrote: 
>> 
>> Thank you everyone for your suggestions, 
>> 
>> I believe that kicking off the function via stellar would be the optimal 
>> solution. If anyone has an example of calling external code via stellar 
>> that would be very helpful. Thanks! 
>> 
>> Regards, 
>> 
>> Tyler Moore 
>> IT Specialist 
>> tyler.math...@yahoo.com 
>> 248-909-2769 <(248)%20909-2769> 
>> 
>> > On Dec 30, 2016, at 17:54, Otto Fowler  wrote: 
>> > 
>> > They are all extension points. 
>> > 
>> >> On December 30, 2016 at 16:34:58, zeo...@gmail.com (zeo...@gmail.com) 
>> wrote: 
>> >> 
>> >> Right but unless I'm missing something, both of those options are more 
>> >> rigid and the MaaS service would have an unnecessary delay as opposed to 
>> >> doing it entirely in Stellar. Unless there's a reason to do otherwise 
>> that 
>> >> I'm missing, I would think doing this in Stellar gives you a more timely 
>> >> and (re)configurable end result. 
>> >> 
>> >> Jon 
>> >> 
>> >>> On Fri, Dec 30, 2016, 16:22 Otto Fowler  
>> wrote: 
>> >>> 
>> >>> I think there are a couple of things you can do here. There way to get 
>> >>> something else into the split is to have another adapter to split to, 
>> which 
>> >>> is what I think you mean. You can also integrate with MaaS and create 
>> a 
>> >>> service that you can call via STELLAR. 
>> >>> 
>> >>> 
>> >>> 
>> >>> On December 30, 2016 at 15:08:48, Otto Fowler (ottobackwa...@gmail.com 
>> ) 
>> >>> wrote: 
>> >>> 
>> >>> Or a Maas service? 
>> >>> 
>> >>> 
>> >>> On December 30, 2016 at 13:52:06, zeo...@gmail.com (zeo...@gmail.com) 
>> >>> wrote: 
>> >>> 
>> >>> Depending on the details it sounds like a much simpler solution would 
>> be 
>> >>> to 
>> >>> handle this in a Stellar function. 
>> >>> 
>> >>> Jon 
>> >>> 
>>  On Fri, Dec 30, 2016, 13:27 Tyler Moore  wrote: 
>>  
>>  Happy Holidays Metron Devs! 
>>  
>>  Could anyone lend me some guidance on customizing the storm topologies 
>> >>> in 
>>  metron? What I am am trying to accomplish: 
>>  
>>  1) Add a method to the threat intel joiner bolt that sends an http 
>> post 
>>  with the score of the threat to a remote rest api. This will 
>> >>> 

[GitHub] incubator-metron issue #393: METRON-622: Create a Metron Docker Compose appl...

2017-01-03 Thread kylerichardson
Github user kylerichardson commented on the issue:

https://github.com/apache/incubator-metron/pull/393
  
I'll created METRON-646 for the elasticsearch image customizations. I 
already have a start on those changes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-metron issue #393: METRON-622: Create a Metron Docker Compose appl...

2017-01-03 Thread ottobackwards
Github user ottobackwards commented on the issue:

https://github.com/apache/incubator-metron/pull/393
  
Maybe we can create jiras for these, and someone else might help?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-metron issue #393: METRON-622: Create a Metron Docker Compose appl...

2017-01-03 Thread kylerichardson
Github user kylerichardson commented on the issue:

https://github.com/apache/incubator-metron/pull/393
  
I've run this up and successfully tested it using the examples provided in 
the README. It works as documented on docker-machine/boot2docker. Nice job.

One showstopper for me. I can't seem to find the topology logs in the storm 
container. I checked /var/log/storm and no topology specific logs were ever 
written. For debugging new parsers, etc. this will be important to have 
available.

I also want to highlight a few nice-to-haves that I would be perfectly 
happy submitting as separate, follow-on PRs.
- Custom the elasticsearch image (1) to have the elasticsearch-head plugin 
installed as part of the image build, (2) copy the es_templates into the image 
so they are available, (3) for bonus points, deploy the templates on container 
start
- Load zookeeper config on container start
- Improve kafkazk Dockerfile for local docker-engine (Linux); current 
problem is that DOCKER_HOST in this case defaults to empty string and the 
default argument in the Dockerfile is never hit
- Add an HDFS container to allow for complete testing of the indexing 
topology



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-metron issue #393: METRON-622: Create a Metron Docker Compose appl...

2017-01-03 Thread ottobackwards
Github user ottobackwards commented on the issue:

https://github.com/apache/incubator-metron/pull/393
  
ok, I was not using the updated examples in the read me to verify the pr.  
It was not clear to me, after starting everything up, what I should expect to 
be working and what should not be working.  I will go back and do that get the 
context. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-metron issue #393: METRON-622: Create a Metron Docker Compose appl...

2017-01-03 Thread ottobackwards
Github user ottobackwards commented on the issue:

https://github.com/apache/incubator-metron/pull/393
  
If the indices are not installed, then the topologies are moot.  I am 
sorry, I don't mean to be difficult, I am just not sure if it is working and I 
don't understand what the result should be or if it did not work


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


Re: Long-term storage for enriched data

2017-01-03 Thread zeo...@gmail.com
For those interested, I ended up finding a recording of the talk itself
when doing some Avro research - https://www.youtube.com/watch?v=tB28rPTvRiI

Jon

On Sun, Jan 1, 2017 at 8:41 PM Matt Foley  wrote:

> I’m not an expert on these things, but my understanding is that Avro and
> ORC serve many of the same needs.  The biggest difference is that ORC is
> columnar, and Avro isn’t.  Avro, ORC, and Parquet were compared in detail
> at last year’s Hadoop Summit; the slideshare prezo is here:
> http://www.slideshare.net/HadoopSummit/file-format-benchmark-avro-json-orc-parquet
>
> It’s conclusion: “For complex tables with common strings, Avro with Snappy
> is a good fit.  For other tables [or when applications “just need a few
> columns” of the tables], ORC with Zlib is a good fit.”  (The addition in
> square brackets incorporates a quote from another part of the prezo.)  But
> do look at the prezo please, it gives detailed benchmarks showing when each
> one is better.
>
> --Matt
>
> On 1/1/17, 5:18 AM, "zeo...@gmail.com"  wrote:
>
> I don't recall a conversation on that product specifically, but I've
> definitely brought up the need to search HDFS from time to time.
> Things
> like Spark SQL, Hive, Oozie have been discussed, but Avro is new to me
> I'll
> have to look into it.  Are you able to summarize it's benefits?
>
> Jon
>
> On Wed, Dec 28, 2016, 14:45 Kyle Richardson  >
> wrote:
>
> > This thread got me thinking... there are likely a fair number of use
> cases
> > for searching and analyzing the output stored in HDFS. Dima's use
> case is
> > certainly one. Has there been any discussion on the use of Avro to
> store
> > the output in HDFS? This would likely require an expansion of the
> current
> > json schema.
> >
> > -Kyle
> >
> > On Thu, Dec 22, 2016 at 5:53 PM, Casey Stella 
> wrote:
> >
> > > Oozie (or something like it) would appear to me to be the correct
> tool
> > > here.  You are likely moving files around and pinning up hive
> tables:
> > >
> > >- Moving the data written in HDFS from
> /apps/metron/enrichment/${
> > > sensor}
> > >to another directory in HDFS
> > >- Running a job in Hive or pig or spark to take the JSON blobs,
> map
> > them
> > >to rows and pin it up as an ORC table for downstream analytics
> > >
> > > NiFi is mostly about getting data in the cluster, not really for
> > scheduling
> > > large-scale batch ETL, I think.
> > >
> > > Casey
> > >
> > > On Thu, Dec 22, 2016 at 5:18 PM, Dima Kovalyov <
> dima.koval...@sstech.us>
> > > wrote:
> > >
> > > > Thank you for reply Carolyn,
> > > >
> > > > Currently for the test purposes we enrich flow with Geo and
> ThreatIntel
> > > > malware IP, but plan to expand this further.
> > > >
> > > > Our dev team is working on Oozie job to process this. So
> meanwhile I
> > > > wonder if I could use NiFi for this purpose (because we already
> using
> > it
> > > > for data ingest and stream).
> > > >
> > > > Could you elaborate why it may be overkill? The idea is to have
> > > > everything in one place instead of hacking into Metron libraries
> and
> > > code.
> > > >
> > > > - Dima
> > > >
> > > > On 12/22/2016 02:26 AM, Carolyn Duby wrote:
> > > > > Hi Dima -
> > > > >
> > > > > What type of analytics are you looking to do?  Is the
> normalized
> > format
> > > > not working?  You could use an oozie or spark job to create
> derivative
> > > > tables.
> > > > >
> > > > > Nifi may be overkill for breaking up the kafka stream.  Spark
> > streaming
> > > > may be easier.
> > > > >
> > > > > Thanks
> > > > > Carolyn
> > > > >
> > > > >
> > > > >
> > > > > Sent from my Verizon, Samsung Galaxy smartphone
> > > > >
> > > > >
> > > > >  Original message 
> > > > > From: Dima Kovalyov 
> > > > > Date: 12/21/16 6:28 PM (GMT-05:00)
> > > > > To: dev@metron.incubator.apache.org
> > > > > Subject: Long-term storage for enriched data
> > > > >
> > > > > Hello,
> > > > >
> > > > > Currently we are researching fast and resources efficient way
> to save
> > > > > enriched data in Hive for further Analytics.
> > > > >
> > > > > There are two scenarios that we consider:
> > > > > a) Use Ozzie Java job that uses Metron enrichment classes to
> > "manually"
> > > > > enrich each line of the source data that is picked up from the
> source
> > > > > dir (the one that we have developed already and using). That is
> > > > > something that we developed on our own. Downside: custom code
> that
> > > built
> > > > > on top of Metron source code.
> > > > >
> > > > > b) 

[GitHub] incubator-metron issue #393: METRON-622: Create a Metron Docker Compose appl...

2017-01-03 Thread merrimanr
Github user merrimanr commented on the issue:

https://github.com/apache/incubator-metron/pull/393
  
I would consider the topologies installed, just not running.  But yes, no 
data flowing end to end by default.  


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


Re: Long-term storage for enriched data

2017-01-03 Thread zeo...@gmail.com
Right, I second that.  That was kind of my intent with my initial question
(although I did a bad job at making it clear) - Metron specific
benefits/details for Avro use.

Sound like it may make sense for someone to throw together a proposal doc?
Not volunteering though =)

Jon

On Tue, Jan 3, 2017 at 10:07 AM Otto Fowler  wrote:

> What I would like to see is something on using avro with a non-static
> model, such as would be required with metron should new enrichments or
> threat - intelligence, Stellar capabilities, or source changes.
>
>
> On January 2, 2017 at 11:41:32, Carolyn Duby (cd...@hortonworks.com)
> wrote:
>
> Avro is a format that contains both the data and the schema. Here is a
> quick summary:
>
> https://avro.apache.org/docs/current/
>
>
> Thanks
> Carolyn
>
>
>
> On 1/1/17, 8:41 PM, "Matt Foley"  wrote:
>
> >I’m not an expert on these things, but my understanding is that Avro and
> ORC serve many of the same needs. The biggest difference is that ORC is
> columnar, and Avro isn’t. Avro, ORC, and Parquet were compared in detail at
> last year’s Hadoop Summit; the slideshare prezo is here:
>
> http://www.slideshare.net/HadoopSummit/file-format-benchmark-avro-json-orc-parquet
> >
> >It’s conclusion: “For complex tables with common strings, Avro with Snappy
> is a good fit. For other tables [or when applications “just need a few
> columns” of the tables], ORC with Zlib is a good fit.” (The addition in
> square brackets incorporates a quote from another part of the prezo.) But
> do look at the prezo please, it gives detailed benchmarks showing when each
> one is better.
> >
> >--Matt
> >
> >On 1/1/17, 5:18 AM, "zeo...@gmail.com"  wrote:
> >
> > I don't recall a conversation on that product specifically, but I've
> > definitely brought up the need to search HDFS from time to time. Things
> > like Spark SQL, Hive, Oozie have been discussed, but Avro is new to me
> I'll
> > have to look into it. Are you able to summarize it's benefits?
> >
> > Jon
> >
> > On Wed, Dec 28, 2016, 14:45 Kyle Richardson 
> > wrote:
> >
> > > This thread got me thinking... there are likely a fair number of use
> cases
> > > for searching and analyzing the output stored in HDFS. Dima's use case
> is
> > > certainly one. Has there been any discussion on the use of Avro to
> store
> > > the output in HDFS? This would likely require an expansion of the
> current
> > > json schema.
> > >
> > > -Kyle
> > >
> > > On Thu, Dec 22, 2016 at 5:53 PM, Casey Stella 
> wrote:
> > >
> > > > Oozie (or something like it) would appear to me to be the correct
> tool
> > > > here. You are likely moving files around and pinning up hive tables:
> > > >
> > > > - Moving the data written in HDFS from /apps/metron/enrichment/${
> > > > sensor}
> > > > to another directory in HDFS
> > > > - Running a job in Hive or pig or spark to take the JSON blobs, map
> > > them
> > > > to rows and pin it up as an ORC table for downstream analytics
> > > >
> > > > NiFi is mostly about getting data in the cluster, not really for
> > > scheduling
> > > > large-scale batch ETL, I think.
> > > >
> > > > Casey
> > > >
> > > > On Thu, Dec 22, 2016 at 5:18 PM, Dima Kovalyov <
> dima.koval...@sstech.us>
> > > > wrote:
> > > >
> > > > > Thank you for reply Carolyn,
> > > > >
> > > > > Currently for the test purposes we enrich flow with Geo and
> ThreatIntel
> > > > > malware IP, but plan to expand this further.
> > > > >
> > > > > Our dev team is working on Oozie job to process this. So meanwhile
> I
> > > > > wonder if I could use NiFi for this purpose (because we already
> using
> > > it
> > > > > for data ingest and stream).
> > > > >
> > > > > Could you elaborate why it may be overkill? The idea is to have
> > > > > everything in one place instead of hacking into Metron libraries
> and
> > > > code.
> > > > >
> > > > > - Dima
> > > > >
> > > > > On 12/22/2016 02:26 AM, Carolyn Duby wrote:
> > > > > > Hi Dima -
> > > > > >
> > > > > > What type of analytics are you looking to do? Is the normalized
> > > format
> > > > > not working? You could use an oozie or spark job to create
> derivative
> > > > > tables.
> > > > > >
> > > > > > Nifi may be overkill for breaking up the kafka stream. Spark
> > > streaming
> > > > > may be easier.
> > > > > >
> > > > > > Thanks
> > > > > > Carolyn
> > > > > >
> > > > > >
> > > > > >
> > > > > > Sent from my Verizon, Samsung Galaxy smartphone
> > > > > >
> > > > > >
> > > > > >  Original message 
> > > > > > From: Dima Kovalyov 
> > > > > > Date: 12/21/16 6:28 PM (GMT-05:00)
> > > > > > To: dev@metron.incubator.apache.org
> > > > > > Subject: Long-term storage for enriched data
> > > > > >
> > > > > > Hello,
> > > > > >
> > > > > > Currently we are researching fast and resources efficient way to
> save
> > > > > > enriched data in Hive for further Analytics.
> > > > > >
> > > > > > 

Re: [GitHub] incubator-metron issue #393: METRON-622: Create a Metron Docker Compose appl...

2017-01-03 Thread Ryan Merriman
I would consider the topologies installed, just not running.  But yes, no
data flowing end to end by default.

Ryan

On Thu, Dec 22, 2016 at 11:42 AM, ottobackwards  wrote:

> Github user ottobackwards commented on the issue:
>
> https://github.com/apache/incubator-metron/pull/393
>
> So just to be clear, the end result of this is the cluster deployed,
> but nothing installed?  No topologies, no indices in ES etc?
>
>
> ---
> If your project is set up for it, you can reply to this email and have your
> reply appear on GitHub as well. If your project does not have this feature
> enabled and wishes so, or if the feature is enabled but not working, please
> contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
> with INFRA.
> ---
>


Re: Custom Storm Topologies

2017-01-03 Thread Otto Fowler
Wouldn’t that be a bolt?


On January 2, 2017 at 14:39:34, Matt Foley (ma...@apache.org) wrote:

Should we consider a script calling capability that can launch a streaming
script and keep it alive and fed, long-term, rather than launching the
script anew every time the Stellar function is invoked? I’m thinking two
basic rules: Write a line, read a line; and always have a timeout. Prob
need a UID of some sort for a cache of running process objects.

--Matt

On 1/2/17, 8:50 AM, "Carolyn Duby"  wrote:


Inserting a script inline is ok for low throughput and prototyping but once
you get higher throughput (millions of events per second), it’s probably
going to be a bottleneck.


For Metron-571 you might want to consider a java based extension plugin
similar to Eclipse plugins.

Thanks
Carolyn

On 12/31/16, 5:22 PM, "Tyler Moore"  wrote:

>Thanks Jon,
>
>I'll look over the tutorial and put something together for the SHELL_EXEC
>stellar function.
>I don't believe I have permissions to assign in Jira if you want to assign
>to me my username is devopsec.
>I'll post back details and we can review security issues
>
>Regards,
>
>Tyler Moore
>Software Engineer
>Phone: 248-909-2769
>Email: moore.ty...@goflyball.com
>
>
>On Sat, Dec 31, 2016 at 9:46 AM, zeo...@gmail.com 
wrote:
>
>> Casey did a tutorial on how to add your own Stellar function here
>>  - there is not an existing
>> function that does this (current functions are listed here
>> > metron-platform/metron-common#stellar-core-functions>).
>> I noticed that some of the Stellar function documentation was a bit
dated
>> so I've opened a PR to update it here
>> .
>>
>> As this is something I need as well, I'd be happy to assist you where I
>> can. Perhaps you want to self-assign METRON-571
>> ? I do have some
>> security concerns with a SHELL_EXEC function because it could result in
RCE
>> - if that's the route you go I could probably help with a thorough
secure
>> code review.
>>
>> Jon
>>
>> On Fri, Dec 30, 2016 at 10:43 PM Tyler Moore 
wrote:
>>
>> Thank you everyone for your suggestions,
>>
>> I believe that kicking off the function via stellar would be the optimal
>> solution. If anyone has an example of calling external code via stellar
>> that would be very helpful. Thanks!
>>
>> Regards,
>>
>> Tyler Moore
>> IT Specialist
>> tyler.math...@yahoo.com
>> 248-909-2769 <(248)%20909-2769>
>>
>> > On Dec 30, 2016, at 17:54, Otto Fowler 
wrote:
>> >
>> > They are all extension points.
>> >
>> >> On December 30, 2016 at 16:34:58, zeo...@gmail.com (zeo...@gmail.com)
>> wrote:
>> >>
>> >> Right but unless I'm missing something, both of those options are
more
>> >> rigid and the MaaS service would have an unnecessary delay as opposed
to
>> >> doing it entirely in Stellar. Unless there's a reason to do otherwise
>> that
>> >> I'm missing, I would think doing this in Stellar gives you a more
timely
>> >> and (re)configurable end result.
>> >>
>> >> Jon
>> >>
>> >>> On Fri, Dec 30, 2016, 16:22 Otto Fowler 
>> wrote:
>> >>>
>> >>> I think there are a couple of things you can do here. There way to
get
>> >>> something else into the split is to have another adapter to split
to,
>> which
>> >>> is what I think you mean. You can also integrate with MaaS and
create
>> a
>> >>> service that you can call via STELLAR.
>> >>>
>> >>>
>> >>>
>> >>> On December 30, 2016 at 15:08:48, Otto Fowler (
ottobackwa...@gmail.com
>> )
>> >>> wrote:
>> >>>
>> >>> Or a Maas service?
>> >>>
>> >>>
>> >>> On December 30, 2016 at 13:52:06, zeo...@gmail.com (zeo...@gmail.com)

>> >>> wrote:
>> >>>
>> >>> Depending on the details it sounds like a much simpler solution
would
>> be
>> >>> to
>> >>> handle this in a Stellar function.
>> >>>
>> >>> Jon
>> >>>
>>  On Fri, Dec 30, 2016, 13:27 Tyler Moore 
wrote:
>> 
>>  Happy Holidays Metron Devs!
>> 
>>  Could anyone lend me some guidance on customizing the storm
topologies
>> >>> in
>>  metron? What I am am trying to accomplish:
>> 
>>  1) Add a method to the threat intel joiner bolt that sends an http
>> post
>>  with the score of the threat to a remote rest api. This will
>> >>> conditionally
>>  trigger notifications based on user settings in another database
(the
>>  backend processing logic is on another platform).
>>  The score should be available within the JSONObject but I am not an
>> >>> expert
>>  with storm and I am not completely understanding what conditions
>> >>> constitute
>>  when the threat feed is considered an "alert" in metron. Please
>> clarify.
>> 
>>  2) How would I add an external dependency, my http rest java 

Re: Long-term storage for enriched data

2017-01-03 Thread Otto Fowler
What I would like to see is something on using avro with a non-static
model, such as would be required with metron should new enrichments or
threat - intelligence, Stellar capabilities, or source changes.


On January 2, 2017 at 11:41:32, Carolyn Duby (cd...@hortonworks.com) wrote:

Avro is a format that contains both the data and the schema. Here is a
quick summary:

https://avro.apache.org/docs/current/


Thanks
Carolyn



On 1/1/17, 8:41 PM, "Matt Foley"  wrote:

>I’m not an expert on these things, but my understanding is that Avro and
ORC serve many of the same needs. The biggest difference is that ORC is
columnar, and Avro isn’t. Avro, ORC, and Parquet were compared in detail at
last year’s Hadoop Summit; the slideshare prezo is here:
http://www.slideshare.net/HadoopSummit/file-format-benchmark-avro-json-orc-parquet
>
>It’s conclusion: “For complex tables with common strings, Avro with Snappy
is a good fit. For other tables [or when applications “just need a few
columns” of the tables], ORC with Zlib is a good fit.” (The addition in
square brackets incorporates a quote from another part of the prezo.) But
do look at the prezo please, it gives detailed benchmarks showing when each
one is better.
>
>--Matt
>
>On 1/1/17, 5:18 AM, "zeo...@gmail.com"  wrote:
>
> I don't recall a conversation on that product specifically, but I've
> definitely brought up the need to search HDFS from time to time. Things
> like Spark SQL, Hive, Oozie have been discussed, but Avro is new to me
I'll
> have to look into it. Are you able to summarize it's benefits?
>
> Jon
>
> On Wed, Dec 28, 2016, 14:45 Kyle Richardson 
> wrote:
>
> > This thread got me thinking... there are likely a fair number of use
cases
> > for searching and analyzing the output stored in HDFS. Dima's use case
is
> > certainly one. Has there been any discussion on the use of Avro to
store
> > the output in HDFS? This would likely require an expansion of the
current
> > json schema.
> >
> > -Kyle
> >
> > On Thu, Dec 22, 2016 at 5:53 PM, Casey Stella 
wrote:
> >
> > > Oozie (or something like it) would appear to me to be the correct
tool
> > > here. You are likely moving files around and pinning up hive tables:
> > >
> > > - Moving the data written in HDFS from /apps/metron/enrichment/${
> > > sensor}
> > > to another directory in HDFS
> > > - Running a job in Hive or pig or spark to take the JSON blobs, map
> > them
> > > to rows and pin it up as an ORC table for downstream analytics
> > >
> > > NiFi is mostly about getting data in the cluster, not really for
> > scheduling
> > > large-scale batch ETL, I think.
> > >
> > > Casey
> > >
> > > On Thu, Dec 22, 2016 at 5:18 PM, Dima Kovalyov <
dima.koval...@sstech.us>
> > > wrote:
> > >
> > > > Thank you for reply Carolyn,
> > > >
> > > > Currently for the test purposes we enrich flow with Geo and
ThreatIntel
> > > > malware IP, but plan to expand this further.
> > > >
> > > > Our dev team is working on Oozie job to process this. So meanwhile
I
> > > > wonder if I could use NiFi for this purpose (because we already
using
> > it
> > > > for data ingest and stream).
> > > >
> > > > Could you elaborate why it may be overkill? The idea is to have
> > > > everything in one place instead of hacking into Metron libraries
and
> > > code.
> > > >
> > > > - Dima
> > > >
> > > > On 12/22/2016 02:26 AM, Carolyn Duby wrote:
> > > > > Hi Dima -
> > > > >
> > > > > What type of analytics are you looking to do? Is the normalized
> > format
> > > > not working? You could use an oozie or spark job to create
derivative
> > > > tables.
> > > > >
> > > > > Nifi may be overkill for breaking up the kafka stream. Spark
> > streaming
> > > > may be easier.
> > > > >
> > > > > Thanks
> > > > > Carolyn
> > > > >
> > > > >
> > > > >
> > > > > Sent from my Verizon, Samsung Galaxy smartphone
> > > > >
> > > > >
> > > > >  Original message 
> > > > > From: Dima Kovalyov 
> > > > > Date: 12/21/16 6:28 PM (GMT-05:00)
> > > > > To: dev@metron.incubator.apache.org
> > > > > Subject: Long-term storage for enriched data
> > > > >
> > > > > Hello,
> > > > >
> > > > > Currently we are researching fast and resources efficient way to
save
> > > > > enriched data in Hive for further Analytics.
> > > > >
> > > > > There are two scenarios that we consider:
> > > > > a) Use Ozzie Java job that uses Metron enrichment classes to
> > "manually"
> > > > > enrich each line of the source data that is picked up from the
source
> > > > > dir (the one that we have developed already and using). That is
> > > > > something that we developed on our own. Downside: custom code
that
> > > built
> > > > > on top of Metron source code.
> > > > >
> > > > > b) Use NiFi to listen for indexing Kafka topic -> split stream by
> > > source
> > > > > type -> Put every source type in corresponding Hive table.
> > > > >
> > > > > I wonder, if someone was going 

[GitHub] incubator-metron issue #397: METRON-627: Add HyperLogLogPlus implementation ...

2017-01-03 Thread mmiklavc
Github user mmiklavc commented on the issue:

https://github.com/apache/incubator-metron/pull/397
  
@nickwallen Thanks for the feedback! - I'm currently working on finishing 
up some performance metrics (p/sp vals vs accuracy, cardinality, serialized mem 
consumption) in addition to an example utilizing the profiler. Your example 
above is exactly in line with what I was thinking might make for a good demo of 
this functionality.

Also, agreed on the single-line example - that was meant to make 
cut-and-paste easy. The unit and integration tests are much easier to read. 
I'll provide similar examples here.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


Re: Confluence write access to a space

2017-01-03 Thread Matt Foley
Hi Dima,

I think it would make a lot of sense to make 642 a sub-task of 634.  Certainly 
it’s a needed improvement and there’s no need to wait for the laundry list of 
other stuff in 634.  In fact, shorter is better.  But gathering the pieces as 
subtasks will make it easier to track and avoid duplicating work.

Regarding 641, as I commented in the PR, I think the proposed change is only 
needed if the user is using Python 2.6.  But there’s no harm in making the 
change anyway.  Since it isn’t mentioned in 634, just go ahead and pursue it 
separately.

Thanks for asking.

Regarding elasticsearch_config_path in 
metron-deployment/roles/metron_streaming/defaults/main.yml , I don’t really 
know.  I think it is probably a reference to the location of elastic-env.sh, 
which is itself not actually used for anything as far as I can tell.  In 
METRON-634 I actually propose removing elastic-env.sh and its progenitor file, 
elastic-env.xml.  But other more knowledgeable people haven’t had the chance to 
chime in on that yet :-)

--Matt


On 1/2/17, 11:01 PM, "Dima Kovalyov"  wrote:

Thank you Matt,

I haven't seen 634 before. Should I merge my tickets as sub-tasks
addressing some of the points your bring up there?

Also, I have a note about elasticsearch_config_path which is set in
metron-deployment/roles/metron_streaming/defaults/main.yml. It seems
like it is not used anywhere in the code base.

Please let me know how I should proceed with two tickets I have created
that are relevant to yours.

- Dima

On 01/01/2017 04:43 AM, Matt Foley wrote:
> Hi Dima,
> Great to have the how-to doc in the wiki where it belongs.  Now we have a 
doc to edit as we improve the install process :-)
>
> Did you look at https://issues.apache.org/jira/browse/METRON-634 before 
opening METRON-642?  Please see my comments in the Jira for METRON-642.
> Thanks,
> --Matt
>
>
>
> On 12/30/16, 11:44 PM, "Dima Kovalyov"  wrote:
>
> Hey,
> 
> I wanted to finish what I've started with document for Metron with HDP
> 2.5, so I have migrated document (with minor text fixes and
> clarifications) to here:
> 
https://cwiki.apache.org/confluence/display/METRON/Metron+with+HDP+2.5+bare-metal+install
> Old google doc was replaced with the link to this article.
> 
> I also, created number of pull requests to fix minor bugs here and 
there
> and created these two tickets: METRON-641 and METRON-642.
> Please let me know if I did something out of proper procedure.
> 
> Also, I agree that we should eventually strip HDP related steps from 
the
> document, so in the end it will be like:
> 1. Build Mpack
> 2. Add to Ambari
> 3. Assigned Masters and Slave
> 4. PROFIT
> But since we are where we are, let's leave it like that and fix all 
the
> bugs first.
> 
> p.s. have a happy holidays everyone
> 
> - Dima
> 
> On 12/16/2016 04:21 AM, Matt Foley wrote:
> > I seem to have found the difficulty.  It will NOT show up on any 
system that has /bin/java defined, which may account for why other folks with 
Centos7 test systems aren’t seeing the behavior.
> >
> > On my Centos7 test system, it so happens that /bin/java is not 
defined, even though $JAVA_HOME is correctly defined, and “$JAVA_HOME/bin” is 
in the PATH.  In Centos7, when services launch through the (new in 7) systemctl 
process, it drops all inherited environment variables and starts over fresh.  
Although the systemd launch script 
/usr/lib/systemd/system/elasticsearch.service does read in the 
/etc/sysconfig/elasticsearch as an “EnvironmentFile”, it does not include 
JAVA_HOME.
> >
> > When, eventually, the user-level launcher script at 
/usr/share/elasticsearch/bin/elasticsearch gets invoked, JAVA_HOME is still 
undefined.  But it looks for $JAVA_HOME/bin/java, so if “/bin/java” is linked 
in the file system, then it’s good!  But if not, the launcher script dies.  
Regrettably that launcher script, even though it is fairly complex, does not 
write to any log file, and its stdout was closed long ago by the service-level 
launcher.  So I had to hack it to see what it was doing.
> >
> > The solution is to simply write JAVA_HOME={{java64_home}} into the 
elastic-sysconfig template.
> >
> > BTW, while munging thru code I reached the conclusion that 
elastic-env.sh is basically orphaned.  Does anyone know of scripts that source 
it? (Of course elastic-env.xml is still important, I’m only asking about the 
elastic-env.sh file templated from it.)
> >
> > Thanks,
> > --Matt
> >
> >
> > On 12/14/16, 2:41 PM, "Matt Foley"