from:"Nick Allen"

Re: [VOTE] Move Apache Metron to the Apache Attic and Dissolve PMC

2020-11-16 Thread Nick Allen

+1

On Mon, Nov 16, 2020 at 11:41 AM zeo...@gmail.com  wrote:

> +1
>
> --
> Jon Zeolla
> @jonzeolla
>
> PittSec | BSidesPGH | SteelCityInfoSec
>
> On Mon, Nov 16, 2020, 11:33 AM Casey Stella  wrote:
>
> > +1
> >
> > On Mon, Nov 16, 2020 at 09:01 Justin Leet  wrote:
> >
> > > Hi all,
> > >
> > > This is a vote thread to retire Metron to the Attic, and dissolve the
> > PMC.
> > > This follows a discussion thread on the dev list ([DISCUSS] Retire
> Metron
> > > to the Attic
> > > <
> > >
> >
> https://lists.apache.org/thread.html/reb31f643fac20d3ad09521fd702b19922412b7a4e8e08062968268c5%40%3Cdev.metron.apache.org%3E
> > > >).
> > > More details can be found in that discussion, but the most relevant
> link
> > is
> > > the specific process at Moving a project to the Attic
> > > .
> > >
> > > As noted in the process page, this is a PMC vote. As usual, feel
> > encouraged
> > > to contribute non-binding votes.
> > >
> > > The vote will run 72 hours, until Nov 19th at 9:00 am EST.
> > >
> > > Thank you,
> > > Justin
> > >
> >
>

Re: Development Activity has dropped to effectively 0, what should we do?

2020-04-21 Thread Nick Allen

Hi Tom -

>  Do you or anyone have enough experience to judge if it is possible to
leverage Ansible as a replacement to deploy a working cluster?

Yes, I worked a lot on the Ansible mechanism in the early days of Metron.
This was the primary deployment mechanism before we had the Ambari MPack.

We found it very difficult to use Ansible to create a one-size-fits-all
deployment solution. It's possible, but very difficult to get a solution
that doesn't take close monitoring and manual work arounds when attempting
to use it across environments of different sizes and shapes. In terms of
usability, the Ambari MPack was a big step-up in my opinion.


>  perhaps a dedicated docker image that is designed to connect with other
dockerized applications such as Storm, Kafka, etc..?

Yes, I think that would be the way to go for a dev environment. We would be
able to use community supported containers for most of our underlying
platform needs. Unfortunately, this alone would not help anyone deploy
Metron on a cluster.




On Tue, Apr 21, 2020 at 9:08 AM Yerex, Tom  wrote:

> Hi Nick,
>
> I see there is a lot of work done using Ansible in the repository. Do you
> or anyone have enough experience to judge if it is possible to leverage
> Ansible as a replacement to deploy a working cluster?
>
> Now that I am typing this out, I wonder if docker might be a solution that
> would work? I don't have much experience with docker, perhaps a dedicated
> docker image that is designed to connect with other dockerized applications
> such as Storm, Kafka, etc..?
>
> --Tom.
>
> On 2020-04-17, 11:27 AM, "Nick Allen"  wrote:
>
> This is a good discussion and one that I haven't fully grappled with
> in my
> own mind yet. I'll have more to add, but I just want to chime in on the
> topic of Ambari at this point.
>
> ### Ambari and the Paywall
>
> The problem with Ambari is that its installation mechanism requires a
> repository of compiled packages (RPMs, DEBs, etc.) To install the
> underlying platform dependencies (like Kafka, HBase, Storm, Zk, etc) we
> relied on binary packages that were made freely available by
> Cloudera/Hortonworks. As of this past January, those packages are now
> behind a paywall.
>
> Due to the paywall, installing your own HDP cluster with Ambari is now
> effectively dead.  I am not sure if legacy versions of Kafka, HBase,
> Storm,
> etc will continue to be freely available, but even if so, we cannot
> continue to rely on this mechanism if new versions and security updates
> will not be made available.
>
> The Apache Metron project does not publish compiled binaries or
> packages
> either.  We do make the code freely available to allow users to build
> and
> publish their own Metron packages.   But even with this capability,
> unless
> you have a means to install the underlying platform dependencies via
> Ambari, installing Metron with Ambari has little value.
>
> Unfortunately, I don't see a feasible path forward for Metron's Ambari
> MPack.
>
> ### Dev Environment
>
> This not only impacts the users of Apache Metron, this impacts
> contributors
> also. Our primary development environment relies on that Ambari
> MPack.  To
> continue development on any of the components of Apache Metron, we
> would
> need to build an alternative development environment that can function
> despite the paywall.  That could take many shapes, but in my opinion it
> would be a blocker for continuing any development on Apache Metron,
> unfortunately.
>
> Please do let me know if anyone disagrees or can think of an
> alternative
> approach that would allow the current Ambari MPack to remain viable.
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
> On Thu, Apr 16, 2020 at 4:34 PM Dima Kovalyov 
> wrote:
>
> >   - Dropping Ambari.
> >
> > I like the progress that Apache did with Ambari in 2.7. And I don't
> know a
> > better installer/manager for all the services (we use other Hadoop
> eco
> > services besides Metron).
> >
> > Sometimes its buggy, agents get stuck or server needs reboot from
> time to
> > time, mpacks brake some functionality. But overall I feel this is the
> > direction for central management and orchestration.
> >
> > - Dima
> >
> > On Wed, Apr 15, 2020, 12:45 Justin Leet 
> wrote:
> >
> > > This is a bit off the top of my head, but I'd I agree with pretty
> much
> > all
> > > of points on what's bringing a lot of overhead.  There's probably
> also

Re: Development Activity has dropped to effectively 0, what should we do?

2020-04-17 Thread Nick Allen

This is a good discussion and one that I haven't fully grappled with in my
own mind yet. I'll have more to add, but I just want to chime in on the
topic of Ambari at this point.

### Ambari and the Paywall

The problem with Ambari is that its installation mechanism requires a
repository of compiled packages (RPMs, DEBs, etc.) To install the
underlying platform dependencies (like Kafka, HBase, Storm, Zk, etc) we
relied on binary packages that were made freely available by
Cloudera/Hortonworks. As of this past January, those packages are now
behind a paywall.

Due to the paywall, installing your own HDP cluster with Ambari is now
effectively dead.  I am not sure if legacy versions of Kafka, HBase, Storm,
etc will continue to be freely available, but even if so, we cannot
continue to rely on this mechanism if new versions and security updates
will not be made available.

The Apache Metron project does not publish compiled binaries or packages
either.  We do make the code freely available to allow users to build and
publish their own Metron packages.   But even with this capability, unless
you have a means to install the underlying platform dependencies via
Ambari, installing Metron with Ambari has little value.

Unfortunately, I don't see a feasible path forward for Metron's Ambari
MPack.

### Dev Environment

This not only impacts the users of Apache Metron, this impacts contributors
also. Our primary development environment relies on that Ambari MPack.  To
continue development on any of the components of Apache Metron, we would
need to build an alternative development environment that can function
despite the paywall.  That could take many shapes, but in my opinion it
would be a blocker for continuing any development on Apache Metron,
unfortunately.

Please do let me know if anyone disagrees or can think of an alternative
approach that would allow the current Ambari MPack to remain viable.

On Thu, Apr 16, 2020 at 4:34 PM Dima Kovalyov  wrote:

>   - Dropping Ambari.
>
> I like the progress that Apache did with Ambari in 2.7. And I don't know a
> better installer/manager for all the services (we use other Hadoop eco
> services besides Metron).
>
> Sometimes its buggy, agents get stuck or server needs reboot from time to
> time, mpacks brake some functionality. But overall I feel this is the
> direction for central management and orchestration.
>
> - Dima
>
> On Wed, Apr 15, 2020, 12:45 Justin Leet  wrote:
>
> > This is a bit off the top of my head, but I'd I agree with pretty much
> all
> > of points on what's bringing a lot of overhead.  There's probably also a
> > worthwhile discussion about what value we're shooting for the project to
> > provide to people that influences what stays/goes.
> >
> > Thinking out loud a bit
> >
> >- Dropping Storm and moving to Spark drops the very hard to
> >tune/manage/troubleshoot Storm.
> >- Dropping the UIs (and making SQL the external interface) pretty much
> >implies dropping the REST APIs and ES/Solr.  ES/Solr have been a giant
> >source of dev heartache on the project and they exist primarily for
> the
> >real time use case.  People can build whatever UIs or use existing
> tools
> >against Parquet/Hive/whatever.
> >- Dropping Ambari. It's a complex beast to install because of how many
> >components we have. Dropping the above makes our install much easier
> and
> >should alleviate the need for a complex installer.
> >
> > At that point, we're basically left with
> >
> >- Some Spark for parse -> enrich -> output
> >- The profiler
> >- Stellar
> >- Probably some other misc stuff (sensors, bro kafka plugging, etc.)
> >
> > At a glance, that seems almost an order of magnitude smaller than what we
> > currently try to handle.
> >
> > I'm not really sure what an appropriate way to handle the profiler is.
> I've
> > barely touched the code for it, so I anything I say is a vague guess.
> >
> > On Wed, Apr 8, 2020 at 7:38 PM Yerex, Tom  wrote:
> >
> > > To me Metron is big and broad in the scope of technology required to
> get
> > > it running. If things were more modular that would go a long way to
> > > reducing the learning curve or at least putting it into smaller bites
> > (and
> > > it might encourage more people to get involved).
> > >
> > > If the UI were an add-on module in another project, it would have made
> it
> > > easier for me and it could also encourage my hypothetical buddy who is
> a
> > > web developer expert to get involved since he could focus on the web-ui
> > > module instead of trying to tackle all the other pieces that are
> probably
> > > not part of his bailiwick.
> > >
> > > Stellar is very intriguing, maybe that is not unique to Metron? The
> > > architecture of Metron with respect to parsing, enriching, etc., makes
> a
> > > lot of sense to anyone I talk with. These two aspects of Metron seem
> like
> > > standout examples that make for a powerful platform to develop on.
>

Re: Centos6 and Centos7 instructions

2020-04-15 Thread Nick Allen

Hi Tom -

The source for
https://metron.apache.org/current-book/metron-deployment/development/centos6/index.html
is
contained in the README at
`metron/metron-deployment/development/centos6/README.md`.  When a release
happens we have a script that generates the site book from the various
documentation bits we have in READMEs in the code base.

The primary purpose
of `metron/metron-deployment/development/centos6/README.md` is describing
how to use the centos6 development environment. Before we remove the
centos6 page, we would just need to make sure the centos7 development
environment has feature parity.  And we would likely remove everything
under `metron/metron-deployment/development/centos6` all at once.

Now to add a README for the centos7 development environment, you would just
add some markdown at the path
`metron/metron-deployment/development/centos7/README.md`. Once a release
happens, that new README.md would be used to generate the site book at the
URL that you mentioned above.

On Wed, Apr 15, 2020 at 1:24 PM Yerex, Tom  wrote:

> Good morning,
>
> How does one go about updating the documentation at
> https://metron.apache.org/current-book/metron-deployment/development/centos6/index.html
> ?
>
> I would like to add a similar page for Centos7, which I see is in the
> repo. Is there any reason to keep the CentOS 6 page, or should it come down?
>
> Cheers,
>
> Tom.
>

Re: Possible approach to solve GeoIP update?

2020-04-15 Thread Nick Allen

That seems like a viable solution to me.



On Thu, Apr 9, 2020 at 7:58 PM Yerex, Tom  wrote:

> Good afternoon,
>
> Reviewing hxxps://
> issues.apache.org/jira/projects/METRON/issues/METRON-2340 I'm attempting
> to sketch out a rough solution and I would like guidance from more
> experienced minds.
>
> Maxmind releases code that allows you to build your own mmdb database. The
> tests I can see in Metron, besides the download, seem to try a few
> locations like Milton and London. Would it be sufficient to generate geoip
> databases with those (limited), test locations or alternatively generate
> geoip databases with hypothetical locations and then host those locations
> in git for the purpose of the download and testing?
>
> The documentation for the settings in Ambari for Metron GeoIP updates
> could then be slightly updated to add that someone needs to get their own
> key for GeoIP updates?
>
> Cheers,
>
> Tom.
>

Re: [DISCUSS] Next Release - Life After 0.7.1

2019-12-13 Thread Nick Allen

Are we just waiting on the following PRs as release blockers?  Any others?

   - https://github.com/apache/metron/pull/1533
   - https://github.com/apache/metron/pull/1527

Being towards the end of the year, people are going to be on holiday. It
would be great if we could focus on reducing scope and getting a release
cut.


On Sat, Dec 7, 2019 at 10:04 AM Justin Leet  wrote:

> https://github.com/apache/metron/pull/1568 and
> https://github.com/apache/metron/pull/1554 are in master now.
>
> On Fri, Dec 6, 2019 at 7:16 PM Justin Leet  wrote:
>
> > I'd like to throw https://github.com/apache/metron/pull/1552 on the
> pile.
> > Per https://issues.apache.org/jira/browse/LEGAL-491, we should just note
> > the contribution comes from dependabot. Would someone more familiar with
> > the implications of upgrading that be able to review it, or give some
> > advice on what we should be looking for in the review?
> >
> > On Thu, Dec 5, 2019 at 12:06 PM Shane Ardell 
> > wrote:
> >
> >> Speaking on the UI-related PRs that Justin mentioned, I also would like
> to
> >> see both of them merged before a release. At the moment, #1527 does not
> >> address a few "stale data state" message inconsistencies that become
> >> apparent as a result of that PR's work (you can read more about it in
> the
> >> PR comments). That said, I think those inconsistencies can be tracked
> and
> >> addressed separately from the current PR.
> >>
> >> On Thu, Dec 5, 2019 at 11:51 AM Michael Miklavcic <
> >> michael.miklav...@gmail.com> wrote:
> >>
> >> > I think the junit upgrade should go in also. I'm almost finished
> >> reviewing
> >> > that.
> >> >
> >> > On Thu, Dec 5, 2019, 8:50 AM Justin Leet 
> wrote:
> >> >
> >> > >  If we're going to do a bug fix release, I'd like to see some of the
> >> low
> >> > > hanging fix PRs get finished and merged prior to the release. We've
> >> been
> >> > > lax about getting them cleaned up, so I'd like to use a release as
> an
> >> > > opportunity to whittle the PRs down and put out a really solid
> >> release.
> >> > >
> >> > > https://github.com/apache/metron/pull/1568 should be in before
> >> release.
> >> > It
> >> > > addresses an issue with our validation of dependencies_with_url.csv
> >> and
> >> > > it's validation.
> >> > > Should https://github.com/apache/metron/pull/1282 be in? Seems like
> >> that
> >> > > should have been merged awhile ago.
> >> > >
> >> > > There's also a couple UI performance / bug fixes PRs (e.g.
> >> > > https://github.com/apache/metron/pull/1533 and
> >> > > https://github.com/apache/metron/pull/1527) that have been sitting
> >> > awhile.
> >> > >
> >> > >
> >> > >
> >> > > On Thu, Dec 5, 2019 at 10:32 AM Nick Allen 
> >> wrote:
> >> > >
> >> > > > Hello Metron'ers -
> >> > > >
> >> > > > I would like to make the case that it is time for us to cut the
> next
> >> > > Apache
> >> > > > Metron release.
> >> > > >
> >> > > >- Our last release was 0.7.1 on May 15th
> >> > > ><
> >> > > >
> >> > >
> >> >
> >>
> https://lists.apache.org/thread.html/e2e532cbb63be757d0875718b082c069a268f57a9087510f196be09b%40%3Cdev.metron.apache.org%3E
> >> > > > >.
> >> > > >It has been *~6 months* since this release.
> >> > > >
> >> > > >
> >> > > >- We have *102 changes* in master since the last release. This
> >> > figure
> >> > > >excludes the two feature branches currently undergoing active
> >> > > > development.
> >> > > >
> >> > > >
> >> > > >- We should cut a release *prior to merging in any other
> >> significant
> >> > > >changes* from either feature branch.  The two active feature
> >> > branches
> >> > > >include ~47 other changes at this point in time.
> >> > > >
> >> > > >
> >> > > >- These 102 changes include some very nice *bug fixes and
> >> usability
> >> > > >improvements*. I would pr

Re: JUnit 5 PR merged into master

2019-12-07 Thread Nick Allen

Thanks for the hard work on that upgrade and this very useful highlight
reel.

On Sat, Dec 7, 2019, 10:17 AM Justin Leet  wrote:

> Hi all,
>
> The JUnit 5 migration PR has been merged to master. From this point
> forward, please use the newer interfaces and methods.  There are plenty of
> examples through the code, and for more information, check out see
> https://junit.org/junit5/.
>
> Brief list of things to be aware of when writing JUnit 5 tests:
>
>- Generally, use org.junit.jupiter.api imports.
>- There are some minor interface changes, e.g. @BeforeAll
>replaces @BeforeClass
>- Failure messages are now at the end of the argument list.
>- Exception checking idiom has been improved with JUnit 5. Please see an
>example at UpdateDaoTest.java#L60
><
> https://github.com/apache/metron/blob/b71ddceeefb4efc677f800c66ca65fae46f4a45a/metron-platform/metron-indexing/metron-indexing-common/src/test/java/org/apache/metron/indexing/dao/UpdateDaoTest.java#L60
> >
>.
>- Parameterized tests function differently now. Please see an example at
>ByteArrayMatchingUtilTest.java#L85
><
> https://github.com/apache/metron/blob/b71ddceeefb4efc677f800c66ca65fae46f4a45a/metron-platform/metron-pcap/src/test/java/org/apache/metron/pcap/pattern/ByteArrayMatchingUtilTest.java#L85
> >
>.
>- No more PowerMock. Please reconsider your code design if you need to
>mock a static classes.
>- The replacement for @TemporaryFolder is still experimental, and as
>such has generally stayed. @EnableRuleMigrationSupport at the class
> level
>allows this to continue, as @Rule is generally no longer used.
>
>
> Thanks!
>

[DISCUSS] Next Release - Life After 0.7.1

2019-12-05 Thread Nick Allen

Hello Metron'ers -

I would like to make the case that it is time for us to cut the next Apache
Metron release.

   - Our last release was 0.7.1 on May 15th
   
.
   It has been *~6 months* since this release.


   - We have *102 changes* in master since the last release. This figure
   excludes the two feature branches currently undergoing active development.


   - We should cut a release *prior to merging in any other significant
   changes* from either feature branch.  The two active feature branches
   include ~47 other changes at this point in time.


   - These 102 changes include some very nice *bug fixes and usability
   improvements*. I would propose that we treat this as a bug-fix release
   and label it as *Metron 0.7.2*.

Please let me know if you agree or disagree with this call for a release.

For those interested, here are the 102 unreleased changes in master.

METRON-2323 Increase unit test coverage for Alerts List (sardell) closes
apache/metron#1567
METRON-2208 [UI] Increase unit test coverage for Alert Details (sardell)
closes apache/metron#1479
METRON-2316 [UI] Drag drop sorting for the selected fields in the Alerts UI
(ruffle1986 via sardell) closes apache/metron#1560
METRON-2326 Unable to Call ENRICHMENT_GET from Threat Triage Rule Reason
Field (nickwallen via mmiklavc) closes apache/metron#1570
METRON-2285 Batch Profiler Cannot Persist Data Sketches (nickwallen via
mmiklavc) closes apache/metron#1564
METRON-2321 Remove Legacy AWS Deployment Path (nickwallen) closes
apache/metron#1565
METRON-2317 [UI] Delete confirmation dialogue looks visually broken (tiborm
via sardell) closes apache/metron#1562
METRON-2304 Update node and npm version to LTS releases (sardell) closes
apache/metron#1550
METRON-2311 Remove JUnit from all our uber jars (justinleet) closes
apache/metron#1561
METRON-2239 Metron Automated backup and restore (mmiklavc) closes
apache/metron#1546
METRON-2308 Fix 'Degrees' Example Profile (nickwallen) closes
apache/metron#1555
METRON-2310 Remove metron-integration-test from compile dependencies
(justinleet) closes apache/metron#1557
METRON-2284 Metron Profiler for Spark Doesn't Work as Expected (nickwallen)
closes apache/metron#1556
METRON-2290 [UI] Delaying first auto polling request on app start (tiborm
via sardell) closes apache/metron#1534
METRON-2293 Fix some inaccuracies in the MaaS README (mmiklavc) closes
apache/metron#1536
METRON-2302 [UI] Change the default polling interval for Alerts UI to
longer time (tiborm via sardell) closes apache/metron#1547
METRON-2295 [UI] Displaying No Data message in the Alerts UI
screen (subhashjha35 via sardell) closes apache/metron#1543
METRON-2294 [UI] Fixing Stale mode issue in Alert UI Manual Query Mode
(subhashjha35 via sardell) closes apache/metron#1540
METRON-2291 [UI] Fixing and rephrasing warning messages on Alerts UI
(tiborm via sardell) closes apache/metron#1535
METRON-2303 Change Default HDFS Port for Batch Profiler (nickwallen) closes
apache/metron#1548
METRON-2300 Fix Brad Kolarov's Apache ID (billierinaldi via mmiklavc)
closes apache/metron#1541
METRON-2280 PCAP queries no longer work (mmiklavc) closes apache/metron#1537
METRON-2259 [UI] Hide Resolved and Hide Dismissed toggles not works when
filtering is in manual mode (tiborm via sardell) closes apache/metron#1532
METRON-2278 Metron on CentOS 6 Documentation is outdated
(subhashjha35 via sardell) closes apache/metron#1530
METRON-2274 Flatfile loader and summarizer mapreduce mode broken (mmiklavc)
closes apache/metron#1525
METRON-2272 [UI] Performance: Switching manual filtering on and off
multiple times leads slow typing (ruffle1986 via sardell) closes
apache/metron#1524
METRON-2190 [UI] Alerts UI: Indicating loading and preventing parallel
requests (tiborm via sardell) closes apache/metron#1514
METRON-2271 Reorganize Travis Build (nickwallen) closes apache/metron#1522
METRON-2266 REST debug instructions (merrimanr via nickwallen) closes
apache/metron#1520
METRON-2235 Increase server startup timeout (tigerquoll via mmiklavc)
closes apache/metron#1496
METRON-2257 Metron-Alerts GUI testing failing on MacOS builds (sardell)
closes apache/metron#1513
METRON-2247 rpm-docker: Provide an option to bypass running rpmlint
(tigerquoll via nickwallen) closes apache/metron#1503
METRON-2254 Intermittent Test Failure in RestFunctionsIntegrationTest
(nickwallen) closes apache/metron#1510
METRON-2211 [UI] Alerts UI should optionally render timestamp in local time
(sardell) closes apache/metron#1495
METRON-2217 Migrate current HBase client from HTableInterface to Table
(mmiklavc) closes apache/metron#1483
METRON-2201 The description for the IS_IP method default behavior needs to
corrected as per implementation (MohanDV via mmiklavc) closes
apache/metron#1474
METRON-2227 Increase Kafka test harness timeout (tigerquoll via mmiklavc)
closes apache/metron#1493

Re: [DISCUSS] deprecate misleading install methods and docs?

2019-11-19 Thread Nick Allen

FYI - See the following which removes the automated AWS deployment
mechanism.

https://issues.apache.org/jira/browse/METRON-2321
https://github.com/apache/metron/pull/1565


On Tue, Oct 29, 2019 at 10:23 AM Nick Allen  wrote:

> +1  On the remove option.  I think we should *completely remove* the
> automated AWS deployment mechanism because it has been too difficult to
> maintain, deploys an unsecure cluster by default, and is not the preferred
> installation path for AWS.  If a user wants to deploy to AWS, they should
> launch their EC2 nodes, install Ambari, and then using the MPack to deploy
> Metron.  That is the preferred installation path for AWS.
>
> I would gladly volunteer to do this work if we can reach consensus on this
> approach.
>
>
>
> On Tue, Oct 29, 2019 at 9:56 AM Simon Elliston Ball <
> si...@simonellistonball.com> wrote:
>
>> Following many discussions on the user and dev lists in the past, a
>> number of users seem to have problems with the old ansible methods for
>> installing AWS.
>>
>> I am not aware of anyone who is maintaining this area (please shout if
>> you are willing to take on bringing this up to date) and we have a lot of
>> outdated documentation on both the source tree and the wiki around older,
>> now broken install methods.
>>
>> My proposal is that we consolidate the multitude of deployment methods,
>> and:
>> * remove or
>> * Mark de-deprecated or
>> * move to contrib
>>  The methods outside of the Ambari Mpack and full-dev methods of install.
>>
>> Does anyone have any thoughts about how we can clean this up and reduce
>> the number of options that seem to be confusing new users coming to the
>> platform? I am happy as long as  the Ambari method currently used by the
>> distributor (who, as you mostly know, I work for, in the interest of full
>> disclosure) remains, and full-dev remains as is to avoid disruption to
>> development process. I have no strong opinions on any of the other
>> deployment methods, other than that their existence seems to be hindering
>> new community members.
>>
>> Thoughts?
>> Simon
>>
>>

Branch Cleanup

2019-10-29 Thread Nick Allen

Heads up... I accidentally pushed a feature branch to Apache called
METRON-2223.  This was my mistake.  I have just deleted the branch.

My apologies

Re: [DISCUSS] deprecate misleading install methods and docs?

2019-10-29 Thread Nick Allen

+1  On the remove option.  I think we should *completely remove* the
automated AWS deployment mechanism because it has been too difficult to
maintain, deploys an unsecure cluster by default, and is not the preferred
installation path for AWS.  If a user wants to deploy to AWS, they should
launch their EC2 nodes, install Ambari, and then using the MPack to deploy
Metron.  That is the preferred installation path for AWS.

I would gladly volunteer to do this work if we can reach consensus on this
approach.



On Tue, Oct 29, 2019 at 9:56 AM Simon Elliston Ball <
si...@simonellistonball.com> wrote:

> Following many discussions on the user and dev lists in the past, a number
> of users seem to have problems with the old ansible methods for installing
> AWS.
>
> I am not aware of anyone who is maintaining this area (please shout if you
> are willing to take on bringing this up to date) and we have a lot of
> outdated documentation on both the source tree and the wiki around older,
> now broken install methods.
>
> My proposal is that we consolidate the multitude of deployment methods,
> and:
> * remove or
> * Mark de-deprecated or
> * move to contrib
>  The methods outside of the Ambari Mpack and full-dev methods of install.
>
> Does anyone have any thoughts about how we can clean this up and reduce
> the number of options that seem to be confusing new users coming to the
> platform? I am happy as long as  the Ambari method currently used by the
> distributor (who, as you mostly know, I work for, in the interest of full
> disclosure) remains, and full-dev remains as is to avoid disruption to
> development process. I have no strong opinions on any of the other
> deployment methods, other than that their existence seems to be hindering
> new community members.
>
> Thoughts?
> Simon
>
>

Re: [DISCUSS] Curator client upgrade

2019-09-17 Thread Nick Allen

+1 to making the change on the feature branch.

We don't really know how this might affect master which is still building
against HDP 2.6, nor is it strictly needed there.  Going to Curator 4.0.0
is only needed due to the HDP 3.1 upgrade.   This is also likely to get
more focused testing cycles in the feature branch before it has a chance to
break anything in master.

On Tue, Sep 17, 2019 at 1:13 PM Michael Miklavcic <
michael.miklav...@gmail.com> wrote:

> Hey all,
>
> While working through the feature branch upgrade for HDP 3.1, we came
> across some classpath related issues conflicting with Storm and Guava while
> testing out Kerberos. Initially, this seemed reasonable to roll into the
> Kerberos fix PR, but the scope has expanded a bit because we found
> additional impacts while working through the Hadoop upgrade branch as well.
> We're now looking at splitting this out into a separate PR in the feature
> branch. Curator 4.0.0 is backwards compatible and will work with Zookeeper
> 3.4.6. Strictly speaking, this *could* be done against master first, but
> there may be some duplicate testing effort there, and I'm really not it's
> worth it as there is absolutely no issue currently with Curator 2.12.0 in
> master - I call this out purely for transparency/information share with the
> community. I'm leaning towards us making this change against the feature
> branch directly as that is the only place we currently have compatibility
> issues.
>
> Best,
> Mike
>

Re: [DISCUSS] Deprecate Least Recently Used Pruner

2019-08-13 Thread Nick Allen

Sure.  I should have provided some more context.  I can tell you what I do
know about it.  Perhaps others can provide some more color.

   - This is functionality accessed by a user by running the script; ${
   METRON_HOME}/bin/threatintel_bulk_prune.sh


   - If you are using access trackers with your HBase enrichments, it runs
   as an MR job that counts the number of times each Enrichment is used.  I am
   assuming that it then prunes those that are less frequently accessed.


   - It was originally created here;
   https://github.com/apache/metron/pull/22


On Tue, Aug 13, 2019 at 6:11 PM Otto Fowler  wrote:

> Can you summarize what it does? Is it from OpenSOC?
>
>
>
>
> On August 13, 2019 at 17:53:52, Nick Allen (n...@nickallen.org) wrote:
>
> As part of https://github.com/apache/metron/pull/1470, I found it
> difficult
> to update the "Least Recently Used Pruner" to work with HBase 2.0.2. I am
> sure that given more time and effort, I could make it work, but is it worth
> it?
>
> This is a feature that I myself am not familiar with. I do not know of
> anyone using this. I also did not find much documentation on how to use
> this feature. I certainly don't know the entire user community, so please
> let me know if anyone is using this functionality or believes that it
> should be maintained going forward.
>
> Would you support deprecating this feature?
>
> Thanks
>

[DISCUSS] Deprecate Least Recently Used Pruner

2019-08-13 Thread Nick Allen

As part of https://github.com/apache/metron/pull/1470, I found it difficult
to update the "Least Recently Used Pruner" to work with HBase 2.0.2.  I am
sure that given more time and effort, I could make it work, but is it worth
it?

This is a feature that I myself am not familiar with. I do not know of
anyone using this.  I also did not find much documentation on how to use
this feature.  I certainly don't know the entire user community, so please
let me know if anyone is using this functionality or believes that it
should be maintained going forward.

Would you support deprecating this feature?

Thanks

Re: [DISCUSS] Alerts UI: Loading state while fetching data

2019-07-24 Thread Nick Allen

Yes!  I think this is sorely needed.

Would this also include indicating when an error has occurred in the
backend call?  That might also be helpful and somewhat related to
METRON-2190.

On Wed, Jul 24, 2019 at 9:27 AM Tibor Meller  wrote:

> Hi all,
>
> I think it would great to have a loading state on the Alerts UI which shows
> loading is in progress (spinner) and prevent the user to trigger new fetch
> requests before the response arrives (or timeout).
>
> I added more detail here:
> https://issues.apache.org/jira/browse/METRON-2190
>
> Please take a look at it when you have a chance and let me know what you
> think.
>
> Regards,
> Tibor
>

Re: Test failure in master

2019-06-28 Thread Nick Allen

Thanks for fixing that.

On Fri, Jun 28, 2019 at 6:40 PM Michael Miklavcic <
michael.miklav...@gmail.com> wrote:

> Just an fyi, if you run into the FileFilterUtilTest failing, pull latest
> from master. There was a time-sensitive unit test condition that has now
> been fixed in https://issues.apache.org/jira/browse/METRON-2166
>
> Mike
>

Re: Travis failing due to same Maven issue

2019-06-13 Thread Nick Allen

I am not sure if this will actually solve the issue, but here is another
attempt.

https://github.com/apache/metron/pull/1442



On Wed, Jun 12, 2019 at 9:31 PM Michael Miklavcic <
michael.miklav...@gmail.com> wrote:

> I've seen this now with 2 back-to-back commits to master that end with
> Travis failing while attempting to download Maven. We've been seeing this
> off and on for a while now. I don't have an immediate answer, but feel it's
> worth bringing to the dev list's attention again.
>
> https://travis-ci.org/apache/metron/builds/544988758?utm_medium=notification_source=email
>
> Best,
> Mike
>

Re: Storm 1.0.x eol

2019-06-04 Thread Nick Allen

As part of the upgrade to HDP 3.1, we should consider ending all Metron
support for Storm 1.0.x.  We have a few build options and workarounds in
the code base (see storm-kafka-override, Maven profile HDP-1.5.0.0, etc)
that would need cleaned up.

On Mon, Jun 3, 2019 at 5:39 PM Otto Fowler  wrote:

> https://storm.apache.org/2019/05/30/storm200-released.html
>

Re: Master build - failed due to Maven download fail

2019-05-24 Thread Nick Allen

https://github.com/apache/metron/pull/1433

On Fri, May 24, 2019 at 12:07 PM Nick Allen  wrote:

> FYI, I just created this to track the issue.
>
> https://issues.apache.org/jira/browse/METRON-2143
>
> On Mon, May 20, 2019 at 6:39 PM Justin Leet  wrote:
>
>> I saw the same thing on https://github.com/apache/metron/pull/1407, but
>> bouncing Travis fixed it. Not sure if it's a connection issue or what, but
>> it seems like we should be able to cache it to avoid downloading every
>> time.
>>
>> On Mon, May 20, 2019 at 6:14 PM Michael Miklavcic <
>> michael.miklav...@gmail.com> wrote:
>>
>> > FYI I kicked off a rebuild for master. For some reason the wget command
>> to
>> > grab Maven 3.3.9 failed.
>> >
>>
>

Re: Master build - failed due to Maven download fail

2019-05-24 Thread Nick Allen

FYI, I just created this to track the issue.

https://issues.apache.org/jira/browse/METRON-2143

On Mon, May 20, 2019 at 6:39 PM Justin Leet  wrote:

> I saw the same thing on https://github.com/apache/metron/pull/1407, but
> bouncing Travis fixed it. Not sure if it's a connection issue or what, but
> it seems like we should be able to cache it to avoid downloading every
> time.
>
> On Mon, May 20, 2019 at 6:14 PM Michael Miklavcic <
> michael.miklav...@gmail.com> wrote:
>
> > FYI I kicked off a rebuild for master. For some reason the wget command
> to
> > grab Maven 3.3.9 failed.
> >
>

Re: [DISCUSS] Travis CI Build Time Limits

2019-05-23 Thread Nick Allen

> (1) The build is really a tree of modules. With our recent de-coupling
   from Storm, this has become all the more exaggerated...

I agree that the approach I followed in my POC suffers from the risk of a
future refactor accidently causing some tests not to run.  But I just did
it this way because it was the simplest approach to see something work.  I
think we could easily find a way to mitigate or eliminate this risk in a
final solution.


> (2) I actually like the current limits because it has forced contributors and
reviewers to consider the overall build time as they design and add new
tests

I agree with the goal.  I want the tests to run as quickly as possible too. But
spending time here worried about whether the builds will complete after the
cache is invalidated seems like a waste of everyone's time.  In addition,
I've seen times where my own feature branches CI builds fail because the
time limit is breached.  I think this tends to happen when Travis is under
greater load or when they have some resources down for maintenance or
outages (or so I assume).

Personally, I am in favor of anything that reduces the time for a developer
to get feedback on changes they've made.  I like the idea of
parallelizing/splitting the integration tests for this reason.









On Thu, May 23, 2019 at 11:33 AM Michael Miklavcic <
michael.miklav...@gmail.com> wrote:

> I think this is a neat idea, however I have a couple concerns with this:
>
>1. The build is really a tree of modules. With our recent de-coupling
>from Storm, this has become all the more exaggerated. This is a good
> thing,
>however the benefit of the parent pom approach we currently have is
> that it
>does a pretty good job of guaranteeing all of the leaf modules get
> built as
>well. Specifying the modules individually is going to turn this into a
> very
>manual process any time a new module is added or refactored, and is more
>likely to result in bugs. One other option to aid in solving that
> problem
>is to use aggregator modules, and then have the full local build parent
> pom
>use those aggregators instead of referring to the individual modules the
>way it does now. You still get the top-down tree, but there's less of a
> gap
>between Travis and our primary Maven structure.
>2. I actually like the current limits because it has forced contributors
>and reviewers to consider the overall build time as they design and add
> new
>tests. Especially with the current set of upgrades and proposed
> integration
>test and Docker changes, I think we should be extra vigilant on getting
>those build/test times *down*, rather than enabling them to increase.
> Sure,
>there are other options for solving that problem, but Travis is a simple
>equalizer because it's impartial to whatever local hardware engineers
> are
>using.
>
>
> On Wed, May 22, 2019 at 7:48 AM Nick Allen  wrote:
>
> > FYI - Here is a POC build of this concept running.  This only runs the
> > Parser and Enrichment integration tests, but as separate jobs in Travis.
> >
> > https://travis-ci.org/nickwallen/metron/builds/535791047
> >
> >
> >
> >
> > On Wed, May 22, 2019 at 9:20 AM Nick Allen  wrote:
> >
> > >
> > > > Justin Leet said
> > > <https://github.com/apache/metron/pull/1417#issuecomment-494464795>
> > [1]: Looking
> > > at our build times, I'm actually concerned that if we kill the caches
> our
> > > builds won't complete. The integration tests take >45 minutes and it's
> > very
> > > possible redownloading everything goes over our remaining time.
> > >
> > > To recap, our current Travis CI build is composed of multiple jobs.
> > Travis'
> > > time limit of 50 minutes
> > > <https://docs.travis-ci.com/user/customizing-the-build/#build-timeouts
> >
> > [2]
> > > is per job, rather than the total build time.  While our total build
> time
> > > is on the order of 2.5 hours, it is only our integration-test job which
> > is
> > > coming close to that 50 minute limit.
> > >
> > > We could try splitting our integration tests into multiple jobs.  Each
> of
> > > these would have the opportunity to run in parallel (given whatever
> > > resources Travis can allocate to us), but more importantly each one has
> > its
> > > own 50 minute limit. For example...
> > >
> > >- Parser Integration Tests:
> > >   - `time mvn surefire:test@integration-tests -pl
> > >   "metron-platform/metron-parsing/metron-parsing-storm,
> > >   metron-platform/metron-parsing/metro

Re: [DISCUSS] Build RPM/DEBs in Travis?

2019-05-22 Thread Nick Allen

Yes, running up Full Dev is a manual verification that is required.  And as
a manual verification sometimes that will get missed.  And in this specific
case, tt does seem a bit silly that the addition of a parser should require
the contributor to run up Full Dev.

That being said, anything we can do to reduce the amount of manual
verification that is required is a good thing.  We should be pushing
ourselves to an end state where no manual verification is required for any
Metron PR.  I think building the RPMs/DEBs as part of the Travis build is
at least a small step in the right direction.

On Wed, May 22, 2019 at 9:34 AM Justin Leet  wrote:

> Theoretically, we didn't need to before there were both RPMs and DEBs since
> running dev up (which necessitates building those) is part of the build
> process. Since they've been split apart, I agree we probably should be
> building them, because nobody is going to run both unless they specifically
> done something they'd expect to affect both.
>
> On Wed, May 22, 2019 at 9:30 AM Nick Allen  wrote:
>
> > In light of issues like this https://github.com/apache/metron/pull/1419,
> > has anyone looked into building our RPMs and DEBs in Travis?  This is a
> > very common and easy mistake to make and our CI builds really should be
> > able to catch this.
> >
>

Re: [DISCUSS] Travis CI Build Time Limits

2019-05-22 Thread Nick Allen

FYI - Here is a POC build of this concept running.  This only runs the
Parser and Enrichment integration tests, but as separate jobs in Travis.

https://travis-ci.org/nickwallen/metron/builds/535791047




On Wed, May 22, 2019 at 9:20 AM Nick Allen  wrote:

>
> > Justin Leet said
> <https://github.com/apache/metron/pull/1417#issuecomment-494464795> [1]: 
> Looking
> at our build times, I'm actually concerned that if we kill the caches our
> builds won't complete. The integration tests take >45 minutes and it's very
> possible redownloading everything goes over our remaining time.
>
> To recap, our current Travis CI build is composed of multiple jobs. Travis'
> time limit of 50 minutes
> <https://docs.travis-ci.com/user/customizing-the-build/#build-timeouts> [2]
> is per job, rather than the total build time.  While our total build time
> is on the order of 2.5 hours, it is only our integration-test job which is
> coming close to that 50 minute limit.
>
> We could try splitting our integration tests into multiple jobs.  Each of
> these would have the opportunity to run in parallel (given whatever
> resources Travis can allocate to us), but more importantly each one has its
> own 50 minute limit. For example...
>
>- Parser Integration Tests:
>   - `time mvn surefire:test@integration-tests -pl
>   "metron-platform/metron-parsing/metron-parsing-storm,
>   metron-platform/metron-parsing/metron-parsers,
>   metron-platform/metron-parsing/metron-parsers-common/"`
>- Enrichment Integration Tests
>   - `time mvn surefire:test@integration-tests -pl
>   
> "metron-platform/metron-enrichment/metron-enrichment-common/,metron-platform/metron-enrichment/metron-enrichment-storm"`
>- etc, etc
>
>
> We would just need to determine how to logically split the tests up.  If
> this sounds reasonable, I've already got a start on a POC.
>
> ---
> [1] https://github.com/apache/metron/pull/1417#issuecomment-494464795
> [2] https://docs.travis-ci.com/user/customizing-the-build/#build-timeouts
>

[DISCUSS] Build RPM/DEBs in Travis?

2019-05-22 Thread Nick Allen

In light of issues like this https://github.com/apache/metron/pull/1419,
has anyone looked into building our RPMs and DEBs in Travis?  This is a
very common and easy mistake to make and our CI builds really should be
able to catch this.

[DISCUSS] Travis CI Build Time Limits

2019-05-22 Thread Nick Allen

> Justin Leet said

[1]: Looking
at our build times, I'm actually concerned that if we kill the caches our
builds won't complete. The integration tests take >45 minutes and it's very
possible redownloading everything goes over our remaining time.

To recap, our current Travis CI build is composed of multiple jobs. Travis'
time limit of 50 minutes
[2]
is per job, rather than the total build time. While our total build time
is on the order of 2.5 hours, it is only our integration-test job which is
coming close to that 50 minute limit.

We could try splitting our integration tests into multiple jobs. Each of
these would have the opportunity to run in parallel (given whatever
resources Travis can allocate to us), but more importantly each one has its
own 50 minute limit. For example...

- Parser Integration Tests:
- `time mvn surefire:test@integration-tests -pl
"metron-platform/metron-parsing/metron-parsing-storm,
metron-platform/metron-parsing/metron-parsers,
metron-platform/metron-parsing/metron-parsers-common/"`
- Enrichment Integration Tests
- `time mvn surefire:test@integration-tests -pl

"metron-platform/metron-enrichment/metron-enrichment-common/,metron-platform/metron-enrichment/metron-enrichment-storm"`
- etc, etc

We would just need to determine how to logically split the tests up. If
this sounds reasonable, I've already got a start on a POC.

---
[1] https://github.com/apache/metron/pull/1417#issuecomment-494464795
[2] https://docs.travis-ci.com/user/customizing-the-build/#build-timeouts

Re: [DISCUSS] JsonMapParser original string functionality

2019-05-10 Thread Nick Allen

>  I suppose we could always allow this to be overridden, also.

I like an on/off switch for the "original string" functionality.  If on,
you get the original string in pristine condition.  If off, no original
string is appended for those who care more about storage space.

I can't think of a reason where one kind of parser would have a different
original string mechanism than the others.  If something like that does
come up, the parser can create its own original string by just naming it
something different and then turning "off" the switch that you described.



On Fri, May 10, 2019 at 5:53 PM Michael Miklavcic <
michael.miklav...@gmail.com> wrote:

> I think that's an excellent idea. Can anyone think of a situation where we
> wouldn't want to add this the same way for all parsers? I suppose we could
> always allow this to be overridden, also.
>
> On Fri, May 10, 2019 at 3:43 PM Nick Allen  wrote:
>
> > I think maintaining the integrity of the original data makes a lot of
> sense
> > for any parser. And ideally the original string should be what came out
> of
> > Kafka with only the minimally necessary processing.
> >
> > With that in mind, we could solve this one level up.  Instead of relying
> on
> > each parser to do this right, we could have the ParserRunner and
> > specifically the ParserRunnerImpl [1] handle this round-abouts here
> > <
> >
> https://github.com/apache/metron/blob/1b6ef88c79d60022542cda7e9abbea7e720773cc/metron-platform/metron-parsing/metron-parsers-common/src/main/java/org/apache/metron/parsers/ParserRunnerImpl.java#L149-L158
> > >
> > [1].
> > It has the raw message data and can append the original string to each
> > message it gets back from the parsers.
> >
> > Just another approach to consider.
> >
> > --
> > [1]
> >
> >
> https://github.com/apache/metron/blob/1b6ef88c79d60022542cda7e9abbea7e720773cc/metron-platform/metron-parsing/metron-parsers-common/src/main/java/org/apache/metron/parsers/ParserRunnerImpl.java#L149-L158
> >
> > On Fri, May 10, 2019 at 4:11 PM Otto Fowler 
> > wrote:
> >
> > > +1
> > >
> > >
> > > On May 10, 2019 at 13:57:55, Michael Miklavcic (
> > > michael.miklav...@gmail.com)
> > > wrote:
> > >
> > > When adding the capability for parsing messages in the JsonMapParser
> > using
> > > JSON Path expressions the original behavior for managing original
> strings
> > > was changed.
> > >
> > >
> > >
> >
> https://github.com/apache/metron/blob/master/metron-platform/metron-parsing/metron-parsers-common/src/main/java/org/apache/metron/parsers/json/JSONMapParser.java#L192
> > >
> > > A couple issues have been reported recently regarding this change:
> > >
> > > 1. We're losing the actual original string, which is a legal issue for
> > > data lineage for some customers
> > > 2. Even for the degenerate case with no sub-messages created, the
> > > original sub-message string is modified because of the
> > > serialization/deserialization process with Jackson/JsonSimple. The
> fields
> > > are reordered bc the content is normalized.
> > >
> > > I looked at options for preserving formatting, but am unable to find a
> > > method that allows you to both parse, then query the original message
> and
> > > then also obtain the raw string matches without the normalizing from
> > > ser/deserialization.
> > >
> > > I'd like to propose that we add a configuration option for this parser
> > that
> > > allows the user to toggle which approach they'd like to use. My
> personal
> > > preference based on feedback I've gotten from multiple customers is
> that
> > > the default should be the older approach which takes the raw original
> > > string. It's arguable that this change in contract is a regression, so
> > the
> > > default should be the earlier behavior. Any sub-messages would then
> have
> > a
> > > copy of that raw original string, not just the sub-message original
> > string.
> > > Enabling the flag would enable the current sub-message original string
> > > functionality.
> > >
> > > Mike
> > >
> >
>

Re: [DISCUSS] JsonMapParser original string functionality

2019-05-10 Thread Nick Allen

I think maintaining the integrity of the original data makes a lot of sense
for any parser. And ideally the original string should be what came out of
Kafka with only the minimally necessary processing.

With that in mind, we could solve this one level up.  Instead of relying on
each parser to do this right, we could have the ParserRunner and
specifically the ParserRunnerImpl [1] handle this round-abouts here

[1].
It has the raw message data and can append the original string to each
message it gets back from the parsers.

Just another approach to consider.

--
[1]
https://github.com/apache/metron/blob/1b6ef88c79d60022542cda7e9abbea7e720773cc/metron-platform/metron-parsing/metron-parsers-common/src/main/java/org/apache/metron/parsers/ParserRunnerImpl.java#L149-L158

On Fri, May 10, 2019 at 4:11 PM Otto Fowler  wrote:

> +1
>
>
> On May 10, 2019 at 13:57:55, Michael Miklavcic (
> michael.miklav...@gmail.com)
> wrote:
>
> When adding the capability for parsing messages in the JsonMapParser using
> JSON Path expressions the original behavior for managing original strings
> was changed.
>
>
> https://github.com/apache/metron/blob/master/metron-platform/metron-parsing/metron-parsers-common/src/main/java/org/apache/metron/parsers/json/JSONMapParser.java#L192
>
> A couple issues have been reported recently regarding this change:
>
> 1. We're losing the actual original string, which is a legal issue for
> data lineage for some customers
> 2. Even for the degenerate case with no sub-messages created, the
> original sub-message string is modified because of the
> serialization/deserialization process with Jackson/JsonSimple. The fields
> are reordered bc the content is normalized.
>
> I looked at options for preserving formatting, but am unable to find a
> method that allows you to both parse, then query the original message and
> then also obtain the raw string matches without the normalizing from
> ser/deserialization.
>
> I'd like to propose that we add a configuration option for this parser that
> allows the user to toggle which approach they'd like to use. My personal
> preference based on feedback I've gotten from multiple customers is that
> the default should be the older approach which takes the raw original
> string. It's arguable that this change in contract is a regression, so the
> default should be the earlier behavior. Any sub-messages would then have a
> copy of that raw original string, not just the sub-message original string.
> Enabling the flag would enable the current sub-message original string
> functionality.
>
> Mike
>

Re: [VOTE] Metron Release Candidate 0.7.1-RC2

2019-05-10 Thread Nick Allen

I really enjoyed the retro, 3-digit vibe on that one.

On Fri, May 10, 2019 at 4:38 PM Michael Miklavcic <
michael.miklav...@gmail.com> wrote:

> "METRON-685" - wow, that one was a long time coming.
>
> On Thu, May 9, 2019 at 5:54 PM Nick Allen  wrote:
>
> > +1 binding
> >
> > I validated the release tarball, ran the full test suite and validated
> the
> > CentOS 6 development environment.  Everything looks solid.  Let's ship
> it.
> >
> > On Wed, May 8, 2019 at 6:50 PM Justin Leet 
> wrote:
> >
> > > This is a call to vote on releasing Apache Metron 0.7.1
> > >
> > > Full list of changes in this release:
> > > https://dist.apache.org/repos/dist/dev/metron/0.7.1-RC2/CHANGES
> > > The tag to be voted upon is:
> > > apache-metron_0.7.1-rc2
> > >
> > > The source archives being voted upon can be found here:
> > >
> > >
> >
> https://dist.apache.org/repos/dist/dev/metron/0.7.1-RC2/apache-metron_0.7.1-rc2.tar.gz
> > >
> > > Other release files, signatures and digests can be found here:
> > > https://dist.apache.org/repos/dist/dev/metron/0.7.1-RC2/
> > >
> > > The release artifacts are signed with the following key:
> > > https://dist.apache.org/repos/dist/release/metron/KEYS
> > > Please vote on releasing this package as Apache Metron 0.7.1-RC2
> > >
> > > When voting, please list the actions taken to verify the release.
> > >
> > > Recommended build validation and verification instructions are posted
> > > here:
> > > https://cwiki.apache.org/confluence/display/METRON/Verifying+Builds
> > >
> > > This vote will be open for until 7pm EDT on Monday May 13 2019, to
> > account
> > > for the weekend.
> > >
> > > [ ] +1 Release this package as Apache Metron 0.7.1-RC2
> > >
> > > [ ] 0 No opinion
> > >
> > > [ ] -1 Do not release this package because...
> > >
> >
>

Re: [VOTE] Metron Release Candidate 0.7.1-RC2

2019-05-09 Thread Nick Allen

+1 binding

I validated the release tarball, ran the full test suite and validated the
CentOS 6 development environment.  Everything looks solid.  Let's ship it.

On Wed, May 8, 2019 at 6:50 PM Justin Leet  wrote:

> This is a call to vote on releasing Apache Metron 0.7.1
>
> Full list of changes in this release:
> https://dist.apache.org/repos/dist/dev/metron/0.7.1-RC2/CHANGES
> The tag to be voted upon is:
> apache-metron_0.7.1-rc2
>
> The source archives being voted upon can be found here:
>
> https://dist.apache.org/repos/dist/dev/metron/0.7.1-RC2/apache-metron_0.7.1-rc2.tar.gz
>
> Other release files, signatures and digests can be found here:
> https://dist.apache.org/repos/dist/dev/metron/0.7.1-RC2/
>
> The release artifacts are signed with the following key:
> https://dist.apache.org/repos/dist/release/metron/KEYS
> Please vote on releasing this package as Apache Metron 0.7.1-RC2
>
> When voting, please list the actions taken to verify the release.
>
> Recommended build validation and verification instructions are posted
> here:
> https://cwiki.apache.org/confluence/display/METRON/Verifying+Builds
>
> This vote will be open for until 7pm EDT on Monday May 13 2019, to account
> for the weekend.
>
> [ ] +1 Release this package as Apache Metron 0.7.1-RC2
>
> [ ] 0 No opinion
>
> [ ] -1 Do not release this package because...
>

Re: [DISCUSS] Parser Aggregation in Management UI

2019-05-06 Thread Nick Allen

Have you considered creating a feature branch for the effort? This would
allow you to break the effort into chunks, where the result of each PR may
not be a fully working "master-ready" result.

I am sure you guys tackled the work in chunks when developing it, so
consider just replaying those chunks onto the feature branch as separate
PRs.



On Mon, May 6, 2019 at 5:24 AM Tibor Meller  wrote:

> I wondered on the weekend how we could split that PR to smaller chunks.
> That PR is a result of almost 2 months of development and I don't see how
> to split that to multiple WORKING parts. It is as it is a whole working
> feature. If we split it by packages or files we could provide smaller
> non-functional PR's, but can end up having a broken Management UI after
> having the 1st PR part merged into master. I don't think that would be
> acceptable by the community (or even by me) so I would like to suggest two
> other option to help review PR#1360.
>
> #1 We could extend that PR with our own author comments in Github. That
> would help following which code part belongs to where and why it was
> necessary.
> #2 We can schedule an interactive code walkthrough call with the ones who
> interested in reviewing or the particular changeset.
>
> Please share your thoughts on this! Which version would support you the
> best? Or if you have any other idea let us know.
>
> PS: I think the size of our PR's depends on how small independently
> deliverable changesets we can identify before we starting to implement a
> relatively big new feature. Unfortunately, we missed to do that with this
> feature.
>
> On Fri, May 3, 2019 at 1:49 PM Shane Ardell 
> wrote:
>
> > NgRx was only used for the aggregation feature and doesn't go beyond
> that.
> > I think the way I worded that sentence may have caused confusion. I just
> > meant we use it to manage more pieces of state within the aggregation
> > feature than just previous and current state of grouped parsers.
> >
> > On Fri, May 3, 2019 at 1:32 AM Michael Miklavcic <
> > michael.miklav...@gmail.com> wrote:
> >
> > > Shane, thanks for putting this together. The updates on the Jira are
> > useful
> > > as well.
> > >
> > > > (we used it for more than just that in this feature, but that was the
> > > initial reasoning)
> > > What are you using NgRx for in the submitted work that goes beyond the
> > > aggregation feature?
> > >
> > >
> > >
> > > On Thu, May 2, 2019 at 12:22 PM Shane Ardell  >
> > > wrote:
> > >
> > > > Hello everyone,
> > > >
> > > > In response to discussions in the 0.7.1 release thread, I wanted to
> > > start a
> > > > thread regarding the parser aggregation work for the Management UI.
> For
> > > > anyone who has not already read and tested the PR locally, I've
> added a
> > > > detailed description of what we did and why to the JIRA ticket here:
> > > > https://issues.apache.org/jira/browse/METRON-1856
> > > >
> > > > I'm wondering what the community thinks about what we've built thus
> > far.
> > > Do
> > > > you see anything missing that must be part of this new feature in the
> > UI?
> > > > Are there any strong objections to how we implemented it?
> > > >
> > > > I’m also looking to see if anyone has any thoughts on how we can
> > possibly
> > > > simplify this PR. Right now it's pretty big, and there are a lot of
> > > commits
> > > > to parse through, but I'm not sure how we could break this work out
> > into
> > > > separate, smaller PRs opened against master. We could try to
> > cherry-pick
> > > > the commits into smaller PRs and then merge them into a feature
> branch,
> > > but
> > > > I'm not sure if that's worth the effort since that will only reduce
> the
> > > > number commits to review, not the lines changed.
> > > >
> > > > As an aside, I also want to give a little background into the
> > > introduction
> > > > of NgRx in this PR. To give a little background on why we chose to do
> > > this,
> > > > you can refer to the discussion thread here:
> > > >
> > > >
> > >
> >
> https://lists.apache.org/thread.html/06a59ea42e8d9a9dea5f90aab4011e44434555f8b7f3cf21297c7c87@%3Cdev.metron.apache.org%3E
> > > >
> > > > We previously discussed introducing a better way to manage
> application
> > > > state in both UIs in that thread. It was decided that NgRx was a
> great
> > > tool
> > > > for many reasons, one of them being that we can piecemeal it into the
> > > > application rather than doing a huge rewrite of all the application
> > state
> > > > at once. The contributors in this PR (myself included) decided this
> > would
> > > > be a perfect opportunity to introduce NgRx into the Management UI
> since
> > > we
> > > > need to manage the previous and current state with the grouping
> feature
> > > so
> > > > that users can undo the changes they've made (we used it for more
> than
> > > just
> > > > that in this feature, but that was the initial reasoning). In
> addition,
> > > we
> > > > greatly benefited from this when it came time to debug our work in

Re: [DISCUSS] Full-dev role in PR testign

2019-05-03 Thread Nick Allen

I'm exploring the use of TestContainers right now as part of the HDP 3.1
effort.  Still exploring feasibility, but it is looking promising.

On Fri, May 3, 2019 at 10:46 AM Justin Leet  wrote:

> I think everything Casey mentioned is a good call-out as things start to
> build into specifics. I definitely agree it's a very nontrivial amount of
> work, but that lowering the barrier of entry to a lot of PRs eases the
> burden on both new and existing contributors by a substantial amount.
>
> @Mike,
> As a heads up, I (super briefly) looked into the Docker stuff a bit, and
> the extension idea may not work with the Docker stuff to the extent we want
> (at least without us doing some additional work ourselves).  It seems like
> at least what I linked earlier and some other stuff actually provide direct
> annotations rather than plugging directly into the same extensions idea.
>
> Before we dive into it too much, it might be worth playing around with it
> more and coming back to the group with a couple options. If you're
> interested in looking into it, I *suspect* it'll boil down to a couple
> options
> * Use Docker with something like the above link or testcontainers
> . It's possible the Docker stuff ends up
> being lightweight / fast enough to use on at least a per class basis like
> we do now, rather than trying to generalize across all tests immediately.
> * Roll our own code to have more fine-grained control over the Docker
> lifecycle and which components need to be spun up for extensions.
> * Figure out how to make the prior options play nice
> * Other
>
> I'll probably dig a bit on my own, but I'm not sure how much focused time
> I'll be able to put into it in the absolute immediate term.  I can probably
> whip up a quick demo of the extensions stuff over the next week or so in a
> one-off project to give a bit of a demo and maybe help out with some of
> experimentation. Feel free to reach out if there's anything in particular
> that would be helpful to look into.
>
> The extensions stuff does have a lot of benefits, but I had less components
> to work with and didn't have the same classpath worries that made real
> instances (i.e. Docker) more attractive. It was sufficient for our
> purposes, but there might have to be compromises here. We depend on a lot
> more of the Hadoop stack.
>
> Migrating to JUnit 5 in general *should* be pretty easy. I don't think we
> really use any of the stuff that migrated in odd ways, so it should mostly
> just be updating annotations and imports (@Before to @BeforeEach, etc.).
> I'm sure this glosses over at least a few gotchas, though.
>
> On Fri, May 3, 2019 at 10:09 AM Michael Miklavcic <
> michael.miklav...@gmail.com> wrote:
>
> > I didn't get a chance to say so earlier, but Justin, I also like the
> JUnit
> > 5 extension suggestion. I've gone through some en-masse changes before,
> > e.g. standardizing the log4j construction idiom, and it honestly wasn't
> too
> > bad. Just a thought, it might make sense to kick this off by upgrading
> > overall JUnit 4 to 5 across the code base, and then diving into some of
> the
> > more 5-specific changes you're recommending as needed. I created this
> Jira
> > a bit ago - https://issues.apache.org/jira/browse/METRON-2037. That was
> to
> > upgrade to 4.13, but we might be able to kill 2 birds with one stone if
> we
> > go to JUnit 5. I'm volunteering to look into this and/or see the work
> > through to completion. What do you think?
> >
> > > - debuggability (right now we run the tests in the same JVM and setting
> >breakpoints is trivial, even in the innards of Hadoop.  This is very
> >valuable for figuring out what's going wrong and we'll need SOME
> > solution
> >for it)
> >
> > Yeah Casey, I remember this from the last time we discussed it. That's
> the
> > most import issue to be sure we have a handle on, imo. We'll need to
> figure
> > out remote debugging in Docker containers. Not to mention, the execution
> > path becomes a bit more spread out when we're running multiple components
> > as nature intended across multiple processes.
> >
> >
> >
> > On Fri, May 3, 2019 at 7:14 AM Casey Stella  wrote:
> >
> > > I just want to chime in and say I'm STRONGLY in favor of a docker-based
> > > approach to testing (I specifically like the JUnit 5 extensions
> > > suggestion).  I think that forcing a full-dev evaluation for every
> small
> > PR
> > > is a barrier to entry that I'd like to overcome.  I also think that
> this
> > is
> > > going to not be trivial.
> > >
> > > There will be weirdness/drama with:
> > >
> > >- cleanup
> > >- setup in situations where multi-components are used
> > >- debuggability (right now we run the tests in the same JVM and
> > setting
> > >breakpoints is trivial, even in the innards of Hadoop.  This is very
> > >valuable for figuring out what's going wrong and we'll need SOME
> > > solution
> > >for it)
> > >- possible resource

Re: [DISCUSS] Metron Release - 0.7.1 next steps

2019-05-02 Thread Nick Allen

I think any open source project needs to strive to cut releases regularly.
This is healthy for the project and community.  It gets new features and
functionality out to the community so we can get feedback, find what is
working and what is not, iterate and improve.  You probably agree with this.

While releasing this week or next may not matter in the grand scheme, if we
want to cut releases regularly, then we need to bear down and just do it.
Case in point, I opened the initial discussion for this release on March
13th [1] and it is now May 2nd and we have yet to release 7 weeks later.

--
[1]
https://lists.apache.org/thread.html/4f58649139f0aa6276f96febe1d0ecf9e6b3fb5b2b088cba1e3c4d81@%3Cdev.metron.apache.org%3E


On Thu, May 2, 2019 at 11:51 AM Michael Miklavcic <
michael.miklav...@gmail.com> wrote:

> As a more general question, can I ask why we're feeling pressure to push
> out a release in the first place? Again, I'm happy to continue with option
> 2. Let's move forward and get out the release. But is there a reason why we
> think it has to get out now, versus next week, or the week after? Otto
> pointed out a legitimate issue, dev environment or not, and I'm unclear why
> we have an issue with waiting for the fix. There's no pressure on this,
> imho.
>
> On Thu, May 2, 2019, 9:12 AM Otto Fowler  wrote:
>
> > I remember this now, but I’m not sure how I would have related this to a
> > parser aggregation pr honestly.
> >
> >
> > On May 2, 2019 at 07:54:13, Shane Ardell (shane.m.ard...@gmail.com)
> wrote:
> >
> > Here's a link to the ngrx discussion thread from a few months back:
> >
> >
> https://lists.apache.org/thread.html/06a59ea42e8d9a9dea5f90aab4011e44434555f8b7f3cf21297c7c87@%3Cdev.metron.apache.org%3E
> >
> > On Thu, May 2, 2019 at 1:17 PM Otto Fowler 
> > wrote:
> >
> > > If you can find a link in the archives for that thread, it would really
> > > help.
> > >
> > > I don’t think sending them up as one sensor would work…. as something
> > > quick. I think it is an interesting idea from a higher level that would
> > > need some more thought though ( IE: what if every sensor in the ui was
> a
> > > sensor group, and the existing entries where just groups of 1 ).
> > >
> > > As far as I can see, we have brought up the idea of a release
> ourselves,
> > I
> > > don’t see why we don’t just swarm this issue and get it right then
> > release.
> > >
> > >
> > >
> > > On May 2, 2019 at 04:16:31, Tamás Fodor (ftamas.m...@gmail.com) wrote:
> > >
> > > In PR#1360 we introduced a new state management strategy involving a
> new
> > > module called Ngrx. We had a discussion thread on this a few months ago
> > and
> > > we successfully convinced you about the benefits. This is one of the
> > > reasons why this PR is going to be still huge after cleaning up the
> > commit
> > > history. After you having a look at the changes and the feature itself,
> > > there's likely have questions about why certain parts work as they do.
> > The
> > > thing what I'd like to point out is that, yes, it probably takes more
> > time
> > > to get it in.
> > >
> > > In order to being able to release the RC, wouldn't it be an easy and
> > quick
> > > fix on the backend if it sent the aggregated parsers to the client as
> > they
> > > were one sensor? It's just an idea, it might be wrong, but at least we
> > > shouldn't have to wait until the aforementioned PR gets ready to be
> > merged
> > > to the master.
> > >
> > > On Wed, May 1, 2019 at 4:16 PM Justin Leet 
> > wrote:
> > >
> > > > Short version: I'm in favor of #2 of 0.7.1 and #1 as a blocker for
> > 0.8.0.
> > > > #3 seems like a total waste of time and effort.
> > > >
> > > > The wall of text version:
> > > > I agree this isn't "just the wrong thing shown", but for completely
> > > > different reasons.
> > > >
> > > > To be extremely clear about what the problem is: Our "dev"
> environment
> > > > (whose very name implies the audience is develops) uses a
> > > performance-based
> > > > advanced feature to ensure that all our of sample flows are regularly
> > run
> > > > and produce data. This feature has a bare minimal implementation to
> be
> > > > enabled via Ambari, which it currently is by default. This is because
> > of
> > > > the limited resources available that previously resulted in us
> turning
> > > off
> > > > Yaf, and therefore testing it during regular full dev runs. Right now
> > > > however, this feature is not exposed through the management UI, and
> > > > therefore it isn't obvious what the implications are. Am I missing
> > > anything
> > > > here?
> > > >
> > > > For users actually choosing to use the parser aggregation feature in
> a
> > > > non-full-dev environment, I'd expect substantially more care to be
> > > involved
> > > > given the lack of easy configuration for it (after all, why would you
> > > > bother running the aggregated parser alongside the regular parser?
> This
> > > > could be more explicitly stated, but again that feels like a doc

Re: [DISCUSS] Metron Release - 0.7.1 next steps

2019-05-02 Thread Nick Allen

To echo Justin's comments, I am in favor of #2, which provides a clear,
well-defined path to a release.

   - Why hold back a release, especially a point release containing 89
   improvements, for one issue that will not affect most users?


   - It is one thing to stall a release to address a bug of limited scope,
   where a fix is well understood and ready for review, but it is completely
   another issue to delay for this.


   - I don't see a set of reviewable PRs yet that will push this over the
   finish line.  As has been noted, there were fundamental problems with #1360
   (which has now been closed) that would have prevented adequate review by
   the community.


   - Why drive this issue with the pressure of a stalled release, instead
   of just releasing the fix when it is ready and has been adequately
   reviewed?  Swarming on an issue does not often produce quality results.

For those in favor of #1, can someone please provide a clear outline of
what the fix looks-like?  How many PRs will this require?  When are these
PRs likely to be ready?  Who is driving this?  Tamás has already commented
that this not a quick fix. This path is very murky to me, but maybe I am
just ignorant on this.

I would also urge other committers and users who don't have a binding vote
on the release to share their opinion on the path forward.




On Thu, May 2, 2019 at 7:17 AM Otto Fowler  wrote:

> If you can find a link in the archives for that thread, it would really
> help.
>
> I don’t think sending them up as one sensor would work…. as something
> quick.  I think it is an interesting idea from a higher level that would
> need some more thought though ( IE: what if every sensor in the ui was a
> sensor group, and the existing  entries where just groups of 1 ).
>
> As far as I can see, we have brought up the idea of a release ourselves, I
> don’t see why we don’t just swarm this issue and get it right then release.
>
>
>
> On May 2, 2019 at 04:16:31, Tamás Fodor (ftamas.m...@gmail.com) wrote:
>
> In PR#1360 we introduced a new state management strategy involving a new
> module called Ngrx. We had a discussion thread on this a few months ago and
> we successfully convinced you about the benefits. This is one of the
> reasons why this PR is going to be still huge after cleaning up the commit
> history. After you having a look at the changes and the feature itself,
> there's likely have questions about why certain parts work as they do. The
> thing what I'd like to point out is that, yes, it probably takes more time
> to get it in.
>
> In order to being able to release the RC, wouldn't it be an easy and quick
> fix on the backend if it sent the aggregated parsers to the client as they
> were one sensor? It's just an idea, it might be wrong, but at least we
> shouldn't have to wait until the aforementioned PR gets ready to be merged
> to the master.
>
> On Wed, May 1, 2019 at 4:16 PM Justin Leet  wrote:
>
> > Short version: I'm in favor of #2 of 0.7.1 and #1 as a blocker for 0.8.0.
> > #3 seems like a total waste of time and effort.
> >
> > The wall of text version:
> > I agree this isn't "just the wrong thing shown", but for completely
> > different reasons.
> >
> > To be extremely clear about what the problem is: Our "dev" environment
> > (whose very name implies the audience is develops) uses a
> performance-based
> > advanced feature to ensure that all our of sample flows are regularly run
> > and produce data. This feature has a bare minimal implementation to be
> > enabled via Ambari, which it currently is by default. This is because of
> > the limited resources available that previously resulted in us turning
> off
> > Yaf, and therefore testing it during regular full dev runs. Right now
> > however, this feature is not exposed through the management UI, and
> > therefore it isn't obvious what the implications are. Am I missing
> anything
> > here?
> >
> > For users actually choosing to use the parser aggregation feature in a
> > non-full-dev environment, I'd expect substantially more care to be
> involved
> > given the lack of easy configuration for it (after all, why would you
> > bother running the aggregated parser alongside the regular parser? This
> > could be more explicitly stated, but again that feels like a doc problem.
> > Right now I could essentially provide two of the same parser and create
> the
> > same problem, so right now aggregation is only special because it runs on
> > dev by default). This is, in my opinion, primarily a first impression
> > problem and likely one of many areas that could use improved
> documentation.
> >
> > Quite frankly, I think the issue pointed out here could mostly be
> resolved
> > by documenting how the current aggregation is done in dev, and telling
> how
> > to change it. Especially for a 0.x.1 release, which is primarily bug
> > fixes. As can be inferred from my vote, I don't think this problem is a
> > problem that needs solving in a point release. I would

Re: [VOTE] Metron Release Candidate 0.7.1-RC1

2019-04-28 Thread Nick Allen

I agree with Justin.  My +1 stands.

Considering that this is a known gap, we have already released with this
gap, and we have a backlog of numerous improvements that should be released
to the community, I am not in favor of delaying the release.  Metron
provides a wide variety of functionality at varying levels of maturity.
This is to be expected.  If we expect perfection, we will never get a
release out.


On Sat, Apr 27, 2019 at 6:12 PM Justin Leet  wrote:

> Mike is correct, that is because of the combination of full dev
> restrictions and the lack of support in the configuration UI for parser
> aggregation.  This was introduced in
> https://github.com/apache/metron/pull/1207 and also was true of the last
> release. Currently, parser aggregation is an advanced/manual feature whose
> (bare minimum) configuration can be done via Ambari, out of convenience.
>
> I haven't looked into it, but https://github.com/apache/metron/pull/1360
> is
> likely the work for this (and need additional work before merging).
>
> I'm personally letting my binding +1 stand, although I would support either
> ensuring we get that PR cleaned up and in and/or additional documentation
> regarding the current limitations of this feature.
>
>
> On Sat, Apr 27, 2019 at 2:38 PM Anand Subramanian <
> asubraman...@hortonworks.com> wrote:
>
> > I can confirm that I've seen the Mgmt UI shows the sensor status
> correctly
> > when they run as single topologies.
> >
> > -Anand
> >
> > On 4/27/19, 11:37 PM, "Michael Miklavcic" 
> > wrote:
> >
> > I believe that is bc of parser aggregation. The UI does not support
> it
> > currently. IIRC there was a PR to change the bro, snort, and yaf
> > sensors to
> > aggregated bc full dev didn't have enough resources. The upshot is
> > that the
> > UI still works for single sensors, but the feature for enabling
> > aggregated
> > sensors has not yet been completed.
> >
> > On Sat, Apr 27, 2019, 11:33 AM Otto Fowler 
> > wrote:
> >
> > > -1
> > >
> > > Ran the script and ran full dev, all good.
> > > In the configuration ui, the status of the sensors is not correct.
> > It
> > > does not show any running, but they are running in storm and the
> > data was
> > > moved correctly.
> > >
> > >
> > > On April 26, 2019 at 09:58:02, Otto Fowler (
> ottobackwa...@gmail.com)
> > > wrote:
> > >
> > > Curious Anand,
> > > are your steps for bringing up an open stack cluster something we
> > could
> > > script like the AWS stuff?
> > >
> > >
> > > On April 26, 2019 at 09:35:29, Anand Subramanian (
> > > asubraman...@hortonworks.com) wrote:
> > >
> > > +1 (non-binding)
> > >
> > > * Built RPMs and mpacks.
> > > * Brought up Metron stack on 12-node CentOS 7 openstack cluster.
> > > * Ran sensor-stubs and validated events in the Alerts UI for the
> > default
> > > sensors.
> > > * Management UI, Alerts UI and Swagger UI sanity check
> > >
> > > Regards,
> >     > Anand
> > >
> > > On 4/26/19, 5:18 AM, "Nick Allen"  wrote:
> > >
> > > +1 Verified release with all documented steps and ran up Full Dev.
> > >
> > > On Thu, Apr 25, 2019 at 6:10 PM Michael Miklavcic <
> > > michael.miklav...@gmail.com> wrote:
> > >
> > > > Ok cool, just finished the validation and updated the steps in
> the
> > doc to
> > > > reflect the current code base.
> > > >
> > > > On Thu, Apr 25, 2019 at 3:45 PM Nick Allen 
> > wrote:
> > > >
> > > > > No voting required. Those are just docs. Whoever is willing to
> > correct
> > > > > and has access, should be able to. Good catch.
> > > > >
> > > > > On Thu, Apr 25, 2019 at 4:32 PM Michael Miklavcic <
> > > > > michael.miklav...@gmail.com> wrote:
> > > > >
> > > > > > We're also not "incubator-metron" any longer. Do we require
> > any kind
> > > of
> > > > > > voting or +1 on that verification page to make corrections to
> > it?
> > > > > >
> > > > > > On Thu, Apr 25, 2019 at 2:29 PM Michael Miklavcic <
> > > &g

Re: [VOTE] Metron Release Candidate 0.7.1-RC1

2019-04-25 Thread Nick Allen

No, not AWS.

On Thu, Apr 25, 2019, 8:49 PM Michael Miklavcic 
wrote:

> Just curious, did you also do AWS? I haven't run that in a while. Wondered
> if it worked ok.
>
> On Thu, Apr 25, 2019, 5:48 PM Nick Allen  wrote:
>
> > +1 Verified release with all documented steps and ran up Full Dev.
> >
> > On Thu, Apr 25, 2019 at 6:10 PM Michael Miklavcic <
> > michael.miklav...@gmail.com> wrote:
> >
> > > Ok cool, just finished the validation and updated the steps in the doc
> to
> > > reflect the current code base.
> > >
> > > On Thu, Apr 25, 2019 at 3:45 PM Nick Allen  wrote:
> > >
> > > > No voting required.  Those are just docs.  Whoever is willing to
> > correct
> > > > and has access, should be able to.  Good catch.
> > > >
> > > > On Thu, Apr 25, 2019 at 4:32 PM Michael Miklavcic <
> > > > michael.miklav...@gmail.com> wrote:
> > > >
> > > > > We're also not "incubator-metron" any longer. Do we require any
> kind
> > of
> > > > > voting or +1 on that verification page to make corrections to it?
> > > > >
> > > > > On Thu, Apr 25, 2019 at 2:29 PM Michael Miklavcic <
> > > > > michael.miklav...@gmail.com> wrote:
> > > > >
> > > > > > fyi, the steps in this doc have changed slightly per this naming
> > > > > > convention change as well -
> > > > > >
> > https://cwiki.apache.org/confluence/display/METRON/Verifying+Builds.
> > > > > >
> > > > > >
> > > > > >
> > > > > > On Thu, Apr 25, 2019 at 1:25 PM Justin Leet <
> justinjl...@gmail.com
> > >
> > > > > wrote:
> > > > > >
> > > > > >> For everyone taking the time to validate and vote on the RC,
> there
> > > is
> > > > a
> > > > > >> caveat.  The naming conventions for the two repos are now
> aligned
> > > > > >> (_, instead of being '-' in the main repo
> and
> > > '_'
> > > > in
> > > > > >> the plugin repo) along with the location of the KEYS file, I
> have
> > a
> > > PR
> > > > > out
> > > > > >> to update the metron-rc-check script (
> > > > > >> https://github.com/apache/metron/pull/1394).
> > > > > >>
> > > > > >> This accounts for both of these changes, and should allow the
> > script
> > > > to
> > > > > be
> > > > > >> run normally.
> > > > > >>
> > > > > >> On Thu, Apr 25, 2019 at 3:22 PM Justin Leet <
> > justinjl...@gmail.com>
> > > > > >> wrote:
> > > > > >>
> > > > > >> > This is a call to vote on releasing Apache Metron 0.7.1
> > > > > >> >
> > > > > >> > Full list of changes in this release:
> > > > > >> >
> https://dist.apache.org/repos/dist/dev/metron/0.7.1-RC1/CHANGES
> > > > > >> > The tag to be voted upon is:
> > > > > >> > apache-metron_0.7.1-rc1
> > > > > >> >
> > > > > >> > The source archives being voted upon can be found here:
> > > > > >> >
> > > > > >> >
> > > > > >>
> > > > >
> > > >
> > >
> >
> https://dist.apache.org/repos/dist/dev/metron/0.7.1-RC1/apache-metron_0.7.1-rc1.tar.gz
> > > > > >> >
> > > > > >> > Other release files, signatures and digests can be found here:
> > > > > >> > https://dist.apache.org/repos/dist/dev/metron/0.7.1-RC1/
> > > > > >> >
> > > > > >> > The release artifacts are signed with the following key:
> > > > > >> > https://dist.apache.org/repos/dist/release/metron/KEYS
> > > > > >> > Please vote on releasing this package as Apache Metron
> 0.7.1-RC1
> > > > > >> >
> > > > > >> > When voting, please list the actions taken to verify the
> > release.
> > > > > >> >
> > > > > >> > Recommended build validation and verification instructions are
> > > > posted
> > > > > >> > here:
> > > > > >> >
> > > https://cwiki.apache.org/confluence/display/METRON/Verifying+Builds
> > > > > >> >
> > > > > >> > This vote will be open for until 4pm EDT on Tuesday April 30
> > 2019,
> > > > to
> > > > > >> > account for the weekend..
> > > > > >> >
> > > > > >> > [ ] +1 Release this package as Apache Metron 0.7.1-RC1
> > > > > >> >
> > > > > >> > [ ] 0 No opinion
> > > > > >> >
> > > > > >> > [ ] -1 Do not release this package because...
> > > > > >> >
> > > > > >>
> > > > > >
> > > > >
> > > >
> > >
> >
>

Re: [VOTE] Metron Release Candidate 0.7.1-RC1

2019-04-25 Thread Nick Allen

+1 Verified release with all documented steps and ran up Full Dev.

On Thu, Apr 25, 2019 at 6:10 PM Michael Miklavcic <
michael.miklav...@gmail.com> wrote:

> Ok cool, just finished the validation and updated the steps in the doc to
> reflect the current code base.
>
> On Thu, Apr 25, 2019 at 3:45 PM Nick Allen  wrote:
>
> > No voting required.  Those are just docs.  Whoever is willing to correct
> > and has access, should be able to.  Good catch.
> >
> > On Thu, Apr 25, 2019 at 4:32 PM Michael Miklavcic <
> > michael.miklav...@gmail.com> wrote:
> >
> > > We're also not "incubator-metron" any longer. Do we require any kind of
> > > voting or +1 on that verification page to make corrections to it?
> > >
> > > On Thu, Apr 25, 2019 at 2:29 PM Michael Miklavcic <
> > > michael.miklav...@gmail.com> wrote:
> > >
> > > > fyi, the steps in this doc have changed slightly per this naming
> > > > convention change as well -
> > > > https://cwiki.apache.org/confluence/display/METRON/Verifying+Builds.
> > > >
> > > >
> > > >
> > > > On Thu, Apr 25, 2019 at 1:25 PM Justin Leet 
> > > wrote:
> > > >
> > > >> For everyone taking the time to validate and vote on the RC, there
> is
> > a
> > > >> caveat.  The naming conventions for the two repos are now aligned
> > > >> (_, instead of being '-' in the main repo and
> '_'
> > in
> > > >> the plugin repo) along with the location of the KEYS file, I have a
> PR
> > > out
> > > >> to update the metron-rc-check script (
> > > >> https://github.com/apache/metron/pull/1394).
> > > >>
> > > >> This accounts for both of these changes, and should allow the script
> > to
> > > be
> > > >> run normally.
> > > >>
> > > >> On Thu, Apr 25, 2019 at 3:22 PM Justin Leet 
> > > >> wrote:
> > > >>
> > > >> > This is a call to vote on releasing Apache Metron 0.7.1
> > > >> >
> > > >> > Full list of changes in this release:
> > > >> > https://dist.apache.org/repos/dist/dev/metron/0.7.1-RC1/CHANGES
> > > >> > The tag to be voted upon is:
> > > >> > apache-metron_0.7.1-rc1
> > > >> >
> > > >> > The source archives being voted upon can be found here:
> > > >> >
> > > >> >
> > > >>
> > >
> >
> https://dist.apache.org/repos/dist/dev/metron/0.7.1-RC1/apache-metron_0.7.1-rc1.tar.gz
> > > >> >
> > > >> > Other release files, signatures and digests can be found here:
> > > >> > https://dist.apache.org/repos/dist/dev/metron/0.7.1-RC1/
> > > >> >
> > > >> > The release artifacts are signed with the following key:
> > > >> > https://dist.apache.org/repos/dist/release/metron/KEYS
> > > >> > Please vote on releasing this package as Apache Metron 0.7.1-RC1
> > > >> >
> > > >> > When voting, please list the actions taken to verify the release.
> > > >> >
> > > >> > Recommended build validation and verification instructions are
> > posted
> > > >> > here:
> > > >> >
> https://cwiki.apache.org/confluence/display/METRON/Verifying+Builds
> > > >> >
> > > >> > This vote will be open for until 4pm EDT on Tuesday April 30 2019,
> > to
> > > >> > account for the weekend..
> > > >> >
> > > >> > [ ] +1 Release this package as Apache Metron 0.7.1-RC1
> > > >> >
> > > >> > [ ] 0 No opinion
> > > >> >
> > > >> > [ ] -1 Do not release this package because...
> > > >> >
> > > >>
> > > >
> > >
> >
>

Re: [VOTE] Metron Release Candidate 0.7.1-RC1

2019-04-25 Thread Nick Allen

No voting required.  Those are just docs.  Whoever is willing to correct
and has access, should be able to.  Good catch.

On Thu, Apr 25, 2019 at 4:32 PM Michael Miklavcic <
michael.miklav...@gmail.com> wrote:

> We're also not "incubator-metron" any longer. Do we require any kind of
> voting or +1 on that verification page to make corrections to it?
>
> On Thu, Apr 25, 2019 at 2:29 PM Michael Miklavcic <
> michael.miklav...@gmail.com> wrote:
>
> > fyi, the steps in this doc have changed slightly per this naming
> > convention change as well -
> > https://cwiki.apache.org/confluence/display/METRON/Verifying+Builds.
> >
> >
> >
> > On Thu, Apr 25, 2019 at 1:25 PM Justin Leet 
> wrote:
> >
> >> For everyone taking the time to validate and vote on the RC, there is a
> >> caveat.  The naming conventions for the two repos are now aligned
> >> (_, instead of being '-' in the main repo and '_' in
> >> the plugin repo) along with the location of the KEYS file, I have a PR
> out
> >> to update the metron-rc-check script (
> >> https://github.com/apache/metron/pull/1394).
> >>
> >> This accounts for both of these changes, and should allow the script to
> be
> >> run normally.
> >>
> >> On Thu, Apr 25, 2019 at 3:22 PM Justin Leet 
> >> wrote:
> >>
> >> > This is a call to vote on releasing Apache Metron 0.7.1
> >> >
> >> > Full list of changes in this release:
> >> > https://dist.apache.org/repos/dist/dev/metron/0.7.1-RC1/CHANGES
> >> > The tag to be voted upon is:
> >> > apache-metron_0.7.1-rc1
> >> >
> >> > The source archives being voted upon can be found here:
> >> >
> >> >
> >>
> https://dist.apache.org/repos/dist/dev/metron/0.7.1-RC1/apache-metron_0.7.1-rc1.tar.gz
> >> >
> >> > Other release files, signatures and digests can be found here:
> >> > https://dist.apache.org/repos/dist/dev/metron/0.7.1-RC1/
> >> >
> >> > The release artifacts are signed with the following key:
> >> > https://dist.apache.org/repos/dist/release/metron/KEYS
> >> > Please vote on releasing this package as Apache Metron 0.7.1-RC1
> >> >
> >> > When voting, please list the actions taken to verify the release.
> >> >
> >> > Recommended build validation and verification instructions are posted
> >> > here:
> >> > https://cwiki.apache.org/confluence/display/METRON/Verifying+Builds
> >> >
> >> > This vote will be open for until 4pm EDT on Tuesday April 30 2019, to
> >> > account for the weekend..
> >> >
> >> > [ ] +1 Release this package as Apache Metron 0.7.1-RC1
> >> >
> >> > [ ] 0 No opinion
> >> >
> >> > [ ] -1 Do not release this package because...
> >> >
> >>
> >
>

Re: [DISCUSS] Next Release

2019-04-25 Thread Nick Allen

Justin - I cleaned up the last bit of stragglers in JIRA.  JIRA should be
ready for the release.

$ /dev-utilities/release-utils/validate-jira-for-release --version=0.7.1
--start=tags/apache-metron_0.7.0-release
...
   JIRA  STATUS FIX VERSION
 ASSIGNEEFIX
METRON-2091Done   0.7.1  Ryan
Merriman
METRON-2078Done   0.7.1  Ryan
Merriman
METRON-2065Done   0.7.1  Ryan
Merriman
METRON-2067Done   0.7.1  Michael
Miklavcic
METRON-2074Done   0.7.1  Michael
Miklavcic
METRON-2082Done   0.7.1
Mohan
METRON-2006Done   0.7.1Justin
Leet
METRON-2071Done   0.7.1  Michael
Miklavcic
METRON-2014Done   0.7.1  Ryan
Merriman
METRON-2026Done   0.7.1  Ryan
Merriman
METRON-2062Done   0.7.1  Michael
Miklavcic
METRON-2050Done   0.7.1  Michael
Miklavcic
METRON-2060Done   0.7.1  Michael
Miklavcic
METRON-2064Done   0.7.1  Ryan
Merriman
METRON-2066Done   0.7.1  Michael
Miklavcic
METRON-1654Done   0.7.1  Ryan
Merriman
METRON-2053Done   0.7.1  Michael
Miklavcic
METRON-2022Done   0.7.1  Ryan
Merriman
METRON-2056Done   0.7.1 Nick
Allen
METRON-2023Done   0.7.1   Tibor
Meller
METRON-2039Done   0.7.1  Ryan
Merriman
METRON-2052Done   0.7.1   Tibor
Meller
METRON-2051Done   0.7.1  Michael
Miklavcic
METRON-2023Done   0.7.1   Tibor
Meller
METRON-2046Done   0.7.1  Anand
Subramanian
METRON-2029Done   0.7.1   Shane
Ardell
METRON-2032Done   0.7.1  Anand
Subramanian
METRON-2038Done   0.7.1 Nick
Allen
METRON-2035Done   0.7.1 Nick
Allen
METRON-2041Done   0.7.1  Michael
Miklavcic
METRON-2036Done   0.7.1  Michael
Miklavcic
METRON-2030Done   0.7.1  Ryan
Merriman
METRON-2031Done   0.7.1   Tibor
Meller
METRON-2012Done   0.7.1 Nick
Allen
METRON-1971Done   0.7.1   Shane
Ardell
METRON-1940Done   0.7.1
Mohan
METRON-2019Done   0.7.1  Ryan
Merriman
METRON-2016Done   0.7.1  Ryan
Merriman
METRON-1987Done   0.7.1   Shane
Ardell
METRON-1968Done   0.7.1  Ryan
Merriman
METRON-1778Done   0.7.1 Nick
Allen
METRON-1996Done   0.7.1
Mohan
METRON-1944Done   0.7.1Tamas
Fodor
METRON-2010Done   0.7.1 Nick
Allen
METRON-1998Done   0.7.1  Ryan
Merriman
METRON-2009Done   0.7.1Justin
Leet
METRON-2005Done   0.7.1Justin
Leet
METRON-2007Done   0.7.1  Ryan
Merriman
METRON-1986Done   0.7.1 Nick
Allen
METRON-1993Done   0.7.1  Ryan
Merriman
METRON-1999Done   0.7.1   Tibor
Meller
METRON-1985Done   0.7.1 Nick
Allen
METRON-1974Done   0.7.1 Nick
Allen
METRON-1970Done   0.7.1 Nick
Allen
METRON-1995Done   0.7.1Tamas
Fodor
METRON-1973Done   0.7.1   Tibor
Meller
METRON-1948Done   0.7.1  Ryan
Merriman
METRON-1969Done   0.7.1   Tibor
Meller
METRON-1933Done   0.7.1 Jon
Zeolla
METRON-1962Done   0.7.1  Anand
Subramanian

Re: [DISCUSS] Next Release

2019-04-23 Thread Nick Allen

I was able to create the "0.7.1" release version in JIRA.

On Tue, Apr 23, 2019 at 1:41 PM Justin Leet  wrote:

> Absolutely. It'll probably be tomorrow before that gets into full swing.
>
> I don't believe we have a "0.7.1" release in Jira, and I (oddly enough)
> don't believe I have permissions to create it.  We'll need someone to get
> that created (James, maybe?).
>
> Until then, I'd like to ask everyone to make sure Jira is up to date
> (particularly with assignee, release version (or at least next+1 which is
> easy to find), along with status). Typically, I've taken care of getting
> Jira up to date in the last couple releases, but it would be helpful if
> everyone took a few minutes to update their own tickets, so I'm just doing
> touch-ups (and updating my own tickets).  If I can grab some time, I'll try
> to get a list of tickets to update to the appropriate contributors.
>
> Thanks,
> Justin
>
>
>
> On Tue, Apr 23, 2019, 09:54 Nick Allen  wrote:
>
> > Justin - Can you kick-off the release process?
> >
> > On Wed, Apr 17, 2019 at 11:51 AM Michael Miklavcic <
> > michael.miklav...@gmail.com> wrote:
> >
> > > I don't think it should hold up the release. It's *extremely* rare,
> even
> > > though it's come up twice. I wanted to be preemptive about bringing up
> > the
> > > intermittent failure list because we agreed as a community on this
> being
> > a
> > > regular release cycle review step. I think all those voting on the
> > release
> > > should have a full awareness of issues seen and vote accordingly as to
> > > whether we should move forward or address a problem first. Having
> > reviewed
> > > the test-failures list, I'm in favor of moving forward with the
> release.
> > >
> > > On Wed, Apr 17, 2019 at 9:29 AM Nick Allen  wrote:
> > >
> > > > Oh, and that failure from 21 days ago was caused by a downstream
> issue
> > > > brought in by PR #1364, that we then later reverted [1].  So at least
> > > that
> > > > particular issue has been identified and addressed and should no
> longer
> > > > impacts builds.
> > > >
> > > > ---
> > > > [1]
> > > >
> > > >
> > >
> >
> https://github.com/apache/metron/commit/cad679a5948ce0ee9866e61bbf75b1f5f255682c
> > > >
> > > > On Wed, Apr 17, 2019 at 11:25 AM Nick Allen 
> > wrote:
> > > >
> > > > > Are you suggesting the release should wait on this?  Personally, I
> > > don't
> > > > > feel that we have any *persistently intermittent* test failures
> that
> > > > would
> > > > > block a release.  The last build failure on master I see was from
> 21
> > > days
> > > > > ago.
> > > > >
> > > > > Otherwise, I'd really like to kick off the next release.
> > > > >
> > > > > On Tue, Apr 16, 2019 at 11:25 AM Michael Miklavcic <
> > > > > michael.miklav...@gmail.com> wrote:
> > > > >
> > > > >> FYI, I just saw one of our reported intermittent test failures pop
> > up
> > > > >> again
> > > > >> today - https://issues.apache.org/jira/browse/METRON-1814
> > > > >>
> > > > >> On Mon, Apr 15, 2019 at 2:22 PM Michael Miklavcic <
> > > > >> michael.miklav...@gmail.com> wrote:
> > > > >>
> > > > >> > I want to thread the needle on this. I just reviewed a PR from
> > Ryan
> > > > that
> > > > >> > addresses this and gave it a +1.
> > > > >> > https://github.com/apache/metron/pull/1381
> > > > >> >
> > > > >> > I think Jon Zeolla and Justin Leet should have an opportunity to
> > > chime
> > > > >> in
> > > > >> > as they also had commented about this request in the original
> PR.
> > > > >> >
> > > > >> > One other thing to note - we had agreed last time around to do a
> > > scan
> > > > >> over
> > > > >> > any intermittent test failures encountered and assess whether or
> > not
> > > > >> they
> > > > >> > were still valid. I'd like to ask the community to take a look
> and
> > > > >> comment
> > > > >> > on whether any of these has appeared for them. From what I can
> > tell
> &

Re: [DISCUSS] Upgrade to HDP 3.1.0

2019-04-23 Thread Nick Allen

FYI - I opened a ticket to serve as an epic for this work and the feature
branch.

https://issues.apache.org/jira/browse/METRON-2088

On Mon, Apr 22, 2019 at 3:32 PM Michael Miklavcic <
michael.miklav...@gmail.com> wrote:

> +1 to starting a feature branch for this.
> +1 to removing our custom implementations if the newest revs are in fact
> stable now.
>
> Regarding the profile option - if it's possible to keep 2.6.5 for a bit and
> not require separate branches or code trees, this is probably OK.
> Otherwise, I'm inclined to take the approach we've taken in the past with
> every other upgrade and only support 1 version. I think we should prepare
> users for the likelihood that if/when we cut over, there will be no more
> updates to 2.6.x.
>
> I talked through this a bit with Nick and Ryan Merriman offline. There are
> a number of major version revs of components from HDP 2.6 to 3.x that are
> likely to have backwards compatibility problems. HBase is a big one that
> comes to mind - I noticed the HTable interface was deprecated while working
> through the coprocessor implementation, and Ryan found that it was removed
> completely in the new version. That affects our integration tests as well
> bc we have a rather large mock implementation of HBase in use that is built
> around the removed API. We will either need to migrate to the new API or
> find alternative approach to integration testing with HBase.
>
> I'll let Nick add more detail in the Epic/Jira and feature branch plan, but
> here is a sampling of some of what we can expect to require some work to
> upgrade:
>
>- Ambari - the current MPack is incompatible with Ambari 2.7.3, however
>there isn't a breaking changes document, so we'll have to work through
> this
>brute force or hopefully find some help from the Ambari community.
>- MaaS - YARN major change
>- PCAP - HDFS, Kafka
>- Indexing - HDFS, Solr
>- All topologies - Kafka
>- Stellar - HDFS, HBase
>- Enrichment - HBase
>- Enrichment Coprocessor (the enrichments listing) - HBase
>- Integration tests - Kafka and HBase have changed considerably.
>- UI, REST - Solr, HDFS, HBase
>- Knox
>- Kerberos (hopefully this is a kick-the-tires effort, though there is
>some possible risk if Ambari and the individual components introduce
>changes here)
>
> Fortunately, Zookeeper appears to have stayed the same across versions. It
> might be worthwhile to get a chart of the versions for each platform added
> to the epic Jira for reference while performing this work.
>
> Best,
> Mike
>
>
> On Mon, Apr 22, 2019, 12:50 PM Nick Allen  wrote:
>
> > We currently support running Metron on an HDP 2.6.5
> > <
> >
> https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.6.5/bk_release-notes/content/comp_versions.html
> > >
> > cluster.
> > I'd like to get Metron updated to run in an HDP 3.1.0
> > <
> >
> https://docs.hortonworks.com/HDPDocuments/HDP3/HDP-3.1.0/release-notes/content/comp_versions.html
> > >
> > cluster.
> > This provides a number of significant updates to the core platform
> > components that we depend on like Kafka, HBase, Ambari, etc.
> >
> > ### Feature Branch
> >
> > I'd like to create a feature branch in which to do this.  This will take
> a
> > good amount of effort and multiple PRs. To avoid any impact to master as
> we
> > progress through this, a feature branch would make sense.
> >
> > If you have concerns or interest in this effort, please speak up.  Here
> are
> > some relevant discussion points based on what I know so far.
> >
> > ### CentOS 7
> >
> > CentOS 6 RPMs are no longer distributed for HDP 3.1.0, only CentOS 7
> RPMs.
> > Because of this we will likely need to transition Full Dev over to CentOS
> > 7.  I don't see a downside to doing this since 6 is rather old and I
> assume
> > that most users run variants of 7 already anyways.
> >
> > ### HDP 2.6.5
> >
> > I'd like to try and make these changes backwards compatible with HDP
> 2.6.5
> > if possible, but only as long as that does not increase our ongoing
> > development burden.
> >
> > For example, if I can simply define a separate build profile for 3.1.0
> and
> > things are generally backwards compatible, then I'm all for maintaining
> > support for 2.6.5.  On the other hand, I would not want to go as far as
> > maintaining separate master branches for each.  In my mind the ongoing
> cost
> > there is too high.
> >
> > ### HDP 2.5.6
> >
> > There are some workaround in the code base that were introduced to
> support
> > HDP
> > 2.5.6
> > <
> >
> https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.5.6/bk_release-notes/content/comp_versions.html
> > >
> > when
> > we moved to HDP 2.6.5. There are some workarounds specifically for older
> > versions of Storm like 1.0.x. Rather than maintaining this going forward,
> > I'd prefer we remove this technical debt and not support anything older
> > than HDP 2.6.5.
> >
> >
> >
> >
> > Best,
> > Nick
> >
>

Re: [DISCUSS] Next Release

2019-04-23 Thread Nick Allen

Justin - Can you kick-off the release process?

On Wed, Apr 17, 2019 at 11:51 AM Michael Miklavcic <
michael.miklav...@gmail.com> wrote:

> I don't think it should hold up the release. It's *extremely* rare, even
> though it's come up twice. I wanted to be preemptive about bringing up the
> intermittent failure list because we agreed as a community on this being a
> regular release cycle review step. I think all those voting on the release
> should have a full awareness of issues seen and vote accordingly as to
> whether we should move forward or address a problem first. Having reviewed
> the test-failures list, I'm in favor of moving forward with the release.
>
> On Wed, Apr 17, 2019 at 9:29 AM Nick Allen  wrote:
>
> > Oh, and that failure from 21 days ago was caused by a downstream issue
> > brought in by PR #1364, that we then later reverted [1].  So at least
> that
> > particular issue has been identified and addressed and should no longer
> > impacts builds.
> >
> > ---
> > [1]
> >
> >
> https://github.com/apache/metron/commit/cad679a5948ce0ee9866e61bbf75b1f5f255682c
> >
> > On Wed, Apr 17, 2019 at 11:25 AM Nick Allen  wrote:
> >
> > > Are you suggesting the release should wait on this?  Personally, I
> don't
> > > feel that we have any *persistently intermittent* test failures that
> > would
> > > block a release.  The last build failure on master I see was from 21
> days
> > > ago.
> > >
> > > Otherwise, I'd really like to kick off the next release.
> > >
> > > On Tue, Apr 16, 2019 at 11:25 AM Michael Miklavcic <
> > > michael.miklav...@gmail.com> wrote:
> > >
> > >> FYI, I just saw one of our reported intermittent test failures pop up
> > >> again
> > >> today - https://issues.apache.org/jira/browse/METRON-1814
> > >>
> > >> On Mon, Apr 15, 2019 at 2:22 PM Michael Miklavcic <
> > >> michael.miklav...@gmail.com> wrote:
> > >>
> > >> > I want to thread the needle on this. I just reviewed a PR from Ryan
> > that
> > >> > addresses this and gave it a +1.
> > >> > https://github.com/apache/metron/pull/1381
> > >> >
> > >> > I think Jon Zeolla and Justin Leet should have an opportunity to
> chime
> > >> in
> > >> > as they also had commented about this request in the original PR.
> > >> >
> > >> > One other thing to note - we had agreed last time around to do a
> scan
> > >> over
> > >> > any intermittent test failures encountered and assess whether or not
> > >> they
> > >> > were still valid. I'd like to ask the community to take a look and
> > >> comment
> > >> > on whether any of these has appeared for them. From what I can tell
> > >> after
> > >> > having looked them over, it looks like random Travis issues. I
> haven't
> > >> > experienced any of these locally in quite some time.
> > >> >
> > >>
> >
> https://issues.apache.org/jira/browse/METRON-2077?jql=project%20%3D%20METRON%20AND%20resolution%20%3D%20Unresolved%20AND%20labels%20%3D%20test-failure%20ORDER%20BY%20priority%20DESC%2C%20updated%20DESC
> > >> >
> > >> > Best,
> > >> > Mike
> > >> >
> > >> > On Fri, Apr 5, 2019 at 11:50 AM Ryan Merriman 
> > >> wrote:
> > >> >
> > >> >> Jon is correct.   I am actively working on this and hope to have it
> > >> >> completed soon.   I realize it will hold up the release so it's a
> > >> priority
> > >> >> for me.
> > >> >>
> > >> >> On Sat, Mar 30, 2019 at 6:09 PM zeo...@gmail.com  >
> > >> >> wrote:
> > >> >>
> > >> >> > Isn't the documentation already in progress?
> > >> >> >
> > >> >> >
> https://github.com/apache/metron/pull/1330#issuecomment-466453372
> > >> >> >
> > >> >> > If not I would still consider it important to complete prior to a
> > >> >> release
> > >> >> > and I agree with Justin's comments in
> > >> >> >
> > >> >> >
> > >> >> >
> > >> >>
> > >>
> >
> https://lists.apache.org/thread.html/50b89b919bd8bef3f7fcdef167cbd7e489fa74a1e2da3e4fddb08b13@
> > &g

[DISCUSS] Upgrade to HDP 3.1.0

2019-04-22 Thread Nick Allen

We currently support running Metron on an HDP 2.6.5

cluster.
I'd like to get Metron updated to run in an HDP 3.1.0

cluster.
This provides a number of significant updates to the core platform
components that we depend on like Kafka, HBase, Ambari, etc.

### Feature Branch

I'd like to create a feature branch in which to do this.  This will take a
good amount of effort and multiple PRs. To avoid any impact to master as we
progress through this, a feature branch would make sense.

If you have concerns or interest in this effort, please speak up.  Here are
some relevant discussion points based on what I know so far.

### CentOS 7

CentOS 6 RPMs are no longer distributed for HDP 3.1.0, only CentOS 7 RPMs.
Because of this we will likely need to transition Full Dev over to CentOS
7.  I don't see a downside to doing this since 6 is rather old and I assume
that most users run variants of 7 already anyways.

### HDP 2.6.5

I'd like to try and make these changes backwards compatible with HDP 2.6.5
if possible, but only as long as that does not increase our ongoing
development burden.

For example, if I can simply define a separate build profile for 3.1.0 and
things are generally backwards compatible, then I'm all for maintaining
support for 2.6.5.  On the other hand, I would not want to go as far as
maintaining separate master branches for each.  In my mind the ongoing cost
there is too high.

### HDP 2.5.6

There are some workaround in the code base that were introduced to support HDP
2.5.6

when
we moved to HDP 2.6.5. There are some workarounds specifically for older
versions of Storm like 1.0.x. Rather than maintaining this going forward,
I'd prefer we remove this technical debt and not support anything older
than HDP 2.6.5.




Best,
Nick

Re: [DISCUSS] Next Release

2019-04-17 Thread Nick Allen

Oh, and that failure from 21 days ago was caused by a downstream issue
brought in by PR #1364, that we then later reverted [1].  So at least that
particular issue has been identified and addressed and should no longer
impacts builds.

---
[1]
https://github.com/apache/metron/commit/cad679a5948ce0ee9866e61bbf75b1f5f255682c

On Wed, Apr 17, 2019 at 11:25 AM Nick Allen  wrote:

> Are you suggesting the release should wait on this?  Personally, I don't
> feel that we have any *persistently intermittent* test failures that would
> block a release.  The last build failure on master I see was from 21 days
> ago.
>
> Otherwise, I'd really like to kick off the next release.
>
> On Tue, Apr 16, 2019 at 11:25 AM Michael Miklavcic <
> michael.miklav...@gmail.com> wrote:
>
>> FYI, I just saw one of our reported intermittent test failures pop up
>> again
>> today - https://issues.apache.org/jira/browse/METRON-1814
>>
>> On Mon, Apr 15, 2019 at 2:22 PM Michael Miklavcic <
>> michael.miklav...@gmail.com> wrote:
>>
>> > I want to thread the needle on this. I just reviewed a PR from Ryan that
>> > addresses this and gave it a +1.
>> > https://github.com/apache/metron/pull/1381
>> >
>> > I think Jon Zeolla and Justin Leet should have an opportunity to chime
>> in
>> > as they also had commented about this request in the original PR.
>> >
>> > One other thing to note - we had agreed last time around to do a scan
>> over
>> > any intermittent test failures encountered and assess whether or not
>> they
>> > were still valid. I'd like to ask the community to take a look and
>> comment
>> > on whether any of these has appeared for them. From what I can tell
>> after
>> > having looked them over, it looks like random Travis issues. I haven't
>> > experienced any of these locally in quite some time.
>> >
>> https://issues.apache.org/jira/browse/METRON-2077?jql=project%20%3D%20METRON%20AND%20resolution%20%3D%20Unresolved%20AND%20labels%20%3D%20test-failure%20ORDER%20BY%20priority%20DESC%2C%20updated%20DESC
>> >
>> > Best,
>> > Mike
>> >
>> > On Fri, Apr 5, 2019 at 11:50 AM Ryan Merriman 
>> wrote:
>> >
>> >> Jon is correct.   I am actively working on this and hope to have it
>> >> completed soon.   I realize it will hold up the release so it's a
>> priority
>> >> for me.
>> >>
>> >> On Sat, Mar 30, 2019 at 6:09 PM zeo...@gmail.com 
>> >> wrote:
>> >>
>> >> > Isn't the documentation already in progress?
>> >> >
>> >> > https://github.com/apache/metron/pull/1330#issuecomment-466453372
>> >> >
>> >> > If not I would still consider it important to complete prior to a
>> >> release
>> >> > and I agree with Justin's comments in
>> >> >
>> >> >
>> >> >
>> >>
>> https://lists.apache.org/thread.html/50b89b919bd8bef3f7fcdef167cbd7e489fa74a1e2da3e4fddb08b13@
>> >> > 
>> >> >
>> >> > Jon Zeolla
>> >> >
>> >> > On Thu, Mar 28, 2019, 2:16 PM Michael Miklavcic <
>> >> > michael.miklav...@gmail.com>
>> >> > wrote:
>> >> >
>> >> > > Jon and Ryan - this was a convo/negotiation between you two at the
>> >> time.
>> >> > > Any thoughts?
>> >> > >
>> >> > > On Thu, Mar 28, 2019 at 9:08 AM Nick Allen 
>> >> wrote:
>> >> > >
>> >> > > > Is anyone volunteering to take this on?  Would be nice to get a
>> >> release
>> >> > > > out.
>> >> > > >
>> >> > > > On Thu, Mar 14, 2019, 4:53 PM zeo...@gmail.com > >
>> >> > wrote:
>> >> > > >
>> >> > > > > We should likely get METRON-2014 in, based on
>> >> > > > >
>> >> > > > >
>> >> > > >
>> >> > >
>> >> >
>> >>
>> https://lists.apache.org/thread.html/13bd0ed5606ad4f3427f24a8e759d6bcb61ace76d4afcc9f48310a00@%3Cdev.metron.apache.org%3E
>> >> > > > >
>> >> > > > > On Thu, Mar 14, 2019 at 4:24 PM Michael Miklavcic <
>> >> > > > > michael.miklav...@gmail.com> wrote:
>> >> > > > >
>> >> > > > > > Ticket is now done and merge

Re: [DISCUSS] Next Release

2019-04-17 Thread Nick Allen

Are you suggesting the release should wait on this?  Personally, I don't
feel that we have any *persistently intermittent* test failures that would
block a release.  The last build failure on master I see was from 21 days
ago.

Otherwise, I'd really like to kick off the next release.

On Tue, Apr 16, 2019 at 11:25 AM Michael Miklavcic <
michael.miklav...@gmail.com> wrote:

> FYI, I just saw one of our reported intermittent test failures pop up again
> today - https://issues.apache.org/jira/browse/METRON-1814
>
> On Mon, Apr 15, 2019 at 2:22 PM Michael Miklavcic <
> michael.miklav...@gmail.com> wrote:
>
> > I want to thread the needle on this. I just reviewed a PR from Ryan that
> > addresses this and gave it a +1.
> > https://github.com/apache/metron/pull/1381
> >
> > I think Jon Zeolla and Justin Leet should have an opportunity to chime in
> > as they also had commented about this request in the original PR.
> >
> > One other thing to note - we had agreed last time around to do a scan
> over
> > any intermittent test failures encountered and assess whether or not they
> > were still valid. I'd like to ask the community to take a look and
> comment
> > on whether any of these has appeared for them. From what I can tell after
> > having looked them over, it looks like random Travis issues. I haven't
> > experienced any of these locally in quite some time.
> >
> https://issues.apache.org/jira/browse/METRON-2077?jql=project%20%3D%20METRON%20AND%20resolution%20%3D%20Unresolved%20AND%20labels%20%3D%20test-failure%20ORDER%20BY%20priority%20DESC%2C%20updated%20DESC
> >
> > Best,
> > Mike
> >
> > On Fri, Apr 5, 2019 at 11:50 AM Ryan Merriman 
> wrote:
> >
> >> Jon is correct.   I am actively working on this and hope to have it
> >> completed soon.   I realize it will hold up the release so it's a
> priority
> >> for me.
> >>
> >> On Sat, Mar 30, 2019 at 6:09 PM zeo...@gmail.com 
> >> wrote:
> >>
> >> > Isn't the documentation already in progress?
> >> >
> >> > https://github.com/apache/metron/pull/1330#issuecomment-466453372
> >> >
> >> > If not I would still consider it important to complete prior to a
> >> release
> >> > and I agree with Justin's comments in
> >> >
> >> >
> >> >
> >>
> https://lists.apache.org/thread.html/50b89b919bd8bef3f7fcdef167cbd7e489fa74a1e2da3e4fddb08b13@
> >> > 
> >> >
> >> > Jon Zeolla
> >> >
> >> > On Thu, Mar 28, 2019, 2:16 PM Michael Miklavcic <
> >> > michael.miklav...@gmail.com>
> >> > wrote:
> >> >
> >> > > Jon and Ryan - this was a convo/negotiation between you two at the
> >> time.
> >> > > Any thoughts?
> >> > >
> >> > > On Thu, Mar 28, 2019 at 9:08 AM Nick Allen 
> >> wrote:
> >> > >
> >> > > > Is anyone volunteering to take this on?  Would be nice to get a
> >> release
> >> > > > out.
> >> > > >
> >> > > > On Thu, Mar 14, 2019, 4:53 PM zeo...@gmail.com 
> >> > wrote:
> >> > > >
> >> > > > > We should likely get METRON-2014 in, based on
> >> > > > >
> >> > > > >
> >> > > >
> >> > >
> >> >
> >>
> https://lists.apache.org/thread.html/13bd0ed5606ad4f3427f24a8e759d6bcb61ace76d4afcc9f48310a00@%3Cdev.metron.apache.org%3E
> >> > > > >
> >> > > > > On Thu, Mar 14, 2019 at 4:24 PM Michael Miklavcic <
> >> > > > > michael.miklav...@gmail.com> wrote:
> >> > > > >
> >> > > > > > Ticket is now done and merged. I'm also good on 0.7.1.
> >> > > > > >
> >> > > > > > On Thu, Mar 14, 2019 at 2:18 PM Justin Leet <
> >> justinjl...@gmail.com
> >> > >
> >> > > > > wrote:
> >> > > > > >
> >> > > > > > > I'm in favor doing a release, pending the ticket Mike
> pointed
> >> out
> >> > > > (and
> >> > > > > > > anything else someone comes up with).
> >> > > > > > >
> >> > > > > > > To the best of my knowledge, I think 0.7.1 is sufficient,
> but
> >> if
> >> > > > > someone
> >> > > > > > >

Re: Problems with Dev deployment.

2019-04-11 Thread Nick Allen

That script (metron-deployment/scripts/platform-info.sh) should be solid.
I was going to suggest that you run that on your computer and send us the
output.  That has often helped us debug issues like this before.  If it is
not finding Node/NPM, then that is a problem that needs addressed.



On Wed, Apr 10, 2019 at 9:43 PM Dale Richardson 
wrote:

> Does the pre-req script at:
>
>
>   metron-deployment/scripts/platform-info.sh
>
> need to be updated?  I notice it is complaining that node and npm are NOT
> installed on my machine, yet I am able to build and deploy without any
> problems.  Is docker being used to host a build image for node/npm?
>
> Regards,
> Dale.
> 
> From: Michael Miklavcic 
> Sent: Wednesday, 10 April 2019 4:07 PM
> To: dev@metron.apache.org
> Cc: Dale Richardson
> Subject: Re: Problems with Dev deployment.
>
> Dale - to spin back around to what would best help you, for now you could
> reference docs from latest Metron master. You can either open the READMEs
> directly on github or just build them locally, ie.
> metron$ cd site-book
> metron/site-book$ mvn clean site && open target/site/index.html
>
> I'm not sure what's going on with the main product docs - I think we may
> have some old pages that didn't get cleaned up and are still being cached
> by Google, et al. For example, this shows our latest documentation
> correctly -
> https://metron.apache.org/current-book/metron-deployment/index.html<
> https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fmetron.apache.org%2Fcurrent-book%2Fmetron-deployment%2Findex.html=02%7C01%7C%7C47f2f0dff5c4435a5eca08d6bdceb0fe%7C84df9e7fe9f640afb435%7C1%7C0%7C636905092767050625=v4t%2FCXrE8%2BobFXJHAZIEkBGHYP8bKnU%2BJ4dEIc1AnF8%3D=0>.
> Actually, as long as you're seeing version 0.7.0 in the menu bar, you're
> good to go. No need to traverse the READMEs or build locally. Just follow
> the TOC there.
>
> Best,
> Mike
>
> On Wed, Apr 10, 2019 at 10:00 AM Michael Miklavcic <
> michael.miklav...@gmail.com> wrote:
> So does anyone know, have we not been updating the site-book when we do a
> release? That documentation is wy outdated - v0.4.2??? - and should
> either be removed or setup as that version number in the URL as other
> projects do. e.g. Spark - https://spark.apache.org/documentation.html<
> https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fspark.apache.org%2Fdocumentation.html=02%7C01%7C%7C47f2f0dff5c4435a5eca08d6bdceb0fe%7C84df9e7fe9f640afb435%7C1%7C0%7C636905092767070634=CC%2B96RDg4f9fyEdwV1I9u6h4W55FMLpdqsof5hWO2wM%3D=0
> >
>
> On Wed, Apr 10, 2019 at 9:34 AM zeo...@gmail.com  zeo...@gmail.com>> wrote:
> Wow, I didn't realize that never got finalized/merged.  Looks like there is
> a failure in travis on that PR, if you get that wrapped up I would think we
> should take another look at that and maybe get it merged.  It has been a
> while but I recall I was pretty happy with it after my review cycle with
> you.
>
> - Jon Zeolla
> zeo...@gmail.com
>
>
> On Wed, Apr 10, 2019 at 9:17 AM Otto Fowler  > wrote:
>
> > These issues are the reason https://github.com/apache/metron/pull/1261<
> https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fapache%2Fmetron%2Fpull%2F1261=02%7C01%7C%7C47f2f0dff5c4435a5eca08d6bdceb0fe%7C84df9e7fe9f640afb435%7C1%7C0%7C636905092767080639=VWJdRh4dj3kCMGBvzicxajIjifYVkCueAhqj4YgEwrQ%3D=0>
> was
> > done.  It would be nice if we could get by them.
> >
> >
> > On April 10, 2019 at 08:13:04, Dale Richardson (tigerqu...@outlook.com
> )
> > wrote:
> >
> > Older pre-req versions are mentioned at:
> >
> >
> >
> https://metron.apache.org/current-book/metron-deployment/vagrant/codelab-platform/index.html
> <
> https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fmetron.apache.org%2Fcurrent-book%2Fmetron-deployment%2Fvagrant%2Fcodelab-platform%2Findex.html=02%7C01%7C%7C47f2f0dff5c4435a5eca08d6bdceb0fe%7C84df9e7fe9f640afb435%7C1%7C0%7C636905092767090692=AZHivJHaD5kK%2Bxy5SFzUO5oP4KdzrbXQnpcdCssaI5U%3D=0
> >
> > Metron – Developer Image for Apache Metron on Virtualbox<
> >
> >
> https://metron.apache.org/current-book/metron-deployment/vagrant/codelab-platform/index.html
> <
> https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fmetron.apache.org%2Fcurrent-book%2Fmetron-deployment%2Fvagrant%2Fcodelab-platform%2Findex.html=02%7C01%7C%7C47f2f0dff5c4435a5eca08d6bdceb0fe%7C84df9e7fe9f640afb435%7C1%7C0%7C636905092767110665=lHTB%2BsTqLSXfAOtrj4w%2Bj4tKdZHBZh4G56lAEgEOxlQ%3D=0
> >
> > >
> >
> > Developer Image for Apache Metron on Virtualbox. This image is a fully
> > functional Metron installation that has been pre-loaded with Ambari, HDP
> > and Metron.
> > metron.apache.org<
>

Re: [DISCUSS] Reverting METRON-2023 (PR #1364)

2019-04-03 Thread Nick Allen

+1
Thanks for digging into this.

On Wed, Apr 3, 2019, 6:07 AM Shane Ardell  wrote:

> Hello everyone,
>
> I'm sending out this email to notify the community of my intention to
> revert the commit for METRON-2023 (PR #1364
> ). This commit introduced a
> bug
> stemming from yauzl, the tool that Cypress uses to unzip itself. This bug
> causes an error in Travis and is intermittent, which is how it passed
> multiple build attempts before being merged into master. The issue is
> currently being discussed with the Cypress team and can be viewed here
> .
>
> If anyone has any strong objection to this, please let me know here.
> Otherwise, I plan on reverting this in the next 24 hours.
>
> Shane
>

Re: [DISCUSS] Next Release

2019-03-28 Thread Nick Allen

Is anyone volunteering to take this on?  Would be nice to get a release
out.

On Thu, Mar 14, 2019, 4:53 PM zeo...@gmail.com  wrote:

> We should likely get METRON-2014 in, based on
>
> https://lists.apache.org/thread.html/13bd0ed5606ad4f3427f24a8e759d6bcb61ace76d4afcc9f48310a00@%3Cdev.metron.apache.org%3E
>
> On Thu, Mar 14, 2019 at 4:24 PM Michael Miklavcic <
> michael.miklav...@gmail.com> wrote:
>
> > Ticket is now done and merged. I'm also good on 0.7.1.
> >
> > On Thu, Mar 14, 2019 at 2:18 PM Justin Leet 
> wrote:
> >
> > > I'm in favor doing a release, pending the ticket Mike pointed out (and
> > > anything else someone comes up with).
> > >
> > > To the best of my knowledge, I think 0.7.1 is sufficient, but if
> someone
> > > comes up with something, it's not hard to pivot.
> > >
> > > On Wed, Mar 13, 2019, 13:08 Michael Miklavcic <
> > michael.miklav...@gmail.com
> > > >
> > > wrote:
> > >
> > > > I'd like to see this fixed for the next release.
> > > > https://issues.apache.org/jira/browse/METRON-2036. Even though it's
> a
> > > > non-prod issue, this is a core part of our infrastructure/development
> > > > lifecycle that is currently broken and fits with our previous
> > agreements
> > > of
> > > > holding a release until all intermittent test failures are addressed.
> > > >
> > > > On Wed, Mar 13, 2019 at 11:33 AM Nick Allen 
> > wrote:
> > > >
> > > > > I would like to open a discussion in regards to the next release.
> Our
> > > > last
> > > > > 0.7.0 release was on Dec 11th.
> > > > >
> > > > > I believe we have a significant number of bug fixes and performance
> > > > > improvements that would make a worthy point release; 0.7.1.
> > Although,
> > > we
> > > > > should review the change log and see if there are any breaking
> > changes
> > > > that
> > > > > would require a bump to the minor version.
> > > > >
> > > > > Thoughts?
> > > > >
> > > > > $ git log --format=%B  tags/apache-metron_0.7.0-release..HEAD |
> grep
> > > > > METRON-
> > > > > Merge remote-tracking branch 'apache/master' into METRON-2035
> > > > > METRON-2035 Allow User to Configure Role Names for Access Control
> > > > > METRON-2030 SensorParserGroupControllerIntegrationTest intermittent
> > > > errors
> > > > > (merrimanr via mmiklavc) closes apache/metron#1352
> > > > > METRON-2031 [UI] Turning off initial search request and polling by
> > > > default
> > > > > on Alerts UI (tiborm via mmiklavc) closes apache/metron#1353
> > > > > METRON-2012 Unable to Execute Stellar Functions Against HBase in
> the
> > > REPL
> > > > > (nickwallen) closes apache/metron#1345
> > > > > METRON-1971 Short timeout value in Cypress may cause build failures
> > > > > (sardell) closes apache/metron#1323
> > > > > METRON-1940 Check if not and install Elastic search templates /
> Solr
> > > > > collections when indexing server is restarted (MohanDV) closes
> > > > > apache/metron#1305
> > > > > METRON-2019 Improve Metron REST Logging (merrimanr) closes
> > > > > apache/metron#1347
> > > > > METRON-2016 Parser aggregate groups should be persisted and
> available
> > > > > through REST (merrimanr) closes apache/metron#1346
> > > > > METRON-1987 Upgrade Alert UI to stable Bootstrap 4 (sardell) closes
> > > > > apache/metron#1336
> > > > > METRON-1968 Messages are lost when a parser produces multiple
> > messages
> > > > and
> > > > > batch size is greater than 1 (merrimanr) closes apache/metron#1330
> > > > > METRON-1778 Out-of-order timestamps may delay flush in Storm
> Profiler
> > > > > (nickwallen) closes apache/metron#1197
> > > > > METRON-1996 Solr search throws NPE for group search if the group
> > > > parameter
> > > > > is null or empty (MohanDV via nickwallen) closes apache/metron#1333
> > > > > METRON-1944 Unable to Delete a Comment in Alerts UI (ruffle1986 via
> > > > > sardell) closes apache/metron#1307
> > > > > METRON-2010 Unable to Build Metron Due to Inaccessible Repository
> > > > > (nickwallen) closes apache/met

[DISCUSS] Next Release

2019-03-13 Thread Nick Allen

I would like to open a discussion in regards to the next release. Our last
0.7.0 release was on Dec 11th.

I believe we have a significant number of bug fixes and performance
improvements that would make a worthy point release; 0.7.1.  Although, we
should review the change log and see if there are any breaking changes that
would require a bump to the minor version.

Thoughts?

$ git log --format=%B  tags/apache-metron_0.7.0-release..HEAD | grep METRON-
Merge remote-tracking branch 'apache/master' into METRON-2035
METRON-2035 Allow User to Configure Role Names for Access Control
METRON-2030 SensorParserGroupControllerIntegrationTest intermittent errors
(merrimanr via mmiklavc) closes apache/metron#1352
METRON-2031 [UI] Turning off initial search request and polling by default
on Alerts UI (tiborm via mmiklavc) closes apache/metron#1353
METRON-2012 Unable to Execute Stellar Functions Against HBase in the REPL
(nickwallen) closes apache/metron#1345
METRON-1971 Short timeout value in Cypress may cause build failures
(sardell) closes apache/metron#1323
METRON-1940 Check if not and install Elastic search templates / Solr
collections when indexing server is restarted (MohanDV) closes
apache/metron#1305
METRON-2019 Improve Metron REST Logging (merrimanr) closes
apache/metron#1347
METRON-2016 Parser aggregate groups should be persisted and available
through REST (merrimanr) closes apache/metron#1346
METRON-1987 Upgrade Alert UI to stable Bootstrap 4 (sardell) closes
apache/metron#1336
METRON-1968 Messages are lost when a parser produces multiple messages and
batch size is greater than 1 (merrimanr) closes apache/metron#1330
METRON-1778 Out-of-order timestamps may delay flush in Storm Profiler
(nickwallen) closes apache/metron#1197
METRON-1996 Solr search throws NPE for group search if the group parameter
is null or empty (MohanDV via nickwallen) closes apache/metron#1333
METRON-1944 Unable to Delete a Comment in Alerts UI (ruffle1986 via
sardell) closes apache/metron#1307
METRON-2010 Unable to Build Metron Due to Inaccessible Repository
(nickwallen) closes apache/metron#1343
METRON-1998 Only one sensor is flushed by tick tuple (merrimanr) closes
apache/metron#1335
METRON-2009 Address Javadoc checkstyle issues in metron-common (justinleet)
closes apache/metron#1342
METRON-2005 Batch Writer writes 0-byte files to HDFS on rotation
(justinleet) closes apache/metron#1338
METRON-2007 Management UI not loading grok statements correctly (merrimanr)
closes apache/metron#1340
METRON-1986 Batch Profiler Fails to Resolve Stats Stellar Functions
(nickwallen) closes apache/metron#1328
METRON-1993 Stellar REST_GET should handle responses when content length is
less than zero (merrimanr) closes apache/metron#1331
METRON-1999 Adding validation against special characters to parser name
field (tiborm via sardell) closes apache/metron#1337
METRON-1985 Improve Error Handling When Cannot Connect to HBase
(nickwallen) closes apache/metron#1327
METRON-1974 Batch Profiler Should Handle Errant Profiles Better
(nickwallen) closes apache/metron#1326
METRON-1970 Add Metadata to Error Messages Generated During Parsing
(nickwallen) closes apache/metron#1325
METRON-1995 Arrow icon in date range selector moved to a wrong position
(ruffle1986 via sardell) closes apache/metron#1332
METRON-1973 Upgrade Alert UIs webpack-dev-server to 3.1.14 (tiborm
via sardell) closes apache/metron#1324
METRON-1948 Dropped messages from REGEX_SELECT parser field transformation
are not acked in Storm (merrimanr) closes apache/metron#1321
METRON-1969 Adding Cypress documentation to Alert UIs README.md
(tiborm via sardell) closes apache/metron#1322
METRON-1933 Improve build-utils helper scripts (JonZeolla via
ottobackwards) closes apache/metron#1297
METRON-1962 Make entering JDBC details in REST config to be optional
(anandsubbu) closes apache/metron#1318
METRON-1929 Build GET_ASN Stellar function (justinleet) closes
apache/metron#1299
METRON-1956 prepare-commit does not run all the tests it should
(ottobackwards) closes apache/metron#1315
METRON-1965 Knox should work on a multi-node installation (merrimanr)
closes apache/metron#1320
METRON-1939 Update version to 0.7.1 (justinleet via nickwallen) closes
apache/metron#1303
METRON-685 Scores in Threat Triage should be a Stellar Statement
(nickwallen) closes apache/metron#1311
METRON-1963 Remove left over integration test from before refactoring
(ottobackwards) closes apache/metron#1319
METRON-1945 Metron MPack support for Knox SSO setup (merrimanr) closes
apache/metron#1308
METRON-1878 Add Metron as a Knox service (merrimanr) closes
apache/metron#1275
METRON-1958 Optimize Cypress to use best practices (sardell via merrimanr)
closes apache/metron#1317
METRON-1957 5424 and 3164 parser configurations are packaged in wrong place
(ottobackwards) closes apache/metron#1316
METRON-1955 Update metron SPEC file to include syslog 3164 parser
(anandsubbu via ottobackwards) closes apache/metron#1314
METRON-1893 Create a syslog 3164

Re: [DISCUSS] Upgrading HBase and Kafka support

2019-03-08 Thread Nick Allen

+1 for option 3.  I am in favor of using Docker for the integration tests
for all the reasons that you mentioned.

On Fri, Mar 8, 2019 at 9:47 AM Ryan Merriman  wrote:

> I have been researching the effort involved to upgrade to HDP 3.  Along the
> way I've found a couple challenging issues that we will need to solve, both
> involving our integration testing strategy.
>
> The first issue is Kafka.  We are moving from 0.10.0 to 2.0.0 and there
> have been significant changes to the API.  This creates an issue in the
> KafkaComponent class, which we use as an in-memory Kafka server in
> integration tests.  Most of the classes that were previously used have gone
> away, and to the best of my knowledge, were not supported as public APIs.
> I also don't see any publicly documented APIs to replace them.
>
> The second issue is HBase.  We are moving from 1.1.2 to 2.0.2 so another
> significant change.  This creates an issue in the MockHTable class
> becausethe HTableInterface class has changed to Table, essentially
> requiring that MockHTable be rewritten to conform to the new interface.
> It's my opinion that this class is complicated and difficult to maintain as
> it is anyways.
>
> These 2 issues have the potential to add a significant amount of work to
> upgrading Metron to HDP 3.  I want to take a step back and review our
> options before we move forward.  Here are some initial thoughts I had on
> how to approach this.  For HBase:
>
>1. Update MockHTable to work with the new HBase API.  We would continue
>using a mock server approach for HBase.
>2. Research replacing MockHTable with an in-memory HBase server.
>3. Replace MockHTable with a Docker container running HBase.
>
> For Kafka:
>
>1. Replace KafkaComponent with a mock server implementation.
>2. Update KafkaComponent to work with the new API.  We would probably
>need to leverage some internal Kafka classes.  I do not see a testing
> API
>documented publicly.
>3. Replace KafkaComponent with a Docker container running Kafka.
>
> What other options are there?  Whatever we choose I think we should follow
> a similar approach for both (mock servers, in memory servers, Docker, other
> options I'm not thinking of).
>
> This will not shock anyone but I would be in favor of Docker containers.
> They have the advantage of classpath isolation, easy upgrades, and accurate
> integration testing.  The downside is we will have to adjusts our tests and
> travis script to incorporate these Docker containers into our build
> process.  We have discussed this at length in the past and it has generally
> stalled for various reasons.  Maybe if we move a few services at a time it
> might be more palatable?  As for the other 2 approaches, I think if either
> worked well we wouldn't be having this discussion.  Mock servers are hard
> to maintain and I don't see in memory testing classes documented in
> javadocs for either service.
>
> Thoughts?
>

Re: [DISCUSS] Architecture documentation

2019-02-26 Thread Nick Allen

And just to be clear, this is just a continuation of the discussion in this
thread.  This is not in any way a blocker for Ryan's PR, of course.

On Tue, Feb 26, 2019 at 1:44 PM Nick Allen  wrote:

>
> We could also enhance
> "metron/dev-utilities/release-utils/validate-jira-for-release" so that it
> spits out a warning for any JIRAs that are marked "Next + 1", but don't
> have a record in the commit history.  That would at least be a reminder
> that the JIRA needs some follow-up.
>
>
>
>
>
> On Mon, Feb 25, 2019 at 5:25 PM Justin Leet  wrote:
>
>> Re: Labeling in Jira, I'm fine with having a be "Next + 1" from a release
>> management perspective, but I'd still consider at least taking action on
>> followup to be the relevant party's responsibility (implementer or
>> whatever
>> the case may be).  We probably should have a more clear way to tag things
>> like this, but I don't believe we do right now. If there's not a harder
>> dependency than my memory, chances are high it gets
>> overlooked/missed/whatever.
>>
>> On Mon, Feb 25, 2019 at 4:32 PM Otto Fowler 
>> wrote:
>>
>> > I really like the idea of architecture.md -> **/architecture.md.
>> >
>> > We overall do not have javadoc in a lot of areas, and could maybe start
>> > working on it as we go and think about asking for it in reviews.
>> > We are also missing the Parser Programmer’s Guide, how to add a parser
>> to
>> > the metron system/install etc and other things.
>> >
>> >
>> >
>> > On February 25, 2019 at 15:22:47, Ryan Merriman (merrim...@gmail.com)
>> > wrote:
>> >
>> > I feel like the code itself is pretty well documented. I updated
>> existing
>> > javadocs and added javadocs to classes that didn't have them before this
>> > PR. In my opinion the level of documentation for these classes has
>> > increased significantly.
>> >
>> > On Mon, Feb 25, 2019 at 1:52 PM Michael Miklavcic <
>> > michael.miklav...@gmail.com> wrote:
>> >
>> > > Tentatively agreed on further clarification of what we consider
>> in/out of
>> > > scope for documentation re: document something that wasn't documented
>> > > before. Ryan, can you give a quick summary of what you *have*
>> > added/updated
>> > > in documentation on this PR vs what you want to leave out?
>> > >
>> > > My initial concern in punting on docs right now is that part of what
>> made
>> > > this PR/task more challenging in the first place was not having
>> > > documentation. We risk losing context and detail again if we don't do
>> > this
>> > > immediately. Would it be reasonable to split it up as follows?:
>> > >
>> > > 1. Additional overarching documentation feels out of scope - make it a
>> > > follow on (see comments below).
>> > > 2. Adding documentation to our existing README's and java code
>> comments
>> > > that describe the new/modified functionality should be in scope
>> because
>> > > it's part of the unit of work. I expect that a developer should be
>> able
>> > > to
>> > > look at the code, tests, comments, and README's and understand how
>> this
>> > > code functions without having to start from scratch.
>> > >
>> > > The way we've handled follow-on work before, at least as far as
>> feature
>> > > branches are concerned, was to create Jiras and link them to the
>> > > appropriate discussions for context. Maybe we can take that one step
>> > > further and do the release manager a favor by also labeling the
>> > > required/requested release on the Jira as a gating factor. This
>> follows
>> > our
>> > > pattern for intermittent test failure reporting, e.g.
>> > >
>> > >
>> >
>> >
>> https://issues.apache.org/jira/browse/METRON-1946?jql=project%20%3D%20METRON%20AND%20resolution%20%3D%20Unresolved%20AND%20labels%20%3D%20test-failure%20ORDER%20BY%20priority%20DESC%2C%20updated%20DESC
>> > > .
>> > >
>> > > I'm also in favor of continuing to document architecture and technical
>> > > details as part of the code base as Ryan and Jon have suggested. I
>> think
>> > we
>> > > should have an "architecture.md" in metron root that replaces this -
>> > >
>> > >
>> >
>> >
>> https://github.com/apache/metron/blob/d7

Re: [DISCUSS] Architecture documentation

2019-02-26 Thread Nick Allen

We could also enhance
"metron/dev-utilities/release-utils/validate-jira-for-release" so that it
spits out a warning for any JIRAs that are marked "Next + 1", but don't
have a record in the commit history.  That would at least be a reminder
that the JIRA needs some follow-up.





On Mon, Feb 25, 2019 at 5:25 PM Justin Leet  wrote:

> Re: Labeling in Jira, I'm fine with having a be "Next + 1" from a release
> management perspective, but I'd still consider at least taking action on
> followup to be the relevant party's responsibility (implementer or whatever
> the case may be).  We probably should have a more clear way to tag things
> like this, but I don't believe we do right now. If there's not a harder
> dependency than my memory, chances are high it gets
> overlooked/missed/whatever.
>
> On Mon, Feb 25, 2019 at 4:32 PM Otto Fowler 
> wrote:
>
> > I really like the idea of architecture.md -> **/architecture.md.
> >
> > We overall do not have javadoc in a lot of areas, and could maybe start
> > working on it as we go and think about asking for it in reviews.
> > We are also missing the Parser Programmer’s Guide, how to add a parser to
> > the metron system/install etc and other things.
> >
> >
> >
> > On February 25, 2019 at 15:22:47, Ryan Merriman (merrim...@gmail.com)
> > wrote:
> >
> > I feel like the code itself is pretty well documented. I updated existing
> > javadocs and added javadocs to classes that didn't have them before this
> > PR. In my opinion the level of documentation for these classes has
> > increased significantly.
> >
> > On Mon, Feb 25, 2019 at 1:52 PM Michael Miklavcic <
> > michael.miklav...@gmail.com> wrote:
> >
> > > Tentatively agreed on further clarification of what we consider in/out
> of
> > > scope for documentation re: document something that wasn't documented
> > > before. Ryan, can you give a quick summary of what you *have*
> > added/updated
> > > in documentation on this PR vs what you want to leave out?
> > >
> > > My initial concern in punting on docs right now is that part of what
> made
> > > this PR/task more challenging in the first place was not having
> > > documentation. We risk losing context and detail again if we don't do
> > this
> > > immediately. Would it be reasonable to split it up as follows?:
> > >
> > > 1. Additional overarching documentation feels out of scope - make it a
> > > follow on (see comments below).
> > > 2. Adding documentation to our existing README's and java code comments
> > > that describe the new/modified functionality should be in scope because
> > > it's part of the unit of work. I expect that a developer should be able
> > > to
> > > look at the code, tests, comments, and README's and understand how this
> > > code functions without having to start from scratch.
> > >
> > > The way we've handled follow-on work before, at least as far as feature
> > > branches are concerned, was to create Jiras and link them to the
> > > appropriate discussions for context. Maybe we can take that one step
> > > further and do the release manager a favor by also labeling the
> > > required/requested release on the Jira as a gating factor. This follows
> > our
> > > pattern for intermittent test failure reporting, e.g.
> > >
> > >
> >
> >
> https://issues.apache.org/jira/browse/METRON-1946?jql=project%20%3D%20METRON%20AND%20resolution%20%3D%20Unresolved%20AND%20labels%20%3D%20test-failure%20ORDER%20BY%20priority%20DESC%2C%20updated%20DESC
> > > .
> > >
> > > I'm also in favor of continuing to document architecture and technical
> > > details as part of the code base as Ryan and Jon have suggested. I
> think
> > we
> > > should have an "architecture.md" in metron root that replaces this -
> > >
> > >
> >
> >
> https://github.com/apache/metron/blob/d7d4fd9afb19e2bd2e66babb7e1514a19eae07d0/README.md#navigating-the-architecture
> > > and covers the broad architecture with links to the appropriate modules
> > for
> > > detail. Minimally, it would be nice if we had a simple diagram showing
> > the
> > > basic flow of data in Metron. I think we probably want an updated
> version
> > > of this wiki entry from back in the day -
> > > https://cwiki.apache.org/confluence/display/METRON/Metron+Architecture
> > >
> > > Best,
> > > Mike
> > >
> > >
> > > On Mon, Feb 25, 2019 at 7:18 AM

Re: [DISCUSS] Architecture documentation

2019-02-25 Thread Nick Allen

I don't think we should hold up this work to document something that wasn't
previously documented.  A follow-on is sufficient.

On Mon, Feb 25, 2019 at 8:50 AM Ryan Merriman  wrote:

> Recently I submitted a PR 
> that
> introduces a large number of changes to a critical part of our code base.
> Reviewers feel like it is significant enough to document at an
> architectural level (and I agree).  There are a couple points I would like
> to clarify.
>
> Generally architectural documentation lives in the README of the
> appropriate module.  Do we want to continue documenting architecture here?
> I think it makes sense because it will be versioned along with the code.
> Just wanted to confirm there are no objections to continuing this practice.
>
> A reviewer suggested we could accept the PR as is and leave the
> architectural documentation as a follow on.  I think this makes sense
> because it can be tedious to maintain a large PR as other smaller commits
> are accepted into master.  An important requirement is the documentation
> follow on must be completed in a timely manner, before the next release.
> Are there any objections to doing it this way?
>

Re: [DISCUSS] Architecture documentation

2019-02-25 Thread Nick Allen

> Procedurally, do we have a way to ensure that any follow-on documentation 
> happens
prior to a release (anything in the wiki, etc.)?

If someone thinks the code base needs X before the next release, then they
can bring up X during the release discussion.  We don't need additional
procedure around this.

On Mon, Feb 25, 2019 at 9:11 AM zeo...@gmail.com  wrote:

> I agree, I think all docs should be kept in the code base.  I
> opened METRON-714 ages ago to get the existing cwiki docs over to READMEs
> as well.
>
> I would also like to see us consider a more general/overview architecture,
> or perhaps write each component's architecture in a way that it can be
> composed into a higher level doc when generating the site-book.  Right now
> the barrier to getting started with Metron is too high, and I think this
> would make it slightly less imposing.  But this is slightly outside of the
> scope of the current conversation.
>
> Procedurally, do we have a way to ensure that any follow-on documentation
> happens prior to a release (anything in the wiki, etc.)?  I'm fine with
> splitting the commits generally.
>
> Jon
>
> On Mon, Feb 25, 2019 at 8:50 AM Ryan Merriman  wrote:
>
> > Recently I submitted a PR 
> > that
> > introduces a large number of changes to a critical part of our code base.
> > Reviewers feel like it is significant enough to document at an
> > architectural level (and I agree).  There are a couple points I would
> like
> > to clarify.
> >
> > Generally architectural documentation lives in the README of the
> > appropriate module.  Do we want to continue documenting architecture
> here?
> > I think it makes sense because it will be versioned along with the code.
> > Just wanted to confirm there are no objections to continuing this
> practice.
> >
> > A reviewer suggested we could accept the PR as is and leave the
> > architectural documentation as a follow on.  I think this makes sense
> > because it can be tedious to maintain a large PR as other smaller commits
> > are accepted into master.  An important requirement is the documentation
> > follow on must be completed in a timely manner, before the next release.
> > Are there any objections to doing it this way?
> >
> --
>
> Jon Zeolla
>

Re: [VOTE] Metron Release Candidate 0.7.0-RC1

2018-12-12 Thread Nick Allen

+1 binding

   - All of the tarballs, checksums, and signatures are correct
   - All of the tests and integration tests ran successfully.
   - The release also spun-up correctly in the development environment.


FYI - I had to slightly modify the metron-rc-check script for this to
work.  See the patch below.

diff --git a/dev-utilities/release-utils/metron-rc-check
b/dev-utilities/release-utils/metron-rc-check
index 143ba85a2..4552e5568 100755
--- a/dev-utilities/release-utils/metron-rc-check
+++ b/dev-utilities/release-utils/metron-rc-check
@@ -67,8 +67,6 @@ for i in "$@"; do
 #   --bro=0.1.0
 #
 -b=*|--bro=*)
-BRO="${i#*=}"
-shift # past argument=value
 ;;

 #
@@ -157,6 +155,7 @@ fi
 echo "Working directory $WORK"

 KEYS="$METRON_RC_DIST/KEYS"
+KEYS="https://dist.apache.org/repos/dist/release/metron/KEYS;
 METRON_ASSEMBLY="$METRON_RC_DIST/apache-metron-$METRON_VERSION-$RC.tar.gz"
 METRON_ASSEMBLY_SIG="$METRON_ASSEMBLY.asc"


On Tue, Dec 11, 2018 at 2:43 PM Justin Leet  wrote:

> This is a call to vote on releasing Apache Metron 0.7.0
>
> Full list of changes in this release:
> https://dist.apache.org/repos/dist/dev/metron/0.7.0-RC1/CHANGES
> The tag to be voted upon is:
> apache-metron-0.7.0-rc1
>
> The source archives being voted upon can be found here:
>
> https://dist.apache.org/repos/dist/dev/metron/0.7.0-RC1/apache-metron-0.7.0-rc1.tar.gz
>
> Other release files, signatures and digests can be found here:
> https://dist.apache.org/repos/dist/dev/metron/0.7.0-RC1/
>
> The release artifacts are signed with the following key:
> https://dist.apache.org/repos/dist/release/metron/KEYS
> Please vote on releasing this package as Apache Metron 0.7.0-RC1
>
> When voting, please list the actions taken to verify the release.
>
> Recommended build validation and verification instructions are posted
> here:
> https://cwiki.apache.org/confluence/display/METRON/Verifying+Builds
>
> This vote will be open for until 3pm EDT on Friday December 14 2018.
>
> [ ] +1 Release this package as Apache Metron 0.7.0-RC1
>
> [ ] 0 No opinion
>
> [ ] -1 Do not release this package because...
>

Re: [VOTE] Metron Release Candidate 0.7.0-RC1

2018-12-11 Thread Nick Allen

I think it is fine.  We can update the metron-rc-check script.  Just wanted
to be sure since it is different from what we did last release.

On Tue, Dec 11, 2018 at 4:58 PM Justin Leet  wrote:

> I haven't seen another project include a KEYS file in the main release
> itself, and since the KEYS file hasn't changed I just linked to the one in
> dist. They either seem to include it at the root level (so
> https://dist.apache.org/repos/dist/dev/metron/ for us) or they don't
> include it at all (except maybe during releases and it just appeared empty
> when I looked).
>
> The management of the KEYS file came up in relation to the plugin repo and
> in the original PR for the RC script, but nobody seems to really have
> strong opinions. Specifically, we can trivially include it with the main
> repo, but not the plugin repo. We'd need to pull the KEYS file from
> somewhere with the plugin and if that's only getting updated on release of
> the main repo, it causes friction if we're only releasing the plugin and
> changing release managers (who need to add the key to the file).
>
> I'm happy to revisit this with a more general solution (e.g. having a
> script to publish just the KEYS file?) . Given that this causes problems
> with the RC check, it seems like we need to update something one way or the
> other. It might just be updating the RC check script in the short term.
>
>
> On Tue, Dec 11, 2018 at 3:20 PM Nick Allen  wrote:
>
> > Should there be a KEYS file with the release candidate at
> > https://dist.apache.org/repos/dist/dev/metron/0.7.0-RC1/KEYS? Or was
> that
> > changed to https://dist.apache.org/repos/dist/release/metron/KEYS ?
> >
> > ```
> > $ ~/Development/metron/dev-utilities/release-utils/metron-rc-check
> > --version=0.7.0 --candidate=1
> > Metron Version 0.7.0
> > Release Candidate rc1
> > Metron RC Distribution Root is
> > https://dist.apache.org/repos/dist/dev/metron/0.7.0-RC1
> > Working directory /Users/nallen/tmp/metron-0.7.0-rc1
> > Downloading https://dist.apache.org/repos/dist/dev/metron/0.7.0-RC1/KEYS
> > --2018-12-11
> > <
> https://dist.apache.org/repos/dist/dev/metron/0.7.0-RC1/KEYS--2018-12-11>
> > 15:18:49--
> > https://dist.apache.org/repos/dist/dev/metron/0.7.0-RC1/KEYS
> > Resolving dist.apache.org (dist.apache.org)... 209.188.14.144
> > Connecting to dist.apache.org (dist.apache.org)|209.188.14.144|:443...
> > connected.
> > HTTP request sent, awaiting response... 404 Not Found
> > 2018-12-11 15:18:50 ERROR 404: Not Found.
> >
> > [ERROR] Failed to download
> > https://dist.apache.org/repos/dist/dev/metron/0.7.0-RC1/KEYS
> > ```
> >
> > On Tue, Dec 11, 2018 at 2:43 PM Justin Leet  wrote:
> >
> > > This is a call to vote on releasing Apache Metron 0.7.0
> > >
> > > Full list of changes in this release:
> > > https://dist.apache.org/repos/dist/dev/metron/0.7.0-RC1/CHANGES
> > > The tag to be voted upon is:
> > > apache-metron-0.7.0-rc1
> > >
> > > The source archives being voted upon can be found here:
> > >
> > >
> >
> https://dist.apache.org/repos/dist/dev/metron/0.7.0-RC1/apache-metron-0.7.0-rc1.tar.gz
> > >
> > > Other release files, signatures and digests can be found here:
> > > https://dist.apache.org/repos/dist/dev/metron/0.7.0-RC1/
> > >
> > > The release artifacts are signed with the following key:
> > > https://dist.apache.org/repos/dist/release/metron/KEYS
> > > Please vote on releasing this package as Apache Metron 0.7.0-RC1
> > >
> > > When voting, please list the actions taken to verify the release.
> > >
> > > Recommended build validation and verification instructions are posted
> > > here:
> > > https://cwiki.apache.org/confluence/display/METRON/Verifying+Builds
> > >
> > > This vote will be open for until 3pm EDT on Friday December 14 2018.
> > >
> > > [ ] +1 Release this package as Apache Metron 0.7.0-RC1
> > >
> > > [ ] 0 No opinion
> > >
> > > [ ] -1 Do not release this package because...
> > >
> >
>

Re: [VOTE] Metron Release Candidate 0.7.0-RC1

2018-12-11 Thread Nick Allen

Should there be a KEYS file with the release candidate at
https://dist.apache.org/repos/dist/dev/metron/0.7.0-RC1/KEYS? Or was that
changed to https://dist.apache.org/repos/dist/release/metron/KEYS ?

```
$ ~/Development/metron/dev-utilities/release-utils/metron-rc-check
--version=0.7.0 --candidate=1
Metron Version 0.7.0
Release Candidate rc1
Metron RC Distribution Root is
https://dist.apache.org/repos/dist/dev/metron/0.7.0-RC1
Working directory /Users/nallen/tmp/metron-0.7.0-rc1
Downloading https://dist.apache.org/repos/dist/dev/metron/0.7.0-RC1/KEYS
--2018-12-11 15:18:49--
https://dist.apache.org/repos/dist/dev/metron/0.7.0-RC1/KEYS
Resolving dist.apache.org (dist.apache.org)... 209.188.14.144
Connecting to dist.apache.org (dist.apache.org)|209.188.14.144|:443...
connected.
HTTP request sent, awaiting response... 404 Not Found
2018-12-11 15:18:50 ERROR 404: Not Found.

[ERROR] Failed to download
https://dist.apache.org/repos/dist/dev/metron/0.7.0-RC1/KEYS
```

On Tue, Dec 11, 2018 at 2:43 PM Justin Leet  wrote:

> This is a call to vote on releasing Apache Metron 0.7.0
>
> Full list of changes in this release:
> https://dist.apache.org/repos/dist/dev/metron/0.7.0-RC1/CHANGES
> The tag to be voted upon is:
> apache-metron-0.7.0-rc1
>
> The source archives being voted upon can be found here:
>
> https://dist.apache.org/repos/dist/dev/metron/0.7.0-RC1/apache-metron-0.7.0-rc1.tar.gz
>
> Other release files, signatures and digests can be found here:
> https://dist.apache.org/repos/dist/dev/metron/0.7.0-RC1/
>
> The release artifacts are signed with the following key:
> https://dist.apache.org/repos/dist/release/metron/KEYS
> Please vote on releasing this package as Apache Metron 0.7.0-RC1
>
> When voting, please list the actions taken to verify the release.
>
> Recommended build validation and verification instructions are posted
> here:
> https://cwiki.apache.org/confluence/display/METRON/Verifying+Builds
>
> This vote will be open for until 3pm EDT on Friday December 14 2018.
>
> [ ] +1 Release this package as Apache Metron 0.7.0-RC1
>
> [ ] 0 No opinion
>
> [ ] -1 Do not release this package because...
>

Re: [DISCUSS] Mandatory relocation of Apache git repositories on git-wip-us.apache.org

2018-12-10 Thread Nick Allen

Another thing to note here is that committers have admin access to the
Github account.  This access is granted once you link your accounts using
the URL that Roy sent.  It took about an hour for mine to sync.

This means we can do things like close abandoned pull requests ourselves.
This also exposes the "merge pull request" button in Github.  I am not
quite sure how that might differ from our current 'prepare-commit' script.
I would suggest we keep using the script until we figure that out.

On Mon, Dec 10, 2018 at 3:29 PM Roy Lenferink  wrote:

> The repo has just moved to gitbox. Keep in mind the (ASF) remote has
> changed. The GitHub remote remains the same.
> I've opened a pull request to adapt the scripts to the gitbox URLs.
>
> In order to fully use the GitBox functionality, it is needed to link your
> ASF and GitHub account which can be done here:
> https://gitbox.apache.org/setup/
>
> When having any questions, don't hesitate to ask!
>
> - Roy
>
> Op ma 10 dec. 2018 om 20:11 schreef Roy Lenferink :
>
> > I have created the issue with INFRA to move over from git-wip-us to
> > GitBox.
> > METRON-1931 is created for updating the scripts with the new GitBox
> > location. I'll start with this once the repositories are moved.
> >
> > - Roy
> >
> > Op ma 10 dec. 2018 om 15:52 schreef Nick Allen :
> >
> >> +1  Thanks for the heads up.  We will do whatever we need to to help
> with
> >> the transition.
> >>
> >> On Sun, Dec 9, 2018 at 11:03 AM Otto Fowler 
> >> wrote:
> >>
> >>> +1
> >>>
> >>> We will need jiras and PR’s for updating our scripts post move however.
> >>>
> >>
>

Re: [DISCUSS] Mandatory relocation of Apache git repositories on git-wip-us.apache.org

2018-12-10 Thread Nick Allen

+1  Thanks for the heads up.  We will do whatever we need to to help with
the transition.

On Sun, Dec 9, 2018 at 11:03 AM Otto Fowler  wrote:

> +1
>
> We will need jiras and PR’s for updating our scripts post move however.
>
>
> On December 9, 2018 at 06:32:44, Roy Lenferink (rlenfer...@apache.org)
> wrote:
>
> Hi folks,
>
> Checking the mail-archives I noticed the message below missed the
> dev@metron
> list [1].
> Does anyone have a problem with starting the process to migrate the
> existing Metron
> git-wip-us repos [2][3] to gitbox?
>
> This means integrated access and easy PRs on the repos (write access
> to the GitHub repos).
>
> I can't imagine anyone will say no, but we need to "document support
> for the decision" from a mailing list post, so, here it is.
>
> If there are no objections after 72 hours, I will create a ticket with
> INFRA for moving
> over the metron repositories to GitBox.
>
> - Roy
>
> [1]
> http://mail-archives.apache.org/mod_mbox/metron-dev/201812.mbox/browser
> [2] https://git-wip-us.apache.org/repos/asf?p=metron.git
> [3] https://git-wip-us.apache.org/repos/asf?p=metron-bro-plugin-kafka.git
>
> -- Forwarded message -
> From: Daniel Gruno 
> Date: vr 7 dec. 2018 om 17:53
> Subject: [NOTICE] Mandatory relocation of Apache git repositories on
> git-wip-us.apache.org
> To: us...@infra.apache.org 
>
>
> [IF YOUR PROJECT DOES NOT HAVE GIT REPOSITORIES ON GIT-WIP-US PLEASE
> DISREGARD THIS EMAIL; IT WAS MASS-MAILED TO ALL APACHE PROJECTS]
>
> Hello Apache projects,
>
> I am writing to you because you may have git repositories on the
> git-wip-us server, which is slated to be decommissioned in the coming
> months. All repositories will be moved to the new gitbox service which
> includes direct write access on github as well as the standard ASF
> commit access via gitbox.apache.org.
>
> ## Why this move? ##
> The move comes as a result of retiring the git-wip service, as the
> hardware it runs on is longing for retirement. In lieu of this, we
> have decided to consolidate the two services (git-wip and gitbox), to
> ease the management of our repository systems and future-proof the
> underlying hardware. The move is fully automated, and ideally, nothing
> will change in your workflow other than added features and access to
> GitHub.
>
> ## Timeframe for relocation ##
> Initially, we are asking that projects voluntarily request to move
> their repositories to gitbox, hence this email. The voluntary
> timeframe is between now and January 9th 2019, during which projects
> are free to either move over to gitbox or stay put on git-wip. After
> this phase, we will be requiring the remaining projects to move within
> one month, after which we will move the remaining projects over.
>
> To have your project moved in this initial phase, you will need:
>
> - Consensus in the project (documented via the mailing list)
> - File a JIRA ticket with INFRA to voluntarily move your project repos
> over to gitbox (as stated, this is highly automated and will take
> between a minute and an hour, depending on the size and number of
> your repositories)
>
> To sum up the preliminary timeline;
>
> - December 9th 2018 -> January 9th 2019: Voluntary (coordinated)
> relocation
> - January 9th -> February 6th: Mandated (coordinated) relocation
> - February 7th: All remaining repositories are mass migrated.
>
> This timeline may change to accommodate various scenarios.
>
> ## Using GitHub with ASF repositories ##
> When your project has moved, you are free to use either the ASF
> repository system (gitbox.apache.org) OR GitHub for your development
> and code pushes. To be able to use GitHub, please follow the primer
> at: https://reference.apache.org/committer/github
>
>
> We appreciate your understanding of this issue, and hope that your
> project can coordinate voluntarily moving your repositories in a
> timely manner.
>
> All settings, such as commit mail targets, issue linking, PR
> notification schemes etc will automatically be migrated to gitbox as
> well.
>
> With regards, Daniel on behalf of ASF Infra.
>
> PS:For inquiries, please reply to us...@infra.apache.org, not your
> project's dev list :-).
>

Re: [DISCUSS] Remove `/api/v1/update/replace` endpoint

2018-12-07 Thread Nick Allen

Heads up - I have not received a response from anyone for about 3 days.  I
am going to take that as no one has a problem with this change.  I will
likely merge the PR today.



On Tue, Dec 4, 2018 at 4:05 PM Nick Allen  wrote:

> PR #1284 completely removes the `/api/v1/update/replace` endpoint.  This
> endpoint is not being used by any other services in Metron currently.
>
> As part of the work I am doing to allow Elasticsearch to auto-generate the
> document ID, I would have had to make code changes to this endpoint to
> continue to support it.  Rather than doing that, I opted to remote it, as I
> believe it is dead code.
>
> * I want to make sure that no one in the community is using this endpoint
> for other purposes.  If you are, please respond on this thread.
>
> * Does anyone think we should deprecate this before completely removing
> it?  Personally, I do not think it is necessary, but if anyone disagrees
> please voice your opinion.
>
>
>
>
> [1] https://github.com/apache/metron/pull/1284
>

Re: Metron Release 0.6.1 and/or Plugin release 0.3.0?

2018-12-05 Thread Nick Allen

I would prefer to just go ahead with the release and not wait on an
intermittent, integration test related JIRAs.  Just wanting to see if there
is support for getting a RC out sooner rather than later.

On Tue, Dec 4, 2018 at 4:06 PM zeo...@gmail.com  wrote:

> I agree that we should move forward with the apache/metron 0.7.0 release.
> If 0.3 gets finalized in time we can include it, but otherwise no big deal
> not including it since the dev environment points to 0.1 which didn't have
> the issue.
>
> Jon
>
> On Mon, Dec 3, 2018 at 5:09 PM Michael Miklavcic <
> michael.miklav...@gmail.com> wrote:
>
> > I have one more intermittent failure to add to the list from a timeout in
> > the profiler integration tests.
> > https://issues.apache.org/jira/browse/METRON-1918
> >
> >
> > On Mon, Dec 3, 2018 at 2:54 PM Michael Miklavcic <
> > michael.miklav...@gmail.com> wrote:
> >
> > > fwiw, I have not been able to reproduce the integration test failure
> that
> > > I logged here - https://issues.apache.org/jira/browse/METRON-1851.
> > Unless
> > > anyone else has seen this, either locally or in Travis, I recommend we
> > > close it out as unable to reproduce. If it does ever show up again, the
> > > closed Jira will be out there as a record of it.
> > >
> > > On Mon, Dec 3, 2018 at 2:29 PM Justin Leet 
> > wrote:
> > >
> > >> I'm inclined to do move forward with the core repo release. Having
> said
> > >> that, there's a few test bugs and such I'd like to see addressed,
> either
> > >> "won't fix" or preferably with PRs, before creating an RC (as noted
> > >> earlier
> > >> in the thread).  It's probably a good opportunity to ask again if
> > there's
> > >> anything outstanding we'd like to see in the release. Is there
> anything
> > >> we'd like taken care of that's relatively close to going in?
> > >>
> > >> If the plugin gets fixed before we're set to move forward with a core
> > >> release (or we choose not to fix it, given the current version is
> > >> affected), I'm happy to put out a new RC.
> > >>
> > >> On Mon, Dec 3, 2018 at 4:12 PM Michael Miklavcic <
> > >> michael.miklav...@gmail.com> wrote:
> > >>
> > >> > +1 Nick
> > >> >
> > >> > On Mon, Dec 3, 2018 at 2:04 PM Nick Allen 
> wrote:
> > >> >
> > >> > > OK, well either way, I see no need to hold up Metron 0.6.1.
> > >> > >
> > >> > > On Mon, Dec 3, 2018 at 3:51 PM zeo...@gmail.com  >
> > >> > wrote:
> > >> > >
> > >> > > > I believe that 0.2.0 is impacted by the bug.
> > >> > > >
> > >> > > > Jon
> > >> > > >
> > >> > > > On Mon, Dec 3, 2018 at 3:50 PM Nick Allen 
> > >> wrote:
> > >> > > >
> > >> > > > > In light of this comment [1], I propose that we move forward
> > with
> > >> > > another
> > >> > > > > Metron release and forgo the Metron Bro Plugin 0.3.0 release
> > >> until we
> > >> > > can
> > >> > > > > resolve METRON-1910 [2].  There is no need to rush the fix as
> > the
> > >> > > current
> > >> > > > > 0.2.0 release of the Bro Plugin is not impacted by the bug. We
> > do
> > >> > have
> > >> > > a
> > >> > > > > good amount of other Metron functionality to release though.
> I
> > do
> > >> > not
> > >> > > > see
> > >> > > > > a need to hold-up the release.
> > >> > > > >
> > >> > > > > ---
> > >> > > > >
> > >> > > > > [1]
> > >> > > > >
> > >> > > > >
> > >> > > >
> > >> > >
> > >> >
> > >>
> >
> https://github.com/apache/metron-bro-plugin-kafka/pull/20#issuecomment-443481440
> > >> > > > > [2] https://github.com/apache/metron-bro-plugin-kafka/pull/20
> > >> > > > >
> > >> > > > > On Thu, Nov 29, 2018 at 10:29 AM Justin Leet <
> > >> justinjl...@gmail.com>
> > >> > > > > wrote:
> > >> > > > >
> > >> >

[DISCUSS] Remove `/api/v1/update/replace` endpoint

2018-12-04 Thread Nick Allen

PR #1284 completely removes the `/api/v1/update/replace` endpoint.  This
endpoint is not being used by any other services in Metron currently.

As part of the work I am doing to allow Elasticsearch to auto-generate the
document ID, I would have had to make code changes to this endpoint to
continue to support it.  Rather than doing that, I opted to remote it, as I
believe it is dead code.

* I want to make sure that no one in the community is using this endpoint
for other purposes.  If you are, please respond on this thread.

* Does anyone think we should deprecate this before completely removing
it?  Personally, I do not think it is necessary, but if anyone disagrees
please voice your opinion.




[1] https://github.com/apache/metron/pull/1284

Re: Metron Release 0.6.1 and/or Plugin release 0.3.0?

2018-12-03 Thread Nick Allen

OK, well either way, I see no need to hold up Metron 0.6.1.

On Mon, Dec 3, 2018 at 3:51 PM zeo...@gmail.com  wrote:

> I believe that 0.2.0 is impacted by the bug.
>
> Jon
>
> On Mon, Dec 3, 2018 at 3:50 PM Nick Allen  wrote:
>
> > In light of this comment [1], I propose that we move forward with another
> > Metron release and forgo the Metron Bro Plugin 0.3.0 release until we can
> > resolve METRON-1910 [2].  There is no need to rush the fix as the current
> > 0.2.0 release of the Bro Plugin is not impacted by the bug. We do have a
> > good amount of other Metron functionality to release though.  I do not
> see
> > a need to hold-up the release.
> >
> > ---
> >
> > [1]
> >
> >
> https://github.com/apache/metron-bro-plugin-kafka/pull/20#issuecomment-443481440
> > [2] https://github.com/apache/metron-bro-plugin-kafka/pull/20
> >
> > On Thu, Nov 29, 2018 at 10:29 AM Justin Leet 
> > wrote:
> >
> > > There's a few issues I would like to see at least triaged and
> preferably
> > > addressed prior to the release of the main repo. In Jira, we have a
> > > "test-failures" label, that has a few things attached to it. If we know
> > of
> > > any other Jiras that should have this label attached, please do so and
> > I'd
> > > appreciate it if you replied to the thread.  See test-failures
> > > <
> > >
> >
> https://issues.apache.org/jira/browse/METRON-1851?jql=project%20%3D%20METRON%20AND%20labels%20%3D%20test-failure
> > > >
> > > .
> > >
> > > The Jiras are:
> > > METRON-1810 <https://issues.apache.org/jira/browse/METRON-1810>
> > > METRON-1814 <https://issues.apache.org/jira/browse/METRON-1814>
> > > METRON-1851 <https://issues.apache.org/jira/browse/METRON-1851>
> > >
> > > On Wed, Nov 21, 2018 at 2:20 PM zeo...@gmail.com 
> > wrote:
> > >
> > > > A metron-bro-plugin-kafka 0.3 release is good to go from my side.
> > Thanks
> > > > for all of the reviews Nick
> > > >
> > > > On Wed, Nov 21, 2018 at 11:16 AM Nick Allen 
> > wrote:
> > > >
> > > > > Ha.  Yes, that definitely counts and makes a ton of sense.  Thanks!
> > > > >
> > > > > On Wed, Nov 21, 2018 at 11:00 AM Justin Leet <
> justinjl...@gmail.com>
> > > > > wrote:
> > > > >
> > > > > > Does "I forgot to pull master fresh before running the command"
> > count
> > > > as
> > > > > a
> > > > > > reason?
> > > > > >
> > > > > > The missing Jiras are:
> > > > > >
> > > > > > METRON-1890 Metron Vagrant should disable audio
> (ottobackwards)
> > > > > closes
> > > > > > apache/metron#1277
> > > > > > METRON-1874 Create a Parser Debugger (nickwallen) closes
> > > > > > apache/metron#1265
> > > > > > METRON-1880 Use Caffeine for Profiler Caching (nickwallen)
> > closes
> > > > > > apache/metron#1270
> > > > > > METRON-1877 Nested IF ELSE statements can cause parse errors
> in
> > > > > Stellar
> > > > > > (justinleet) closes apache/metron#1268
> > > > > > METRON-1872 Move rat plugin away from snapshot version
> > > (justinleet)
> > > > > > closes apache/metron#1264
> > > > > >
> > > > > > On Wed, Nov 21, 2018 at 10:59 AM Nick Allen 
> > > > wrote:
> > > > > >
> > > > > > > Also, I'd like to get this one included in the release.  This
> is
> > > > really
> > > > > > > annoying for people just wanting to try out the Profiler.  And
> > this
> > > > was
> > > > > > > 'broken' after the last release, so there currently is no
> release
> > > > with
> > > > > > this
> > > > > > > problem and I'd like to keep it that way. :)
> > > > > > >
> > > > > > > https://github.com/apache/metron/pull/1276
> > > > > > >
> > > > > > > On Wed, Nov 21, 2018 at 10:11 AM Justin Leet <
> > > justinjl...@gmail.com>
> > > > > > > wrote:
> > > > > > >
> > > > > > > > Realized I'd never sent the updated list of Jiras.  I changed
> > the
> &g

Re: Metron Release 0.6.1 and/or Plugin release 0.3.0?

2018-12-03 Thread Nick Allen

In light of this comment [1], I propose that we move forward with another
Metron release and forgo the Metron Bro Plugin 0.3.0 release until we can
resolve METRON-1910 [2].  There is no need to rush the fix as the current
0.2.0 release of the Bro Plugin is not impacted by the bug. We do have a
good amount of other Metron functionality to release though.  I do not see
a need to hold-up the release.

---

[1]
https://github.com/apache/metron-bro-plugin-kafka/pull/20#issuecomment-443481440
[2] https://github.com/apache/metron-bro-plugin-kafka/pull/20

On Thu, Nov 29, 2018 at 10:29 AM Justin Leet  wrote:

> There's a few issues I would like to see at least triaged and preferably
> addressed prior to the release of the main repo. In Jira, we have a
> "test-failures" label, that has a few things attached to it. If we know of
> any other Jiras that should have this label attached, please do so and I'd
> appreciate it if you replied to the thread.  See test-failures
> <
> https://issues.apache.org/jira/browse/METRON-1851?jql=project%20%3D%20METRON%20AND%20labels%20%3D%20test-failure
> >
> .
>
> The Jiras are:
> METRON-1810 <https://issues.apache.org/jira/browse/METRON-1810>
> METRON-1814 <https://issues.apache.org/jira/browse/METRON-1814>
> METRON-1851 <https://issues.apache.org/jira/browse/METRON-1851>
>
> On Wed, Nov 21, 2018 at 2:20 PM zeo...@gmail.com  wrote:
>
> > A metron-bro-plugin-kafka 0.3 release is good to go from my side.  Thanks
> > for all of the reviews Nick
> >
> > On Wed, Nov 21, 2018 at 11:16 AM Nick Allen  wrote:
> >
> > > Ha.  Yes, that definitely counts and makes a ton of sense.  Thanks!
> > >
> > > On Wed, Nov 21, 2018 at 11:00 AM Justin Leet 
> > > wrote:
> > >
> > > > Does "I forgot to pull master fresh before running the command" count
> > as
> > > a
> > > > reason?
> > > >
> > > > The missing Jiras are:
> > > >
> > > > METRON-1890 Metron Vagrant should disable audio (ottobackwards)
> > > closes
> > > > apache/metron#1277
> > > > METRON-1874 Create a Parser Debugger (nickwallen) closes
> > > > apache/metron#1265
> > > > METRON-1880 Use Caffeine for Profiler Caching (nickwallen) closes
> > > > apache/metron#1270
> > > > METRON-1877 Nested IF ELSE statements can cause parse errors in
> > > Stellar
> > > > (justinleet) closes apache/metron#1268
> > > > METRON-1872 Move rat plugin away from snapshot version
> (justinleet)
> > > > closes apache/metron#1264
> > > >
> > > > On Wed, Nov 21, 2018 at 10:59 AM Nick Allen 
> > wrote:
> > > >
> > > > > Also, I'd like to get this one included in the release.  This is
> > really
> > > > > annoying for people just wanting to try out the Profiler.  And this
> > was
> > > > > 'broken' after the last release, so there currently is no release
> > with
> > > > this
> > > > > problem and I'd like to keep it that way. :)
> > > > >
> > > > > https://github.com/apache/metron/pull/1276
> > > > >
> > > > > On Wed, Nov 21, 2018 at 10:11 AM Justin Leet <
> justinjl...@gmail.com>
> > > > > wrote:
> > > > >
> > > > > > Realized I'd never sent the updated list of Jiras.  I changed the
> > > > command
> > > > > > slightly (to remove a clause I thought we'd already removed re:
> > http,
> > > > and
> > > > > > added the awk to remove dupes resulting from multiple commits
> for a
> > > > > single
> > > > > > Jira. I'll do a PR for these changes).
> > > > > >
> > > > > > *apache/metron*
> > > > > > git log "master" "^tags/apache-metron-0.6.0-release" --no-merges
> |
> > > grep
> > > > > -E
> > > > > > "^[[:blank:]]+METRON" | sed 's/\[//g' | sed 's/\]//g' | awk
> > > '!x[$1]++'
> > > > > > METRON-1875 Expose configurable global settings in the Alerts
> > UI
> > > > > > (merrimanr) closes apache/metron#1266
> > > > > > METRON-1834: Migrate Elasticsearch from TransportClient to
> new
> > > Java
> > > > > > REST API (mmiklavc via mmiklavc) closes apache/metron#1242
> > > > > > METRON-1749 Update Angular to latest release in Management UI
> > > > > (sardell

Re: Unzipping Cypress

2018-11-26 Thread Nick Allen

Yes, I have noticed that too.  If not a way to reduce the time, we should
not be logging the unzipping process percentile-by-percentile in the Travis
CI builds.

On Sat, Nov 24, 2018 at 9:49 AM Otto Fowler  wrote:

> Anyone else seeing a lot of time taken downloading and unzipping Cypress on
> builds?
> What is up with that?™
>
> ottO
>

Re: Metron Release 0.6.1 and/or Plugin release 0.3.0?

2018-11-21 Thread Nick Allen

Ha.  Yes, that definitely counts and makes a ton of sense.  Thanks!

On Wed, Nov 21, 2018 at 11:00 AM Justin Leet  wrote:

> Does "I forgot to pull master fresh before running the command" count as a
> reason?
>
> The missing Jiras are:
>
> METRON-1890 Metron Vagrant should disable audio (ottobackwards) closes
> apache/metron#1277
> METRON-1874 Create a Parser Debugger (nickwallen) closes
> apache/metron#1265
> METRON-1880 Use Caffeine for Profiler Caching (nickwallen) closes
> apache/metron#1270
> METRON-1877 Nested IF ELSE statements can cause parse errors in Stellar
> (justinleet) closes apache/metron#1268
> METRON-1872 Move rat plugin away from snapshot version (justinleet)
> closes apache/metron#1264
>
> On Wed, Nov 21, 2018 at 10:59 AM Nick Allen  wrote:
>
> > Also, I'd like to get this one included in the release.  This is really
> > annoying for people just wanting to try out the Profiler.  And this was
> > 'broken' after the last release, so there currently is no release with
> this
> > problem and I'd like to keep it that way. :)
> >
> > https://github.com/apache/metron/pull/1276
> >
> > On Wed, Nov 21, 2018 at 10:11 AM Justin Leet 
> > wrote:
> >
> > > Realized I'd never sent the updated list of Jiras.  I changed the
> command
> > > slightly (to remove a clause I thought we'd already removed re: http,
> and
> > > added the awk to remove dupes resulting from multiple commits for a
> > single
> > > Jira. I'll do a PR for these changes).
> > >
> > > *apache/metron*
> > > git log "master" "^tags/apache-metron-0.6.0-release" --no-merges | grep
> > -E
> > > "^[[:blank:]]+METRON" | sed 's/\[//g' | sed 's/\]//g' | awk '!x[$1]++'
> > > METRON-1875 Expose configurable global settings in the Alerts UI
> > > (merrimanr) closes apache/metron#1266
> > > METRON-1834: Migrate Elasticsearch from TransportClient to new Java
> > > REST API (mmiklavc via mmiklavc) closes apache/metron#1242
> > > METRON-1749 Update Angular to latest release in Management UI
> > (sardell
> > > via nickwallen) closes apache/metron#1217
> > > METRON-1870 Intermittent Stellar REST test failures (merrimanr via
> > > nickwallen) closes apache/metron#1263
> > > METRON-1868 metron-committer-common incorrectly checking REPO_NAME
> > > (JonZeolla via jonzeolla) closes apache/metron#1260
> > > METRON-1740 Improve Palo Alto parser to handle CONFIG and SYSTEM
> > syslog
> > > messages (liuy-tnz via nickwallen) closes apache/metron#1171
> > > METRON-1847 Create reusable script with functions from
> prepare-commit
> > > (ottobackwards) closes apache/metron#1248
> > > METRON-1850 Stellar REST function (merrimanr) closes
> > apache/metron#1250
> > > METRON-1858 BasicFireEyeParser check style cleanup and optimization
> > > (ottobackwards) closes apache/metron#1255
> > > METRON-1864 Stellar date format test fails after daylight saving
> > > (ottobackwards) closes apache/metron#1258
> > > METRON-1861 METRON-1861: REST fails to start when LDAP enabled and
> > > 'Active Spring profiles' config is empty (anandsubbu via justinleet)
> > closes
> > > apache/metron#1256
> > > METRON-1853: Add shutdown hook to Stellar BaseFunctionResolver
> > > (mmiklavc via mmiklavc) closes apache/metron#1251
> > > METRON-1857 Fix Metaalert Nested Alert Field Name in Index Template
> > > (nickwallen) closes apache/metron#1253
> > > METRON-1855: Make unified enrichment topology the default and
> > deprecate
> > > split-join (mmiklavc via mmiklavc) closes apache/metron#1252
> > > METRON-1790 Unsubscribe from every observable in the pcap panel UI
> > > component (ruffle via nickwallen) closes apache/metron#1208
> > > METRON-1803: Integrate Cypress with Travis (tiborm via mmiklavc)
> > closes
> > > apache/metron#1226
> > > METRON-1844 Allow for LDAP to be used for authentication and roles
> > > (justinleet) closes apache/metron#1246
> > > METRON-1830 Re-implement Alerts dialog box without jQuery (sardell
> > via
> > > merrimanr) closes apache/metron#1240
> > > METRON-1826 Update librdkafka and devtoolset (JonZeolla via
> > jonzeolla)
> > > closes apache/metron#1238
> > > METRON-1839 Install Elasticsearch MPack Step in Ansible Not
> > Idempotent
> > > (nickwallen) closes apache/metron#1244
> > > METRON-1833: Managemen

Re: Metron Release 0.6.1 and/or Plugin release 0.3.0?

2018-11-21 Thread Nick Allen

gin-kafka_0.2.0-release"
> --no-merges | grep -E "^[[:blank:]]+METRON" | sed 's/\[//g' | sed 's/\]//g'
> | awk '!x[$1]++'
> METRON-1827 Update librdkafka in metron-bro-plugin-kafka (JonZeolla via
> jonzeolla) closes apache/metron-bro-plugin-kafka#13
> METRON-1866 Improve metron-bro-plugin-kafka documentation (JonZeolla
> via jonzeolla) closes apache/metron-bro-plugin-kafka#17
> METRON-1304 Allow metron-bro-plugin-kafka to include or exclude logs
> (JonZeolla via nickwallen) closes apache/metron-bro-plugin-kafka#2
> METRON-1865 Fix metron-bro-plugin-kafka tests (JonZeolla via jonzeolla)
> closes apache/metron-bro-plugin-kafka#16
> METRON-1828 Improve bro plugin contributing documentation (JonZeolla)
> closes apache/metron-bro-plugin-kafka#14
> METRON-1818 Remove config_files from bro-pkg.meta (JonZeolla) closes
> apache/metron-bro-plugin-kafka#11
> METRON-1800 Increment metron-bro-plugin-kafka version (JonZeolla via
> jonzeolla) closes apache/metron-bro-plugin-kafka#10
> METRON-1773 Bro plugin docs should refer to Apache Metron project
> (nickwallen) closes apache/metron-bro-plugin-kafka#9
>
> On Sun, Nov 18, 2018 at 7:52 AM zeo...@gmail.com  wrote:
>
> > I'm good with that release schedule, and using version 0.7.0 for the
> > apache/metron release.
> >
> > I opened up METRON-1881 and have a branch ready to PR for the 0.3
> > plugin upgrade.
> >
> > Jon
> >
> > On Fri, Nov 16, 2018 at 10:08 AM Otto Fowler 
> > wrote:
> >
> >> Can you generate the jiras that would be included in the release?
> >>
> >>
> >> On November 16, 2018 at 10:05:50, Justin Leet (justinjl...@gmail.com)
> >> wrote:
> >>
> >> Given that we've had a couple major PRs (the ES client migration along
> >> with
> >> the Angular upgrade stuff, and I'm sure others), I'd be in favor of
> >> releasing both the plugin and the main repo.
> >>
> >> I'd be in favor of doing something like:
> >> metron-bro-plugin-kafka release 0.3.0
> >> PR to update full dev
> >> metron release 0.7.0 (given that things changed a good amount)
> >>
> >> I would also expect this process to most likely be kicked off post
> >> Thanksgiving break (Thursday 22nd and Friday 23rd for anybody not in the
> >> US).
> >>
> >> I can kick out another summarization thread if we want, but basically:
> >> * Nov 26 - Start the Bro plugin release process
> >> * Once that finishes, PR to update full dev with new plugin version
> >> * Once PR is in, start metron release process (hopefully) sometime the
> >> week
> >> of the 3rd?
> >>
> >> Are there any objections to staggering the releases like that? They
> could
> >> also be done together, but it means that we have to update full dev to
> >> match the plugin version post release.
> >>
> >> On Wed, Nov 14, 2018 at 10:29 AM zeo...@gmail.com 
> >> wrote:
> >>
> >> > In my opinion metron-bro-plugin-kafka is ready for a release. Anything
> >> > else people would want to see? Once it gets released, I would like to
> >> > update full dev to use the newest version prior to any future metron
> >> > release (0.6.1 or whatever we choose).
> >> >
> >> > Jon
> >> >
> >> > On Wed, Nov 7, 2018 at 8:07 PM zeo...@gmail.com 
> >> wrote:
> >> >
> >> > > So, about this release, anybody have time to review
> >> > > apache/metron-bro-plugin-kafka#2 and
> >> apache/metron-bro-plugin-kafka#13?
> >> > >
> >> > > Jon
> >> > >
> >> > > On Wed, Oct 17, 2018 at 10:37 AM Michael Miklavcic <
> >> > > michael.miklav...@gmail.com> wrote:
> >> > >
> >> > >> And I do think we will be ready to roll another Metron release in
> the
> >> > near
> >> > >> future as well.
> >> > >>
> >> > >> On Wed, Oct 17, 2018 at 8:26 AM Justin Leet  >
> >> > >> wrote:
> >> > >>
> >> > >> > I tend to agree with Mike. I think we could do a release of the
> >> main
> >> > >> > project and get benefit, but I think skipping this cycle
> >> (especially
> >> > >> since
> >> > >> > we had a release last month) let's us get things like the ES
> client
> >> > >> > migration in and settled and just generally make recommendi

Re: Metron Release 0.6.1 and/or Plugin release 0.3.0?

2018-11-21 Thread Nick Allen

ibrdkafka in metron-bro-plugin-kafka (JonZeolla via
> jonzeolla) closes apache/metron-bro-plugin-kafka#13
> METRON-1866 Improve metron-bro-plugin-kafka documentation (JonZeolla
> via jonzeolla) closes apache/metron-bro-plugin-kafka#17
> METRON-1304 Allow metron-bro-plugin-kafka to include or exclude logs
> (JonZeolla via nickwallen) closes apache/metron-bro-plugin-kafka#2
> METRON-1865 Fix metron-bro-plugin-kafka tests (JonZeolla via jonzeolla)
> closes apache/metron-bro-plugin-kafka#16
> METRON-1828 Improve bro plugin contributing documentation (JonZeolla)
> closes apache/metron-bro-plugin-kafka#14
> METRON-1818 Remove config_files from bro-pkg.meta (JonZeolla) closes
> apache/metron-bro-plugin-kafka#11
> METRON-1800 Increment metron-bro-plugin-kafka version (JonZeolla via
> jonzeolla) closes apache/metron-bro-plugin-kafka#10
> METRON-1773 Bro plugin docs should refer to Apache Metron project
> (nickwallen) closes apache/metron-bro-plugin-kafka#9
>
> On Sun, Nov 18, 2018 at 7:52 AM zeo...@gmail.com  wrote:
>
> > I'm good with that release schedule, and using version 0.7.0 for the
> > apache/metron release.
> >
> > I opened up METRON-1881 and have a branch ready to PR for the 0.3
> > plugin upgrade.
> >
> > Jon
> >
> > On Fri, Nov 16, 2018 at 10:08 AM Otto Fowler 
> > wrote:
> >
> >> Can you generate the jiras that would be included in the release?
> >>
> >>
> >> On November 16, 2018 at 10:05:50, Justin Leet (justinjl...@gmail.com)
> >> wrote:
> >>
> >> Given that we've had a couple major PRs (the ES client migration along
> >> with
> >> the Angular upgrade stuff, and I'm sure others), I'd be in favor of
> >> releasing both the plugin and the main repo.
> >>
> >> I'd be in favor of doing something like:
> >> metron-bro-plugin-kafka release 0.3.0
> >> PR to update full dev
> >> metron release 0.7.0 (given that things changed a good amount)
> >>
> >> I would also expect this process to most likely be kicked off post
> >> Thanksgiving break (Thursday 22nd and Friday 23rd for anybody not in the
> >> US).
> >>
> >> I can kick out another summarization thread if we want, but basically:
> >> * Nov 26 - Start the Bro plugin release process
> >> * Once that finishes, PR to update full dev with new plugin version
> >> * Once PR is in, start metron release process (hopefully) sometime the
> >> week
> >> of the 3rd?
> >>
> >> Are there any objections to staggering the releases like that? They
> could
> >> also be done together, but it means that we have to update full dev to
> >> match the plugin version post release.
> >>
> >> On Wed, Nov 14, 2018 at 10:29 AM zeo...@gmail.com 
> >> wrote:
> >>
> >> > In my opinion metron-bro-plugin-kafka is ready for a release. Anything
> >> > else people would want to see? Once it gets released, I would like to
> >> > update full dev to use the newest version prior to any future metron
> >> > release (0.6.1 or whatever we choose).
> >> >
> >> > Jon
> >> >
> >> > On Wed, Nov 7, 2018 at 8:07 PM zeo...@gmail.com 
> >> wrote:
> >> >
> >> > > So, about this release, anybody have time to review
> >> > > apache/metron-bro-plugin-kafka#2 and
> >> apache/metron-bro-plugin-kafka#13?
> >> > >
> >> > > Jon
> >> > >
> >> > > On Wed, Oct 17, 2018 at 10:37 AM Michael Miklavcic <
> >> > > michael.miklav...@gmail.com> wrote:
> >> > >
> >> > >> And I do think we will be ready to roll another Metron release in
> the
> >> > near
> >> > >> future as well.
> >> > >>
> >> > >> On Wed, Oct 17, 2018 at 8:26 AM Justin Leet  >
> >> > >> wrote:
> >> > >>
> >> > >> > I tend to agree with Mike. I think we could do a release of the
> >> main
> >> > >> > project and get benefit, but I think skipping this cycle
> >> (especially
> >> > >> since
> >> > >> > we had a release last month) let's us get things like the ES
> client
> >> > >> > migration in and settled and just generally make recommending a
> >> newer
> >> > >> > version easier.
> >> > >> >
> >> > >> > The main reason for doing a release for the Bro plu

Re: [DISCUSS] Slack Channel Use

2018-11-12 Thread Nick Allen

+1 to all your points Justin.

On Mon, Nov 12, 2018 at 10:08 AM Justin Leet  wrote:

> I wanted to add back onto this thread after putting some more thought into
> it.
>
> I like Slack for the type of small developer "what's going on here?" type
> discussions.  That's the kind of thing I like being real-time ("Hey, full
> dev is acting weird", "What's the basic layout of this stuff?", "Anybody
> else seen this test failure?", etc.).  I think we've been pretty good about
> keeping our decision type dev discussions to the list (e.g. this exact
> conversation).
>
> We've been doing this more, but I would like to see more of the user and
> troubleshooting move to the list.  I think we've gotten a bit better about
> it as we've settled into Slack, but having that sort of helpful stuff
> exposed and searchable for users who come in afterwards is a big selling
> point of the lists, imo.
>
> To add onto this, I'd probably like to see
>
> https://cwiki.apache.org/confluence/display/METRON/Community+Resources#CommunityResources-ApacheMetronCommunityResources
> (and any other relevant links) updated to emphasize a Slack focus on
> developing Metron itself, and the user lists for configuration,
> troubleshooting, etc.
>
> Essentially, I'm proposing:
> Dev list / Jira / PRs as usual for any actual decisions + concrete feature
> discussion/review.
> Slack for Metron development "Hey, anyone seen this or have insight or a
> starting point?" and "I'm seeing something weird in our tests" type stuff
> User list for usage and troubleshooting questions.  Generally, discussions
> like this in Slack should be redirected to the user list.
>
> Is this a reasonable way separate our concerns here?
>
> On Wed, Oct 24, 2018 at 11:37 AM Michael Miklavcic <
> michael.miklav...@gmail.com> wrote:
>
> > Yeah, I'm also surprised by that comment about the mailing list activity.
> > Our dev/user list discussions are by far more active than they've ever
> > been. Just have a look at the list of DISCUSS threads that have come up
> in
> > the past few months and it's clear that not only participation has
> > increased, but diversity of topic and participant.
> >
> > On Wed, Oct 24, 2018 at 8:08 AM Casey Stella  wrote:
> >
> > > Not for nothing, but at least according to the last board report that I
> > > submitted, the user@ traffic is up 100% and the dev list traffic is
> flat
> > > as
> > > compared to last quarter.  That's not to say that we couldn't stand
> more
> > > discussion on the lists, but a lot of the dev discussion happens on
> > github
> > > and JIRA and I'm happy to see an uptick in user traffic.
> > >
> > > On Wed, Oct 24, 2018 at 10:05 AM Otto Fowler 
> > > wrote:
> > >
> > > > I wouldn’t be so quick to related the slack discussion with perceived
> > > > activity on the list.
> > > > That is more do to the other things that are bigger issues.
> > > >
> > > >
> > > > On October 24, 2018 at 07:15:30, Nick Allen (n...@nickallen.org)
> > wrote:
> > > >
> > > > > I have heard recently people thought Metron is sort of dead just
> > > because
> > > > the mailing list is not so active anymore!
> > > >
> > > > That is exactly my concern.
> > > >
> > > >
> > > > On Wed, Oct 24, 2018, 2:49 AM Ali Nazemian 
> > > wrote:
> > > >
> > > > > I kind of expect to have Slack for more dev related discussions
> > rather
> > > > than
> > > > > user QA. I guess it is quite common to expect mailing list to be
> used
> > > for
> > > > > the purpose of knowledge sharing to make sure it will be accessible
> > by
> > > > > other users as well. Of course, it is a trade-off that most of the
> > > other
> > > > > Apache projects decided to accept the risk of keeping user related
> > > > > discussions out of Slack/IRC. However, it sometimes happens to see
> > the
> > > > > mixture of questions coming to Slack. I have heard recently people
> > > > thought
> > > > > Metron is sort of dead just because the mailing list is not so
> active
> > > > > anymore!
> > > > >
> > > > > Cheers,
> > > > > Ali
> > > > >
> > > > > On Tue, Oct 23, 2018 at 8:23 AM Casey Stella 
> > > wrote:
> > > > >
> > > > > > Agreed, t

Re: [DISCUSS] Deprecate split-join enrichment topology in favor of unified enrichment topology

2018-11-01 Thread Nick Allen

+1

On Thu, Nov 1, 2018, 6:27 PM Justin Leet  wrote:

> +1, I haven't seen any case where the split-join topology isn't made
> obsolete by the unified topology.
>
> On Thu, Nov 1, 2018 at 6:17 PM Michael Miklavcic <
> michael.miklav...@gmail.com> wrote:
>
> > Fellow Metronians,
> >
> > We've had the unified enrichment topology around for a number of months
> > now, it has proved itself stable, and there is yet to be a time that I
> have
> > seen the split-join topology outperform the unified one. Here are some
> > simple reasons to deprecate the split-join topology.
> >
> >1. Unified topology performs better.
> >2. The configuration, especially for performance tuning is much, much
> >simpler in the unified model.
> >3. The footprint within the cluster is smaller.
> >4. One of the first activities for any install is that we spend time
> >instructing users to switch to the unified topology.
> >5. One less moving part to maintain.
> >
> > I'd like to recommend that we deprecate the split-join topology and make
> > the unified enrichment topology the new default.
> >
> > Best,
> > Mike
> >
>

Fwd: [DISCUSS] Day 1 User Experience - Getting Metron Running

2018-10-26 Thread Nick Allen

Right.  I think we would just have to Dockerize whatever bits are needed
for specific "scenarios" as Katakoda calls them. At least, that is what I
am hoping.

On Fri, Oct 26, 2018 at 3:00 PM Otto Fowler  wrote:

> Ok, so we are not talking about dockerizing all of metron for this then.
>
>
> On October 26, 2018 at 14:52:27, Nick Allen (n...@nickallen.org) wrote:
>
> From what I can tell, Katakoda functions mainly through hosting Docker
> containers. So if I were to create Katakoda demo like "Introduction to
> Stellar REPL", I would need to create a Docker container that hosts the
> Stellar REPL.  As a user works through your demo, Katakoda launches and
> hosts your container for each user session. That is my assumption from
> looking through some of the demos that currently exist.
>
> On Fri, Oct 26, 2018 at 2:42 PM Otto Fowler 
> wrote:
>
>> What is the metron on docker part?
>>
>>
>> On October 26, 2018 at 14:37:48, Nick Allen (n...@nickallen.org) wrote:
>>
>> > Yeah I would +1 katakoda.
>>
>> Has anyone used or have a history with KataKoda? I'd hate to invest time
>> in a hosted solution if the provider isn't going to be around. That's a
>> definite 'con' to taking that approach.
>>
>> Although most of the effort would be invested in "Metron on Docker" which
>> might have value outside of KataKoda. And some level of work has already
>> been done on Docker.
>>
>>
>> > I also think that it would help to start distributing RPMs, DEBs, and
>> the
>> mpacks with the releases..
>>
>> Agreed. I was thinking that whatever solution falls out of this discussion
>> might require RPMs, DEBs, Maven Central, etc as prerequisites. Although
>> each of those have value in their own right.
>>
>>
>>
>> On Fri, Oct 26, 2018 at 1:42 PM zeo...@gmail.com 
>> wrote:
>>
>> > Yeah I would +1 katakoda. I also think that it would help to start
>> > distributing RPMs, DEBs, and the mpacks with the releases, as well as
>> > consider a service like opensuse's build service for nightlies, etc.
>> >
>> > Jon
>> >
>> > On Fri, Oct 26, 2018 at 6:25 AM Anand Subramanian <
>> > asubraman...@hortonworks.com> wrote:
>> >
>> > > Great idea! This will be a HUGE improvement in the user experience for
>> > > first timers to Metron. Katakoda seems very interesting - simple and
>> > > straight-forward. I loved the way you can provide instructions,
>> commands
>> > > (that can be directly clicked!), links, explanation and so on.
>> > >
>> > > Regards,
>> > > Anand
>> > >
>> > > On 10/25/18, 7:49 PM, "Nick Allen"  wrote:
>> > >
>> > > We all know spinning up the development environment is a pain.
>> > > Unfortunately, it is the only way for a new user to get a feel for
>> > > Metron.
>> > > We need a better way to introduce new users to Metron.
>> > >
>> > > I am hoping we can brainstorm ways to improve that experience. Here
>> > > are a
>> > > few thoughts that might help start a discussion.
>> > >
>> > > (1) Create a *KataKoda* [1] based demo. I ran across this after
>> > > finding
>> > > Apache Ozone's demo [2], which I think is great.
>> > >
>> > >
>> > > - A user does not need to download or install anything. It is a
>> > > completely hosted offering.
>> > > - Provides a step-by-step demo experience that could guide
>> > users
>> > > through creating an enrichment, defining a profile, managing
>> > > alerts.
>> > > - Would require a Metron on Docker solution.
>> > >
>> > > (2) Create a *Vagrant Cloud* [3] hosted image of "Full Dev" with
>> > > everything
>> > > installed and ready to rock. A user would just need to install
>> > > Vagrant and
>> > > run:
>> > >
>> > > vagrant init metron/0.6.0
>> > >
>> > > vagrant up
>> > >
>> > >
>> > > - Reduces the number of dependencies needed to get Metron
>> > > up-and-running.
>> > > - Significantly increases the success rate of new users getting
>> > > Metron running.
>> > > - Still results in "Full Dev" Metron which requires too many
>> > > resources for the average computer.
>> > >
>> > > Are these good options? What other approaches could we take?
>> > Hopefully
>> > > some JIRAs might fall out of this discussion.
>> > >
>> > > - Nick
>> > >
>> > >
>> > > --
>> > > [1] https://www.katacoda.com
>> > > [2] https://www.katacoda.com/elek/scenarios/ozone101
>> > > [3] https://app.vagrantup.com/boxes/search
>> > >
>> > >
>> > > --
>> >
>> > Jon Zeolla
>> >
>>
>>

Re: [DISCUSS] Day 1 User Experience - Getting Metron Running

2018-10-26 Thread Nick Allen

>From what I can tell, Katakoda functions mainly through hosting Docker
containers. So if I were to create Katakoda demo like "Introduction to
Stellar REPL", I would need to create a Docker container that hosts the
Stellar REPL.  As a user works through your demo, Katakoda launches and
hosts your container for each user session. That is my assumption from
looking through some of the demos that currently exist.

On Fri, Oct 26, 2018 at 2:42 PM Otto Fowler  wrote:

> What is the metron on docker part?
>
>
> On October 26, 2018 at 14:37:48, Nick Allen (n...@nickallen.org) wrote:
>
> > Yeah I would +1 katakoda.
>
> Has anyone used or have a history with KataKoda? I'd hate to invest time
> in a hosted solution if the provider isn't going to be around. That's a
> definite 'con' to taking that approach.
>
> Although most of the effort would be invested in "Metron on Docker" which
> might have value outside of KataKoda. And some level of work has already
> been done on Docker.
>
>
> > I also think that it would help to start distributing RPMs, DEBs, and
> the
> mpacks with the releases..
>
> Agreed. I was thinking that whatever solution falls out of this discussion
> might require RPMs, DEBs, Maven Central, etc as prerequisites. Although
> each of those have value in their own right.
>
>
>
> On Fri, Oct 26, 2018 at 1:42 PM zeo...@gmail.com 
> wrote:
>
> > Yeah I would +1 katakoda. I also think that it would help to start
> > distributing RPMs, DEBs, and the mpacks with the releases, as well as
> > consider a service like opensuse's build service for nightlies, etc.
> >
> > Jon
> >
> > On Fri, Oct 26, 2018 at 6:25 AM Anand Subramanian <
> > asubraman...@hortonworks.com> wrote:
> >
> > > Great idea! This will be a HUGE improvement in the user experience for
> > > first timers to Metron. Katakoda seems very interesting - simple and
> > > straight-forward. I loved the way you can provide instructions,
> commands
> > > (that can be directly clicked!), links, explanation and so on.
> > >
> > > Regards,
> > > Anand
> > >
> > > On 10/25/18, 7:49 PM, "Nick Allen"  wrote:
> > >
> > > We all know spinning up the development environment is a pain.
> > > Unfortunately, it is the only way for a new user to get a feel for
> > > Metron.
> > > We need a better way to introduce new users to Metron.
> > >
> > > I am hoping we can brainstorm ways to improve that experience. Here
> > > are a
> > > few thoughts that might help start a discussion.
> > >
> > > (1) Create a *KataKoda* [1] based demo. I ran across this after
> > > finding
> > > Apache Ozone's demo [2], which I think is great.
> > >
> > >
> > > - A user does not need to download or install anything. It is a
> > > completely hosted offering.
> > > - Provides a step-by-step demo experience that could guide
> > users
> > > through creating an enrichment, defining a profile, managing
> > > alerts.
> > > - Would require a Metron on Docker solution.
> > >
> > > (2) Create a *Vagrant Cloud* [3] hosted image of "Full Dev" with
> > > everything
> > > installed and ready to rock. A user would just need to install
> > > Vagrant and
> > > run:
> > >
> > > vagrant init metron/0.6.0
> > >
> > > vagrant up
> > >
> > >
> > > - Reduces the number of dependencies needed to get Metron
> > > up-and-running.
> > > - Significantly increases the success rate of new users getting
> > > Metron running.
> > > - Still results in "Full Dev" Metron which requires too many
> > > resources for the average computer.
> > >
> > > Are these good options? What other approaches could we take?
> > Hopefully
> > > some JIRAs might fall out of this discussion.
> > >
> > > - Nick
> > >
> > >
> > > --
> > > [1] https://www.katacoda.com
> > > [2] https://www.katacoda.com/elek/scenarios/ozone101
> > > [3] https://app.vagrantup.com/boxes/search
> > >
> > >
> > > --
> >
> > Jon Zeolla
> >
>
>

Re: [DISCUSS] Day 1 User Experience - Getting Metron Running

2018-10-26 Thread Nick Allen

> Yeah I would +1 katakoda.

Has anyone used or have a history with KataKoda?  I'd hate to invest time
in a hosted solution if the provider isn't going to be around.  That's a
definite 'con' to taking that approach.

Although most of the effort would be invested in "Metron on Docker" which
might have value outside of KataKoda.  And some level of work has already
been done on Docker.


> I also think that it would help to start distributing RPMs, DEBs, and the
mpacks with the releases..

Agreed. I was thinking that whatever solution falls out of this discussion
might require RPMs, DEBs, Maven Central, etc as prerequisites.  Although
each of those have value in their own right.



On Fri, Oct 26, 2018 at 1:42 PM zeo...@gmail.com  wrote:

> Yeah I would +1 katakoda.  I also think that it would help to start
> distributing RPMs, DEBs, and the mpacks with the releases, as well as
> consider a service like opensuse's build service for nightlies, etc.
>
> Jon
>
> On Fri, Oct 26, 2018 at 6:25 AM Anand Subramanian <
> asubraman...@hortonworks.com> wrote:
>
> > Great idea! This will be a HUGE improvement in the user experience for
> > first timers to Metron. Katakoda seems very interesting - simple and
> > straight-forward. I loved the way you can provide instructions, commands
> > (that can be directly clicked!), links, explanation and so on.
> >
> > Regards,
> > Anand
> >
> > On 10/25/18, 7:49 PM, "Nick Allen"  wrote:
> >
> > We all know spinning up the development environment is a pain.
> > Unfortunately, it is the only way for a new user to get a feel for
> > Metron.
> > We need a better way to introduce new users to Metron.
> >
> > I am hoping we can brainstorm ways to improve that experience.  Here
> > are a
> > few thoughts that might help start a discussion.
> >
> > (1) Create a *KataKoda* [1] based demo.  I ran across this after
> > finding
> > Apache Ozone's demo [2], which I think is great.
> >
> >
> >- A user does not need to download or install anything.  It is a
> >   completely hosted offering.
> >   - Provides a step-by-step demo experience that could guide
> users
> >   through creating an enrichment, defining a profile, managing
> > alerts.
> >   - Would require a Metron on Docker solution.
> >
> > (2) Create a *Vagrant Cloud* [3] hosted image of "Full Dev" with
> > everything
> > installed and ready to rock.  A user would just need to install
> > Vagrant and
> > run:
> >
> > vagrant init metron/0.6.0
> >
> > vagrant up
> >
> >
> >- Reduces the number of dependencies needed to get Metron
> > up-and-running.
> >   - Significantly increases the success rate of new users getting
> >   Metron running.
> >   - Still results in "Full Dev" Metron which requires too many
> >   resources for the average computer.
> >
> > Are these good options? What other approaches could we take?
> Hopefully
> > some JIRAs might fall out of this discussion.
> >
> > - Nick
> >
> >
> > --
> > [1] https://www.katacoda.com
> > [2] https://www.katacoda.com/elek/scenarios/ozone101
> > [3] https://app.vagrantup.com/boxes/search
> >
> >
> > --
>
> Jon Zeolla
>

[DISCUSS] Day 1 User Experience - Getting Metron Running

2018-10-25 Thread Nick Allen

We all know spinning up the development environment is a pain.
Unfortunately, it is the only way for a new user to get a feel for Metron.
We need a better way to introduce new users to Metron.

I am hoping we can brainstorm ways to improve that experience.  Here are a
few thoughts that might help start a discussion.

(1) Create a *KataKoda* [1] based demo.  I ran across this after finding
Apache Ozone's demo [2], which I think is great.


   - A user does not need to download or install anything.  It is a
  completely hosted offering.
  - Provides a step-by-step demo experience that could guide users
  through creating an enrichment, defining a profile, managing alerts.
  - Would require a Metron on Docker solution.

(2) Create a *Vagrant Cloud* [3] hosted image of "Full Dev" with everything
installed and ready to rock.  A user would just need to install Vagrant and
run:

vagrant init metron/0.6.0

vagrant up


   - Reduces the number of dependencies needed to get Metron up-and-running.
  - Significantly increases the success rate of new users getting
  Metron running.
  - Still results in "Full Dev" Metron which requires too many
  resources for the average computer.

Are these good options? What other approaches could we take?  Hopefully
some JIRAs might fall out of this discussion.

- Nick


--
[1] https://www.katacoda.com
[2] https://www.katacoda.com/elek/scenarios/ozone101
[3] https://app.vagrantup.com/boxes/search

Re: metron-elasticsearch integration tests failing after merging in master

2018-10-24 Thread Nick Allen

I usually create a JIRA when I see an intermittent failure like this.  But
I do not see a record of this particular issue before.

https://jira.apache.org/jira/issues/?jql=text%20~%20%22Intermittent%22%20AND%20project%20%3D%20Metron%20AND%20resolution%20%3D%20Unresolved


On Wed, Oct 24, 2018 at 12:35 PM Michael Miklavcic <
michael.miklav...@gmail.com> wrote:

> That's a new one on me. Does this run locally for you?
>
> Failed tests:
>   ElasticsearchUpdateIntegrationTest>UpdateIntegrationTest.test:132 Data
> store is not updated!. Actual: 0
>
>
>
> On Wed, Oct 24, 2018 at 10:10 AM Otto Fowler 
> wrote:
>
> > https://travis-ci.org/ottobackwards/metron/jobs/445723343
> > Anyone having ES test problems?  Anyone shed any light on this.
> >
>

Re: [DISCUSS] Slack Channel Use

2018-10-24 Thread Nick Allen

> I have heard recently people thought Metron is sort of dead just because
the mailing list is not so active anymore!

That is exactly my concern.


On Wed, Oct 24, 2018, 2:49 AM Ali Nazemian  wrote:

> I kind of expect to have Slack for more dev related discussions rather than
> user QA. I guess it is quite common to expect mailing list to be used for
> the purpose of knowledge sharing to make sure it will be accessible by
> other users as well. Of course, it is a trade-off that most of the other
> Apache projects decided to accept the risk of keeping user related
> discussions out of Slack/IRC. However, it sometimes happens to see the
> mixture of questions coming to Slack. I have heard recently people thought
> Metron is sort of dead just because the mailing list is not so active
> anymore!
>
> Cheers,
> Ali
>
> On Tue, Oct 23, 2018 at 8:23 AM Casey Stella  wrote:
>
> > Agreed, the benefit of the mailing list is that it’s searchable by
> ponymail
> > and the major search engines.
> > On Mon, Oct 22, 2018 at 17:18 Nick Allen  wrote:
> >
> > > I don't know that it is the same kind of searchable.  Is it being
> indexed
> > > by the major search engines?  I have never used a search engine and
> > > uncovered the answer to my problem in a Slack archive.
> > >
> > > On Mon, Oct 22, 2018 at 5:05 PM Otto Fowler 
> > > wrote:
> > >
> > > > According to Greg Stein, an infra admin on the NiFi slack, the ASF
> > slack
> > > > that metron is in IS the standard plan, not the free one and is
> > > searchable
> > > > past 10,000 messages.
> > > >
> > > >
> > > >
> > > > On October 22, 2018 at 15:35:51, Michael Miklavcic (
> > > > michael.miklav...@gmail.com) wrote:
> > > >
> > > > ...From an archival and broader reach point of view, I do think
> there's
> > > > something to be said about using the mailing list. It's also easier
> to
> > > link
> > > > to Q/A threads from the mailing list archives and do searches...
> > > >
> > > >
> > >
> >
> https://lists.apache.org/thread.html/1aa85bc13d41e04a1f85c3100c2b803abe35d79b54062bbeaab83ace@%3Cdev.metron.apache.org%3E
> > > >
> > > > How very Inception.
> > > >
> > > >
> > > > On Mon, Oct 22, 2018 at 1:32 PM Michael Miklavcic <
> > > > michael.miklav...@gmail.com> wrote:
> > > >
> > > > > I just want to point out that we currently have 32 members in the
> > > Metron
> > > > > Slack channel which I personally think is a great sign. This is
> good
> > > from
> > > > a
> > > > > community perspective and helps foster interactive sessions where
> > > > required.
> > > > > From an archival and broader reach point of view, I do think
> there's
> > > > > something to be said about using the mailing list. It's also easier
> > to
> > > > link
> > > > > to Q/A threads from the mailing list archives and do searches. As
> > > such, I
> > > > > would also go along with Nick's suggestion and urge members to
> prefer
> > > the
> > > > > user/dev list where possible.
> > > > >
> > > > > On Mon, Oct 22, 2018 at 10:51 AM Justin Leet <
> justinjl...@gmail.com>
> > > > > wrote:
> > > > >
> > > > >> If we want to push more discussion to the dev list, my obvious
> > follow
> > > up
> > > > >> question then is "What are we hoping to get out of Slack/irc/other
> > > > >> interactive medium?". What discussion would we even want on there,
> > if
> > > we
> > > > >> can't have decisions and don't want usage/support?
> > > > >>
> > > > >> On Mon, Oct 22, 2018 at 12:44 PM Casey Stella  >
> > > > wrote:
> > > > >>
> > > > >> > I am of 2 minds, but I tend to agree. On the one hand, it's
> > > definitely
> > > > >> the
> > > > >> > preference that we use the mailing lists for the reasons you
> > stated
> > > > (and
> > > > >> > also because not everyone has access to slack generally). On the
> > > other
> > > > >> > hand, I think an interactive medium like Slack has a lot of
> > > advantages
> > > > >> in
> > > > >> > terms of user sat

Re: Revert PR #1218

2018-10-23 Thread Nick Allen

I wouldn't call this complex.  It is much easier to roll it back, so I can
work on a proper fix without impacting the ongoing work of others.

The existing Elasticsearch DAOs do not distinguish between document ID and
Metron GUID as there was no need to before.  So I need to disambiguate
those concepts a bit, which is rather subtle.  In addition, none of the
integration or e2e tests caught the problem because there is a disconnect
between the reader and writer side of the house for Elasticsearch.  I want
to update the tests to ensure this sort of problem is caught.


On Tue, Oct 23, 2018 at 3:11 PM Simon Elliston Ball <
si...@simonellistonball.com> wrote:

> Would it not make more sense to fix the bug on the DAO side, and roll
> forward? I suspect what we need to do is add a stage in the update
> capability to configure the key field used for update, or worst case have a
> pre-query to lookup the internal ID in the relatively rare scenario where
> we escalate / modify indexed docs. Seems like a simple new ticket, rather
> than a complex roll back and roll forward later. As long as we get the
> follow on in before an Apache release we should be fine, no?
>
> Simon
>
> On Tue, 23 Oct 2018 at 19:58, Nick Allen  wrote:
>
> > Hi Guys -
> >
> > @rmerriman tracked down some problems that were introduced with my PR
> > #1218.  Thanks to him for finding this.  The change was intended to
> improve
> > Elasticsearch write performance by allowing Elasticsearch to set its own
> > document ID.
> >
> > The problem is that if you then go to the Alerts UI and escalate an
> alert,
> > it will create a duplicate alert in the index, rather than updating the
> > existing alert. I've been looking at how to fix the problem and the scope
> > of the fix is larger than I'd like to handle as a follow-on.  There are
> > some prerequisites I'd like to tackle before introducing this change.
> >
> > I am going to revert the change on master, which will introduce an
> > additional commit that is an "undo" of the original commit.  I will then
> > open a separate PR that introduces this new functionality.
> >
> > https://github.com/apache/metron/pull/1218
> >
> > Thanks
> >
>
>
> --
> --
> simon elliston ball
> @sireb
>

Revert PR #1218

2018-10-23 Thread Nick Allen

Hi Guys -

@rmerriman tracked down some problems that were introduced with my PR
#1218.  Thanks to him for finding this.  The change was intended to improve
Elasticsearch write performance by allowing Elasticsearch to set its own
document ID.

The problem is that if you then go to the Alerts UI and escalate an alert,
it will create a duplicate alert in the index, rather than updating the
existing alert. I've been looking at how to fix the problem and the scope
of the fix is larger than I'd like to handle as a follow-on.  There are
some prerequisites I'd like to tackle before introducing this change.

I am going to revert the change on master, which will introduce an
additional commit that is an "undo" of the original commit.  I will then
open a separate PR that introduces this new functionality.

https://github.com/apache/metron/pull/1218

Thanks

Re: [DISCUSS] Slack Channel Use

2018-10-22 Thread Nick Allen

I don't know that it is the same kind of searchable.  Is it being indexed
by the major search engines?  I have never used a search engine and
uncovered the answer to my problem in a Slack archive.

On Mon, Oct 22, 2018 at 5:05 PM Otto Fowler  wrote:

> According to Greg Stein, an infra admin on the NiFi slack, the ASF slack
> that metron is in IS the standard plan, not the free one and is searchable
> past 10,000 messages.
>
>
>
> On October 22, 2018 at 15:35:51, Michael Miklavcic (
> michael.miklav...@gmail.com) wrote:
>
> ...From an archival and broader reach point of view, I do think there's
> something to be said about using the mailing list. It's also easier to link
> to Q/A threads from the mailing list archives and do searches...
>
> https://lists.apache.org/thread.html/1aa85bc13d41e04a1f85c3100c2b803abe35d79b54062bbeaab83ace@%3Cdev.metron.apache.org%3E
>
> How very Inception.
>
>
> On Mon, Oct 22, 2018 at 1:32 PM Michael Miklavcic <
> michael.miklav...@gmail.com> wrote:
>
> > I just want to point out that we currently have 32 members in the Metron
> > Slack channel which I personally think is a great sign. This is good from
> a
> > community perspective and helps foster interactive sessions where
> required.
> > From an archival and broader reach point of view, I do think there's
> > something to be said about using the mailing list. It's also easier to
> link
> > to Q/A threads from the mailing list archives and do searches. As such, I
> > would also go along with Nick's suggestion and urge members to prefer the
> > user/dev list where possible.
> >
> > On Mon, Oct 22, 2018 at 10:51 AM Justin Leet 
> > wrote:
> >
> >> If we want to push more discussion to the dev list, my obvious follow up
> >> question then is "What are we hoping to get out of Slack/irc/other
> >> interactive medium?". What discussion would we even want on there, if we
> >> can't have decisions and don't want usage/support?
> >>
> >> On Mon, Oct 22, 2018 at 12:44 PM Casey Stella 
> wrote:
> >>
> >> > I am of 2 minds, but I tend to agree. On the one hand, it's definitely
> >> the
> >> > preference that we use the mailing lists for the reasons you stated
> (and
> >> > also because not everyone has access to slack generally). On the other
> >> > hand, I think an interactive medium like Slack has a lot of advantages
> >> in
> >> > terms of user satisfaction. Ultimately, though, we may satisfy 1 user
> >> at
> >> > the cost of not persisting the discussion and satisfying many users.
> >> >
> >> > I'll go along with a specific preference to drive more discussion to
> the
> >> > mailing list.
> >> >
> >> > Casey
> >> >
> >> > On Mon, Oct 22, 2018 at 12:18 PM Nick Allen 
> wrote:
> >> >
> >> > > It seems that we are seeing a lot of Metron usage and support
> >> questions
> >> > on
> >> > > the Slack Channel.
> >> > > These are questions that previously would have been directed to the
> >> User
> >> > or
> >> > > Dev mailing lists. Since this is occurring in the Slack Channel, the
> >> > > conversations are not archived.
> >> > >
> >> > > In my opinion, this is not good for the Metron community. Having
> this
> >> > > persisted in a discoverable form (like a mailing list archive) not
> >> only
> >> > > helps support current users, but also helps *potential* users
> >> understand
> >> > > how Metron is being used.
> >> > >
> >> > > Does anyone else agree or disagree? At a minimum, I feel we need to
> >> do
> >> > > something to direct these conversations back to the mailing list.
> >> > >
> >> >
> >>
> >
>

Re: [DISCUSS] Slack Channel Use

2018-10-22 Thread Nick Allen

The help given on the Channel is great.  It is just much more scalable on
an archived mailing list.  Hence the Apache Foundation's preference for
mailing lists.




On Mon, Oct 22, 2018 at 12:27 PM Otto Fowler 
wrote:

> These questions also occurred on the IRC channel.  The difference is that
> there are more than Jon and I answering now.
>
>
> On October 22, 2018 at 12:18:08, Nick Allen (n...@nickallen.org) wrote:
>
> It seems that we are seeing a lot of Metron usage and support questions on
> the Slack Channel.
> These are questions that previously would have been directed to the User
> or
> Dev mailing lists. Since this is occurring in the Slack Channel, the
> conversations are not archived.
>
> In my opinion, this is not good for the Metron community. Having this
> persisted in a discoverable form (like a mailing list archive) not only
> helps support current users, but also helps *potential* users understand
> how Metron is being used.
>
> Does anyone else agree or disagree? At a minimum, I feel we need to do
> something to direct these conversations back to the mailing list.
>
>

[DISCUSS] Slack Channel Use

2018-10-22 Thread Nick Allen

It seems that we are seeing a lot of Metron usage and support questions on
the Slack Channel.
These are questions that previously would have been directed to the User or
Dev mailing lists.  Since this is occurring in the Slack Channel, the
conversations are not archived.

In my opinion, this is not good for the Metron community.  Having this
persisted in a discoverable form (like a mailing list archive) not only
helps support current users, but also helps *potential* users understand
how Metron is being used.

Does anyone else agree or disagree?  At a minimum, I feel we need to do
something to direct these conversations back to the mailing list.

[DISCUSS] Recurrent Large Indexing Error Messages

2018-10-19 Thread Nick Allen

I want to discuss solutions for the problem that I have described in
METRON-1832; Recurrent Large Indexing Error Messages. I feel this is a very
easy trap to fall into when using the default settings that currently come
with Metron.


## Problem


https://issues.apache.org/jira/browse/METRON-1832


If any index destination like HDFS, Elasticsearch, or Solr goes down while
the Indexing topology is running, an error message is created and sent back
to the user-defined error topic.  By default, this is defined to also be
the 'indexing' topic.

The Indexing topology then consumes this error message and attempts to
write it again. If the index destination is still down, another error
occurs and another error message is created that encapsulates the original
error message.  That message is then sent to the 'indexing' topic, which is
later consumed, yet again, by the Indexing topology.

These error messages will continue to be recycled and grow larger and
larger as each new error message encapsulates all previous error messages
in the "raw_message" field.

Once the index destination recovers, one giant error message will finally
be written that contains massively duplicated, useless information which
can further negatively impact performance of the index destination.

Also, the escape character '\' continually compounds one another leading to
long strings of '\' characters in the error message.


## Background

There was some discussion on how to handle this on the original PR #453
https://github.com/apache/metron/pull/453.

## Solutions

(1) The first, easiest option is to just do nothing.  There was already a
discussion around this and this is the solution that we landed on in #453.

Pros: Really easy; do nothing.

Cons: Intermittent problems with ES/Solr can easily create very large error
messages that can significantly impact both search and ingest performance.


(2) Change the default indexing error topic to 'indexing_errors' to avoid
recycling error messages. Nothing will consume from the 'indexing_errors'
topic, thus preventing a cycle.

Pros: Simple, easy change that prevents the cycle.

Cons: Recoverable indexing errors are not visible to users as they will
never be indexed in ES/Solr.

(2) Add logic to limit the number times a message can be 'recycled' through
the Indexing topology.  This effectively sets a maximum number of retry
attempts.  If a message fails N times, then write the message to a separate
unrecoverable, error topic.

Pros: Recoverable errors are visible to users in ES/Solr.

Cons: More complex.  Users still need to check the unrecoverable, error
topic for potential problems.

(4) Do not further encapsulate error messages in the 'raw_message' field.
If an error message fails, don't encapsulate it in another error message.
Just push it to the error topic as-is.  Could add a field that indicates
how many times the message has failed.

Pros: Prevents giant error messages from being created from recoverable
errors.

Cons: Extended outages would still cause the Indexing topology to
repeatedly recycle these error messages, which would ultimately exhaust
resources in Storm.



What other ways can we solve this?

Re: Metron Release 0.6.1 and/or Plugin release 0.3.0?

2018-10-16 Thread Nick Allen

I am in favor of a release for both.

There are a lot of really useful bug fixes, management of pcap through
Ambari, more flexibility for configuring JAAS in Ambari, increased
Elasticsearch performance, the Syslog parser, and the Batch Profiler, among
others. I would be happy with calling it a 0.6.1 point release.

Mike - What is outstanding that you would like to see in the release?

On Tue, Oct 16, 2018, 12:21 PM Michael Miklavcic <
michael.miklav...@gmail.com> wrote:

> I'd be +1 on going with just the metron-bro-kafka-plugin release. It seems
> like it's ready to go, and I think there are a few more things I'd like to
> see get into our next Metron release so I'm good with holding off there.
>
> Mike
>
> On Tue, Oct 16, 2018 at 10:26 AM Justin Leet 
> wrote:
>
> > Hi all,
> >
> > As you might recall from a prior discussion about release cadence, we
> were
> > interested in initiating release threads near our board reports to see if
> > we wanted to do releases or not. Additionally, the work is done to do two
> > separate releases, so our options are releasing both, a single one, or
> > neither.
> >
> > Having said that, a metron-bro-kafka-plugin 0.3.0 release came up on this
> > thread
> > <
> >
> https://lists.apache.org/thread.html/3c18c3aba6b436b11032831e7db541d50eb7cb1e3ae54b7423057c88@%3Cdev.metron.apache.org%3E
> > >.
> > In particular, the prospect of a release came up in the context of
> having a
> > version with better (and working) testing.
> >
> > Version Number
> > If we choose to do a core Metron release, I propose 0.6.1. For
> > metron-bro-kafka-plugin, I propose 0.3.0.  Keep in mind, the versioning
> for
> > the plugin is a bit different in that we need x.y for bro-pkg, instead of
> > x.y.z, so we wouldn't release 0.2.1.
> >
> > I would personally be in favor of doing a plugin release, but I'm less
> > inclined to do a core Metron release.
> >
> > For Metron, I believe the main feature is the batch profiler work, but
> > there's a fair amount of fixes and improvements.
> >
> > For the plugin, I believe the main improvements would be around the
> fixing
> > our testing and more maintenance type work.
> >
> > JIRA status
> > *metron*
> > There are 24 opens PRs at https://github.com/apache/metron/pulls.  If we
> > do
> > decide to release, we should work on getting anything meriting inclusion
> > closed out.
> >
> > There have been 49 commits since the 0.6.0 release (listed at the end of
> > the message). This assumes a release from master.
> >
> > *metron-bro-kafka-plugin*
> > There are 6 open PRs at
> > https://github.com/apache/metron-bro-plugin-kafka/pulls.  If we do
> decide
> > to release, we should get anything we need closed out.
> >
> > There have been 2 commits since the 0.2.0 release (listed at the end of
> the
> > message).  This assumes a release from master.
> >
> > *Metron changelog*
> > METRON-1801 Allow Customization of Elasticsearch Document ID
> > (nickwallen) closes apache/metron#1218
> > METRON-1799 Remove outdated bylaws from site. (justinleet) closes
> > apache/metron#1216
> > METRON-1769 Script creation of a release candidate (justinleet)
> closes
> > apache/metron#1188
> > METRON-1761 Allow a grok statement to be applied to each line in a
> > file. (ottobackwards) closes apache/metron#1184
> > METRON-1813 Stellar REPL Not Initialized with Client JAAS
> (nickwallen)
> > closes apache/metron#1232
> > METRON-1812: Fix dependencies_with_url.csv (mmiklavc via mmiklavc)
> > closes apache/metron#1230
> > METRON-1811 Alert Search Fails When Sorting by Alert Status
> (merrimanr)
> > closes apache/metron#1231
> > METRON-1809 Support Column Oriented Input with Batch Profiler
> > (nickwallen) closes apache/metron#1229
> > METRON-1806: Upgrade Maven Shade Plugin version (mmiklavc via
> mmiklavc)
> > closes apache/metron#1224
> > METRON-1792 Simplify Profile Definitions in Integration Tests
> > (nickwallen) closes apache/metron#1211
> > METRON-1807 Auto populate the recommended values to some of the
> metron
> > config parameters  (MohanDV via merrimanr) closes apache/metron#1227
> > METRON-1808 Add Ansible created pyc to gitignore (justinleet) closes
> > apache/metron#1228
> > METRON-1695 Expose pcap properties through Ambari (anandsubbu) closes
> > apache/metron#1207
> > METRON-1771 Update REST endpoints to support eventually consistent UI
> > updates (merrimanr) closes apache/metron#1190
> > METRON-1791 Add GUID to Messages Produced by Profiler (nickwallen)
> > closes apache/metron#1210
> > METRON-1804 Update version to 0.6.1 (justinleet) closes
> > apache/metron#1220
> > METRON-1798 Add mpack support for parser aggregation (anandsubbu)
> > closes apache/metron#1215
> > METRON-1750 Create Parser for Syslog RFC 5424 Messages
> (ottobackwards)
> > closes apache/metron#1175
> > METRON-1794 Include User Details When Escalating Alerts (nickwallen)
> > closes apache/metron#1212
> > METRON-1782 Add Kafka

Re: [DISCUSS] Switching to a better alternative of Pikaday.js

2018-10-10 Thread Nick Allen

> Before making a decision on what's next, I'd to ask you a question. Is it 
> really
a priority and is it really worth the effort to touch our currently used
date picker component to get ~15% reduction in the bundle size by removing
moment?

As an aside, I think there is a greater benefit here too.  We need to make
a conscious effort to identify libraries that we are using that are
deprecated, lack community support, and are unlikely to be maintained and
updated for security vulnerabilities.  We need to actively identify and
replace those.





On Wed, Oct 10, 2018 at 9:33 AM Tamás Fodor  wrote:

> I'd like to open a discussion about switching to a new date picker library
> in the Metron Alerts UI regarding to the following:
>
>
> https://lists.apache.org/thread.html/2e4fafa4256ce14ebcd4433420974e24962884204418ade51f0e3bfb@%3Cdev.metron.apache.org%3E
>
> https://github.com/apache/metron/pull/1219#discussion_r223733562
>
> A week ago, I opened a PR about removing moment.js from the code base to
> decrease the size of the production javascript bundle. I could achieve 15%
> loss in the final bundle size which is admittedly not a game changer but
> still. Not to mention if we want to heavily rely on date manipulator
> functions in the future it's better to get rid of it at this early stage.
> Go here  to read more about
> the
> background and the results. I tried to provide as many details as I could.
>
> So far, so good. But then I stumbled upon an obstacle, Pikaday
> .
>
> Before going further, let me thank Tibor Meller, Michael Miklavcic and
> Nicholas Allen for taking their time to go through my proposal to deal with
> the aforementioned issue. At the end, we agreed on basically not going with
> my temporary solution that intended to solve the related problems of
> Pikaday and we'd rather like to find and change for a better alternative.
>
> To be fair, Pikaday is a pretty good date picker module, its only problem
> is the moment dependency if it's installed via npm. But other than that, it
> functions perfectly. Zero dependencies, small, etc. Long story short, it's
> good for us unless we want to get rid of moment.js.
>
> Before making a decision on what's next, I'd to ask you a question. Is it
> really a priority and is it really worth the effort to touch our currently
> used date picker component to get ~15% reduction in the bundle size by
> removing moment?
>
> I'm asking it because if we want to do so, considering that it's a huge
> topic, the following questions might come up:
>
> *A: What component do we want to use instead of Pikaday?*
>
> I'm not satisfied with the alternative individual solutions out there on
> npm. I'd rather pick a component library like the angular port of Bootstrap
>  or the angular material library
> . Both of them have a date picker component
> and many other components to rely on and reuse throughout the Metron app.
>
> *B: What component library do we want to use?*
>
> Introducing a new component library requires a lot of research and there
> are many things we have to agreed on. Since it's a long term plan because
> it would be great to use it consistently instead of picking a new one a few
> months later just because we chose wrongly.
>
> *C: What about the jQuery version of Bootstrap?*
>
> So basically we already have a component library and we still use it but
> we're also planning to replace it with another or the angular port at least
> to get the most out of the angular rendering engine. Since it uses jquery,
> it's much less performant than a port written in Angular.
> And I think it's a bad idea to introduce a new one and use multiple
> component libraries within one project.
> We can also pick the date picker component from the jQuery Bootstrap but,
> again, it's not as efficient as the angular port so it seems to be
> beneficial to replace it with something else.
>
> What do you think, guys?
>
> Thanks,
> Tamas
>

Re: [DISCUSS] Add e2e step to PR checklist

2018-10-10 Thread Nick Allen

The latter.

On Wed, Oct 10, 2018 at 5:48 AM Shane Ardell 
wrote:

> Nick - To be clear, when you say you can never get them all to pass, do you
> mean you can never get all the tests to pass without protractor flake
> re-running the failing tests (ie. eventually all the tests pass in the
> end), or do you mean you still have failing tests even after
> protractor-flake does its work? I want to look into this, but before I do I
> want to make sure I understand you correctly.
>
> On Fri, Oct 5, 2018 at 12:40 PM Casey Stella  wrote:
>
> > This is really good feedback, Nick. I agree, we need them to be reliable
> > enough to not be a source of constant false positives prior to putting
> them
> > into the checklist.
> > On Thu, Oct 4, 2018 at 15:34 Nick Allen  wrote:
> >
> > > I think we still have an issue of reliability.  I can never reliably
> get
> > > them all to pass.  I have no idea which failures are real.  Am I the
> only
> > > one that experiences this?
> > >
> > > We need a reliable pass/fail on these before we talk about adding them
> to
> > > the checklist.  For example, I just tried to run them on METRON-1771.
> I
> > > don't think we have a problem with these changes, but I have not been
> > able
> > > to get one run to fully pass.  See the attached output of those runs.
> > >
> > >
> > >
> > > On Wed, Oct 3, 2018 at 7:36 AM Shane Ardell 
> > > wrote:
> > >
> > >> I ran them locally a handful of times just now, and on average they
> took
> > >> approximately 15 minutes to complete.
> > >>
> > >> On Tue, Oct 2, 2018, 18:22 Michael Miklavcic <
> > michael.miklav...@gmail.com
> > >> >
> > >> wrote:
> > >>
> > >> > @Shane Just how much time are we talking about, on average? I don't
> > >> think
> > >> > many in the community have had much exposure to running the e2e
> tests
> > in
> > >> > their current form. It might still be worth it in the short term.
> > >> >
> > >> > On Tue, Oct 2, 2018 at 10:20 AM Shane Ardell <
> > shane.m.ard...@gmail.com>
> > >> > wrote:
> > >> >
> > >> > > The protractor-flake package should catch and re-run false
> failures,
> > >> so
> > >> > > people shouldn't get failing tests when they are done running. I
> > just
> > >> > meant
> > >> > > that we often re-run flaky tests with protractor-flake, so it can
> > >> take a
> > >> > > while to run and could increase the build time considerably.
> > >> > >
> > >> > > On Tue, Oct 2, 2018, 18:00 Casey Stella 
> wrote:
> > >> > >
> > >> > > > Are the tests so brittle that, even with flaky, people will run
> > upon
> > >> > > false
> > >> > > > failures as part of contributing a PR?  If so, do we have a list
> > of
> > >> the
> > >> > > > brittle ones (and the things that would disambiguate a true
> > failure
> > >> > from
> > >> > > a
> > >> > > > false failure) that we can add to the documentation?
> > >> > > >
> > >> > > > On Tue, Oct 2, 2018 at 11:58 AM Shane Ardell <
> > >> shane.m.ard...@gmail.com
> > >> > >
> > >> > > > wrote:
> > >> > > >
> > >> > > > > I also would like to eventually have these tests automated.
> > There
> > >> > are a
> > >> > > > > couple hurdles to setting up our e2e tests to run with our
> > build.
> > >> I
> > >> > > think
> > >> > > > > the biggest hurdle is setting up a dedicated server with data
> > for
> > >> the
> > >> > > e2e
> > >> > > > > tests to use. I would assume this requires funding,
> engineering
> > >> > > support,
> > >> > > > > obfuscated data, etc. I also think we should migrate our e2e
> > >> tests to
> > >> > > > > Cypress first because Protractor lacks debugging tools that
> > would
> > >> > make
> > >> > > > our
> > >> > > > > life much easier if, for example, we had a failure in our CI
> > build
> > >> > but
> > >> > > &g

Re: [DISCUSS] Add e2e step to PR checklist

2018-10-04 Thread Nick Allen

I think we still have an issue of reliability.  I can never reliably get
them all to pass.  I have no idea which failures are real.  Am I the only
one that experiences this?

We need a reliable pass/fail on these before we talk about adding them to
the checklist.  For example, I just tried to run them on METRON-1771.  I
don't think we have a problem with these changes, but I have not been able
to get one run to fully pass.  See the attached output of those runs.



On Wed, Oct 3, 2018 at 7:36 AM Shane Ardell 
wrote:

> I ran them locally a handful of times just now, and on average they took
> approximately 15 minutes to complete.
>
> On Tue, Oct 2, 2018, 18:22 Michael Miklavcic 
> wrote:
>
> > @Shane Just how much time are we talking about, on average? I don't think
> > many in the community have had much exposure to running the e2e tests in
> > their current form. It might still be worth it in the short term.
> >
> > On Tue, Oct 2, 2018 at 10:20 AM Shane Ardell 
> > wrote:
> >
> > > The protractor-flake package should catch and re-run false failures, so
> > > people shouldn't get failing tests when they are done running. I just
> > meant
> > > that we often re-run flaky tests with protractor-flake, so it can take
> a
> > > while to run and could increase the build time considerably.
> > >
> > > On Tue, Oct 2, 2018, 18:00 Casey Stella  wrote:
> > >
> > > > Are the tests so brittle that, even with flaky, people will run upon
> > > false
> > > > failures as part of contributing a PR?  If so, do we have a list of
> the
> > > > brittle ones (and the things that would disambiguate a true failure
> > from
> > > a
> > > > false failure) that we can add to the documentation?
> > > >
> > > > On Tue, Oct 2, 2018 at 11:58 AM Shane Ardell <
> shane.m.ard...@gmail.com
> > >
> > > > wrote:
> > > >
> > > > > I also would like to eventually have these tests automated. There
> > are a
> > > > > couple hurdles to setting up our e2e tests to run with our build. I
> > > think
> > > > > the biggest hurdle is setting up a dedicated server with data for
> the
> > > e2e
> > > > > tests to use. I would assume this requires funding, engineering
> > > support,
> > > > > obfuscated data, etc. I also think we should migrate our e2e tests
> to
> > > > > Cypress first because Protractor lacks debugging tools that would
> > make
> > > > our
> > > > > life much easier if, for example, we had a failure in our CI build
> > but
> > > > > could not reproduce locally. In addition, our current Protractor
> > tests
> > > > are
> > > > > brittle and extremely slow.
> > > > >
> > > > > All that said, it seems we agree that we could add another PR
> > checklist
> > > > > item in the meantime. Clarifying those e2e test instructions should
> > be
> > > > part
> > > > > of that task.
> > > > >
> > > > > On Mon, Oct 1, 2018 at 2:36 PM Casey Stella 
> > > wrote:
> > > > >
> > > > > > I'd also like to make sure that clear instructions are provided
> (or
> > > > > linked
> > > > > > to) about how to run them.  Also, we need to make sure the
> > > instructions
> > > > > are
> > > > > > rock-solid for running them.
> > > > > > Looking at
> > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://github.com/apache/metron/tree/master/metron-interface/metron-alerts#e2e-tests
> > > > > > ,
> > > > > > would someone who doesn't have much or any knowledge of the UI be
> > > able
> > > > to
> > > > > > run that without assistance?
> > > > > >
> > > > > > For instance, we use full-dev, do we need to stop data from being
> > > > played
> > > > > > into full-dev for the tests to work?
> > > > > >
> > > > > > Casey
> > > > > >
> > > > > > On Mon, Oct 1, 2018 at 8:29 AM Casey Stella 
> > > > wrote:
> > > > > >
> > > > > > > I'm not super keen on expanding the steps to contribute,
> > especially
> > > > in
> > > > > an
> > > > > > > avenue that should be automated.
> > > > > > > That being said, I think that until we get to the point of
> > > automating
> > > > > the
> > > > > > > e2e tests, it's sensible to add them to the checklist.
> > > > > > > So, I would support it, but I would also urge us to move
> forward
> > > the
> > > > > > > efforts of running these tests as part of the CI build.
> > > > > > >
> > > > > > > What is the current gap there?
> > > > > > >
> > > > > > > Casey
> > > > > > >
> > > > > > > On Mon, Oct 1, 2018 at 7:41 AM Shane Ardell <
> > > > shane.m.ard...@gmail.com>
> > > > > > > wrote:
> > > > > > >
> > > > > > >> Hello everyone,
> > > > > > >>
> > > > > > >> In another discussion thread from July, I briefly mentioned
> the
> > > idea
> > > > > of
> > > > > > >> adding a step to the pull request checklist asking
> contributors
> > to
> > > > run
> > > > > > the
> > > > > > >> UI end-to-end tests. Since we aren't running e2e tests as part
> > of
> > > > the
> > > > > CI
> > > > > > >> build, it's easy for contributors to unintentionally break
> these
> > > > > tests.
> > > > > > >> Reminding contributors to run these tests will hopefully help

Re: Metron dev environments moving to require Ansible 2.4+

2018-10-01 Thread Nick Allen

I have been able to spin-up a development environment today without a
problem.

On Mon, Oct 1, 2018 at 3:58 PM Michael Miklavcic <
michael.miklav...@gmail.com> wrote:

> Anyone run latest master today with full dev? I'm seeing
> NoClassDefFoundError exceptions on starting enrichments. I upgraded to
> latest Ansible and the provisioning part seemed to work just fine.
> https://gist.github.com/mmiklavc/56205526b4736e859aa7ba52468ff4f3
>
>
> On Mon, Oct 1, 2018 at 6:46 AM zeo...@gmail.com  wrote:
>
>> Hi All,
>>
>> This has been pushed to master.  Please updated your Ansible
>> appropriately.  Thanks!
>>
>> Jon
>>
>> On Fri, Sep 28, 2018 at 12:18 PM Otto Fowler 
>> wrote:
>>
>>> Yeah,  I thought we had more but maybe they where removed.
>>> Many places in *.md files referencing Ansible and versions too
>>>
>>>
>>> On September 28, 2018 at 11:45:14, zeo...@gmail.com (zeo...@gmail.com)
>>> wrote:
>>>
>>> Do you mean this
>>> ?
>>> It was the only reference I could find on the wiki.  All of the READMEs
>>> should be updated as a part of the PR, but feel free to provide your input
>>> if I missed anything.
>>>
>>> Jon
>>>
>>> On Fri, Sep 28, 2018 at 10:15 AM Otto Fowler 
>>> wrote:
>>>
 We should make sure the non-source documentation is updated

 On September 28, 2018 at 09:32:52, zeo...@gmail.com (zeo...@gmail.com)
 wrote:

 Hi All,

 As it currently sits, once METRON-1758
  is merged into the code
 base, Ansible 2.4 or later will be required to use any of the Metron
 ansible playbooks.  This is in contrast to the prior version requirements
 outlined in Metron documentation which specifically point to 2.0.0.2 and
 2.2.0.0 as supported/recommended Ansible versions.  If you install Ansible
 2.5.0 exactly you should not experience any issues spinning up pre- and 
 post-
 merge versions of Metron.

 I am broadcasting this to both the user and dev communities in advance
 of any changes to provide an opportunity to voice any concerns.  Thanks,

 Jon
 --

 Jon

 --
>>>
>>> Jon
>>>
>>> --
>>
>> Jon
>>
>

Re: [DISCUSS] Batch Profiler Feature Branch

2018-09-28 Thread Nick Allen

Thanks for all the reviews and support.  I have merged the feature branch
into master.

On Thu, Sep 27, 2018 at 2:41 PM James Sirota  wrote:

> +1 from me as well. great work
>
> 27.09.2018, 11:15, "Ryan Merriman" :
> > +1 from me. Great work.
> >
> > On Thu, Sep 27, 2018 at 12:41 PM Justin Leet 
> wrote:
> >
> >>  I'm +1 on merging the feature branch into master. There's a lot of good
> >>  work here, and it's definitely been nice to see the couple remaining
> >>  improvements make it in.
> >>
> >>  Thanks a lot for the contribution, this is great stuff!
> >>
> >>  On Wed, Sep 26, 2018 at 6:26 PM Nick Allen  wrote:
> >>
> >>  > Or support to be offered for merging this feature branch into master?
> >>  >
> >>  > On Wed, Sep 26, 2018 at 6:20 PM Nick Allen 
> wrote:
> >>  >
> >>  > > Thanks for the review. With
> >>  https://github.com/apache/metron/pull/1209
> >>  > complete,
> >>  > > I think the feature branch is ready to be merged. Sounds like I
> have
> >>  > > Mike's support. Anyone else have comments, concerns, questions?
> >>  > >
> >>  > > On Tue, Sep 25, 2018 at 10:33 PM Michael Miklavcic <
> >>  > > michael.miklav...@gmail.com> wrote:
> >>  > >
> >>  > >> I just made a couple minor comments on that PR, and I am in
> agreement
> >>  > >> about
> >>  > >> the readiness for merging with master. Good stuff Nick.
> >>  > >>
> >>  > >> On Fri, Sep 21, 2018 at 12:37 PM Nick Allen 
> >>  wrote:
> >>  > >>
> >>  > >> > Here is a PR that adds the input time constraints to the Batch
> >>  > Profiler
> >>  > >> > (METRON-1787); https://github.com/apache/metron/pull/1209.
> >>  > >> >
> >>  > >> > It seems that the consensus is that this is probably the last
> >>  feature
> >>  > we
> >>  > >> > need before merging the FB into master. The other two can wait
> >>  until
> >>  > >> after
> >>  > >> > the feature branch has been merged. Let me know if you disagree.
> >>  > >> >
> >>  > >> > Thanks
> >>  > >> >
> >>  > >> >
> >>  > >> > On Thu, Sep 20, 2018 at 1:55 PM Nick Allen 
> >>  > wrote:
> >>  > >> >
> >>  > >> > > Yeah, agreed. Per use case 3, when deploying to production
> there
> >>  > >> really
> >>  > >> > > wouldn't be a huge overlap like 3 months of already profiled
> data.
> >>  > >> Its
> >>  > >> > day
> >>  > >> > > 1, the profile was just deployed around the same time as you
> are
> >>  > >> running
> >>  > >> > > the Batch Profiler, so the overlap is in minutes, maybe hours.
> >>  But
> >>  > I
> >>  > >> can
> >>  > >> > > definitely see the usefulness of the feature for re-runs, etc
> as
> >>  you
> >>  > >> have
> >>  > >> > > described.
> >>  > >> > >
> >>  > >> > > Based on this discussion, I created a few JIRAs. Thanks all
> for
> >>  the
> >>  > >> > great
> >>  > >> > > feedback and keep it coming.
> >>  > >> > >
> >>  > >> > > [1] METRON-1787 - Input Time Constraints for Batch Profiler
> >>  > >> > > [2] METRON-1788 - Fetch Profile Definitions from Zk for Batch
> >>  > Profiler
> >>  > >> > > [3] METRON-1789 - MPack Should Define Default Input Path for
> Batch
> >>  > >> > > Profiler
> >>  > >> > >
> >>  > >> > >
> >>  > >> > > --
> >>  > >> > > [1] https://issues.apache.org/jira/browse/METRON-1787
> >>  > >> > > [2] https://issues.apache.org/jira/browse/METRON-1788
> >>  > >> > > [3] https://issues.apache.org/jira/browse/METRON-1789
> >>  > >> > >
> >>  > >> > >
> >>  > >> > >
> >>  > >> > >
> >>  > >> > >
> >>  > >> > >
> &

Re: [DISCUSS] Batch Profiler Feature Branch

2018-09-26 Thread Nick Allen

Or support to be offered for merging this feature branch into master?

On Wed, Sep 26, 2018 at 6:20 PM Nick Allen  wrote:

> Thanks for the review.  With  https://github.com/apache/metron/pull/1209 
> complete,
> I think the feature branch is ready to be merged.  Sounds like I have
> Mike's support.  Anyone else have comments, concerns, questions?
>
> On Tue, Sep 25, 2018 at 10:33 PM Michael Miklavcic <
> michael.miklav...@gmail.com> wrote:
>
>> I just made a couple minor comments on that PR, and I am in agreement
>> about
>> the readiness for merging with master. Good stuff Nick.
>>
>> On Fri, Sep 21, 2018 at 12:37 PM Nick Allen  wrote:
>>
>> > Here is a PR that adds the input time constraints to the Batch Profiler
>> > (METRON-1787);  https://github.com/apache/metron/pull/1209.
>> >
>> > It seems that the consensus is that this is probably the last feature we
>> > need before merging the FB into master.  The other two can wait until
>> after
>> > the feature branch has been merged.  Let me know if you disagree.
>> >
>> > Thanks
>> >
>> >
>> > On Thu, Sep 20, 2018 at 1:55 PM Nick Allen  wrote:
>> >
>> > > Yeah, agreed.  Per use case 3, when deploying to production there
>> really
>> > > wouldn't be a huge overlap like 3 months of already profiled data.
>> Its
>> > day
>> > > 1, the profile was just deployed around the same time as you are
>> running
>> > > the Batch Profiler, so the overlap is in minutes, maybe hours.  But I
>> can
>> > > definitely see the usefulness of the feature for re-runs, etc as you
>> have
>> > > described.
>> > >
>> > > Based on this discussion, I created a few JIRAs.  Thanks all for the
>> > great
>> > > feedback and keep it coming.
>> > >
>> > > [1] METRON-1787 - Input Time Constraints for Batch Profiler
>> > > [2] METRON-1788 - Fetch Profile Definitions from Zk for Batch Profiler
>> > > [3] METRON-1789 - MPack Should Define Default Input Path for Batch
>> > > Profiler
>> > >
>> > >
>> > > --
>> > > [1] https://issues.apache.org/jira/browse/METRON-1787
>> > > [2] https://issues.apache.org/jira/browse/METRON-1788
>> > > [3] https://issues.apache.org/jira/browse/METRON-1789
>> > >
>> > >
>> > >
>> > >
>> > >
>> > >
>> > > On Thu, Sep 20, 2018 at 1:34 PM Michael Miklavcic <
>> > > michael.miklav...@gmail.com> wrote:
>> > >
>> > >> I think we might want to allow the flexibility to choose the date
>> range
>> > >> then. I don't yet feel like I have a good enough understanding of all
>> > the
>> > >> ways in which users would want to seed to force them to run the batch
>> > job
>> > >> over all the data. It might also make it easier to deal with
>> > remediation,
>> > >> ie an error doesn't force you to re-run over the entire history. Same
>> > goes
>> > >> for testing out the profile seeing batch job in the first place.
>> > >>
>> > >> On Thu, Sep 20, 2018 at 11:23 AM Nick Allen 
>> wrote:
>> > >>
>> > >> > Assuming you have 9 months of data archived, yes.
>> > >> >
>> > >> > On Thu, Sep 20, 2018 at 1:22 PM Michael Miklavcic <
>> > >> > michael.miklav...@gmail.com> wrote:
>> > >> >
>> > >> > > So in the case of 3 - if you had 6 months of data that hadn't
>> been
>> > >> > profiled
>> > >> > > and another 3 that had been profiled (9 months total data), in
>> its
>> > >> > current
>> > >> > > form the batch job runs over all 9 months?
>> > >> > >
>> > >> > > On Thu, Sep 20, 2018 at 11:13 AM Nick Allen 
>> > >> wrote:
>> > >> > >
>> > >> > > > > How do we establish "tm" from 1.1 above? Any concerns about
>> > >> overlap
>> > >> > or
>> > >> > > > gaps after the seeding is performed?
>> > >> > > >
>> > >> > > > Good point.  Right now, if the Streaming and Batch Profiler
>> > overlap
>> > >> the
>> > >> > > > last write wins.  And presumably the output of the Strea

Re: [DISCUSS] Batch Profiler Feature Branch

2018-09-26 Thread Nick Allen

Thanks for the review.  With
https://github.com/apache/metron/pull/1209 complete,
I think the feature branch is ready to be merged.  Sounds like I have
Mike's support.  Anyone else have comments, concerns, questions?

On Tue, Sep 25, 2018 at 10:33 PM Michael Miklavcic <
michael.miklav...@gmail.com> wrote:

> I just made a couple minor comments on that PR, and I am in agreement about
> the readiness for merging with master. Good stuff Nick.
>
> On Fri, Sep 21, 2018 at 12:37 PM Nick Allen  wrote:
>
> > Here is a PR that adds the input time constraints to the Batch Profiler
> > (METRON-1787);  https://github.com/apache/metron/pull/1209.
> >
> > It seems that the consensus is that this is probably the last feature we
> > need before merging the FB into master.  The other two can wait until
> after
> > the feature branch has been merged.  Let me know if you disagree.
> >
> > Thanks
> >
> >
> > On Thu, Sep 20, 2018 at 1:55 PM Nick Allen  wrote:
> >
> > > Yeah, agreed.  Per use case 3, when deploying to production there
> really
> > > wouldn't be a huge overlap like 3 months of already profiled data.  Its
> > day
> > > 1, the profile was just deployed around the same time as you are
> running
> > > the Batch Profiler, so the overlap is in minutes, maybe hours.  But I
> can
> > > definitely see the usefulness of the feature for re-runs, etc as you
> have
> > > described.
> > >
> > > Based on this discussion, I created a few JIRAs.  Thanks all for the
> > great
> > > feedback and keep it coming.
> > >
> > > [1] METRON-1787 - Input Time Constraints for Batch Profiler
> > > [2] METRON-1788 - Fetch Profile Definitions from Zk for Batch Profiler
> > > [3] METRON-1789 - MPack Should Define Default Input Path for Batch
> > > Profiler
> > >
> > >
> > > --
> > > [1] https://issues.apache.org/jira/browse/METRON-1787
> > > [2] https://issues.apache.org/jira/browse/METRON-1788
> > > [3] https://issues.apache.org/jira/browse/METRON-1789
> > >
> > >
> > >
> > >
> > >
> > >
> > > On Thu, Sep 20, 2018 at 1:34 PM Michael Miklavcic <
> > > michael.miklav...@gmail.com> wrote:
> > >
> > >> I think we might want to allow the flexibility to choose the date
> range
> > >> then. I don't yet feel like I have a good enough understanding of all
> > the
> > >> ways in which users would want to seed to force them to run the batch
> > job
> > >> over all the data. It might also make it easier to deal with
> > remediation,
> > >> ie an error doesn't force you to re-run over the entire history. Same
> > goes
> > >> for testing out the profile seeing batch job in the first place.
> > >>
> > >> On Thu, Sep 20, 2018 at 11:23 AM Nick Allen 
> wrote:
> > >>
> > >> > Assuming you have 9 months of data archived, yes.
> > >> >
> > >> > On Thu, Sep 20, 2018 at 1:22 PM Michael Miklavcic <
> > >> > michael.miklav...@gmail.com> wrote:
> > >> >
> > >> > > So in the case of 3 - if you had 6 months of data that hadn't been
> > >> > profiled
> > >> > > and another 3 that had been profiled (9 months total data), in its
> > >> > current
> > >> > > form the batch job runs over all 9 months?
> > >> > >
> > >> > > On Thu, Sep 20, 2018 at 11:13 AM Nick Allen 
> > >> wrote:
> > >> > >
> > >> > > > > How do we establish "tm" from 1.1 above? Any concerns about
> > >> overlap
> > >> > or
> > >> > > > gaps after the seeding is performed?
> > >> > > >
> > >> > > > Good point.  Right now, if the Streaming and Batch Profiler
> > overlap
> > >> the
> > >> > > > last write wins.  And presumably the output of the Streaming and
> > >> Batch
> > >> > > > Profiler are the same, so no worries, right? :)
> > >> > > >
> > >> > > > So it kind of works, but it is definitely not ideal for use case
> > >> 3.  I
> > >> > > > could add --begin and --end args to constrain the time frame
> over
> > >> which
> > >> > > the
> > >> > > > Batch Profiler runs.  I do not have that in the feature branch.
> > It
> > >>

Re: [DISCUSS] Batch Profiler Feature Branch

2018-09-21 Thread Nick Allen

Here is a PR that adds the input time constraints to the Batch Profiler
(METRON-1787);  https://github.com/apache/metron/pull/1209.

It seems that the consensus is that this is probably the last feature we
need before merging the FB into master.  The other two can wait until after
the feature branch has been merged.  Let me know if you disagree.

Thanks


On Thu, Sep 20, 2018 at 1:55 PM Nick Allen  wrote:

> Yeah, agreed.  Per use case 3, when deploying to production there really
> wouldn't be a huge overlap like 3 months of already profiled data.  Its day
> 1, the profile was just deployed around the same time as you are running
> the Batch Profiler, so the overlap is in minutes, maybe hours.  But I can
> definitely see the usefulness of the feature for re-runs, etc as you have
> described.
>
> Based on this discussion, I created a few JIRAs.  Thanks all for the great
> feedback and keep it coming.
>
> [1] METRON-1787 - Input Time Constraints for Batch Profiler
> [2] METRON-1788 - Fetch Profile Definitions from Zk for Batch Profiler
> [3] METRON-1789 - MPack Should Define Default Input Path for Batch
> Profiler
>
>
> --
> [1] https://issues.apache.org/jira/browse/METRON-1787
> [2] https://issues.apache.org/jira/browse/METRON-1788
> [3] https://issues.apache.org/jira/browse/METRON-1789
>
>
>
>
>
>
> On Thu, Sep 20, 2018 at 1:34 PM Michael Miklavcic <
> michael.miklav...@gmail.com> wrote:
>
>> I think we might want to allow the flexibility to choose the date range
>> then. I don't yet feel like I have a good enough understanding of all the
>> ways in which users would want to seed to force them to run the batch job
>> over all the data. It might also make it easier to deal with remediation,
>> ie an error doesn't force you to re-run over the entire history. Same goes
>> for testing out the profile seeing batch job in the first place.
>>
>> On Thu, Sep 20, 2018 at 11:23 AM Nick Allen  wrote:
>>
>> > Assuming you have 9 months of data archived, yes.
>> >
>> > On Thu, Sep 20, 2018 at 1:22 PM Michael Miklavcic <
>> > michael.miklav...@gmail.com> wrote:
>> >
>> > > So in the case of 3 - if you had 6 months of data that hadn't been
>> > profiled
>> > > and another 3 that had been profiled (9 months total data), in its
>> > current
>> > > form the batch job runs over all 9 months?
>> > >
>> > > On Thu, Sep 20, 2018 at 11:13 AM Nick Allen 
>> wrote:
>> > >
>> > > > > How do we establish "tm" from 1.1 above? Any concerns about
>> overlap
>> > or
>> > > > gaps after the seeding is performed?
>> > > >
>> > > > Good point.  Right now, if the Streaming and Batch Profiler overlap
>> the
>> > > > last write wins.  And presumably the output of the Streaming and
>> Batch
>> > > > Profiler are the same, so no worries, right? :)
>> > > >
>> > > > So it kind of works, but it is definitely not ideal for use case
>> 3.  I
>> > > > could add --begin and --end args to constrain the time frame over
>> which
>> > > the
>> > > > Batch Profiler runs.  I do not have that in the feature branch.  It
>> > would
>> > > > be easy enough to add though.
>> > > >
>> > > >
>> > > >
>> > > > On Thu, Sep 20, 2018 at 12:41 PM Michael Miklavcic <
>> > > > michael.miklav...@gmail.com> wrote:
>> > > >
>> > > > > Ok, makes sense. That's sort of what I was thinking as well, Nick.
>> > > > Pulling
>> > > > > at this thread just a bit more...
>> > > > >
>> > > > >1. I have an existing system that's been up a while, and I have
>> > > added
>> > > > k
>> > > > >profiles - assume these are the first profiles I've created.
>> > > > >   1. I would have t0 - tm (where m is the time when the
>> profiles
>> > > were
>> > > > >   first installed) worth of data that has not been profiled
>> yet.
>> > > > >   2. The batch profiler process would be to take that exact
>> > profile
>> > > > >   definition from ZK and run the batch loader with that from
>> the
>> > > CLI.
>> > > > >   3. Profiles are now up to date from t0 - tCurrent
>> > > > >2. I've already done #1 above. Time goes by and now I want to
>> add
>> > a
&

Re: [DISCUSS] Batch Profiler Feature Branch

2018-09-20 Thread Nick Allen

Yeah, agreed.  Per use case 3, when deploying to production there really
wouldn't be a huge overlap like 3 months of already profiled data.  Its day
1, the profile was just deployed around the same time as you are running
the Batch Profiler, so the overlap is in minutes, maybe hours.  But I can
definitely see the usefulness of the feature for re-runs, etc as you have
described.

Based on this discussion, I created a few JIRAs.  Thanks all for the great
feedback and keep it coming.

[1] METRON-1787 - Input Time Constraints for Batch Profiler
[2] METRON-1788 - Fetch Profile Definitions from Zk for Batch Profiler
[3] METRON-1789 - MPack Should Define Default Input Path for Batch Profiler


--
[1] https://issues.apache.org/jira/browse/METRON-1787
[2] https://issues.apache.org/jira/browse/METRON-1788
[3] https://issues.apache.org/jira/browse/METRON-1789






On Thu, Sep 20, 2018 at 1:34 PM Michael Miklavcic <
michael.miklav...@gmail.com> wrote:

> I think we might want to allow the flexibility to choose the date range
> then. I don't yet feel like I have a good enough understanding of all the
> ways in which users would want to seed to force them to run the batch job
> over all the data. It might also make it easier to deal with remediation,
> ie an error doesn't force you to re-run over the entire history. Same goes
> for testing out the profile seeing batch job in the first place.
>
> On Thu, Sep 20, 2018 at 11:23 AM Nick Allen  wrote:
>
> > Assuming you have 9 months of data archived, yes.
> >
> > On Thu, Sep 20, 2018 at 1:22 PM Michael Miklavcic <
> > michael.miklav...@gmail.com> wrote:
> >
> > > So in the case of 3 - if you had 6 months of data that hadn't been
> > profiled
> > > and another 3 that had been profiled (9 months total data), in its
> > current
> > > form the batch job runs over all 9 months?
> > >
> > > On Thu, Sep 20, 2018 at 11:13 AM Nick Allen 
> wrote:
> > >
> > > > > How do we establish "tm" from 1.1 above? Any concerns about overlap
> > or
> > > > gaps after the seeding is performed?
> > > >
> > > > Good point.  Right now, if the Streaming and Batch Profiler overlap
> the
> > > > last write wins.  And presumably the output of the Streaming and
> Batch
> > > > Profiler are the same, so no worries, right? :)
> > > >
> > > > So it kind of works, but it is definitely not ideal for use case 3.
> I
> > > > could add --begin and --end args to constrain the time frame over
> which
> > > the
> > > > Batch Profiler runs.  I do not have that in the feature branch.  It
> > would
> > > > be easy enough to add though.
> > > >
> > > >
> > > >
> > > > On Thu, Sep 20, 2018 at 12:41 PM Michael Miklavcic <
> > > > michael.miklav...@gmail.com> wrote:
> > > >
> > > > > Ok, makes sense. That's sort of what I was thinking as well, Nick.
> > > > Pulling
> > > > > at this thread just a bit more...
> > > > >
> > > > >1. I have an existing system that's been up a while, and I have
> > > added
> > > > k
> > > > >profiles - assume these are the first profiles I've created.
> > > > >   1. I would have t0 - tm (where m is the time when the
> profiles
> > > were
> > > > >   first installed) worth of data that has not been profiled
> yet.
> > > > >   2. The batch profiler process would be to take that exact
> > profile
> > > > >   definition from ZK and run the batch loader with that from
> the
> > > CLI.
> > > > >   3. Profiles are now up to date from t0 - tCurrent
> > > > >2. I've already done #1 above. Time goes by and now I want to
> add
> > a
> > > > new
> > > > >profile.
> > > > >   1. Same first step above
> > > > >   2. I would run the batch loader with *only* that new profile
> > > > >   definition to seed?
> > > > >
> > > > > Forgive me if I missed this in PR's and discussion in the FB, but
> how
> > > do
> > > > we
> > > > > establish "tm" from 1.1 above? Any concerns about overlap or gaps
> > after
> > > > the
> > > > > seeding is performed?
> > > > >
> > > > > On Thu, Sep 20, 2018 at 10:26 AM Nick Allen 
> > > wrote:
> > > > >
> > > > > > I think more often than not,

Re: [DISCUSS] Batch Profiler Feature Branch

2018-09-20 Thread Nick Allen

Assuming you have 9 months of data archived, yes.

On Thu, Sep 20, 2018 at 1:22 PM Michael Miklavcic <
michael.miklav...@gmail.com> wrote:

> So in the case of 3 - if you had 6 months of data that hadn't been profiled
> and another 3 that had been profiled (9 months total data), in its current
> form the batch job runs over all 9 months?
>
> On Thu, Sep 20, 2018 at 11:13 AM Nick Allen  wrote:
>
> > > How do we establish "tm" from 1.1 above? Any concerns about overlap or
> > gaps after the seeding is performed?
> >
> > Good point.  Right now, if the Streaming and Batch Profiler overlap the
> > last write wins.  And presumably the output of the Streaming and Batch
> > Profiler are the same, so no worries, right? :)
> >
> > So it kind of works, but it is definitely not ideal for use case 3.  I
> > could add --begin and --end args to constrain the time frame over which
> the
> > Batch Profiler runs.  I do not have that in the feature branch.  It would
> > be easy enough to add though.
> >
> >
> >
> > On Thu, Sep 20, 2018 at 12:41 PM Michael Miklavcic <
> > michael.miklav...@gmail.com> wrote:
> >
> > > Ok, makes sense. That's sort of what I was thinking as well, Nick.
> > Pulling
> > > at this thread just a bit more...
> > >
> > >1. I have an existing system that's been up a while, and I have
> added
> > k
> > >profiles - assume these are the first profiles I've created.
> > >   1. I would have t0 - tm (where m is the time when the profiles
> were
> > >   first installed) worth of data that has not been profiled yet.
> > >   2. The batch profiler process would be to take that exact profile
> > >   definition from ZK and run the batch loader with that from the
> CLI.
> > >   3. Profiles are now up to date from t0 - tCurrent
> > >2. I've already done #1 above. Time goes by and now I want to add a
> > new
> > >profile.
> > >   1. Same first step above
> > >   2. I would run the batch loader with *only* that new profile
> > >   definition to seed?
> > >
> > > Forgive me if I missed this in PR's and discussion in the FB, but how
> do
> > we
> > > establish "tm" from 1.1 above? Any concerns about overlap or gaps after
> > the
> > > seeding is performed?
> > >
> > > On Thu, Sep 20, 2018 at 10:26 AM Nick Allen 
> wrote:
> > >
> > > > I think more often than not, you would want to load your profile
> > > definition
> > > > from a file.  This is why I considered the 'load from Zk' more of a
> > > > nice-to-have.
> > > >
> > > >- In use case 1 and 2, this would definitely be the case.  The
> > > profiles
> > > >I am working with are speculative and I am using the batch
> profiler
> > to
> > > >determine if they are worth keeping.  In this case, my speculative
> > > > profiles
> > > >would not be in Zk (yet).
> > > >- In use case 3, I could see it go either way.  It might be useful
> > to
> > > >load from Zk, but it certainly isn't a blocker.
> > > >
> > > >
> > > > > So if the config does not correctly match the profiler config held
> in
> > > ZK
> > > > and
> > > > the user runs the batch seeding job, what happens?
> > > >
> > > > You would just get a profile that is slightly different over the
> entire
> > > > time span.  This is not a new risk.  If the user changes their
> Profile
> > > > definitions in Zk, the same thing would happen.
> > > >
> > > >
> > > > On Thu, Sep 20, 2018 at 12:15 PM Michael Miklavcic <
> > > > michael.miklav...@gmail.com> wrote:
> > > >
> > > > > I think I'm torn on this, specifically because it's batch and would
> > > > > generally be run as-needed. Justin, can you elaborate on your
> > concerns
> > > > > there? This feels functionally very similar to our flat file
> loaders,
> > > > which
> > > > > all have inputs for config from the CLI only. On the other hand,
> our
> > > flat
> > > > > file loaders are not typically seeding an existing structure. My
> > > concern
> > > > of
> > > > > a local file profiler config stems from this stated goal:
> > > > > > The goal would be to enable “profile seed

Re: [DISCUSS] Batch Profiler Feature Branch

2018-09-20 Thread Nick Allen

> It's just cleaner from a usage/management perspective to say "I want to put
a profile in prod, just use streaming profiler and the batch profiler with
the same setup and they're good to go."

Agreed.  I can add it.  It would be a simple addition.

On Thu, Sep 20, 2018 at 12:49 PM Justin Leet  wrote:

> I think the main difference between this and the flatfile loader is that we
> actively maintain our profiles in ZK for streaming.  Doing this from files
> is likely going to be the main usage, particularly for speculative usage.
>
> For me, the main use case for ZK is definitely use case 3.
>
> I can definitely be persuaded that this isn't a blocker for right now, but
> I think there will be problems in practice from not having the
> functionality. E.g. "We want to refresh everything because of mistake X,
> and nobody refreshed the file/ZK and they've diverged".  While nobody likes
> to refresh prod data (or some subset), I have seen it happen in literally
> every single project I've worked on.  On dev/integration environments this
> is even more likely.  Most people probably aren't going to store these
> files in their version control (even though they probably should) and these
> sort of divergences will happen.
>
>  It's just cleaner from a usage/management perspective to say "I want to
> put a profile in prod, just use streaming profiler and the batch profiler
> with the same setup and they're good to go."
>
> On Thu, Sep 20, 2018 at 12:41 PM Michael Miklavcic <
> michael.miklav...@gmail.com> wrote:
>
> > Ok, makes sense. That's sort of what I was thinking as well, Nick.
> Pulling
> > at this thread just a bit more...
> >
> >1. I have an existing system that's been up a while, and I have added
> k
> >profiles - assume these are the first profiles I've created.
> >   1. I would have t0 - tm (where m is the time when the profiles were
> >   first installed) worth of data that has not been profiled yet.
> >   2. The batch profiler process would be to take that exact profile
> >   definition from ZK and run the batch loader with that from the CLI.
> >   3. Profiles are now up to date from t0 - tCurrent
> >2. I've already done #1 above. Time goes by and now I want to add a
> new
> >profile.
> >   1. Same first step above
> >   2. I would run the batch loader with *only* that new profile
> >   definition to seed?
> >
> > Forgive me if I missed this in PR's and discussion in the FB, but how do
> we
> > establish "tm" from 1.1 above? Any concerns about overlap or gaps after
> the
> > seeding is performed?
> >
> > On Thu, Sep 20, 2018 at 10:26 AM Nick Allen  wrote:
> >
> > > I think more often than not, you would want to load your profile
> > definition
> > > from a file.  This is why I considered the 'load from Zk' more of a
> > > nice-to-have.
> > >
> > >- In use case 1 and 2, this would definitely be the case.  The
> > profiles
> > >I am working with are speculative and I am using the batch profiler
> to
> > >determine if they are worth keeping.  In this case, my speculative
> > > profiles
> > >would not be in Zk (yet).
> > >- In use case 3, I could see it go either way.  It might be useful
> to
> > >load from Zk, but it certainly isn't a blocker.
> > >
> > >
> > > > So if the config does not correctly match the profiler config held in
> > ZK
> > > and
> > > the user runs the batch seeding job, what happens?
> > >
> > > You would just get a profile that is slightly different over the entire
> > > time span.  This is not a new risk.  If the user changes their Profile
> > > definitions in Zk, the same thing would happen.
> > >
> > >
> > > On Thu, Sep 20, 2018 at 12:15 PM Michael Miklavcic <
> > > michael.miklav...@gmail.com> wrote:
> > >
> > > > I think I'm torn on this, specifically because it's batch and would
> > > > generally be run as-needed. Justin, can you elaborate on your
> concerns
> > > > there? This feels functionally very similar to our flat file loaders,
> > > which
> > > > all have inputs for config from the CLI only. On the other hand, our
> > flat
> > > > file loaders are not typically seeding an existing structure. My
> > concern
> > > of
> > > > a local file profiler config stems from this stated goal:
> > > > > The goal would be to enable “profile seeding” which allows profiles
> >

Re: [DISCUSS] Batch Profiler Feature Branch

2018-09-20 Thread Nick Allen

> How do we establish "tm" from 1.1 above? Any concerns about overlap or
gaps after the seeding is performed?

Good point.  Right now, if the Streaming and Batch Profiler overlap the
last write wins.  And presumably the output of the Streaming and Batch
Profiler are the same, so no worries, right? :)

So it kind of works, but it is definitely not ideal for use case 3.  I
could add --begin and --end args to constrain the time frame over which the
Batch Profiler runs.  I do not have that in the feature branch.  It would
be easy enough to add though.



On Thu, Sep 20, 2018 at 12:41 PM Michael Miklavcic <
michael.miklav...@gmail.com> wrote:

> Ok, makes sense. That's sort of what I was thinking as well, Nick. Pulling
> at this thread just a bit more...
>
>1. I have an existing system that's been up a while, and I have added k
>profiles - assume these are the first profiles I've created.
>   1. I would have t0 - tm (where m is the time when the profiles were
>   first installed) worth of data that has not been profiled yet.
>   2. The batch profiler process would be to take that exact profile
>   definition from ZK and run the batch loader with that from the CLI.
>   3. Profiles are now up to date from t0 - tCurrent
>2. I've already done #1 above. Time goes by and now I want to add a new
>profile.
>   1. Same first step above
>   2. I would run the batch loader with *only* that new profile
>   definition to seed?
>
> Forgive me if I missed this in PR's and discussion in the FB, but how do we
> establish "tm" from 1.1 above? Any concerns about overlap or gaps after the
> seeding is performed?
>
> On Thu, Sep 20, 2018 at 10:26 AM Nick Allen  wrote:
>
> > I think more often than not, you would want to load your profile
> definition
> > from a file.  This is why I considered the 'load from Zk' more of a
> > nice-to-have.
> >
> >- In use case 1 and 2, this would definitely be the case.  The
> profiles
> >I am working with are speculative and I am using the batch profiler to
> >determine if they are worth keeping.  In this case, my speculative
> > profiles
> >would not be in Zk (yet).
> >- In use case 3, I could see it go either way.  It might be useful to
> >load from Zk, but it certainly isn't a blocker.
> >
> >
> > > So if the config does not correctly match the profiler config held in
> ZK
> > and
> > the user runs the batch seeding job, what happens?
> >
> > You would just get a profile that is slightly different over the entire
> > time span.  This is not a new risk.  If the user changes their Profile
> > definitions in Zk, the same thing would happen.
> >
> >
> > On Thu, Sep 20, 2018 at 12:15 PM Michael Miklavcic <
> > michael.miklav...@gmail.com> wrote:
> >
> > > I think I'm torn on this, specifically because it's batch and would
> > > generally be run as-needed. Justin, can you elaborate on your concerns
> > > there? This feels functionally very similar to our flat file loaders,
> > which
> > > all have inputs for config from the CLI only. On the other hand, our
> flat
> > > file loaders are not typically seeding an existing structure. My
> concern
> > of
> > > a local file profiler config stems from this stated goal:
> > > > The goal would be to enable “profile seeding” which allows profiles
> to
> > be
> > > populated from a time before the profile was created.
> > > So if the config does not correctly match the profiler config held in
> ZK
> > > and the user runs the batch seeding job, what happens?
> > >
> > > On Thu, Sep 20, 2018 at 10:06 AM Justin Leet 
> > > wrote:
> > >
> > > > The profile not being able to read from ZK feels like a fairly
> > > substantial,
> > > > if subtle, set of potential problems.  I'd like to see that in either
> > > > before merging or at least pretty soon after merging.  Is it a lot of
> > > work
> > > > to add that functionality based on where things are right now?
> > > >
> > > > On Thu, Sep 20, 2018 at 9:59 AM Nick Allen 
> wrote:
> > > >
> > > > > Here is another limitation that I just thought. It can only read a
> > > > profile
> > > > > definition from a file.  It probably also makes sense to add an
> > option
> > > > that
> > > > > allows it to read the current Profiler configuration from
> Zookeeper.
> > > > >
> > > > >
> > > > > > Is it wo

Re: [DISCUSS] Batch Profiler Feature Branch

2018-09-20 Thread Nick Allen

I think more often than not, you would want to load your profile definition
from a file.  This is why I considered the 'load from Zk' more of a
nice-to-have.

   - In use case 1 and 2, this would definitely be the case.  The profiles
   I am working with are speculative and I am using the batch profiler to
   determine if they are worth keeping.  In this case, my speculative profiles
   would not be in Zk (yet).
   - In use case 3, I could see it go either way.  It might be useful to
   load from Zk, but it certainly isn't a blocker.


> So if the config does not correctly match the profiler config held in ZK and
the user runs the batch seeding job, what happens?

You would just get a profile that is slightly different over the entire
time span.  This is not a new risk.  If the user changes their Profile
definitions in Zk, the same thing would happen.


On Thu, Sep 20, 2018 at 12:15 PM Michael Miklavcic <
michael.miklav...@gmail.com> wrote:

> I think I'm torn on this, specifically because it's batch and would
> generally be run as-needed. Justin, can you elaborate on your concerns
> there? This feels functionally very similar to our flat file loaders, which
> all have inputs for config from the CLI only. On the other hand, our flat
> file loaders are not typically seeding an existing structure. My concern of
> a local file profiler config stems from this stated goal:
> > The goal would be to enable “profile seeding” which allows profiles to be
> populated from a time before the profile was created.
> So if the config does not correctly match the profiler config held in ZK
> and the user runs the batch seeding job, what happens?
>
> On Thu, Sep 20, 2018 at 10:06 AM Justin Leet 
> wrote:
>
> > The profile not being able to read from ZK feels like a fairly
> substantial,
> > if subtle, set of potential problems.  I'd like to see that in either
> > before merging or at least pretty soon after merging.  Is it a lot of
> work
> > to add that functionality based on where things are right now?
> >
> > On Thu, Sep 20, 2018 at 9:59 AM Nick Allen  wrote:
> >
> > > Here is another limitation that I just thought. It can only read a
> > profile
> > > definition from a file.  It probably also makes sense to add an option
> > that
> > > allows it to read the current Profiler configuration from Zookeeper.
> > >
> > >
> > > > Is it worth setting up a default config that pulls from the main
> > indexing
> > > output?
> > >
> > > Yes, I think that makes sense.  We want the Batch Profiler to point to
> > the
> > > right HDFS URL, no matter where/how Metron is deployed.  When Metron
> gets
> > > spun-up on a cluster, I should be able to just run the Batch Profiler
> > > without having to fuss with the input path.
> > >
> > >
> > >
> > >
> > >
> > > On Thu, Sep 20, 2018 at 9:46 AM Justin Leet 
> > wrote:
> > >
> > > > Re:
> > > >
> > > > >  * You do not configure the Batch Profiler in Ambari.  It is
> > configured
> > > > > and executed completely from the command-line.
> > > > >
> > > >
> > > > Is it worth setting up a default config that pulls from the main
> > indexing
> > > > output?  I'm a little on the fence about it, but it seems like making
> > the
> > > > most common case more or less built-in would be nice.
> > > >
> > > > Having said that, I do not consider that a requirement for merging
> the
> > > > feature branch.
> > > >
> > > > On Wed, Sep 19, 2018 at 11:23 AM James Sirota 
> > > wrote:
> > > >
> > > > > I think what you have outlined above is a good initial stab at the
> > > > > feature.  Manual install of spark is not a big deal.  Configuring
> via
> > > > > command line while we mature this feature is ok as well.  Doesn't
> > look
> > > > like
> > > > > configuration steps are too hard.  I think you should merge.
> > > > >
> > > > > James
> > > > >
> > > > > 19.09.2018, 08:15, "Nick Allen" :
> > > > > > I would like to open a discussion to get the Batch Profiler
> feature
> > > > > branch
> > > > > > merged into master as part of METRON-1699 [1] Create Batch
> > Profiler.
> > > > All
> > > > > > of the work that I had in mind for our first draft of the Batch
> > > > Profiler
> > > > > > has been completed. Please take

Re: [DISCUSS] Batch Profiler Feature Branch

2018-09-20 Thread Nick Allen

Here is another limitation that I just thought. It can only read a profile
definition from a file.  It probably also makes sense to add an option that
allows it to read the current Profiler configuration from Zookeeper.


> Is it worth setting up a default config that pulls from the main indexing
output?

Yes, I think that makes sense.  We want the Batch Profiler to point to the
right HDFS URL, no matter where/how Metron is deployed.  When Metron gets
spun-up on a cluster, I should be able to just run the Batch Profiler
without having to fuss with the input path.





On Thu, Sep 20, 2018 at 9:46 AM Justin Leet  wrote:

> Re:
>
> >  * You do not configure the Batch Profiler in Ambari.  It is configured
> > and executed completely from the command-line.
> >
>
> Is it worth setting up a default config that pulls from the main indexing
> output?  I'm a little on the fence about it, but it seems like making the
> most common case more or less built-in would be nice.
>
> Having said that, I do not consider that a requirement for merging the
> feature branch.
>
> On Wed, Sep 19, 2018 at 11:23 AM James Sirota  wrote:
>
> > I think what you have outlined above is a good initial stab at the
> > feature.  Manual install of spark is not a big deal.  Configuring via
> > command line while we mature this feature is ok as well.  Doesn't look
> like
> > configuration steps are too hard.  I think you should merge.
> >
> > James
> >
> > 19.09.2018, 08:15, "Nick Allen" :
> > > I would like to open a discussion to get the Batch Profiler feature
> > branch
> > > merged into master as part of METRON-1699 [1] Create Batch Profiler.
> All
> > > of the work that I had in mind for our first draft of the Batch
> Profiler
> > > has been completed. Please take a look through what I have and let me
> > know
> > > if there are other features that you think are required *before* we
> > merge.
> > >
> > > Previous list discussions on this topic include [2] and [3].
> > >
> > > (Q) What can I do with the feature branch?
> > >
> > >   * With the Batch Profiler, you can backfill/seed profiles using
> > archived
> > > telemetry. This enables the following types of use cases.
> > >
> > >   1. As a Security Data Scientist, I want to understand the
> > historical
> > > behaviors and trends of a profile that I have created so that I can
> > > determine if I have created a feature set that has predictive value for
> > > model building.
> > >
> > >   2. As a Security Data Scientist, I want to understand the
> > historical
> > > behaviors and trends of a profile that I have created so that I can
> > > determine if I have defined the profile correctly and created a feature
> > set
> > > that matches reality.
> > >
> > >   3. As a Security Platform Engineer, I want to generate a profile
> > > using archived telemetry when I deploy a new model to production so
> that
> > > models depending on that profile can function on day 1.
> > >
> > >   * METRON-1699 [1] includes a more detailed description of the
> feature.
> > >
> > > (Q) What work was completed?
> > >
> > >   * The Batch Profiler runs on Spark and was implemented in Java to
> > remain
> > > consistent with our current Java-heavy code base.
> > >
> > >   * The Batch Profiler is executed from the command-line. It can be
> > > launched using a script or by calling `spark-submit`, which may be
> useful
> > > for advanced users.
> > >
> > >   * Input telemetry can be consumed from multiple sources; for example
> > HDFS
> > > or the local file system.
> > >
> > >   * Input telemetry can be consumed in multiple formats; for example
> JSON
> > > or ORC.
> > >
> > >   * The 'output' profile measurements are persisted in HBase and is
> > > consistent with the Storm Profiler.
> > >
> > >   * It can be run on any underlying engine supported by Spark. I have
> > > tested it both in 'local' mode and on a YARN cluster.
> > >
> > >   * It is installed automatically by the Metron MPack.
> > >
> > >   * A README was added that documents usage instructions.
> > >
> > >   * The existing Profiler code was refactored so that as much code as
> > > possible is shared between the 3 Profiler ports; Storm, the Stellar
> REPL,
> > > and Spark. For example, the logic which determines the timestamp of a
>

[DISCUSS] Batch Profiler Feature Branch

2018-09-19 Thread Nick Allen

I would like to open a discussion to get the Batch Profiler feature branch
merged into master as part of METRON-1699 [1] Create Batch Profiler. All
of the work that I had in mind for our first draft of the Batch Profiler
has been completed. Please take a look through what I have and let me know
if there are other features that you think are required *before* we merge.

Previous list discussions on this topic include [2] and [3].

(Q) What can I do with the feature branch?

* With the Batch Profiler, you can backfill/seed profiles using archived
telemetry. This enables the following types of use cases.

1. As a Security Data Scientist, I want to understand the historical
behaviors and trends of a profile that I have created so that I can
determine if I have created a feature set that has predictive value for
model building.

2. As a Security Data Scientist, I want to understand the historical
behaviors and trends of a profile that I have created so that I can
determine if I have defined the profile correctly and created a feature set
that matches reality.

3. As a Security Platform Engineer, I want to generate a profile
using archived telemetry when I deploy a new model to production so that
models depending on that profile can function on day 1.

* METRON-1699 [1] includes a more detailed description of the feature.

(Q) What work was completed?

* The Batch Profiler runs on Spark and was implemented in Java to remain
consistent with our current Java-heavy code base.

* The Batch Profiler is executed from the command-line. It can be
launched using a script or by calling `spark-submit`, which may be useful
for advanced users.

* Input telemetry can be consumed from multiple sources; for example HDFS
or the local file system.

* Input telemetry can be consumed in multiple formats; for example JSON
or ORC.

* The 'output' profile measurements are persisted in HBase and is
consistent with the Storm Profiler.

* It can be run on any underlying engine supported by Spark. I have
tested it both in 'local' mode and on a YARN cluster.

* It is installed automatically by the Metron MPack.

* A README was added that documents usage instructions.

* The existing Profiler code was refactored so that as much code as
possible is shared between the 3 Profiler ports; Storm, the Stellar REPL,
and Spark. For example, the logic which determines the timestamp of a
message was refactored so that it could be reused by all ports.

* metron-profiler-common: The common Profiler code shared amongst
each port.
* metron-profiler-storm: Profiler on Storm
* metron-profiler-spark: Profiler on Spark
* metron-profiler-repl: Profiler on the Stellar REPL
* metron-profiler-client: The client code for retrieving profile
data; for example PROFILE_GET.

* There are 3 separate RPM and DEB packages now created for the Profiler.

* metron-profiler-storm-*.rpm
* metron-profiler-spark-*.rpm
* metron-profiler-repl-*.rpm

* The Profiler integration tests were enhanced to leverage the Profiler
Client logic to validate the results.

* Review METRON-1699 [1] for a complete break-down of the tasks that have
been completed on the feature branch.

(Q) What limitations exist?

* You must manually install Spark to use the Batch Profiler. The Metron
MPack does not treat Spark as a Metron dependency and so does not install
it automatically.

* You do not configure the Batch Profiler in Ambari. It is configured
and executed completely from the command-line.

* To run the Batch Profiler in 'Full Dev', you have to take the following
manual steps. Some of these are arguably limitations with how Ambari
installs Spark 2 in the version of HDP that we run.

1. Install Spark 2 using Ambari.

2. Tell Spark how to talk with HBase.

SPARK_HOME=/usr/hdp/current/spark2-client
cp /usr/hdp/current/hbase-client/conf/hbase-site.xml
$SPARK_HOME/conf/

3. Create the Spark History directory in HDFS.

export HADOOP_USER_NAME=hdfs
hdfs dfs -mkdir /spark2-history

4. Change the default input path to `hdfs://localhost:8020/...` to
match the port defined by HDP, instead of port 9000.

[1] https://issues.apache.org/jira/browse/METRON-1699
[2]
https://lists.apache.org/thread.html/da81c1227ffda3a47eb2e5bb4d0b162dd6d36006241c4ba4b659587b@%3Cdev.metron.apache.org%3E
[3]
https://lists.apache.org/thread.html/d28d18cc9358f5d9c276c7c304ff4ee601041fb47bfc97acb6825083@%3Cdev.metron.apache.org%3E

Re: [DISCUSS] PCAP data for testing and development

2018-09-19 Thread Nick Allen

I would just be worried about resource constraints on the VM.

But Simon's idea of 'do a loop and stop' might be a good solution.  We
could probably orchestrate that with Ansible tags actually.  If you pass
the tag, it will 'do a loop and stop', but by default it keeps the current
behavior.





On Wed, Sep 19, 2018 at 8:12 AM Simon Elliston Ball <
si...@simonellistonball.com> wrote:

> Isn't this what the pcap_replay role is for? We should be able to install
> that role on full-dev and get the example.pcap file we currently ship to
> replay and capture. It's not on by default in full dev because it's heavy
> for most use cases, but should make it easy to load some sample pcap data
> through the pcap topology.
>
> Maybe we should have a method that instead of doing it continuously, has a
> 'do a loop and stop' version to load this data to keep the cpu weight down
> and provide data for testing UI functionality around PCAP.
>
> Simon
>
> On Wed, 19 Sep 2018 at 12:56, Tibor Meller  wrote:
>
> > Hi all,
> >
> > I would like to start a discussion on the possible ways to provide PCAP
> > data for the full dev.
> > The full dev VM after a rebuild contains no PCAP data. Currently,
> > I'm uploading binaries manually. This makes development slower and
> testing
> > problematic as well. I think a more desired outcome would be
> > something similar to what we have in the Alert tab, which is to have some
> > pcap data available right after starting the VM.
> >
> > Do you guys think uploading pcap sample date as part of the
> > ansible provisioning step would be a good approach?
> > Or sensor stubs for pcap would be a better way?
> >
> > I would be curious about your thoughts!
> >
> > Thanks,
> > Tibor
> >
>
>
> --
> --
> simon elliston ball
> @sireb
>

Re: [RESULT][VOTE] Metron Release Candidate 0.6.0-RC1

2018-09-12 Thread Nick Allen

Woop woop!

On Wed, Sep 12, 2018 at 12:15 PM Justin Leet  wrote:

> The vote has passed.  Including my +1, the voting was:
> 3 binding +1’s
> 1 non-binding +1’s
> no 0’s
> no -1’s.
>

Re: [VOTE] Metron Release Candidate 0.6.0-RC1

2018-09-09 Thread Nick Allen

+1 Release this package as Apache Metron 0.6.0

Ran through all of the validation steps.  No problems, except one transient
integration test failure.  Not a blocker for the release.




On Fri, Sep 7, 2018 at 9:45 AM Nick Allen  wrote:

> I think the change you made was valuable clean-up.  I don't think we need
> to cancel the vote.  We can either manually validate the RC or just use the
> script from your branch to do the validation.
>
> I will be testing the RC this morning.
>
> On Fri, Sep 7, 2018 at 9:34 AM Justin Leet  wrote:
>
>> As full disclosure, there's a slight difference in the naming of the of
>> the
>> RC artifacts for the plugin. Namely, that it actually the RC number.  This
>> breaks the verification script as-is.  I have an updated version added
>> here
>> https://github.com/apache/metron/pull/1188/files.  If we feel it's
>> necessary to update that file for this release, I can quick put out a PR,
>> cancel the vote, and pretty easily put out a new RC with the script.
>>
>> Justin
>>
>> On Fri, Sep 7, 2018 at 9:30 AM Justin Leet  wrote:
>>
>> > This is a call to vote on releasing Apache Metron 0.6.0 and the
>> Bro-Kafka
>> > plugin 0.2.0
>> >
>> > Full list of changes in this release:
>> > https://dist.apache.org/repos/dist/dev/metron/0.6.0-RC1/CHANGES
>> >
>> https://dist.apache.org/repos/dist/dev/metron/0.6.0-RC1/CHANGES.bro-plugin
>> >
>> > The tags to be voted upon are:
>> > (apache/metron) apache-metron-0.6.0-rc1
>> > (apache/metron-bro-plugin-kafka)
>> apache-metron-bro-plugin-kafka_0.2.0-rc1
>> >
>> > The source archives being voted upon can be found here:
>> >
>> >
>> https://dist.apache.org/repos/dist/dev/metron/0.6.0-RC1/apache-metron-0.6.0-rc1.tar.gz
>> >
>> >
>> https://dist.apache.org/repos/dist/dev/metron/0.6.0-RC1/apache-metron-bro-plugin-kafka_0.2.0-rc1.tar.gz
>> >
>> > Other release files, signatures and digests can be found here:
>> > https://dist.apache.org/repos/dist/dev/metron/0.6.0-RC1/
>> >
>> > The release artifacts are signed with the following key:
>> > https://dist.apache.org/repos/dist/dev/metron/0.6.0-RC1/KEYS
>> >
>> > Please vote on releasing this package as Apache Metron 0.6.0-RC1 and the
>> > Bro-Kafka plugin as 0.2.0-RC1
>> >
>> > When voting, please list the actions taken to verify the release.
>> >
>> > Recommended build validation and verification instructions are posted
>> > here:
>> > https://cwiki.apache.org/confluence/display/METRON/Verifying+Builds
>> >
>> > This vote will be open for until 10pm EDT on Wednesday September 12
>> 2018,
>> > to account for the weekend.
>> >
>> > [ ] +1 Release this package as Apache Metron 0.3.0-RC1
>> >
>> > [ ] 0 No opinion
>> >
>> > [ ] -1 Do not release this package because...
>> >
>>
>

1 2 3 >

1 - 100 of 294 matches

Mail list logo