Re: [RESULT][VOTE] Metron Release Candidate 0.7.1-RC2

2019-05-14 Thread Otto Fowler
I would like the announcement to state to look into the new deployment
readme for issues with the configuration ui.

I was holding my +1 for discussion on this point, and the release notes.


On May 13, 2019 at 22:34:58, Michael Miklavcic (michael.miklav...@gmail.com)
wrote:



On Mon, May 13, 2019, 8:28 PM Justin Leet  wrote:

> The vote has passed. Including my +1, the voting was:
> 3 binding +1’s
> no 0’s
> no -1’s.
>
> I'll work on finishing out the release process tomorrow and will notify
the
> dev and user lists.
>


Re: [VOTE] Metron Release Candidate 0.7.1-RC2

2019-05-13 Thread Otto Fowler
We don’t have release notes do we?  Is our only recourse for pointing out
things to look out for the release announcement?


On May 10, 2019 at 20:58:40, Michael Miklavcic (michael.miklav...@gmail.com)
wrote:

+1 binding

Validated same as Nick.

Mike

On Thu, May 9, 2019 at 5:54 PM Nick Allen  wrote:

> +1 binding
>
> I validated the release tarball, ran the full test suite and validated
the
> CentOS 6 development environment. Everything looks solid. Let's ship it.
>
> On Wed, May 8, 2019 at 6:50 PM Justin Leet  wrote:
>
> > This is a call to vote on releasing Apache Metron 0.7.1
> >
> > Full list of changes in this release:
> > https://dist.apache.org/repos/dist/dev/metron/0.7.1-RC2/CHANGES
> > The tag to be voted upon is:
> > apache-metron_0.7.1-rc2
> >
> > The source archives being voted upon can be found here:
> >
> >
>
https://dist.apache.org/repos/dist/dev/metron/0.7.1-RC2/apache-metron_0.7.1-rc2.tar.gz
> >
> > Other release files, signatures and digests can be found here:
> > https://dist.apache.org/repos/dist/dev/metron/0.7.1-RC2/
> >
> > The release artifacts are signed with the following key:
> > https://dist.apache.org/repos/dist/release/metron/KEYS
> > Please vote on releasing this package as Apache Metron 0.7.1-RC2
> >
> > When voting, please list the actions taken to verify the release.
> >
> > Recommended build validation and verification instructions are posted
> > here:
> > https://cwiki.apache.org/confluence/display/METRON/Verifying+Builds
> >
> > This vote will be open for until 7pm EDT on Monday May 13 2019, to
> account
> > for the weekend.
> >
> > [ ] +1 Release this package as Apache Metron 0.7.1-RC2
> >
> > [ ] 0 No opinion
> >
> > [ ] -1 Do not release this package because...
> >
>


Re: [DISCUSS] JsonMapParser original string functionality

2019-05-10 Thread Otto Fowler
The original string would be the string specified as the message body, thus
each message in the chain produced would just be the bytes passed in, from
a specific field in the incoming message.



On May 10, 2019 at 19:55:28, Simon Elliston Ball (
si...@simonellistonball.com) wrote:

My understanding is that chaining preserves (correctly to my mind) the
original original string.

In other words: unless the message strategy is raw message, the original
string is just passed through. Original string therefore comes from outside
Metron, and is preserved throughout Metron processes, allowing for
recreation of original form for forensics and evidentiary purposes.

Simon

> On 11 May 2019, at 00:10, Otto Fowler  wrote:
>
> What about parser chaining? Should the original string be from kafka, or
> the last parsed?
>
>
> On May 10, 2019 at 19:03:39, Simon Elliston Ball (
> si...@simonellistonball.com) wrote:
>
> The only scenario I can think of where a parser might treat original
string
> differently, or even need to know about it would be different encoding
> locales. For example, if the string were to be encoded in a locale
specific
> to the device and choose the encoding based on metadata or parsed
content,
> then that could merit pushing it down. The other edge might be when you
> have binary data that does not go down to an original string well (eg a
> netflow parser).
>
> That said, that’s a highly unlikely edge case that could be handled by
> workarounds.
>
> I’m a definitely +1 on Nick’s idea of pulling original string up to the
> runner. Right now we’re pretty inconsistent in how it’s done, so that
would
> help.
>
> Simon
>
> Sent from my iPhone
>
> On 10 May 2019, at 23:10, Nick Allen  wrote:
>
>>> I suppose we could always allow this to be overridden, also.
>>
>> I like an on/off switch for the "original string" functionality. If on,
>> you get the original string in pristine condition. If off, no original
>> string is appended for those who care more about storage space.
>>
>> I can't think of a reason where one kind of parser would have a
different
>> original string mechanism than the others. If something like that does
>> come up, the parser can create its own original string by just naming it
>> something different and then turning "off" the switch that you
described.
>>
>>
>>
>> On Fri, May 10, 2019 at 5:53 PM Michael Miklavcic <
>> michael.miklav...@gmail.com> wrote:
>>
>>> I think that's an excellent idea. Can anyone think of a situation where
> we
>>> wouldn't want to add this the same way for all parsers? I suppose we
> could
>>> always allow this to be overridden, also.
>>>
>>>> On Fri, May 10, 2019 at 3:43 PM Nick Allen  wrote:
>>>>
>>>> I think maintaining the integrity of the original data makes a lot of
>>> sense
>>>> for any parser. And ideally the original string should be what came
out
>>> of
>>>> Kafka with only the minimally necessary processing.
>>>>
>>>> With that in mind, we could solve this one level up. Instead of
relying
>>> on
>>>> each parser to do this right, we could have the ParserRunner and
>>>> specifically the ParserRunnerImpl [1] handle this round-abouts here
>>>> <
>>>>
>>>
>
https://github.com/apache/metron/blob/1b6ef88c79d60022542cda7e9abbea7e720773cc/metron-platform/metron-parsing/metron-parsers-common/src/main/java/org/apache/metron/parsers/ParserRunnerImpl.java#L149-L158
>>>>>
>>>> [1].
>>>> It has the raw message data and can append the original string to each
>>>> message it gets back from the parsers.
>>>>
>>>> Just another approach to consider.
>>>>
>>>> --
>>>> [1]
>>>>
>>>>
>>>
>
https://github.com/apache/metron/blob/1b6ef88c79d60022542cda7e9abbea7e720773cc/metron-platform/metron-parsing/metron-parsers-common/src/main/java/org/apache/metron/parsers/ParserRunnerImpl.java#L149-L158
>>>>
>>>> On Fri, May 10, 2019 at 4:11 PM Otto Fowler 
>>>> wrote:
>>>>
>>>>> +1
>>>>>
>>>>>
>>>>> On May 10, 2019 at 13:57:55, Michael Miklavcic (
>>>>> michael.miklav...@gmail.com)
>>>>> wrote:
>>>>>
>>>>> When adding the capability for parsing messages in the JsonMapParser
>>>> using
>>>>> JSON Path expressions the original behavior for managing original
>>> strings
>>>>> was changed.
>

Re: [DISCUSS] JsonMapParser original string functionality

2019-05-10 Thread Otto Fowler
What about parser chaining?   Should the original string be from kafka, or
the last parsed?


On May 10, 2019 at 19:03:39, Simon Elliston Ball (
si...@simonellistonball.com) wrote:

The only scenario I can think of where a parser might treat original string
differently, or even need to know about it would be different encoding
locales. For example, if the string were to be encoded in a locale specific
to the device and choose the encoding based on metadata or parsed content,
then that could merit pushing it down. The other edge might be when you
have binary data that does not go down to an original string well (eg a
netflow parser).

That said, that’s a highly unlikely edge case that could be handled by
workarounds.

I’m a definitely +1 on Nick’s idea of pulling original string up to the
runner. Right now we’re pretty inconsistent in how it’s done, so that would
help.

Simon

Sent from my iPhone

On 10 May 2019, at 23:10, Nick Allen  wrote:

>> I suppose we could always allow this to be overridden, also.
>
> I like an on/off switch for the "original string" functionality. If on,
> you get the original string in pristine condition. If off, no original
> string is appended for those who care more about storage space.
>
> I can't think of a reason where one kind of parser would have a different
> original string mechanism than the others. If something like that does
> come up, the parser can create its own original string by just naming it
> something different and then turning "off" the switch that you described.
>
>
>
> On Fri, May 10, 2019 at 5:53 PM Michael Miklavcic <
> michael.miklav...@gmail.com> wrote:
>
>> I think that's an excellent idea. Can anyone think of a situation where
we
>> wouldn't want to add this the same way for all parsers? I suppose we
could
>> always allow this to be overridden, also.
>>
>>> On Fri, May 10, 2019 at 3:43 PM Nick Allen  wrote:
>>>
>>> I think maintaining the integrity of the original data makes a lot of
>> sense
>>> for any parser. And ideally the original string should be what came out
>> of
>>> Kafka with only the minimally necessary processing.
>>>
>>> With that in mind, we could solve this one level up. Instead of relying
>> on
>>> each parser to do this right, we could have the ParserRunner and
>>> specifically the ParserRunnerImpl [1] handle this round-abouts here
>>> <
>>>
>>
https://github.com/apache/metron/blob/1b6ef88c79d60022542cda7e9abbea7e720773cc/metron-platform/metron-parsing/metron-parsers-common/src/main/java/org/apache/metron/parsers/ParserRunnerImpl.java#L149-L158
>>>>
>>> [1].
>>> It has the raw message data and can append the original string to each
>>> message it gets back from the parsers.
>>>
>>> Just another approach to consider.
>>>
>>> --
>>> [1]
>>>
>>>
>>
https://github.com/apache/metron/blob/1b6ef88c79d60022542cda7e9abbea7e720773cc/metron-platform/metron-parsing/metron-parsers-common/src/main/java/org/apache/metron/parsers/ParserRunnerImpl.java#L149-L158
>>>
>>> On Fri, May 10, 2019 at 4:11 PM Otto Fowler 
>>> wrote:
>>>
>>>> +1
>>>>
>>>>
>>>> On May 10, 2019 at 13:57:55, Michael Miklavcic (
>>>> michael.miklav...@gmail.com)
>>>> wrote:
>>>>
>>>> When adding the capability for parsing messages in the JsonMapParser
>>> using
>>>> JSON Path expressions the original behavior for managing original
>> strings
>>>> was changed.
>>>>
>>>>
>>>>
>>>
>>
https://github.com/apache/metron/blob/master/metron-platform/metron-parsing/metron-parsers-common/src/main/java/org/apache/metron/parsers/json/JSONMapParser.java#L192
>>>>
>>>> A couple issues have been reported recently regarding this change:
>>>>
>>>> 1. We're losing the actual original string, which is a legal issue for
>>>> data lineage for some customers
>>>> 2. Even for the degenerate case with no sub-messages created, the
>>>> original sub-message string is modified because of the
>>>> serialization/deserialization process with Jackson/JsonSimple. The
>> fields
>>>> are reordered bc the content is normalized.
>>>>
>>>> I looked at options for preserving formatting, but am unable to find a
>>>> method that allows you to both parse, then query the original message
>> and
>>>> then also obtain the raw string matches without the normalizing from
>>>> ser/deserialization.
>>>>
>>>> I'd like to propose that we add a configuration option for this parser
>>> that
>>>> allows the user to toggle which approach they'd like to use. My
>> personal
>>>> preference based on feedback I've gotten from multiple customers is
>> that
>>>> the default should be the older approach which takes the raw original
>>>> string. It's arguable that this change in contract is a regression, so
>>> the
>>>> default should be the earlier behavior. Any sub-messages would then
>> have
>>> a
>>>> copy of that raw original string, not just the sub-message original
>>> string.
>>>> Enabling the flag would enable the current sub-message original string
>>>> functionality.
>>>>
>>>> Mike
>>>>
>>>
>>


Re: [DISCUSS] JsonMapParser original string functionality

2019-05-10 Thread Otto Fowler
+1


On May 10, 2019 at 13:57:55, Michael Miklavcic (michael.miklav...@gmail.com)
wrote:

When adding the capability for parsing messages in the JsonMapParser using
JSON Path expressions the original behavior for managing original strings
was changed.

https://github.com/apache/metron/blob/master/metron-platform/metron-parsing/metron-parsers-common/src/main/java/org/apache/metron/parsers/json/JSONMapParser.java#L192

A couple issues have been reported recently regarding this change:

1. We're losing the actual original string, which is a legal issue for
data lineage for some customers
2. Even for the degenerate case with no sub-messages created, the
original sub-message string is modified because of the
serialization/deserialization process with Jackson/JsonSimple. The fields
are reordered bc the content is normalized.

I looked at options for preserving formatting, but am unable to find a
method that allows you to both parse, then query the original message and
then also obtain the raw string matches without the normalizing from
ser/deserialization.

I'd like to propose that we add a configuration option for this parser that
allows the user to toggle which approach they'd like to use. My personal
preference based on feedback I've gotten from multiple customers is that
the default should be the older approach which takes the raw original
string. It's arguable that this change in contract is a regression, so the
default should be the earlier behavior. Any sub-messages would then have a
copy of that raw original string, not just the sub-message original string.
Enabling the flag would enable the current sub-message original string
functionality.

Mike


Re: [DISCUSS] Parser Aggregation in Management UI

2019-05-08 Thread Otto Fowler
You need to be a committer, that is all I think.
I would not use the github UI for it though, I do it through the cli



On May 8, 2019 at 09:45:24, Michael Miklavcic (michael.miklav...@gmail.com)
wrote:

Not that I'm aware of. Nick and Otto, you've created them before, did you
need any special perms?

On Wed, May 8, 2019 at 3:57 AM Shane Ardell 
wrote:

> This morning, we started to break down our work as Michael suggested in
> this thread. However, it looks like I don't have permission to create a
new
> branch in the GitHub UI or push a new branch to the apache/metron repo.
Is
> this action restricted to PMC members only?
>
> Shane
>
> On Wed, May 8, 2019 at 9:06 AM Tamás Fodor  wrote:
>
> > Here's the process we've gone through in order to implement the
feature.
> >
> > At the beginning we had some bootstrap work like creating a mock API
> > (written in NodeJS) because we were a few steps ahead the backend part.
> But
> > this is in a totally different repository so it doesn't really count.
We
> > also had to wire NgRX, our chosen 3rd party that supports the flux flow
> to
> > get a better state management. When we were ready to kick off
> implementing
> > the business logic in, we splited up the work by subfeatures like drag
> and
> > dropping table rows. At this point, we created a POC without NgRX just
to
> > let you have the feeling of how it works in real life. Later on, after
> > introducing NgRX, we had to refactor it a little bit obviously to make
it
> > compatible with NgRX. There were other subfeatures like creating and
> > editing groups in a floating pane on the right side of the window.
> > When the real backend API was ready we made the necessary changes and
> > tested whether it worked how it was expected. There were a few
difference
> > between how we originally planned the API and the current
implementation
> so
> > we had to adapt it accordingly. While we were implementing the
features,
> we
> > wrote the unit tests simultaneously. The latest task on the feature was
> > restricting the user from aggregating parsers together.
> >
> > As a first iteration, we've decided to put the restriction in because
it
> > requires a bigger effort on the backend to deal with that. In my
opinion,
> > we should get rid of the restriction because it's not intuitive and
very
> > inconvenient. In my opinion, we should let the users to aggregate the
> > running parsers together and do the job to handle this edge case on the
> > backend accordingly.
> >
> > What do you think, guys?
> >
> > Hope this helps.
> >
> > Tamas
> >
> > On Tue, May 7, 2019 at 4:34 PM Michael Miklavcic <
> > michael.miklav...@gmail.com> wrote:
> >
> > > This was my expectation as well.
> > >
> > > Shane, Tibor, Tamas - how did you go about breaking this down into
> chunks
> > > and/or microchunks when you collaborated offline? As Nick mentioned,
> you
> > > obviously split up work and shared it amongst yourselves. Some
> > explanation
> > > around this process would be helpful for reviewers as well. We might
be
> > > able to provide better guidance and examples to future contributors
as
> > > well.
> > >
> > > I talked a little bit with Shane about this offline last week. It
looks
> > > like you guys effectively ran a local feature branch. Replicating
that
> > > process in a feature branch in Apache is probably what you guys
should
> > > be doing for a change this size. We don't have hard limits on line
> change
> > > size, but in the past it's been somewhere around 2k-3k lines and
above
> > > being the tipping point for discussing a feature branch. Strictly
> > speaking,
> > > line quantity alone is not the only metric, but it's relevant here.
If
> > you
> > > want to make smaller incremental changes locally, there's nothing to
> keep
> > > you from doing that - I would only advise that you consider squashing
> > those
> > > commits (just ask if you're unclear about how to handle that) into a
> > single
> > > larger commit/chunk when you're ready to publish them as a chunk to
the
> > > public feature branch. So it would look something like this:
> > >
> > > Commits by person locally
> > > Shane: 1,2,3 -> squash as A
> > > Tibor: 4,5,6 -> squash as B
> > > Tamas: 7,8,9 -> squash as C
> > >
> > > Commits by person in Apache
> > > Shane: A
> > > Tibor: B
> > > Tamas: C
> > >
> > > We need to maintain a record of attribution. Your real workflow may
not
> > be
> > > that cleanly delineated, but you can choose how you want to squash in
> > those
> > > cases. Even in public collaboration, there are plenty of cases where
> > folks
> > > submit PRs against PRs, abstain from accepting attribution, and it
all
> > gets
> > > squashed down into one person's final PR commit. There are many
> options.
> > >
> > > Hope this helps.
> > >
> > > Best,
> > > Mike
> > >
> > > On Mon, May 6, 2019 at 8:19 AM Nick Allen  wrote:
> > >
> > > > Have you considered creating a feature branch for the effort? This
> > would
> > > > allow you to break the effort 

Re: [DISCUSS] Metron Release - 0.7.1 next steps

2019-05-03 Thread Otto Fowler
Despite the name, we *have* been using it as both for quite some amount of
time.  It *is* both dev and demo, and we recommend it as such on the list
all the time.

So there isn’t a decision to be made here as far as the status quo -> we
use full dev as both dev and demo.




On May 2, 2019 at 18:53:37, Michael Miklavcic (michael.miklav...@gmail.com)
wrote:

Whether or not full dev is, first and foremost, "dev" I think your
questions being up a good point. If not full_dev for introducing new users,
then what? If we want to provide a different env for letting people tinker
and try it out than we do for development, that's completely fine. But we
don't have that right now. So we can treat full_dev as multipurpose, or we
can stop directing non-devs to it, or we can add something new. I honestly
don't have any recommendations here. We've talked about docker instances
for replacing in-memory components, but I'm still not sure that solves this
problem, or adds more complexity. Given the current options on the table,
I'm inclined to go with "full_dev" serves both dev and demo purposes. Otto,
what do you think?

On Thu, May 2, 2019, 4:32 PM Otto Fowler  wrote:

> I’ve commented on the PR, and I won’t repeat it here as well, I will
> however ask again if we know and can list all of the usability issues
that
> surround this problem. IE. All the things that can happen or may happen
> for people who are not Metron developers and committers who are using
> full dev, because we keep recommending it.
>
>
>
> On May 2, 2019 at 17:38:30, Michael Miklavcic (michael.miklav...@gmail.com
> )
> wrote:
>
> PR is up. I added the doc change to the metron-deployment README since
this
> serves as the gateway doc for all the VM instances. All of which would be
> affected by the feature gap.
>
> https://github.com/apache/metron/pull/1398
>
> On Thu, May 2, 2019 at 1:37 PM Michael Miklavcic <
> michael.miklav...@gmail.com> wrote:
>
> > Here's the ticket I created to track it, which also references the Jira
> > for the new UI feature.
> > https://issues.apache.org/jira/browse/METRON-2100
> >
> > On Thu, May 2, 2019 at 12:34 PM Michael Miklavcic <
> > michael.miklav...@gmail.com> wrote:
> >
> >> :-)
> >>
> >> I expect to have #2 out sometime today.
> >>
> >> On Thu, May 2, 2019, 12:11 PM Justin Leet 
> wrote:
> >>
> >>> >
> >>> > I personally
> >>> > don't like this feature gap in full dev. It seems Otto agrees, and
> >>> Casey at
> >>> > the very least sees it as enough of an issue to gate us from 0.8.
> >>> >
> >>>
> >>> +1 on all of this. I don't like it either.
> >>>
> >>>
> >>> > Our vote landed 2-2. We are having a discussion about what to do
with
> >>> the
> >>> > release. This is that discussion.
> >>>
> >>>
> >>> I'm going to be honest, my response was a combination of misreading
> what
> >>> you said and thinking you were proposing delaying the release more
> >>> seriously and feeling a bit blindsided by a perceived move from the
> >>> initial
> >>> "take more time than originally anticipated" (which in my head I took
> as
> >>> a
> >>> couple days) to "versus next week, or the week after" (where delaying
> >>> things weeks is something I personally would like not buried so far
> down
> >>> in
> >>> the thread). Totally my bad, sorry about that.
> >>>
> >>> Other than that, it sounds like we're pretty much in agreement.
> >>>
> >>> Here's my current understanding of the state and consensus as of
right
> >>> now
> >>> (which is subject to change as more discussion happens):
> >>>
> >>> - Most of the people in the thread are in favor of #2 for 0.7.1 and
#3
> >>> for 0.8.0.
> >>> - I don't believe I've seen an explicit response from Otto on what
> >>> he
> >>> thinks about doing this, and from a personal perspective like to
> >>> see what
> >>> his opinion is as the person who originally brought it up.
> >>> - Mike said he's going to kick out a PR that addresses #2
> >>> - After that undergoes the normal review process and is merged, we
> >>> proceed normally and cut RC2.
> >>>
> >>>
> >>> On Thu, May 2, 2019 at 1:14 PM Michael Miklavcic <
> >>> michael.miklav...@gmail.com> wrote:
> >>>
> >>> > I think your later p

Re: [VOTE] Update dev guidelines with format for sharing architecture source files and rendered images

2019-05-03 Thread Otto Fowler
+1


On May 2, 2019 at 21:18:21, Michael Miklavcic (michael.miklav...@gmail.com)
wrote:

Here's the latest discussion on the subject:
https://lists.apache.org/thread.html/0aa2b0b9ed4a0f0b0d8bb018c618e62de196565f9af71f347e504076@%3Cdev.metron.apache.org%3E

I'd like to propose a vote to change our dev guidelines which will clarify
the tooling we use to produce diagrams and share the source files for those
diagrams. I propose the dev guidelines
https://cwiki.apache.org/confluence/display/METRON/Development+Guidelines
and
PR checklist
https://github.com/apache/metron/blob/master/.github/PULL_REQUEST_TEMPLATE.md#for-documentation-related-changes
be
changed in the following ways:

1. Under "1.1 Contributing A Code Change"
1. Change <<"New features and significant bug fixes should be
documented in the JIRA and appropriate architecture diagrams should be
attached. Major features may require a vote.">> to <<"New features
and significant bug fixes should be documented in the JIRA. Appropriate
architecture diagrams should be created in https://www.draw.io/
and committed
to source control as per section 2.4. Diagrams may be requested of PR
submitters during review either as documentation or as an aid to the
reviewer. Major features may also require a vote.">>
2. Under "2.4 Documentation"
1. New line item <<"Diagrams - We save architecture diagram source
files in an xml format rendered by draw.io (instructions below). This
is the free tool of choice that we've agreed to use for exchanging
diagrams and their source files in Metron.">>
2. New line item >. This section
would provide basic instructions for downloading source files from
draw.io.
3. Add a new checkbox item under PR checklist heading "For documentation
related changes" with the following text
1. Have you ensured that any documentation diagrams have been
updated, along with their source files, using draw.io? See
https://cwiki.apache.org/confluence/display/METRON/Development+Guidelines
for
instructions.
4. Here is the Jira for migrating/redoing existing diagrams
1. https://issues.apache.org/jira/browse/METRON-2099


We require a minimum of 72 hours for a vote, not typically including
weekend days. I'd like to leave this vote open until Wednesday 5/8, 12PM
EDT. Please vote +1, -1, or 0 to abstain, and also indicate if your vote is
binding or non-binding.


Re: [DISCUSS] Parser Aggregation in Management UI

2019-05-02 Thread Otto Fowler
I have commented the jira.




On May 2, 2019 at 14:22:41, Shane Ardell (shane.m.ard...@gmail.com) wrote:

Hello everyone,

In response to discussions in the 0.7.1 release thread, I wanted to start a
thread regarding the parser aggregation work for the Management UI. For
anyone who has not already read and tested the PR locally, I've added a
detailed description of what we did and why to the JIRA ticket here:
https://issues.apache.org/jira/browse/METRON-1856

I'm wondering what the community thinks about what we've built thus far. Do
you see anything missing that must be part of this new feature in the UI?
Are there any strong objections to how we implemented it?

I’m also looking to see if anyone has any thoughts on how we can possibly
simplify this PR. Right now it's pretty big, and there are a lot of commits
to parse through, but I'm not sure how we could break this work out into
separate, smaller PRs opened against master. We could try to cherry-pick
the commits into smaller PRs and then merge them into a feature branch, but
I'm not sure if that's worth the effort since that will only reduce the
number commits to review, not the lines changed.

As an aside, I also want to give a little background into the introduction
of NgRx in this PR. To give a little background on why we chose to do this,
you can refer to the discussion thread here:
https://lists.apache.org/thread.html/06a59ea42e8d9a9dea5f90aab4011e44434555f8b7f3cf21297c7c87@%3Cdev.metron.apache.org%3E

We previously discussed introducing a better way to manage application
state in both UIs in that thread. It was decided that NgRx was a great tool
for many reasons, one of them being that we can piecemeal it into the
application rather than doing a huge rewrite of all the application state
at once. The contributors in this PR (myself included) decided this would
be a perfect opportunity to introduce NgRx into the Management UI since we
need to manage the previous and current state with the grouping feature so
that users can undo the changes they've made (we used it for more than just
that in this feature, but that was the initial reasoning). In addition, we
greatly benefited from this when it came time to debug our work in the UI
(the discussion in the above thread link goes a little more into the
advantages of debugging with NgRx and DevTools). Removing NgRx from this
work would reduce the numbers of lines changed slightly, but it would still
be a big PR and a lot of that code would just move to the component or
service level in the Angular application.

Shane


Re: [DISCUSS] Metron Release - 0.7.1 next steps

2019-05-02 Thread Otto Fowler
to do with
>>> the
>>> > release. This is that discussion.
>>> >
>>> > On Thu, May 2, 2019, 10:52 AM Justin Leet 
>>> wrote:
>>> >
>>> > > @Mike
>>> > > I have a different question: Why is this enough to consider
delaying
>>> a
>>> > > release in the first place for a fairly involved fix?
>>> > >
>>> > > There was a discuss thread, where the general agreement was that we
>>> had
>>> > > enough value to do a release (Over a month ago. And more things
have
>>> gone
>>> > > into master since then). There's a good number of fixes, and not
just
>>> > > trivial ones either. The general consensus here seems to be that
the
>>> > > management UI issue is fairly minor for a point release (after all,
>>> > there's
>>> > > been multiple people who think option 2 is sufficient), but becomes
>>> > > important if we want to release a minor version. The question I
asked
>>> > > myself about this was ""Does this issue detract enough value that a
>>> > release
>>> > > isn't worthwhile?" and my answer was, and still is, "No, we have
>>> enough
>>> > > value to do a meaningful release".
>>> > >
>>> > > I'm fine with delaying or cancelling a release because we find
issues
>>> > that
>>> > > are severe enough or we don't think there's enough value anymore,
>>> but to
>>> > be
>>> > > entirely honest, I'm absolutely shocked this issue has blown up so
>>> much.
>>> > > However, if you want to have a discuss thread to reevaluate if it's
>>> > > worthwhile to do a release, go for it. The communities' calculus on
>>> the
>>> > > "Does this issue detract enough value that a release isn't
>>> worthwhile?"
>>> > may
>>> > > be different than mine.
>>> > >
>>> > > Having said all that, to a large extent, I think you're right. It
>>> really
>>> > > doesn't matter* that much* if we release next week or the week
after
>>> or
>>> > > whenever. But at the same time I personally get super frustrated
>>> when I
>>> > go
>>> > > to use a project, find a bug, it's already known and fixed, but it
>>> just
>>> > > never puts out a released version. Every cutoff is largely
>>> arbitrary,
>>> > but
>>> > > I think getting our improvements and fixes out there is important.
>>> One of
>>> > > the things we've done fairly well is put out releases at a fairly
>>> decent
>>> > > cadence for a project this large. I really don't want to set the
>>> > precedent
>>> > > of just increasingly pushing out point releases for stuff like
this.
>>> > >
>>> > > On Thu, May 2, 2019 at 12:52 PM Nick Allen 
>>> wrote:
>>> > >
>>> > > > I think any open source project needs to strive to cut releases
>>> > > regularly.
>>> > > > This is healthy for the project and community. It gets new
>>> features
>>> > and
>>> > > > functionality out to the community so we can get feedback, find
>>> what is
>>> > > > working and what is not, iterate and improve. You probably agree
>>> with
>>> > > > this.
>>> > > >
>>> > > > While releasing this week or next may not matter in the grand
>>> scheme,
>>> > if
>>> > > we
>>> > > > want to cut releases regularly, then we need to bear down and
just
>>> do
>>> > it.
>>> > > > Case in point, I opened the initial discussion for this release
on
>>> > March
>>> > > > 13th [1] and it is now May 2nd and we have yet to release 7 weeks
>>> > later.
>>> > > >
>>> > > > --
>>> > > > [1]
>>> > > >
>>> > > >
>>> > >
>>> >
>>>
https://lists.apache.org/thread.html/4f58649139f0aa6276f96febe1d0ecf9e6b3fb5b2b088cba1e3c4d81@%3Cdev.metron.apache.org%3E
>>> > > >
>>> > > >
>>> > > > On Thu, May 2, 2019 at 11:51 AM Michael Miklavcic <
>>> > > > michael.miklav...@gmail.com> wrote:
>

Re: [DISCUSS] Dev guideline changes for architecture diagrams

2019-05-02 Thread Otto Fowler
+1, great job Mike


On May 2, 2019 at 18:06:28, Michael Miklavcic (michael.miklav...@gmail.com)
wrote:

I thought it might be useful as a more general purpose bucket, and it's how
we refer to the image files in the site-book generation. Also, there may be
things that might not qualify as a diagram, but have some value. A few
README's use screenshots. Here's an example:

1.
https://github.com/apache/metron/blob/master/metron-deployment/Kerberos-ambari-setup.md
2.
https://github.com/apache/metron/tree/master/metron-deployment/readme-images
.


What do you think?

On Thu, May 2, 2019 at 5:19 AM Otto Fowler  wrote:

> I’m all set with this, but for one question, why image and image-source
as
> opposed to diagram and diagram source? Isn’t it more descriptive? Is
> there another reason ( like for doc gen )?
>
>
>
> On May 1, 2019 at 20:12:47, Michael Miklavcic (michael.miklav...@gmail.com
> )
> wrote:
>
> Picking up where things left off in the VOTE thread on the subject,
> I'm presenting a revision to my original proposal below. I'd like to get
> this signed off on before submitting it for another vote. Otto, picking
up
> where we left off, let me know if this looks good to you.
>
> I'd like to propose a vote to change our dev guidelines which will
clarify
> the tooling we use to produce diagrams and share the source files for
those
> diagrams. The original discuss thread is noted at the end of this email.
I
> propose the dev guidelines
> https://cwiki.apache.org/confluence/display/METRON/Development+Guidelines
> and
> PR checklist
>
>
https://github.com/apache/metron/blob/master/.github/PULL_REQUEST_TEMPLATE.md#for-documentation-related-changes
> be
> changed in the following ways:
>
> 1. Under "1.1 Contributing A Code Change"
> 1. Change <<"New features and significant bug fixes should be
> documented in the JIRA and appropriate architecture diagrams should be
> attached. Major features may require a vote.">> to <<"New features
> and significant bug fixes should be documented in the JIRA. Appropriate
> architecture diagrams should be created in https://www.draw.io/
> and committed
> to source control as per section 2.4. Diagrams may be requested of PR
> submitters during review either as documentation or as an aid to the
> reviewer. Major features may also require a vote.">>
> 2. Under "2.4 Documentation"
> 1. New line item <<"Diagrams - We save architecture diagram source
> files in an xml format rendered by draw.io (instructions below). This
> is the free tool of choice that we've agreed to use for exchanging
> diagrams and their source files in Metron.">>
> 2. New line item < "/images-source" and rendered diagrams and images belong in
> "/images."
> 3. New subsection <<"Creating and Modifying Diagrams">>. This section
> would provide basic instructions for downloading source files from
> draw.io.
> 3. Add a new checkbox item under PR checklist heading "For documentation
> related changes" with the following text
> 1. Have you ensured that any documentation diagrams have been
> updated, along with their source files, using draw.io? See
> https://cwiki.apache.org/confluence/display/METRON/Development+Guidelines
> for
> instructions.
> 4. Here is the Jira for migrating/redoing existing diagrams
> 1. https://issues.apache.org/jira/browse/METRON-2099
>
>
>
> - Original DISCUSS thread -
>
>
https://lists.apache.org/thread.html/3ae02f1e32044b1a7648899700d44611aefdab6caa09fb3196292425@%3Cdev.metron.apache.org%3E
> - Original VOTE thread -
>
>
https://lists.apache.org/thread.html/c41dca65a46354c161d58caa7c79cfac6758f437b6638d12bd5c5622@%3Cdev.metron.apache.org%3E
>


Re: [DISCUSS] Metron Release - 0.7.1 next steps

2019-05-02 Thread Otto Fowler
I remember this now, but I’m not sure how I would have related this to a
parser aggregation pr honestly.


On May 2, 2019 at 07:54:13, Shane Ardell (shane.m.ard...@gmail.com) wrote:

Here's a link to the ngrx discussion thread from a few months back:
https://lists.apache.org/thread.html/06a59ea42e8d9a9dea5f90aab4011e44434555f8b7f3cf21297c7c87@%3Cdev.metron.apache.org%3E

On Thu, May 2, 2019 at 1:17 PM Otto Fowler  wrote:

> If you can find a link in the archives for that thread, it would really
> help.
>
> I don’t think sending them up as one sensor would work…. as something
> quick. I think it is an interesting idea from a higher level that would
> need some more thought though ( IE: what if every sensor in the ui was a
> sensor group, and the existing entries where just groups of 1 ).
>
> As far as I can see, we have brought up the idea of a release ourselves,
I
> don’t see why we don’t just swarm this issue and get it right then
release.
>
>
>
> On May 2, 2019 at 04:16:31, Tamás Fodor (ftamas.m...@gmail.com) wrote:
>
> In PR#1360 we introduced a new state management strategy involving a new
> module called Ngrx. We had a discussion thread on this a few months ago
and
> we successfully convinced you about the benefits. This is one of the
> reasons why this PR is going to be still huge after cleaning up the
commit
> history. After you having a look at the changes and the feature itself,
> there's likely have questions about why certain parts work as they do.
The
> thing what I'd like to point out is that, yes, it probably takes more
time
> to get it in.
>
> In order to being able to release the RC, wouldn't it be an easy and
quick
> fix on the backend if it sent the aggregated parsers to the client as
they
> were one sensor? It's just an idea, it might be wrong, but at least we
> shouldn't have to wait until the aforementioned PR gets ready to be
merged
> to the master.
>
> On Wed, May 1, 2019 at 4:16 PM Justin Leet  wrote:
>
> > Short version: I'm in favor of #2 of 0.7.1 and #1 as a blocker for
0.8.0.
> > #3 seems like a total waste of time and effort.
> >
> > The wall of text version:
> > I agree this isn't "just the wrong thing shown", but for completely
> > different reasons.
> >
> > To be extremely clear about what the problem is: Our "dev" environment
> > (whose very name implies the audience is develops) uses a
> performance-based
> > advanced feature to ensure that all our of sample flows are regularly
run
> > and produce data. This feature has a bare minimal implementation to be
> > enabled via Ambari, which it currently is by default. This is because
of
> > the limited resources available that previously resulted in us turning
> off
> > Yaf, and therefore testing it during regular full dev runs. Right now
> > however, this feature is not exposed through the management UI, and
> > therefore it isn't obvious what the implications are. Am I missing
> anything
> > here?
> >
> > For users actually choosing to use the parser aggregation feature in a
> > non-full-dev environment, I'd expect substantially more care to be
> involved
> > given the lack of easy configuration for it (after all, why would you
> > bother running the aggregated parser alongside the regular parser? This
> > could be more explicitly stated, but again that feels like a doc
problem.
> > Right now I could essentially provide two of the same parser and create
> the
> > same problem, so right now aggregation is only special because it runs
on
> > dev by default). This is, in my opinion, primarily a first impression
> > problem and likely one of many areas that could use improved
> documentation.
> >
> > Quite frankly, I think the issue pointed out here could mostly be
> resolved
> > by documenting how the current aggregation is done in dev, and telling
> how
> > to change it. Especially for a 0.x.1 release, which is primarily bug
> > fixes. As can be inferred from my vote, I don't think this problem is a
> > problem that needs solving in a point release. I would support
improving
> > the documentation, both full-dev and for aggregation in general for the
> > 0.7.1 point release, while making a 0.8.0 release contingent upon the
> > outstanding PRs to enable it in the management UI.
> >
> > There are a couple deeper issues, imo, that I care substantially more
> about
> > than this in particular
> > * The dev environment is being used as our intro for users, because
it's
> > convenient for us to not maintain more environments (which has been a
> major
> > pain point in the past). Worse, the dev environment strongly implies
it's
> > for M

Re: [DISCUSS] Dev guideline changes for architecture diagrams

2019-05-02 Thread Otto Fowler
I’m all set with this, but for one question,  why image and image-source as
opposed to diagram and diagram source?  Isn’t it more descriptive?  Is
there another reason ( like for doc gen )?



On May 1, 2019 at 20:12:47, Michael Miklavcic (michael.miklav...@gmail.com)
wrote:

Picking up where things left off in the VOTE thread on the subject,
I'm presenting a revision to my original proposal below. I'd like to get
this signed off on before submitting it for another vote. Otto, picking up
where we left off, let me know if this looks good to you.

I'd like to propose a vote to change our dev guidelines which will clarify
the tooling we use to produce diagrams and share the source files for those
diagrams. The original discuss thread is noted at the end of this email. I
propose the dev guidelines
https://cwiki.apache.org/confluence/display/METRON/Development+Guidelines
and
PR checklist
https://github.com/apache/metron/blob/master/.github/PULL_REQUEST_TEMPLATE.md#for-documentation-related-changes
be
changed in the following ways:

1. Under "1.1 Contributing A Code Change"
1. Change <<"New features and significant bug fixes should be
documented in the JIRA and appropriate architecture diagrams should be
attached. Major features may require a vote.">> to <<"New features
and significant bug fixes should be documented in the JIRA. Appropriate
architecture diagrams should be created in https://www.draw.io/
and committed
to source control as per section 2.4. Diagrams may be requested of PR
submitters during review either as documentation or as an aid to the
reviewer. Major features may also require a vote.">>
2. Under "2.4 Documentation"
1. New line item <<"Diagrams - We save architecture diagram source
files in an xml format rendered by draw.io (instructions below). This
is the free tool of choice that we've agreed to use for exchanging
diagrams and their source files in Metron.">>
2. New line item >. This section
would provide basic instructions for downloading source files from
draw.io.
3. Add a new checkbox item under PR checklist heading "For documentation
related changes" with the following text
1. Have you ensured that any documentation diagrams have been
updated, along with their source files, using draw.io? See
https://cwiki.apache.org/confluence/display/METRON/Development+Guidelines
for
instructions.
4. Here is the Jira for migrating/redoing existing diagrams
1. https://issues.apache.org/jira/browse/METRON-2099



- Original DISCUSS thread -
https://lists.apache.org/thread.html/3ae02f1e32044b1a7648899700d44611aefdab6caa09fb3196292425@%3Cdev.metron.apache.org%3E
- Original VOTE thread -
https://lists.apache.org/thread.html/c41dca65a46354c161d58caa7c79cfac6758f437b6638d12bd5c5622@%3Cdev.metron.apache.org%3E


Re: [DISCUSS] Metron Release - 0.7.1 next steps

2019-05-02 Thread Otto Fowler
ent UI, we should be able to
> entirely divorce these two overlapping domains. I'd love to see parsers
> ripped out of Ambari, then full-dev manages all the setup via REST. At
that
> point, we can easily tell everyone to just use the management UI.
>
> On Wed, May 1, 2019 at 7:23 AM Otto Fowler 
> wrote:
>
> > I think it would help if the full consequences of having the UI show
the
> > wrong status where listed.
> >
> > Someone trying metron, will, by default , see the wrong thing in the UI
> for
> > the ONLY sensors they have that are running and doing data.
> >
> > What happens when they try to start them to make them work? One, two or
> > all?
> > What happens when he edits them or try to add transformations? One, two
> or
> > all?
> > What other things can you do with the sensors in the ui? What happens?
> >
> > Are we recommending aggregation on the list and elsewhere for users?
Are
> > we recommending something that is going to ensure they get into this
> > situation?
> >
> > I think this is more than ‘just the wrong thing shown’ in the ui.
> >
> >
> >
> >
> > On April 30, 2019 at 20:48:10, Michael Miklavcic (
> > michael.miklav...@gmail.com) wrote:
> >
> > The vote for RC1 did not pass and I'd like to kickstart some discussion
> > about what we should do.
> >
> > I started taking a look at PR#1360 and it looks like this isn't quite
as
> > close to being able go in as I had originally expected. I want to talk
> > about options here. It seems to me that we can:
> >
> > 1. Wait for PR#1360 to go in, but this is likely going to take more
time
> > than originally anticipated
> > 2. Accept the issue in full dev, but add some notes in the developer
> > docs about the current feature gap and why sensors aren't showing
status
> in
> > the management UI when aggregation is enabled.
> > 3. Find some other workable UI solution.
> > 4. Other option?
> >
> > All things considered, I'm personally leaning towards #2 in the
> short-term,
> > but I think we should probably talk about this a bit before deciding
what
> > RC2 should be.
> >
> > Best,
> > Mike
> >
>


Re: [DISCUSS] Metron Release - 0.7.1 next steps

2019-05-01 Thread Otto Fowler
I think it would help if the full consequences of having the UI show the
wrong status where listed.

Someone trying metron, will, by default , see the wrong thing in the UI for
the ONLY sensors they have that are running and doing data.

What happens when they try to start them to make them work? One, two or all?
What happens when he edits them or try to add transformations? One, two or
all?
What other things can you do with the sensors in the ui?  What happens?

Are we recommending aggregation on the list and elsewhere for users?  Are
we recommending something that is going to ensure they get into this
situation?

I think this is more than ‘just the wrong thing shown’ in the ui.




On April 30, 2019 at 20:48:10, Michael Miklavcic (
michael.miklav...@gmail.com) wrote:

The vote for RC1 did not pass and I'd like to kickstart some discussion
about what we should do.

I started taking a look at PR#1360 and it looks like this isn't quite as
close to being able go in as I had originally expected. I want to talk
about options here. It seems to me that we can:

1. Wait for PR#1360 to go in, but this is likely going to take more time
than originally anticipated
2. Accept the issue in full dev, but add some notes in the developer
docs about the current feature gap and why sensors aren't showing status in
the management UI when aggregation is enabled.
3. Find some other workable UI solution.
4. Other option?

All things considered, I'm personally leaning towards #2 in the short-term,
but I think we should probably talk about this a bit before deciding what
RC2 should be.

Best,
Mike


Re: [VOTE] Metron Release Candidate 0.7.1-RC1

2019-04-27 Thread Otto Fowler
while we can land a pr and accept the regressions, I do not think we should
do a release with our default sample environment broken.

-1

On April 27, 2019 at 18:11:42, Justin Leet (justinjl...@gmail.com) wrote:

> Mike is correct, that is because of the combination of full dev
> restrictions and the lack of support in the configuration UI for parser
> aggregation. This was introduced in
> https://github.com/apache/metron/pull/1207 and also was true of the last
> release. Currently, parser aggregation is an advanced/manual feature whose
> (bare minimum) configuration can be done via Ambari, out of convenience.
>
> I haven't looked into it, but https://github.com/apache/metron/pull/1360
> is
> likely the work for this (and need additional work before merging).
>
> I'm personally letting my binding +1 stand, although I would support
> either
> ensuring we get that PR cleaned up and in and/or additional documentation
> regarding the current limitations of this feature.
>
>
> On Sat, Apr 27, 2019 at 2:38 PM Anand Subramanian <
> asubraman...@hortonworks.com> wrote:
>
> I can confirm that I've seen the Mgmt UI shows the sensor status correctly
> when they run as single topologies.
>
> -Anand
>
> On 4/27/19, 11:37 PM, "Michael Miklavcic" 
> wrote:
>
> I believe that is bc of parser aggregation. The UI does not support it
> currently. IIRC there was a PR to change the bro, snort, and yaf
> sensors to
> aggregated bc full dev didn't have enough resources. The upshot is
> that the
> UI still works for single sensors, but the feature for enabling
> aggregated
> sensors has not yet been completed.
>
> On Sat, Apr 27, 2019, 11:33 AM Otto Fowler 
> wrote:
>
> -1
>
> Ran the script and ran full dev, all good.
> In the configuration ui, the status of the sensors is not correct.
>
> It
>
> does not show any running, but they are running in storm and the
>
> data was
>
> moved correctly.
>
>
> On April 26, 2019 at 09:58:02, Otto Fowler (ottobackwa...@gmail.com)
> wrote:
>
> Curious Anand,
> are your steps for bringing up an open stack cluster something we
>
> could
>
> script like the AWS stuff?
>
>
> On April 26, 2019 at 09:35:29, Anand Subramanian (
> asubraman...@hortonworks.com) wrote:
>
> +1 (non-binding)
>
> * Built RPMs and mpacks.
> * Brought up Metron stack on 12-node CentOS 7 openstack cluster.
> * Ran sensor-stubs and validated events in the Alerts UI for the
>
> default
>
> sensors.
> * Management UI, Alerts UI and Swagger UI sanity check
>
> Regards,
> Anand
>
> On 4/26/19, 5:18 AM, "Nick Allen"  wrote:
>
> +1 Verified release with all documented steps and ran up Full Dev.
>
> On Thu, Apr 25, 2019 at 6:10 PM Michael Miklavcic <
> michael.miklav...@gmail.com> wrote:
>
> Ok cool, just finished the validation and updated the steps in the
>
> doc to
>
> reflect the current code base.
>
> On Thu, Apr 25, 2019 at 3:45 PM Nick Allen 
>
> wrote:
>
>
> No voting required. Those are just docs. Whoever is willing to
>
> correct
>
> and has access, should be able to. Good catch.
>
> On Thu, Apr 25, 2019 at 4:32 PM Michael Miklavcic <
> michael.miklav...@gmail.com> wrote:
>
> We're also not "incubator-metron" any longer. Do we require
>
> any kind
>
> of
>
> voting or +1 on that verification page to make corrections to
>
> it?
>
>
> On Thu, Apr 25, 2019 at 2:29 PM Michael Miklavcic <
> michael.miklav...@gmail.com> wrote:
>
> fyi, the steps in this doc have changed slightly per this
>
> naming
>
> convention change as well -
>
> https://cwiki.apache.org/confluence/display/METRON/Verifying+Builds.
>
>
>
>
> On Thu, Apr 25, 2019 at 1:25 PM Justin Leet <
>
> justinjl...@gmail.com
>
>
> wrote:
>
>
> For everyone taking the time to validate and vote on the
>
> RC, there
>
> is
>
> a
>
> caveat. The naming conventions for the two repos are now
>
> aligned
>
> (_, instead of being '-' in the main
>
> repo and
>
> '_'
>
> in
>
> the plugin repo) along with the location of the KEYS file,
>
> I have
>
> a
>
> PR
>
> out
>
> to update the metron-rc-check script (
> https://github.com/apache/metron/pull/1394).
>
> This accounts for both of these changes, and should allow
>
> the
>
> script
>
> to
>
> be
>
> run normally.
>
> On Thu, Apr 25, 2019 at 3:22 PM Justin Leet <
>
> justinjl...@gmail.com>
>
> wrote:
>
> This is a call to vote on releasing Apache Metron 0.7.1
>
> Full list of change

Re: [VOTE] Metron Release Candidate 0.7.1-RC1

2019-04-27 Thread Otto Fowler
-1

Ran the script and ran full dev, all good.
In the configuration ui, the status of the sensors is not correct.  It does
not show any running, but they are running in storm and the data was moved
correctly.


On April 26, 2019 at 09:58:02, Otto Fowler (ottobackwa...@gmail.com) wrote:

Curious Anand,
are your steps for bringing up an open stack cluster something we could
script like the AWS stuff?


On April 26, 2019 at 09:35:29, Anand Subramanian (
asubraman...@hortonworks.com) wrote:

+1 (non-binding)

* Built RPMs and mpacks.
* Brought up Metron stack on 12-node CentOS 7 openstack cluster.
* Ran sensor-stubs and validated events in the Alerts UI for the default
sensors.
* Management UI, Alerts UI and Swagger UI sanity check

Regards,
Anand

On 4/26/19, 5:18 AM, "Nick Allen"  wrote:

+1 Verified release with all documented steps and ran up Full Dev.

On Thu, Apr 25, 2019 at 6:10 PM Michael Miklavcic <
michael.miklav...@gmail.com> wrote:

> Ok cool, just finished the validation and updated the steps in the doc to
> reflect the current code base.
>
> On Thu, Apr 25, 2019 at 3:45 PM Nick Allen  wrote:
>
> > No voting required. Those are just docs. Whoever is willing to correct
> > and has access, should be able to. Good catch.
> >
> > On Thu, Apr 25, 2019 at 4:32 PM Michael Miklavcic <
> > michael.miklav...@gmail.com> wrote:
> >
> > > We're also not "incubator-metron" any longer. Do we require any kind
of
> > > voting or +1 on that verification page to make corrections to it?
> > >
> > > On Thu, Apr 25, 2019 at 2:29 PM Michael Miklavcic <
> > > michael.miklav...@gmail.com> wrote:
> > >
> > > > fyi, the steps in this doc have changed slightly per this naming
> > > > convention change as well -
> > > > https://cwiki.apache.org/confluence/display/METRON/Verifying+Builds.
> > > >
> > > >
> > > >
> > > > On Thu, Apr 25, 2019 at 1:25 PM Justin Leet 
> > > wrote:
> > > >
> > > >> For everyone taking the time to validate and vote on the RC, there
> is
> > a
> > > >> caveat. The naming conventions for the two repos are now aligned
> > > >> (_, instead of being '-' in the main repo and
> '_'
> > in
> > > >> the plugin repo) along with the location of the KEYS file, I have a
> PR
> > > out
> > > >> to update the metron-rc-check script (
> > > >> https://github.com/apache/metron/pull/1394).
> > > >>
> > > >> This accounts for both of these changes, and should allow the
script
> > to
> > > be
> > > >> run normally.
> > > >>
> > > >> On Thu, Apr 25, 2019 at 3:22 PM Justin Leet 
> > > >> wrote:
> > > >>
> > > >> > This is a call to vote on releasing Apache Metron 0.7.1
> > > >> >
> > > >> > Full list of changes in this release:
> > > >> > https://dist.apache.org/repos/dist/dev/metron/0.7.1-RC1/CHANGES
> > > >> > The tag to be voted upon is:
> > > >> > apache-metron_0.7.1-rc1
> > > >> >
> > > >> > The source archives being voted upon can be found here:
> > > >> >
> > > >> >
> > > >>
> > >
> >
>
https://dist.apache.org/repos/dist/dev/metron/0.7.1-RC1/apache-metron_0.7.1-rc1.tar.gz
> > > >> >
> > > >> > Other release files, signatures and digests can be found here:
> > > >> > https://dist.apache.org/repos/dist/dev/metron/0.7.1-RC1/
> > > >> >
> > > >> > The release artifacts are signed with the following key:
> > > >> > https://dist.apache.org/repos/dist/release/metron/KEYS
> > > >> > Please vote on releasing this package as Apache Metron 0.7.1-RC1
> > > >> >
> > > >> > When voting, please list the actions taken to verify the release.
> > > >> >
> > > >> > Recommended build validation and verification instructions are
> > posted
> > > >> > here:
> > > >> >
> https://cwiki.apache.org/confluence/display/METRON/Verifying+Builds
> > > >> >
> > > >> > This vote will be open for until 4pm EDT on Tuesday April 30
2019,
> > to
> > > >> > account for the weekend..
> > > >> >
> > > >> > [ ] +1 Release this package as Apache Metron 0.7.1-RC1
> > > >> >
> > > >> > [ ] 0 No opinion
> > > >> >
> > > >> > [ ] -1 Do not release this package because...
> > > >> >
> > > >>
> > > >
> > >
> >
>


Re: [VOTE] Update dev guidelines with format for sharing architecture source files and rendered images

2019-04-26 Thread Otto Fowler
When I say module, I mean at the root of any module, so each module could
have it’s own diagrams.
And the project root wold have diagrams for the ‘overall'


On April 26, 2019 at 13:44:30, Michael Miklavcic (
michael.miklav...@gmail.com) wrote:

The convention that seems to have been followed thus far has been to plop
the images in the root of the module they're relevant to. Maybe relocating
them to a central place would make it easier. The site-book image link
rewriting might be simpler then as well. The only downside to this approach
would be that the artifacts are split from their respective modules, but I
honestly don't see that as a problem.

On Fri, Apr 26, 2019, 11:40 AM Otto Fowler  wrote:

> On April 26, 2019 at 13:19:05, Michael Miklavcic (
> michael.miklav...@gmail.com) wrote:
>
> @otto when I get your responses to my Q's inline below I can post another
> revision.
>
> On Thu, Apr 25, 2019 at 11:52 AM Otto Fowler 
> wrote:
>
> > - We need to specify the format I think, and then say that draw io is
the
> > tool for the format and not just specify the tool.
> >
>
> Format for the source files, rendered files, or both? I believe their
> source file format is a proprietary XML format. For rendered images, I
> don't have a strong opinion and am happy to leave that up to the
> implementer. If we want to be more opinionated, i.e. specify png, svg,
> jpeg, etc. I could probably be persuaded. For the source file comment,
> maybe it would help if I did the full write-up for 3.1 wrt instructions
for
> how to produce the diagrams and source files from draw.io.
>
>
> I think what you say would be ok then, if draw io only has one source
> format.
>
> I don’t care about the image format either, I’m surprised nobody has a
> strong opinion about it.
>
> Do we want a standard place to put the diagrams?
>
> module/
> - diagrams/
> - foo.xml
> - foo.png
> - pr1234.xml
> - pr1234.jpeg
> - METRON-13244.xml
> - METRON-13244.png
> - EnrichementArchitecture.xml
> - EnrichementArchitecture.png
>
>
>
>
>
>
> > - Existing diagrams, in order to be modified, will have to be converted
> to
> > this format, there should be jiras for that
> >
> > Makes sense - I think I'll create those Jiras in lockstep with this
vote
> getting approval
>
>
> > 2.1 "New features and significant bug fixes should be documented in the
> > JIRA. Appropriate architecture diagrams should be created in
> > https://www.draw.io/ and committed “
> > "New features and significant bug fixes should be documented in the
JIRA.
> > Appropriate architecture diagrams should be created in
> > https://www.draw.io/ and
> > committed. Diagrams may be requested of PR submitters during review
> either
> > as documentation or as an aid to the reviewer “
> >
> > We could/should/can use the diagrams as
> >
> > - documentation
> > - simple aids for understanding PRs and communication ( Nick and I used
> > them for such yesterday to great effect to make sure we were on the
same
> > page ).
> >
> > I’m not sure we don’t want to have that blurb in there
> >
> > I'm happy to add this as well, +1 to that.
>
>
> >
> > On April 25, 2019 at 12:57:47, Michael Miklavcic (
> > michael.miklav...@gmail.com) wrote:
> >
> > I'd like to propose a vote to change our dev guidelines which will
> clarify
> > the tooling we use to produce diagrams and share the source files for
> those
> > diagrams. The original discuss thread is noted at the end of this
email.
> I
> > propose the dev guidelines
> >
> https://cwiki.apache.org/confluence/display/METRON/Development+Guidelines
> > and
> > PR checklist
> >
> >
>
>
https://github.com/apache/metron/blob/master/.github/PULL_REQUEST_TEMPLATE.md#for-documentation-related-changes
> > be
> > changed in the following ways:
> >
> > 1. We specify that draw.io is the free tool of choice for sharing
> > diagrams in Metron and that the source files will be maintained/shared
in
> > source control.
> > 2. Under "1.1 Contributing A Code Change"
> > 1. Change "New features and significant bug fixes should be
> > documented in the JIRA and appropriate architecture diagrams should be
> > attached. Major features may require a vote." to "New features and
> > significant bug fixes should be documented in the JIRA. Appropriate
> > architecture diagrams should be created in https://www.draw.io/
> > and committed
> > to source control with their XML source files and final rendered image.
> > Major features may require a vote."
> > 3. Un

Re: [VOTE] Update dev guidelines with format for sharing architecture source files and rendered images

2019-04-26 Thread Otto Fowler
On April 26, 2019 at 13:19:05, Michael Miklavcic (
michael.miklav...@gmail.com) wrote:

@otto when I get your responses to my Q's inline below I can post another
revision.

On Thu, Apr 25, 2019 at 11:52 AM Otto Fowler 
wrote:

> - We need to specify the format I think, and then say that draw io is the
> tool for the format and not just specify the tool.
>

Format for the source files, rendered files, or both? I believe their
source file format is a proprietary XML format. For rendered images, I
don't have a strong opinion and am happy to leave that up to the
implementer. If we want to be more opinionated, i.e. specify png, svg,
jpeg, etc. I could probably be persuaded. For the source file comment,
maybe it would help if I did the full write-up for 3.1 wrt instructions for
how to produce the diagrams and source files from draw.io.


I think what you say would be ok then, if draw io only has one source
format.

I don’t care about the image format either, I’m surprised nobody has a
strong opinion about it.

Do we want a standard place to put the diagrams?

module/
- diagrams/
- foo.xml
- foo.png
- pr1234.xml
- pr1234.jpeg
- METRON-13244.xml
- METRON-13244.png
- EnrichementArchitecture.xml
- EnrichementArchitecture.png






> - Existing diagrams, in order to be modified, will have to be converted
to
> this format, there should be jiras for that
>
> Makes sense - I think I'll create those Jiras in lockstep with this vote
getting approval


> 2.1 "New features and significant bug fixes should be documented in the
> JIRA. Appropriate architecture diagrams should be created in
> https://www.draw.io/ and committed “
> "New features and significant bug fixes should be documented in the JIRA.
> Appropriate architecture diagrams should be created in
> https://www.draw.io/ and
> committed. Diagrams may be requested of PR submitters during review
either
> as documentation or as an aid to the reviewer “
>
> We could/should/can use the diagrams as
>
> - documentation
> - simple aids for understanding PRs and communication ( Nick and I used
> them for such yesterday to great effect to make sure we were on the same
> page ).
>
> I’m not sure we don’t want to have that blurb in there
>
> I'm happy to add this as well, +1 to that.


>
> On April 25, 2019 at 12:57:47, Michael Miklavcic (
> michael.miklav...@gmail.com) wrote:
>
> I'd like to propose a vote to change our dev guidelines which will
clarify
> the tooling we use to produce diagrams and share the source files for
those
> diagrams. The original discuss thread is noted at the end of this email.
I
> propose the dev guidelines
> https://cwiki.apache.org/confluence/display/METRON/Development+Guidelines
> and
> PR checklist
>
>
https://github.com/apache/metron/blob/master/.github/PULL_REQUEST_TEMPLATE.md#for-documentation-related-changes
> be
> changed in the following ways:
>
> 1. We specify that draw.io is the free tool of choice for sharing
> diagrams in Metron and that the source files will be maintained/shared in
> source control.
> 2. Under "1.1 Contributing A Code Change"
> 1. Change "New features and significant bug fixes should be
> documented in the JIRA and appropriate architecture diagrams should be
> attached. Major features may require a vote." to "New features and
> significant bug fixes should be documented in the JIRA. Appropriate
> architecture diagrams should be created in https://www.draw.io/
> and committed
> to source control with their XML source files and final rendered image.
> Major features may require a vote."
> 3. Under "2.4 Documentation"
> 1. Add a new section with instructions entitled "Creating and Modifying
> Diagrams". This section would provide basic instructions for downloading
> source files from draw.io.
> 4. Add a new checkbox item under PR checklist heading "For documentation
> related changes" with the following text
> 1. Have you ensured that any documentation diagrams have been
> updated, along with their source files, using draw.io? See
> https://cwiki.apache.org/confluence/display/METRON/Development+Guidelines
> for
> instructions.
>
> We require a minimum of 72 hours for a vote, not typically including
> weekend days. I'd like to leave this vote open until Tuesday, 12PM
> EDT. Please vote +1, -1, or 0 to abstain, and also indicate if your vote
is
> binding or non-binding.
>
>
>
https://lists.apache.org/thread.html/3ae02f1e32044b1a7648899700d44611aefdab6caa09fb3196292425@%3Cdev.metron.apache.org%3E
>
> Cheers,
> Mike
>


Re: [VOTE] Metron Release Candidate 0.7.1-RC1

2019-04-26 Thread Otto Fowler
Curious Anand,
are your steps for bringing up an open stack cluster something we could
script like the AWS stuff?


On April 26, 2019 at 09:35:29, Anand Subramanian (
asubraman...@hortonworks.com) wrote:

+1 (non-binding)

* Built RPMs and mpacks.
* Brought up Metron stack on 12-node CentOS 7 openstack cluster.
* Ran sensor-stubs and validated events in the Alerts UI for the default
sensors.
* Management UI, Alerts UI and Swagger UI sanity check

Regards,
Anand

On 4/26/19, 5:18 AM, "Nick Allen"  wrote:

+1 Verified release with all documented steps and ran up Full Dev.

On Thu, Apr 25, 2019 at 6:10 PM Michael Miklavcic <
michael.miklav...@gmail.com> wrote:

> Ok cool, just finished the validation and updated the steps in the doc to
> reflect the current code base.
>
> On Thu, Apr 25, 2019 at 3:45 PM Nick Allen  wrote:
>
> > No voting required. Those are just docs. Whoever is willing to correct
> > and has access, should be able to. Good catch.
> >
> > On Thu, Apr 25, 2019 at 4:32 PM Michael Miklavcic <
> > michael.miklav...@gmail.com> wrote:
> >
> > > We're also not "incubator-metron" any longer. Do we require any kind
of
> > > voting or +1 on that verification page to make corrections to it?
> > >
> > > On Thu, Apr 25, 2019 at 2:29 PM Michael Miklavcic <
> > > michael.miklav...@gmail.com> wrote:
> > >
> > > > fyi, the steps in this doc have changed slightly per this naming
> > > > convention change as well -
> > > > https://cwiki.apache.org/confluence/display/METRON/Verifying+Builds.

> > > >
> > > >
> > > >
> > > > On Thu, Apr 25, 2019 at 1:25 PM Justin Leet 
> > > wrote:
> > > >
> > > >> For everyone taking the time to validate and vote on the RC, there
> is
> > a
> > > >> caveat. The naming conventions for the two repos are now aligned
> > > >> (_, instead of being '-' in the main repo and
> '_'
> > in
> > > >> the plugin repo) along with the location of the KEYS file, I have
a
> PR
> > > out
> > > >> to update the metron-rc-check script (
> > > >> https://github.com/apache/metron/pull/1394).
> > > >>
> > > >> This accounts for both of these changes, and should allow the
script
> > to
> > > be
> > > >> run normally.
> > > >>
> > > >> On Thu, Apr 25, 2019 at 3:22 PM Justin Leet 

> > > >> wrote:
> > > >>
> > > >> > This is a call to vote on releasing Apache Metron 0.7.1
> > > >> >
> > > >> > Full list of changes in this release:
> > > >> > https://dist.apache.org/repos/dist/dev/metron/0.7.1-RC1/CHANGES
> > > >> > The tag to be voted upon is:
> > > >> > apache-metron_0.7.1-rc1
> > > >> >
> > > >> > The source archives being voted upon can be found here:
> > > >> >
> > > >> >
> > > >>
> > >
> >
>
https://dist.apache.org/repos/dist/dev/metron/0.7.1-RC1/apache-metron_0.7.1-rc1.tar.gz
> > > >> >
> > > >> > Other release files, signatures and digests can be found here:
> > > >> > https://dist.apache.org/repos/dist/dev/metron/0.7.1-RC1/
> > > >> >
> > > >> > The release artifacts are signed with the following key:
> > > >> > https://dist.apache.org/repos/dist/release/metron/KEYS
> > > >> > Please vote on releasing this package as Apache Metron 0.7.1-RC1
> > > >> >
> > > >> > When voting, please list the actions taken to verify the
release.
> > > >> >
> > > >> > Recommended build validation and verification instructions are
> > posted
> > > >> > here:
> > > >> >
> https://cwiki.apache.org/confluence/display/METRON/Verifying+Builds
> > > >> >
> > > >> > This vote will be open for until 4pm EDT on Tuesday April 30
2019,
> > to
> > > >> > account for the weekend..
> > > >> >
> > > >> > [ ] +1 Release this package as Apache Metron 0.7.1-RC1
> > > >> >
> > > >> > [ ] 0 No opinion
> > > >> >
> > > >> > [ ] -1 Do not release this package because...
> > > >> >
> > > >>
> > > >
> > >
> >
>


Re: [VOTE] Update dev guidelines with format for sharing architecture source files and rendered images

2019-04-25 Thread Otto Fowler
- We need to specify the format I think, and then say that draw io is the
tool for the format and not just specify the tool.
- Existing diagrams, in order to be modified, will have to be converted to
this format, there should be jiras for that

2.1 "New features and significant bug fixes should be documented in the
JIRA. Appropriate architecture diagrams should be created in
https://www.draw.io/ and committed “
"New features and significant bug fixes should be documented in the JIRA.
Appropriate architecture diagrams should be created in https://www.draw.io/ and
committed.  Diagrams may be requested of PR submitters during review either
as documentation or as an aid to the reviewer “

We could/should/can use the diagrams as

- documentation
- simple aids for understanding PRs and communication ( Nick and I used
them for such yesterday to great effect to make sure we were on the same
page ).

I’m not sure we don’t want to have that blurb in there



On April 25, 2019 at 12:57:47, Michael Miklavcic (
michael.miklav...@gmail.com) wrote:

I'd like to propose a vote to change our dev guidelines which will clarify
the tooling we use to produce diagrams and share the source files for those
diagrams. The original discuss thread is noted at the end of this email. I
propose the dev guidelines
https://cwiki.apache.org/confluence/display/METRON/Development+Guidelines
and
PR checklist
https://github.com/apache/metron/blob/master/.github/PULL_REQUEST_TEMPLATE.md#for-documentation-related-changes
be
changed in the following ways:

1. We specify that draw.io is the free tool of choice for sharing
diagrams in Metron and that the source files will be maintained/shared in
source control.
2. Under "1.1 Contributing A Code Change"
1. Change "New features and significant bug fixes should be
documented in the JIRA and appropriate architecture diagrams should be
attached. Major features may require a vote." to "New features and
significant bug fixes should be documented in the JIRA. Appropriate
architecture diagrams should be created in https://www.draw.io/
and committed
to source control with their XML source files and final rendered image.
Major features may require a vote."
3. Under "2.4 Documentation"
1. Add a new section with instructions entitled "Creating and Modifying
Diagrams". This section would provide basic instructions for downloading
source files from draw.io.
4. Add a new checkbox item under PR checklist heading "For documentation
related changes" with the following text
1. Have you ensured that any documentation diagrams have been
updated, along with their source files, using draw.io? See
https://cwiki.apache.org/confluence/display/METRON/Development+Guidelines
for
instructions.

We require a minimum of 72 hours for a vote, not typically including
weekend days. I'd like to leave this vote open until Tuesday, 12PM
EDT. Please vote +1, -1, or 0 to abstain, and also indicate if your vote is
binding or non-binding.

https://lists.apache.org/thread.html/3ae02f1e32044b1a7648899700d44611aefdab6caa09fb3196292425@%3Cdev.metron.apache.org%3E

Cheers,
Mike


Re: [DISCUSS] Format for sharing architecture source files and rendered images

2019-04-17 Thread Otto Fowler
Also, the section should either have a blurb and like for draw.io or a
reference footnote etc.


On April 17, 2019 at 21:36:03, Otto Fowler (ottobackwa...@gmail.com) wrote:

I think we should try draw.io, and should add ( with vote i think ) a new
section to the developer documention like:
Architecture Diagrams

The project supports the XXX format along with accompanying image
version in  format for architectural diagrams.

Architectural diagrams are important because reasons, and while they may
not be required for a PR with architectural implications, they may be
requested as part of the review process when deemed prudent.

—

Along with that, we can discuss how to store them in git and have them
integrated with the documentation *and* the website.




On April 16, 2019 at 11:57:46, Michael Miklavcic (
michael.miklav...@gmail.com) wrote:

Hi devs,

We've talked about this fairly ad hoc on PRs before, and I thought it worth
bringing up more formally. Architecture diagrams are a strategically useful
addition to the core developer documentation. Examples include:

1.
https://github.com/apache/metron/tree/master/metron-platform/metron-parsing#parser-architecture
2.
https://github.com/apache/metron/tree/master/metron-platform/metron-enrichment/metron-enrichment-storm#enrichment-architecture
3.
https://github.com/apache/metron/tree/master/metron-platform/metron-job#job-state-statechart
4.
https://github.com/apache/metron/tree/master/metron-interface#knox-requestresponse-flow

We currently have a couple formats for maintaining the source diagram files
across the project as well as a few diagrams with no source files. I've
personally used draw.io in the past because it's publicly available and
free to use. I also see Powerpoint.

1. No source file -
https://github.com/apache/metron/blob/master/metron-analytics/metron-maas-service/maas_arch.png
2. draw.io source -
https://github.com/apache/metron/blob/master/metron-platform/metron-job/metron-job_state_statechart_diagram.xml
3. powerpoint -
https://github.com/apache/metron/blob/master/metron-interface/flow_diagrams.pptx

I'm interested in what the developer community has access to and would
prefer. As I mentioned before, I'm partial to draw.io if only because it
does not require a fee to use. PowerPoint is probably easier to use, but it
does limit the contributors who can edit the images. Are there other
options that people have used they would prefer over any of the above? Are
we still interested in maintaining the source files (my personal vote is
yes)?

Best,
Mike


Re: [DISCUSS] Format for sharing architecture source files and rendered images

2019-04-17 Thread Otto Fowler
I think we should try draw.io, and should add ( with vote i think ) a new
section to the developer documention like:
Architecture Diagrams

The project supports the XXX format along with accompanying image
version in  format for architectural diagrams.

Architectural diagrams are important because reasons, and while they may
not be required for a PR with architectural implications, they may be
requested as part of the review process when deemed prudent.

—

Along with that, we can discuss how to store them in git and have them
integrated with the documentation *and* the website.





On April 16, 2019 at 11:57:46, Michael Miklavcic (
michael.miklav...@gmail.com) wrote:

Hi devs,

We've talked about this fairly ad hoc on PRs before, and I thought it worth
bringing up more formally. Architecture diagrams are a strategically useful
addition to the core developer documentation. Examples include:

1.
https://github.com/apache/metron/tree/master/metron-platform/metron-parsing#parser-architecture
2.
https://github.com/apache/metron/tree/master/metron-platform/metron-enrichment/metron-enrichment-storm#enrichment-architecture
3.
https://github.com/apache/metron/tree/master/metron-platform/metron-job#job-state-statechart
4.
https://github.com/apache/metron/tree/master/metron-interface#knox-requestresponse-flow

We currently have a couple formats for maintaining the source diagram files
across the project as well as a few diagrams with no source files. I've
personally used draw.io in the past because it's publicly available and
free to use. I also see Powerpoint.

1. No source file -
https://github.com/apache/metron/blob/master/metron-analytics/metron-maas-service/maas_arch.png
2. draw.io source -
https://github.com/apache/metron/blob/master/metron-platform/metron-job/metron-job_state_statechart_diagram.xml
3. powerpoint -
https://github.com/apache/metron/blob/master/metron-interface/flow_diagrams.pptx

I'm interested in what the developer community has access to and would
prefer. As I mentioned before, I'm partial to draw.io if only because it
does not require a fee to use. PowerPoint is probably easier to use, but it
does limit the contributors who can edit the images. Are there other
options that people have used they would prefer over any of the above? Are
we still interested in maintaining the source files (my personal vote is
yes)?

Best,
Mike


Re: Problems with Dev deployment.

2019-04-10 Thread Otto Fowler
Sorry, fixed it


On April 10, 2019 at 11:34:33, zeo...@gmail.com (zeo...@gmail.com) wrote:

Wow, I didn't realize that never got finalized/merged. Looks like there is
a failure in travis on that PR, if you get that wrapped up I would think we
should take another look at that and maybe get it merged. It has been a
while but I recall I was pretty happy with it after my review cycle with
you.

- Jon Zeolla
zeo...@gmail.com


On Wed, Apr 10, 2019 at 9:17 AM Otto Fowler 
wrote:

> These issues are the reason https://github.com/apache/metron/pull/1261
was
> done. It would be nice if we could get by them.
>
>
> On April 10, 2019 at 08:13:04, Dale Richardson (tigerqu...@outlook.com)
> wrote:
>
> Older pre-req versions are mentioned at:
>
>
>
https://metron.apache.org/current-book/metron-deployment/vagrant/codelab-platform/index.html
> Metron – Developer Image for Apache Metron on Virtualbox<
>
>
https://metron.apache.org/current-book/metron-deployment/vagrant/codelab-platform/index.html
> >
>
> Developer Image for Apache Metron on Virtualbox. This image is a fully
> functional Metron installation that has been pre-loaded with Ambari, HDP
> and Metron.
> metron.apache.org
>
>
>
https://metron.apache.org/current-book/metron-deployment/vagrant/full-dev-platform/index.html
> Metron – Full Development Platform<
>
>
https://metron.apache.org/current-book/metron-deployment/vagrant/full-dev-platform/index.html
> >
>
> Full Development Platform. This project fully automates the provisioning
> and deployment of Apache Metron and all necessary prerequisites on a
> single, virtualized host running on Virtualbox.
> metron.apache.org
>
>
>
>
https://metron.apache.org/current-book/metron-deployment/vagrant/quick-dev-platform/index.html
> Metron – Quick Development Platform<
>
>
https://metron.apache.org/current-book/metron-deployment/vagrant/quick-dev-platform/index.html
> >
>
> Quick Development Platform. This project fully automates the provisioning
> and deployment of Apache Metron and all necessary prerequisites on a
> single, virtualized host running on Virtualbox.
> metron.apache.org
>
>
>
> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=68718548
>
> Metron Install on Ubuntu/Debian single-node VM with Vagrant and Ambari -
> Metron - Apache Software Foundation - cwiki.apache.org<
> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=68718548>

> Contributed by Umesh Kaushik < umesh.kaus...@bhujang.net > Introduction.
> These instructions are for an Ubuntu host, with occasional comments about
> how to do similar tasks in CentOS.
> cwiki.apache.org
>
> Which links to
> https://gist.github.com/dpalomar/96b826dac5c2e8b62cbf4c86dbd1c9df
>
>
>
> 
> From: Michael Miklavcic 
> Sent: Wednesday, 10 April 2019 4:15 AM
> To: dev@metron.apache.org
> Subject: Re: Problems with Dev deployment.
>
> Where are you seeing 2.0.0.2 for Ansible? Should be 2.4.0+ now.
>
>
https://github.com/apache/metron/blob/master/metron-deployment/development/centos6/README.md
>
>
>
> On Tue, Apr 9, 2019, 6:59 PM Michael Miklavcic <
> michael.miklav...@gmail.com>
>
> wrote:
>
> > That would be awesome man! Yes, Jira tickets for every change. I don't
> > recall off the top of my head the version - I'll have to look when I
get
> > back in front of a computer.
> >
> > On Tue, Apr 9, 2019, 6:56 PM Dale Richardson 
> > wrote:
> >
> >> Hi Michael,
> >> Yep that was the issue, looks like "brew install ansible" configures
> >> ansible for python 3. Happy to patch the documentation with updated
> >> install instructions (I will add the MacOS Mojave Drive permission
issue
> >> as well). Do you know if the minimum ansible version requirement has
> >> changed at all? (some documentation lists it as 2.0.0.2 or 2.2.2.0).
Do
> >> you guys usually put in a Jira to cover small documentation changes at
> all?
> >>
> >> Regards,
> >> Dale.
> >> 
> >> From: Otto Fowler 
> >> Sent: Sunday, 7 April 2019 1:40 PM
> >> To: dev@metron.apache.org
> >> Subject: Re: Problems with Dev deployment.
> >>
> >> Can you pull down
> >>
>
>
https://nam03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fapache%2Fmetron%2Fpull%2F1261data=02%7C01%7C%7Cc00c98c5300d45adf8c508d6bbab3180%7C84df9e7fe9f640afb435%7C1%7C0%7C636902741276312736sdata=oylL4njlCkTYaLsyLYl44Zy8%2BoZoB4HVBCb5F2vgxng%3Dreserved=0
> >> and try?
> >> That should eliminate any env. issues.
> >

Re: Problems with Dev deployment.

2019-04-10 Thread Otto Fowler
These issues are the reason https://github.com/apache/metron/pull/1261 was
done.  It would be nice if we could get by them.


On April 10, 2019 at 08:13:04, Dale Richardson (tigerqu...@outlook.com)
wrote:

Older pre-req versions are mentioned at:

https://metron.apache.org/current-book/metron-deployment/vagrant/codelab-platform/index.html
Metron – Developer Image for Apache Metron on Virtualbox<
https://metron.apache.org/current-book/metron-deployment/vagrant/codelab-platform/index.html>

Developer Image for Apache Metron on Virtualbox. This image is a fully
functional Metron installation that has been pre-loaded with Ambari, HDP
and Metron.
metron.apache.org

https://metron.apache.org/current-book/metron-deployment/vagrant/full-dev-platform/index.html
Metron – Full Development Platform<
https://metron.apache.org/current-book/metron-deployment/vagrant/full-dev-platform/index.html>

Full Development Platform. This project fully automates the provisioning
and deployment of Apache Metron and all necessary prerequisites on a
single, virtualized host running on Virtualbox.
metron.apache.org


https://metron.apache.org/current-book/metron-deployment/vagrant/quick-dev-platform/index.html
Metron – Quick Development Platform<
https://metron.apache.org/current-book/metron-deployment/vagrant/quick-dev-platform/index.html>

Quick Development Platform. This project fully automates the provisioning
and deployment of Apache Metron and all necessary prerequisites on a
single, virtualized host running on Virtualbox.
metron.apache.org



https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=68718548

Metron Install on Ubuntu/Debian single-node VM with Vagrant and Ambari -
Metron - Apache Software Foundation - cwiki.apache.org<
https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=68718548>
Contributed by Umesh Kaushik < umesh.kaus...@bhujang.net > Introduction.
These instructions are for an Ubuntu host, with occasional comments about
how to do similar tasks in CentOS.
cwiki.apache.org

Which links to
https://gist.github.com/dpalomar/96b826dac5c2e8b62cbf4c86dbd1c9df




From: Michael Miklavcic 
Sent: Wednesday, 10 April 2019 4:15 AM
To: dev@metron.apache.org
Subject: Re: Problems with Dev deployment.

Where are you seeing 2.0.0.2 for Ansible? Should be 2.4.0+ now.
https://github.com/apache/metron/blob/master/metron-deployment/development/centos6/README.md



On Tue, Apr 9, 2019, 6:59 PM Michael Miklavcic 

wrote:

> That would be awesome man! Yes, Jira tickets for every change. I don't
> recall off the top of my head the version - I'll have to look when I get
> back in front of a computer.
>
> On Tue, Apr 9, 2019, 6:56 PM Dale Richardson 
> wrote:
>
>> Hi Michael,
>> Yep that was the issue, looks like "brew install ansible" configures
>> ansible for python 3. Happy to patch the documentation with updated
>> install instructions (I will add the MacOS Mojave Drive permission issue
>> as well). Do you know if the minimum ansible version requirement has
>> changed at all? (some documentation lists it as 2.0.0.2 or 2.2.2.0). Do
>> you guys usually put in a Jira to cover small documentation changes at
all?
>>
>> Regards,
>> Dale.
>> 
>> From: Otto Fowler 
>> Sent: Sunday, 7 April 2019 1:40 PM
>> To: dev@metron.apache.org
>> Subject: Re: Problems with Dev deployment.
>>
>> Can you pull down
>>
https://nam03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fapache%2Fmetron%2Fpull%2F1261data=02%7C01%7C%7Cc00c98c5300d45adf8c508d6bbab3180%7C84df9e7fe9f640afb435%7C1%7C0%7C636902741276312736sdata=oylL4njlCkTYaLsyLYl44Zy8%2BoZoB4HVBCb5F2vgxng%3Dreserved=0
>> and try?
>> That should eliminate any env. issues.
>>
>> I’ll update it to latest master now.
>>
>>
>>
>> On April 7, 2019 at 08:33:52, Dale Richardson (tigerqu...@outlook.com)
>> wrote:
>>
>> Hi Folks,
>> I've been walking through some of the documentations on building and
>> deploying the Metron development image.
>> I am building Metron version 0.7. The deployment appears to go OK up
until
>> the Ambari deployment stage, and then I get the following error:
>>
>>
>> SyntaxError: invalid syntax
>>
>> INFO interface: detail: The full traceback is:
>>
>> Traceback (most recent call last):
>>
>> File
>>
>>
"/usr/local/Cellar/ansible/2.7.9/libexec/lib/python3.7/site-packages/ansible/executor/task_executor.py",

>> line 140, in run
>>
>> res = self._execute()
>>
>> File
>>
>>
"/usr/local/Cellar/ansible/2.7.9/libexec/lib/python3.7/site-packages/ansible/executor/task_executor.py",

>> lin

Re: Problems with Dev deployment.

2019-04-07 Thread Otto Fowler
Can you pull down https://github.com/apache/metron/pull/1261 and try?
That should eliminate any env. issues.

I’ll update it to latest master now.



On April 7, 2019 at 08:33:52, Dale Richardson (tigerqu...@outlook.com)
wrote:

Hi Folks,
I've been walking through some of the documentations on building and
deploying the Metron development image.
I am building Metron version 0.7. The deployment appears to go OK up until
the Ambari deployment stage, and then I get the following error:


SyntaxError: invalid syntax

INFO interface: detail: The full traceback is:

Traceback (most recent call last):

File
"/usr/local/Cellar/ansible/2.7.9/libexec/lib/python3.7/site-packages/ansible/executor/task_executor.py",
line 140, in run

res = self._execute()

File
"/usr/local/Cellar/ansible/2.7.9/libexec/lib/python3.7/site-packages/ansible/executor/task_executor.py",
line 612, in _execute

result = self._handler.run(task_vars=variables)

File
"/usr/local/Cellar/ansible/2.7.9/libexec/lib/python3.7/site-packages/ansible/plugins/action/normal.py",
line 46, in run

result = merge_hash(result, self._execute_module(task_vars=task_vars,
wrap_async=wrap_async))

File
"/usr/local/Cellar/ansible/2.7.9/libexec/lib/python3.7/site-packages/ansible/plugins/action/__init__.py",
line 742, in _execute_module

(module_style, shebang, module_data, module_path) =
self._configure_module(module_name=module_name, module_args=module_args,
task_vars=task_vars)

File
"/usr/local/Cellar/ansible/2.7.9/libexec/lib/python3.7/site-packages/ansible/plugins/action/__init__.py",
line 178, in _configure_module

environment=final_environment)

File
"/usr/local/Cellar/ansible/2.7.9/libexec/lib/python3.7/site-packages/ansible/executor/module_common.py",
line 973, in modify_module

environment=environment)

File
"/usr/local/Cellar/ansible/2.7.9/libexec/lib/python3.7/site-packages/ansible/executor/module_common.py",
line 791, in _find_module_utils

recursive_finder(module_name, b_module_data, py_module_names,
py_module_cache, zf)

File
"/usr/local/Cellar/ansible/2.7.9/libexec/lib/python3.7/site-packages/ansible/executor/module_common.py",
line 538, in recursive_finder

tree = ast.parse(data)

File
"/usr/local/opt/python/Frameworks/Python.framework/Versions/3.7/lib/python3.7/ast.py",
line 35, in parse

return compile(source, filename, mode, PyCF_ONLY_AST)

File "", line 230

except requests.ConnectionError, e:

^

MacOS Mojava (Version 10.14.3)
VirtualBox version 5.2.26
Vagrant version 2.2.4

ansible 2.7.9
config file =
/Users/user/work/metron/metron-deployment/development/centos6/ansible.cfg
ansible python module location =
/usr/local/Cellar/ansible/2.7.9/libexec/lib/python3.7/site-packages/ansible
executable location = /usr/local/bin/ansible
python version = 3.7.2 (default, Feb 12 2019, 08:15:36) [Clang 10.0.0
(clang-1000.11.45.5)]



Python2 version is 2.7.16

commit 2263983761e77b7eec52f70f5e8f8001bac83125 (HEAD, tag:
apache-metron_0.7.0-release, tag: apache-metron-0.7.0-rc1,
origin/Metron_0.7.0)

Does anybody have any suggestions on how to proceed?

Thanks,
Dale.


Re: Problems with Dev deployment.

2019-04-07 Thread Otto Fowler
Check the readme for instructions


On April 7, 2019 at 09:40:34, Otto Fowler (ottobackwa...@gmail.com) wrote:

Can you pull down https://github.com/apache/metron/pull/1261 and try?
That should eliminate any env. issues.

I’ll update it to latest master now.



On April 7, 2019 at 08:33:52, Dale Richardson (tigerqu...@outlook.com)
wrote:

Hi Folks,
I've been walking through some of the documentations on building and
deploying the Metron development image.
I am building Metron version 0.7. The deployment appears to go OK up until
the Ambari deployment stage, and then I get the following error:


SyntaxError: invalid syntax

INFO interface: detail: The full traceback is:

Traceback (most recent call last):

File
"/usr/local/Cellar/ansible/2.7.9/libexec/lib/python3.7/site-packages/ansible/executor/task_executor.py",
line 140, in run

res = self._execute()

File
"/usr/local/Cellar/ansible/2.7.9/libexec/lib/python3.7/site-packages/ansible/executor/task_executor.py",
line 612, in _execute

result = self._handler.run(task_vars=variables)

File
"/usr/local/Cellar/ansible/2.7.9/libexec/lib/python3.7/site-packages/ansible/plugins/action/normal.py",
line 46, in run

result = merge_hash(result, self._execute_module(task_vars=task_vars,
wrap_async=wrap_async))

File
"/usr/local/Cellar/ansible/2.7.9/libexec/lib/python3.7/site-packages/ansible/plugins/action/__init__.py",
line 742, in _execute_module

(module_style, shebang, module_data, module_path) =
self._configure_module(module_name=module_name, module_args=module_args,
task_vars=task_vars)

File
"/usr/local/Cellar/ansible/2.7.9/libexec/lib/python3.7/site-packages/ansible/plugins/action/__init__.py",
line 178, in _configure_module

environment=final_environment)

File
"/usr/local/Cellar/ansible/2.7.9/libexec/lib/python3.7/site-packages/ansible/executor/module_common.py",
line 973, in modify_module

environment=environment)

File
"/usr/local/Cellar/ansible/2.7.9/libexec/lib/python3.7/site-packages/ansible/executor/module_common.py",
line 791, in _find_module_utils

recursive_finder(module_name, b_module_data, py_module_names,
py_module_cache, zf)

File
"/usr/local/Cellar/ansible/2.7.9/libexec/lib/python3.7/site-packages/ansible/executor/module_common.py",
line 538, in recursive_finder

tree = ast.parse(data)

File
"/usr/local/opt/python/Frameworks/Python.framework/Versions/3.7/lib/python3.7/ast.py",
line 35, in parse

return compile(source, filename, mode, PyCF_ONLY_AST)

File "", line 230

except requests.ConnectionError, e:

^

MacOS Mojava (Version 10.14.3)
VirtualBox version 5.2.26
Vagrant version 2.2.4

ansible 2.7.9
config file =
/Users/user/work/metron/metron-deployment/development/centos6/ansible.cfg
ansible python module location =
/usr/local/Cellar/ansible/2.7.9/libexec/lib/python3.7/site-packages/ansible
executable location = /usr/local/bin/ansible
python version = 3.7.2 (default, Feb 12 2019, 08:15:36) [Clang 10.0.0
(clang-1000.11.45.5)]



Python2 version is 2.7.16

commit 2263983761e77b7eec52f70f5e8f8001bac83125 (HEAD, tag:
apache-metron_0.7.0-release, tag: apache-metron-0.7.0-rc1,
origin/Metron_0.7.0)

Does anybody have any suggestions on how to proceed?

Thanks,
Dale.


Master is failing RAT check

2019-03-30 Thread Otto Fowler
https://travis-ci.org/apache/metron

Looks like an issue with node.


Re: [DISCUSS] Ambari MPacks, Metron upgrades

2019-03-29 Thread Otto Fowler
The long and the short of it is that I don't think there is likely
to be much more time invested in enhancing, upgrading, or expanding the
Ambari MPack features for Metron from this point on. We have many users
that currently depend on the installation process, and I think the option
should continue to exist, but we should start the process of decoupling our
installation from Ambari.


That seems like just the short of it.  Why don’t you think it is likely
more time will be used on mpack?

Can you elaborate?



On March 29, 2019 at 13:12:53, Michael Miklavcic (
michael.miklav...@gmail.com) wrote:

Hi devs,

I wanted to kickstart a high level discussion on what the future of
development on Ambari MPacks and upgrades will/should look like ongoing for
Metron. The long and the short of it is that I don't think there is likely
to be much more time invested in enhancing, upgrading, or expanding the
Ambari MPack features for Metron from this point on. We have many users
that currently depend on the installation process, and I think the option
should continue to exist, but we should start the process of decoupling our
installation from Ambari.

I propose that we extract and decouple our install python scripts that are
in the MPack. We're already executing on decoupling our core platform code
wrt Storm. The scripts are already pretty well modularized, but we'll want
to replace some of the core Ambari features we currently depend on with our
own implementations. I'll emphasize that this refactoring should not and
cannot break the Ambari install, at least for now - my thought is that
validation would include a successful deployment with fulldev and the
Amazon EC2 cloud deployment script. We can re-evaluate what long term
support for Ambari will look like at a later time. For now, I want to
emphasize that this is just to enable flexibility.

The other nut to crack here is upgrades, and more to the point of why I'm
starting this thread in the first place. We've talked about it a while now,
and I think it's about time we got serious about paving the way to making
upgrades easier. I think we should make this process a pluggable set of
scripts as well. And from there, the community can plug that into whatever
cluster management software they wish.

Does this sound reasonable? I'm eager to hear everyone's thoughts.

Best,
Mike Miklavcic


Re: [DISCUSS] Central Navigation for Alerts and Management UI

2019-03-11 Thread Otto Fowler
Maybe you should post to the users@ list


On March 11, 2019 at 06:25:48, Shane Ardell (shane.m.ard...@gmail.com)
wrote:

Thank you both for explaining the original design choice. That does make
sense.

However, like both of you point out, I am also curious if anyone has direct
user feedback indicating a need for either persona to switch between
screens.

Shane

On Tue, Mar 5, 2019 at 7:25 PM Rita McKissick 
wrote:

> That was my thought, too. The Management UI is meant for the Operations
> persona.
> And the Alerts UI is meant for the SOC analyst persona. If we see a need
> for either of these
> personas to use both of the UIs, then the ability to switch between the
> two UIs would
> be great. Otherwise, I'm not sure that ability is necessary.
>
> As an aside, as the tech writer I would love to be able to switch between
> the two UIs, but I'm not
> really one of our supported personas __ Darn!
>
> Rita
>
> Rita McKissick ! Sr. Technical Writer
> rmckiss...@hortonworks.com
> (mobile) 831-234-3676
>
>
> On 3/5/19, 9:50 AM, "Michael Miklavcic" 
> wrote:
>
> The original design was done with the intent to keep the user profiles
> (soc
> analyst vs ops personnel) separate and enable a microservices-oriented
> architecture. I don't have a strong opinion one way or the other, but
> I'd
> be interested to hear whether others in the community find this wall
> useful, or if we should come back to a single pain of glass.
>
> Mike
>
> On Tue, Mar 5, 2019 at 9:12 AM Shane Ardell 
> wrote:
>
> > Hello everyone,
> >
> > I recently started experimenting with implementing a navigation bar
> in both
> > the Alerts and Management UI. It would allow us to navigate between
> the two
> > UIs through links instead of manually entering a url or opening
> separate
> > tabs from Ambari.
> >
> > I'm just wondering what everyone's thoughts are. Is this something
> we want
> > in Metron?
> >
>
>
>


Re: [DISCUSS] Upgrading HBase and Kafka support

2019-03-08 Thread Otto Fowler
https://github.com/apache/metron-bro-plugin-kafka/tree/master/docker


On March 8, 2019 at 14:28:20, zeo...@gmail.com (zeo...@gmail.com) wrote:

So most importantly I want to make sure to give Otto credit for being the
one who cleaned up the rudimentary testing steps we had for testing the
plugin and turned it into the docker end to end. Right now we manually run
the tests, as there were a few follow-ons we needed to work through before
it's ready for Travis. In my opinion, once METRON-2003 (PR 26) gets in
it'll be ready to have Travis. There isn't any current Maven use

Jon

On Fri, Mar 8, 2019 at 12:26 PM Otto Fowler 
wrote:

> I believe that the TestContainers allows the ide case
>
>
> On March 8, 2019 at 11:38:24, Michael Miklavcic (
> michael.miklav...@gmail.com)
> wrote:
>
> I'm -1 on #1 unless there's some desperately compelling reason to go that
> route. It would be a regression in our test coverage, and at that point
> it's really just duplicating our unit tests as opposed to checking our
> integration.
>
> I'm good with 3. Gating factors for a successful implementation would be
> that as a developer I can:
>
> 1. Run it in my IDE without having to do anything extra (the beauty of
> the in-mem component is that @BeforeClass spins it up automatically - we
> should keep doing something along those lines)
> 2. Run it via Maven cli
> 3. Run it in Travis as part of our normal build
>
> It's probably worth looking at Kafka's testing infrastructure straight
from
> the source - https://github.com/apache/kafka/blob/trunk/tests/README.md.
> They leverage Docker containers now for system tests.
>
> Best,
> Mike
>
>
> On Fri, Mar 8, 2019 at 7:47 AM Ryan Merriman  wrote:
>
> > I have been researching the effort involved to upgrade to HDP 3. Along
> the
> > way I've found a couple challenging issues that we will need to solve,
> both
> > involving our integration testing strategy.
> >
> > The first issue is Kafka. We are moving from 0.10.0 to 2.0.0 and there
> > have been significant changes to the API. This creates an issue in the
> > KafkaComponent class, which we use as an in-memory Kafka server in
> > integration tests. Most of the classes that were previously used have
> gone
> > away, and to the best of my knowledge, were not supported as public
APIs.
> > I also don't see any publicly documented APIs to replace them.
> >
> > The second issue is HBase. We are moving from 1.1.2 to 2.0.2 so another
> > significant change. This creates an issue in the MockHTable class
> > becausethe HTableInterface class has changed to Table, essentially
> > requiring that MockHTable be rewritten to conform to the new interface.
> > It's my opinion that this class is complicated and difficult to
maintain
> as
> > it is anyways.
> >
> > These 2 issues have the potential to add a significant amount of work
to
> > upgrading Metron to HDP 3. I want to take a step back and review our
> > options before we move forward. Here are some initial thoughts I had on
> > how to approach this. For HBase:
> >
> > 1. Update MockHTable to work with the new HBase API. We would continue
> > using a mock server approach for HBase.
> > 2. Research replacing MockHTable with an in-memory HBase server.
> > 3. Replace MockHTable with a Docker container running HBase.
> >
> > For Kafka:
> >
> > 1. Replace KafkaComponent with a mock server implementation.
> > 2. Update KafkaComponent to work with the new API. We would probably
> > need to leverage some internal Kafka classes. I do not see a testing
> > API
> > documented publicly.
> > 3. Replace KafkaComponent with a Docker container running Kafka.
> >
> > What other options are there? Whatever we choose I think we should
follow
> > a similar approach for both (mock servers, in memory servers, Docker,
> other
> > options I'm not thinking of).
> >
> > This will not shock anyone but I would be in favor of Docker
containers.
> > They have the advantage of classpath isolation, easy upgrades, and
> accurate
> > integration testing. The downside is we will have to adjusts our tests
> and
> > travis script to incorporate these Docker containers into our build
> > process. We have discussed this at length in the past and it has
> generally
> > stalled for various reasons. Maybe if we move a few services at a time
it
> > might be more palatable? As for the other 2 approaches, I think if
either
> > worked well we wouldn't be having this discussion. Mock servers are
hard
> > to maintain and I don't see in memory testing classes documented in
> > javadocs for either service.
> >
> > Thoughts?
> >
>
-- 

Jon Zeolla


Re: [DISCUSS] Upgrading HBase and Kafka support

2019-03-08 Thread Otto Fowler
I believe that the TestContainers allows the ide case


On March 8, 2019 at 11:38:24, Michael Miklavcic (michael.miklav...@gmail.com)
wrote:

I'm -1 on #1 unless there's some desperately compelling reason to go that
route. It would be a regression in our test coverage, and at that point
it's really just duplicating our unit tests as opposed to checking our
integration.

I'm good with 3. Gating factors for a successful implementation would be
that as a developer I can:

1. Run it in my IDE without having to do anything extra (the beauty of
the in-mem component is that @BeforeClass spins it up automatically - we
should keep doing something along those lines)
2. Run it via Maven cli
3. Run it in Travis as part of our normal build

It's probably worth looking at Kafka's testing infrastructure straight from
the source - https://github.com/apache/kafka/blob/trunk/tests/README.md.
They leverage Docker containers now for system tests.

Best,
Mike


On Fri, Mar 8, 2019 at 7:47 AM Ryan Merriman  wrote:

> I have been researching the effort involved to upgrade to HDP 3. Along
the
> way I've found a couple challenging issues that we will need to solve,
both
> involving our integration testing strategy.
>
> The first issue is Kafka. We are moving from 0.10.0 to 2.0.0 and there
> have been significant changes to the API. This creates an issue in the
> KafkaComponent class, which we use as an in-memory Kafka server in
> integration tests. Most of the classes that were previously used have
gone
> away, and to the best of my knowledge, were not supported as public APIs.
> I also don't see any publicly documented APIs to replace them.
>
> The second issue is HBase. We are moving from 1.1.2 to 2.0.2 so another
> significant change. This creates an issue in the MockHTable class
> becausethe HTableInterface class has changed to Table, essentially
> requiring that MockHTable be rewritten to conform to the new interface.
> It's my opinion that this class is complicated and difficult to maintain
as
> it is anyways.
>
> These 2 issues have the potential to add a significant amount of work to
> upgrading Metron to HDP 3. I want to take a step back and review our
> options before we move forward. Here are some initial thoughts I had on
> how to approach this. For HBase:
>
> 1. Update MockHTable to work with the new HBase API. We would continue
> using a mock server approach for HBase.
> 2. Research replacing MockHTable with an in-memory HBase server.
> 3. Replace MockHTable with a Docker container running HBase.
>
> For Kafka:
>
> 1. Replace KafkaComponent with a mock server implementation.
> 2. Update KafkaComponent to work with the new API. We would probably
> need to leverage some internal Kafka classes. I do not see a testing
> API
> documented publicly.
> 3. Replace KafkaComponent with a Docker container running Kafka.
>
> What other options are there? Whatever we choose I think we should follow
> a similar approach for both (mock servers, in memory servers, Docker,
other
> options I'm not thinking of).
>
> This will not shock anyone but I would be in favor of Docker containers.
> They have the advantage of classpath isolation, easy upgrades, and
accurate
> integration testing. The downside is we will have to adjusts our tests
and
> travis script to incorporate these Docker containers into our build
> process. We have discussed this at length in the past and it has
generally
> stalled for various reasons. Maybe if we move a few services at a time it
> might be more palatable? As for the other 2 approaches, I think if either
> worked well we wouldn't be having this discussion. Mock servers are hard
> to maintain and I don't see in memory testing classes documented in
> javadocs for either service.
>
> Thoughts?
>


Re: [DISCUSS] Upgrading HBase and Kafka support

2019-03-08 Thread Otto Fowler
I think I have mentioned it before, but https://www.testcontainers.org could
be a viable approach for this methodology (3).
I would think it would be worth looking at.



On March 8, 2019 at 09:47:54, Ryan Merriman (merrim...@gmail.com) wrote:

I have been researching the effort involved to upgrade to HDP 3. Along the
way I've found a couple challenging issues that we will need to solve, both
involving our integration testing strategy.

The first issue is Kafka. We are moving from 0.10.0 to 2.0.0 and there
have been significant changes to the API. This creates an issue in the
KafkaComponent class, which we use as an in-memory Kafka server in
integration tests. Most of the classes that were previously used have gone
away, and to the best of my knowledge, were not supported as public APIs.
I also don't see any publicly documented APIs to replace them.

The second issue is HBase. We are moving from 1.1.2 to 2.0.2 so another
significant change. This creates an issue in the MockHTable class
becausethe HTableInterface class has changed to Table, essentially
requiring that MockHTable be rewritten to conform to the new interface.
It's my opinion that this class is complicated and difficult to maintain as
it is anyways.

These 2 issues have the potential to add a significant amount of work to
upgrading Metron to HDP 3. I want to take a step back and review our
options before we move forward. Here are some initial thoughts I had on
how to approach this. For HBase:

1. Update MockHTable to work with the new HBase API. We would continue
using a mock server approach for HBase.
2. Research replacing MockHTable with an in-memory HBase server.
3. Replace MockHTable with a Docker container running HBase.

For Kafka:

1. Replace KafkaComponent with a mock server implementation.
2. Update KafkaComponent to work with the new API. We would probably
need to leverage some internal Kafka classes. I do not see a testing API
documented publicly.
3. Replace KafkaComponent with a Docker container running Kafka.

What other options are there? Whatever we choose I think we should follow
a similar approach for both (mock servers, in memory servers, Docker, other
options I'm not thinking of).

This will not shock anyone but I would be in favor of Docker containers.
They have the advantage of classpath isolation, easy upgrades, and accurate
integration testing. The downside is we will have to adjusts our tests and
travis script to incorporate these Docker containers into our build
process. We have discussed this at length in the past and it has generally
stalled for various reasons. Maybe if we move a few services at a time it
might be more palatable? As for the other 2 approaches, I think if either
worked well we wouldn't be having this discussion. Mock servers are hard
to maintain and I don't see in memory testing classes documented in
javadocs for either service.

Thoughts?


Re: [DISCUSS] Architecture documentation

2019-02-25 Thread Otto Fowler
I really like the idea of architecture.md -> **/architecture.md.

We overall do not have javadoc in a lot of areas, and could maybe start
working on it as we go and think about asking for it in reviews.
We are also missing the Parser Programmer’s Guide, how to add a parser to
the metron system/install etc and other things.



On February 25, 2019 at 15:22:47, Ryan Merriman (merrim...@gmail.com) wrote:

I feel like the code itself is pretty well documented. I updated existing
javadocs and added javadocs to classes that didn't have them before this
PR. In my opinion the level of documentation for these classes has
increased significantly.

On Mon, Feb 25, 2019 at 1:52 PM Michael Miklavcic <
michael.miklav...@gmail.com> wrote:

> Tentatively agreed on further clarification of what we consider in/out of
> scope for documentation re: document something that wasn't documented
> before. Ryan, can you give a quick summary of what you *have*
added/updated
> in documentation on this PR vs what you want to leave out?
>
> My initial concern in punting on docs right now is that part of what made
> this PR/task more challenging in the first place was not having
> documentation. We risk losing context and detail again if we don't do
this
> immediately. Would it be reasonable to split it up as follows?:
>
> 1. Additional overarching documentation feels out of scope - make it a
> follow on (see comments below).
> 2. Adding documentation to our existing README's and java code comments
> that describe the new/modified functionality should be in scope because
> it's part of the unit of work. I expect that a developer should be able
> to
> look at the code, tests, comments, and README's and understand how this
> code functions without having to start from scratch.
>
> The way we've handled follow-on work before, at least as far as feature
> branches are concerned, was to create Jiras and link them to the
> appropriate discussions for context. Maybe we can take that one step
> further and do the release manager a favor by also labeling the
> required/requested release on the Jira as a gating factor. This follows
our
> pattern for intermittent test failure reporting, e.g.
>
>
https://issues.apache.org/jira/browse/METRON-1946?jql=project%20%3D%20METRON%20AND%20resolution%20%3D%20Unresolved%20AND%20labels%20%3D%20test-failure%20ORDER%20BY%20priority%20DESC%2C%20updated%20DESC
> .
>
> I'm also in favor of continuing to document architecture and technical
> details as part of the code base as Ryan and Jon have suggested. I think
we
> should have an "architecture.md" in metron root that replaces this -
>
>
https://github.com/apache/metron/blob/d7d4fd9afb19e2bd2e66babb7e1514a19eae07d0/README.md#navigating-the-architecture
> and covers the broad architecture with links to the appropriate modules
for
> detail. Minimally, it would be nice if we had a simple diagram showing
the
> basic flow of data in Metron. I think we probably want an updated version
> of this wiki entry from back in the day -
> https://cwiki.apache.org/confluence/display/METRON/Metron+Architecture
>
> Best,
> Mike
>
>
> On Mon, Feb 25, 2019 at 7:18 AM Nick Allen  wrote:
>
> > I don't think we should hold up this work to document something that
> wasn't
> > previously documented. A follow-on is sufficient.
> >
> > On Mon, Feb 25, 2019 at 8:50 AM Ryan Merriman 
> wrote:
> >
> > > Recently I submitted a PR 

> > > that
> > > introduces a large number of changes to a critical part of our code
> base.
> > > Reviewers feel like it is significant enough to document at an
> > > architectural level (and I agree). There are a couple points I would
> > like
> > > to clarify.
> > >
> > > Generally architectural documentation lives in the README of the
> > > appropriate module. Do we want to continue documenting architecture
> > here?
> > > I think it makes sense because it will be versioned along with the
> code.
> > > Just wanted to confirm there are no objections to continuing this
> > practice.
> > >
> > > A reviewer suggested we could accept the PR as is and leave the
> > > architectural documentation as a follow on. I think this makes sense
> > > because it can be tedious to maintain a large PR as other smaller
> commits
> > > are accepted into master. An important requirement is the
> documentation
> > > follow on must be completed in a timely manner, before the next
> release.
> > > Are there any objections to doing it this way?
> > >
> >
>


RE: How to provide hbase-site.xml to Stellar Processor Java API

2019-01-25 Thread Otto Fowler
Please link it to https://issues.apache.org/jira/browse/METRON-1409.
This is a jira around hosting stellar that I created a while ago.


On January 25, 2019 at 07:02:30, Anil Donthireddy (
anil.donthire...@sstech.us) wrote:

Hi Mohan DV,



I appreciate your time for sharing the much useful information.



As per my understanding there seems to be no scope to pass my own build
Hbase configuration object to execute Stellar queries in my extended API. I
may need to re-write lot of things in my extended API in the way Stellar
processor works to override the Hbase configuration.



@Otto Fowler: I will raise a Jira case with the usecase and the issue.



Thanks,

Anil.



*From:* Otto Fowler [mailto:ottobackwa...@gmail.com]
*Sent:* Thursday, January 24, 2019 11:15 PM
*To:* Anil Donthireddy ; u...@metron.apache.org;
dev@metron.apache.org; Mohan Venkateshaiah 
*Cc:* Maxim Dashenko ; Satish Abburi <
satish.abb...@sstech.us>; James Sirota ;
Christopher Berry 
*Subject:* Re: How to provide hbase-site.xml to Stellar Processor Java API



But still file a jira ;)





On January 24, 2019 at 12:19:34, Mohan Venkateshaiah (
mvenkatesha...@hortonworks.com) wrote:

Hi Anil,

I had done similar to this in the past . In the stellar to get the hbase
configuration we call HBaseConfiguration.create() , in that call hbase adds
the hbase-site and core-site as resources to the config we probably SHOULD
let people specify a base config.

What I had done was in the global config, I set a property called
hbase.provider.impl, it's the fully qualified class name for a class that
implements the TableProvider interface which has one method:

public HTableInterface getTable(Configuration config, String tableName)
throws IOException

if you implement your own where you ignore the config argument and resolve
the hbase table with your own injected config that will work

Thanks
Mohan DV

On 1/24/19, 8:56 PM, "Otto Fowler"  wrote:

Hi Anil,
Can you create a jira on this with these details and a general overview of
your use case?
It looks like the HbaseConfiguration we use in the HTableConnector
is done using the create() method, which creates from resources.

I think we would need to do some work to support the external file.



On January 24, 2019 at 10:14:46, Anil Donthireddy (
anil.donthire...@sstech.us) wrote:

Hi,



I have written a java application which uses Stellar processor and execute
the stellar expressions. The issue I am facing is I am unable to connect
Hbase unless I place hbase-site.xml in src/main/resources/ folder of the
code. As it is not the proper way of packaging the hbase-site.xml with Jar,
I would like to understand how the hbase-site.xml is being set to classpath
while starting profiler topology.



The ways I tried are

1) Setting the classpath to hbase conf folder using command “java -cp
$CLASSPATH:/etc/hbase/conf:/etc/Hadoop/conf –jar myJar.jar”

2) Adding Hbase conf folder to HADOOP_CLASSPATH. Below is the Hadoop
classpath

/usr/hdp/2.6.1.0-129/hadoop/conf:/usr/hdp/2.6.1.0-129/hadoop/lib/*:/usr/hdp/2.6.1.0-129/hadoop/.//*:/usr/hdp/2.6.1.0-129/hadoop-hdfs/./:/usr/hdp/2.6.1.0-129/hadoop-hdfs/lib/*:/usr/hdp/2.6.1.0-129/hadoop-hdfs/.//*:/usr/hdp/2.6.1.0-129/hadoop-yarn/lib/*:/usr/hdp/2.6.1.0-129/hadoop-yarn/.//*:/usr/hdp/2.6.1.0-129/hadoop-mapreduce/lib/*:/usr/hdp/2.6.1.0-129/hadoop-mapreduce/.//*::mysql-connector-java.jar:postgresql-jdbc2ee.jar:postgresql-jdbc2.jar:postgresql-jdbc3.jar:postgresql-jdbc.jar:/etc/hbase/conf/:/usr/hdp/2.6.1.0-129/tez/*:/usr/hdp/2.6.1.0-129/tez/lib/*:/usr/hdp/2.6.1.0-129/tez/conf




One more step I would like to try to fix the issue is to set property “
*zookeeper.znode.parent*” for configuration object while instantiating
HbaseConnector. But it is in the scope of metron code to try this fix.



I would like to know if any one able to provide hbase-site.xml to any Java
Application or anyone able to extend Metron Stellar Processor and execute
profile definitions successfully.

Please provide any inputs to resolve the issue.



Thanking you.



Thanks,

Anil.


Re: How to provide hbase-site.xml to Stellar Processor Java API

2019-01-24 Thread Otto Fowler
But still file a jira ;)


On January 24, 2019 at 12:19:34, Mohan Venkateshaiah (
mvenkatesha...@hortonworks.com) wrote:

Hi Anil,

I had done similar to this in the past . In the stellar to get the hbase
configuration we call HBaseConfiguration.create() , in that call hbase adds
the hbase-site and core-site as resources to the config we probably SHOULD
let people specify a base config.

What I had done was in the global config, I set a property called
hbase.provider.impl, it's the fully qualified class name for a class that
implements the TableProvider interface which has one method:

public HTableInterface getTable(Configuration config, String tableName)
throws IOException

if you implement your own where you ignore the config argument and resolve
the hbase table with your own injected config that will work

Thanks
Mohan DV

On 1/24/19, 8:56 PM, "Otto Fowler"  wrote:

Hi Anil,
Can you create a jira on this with these details and a general overview of
your use case?
It looks like the HbaseConfiguration we use in the HTableConnector
is done using the create() method, which creates from resources.

I think we would need to do some work to support the external file.



On January 24, 2019 at 10:14:46, Anil Donthireddy (
anil.donthire...@sstech.us) wrote:

Hi,



I have written a java application which uses Stellar processor and execute
the stellar expressions. The issue I am facing is I am unable to connect
Hbase unless I place hbase-site.xml in src/main/resources/ folder of the
code. As it is not the proper way of packaging the hbase-site.xml with Jar,
I would like to understand how the hbase-site.xml is being set to classpath
while starting profiler topology.



The ways I tried are

1) Setting the classpath to hbase conf folder using command “java -cp
$CLASSPATH:/etc/hbase/conf:/etc/Hadoop/conf –jar myJar.jar”

2) Adding Hbase conf folder to HADOOP_CLASSPATH. Below is the Hadoop
classpath

/usr/hdp/2.6.1.0-129/hadoop/conf:/usr/hdp/2.6.1.0-129/hadoop/lib/*:/usr/hdp/2.6.1.0-129/hadoop/.//*:/usr/hdp/2.6.1.0-129/hadoop-hdfs/./:/usr/hdp/2.6.1.0-129/hadoop-hdfs/lib/*:/usr/hdp/2.6.1.0-129/hadoop-hdfs/.//*:/usr/hdp/2.6.1.0-129/hadoop-yarn/lib/*:/usr/hdp/2.6.1.0-129/hadoop-yarn/.//*:/usr/hdp/2.6.1.0-129/hadoop-mapreduce/lib/*:/usr/hdp/2.6.1.0-129/hadoop-mapreduce/.//*::mysql-connector-java.jar:postgresql-jdbc2ee.jar:postgresql-jdbc2.jar:postgresql-jdbc3.jar:postgresql-jdbc.jar:/etc/hbase/conf/:/usr/hdp/2.6.1.0-129/tez/*:/usr/hdp/2.6.1.0-129/tez/lib/*:/usr/hdp/2.6.1.0-129/tez/conf




One more step I would like to try to fix the issue is to set property “
*zookeeper.znode.parent*” for configuration object while instantiating
HbaseConnector. But it is in the scope of metron code to try this fix.



I would like to know if any one able to provide hbase-site.xml to any Java
Application or anyone able to extend Metron Stellar Processor and execute
profile definitions successfully.

Please provide any inputs to resolve the issue.



Thanking you.



Thanks,

Anil.


Re: How to provide hbase-site.xml to Stellar Processor Java API

2019-01-24 Thread Otto Fowler
Hi Anil,
Can you create a jira on this with these details and a general overview of
your use case?
It looks like the HbaseConfiguration we use in the HTableConnector
is done using the create() method, which creates from resources.

I think we would need to do some work to support the external file.



On January 24, 2019 at 10:14:46, Anil Donthireddy (
anil.donthire...@sstech.us) wrote:

Hi,



I have written a java application which uses Stellar processor and execute
the stellar expressions. The issue I am facing is I am unable to connect
Hbase unless I place hbase-site.xml in src/main/resources/ folder of the
code. As it is not the proper way of packaging the hbase-site.xml with Jar,
I would like to understand how the hbase-site.xml is being set to classpath
while starting profiler topology.



The ways I tried are

1)  Setting the classpath to hbase conf folder using command “java -cp
$CLASSPATH:/etc/hbase/conf:/etc/Hadoop/conf –jar myJar.jar”

2)  Adding Hbase conf folder to HADOOP_CLASSPATH. Below is the Hadoop
classpath

/usr/hdp/2.6.1.0-129/hadoop/conf:/usr/hdp/2.6.1.0-129/hadoop/lib/*:/usr/hdp/2.6.1.0-129/hadoop/.//*:/usr/hdp/2.6.1.0-129/hadoop-hdfs/./:/usr/hdp/2.6.1.0-129/hadoop-hdfs/lib/*:/usr/hdp/2.6.1.0-129/hadoop-hdfs/.//*:/usr/hdp/2.6.1.0-129/hadoop-yarn/lib/*:/usr/hdp/2.6.1.0-129/hadoop-yarn/.//*:/usr/hdp/2.6.1.0-129/hadoop-mapreduce/lib/*:/usr/hdp/2.6.1.0-129/hadoop-mapreduce/.//*::mysql-connector-java.jar:postgresql-jdbc2ee.jar:postgresql-jdbc2.jar:postgresql-jdbc3.jar:postgresql-jdbc.jar:/etc/hbase/conf/:/usr/hdp/2.6.1.0-129/tez/*:/usr/hdp/2.6.1.0-129/tez/lib/*:/usr/hdp/2.6.1.0-129/tez/conf



One more step I would like to try to fix the issue is to set property “
*zookeeper.znode.parent*” for configuration object while instantiating
HbaseConnector. But it is in the scope of metron code to try this fix.



I would like to know if any one able to provide hbase-site.xml to any Java
Application or anyone able to extend Metron Stellar Processor and execute
profile definitions successfully.

Please provide any inputs to resolve the issue.



Thanking you.



Thanks,

Anil.


Re: [DISCUSS] Writer class refactor

2019-01-18 Thread Otto Fowler
Agreed


On January 18, 2019 at 14:52:32, Ryan Merriman (merrim...@gmail.com) wrote:

I am on board with that. In that case, I think it's even more important
that we get the Writer interfaces right.

On Fri, Jan 18, 2019 at 1:34 PM Otto Fowler 
wrote:

> I think that the writers should be loaded as, and act as extension
points,
> such that it is possible to have 3rd party writers, and would structure
> them as such.
>
>
>
> On January 18, 2019 at 13:55:00, Ryan Merriman (merrim...@gmail.com)
> wrote:
>
> Recently there was a bug reported by a user where a parser that emits
> multiple messages from a single tuple doesn't work correctly:
> https://issues.apache.org/jira/browse/METRON-1968. This has exposed a
> problem with how the writer classes work.
>
> The fundamental issue is this: the writer classes operate under the
> assumption that there is a 1 to 1 mapping between tuples and messages to
> be
> written. A couple of examples:
>
> KafkaWriter
> <
>
https://github.com/apache/metron/blob/master/metron-platform/metron-writer/src/main/java/org/apache/metron/writer/kafka/KafkaWriter.java#L236>

>
> -
> This class writes messages by iterating through the list of tuples and
> fetching the message with the same index. This is the cause of the Jira
> above. We could iterate through the message list instead but then we
don't
> know which tuples have been fully processed. It would be possible for a
> batch to be flushed before all messages from a tuple are passed to the
> writer.
>
> BulkWriterComponent
> <
>
https://github.com/apache/metron/blob/master/metron-platform/metron-writer/src/main/java/org/apache/metron/writer/BulkWriterComponent.java#L250>

>
> - The tuple list size is used to determine when a batch should be
flushed.
> While inherently incorrect in my opinion (should be message list size),
> this also causes an issue where only the first message from the last
tuple
> in a batch is written.
>
> I do not believe there are easy fixes to these problems. There is no way
> to properly store the relationship between tuples and messages to be
> written with the current BulkMessageWriter interface and
> BulkWriterResponse
> class. If we did have a way, how should we handle partial failures? If
> multiple messages are parsed from a tuple but only half of them are
> written
> successfully, what should happen? Should we replay the tuple? Should we
> just report the failed messages and continue on? I think it may be a good
> time to review our writer classes and consider a refactor. Do others
> agree? Are there easy fixes I'm missing?
>
> Assuming there is interest in refactoring, I will throw out some ideas
for
> consideration. For those not as familiar with the writer classes, they
are
> organized as follows (in order from lowest to highest level):
>
> Writers - These classes do the actual writing and implement the
> BulkMessageWriter or MessageWriter interfaces. There are 6
implementations
> I can see including KafkaWriter, SolrWriter, ElasticsearchWriter,
> HdfsWriter, etc. There is also an implementation that adapts a
> MessageWriter to a BulkMessageWriter (WriterToBulkWriter). The result of
a
> writing operation is a BulkWriterResponse containing a list of either
> successful or failed tuples.
>
> Writer Containers - This includes the BulkWriterComponent and
> WriterHandler
> classes. These are responsible for batching and flushing messages,
> handling errors and acking tuples.
>
> Bolts - This includes ParserBolt, WriterBolt and BulkMessageWriterBolt.
> These classes implement the Storm Bolt interfaces, setup
> writers/components
> and execute tuples.
>
> I think the first step is to reevaluate the separation of concerns for
> these classes. Here is how I would change from what we currently have:
>
> Writers - These classes should only be concerned with writing messages
and
> reporting what happened. They would also manage the lifecycle and
> configuration of the underlying client libraries as they do now. Instead
> of accepting 2 separate lists, they should accept a data structure that
> accurately represents the relationship between tuples and messages.
>
> Writer Containers - These classes would continue to handling batching and
> flushing but would only report the results of a flush rather than
actually
> doing the acking or error handling.
>
> Bolts - These would now be responsible for acking and error reporting on
> tuples. They would transform a tuple into something the Writer Containers
> can accept as input.
>
> I think working through this and adjusting the contracts between the
> different layers will be necessary to fix the bugs described above. While
> we're at it I think there are other improvem

Re: [DISCUSS] Writer class refactor

2019-01-18 Thread Otto Fowler
I think that the writers should be loaded as, and act as extension points,
such that it is possible to have 3rd party writers, and would structure
them as such.



On January 18, 2019 at 13:55:00, Ryan Merriman (merrim...@gmail.com) wrote:

Recently there was a bug reported by a user where a parser that emits
multiple messages from a single tuple doesn't work correctly:
https://issues.apache.org/jira/browse/METRON-1968. This has exposed a
problem with how the writer classes work.

The fundamental issue is this: the writer classes operate under the
assumption that there is a 1 to 1 mapping between tuples and messages to be
written. A couple of examples:

KafkaWriter
<
https://github.com/apache/metron/blob/master/metron-platform/metron-writer/src/main/java/org/apache/metron/writer/kafka/KafkaWriter.java#L236>

-
This class writes messages by iterating through the list of tuples and
fetching the message with the same index. This is the cause of the Jira
above. We could iterate through the message list instead but then we don't
know which tuples have been fully processed. It would be possible for a
batch to be flushed before all messages from a tuple are passed to the
writer.

BulkWriterComponent
<
https://github.com/apache/metron/blob/master/metron-platform/metron-writer/src/main/java/org/apache/metron/writer/BulkWriterComponent.java#L250>

- The tuple list size is used to determine when a batch should be flushed.
While inherently incorrect in my opinion (should be message list size),
this also causes an issue where only the first message from the last tuple
in a batch is written.

I do not believe there are easy fixes to these problems. There is no way
to properly store the relationship between tuples and messages to be
written with the current BulkMessageWriter interface and BulkWriterResponse
class. If we did have a way, how should we handle partial failures? If
multiple messages are parsed from a tuple but only half of them are written
successfully, what should happen? Should we replay the tuple? Should we
just report the failed messages and continue on? I think it may be a good
time to review our writer classes and consider a refactor. Do others
agree? Are there easy fixes I'm missing?

Assuming there is interest in refactoring, I will throw out some ideas for
consideration. For those not as familiar with the writer classes, they are
organized as follows (in order from lowest to highest level):

Writers - These classes do the actual writing and implement the
BulkMessageWriter or MessageWriter interfaces. There are 6 implementations
I can see including KafkaWriter, SolrWriter, ElasticsearchWriter,
HdfsWriter, etc. There is also an implementation that adapts a
MessageWriter to a BulkMessageWriter (WriterToBulkWriter). The result of a
writing operation is a BulkWriterResponse containing a list of either
successful or failed tuples.

Writer Containers - This includes the BulkWriterComponent and WriterHandler
classes. These are responsible for batching and flushing messages,
handling errors and acking tuples.

Bolts - This includes ParserBolt, WriterBolt and BulkMessageWriterBolt.
These classes implement the Storm Bolt interfaces, setup writers/components
and execute tuples.

I think the first step is to reevaluate the separation of concerns for
these classes. Here is how I would change from what we currently have:

Writers - These classes should only be concerned with writing messages and
reporting what happened. They would also manage the lifecycle and
configuration of the underlying client libraries as they do now. Instead
of accepting 2 separate lists, they should accept a data structure that
accurately represents the relationship between tuples and messages.

Writer Containers - These classes would continue to handling batching and
flushing but would only report the results of a flush rather than actually
doing the acking or error handling.

Bolts - These would now be responsible for acking and error reporting on
tuples. They would transform a tuple into something the Writer Containers
can accept as input.

I think working through this and adjusting the contracts between the
different layers will be necessary to fix the bugs described above. While
we're at it I think there are other improvements we could also make:

Decouple Storm - It would be beneficial to remove the dependency on tuples
in our writers and writer containers. We could replace this with a simple
abstraction (an id would probably work fine). This will allow us to more
easily port Metron to other streaming platforms.

Remove MessageWriter Interface - This is not being actively used as far as
I can tell. Is that true? Removing this will make our code simpler and
easier to follow (WriterHandler and WriterToBulkWriter classes can probably
go away). I don't see any reason future writers, even those without bulk
writing capabilities, could not fit into the BulkMessageWriter interface.
A writer could either iterate through messages and write 

Re: Site-book broken in master

2018-12-20 Thread Otto Fowler
https://github.com/oasp/asciidoc-link-checker


On December 20, 2018 at 12:59:39, Otto Fowler (ottobackwa...@gmail.com)
wrote:

+1 for merging a fix.



On December 20, 2018 at 12:43:57, Michael Miklavcic (
michael.miklav...@gmail.com) wrote:

Thanks for the feedback Otto. I'm not too familiar with our infrastructure
afa impact of switching to ascii doc, but I think it's worth considering if
it allows us to have a consistent rendering markdown that's easy to use.
Automated link testing is a good idea. I wonder if Cypress could do that
well, or if we should just write our own link tester that spins up embedded
Jetty, or other option.

*Anyone opposed to me merging this PR in sooner than the 24 hour period?*
This is a broken master as far as I'm concerned, despite what Travis says.

In a separate PR, I'd also like to add a simple mvn clean site run to our
Travis builds, which should catch the bigger issues.

On Wed, Dec 19, 2018 at 8:51 PM Otto Fowler  wrote:

> Maybe we should consider ascii doc, it shows in github, and can generate
> multiple formats.
>
>
>
> On December 19, 2018 at 22:49:26, Otto Fowler (ottobackwa...@gmail.com)
> wrote:
>
> What would help is an automated test the tests all the links as well.
>
>
>
> On December 19, 2018 at 19:55:03, Michael Miklavcic (
> michael.miklav...@gmail.com) wrote:
>
> I'd also like to submit a call for suggestions for making this process
> simpler. I think we've made some excellent progress towards getting an
> html-generated version of our documentation, but it's also proving to be
> somewhat error-prone. Github and Doxia have different expectations as to
> formatting, and most devs are generally looking at Github rendering only
> because we get that for free as part of the Github PR lifecycle. In
> addition, images are managed independently. Missing a reference, getting
> the relative path wrong, or getting the rewrite rules incorrect will result
> in a bad time. I've seen at least a few PR's in recent memory unwittingly
> fall into this trap.
>
> On Wed, Dec 19, 2018 at 5:47 PM Michael Miklavcic <
> michael.miklav...@gmail.com> wrote:
>
> > Fix is out in this PR: https://github.com/apache/metron/pull/1309
> > Jira: https://issues.apache.org/jira/browse/METRON-1950
> >
> > It ended up being a bit bigger than I hoped because we had some remaining
> > incompatibilities between our Github markdown and the Doxia-expected
> > formatting. This caused a cascade of inconsistent results for anyone
> > attempting to generate a page that works in both Github markdown and
> Doxia.
> > The latest fixes here should address all of this in the parsers
> > documentation. I suspect we may have other markdown pages out there that
> > will have similar issues, so this PR/Jira/notice should serve as a recipe
> > for remedying them in the future.
> >
> > Best,
> > Mike
> >
> >
> > On Wed, Dec 19, 2018 at 3:46 PM Michael Miklavcic <
> > michael.miklav...@gmail.com> wrote:
> >
> >> FYI, site-book is currently broken in master. I think I've found the
> >> source of the issue and will be issuing a fix shortly.
> >>
> >> Best,
> >> Mike
> >>
> >
>
>


Re: Site-book broken in master

2018-12-20 Thread Otto Fowler
+1 for merging a fix.



On December 20, 2018 at 12:43:57, Michael Miklavcic (
michael.miklav...@gmail.com) wrote:

Thanks for the feedback Otto. I'm not too familiar with our infrastructure
afa impact of switching to ascii doc, but I think it's worth considering if
it allows us to have a consistent rendering markdown that's easy to use.
Automated link testing is a good idea. I wonder if Cypress could do that
well, or if we should just write our own link tester that spins up embedded
Jetty, or other option.

*Anyone opposed to me merging this PR in sooner than the 24 hour period?*
This is a broken master as far as I'm concerned, despite what Travis says.

In a separate PR, I'd also like to add a simple mvn clean site run to our
Travis builds, which should catch the bigger issues.

On Wed, Dec 19, 2018 at 8:51 PM Otto Fowler  wrote:

> Maybe we should consider ascii doc, it shows in github, and can generate
> multiple formats.
>
>
>
> On December 19, 2018 at 22:49:26, Otto Fowler (ottobackwa...@gmail.com)
> wrote:
>
> What would help is an automated test the tests all the links as well.
>
>
>
> On December 19, 2018 at 19:55:03, Michael Miklavcic (
> michael.miklav...@gmail.com) wrote:
>
> I'd also like to submit a call for suggestions for making this process
> simpler. I think we've made some excellent progress towards getting an
> html-generated version of our documentation, but it's also proving to be
> somewhat error-prone. Github and Doxia have different expectations as to
> formatting, and most devs are generally looking at Github rendering only
> because we get that for free as part of the Github PR lifecycle. In
> addition, images are managed independently. Missing a reference, getting
> the relative path wrong, or getting the rewrite rules incorrect will result
> in a bad time. I've seen at least a few PR's in recent memory unwittingly
> fall into this trap.
>
> On Wed, Dec 19, 2018 at 5:47 PM Michael Miklavcic <
> michael.miklav...@gmail.com> wrote:
>
> > Fix is out in this PR: https://github.com/apache/metron/pull/1309
> > Jira: https://issues.apache.org/jira/browse/METRON-1950
> >
> > It ended up being a bit bigger than I hoped because we had some remaining
> > incompatibilities between our Github markdown and the Doxia-expected
> > formatting. This caused a cascade of inconsistent results for anyone
> > attempting to generate a page that works in both Github markdown and
> Doxia.
> > The latest fixes here should address all of this in the parsers
> > documentation. I suspect we may have other markdown pages out there that
> > will have similar issues, so this PR/Jira/notice should serve as a recipe
> > for remedying them in the future.
> >
> > Best,
> > Mike
> >
> >
> > On Wed, Dec 19, 2018 at 3:46 PM Michael Miklavcic <
> > michael.miklav...@gmail.com> wrote:
> >
> >> FYI, site-book is currently broken in master. I think I've found the
> >> source of the issue and will be issuing a fix shortly.
> >>
> >> Best,
> >> Mike
> >>
> >
>
>


Re: Site-book broken in master

2018-12-19 Thread Otto Fowler
What would help is an automated test the tests all the links as well.



On December 19, 2018 at 19:55:03, Michael Miklavcic (
michael.miklav...@gmail.com) wrote:

I'd also like to submit a call for suggestions for making this process
simpler. I think we've made some excellent progress towards getting an
html-generated version of our documentation, but it's also proving to be
somewhat error-prone. Github and Doxia have different expectations as to
formatting, and most devs are generally looking at Github rendering only
because we get that for free as part of the Github PR lifecycle. In
addition, images are managed independently. Missing a reference, getting
the relative path wrong, or getting the rewrite rules incorrect will result
in a bad time. I've seen at least a few PR's in recent memory unwittingly
fall into this trap.

On Wed, Dec 19, 2018 at 5:47 PM Michael Miklavcic <
michael.miklav...@gmail.com> wrote:

> Fix is out in this PR: https://github.com/apache/metron/pull/1309
> Jira: https://issues.apache.org/jira/browse/METRON-1950
>
> It ended up being a bit bigger than I hoped because we had some remaining
> incompatibilities between our Github markdown and the Doxia-expected
> formatting. This caused a cascade of inconsistent results for anyone
> attempting to generate a page that works in both Github markdown and
Doxia.
> The latest fixes here should address all of this in the parsers
> documentation. I suspect we may have other markdown pages out there that
> will have similar issues, so this PR/Jira/notice should serve as a recipe
> for remedying them in the future.
>
> Best,
> Mike
>
>
> On Wed, Dec 19, 2018 at 3:46 PM Michael Miklavcic <
> michael.miklav...@gmail.com> wrote:
>
>> FYI, site-book is currently broken in master. I think I've found the
>> source of the issue and will be issuing a fix shortly.
>>
>> Best,
>> Mike
>>
>


Re: Site-book broken in master

2018-12-19 Thread Otto Fowler
Maybe we should consider ascii doc, it shows in github, and can generate
multiple formats.



On December 19, 2018 at 22:49:26, Otto Fowler (ottobackwa...@gmail.com)
wrote:

What would help is an automated test the tests all the links as well.



On December 19, 2018 at 19:55:03, Michael Miklavcic (
michael.miklav...@gmail.com) wrote:

I'd also like to submit a call for suggestions for making this process
simpler. I think we've made some excellent progress towards getting an
html-generated version of our documentation, but it's also proving to be
somewhat error-prone. Github and Doxia have different expectations as to
formatting, and most devs are generally looking at Github rendering only
because we get that for free as part of the Github PR lifecycle. In
addition, images are managed independently. Missing a reference, getting
the relative path wrong, or getting the rewrite rules incorrect will result
in a bad time. I've seen at least a few PR's in recent memory unwittingly
fall into this trap.

On Wed, Dec 19, 2018 at 5:47 PM Michael Miklavcic <
michael.miklav...@gmail.com> wrote:

> Fix is out in this PR: https://github.com/apache/metron/pull/1309
> Jira: https://issues.apache.org/jira/browse/METRON-1950
>
> It ended up being a bit bigger than I hoped because we had some remaining
> incompatibilities between our Github markdown and the Doxia-expected
> formatting. This caused a cascade of inconsistent results for anyone
> attempting to generate a page that works in both Github markdown and
Doxia.
> The latest fixes here should address all of this in the parsers
> documentation. I suspect we may have other markdown pages out there that
> will have similar issues, so this PR/Jira/notice should serve as a recipe
> for remedying them in the future.
>
> Best,
> Mike
>
>
> On Wed, Dec 19, 2018 at 3:46 PM Michael Miklavcic <
> michael.miklav...@gmail.com> wrote:
>
>> FYI, site-book is currently broken in master. I think I've found the
>> source of the issue and will be issuing a fix shortly.
>>
>> Best,
>> Mike
>>
>


Re: [VOTE] Metron Release Candidate 0.7.0-RC1

2018-12-12 Thread Otto Fowler
+1 binding
Ran through script


On December 12, 2018 at 14:10:22, Nick Allen (n...@nickallen.org) wrote:

+1 binding

- All of the tarballs, checksums, and signatures are correct
- All of the tests and integration tests ran successfully.
- The release also spun-up correctly in the development environment.


FYI - I had to slightly modify the metron-rc-check script for this to
work. See the patch below.

diff --git a/dev-utilities/release-utils/metron-rc-check
b/dev-utilities/release-utils/metron-rc-check
index 143ba85a2..4552e5568 100755
--- a/dev-utilities/release-utils/metron-rc-check
+++ b/dev-utilities/release-utils/metron-rc-check
@@ -67,8 +67,6 @@ for i in "$@"; do
# --bro=0.1.0
#
-b=*|--bro=*)
- BRO="${i#*=}"
- shift # past argument=value
;;

#
@@ -157,6 +155,7 @@ fi
echo "Working directory $WORK"

KEYS="$METRON_RC_DIST/KEYS"
+KEYS="https://dist.apache.org/repos/dist/release/metron/KEYS;
METRON_ASSEMBLY="$METRON_RC_DIST/apache-metron-$METRON_VERSION-$RC.tar.gz"
METRON_ASSEMBLY_SIG="$METRON_ASSEMBLY.asc"


On Tue, Dec 11, 2018 at 2:43 PM Justin Leet  wrote:

> This is a call to vote on releasing Apache Metron 0.7.0
>
> Full list of changes in this release:
> https://dist.apache.org/repos/dist/dev/metron/0.7.0-RC1/CHANGES
> The tag to be voted upon is:
> apache-metron-0.7.0-rc1
>
> The source archives being voted upon can be found here:
>
>
https://dist.apache.org/repos/dist/dev/metron/0.7.0-RC1/apache-metron-0.7.0-rc1.tar.gz
>
> Other release files, signatures and digests can be found here:
> https://dist.apache.org/repos/dist/dev/metron/0.7.0-RC1/
>
> The release artifacts are signed with the following key:
> https://dist.apache.org/repos/dist/release/metron/KEYS
> Please vote on releasing this package as Apache Metron 0.7.0-RC1
>
> When voting, please list the actions taken to verify the release.
>
> Recommended build validation and verification instructions are posted
> here:
> https://cwiki.apache.org/confluence/display/METRON/Verifying+Builds
>
> This vote will be open for until 3pm EDT on Friday December 14 2018.
>
> [ ] +1 Release this package as Apache Metron 0.7.0-RC1
>
> [ ] 0 No opinion
>
> [ ] -1 Do not release this package because...
>


Re: [DISCUSS] Mandatory relocation of Apache git repositories on git-wip-us.apache.org

2018-12-09 Thread Otto Fowler
+1

We will need jiras and PR’s for updating our scripts post move however.


On December 9, 2018 at 06:32:44, Roy Lenferink (rlenfer...@apache.org)
wrote:

Hi folks,

Checking the mail-archives I noticed the message below missed the dev@metron
list [1].
Does anyone have a problem with starting the process to migrate the
existing Metron
git-wip-us repos [2][3] to gitbox?

This means integrated access and easy PRs on the repos (write access
to the GitHub repos).

I can't imagine anyone will say no, but we need to "document support
for the decision" from a mailing list post, so, here it is.

If there are no objections after 72 hours, I will create a ticket with
INFRA for moving
over the metron repositories to GitBox.

- Roy

[1] http://mail-archives.apache.org/mod_mbox/metron-dev/201812.mbox/browser
[2] https://git-wip-us.apache.org/repos/asf?p=metron.git
[3] https://git-wip-us.apache.org/repos/asf?p=metron-bro-plugin-kafka.git

-- Forwarded message -
From: Daniel Gruno 
Date: vr 7 dec. 2018 om 17:53
Subject: [NOTICE] Mandatory relocation of Apache git repositories on
git-wip-us.apache.org
To: us...@infra.apache.org 


[IF YOUR PROJECT DOES NOT HAVE GIT REPOSITORIES ON GIT-WIP-US PLEASE
DISREGARD THIS EMAIL; IT WAS MASS-MAILED TO ALL APACHE PROJECTS]

Hello Apache projects,

I am writing to you because you may have git repositories on the
git-wip-us server, which is slated to be decommissioned in the coming
months. All repositories will be moved to the new gitbox service which
includes direct write access on github as well as the standard ASF
commit access via gitbox.apache.org.

## Why this move? ##
The move comes as a result of retiring the git-wip service, as the
hardware it runs on is longing for retirement. In lieu of this, we
have decided to consolidate the two services (git-wip and gitbox), to
ease the management of our repository systems and future-proof the
underlying hardware. The move is fully automated, and ideally, nothing
will change in your workflow other than added features and access to
GitHub.

## Timeframe for relocation ##
Initially, we are asking that projects voluntarily request to move
their repositories to gitbox, hence this email. The voluntary
timeframe is between now and January 9th 2019, during which projects
are free to either move over to gitbox or stay put on git-wip. After
this phase, we will be requiring the remaining projects to move within
one month, after which we will move the remaining projects over.

To have your project moved in this initial phase, you will need:

- Consensus in the project (documented via the mailing list)
- File a JIRA ticket with INFRA to voluntarily move your project repos
over to gitbox (as stated, this is highly automated and will take
between a minute and an hour, depending on the size and number of
your repositories)

To sum up the preliminary timeline;

- December 9th 2018 -> January 9th 2019: Voluntary (coordinated)
relocation
- January 9th -> February 6th: Mandated (coordinated) relocation
- February 7th: All remaining repositories are mass migrated.

This timeline may change to accommodate various scenarios.

## Using GitHub with ASF repositories ##
When your project has moved, you are free to use either the ASF
repository system (gitbox.apache.org) OR GitHub for your development
and code pushes. To be able to use GitHub, please follow the primer
at: https://reference.apache.org/committer/github


We appreciate your understanding of this issue, and hope that your
project can coordinate voluntarily moving your repositories in a
timely manner.

All settings, such as commit mail targets, issue linking, PR
notification schemes etc will automatically be migrated to gitbox as
well.

With regards, Daniel on behalf of ASF Infra.

PS:For inquiries, please reply to us...@infra.apache.org, not your
project's dev list :-).


Re: [DISCUSS] Recurrent Large Indexing Error Messages

2018-12-05 Thread Otto Fowler
Why not have a second indexing topology configured just for errors?
We can load the same code with two different configurations in two
topologies.



On December 5, 2018 at 03:55:59, Ali Nazemian (alinazem...@gmail.com) wrote:

I think if you look at the indexing error management, it is pretty much
similar to parser and enrichment error use cases. It is even more common to
expect something ended up in error topics. I think a wider independent job
can be used to take care of error management. It can be decided to add a
separate topology later on to manage error logs and create
alert/notifications separately.
It can be even integrated with log feeder and log search.
The scenario of sending solution operational logs to the same solution is a
bit weird and not enterprise friendly. Normally platform operation team
would be a separate team with different objectives and probably they have
got a separate monitoring/notification solution in placed already.

I don't think it is the end of the world if this part is left to be managed
by users. So I prefer option 2 as a short term. Long term solution can be
discussed separately.

Cheers,
Ali

On Sat, 20 Oct. 2018, 05:20 Nick Allen  I want to discuss solutions for the problem that I have described in
> METRON-1832; Recurrent Large Indexing Error Messages. I feel this is a
very
> easy trap to fall into when using the default settings that currently
come
> with Metron.
>
>
> ## Problem
>
>
> https://issues.apache.org/jira/browse/METRON-1832
>
>
> If any index destination like HDFS, Elasticsearch, or Solr goes down
while
> the Indexing topology is running, an error message is created and sent
back
> to the user-defined error topic. By default, this is defined to also be
> the 'indexing' topic.
>
> The Indexing topology then consumes this error message and attempts to
> write it again. If the index destination is still down, another error
> occurs and another error message is created that encapsulates the
original
> error message. That message is then sent to the 'indexing' topic, which
is
> later consumed, yet again, by the Indexing topology.
>
> These error messages will continue to be recycled and grow larger and
> larger as each new error message encapsulates all previous error messages
> in the "raw_message" field.
>
> Once the index destination recovers, one giant error message will finally
> be written that contains massively duplicated, useless information which
> can further negatively impact performance of the index destination.
>
> Also, the escape character '\' continually compounds one another leading
to
> long strings of '\' characters in the error message.
>
>
> ## Background
>
> There was some discussion on how to handle this on the original PR #453
> https://github.com/apache/metron/pull/453.
>
> ## Solutions
>
> (1) The first, easiest option is to just do nothing. There was already a
> discussion around this and this is the solution that we landed on in
#453.
>
> Pros: Really easy; do nothing.
>
> Cons: Intermittent problems with ES/Solr can easily create very large
error
> messages that can significantly impact both search and ingest
performance.
>
>
> (2) Change the default indexing error topic to 'indexing_errors' to avoid
> recycling error messages. Nothing will consume from the 'indexing_errors'
> topic, thus preventing a cycle.
>
> Pros: Simple, easy change that prevents the cycle.
>
> Cons: Recoverable indexing errors are not visible to users as they will
> never be indexed in ES/Solr.
>
> (2) Add logic to limit the number times a message can be 'recycled'
through
> the Indexing topology. This effectively sets a maximum number of retry
> attempts. If a message fails N times, then write the message to a
separate
> unrecoverable, error topic.
>
> Pros: Recoverable errors are visible to users in ES/Solr.
>
> Cons: More complex. Users still need to check the unrecoverable, error
> topic for potential problems.
>
> (4) Do not further encapsulate error messages in the 'raw_message' field.
> If an error message fails, don't encapsulate it in another error message.
> Just push it to the error topic as-is. Could add a field that indicates
> how many times the message has failed.
>
> Pros: Prevents giant error messages from being created from recoverable
> errors.
>
> Cons: Extended outages would still cause the Indexing topology to
> repeatedly recycle these error messages, which would ultimately exhaust
> resources in Storm.
>
>
>
> What other ways can we solve this?
>


Re: [DISCUSS] Managing intermittent test failures

2018-11-29 Thread Otto Fowler
+1


On November 29, 2018 at 10:26:53, Michael Miklavcic (
michael.miklav...@gmail.com) wrote:

Every now and then we see intermittent test failures, and rather than
sweeping them under the rug, we should have a simple method to track and
handle them. I started creating Jiras for tests that I've seen fail, but
that don't fail consistently, or even fail more than once. For example,
https://issues.apache.org/jira/browse/METRON-1851.

I think we're all taking steps to varying degrees already, but I want to
call it out formally. I propose we create a ticket and add the label
"test-failure." It might also make sense to send a quick note to the dev
list or Slack channel, so attention can be brought to it and anyone else
that may have run into an issue with the test can chime in. We can clean
them out every few months - maybe do a review going into a release and
close any that have not been reproduced for some time. What do you all
think?

Mike


Re: Unzipping Cypress

2018-11-28 Thread Otto Fowler
OK,

I think what is happening is that in my PR, I’m building metron in Docker
and deploying to vagrant.  I have updated my PR to map the cypress cache
into the Docker container.
Thanks!


On November 28, 2018 at 10:29:25, Shane Ardell (shane.m.ard...@gmail.com)
wrote:

For me, it's at ~/Library/Caches/Cypress, but the path depends on your OS:
https://docs.cypress.io/guides/getting-started/installing-cypress.html#Binary-cache

On Wed, Nov 28, 2018 at 4:19 PM Otto Fowler 
wrote:

> Where is the cache path?
>
>
> On November 28, 2018 at 09:34:18, Shane Ardell (shane.m.ard...@gmail.com)
> wrote:
>
> https://github.com/cypress-io/cypress/issues/1813
>


Re: Unzipping Cypress

2018-11-28 Thread Otto Fowler
Where is the cache path?


On November 28, 2018 at 09:34:18, Shane Ardell (shane.m.ard...@gmail.com)
wrote:

https://github.com/cypress-io/cypress/issues/1813


Re: Unzipping Cypress

2018-11-27 Thread Otto Fowler
I’m sorry for the confusion, but I see this locally.  I’m not looking at
travis.
I have been doing a lot of full dev deployments.



On November 27, 2018 at 11:40:42, Shane Ardell (shane.m.ard...@gmail.com)
wrote:

Otto,

Do you have a Travis log you can share that shows Cypress downloaded vs.
using a cached version? It looks like my latest merge to master uses a
cached version: https://travis-ci.org/apache/metron/jobs/458440883#L7292.

Thanks in advance.

On Mon, Nov 26, 2018 at 6:14 PM Shane Ardell 
wrote:

> It seems we can pretty easily configure the .travis.yml config file to
> cache our npm modules:
>
https://docs.cypress.io/guides/guides/continuous-integration.html#Caching-the-Cypress-binary
>
> It also looks like we are already trying to cache our npm modules in the
> Travis config, but, obviously, it's not working as intended. I can take a
> look into why tomorrow.
>
> On Mon, Nov 26, 2018 at 5:33 PM Michael Miklavcic <
> michael.miklav...@gmail.com> wrote:
>
>> Shane, Tibor - Can you guys chime in on this?
>>
>> On Mon, Nov 26, 2018 at 9:13 AM Otto Fowler 
>> wrote:
>>
>> > Isn’t there a way we can cache it?
>> >
>> >
>> > On November 26, 2018 at 10:59:20, Nick Allen (n...@nickallen.org)
>> wrote:
>> >
>> > Yes, I have noticed that too. If not a way to reduce the time, we
should
>> > not be logging the unzipping process percentile-by-percentile in the
>> Travis
>> > CI builds.
>> >
>> > On Sat, Nov 24, 2018 at 9:49 AM Otto Fowler 
>> > wrote:
>> >
>> > > Anyone else seeing a lot of time taken downloading and unzipping
>> Cypress
>> > on
>> > > builds?
>> > > What is up with that?™
>> > >
>> > > ottO
>> > >
>> >
>>
>


Re: Unzipping Cypress

2018-11-26 Thread Otto Fowler
Isn’t there a way we can cache it?


On November 26, 2018 at 10:59:20, Nick Allen (n...@nickallen.org) wrote:

Yes, I have noticed that too. If not a way to reduce the time, we should
not be logging the unzipping process percentile-by-percentile in the Travis
CI builds.

On Sat, Nov 24, 2018 at 9:49 AM Otto Fowler 
wrote:

> Anyone else seeing a lot of time taken downloading and unzipping Cypress
on
> builds?
> What is up with that?™
>
> ottO
>


Unzipping Cypress

2018-11-24 Thread Otto Fowler
Anyone else seeing a lot of time taken downloading and unzipping Cypress on
builds?
What is up with that?™

ottO


Re: [DISCUSS] Add ngrx to handle state management in Angular

2018-11-22 Thread Otto Fowler
Can you describe what you mean by “state” in a little more detail?  Not a
complete description, maybe just a crib list.


On November 22, 2018 at 07:21:43, Shane Ardell (shane.m.ard...@gmail.com)
wrote:

As both the Management and Alerts UI grow in size, managing application
state continues to become more and more complex. To help us deal with
managing all of this state and ensuring our application derives state from
a single source of truth, I suggest we start using NgRx, a state management
library based on the Redux pattern but built for Angular. It is by far the
most popular library of this type for Angular. As you can see in the
project's GitHub Insights tab ,
it's quite actively worked on and releases are pretty frequent. The project
is licensed under MIT.

As far as an approach to integration, I don't think we necessarily need a
big refactoring right off the bat. I feel something like this can be done
in a piecemeal approach over time. I think we can start by introducing it
into the project the next time we have a new application feature.

What are everyone's thoughts around this?

Cheers,
Shane


Re: [DISCUSS] Deprecating MySQL

2018-11-16 Thread Otto Fowler
I would like to understand the work required to move our JDBC support ( or
adapt the current support to the abstraction ) to /contrib.
We could default and only officially support LDAP, but have the /contrib (
or /extension_examples ) have a “this is how you would support jdbc for
auth “ project.



On November 15, 2018 at 15:01:10, Michael Miklavcic (
michael.miklav...@gmail.com) wrote:

Yes, makes sense. +1 to that.

On Thu, Nov 15, 2018 at 12:54 PM James Sirota  wrote:

> To clarify my position, I don't have a problem with mySql or any other
> projects relying on it. mySql in itself is not an issue. What I don't
> want is for a customer to be presented with an option to chose and
> configure two options for authenticating the UI, which I think is
> needless. It adds complexity for not much value. Since LDAP is clearly
> the better way to go that should be what we support without explicitly
> giving a user an option to switch to JDBC. A user can still do so by
> extending our abstractions if that is what they chose to do, but this
would
> not be officially supported by us. We would not be providing a config or
> an mPack to do this. A user would have to do it on their own.
>
> James
>
>
>
> 15.11.2018, 12:15, "Michael Miklavcic" :
> > Incidentally, even without the Metron piece in the picture, what is the
> > answer for Ambari's database dependency? Which uses a SQL data store.
> Does
> > this actually solve the problem of "customers won't install Metron bc
SQL
> > store?" or are there other issues we need to address?
> >
> > On Thu, Nov 15, 2018 at 9:30 AM James Sirota 
wrote:
> >
> >> Hi Guys,
> >>
> >> My opinion on this, as is with Knox SSO, is that the code should be
> >> pluggable to support JDBC, but we should not continue to support the
> >> concrete implementation and expose it to users via a setting. This is
a
> >> fairly minor feature and the added complexity of supporting switching
> >> between JDBC and LDAP is simply not worth it. We need to strike a
> balance
> >> between ease of use and capabilities/extensibility. For features that
> are
> >> worth it such as with analytics and stream processing, the extra
> capability
> >> is worth the added complexity in configuration. But for this, it is
> not.
> >> So let's keep JDBC around for a release to allow users to migrate to
> LDAP,
> >> deprecate it, and move on.
> >>
> >> Thanks,
> >> James
> >>
> >> 13.11.2018, 16:03, "Simon Elliston Ball"  >:
> >> > We went over the hbase user settings thing on extensive discussions
> at
> >> the time. Storing an arbitrary blob of JSON which is only ever
> accessed by
> >> a single key (username) was concluded to be a key value problem, not a
> >> relational problem. Hbase was concluded to be massive overkill as a
key
> >> value store in this usecase, unless it was already there and ready to
> go,
> >> which in the case of Metron, it is, for enrichments, threat intel and
> >> profiles. Hence it ended up in Hbase, as a conveniently present data
> store
> >> that matched the usage patterns. See
> >>
>
https://lists.apache.org/thread.html/145b3b8ffd8c3aa5bbfc3b93f550fc67e71737819b19bc525a2f2ce2@%3Cdev.metron.apache.org%3E
> >> and METRON-1337 for discussion.
> >> >
> >> > Simon
> >> >
> >> >> On 13 Nov 2018, at 18:50, Michael Miklavcic <
> >> michael.miklav...@gmail.com> wrote:
> >> >>
> >> >> Thanks for the write up Simon. I don't think I see any major
> problems
> >> with
> >> >> deprecating the general sql store. However, just to clarify, Metron
> >> does
> >> >> NOT require any specific backing store. It's 100% JPA, which means
> >> anything
> >> >> that can be configured with the Spring properties we expose. I
think
> >> the
> >> >> most opinionated thing we do there is ship an extremely basic table
> >> >> creation script for h2 and mysql as a simple example for schema. As
> an
> >> >> example, we simply use H2 in full dev, which is entirely in-memory
> and
> >> spun
> >> >> up automatically from configuration. The recent work by Justin Leet
> >> removes
> >> >> the need to use a SQL store at all if you choose LDAP -
> >> >> https://github.com/apache/metron/pull/1246. I'll let him comment
> >> further on
> >> >> this, but I think there is one small change that could be made via
a
> >> toggle
> >> >> in Ambari that would even eliminate the user from seeing JDBC
> settings
> >> >> altogether during install if they choose LDAP. Again, I think I'm
on
> >> board
> >> >> with deprecating the SQL backing store as I pointed this out on the
> >> Knox
> >> >> thread as well, but I just wanted to make sure everyone has an
> accurate
> >> >> picture of the current state.
> >> >>
> >> >> I had to double check on the HBase config you mentioned, but it
does
> >> appear
> >> >> that we use it for the Alerts UI. I don't think I realized we were
> >> storing
> >> >> config there instead of the Zookeeper store we use for other system
> >> >> configuration. Ironically enough, I think that it probably makes
> more
> >> 

Re: [DISCUSS] Knox SSO feature branch review and features

2018-11-16 Thread Otto Fowler
That does sound good Simon, I think I miss understood that the default LDAP
was standard with KNOX/ambari and not something we would be doing ourselves.


On November 16, 2018 at 10:54:48, Simon Elliston Ball (
si...@simonellistonball.com) wrote:

I think there is a lot to be said for defaulting to Knox on... we also that
way get some 'secure by default' - at least ssl by default. The do-nothing
I think you're proposing would be around the authentication, right? Knox
does ship with a demo LDAP server we could have some defaults (kinda like
we do with the dev spring profile today) which might be a good way of
achieving a similar effect, and that's configured by default by Ambari.
Would that meet the need, or do you think we should provide a "yeah I'm
sure we don't need to authenticate that connection, let them in" identity
provider for Knox? That way we would have to have them impersonate a known
user for the REST api to work, but you would get the seemless, no auth
access.

To be honest, I'm a fan of the first option where we give people a nice
simple, no external config and just use some sensible default users in the
demo LDAP instance that Knox owns, and then default to Knox on to give us
our nice single access point.

Simon


On Fri, 16 Nov 2018 at 15:35, Otto Fowler  wrote:

> Those are all valid points. I think it is ( was ) worth discussion at
> lease a little.
>
> WRT Knox and defaults:
>
> I have in the past used “do-nothing” implementations as default
> placeholders for functionality
> that needed extensive per customer configuration, or configuration
outside
> the responsibility of the product.
>
> Would it be simpler if we ALWAYS used Knox, but defaulted to a KNOX
> configuration with “do-nothing” providers
> for auth etc. The users would then configure the providers ( based on the
> provider(s) we support ) at a later time.
>
> We could write the providers, as everyone has pointed out how extensible
> KNOX is ;)
>
> Would that be a valid way to simplify the issue?
> What would the fallout of that be?
>
>
>
> On November 16, 2018 at 09:20:53, Ryan Merriman (merrim...@gmail.com)
> wrote:
>
> Most of the research I've done around adding Metron as a Knox service is
> based on how other projects do it. The documentation is not easy to
follow
> so I learned by reading other service definition files. The assumption
> that we are doing things drastically different is false.
>
> I completely agree with Simon. Why would we want to be dependent on
Knox's
> release cycle? How does that benefit us? It may reduce some operational
> complexity but it makes our install process more complicated because we
> require a certain version of Knox (who knows when that gets released).
> What do we do in the meantime? I would also like to point out that Metron
> is inherently different than other Hadoop stack services. We are a
> full-blown application with multiple UIs so the way we expose services
> through Knox may be a little different.
>
> I think this will be easier to discuss when we can all see what is
actually
> involved. I am working on a PR that adds Metron as a Knox service and
will
> have that out soon. That should give everyone more context.
>
> On Fri, Nov 16, 2018 at 7:39 AM Simon Elliston Ball <
> si...@simonellistonball.com> wrote:
>
> > You could say the same thing about Ambari, but that provides mpacks.
Knox
> > is also designed to be extensible through Knox service stacks since
they
> > realized they can’t support every project. The challenge is that the
docs
> > have not made it as easy as they could for the ecosystem to plug into
> Knox,
> > which has led to some confusion around this being a recommended pattern
> > (which it is).
> >
> > The danger of trying to get your bits into Knox is that that ties you
to
> > their release cycle (a problem Ambari has felt hard, hence their
> community
> > is moving away from the everything inside model towards everything is
an
> > mpack).
> >
> > A number of implementations of Knox also use the approach Ryan is
> > suggesting for their own organization specific end points, so it’s not
> like
> > this is an uncommon, or anti-pattern, it’s more the way Knox is
designed
> to
> > work in the future, than the legacy of it only being able to handle a
> > subset of Hadoop projects.
> >
> > Knox remains optional In our scenario, but we keep control over the
> > shipping of things like rewrite rules, which allows Metron to control
its
> > release destiny should things like url patterns in the ui need to
change
> > (with a new release of angular / new module / new rest endpoint etc)
> > instead of making a Metron release dependent on a Knox release.
> >
>

Re: Running MAAS in batch

2018-11-16 Thread Otto Fowler
That may be the best MAAS explanation I’ve seen Simon.


On November 16, 2018 at 10:28:57, Simon Elliston Ball (
si...@simonellistonball.com) wrote:

MaaS is designed to wrap model inference (scoring) an event at a time, via
a REST api. As such, running it batch doesn't make a lot of sense, since
each message would be processed individually. Most of the models you're
likely to run in MaaS however, are also likely to be easily batchable, and
are probable better wrapped up in a batch engine like Spark to take
advantage of more efficient "mass" scoring.

Simon

On Fri, 16 Nov 2018 at 15:18, deepak kumar  wrote:

> Hi All
> Right now MAAS supports running the model against real time events being
> streamed into metron platform.
> Is there any way to run the models deployed in MAAS on the batch events /
> data that have been indexed into hdfs ?
> If anyone have tried this batch model , please share some insights.
> Thanks
> Deepak.
>
>

--
--
simon elliston ball
@sireb


Re: [DISCUSS] Attribution and merging the Elasticsearch client migration

2018-11-16 Thread Otto Fowler
Maybe if we can’t document a concrete solution, we can document or codify a
procedure.   Should for example the owner of the feature branch always work
with the release manager when there are issues?


On November 16, 2018 at 10:26:11, Michael Miklavcic (
michael.miklav...@gmail.com) wrote:

It's a good suggestion, and I've been thinking about how to best handle
this. Honestly, the right answer might be to do git rebase on master from
the PR branch rather than a merge. That might avoid this situation
altogether. Of course, that also comes with all the obligatory warnings
about rebasing publicly shared branches, and that anyone how has their own
copy will get conflicts. But I think that risk is probably minimal. I'll
put something together that covers both options along with the problems and
solutions for each. Hopefully that will make future collaboration that fits
this pattern easier for people.

On Fri, Nov 16, 2018, 5:42 AM Otto Fowler 
> Maybe this is worth a confluence entry, not as a guide, but just to
> document what you did.
>
> On November 15, 2018 at 19:07:40, Michael Miklavcic (
> michael.miklav...@gmail.com) wrote:
>
> Ok, this is finally merged! Whew!
>
> Here's how I polished up the history at the end. I used other feature
> branch merges as a guideline around commit messaging.
>
> *   fcd644ca 2018-11-15 | METRON-1834: Migrate Elasticsearch from
> TransportClient to new Java REST API (mmiklavc via mmiklavc) (HEAD ->
> master, origin/master, origin/HEAD, master-merge) [mmiklavc]
> |\
> | * 8bf3b6ec 2018-11-15 | METRON-1834: Migrate Elasticsearch from
> TransportClient to new Java REST API (mmiklavc via mmiklavc) closes
> apache/metron#1242 (stella-es-base2) [mmiklavc]
> | * e7e19fbb 2018-10-08 | METRON-1834: Migrate Elasticsearch from
> TransportClient to new Java REST API (cstella via mmiklavc) [cstella]
> * | 0c4c622b 2018-11-14 | METRON-1749 Update Angular to latest release in
> Management UI (sardell via nickwallen) closes apache/metron#1217 [sardell]
>
> On Thu, Nov 15, 2018 at 4:29 PM Michael Miklavcic <
> michael.miklav...@gmail.com> wrote:
>
>> Absolutely, that's part of what I did to validate. This output below also
>> exactly matches the diff I get when I run it from the raw PR branch.
>>
>> git diff master --stat
>>  Upgrading.md
>>   |   7 +++
>>  dependencies_with_url.csv
>>  |   2 +
>>  metron-deployment/Kerberos-manual-setup.md
>>   | 154
>> ++---
>>  
>> metron-deployment/packaging/ambari/metron-mpack/src/main/resources/common-services/METRON/CURRENT/configuration/metron-env.xml
>>   |   9 
>>  
>> metron-deployment/packaging/ambari/metron-mpack/src/main/resources/common-services/METRON/CURRENT/package/scripts/metron_service.py
>>  |   2 -
>>  
>> metron-deployment/packaging/ambari/metron-mpack/src/main/resources/common-services/METRON/CURRENT/package/scripts/params/params_linux.py
>> |   3 +-
>>  
>> metron-deployment/packaging/ambari/metron-mpack/src/main/resources/common-services/METRON/CURRENT/themes/metron_theme.json
>>   |  10 
>>  
>> metron-interface/metron-rest/src/main/java/org/apache/metron/rest/service/impl/MetaAlertServiceImpl.java
>> |   2 +-
>>  metron-platform/elasticsearch-shaded/pom.xml
>>   |  47
>> +++-
>>  
>> metron-platform/elasticsearch-shaded/src/main/resources/META-INF/log4j-provider.properties
>>   |  18 ---
>>  metron-platform/metron-common/README.md
>>  |  48
>> +
>>  metron-platform/metron-common/src/main/config/zookeeper/global.json
>>  |   1 -
>>  
>> metron-platform/metron-common/src/main/java/org/apache/metron/common/configuration/ConfigOption.java
>> |   7 +++
>>  metron-platform/metron-elasticsearch/README.md
>>   |  45
>> +++-
>>  metron-platform/metron-elasticsearch/pom.xml
>>   |  32
>> +--
>>  
>> metron-platform/metron-elasticsearch/src/main/java/org/apache/metron/elasticsearch/client/ElasticsearchClient.java
>>   

Re: [DISCUSS] Knox SSO feature branch review and features

2018-11-16 Thread Otto Fowler
Those are all valid points.  I think it is ( was ) worth discussion at
lease a little.

WRT Knox and defaults:

I have in the past used “do-nothing” implementations as default
placeholders for functionality
that needed extensive per customer configuration, or configuration outside
the responsibility of the product.

Would it be simpler if we ALWAYS used Knox, but defaulted to a KNOX
configuration with “do-nothing” providers
for auth etc.  The users would then configure the providers ( based on the
provider(s) we support ) at a later time.

We could write the providers, as everyone has pointed out how extensible
KNOX is ;)

Would that be a valid way to simplify the issue?
What would the fallout of that be?



On November 16, 2018 at 09:20:53, Ryan Merriman (merrim...@gmail.com) wrote:

Most of the research I've done around adding Metron as a Knox service is
based on how other projects do it. The documentation is not easy to follow
so I learned by reading other service definition files. The assumption
that we are doing things drastically different is false.

I completely agree with Simon. Why would we want to be dependent on Knox's
release cycle? How does that benefit us? It may reduce some operational
complexity but it makes our install process more complicated because we
require a certain version of Knox (who knows when that gets released).
What do we do in the meantime? I would also like to point out that Metron
is inherently different than other Hadoop stack services. We are a
full-blown application with multiple UIs so the way we expose services
through Knox may be a little different.

I think this will be easier to discuss when we can all see what is actually
involved. I am working on a PR that adds Metron as a Knox service and will
have that out soon. That should give everyone more context.

On Fri, Nov 16, 2018 at 7:39 AM Simon Elliston Ball <
si...@simonellistonball.com> wrote:

> You could say the same thing about Ambari, but that provides mpacks. Knox
> is also designed to be extensible through Knox service stacks since they
> realized they can’t support every project. The challenge is that the docs
> have not made it as easy as they could for the ecosystem to plug into
Knox,
> which has led to some confusion around this being a recommended pattern
> (which it is).
>
> The danger of trying to get your bits into Knox is that that ties you to
> their release cycle (a problem Ambari has felt hard, hence their
community
> is moving away from the everything inside model towards everything is an
> mpack).
>
> A number of implementations of Knox also use the approach Ryan is
> suggesting for their own organization specific end points, so it’s not
like
> this is an uncommon, or anti-pattern, it’s more the way Knox is designed
to
> work in the future, than the legacy of it only being able to handle a
> subset of Hadoop projects.
>
> Knox remains optional In our scenario, but we keep control over the
> shipping of things like rewrite rules, which allows Metron to control its
> release destiny should things like url patterns in the ui need to change
> (with a new release of angular / new module / new rest endpoint etc)
> instead of making a Metron release dependent on a Knox release.
>
> Imagine how we would have done with the Ambari side if we’d had to wait
> for them to release every time we needed to change something in the
> mpack... we don’t want that happening with Knox.
>
> Simon
>
> > On 16 Nov 2018, at 13:22, Otto Fowler  wrote:
> >
> >
>
https://issues.apache.org/jira/browse/KNOX-841?jql=project%20%3D%20KNOX%20AND%20text%20~%20support
> >
> > Solr is angular for example.
> >
> >
> > On November 16, 2018 at 08:12:55, Otto Fowler (ottobackwa...@gmail.com)
> > wrote:
> >
> > Ok, here is something I don’t understand, but would like to.
> >
> > Knox comes configured with build in services for a number of other
apache
> > products and UI’s.
> > It would seem to me, that the best integration with Knox would be to do
> > what these other products have done.
> >
> >
> > 1. Do whatever you have to do to make your own stuff compatible.
> > 2. Create a knox service definition and provide it or try to get it
into
> > knox itself
> >
> > This would make the knox integration with metron optional and pluggable
> > wouldn’t it?
> >
> > Then knox with metron would just be the same as knox with anything
else.
> > Please help me if I am wrong, but we seem to be going our own way here.
> > Why don’t we just do what these other products have done?
> > Why don’t we try to get apache metron services accepted to the knox
> > project? Why don’t we model our knox integration with how XYZ does it?
> > Have we looked at how others integrate? H

Re: Metron Release 0.6.1 and/or Plugin release 0.3.0?

2018-11-16 Thread Otto Fowler
Can you generate the jiras that would be included in the release?


On November 16, 2018 at 10:05:50, Justin Leet (justinjl...@gmail.com) wrote:

Given that we've had a couple major PRs (the ES client migration along with
the Angular upgrade stuff, and I'm sure others), I'd be in favor of
releasing both the plugin and the main repo.

I'd be in favor of doing something like:
metron-bro-plugin-kafka release 0.3.0
PR to update full dev
metron release 0.7.0 (given that things changed a good amount)

I would also expect this process to most likely be kicked off post
Thanksgiving break (Thursday 22nd and Friday 23rd for anybody not in the
US).

I can kick out another summarization thread if we want, but basically:
* Nov 26 - Start the Bro plugin release process
* Once that finishes, PR to update full dev with new plugin version
* Once PR is in, start metron release process (hopefully) sometime the week
of the 3rd?

Are there any objections to staggering the releases like that? They could
also be done together, but it means that we have to update full dev to
match the plugin version post release.

On Wed, Nov 14, 2018 at 10:29 AM zeo...@gmail.com  wrote:

> In my opinion metron-bro-plugin-kafka is ready for a release. Anything
> else people would want to see? Once it gets released, I would like to
> update full dev to use the newest version prior to any future metron
> release (0.6.1 or whatever we choose).
>
> Jon
>
> On Wed, Nov 7, 2018 at 8:07 PM zeo...@gmail.com  wrote:
>
> > So, about this release, anybody have time to review
> > apache/metron-bro-plugin-kafka#2 and apache/metron-bro-plugin-kafka#13?
> >
> > Jon
> >
> > On Wed, Oct 17, 2018 at 10:37 AM Michael Miklavcic <
> > michael.miklav...@gmail.com> wrote:
> >
> >> And I do think we will be ready to roll another Metron release in the
> near
> >> future as well.
> >>
> >> On Wed, Oct 17, 2018 at 8:26 AM Justin Leet 
> >> wrote:
> >>
> >> > I tend to agree with Mike. I think we could do a release of the main
> >> > project and get benefit, but I think skipping this cycle (especially
> >> since
> >> > we had a release last month) let's us get things like the ES client
> >> > migration in and settled and just generally make recommending a
newer
> >> > version easier.
> >> >
> >> > The main reason for doing a release for the Bro plugin is to address
> >> > shortcomings that we discovered and are addressing.
> >> >
> >> > On Tue, Oct 16, 2018 at 4:42 PM Michael Miklavcic <
> >> > michael.miklav...@gmail.com> wrote:
> >> >
> >> > > Migrating the ES client, refactoring parser bolt. There are some
> >> > interface
> >> > > changes in flight right now that I think would be beneficial to
see
> in
> >> > the
> >> > > next release.
> >> > >
> >> > > On Tue, Oct 16, 2018 at 2:01 PM Nick Allen 
> >> wrote:
> >> > >
> >> > > > I am in favor of a release for both.
> >> > > >
> >> > > > There are a lot of really useful bug fixes, management of pcap
> >> through
> >> > > > Ambari, more flexibility for configuring JAAS in Ambari,
increased
> >> > > > Elasticsearch performance, the Syslog parser, and the Batch
> >> Profiler,
> >> > > among
> >> > > > others. I would be happy with calling it a 0.6.1 point release.
> >> > > >
> >> > > > Mike - What is outstanding that you would like to see in the
> >> release?
> >> > > >
> >> > > > On Tue, Oct 16, 2018, 12:21 PM Michael Miklavcic <
> >> > > > michael.miklav...@gmail.com> wrote:
> >> > > >
> >> > > > > I'd be +1 on going with just the metron-bro-kafka-plugin
> release.
> >> It
> >> > > > seems
> >> > > > > like it's ready to go, and I think there are a few more things
> I'd
> >> > like
> >> > > > to
> >> > > > > see get into our next Metron release so I'm good with holding
> off
> >> > > there.
> >> > > > >
> >> > > > > Mike
> >> > > > >
> >> > > > > On Tue, Oct 16, 2018 at 10:26 AM Justin Leet <
> >> justinjl...@gmail.com>
> >> > > > > wrote:
> >> > > > >
> >> > > > > > Hi all,
> >> > > > > >
> >> > > > > > As you might recall from a prior discussion about release
> >> cadence,
> >> > we
> >> > > > > were
> >> > > > > > interested in initiating release threads near our board
> reports
> >> to
> >> > > see
> >> > > > if
> >> > > > > > we wanted to do releases or not. Additionally, the work is
> done
> >> to
> >> > do
> >> > > > two
> >> > > > > > separate releases, so our options are releasing both, a
single
> >> one,
> >> > > or
> >> > > > > > neither.
> >> > > > > >
> >> > > > > > Having said that, a metron-bro-kafka-plugin 0.3.0 release
came
> >> up
> >> > on
> >> > > > this
> >> > > > > > thread
> >> > > > > > <
> >> > > > > >
> >> > > > >
> >> > > >
> >> > >
> >> >
> >>
>
https://lists.apache.org/thread.html/3c18c3aba6b436b11032831e7db541d50eb7cb1e3ae54b7423057c88@%3Cdev.metron.apache.org%3E
> >> > > > > > >.
> >> > > > > > In particular, the prospect of a release came up in the
> context
> >> of
> >> > > > > having a
> >> > > > > > version with better (and working) testing.
> >> > > > > >
> >> > > > > > Version Number
> 

Re: [DISCUSS] Knox SSO feature branch review and features

2018-11-16 Thread Otto Fowler
https://issues.apache.org/jira/browse/KNOX-841?jql=project%20%3D%20KNOX%20AND%20text%20~%20support

Solr is angular for example.


On November 16, 2018 at 08:12:55, Otto Fowler (ottobackwa...@gmail.com)
wrote:

Ok,  here is something I don’t understand, but would like to.

Knox comes configured with build in services for a number of other apache
products and UI’s.
It would seem to me, that the best integration with Knox would be to do
what these other products have done.


1. Do whatever you have to do to make your own stuff compatible.
2. Create a knox service definition and provide it or try to get it into
knox itself

This would make the knox integration with metron optional and pluggable
wouldn’t it?

Then knox with metron would just be the same as knox with anything else.
Please help me if I am wrong, but we seem to be going our own way here.
Why don’t we just do what these other products have done?
Why don’t we try to get apache metron services accepted to the knox
project?  Why don’t we model our knox integration with how XYZ does it?
Have we looked at how others integrate?   Having all the code and being
able to track stuff is kind of the point of this whole thing isn’t it?

Maybe this is implied and I’m missing it, if so I apologize.

I think consistency with the rest of the hadoop stack with knox helps us.



On November 15, 2018 at 22:20:00, Ryan Merriman (merrim...@gmail.com) wrote:

1) Sorry I misspoke. I meant to say this is not possible in the Alerts UI
as far as I know. I put up a PR with a proposed solution here:
https://github.com/apache/metron/pull/1266.
2) Yes Knox is a service you can install with Ambari, similar to Ranger or
Spark. There are some things that are specifically configured in Knox and
there are some things specific to Metron. I will put up a PR with the
changes needed so you can see exactly what is involved.
3) I don't understand what you mean here. Is this a question?
4) I think it's a little early to predict the Ambari changes required.
This will depend on how tasks 1-3 go. I imagine it's similar to other
mpack work: expose some parameters in ambari and bind those to config
files. My understanding from this thread so far is that we should focus on
a manual, documented approach to start.

On Thu, Nov 15, 2018 at 7:53 PM Michael Miklavcic <
michael.miklav...@gmail.com> wrote:

> Thanks Ryan,
>
> 1) Can you clarify "not a good way to do this?" Are you saying we don't
> have a way to set this and need to add the config option, or that a
> solution is not obvious and it's unclear what to do? It seems to me you're
> saying the former, but I'd like to be sure.
> 2) Is Knox not a service made available by Ambari similar to Ranger or
> Spark? I'm assuming that similar to Kerberos, there are some things that
> are specifically configured in Knox and others that are app-specific. Some
> explanation of what this looks like would be helpful.
> 3) Sounds like this follows pretty naturally from #1
> 4) Relates to #2. I think we need some guidance on what a manual vs
> MPack/automated install would look like.
>
> Cheers,
> Mike
>
>
> On Thu, Nov 15, 2018 at 4:07 PM Ryan Merriman  wrote:
>
> > Wanted to give an update on the context path issue. I investigated
> > rewriting url links in the outgoing static assets with Knox and it was
> not
> > trivial. Fortunately I found a simple solution that works with or
> without
> > Knox. I changed the base tag in index.html from  to  > href="./">, or in other words made the base href relative.
> >
> > I believe I am at the point where I can task this out and provide a high
> > level overview of the changes needed. I think that each task will be a
> > manageable size and can stand alone so I don't think we need a feature
> > branch.
> >
> > The first task involves a general change to the UI code. We need a way
> to
> > set the path to the REST service with a configuration setting because it
> is
> > different with and without Knox. Currently there is not a good way to do
> > this in the UI. We can use the environment files but that is a build
> time
> > setting and is not flexible. I can see this capability being useful for
> > other use cases in the future. I think we could even split this up into
> 2
> > separate tasks, one for the alerts UI and one for the management UI.
> >
> > The second task involves adding Knox to our stack either by default as a
> > dependency in the mpack or with a documented approach. We would add our
> > REST service, Alerts UI, and Management UI as services in Knox.
> Everything
> > would continue to function as it currently does but with all
> communication
> > going through Knox. LDAP authentication would be required when using
> Knox
> > and Knox wi

Re: [DISCUSS] Knox SSO feature branch review and features

2018-11-16 Thread Otto Fowler
Ok,  here is something I don’t understand, but would like to.

Knox comes configured with build in services for a number of other apache
products and UI’s.
It would seem to me, that the best integration with Knox would be to do
what these other products have done.


1. Do whatever you have to do to make your own stuff compatible.
2. Create a knox service definition and provide it or try to get it into
knox itself

This would make the knox integration with metron optional and pluggable
wouldn’t it?

Then knox with metron would just be the same as knox with anything else.
Please help me if I am wrong, but we seem to be going our own way here.
Why don’t we just do what these other products have done?
Why don’t we try to get apache metron services accepted to the knox
project?  Why don’t we model our knox integration with how XYZ does it?
Have we looked at how others integrate?   Having all the code and being
able to track stuff is kind of the point of this whole thing isn’t it?

Maybe this is implied and I’m missing it, if so I apologize.

I think consistency with the rest of the hadoop stack with knox helps us.



On November 15, 2018 at 22:20:00, Ryan Merriman (merrim...@gmail.com) wrote:

1) Sorry I misspoke. I meant to say this is not possible in the Alerts UI
as far as I know. I put up a PR with a proposed solution here:
https://github.com/apache/metron/pull/1266.
2) Yes Knox is a service you can install with Ambari, similar to Ranger or
Spark. There are some things that are specifically configured in Knox and
there are some things specific to Metron. I will put up a PR with the
changes needed so you can see exactly what is involved.
3) I don't understand what you mean here. Is this a question?
4) I think it's a little early to predict the Ambari changes required.
This will depend on how tasks 1-3 go. I imagine it's similar to other
mpack work: expose some parameters in ambari and bind those to config
files. My understanding from this thread so far is that we should focus on
a manual, documented approach to start.

On Thu, Nov 15, 2018 at 7:53 PM Michael Miklavcic <
michael.miklav...@gmail.com> wrote:

> Thanks Ryan,
>
> 1) Can you clarify "not a good way to do this?" Are you saying we don't
> have a way to set this and need to add the config option, or that a
> solution is not obvious and it's unclear what to do? It seems to me
you're
> saying the former, but I'd like to be sure.
> 2) Is Knox not a service made available by Ambari similar to Ranger or
> Spark? I'm assuming that similar to Kerberos, there are some things that
> are specifically configured in Knox and others that are app-specific.
Some
> explanation of what this looks like would be helpful.
> 3) Sounds like this follows pretty naturally from #1
> 4) Relates to #2. I think we need some guidance on what a manual vs
> MPack/automated install would look like.
>
> Cheers,
> Mike
>
>
> On Thu, Nov 15, 2018 at 4:07 PM Ryan Merriman 
wrote:
>
> > Wanted to give an update on the context path issue. I investigated
> > rewriting url links in the outgoing static assets with Knox and it was
> not
> > trivial. Fortunately I found a simple solution that works with or
> without
> > Knox. I changed the base tag in index.html from  to
 > href="./">, or in other words made the base href relative.
> >
> > I believe I am at the point where I can task this out and provide a
high
> > level overview of the changes needed. I think that each task will be a
> > manageable size and can stand alone so I don't think we need a feature
> > branch.
> >
> > The first task involves a general change to the UI code. We need a way
> to
> > set the path to the REST service with a configuration setting because
it
> is
> > different with and without Knox. Currently there is not a good way to
do
> > this in the UI. We can use the environment files but that is a build
> time
> > setting and is not flexible. I can see this capability being useful for
> > other use cases in the future. I think we could even split this up into
> 2
> > separate tasks, one for the alerts UI and one for the management UI.
> >
> > The second task involves adding Knox to our stack either by default as
a
> > dependency in the mpack or with a documented approach. We would add our
> > REST service, Alerts UI, and Management UI as services in Knox.
> Everything
> > would continue to function as it currently does but with all
> communication
> > going through Knox. LDAP authentication would be required when using
> Knox
> > and Knox will authenticate with the REST service by passing along an
> > Authorization header. Enabling Knox would be a manual process that
> > involves deploying assets (Knox descriptor files) and changing
> > configuration. There would be no change to how the UI functions by
> default
> > (without Knox) and either LDAP or JDBC authentication could still be
> used..
> >
> > The third task involves enabling SSO with Knox. We would update the
REST
> > service so that it can authenticate 

Re: [DISCUSS] Knox SSO feature branch review and features

2018-11-16 Thread Otto Fowler
Welcome and thanks!


On November 12, 2018 at 10:12:44, Sandeep Moré (moresand...@gmail.com)
wrote:

Hello Ryan,

I am still catching up on the architecture so let me know if I am
misunderstanding anything.
You could have multiple serviced deployed in Knox
1. for metron (metron/api/v1)
2. for alerts-ui (metron-alerts-ui/alerts-list)
and have them run in one Knox instance and you could have one service
reference from other (not recommended but possible).

You can tailor rewrite rules to update the context path for the assets as
well, pointed out by Simon.

Knox Wiki [1] has some blogs and tutorials that you can look at. This [2]
is a good tutorial on how rewriting static assets, there is also a blog [3]
on basics of rewrite rules that should be a good reference.

I would also be glad to look at the service definitions you have and answer
any questions.

[1] https://cwiki.apache.org/confluence/display/KNOX/Index
[2]
https://cwiki.apache.org/confluence/display/KNOX/Proxying+a+UI+using+Knox
[3]
https://cwiki.apache.org/confluence/display/KNOX/Proxying+a+UI+using+Knox

Best,
Sandeep

P.S. I am a Knox committer and new to Metron.

On Mon, Nov 12, 2018 at 9:59 AM Ryan Merriman  wrote:

> I'm just coming up to speed on Knox so maybe rewriting assets links are
> trivial. If anyone has a good example of how to do that or can point to
> some documentation, please share.
>
> On Mon, Nov 12, 2018 at 8:54 AM Simon Elliston Ball <
> si...@simonellistonball.com> wrote:
>
> > Doing the Knox proxy work first certainly does make a lot of sense vs
the
> > SSO first approach, so I'm in favour of this. It bypasses all the
> anti-CORS
> > proxying stuff the other solution needed by being on the same URL
space.
> >
> > Is there are reason we're not re-writing the asset link URLs in Knox?
We
> > should have a reverse content rewrite rule to avoid that problem and
make
> > it entirely transparent whether there is Knox or not. We shouldn't be
> > changing anything about the UI services themselves. If the rewrite
> service
> > is complete, there is no change to base ref in the UI code, Knox would
> > effectively apply it by content filtering. Note also that the gateway
URL
> > is configurable and likely to vary from Knox to Knox, so baking it into
> the
> > ng build will break non-full-dev builds. (e.g. gateway/default could
well
> > be gateway/xyz).
> >
> > I would also like to discuss removing the JDBC auth, because it's a set
> of
> > plaintext passwords in a mysql DB... it introduces a problematic
> dependency
> > (mysql) a ton of java dependencies we could cut out (JPA, eclipselink)
> and
> > opens up a massive security hole. I personally know of several
> > organisations who are blocked from using Metron by the presence of the
> JDBC
> > authentication method in its current form.
> >
> > Simon
> >
> > On Mon, 12 Nov 2018 at 14:36, Ryan Merriman 
wrote:
> >
> > > Let me clarify on exposing both legacy and Knox URLs at the same
time.
> > The
> > > base urls will look something like this:
> > >
> > > Legacy REST - http://node1:8082/api/v1
> > > Legacy Alerts UI - http://node1:4201:/alerts-list
> > >
> > > Knox REST - https://node1:8443/gateway/default/metron/api/v1
> > > Knox Alerts UI -
> > > https://node1:8443/gateway/default/metron-alerts-ui/alerts-list
> > >
> > > If Knox were turned on and the alerts UI deployed as is, it would not
> > > work. This is because static assets are referenced with
> > > http://node1:4201/assets/some-asset.js which does not include the
> > correct
> > > context path to the alerts UI in knox. To make it work, you have to
> set
> > > the base ref to "/gateway/default/metron-alerts-ui" so that static
> assets
> > > are referenced at
> > >
> https://node1:8443/gateway/default/metron-alerts-ui/assets/some-asset.js
> > .
> > > When you do that, the legacy alerts UI will no longer work. I guess
> the
> > > point I'm trying to make is that we would have to switch between them
> or
> > > have 2 separate application running. I imagine most users only need
> one
> > or
> > > the other running so probably not an issue.
> > >
> > > Jon, the primary upgrade consideration I see is with authentication.
> To
> > be
> > > able to use Knox, you would have to upgrade to LDAP-based
> authentication
> > if
> > > you were still using JDBC-based authentication in REST. The urls
would
> > > also change obviously.
> > >
> > > On Sun, Nov 11, 2018 at 6:38 PM zeo...@gmail.com 
> > wrote:
> > >
> > > > Phew, that was quite the thread to catch up on.
> > > >
> > > > I agree that this should be optional/pluggable to start, and I'm
> > > interested
> > > > to hear the issues as they relate to upgrading an existing cluster
> > (given
> > > > the suggested approach) and exposing both legacy and knox URLs at
the
> > > same
> > > > time.
> > > >
> > > > Jon
> > > >
> > > > On Fri, Nov 9, 2018, 4:46 PM Michael Miklavcic <
> > > > michael.miklav...@gmail.com>
> > > > wrote:
> > > >
> > > > > A couple more things, and I think this goes 

Re: [DISCUSS] Attribution and merging the Elasticsearch client migration

2018-11-16 Thread Otto Fowler
g");
> 183 properties.put("long_field", longType);
> 184 Map floatType = new HashMap<>();
> 185 floatType.put("type", "float");
> 186 properties.put("latitude", floatType);
> 187 Map doubleType = new HashMap<>();
> 188 doubleType.put("type", "double");
> 189 properties.put("score", doubleType);
> 190   }
> 191
> 192   /**
> 193* Add test fields to a template with defined types in case they are
> not defined in the sensor template shipped with Metron.
> 194* This is useful for testing certain cases, for example faceting on
> fields of various types.
> 195* @param template
> 196* @param docType
> 197*/
> 198   private static void addTestFieldMappings(JSONObject template, String
> docType) {
> 199 Map mappings = (Map) template.get("mappings");
> 200 Map docTypeJSON = (Map) mappings.get(docType);
> 201 Map properties = (Map) docTypeJSON.get("properties");
> 202 Map longType = new HashMap<>();
> 203 longType.put("type", "long");
> 204 properties.put("long_field", longType);
> 205 Map floatType = new HashMap<>();
> 206 floatType.put("type", "float");
> 207 properties.put("latitude", floatType);
> 208 Map doubleType = new HashMap<>();
> 209 doubleType.put("type", "double");
> 210 properties.put("score", doubleType);
> 211   }
>
>
> On Thu, Nov 15, 2018 at 4:21 PM Otto Fowler 
> wrote:
>
>> Can you diff the trees to be sure?
>>
>>
>> On November 15, 2018 at 17:52:40, Michael Miklavcic (
>> michael.miklav...@gmail.com) wrote:
>>
>> So amazingly, this still has results in conflicts, but I am able to
>> resolve
>> them manually in a sensible fashion.
>> git merge -X theirs es-rebased
>> CONFLICT (rename/rename): Rename
>>
>> "metron-interface/metron-config/src/app/rxjs-operators.ts"->"metron-platform/metron-parsers/src/main/java/org/apache/metron/parsers/ParserRunnerResults.java"
>> in branch "HEAD" rename
>>
>> "metron-interface/metron-config/src/app/rxjs-operators.ts"->"metron-platform/metron-elasticsearch/src/main/java/org/apache/metron/elasticsearch/utils/FieldMapping.java"
>> in "stella-es-base2"
>>
>> So where I landed gives a history like the following:
>> * df1195aa 2018-11-15 | METRON-1834: Migrate Elasticsearch from
>> TransportClient to new Java REST API (mmiklavc via mmiklavc) closes
>> apache/metron#1242 (HEAD -> master-merge) [mmiklavc]
>> |\
>> | * 590b3669 2018-11-15 | METRON-1834: Migrate Elasticsearch from
>> TransportClient to new Java REST API (mmiklavc via mmiklavc) closes
>> apache/metron#1242 (stella-es-base2) [mmiklavc]
>> | * a7c7dc28 2018-10-08 | Casey Stella - elasticsearch rest client
>> migration base work (stella-es-base) [cstella]
>> * | 0c4c622b 2018-11-14 | METRON-1749 Update Angular to latest release in
>> Management UI (sardell via nickwallen) closes apache/metron#1217
>> (origin/master, origin/HEAD, master) [sardell]
>> ...
>>
>> I can modify a7c7dc28 commit message as well and hopefully this will be
>> good for everyone?
>>
>> Cheers,
>> Mike
>>
>>
>>
>>
>> On Thu, Nov 15, 2018 at 2:29 PM Justin Leet 
>> wrote:
>>
>> > I took a look at this with Mike a bit, and it seems like it's pretty
>> > painful and without a clear way to avoid remerging conflicts. If the
>> > latest attempt doesn't work, I'm in favor of getting it in and just
>> getting
>> > it down to as few commits as reasonably possible.
>> >
>> > On Thu, Nov 15, 2018 at 4:12 PM Michael Miklavcic <
>> > michael.miklav...@gmail.com> wrote:
>> >
>> > > I'm attempting 1 more option, which would be to do a "git merge
>> > > --strategy-option theirs" after having done the commit wrangling in
>> the
>> > PR
>> > > branch. Will reply back with results.
>> > >
>> > > On Thu, Nov 15, 2018 at 2:02 PM Michael Miklavcic <
>> > > michael.miklav...@gmail.com> wrote:
>> > >
>> > > > Yes, definitely.
>> > > >
>> > > > On Thu, Nov 15, 2018 at 2:01 PM Casey Stella 
>> > wrote:
>> > > >
>> > > >> Can you at least rename your commits to have METRON-1834 prefixing
>> > them?
>> > 

Re: [DISCUSS] Attribution and merging the Elasticsearch client migration

2018-11-15 Thread Otto Fowler
Can you diff the trees to be sure?


On November 15, 2018 at 17:52:40, Michael Miklavcic (
michael.miklav...@gmail.com) wrote:

So amazingly, this still has results in conflicts, but I am able to resolve
them manually in a sensible fashion.
git merge -X theirs es-rebased
CONFLICT (rename/rename): Rename
"metron-interface/metron-config/src/app/rxjs-operators.ts"->"metron-platform/metron-parsers/src/main/java/org/apache/metron/parsers/ParserRunnerResults.java"

in branch "HEAD" rename
"metron-interface/metron-config/src/app/rxjs-operators.ts"->"metron-platform/metron-elasticsearch/src/main/java/org/apache/metron/elasticsearch/utils/FieldMapping.java"

in "stella-es-base2"

So where I landed gives a history like the following:
* df1195aa 2018-11-15 | METRON-1834: Migrate Elasticsearch from
TransportClient to new Java REST API (mmiklavc via mmiklavc) closes
apache/metron#1242 (HEAD -> master-merge) [mmiklavc]
|\
| * 590b3669 2018-11-15 | METRON-1834: Migrate Elasticsearch from
TransportClient to new Java REST API (mmiklavc via mmiklavc) closes
apache/metron#1242 (stella-es-base2) [mmiklavc]
| * a7c7dc28 2018-10-08 | Casey Stella - elasticsearch rest client
migration base work (stella-es-base) [cstella]
* | 0c4c622b 2018-11-14 | METRON-1749 Update Angular to latest release in
Management UI (sardell via nickwallen) closes apache/metron#1217
(origin/master, origin/HEAD, master) [sardell]
...

I can modify a7c7dc28 commit message as well and hopefully this will be
good for everyone?

Cheers,
Mike




On Thu, Nov 15, 2018 at 2:29 PM Justin Leet  wrote:

> I took a look at this with Mike a bit, and it seems like it's pretty
> painful and without a clear way to avoid remerging conflicts. If the
> latest attempt doesn't work, I'm in favor of getting it in and just
getting
> it down to as few commits as reasonably possible.
>
> On Thu, Nov 15, 2018 at 4:12 PM Michael Miklavcic <
> michael.miklav...@gmail.com> wrote:
>
> > I'm attempting 1 more option, which would be to do a "git merge
> > --strategy-option theirs" after having done the commit wrangling in the
> PR
> > branch. Will reply back with results.
> >
> > On Thu, Nov 15, 2018 at 2:02 PM Michael Miklavcic <
> > michael.miklav...@gmail.com> wrote:
> >
> > > Yes, definitely.
> > >
> > > On Thu, Nov 15, 2018 at 2:01 PM Casey Stella 
> wrote:
> > >
> > >> Can you at least rename your commits to have METRON-1834 prefixing
> them?
> > >> On Thu, Nov 15, 2018 at 15:19 Michael Miklavcic <
> > >> michael.miklav...@gmail.com>
> > >> wrote:
> > >>
> > >> > https://github.com/apache/metron/pull/1242
> > >> >
> > >> > TL;DR
> > >> > I'd like to discuss the best option to merge METRON-1834 into
> master.
> > I
> > >> > want to propose handling this like a feature branch and merging it
> > >> as-is.
> > >> > ---
> > >> >
> > >> > I'm sure most folks' initial reaction will be some skepticism akin
> to
> > >> "have
> > >> > you tried turning it off again," as this was my initial reaction
as
> > >> well.
> > >> > It does not seem like this should be difficult. And I'm hoping
that
> > this
> > >> > may be some esoteric thing on my system, though I believe this is
a
> > real
> > >> > problem. A rather tedious explanation follows of what I've tried
and
> > the
> > >> > problems encountered along the way. What seemed like a really
simple
> > >> > problem instead appears to be a bit much for Git to handle without
> > >> > requiring redoing merges and another full round of testing. I'd
much
> > >> prefer
> > >> > to avoid that in this instance.
> > >> >
> > >> > This PR is ready to be merged into master. It's recent and very
> close
> > to
> > >> > fully up to date in the branch. Latest master merges cleanly.
There
> is
> > >> an
> > >> > attribution to Casey Stella for the base point of this PR that I
> need
> > to
> > >> > include when getting this into master. When I created my branch, I
> > >> > collapsed his initial set of commits into a single squashed commit
> on
> > >> > master at the time, and I started to work from there. Over time, I
> > made
> > >> a
> > >> > number of additional commits and merges from master. Now for the
> > issues.
> > >> >
> > >> > Originally, my expectation was that I could have 2 commits - the
> > >> original
> > >> > squashed commit from Casey along with all my additional commits
(and
> > the
> > >> > merges with master) right on top. Nice clean history on master.
> Turns
> > >> out,
> > >> > this doesn't work as cleanly as expected because a combination of
> the
> > >> > multiple merges and the need to keep the original commit with
> > >> attribution
> > >> > to Casey's work. A normal git pull --squash works fine, as
expected,
> > >> but we
> > >> > lose the base commit, and therefore the requisite attribution.
Here
> > are
> > >> > some other things I've tried, to no avail.
> > >> >
> > >> > 1. Git pull --squash after a merge with master. This will squash
> > the
> > >> > entire tree back to the branch point. No good.
> > >> > 2. Git rebase -i 

Re: [DISCUSS] Attribution and merging the Elasticsearch client migration

2018-11-15 Thread Otto Fowler
Proper attribution and the correct code are the most important things, not
the number of commits.


On November 15, 2018 at 16:29:04, Justin Leet (justinjl...@gmail.com) wrote:

I took a look at this with Mike a bit, and it seems like it's pretty
painful and without a clear way to avoid remerging conflicts. If the
latest attempt doesn't work, I'm in favor of getting it in and just getting
it down to as few commits as reasonably possible.

On Thu, Nov 15, 2018 at 4:12 PM Michael Miklavcic <
michael.miklav...@gmail.com> wrote:

> I'm attempting 1 more option, which would be to do a "git merge
> --strategy-option theirs" after having done the commit wrangling in the
PR
> branch. Will reply back with results.
>
> On Thu, Nov 15, 2018 at 2:02 PM Michael Miklavcic <
> michael.miklav...@gmail.com> wrote:
>
> > Yes, definitely.
> >
> > On Thu, Nov 15, 2018 at 2:01 PM Casey Stella 
wrote:
> >
> >> Can you at least rename your commits to have METRON-1834 prefixing
them?
> >> On Thu, Nov 15, 2018 at 15:19 Michael Miklavcic <
> >> michael.miklav...@gmail.com>
> >> wrote:
> >>
> >> > https://github.com/apache/metron/pull/1242
> >> >
> >> > TL;DR
> >> > I'd like to discuss the best option to merge METRON-1834 into
master.
> I
> >> > want to propose handling this like a feature branch and merging it
> >> as-is.
> >> > ---
> >> >
> >> > I'm sure most folks' initial reaction will be some skepticism akin
to
> >> "have
> >> > you tried turning it off again," as this was my initial reaction as
> >> well.
> >> > It does not seem like this should be difficult. And I'm hoping that
> this
> >> > may be some esoteric thing on my system, though I believe this is a
> real
> >> > problem. A rather tedious explanation follows of what I've tried and
> the
> >> > problems encountered along the way. What seemed like a really simple
> >> > problem instead appears to be a bit much for Git to handle without
> >> > requiring redoing merges and another full round of testing. I'd much
> >> prefer
> >> > to avoid that in this instance.
> >> >
> >> > This PR is ready to be merged into master. It's recent and very
close
> to
> >> > fully up to date in the branch. Latest master merges cleanly. There
is
> >> an
> >> > attribution to Casey Stella for the base point of this PR that I
need
> to
> >> > include when getting this into master. When I created my branch, I
> >> > collapsed his initial set of commits into a single squashed commit
on
> >> > master at the time, and I started to work from there. Over time, I
> made
> >> a
> >> > number of additional commits and merges from master. Now for the
> issues.
> >> >
> >> > Originally, my expectation was that I could have 2 commits - the
> >> original
> >> > squashed commit from Casey along with all my additional commits (and
> the
> >> > merges with master) right on top. Nice clean history on master.
Turns
> >> out,
> >> > this doesn't work as cleanly as expected because a combination of
the
> >> > multiple merges and the need to keep the original commit with
> >> attribution
> >> > to Casey's work. A normal git pull --squash works fine, as expected,
> >> but we
> >> > lose the base commit, and therefore the requisite attribution. Here
> are
> >> > some other things I've tried, to no avail.
> >> >
> >> > 1. Git pull --squash after a merge with master. This will squash
> the
> >> > entire tree back to the branch point. No good.
> >> > 2. Git rebase -i master. Allows you to cleanly apply changes, but
> >> then
> >> > it ends up having problems with a clean rebase and shows
> conflicts. I
> >> > expect this is because of the merge history being necessary.
> >> > 3. Checking out a branch from the base point squashed commit from
> >> Casey,
> >> > and attempt to apply my changes on top. Numerous methods for
> >> > squashing/rebasing my changes on top applies nicely in the branch.
> >> But
> >> > then
> >> > it once again causes merge conflicts when I attempt to get this
> onto
> >> > master. Things I attempted include: manually copying files,
> rebasing
> >> > all my
> >> > commits plus merges on top of the base commit, git merge --squash,
> >> > intimidation.
> >> >
> >> > For one example of the result I'm talking about, this looks "good"
but
> >> it's
> >> > missing a ton of recent commits because they get caught up in the
> rebase
> >> > and get squashed in with my commit. When you attempt to merge this
> onto
> >> > master, it is just plain wrong (see example below with merge
> conflicts).
> >> > * 22c3b3bc 2018-11-15 | METRON-1834: Migrate Elasticsearch from
> >> > TransportClient to new Java REST API (mmiklavc via mmiklavc) closes
> >> > apache/metron#1242 (HEAD -> stella-es-base2) [mmiklavc]
> >> > * 84232e90 2018-10-08 | METRON-1834: Elasticsearch rest client
> migration
> >> > base work starting point for apache/metron#1242 (cstella via
mmiklavc)
> >> > [cstella]
> >> > * 5bfc08c5 2018-10-08 | METRON-1792 Simplify Profile Definitions in
> >> > Integration Tests (nickwallen) closes 

New WIP PR, a new Full Dev option

2018-11-13 Thread Otto Fowler
I have submitted a WIP PR [#1261](https://github.com/apache/metron/pull/1261)
that makes it possible to run / try the metron full dev environment with
only Vagrant, VirtualBox and Docker installed, as opposed to having to have
all the dev tools and ansible at the right version.

I think this would be a big help for new users.

Please give it a look.

I look forward to your feedback.

O


Re: [DISCUSS] Slack Channel Use

2018-11-12 Thread Otto Fowler
What we need is a slackbot where you can:

- start a discuss thread, that will get sent to email list
- listen to emails ( be on the list ) and post any discuss thread replies
not from slack TO slack
- add tagged comments to discuss thread

the list-slack singularity



On November 12, 2018 at 11:49:15, Scott C. Cote (scottcc...@gmail.com)
wrote:

I realize that I’m a “new kid” here, but I think you can have your cake and
eat it too…..

If I can create it, find it, or configure it, perhaps the really best way
is to be able to either:

1) dump public slack conversations to the developer thread - arbitrarily
2) dump public slack conversations to the user thread - IFF the
thread/conversation was tagged #user (or equivalent)

Slack is a wonderful tool for facilitating discussions - I cannot emphasize
how often spam filters and the inherent slowness email servers - have
interfered with rapid conversations. Additionally, the big “ask” of any
resolution on slack - has been “can you put this in the email thread”. Goes
without saying that the even bigger ask has been - can this be contributed
to the documentation.

I strongly recommend that you streamline the flow of information from Slack
to the list archives.

SCott
Scott C. Cote
scottcc...@gmail.com
972.900.1561

twitter: @scottccote



> On Nov 12, 2018, at 9:07 AM, Justin Leet  wrote:
>
> I wanted to add back onto this thread after putting some more thought
into
> it.
>
> I like Slack for the type of small developer "what's going on here?" type
> discussions. That's the kind of thing I like being real-time ("Hey, full
> dev is acting weird", "What's the basic layout of this stuff?", "Anybody
> else seen this test failure?", etc.). I think we've been pretty good
about
> keeping our decision type dev discussions to the list (e.g. this exact
> conversation).
>
> We've been doing this more, but I would like to see more of the user and
> troubleshooting move to the list. I think we've gotten a bit better about
> it as we've settled into Slack, but having that sort of helpful stuff
> exposed and searchable for users who come in afterwards is a big selling
> point of the lists, imo.
>
> To add onto this, I'd probably like to see
>
https://cwiki.apache.org/confluence/display/METRON/Community+Resources#CommunityResources-ApacheMetronCommunityResources
> (and any other relevant links) updated to emphasize a Slack focus on
> developing Metron itself, and the user lists for configuration,
> troubleshooting, etc.
>
> Essentially, I'm proposing:
> Dev list / Jira / PRs as usual for any actual decisions + concrete
feature
> discussion/review.
> Slack for Metron development "Hey, anyone seen this or have insight or a
> starting point?" and "I'm seeing something weird in our tests" type stuff
> User list for usage and troubleshooting questions. Generally, discussions
> like this in Slack should be redirected to the user list.
>
> Is this a reasonable way separate our concerns here?
>
> On Wed, Oct 24, 2018 at 11:37 AM Michael Miklavcic <
> michael.miklav...@gmail.com> wrote:
>
>> Yeah, I'm also surprised by that comment about the mailing list
activity.
>> Our dev/user list discussions are by far more active than they've ever
>> been. Just have a look at the list of DISCUSS threads that have come up
in
>> the past few months and it's clear that not only participation has
>> increased, but diversity of topic and participant.
>>
>> On Wed, Oct 24, 2018 at 8:08 AM Casey Stella  wrote:
>>
>>> Not for nothing, but at least according to the last board report that I
>>> submitted, the user@ traffic is up 100% and the dev list traffic is
flat
>>> as
>>> compared to last quarter. That's not to say that we couldn't stand more
>>> discussion on the lists, but a lot of the dev discussion happens on
>> github
>>> and JIRA and I'm happy to see an uptick in user traffic.
>>>
>>> On Wed, Oct 24, 2018 at 10:05 AM Otto Fowler 
>>> wrote:
>>>
>>>> I wouldn’t be so quick to related the slack discussion with perceived
>>>> activity on the list.
>>>> That is more do to the other things that are bigger issues.
>>>>
>>>>
>>>> On October 24, 2018 at 07:15:30, Nick Allen (n...@nickallen.org)
>> wrote:
>>>>
>>>>> I have heard recently people thought Metron is sort of dead just
>>> because
>>>> the mailing list is not so active anymore!
>>>>
>>>> That is exactly my concern.
>>>>
>>>>
>>>> On Wed, Oct 24, 2018, 2:49 AM Ali Nazemian 
>>> wrote:
>>>>
>>>>> I kind of exp

Broken build at the moment

2018-11-08 Thread Otto Fowler
We have a stellar test for date format that is broken because of the
daylight savings change.  Justin and I have been working through it
and I’ll have a PR as soon as my travis build completes.

https://issues.apache.org/jira/browse/METRON-1864

Just a heads up that any new builds ( at least in the US ) will fail



Results :

Failed tests:
  DateFunctionsTest.testDateFormatDefault:239 null


Re: [DISCUSS] Day 1 User Experience - Getting Metron Running

2018-10-26 Thread Otto Fowler
What is the metron on docker part?


On October 26, 2018 at 14:37:48, Nick Allen (n...@nickallen.org) wrote:

> Yeah I would +1 katakoda.

Has anyone used or have a history with KataKoda? I'd hate to invest time
in a hosted solution if the provider isn't going to be around. That's a
definite 'con' to taking that approach.

Although most of the effort would be invested in "Metron on Docker" which
might have value outside of KataKoda. And some level of work has already
been done on Docker.


> I also think that it would help to start distributing RPMs, DEBs, and the
mpacks with the releases..

Agreed. I was thinking that whatever solution falls out of this discussion
might require RPMs, DEBs, Maven Central, etc as prerequisites. Although
each of those have value in their own right.



On Fri, Oct 26, 2018 at 1:42 PM zeo...@gmail.com  wrote:

> Yeah I would +1 katakoda. I also think that it would help to start
> distributing RPMs, DEBs, and the mpacks with the releases, as well as
> consider a service like opensuse's build service for nightlies, etc.
>
> Jon
>
> On Fri, Oct 26, 2018 at 6:25 AM Anand Subramanian <
> asubraman...@hortonworks.com> wrote:
>
> > Great idea! This will be a HUGE improvement in the user experience for
> > first timers to Metron. Katakoda seems very interesting - simple and
> > straight-forward. I loved the way you can provide instructions,
commands
> > (that can be directly clicked!), links, explanation and so on.
> >
> > Regards,
> > Anand
> >
> > On 10/25/18, 7:49 PM, "Nick Allen"  wrote:
> >
> > We all know spinning up the development environment is a pain.
> > Unfortunately, it is the only way for a new user to get a feel for
> > Metron.
> > We need a better way to introduce new users to Metron.
> >
> > I am hoping we can brainstorm ways to improve that experience. Here
> > are a
> > few thoughts that might help start a discussion.
> >
> > (1) Create a *KataKoda* [1] based demo. I ran across this after
> > finding
> > Apache Ozone's demo [2], which I think is great.
> >
> >
> > - A user does not need to download or install anything. It is a
> > completely hosted offering.
> > - Provides a step-by-step demo experience that could guide
> users
> > through creating an enrichment, defining a profile, managing
> > alerts.
> > - Would require a Metron on Docker solution.
> >
> > (2) Create a *Vagrant Cloud* [3] hosted image of "Full Dev" with
> > everything
> > installed and ready to rock. A user would just need to install
> > Vagrant and
> > run:
> >
> > vagrant init metron/0.6.0
> >
> > vagrant up
> >
> >
> > - Reduces the number of dependencies needed to get Metron
> > up-and-running.
> > - Significantly increases the success rate of new users getting
> > Metron running.
> > - Still results in "Full Dev" Metron which requires too many
> > resources for the average computer.
> >
> > Are these good options? What other approaches could we take?
> Hopefully
> > some JIRAs might fall out of this discussion.
> >
> > - Nick
> >
> >
> > --
> > [1] https://www.katacoda.com
> > [2] https://www.katacoda.com/elek/scenarios/ozone101
> > [3] https://app.vagrantup.com/boxes/search
> >
> >
> > --
>
> Jon Zeolla
>


metron-elasticsearch integration tests failing after merging in master

2018-10-24 Thread Otto Fowler
https://travis-ci.org/ottobackwards/metron/jobs/445723343
Anyone having ES test problems?  Anyone shed any light on this.


Re: [DISCUSS] Slack Channel Use

2018-10-24 Thread Otto Fowler
I wouldn’t be so quick to related the slack discussion with perceived
activity on the list.
That is more do to the other things that are bigger issues.


On October 24, 2018 at 07:15:30, Nick Allen (n...@nickallen.org) wrote:

> I have heard recently people thought Metron is sort of dead just because
the mailing list is not so active anymore!

That is exactly my concern.


On Wed, Oct 24, 2018, 2:49 AM Ali Nazemian  wrote:

> I kind of expect to have Slack for more dev related discussions rather
than
> user QA. I guess it is quite common to expect mailing list to be used for
> the purpose of knowledge sharing to make sure it will be accessible by
> other users as well. Of course, it is a trade-off that most of the other
> Apache projects decided to accept the risk of keeping user related
> discussions out of Slack/IRC. However, it sometimes happens to see the
> mixture of questions coming to Slack. I have heard recently people
thought
> Metron is sort of dead just because the mailing list is not so active
> anymore!
>
> Cheers,
> Ali
>
> On Tue, Oct 23, 2018 at 8:23 AM Casey Stella  wrote:
>
> > Agreed, the benefit of the mailing list is that it’s searchable by
> ponymail
> > and the major search engines.
> > On Mon, Oct 22, 2018 at 17:18 Nick Allen  wrote:
> >
> > > I don't know that it is the same kind of searchable. Is it being
> indexed
> > > by the major search engines? I have never used a search engine and
> > > uncovered the answer to my problem in a Slack archive.
> > >
> > > On Mon, Oct 22, 2018 at 5:05 PM Otto Fowler 
> > > wrote:
> > >
> > > > According to Greg Stein, an infra admin on the NiFi slack, the ASF
> > slack
> > > > that metron is in IS the standard plan, not the free one and is
> > > searchable
> > > > past 10,000 messages.
> > > >
> > > >
> > > >
> > > > On October 22, 2018 at 15:35:51, Michael Miklavcic (
> > > > michael.miklav...@gmail.com) wrote:
> > > >
> > > > ...From an archival and broader reach point of view, I do think
> there's
> > > > something to be said about using the mailing list. It's also easier
> to
> > > link
> > > > to Q/A threads from the mailing list archives and do searches...
> > > >
> > > >
> > >
> >
>
https://lists.apache.org/thread.html/1aa85bc13d41e04a1f85c3100c2b803abe35d79b54062bbeaab83ace@%3Cdev.metron.apache.org%3E
> > > >
> > > > How very Inception.
> > > >
> > > >
> > > > On Mon, Oct 22, 2018 at 1:32 PM Michael Miklavcic <
> > > > michael.miklav...@gmail.com> wrote:
> > > >
> > > > > I just want to point out that we currently have 32 members in the
> > > Metron
> > > > > Slack channel which I personally think is a great sign. This is
> good
> > > from
> > > > a
> > > > > community perspective and helps foster interactive sessions where
> > > > required.
> > > > > From an archival and broader reach point of view, I do think
> there's
> > > > > something to be said about using the mailing list. It's also
easier
> > to
> > > > link
> > > > > to Q/A threads from the mailing list archives and do searches. As
> > > such, I
> > > > > would also go along with Nick's suggestion and urge members to
> prefer
> > > the
> > > > > user/dev list where possible.
> > > > >
> > > > > On Mon, Oct 22, 2018 at 10:51 AM Justin Leet <
> justinjl...@gmail.com>
> > > > > wrote:
> > > > >
> > > > >> If we want to push more discussion to the dev list, my obvious
> > follow
> > > up
> > > > >> question then is "What are we hoping to get out of
Slack/irc/other
> > > > >> interactive medium?". What discussion would we even want on
there,
> > if
> > > we
> > > > >> can't have decisions and don't want usage/support?
> > > > >>
> > > > >> On Mon, Oct 22, 2018 at 12:44 PM Casey Stella  >
> > > > wrote:
> > > > >>
> > > > >> > I am of 2 minds, but I tend to agree. On the one hand, it's
> > > definitely
> > > > >> the
> > > > >> > preference that we use the mailing lists for the reasons you
> > stated
> > > > (and
> > > > >> > also because not everyone has access to

Re: Invite to Slack Channel

2018-10-23 Thread Otto Fowler
Done


On October 23, 2018 at 02:19:54, Mustafa Akmal (mustafa.ak...@abcdata.org)
wrote:

Hi,
Please send me an invitation link to the slack channel aswell.
Thanks!

Mustafa Akmal
Big Data Consultant
mustafa.ak...@abcdata.org (mailto:mustafa.ak...@abcdata.org)
+923365257705 (tel:+923365257705)
NUST (Technology Incubation Centre), Office No. 208, H-12, Islamabad,
Pakistan (
https://maps.google.com/?q=NUST%20(Technology%20Incubation%20Centre)%2C%20Office%20No.%20208%2C%20H-12%2C%20Islamabad%2C%20Pakistan)

http://www.abcdata.org/

On Oct 23 2018, at 1:01 am, Michael Miklavcic 
wrote:
>
> Sent
> On Mon, Oct 22, 2018 at 1:00 PM vpiserc...@gmail.com 

> wrote:
>
> > Hi,
> > can anyone invite to the metron slack channel?
> > Thanks,
> > vito piserchia
> > On 10/22/18 3:31 PM, zeo...@gmail.com wrote:
> > > Invite sent
> > >
> > > On Mon, Oct 22, 2018 at 9:26 AM Muhammed Irshad 
> > wrote:
> > >
> > > > Some one get me also the slack channel link ?
> > > > Thanks,
> > > > Muhammed Irshad
> > > > Q*Burst*
> > > > www.qburst.com
> > > >
> > > >
> > > > On Wed, Oct 17, 2018 at 7:33 PM Michael Miklavcic <
> > > > michael.miklav...@gmail.com> wrote:
> > > >
> > > > > Sent
> > > > > On Wed, Oct 17, 2018 at 7:23 AM Tibor Meller <
tibor.mel...@gmail.com>
> > > > > wrote:
> > > > >
> > > > > > Hi Guys,
> > > > > > Can you add me to the apache metron slack chanel?
> > > > > >
> > > > > > Thanks,
> > > > > > On Thu, Oct 4, 2018 at 1:14 PM Otto Fowler <
ottobackwa...@gmail.com>
> > > > > > wrote:
> > > > > >
> > > > > > > Done
> > > > > > >
> > > > > > > On October 4, 2018 at 05:35:06, Tamás Fodor (
ftamas.m...@gmail.com)
> > > > > > wrote:
> > > > > > >
> > > > > > > Hello,
> > > > > > > Michael, can you add me as well?
> > > > > > > Thank you in advance!
> > > > > > > Tamas
> > > > > > > On Wed, Oct 3, 2018 at 4:27 PM Michael Miklavcic <
> > > > > > > michael.miklav...@gmail.com> wrote:
> > > > > > >
> > > > > > > > Sent
> > > > > > > > On Wed, Oct 3, 2018 at 8:17 AM Shane Ardell <
> > > > > shane.m.ard...@gmail.com>
> > > > > > > > wrote:
> > > > > > > >
> > > > > > > > > Hello everyone,
> > > > > > > > > Is it possible for someone to send me an invite to the
Metron
> > > > Slack
> > > > > > > > > channel?
> > > > > > > > >
> > > > > > > > > Regards,
> > > > > > > > > Shane
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> >
> >
> > --
> > Vito Piserchia
> > Security and Software Engineer
> >
> > : vito.piserchia[at]dreamlab.net
> > : 4915 8835 2C18 9CAE F14F 2314 613D 51C5 106B 83EA
> > : https://dreamlab.net
> > : +41 31 398 66 66
> > : +41 31 398 66 69
> > -
> >
> > DreamLab Technologies AG
> > Monbijoustrasse 36
> > 3011 Bern, Switzerland
> >
> > -
> > This e-mail may contain confidential and/or privileged information.
> > If you are not the intended recipient (or have received this e-mail
> > in error) please notify the sender immediately and destroy this
> > e-mail. Any unauthorised copying, disclosure or distribution of the
> > material in this e-mail is strictly forbidden.
> >
> > -


Re: [DISCUSS] Slack Channel Use

2018-10-22 Thread Otto Fowler
According to Greg Stein, an infra admin on the NiFi slack, the ASF slack
that metron is in IS the standard plan, not the free one and is searchable
past 10,000 messages.



On October 22, 2018 at 15:35:51, Michael Miklavcic (
michael.miklav...@gmail.com) wrote:

...From an archival and broader reach point of view, I do think there's
something to be said about using the mailing list. It's also easier to link
to Q/A threads from the mailing list archives and do searches...
https://lists.apache.org/thread.html/1aa85bc13d41e04a1f85c3100c2b803abe35d79b54062bbeaab83ace@%3Cdev.metron.apache.org%3E

How very Inception.


On Mon, Oct 22, 2018 at 1:32 PM Michael Miklavcic <
michael.miklav...@gmail.com> wrote:

> I just want to point out that we currently have 32 members in the Metron
> Slack channel which I personally think is a great sign. This is good from
a
> community perspective and helps foster interactive sessions where
required.
> From an archival and broader reach point of view, I do think there's
> something to be said about using the mailing list. It's also easier to
link
> to Q/A threads from the mailing list archives and do searches. As such, I
> would also go along with Nick's suggestion and urge members to prefer the
> user/dev list where possible.
>
> On Mon, Oct 22, 2018 at 10:51 AM Justin Leet 
> wrote:
>
>> If we want to push more discussion to the dev list, my obvious follow up
>> question then is "What are we hoping to get out of Slack/irc/other
>> interactive medium?". What discussion would we even want on there, if we
>> can't have decisions and don't want usage/support?
>>
>> On Mon, Oct 22, 2018 at 12:44 PM Casey Stella 
wrote:
>>
>> > I am of 2 minds, but I tend to agree. On the one hand, it's definitely
>> the
>> > preference that we use the mailing lists for the reasons you stated
(and
>> > also because not everyone has access to slack generally). On the other
>> > hand, I think an interactive medium like Slack has a lot of advantages
>> in
>> > terms of user satisfaction. Ultimately, though, we may satisfy 1 user
>> at
>> > the cost of not persisting the discussion and satisfying many users.
>> >
>> > I'll go along with a specific preference to drive more discussion to
the
>> > mailing list.
>> >
>> > Casey
>> >
>> > On Mon, Oct 22, 2018 at 12:18 PM Nick Allen 
wrote:
>> >
>> > > It seems that we are seeing a lot of Metron usage and support
>> questions
>> > on
>> > > the Slack Channel.
>> > > These are questions that previously would have been directed to the
>> User
>> > or
>> > > Dev mailing lists. Since this is occurring in the Slack Channel, the
>> > > conversations are not archived.
>> > >
>> > > In my opinion, this is not good for the Metron community. Having
this
>> > > persisted in a discoverable form (like a mailing list archive) not
>> only
>> > > helps support current users, but also helps *potential* users
>> understand
>> > > how Metron is being used.
>> > >
>> > > Does anyone else agree or disagree? At a minimum, I feel we need to
>> do
>> > > something to direct these conversations back to the mailing list.
>> > >
>> >
>>
>


Re: [DISCUSS] Slack Channel Use

2018-10-22 Thread Otto Fowler
These questions also occurred on the IRC channel.  The difference is that
there are more than Jon and I answering now.


On October 22, 2018 at 12:18:08, Nick Allen (n...@nickallen.org) wrote:

It seems that we are seeing a lot of Metron usage and support questions on
the Slack Channel.
These are questions that previously would have been directed to the User or
Dev mailing lists. Since this is occurring in the Slack Channel, the
conversations are not archived.

In my opinion, this is not good for the Metron community. Having this
persisted in a discoverable form (like a mailing list archive) not only
helps support current users, but also helps *potential* users understand
how Metron is being used.

Does anyone else agree or disagree? At a minimum, I feel we need to do
something to direct these conversations back to the mailing list.


Re: [DISCUSS] Stellar REST client

2018-10-19 Thread Otto Fowler
I believe the issue of introducing and supporting higher latency
enrichments is a systemic one, and should be solved as such,
with the rest and other higher latency enrichments build on top of that
framework.




On October 19, 2018 at 12:22:28, Ryan Merriman (merrim...@gmail.com) wrote:

Thanks Casey, good questions.

As far as the verbs go, just thinking we might want to support calls other
than GET at some point. For the use case stated (enriching messages from
3rd party services) GET is all we need. Probably a moot point anyways
since every http library will support the different HTTP verbs.

Agreed on the caching. I will defer to those that are more familiar with
the Stellar internals on what the correct approach is.

I was thinking the same thing with regards to the client libraries. Apache
HttpComponents is probably the safest choice but OkHttp looks nice and
could reduce effort and complexity as long as it meets our requirements.

On Fri, Oct 19, 2018 at 10:58 AM Casey Stella  wrote:

> I think it makes a lot of sense. A couple of questions:
>
> - What actions do you see the REST verbs corresponding to? I would
> understand GET (which is in effect "evaluate an expression", right?),
> but
> I'm not sure about the others.
> - We should probably be careful about caching stellar expressions. Not
> all stellar expressions are deterministic (e.g. PROFILE_GET may not be
> as
> the lookback window is bound to current time). Ultimately, I think we
> should probably bake whether a function is deterministic into stellar so
> that *stellar* can cache where appropriate (e.g. if every part of an
> expression is deterministic, then pull from cache otherwise recompute).
> All of this to say, if you're going to make it configurable, IMO we
> should
> make it a configuration that the user passes in at request time so they
> have the control over whether the expression is safe to cache or
> otherwise.
>
> Without more compelling reasons to not do so, I'd suggest we use HTTP
> Components as it's another apache project and under active
> development/support. I'd also be ok with OkHttp if it's actively
> maintained.
>
> On Fri, Oct 19, 2018 at 11:46 AM Ryan Merriman 
> wrote:
>
> > I want to open up discussion around adding a Stellar REST client
> function.
> > There are services available to enrich security telemetry and they are
> > commonly exposed through a REST interface. The primary purpose of this
> > discuss thread to collect requirements from the community and agree on
a
> > general architectural approach.
> >
> > At a minimum I see a Stellar REST client supporting:
> >
> > - Common HTTP verbs including GET, POST, DELETE, etc
> > - Option to provide headers and request parameters as needed
> > - Support for basic authentication
> > - Proper request and error handling (we can discuss further how this
> > should work)
> > - SSL support
> > - Option to use a proxy server (including authentication)
> > - JSON format
> >
> > In addition to these functional requirements, I would also propose we
> > include these performance requirements:
> >
> > - Provide a configurable caching layer
> > - Provide a mechanism for pooling connections
> > - Provide clear documentation and guidance on how to properly use this
> > feature since there is a significant risk of introducing latency
> issues
> >
> > What else would you like to see included?
> >
> > I think the primary architectural decision we need to make (based on
the
> > agreed upon requirements of course) is an appropriate Java HTTP/REST
> client
> > library. Ideally we choose a library that supports everything we need
> > OOTB. I think the majority of the work for this feature will involve
> > wrapping this library in a Stellar function and exposing the
> configuration
> > knobs through Metron's configuration interface (Ambari, Zookeeper,
> etc). I
> > have done some very light research and here is my initial list:
> >
> > - Apache HttpComponents - https://hc.apache.org/
> > - Has support for all of the features listed above as far as I can
> tell
> > - Doesn't introduce a large number of new dependencies (am I wrong
> > here?)
> > - Is sort of included already (we will need to upgrade from
> > httpclient)
> > - Lower level
> > - Google HTTP Client Library for Java -
> >
> >
>
https://developers.google.com/api-client-library/java/google-http-java-client/
> > - Higher level API with pluggable components
> > - Introduces dependencies (we've had issues with Guava in the past)
> > - Netflix Ribbon - https://github.com/Netflix/ribbon
> > - Has a lot of nice features that may be useful in the future
> > - Introduces dependencies (including guava)
> > - Hasn't been committed to in the last 5-6 months
> > - Unirest - https://github.com/Kong/unirest-java
> > - Lightweight API built on top of HttpComponents
> > - Pluggable serialization library (jackson is an issue for us so
> this
> > is nice)
> > - Also has not received a commit in a while
> > - OkHttp - 

Re: Bro plugin unit tests failing

2018-10-14 Thread Otto Fowler
It is INFRA, see INFRA-17091 for example.


On October 12, 2018 at 20:47:24, zeo...@gmail.com (zeo...@gmail.com) wrote:

So it seems that the last commit before the 0.2 release of
metron-bro-plugin-kafka broke the one basic unit test that we had. Since
metron 0.6.0 pins to 0.1
<
https://github.com/apache/metron/blob/Metron_0.6.0/metron-deployment/ansible/roles/bro/vars/main.yml#L30>

this wouldn't cause an obvious issue if you spun up the sensors in
full-dev, and even if we were to point metron to 0.2 of the plugin we
wouldn't have necessarily seen the issue because we do a --force
<
https://github.com/apache/metron/blob/Metron_0.6.0/metron-deployment/ansible/roles/bro/tasks/metron-bro-plugin-kafka.yml#L35>

to remove interaction.

After we get a chance to review/merge a recent PR
, we should have
a few more 'actual' unit tests, and I would like to get travis setup for
the repo  to actually
check those (PR incoming) so it's less likely this sort of thing would
happen again. However, it doesn't appear I have the right permissions to
be able to get travis moving on my own. Does anybody else have the right
access, or is this an asf infrastructure ticket?

I'd also like us to consider doing a 0.3 release for the plugin so we
aren't distributing something that has broken tests. Thoughts on that?

Jon
-- 

Jon


Re: Custom parser using Jackson instead of json-simple

2018-10-05 Thread Otto Fowler
The ParserBolt is written to JSON simple, so although the interface is 
in practice it is .  The answer is no right now.

Feel free to open a jira.


On October 5, 2018 at 02:52:37, Muhammed Irshad (irshadkt@gmail.com)
wrote:

Hi All,

Is it not possible to use any Json library other than json-simple
 while writing custom
parsers ? I could see we should implement custom parser interface
MessageParser in document. What if I need to use Jackson
instead of json-simple ? I see Jackson performs better than json-simple in
few aspects in some of the benchmark studies. I tried writing a custom
parser implementing MessageParser ( Jackson's JsonNode ). No
compiler errors. I am getting below error when I deploy and run this parser
in HCP.

2018-10-05 04:42:09.829 o.a.s.d.executor Thread-12-parserBolt-executor[5 5]
[ERROR]
java.lang.ClassCastException: org.codehaus.jackson.node.ObjectNode cannot
be cast to org.json.simple.JSONObject
at org.apache.metron.parsers.bolt.ParserBolt.execute(ParserBolt.java:187)
[stormjar.jar:?
]
at
org.apache.storm.daemon.executor$fn__10252$tuple_action_fn__10254.invoke(executor.clj:735)

[storm-core-1.1.0.2.6.5.0-292.jar:1.1.0.2.6.5.0-292]
at
org.apache.storm.daemon.executor$mk_task_receiver$fn__10171.invoke(executor.clj:466)

[storm-core-1.1.0.2.6.5.0-292.jar:1.1.0.2.6.5.0-292]
at
org.apache.storm.disruptor$clojure_handler$reify__9685.onEvent(disruptor.clj:40)

[storm-core-1.1.0.2.6.5.0-292.jar:1.1.0.2.6.5.0-292]
at
org.apache.storm.utils.DisruptorQueue.consumeBatchToCursor(DisruptorQueue.java:472)

[storm-core-1.1.0.2.6.5.0-292.jar:1.1.0.2.6.5.0-292]
at
org.apache.storm.utils.DisruptorQueue.consumeBatchWhenAvailable(DisruptorQueue.java:451)

[storm-core-1.1.0.2.6.5.0-292.jar:1.1.0.2.6.5.0-292]
at
org.apache.storm.disruptor$consume_batch_when_available.invoke(disruptor.clj:73)

[storm-core-1.1.0.2.6.5.0-292.jar:1.1.0.2.6.5.0-292]
at
org.apache.storm.daemon.executor$fn__10252$fn__10265$fn__10320.invoke(executor.clj:855)

[storm-core-1.1.0.2.6.5.0-292.jar:1.1.0.2.6.5.0-292]
at org.apache.storm.util$async_loop$fn__553.invoke(util.clj:484)
[storm-core-1.1.0.2.6.5.0-292.jar:1.1.0.2.6.5.0-292]
at clojure.lang.AFn.run(AFn.java:22) [clojure-1.7.0.jar:?
]
at java.lang.Thread.run(Thread.java:745) [?:1.8.0_112]
2018-10-05 04:42:09.850 o.a.s.d.executor Thread-12-parserBolt-executor[5 5]
[ERROR]


-- 
Muhammed Irshad K T
Senior Software Engineer
+919447946359
irshadkt@gmail.com
Skype : muhammed.irshad.k.t


Re: Invite to Slack Channel

2018-10-04 Thread Otto Fowler
Done


On October 4, 2018 at 05:35:06, Tamás Fodor (ftamas.m...@gmail.com) wrote:

Hello,

Michael, can you add me as well?

Thank you in advance!

Tamas

On Wed, Oct 3, 2018 at 4:27 PM Michael Miklavcic <
michael.miklav...@gmail.com> wrote:

> Sent
>
> On Wed, Oct 3, 2018 at 8:17 AM Shane Ardell 
> wrote:
>
> > Hello everyone,
> >
> > Is it possible for someone to send me an invite to the Metron Slack
> > channel?
> >
> > Regards,
> > Shane
> >
>


Re: [DISCUSS] Feature Branch guidance

2018-09-29 Thread Otto Fowler
This is all well and good for feature branches, but does nothing for Simon
and the type of work he attempted.
If we agree that features do not have architectural changes, then we also
need to codify how we handle that level of change, assuming
anyone is optimistic enough to attempt such a thing in the future as an
individual, non-hw contributor.




On September 28, 2018 at 16:58:04, Justin Leet (justinjl...@gmail.com)
wrote:

+1 to both points.

I'm in favor of keeping architectural changes limited in a feature branch.
Architectural challenges beyond the scope of the branch should be brought
back to the community for any necessary discussion.

I don't think we've ever formalized what exactly closes out a feature
branch in terms of consensus. Typically we've been getting a few +1's and
calling it a day. It's probably worth it to formalize it while we're
already in the bylaws, assuming there's consensus on changing them.

In terms of wording, maybe something like the following (but better):

Feature Branch
Large feature changes may be made in a speculative feature branch. A
DISCUSS thread should be started on the primary project development mailing
list (dev@metron.apache.org) to propose the feature and outline minimum
architectural changes. Architectural changes are limited based on the
discuss thread unless further discussion occurs. To close the feature
branch, start a DISCUSS thread to outline branch state and solicit overall
feedback and requests. The branch can be committed after .

On Fri, Sep 28, 2018 at 2:26 PM Michael Miklavcic <
michael.miklav...@gmail.com> wrote:

> +1 to those 2 bullet points Casey. And thanks Justin for adding the Jira
> for fixing the website.
>
> I can think of 2 good examples to borrow from recently that were
submitted
> by community contributors. Shane Ardell brought up a discussion about
> migrating from Protractor to Cypress, and Tamas Fodor brought up a
> discussion about migrating from momentjs to date-fns. This greases the
> skids by engaging community members and explaining the scope of proposed
> changes. As always, committers are able to -1 something at any time, so I
> would imagine that any contributor would be well advised to get as much
buy
> in as possible prior to any large undertaking. And I would expect those
> PR's to reference the original DISCUSS threads when they come to
fruition.
>
> Another example comes to mind from Ryan M with his PCAP feature branch.
It
> was a lot of work, but Ryan put out a DISCUSS thread back in, I think it
> was May, outlining the intent for the FB. Subsequently, he followed up at
> the end with an accounting of all the requests from the original DISCUSS
> and any deviations from that along the way, and provided a clear
> explanation of what was in, what wasn't, what should be followed up with,
> and why. In fact, at one point I think there were some library changes
that
> we saw as orthogonal to the intent of the FB and suggested they be made
> outside of the FB. Imho, this FB worked out well, though take this with a
> grain of salt as I may be biased because I was also involved with a
number
> of PRs.
>
> I think Nick Allen also put together a good archetype for a FB with his
> recent work on the batch profiler. I see a couple introductory DISCUSS
> threads about the FB along with some back and forth around introducing
> Spark into the stack. He also followed up at the end to make sure there
> wasn't anything further the community wanted before he pushed to have the
> branch merged into master.
>
> *TL;DR* - We've learned a lot since the earlier days of the project, and
> our subsequent feature branches have gotten much better. We should take
the
> lessons learned along the way and formalize them as Casey is recommending
> in our bylaws. I'll be following up with more specific thoughts on
> language.
>
> Best,
> Mike
>
>
> On Fri, Sep 28, 2018 at 10:13 AM Justin Leet 
> wrote:
>
> > Ticket created: https://issues.apache.org/jira/browse/METRON-1799
> >
> > I think that whole '/develop' is orphaned and can be dropped.
> >
> > On Fri, Sep 28, 2018 at 12:12 PM Casey Stella 
> wrote:
> >
> > > I just noticed this, but googling "metron bylaws" yields
> > > http://metron.apache.org/develop/bylaws.html which is not our bylaws.
> > Our
> > > bylaws are on
> > >
> https://cwiki.apache.org/confluence/display/METRON/Apache+Metron+Bylaws
> > >
> > > We should fix that.
> > >
> > > On Fri, Sep 28, 2018 at 12:02 PM Casey Stella 
> > wrote:
> > >
> > > > Hi All,
> > > >
> > > > Given discussions about the current high-profile feature branch
(Knox
> > > > SSO), I thought it might be appropriate to have a conversation
about
> > what
> > > > constitutes a feature branch and get some of this encoded in the
> > > community
> > > > guidelines.
> > > >
> > > > Specifically, there was the request made that we split up the Knox
> SSO
> > > > feature branch due to the current implementation including a
> distinct,
> > > new
> > > > architectural 

Re: Metron dev environments moving to require Ansible 2.4+

2018-09-28 Thread Otto Fowler
Yeah,  I thought we had more but maybe they where removed.
Many places in *.md files referencing Ansible and versions too


On September 28, 2018 at 11:45:14, zeo...@gmail.com (zeo...@gmail.com)
wrote:

Do you mean this
<https://cwiki.apache.org/confluence/display/METRON/Downgrade+Ansible>?  It
was the only reference I could find on the wiki.  All of the READMEs should
be updated as a part of the PR, but feel free to provide your input if I
missed anything.

Jon

On Fri, Sep 28, 2018 at 10:15 AM Otto Fowler 
wrote:

> We should make sure the non-source documentation is updated
>
>
> On September 28, 2018 at 09:32:52, zeo...@gmail.com (zeo...@gmail.com)
> wrote:
>
> Hi All,
>
> As it currently sits, once METRON-1758
> <https://github.com/apache/metron/pull/1179> is merged into the code
> base, Ansible 2.4 or later will be required to use any of the Metron
> ansible playbooks.  This is in contrast to the prior version requirements
> outlined in Metron documentation which specifically point to 2.0.0.2 and
> 2.2.0.0 as supported/recommended Ansible versions.  If you install Ansible
> 2.5.0 exactly you should not experience any issues spinning up pre- and post-
> merge versions of Metron.
>
> I am broadcasting this to both the user and dev communities in advance of
> any changes to provide an opportunity to voice any concerns.  Thanks,
>
> Jon
> --
>
> Jon
>
> --

Jon


Re: Metron dev environments moving to require Ansible 2.4+

2018-09-28 Thread Otto Fowler
We should make sure the non-source documentation is updated


On September 28, 2018 at 09:32:52, zeo...@gmail.com (zeo...@gmail.com)
wrote:

Hi All,

As it currently sits, once METRON-1758
 is merged into the code base,
Ansible 2.4 or later will be required to use any of the Metron ansible
playbooks.  This is in contrast to the prior version requirements outlined
in Metron documentation which specifically point to 2.0.0.2 and 2.2.0.0 as
supported/recommended Ansible versions.  If you install Ansible 2.5.0
exactly you should not experience any issues spinning up pre- and post-
merge versions of Metron.

I am broadcasting this to both the user and dev communities in advance of
any changes to provide an opportunity to voice any concerns.  Thanks,

Jon
--

Jon


Re: [MENTORS][DISCUSS] LICENSE and NOTICE likely outdated

2018-09-12 Thread Otto Fowler
So,  since NiFi does produce binaries, they require NOTICE and LICENSE
updates in two places:

- the ‘package’ itself.  With nifi usually this is the .nar file ( nars are
just jars ).
- the nifi-assembly module which builds the .zip binary distribution.

It is normal and expected during reviews that committers/reviewers check
this.  Joe Witt is the final word on it though.
Here is an example from a pr I did:

https://github.com/apache/nifi/pull/2805#discussion_r196906833




On September 12, 2018 at 14:34:58, Justin Leet (justinjl...@gmail.com)
wrote:

There is a distinction. The dependencies_with_url.csv does manage to make
sure our dependencies (and transitive dependencies) are appropriately
accounted for. What we also need to do is make that any changes (if
necessary) to the LICENSE and NOTICE files also make it in there. For
example, certain attributions may be necessary in the NOTICE file. Similar
things can happen with the LICENSE file (e.g. including licenses from
dependencies from bundled code). It's possible that we don't have any
problems, but I also expect that since we haven't been actively maintaining
it that there might be issues. E.g. if the UI has pulled in anything
bundled in our source, modifications to the L may need to be made.

As far as uberjars go, I believe we're fine. Like you said, we aren't
distributing them, so they aren't bundled.

Otto, you mentioned on Slack that NiFi requires some checking in having PRs
reviewed and in reviewing PRs. Could you share your experience there?

On Wed, Sep 12, 2018 at 1:36 PM Otto Fowler 
wrote:

> Are you referring to the dependencies check against the csv?
>
>
> On September 12, 2018 at 13:09:48, Michael Miklavcic (
> michael.miklav...@gmail.com) wrote:
>
> I'm not sure I fully understand what is out of date. I know I have
> personally modified our licenses a couple times in the past and used an
> automated script that, I believe, Casey Stella had created for doing the
> check. I even made some improvements to it a long ways back. It rips
> through the maven dependency tree and tells you what isn't in the
licenses
> file and fails with a non-zero return code. I thought that was part of
our
> Travis build, or at the very least, the release lifecycle. Is that not
the
> case, or is there a different context we're talking about here?
>
> I understand that convenience binaries might some issues with uberjars
when
> we go that route for 1.0. But is there any issue with the uberjars as
> things currently stand? I was under the impression we are OK because we
> don't distribute them. It's part of the build, just like tools such as
> JUnit, that we don't actually ship.
>
> Justin - These are the links for guidance that I've found. Is anything
else
> you've found that we should peruse while figuring this out?
>
> - https://www.apache.org/dev/licensing-howto.html
> - http://www.apache.org/legal/release-policy.html#artifacts
>
> Mike
>
>
> On Wed, Sep 12, 2018 at 10:29 AM Justin Leet 
> wrote:
>
> > Hi all,
> >
> > As mentioned on the release voting thread, there was a Slack discussion
> > around our LICENSE and NOTICE file likely being outdated because they
> > haven't been actively kept up to date since graduation. I suggested on
> the
> > vote thread that we proceed with the current release, but consider it a
> > blocker for the next release.
> >
> > Mentor input on this (and how other projects handle it), would be
greatly
> > appreciated.
> >
> > This discussion should result in JIRAs that are brought back to the
> thread,
> > so we can make sure to track this.
> >
> > For context, in addition to the standard L management, when we build
> > artifacts we shade a lot of jars into a uberjars, thus bundling
> > dependencies. However, our current releases are source only, but
> > publishing convenience binaries came up in the 1.0 roadmap thread.
> >
> > I think there are a few things that need to happen to correct our
current
> > issue and make this easier in the future.
> > 1) Get the LICENSE and NOTICE files up to date
> > 2) Document the process we went through getting things up to date and
> (just
> > as importantly) the reasoning behind it.
> > 3) Update the PR checklist to include LICENSE and NOTICE files for new
> (and
> > transitive) dependencies.
> > 4) Update or add any processes we need to maintain this properly (e.g.
> > release auditing)
> > 5) Possibly build tooling for making some of this auditing easier (or
use
> > existing tool if anyone has suggestions)?
> >
> > Are there any other steps I'm missing that need to go into JIRAs?
> > Any other concerns regarding these files that need to be addressed?
> > Any other context I'm missing and that belongs in this discussion?
> >
>


Re: [MENTORS][DISCUSS] LICENSE and NOTICE likely outdated

2018-09-12 Thread Otto Fowler
Are you referring to the dependencies check against the csv?


On September 12, 2018 at 13:09:48, Michael Miklavcic (
michael.miklav...@gmail.com) wrote:

I'm not sure I fully understand what is out of date. I know I have
personally modified our licenses a couple times in the past and used an
automated script that, I believe, Casey Stella had created for doing the
check. I even made some improvements to it a long ways back. It rips
through the maven dependency tree and tells you what isn't in the licenses
file and fails with a non-zero return code. I thought that was part of our
Travis build, or at the very least, the release lifecycle. Is that not the
case, or is there a different context we're talking about here?

I understand that convenience binaries might some issues with uberjars when
we go that route for 1.0. But is there any issue with the uberjars as
things currently stand? I was under the impression we are OK because we
don't distribute them. It's part of the build, just like tools such as
JUnit, that we don't actually ship.

Justin - These are the links for guidance that I've found. Is anything else
you've found that we should peruse while figuring this out?

- https://www.apache.org/dev/licensing-howto.html
- http://www.apache.org/legal/release-policy.html#artifacts

Mike


On Wed, Sep 12, 2018 at 10:29 AM Justin Leet  wrote:

> Hi all,
>
> As mentioned on the release voting thread, there was a Slack discussion
> around our LICENSE and NOTICE file likely being outdated because they
> haven't been actively kept up to date since graduation. I suggested on
the
> vote thread that we proceed with the current release, but consider it a
> blocker for the next release.
>
> Mentor input on this (and how other projects handle it), would be greatly
> appreciated.
>
> This discussion should result in JIRAs that are brought back to the
thread,
> so we can make sure to track this.
>
> For context, in addition to the standard L management, when we build
> artifacts we shade a lot of jars into a uberjars, thus bundling
> dependencies. However, our current releases are source only, but
> publishing convenience binaries came up in the 1.0 roadmap thread.
>
> I think there are a few things that need to happen to correct our current
> issue and make this easier in the future.
> 1) Get the LICENSE and NOTICE files up to date
> 2) Document the process we went through getting things up to date and
(just
> as importantly) the reasoning behind it.
> 3) Update the PR checklist to include LICENSE and NOTICE files for new
(and
> transitive) dependencies.
> 4) Update or add any processes we need to maintain this properly (e.g.
> release auditing)
> 5) Possibly build tooling for making some of this auditing easier (or use
> existing tool if anyone has suggestions)?
>
> Are there any other steps I'm missing that need to go into JIRAs?
> Any other concerns regarding these files that need to be addressed?
> Any other context I'm missing and that belongs in this discussion?
>


Re: [DISCUSS] Feature branches post-merge

2018-09-07 Thread Otto Fowler
I would drop them.
I’ve already clean up FB’s around dead things.



On September 6, 2018 at 13:42:55, Michael Miklavcic (
michael.miklav...@gmail.com) wrote:

What are we doing with feature branches once they're complete and merged
into master? Is our expectation that we'll keep feature branches in
perpetuity, or should we plan to do some house cleaning once they've been
merged? I did a quick check of NiFi and Kafka and don't see much by way of
feature branches in their repos. I see plenty of RC's in both the branches
and tags listings, but nothing FB related. In previous discussions, we
talked quite a bit about us "trailblazing here," so it may be that this is
simply without much precedent and entirely for us to decide. I can
definitely see value in maintaining them for future reference, as it does
offer a nice bucket in which to collect the commits and discussion nicely,
but I wanted to get others' thoughts.

Best,
Mike


Re: IRC Channel -> OPS?

2018-08-29 Thread Otto Fowler
Damn, I was hoping not.  It will never happen now


On August 29, 2018 at 15:49:26, zeo...@gmail.com (zeo...@gmail.com) wrote:

Isn't it Casey?

Jon

On Wed, Aug 29, 2018, 08:41 Otto Fowler  wrote:

> Who has ops in the irc channel?
> Can you pop in and set the topic to something like:
> “There is an ASF slack with an active metron channel, please email
> dev@metron.apache.org and request an invite”
>
-- 

Jon


Re: [DISCUSS] Contributing a General Purpose Regex Parser

2018-08-29 Thread Otto Fowler
I would like to see a PR on this.
Do you have an example of a second type of log where this would be useful?
Besides something syslog-y?

There is a PR out for a Syslog RFC 5424 parser that handles that (
including structured data, which I don’t know if you have in your parser ).

What may be more interesting is if we can define field prefixes and regexes
to match, as opposed to having HEADER RECORD.  IE.  header is
coincidentally configured.

“Foo.header” -> “SOME REGEX” => foo.header.NAME=data

That would be more generic.

Please excuse me if I am misunderstanding anything.


On August 26, 2018 at 19:28:02, jskar...@gmail.com (jskar...@gmail.com)
wrote:

Hello,



We have implemented a general purpose regex parser for Metron that we are
interested in contributing back to the community.



While the Metron Grok parser provides some regex based capability today,
the intention of this general purpose regex parser is to:

1. Allow for more advanced parsing scenarios (specifically, dealing with
multiple regex lines for devices that contain several log formats within
them)
2. Give users and developers of Metron additional options for parsing
3. With the new parser chaining and regex routing feature available in
Metron, this gives some additional flexibility to logically separate a flow
by:
1. Regex routing to segregate logs at a device level and handle
envelope unwrapping
2. This general purpose regex parser to parse an entire device type
that contains multiple log formats within the single device (for example,
RHEL logs)



At a high level control flow is like this:

1. Identify the record type if incoming raw message.

2. Find and apply the regular expression of corresponding record type to
extract the fields (using named groups).

3. Apply the message header regex to extract the fields in the header part
of the message (using named groups).


The parser config uses the following structure:

"recordTypeRegex": "(?(?<=\\s)\\b(kernel|syslog)\\b(?=\\[|:))"

"messageHeaderRegex": "(?(?<=^<)
\\d{1,4}(?=>)).*?(?(?<=>)[A-Za-z]{3}\\s{1,2}\\d{1,2}\\s\\d{1,2}:\\d{1,2}:\\d{1,2}(?=\\s)).*?(?(?<=\\s).*?(?=\\s))

",

"fields": [

{

"recordType": "kernel",

"regex": ".*(?(?<=\\]|\\w\\:).*?(?=$))"

},

{

"recordType": "syslog",

"regex":
".*(?(?<=PID\\s=\\s).*?(?=\\sLine)).*(?(?<=64\\s)\/([A-Za-z0-9_-]+\/)+(?=\\w))(?.*?(?=\")).*(?(?<=\").*?(?=$))"


}

]



Where:

- recordTypeRegex is used to distinctly identify a record type. It
inputs a valid regular expression and may also have named groups, which
would be extracted into fields.
- messageHeaderRegex is used to specify a regular expression to extract
fields from a message part which is common across all the messages (i.e,
syslog fields, standard headers)
- fields: json list of objects containing recordType and regex. The
expression that is evaluated is based on the output of the recordTypeRegex
- Note: recordTypeRegex and messageHeaderRegex could be specified as
lists also (as a JSON array), where the list will be evaluated in order
until a matching regular expression is found.





If there are no objections to having this type of Parser within Metron, we
will open a JIRA/PR for code review.

*Jagdeep Singh*


IRC Channel -> OPS?

2018-08-29 Thread Otto Fowler
Who has ops in the irc channel?
Can you pop in and set the topic to something like:
“There is an ASF slack with an active metron channel, please email
dev@metron.apache.org and request an invite”


Re: [DISCUSS] Getting to a 1.0 release

2018-08-27 Thread Otto Fowler
; > > > Personally, I think the state of our docs and web presence is an
> > > > inhibitor
> > > > > to growing the Metron community. Unless we can offer concise,
> > > compelling
> > > > > answers to the basic questions (What can I do with Metron? Who
> does
> > it
> > > > > help? How do I do that?), potential users and contributors are
> unable
> > > to
> > > > > see the value of Metron.
> > > > >
> > > > >
> > > > >
> > > > > On Sat, Aug 18, 2018 at 9:42 AM, Nick Allen 
> > > wrote:
> > > > >
> > > > > > I'd like to see us focus on improving our docs before a version
> > 1.0.
> > > > > > Right now we just stitch together a bunch of READMEs, which is
a
> > > great
> > > > > > stride from where we started, but is not ideal.
> > > > > >
> > > > > > Our docs should focused on the user and use cases; What can I
do
> > with
> > > > > > Metron? Who does it help? How do I do that?
> > > > > >
> > > > > > The docs should be separate from the code base to allow for an
> > > > > > organization that is focused on the user rather than the
> > > > implementation.
> > > > > > This allows the READMEs to focus on the developer and the
> > > > implementation,
> > > > > > which should make them more digestible too. The docs should be
> > > version
> > > > > > controlled and maintained through PRs, just like the code. We
> > should
> > > > > take
> > > > > > just as much pride in our docs as we do in our code.
> > > > > >
> > > > > >
> > > > > >
> > > > > > On Wed, Aug 15, 2018 at 4:35 PM, Simon Elliston Ball <
> > > > > > si...@simonellistonball.com> wrote:
> > > > > >
> > > > > >> Agreed, should we add TDE by default, and get the ranger
> policies
> > on
> > > > by
> > > > > >> default? That leaves secured in Kafka, which would have to be
> > built
> > > > into
> > > > > >> the consumers and producers to encrypt into the on disk Kafka
> > > topics.
> > > > > Does
> > > > > >> that seem necessary to people? It would have performance
> > > implications
> > > > > for
> > > > > >> sure.
> > > > > >>
> > > > > >> Simon
> > > > > >>
> > > > > >> > On 15 Aug 2018, at 21:26, Otto Fowler <
> ottobackwa...@gmail.com>
> > > > > wrote:
> > > > > >> >
> > > > > >> > Well, I look at it like this.
> > > > > >> >
> > > > > >> > The Secure Vault was part of the original metron pitch, and
> many
> > > may
> > > > > >> have used that as part of their evaluations.
> > > > > >> > “Look, it is going to have a security vault type thing, it
is
> on
> > > the
> > > > > >> roadmap”.
> > > > > >> >
> > > > > >> > Regardless of the implementation, conceptually, security of
> data
> > > at
> > > > > >> rest is important, and is a major outstanding item or the core
> > > metron
> > > > > >> proposition.
> > > > > >> >
> > > > > >> >
> > > > > >> >
> > > > > >> >
> > > > > >> >> On August 15, 2018 at 16:03:19, Simon Elliston Ball (
> > > > > >> si...@simonellistonball.com) wrote:
> > > > > >> >>
> > > > > >> >> That’s going back a way. I always saw that concept as begin
> > about
> > > > the
> > > > > >> formats, e.g. Orc, and meta data around it plus the data
service
> > api
> > > > to
> > > > > get
> > > > > >> at it. I’m all for that too, but think it needs more thought
> than
> > > the
> > > > > >> ticket captures.
> > > > > >> >>
> > > > > >> >> Simon
> > > > > >> >>
> > > > > >> >> On 15 Aug 2018, at 20:53, Otto Fowler <
> 

package.lock changes during build?

2018-08-25 Thread Otto Fowler
I just did a PR, can saw that the package.lock file for alerts-ui was
changed, with updated versions.
I did *not* change the file, nor anything in metron-interface. That seems
to imply that this file is changed or updated by
something that happens during building or deploying full dev.

Is this true?  How does this work?  Is this on purpose?

ottO


Re: [DISCUSS] Pcap query branch completion

2018-08-16 Thread Otto Fowler
Looks good, thanks!


On August 15, 2018 at 19:38:12, Ryan Merriman (merrim...@gmail.com) wrote:

Otto, I believe the items you requested are in the feature branch now. Is
there anything outstanding that we missed? The Jiras for the Pcap feature
branch should be up to date:
https://issues.apache.org/jira/browse/METRON-1554

On Mon, Aug 13, 2018 at 5:13 PM, Ryan Merriman  wrote:

> - Date range limits on queries
>
> I will add a warning in the Job cleanup PR. That seems like an
> appropriate place for it (ie. make sure you don't cause health issues in
> your cluster).
>
> - UI should manage a queue/history of jobs
>
> I can add some documentation around killing jobs manually with the YARN
> CLI. However if they haven't set up a YARN queue, I'm not sure how you
> would view only Pcap jobs. I'm also not sure how you would get the
> application id for the job to kill because it's not displayed anywhere in
> the UI. However, I believe we are wired for a job name but REST doesn't
> set this. Maybe we could get a proper job name associated with pcap
> queries and then this would be possible to document?
>
> - Documentation/blueprint for YARN configuration
>
> You make a good point. A YARN tuning guide for Metron does sound useful.
> I will add a follow on Jira.
>
> On Mon, Aug 13, 2018 at 4:53 PM, Otto Fowler 
> wrote:
>
>>
>> - Date range limits on queries
>>
>> I took the point the wrong way apparently, sorry, I withdraw. I thought
>> you meant allow specifying a limit on the query, not the system imposing
a
>> limit.
>> This should be documented with a warning or something
>>
>> - UI should manage a queue/history of jobs
>>
>> I was thinking that if there where multiple users/jobs, there should
>> be some thought or documentation + script on how to manage them.
>> “To see all the jobs still running on your cluster, across users and ui
>> instances do X”
>> “If there is an issue with the jobs you can’t resolve in the UI for that
>> user, or you are an admin and want to do something then X"
>>
>> - Documentation/blueprint for YARN configuration
>>
>> I agree with what you are saying. Although, we offer guidance on storm
>> tuning, and that is conceptually the same isn’t it? That is why it comes
>> to mind.
>> Maybe this can be a follow on, in the tuning guide?
>>
>> On August 13, 2018 at 17:36:41, Ryan Merriman (merrim...@gmail.com)
>> wrote:
>>
>> - Date range limits on queries
>>
>> Can you describe what you think is needed here? Each Metron user could
>> have different volumes of pcap data spread out over different time
>> periods. Are you saying we should limit the data range to something
either
>>
>> constant or configurable? Are we sure all users would want this? Am I
>> misinterpreting this requirement?
>>
>> - UI should manage a queue/history of jobs
>>
>> What should we document here? Reading that bullet point again, it's sort
>> of vague and not very description. What I am referring to is a design
that
>>
>> provides users a way to view and manage jobs in the UI. Currently jobs
can
>>
>> only be run 1 at a time and progress is shown with a status bar, so it's
>> somewhat interactive.
>>
>> - Documentation/blueprint for YARN configuration
>>
>>
>


Re: [DISCUSS] Getting to a 1.0 release

2018-08-15 Thread Otto Fowler
Well, I look at it like this.

The Secure Vault was part of the original metron pitch, and many may have
used that as part of their evaluations.
“Look, it is going to have a security vault type thing, it is on the
roadmap”.

Regardless of the implementation, conceptually, security of data at rest is
important, and is a major outstanding item or the core metron proposition.




On August 15, 2018 at 16:03:19, Simon Elliston Ball (
si...@simonellistonball.com) wrote:

That’s going back a way. I always saw that concept as begin about the
formats, e.g. Orc, and meta data around it plus the data service api to get
at it. I’m all for that too, but think it needs more thought than the
ticket captures.

Simon

On 15 Aug 2018, at 20:53, Otto Fowler  wrote:

https://issues.apache.org/jira/browse/METRON-343

On August 15, 2018 at 15:47:24, Simon Elliston Ball (
si...@simonellistonball.com) wrote:

What would you see as secure? I’ve seen people use TDE for the HDFS store,
but it’s harder to encrypt storage with solr / es. Something I was thinking
of doing to follow up on the Knox Feature was to add Ranger integration for
securing and auditing configs, and potentially extending to the index
destinations. Do you think that would cover the secure storage concept?

Simon

> On 15 Aug 2018, at 20:39, Otto Fowler  wrote:
>
> Secure storage off the top of my head
>
> On August 15, 2018 at 14:49:26, zeo...@gmail.com (zeo...@gmail.com) wrote:
>
> So, as has been discussed in a few
> <
>
https://lists.apache.org/thread.html/0445cd8f94dfb844cd5a23ac3eeca04c9f44c9d8f269c6ef12cb3598@%3Cdev.metron.apache.org%3E
>
>
> other
> <
>
https://lists.apache.org/thread.html/427a20c22207e84331b94e8ead9a4172a22577d26eb581c0e564d0dc@%3Cdev.metron.apache.org%3E
>
>
> recent dev list threads, I would like to discuss what a Metron 1.0 release
> looks like.
>
> In order to kick off the conversation, I would like to make a few
> suggestions regarding "what 1.0 means to me," but I'm very interested to
> hear everybody else's opinions.
>
> In order to go 1.0 I believe we should have:
> 1. A clear, supported method of upgrading from one version of Metron to
the
> next. We have attempted
> <https://github.com/apache/metron/blob/master/Upgrading.md> to make this
> easier in the past, but it is currently not
> <
>
https://github.com/apache/metron/tree/master/metron-deployment/packaging/ambari/metron-mpack#limitations
>
>
> supported
> <
>
https://github.com/apache/metron/tree/master/metron-deployment/packaging/ambari/elasticsearch-mpack#limitations
>
>
> .
> 2. Authentication for all of the UIs and APIs should be secure and support
> SSO. I believe this is in progress via METRON-1663
> <https://issues.apache.org/jira/browse/METRON-1663>.
> 3. Each of our personas
> <
>
https://cwiki.apache.org/confluence/display/METRON/Metron+User+Personas+And+Benefits
>
>
> should
> be well documented, understood, and supported.
> - The current state of documentation is, in my opinion, inadequate and I
> admit I am partially to blame for this. I suggest we define a strict
> approach for documentation, align to it (such as perhaps migrating all
> useful wiki documentation to git), and enforce it.
> - I would consider METRON-1699
> <https://issues.apache.org/jira/browse/METRON-1699> as a critical item for
> a Security Data Scientist, but it is currently not clear to me where the
> line exists between some of the other personas, or that each persona has
> been sufficiently implemented.
> 4. A performance tuning guide should be available for all of the main
> components, whether as an independent document or as a part of a larger
> document.
> 5. Simple data ingest.
> - Similar to the ongoing conversation for NiFi integration
> <
>
https://lists.apache.org/thread.html/d7bb4d32c8c42bd40b2f26973f989bcba16010a672fd8a533a5544bf@%3Cdev.metron.apache.org%3E
>,
>
> we should be able to say that we have broken down the barriers to getting
> data into a Metron cluster in easy and efficient ways. In addition to
> NiFi, having support for other popular tools such as beats
> <https://www.elastic.co/products/beats>, fluentd <https://www.fluentd.org/
>,
>
> etc.
> - Parsers should be pluggable, with independent tests and the ability to
> make versioned modifications with roll-backs.
>
> What else? Are any of these items not necessary for a 1.0?
>
> Jon
> --
>
> Jon


Re: [DISCUSS] Getting to a 1.0 release

2018-08-15 Thread Otto Fowler
https://issues.apache.org/jira/browse/METRON-106
At least making sure it is met and closing it



On August 15, 2018 at 15:53:02, Otto Fowler (ottobackwa...@gmail.com) wrote:

https://issues.apache.org/jira/browse/METRON-343

On August 15, 2018 at 15:47:24, Simon Elliston Ball (
si...@simonellistonball.com) wrote:

What would you see as secure? I’ve seen people use TDE for the HDFS store,
but it’s harder to encrypt storage with solr / es. Something I was thinking
of doing to follow up on the Knox Feature was to add Ranger integration for
securing and auditing configs, and potentially extending to the index
destinations. Do you think that would cover the secure storage concept?

Simon

> On 15 Aug 2018, at 20:39, Otto Fowler  wrote:
>
> Secure storage off the top of my head
>
> On August 15, 2018 at 14:49:26, zeo...@gmail.com (zeo...@gmail.com) wrote:
>
> So, as has been discussed in a few
> <
>
https://lists.apache.org/thread.html/0445cd8f94dfb844cd5a23ac3eeca04c9f44c9d8f269c6ef12cb3598@%3Cdev.metron.apache.org%3E
>
>
> other
> <
>
https://lists.apache.org/thread.html/427a20c22207e84331b94e8ead9a4172a22577d26eb581c0e564d0dc@%3Cdev.metron.apache.org%3E
>
>
> recent dev list threads, I would like to discuss what a Metron 1.0 release
> looks like.
>
> In order to kick off the conversation, I would like to make a few
> suggestions regarding "what 1.0 means to me," but I'm very interested to
> hear everybody else's opinions.
>
> In order to go 1.0 I believe we should have:
> 1. A clear, supported method of upgrading from one version of Metron to
the
> next. We have attempted
> <https://github.com/apache/metron/blob/master/Upgrading.md> to make this
> easier in the past, but it is currently not
> <
>
https://github.com/apache/metron/tree/master/metron-deployment/packaging/ambari/metron-mpack#limitations
>
>
> supported
> <
>
https://github.com/apache/metron/tree/master/metron-deployment/packaging/ambari/elasticsearch-mpack#limitations
>
>
> .
> 2. Authentication for all of the UIs and APIs should be secure and support
> SSO. I believe this is in progress via METRON-1663
> <https://issues.apache.org/jira/browse/METRON-1663>.
> 3. Each of our personas
> <
>
https://cwiki.apache.org/confluence/display/METRON/Metron+User+Personas+And+Benefits
>
>
> should
> be well documented, understood, and supported.
> - The current state of documentation is, in my opinion, inadequate and I
> admit I am partially to blame for this. I suggest we define a strict
> approach for documentation, align to it (such as perhaps migrating all
> useful wiki documentation to git), and enforce it.
> - I would consider METRON-1699
> <https://issues.apache.org/jira/browse/METRON-1699> as a critical item for
> a Security Data Scientist, but it is currently not clear to me where the
> line exists between some of the other personas, or that each persona has
> been sufficiently implemented.
> 4. A performance tuning guide should be available for all of the main
> components, whether as an independent document or as a part of a larger
> document.
> 5. Simple data ingest.
> - Similar to the ongoing conversation for NiFi integration
> <
>
https://lists.apache.org/thread.html/d7bb4d32c8c42bd40b2f26973f989bcba16010a672fd8a533a5544bf@%3Cdev.metron.apache.org%3E
>,
>
> we should be able to say that we have broken down the barriers to getting
> data into a Metron cluster in easy and efficient ways. In addition to
> NiFi, having support for other popular tools such as beats
> <https://www.elastic.co/products/beats>, fluentd <https://www.fluentd.org/
>,
>
> etc.
> - Parsers should be pluggable, with independent tests and the ability to
> make versioned modifications with roll-backs.
>
> What else? Are any of these items not necessary for a 1.0?
>
> Jon
> --
>
> Jon


Re: [DISCUSS] Getting to a 1.0 release

2018-08-15 Thread Otto Fowler
https://issues.apache.org/jira/browse/METRON-343

On August 15, 2018 at 15:47:24, Simon Elliston Ball (
si...@simonellistonball.com) wrote:

What would you see as secure? I’ve seen people use TDE for the HDFS store,
but it’s harder to encrypt storage with solr / es. Something I was thinking
of doing to follow up on the Knox Feature was to add Ranger integration for
securing and auditing configs, and potentially extending to the index
destinations. Do you think that would cover the secure storage concept?

Simon

> On 15 Aug 2018, at 20:39, Otto Fowler  wrote:
>
> Secure storage off the top of my head
>
> On August 15, 2018 at 14:49:26, zeo...@gmail.com (zeo...@gmail.com)
wrote:
>
> So, as has been discussed in a few
> <
>
https://lists.apache.org/thread.html/0445cd8f94dfb844cd5a23ac3eeca04c9f44c9d8f269c6ef12cb3598@%3Cdev.metron.apache.org%3E>

>
> other
> <
>
https://lists.apache.org/thread.html/427a20c22207e84331b94e8ead9a4172a22577d26eb581c0e564d0dc@%3Cdev.metron.apache.org%3E>

>
> recent dev list threads, I would like to discuss what a Metron 1.0
release
> looks like.
>
> In order to kick off the conversation, I would like to make a few
> suggestions regarding "what 1.0 means to me," but I'm very interested to
> hear everybody else's opinions.
>
> In order to go 1.0 I believe we should have:
> 1. A clear, supported method of upgrading from one version of Metron to
the
> next. We have attempted
> <https://github.com/apache/metron/blob/master/Upgrading.md> to make this
> easier in the past, but it is currently not
> <
>
https://github.com/apache/metron/tree/master/metron-deployment/packaging/ambari/metron-mpack#limitations>

>
> supported
> <
>
https://github.com/apache/metron/tree/master/metron-deployment/packaging/ambari/elasticsearch-mpack#limitations>

>
> .
> 2. Authentication for all of the UIs and APIs should be secure and
support
> SSO. I believe this is in progress via METRON-1663
> <https://issues.apache.org/jira/browse/METRON-1663>.
> 3. Each of our personas
> <
>
https://cwiki.apache.org/confluence/display/METRON/Metron+User+Personas+And+Benefits>

>
> should
> be well documented, understood, and supported.
> - The current state of documentation is, in my opinion, inadequate and I
> admit I am partially to blame for this. I suggest we define a strict
> approach for documentation, align to it (such as perhaps migrating all
> useful wiki documentation to git), and enforce it.
> - I would consider METRON-1699
> <https://issues.apache.org/jira/browse/METRON-1699> as a critical item
for
> a Security Data Scientist, but it is currently not clear to me where the
> line exists between some of the other personas, or that each persona has
> been sufficiently implemented.
> 4. A performance tuning guide should be available for all of the main
> components, whether as an independent document or as a part of a larger
> document.
> 5. Simple data ingest.
> - Similar to the ongoing conversation for NiFi integration
> <
>
https://lists.apache.org/thread.html/d7bb4d32c8c42bd40b2f26973f989bcba16010a672fd8a533a5544bf@%3Cdev.metron.apache.org%3E>,

>
> we should be able to say that we have broken down the barriers to getting
> data into a Metron cluster in easy and efficient ways. In addition to
> NiFi, having support for other popular tools such as beats
> <https://www.elastic.co/products/beats>, fluentd <https://www.fluentd.org/>,

>
> etc.
> - Parsers should be pluggable, with independent tests and the ability to
> make versioned modifications with roll-backs.
>
> What else? Are any of these items not necessary for a 1.0?
>
> Jon
> --
>
> Jon


Re: [DISCUSS] Getting to a 1.0 release

2018-08-15 Thread Otto Fowler
Secure storage off the top of my head

On August 15, 2018 at 14:49:26, zeo...@gmail.com (zeo...@gmail.com) wrote:

So, as has been discussed in a few
<
https://lists.apache.org/thread.html/0445cd8f94dfb844cd5a23ac3eeca04c9f44c9d8f269c6ef12cb3598@%3Cdev.metron.apache.org%3E>

other
<
https://lists.apache.org/thread.html/427a20c22207e84331b94e8ead9a4172a22577d26eb581c0e564d0dc@%3Cdev.metron.apache.org%3E>

recent dev list threads, I would like to discuss what a Metron 1.0 release
looks like.

In order to kick off the conversation, I would like to make a few
suggestions regarding "what 1.0 means to me," but I'm very interested to
hear everybody else's opinions.

In order to go 1.0 I believe we should have:
1. A clear, supported method of upgrading from one version of Metron to the
next. We have attempted
 to make this
easier in the past, but it is currently not
<
https://github.com/apache/metron/tree/master/metron-deployment/packaging/ambari/metron-mpack#limitations>

supported
<
https://github.com/apache/metron/tree/master/metron-deployment/packaging/ambari/elasticsearch-mpack#limitations>

.
2. Authentication for all of the UIs and APIs should be secure and support
SSO. I believe this is in progress via METRON-1663
.
3. Each of our personas
<
https://cwiki.apache.org/confluence/display/METRON/Metron+User+Personas+And+Benefits>

should
be well documented, understood, and supported.
- The current state of documentation is, in my opinion, inadequate and I
admit I am partially to blame for this. I suggest we define a strict
approach for documentation, align to it (such as perhaps migrating all
useful wiki documentation to git), and enforce it.
- I would consider METRON-1699
 as a critical item for
a Security Data Scientist, but it is currently not clear to me where the
line exists between some of the other personas, or that each persona has
been sufficiently implemented.
4. A performance tuning guide should be available for all of the main
components, whether as an independent document or as a part of a larger
document.
5. Simple data ingest.
- Similar to the ongoing conversation for NiFi integration
<
https://lists.apache.org/thread.html/d7bb4d32c8c42bd40b2f26973f989bcba16010a672fd8a533a5544bf@%3Cdev.metron.apache.org%3E>,

we should be able to say that we have broken down the barriers to getting
data into a Metron cluster in easy and efficient ways. In addition to
NiFi, having support for other popular tools such as beats
, fluentd ,

etc.
- Parsers should be pluggable, with independent tests and the ability to
make versioned modifications with roll-backs.

What else? Are any of these items not necessary for a 1.0?

Jon
-- 

Jon


Re: [ANNOUNCE] - Apache Metron Slack channel

2018-08-15 Thread Otto Fowler
Done


On August 15, 2018 at 14:22:45, Vets, Laurens (laur...@daemon.be) wrote:

Could I be invited?

On 15-Aug-18 09:48, Michael Miklavcic wrote:
> + Metron user list
>
> On Wed, Aug 15, 2018 at 10:38 AM Michael Miklavcic <
> michael.miklav...@gmail.com> wrote:
>
>> Turns out we are able to invite folks on an ad-hoc basis. See
instructions
>> here -
>> https://cwiki.apache.org/confluence/display/METRON/Community+Resources
>>
>>
>> On Wed, Aug 15, 2018 at 9:23 AM Michael Miklavcic <
>> michael.miklav...@gmail.com> wrote:
>>
>>> It's another option with different features. I imagine many people will
>>> use both.
>>>
>>> On Wed, Aug 15, 2018, 9:14 AM Simon Elliston Ball <
>>> si...@simonellistonball.com> wrote:
>>>
 Since this is committers only, would it make more sense to stick to
IRC?
 Or
 is exclusivity the idea?

 On 15 August 2018 at 16:09, Nick Allen  wrote:

> Thanks for the instructions!
>
> On Wed, Aug 15, 2018 at 10:22 AM, Michael Miklavcic <
> michael.miklav...@gmail.com> wrote:
>
>> The Metron community has a Slack channel available for communication
>> (similar to the existing IRC channel, only on Slack).
>>
>> To join:
>>
>> 1. Go to slack.com.
>> 2. For organization/group, you'll enter "the-asf"
>> 3. Use your Apache email for your login
>> 4. Click "Channels" and look for #metron (Created by ottO June 15,
> 2018)
>> Best
>> Mike Miklavcic
>>


 --
 --
 simon elliston ball
 @sireb



Re: [DISCUSS] Metron Parsers in Nifi

2018-08-15 Thread Otto Fowler
 bug
> (debatably) which should be fixed on that side.
>
> Simon
>
> On 13 August 2018 at 14:29, Otto Fowler  wrote:
>
>> Also,  If we are doing the record readers, we can have a reader for a
>> parser type and explicitly set the schema, as seen here :
>> https://github.com/apache/nifi/blob/master/nifi-nar-bundles/nifi-standard-services/nifi-record-serialization-services-bundle/nifi-record-serialization-services/src/main/java/org/apache/nifi/syslog/Syslog5424Reader.java
>>
>>
>>
>> On August 13, 2018 at 09:26:50, Otto Fowler (ottobackwa...@gmail.com)
>> wrote:
>>
>> If we can do the record readers ourselves ( with the parsers inside them
>> ) we can handle the returns.
>> I’ll be doing the net flow 5 readers once the net flow 5 processor PR (
>> not mine ) is in.
>>
>> I don’t think having a generic class loading parsers foo and having to
>> manage all that is preferable to having
>> an archetype and explicit parsers.
>>
>> Nifi processors and readers are self documenting, and this approach will
>> make that not possible, as another consideration.
>>
>>
>>
>> On August 13, 2018 at 06:50:09, Simon Elliston Ball (
>> si...@simonellistonball.com) wrote:
>>
>> Maybe the edge use case will clarify the config issue a little. The reason
>> I would want to be able to push Metron parsers into NiFi would be so I can
>> pre-parse and filter on the edge to save bandwidth from remote locations.
>> I
>> would expect to be able to parse at the edge and use NiFi to prioritise or
>> filter on the Metron ready data, then push through to a 'NoOp' parser in
>> Metron. For this to happen, we would absolutely not want to connect to
>> Zookeeper, so I'm +1 on Otto's suggestion that the config be embeddable in
>> NiFi properties. We cannot assume ZK connectivity from NiFi.
>>
>> I can also see a scenario where NiFi might make it easier to chain
>> parsers,
>> which is where it overlaps more with Metron. This is more about the fact
>> that NiFi make it a lot easier to configure and manage complex multi-step
>> flows than Metron, and is way more user intuitive from a design and
>> monitoring perspective. My main concern around using NiFi in this way is
>> about the load on the content repository. We are looking at a lot of
>> content level transformation here. You could argue that the same load is
>> taken off Kafka in the chaining scenario, but there is still a chance for
>> a
>> user to accidentally create a lot of disk access if they go over the top
>> with NiFi.
>>
>> I see this as potentially a a chance to make the Metron Parser interface
>> compatible with NiFi Record Readers. Then both communities could benefit
>> from sharing each other's parsers.
>>
>> In terms of the NAR approach, I would say we have a base bundle of the
>> NiFi
>> bits (https://github.com/simonellistonball/metron/tree/nifi already has
>> this for stellar, enrichments and an opinionated publisher, it also has a
>> readme with some discussion around this
>> https://github.com/simonellistonball/metron/tree/nifi/nifi-metron-bundle
>> ).
>> We can then use other nar dependencies to side load parser classes into
>> the
>> record reader. We would then need to do some fancy property validation in
>> NiFi to ensure the classes were available.
>>
>> Also, Record Readers are much much faster. The only problem I've found
>> with
>> them is that they error on blank output, which was a problem for me
>> writing
>> a netflow 9 reader (template only records need to live in NiFi cache, but
>> not be emitted).
>>
>> In terms of the schema objection, I'm not sure why schema focus is a
>> problem. Our parsers have implicit schema and the output schema formats
>> used in NiFi are very flexible and could be "just a map". That said, we
>> could also take the opportunity to introduce a method to the parser
>> interface to emit traits to contribute the bits of schema that a parser
>> produces. This would ultimately lead to us being able to generate output
>> schemas (ES, Solr, Hive, whatever which would take a lot of the pain out
>> of
>> setup for sensors).
>>
>> Simon
>>
>> On 9 August 2018 at 16:42, Otto Fowler  wrote:
>>
>> > I would say that
>> >
>> > - For each configuration parameter we want to pull in, it should be
>> > explicitly configured through a property as well as through a controller
>> > service that accesses the metron zk
>> > - Transformations should not be conflated with parsing in 

Re: [DISCUSS] Pcap query branch completion

2018-08-13 Thread Otto Fowler
- Date range limits on queries

I took the point the wrong way apparently, sorry, I withdraw.  I thought
you meant allow specifying a limit on the query, not the system imposing a
limit.
This should be documented with a warning or something

- UI should manage a queue/history of jobs

I was thinking that if there where multiple users/jobs, there should
be some thought or documentation + script on how to manage them.
“To see all the jobs still running on your cluster, across users and ui
instances do X”
“If there is an issue with the jobs you can’t resolve in the UI for that
user, or you are an admin and want to do something then X"

- Documentation/blueprint for YARN configuration

I agree with what you are saying.  Although, we offer guidance on storm
tuning, and that is conceptually the same isn’t it?  That is why it comes
to mind.
Maybe this can be a follow on, in the tuning guide?

On August 13, 2018 at 17:36:41, Ryan Merriman (merrim...@gmail.com) wrote:

- Date range limits on queries

Can you describe what you think is needed here? Each Metron user could
have different volumes of pcap data spread out over different time
periods. Are you saying we should limit the data range to something either
constant or configurable? Are we sure all users would want this? Am I
misinterpreting this requirement?

- UI should manage a queue/history of jobs

What should we document here? Reading that bullet point again, it's sort
of vague and not very description. What I am referring to is a design that
provides users a way to view and manage jobs in the UI. Currently jobs can
only be run 1 at a time and progress is shown with a status bar, so it's
somewhat interactive.

- Documentation/blueprint for YARN configuration


Re: [DISCUSS] Metron Parsers in Nifi

2018-08-13 Thread Otto Fowler
Also,  If we are doing the record readers, we can have a reader for a
parser type and explicitly set the schema, as seen here :
https://github.com/apache/nifi/blob/master/nifi-nar-bundles/nifi-standard-services/nifi-record-serialization-services-bundle/nifi-record-serialization-services/src/main/java/org/apache/nifi/syslog/Syslog5424Reader.java



On August 13, 2018 at 09:26:50, Otto Fowler (ottobackwa...@gmail.com) wrote:

If we can do the record readers ourselves ( with the parsers inside them )
we can handle the returns.
I’ll be doing the net flow 5 readers once the net flow 5 processor PR ( not
mine ) is in.

I don’t think having a generic class loading parsers foo and having to
manage all that is preferable to having
an archetype and explicit parsers.

Nifi processors and readers are self documenting, and this approach will
make that not possible, as another consideration.



On August 13, 2018 at 06:50:09, Simon Elliston Ball (
si...@simonellistonball.com) wrote:

Maybe the edge use case will clarify the config issue a little. The reason
I would want to be able to push Metron parsers into NiFi would be so I can
pre-parse and filter on the edge to save bandwidth from remote locations. I
would expect to be able to parse at the edge and use NiFi to prioritise or
filter on the Metron ready data, then push through to a 'NoOp' parser in
Metron. For this to happen, we would absolutely not want to connect to
Zookeeper, so I'm +1 on Otto's suggestion that the config be embeddable in
NiFi properties. We cannot assume ZK connectivity from NiFi.

I can also see a scenario where NiFi might make it easier to chain parsers,
which is where it overlaps more with Metron. This is more about the fact
that NiFi make it a lot easier to configure and manage complex multi-step
flows than Metron, and is way more user intuitive from a design and
monitoring perspective. My main concern around using NiFi in this way is
about the load on the content repository. We are looking at a lot of
content level transformation here. You could argue that the same load is
taken off Kafka in the chaining scenario, but there is still a chance for a
user to accidentally create a lot of disk access if they go over the top
with NiFi.

I see this as potentially a a chance to make the Metron Parser interface
compatible with NiFi Record Readers. Then both communities could benefit
from sharing each other's parsers.

In terms of the NAR approach, I would say we have a base bundle of the NiFi
bits (https://github.com/simonellistonball/metron/tree/nifi already has
this for stellar, enrichments and an opinionated publisher, it also has a
readme with some discussion around this
https://github.com/simonellistonball/metron/tree/nifi/nifi-metron-bundle).
We can then use other nar dependencies to side load parser classes into the
record reader. We would then need to do some fancy property validation in
NiFi to ensure the classes were available.

Also, Record Readers are much much faster. The only problem I've found with
them is that they error on blank output, which was a problem for me writing
a netflow 9 reader (template only records need to live in NiFi cache, but
not be emitted).

In terms of the schema objection, I'm not sure why schema focus is a
problem. Our parsers have implicit schema and the output schema formats
used in NiFi are very flexible and could be "just a map". That said, we
could also take the opportunity to introduce a method to the parser
interface to emit traits to contribute the bits of schema that a parser
produces. This would ultimately lead to us being able to generate output
schemas (ES, Solr, Hive, whatever which would take a lot of the pain out of
setup for sensors).

Simon

On 9 August 2018 at 16:42, Otto Fowler  wrote:

> I would say that
>
> - For each configuration parameter we want to pull in, it should be
> explicitly configured through a property as well as through a controller
> service that accesses the metron zk
> - Transformations should not be conflated with parsing in those processors
> or readers
>
> There is no on the fly configuration change in nifi ( You can’t change
> properties once started ).
>
> Wouldn’t the simplest minimal start be to say that we expect either nifi
or
> metron and simplify things? Let nifi nifi, let metron metron.
>
>
> On August 9, 2018 at 10:53:24, Justin Leet (justinjl...@gmail.com) wrote:
>
> That's definitely good info, thanks for reaching out to them about it.
>
> In terms of exposing/sharing, I don't think we have to couple them tightly
> (in fact, I think we should loosen the coupling as much as possible
without
> forcing reimplementation of things). I think there's definitely a way to
do
> that terms of the general purpose processor I proposed (or in terms of
> RecordReader or another implementation).
>
> It would definitely be easy enough to configure it to either pull from ZK
> or to 

Re: [DISCUSS] Metron Parsers in Nifi

2018-08-13 Thread Otto Fowler
If we can do the record readers ourselves ( with the parsers inside them )
we can handle the returns.
I’ll be doing the net flow 5 readers once the net flow 5 processor PR ( not
mine ) is in.

I don’t think having a generic class loading parsers foo and having to
manage all that is preferable to having
an archetype and explicit parsers.

Nifi processors and readers are self documenting, and this approach will
make that not possible, as another consideration.



On August 13, 2018 at 06:50:09, Simon Elliston Ball (
si...@simonellistonball.com) wrote:

Maybe the edge use case will clarify the config issue a little. The reason
I would want to be able to push Metron parsers into NiFi would be so I can
pre-parse and filter on the edge to save bandwidth from remote locations. I
would expect to be able to parse at the edge and use NiFi to prioritise or
filter on the Metron ready data, then push through to a 'NoOp' parser in
Metron. For this to happen, we would absolutely not want to connect to
Zookeeper, so I'm +1 on Otto's suggestion that the config be embeddable in
NiFi properties. We cannot assume ZK connectivity from NiFi.

I can also see a scenario where NiFi might make it easier to chain parsers,
which is where it overlaps more with Metron. This is more about the fact
that NiFi make it a lot easier to configure and manage complex multi-step
flows than Metron, and is way more user intuitive from a design and
monitoring perspective. My main concern around using NiFi in this way is
about the load on the content repository. We are looking at a lot of
content level transformation here. You could argue that the same load is
taken off Kafka in the chaining scenario, but there is still a chance for a
user to accidentally create a lot of disk access if they go over the top
with NiFi.

I see this as potentially a a chance to make the Metron Parser interface
compatible with NiFi Record Readers. Then both communities could benefit
from sharing each other's parsers.

In terms of the NAR approach, I would say we have a base bundle of the NiFi
bits (https://github.com/simonellistonball/metron/tree/nifi already has
this for stellar, enrichments and an opinionated publisher, it also has a
readme with some discussion around this
https://github.com/simonellistonball/metron/tree/nifi/nifi-metron-bundle).
We can then use other nar dependencies to side load parser classes into the
record reader. We would then need to do some fancy property validation in
NiFi to ensure the classes were available.

Also, Record Readers are much much faster. The only problem I've found with
them is that they error on blank output, which was a problem for me writing
a netflow 9 reader (template only records need to live in NiFi cache, but
not be emitted).

In terms of the schema objection, I'm not sure why schema focus is a
problem. Our parsers have implicit schema and the output schema formats
used in NiFi are very flexible and could be "just a map". That said, we
could also take the opportunity to introduce a method to the parser
interface to emit traits to contribute the bits of schema that a parser
produces. This would ultimately lead to us being able to generate output
schemas (ES, Solr, Hive, whatever which would take a lot of the pain out of
setup for sensors).

Simon

On 9 August 2018 at 16:42, Otto Fowler  wrote:

> I would say that
>
> - For each configuration parameter we want to pull in, it should be
> explicitly configured through a property as well as through a controller
> service that accesses the metron zk
> - Transformations should not be conflated with parsing in those
processors
> or readers
>
> There is no on the fly configuration change in nifi ( You can’t change
> properties once started ).
>
> Wouldn’t the simplest minimal start be to say that we expect either nifi
or
> metron and simplify things? Let nifi nifi, let metron metron.
>
>
> On August 9, 2018 at 10:53:24, Justin Leet (justinjl...@gmail.com) wrote:
>
> That's definitely good info, thanks for reaching out to them about it.
>
> In terms of exposing/sharing, I don't think we have to couple them
tightly
> (in fact, I think we should loosen the coupling as much as possible
without
> forcing reimplementation of things). I think there's definitely a way to
do
> that terms of the general purpose processor I proposed (or in terms of
> RecordReader or another implementation).
>
> It would definitely be easy enough to configure it to either pull from ZK
> or to use a parser config json extract as a parameter (to maintain the
same
> formatting and make migration easy). And we can still build specific
> NiFi-oriented parsers as needed (that manage things like Schema via the
> registry and other Nifi mechanisms). This keeps parsers entirely
decoupled
> from a metron installation.
>
> Alternatively, we extract our config handling to a module and scripts we
> can pack

Re: [DISCUSS] Pcap query branch completion

2018-08-13 Thread Otto Fowler
- Job cleanup/TTL

Documented at least, or a helper script to help yourself if you are in a
situation


- Expose the Query filter (vs Fixed) in the UI

Follow on


- Date range limits on queries

I don’t see how this won’t be immediately required. I would do this for
minimum viable.


- Pcap query as a separate UI

Follow on


- UI should manage a queue/history of jobs

Follow on, but maybe we need documentation


- BPF filtering

This is going to be a PITA, follow on


- Sharing PCA jobs with other users

Follow on


- Provide a way in the UI to populate a pcap query from an alert/metaalert

Follow on


- Documentation/blueprint for YARN configuration

Should have


  1   2   3   4   5   6   >