Re: [DISCUSS] Knox SSO feature branch review and features

2018-09-27 Thread Michael Miklavcic
Apparently, I hit send on my last email before finishing my synopsis (per
@Otto's Q in Slack). To summarize, based on my current understanding I
believe that each of the feature branch changes I've outline above are
units of work that are related, yet should be executed on independently.
Knox SSO in its own feature branch. Migrating technologies like NodeJs or
migrating the auth DB to LDAP seem like they belong in their own separate
PR's or feature branches.

Thanks,
Mike

On Thu, Sep 27, 2018 at 4:08 PM Casey Stella  wrote:

> I'm coming in late to the game here, but for my mind a feature branch
> should involve the minimum architectural change to accomplish a given
> feature.
> The feature in question is SSO integration.  It seems to me that the
> operative question is can we do the feature without making the OTHER
> architectural change
> (e.g. migrating from expressjs to spring boot + zuul).  I would argue that
> if we WANT to do that, then it should be a separate feature branch.
>
> Thus, I leave with a question: is there a way to accomplish this feature
> without ripping out expressjs?
>
>- If so and it is feasible, I would argue that we should decouple this
>into a separate feature branch.
>- If so and it is infeasible, I'd like to hear an argument as to the
>infeasibility and let's decide given that
>- If it is not possible, then I'd argue that we should keep them coupled
>and move this through as-is.
>
> On a side-note, it feels a bit weird that we're narrowing to a bundled
> proxy, rather than having that be a pluggable thing.  I'm not super
> knowledgeable in this space, so I apologize
> in advance if this is naive, but isn't this a pluggable, external component
> (e.g. nginx)?
>
> On Thu, Sep 27, 2018 at 5:05 PM Michael Miklavcic <
> michael.miklav...@gmail.com> wrote:
>
> > I've spent some more time reading through Simon's response and the added
> > sequence diagram. This is definitely helpful - thank you Simon.
> >
> > I need to redact my initial list:
> >
> >1. Node migrated to Spring Boot, expressjs migrated to a
> >non-JS/non-NodeJs proxying mechanism (ie Zuul in this case)
> >2. JDBC removed completely in favor of LDAP
> >3. Knox/SSO
> >
> > I'm a bit conflicted on the best way to move forward and would like some
> > thoughts from other community members on this. I think an argument can be
> > made that 1 and 2 are independent of 3, and should/could really be
> > independent PR's against master.
> >
> > The need for a replacement for expressjs (Zuul in this case) is an
> artifact
> > that our request/response cycle for REST calls is a simple matter of
> > forwarding with some additional headers for authentication. There's a
> > JSESSIONID managed by the client browser in our current architecture, for
> > example. You login to the alerts or the management UI which forwards a
> > request to REST, which looks up credentials in a backend database, and
> > passes the results back up the chain. All browser requests go directly to
> > the specific UI you're working with - this is the CORS problem. You
> can't,
> > without some effort with headers for adding other domains to the safe
> list
> > or disabling the security check for CORS, make remote calls directly to
> > REST. That's why we proxy. Switching over to Spring Boot leaves a gap
> with
> > expressjs having handled the proxying and filtering, since it's only
> > available to a NodeJs application (it's server-side javascript vs the
> > client side javascript deployed via our Angular applications). Enter
> Zuul,
> > which now effectively handles that. At runtime, Zuul is a part of the
> > Spring app that serves up our UI's. It handles the requests via
> filtering,
> > forwards them to REST, manages the response back to the client. Very
> > similar to what expressjs was doing, per my current understanding. The
> > sequence diagrams Simon added are useful, and I think some of what was
> less
> > clear was what we currently vs what the new changes are doing to the
> > architecture. This is no fault of Simon's - there simply wasn't any
> > architecture diagrams/documents around this before. Here's my impression
> of
> > the very very basic current state - someone more familiar with this
> > architecture please advise if I'm incorrect about anything (probably
> Ryan).
> >
> > https://imgur.com/f8GtSmh
> >
> > Zuul would be replacing the bit about expressjs in the diagram, and
> instead
> > of node we have spring boot. This covers 1. 2 and 3 are other issues. I'd
> > like to see similar exposition of those server processes with knox
> > involved. I imagine in that case we bump up from 3 to 4 server instances
> > for the additional knox endpoint.
> >
> > Mike
> >
> >
> >
> >
> >
> > On Wed, Sep 19, 2018 at 11:28 AM James Sirota 
> wrote:
> >
> > > Thank you, Simon.  The diagrams help a lot
> > >
> > > 19.09.2018, 21:27, "Simon Elliston Ball"  >:
> > > > To clarify some of this I've put some documentation 

Re: [DISCUSS] Knox SSO feature branch review and features

2018-09-27 Thread Casey Stella
I'm coming in late to the game here, but for my mind a feature branch
should involve the minimum architectural change to accomplish a given
feature.
The feature in question is SSO integration.  It seems to me that the
operative question is can we do the feature without making the OTHER
architectural change
(e.g. migrating from expressjs to spring boot + zuul).  I would argue that
if we WANT to do that, then it should be a separate feature branch.

Thus, I leave with a question: is there a way to accomplish this feature
without ripping out expressjs?

   - If so and it is feasible, I would argue that we should decouple this
   into a separate feature branch.
   - If so and it is infeasible, I'd like to hear an argument as to the
   infeasibility and let's decide given that
   - If it is not possible, then I'd argue that we should keep them coupled
   and move this through as-is.

On a side-note, it feels a bit weird that we're narrowing to a bundled
proxy, rather than having that be a pluggable thing.  I'm not super
knowledgeable in this space, so I apologize
in advance if this is naive, but isn't this a pluggable, external component
(e.g. nginx)?

On Thu, Sep 27, 2018 at 5:05 PM Michael Miklavcic <
michael.miklav...@gmail.com> wrote:

> I've spent some more time reading through Simon's response and the added
> sequence diagram. This is definitely helpful - thank you Simon.
>
> I need to redact my initial list:
>
>1. Node migrated to Spring Boot, expressjs migrated to a
>non-JS/non-NodeJs proxying mechanism (ie Zuul in this case)
>2. JDBC removed completely in favor of LDAP
>3. Knox/SSO
>
> I'm a bit conflicted on the best way to move forward and would like some
> thoughts from other community members on this. I think an argument can be
> made that 1 and 2 are independent of 3, and should/could really be
> independent PR's against master.
>
> The need for a replacement for expressjs (Zuul in this case) is an artifact
> that our request/response cycle for REST calls is a simple matter of
> forwarding with some additional headers for authentication. There's a
> JSESSIONID managed by the client browser in our current architecture, for
> example. You login to the alerts or the management UI which forwards a
> request to REST, which looks up credentials in a backend database, and
> passes the results back up the chain. All browser requests go directly to
> the specific UI you're working with - this is the CORS problem. You can't,
> without some effort with headers for adding other domains to the safe list
> or disabling the security check for CORS, make remote calls directly to
> REST. That's why we proxy. Switching over to Spring Boot leaves a gap with
> expressjs having handled the proxying and filtering, since it's only
> available to a NodeJs application (it's server-side javascript vs the
> client side javascript deployed via our Angular applications). Enter Zuul,
> which now effectively handles that. At runtime, Zuul is a part of the
> Spring app that serves up our UI's. It handles the requests via filtering,
> forwards them to REST, manages the response back to the client. Very
> similar to what expressjs was doing, per my current understanding. The
> sequence diagrams Simon added are useful, and I think some of what was less
> clear was what we currently vs what the new changes are doing to the
> architecture. This is no fault of Simon's - there simply wasn't any
> architecture diagrams/documents around this before. Here's my impression of
> the very very basic current state - someone more familiar with this
> architecture please advise if I'm incorrect about anything (probably Ryan).
>
> https://imgur.com/f8GtSmh
>
> Zuul would be replacing the bit about expressjs in the diagram, and instead
> of node we have spring boot. This covers 1. 2 and 3 are other issues. I'd
> like to see similar exposition of those server processes with knox
> involved. I imagine in that case we bump up from 3 to 4 server instances
> for the additional knox endpoint.
>
> Mike
>
>
>
>
>
> On Wed, Sep 19, 2018 at 11:28 AM James Sirota  wrote:
>
> > Thank you, Simon.  The diagrams help a lot
> >
> > 19.09.2018, 21:27, "Simon Elliston Ball" :
> > > To clarify some of this I've put some documentation into
> > > https://github.com/apache/metron/pull/1203 under METRON-1755 (
> > > https://issues.apache.org/jira/browse/METRON-1755). Hopefully the
> > diagrams
> > > there should make it clearer.
> > >
> > > Simon
> > >
> > > On Tue, 18 Sep 2018 at 14:17, Simon Elliston Ball <
> > > si...@simonellistonball.com> wrote:
> > >
> > >>  Hi Mike,
> > >>
> > >>  Some good points here which could do with some clarification. I
> suspect
> > >>  the architecture documentation could be clearer and fill in some of
> > these
> > >>  gaps, and I'll have a look at working on that and providing some
> > diagrams.
> > >>
> > >>  The short version is that the Zuul proxy gateway has been added to
> > replace
> > >>  the Nodejs express 

Re: [DISCUSS] Knox SSO feature branch review and features

2018-09-27 Thread Michael Miklavcic
I've spent some more time reading through Simon's response and the added
sequence diagram. This is definitely helpful - thank you Simon.

I need to redact my initial list:

   1. Node migrated to Spring Boot, expressjs migrated to a
   non-JS/non-NodeJs proxying mechanism (ie Zuul in this case)
   2. JDBC removed completely in favor of LDAP
   3. Knox/SSO

I'm a bit conflicted on the best way to move forward and would like some
thoughts from other community members on this. I think an argument can be
made that 1 and 2 are independent of 3, and should/could really be
independent PR's against master.

The need for a replacement for expressjs (Zuul in this case) is an artifact
that our request/response cycle for REST calls is a simple matter of
forwarding with some additional headers for authentication. There's a
JSESSIONID managed by the client browser in our current architecture, for
example. You login to the alerts or the management UI which forwards a
request to REST, which looks up credentials in a backend database, and
passes the results back up the chain. All browser requests go directly to
the specific UI you're working with - this is the CORS problem. You can't,
without some effort with headers for adding other domains to the safe list
or disabling the security check for CORS, make remote calls directly to
REST. That's why we proxy. Switching over to Spring Boot leaves a gap with
expressjs having handled the proxying and filtering, since it's only
available to a NodeJs application (it's server-side javascript vs the
client side javascript deployed via our Angular applications). Enter Zuul,
which now effectively handles that. At runtime, Zuul is a part of the
Spring app that serves up our UI's. It handles the requests via filtering,
forwards them to REST, manages the response back to the client. Very
similar to what expressjs was doing, per my current understanding. The
sequence diagrams Simon added are useful, and I think some of what was less
clear was what we currently vs what the new changes are doing to the
architecture. This is no fault of Simon's - there simply wasn't any
architecture diagrams/documents around this before. Here's my impression of
the very very basic current state - someone more familiar with this
architecture please advise if I'm incorrect about anything (probably Ryan).

https://imgur.com/f8GtSmh

Zuul would be replacing the bit about expressjs in the diagram, and instead
of node we have spring boot. This covers 1. 2 and 3 are other issues. I'd
like to see similar exposition of those server processes with knox
involved. I imagine in that case we bump up from 3 to 4 server instances
for the additional knox endpoint.

Mike





On Wed, Sep 19, 2018 at 11:28 AM James Sirota  wrote:

> Thank you, Simon.  The diagrams help a lot
>
> 19.09.2018, 21:27, "Simon Elliston Ball" :
> > To clarify some of this I've put some documentation into
> > https://github.com/apache/metron/pull/1203 under METRON-1755 (
> > https://issues.apache.org/jira/browse/METRON-1755). Hopefully the
> diagrams
> > there should make it clearer.
> >
> > Simon
> >
> > On Tue, 18 Sep 2018 at 14:17, Simon Elliston Ball <
> > si...@simonellistonball.com> wrote:
> >
> >>  Hi Mike,
> >>
> >>  Some good points here which could do with some clarification. I suspect
> >>  the architecture documentation could be clearer and fill in some of
> these
> >>  gaps, and I'll have a look at working on that and providing some
> diagrams.
> >>
> >>  The short version is that the Zuul proxy gateway has been added to
> replace
> >>  the Nodejs express proxy used to gateway the REST api calls in the
> current
> >>  hosts. This is done in both cases to avoid CORS restrictions by
> allowing
> >>  the same host that serves the UI files to proxy call to the API.
> >>
> >>  The choice of Zuul was partly a pragmatic one (it's the one that's
> there
> >>  in the box as it were with Spring Boot, which we use for the REST API,
> via
> >>  the Spring Cloud Netflix project which wraps a bunch of related pieces
> into
> >>  Spring). The choice of Spring Boot to host the UIs themselves was
> similarly
> >>  for parity with the REST host, to simplify the stack (we remove the
> >>  occasionally problematic need to install nodejs on target servers,
> which is
> >>  outside of the regular OS and HDP stacks we support).
> >>
> >>  Arguably, the Zuul proxy is not necessary if we force everything
> through a
> >>  Knox instance, since Knox would provide a single endpoint. We probably
> >>  however don't want to force Knox and SSL, hence using Zuul to keep it
> >>  closer to our current architecture. Zuul does some other nice things,
> which
> >>  might help us in future, so it's really about laying down some options
> for
> >>  potentially doing micro-services style things at a later date. I'm not
> >>  saying we have to, or even should go that way, it will just make life
> >>  easier later if we decide to. It will also help us if we want to add
> HA,
> 

Re: [DISCUSS] Batch Profiler Feature Branch

2018-09-27 Thread James Sirota
+1 from me as well. great work

27.09.2018, 11:15, "Ryan Merriman" :
> +1 from me. Great work.
>
> On Thu, Sep 27, 2018 at 12:41 PM Justin Leet  wrote:
>
>>  I'm +1 on merging the feature branch into master. There's a lot of good
>>  work here, and it's definitely been nice to see the couple remaining
>>  improvements make it in.
>>
>>  Thanks a lot for the contribution, this is great stuff!
>>
>>  On Wed, Sep 26, 2018 at 6:26 PM Nick Allen  wrote:
>>
>>  > Or support to be offered for merging this feature branch into master?
>>  >
>>  > On Wed, Sep 26, 2018 at 6:20 PM Nick Allen  wrote:
>>  >
>>  > > Thanks for the review. With
>>  https://github.com/apache/metron/pull/1209
>>  > complete,
>>  > > I think the feature branch is ready to be merged. Sounds like I have
>>  > > Mike's support. Anyone else have comments, concerns, questions?
>>  > >
>>  > > On Tue, Sep 25, 2018 at 10:33 PM Michael Miklavcic <
>>  > > michael.miklav...@gmail.com> wrote:
>>  > >
>>  > >> I just made a couple minor comments on that PR, and I am in agreement
>>  > >> about
>>  > >> the readiness for merging with master. Good stuff Nick.
>>  > >>
>>  > >> On Fri, Sep 21, 2018 at 12:37 PM Nick Allen 
>>  wrote:
>>  > >>
>>  > >> > Here is a PR that adds the input time constraints to the Batch
>>  > Profiler
>>  > >> > (METRON-1787); https://github.com/apache/metron/pull/1209.
>>  > >> >
>>  > >> > It seems that the consensus is that this is probably the last
>>  feature
>>  > we
>>  > >> > need before merging the FB into master. The other two can wait
>>  until
>>  > >> after
>>  > >> > the feature branch has been merged. Let me know if you disagree.
>>  > >> >
>>  > >> > Thanks
>>  > >> >
>>  > >> >
>>  > >> > On Thu, Sep 20, 2018 at 1:55 PM Nick Allen 
>>  > wrote:
>>  > >> >
>>  > >> > > Yeah, agreed. Per use case 3, when deploying to production there
>>  > >> really
>>  > >> > > wouldn't be a huge overlap like 3 months of already profiled data.
>>  > >> Its
>>  > >> > day
>>  > >> > > 1, the profile was just deployed around the same time as you are
>>  > >> running
>>  > >> > > the Batch Profiler, so the overlap is in minutes, maybe hours.
>>  But
>>  > I
>>  > >> can
>>  > >> > > definitely see the usefulness of the feature for re-runs, etc as
>>  you
>>  > >> have
>>  > >> > > described.
>>  > >> > >
>>  > >> > > Based on this discussion, I created a few JIRAs. Thanks all for
>>  the
>>  > >> > great
>>  > >> > > feedback and keep it coming.
>>  > >> > >
>>  > >> > > [1] METRON-1787 - Input Time Constraints for Batch Profiler
>>  > >> > > [2] METRON-1788 - Fetch Profile Definitions from Zk for Batch
>>  > Profiler
>>  > >> > > [3] METRON-1789 - MPack Should Define Default Input Path for Batch
>>  > >> > > Profiler
>>  > >> > >
>>  > >> > >
>>  > >> > > --
>>  > >> > > [1] https://issues.apache.org/jira/browse/METRON-1787
>>  > >> > > [2] https://issues.apache.org/jira/browse/METRON-1788
>>  > >> > > [3] https://issues.apache.org/jira/browse/METRON-1789
>>  > >> > >
>>  > >> > >
>>  > >> > >
>>  > >> > >
>>  > >> > >
>>  > >> > >
>>  > >> > > On Thu, Sep 20, 2018 at 1:34 PM Michael Miklavcic <
>>  > >> > > michael.miklav...@gmail.com> wrote:
>>  > >> > >
>>  > >> > >> I think we might want to allow the flexibility to choose the date
>>  > >> range
>>  > >> > >> then. I don't yet feel like I have a good enough understanding of
>>  > all
>>  > >> > the
>>  > >> > >> ways in which users would want to seed to force them to run the
>>  > batch
>>  > >> > job
>>  > >> > >> over all the data. It might also make it easier to deal with
>>  > >> > remediation,
>>  > >> > >> ie an error doesn't force you to re-run over the entire history.
>>  > Same
>>  > >> > goes
>>  > >> > >> for testing out the profile seeing batch job in the first place.
>>  > >> > >>
>>  > >> > >> On Thu, Sep 20, 2018 at 11:23 AM Nick Allen 
>>  > >> wrote:
>>  > >> > >>
>>  > >> > >> > Assuming you have 9 months of data archived, yes.
>>  > >> > >> >
>>  > >> > >> > On Thu, Sep 20, 2018 at 1:22 PM Michael Miklavcic <
>>  > >> > >> > michael.miklav...@gmail.com> wrote:
>>  > >> > >> >
>>  > >> > >> > > So in the case of 3 - if you had 6 months of data that hadn't
>>  > >> been
>>  > >> > >> > profiled
>>  > >> > >> > > and another 3 that had been profiled (9 months total data),
>>  in
>>  > >> its
>>  > >> > >> > current
>>  > >> > >> > > form the batch job runs over all 9 months?
>>  > >> > >> > >
>>  > >> > >> > > On Thu, Sep 20, 2018 at 11:13 AM Nick Allen <
>>  > n...@nickallen.org>
>>  > >> > >> wrote:
>>  > >> > >> > >
>>  > >> > >> > > > > How do we establish "tm" from 1.1 above? Any concerns
>>  about
>>  > >> > >> overlap
>>  > >> > >> > or
>>  > >> > >> > > > gaps after the seeding is performed?
>>  > >> > >> > > >
>>  > >> > >> > > > Good point. Right now, if the Streaming and Batch Profiler
>>  > >> > overlap
>>  > >> > >> the
>>  > >> > >> > > > last write wins. And presumably the output of the
>>  Streaming
>>  > >> and
>>  > >> > >> Batch
>>  > >> > 

Re: [DISCUSS] Batch Profiler Feature Branch

2018-09-27 Thread Ryan Merriman
+1 from me.  Great work.

On Thu, Sep 27, 2018 at 12:41 PM Justin Leet  wrote:

> I'm +1 on merging the feature branch into master. There's a lot of good
> work here, and it's definitely been nice to see the couple remaining
> improvements make it in.
>
> Thanks a lot for the contribution, this is great stuff!
>
> On Wed, Sep 26, 2018 at 6:26 PM Nick Allen  wrote:
>
> > Or support to be offered for merging this feature branch into master?
> >
> > On Wed, Sep 26, 2018 at 6:20 PM Nick Allen  wrote:
> >
> > > Thanks for the review.  With
> https://github.com/apache/metron/pull/1209
> > complete,
> > > I think the feature branch is ready to be merged.  Sounds like I have
> > > Mike's support.  Anyone else have comments, concerns, questions?
> > >
> > > On Tue, Sep 25, 2018 at 10:33 PM Michael Miklavcic <
> > > michael.miklav...@gmail.com> wrote:
> > >
> > >> I just made a couple minor comments on that PR, and I am in agreement
> > >> about
> > >> the readiness for merging with master. Good stuff Nick.
> > >>
> > >> On Fri, Sep 21, 2018 at 12:37 PM Nick Allen 
> wrote:
> > >>
> > >> > Here is a PR that adds the input time constraints to the Batch
> > Profiler
> > >> > (METRON-1787);  https://github.com/apache/metron/pull/1209.
> > >> >
> > >> > It seems that the consensus is that this is probably the last
> feature
> > we
> > >> > need before merging the FB into master.  The other two can wait
> until
> > >> after
> > >> > the feature branch has been merged.  Let me know if you disagree.
> > >> >
> > >> > Thanks
> > >> >
> > >> >
> > >> > On Thu, Sep 20, 2018 at 1:55 PM Nick Allen 
> > wrote:
> > >> >
> > >> > > Yeah, agreed.  Per use case 3, when deploying to production there
> > >> really
> > >> > > wouldn't be a huge overlap like 3 months of already profiled data.
> > >> Its
> > >> > day
> > >> > > 1, the profile was just deployed around the same time as you are
> > >> running
> > >> > > the Batch Profiler, so the overlap is in minutes, maybe hours.
> But
> > I
> > >> can
> > >> > > definitely see the usefulness of the feature for re-runs, etc as
> you
> > >> have
> > >> > > described.
> > >> > >
> > >> > > Based on this discussion, I created a few JIRAs.  Thanks all for
> the
> > >> > great
> > >> > > feedback and keep it coming.
> > >> > >
> > >> > > [1] METRON-1787 - Input Time Constraints for Batch Profiler
> > >> > > [2] METRON-1788 - Fetch Profile Definitions from Zk for Batch
> > Profiler
> > >> > > [3] METRON-1789 - MPack Should Define Default Input Path for Batch
> > >> > > Profiler
> > >> > >
> > >> > >
> > >> > > --
> > >> > > [1] https://issues.apache.org/jira/browse/METRON-1787
> > >> > > [2] https://issues.apache.org/jira/browse/METRON-1788
> > >> > > [3] https://issues.apache.org/jira/browse/METRON-1789
> > >> > >
> > >> > >
> > >> > >
> > >> > >
> > >> > >
> > >> > >
> > >> > > On Thu, Sep 20, 2018 at 1:34 PM Michael Miklavcic <
> > >> > > michael.miklav...@gmail.com> wrote:
> > >> > >
> > >> > >> I think we might want to allow the flexibility to choose the date
> > >> range
> > >> > >> then. I don't yet feel like I have a good enough understanding of
> > all
> > >> > the
> > >> > >> ways in which users would want to seed to force them to run the
> > batch
> > >> > job
> > >> > >> over all the data. It might also make it easier to deal with
> > >> > remediation,
> > >> > >> ie an error doesn't force you to re-run over the entire history.
> > Same
> > >> > goes
> > >> > >> for testing out the profile seeing batch job in the first place.
> > >> > >>
> > >> > >> On Thu, Sep 20, 2018 at 11:23 AM Nick Allen 
> > >> wrote:
> > >> > >>
> > >> > >> > Assuming you have 9 months of data archived, yes.
> > >> > >> >
> > >> > >> > On Thu, Sep 20, 2018 at 1:22 PM Michael Miklavcic <
> > >> > >> > michael.miklav...@gmail.com> wrote:
> > >> > >> >
> > >> > >> > > So in the case of 3 - if you had 6 months of data that hadn't
> > >> been
> > >> > >> > profiled
> > >> > >> > > and another 3 that had been profiled (9 months total data),
> in
> > >> its
> > >> > >> > current
> > >> > >> > > form the batch job runs over all 9 months?
> > >> > >> > >
> > >> > >> > > On Thu, Sep 20, 2018 at 11:13 AM Nick Allen <
> > n...@nickallen.org>
> > >> > >> wrote:
> > >> > >> > >
> > >> > >> > > > > How do we establish "tm" from 1.1 above? Any concerns
> about
> > >> > >> overlap
> > >> > >> > or
> > >> > >> > > > gaps after the seeding is performed?
> > >> > >> > > >
> > >> > >> > > > Good point.  Right now, if the Streaming and Batch Profiler
> > >> > overlap
> > >> > >> the
> > >> > >> > > > last write wins.  And presumably the output of the
> Streaming
> > >> and
> > >> > >> Batch
> > >> > >> > > > Profiler are the same, so no worries, right? :)
> > >> > >> > > >
> > >> > >> > > > So it kind of works, but it is definitely not ideal for use
> > >> case
> > >> > >> 3.  I
> > >> > >> > > > could add --begin and --end args to constrain the time
> frame
> > >> over
> > >> > >> which
> > >> > >> > > the
> > >> > >> > > > 

Re: [DISCUSS] Batch Profiler Feature Branch

2018-09-27 Thread Justin Leet
I'm +1 on merging the feature branch into master. There's a lot of good
work here, and it's definitely been nice to see the couple remaining
improvements make it in.

Thanks a lot for the contribution, this is great stuff!

On Wed, Sep 26, 2018 at 6:26 PM Nick Allen  wrote:

> Or support to be offered for merging this feature branch into master?
>
> On Wed, Sep 26, 2018 at 6:20 PM Nick Allen  wrote:
>
> > Thanks for the review.  With  https://github.com/apache/metron/pull/1209
> complete,
> > I think the feature branch is ready to be merged.  Sounds like I have
> > Mike's support.  Anyone else have comments, concerns, questions?
> >
> > On Tue, Sep 25, 2018 at 10:33 PM Michael Miklavcic <
> > michael.miklav...@gmail.com> wrote:
> >
> >> I just made a couple minor comments on that PR, and I am in agreement
> >> about
> >> the readiness for merging with master. Good stuff Nick.
> >>
> >> On Fri, Sep 21, 2018 at 12:37 PM Nick Allen  wrote:
> >>
> >> > Here is a PR that adds the input time constraints to the Batch
> Profiler
> >> > (METRON-1787);  https://github.com/apache/metron/pull/1209.
> >> >
> >> > It seems that the consensus is that this is probably the last feature
> we
> >> > need before merging the FB into master.  The other two can wait until
> >> after
> >> > the feature branch has been merged.  Let me know if you disagree.
> >> >
> >> > Thanks
> >> >
> >> >
> >> > On Thu, Sep 20, 2018 at 1:55 PM Nick Allen 
> wrote:
> >> >
> >> > > Yeah, agreed.  Per use case 3, when deploying to production there
> >> really
> >> > > wouldn't be a huge overlap like 3 months of already profiled data.
> >> Its
> >> > day
> >> > > 1, the profile was just deployed around the same time as you are
> >> running
> >> > > the Batch Profiler, so the overlap is in minutes, maybe hours.  But
> I
> >> can
> >> > > definitely see the usefulness of the feature for re-runs, etc as you
> >> have
> >> > > described.
> >> > >
> >> > > Based on this discussion, I created a few JIRAs.  Thanks all for the
> >> > great
> >> > > feedback and keep it coming.
> >> > >
> >> > > [1] METRON-1787 - Input Time Constraints for Batch Profiler
> >> > > [2] METRON-1788 - Fetch Profile Definitions from Zk for Batch
> Profiler
> >> > > [3] METRON-1789 - MPack Should Define Default Input Path for Batch
> >> > > Profiler
> >> > >
> >> > >
> >> > > --
> >> > > [1] https://issues.apache.org/jira/browse/METRON-1787
> >> > > [2] https://issues.apache.org/jira/browse/METRON-1788
> >> > > [3] https://issues.apache.org/jira/browse/METRON-1789
> >> > >
> >> > >
> >> > >
> >> > >
> >> > >
> >> > >
> >> > > On Thu, Sep 20, 2018 at 1:34 PM Michael Miklavcic <
> >> > > michael.miklav...@gmail.com> wrote:
> >> > >
> >> > >> I think we might want to allow the flexibility to choose the date
> >> range
> >> > >> then. I don't yet feel like I have a good enough understanding of
> all
> >> > the
> >> > >> ways in which users would want to seed to force them to run the
> batch
> >> > job
> >> > >> over all the data. It might also make it easier to deal with
> >> > remediation,
> >> > >> ie an error doesn't force you to re-run over the entire history.
> Same
> >> > goes
> >> > >> for testing out the profile seeing batch job in the first place.
> >> > >>
> >> > >> On Thu, Sep 20, 2018 at 11:23 AM Nick Allen 
> >> wrote:
> >> > >>
> >> > >> > Assuming you have 9 months of data archived, yes.
> >> > >> >
> >> > >> > On Thu, Sep 20, 2018 at 1:22 PM Michael Miklavcic <
> >> > >> > michael.miklav...@gmail.com> wrote:
> >> > >> >
> >> > >> > > So in the case of 3 - if you had 6 months of data that hadn't
> >> been
> >> > >> > profiled
> >> > >> > > and another 3 that had been profiled (9 months total data), in
> >> its
> >> > >> > current
> >> > >> > > form the batch job runs over all 9 months?
> >> > >> > >
> >> > >> > > On Thu, Sep 20, 2018 at 11:13 AM Nick Allen <
> n...@nickallen.org>
> >> > >> wrote:
> >> > >> > >
> >> > >> > > > > How do we establish "tm" from 1.1 above? Any concerns about
> >> > >> overlap
> >> > >> > or
> >> > >> > > > gaps after the seeding is performed?
> >> > >> > > >
> >> > >> > > > Good point.  Right now, if the Streaming and Batch Profiler
> >> > overlap
> >> > >> the
> >> > >> > > > last write wins.  And presumably the output of the Streaming
> >> and
> >> > >> Batch
> >> > >> > > > Profiler are the same, so no worries, right? :)
> >> > >> > > >
> >> > >> > > > So it kind of works, but it is definitely not ideal for use
> >> case
> >> > >> 3.  I
> >> > >> > > > could add --begin and --end args to constrain the time frame
> >> over
> >> > >> which
> >> > >> > > the
> >> > >> > > > Batch Profiler runs.  I do not have that in the feature
> branch.
> >> > It
> >> > >> > would
> >> > >> > > > be easy enough to add though.
> >> > >> > > >
> >> > >> > > >
> >> > >> > > >
> >> > >> > > > On Thu, Sep 20, 2018 at 12:41 PM Michael Miklavcic <
> >> > >> > > > michael.miklav...@gmail.com> wrote:
> >> > >> > > >
> >> > >> > > > > Ok, makes sense. That's sort 

Re: [DISCUSS] Migrate from Protractor to Cypress

2018-09-27 Thread Tibor Meller
Great Guys! Thanks for the feedback. I'll move forward as discussed.

Thx

On Wed, Sep 26, 2018 at 11:44 PM Michael Miklavcic <
michael.miklav...@gmail.com> wrote:

> I'm good with it. We can see some tests in action (and hopefully running in
> Travis! :-D) and then migrate and deprecate Protractor accordingly if we
> still agree that's the way to go. When you submit the first PR, please link
> to this DISCUSS via permalink from the mailing list archives. Thanks guys.
>
> Cheers,
> Mike
>
> On Wed, Sep 26, 2018 at 7:17 AM Shane Ardell 
> wrote:
>
> > I think Tibor's idea of using PCAP tests as an introduction to Cypress
> for
> > Metron is a great idea. As he pointed out, PCAP tests can take advantage
> of
> > Cypress' capability to mock responses, and we can set it up to run in
> > Travis. Once the community is able to see the benefits from an actual set
> > of Cypress tests inside the project and running in Travis, I think any
> > questions about migrating the rest of the existing tests from Protractor
> to
> > Cypress will be settled. However, if for some reason we run into issues
> > implementing or running the tests, we will have invested a fraction of
> time
> > vs. migrating all the tests right away.
> >
> > On Wed, Sep 26, 2018 at 2:12 PM Tibor Meller 
> > wrote:
> >
> > > Hi Team,
> > >
> > > Many of us agreed on that Cypress could be a more capable tool for us
> to
> > > write high-level UI tests, whether those be e2e, integration or
> automated
> > > regression tests. If there is no open question left about cypress we
> > could
> > > to bring it a test drive. My suggestion is to implement the PCAP UI
> tests
> > > with Cypress. Some services and PCAP semple data yet not available from
> > our
> > > CI environment so protractor is hardly applicable here. This would be a
> > > great opportunity for cypress to shine. With Cypress, we are able to
> mock
> > > out those responses and make it run in Travis.
> > > Anytime we make PCAP data available in Travis we could be able to plug
> > out
> > > those mocks and run the same test as integration or e2e tests if we
> like.
> > >
> > > Because it is relatively easy to migrate across cypress and protractor
> I
> > > see no major risks here if we decide to stick with Protractor for some
> > > reason.
> > >
> > > What do you think?
> > >
> > > Thanks for your feedback,
> > > Tibor
> > >
> > > On Wed, Sep 19, 2018 at 1:49 PM Shane Ardell  >
> > > wrote:
> > >
> > > > Hello everyone,
> > > >
> > > > Currently, we use Protractor to run our UI "end-to-end" tests.
> However,
> > > > there are a handful of major advantages we can gain from switching to
> > > > Cypress: https://www.cypress.io/features/.
> > > >
> > > >- As with most Selenium-based e2e testing frameworks, Protractor
> > > suffers
> > > >from test flakiness. This is because Selenium runs outside of the
> > > > browser
> > > >and executes remote commands across the network. To work around
> this
> > > at
> > > > the
> > > >moment, we are using protractor-flake to re-run failed tests, but
> > this
> > > > is
> > > >more of a crutch than a fix. Cypress executes in the same run loop
> > as
> > > > the
> > > >application it's testing, and as a result does not suffer from the
> > > same
> > > >flakiness.
> > > >- As a result of its architecture, Cypress runs much faster than
> > > >Protractor. This is especially critical if e2e tests are added to
> > the
> > > CI
> > > >build in the future.
> > > >- Protractor is incredibly hard to debug. In contrast, Cypress
> comes
> > > >with a plethora of debugging features, some of which you can see
> in
> > > > action
> > > >here: https://vimeo.com/242961930#t=264s
> > > >
> > > > Does anyone else have thoughts or opinions on switching to Cypress or
> > > > staying with Protractor?
> > > >
> > > > Cheers,
> > > > Shane
> > > >
> > >
> >
>