Re: [DISCUSS] SEP-2: ApplicationRunner Design

2017-06-21 Thread Chris Pettitt
Now that I look at the whole list again, I see we could adequately address #2 by first addressing #3. If instead of building aggregation directly into the window operator we had aggregation operators like sum then the lambas would be more intuitive. It would probably even be nicer if the first

Re: [DISCUSS] SEP-2: ApplicationRunner Design

2017-06-20 Thread Prateek Maheshwari
1. +1 for not requiring explicit type information unless its unavoidable. This one seems easy to fix, but there are other places we should address too (OutputStream, Window operator). We should probably discuss the Window API separately from this discussion, but to Chris' point: 2. Re: 2

Re: [DISCUSS] SEP-2: ApplicationRunner Design

2017-06-20 Thread Chris Pettitt
Feedback for PageViewCounterStreamSpecExample: https://github.com/nickpan47/samza/blob/new-api-v2/samza-core/src/test/java/org/apache/samza/example/PageViewCounterStreamSpecExample.java#L65: When we set up the input we had the message type, but it looks like we are not propagating it via

Re: [DISCUSS] SEP-2: ApplicationRunner Design

2017-06-20 Thread Chris Pettitt
Yi, What examples should we be looking at for new-api-v2? 1. samza/samza-core/src/test/java/org/apache/samza/example/PageViewCounterStreamSpecExample.java others? - Chris On Mon, Jun 19, 2017 at 5:29 PM, Yi Pan wrote: > Hi, all, > > Here is the promised code examples

Re: [DISCUSS] SEP-2: ApplicationRunner Design

2017-06-19 Thread Yi Pan
Hi, all, Here is the promised code examples for the revised API, and the related change to how we specify serdes in the API: - User example for the new API chagne: https://github.com/nickpan47/samza/tree/new-api-v2 - Prateekā€™s PR for the proposed schema registry change:

Re: [DISCUSS] SEP-2: ApplicationRunner Design

2017-06-06 Thread Yi Pan
Hi, all, Thanks for all the inputs! Finally I got some time to go through the discussion thread and digest most of the points made above. Here is my personal summary: Consensus on requirements: 1. ApplicationRunner needs async APIs. 2. ApplicationRunner can be hidden from user (except

Re: [DISCUSS] SEP-2: ApplicationRunner Design

2017-05-03 Thread Chris Pettitt
Hi Xinyu, I took a second look at the registerStore API. Would it be possible to call register storeDirectly on the app, similar to what we're doing with app.input (possible with the restriction registerStore must be called before we add an operator that uses the store)? Otherwise we'll end up

Re: [DISCUSS] SEP-2: ApplicationRunner Design

2017-05-01 Thread xinyu liu
Looked again at Chris's beam-samza-runner implementation. Seems LocalApplicationRunner.run() should be asynchronous too. Current implementation is actually using a latch to wait for the StreamProcessors to finish, which seems unnecessary. And we can provide a waitUntilFinish() counterpart to the

Re: [DISCUSS] SEP-2: ApplicationRunner Design

2017-04-28 Thread xinyu liu
Right, option #2 seems redundant for defining streams after further discussion here. StreamSpec itself is flexible enough to achieve both static and programmatic specification of the stream. Agree it's not convenient for now (pretty obvious after looking at your bsr beam.runners.samza.wrapper),

Re: [DISCUSS] SEP-2: ApplicationRunner Design

2017-04-28 Thread Chris Pettitt
Your proposal for #1 looks good. I'm not quite how to reconcile the proposals for #1 and #2. In #1 you add the stream spec straight onto the runner while in #2 you do it in a callback. If it is either-or, #1 looks a lot better for my purposes. For #4 what mechanism are you using to distribute

Re: [DISCUSS] SEP-2: ApplicationRunner Design

2017-04-27 Thread xinyu liu
btw, I will get to SAMZA-1246 as soon as possible. Thanks, Xinyu On Thu, Apr 27, 2017 at 9:11 PM, xinyu liu wrote: > Let me try to capture the updated requirements: > > 1. Set up input streams outside StreamGraph, and treat graph building as a > library (*Fluent, Beam*).

Re: [DISCUSS] SEP-2: ApplicationRunner Design

2017-04-27 Thread xinyu liu
Let me try to capture the updated requirements: 1. Set up input streams outside StreamGraph, and treat graph building as a library (*Fluent, Beam*). 2. Improve ease of use for ApplicationRunner to avoid complex configurations such as zkCoordinator, zkCoordinationService. (*Standalone*). Provide

Re: [DISCUSS] SEP-2: ApplicationRunner Design

2017-04-27 Thread Chris Pettitt
That should have been: For #1, Beam doesn't have a hard requirement... On Thu, Apr 27, 2017 at 9:07 AM, Chris Pettitt wrote: > For #1, I doesn't have a hard requirement for any change from Samza. A > very nice to have would be to allow the input systems to be set up at

Re: [DISCUSS] SEP-2: ApplicationRunner Design

2017-04-27 Thread Chris Pettitt
For #1, I doesn't have a hard requirement for any change from Samza. A very nice to have would be to allow the input systems to be set up at the same time as the rest of the StreamGraph. An even nicer to have would be to do away with the callback based approach and treat graph building as a

Re: [DISCUSS] SEP-2: ApplicationRunner Design

2017-04-26 Thread xinyu liu
Let me give a shot to summarize the requirements for ApplicationRunner we have discussed so far: - Support environment for passing in user-defined objects (streams potentially) into ApplicationRunner (*Beam*) - Improve ease of use for ApplicationRunner to avoid complex configurations such as

Re: [DISCUSS] SEP-2: ApplicationRunner Design

2017-04-24 Thread Prateek Maheshwari
Thanks for putting this together Yi! I agree with Jake, it does seem like there are a few too many moving parts here. That said, the problem being solved is pretty broad, so let me try to summarize my current understanding of the requirements. Please correct me if I'm wrong or missing something.

Re: [DISCUSS] SEP-2: ApplicationRunner Design

2017-04-21 Thread Chris Pettitt
I'm playing with ApplicationRunner, so I'll probably have more feedback. For now, in addition to async run we also need async notification of completion or failure. Also, ApplicationStatus should be able to give me the cause of failure (e.g. via an Exception), not just a failure state. On Thu,

Re: [DISCUSS] SEP-2: ApplicationRunner Design

2017-04-21 Thread Navina Ramesh
Hey Yi, Thanks for lot for your work on this document. I know it must have been crazy trying to put-together everything in a single doc :) Here are my comments. Sorry about the delay :( 1. It will be useful to set some background for the benefit of the community members who haven't been

Re: [DISCUSS] SEP-2: ApplicationRunner Design

2017-04-20 Thread Chris Pettitt
It might be worth taking a look at how Beam does test streams. The API is more powerful than just passing in a queue, e.g.: TestStream source = TestStream.create(StringUtf8Coder.of()) .addElements(TimestampedValue.of("this", start)) .addElements(TimestampedValue.of("that", start))

Re: [DISCUSS] SEP-2: ApplicationRunner Design

2017-04-20 Thread Jacob Maes
Thanks for the SEP! +1 on introducing these new components -1 on the current definition of their roles (see Design feedback below) *Design* - If LocalJobRunner and RemoteJobRunner handle the different methods of launching a Job, what additional value do the different types of

Re: [DISCUSS] SEP-2: ApplicationRunner Design

2017-04-17 Thread Yi Pan
Made some updates to clarify the role and functions of RuntimeEnvironment in SEP-2. On Fri, Apr 14, 2017 at 9:30 AM, Yi Pan wrote: > Hi, everyone, > > In light of new features such as fluent API and standalone that introduce > new deployment / application launch models in

[DISCUSS] SEP-2: ApplicationRunner Design

2017-04-14 Thread Yi Pan
Hi, everyone, In light of new features such as fluent API and standalone that introduce new deployment / application launch models in Samza, I created a new SEP-2 to address the new use cases. SEP-2 link: https://cwiki.apache.org/confluence/display/SAMZA/SEP-2%3A+ApplicationRunner+Design Please