Hi Renato,

> Having the big proposals documented on SEPs is really great to have a
good understanding on the system!
I agree. Our previous design process was not being strictly enforced. We
hope to enforce it going forward as there are major changes coming into the
next release.

> So this means that inside a container there will be a single processor?
StreamProcessor is nothing more than a Samza container, along with an
instance of JobCoordinator in it. Think about it as a thin-wrapper around
SamzaContainer and JobCoordinator instance. You can find more details on
this idea here - https://issues.apache.org/jira/browse/SAMZA-1063
Going forward, we want a Samza job to consist of one or more
StreamProcessors, instead of N SamzaContainers and 1 AppMaster.

>  is this related to SAMZA-1080 somehow?
Yep. SAMZA-1080 introduces StreamProcessor with an almost pass-through
JobCoordinator. In fact, at LinkedIn, one of the teams is already using
this API with the StandaloneJobCoordinator and delegating partition
distribution to kafka high-level consumer (since systemconsumer is
pluggable in Samza, we have some internal wrappers around high-level
consumer). It has been working really well for stateless applications, I
believe.

Cheers!
Navina

On Thu, Mar 30, 2017 at 1:23 PM, Renato Marroquín Mogrovejo <
renatoj.marroq...@gmail.com> wrote:

> Hi Navina,
>
> Thanks for the great proposal! Having the big proposals documented on SEPs
> is really great to have a good understanding on the system!
> I have only a clarification question, the proposal states that every
> containerId is the same as the processorId. So this means that inside a
> container there will be a single processor? is this related to SAMZA-1080
> somehow?
>
>
> Best,
>
> Renato M.
>
> 2017-03-30 20:45 GMT+02:00 Navina Ramesh <nram...@linkedin.com.invalid>:
>
> > Hi Yi,
> > Good question. Three reasons:
> >
> > 1. In SAMZA-881, we came up with a set of responsibilities for the
> > JobCoordinator. One of them was to generate/assign processorId. So, it
> > makes sense to keep getProcessorId() within JobCoordinator interface.
> > 2. StreamProcessor was initially introduced as a user-facing API
> > SAMZA-1080. ProcessorId was an argument in StreamProcessor constructor.
> It
> > was pushing the burden of guaranteeing unique among the processors of a
> job
> > to the user. This was not favorable.
> > 3. In general, I think we have consensus that the processorIdGenerator is
> > going to specific to a runtime environment. Hence, it seems more
> > appropriate to move it to a lower abstraction layer that deals with the
> > underlying execution environment.
> >
> > Let me know if you have a different perspective on this.
> >
> > Cheers!
> > Navina
> >
> > On Thu, Mar 30, 2017 at 9:42 AM, Yi Pan <nickpa...@gmail.com> wrote:
> >
> > > @Navina,
> > >
> > > Sorry to chime in late. One question:
> > > 1. Why is it in JobCoordinator, and why not in StreamProcessor class?
> > > Because JobCoordinator provides coordination service across many
> > > processors, an interface getProcessorId() in JobCoordinator is
> confusing
> > > regarding to which processorId we are getting.
> > >
> > > Otherwise, the proposal looks good.
> > >
> > > -Yi
> > >
> > > On Wed, Mar 29, 2017 at 7:57 PM, Navina Ramesh
> > > <nram...@linkedin.com.invalid
> > > > wrote:
> > >
> > > > Good to hear from you, Yan. Thanks! :)
> > > >
> > > > On Wed, Mar 29, 2017 at 7:48 PM, Yan Fang <yanfang...@gmail.com>
> > wrote:
> > > >
> > > > > +1 . Thanks for the proposal, Navina. :)
> > > > >
> > > > > Fang, Yan
> > > > > yanfang...@gmail.com
> > > > >
> > > > > On Thu, Mar 30, 2017 at 4:24 AM, Prateek Maheshwari <
> > > > > pmaheshw...@linkedin.com.invalid> wrote:
> > > > >
> > > > > > +1 (non binding) from me.
> > > > > >
> > > > > > - Prateek
> > > > > >
> > > > > > On Tue, Mar 28, 2017 at 2:17 PM, Boris S <bor...@gmail.com>
> wrote:
> > > > > >
> > > > > > > +1 Looks good to me.
> > > > > > >
> > > > > > > On Tue, Mar 28, 2017 at 2:00 PM, xinyu liu <
> > xinyuliu...@gmail.com>
> > > > > > wrote:
> > > > > > >
> > > > > > > > +1 on my side. Very happy to see this proposal. This is a
> > blocker
> > > > for
> > > > > > > > integrating fluent API with StreamProcessor, and hopefully we
> > can
> > > > get
> > > > > > it
> > > > > > > > resolved soon :).
> > > > > > > >
> > > > > > > > Thanks,
> > > > > > > > Xinyu
> > > > > > > >
> > > > > > > > On Tue, Mar 28, 2017 at 11:28 AM, Navina Ramesh (Apache) <
> > > > > > > > nav...@apache.org>
> > > > > > > > wrote:
> > > > > > > >
> > > > > > > > > Hi everyone,
> > > > > > > > >
> > > > > > > > > This is a voting thread for SEP-1: Semantics of ProcessorId
> > in
> > > > > Samza.
> > > > > > > > > For reference, here is the wiki link:
> > > > > > > > > https://cwiki.apache.org/confluence/display/SAMZA/SEP-
> > > > > > > > > 1%3A+Semantics+of+ProcessorId+in+Samza
> > > > > > > > >
> > > > > > > > > Link to discussion mail thread:
> > > > > > > > > http://mail-archives.apache.org/mod_mbox/samza-dev/201703.
> > > > > > > > > mbox/%3CCANazzuuHiO%3DvZQyFbTiYU-0Sfh3riK%3Dz4j_
> > > > > > > > AdCicQ8rBO%3DXuYQ%40mail.
> > > > > > > > > gmail.com%3E
> > > > > > > > >
> > > > > > > > > Please vote on this SEP asap. :)
> > > > > > > > >
> > > > > > > > > Thanks!
> > > > > > > > > Navina
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > > >
> > > >
> > > > --
> > > > Navina R.
> > > >
> > >
> >
> >
> >
> > --
> > Navina R.
> >
>



-- 
Navina R.

Reply via email to