Hi Renato, > Having the big proposals documented on SEPs is really great to have a good understanding on the system! I agree. Our previous design process was not being strictly enforced. We hope to enforce it going forward as there are major changes coming into the next release.
> So this means that inside a container there will be a single processor? StreamProcessor is nothing more than a Samza container, along with an instance of JobCoordinator in it. Think about it as a thin-wrapper around SamzaContainer and JobCoordinator instance. You can find more details on this idea here - https://issues.apache.org/jira/browse/SAMZA-1063 Going forward, we want a Samza job to consist of one or more StreamProcessors, instead of N SamzaContainers and 1 AppMaster. > is this related to SAMZA-1080 somehow? Yep. SAMZA-1080 introduces StreamProcessor with an almost pass-through JobCoordinator. In fact, at LinkedIn, one of the teams is already using this API with the StandaloneJobCoordinator and delegating partition distribution to kafka high-level consumer (since systemconsumer is pluggable in Samza, we have some internal wrappers around high-level consumer). It has been working really well for stateless applications, I believe. Cheers! Navina On Thu, Mar 30, 2017 at 1:23 PM, Renato Marroquín Mogrovejo < renatoj.marroq...@gmail.com> wrote: > Hi Navina, > > Thanks for the great proposal! Having the big proposals documented on SEPs > is really great to have a good understanding on the system! > I have only a clarification question, the proposal states that every > containerId is the same as the processorId. So this means that inside a > container there will be a single processor? is this related to SAMZA-1080 > somehow? > > > Best, > > Renato M. > > 2017-03-30 20:45 GMT+02:00 Navina Ramesh <nram...@linkedin.com.invalid>: > > > Hi Yi, > > Good question. Three reasons: > > > > 1. In SAMZA-881, we came up with a set of responsibilities for the > > JobCoordinator. One of them was to generate/assign processorId. So, it > > makes sense to keep getProcessorId() within JobCoordinator interface. > > 2. StreamProcessor was initially introduced as a user-facing API > > SAMZA-1080. ProcessorId was an argument in StreamProcessor constructor. > It > > was pushing the burden of guaranteeing unique among the processors of a > job > > to the user. This was not favorable. > > 3. In general, I think we have consensus that the processorIdGenerator is > > going to specific to a runtime environment. Hence, it seems more > > appropriate to move it to a lower abstraction layer that deals with the > > underlying execution environment. > > > > Let me know if you have a different perspective on this. > > > > Cheers! > > Navina > > > > On Thu, Mar 30, 2017 at 9:42 AM, Yi Pan <nickpa...@gmail.com> wrote: > > > > > @Navina, > > > > > > Sorry to chime in late. One question: > > > 1. Why is it in JobCoordinator, and why not in StreamProcessor class? > > > Because JobCoordinator provides coordination service across many > > > processors, an interface getProcessorId() in JobCoordinator is > confusing > > > regarding to which processorId we are getting. > > > > > > Otherwise, the proposal looks good. > > > > > > -Yi > > > > > > On Wed, Mar 29, 2017 at 7:57 PM, Navina Ramesh > > > <nram...@linkedin.com.invalid > > > > wrote: > > > > > > > Good to hear from you, Yan. Thanks! :) > > > > > > > > On Wed, Mar 29, 2017 at 7:48 PM, Yan Fang <yanfang...@gmail.com> > > wrote: > > > > > > > > > +1 . Thanks for the proposal, Navina. :) > > > > > > > > > > Fang, Yan > > > > > yanfang...@gmail.com > > > > > > > > > > On Thu, Mar 30, 2017 at 4:24 AM, Prateek Maheshwari < > > > > > pmaheshw...@linkedin.com.invalid> wrote: > > > > > > > > > > > +1 (non binding) from me. > > > > > > > > > > > > - Prateek > > > > > > > > > > > > On Tue, Mar 28, 2017 at 2:17 PM, Boris S <bor...@gmail.com> > wrote: > > > > > > > > > > > > > +1 Looks good to me. > > > > > > > > > > > > > > On Tue, Mar 28, 2017 at 2:00 PM, xinyu liu < > > xinyuliu...@gmail.com> > > > > > > wrote: > > > > > > > > > > > > > > > +1 on my side. Very happy to see this proposal. This is a > > blocker > > > > for > > > > > > > > integrating fluent API with StreamProcessor, and hopefully we > > can > > > > get > > > > > > it > > > > > > > > resolved soon :). > > > > > > > > > > > > > > > > Thanks, > > > > > > > > Xinyu > > > > > > > > > > > > > > > > On Tue, Mar 28, 2017 at 11:28 AM, Navina Ramesh (Apache) < > > > > > > > > nav...@apache.org> > > > > > > > > wrote: > > > > > > > > > > > > > > > > > Hi everyone, > > > > > > > > > > > > > > > > > > This is a voting thread for SEP-1: Semantics of ProcessorId > > in > > > > > Samza. > > > > > > > > > For reference, here is the wiki link: > > > > > > > > > https://cwiki.apache.org/confluence/display/SAMZA/SEP- > > > > > > > > > 1%3A+Semantics+of+ProcessorId+in+Samza > > > > > > > > > > > > > > > > > > Link to discussion mail thread: > > > > > > > > > http://mail-archives.apache.org/mod_mbox/samza-dev/201703. > > > > > > > > > mbox/%3CCANazzuuHiO%3DvZQyFbTiYU-0Sfh3riK%3Dz4j_ > > > > > > > > AdCicQ8rBO%3DXuYQ%40mail. > > > > > > > > > gmail.com%3E > > > > > > > > > > > > > > > > > > Please vote on this SEP asap. :) > > > > > > > > > > > > > > > > > > Thanks! > > > > > > > > > Navina > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > Navina R. > > > > > > > > > > > > > > > -- > > Navina R. > > > -- Navina R.