Hi Jagadish, Thanks for sharing the design with the community. I have a couple of questions that were not very clear from the design document.
1. Under mechanism for indicating the end-of-stream to Samza, you mention "The offset in the partition that the message was received from. If this is the last message in the SSP, this field is set to END_OF_STREAM. Such a message is not delivered to the actual StreamTask implementation." Does this mean the last message is not delivered to the Task? How do you identify that it is indeed the last message in the SSP? Does the source provide that info? If that is the case, then it kind of creates contract between any bounded system consumer and samza. Or did you mean to say that we assume end-of-stream has been reached when there is no message returned on poll? Please do clarify. I think what is missing in this document is how to "detect" an end-of-stream from the source. 2. Does this design preclude the possibility of consuming bounded and unbounded stream partitions in the task ? 3. During checkpoint, let's say some of the partitions have reached EOF. Do we write a special offset in the checkpoint message that indicates that it has reached end of stream and don't need to poll anymore? Thanks! Navina On Tue, Aug 30, 2016 at 4:50 PM, Julian Hyde <jh...@apache.org> wrote: > > > On Aug 30, 2016, at 4:44 PM, Xinyu Liu <xinyuliu...@gmail.com> wrote: > > > > It's very exciting that Samza is adding support of bounded input streams. > > +1! > > -- Navina R.