Re: [DISCUSS] End of Stream Feature in Samza

2016-09-07 Thread Jagadish Venkatraman
Hi Navina, *>> after it has exhausted all messages from the source, it will generate a special SENTINEL envelope that will not have any key or message and only contain a special offset. Is that a valid understanding?* The table in the doc already specifies how the sentinel envelope is

Re: [DISCUSS] End of Stream Feature in Samza

2016-09-02 Thread Jagadish Venkatraman
Thank you Renato for the insightful feedback! :-) I don't think we have a complete scoping and design for the "punctuated streams" feature as of yet. However, one way to implement it would be to leverage something similar to what this feature does. We could have a special offset for a

Re: [DISCUSS] End of Stream Feature in Samza

2016-09-01 Thread Renato MarroquĂ­n Mogrovejo
Thanks Jagadish! This is great! Can you share some thoughts/opinions on how this feature relates on using punctuations (at some point) in Samza? I mean do you think that using punctuated streams could be seen as a generalization of this problem? And if so, could this feature be used later on as a

Re: [DISCUSS] End of Stream Feature in Samza

2016-08-31 Thread Navina Ramesh
Hi Jagadish, Thanks for sharing the design with the community. I have a couple of questions that were not very clear from the design document. 1. Under mechanism for indicating the end-of-stream to Samza, you mention "The offset in the partition that the message was received from. If this is the

Re: [DISCUSS] End of Stream Feature in Samza

2016-08-30 Thread Julian Hyde
> On Aug 30, 2016, at 4:44 PM, Xinyu Liu wrote: > > It's very exciting that Samza is adding support of bounded input streams. +1!

Re: [DISCUSS] End of Stream Feature in Samza

2016-08-30 Thread xinyu liu
It's very exciting that Samza is adding support of bounded input streams. Nice write-up of different scenarios and options. Look forward to having this feature work with the upcoming HDFS consumer! Thanks, Xinyu On Tue, Aug 30, 2016 at 12:09 PM, Jagadish Venkatraman < jagadish1...@gmail.com>

[DISCUSS] End of Stream Feature in Samza

2016-08-30 Thread Jagadish Venkatraman
Currently, Samza works with streaming input sources like Kafka topics. This proposal will build an idea of 'end-of-stream' into Samza to support data sources that are bounded - like HDFS files, snapshots on disk etc. Proposal: