It's very exciting that Samza is adding support of bounded input streams. Nice write-up of different scenarios and options. Look forward to having this feature work with the upcoming HDFS consumer!
Thanks, Xinyu On Tue, Aug 30, 2016 at 12:09 PM, Jagadish Venkatraman < jagadish1...@gmail.com> wrote: > Currently, Samza works with streaming input sources like Kafka topics. This > proposal will build an idea of 'end-of-stream' into Samza to support data > sources that are bounded - like HDFS files, snapshots on disk etc. > > Proposal: > > https://issues.apache.org/jira/secure/attachment/12825119/ > ProposalforEndofStreaminSamza.pdf > > This is tracked in SAMZA-974. > > -- > Jagadish V, > Graduate Student, > Department of Computer Science, > Stanford University >