-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/50143/
-----------------------------------------------------------

Review request for samza, Boris Shkolnik, Chris Pettitt, Fred Ji, Jake Maes, Yi 
Pan (Data Infrastructure), Navina Ramesh, and Xinyu Liu.


Repository: samza


Description
-------

Samza currently works with unbounded data sources. However, for bounded data 
sources like HDFS files, snapshot files which are not infinite, we need a 
notion of 'end-of-stream'.
The following are the logical tasks:
1. SystemConsumer will indicate to Samza that the end of stream has been 
reached for an SSP. (by constructing an envelope with eof set to true)
2. Samza will shut down the task if all SSPs in the task are at end of stream.
3. Samza will provide a callback to the task so that it can perform cleanups/ 
commits once tasks are at end of stream.
4. Samza will shut down the container if all tasks in the container have been 
shut down.
5. Samza will ultimately shut down the job if all containers in the job have 
been shut down.

This is a step towards realizing a 'finite' Samza job that terminates (as 
opposed to an infinite stream job that keeps running) once data processing is 
complete.


=== This RB is an RFC for design feedback ====

TODO:
1. Add more unit tests
2. Verify behavior with multiple containers


Diffs
-----

  samza-api/src/main/java/org/apache/samza/system/IncomingMessageEnvelope.java 
cc860cf7eb4d514736913c1dceaa80534b61d71a 
  samza-api/src/main/java/org/apache/samza/task/EndOfStreamListenerTask.java 
PRE-CREATION 
  samza-core/src/main/scala/org/apache/samza/container/TaskInstance.scala 
d32a92976e43ca24033b48c91851ee706de7de6b 
  samza-core/src/test/scala/org/apache/samza/container/TestRunLoop.scala 
e280daa9626757cb4d17c0c03eed923277230c3e 

Diff: https://reviews.apache.org/r/50143/diff/


Testing
-------

Added an unit test and verified that an End of stream message terminates the 
runloop.


Thanks,

Jagadish Venkatraman

Reply via email to