Repository: samza
Updated Branches:
  refs/heads/master 362658938 -> 202a15809


SAMZA-1046: Docs for checkpointable consumer


Project: http://git-wip-us.apache.org/repos/asf/samza/repo
Commit: http://git-wip-us.apache.org/repos/asf/samza/commit/202a1580
Tree: http://git-wip-us.apache.org/repos/asf/samza/tree/202a1580
Diff: http://git-wip-us.apache.org/repos/asf/samza/diff/202a1580

Branch: refs/heads/master
Commit: 202a15809ee4abc7700af9b02e903f2ebd57a662
Parents: 3626589
Author: Boris Shkolnik <bor...@apache.org>
Authored: Tue Nov 22 14:36:09 2016 -0800
Committer: vjagadish1989 <jvenk...@linkedin.com>
Committed: Tue Nov 22 14:36:14 2016 -0800

----------------------------------------------------------------------
 .../versioned/container/checkpointing.md          | 18 ++++++++++++++++++
 1 file changed, 18 insertions(+)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/samza/blob/202a1580/docs/learn/documentation/versioned/container/checkpointing.md
----------------------------------------------------------------------
diff --git a/docs/learn/documentation/versioned/container/checkpointing.md 
b/docs/learn/documentation/versioned/container/checkpointing.md
index 6f8c6d6..9fb7e6d 100644
--- a/docs/learn/documentation/versioned/container/checkpointing.md
+++ b/docs/learn/documentation/versioned/container/checkpointing.md
@@ -121,4 +121,22 @@ samza-example/target/bin/checkpoint-tool.sh \
 
 Note that Samza only reads checkpoints on container startup. In order for your 
checkpoint change to take effect, you need to first stop the job, then save the 
modified offsets, and then start the job again. If you write a checkpoint while 
the job is running, it will most likely have no effect.
 
+### Checkpoint Callbacks
+Currently Samza takes care of checkpointing for all the systems. But there are 
some use-cases when we may need to inform the Consumer about each checkpoint we 
make.
+Here are few examples:
+
+* Samza cannot do checkpointing correctly or efficiently. One such case is 
when Samza is not doing the partitioning. In this case the container doesn’t 
know which SSPs it is responsible for, and thus cannot checkpoint them. An 
actual example could be a system which relies on an auto-balanced High Level 
Kafka Consumer for partitioning.
+* Systems in which the consumer itself needs to control the checkpointed 
offset. Some systems do not support seek() operation (are not replayable), but 
they rely on ACKs for the delivered messages. Example could be a Kinesis 
consumer. Kinesis library provides a checkpoint callback in the* process() 
*call (push system). This callback needs to be invoked after the records are 
processed. This can only be done by the consumer itself.
+* Systems that use checkpoint/offset information for some maintenance actions. 
This information may be used to implement a smart retention policy (deleting 
all the data after it has been consumed).
+
+In order to use the checkpoint callback a SystemConsumer needs to implement 
the CheckpointListener interface:
+{% highlight java %}
+public interface CheckpointListener {
+  void onCheckpoint(Map<SystemStreamPartition, String> offsets);
+}
+{% endhighlight %}
+For the SystemConsumers which implement this interface Samza will invoke 
onCheckpoint() callback every time OffsetManager checkpoints. Checkpoints are 
done per task, and 'offsets' are all the offsets Samza checkpoints for a task,
+and these are the offsets which will be passed to the consumer on restart.
+Note that the callback will happen after the checkpoint and is **not** atomic.
+
 ## [State Management &raquo;](state-management.html)

Reply via email to