Repository: beam Updated Branches: refs/heads/master ca12be5e6 -> f8ae1185c
Update JavaDoc of checkpoint, and expose `enable.auto.commit` for more visible. Project: http://git-wip-us.apache.org/repos/asf/beam/repo Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/4cdfe36c Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/4cdfe36c Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/4cdfe36c Branch: refs/heads/master Commit: 4cdfe36ce3b423dfb0f26a371a255d5490e7b593 Parents: ca12be5 Author: mingmxu <[email protected]> Authored: Tue May 2 12:01:38 2017 -0700 Committer: Dan Halperin <[email protected]> Committed: Thu May 4 10:26:14 2017 -0700 ---------------------------------------------------------------------- .../java/org/apache/beam/sdk/io/kafka/KafkaIO.java | 15 ++++++++++++--- 1 file changed, 12 insertions(+), 3 deletions(-) ---------------------------------------------------------------------- http://git-wip-us.apache.org/repos/asf/beam/blob/4cdfe36c/sdks/java/io/kafka/src/main/java/org/apache/beam/sdk/io/kafka/KafkaIO.java ---------------------------------------------------------------------- diff --git a/sdks/java/io/kafka/src/main/java/org/apache/beam/sdk/io/kafka/KafkaIO.java b/sdks/java/io/kafka/src/main/java/org/apache/beam/sdk/io/kafka/KafkaIO.java index 8f94b8a..8ab33d1 100644 --- a/sdks/java/io/kafka/src/main/java/org/apache/beam/sdk/io/kafka/KafkaIO.java +++ b/sdks/java/io/kafka/src/main/java/org/apache/beam/sdk/io/kafka/KafkaIO.java @@ -204,14 +204,23 @@ import org.slf4j.LoggerFactory; * * <h3>Partition Assignment and Checkpointing</h3> * The Kafka partitions are evenly distributed among splits (workers). - * Checkpointing is fully supported and each split can resume from previous checkpoint. See - * {@link UnboundedKafkaSource#split(int, PipelineOptions)} for more details on + * + * <p>Checkpointing is fully supported and each split can resume from previous checkpoint + * (to the extent supported by runner). + * See {@link UnboundedKafkaSource#split(int, PipelineOptions)} for more details on * splits and checkpoint support. * - * <p>When the pipeline starts for the first time without any checkpoint, the source starts + * <p>When the pipeline starts for the first time, or without any checkpoint, the source starts * consuming from the <em>latest</em> offsets. You can override this behavior to consume from the * beginning by setting appropriate appropriate properties in {@link ConsumerConfig}, through * {@link Read#updateConsumerProperties(Map)}. + * You can also enable offset auto_commit in Kafka to resume from last committed. + * + * <p>In summary, KafkaIO.read follows below sequence to set initial offset:<br> + * 1. {@link KafkaCheckpointMark} provided by runner;<br> + * 2. Consumer offset stored in Kafka when + * {@code ConsumerConfig.ENABLE_AUTO_COMMIT_CONFIG = true};<br> + * 3. Start from <em>latest</em> offset by default; * * <h3>Writing to Kafka</h3> *
