Mark Norkin created BEAM-6902:
---------------------------------
Summary: Beam model contract for finalization of CheckpointMark's
Key: BEAM-6902
URL: https://issues.apache.org/jira/browse/BEAM-6902
Project: Beam
Issue Type: Improvement
Components: beam-model, io-java-kafka, runner-core, runner-dataflow,
sdk-java-core
Reporter: Mark Norkin
Question: What is the contract in Beam model for when checkpoint marks shall be
finalized, is there any ?
I'm working on pipeline that reads messages from Kafka using KafkaIO, and I'm
looking at _commitOffsetsInFinalize()_ option, and KafkaCheckpointMark class.
I want to achieve at-least-once message delivery semantics and want to be sure
that offsets committed to Kafka after they are written to some sink.
Looking at interface of
[CheckpointMark|https://beam.apache.org/releases/javadoc/2.9.0/org/apache/beam/sdk/io/UnboundedSource.CheckpointMark.html]
it's not clear when finalization shall be expected to happen.
Is it runner dependent, what to expect when executing on _DataflowRunner_ ?
And reading KafkaIO.Read javadoc on
_[commitOffsetsInFinalize|[https://beam.apache.org/releases/javadoc/2.9.0/org/apache/beam/sdk/io/kafka/KafkaIO.Read.html#commitOffsetsInFinalize--]]_
also doesn't bring clarity to my understanding, particularly the phrase
{quote}But it does not provide *_hard processing guarantees_*
{quote}
What exactly are hard processing guarantees ?
Can I ask, please for documentation improvement in respect of _CheckpointMark_
and _commitOffsetsInFinalize_.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)