[ 
https://issues.apache.org/jira/browse/BEAM-6902?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Norkin updated BEAM-6902:
------------------------------
    Description: 
Question: What is the contract in Beam model for when checkpoint marks shall be 
finalized, is there any ? 

I'm working on pipeline that reads messages from Kafka using KafkaIO, and I'm 
looking at _commitOffsetsInFinalize()_ option, and KafkaCheckpointMark class.

I want to achieve at-least-once message delivery semantics and want to be sure 
that offsets committed to Kafka after they are written to some sink.

Looking at interface of 
[CheckpointMark|https://beam.apache.org/releases/javadoc/2.9.0/org/apache/beam/sdk/io/UnboundedSource.CheckpointMark.html]
  it's not clear when finalization shall be expected to happen.

Is it runner dependent, what to expect when executing on _DataflowRunner_ ?

And reading KafkaIO.Read 
[javadoc|[https://beam.apache.org/releases/javadoc/2.9.0/org/apache/beam/sdk/io/kafka/KafkaIO.Read.html#commitOffsetsInFinalize--]]
 on 
_[commitOffsetsInFinalize|[https://beam.apache.org/releases/javadoc/2.9.0/org/apache/beam/sdk/io/kafka/KafkaIO.Read.html#commitOffsetsInFinalize--]]_
 also doesn't bring clarity to my understanding, particularly the phrase 
{quote}But it does not provide *_hard processing guarantees_*
{quote}
What exactly are hard processing guarantees ?

Can I ask, please for documentation improvement in respect of _CheckpointMark_ 
and _commitOffsetsInFinalize_. 

 

  was:
Question: What is the contract in Beam model for when checkpoint marks shall be 
finalized, is there any ? 

I'm working on pipeline that reads messages from Kafka using KafkaIO, and I'm 
looking at _commitOffsetsInFinalize()_ option, and KafkaCheckpointMark class.

I want to achieve at-least-once message delivery semantics and want to be sure 
that offsets committed to Kafka after they are written to some sink.

Looking at interface of 
[CheckpointMark|https://beam.apache.org/releases/javadoc/2.9.0/org/apache/beam/sdk/io/UnboundedSource.CheckpointMark.html]
  it's not clear when finalization shall be expected to happen.

Is it runner dependent, what to expect when executing on _DataflowRunner_ ?

And reading KafkaIO.Read javadoc on _[commitOffsetsInFinalize | 
[https://beam.apache.org/releases/javadoc/2.9.0/org/apache/beam/sdk/io/kafka/KafkaIO.Read.html#commitOffsetsInFinalize--]]_
 also doesn't bring clarity to my understanding, particularly the phrase 
{quote}But it does not provide *_hard processing guarantees_*
{quote}
What exactly are hard processing guarantees ?

Can I ask, please for documentation improvement in respect of _CheckpointMark_ 
and _commitOffsetsInFinalize_. 

 


> Beam model contract for finalization of CheckpointMark's    
> ------------------------------------------------------------
>
>                 Key: BEAM-6902
>                 URL: https://issues.apache.org/jira/browse/BEAM-6902
>             Project: Beam
>          Issue Type: Improvement
>          Components: beam-model, io-java-kafka, runner-core, runner-dataflow, 
> sdk-java-core
>            Reporter: Mark Norkin
>            Priority: Major
>
> Question: What is the contract in Beam model for when checkpoint marks shall 
> be finalized, is there any ? 
> I'm working on pipeline that reads messages from Kafka using KafkaIO, and I'm 
> looking at _commitOffsetsInFinalize()_ option, and KafkaCheckpointMark class.
> I want to achieve at-least-once message delivery semantics and want to be 
> sure that offsets committed to Kafka after they are written to some sink.
> Looking at interface of 
> [CheckpointMark|https://beam.apache.org/releases/javadoc/2.9.0/org/apache/beam/sdk/io/UnboundedSource.CheckpointMark.html]
>   it's not clear when finalization shall be expected to happen.
> Is it runner dependent, what to expect when executing on _DataflowRunner_ ?
> And reading KafkaIO.Read 
> [javadoc|[https://beam.apache.org/releases/javadoc/2.9.0/org/apache/beam/sdk/io/kafka/KafkaIO.Read.html#commitOffsetsInFinalize--]]
>  on 
> _[commitOffsetsInFinalize|[https://beam.apache.org/releases/javadoc/2.9.0/org/apache/beam/sdk/io/kafka/KafkaIO.Read.html#commitOffsetsInFinalize--]]_
>  also doesn't bring clarity to my understanding, particularly the phrase 
> {quote}But it does not provide *_hard processing guarantees_*
> {quote}
> What exactly are hard processing guarantees ?
> Can I ask, please for documentation improvement in respect of 
> _CheckpointMark_ and _commitOffsetsInFinalize_. 
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to