Hey Encho,

The Flink Runner acknowledges messages through PubSubIO's `CheckpointMark#finalizeCheckpoint()` method.

The Flink Runner wraps the PubSubIO source via the UnboundedSourceWrapper. When Flink takes a checkpoint of the running Beam streaming job, the wrapper will retrieve the CheckpointMarks from the PubSubIO source.

When the Checkpoint is completed, there is a callback which informs the wrapper (`notifyCheckpointComplete()`) and calls `finalizeCheckpoint()` on all the generated CheckpointMarks.

Hope that helps debugging your problem. I don't have an explanation why this doesn't work for the last records in your PubSub queue. It shouldn't make a difference for how the Flink Runner does checkpointing.

Best,
Max

On 10.09.18 18:17, Encho Mishinev wrote:
Hello,

I am using Flink runner with Apache Beam 2.6.0. I was wondering if there is information on when exactly the runner acknowledges a pubsub message when reading from PubsubIO?

My problem is that whenever there are a few messages left in a subscription my streaming job never really seems to acknowledge them all. For example is a subscription has 100,000,000 messages in total, the job will go through about 99,990,000 and then keep reading the last few thousand and seemingly never acknowledge them.

Some clarity on when the acknowledgement happens in the pipeline might help me debug this problem.

Thanks!

Reply via email to