[ https://issues.apache.org/jira/browse/FLINK-18675?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17170497#comment-17170497 ]
Congxian Qiu(klion26) commented on FLINK-18675: ----------------------------------------------- [~raviratnakar] I think the problem here is that {{CheckpointRequestDecider}} has a wrong value of {{lastCheckpointCompletionRelativeTime}} when checking whether the checkpoint request is too early. 1. We retrieve the value of {{lastCheckpointCompletionRelativeTime}} when calling {{CheckpointRequestDecider#chooseRequestToExecute}} in {{CheckpointCoordinator#triggerCheckpoint}} 2. A pending checkpoint complete, and update the valuable {{pendingCheckpoints}} and {{lastCheckpointCompletionRelativeTime}} 3. In {{CheckpointRequestDecider#chooseRequestToExecute}} we use the previous {{lastCheckpointCompletionRelativeTime}} to check whether current checkpoint request is too early I think we can get the value of {{lastCheckpointCompletionRelativeTime}} in {{CheckpointRequestDecider#chooseRequestToExecute}} here to solve the problem here. > Checkpoint not maintaining minimum pause duration between checkpoints > --------------------------------------------------------------------- > > Key: FLINK-18675 > URL: https://issues.apache.org/jira/browse/FLINK-18675 > Project: Flink > Issue Type: Bug > Components: Runtime / Checkpointing > Affects Versions: 1.11.0 > Environment: !image.png! > Reporter: Ravi Bhushan Ratnakar > Priority: Critical > Attachments: image.png > > > I am running a streaming job with Flink 1.11.0 using kubernetes > infrastructure. I have configured checkpoint configuration like below > Interval - 3 minutes > Minimum pause between checkpoints - 3 minutes > Checkpoint timeout - 10 minutes > Checkpointing Mode - Exactly Once > Number of Concurrent Checkpoint - 1 > > Other configs > Time Characteristics - Processing Time > > I am observing an usual behaviour. *When a checkpoint completes successfully* > *and if it's end to end duration is almost equal or greater than Minimum > pause duration then the next checkpoint gets triggered immediately without > maintaining the Minimum pause duration*. Kindly notice this behaviour from > checkpoint id 194 onward in the attached screenshot -- This message was sent by Atlassian Jira (v8.3.4#803005)