[
https://issues.apache.org/jira/browse/FLINK-19683?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Arvid Heise updated FLINK-19683:
--------------------------------
Fix Version/s: 1.13.0
> Actively timeout aligned checkpoints on the output
> --------------------------------------------------
>
> Key: FLINK-19683
> URL: https://issues.apache.org/jira/browse/FLINK-19683
> Project: Flink
> Issue Type: Sub-task
> Components: Runtime / Checkpointing, Runtime / Task
> Affects Versions: 1.12.0
> Reporter: Piotr Nowojski
> Priority: Major
> Fix For: 1.13.0
>
>
> After enqueuing aligned checkpoint barrier on the output, we could register a
> timeout to check if it was sent downstream within some threshold. If not, we
> can convert it to unaligned checkpoint.
> Note, this will significantly complicate how to execute the actual
> checkpoint. Namely currently the logic inside `AsyncCheckpointRunnable` is
> executed as soon as checkpoint is triggered. With the timeout on the outputs,
> we can not complete the `AsyncCheckpointRunnable` until we know if the
> timeout happened or not. We would need to register some
> listener/CompletableFuture tracking if all of the checkpoint barriers were
> sent down the stream, and the aligned checkpoint can only complete if those
> futures are completed before the timeout. Otherwise, if timeout happens, we
> would need to convert the aligned checkpoint on the outputs to unaligned.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)