[
https://issues.apache.org/jira/browse/FLINK-6315?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15975203#comment-15975203
]
Stephan Ewen commented on FLINK-6315:
-------------------------------------
I think you are thinking about it the right way. When checkpoint 2 does not
happen for whatever reason then checkpoint 3 should be in charge of everything
since the last successful checkpoint.
I see the problem now: When checkpoint 3 starts, you may not yet know whether
checkpoint 2 is actually going to complete. To make it more tricky, it may
actually be that checkpoint 2 fails (due to a timeout) after checkpoint 3
completes.
In the incremental checkpointing code, we have a similar problem. In that case,
we can only re-reference a diff if it is part of a completed checkpoint. If for
example checkpoint 2 is not complete when checkpoint 3 is started, then
checkpoint 3 builds on checkpoint 1, not on checkpoint 2.
[~aljoscha] How is that handled in the regular bucketing sink?
> Notify on checkpoint timeout
> -----------------------------
>
> Key: FLINK-6315
> URL: https://issues.apache.org/jira/browse/FLINK-6315
> Project: Flink
> Issue Type: New Feature
> Components: Core
> Reporter: Seth Wiesman
> Assignee: Seth Wiesman
>
> A common use case when writing a custom operator that outputs data to some
> third party location to partially output on checkpoint and then commit on
> notifyCheckpointComplete. If that external system does not gracefully handle
> rollbacks (such as Amazon S3 not allowing consistent delete operations) then
> that data needs to be handled by the next checkpoint.
> The idea is to add a new interface similar to CheckpointListener that
> provides a callback when the CheckpointCoordinator timesout a checkpoint
> {code:java}
> /**
> * This interface must be implemented by functions/operations that want to
> receive
> * a notification if a checkpoint has been {@link
> org.apache.flink.runtime.checkpoint.CheckpointCoordinator}
> */
> public interface CheckpointTimeoutListener {
> /**
> * This method is called as a notification if a distributed checkpoint
> has been timed out.
> *
> * @param checkpointId The ID of the checkpoint that has been timed out.
> * @throws Exception
> */
> void notifyCheckpointTimeout(long checkpointId) throws Exception;
> }
> {code}
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)