GitHub user tillrohrmann opened a pull request:

    https://github.com/apache/flink/pull/3965

    [FLINK-6328] [chkPts] Don't add savepoints to CompletedCheckpointStore

    The lifecycle of savepoints is not managed by the CheckpointCoordinator and 
fully
    in the hand of the user. Therefore, the CheckpointCoordinator cannot rely 
on them
    when trying to recover from failures. E.g. a user moving a savepoint 
shortly before
    a failure could completely break Flink's recovery mechanism because Flink 
cannot
    skip failed checkpoints when recovering.
    
    Therefore, until Flink is able to skip failed checkpoints when recovering, 
we should
    not add savepoints to the CompletedCheckpointStore which is used to 
retrieve checkpoint
    for recovery. The distinction of a savepoint is done on the basis of the
    CheckpointProperties (CheckpointProperties.STANDARD_SAVEPOINT).
    
    cc @rmetzger 

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/tillrohrmann/flink fixSavepointHandling

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/flink/pull/3965.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #3965
    
----
commit 9c069ad80d66f03a0f90c8ba1a780cbba111896e
Author: Till Rohrmann <[email protected]>
Date:   2017-05-22T15:41:14Z

    [FLINK-6328] [chkPts] Don't add savepoints to CompletedCheckpointStore
    
    The lifecycle of savepoints is not managed by the CheckpointCoordinator and 
fully
    in the hand of the user. Therefore, the CheckpointCoordinator cannot rely 
on them
    when trying to recover from failures. E.g. a user moving a savepoint 
shortly before
    a failure could completely break Flink's recovery mechanism because Flink 
cannot
    skip failed checkpoints when recovering.
    
    Therefore, until Flink is able to skip failed checkpoints when recovering, 
we should
    not add savepoints to the CompletedCheckpointStore which is used to 
retrieve checkpoint
    for recovery. The distinction of a savepoint is done on the basis of the
    CheckpointProperties (CheckpointProperties.STANDARD_SAVEPOINT).

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

Reply via email to