GitHub user uce opened a pull request:

    https://github.com/apache/flink/pull/2608

    [FLINK-4512] [FLIP-10] Add option to persist periodic checkpoints

    ## Introduction
    
    This is the first part of 
[FLIP-10](https://cwiki.apache.org/confluence/display/FLINK/FLIP-10%3A+Unify+Checkpoints+and+Savepoints),
 allowing users to persist periodic checkpoints.
    
    Persistent checkpoints behave very much like regular periodic checkpoints 
except the following differences:
    
    1. They persist their meta data (like savepoints).
    2. They are not discarded when the owning job fails permanently. 
Furthermore, they can be configured to not be discarded when the job is 
cancelled.
    
    This means that if a job fails permanently the user will have a checkpoint 
available to restore from. As an example think of the following scenario: a job 
runs smoothly until it hits a bad record that it cannot handle. The current 
behaviour will be that the job will try to recover, but it will hit the bad 
record again and keep on failing. With persistent checkpoints, the user can 
update the program to handle bad records and restore from the most recent 
persistent checkpoints.
    
    ## CheckpointConfig
    
    This adds the following `@PublicEvolving` methods to `CheckpointConfig`:
    
    ```
    enablePersistentCheckpoints(String targetDirectory);
    enablePersistentCheckpoints(String targetDirectory, 
PersistentCheckpointCleanup cleanup)
    ```
    
    The `PersistentCheckpointCleanup` defines how persistent checkpoints are 
cleaned up when the owning job is cancelled. Since currently most streaming 
jobs are stopped via cancellation, the default is to clean persistent 
checkpoints up. The user can overwrite this behaviour via the enum.
    
    ## REST API
    
    The REST API exposes the external-path of the most recent persistent 
checkpoint via the REST API. This is also displayed in the web UI for the most 
recent persistent checkpoint.
    
    ![screen shot 2016-10-07 at 17 50 
44](https://cloud.githubusercontent.com/assets/1756620/19196699/d0d5065a-8cb6-11e6-8b13-c6bacc4ebe19.png)
    
    ## Deprecate savepoint state backends (FLINK-4507)
    
    Furthermore, the savepoint state backends have been removed and all 
savepoints now go to files. The corresponding configuration keys have been 
removed or deprecated:
    
    `savepoints.state.backend.fs.dir` has been deprecated in favour of 
`state.savepoints.dir`. `savepoints.state.backend` has been removed.
    
    ## Allow to specify custom savepoint directory (FLINK-4509)
    
    The target directory for savepoints was configured per Flink configuration. 
With this change, this can be overwritten:
    
    ```
    bin/flink savepoint <jobId> [targetDirectory]
    ```


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/uce/flink 4512-persistent_checkpoints

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/flink/pull/2608.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #2608
    
----
commit 004ba0b38ac2b75148910660242808c13746c444
Author: Ufuk Celebi <[email protected]>
Date:   2016-10-06T14:43:42Z

    [FLINK-4512] [FLIP-10] Add option to persist periodic checkpoints
    
    [FLINK-4509] [FLIP-10] Specify savepoint directory per savepoint
    [FLINK-4507] [FLIP-10] Deprecate savepoint backend config

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

Reply via email to