[ 
https://issues.apache.org/jira/browse/FLINK-5763?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15871874#comment-15871874
 ] 

ASF GitHub Bot commented on FLINK-5763:
---------------------------------------

GitHub user uce opened a pull request:

    https://github.com/apache/flink/pull/3345

    [FLINK-5763] [checkpoints] Minor meta data refactorings

    This PR groups some refactorings I did as part of FLINK-5763 into a 
separate PR.
    
    1) Acknowledges with explicit ID and `CheckpointMetrics`
      Instead of acknowledging checkpoints with `CheckpointMetaData`, I reduce 
this to what's needed: the checkpoint ID and the `CheckpointMetrics`.
    
    2) Move `CheckpointMetrics` out of the `CheckpointMetaData`
      `CheckpointMetaData` was overloaded with the metrics that are actually 
only needed for the acknowledgement. `CheckpointMetaData` should in my opinion 
only contain fixed properties that are relevant for the checkpoint.
    
    3) Make `CheckpointProperties` isSavepoint check non-static
      There was no good reason that this check was a static helper method. We 
should simply call `props.isSavepoint()`.


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/uce/flink selfcontained_refactorings

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/flink/pull/3345.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #3345
    
----
commit 64e1f8892597dd26489774fe494fd7b211d789a9
Author: Ufuk Celebi <[email protected]>
Date:   2017-02-15T16:52:40Z

    [FLINK-5763] [checkpoints] Acknowledge with explicit ID and 
CheckpointMetrics
    
    Instead of acknowledging checkpoints with the CheckpointMetaData make
    the acknowledgement explicit by ID and CheckpointMetrics. The rest is
    not needed.

commit 0b8fa02a150ae6d5891e04d1fce6de748a283aaf
Author: Ufuk Celebi <[email protected]>
Date:   2017-02-15T17:16:44Z

    [FLINK-5763] [checkpoints] Move CheckpointMetrics out of CheckpointMetaData

commit d023159d0fef1da24373d7880e11d0a36afbd7ef
Author: Ufuk Celebi <[email protected]>
Date:   2017-02-16T15:52:32Z

    [FLINK-5763] [checkpoints] Add isSavepoint() to CheckpointProperties

----


> Make savepoints self-contained and relocatable
> ----------------------------------------------
>
>                 Key: FLINK-5763
>                 URL: https://issues.apache.org/jira/browse/FLINK-5763
>             Project: Flink
>          Issue Type: Improvement
>          Components: State Backends, Checkpointing
>            Reporter: Ufuk Celebi
>            Assignee: Ufuk Celebi
>
> After a user has triggered a savepoint, a single savepoint file will be 
> returned as a handle to the savepoint. A savepoint to {{<target>}} creates a 
> savepoint file like {{<target>/savepoint-<randomSuffix>}}.
> This file contains the metadata of the corresponding checkpoint, but not the 
> actual program state. While this works well for short term management 
> (pause-and-resume a job), it makes it hard to manage savepoints over longer 
> periods of time.
> h4. Problems
> h5. Scattered Checkpoint Files
> For file system based checkpoints (FsStateBackend, RocksDBStateBackend) this 
> results in the savepoint referencing files from the checkpoint directory 
> (usually different than <target>). For users, it is virtually impossible to 
> tell which checkpoint files belong to a savepoint and which are lingering 
> around. This can easily lead to accidentally invalidating a savepoint by 
> deleting checkpoint files.
> h5. Savepoints Not Relocatable
> Even if a user is able to figure out which checkpoint files belong to a 
> savepoint, moving these files will invalidate the savepoint as well, because 
> the metadata file references absolute file paths.
> h5. Forced to Use CLI for Disposal
> Because of the scattered files, the user is in practice forced to use Flink’s 
> CLI to dispose a savepoint. This should be possible to handle in the scope of 
> the user’s environment via a file system delete operation.
> h4. Proposal
> In order to solve the described problems, savepoints should contain all their 
> state, both metadata and program state, inside a single directory. 
> Furthermore the metadata must only hold relative references to the checkpoint 
> files. This makes it obvious which files make up the state of a savepoint and 
> it is possible to move savepoints around by moving the savepoint directory.
> h5. Desired File Layout
> Triggering a savepoint to {{<target>}} creates a directory as follows:
> {code}
> <target>/savepoint-<jobId>-<randomSuffix>
>   +-- _metadata
>   +-- data-<randomSuffix> [1 or more]
> {code}
> We include the JobID in the savepoint directory name in order to give some 
> hints about which job a savepoint belongs to.
> h5. CLI
> - Trigger: When triggering a savepoint to {{<target>}} the savepoint 
> directory will be returned as the handle to the savepoint.
> - Restore: Users can restore by pointing to the directory or the _metadata 
> file. The data files should be required to be in the same directory as the 
> _metadata file.
> - Dispose: The disposal command should be deprecated and eventually removed. 
> While deprecated, disposal can happen by specifying the directory or the 
> _metadata file (same as restore).



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to