[
https://issues.apache.org/jira/browse/FLINK-24149?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17410904#comment-17410904
]
Feifan Wang commented on FLINK-24149:
-------------------------------------
[~yunta],thanks for reminding, this feature does invalidate subsequent
checkpoints after restore for a incremental checkpoint. The key to the problem
is that incremental checkpoint reuses the shared file of the previous job
instance. To fix this problem, I think there are serveral ways to solve this
problem :
# copy shared files to new jobs shared directory when restore from a
incremental checkpoint
# first checkpoint of new job (job id changed, not full restart) degrade into
a full checkpoint. Specifically, we can use an empty materializedSstFiles when
the new job initializes RocksIncrementalSnapshotStrategy.
> Make checkpoint relocatable
> ---------------------------
>
> Key: FLINK-24149
> URL: https://issues.apache.org/jira/browse/FLINK-24149
> Project: Flink
> Issue Type: Improvement
> Components: Runtime / Checkpointing
> Reporter: Feifan Wang
> Priority: Major
> Labels: pull-request-available
>
> h3. Backgroud
> FLINK-5763 proposal make savepoint relocatable, checkpoint has similar
> requirements. For example, to migrate jobs to other HDFS clusters, although
> it can be achieved through a savepoint, but we prefer to use persistent
> checkpoints, especially RocksDBStateBackend incremental checkpoints have
> better performance than savepoint during snapshot and restore.
>
> FLINK-8531 standardized directory layout :
> {code:java}
> /user-defined-checkpoint-dir
> |
> + 1b080b6e710aabbef8993ab18c6de98b (job's ID)
> |
> + --shared/
> + --taskowned/
> + --chk-00001/
> + --chk-00002/
> + --chk-00003/
> ...
> {code}
> * State backend will create a subdirectory with the job's ID that will
> contain the actual checkpoints, such as:
> user-defined-checkpoint-dir/1b080b6e710aabbef8993ab18c6de98b/
> * Each checkpoint individually will store all its files in a subdirectory
> that includes the checkpoint number, such as:
> user-defined-checkpoint-dir/1b080b6e710aabbef8993ab18c6de98b/chk-00003/
> * Files shared between checkpoints will be stored in the shared/ directory
> in the same parent directory as the separate checkpoint directory, such as:
> user-defined-checkpoint-dir/1b080b6e710aabbef8993ab18c6de98b/shared/
> * Similar to shared files, files owned strictly by tasks will be stored in
> the taskowned/ directory in the same parent directory as the separate
> checkpoint directory, such as:
> user-defined-checkpoint-dir/1b080b6e710aabbef8993ab18c6de98b/taskowned/
> h3. Proposal
> Since the individually checkpoint directory does not contain complete state
> data, we cannot make it relocatable, but its parent directory can. The only
> work left is make the metadata file references relative file paths.
> I proposal make these changes to _*FsCheckpointStateOutputStream*_ :
> * introduce _*checkpointDirectory*_ field, and remove *_allowRelativePaths_*
> field
> * introduce *_entropyInjecting_* field
> * *_closeAndGetHandle()_* return _*RelativeFileStateHandle*_ with relative
> path base on _*checkpointDirectory*_ (except entropy injecting file system)
> [~yunta], [~trohrmann] , I verified this in our environment , and submitted a
> pull request to accomplish this feature. Please help evaluate whether it is
> appropriate.
>
--
This message was sent by Atlassian Jira
(v8.3.4#803005)