[
https://issues.apache.org/jira/browse/FLINK-27114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17783164#comment-17783164
]
Zakelly Lan commented on FLINK-27114:
-------------------------------------
Hi [~roman] I'm revisiting this ticket and FLINK-27132. IIUC, these two problem
happen when a job restore in LEGACY mode and restarts afterwards. We lose the
information about the initial checkpoint (because it is subsumed), so it is not
possible to distinguish initial state from the state created by this job. Is it
right? So the proposed solution is to keep the information in StateHandleStore.
Please correct me if I'm wrong.
> On JM restart, the information about the initial checkpoints can be lost
> ------------------------------------------------------------------------
>
> Key: FLINK-27114
> URL: https://issues.apache.org/jira/browse/FLINK-27114
> Project: Flink
> Issue Type: Bug
> Components: Runtime / Checkpointing
> Affects Versions: 1.14.4, 1.15.0, 1.16.0
> Reporter: Roman Khachatryan
> Priority: Major
>
> Scenario (1.14):
> # A job starts from an existing checkpoint 1, with incremental checkpoints
> enabled
> # Checkpoint 1 is loaded with discardOnSubsume=false by
> CheckpointCoordinator.restoreSavepoint
> # A new checkpoint 2 completes, it reuses some state from the initial
> checkpoint
> # At some point, checkpoint 1 is subsumed, but the state is not discarded
> (thanks to discardOnSubsume=false, ref counts stay 1)
> # JM crashes
> # JM restarts, loads the checkpoints 2..x from ZK (or other store) -
> discardOnSubsume=true (as deserialized from handles)
> # At some point, checkpoint 2 is subsumed and the initial shared state is
> not used anymore; because checkpoint 2 has discardOnSubsume=true, shared
> state will be erroneously discarded
> In 1.15, there were the following changes:
> # RestoreMode was added; only LEGACY mode is affected (in NO_CLAIM mode,
> checkpoint 2 can't reuse any initial state; and in CLAIM mode, it's fine to
> discard the initial state)
> # SharedStateRegistry was changed from refCounts to highest checkpoint ID
> (FLINK-24611)
> # In step (7), state will not be discarded (FLINK-26985); however, because
> it's impossible to distinguish initial state from the state created by this
> job, the latter will not be discarded as well, leading to left-over state
> artifacts.
> The proposed solution is to store the initial checkpoint ID (in store such as
> ZK or in checkpoints) and adjust steps 6 or 7.
> Storing restore information in checkpoint allows to handle multiple restore
> modes in the "lineage", e.g.:
> Initial run -> restore in NO_CLAIM -> restore in CLAIM
--
This message was sent by Atlassian Jira
(v8.20.10#820010)