[
https://issues.apache.org/jira/browse/FLINK-29913?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17726084#comment-17726084
]
Congxian Qiu commented on FLINK-29913:
--------------------------------------
thanks for the discuss above and contribution!
Using the UUID/filename as the key solves the problem here, and it also makes
sense because the key and the remote file are one-to-one. In addition, it can
also solve some other potential problems, for example, if the Flink job
management platform uses the SharedRegistry here to maintain the checkpoints
lifecycle, if a task has two ssts with the same name, it will now cause the
file to be deleted by mistake (this situation occurs as follows: job A
generates a checkpoint chk1, then stops, job B job B resumes from chk1,
completes chk2, then stops, then job C resumes from chk1, completes chk3, after
we register chk2 and chk3 in one SharedRegistry, we'll delete some remote files
by mistake, because there will be some sst files in chk2 and chk3 with the same
name)
> Shared state would be discarded by mistake when maxConcurrentCheckpoint>1
> -------------------------------------------------------------------------
>
> Key: FLINK-29913
> URL: https://issues.apache.org/jira/browse/FLINK-29913
> Project: Flink
> Issue Type: Bug
> Components: Runtime / Checkpointing
> Affects Versions: 1.15.0, 1.16.0, 1.17.0
> Reporter: Yanfei Lei
> Assignee: Feifan Wang
> Priority: Major
> Fix For: 1.16.3, 1.17.2
>
>
> When maxConcurrentCheckpoint>1, the shared state of Incremental rocksdb state
> backend would be discarded by registering the same name handle. See
> [https://github.com/apache/flink/pull/21050#discussion_r1011061072]
> cc [~roman]
--
This message was sent by Atlassian Jira
(v8.20.10#820010)