[
https://issues.apache.org/jira/browse/FLINK-17571?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17126428#comment-17126428
]
Congxian Qiu(klion26) commented on FLINK-17571:
-----------------------------------------------
[~wind_ljy] The reason don't want to add a tool to clean the orphan files
because that: we can't clean the files in safely, there may more than one jobs
reference to the same file.
But if you can guarantee that there will never be more than one job recovery
from one checkpoint, you can use such tool to clean the orphan files. and a
future step to clean the orphan files, you can add the tool to corntab, so that
it can delete the orphan files regularly.
PS: maybe you can first move the orphan files to some place, and delete them
one day(or some other duration) later, in case delete the wrong files.
> A better way to show the files used in currently checkpoints
> ------------------------------------------------------------
>
> Key: FLINK-17571
> URL: https://issues.apache.org/jira/browse/FLINK-17571
> Project: Flink
> Issue Type: New Feature
> Components: Command Line Client, Runtime / Checkpointing
> Reporter: Congxian Qiu(klion26)
> Priority: Major
>
> Inspired by the
> [userMail|http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Shared-Checkpoint-Cleanup-and-S3-Lifecycle-Policy-tt34965.html]
> Currently, there are [three types of
> directory|https://ci.apache.org/projects/flink/flink-docs-release-1.10/ops/state/checkpoints.html#directory-structure]
> for a checkpoint, the files in TASKOWND and EXCLUSIVE directory can be
> deleted safely, but users can't delete the files in the SHARED directory
> safely(the files may be created a long time ago).
> I think it's better to give users a better way to know which files are
> currently used(so the others are not used)
> maybe a command-line command such as below is ok enough to support such a
> feature.
> {{./bin/flink checkpoint list $checkpointDir # list all the files used in
> checkpoint}}
--
This message was sent by Atlassian Jira
(v8.3.4#803005)