[
https://issues.apache.org/jira/browse/FLINK-38842?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18047983#comment-18047983
]
SkylerLin commented on FLINK-38842:
-----------------------------------
I'm a new contributor. I would like to fix this bug if possible.
> FileSink may leave orphaned temporary files in COS after restoring from a
> checkpoint, due to missing cleanup logic for temporary files in the
> checkpointed state.
> -----------------------------------------------------------------------------------------------------------------------------------------------------------------
>
> Key: FLINK-38842
> URL: https://issues.apache.org/jira/browse/FLINK-38842
> Project: Flink
> Issue Type: Bug
> Components: Connectors / FileSystem
> Affects Versions: 1.13.0, 1.14.0, 2.0.0, 1.15.0, 1.16.0, 1.17.0, 1.18.0,
> 1.19.0, 1.20.0, 2.1.0, 2.2.0
> Environment: * Flink Version: 1.16.1
> * Storage: Tencent Cloud Object Storage (COS). The issue is storage-agnostic
> and should affect all filesystems used with {{{}FileSink{}}}.
> Reporter: SkylerLin
> Priority: Major
>
> When restoring a FileSink job from a checkpoint, the temporary files
> previously written to COS are re-read and processed correctly. However,
> unlike the legacy StreamFileSink, the current FileSink implementation does
> not mark and delete temporary files recorded in the checkpointed state. This
> results in two temporary files appearing in COS after restore, and one of
> them may never be cleaned up, leaving orphaned files in the storage system
> permanently.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)