[ 
https://issues.apache.org/jira/browse/FLINK-38842?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18047983#comment-18047983
 ] 

SkylerLin commented on FLINK-38842:
-----------------------------------

I'm a new contributor. I would like to fix this bug if possible.

> FileSink may leave orphaned temporary files in COS after restoring from a 
> checkpoint, due to missing cleanup logic for temporary files in the 
> checkpointed state.
> -----------------------------------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: FLINK-38842
>                 URL: https://issues.apache.org/jira/browse/FLINK-38842
>             Project: Flink
>          Issue Type: Bug
>          Components: Connectors / FileSystem
>    Affects Versions: 1.13.0, 1.14.0, 2.0.0, 1.15.0, 1.16.0, 1.17.0, 1.18.0, 
> 1.19.0, 1.20.0, 2.1.0, 2.2.0
>         Environment: * Flink Version: 1.16.1 
>  * Storage: Tencent Cloud Object Storage (COS). The issue is storage-agnostic 
> and should affect all filesystems used with {{{}FileSink{}}}.
>            Reporter: SkylerLin
>            Priority: Major
>
> When restoring a FileSink job from a checkpoint, the temporary files 
> previously written to COS are re-read and processed correctly. However, 
> unlike the legacy StreamFileSink, the current FileSink implementation does 
> not mark and delete temporary files recorded in the checkpointed state. This 
> results in two temporary files appearing in COS after restore, and one of 
> them may never be cleaned up, leaving orphaned files in the storage system 
> permanently.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to