[jira] [Commented] (FLINK-39307) Improve checkpoint deletion speed in CompletedCheckpoint

Gabor Somogyi (Jira) Tue, 24 Mar 2026 02:24:11 -0700


    [ 
https://issues.apache.org/jira/browse/FLINK-39307?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18067963#comment-18067963
 ]


Gabor Somogyi commented on FLINK-39307:
---------------------------------------

We're testing async parallel delete with thread pool in our systems and works 
well exactly because we're using incremental 99% of the cases + there are FS 
implementations which don't support directory deletion. The latter can be 
handled so I'm not against dir deletion but for short term that's not an option.

> Improve checkpoint deletion speed in CompletedCheckpoint
> --------------------------------------------------------
>
>                 Key: FLINK-39307
>                 URL: https://issues.apache.org/jira/browse/FLINK-39307
>             Project: Flink
>          Issue Type: Improvement
>          Components: Runtime / Checkpointing
>            Reporter: Shihuan Liu
>            Priority: Minor
>
> In flink, due to checkpoints have a retention number, after completed a 
> checkpoint the job will delete an old one. Right now to delete that old 
> checkpoint, the files within the checkpoint are deleted in sequential:
> {code:java}
> discard()                                          // on ioExecutor thread
>  ├─ metadataHandle.discardState()                  // fs.delete(file1,false)  
>  ├─ for each OperatorState:                        // e.g. 5 operators
>  │    └─ for each subtask:                         // e.g. 200 subtasks
>  │         └─ for each state handle:               // e.g. 1-3 handles per 
> subtask
>  │              └─ FileStateHandle.discardState()  // fs.delete(fileN, false) 
>  ~"false" -> file deletion
>  └─ disposeStorageLocation()                       // fs.delete(emptyDir, 
> false) {code}
> In our company (Uber), we found that if the checkpoint has a large number of 
> files, like many thousand files, the deletion speed is slow due to the fact 
> that the job has to delete files one-by-one, and send an rpc call to storage 
> backend server (like OCI/GCS servers) per file. 
> We are thinking of one potential improvement, which is to do a directory 
> deletion instead of file-by-file deletion such that the job just calls 
> fs.delete on the checkpoint directory (like chk-1010 for example). This will 
> allow storage layer's batch deletion api (within GCS/OCI clients for example) 
> to kick in, so that they can fit multiple files deletion requests into a 
> single rpc call. This potentially can significantly improve checkpoint clean 
> up performance. We can do this for full checkpointing mode, as for this mode 
> all checkpoint files are hosted under the chk-N folder.
> Any thoughts/recommendations?



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (FLINK-39307) Improve checkpoint deletion speed in CompletedCheckpoint

Reply via email to