Yi Zhang created FLINK-33090:
--------------------------------

             Summary: CheckpointsCleaner clean individual checkpoint states in 
parallel
                 Key: FLINK-33090
                 URL: https://issues.apache.org/jira/browse/FLINK-33090
             Project: Flink
          Issue Type: Improvement
          Components: Runtime / Checkpointing
    Affects Versions: 1.17.1
            Reporter: Yi Zhang


Currently CheckpointsCleaner can clean multiple checkpoints in parallel with 
JobManager's ioExecutor, however each checkpoint states is cleaned 
sequentially. With thousands of StateObjects to clean this can take long time 
on some checkpoint storage, if longer than the checkpoint interval this 
prevents new checkpointing.

The proposal is to use the same ioExecutor to clean up each checkpoints states 
in parallel as well. From my local testing, with default settings for 
ioExecutor thread pool for xK state files this can reduce clean up time from 10 
minutes to <1 minute. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to