Yi Zhang created FLINK-33090:
--------------------------------
Summary: CheckpointsCleaner clean individual checkpoint states in
parallel
Key: FLINK-33090
URL: https://issues.apache.org/jira/browse/FLINK-33090
Project: Flink
Issue Type: Improvement
Components: Runtime / Checkpointing
Affects Versions: 1.17.1
Reporter: Yi Zhang
Currently CheckpointsCleaner can clean multiple checkpoints in parallel with
JobManager's ioExecutor, however each checkpoint states is cleaned
sequentially. With thousands of StateObjects to clean this can take long time
on some checkpoint storage, if longer than the checkpoint interval this
prevents new checkpointing.
The proposal is to use the same ioExecutor to clean up each checkpoints states
in parallel as well. From my local testing, with default settings for
ioExecutor thread pool for xK state files this can reduce clean up time from 10
minutes to <1 minute.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)