Hi devs,
I would like to start a discussion about FLIP-XXX: Checkpoint supports the Operator to customize asynchronous operation [1]. In some Flink task operators, slow operations such as file uploads or data flushing may be performed during the synchronous phase of Checkpoint. Due to performance issues with external storage components, the synchronous phase may take too long to execute, significantly impacting the task's throughput. To address this issue, I propose supporting operator custom asynchronous operation feature, allowing users to move time-consuming operation from the synchronous phase to the asynchronous phase of Checkpoint, thereby minimizing the blocking of the main thread and improving task throughput. For more details, please check the FLIP [1]. There is also a Jira about this [2]. Looking forward to any comments and opinions! Best Regards, Jufang He [1] https://docs.google.com/document/d/1lwxLEQjD6jVhZUBMRGhzQNWKSvdbPbYNQsV265gR4kw [2] https://issues.apache.org/jira/browse/FLINK-37375