Hi all,

Yes, we already had a discussion under the jira[1], here's my thought:

I'd +1 for the motivation, since there might be some long operations for
operators (source or sink) which should be done before checkpoint
materialization. One suggestion regarding the API design:

Instead of introducing a new method `asyncOperation` only for async invoke,
could we just provide a `snapshotStateAsync` which do synchronous operation
and return a `RunnableFuture<Void>` or `RunnableFuture<Boolean>` for the
following asynchronous part?
And I think we should deprecate and eventually remove the original
`snapshotState` since it is subsumed by the new method. @Piotr Nowojski
<pnowoj...@apache.org> WDYT?


[1] https://issues.apache.org/jira/browse/FLINK-37375

Best,
Zakelly

On Wed, Mar 26, 2025 at 8:40 PM jufang he <hejufang0...@gmail.com> wrote:

> Hi devs,
>
>
> I would like to start a discussion about FLIP-XXX: Checkpoint supports the
> Operator to customize asynchronous operation [1].
>
>
> In some Flink task operators, slow operations such as file uploads or data
> flushing may be performed during the synchronous phase of Checkpoint. Due
> to performance issues with external storage components, the synchronous
> phase may take too long to execute, significantly impacting the task's
> throughput.
> To address this issue, I propose supporting operator custom asynchronous
> operation feature, allowing users to move time-consuming operation from the
> synchronous phase to the asynchronous phase of Checkpoint, thereby
> minimizing the blocking of the main thread and improving task throughput.
>
>
> For more details, please check the FLIP [1]. There is also a Jira about
> this [2].
>
>
> Looking forward to any comments and opinions!
>
>
> Best Regards,
> Jufang He
>
> [1]
> https://docs.google.com/document/d/1lwxLEQjD6jVhZUBMRGhzQNWKSvdbPbYNQsV265gR4kw
>
> [2] https://issues.apache.org/jira/browse/FLINK-37375
>

Reply via email to