[
https://issues.apache.org/jira/browse/FLINK-14344?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16950976#comment-16950976
]
Biao Liu commented on FLINK-14344:
----------------------------------
Hi [~pnowojski],
{quote}I think ideally I would prefer to make a contract that sync master hooks
should be non blocking executed in the main thread. Async hooks could also be
executed by the main thread and user should take care of spawning/re-using his
own thread to actually execute the async work (just as in AsyncWaitOperator).
If user executes blocking code, let him shoot himself in the foot...{quote}
It make sense to me. So there are three options now.
1. {{MasterState syncSnapshotHook(...)}}
2. {{CompletableFuture<MasterState> asyncSnapshotHook(ioExecutor)}}
3. {{CompletableFuture<MasterState> asyncSnapshotHook()}}
The only difference between option 2 and 3 is whether we provide an IO executor
or not. I tend to choose option 2. But option 3 is also acceptable to me.
{quote}unless we schedule periodic actions always in some separate thread,
first think we do is to execute the hooks in that thread, and only after
execute the hooks, we enqueue follow up work in the main thread?{quote}
I think we should schedule periodic actions in main thread as planned. That's
one of the biggest targets of this reworking, making it single-threaded and
lock free(the trigger/coordinator-wide lock).
{quote}...we would have to make sure that none of the CheckpointCoordinator
actions will be triggered until that other thread finish its work.{quote}
I think we have to face this problem. The master hook is by designed to do some
IO operation. And we can't wait for the result synchronously. It should be done
with {{CompletableFuture}} and main thread executor.
> Snapshot master hook state asynchronously
> -----------------------------------------
>
> Key: FLINK-14344
> URL: https://issues.apache.org/jira/browse/FLINK-14344
> Project: Flink
> Issue Type: Sub-task
> Components: Runtime / Checkpointing
> Reporter: Biao Liu
> Assignee: Biao Liu
> Priority: Major
> Labels: pull-request-available
> Fix For: 1.10.0
>
> Time Spent: 10m
> Remaining Estimate: 0h
>
> Currently we snapshot the master hook state synchronously. As a part of
> reworking threading model of {{CheckpointCoordinator}}, we have to make this
> non-blocking to satisfy the requirement of running in main thread.
> The behavior of snapshotting master hook state is similar to task state
> snapshotting. Master state snapshotting is taken before task state
> snapshotting. Because in master hook, there might be external system
> initialization which task state snapshotting might depend on.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)