[ 
https://issues.apache.org/jira/browse/FLINK-14344?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16950976#comment-16950976
 ] 

Biao Liu commented on FLINK-14344:
----------------------------------

Hi [~pnowojski],
{quote}I think ideally I would prefer to make a contract that sync master hooks 
should be non blocking executed in the main thread. Async hooks could also be 
executed by the main thread and user should take care of spawning/re-using his 
own thread to actually execute the async work (just as in AsyncWaitOperator). 
If user executes blocking code, let him shoot himself in the foot...{quote}
It make sense to me. So there are three options now. 
1. {{MasterState syncSnapshotHook(...)}}
2. {{CompletableFuture<MasterState> asyncSnapshotHook(ioExecutor)}}
3. {{CompletableFuture<MasterState> asyncSnapshotHook()}}

The only difference between option 2 and 3 is whether we provide an IO executor 
or not. I tend to choose option 2. But option 3 is also acceptable to me.
{quote}unless we schedule periodic actions always in some separate thread, 
first think we do is to execute the hooks in that thread, and only after 
execute the hooks, we enqueue follow up work in the main thread?{quote}
I think we should schedule periodic actions in main thread as planned. That's 
one of the biggest targets of this reworking, making it single-threaded and 
lock free(the trigger/coordinator-wide lock).
{quote}...we would have to make sure that none of the CheckpointCoordinator 
actions will be triggered until that other thread finish its work.{quote}
I think we have to face this problem. The master hook is by designed to do some 
IO operation. And we can't wait for the result synchronously. It should be done 
with {{CompletableFuture}} and main thread executor. 

> Snapshot master hook state asynchronously
> -----------------------------------------
>
>                 Key: FLINK-14344
>                 URL: https://issues.apache.org/jira/browse/FLINK-14344
>             Project: Flink
>          Issue Type: Sub-task
>          Components: Runtime / Checkpointing
>            Reporter: Biao Liu
>            Assignee: Biao Liu
>            Priority: Major
>              Labels: pull-request-available
>             Fix For: 1.10.0
>
>          Time Spent: 10m
>  Remaining Estimate: 0h
>
> Currently we snapshot the master hook state synchronously. As a part of 
> reworking threading model of {{CheckpointCoordinator}}, we have to make this 
> non-blocking to satisfy the requirement of running in main thread.
> The behavior of snapshotting master hook state is similar to task state 
> snapshotting. Master state snapshotting is taken before task state 
> snapshotting. Because in master hook, there might be external system 
> initialization which task state snapshotting might depend on.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to