[
https://issues.apache.org/jira/browse/FLINK-14344?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16960847#comment-16960847
]
Piotr Nowojski commented on FLINK-14344:
----------------------------------------
[~SleePy], as far as me and [~trohrmann] are aware, the only user of this
feature is [the Pravega connector|https://github.com/pravega/flink-connectors],
so we can ask them or check the code directly what's the semantic that they
use/need.
I think the problem is that so far, all of the hooks were de-facto synchronous,
blocking whole {{CheckpointCoordinator}} on any IO. However this was blocking
just the {{CheckpointCoordinator}}. After the refactor, it would block whole
{{JobManager}}, right? To make things more complicated, even if we provide the
asynchronous hook, there are two possible semantics:
# hook is triggered asynchronously before the checkpoints starts, but
checkpoint barriers are being sent and the checkpoints starts before async hook
is completed
# hook is triggered asynchronously, but checkpoint is not started before the
async action completes
The first one might be more preferable for us, but I could imagine that some
systems need the second one - to initialise something, before checkpoint is
started by any of the operators. Again, maybe we can just do some research how
Pravega is using it?
> Snapshot master hook state asynchronously
> -----------------------------------------
>
> Key: FLINK-14344
> URL: https://issues.apache.org/jira/browse/FLINK-14344
> Project: Flink
> Issue Type: Sub-task
> Components: Runtime / Checkpointing
> Reporter: Biao Liu
> Assignee: Biao Liu
> Priority: Major
> Labels: pull-request-available
> Fix For: 1.10.0
>
> Time Spent: 10m
> Remaining Estimate: 0h
>
> Currently we snapshot the master hook state synchronously. As a part of
> reworking threading model of {{CheckpointCoordinator}}, we have to make this
> non-blocking to satisfy the requirement of running in main thread.
> The behavior of snapshotting master hook state is similar to task state
> snapshotting. Master state snapshotting is taken before task state
> snapshotting. Because in master hook, there might be external system
> initialization which task state snapshotting might depend on.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)