[
https://issues.apache.org/jira/browse/FLINK-13905?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16964885#comment-16964885
]
Piotr Nowojski commented on FLINK-13905:
----------------------------------------
{quote}
The savepoint can be triggered anytime. We have to somehow queue the savepoint
trigger request if there is a checkpoint or savepoint ongoing. The queuing and
re-checking logic still can't be avoided.
{quote}
Ok, I get it. Thanks for the explanation.
{quote}
The manually triggering seems to be less meaning.
{quote}
If all things equal, I would prefer to postpone FLINK-13848 as much as
possible, as that would be another chunk of code to review/maintain. But if you
think it simplifies the code here enough to justify it, then sure, let's add
the dependency between them.
{quote}
Do you have any better idea?
{quote}
No. I think this extra complexity should be worth the effort, as it saves as
from concurency issues like FLINK-13497. But I agree, chained/depending
callbacks in asynchronous code are problematic. If there are too many of them,
multithreading synchronous code might be actually easier to maintain.
In the end, I don't know which of the discussed options I would pick, without
trying to write them down and making up my mind as I see the code taking it's
shape, so I would like to leave the descision up to you. At least now I think I
understand the problem well enough, so that I won't be to surprised when
reviewing the code :)
> Separate checkpoint triggering into stages
> ------------------------------------------
>
> Key: FLINK-13905
> URL: https://issues.apache.org/jira/browse/FLINK-13905
> Project: Flink
> Issue Type: Sub-task
> Components: Runtime / Checkpointing
> Reporter: Biao Liu
> Assignee: Biao Liu
> Priority: Major
> Fix For: 1.10.0
>
>
> Currently {{CheckpointCoordinator#triggerCheckpoint}} includes some heavy IO
> operations. We plan to separate the triggering into different stages. The IO
> operations are executed in IO threads, while other on-memory operations are
> not.
> This is a preparation for making all on-memory operations of
> {{CheckpointCoordinator}} single threaded (in main thread).
> Note that we could not put on-memory operations of triggering into main
> thread directly now. Because there are still some operations on a heavy lock
> (coordinator-wide).
--
This message was sent by Atlassian Jira
(v8.3.4#803005)