[jira] [Commented] (FLINK-13905) Separate checkpoint triggering into stages

Piotr Nowojski (Jira) Fri, 01 Nov 2019 07:51:23 -0700


    [ 
https://issues.apache.org/jira/browse/FLINK-13905?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16964885#comment-16964885
 ]


Piotr Nowojski commented on FLINK-13905:
----------------------------------------

{quote}
The savepoint can be triggered anytime. We have to somehow queue the savepoint 
trigger request if there is a checkpoint or savepoint ongoing. The queuing and 
re-checking logic still can't be avoided.
{quote}
Ok, I get it. Thanks for the explanation.
{quote}
The manually triggering seems to be less meaning.
{quote}
If all things equal, I would prefer to postpone FLINK-13848 as much as 
possible, as that would be another chunk of code to review/maintain. But if you 
think it simplifies the code here enough to justify it, then sure, let's add 
the dependency between them.
{quote}
Do you have any better idea?
{quote}
No. I think this extra complexity should be worth the effort, as it saves as 
from concurency issues like FLINK-13497. But I agree, chained/depending 
callbacks in asynchronous code are problematic. If there are too many of them, 
multithreading synchronous code might be actually easier to maintain.

In the end, I don't know which of the discussed options I would pick, without 
trying to write them down and making up my mind as I see the code taking it's 
shape, so I would like to leave the descision up to you. At least now I think I 
understand the problem well enough, so that I won't be to surprised when 
reviewing the code :) 

> Separate checkpoint triggering into stages
> ------------------------------------------
>
>                 Key: FLINK-13905
>                 URL: https://issues.apache.org/jira/browse/FLINK-13905
>             Project: Flink
>          Issue Type: Sub-task
>          Components: Runtime / Checkpointing
>            Reporter: Biao Liu
>            Assignee: Biao Liu
>            Priority: Major
>             Fix For: 1.10.0
>
>
> Currently {{CheckpointCoordinator#triggerCheckpoint}} includes some heavy IO 
> operations. We plan to separate the triggering into different stages. The IO 
> operations are executed in IO threads, while other on-memory operations are 
> not.
> This is a preparation for making all on-memory operations of 
> {{CheckpointCoordinator}} single threaded (in main thread).
> Note that we could not put on-memory operations of triggering into main 
> thread directly now. Because there are still some operations on a heavy lock 
> (coordinator-wide).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (FLINK-13905) Separate checkpoint triggering into stages

Reply via email to