[jira] [Commented] (FLINK-9375) Introduce AbortCheckpoint message from JM to TMs

2018-05-17 Thread vinoyang (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-9375?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16480081#comment-16480081
 ] 

vinoyang commented on FLINK-9375:
-

[~sihuazhou] agree with you, it's duplicate of FLINK-8871, I suggest we could 
close this issue and move to FLINK-9971.

> Introduce AbortCheckpoint message from JM to TMs
> 
>
> Key: FLINK-9375
> URL: https://issues.apache.org/jira/browse/FLINK-9375
> Project: Flink
>  Issue Type: Improvement
>  Components: State Backends, Checkpointing
>Reporter: Stefan Richter
>Assignee: vinoyang
>Priority: Major
>
> We should introduce an {{AbortCheckpoint}} message that a jobmanager can send 
> to taskmanagers if a checkpoint is canceled so that the operators can eagerly 
> stop their alignment phase and continue to normal processing. This can reduce 
> some backpressure issues in the context of canceled and restarted checkpoints.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (FLINK-9375) Introduce AbortCheckpoint message from JM to TMs

2018-05-17 Thread Sihua Zhou (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-9375?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16480041#comment-16480041
 ] 

Sihua Zhou commented on FLINK-9375:
---

Hi [~srichter][~yanghua]I think this is a bit looks like a duplicate of this 
[FLINK-8871|https://issues.apache.org/jira/browse/FLINK-8871] which needs a 
good discussion as you([~srichter]) have mentioned, or 
[FLINK-8871|https://issues.apache.org/jira/browse/FLINK-8871] should be blocked 
by this ticket (which only finish the RPC related works)? Maybe we should 
connect these two guys together to get a better picture...what do you think?

> Introduce AbortCheckpoint message from JM to TMs
> 
>
> Key: FLINK-9375
> URL: https://issues.apache.org/jira/browse/FLINK-9375
> Project: Flink
>  Issue Type: Improvement
>  Components: State Backends, Checkpointing
>Reporter: Stefan Richter
>Assignee: vinoyang
>Priority: Major
>
> We should introduce an {{AbortCheckpoint}} message that a jobmanager can send 
> to taskmanagers if a checkpoint is canceled so that the operators can eagerly 
> stop their alignment phase and continue to normal processing. This can reduce 
> some backpressure issues in the context of canceled and restarted checkpoints.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (FLINK-9375) Introduce AbortCheckpoint message from JM to TMs

2018-05-17 Thread Stefan Richter (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-9375?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16479298#comment-16479298
 ] 

Stefan Richter commented on FLINK-9375:
---

[~yanghua] this task maybe a bit more tricky than it sounds. If you still want 
to work on this, I think it would make a lot of sense if you would briefly 
outline your implementation plan here before you start coding.

> Introduce AbortCheckpoint message from JM to TMs
> 
>
> Key: FLINK-9375
> URL: https://issues.apache.org/jira/browse/FLINK-9375
> Project: Flink
>  Issue Type: Improvement
>  Components: State Backends, Checkpointing
>Reporter: Stefan Richter
>Assignee: vinoyang
>Priority: Major
>
> We should introduce an {{AbortCheckpoint}} message that a jobmanager can send 
> to taskmanagers if a checkpoint is canceled so that the operators can eagerly 
> stop their alignment phase and continue to normal processing. This can reduce 
> some backpressure issues in the context of canceled and restarted checkpoints.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)