[ 
https://issues.apache.org/jira/browse/FLINK-4714?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15891545#comment-15891545
 ] 

Wei-Che Wei commented on FLINK-4714:
------------------------------------

Hi [~till.rohrmann]

I have some ideas about this issue and I would like to know if I can get some 
feedback from you.
As I know, this issue wants to make the task state be {{RUNNING}} after the 
{{StreamTask}} assigns true to {{StreamTask.isRunning}} (i.e. all restored 
states and operations have been prepared), so that the checkpoints won't be 
aborted.

The following are what I thought that might be possible solutions.
1. Run the other thread monitoring the {{StreamTask.isRunning}}, and change 
task state to be {{RUNNING}}. This might be a walk-around solution and I don't 
like it, because I think original {{Task}} change the state is more proactive 
and this implementation is more like a passive way.
2. Add a prepare() method in {{AbstractInvokable}} and override in 
{{StreamTask}} only. Move all prepare work from invoke() to prepare() and call 
prepare() before transit state in {{Task}}.
3. As the second implementation and redefine the invoke() method for all class 
extends {{AbstractInvokable}} as well. Original invoke() method defines that 
all operations and setting, such as I/O stream setting are included in.
The second implementation is a sub-optimal solution for me, because I think 
that implementation is more like move the initialization from {{RUNNING}} state 
to {{DEPLOYING}} state. Therefore, it is better to redefine the invoke(), not 
just customize for {{StreamTask}}.

What do you think?

> Set task state to RUNNING after state has been restored
> -------------------------------------------------------
>
>                 Key: FLINK-4714
>                 URL: https://issues.apache.org/jira/browse/FLINK-4714
>             Project: Flink
>          Issue Type: Improvement
>          Components: Distributed Coordination, State Backends, Checkpointing
>    Affects Versions: 1.2.0
>            Reporter: Till Rohrmann
>            Assignee: Wei-Che Wei
>
> The task state is set to {{RUNNING}} as soon as the {{Task}} is executed. 
> That, however, happens before the state of the {{StreamTask}} invokable has 
> been restored. As a result, the {{CheckpointCoordinator}} starts to trigger 
> checkpoints even though the {{StreamTask}} is not ready.
> In order to avoid aborting checkpoints and properly start it, we should 
> switch the task state to {{RUNNING}} after the state has been restored.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to