[
https://issues.apache.org/jira/browse/FLINK-4912?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15851002#comment-15851002
]
ASF GitHub Bot commented on FLINK-4912:
---------------------------------------
Github user wangzhijiang999 commented on the issue:
https://github.com/apache/flink/pull/3113
@StephanEwen , thank you for the concrete suggestions. Sorry for delay
response because of Chinese Spring Festival Holiday.
I have considered and added some tests to validate the state transitions of
the state machine related with the later processes which would be submitted in
the following PRs together.
I totally agree with the consideration of the above possible state
transitions. And I plan to give a detail explanation of my implementation in
another jira soon. It is actually a bit complex to do that ,so I try to break
them down into small ones in order to review and merge quickly.
> Introduce RECONCILING state in ExecutionGraph
> ---------------------------------------------
>
> Key: FLINK-4912
> URL: https://issues.apache.org/jira/browse/FLINK-4912
> Project: Flink
> Issue Type: Sub-task
> Components: Distributed Coordination
> Reporter: Stephan Ewen
> Assignee: Zhijiang Wang
>
> This is part of the non-disruptive JobManager failure recovery.
> I suggest to add a JobStatus and ExecutionState {{RECONCILING}}.
> If a job is started on a that JobManager for master recovery (tbd how to
> determine that) the {{ExecutionGraph}} and the {{Execution}}s start in the
> reconciling state.
> From {{RECONCILING}}, tasks can go to {{RUNNING}} (execution reconciled with
> TaskManager) or to {{FAILED}}.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)