[
https://issues.apache.org/jira/browse/FLINK-14331?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
ASF GitHub Bot updated FLINK-14331:
-----------------------------------
Labels: pull-request-available (was: )
> Reset vertices right after they transition to terminated states
> ---------------------------------------------------------------
>
> Key: FLINK-14331
> URL: https://issues.apache.org/jira/browse/FLINK-14331
> Project: Flink
> Issue Type: Sub-task
> Components: Runtime / Coordination
> Affects Versions: 1.10.0
> Reporter: Zhu Zhu
> Assignee: Zhu Zhu
> Priority: Major
> Labels: pull-request-available
> Fix For: 1.10.0
>
>
> Currently in DefaultScheduler, tasks to restart will remain in terminated
> state until they are re-scheduled by the SchedulingStrategy.
> This behavior may cause 2 problems:
> 1. Failed/Canceled tasks are possibly not be able to be restarted in lazy
> scheduling. e.g. The job A1--pipelined-->B1 fails. And only A1 will be
> re-scheduled on restartTasks() since the inputs of B1 are not ready. B1
> should be scheduled later on the partition consumable event from restarted
> A1. But the terminal state of B1 will prevent B1 from being scheduled.
> 2. Keeping a task in FAILED/CANCELED state for a long time can happen if it
> takes a long time for its inputs to become ready again. This is also not
> friendly to users, which may cause confusions.
> That's why I'd propose to reset vertices right after they transition to
> terminated states.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)