[
https://issues.apache.org/jira/browse/FLINK-29769?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Weijie Guo updated FLINK-29769:
-------------------------------
Affects Version/s: (was: 1.17.0)
> Further limit the explosion range of failover in hybrid shuffle mode
> --------------------------------------------------------------------
>
> Key: FLINK-29769
> URL: https://issues.apache.org/jira/browse/FLINK-29769
> Project: Flink
> Issue Type: Sub-task
> Components: Runtime / Coordination
> Reporter: Weijie Guo
> Assignee: Weijie Guo
> Priority: Major
>
> Under the current failover strategy, if a region changes to the failed state,
> all its downstream regions must be restarted. For ALL_ EDGE_BLOCKING type
> jobs, since they are scheduled stage by stage, no additional overhead.
> However, for the hybrid shuffle mode, the upstream and downstream can both
> run at the same time. If the upstream task fails, we hope that it will not
> affect the downstream tasks that do not consume it.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)