[
https://issues.apache.org/jira/browse/FLINK-22017?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Till Rohrmann updated FLINK-22017:
----------------------------------
Description:
For the topology with cross-region blocking edges, there are regions that may
never be scheduled. The case is illustrated in the figure below.
!Illustration.jpg!
Let's denote the vertices with layer_number. It's clear that the edge connects
v2_2 and v3_2 crosses region 1 and region 2. Since region 1 has no blocking
edges connected to other regions, it will be scheduled first. When vertex2_2 is
finished, PipelinedRegionSchedulingStrategy will trigger
{{onExecutionStateChange}} for it.
As expected, region 2 will be scheduled since all its consumer partitions are
consumable. But in fact region 2 won't be scheduled, because the result
partition of vertex2_2 is not tagged as consumable. Whether it is consumable or
not is determined by its IntermediateDataSet.
However, an IntermediateDataSet is consumable if and only if all the producers
of its IntermediateResultPartitions are finished. This IntermediateDataSet will
never be consumable since vertex2_3 is not scheduled. All in all, this forms a
deadlock that a region will never be scheduled because it's not scheduled.
As a solution we should let BLOCKING result partitions be consumable
individually. Note that this will result in the scheduling to become
execution-vertex-wise instead of stage-wise, with a nice side effect towards
better resource utilization. The PipelinedRegionSchedulingStrategy can be
simplified along with change to get rid of the correlatedResultPartitions.
was:
For the topology with cross-region blocking edges, there are regions that may
never be scheduled. The case is illustrated in the figure below.
!Illustration.jpg!
Let's denote the vertices with layer_number. It's clear that the edge connects
v2_2 and v3_2 crosses region 1 and region 2. Since region 1 has no blocking
edges connected to other regions, it will be scheduled first. When vertex2_2 is
finished, PipelinedRegionSchedulingStrategy will trigger
{{onExecutionStateChange}} for it.
As expected, region 2 will be scheduled since all its consumer partitions are
consumable. But in fact region 2 won't be scheduled, because the result
partition of vertex2_2 is not tagged as consumable. Whether it is consumable or
not is determined by its IntermediateDataSet.
However, an IntermediateDataSet is consumable if and only if all the producers
of its IntermediateResultPartitions are finished. This IntermediateDataSet will
never be consumable since vertex2_3 is not scheduled. All in all, this forms a
deadlock that a region will never be scheduled because it's not scheduled.
> Regions may never be scheduled when there are cross-region blocking edges
> -------------------------------------------------------------------------
>
> Key: FLINK-22017
> URL: https://issues.apache.org/jira/browse/FLINK-22017
> Project: Flink
> Issue Type: Bug
> Components: Runtime / Coordination
> Affects Versions: 1.11.3, 1.12.2, 1.13.0
> Reporter: Zhilong Hong
> Priority: Major
> Fix For: 1.13.0
>
> Attachments: Illustration.jpg
>
>
> For the topology with cross-region blocking edges, there are regions that may
> never be scheduled. The case is illustrated in the figure below.
> !Illustration.jpg!
> Let's denote the vertices with layer_number. It's clear that the edge
> connects v2_2 and v3_2 crosses region 1 and region 2. Since region 1 has no
> blocking edges connected to other regions, it will be scheduled first. When
> vertex2_2 is finished, PipelinedRegionSchedulingStrategy will trigger
> {{onExecutionStateChange}} for it.
> As expected, region 2 will be scheduled since all its consumer partitions are
> consumable. But in fact region 2 won't be scheduled, because the result
> partition of vertex2_2 is not tagged as consumable. Whether it is consumable
> or not is determined by its IntermediateDataSet.
> However, an IntermediateDataSet is consumable if and only if all the
> producers of its IntermediateResultPartitions are finished. This
> IntermediateDataSet will never be consumable since vertex2_3 is not
> scheduled. All in all, this forms a deadlock that a region will never be
> scheduled because it's not scheduled.
> As a solution we should let BLOCKING result partitions be consumable
> individually. Note that this will result in the scheduling to become
> execution-vertex-wise instead of stage-wise, with a nice side effect towards
> better resource utilization. The PipelinedRegionSchedulingStrategy can be
> simplified along with change to get rid of the correlatedResultPartitions.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)