[
https://issues.apache.org/jira/browse/FLINK-19994?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17226799#comment-17226799
]
Zhu Zhu commented on FLINK-19994:
---------------------------------
[~azagrebin] {{computeStronglyConnectedComponents}} does not see iteration
feedback edges so it does not help here. However, iteration edges will always
be PIPELINED so that head and tail task will be pipelined connected and then
will be in one region.
I have also run a
[CI|https://dev.azure.com/zhuzh/flink-zz/_build/results?buildId=268&view=results]
for the proposed change and it has passed.
The behavior to {{buildOneRegionForAllVertices}} has been there for quite some
time since the legacy RestartPipelinedRegionStrategy which was already removed.
I am not quite sure about the initial purpose. But from what I see, it is not
needed now.
[~trohrmann]
[This|https://github.com/apache/flink/blob/0523ef6451a93da450c6bdf5dd4757c3702f3962/flink-optimizer/src/main/java/org/apache/flink/optimizer/plantranslate/JobGraphGenerator.java#L1188]
is where edges within an iteration will be set to PIPELINED for DataSet jobs.
And for streaming jobs, edges are always PIPELINED.
> All vertices in an DataSet iteration job will be eagerly scheduled
> ------------------------------------------------------------------
>
> Key: FLINK-19994
> URL: https://issues.apache.org/jira/browse/FLINK-19994
> Project: Flink
> Issue Type: Bug
> Components: Runtime / Coordination
> Affects Versions: 1.12.0
> Reporter: Zhu Zhu
> Priority: Blocker
> Fix For: 1.12.0
>
>
> After switching to pipelined region scheduling, all vertices in an DataSet
> iteration job will be eagerly scheduled, which means BLOCKING result
> consumers can be deployed even before the result finishes and resource waste
> happens. This is because all vertices will be put into one pipelined region
> if the job contains {{ColocationConstraint}}, see
> [PipelinedRegionComputeUtil|https://github.com/apache/flink/blob/c0f382f5f0072441ef8933f6993f1c34168004d6/flink-runtime/src/main/java/org/apache/flink/runtime/executiongraph/failover/flip1/PipelinedRegionComputeUtil.java#L52].
> IIUC, this {{makeAllOneRegion()}} behavior was introduced to ensure
> co-located iteration head and tail to be restarted together in pipelined
> region failover. However, given that edges within an iteration will always be
> PIPELINED
> ([ref|https://github.com/apache/flink/blob/0523ef6451a93da450c6bdf5dd4757c3702f3962/flink-optimizer/src/main/java/org/apache/flink/optimizer/plantranslate/JobGraphGenerator.java#L1188]),
> co-located iteration head and tail will always be in the same region. So I
> think we can drop the {{PipelinedRegionComputeUtil#makeAllOneRegion()}} code
> path and build regions in the the same way no matter if there is co-location
> constraints or not.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)