Zhu Zhu created FLINK-19994:
-------------------------------

             Summary: All vertices in an DataSet iteration job will be eagerly 
scheduled
                 Key: FLINK-19994
                 URL: https://issues.apache.org/jira/browse/FLINK-19994
             Project: Flink
          Issue Type: Bug
          Components: Runtime / Coordination
    Affects Versions: 1.12.0
            Reporter: Zhu Zhu
             Fix For: 1.12.0


After switching to pipelined region scheduling, all vertices in an DataSet 
iteration job will be eagerly scheduled, which means BLOCKING result consumers 
can be deployed even before the result finishes and resource waste happens. 
This is because all vertices will be put into one pipelined region if the job 
contains {{ColocationConstraint}}, see 
[PipelinedRegionComputeUtil|https://github.com/apache/flink/blob/c0f382f5f0072441ef8933f6993f1c34168004d6/flink-runtime/src/main/java/org/apache/flink/runtime/executiongraph/failover/flip1/PipelinedRegionComputeUtil.java#L52].

IIUC, this {{makeAllOneRegion()}} behavior was introduced to ensure co-located 
iteration head and tail to be restarted together in pipelined region failover. 
However, given that edges within an iteration will always be PIPELINED 
([ref|https://github.com/apache/flink/blob/0523ef6451a93da450c6bdf5dd4757c3702f3962/flink-optimizer/src/main/java/org/apache/flink/optimizer/plantranslate/JobGraphGenerator.java#L1188]),
 co-located iteration head and tail will always be in the same region. So I 
think we can drop the {{PipelinedRegionComputeUtil#makeAllOneRegion()}} code 
path and build regions in the the same way no matter if there is co-location 
constraints or not.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to