zhuzhurk commented on a change in pull request #11647: [FLINK-16960][runtime] 
Add PipelinedRegion interface
URL: https://github.com/apache/flink/pull/11647#discussion_r405984395
 
 

 ##########
 File path: 
flink-runtime/src/main/java/org/apache/flink/runtime/topology/Topology.java
 ##########
 @@ -39,4 +39,26 @@
         * @return whether the topology contains co-location constraints
         */
        boolean containsCoLocationConstraints();
+
+       /**
+        * Returns all pipelined regions in this topology.
+        *
+        * @return Iterable over pipelined regions in this topology
+        */
+       default Iterable<PipelinedRegion<VID, RID, V, R>> 
getAllPipelinedRegions() {
 
 Review comment:
   The POC should work. And I think it's fine to maintain 2 translation 
algorithms for logical and execution graphs.
   The main concern is the time to translate the graph and the GC caused by it, 
especially for large scale jobs. 
    - For jobs with hundreds of millions of edges (e.g. a 10000x10000 map 
reduce), it will take tens of seconds to build the ExecutionGraph. And it might 
also take tens of seconds to create a translated graph. 
    - Besides that, hundreds of millions of temporary edge instances must be 
created and used at the same time. This may result in more JM memory 
requirement otherwise OOM might happen. And it may also cause GC issues since 
they are not needed anymore after the regions are built.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

Reply via email to