damccorm opened a new issue, #20889:
URL: https://github.com/apache/beam/issues/20889

   Current java direct runner doesn't fuse transforms into steps. Instead, it 
almost executes each transform one by one. It results in memory pressure when 
any transform is high-fanout.
   
   We already have a simple fusion logic in Java 
SDK(https://github.com/apache/beam/blob/master/runners/core-construction-java/src/main/java/org/apache/beam/runners/core/construction/graph/GreedyPipelineFuser.java).
 Work remaining here might be:
   * Apply such fusion into DirectRunner
   * Change the DirectRunner to be able run the fused steps.
   
   I understand that DirectRunner doesn't expect processing large volume data 
and changing DirectRunner execution might be a fair amount of work.
   
   Imported from Jira 
[BEAM-12335](https://issues.apache.org/jira/browse/BEAM-12335). Original Jira 
may contain additional context.
   Reported by: boyuanz.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to