[ 
https://issues.apache.org/jira/browse/TEZ-3217?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15246562#comment-15246562
 ] 

Jonathan Eagles commented on TEZ-3217:
--------------------------------------

[~jlowe] and I are experimenting with a design for enhanced DMEs which could be 
used to reduce the number of messages sent to the downstream task. [~sseth], is 
there an existing effort along these lines? Our use case is to compensate for 
the increased number of messages needed to be processed by an auto-reduce 
parallelism "reducer" 

> Optimize DataMovementEvent routing between AM and downstream vertex
> -------------------------------------------------------------------
>
>                 Key: TEZ-3217
>                 URL: https://issues.apache.org/jira/browse/TEZ-3217
>             Project: Apache Tez
>          Issue Type: Improvement
>            Reporter: Ming Ma
>
> Follow up on TEZ-3206 discussion, we might be able to optimize the DME 
> routing from AM to downstream vertex, mostly for the auto-parallelism case.
> * DME's empty partition payload has all empty partitions from a specific 
> mapper. At the reducer side, it only cares about the partitions it is 
> responsible for, not partitions belong to other reducers. Perhaps we can 
> optimize AM to send only the relevant empty partitions to that reducer.
> * Instead of sending one DME to a given reducer at a time, it can batch all 
> DMEs belonging to a given (mapper, reducer) pair with the common empty 
> partition payload, similar to CompositeDataMovementEvent.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to