[ 
https://issues.apache.org/jira/browse/TEZ-3222?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15667599#comment-15667599
 ] 

Jonathan Eagles commented on TEZ-3222:
--------------------------------------

{quote}
return CompositeEventRouteMetadata.create(1, sourceTaskIndex, 0);
{quote}

This call is only called once per source, this reduces the memory footprint 
while having the same CPU footprint.

{quote}
+message CompositeRoutedDataMovementEventProto {
+  optional int32 source_index = 1;
+  optional int32 target_index = 2;
+  optional int32 count = 3;
+  optional bytes user_payload = 4;
+  optional int32 version = 5;
+}
{quote}

[~bikassaha], I looked at adding a CompositeRouteMeta as proto and having 
CompositeRoutedDataMovementEventProto contain this new structure in place of 
meta fields. There were several downsides of using this technique including 
nearly doubling the cpu needed to use this events. If we can leverage the 
benefits of protobuf and keep them in the top level it will be much easier to 
use and faster. The downside of the evolution of the event can be handled using 
protobuf optional fields. When it becomes too unwieldy, perhaps that is the 
time to redesign. If you are ok with that I would prefer to push that change to 
a later date.

> Reduce messaging overhead for auto-reduce parallelism case
> ----------------------------------------------------------
>
>                 Key: TEZ-3222
>                 URL: https://issues.apache.org/jira/browse/TEZ-3222
>             Project: Apache Tez
>          Issue Type: Bug
>            Reporter: Jonathan Eagles
>            Assignee: Jonathan Eagles
>         Attachments: TEZ-3222.1.patch, TEZ-3222.2.patch, TEZ-3222.3.patch, 
> TEZ-3222.4.patch, TEZ-3222.5.patch, TEZ-3222.6.patch
>
>
> A dag with 15k x 1000k vertex may auto-reduce to 15k x 1. And while the data  
> size is appropriate for 1 task attempt, this results in an increase in task 
> attempt message processing of 1000x.
> This jira aims to reduce the message processing in the auto-reduced task 
> while keeping the amount of message processing in the AM the same or less.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to