[ 
https://issues.apache.org/jira/browse/TEZ-776?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14001987#comment-14001987
 ] 

Siddharth Seth commented on TEZ-776:
------------------------------------

In terms of pushing logic into plugins to reduce memory utilization - a plugin 
would know best about how events are being routed, they don't need to store 
additional state information about which tasks an event needs to be routed to. 
Broadcast for example, just needs to track all events - not apply any checks 
while deciding whether an event goes to a task, and just maintains indices.
A generic solution, which is what has to be implemented anyway, will have to 
store information about which tasks an event goes to, and likely check this 
list each time it needs to decide whether an event needs to go to a task. It 
can obviously have some optimizations when an event is to be routed to all 
downstream tasks.
Storing events in the plugins vs the Vertex itself - it's far easier to control 
Obsoletion, transient events which are required by TEZ-1094, when all relevant 
events are in a single place, rather than them being mixed - which would likely 
be the case when storing in the Vertex.

> Reduce AM mem usage caused by storing TezEvents
> -----------------------------------------------
>
>                 Key: TEZ-776
>                 URL: https://issues.apache.org/jira/browse/TEZ-776
>             Project: Apache Tez
>          Issue Type: Sub-task
>            Reporter: Siddharth Seth
>            Assignee: Siddharth Seth
>
> This is open ended at the moment.
> A fair chunk of the AM heap is taken up by TezEvents (specifically 
> DataMovementEvents - 64 bytes per event).
> Depending on the connection pattern - this puts limits on the number of tasks 
> that can be processed.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to