[ 
https://issues.apache.org/jira/browse/TEZ-776?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14325123#comment-14325123
 ] 

Siddharth Seth commented on TEZ-776:
------------------------------------

Comments on the options posted in the doc (apologies for the delay on this)
On Option1 - fixing the event overhead
Will the new ConfigureEvent be sent from the VertexManager or from the edge 
itself - based on the routing table previously setup by the VertexManager ?
Assuming this event will go out irrespective of auto reduce (once per task), 
instead of relying on the taskIndex - which then ends up mixing routing with 
task indices instead of target indices (which are fully routed by the Edge). 
It'll be better if Inputs get complete information about routing, rather than 
making inferences.
- I had alluded to this earlier in the post about the bitset approach - routing 
CompositeEvents can be tricky, since theoretically they don't need to go to the 
same target index on each task. What comes out of an EdgeManager routing a 
CompositeEvent can be a complicated matrix - which the Edge would then have to 
simplify.

On option 2 (Fixing the reference overhead)
Would this involve the same event being routed MXN times - trading memory for a 
CPU overhead ?
bq. This index must come from the executor
Is the executor tracking the source events or the exploded routed events ?
bq. This index must come from the
executor since network errors may lose message and only the executor knows what 
was its last valid value received.
I believe this is handled via heartbeat numbering. It should be possible to 
track this information completely within the AM - which opens up additional 
possibilities.

Mentioned this earlier; instead of tracking in the Vertex, tracking these 
within the Edge itself has the advantage of being able to handle obsoletion 
better at a later point. Of-course, there's overhead of maintaining multiple 
indices (one per edge per task instead of one per vertex per task). That isn't 
very high though.




> Reduce AM mem usage caused by storing TezEvents
> -----------------------------------------------
>
>                 Key: TEZ-776
>                 URL: https://issues.apache.org/jira/browse/TEZ-776
>             Project: Apache Tez
>          Issue Type: Sub-task
>            Reporter: Siddharth Seth
>            Assignee: Bikas Saha
>         Attachments: events-problem-solutions.txt
>
>
> This is open ended at the moment.
> A fair chunk of the AM heap is taken up by TezEvents (specifically 
> DataMovementEvents - 64 bytes per event).
> Depending on the connection pattern - this puts limits on the number of tasks 
> that can be processed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to