[jira] [Commented] (TEZ-776) Reduce AM mem usage caused by storing TezEvents

Bikas Saha (JIRA) Thu, 05 Mar 2015 20:19:50 -0800

    [ 
https://issues.apache.org/jira/browse/TEZ-776?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14349931#comment-14349931
 ]


Bikas Saha commented on TEZ-776:
--------------------------------

Sorry for the delayed response.
MxN for broadcast is a result of not having visibility to the event payload. 
Without that data, its impossible to avoid. If that data is visible then 
relevant events can be cached when they make sense. Broadcast is an example 
where caching is helpful.
I doubt that CPU overhead for iterating over 1-1 events is going to be 
relevant. Routing over 1-1 may not simply be a single lookup because attempts 
may fail and get retried and events need to be iterated over to get to the new 
versions. Unless of course some dictionary is being created to lookup all 
events generated by a certain tasks attempts.

Past events that havent yet been routed can be ignored if they are from an 
attempt that has been invalidated via in inputfailed event. This can be done by 
the vertex since it has both the dm events and the input-failed events which 
can be matched by task attempt id. There is no need to burden every edge plugin 
writer with this.

Push based routing needs all those questions answered any more that are 
probably orthogonal here.

No. Its test cpu usage and most of it comes from the central dispatcher under 
load in the simulation. No periodic spikes were observed in the running jobs. 
If there is any other way to measure this then I am open to suggestions.

The cpu numbers reinforce that the the cpu utilization is related to the number 
of events in the inner loop. ie. if the cpu used in routing is a significant 
fraction of the total cpu in the first place. Its the same code. Using old and 
new routing based on the new config.

> Reduce AM mem usage caused by storing TezEvents
> -----------------------------------------------
>
>                 Key: TEZ-776
>                 URL: https://issues.apache.org/jira/browse/TEZ-776
>             Project: Apache Tez
>          Issue Type: Sub-task
>            Reporter: Siddharth Seth
>            Assignee: Bikas Saha
>         Attachments: TEZ-776.ondemand.1.patch, TEZ-776.ondemand.2.patch, 
> TEZ-776.ondemand.3.patch, TEZ-776.ondemand.4.patch, TEZ-776.ondemand.patch, 
> events-problem-solutions.txt
>
>
> This is open ended at the moment.
> A fair chunk of the AM heap is taken up by TezEvents (specifically 
> DataMovementEvents - 64 bytes per event).
> Depending on the connection pattern - this puts limits on the number of tasks 
> that can be processed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (TEZ-776) Reduce AM mem usage caused by storing TezEvents

Reply via email to