[ 
https://issues.apache.org/jira/browse/TEZ-776?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14315964#comment-14315964
 ] 

Siddharth Seth commented on TEZ-776:
------------------------------------

Here's some numbers on current memory usage (making the assumption that Java 
treats references as pointers, and each one occupies 4 bytes - That's not 
necessarily valid, but the numbers seem to add up nicely towards what I found 
in a heap dump when the jira was created - 64 bytes (+ some more now for 
ByteBuffer instead of byte[])).
{code}
TezEvent -> Total of 16 references. + self -> 17 references -> 68 bytes
     EventType (Enum) - 1
     Event - 7
     EventMetaData source - 4
     EventMetaData dest - 4

DataMovementEvent - 4 references + ByteBuffer overhead (assume 3) - Total 7
     int sourceIndex
     int targetIndex
     ByteBuffer payload
     int version

EventMetaData     - 4 references
     EventProducerConsumerType - enum
     String taskVertexName
     String edgeVertexName
     TezTaskAttemptId taskAttemptId
{code}
With M being number of source tasks, and N being number of destination tasks
For the ScatterGather case, there's additional overheads of N lists, the M 
events itself - each of which has a ~100 byte payload. 

For a 10K * 1K shuffle - there's MXN unique events. The events itself add up to 
648MB; not considering overheads.

For Broadcast, the number of events stored continues to be N X M. However, it's 
far more efficient since the number of unique events is M. The overhead from 
storing at the task level remains the same however.

Shuffle forces a new DataMovementEvent to be created to update the targetIndex. 
It forces a new TezEvent to be created to hold this new DataMovementEvent.



> Reduce AM mem usage caused by storing TezEvents
> -----------------------------------------------
>
>                 Key: TEZ-776
>                 URL: https://issues.apache.org/jira/browse/TEZ-776
>             Project: Apache Tez
>          Issue Type: Sub-task
>            Reporter: Siddharth Seth
>            Assignee: Bikas Saha
>         Attachments: events-problem-solutions.txt
>
>
> This is open ended at the moment.
> A fair chunk of the AM heap is taken up by TezEvents (specifically 
> DataMovementEvents - 64 bytes per event).
> Depending on the connection pattern - this puts limits on the number of tasks 
> that can be processed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to