[ 
https://issues.apache.org/jira/browse/TEZ-776?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14316052#comment-14316052
 ] 

Siddharth Seth commented on TEZ-776:
------------------------------------

Elaborating more on what I had mentioned in the first comment about using 
BitSets (Point 1, now Option 3). (Needs validation)

A - events generated by a task (theoretically infinite)
M -Tasks running for a Vertex
N - downstream consumer tasks
K - target indices for each event (theoretically infinite)

Each of the A events can, again in theory :), be routed to any number of input 
indices on any number of downstream tasks (numPhysicalInputs determined by edge 
plugin).

h6. Routing possibilities (Assuming a single edge)
Track A X M events
Each of these events can go to any of the N tasks
For each of the N tasks, the events can go to (infinite) - (in practice M) 
input indices.

This can be visualized as a cube
     - X axis representing individual events
     - Y axis representing targetTasks for the specific event
     - Z axis representing targetIndices within each targetTask for a specific 
event.

That's (M X A) points on X,  N points on Y,  and  K points on Z  - M X A X N X 
K bits would be the theoretical memory overhead of storing this cube.
When implementing this however, there's additional issues like minimal 
addressable word sizes, references, K is not known up front and can vary per 
event; Events are received out of order; so the storage is not that simple.

Examples
Shuffle: A=1 (CompositeEvent), M, N are task counts, K=1 since each event is 
routed to exactly 1 index on each downstream task. #UniqueRoutedEvents = MXN
Broadcast: A=1 (Non-composite), M, N are task counts, K=M - each event routed 
to all downstream tasks.

With a fairly basic implementation, the math on memory usage for a 10K * 1K 
shuffle comes to about 175MB of storage. Am sure this could be optimized 
further, and compressible bitsets may further bring this down (I believe 
RoaringBitmaps is supposed to be pretty good).

This will require edges to have the capability to route CompositeEvents without 
exploding them (which could be problematic) - and runtime expansion of 
CompositeEvents into regular events when a task requests events. Lookups would 
be more expensive then they are today, but not terribly so. The amount of 
routing required remains the same - without having to do a dynamic route for 
each request from a task.


> Reduce AM mem usage caused by storing TezEvents
> -----------------------------------------------
>
>                 Key: TEZ-776
>                 URL: https://issues.apache.org/jira/browse/TEZ-776
>             Project: Apache Tez
>          Issue Type: Sub-task
>            Reporter: Siddharth Seth
>            Assignee: Bikas Saha
>         Attachments: events-problem-solutions.txt
>
>
> This is open ended at the moment.
> A fair chunk of the AM heap is taken up by TezEvents (specifically 
> DataMovementEvents - 64 bytes per event).
> Depending on the connection pattern - this puts limits on the number of tasks 
> that can be processed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to