[
https://issues.apache.org/jira/browse/TEZ-776?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14316052#comment-14316052
]
Siddharth Seth commented on TEZ-776:
------------------------------------
Elaborating more on what I had mentioned in the first comment about using
BitSets (Point 1, now Option 3). (Needs validation)
A - events generated by a task (theoretically infinite)
M -Tasks running for a Vertex
N - downstream consumer tasks
K - target indices for each event (theoretically infinite)
Each of the A events can, again in theory :), be routed to any number of input
indices on any number of downstream tasks (numPhysicalInputs determined by edge
plugin).
h6. Routing possibilities (Assuming a single edge)
Track A X M events
Each of these events can go to any of the N tasks
For each of the N tasks, the events can go to (infinite) - (in practice M)
input indices.
This can be visualized as a cube
- X axis representing individual events
- Y axis representing targetTasks for the specific event
- Z axis representing targetIndices within each targetTask for a specific
event.
That's (M X A) points on X, N points on Y, and K points on Z - M X A X N X
K bits would be the theoretical memory overhead of storing this cube.
When implementing this however, there's additional issues like minimal
addressable word sizes, references, K is not known up front and can vary per
event; Events are received out of order; so the storage is not that simple.
Examples
Shuffle: A=1 (CompositeEvent), M, N are task counts, K=1 since each event is
routed to exactly 1 index on each downstream task. #UniqueRoutedEvents = MXN
Broadcast: A=1 (Non-composite), M, N are task counts, K=M - each event routed
to all downstream tasks.
With a fairly basic implementation, the math on memory usage for a 10K * 1K
shuffle comes to about 175MB of storage. Am sure this could be optimized
further, and compressible bitsets may further bring this down (I believe
RoaringBitmaps is supposed to be pretty good).
This will require edges to have the capability to route CompositeEvents without
exploding them (which could be problematic) - and runtime expansion of
CompositeEvents into regular events when a task requests events. Lookups would
be more expensive then they are today, but not terribly so. The amount of
routing required remains the same - without having to do a dynamic route for
each request from a task.
> Reduce AM mem usage caused by storing TezEvents
> -----------------------------------------------
>
> Key: TEZ-776
> URL: https://issues.apache.org/jira/browse/TEZ-776
> Project: Apache Tez
> Issue Type: Sub-task
> Reporter: Siddharth Seth
> Assignee: Bikas Saha
> Attachments: events-problem-solutions.txt
>
>
> This is open ended at the moment.
> A fair chunk of the AM heap is taken up by TezEvents (specifically
> DataMovementEvents - 64 bytes per event).
> Depending on the connection pattern - this puts limits on the number of tasks
> that can be processed.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)