[ 
https://issues.apache.org/jira/browse/TEZ-776?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14344654#comment-14344654
 ] 

Bikas Saha commented on TEZ-776:
--------------------------------

The edge manager implementations are precisely that. Impls of interfaces so 
that routing logic can be captured in code rather than as routing tables in 
memory. And thats what all impls to date do. They have logic in code to 
generate the routing data on the fly instead of encoding it in some memory 
efficient manner. We are probably unnecessarily going around in circles about 
this point of efficiently storing routing tables by smart edge manager plugins. 
Thats already the case.
Lets look at option 4 and on demand routing (ODR in the attached patch). 
Neither stores the exploded data movement events. In option 4. edge plugins are 
made responsible for event storage and implement an API (say. getNextEvents()) 
to get exploded events for a task while maintaining an index for that tasks 
last read index into the event array. In ODR, the vertex stores the events and 
gets routing data from the edge plugins when a task asks for events. In both 
cases, the exploded events are not stored (to reduce memory pressure) and so 
they must be routed to the requesting task. Each of the MxN DM events have 
unique source+target indices and thus there is no caching/reuse possible. The 
CPU needed would be similar in both cases because the inner loop (that consumes 
CPU) consists of creating these events and setting the routing info on them. 
You can see that from the yourkit profiles. The simulation in the patch 
essentially captures the essence of 4 since there is only 1 edge and hence all 
events are iterated through and exploded by the edge plugin. So the difference 
essentially lies in whether the burden of event storage etc. lies in the 
framework or if pushed to the user. ODR is keeping it within the framework 
because that is essentially the frameworks responsibility.
Another important correctness issue came up during the review of the tez paper 
submitted to sigmod. User payloads capture user metadata (and potentially data) 
and thus from a security correctness point of view its essential to limit who 
has access to the payloads. Inputs and outputs communicate via the payloads and 
need access for correctness. But nothing else should. Edge plugins are user 
code. Some written by Tez developers, some by others. Using third party plugins 
to route events should not mean that they have access to the user payloads of 
those events. That would be a potential security hole.
InputFailedEvents need routing info from the edge plugin mainly for the case 
when DM events have been sent to the consumers and they need to be told which 
one of them may now be bad. However, plugins are not needed to not send dm 
events to future consumers after the receipt of an input failed event. An input 
failed event essentially captures the lost task-attempt. The framework can skip 
dm events generated by failed task-attempts (for future consumers) after 
receiving failed task-attempt info via an input failed event. We can make that 
change independently today. Again, its best to make the framework support this 
for all instead of making it a burden for every edge plugin implementation.
Routing events as soon as they arrive would be covered by on-demand-routing 
since essentially it is on demand routing of events upon arrival. So the APIs 
in the edge plugins support that use case. How the AM would end up actually 
sending the event onward based on the routing info would depend on the 
implementation of push based routing and is probably orthogonal to being able 
to capture the routing info on a fine grained basis.

> Reduce AM mem usage caused by storing TezEvents
> -----------------------------------------------
>
>                 Key: TEZ-776
>                 URL: https://issues.apache.org/jira/browse/TEZ-776
>             Project: Apache Tez
>          Issue Type: Sub-task
>            Reporter: Siddharth Seth
>            Assignee: Bikas Saha
>         Attachments: TEZ-776.ondemand.1.patch, TEZ-776.ondemand.patch, 
> events-problem-solutions.txt
>
>
> This is open ended at the moment.
> A fair chunk of the AM heap is taken up by TezEvents (specifically 
> DataMovementEvents - 64 bytes per event).
> Depending on the connection pattern - this puts limits on the number of tasks 
> that can be processed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to