[ 
https://issues.apache.org/jira/browse/TEZ-946?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hitesh Shah updated TEZ-946:
----------------------------
    Target Version/s: 0.8.0  (was: 0.7.0)

> Tez loses buffer-cache performance by running interleaved vertexes
> ------------------------------------------------------------------
>
>                 Key: TEZ-946
>                 URL: https://issues.apache.org/jira/browse/TEZ-946
>             Project: Apache Tez
>          Issue Type: Bug
>            Reporter: Gopal V
>         Attachments: union-10.svg
>
>
> For a task which has multiple reduce vertexes running to generate UNION ops, 
> the current Tez behaviour causes bad cache performance as well as bad perf 
> with the object registry.
> The map spill files get paged in and out of cache, when I was running a large 
> query which had multiple reducers pulling data off different shuffle edges at 
> the same time.
> Along with this, whenever a map-join vertex is interleaved with a reducer 
> vertex, the map-join hashtable gets dropped in the transition.
> It would be beneficial to schedule the vertexes at the same level with some 
> priority so that we finish them faster through better buffer-cache hit-rate 
> and object-registry hit-rate.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to