[
https://issues.apache.org/jira/browse/TEZ-946?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Siddharth Seth updated TEZ-946:
-------------------------------
Target Version/s: 0.7.0
> Tez loses buffer-cache performance by running interleaved vertexes
> ------------------------------------------------------------------
>
> Key: TEZ-946
> URL: https://issues.apache.org/jira/browse/TEZ-946
> Project: Apache Tez
> Issue Type: Bug
> Reporter: Gopal V
> Attachments: union-10.svg
>
>
> For a task which has multiple reduce vertexes running to generate UNION ops,
> the current Tez behaviour causes bad cache performance as well as bad perf
> with the object registry.
> The map spill files get paged in and out of cache, when I was running a large
> query which had multiple reducers pulling data off different shuffle edges at
> the same time.
> Along with this, whenever a map-join vertex is interleaved with a reducer
> vertex, the map-join hashtable gets dropped in the transition.
> It would be beneficial to schedule the vertexes at the same level with some
> priority so that we finish them faster through better buffer-cache hit-rate
> and object-registry hit-rate.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)