[ https://issues.apache.org/jira/browse/TEZ-946?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Hitesh Shah updated TEZ-946: ---------------------------- Target Version/s: 0.8.0 (was: 0.7.0) > Tez loses buffer-cache performance by running interleaved vertexes > ------------------------------------------------------------------ > > Key: TEZ-946 > URL: https://issues.apache.org/jira/browse/TEZ-946 > Project: Apache Tez > Issue Type: Bug > Reporter: Gopal V > Attachments: union-10.svg > > > For a task which has multiple reduce vertexes running to generate UNION ops, > the current Tez behaviour causes bad cache performance as well as bad perf > with the object registry. > The map spill files get paged in and out of cache, when I was running a large > query which had multiple reducers pulling data off different shuffle edges at > the same time. > Along with this, whenever a map-join vertex is interleaved with a reducer > vertex, the map-join hashtable gets dropped in the transition. > It would be beneficial to schedule the vertexes at the same level with some > priority so that we finish them faster through better buffer-cache hit-rate > and object-registry hit-rate. -- This message was sent by Atlassian JIRA (v6.3.4#6332)