[jira] [Updated] (TEZ-946) Tez loses buffer-cache performance by running interleaved vertexes
[ https://issues.apache.org/jira/browse/TEZ-946?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hitesh Shah updated TEZ-946: Target Version/s: 0.8.0 (was: 0.7.0) Tez loses buffer-cache performance by running interleaved vertexes -- Key: TEZ-946 URL: https://issues.apache.org/jira/browse/TEZ-946 Project: Apache Tez Issue Type: Bug Reporter: Gopal V Attachments: union-10.svg For a task which has multiple reduce vertexes running to generate UNION ops, the current Tez behaviour causes bad cache performance as well as bad perf with the object registry. The map spill files get paged in and out of cache, when I was running a large query which had multiple reducers pulling data off different shuffle edges at the same time. Along with this, whenever a map-join vertex is interleaved with a reducer vertex, the map-join hashtable gets dropped in the transition. It would be beneficial to schedule the vertexes at the same level with some priority so that we finish them faster through better buffer-cache hit-rate and object-registry hit-rate. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (TEZ-946) Tez loses buffer-cache performance by running interleaved vertexes
[ https://issues.apache.org/jira/browse/TEZ-946?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Siddharth Seth updated TEZ-946: --- Target Version/s: 0.7.0 Tez loses buffer-cache performance by running interleaved vertexes -- Key: TEZ-946 URL: https://issues.apache.org/jira/browse/TEZ-946 Project: Apache Tez Issue Type: Bug Reporter: Gopal V Attachments: union-10.svg For a task which has multiple reduce vertexes running to generate UNION ops, the current Tez behaviour causes bad cache performance as well as bad perf with the object registry. The map spill files get paged in and out of cache, when I was running a large query which had multiple reducers pulling data off different shuffle edges at the same time. Along with this, whenever a map-join vertex is interleaved with a reducer vertex, the map-join hashtable gets dropped in the transition. It would be beneficial to schedule the vertexes at the same level with some priority so that we finish them faster through better buffer-cache hit-rate and object-registry hit-rate. -- This message was sent by Atlassian JIRA (v6.3.4#6332)