[jira] [Updated] (TEZ-946) Tez loses buffer-cache performance by running interleaved vertexes

2015-04-28 Thread Hitesh Shah (JIRA)

 [ 
https://issues.apache.org/jira/browse/TEZ-946?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hitesh Shah updated TEZ-946:

Target Version/s: 0.8.0  (was: 0.7.0)

 Tez loses buffer-cache performance by running interleaved vertexes
 --

 Key: TEZ-946
 URL: https://issues.apache.org/jira/browse/TEZ-946
 Project: Apache Tez
  Issue Type: Bug
Reporter: Gopal V
 Attachments: union-10.svg


 For a task which has multiple reduce vertexes running to generate UNION ops, 
 the current Tez behaviour causes bad cache performance as well as bad perf 
 with the object registry.
 The map spill files get paged in and out of cache, when I was running a large 
 query which had multiple reducers pulling data off different shuffle edges at 
 the same time.
 Along with this, whenever a map-join vertex is interleaved with a reducer 
 vertex, the map-join hashtable gets dropped in the transition.
 It would be beneficial to schedule the vertexes at the same level with some 
 priority so that we finish them faster through better buffer-cache hit-rate 
 and object-registry hit-rate.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (TEZ-946) Tez loses buffer-cache performance by running interleaved vertexes

2015-01-07 Thread Siddharth Seth (JIRA)

 [ 
https://issues.apache.org/jira/browse/TEZ-946?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siddharth Seth updated TEZ-946:
---
Target Version/s: 0.7.0

 Tez loses buffer-cache performance by running interleaved vertexes
 --

 Key: TEZ-946
 URL: https://issues.apache.org/jira/browse/TEZ-946
 Project: Apache Tez
  Issue Type: Bug
Reporter: Gopal V
 Attachments: union-10.svg


 For a task which has multiple reduce vertexes running to generate UNION ops, 
 the current Tez behaviour causes bad cache performance as well as bad perf 
 with the object registry.
 The map spill files get paged in and out of cache, when I was running a large 
 query which had multiple reducers pulling data off different shuffle edges at 
 the same time.
 Along with this, whenever a map-join vertex is interleaved with a reducer 
 vertex, the map-join hashtable gets dropped in the transition.
 It would be beneficial to schedule the vertexes at the same level with some 
 priority so that we finish them faster through better buffer-cache hit-rate 
 and object-registry hit-rate.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)