[
https://issues.apache.org/jira/browse/TEZ-978?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14087566#comment-14087566
]
Rajesh Balamohan edited comment on TEZ-978 at 8/6/14 11:31 AM:
---------------------------------------------------------------
For testing, I manually set "mapred.reduce.tasks=200" for the job and let tez
auto-reduce-parallelism kick in later. This test was run at 200 GB scale.
Changes (tez-978.4.wip.patch) :
========
1. OnFileSortedOutput sends out additional VertexManagerEvents with incremental
data written in the running task.
2. ShuffleVertexManager computes reducers based on the completed task
information + empty partition details + data written so far in running tasks
With tez-978.4.wip.patch (with auto reduce on):
=====================================
Map 1: 1(+160)/171 Map 5: 1(+29)/170 Map 7: 1/1 Map 8: 1/1
Reducer 2: 0/200 Reducer 3: 0/200 Reducer 4: 0/1 Reducer 6: 0/200
……..
Map 1: 171/171 Map 5: 169(+1)/170 Map 7: 1/1 Map 8: 1/1 Reducer
2: 40/40 Reducer 3: 0(+23)/23 Reducer 4: 0/1 Reducer 6: 0(+40)/40
<======= auto-reduce
Time taken: 88.703 seconds
Without patch (with auto reduce on):
=============================
Map 1: 0(+7)/174 Map 5: 0(+1)/170 Map 7: 1/1 Map 8: 1/1
Reducer 2: 0/200 Reducer 3: 0/200 Reducer 4: 0/1 Reducer 6: 0/200
……...
Map 1: 177/177 Map 5: 169/169 Map 7: 1/1 Map 8: 1/1 Reducer 2:
0(+1)/1 Reducer 3: 0/200 Reducer 4: 0/1 Reducer 6: 0(+1)/1
<========= auto-reduce
job did not complete (killed it after sometime)
was (Author: rajesh.balamohan):
For testing, I manually set "mapred.reduce.tasks=200" for the job and let tez
auto-reduce-parallelism kick in later. This test was run at 200 GB scale.
Changes:
========
1. OnFileSortedOutput sends out additional VertexManagerEvents with incremental
data written in the running task.
2. ShuffleVertexManager computes reducers based on the completed task
information + empty partition details + data written so far in running tasks
With tez-978.4.wip.patch (with auto reduce on):
=====================================
Map 1: 1(+160)/171 Map 5: 1(+29)/170 Map 7: 1/1 Map 8: 1/1
Reducer 2: 0/200 Reducer 3: 0/200 Reducer 4: 0/1 Reducer 6: 0/200
……..
Map 1: 171/171 Map 5: 169(+1)/170 Map 7: 1/1 Map 8: 1/1 Reducer
2: 40/40 Reducer 3: 0(+23)/23 Reducer 4: 0/1 Reducer 6: 0(+40)/40
<======= auto-reduce
Time taken: 88.703 seconds
Without patch (with auto reduce on):
=============================
Map 1: 0(+7)/174 Map 5: 0(+1)/170 Map 7: 1/1 Map 8: 1/1
Reducer 2: 0/200 Reducer 3: 0/200 Reducer 4: 0/1 Reducer 6: 0/200
……...
Map 1: 177/177 Map 5: 169/169 Map 7: 1/1 Map 8: 1/1 Reducer 2:
0(+1)/1 Reducer 3: 0/200 Reducer 4: 0/1 Reducer 6: 0(+1)/1
<========= auto-reduce
job did not complete (killed it after sometime)
> Enhance auto parallelism tuning for queries having empty outputs or data
> skewness
> ---------------------------------------------------------------------------------
>
> Key: TEZ-978
> URL: https://issues.apache.org/jira/browse/TEZ-978
> Project: Apache Tez
> Issue Type: Bug
> Affects Versions: 0.4.0
> Reporter: Rajesh Balamohan
> Assignee: Rajesh Balamohan
> Attachments: TEZ-978-v1.patch, TEZ-978-v2.patch, TEZ-978.3.patch,
> TEZ-978.4.wip.patch
>
>
> Running tpcds (query-92) with auto-tuning
> "tez.am.shuffle-vertex-manager.enable.auto-parallel" degraded the performance
> than original run.
> Query has lots of empty outputs and these tasks tend to complete a lot more
> faster than others. Tez computes the parallelism with the given information
> (wherein most of the output is empty) and set the reducers to "1". When
> other tasks complete, single reducer has to do the heavy lifting and this
> causes the performance degradation.
> Map 1: 2/181 Map 5: 16/179 Map 7: 1/1 Map 8: 1/1 Reducer 2:
> 0/109 Reducer 3: 0/137 Reducer 4: 0/1 Reducer 6: 0/166
> Map 1: 2/181 Map 5: 22/179 Map 7: 1/1 Map 8: 1/1 Reducer 2:
> 0/109 Reducer 3: 0/137 Reducer 4: 0/1 Reducer 6: 0/166
> Map 1: 2/181 Map 5: 25/179 Map 7: 1/1 Map 8: 1/1 Reducer 2:
> 0/109 Reducer 3: 0/137 Reducer 4: 0/1 Reducer 6: 0/166
> Map 1: 2/181 Map 5: 30/179 Map 7: 1/1 Map 8: 1/1 Reducer 2:
> 0/109 Reducer 3: 0/137 Reducer 4: 0/1 Reducer 6: 0/166
> Map 1: 2/181 Map 5: 35/179 Map 7: 1/1 Map 8: 1/1 Reducer 2:
> 0/109 Reducer 3: 0/137 Reducer 4: 0/1 Reducer 6: 0/166
> Map 1: 2/181 Map 5: 36/179 Map 7: 1/1 Map 8: 1/1 Reducer 2:
> 0/109 Reducer 3: 0/137 Reducer 4: 0/1 Reducer 6: 0/166
> Map 1: 2/181 Map 5: 39/179 Map 7: 1/1 Map 8: 1/1 Reducer 2:
> 0/109 Reducer 3: 0/137 Reducer 4: 0/1 Reducer 6: 0/166
> Map 1: 3/181 Map 5: 43/179 Map 7: 1/1 Map 8: 1/1 Reducer 2:
> 0/109 Reducer 3: 0/137 Reducer 4: 0/1 Reducer 6: 0/166
> Map 1: 5/181 Map 5: 46/179 Map 7: 1/1 Map 8: 1/1 Reducer 2:
> 0/109 Reducer 3: 0/137 Reducer 4: 0/1 Reducer 6: 0/1 <===
> ShuffleVertexManager changing parallelism
> Map 1: 5/181 Map 5: 63/179 Map 7: 1/1 Map 8: 1/1 Reducer 2:
> 0/109 Reducer 3: 0/137 Reducer 4: 0/1 Reducer 6: 0/1
> Map 1: 7/181 Map 5: 72/179 Map 7: 1/1 Map 8: 1/1 Reducer 2:
> 0/109 Reducer 3: 0/137 Reducer 4: 0/1 Reducer 6: 0/1
> Map 1: 7/181 Map 5: 83/179 Map 7: 1/1 Map 8: 1/1 Reducer 2:
> 0/109 Reducer 3: 0/137 Reducer 4: 0/1 Reducer 6: 0/1
> Map 1: 8/181 Map 5: 95/179 Map 7: 1/1 Map 8: 1/1 Reducer 2:
> 0/109 Reducer 3: 0/137 Reducer 4: 0/1 Reducer 6: 0/1
> Map 1: 8/181 Map 5: 104/179 Map 7: 1/1 Map 8: 1/1 Reducer 2:
> 0/109 Reducer 3: 0/137 Reducer 4: 0/1 Reducer 6: 0/1
> Map 1: 9/181 Map 5: 116/179 Map 7: 1/1 Map 8: 1/1 Reducer 2:
> 0/109 Reducer 3: 0/137 Reducer 4: 0/1 Reducer 6: 0/1
> Map 1: 12/181 Map 5: 123/179 Map 7: 1/1 Map 8: 1/1 Reducer 2:
> 0/109 Reducer 3: 0/137 Reducer 4: 0/1 Reducer 6: 0/1
> Map 1: 13/181 Map 5: 127/179 Map 7: 1/1 Map 8: 1/1 Reducer 2:
> 0/109 Reducer 3: 0/137 Reducer 4: 0/1 Reducer 6: 0/1
> Map 1: 16/181 Map 5: 127/179 Map 7: 1/1 Map 8: 1/1 Reducer 2:
> 0/109 Reducer 3: 0/137 Reducer 4: 0/1 Reducer 6: 0/1
> Map 1: 17/181 Map 5: 128/179 Map 7: 1/1 Map 8: 1/1 Reducer 2:
> 0/109 Reducer 3: 0/137 Reducer 4: 0/1 Reducer 6: 0/1
> Map 1: 18/181 Map 5: 131/179 Map 7: 1/1 Map 8: 1/1 Reducer 2:
> 0/109 Reducer 3: 0/137 Reducer 4: 0/1 Reducer 6: 0/1
> Map 1: 19/181 Map 5: 131/179 Map 7: 1/1 Map 8: 1/1 Reducer 2:
> 0/109 Reducer 3: 0/137 Reducer 4: 0/1 Reducer 6: 0/1
> Map 1: 25/181 Map 5: 132/179 Map 7: 1/1 Map 8: 1/1 Reducer 2:
> 0/109 Reducer 3: 0/137 Reducer 4: 0/1 Reducer 6: 0/1
> Map 1: 33/181 Map 5: 132/179 Map 7: 1/1 Map 8: 1/1 Reducer 2:
> 0/109 Reducer 3: 0/137 Reducer 4: 0/1 Reducer 6: 0/1
> Map 1: 42/181 Map 5: 134/179 Map 7: 1/1 Map 8: 1/1 Reducer 2:
> 0/109 Reducer 3: 0/137 Reducer 4: 0/1 Reducer 6: 0/1
> <=== ShuffleVertexManager changing parallelism
> Map 1: 51/181 Map 5: 135/179 Map 7: 1/1 Map 8: 1/1 Reducer 2:
> 0/1 Reducer 3: 0/137 Reducer 4: 0/1 Reducer 6: 0/1
> Map 1: 58/181 Map 5: 136/179 Map 7: 1/1 Map 8: 1/1 Reducer 2:
> 0/1 Reducer 3: 0/137 Reducer 4: 0/1 Reducer 6: 0/1
> Map 1: 63/181 Map 5: 136/179 Map 7: 1/1 Map 8: 1/1 Reducer 2:
> 0/1 Reducer 3: 0/137 Reducer 4: 0/1 Reducer 6: 0/1
> Map 1: 70/181 Map 5: 136/179 Map 7: 1/1 Map 8: 1/1 Reducer 2:
> 0/1 Reducer 3: 0/137 Reducer 4: 0/1 Reducer 6: 0/1
> Suggestion is to include
> 1. Empty output information when computing auto-parallelism.
> 2. Have a configurable value for determining the average output from the
> source (e.g minimum of 1 MB output from each source). If the average task
> output size does not meet this criteria (which means all the completed tasks
> are small tasks), we can defer the computation of auto-parallelism until
> other tasks are completed.
--
This message was sent by Atlassian JIRA
(v6.2#6252)