[ 
https://issues.apache.org/jira/browse/TEZ-1248?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14238790#comment-14238790
 ] 

Gopal V commented on TEZ-1248:
------------------------------

bq. More details would help.

For a case with 100 mappers and 1 reducer, the reducer does not start shuffling 
till 75 mappers have exited.

Hive ORDER BY queries suffer from this badly, losing out on several seconds 
when it could be merging the intermediate data.

> Reduce slow-start should special case 1 reducer runs
> ----------------------------------------------------
>
>                 Key: TEZ-1248
>                 URL: https://issues.apache.org/jira/browse/TEZ-1248
>             Project: Apache Tez
>          Issue Type: Improvement
>    Affects Versions: 0.5.0
>         Environment: 20 node cluster running tez
>            Reporter: Gopal V
>            Priority: Minor
>
> Reducer slow-start has a performance problem for the small cases where there 
> is just 1 reducer for a case with a single wave.
> Tez knows the split count and wave count, being able to determine if the 
> cluster has enough spare capacity to run the reducer earlier for lower 
> latency in a N-mapper -> 1 reducer case.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to