[
https://issues.apache.org/jira/browse/HADOOP-5160?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12671054#action_12671054
]
Arun C Murthy commented on HADOOP-5160:
---------------------------------------
As of hadoop-0.18 the Map-Reduce scheduler does assign only 1 reducer per
heartbeat and has the necessary smarts to ensure that it correctly loads up
each machine upto ceil(loadfactor) on each heartbeat. I suspect that
ceil(loadfactor) causes some to get overloaded... which is an unfortunate
side-effect which is hard to fix. I'm assuming you don't want to reduce
#reduceslots to 1 per box?
> Hadoop reduce scheduler sometimes leaves machines idle
> ------------------------------------------------------
>
> Key: HADOOP-5160
> URL: https://issues.apache.org/jira/browse/HADOOP-5160
> Project: Hadoop Core
> Issue Type: Bug
> Components: mapred
> Reporter: Nathan Marz
>
> I have a MapReduce application with number of reducers equal to the number of
> machines in the cluster (and with speculative execution turned off). However,
> Hadoop schedules multiple reduces to run on single machines and leaves other
> machines idle. This causes contention and seriously slows down the job.
> Hadoop should employ the simple heuristic of utilizing as many machines as
> possible when scheduling reduces.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.