0.20: avoid a busy-loop in ReduceTask scheduling
------------------------------------------------
Key: MAPREDUCE-3278
URL: https://issues.apache.org/jira/browse/MAPREDUCE-3278
Project: Hadoop Map/Reduce
Issue Type: Improvement
Components: mrv1, performance, task
Affects Versions: 0.20.205.0
Reporter: Todd Lipcon
Assignee: Todd Lipcon
Looking at profiling results, it became clear that the ReduceTask has the
following busy-loop which was causing it to suck up 100% of CPU in the fetch
phase in some configurations:
- the number of reduce fetcher threads is configured to more than the number of
hosts
- therefore "busyEnough()" never returns true
- the "scheduling" portion of the code can't schedule any new fetches, since
all of the pending fetches in the mapLocations buffer correspond to hosts that
are already being fetched (the hosts are in the {{uniqueHosts}} map)
- {{getCopyResult()}} immediately returns null, since there are no completed
maps.
Hence ReduceTask spins back and forth between trying to schedule things (and
failing), and trying to grab completed results (of which there are none), with
no waits.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira