[ https://issues.apache.org/jira/browse/MAPREDUCE-4487?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Tom White updated MAPREDUCE-4487: --------------------------------- Attachment: MAPREDUCE-4487.patch Here's a patch which removes sleeps (or improves their usage) in three places: * In ReduceTask's fetchOutputs() if there are no map outputs in flight or scheduled, then it sleeps for five seconds. Replacing this condition with a wait that is notified when new map outputs become available is an improvement. * In ReduceTask's fetchOutputs() when all the output has been fetched there is a join on GetMapEventsThread, which may be sleeping (for 1s). Replacing this with a wait/notify removes the sleep overhead. * In Child's main loop while waiting for tasks from the parent tasktracker, the thread sleeps for 0.5s initially then 1.5s if there haven't been any tasks for a while. Replacing this with a more fine grained exponential backoff helps responsiveness. I ran some tests to investigate the effect of these changes. I ran a sleep job that sleeps for 1ms ({{bin/hadoop jar hadoop-*examples*jar sleep -m 1 -r 1 -mt 1 -rt 1}}) and measured the job execution time (on a single node cluster). Without the patch the mean time was 12.97s (over 10 runs, sd 0.53), and with the patch it was 9.109s (sd 1.0) - a significant improvement. > Reduce job latency by removing hardcoded sleep statements > --------------------------------------------------------- > > Key: MAPREDUCE-4487 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-4487 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: mrv1, performance > Affects Versions: 1.0.3 > Reporter: Tom White > Assignee: Tom White > Attachments: MAPREDUCE-4487.patch > > > There are a few places in MapReduce where there are hardcoded sleep > statements. By replacing them with wait/notify or similar it's possible to > reduce latency for short running jobs. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira