[
https://issues.apache.org/jira/browse/MAPREDUCE-5462?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13741418#comment-13741418
]
Sandy Ryza commented on MAPREDUCE-5462:
---------------------------------------
Submitting a patch based on half of Todd's patch from MAPREDUCE-3235.
I benchmarked the change with the LocalJobRunner, using a WordCount job with a
single map task on 64 MB of data. I did five runs with and without the patch.
In all runs, the rest of the job after the map task finished took less than a
second. I measured cache misses using the perf command.
Average cache misses without the change: 165,083,881 (stddev 986,099)
Average job run time without the change: 14.46 seconds (stddev 1.24)
Average cache misses with the change: 83,130,729 (stddev 342,826)
Average job run time with the change: 12.018 seconds (stddev 1.95)
> In map-side sort, swap entire meta entries instead of indexes for better
> cache performance
> -------------------------------------------------------------------------------------------
>
> Key: MAPREDUCE-5462
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5462
> Project: Hadoop Map/Reduce
> Issue Type: Sub-task
> Components: performance, task
> Affects Versions: 2.1.0-beta
> Reporter: Sandy Ryza
> Assignee: Sandy Ryza
>
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira