[
https://issues.apache.org/jira/browse/MAPREDUCE-5462?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13741484#comment-13741484
]
Todd Lipcon commented on MAPREDUCE-5462:
----------------------------------------
Makes sense to me -- there is basically no incremental cost of copying 16 bytes
instead of 4, since either way it's a single cacheline -- probably just two
"mov" instructions instead of one. The win of cutting the number of cache
misses by a factor of two is much much larger.
The intel guide says that an L3 cache hit of unshared data (which this is
likely to be) costs ~40 cycles, so 80M L2 misses (L3 hits) is about 1.6 seconds
at 2Ghz. This seems to roughly line up with the wall clock savings you're
seeing. (in fact you see a bit more savings because some of the accesses
probably miss L3 and hit main memory)
+1 pending Jenkins. Would be cool to see impact on a workload like terasort too
if you have a chance, but doesn't need to block the patch.
> In map-side sort, swap entire meta entries instead of indexes for better
> cache performance
> -------------------------------------------------------------------------------------------
>
> Key: MAPREDUCE-5462
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5462
> Project: Hadoop Map/Reduce
> Issue Type: Sub-task
> Components: performance, task
> Affects Versions: 2.1.0-beta
> Reporter: Sandy Ryza
> Assignee: Sandy Ryza
> Attachments: MAPREDUCE-5462.patch
>
>
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira