[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5462?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13741484#comment-13741484
 ] 

Todd Lipcon commented on MAPREDUCE-5462:
----------------------------------------

Makes sense to me -- there is basically no incremental cost of copying 16 bytes 
instead of 4, since either way it's a single cacheline -- probably just two 
"mov" instructions instead of one. The win of cutting the number of cache 
misses by a factor of two is much much larger.

The intel guide says that an L3 cache hit of unshared data (which this is 
likely to be) costs ~40 cycles, so 80M L2 misses (L3 hits) is about 1.6 seconds 
at 2Ghz. This seems to roughly line up with the wall clock savings you're 
seeing. (in fact you see a bit more savings because some of the accesses 
probably miss L3 and hit main memory)

+1 pending Jenkins. Would be cool to see impact on a workload like terasort too 
if you have a chance, but doesn't need to block the patch.
                
> In map-side sort, swap entire meta entries instead of indexes for better 
> cache performance 
> -------------------------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-5462
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5462
>             Project: Hadoop Map/Reduce
>          Issue Type: Sub-task
>          Components: performance, task
>    Affects Versions: 2.1.0-beta
>            Reporter: Sandy Ryza
>            Assignee: Sandy Ryza
>         Attachments: MAPREDUCE-5462.patch
>
>


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to