[
https://issues.apache.org/jira/browse/MAPREDUCE-3235?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13492599#comment-13492599
]
Gopal V commented on MAPREDUCE-3235:
------------------------------------
For a sort set of 100 x 99171 unique strings (/usr/share/dict/words on Ubuntu)
||trunk||+hashcomparator||
| 275143706 | 74887595 |
Comparisons per-item went from 27.74 per item to 7.55 per item. But this did
not translate into any massive savings in execution time as most of it went
into the spill & subsequent merge.
> Improve CPU cache behavior in map side sort
> -------------------------------------------
>
> Key: MAPREDUCE-3235
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-3235
> Project: Hadoop Map/Reduce
> Issue Type: Improvement
> Components: performance, task
> Affects Versions: 0.23.0
> Reporter: Todd Lipcon
> Assignee: Todd Lipcon
> Attachments: hashed-sort-MAPREDUCE-3235.patch, map_sort_perf.diff,
> mr-3235-poc.txt
>
>
> When running oprofile on a terasort workload, I noticed that a large amount
> of CPU usage was going to MapTask$MapOutputBuffer.compare. Upon disassembling
> this and looking at cycle counters, most of the cycles were going to memory
> loads dereferencing into the array of key-value data -- implying expensive
> cache misses. This can be avoided as follows:
> - rather than simply swapping indexes into the kv array, swap the entire meta
> entries in the meta array. Swapping 16 bytes is only negligibly slower than
> swapping 4 bytes. This requires adding the value-length into the meta array,
> since we used to rely on the previous-in-the-array meta entry to determine
> this. So we replace INDEX with VALUELEN and avoid one layer of indirection.
> - introduce an interface which allows key types to provide a 4-byte
> comparison proxy. For string keys, this can simply be the first 4 bytes of
> the string. The idea is that, if stringCompare(key1.proxy(), key2.proxy()) !=
> 0, then compare(key1, key2) should have the same result. If the proxies are
> equal, the normal comparison method is used. We then include the 4-byte proxy
> as part of the metadata entry, so that for many cases the indirection into
> the data buffer can be avoided.
> On a terasort benchmark, these optimizations plus an optimization to
> WritableComparator.compareBytes dropped the aggregate mapside CPU millis by
> 40%, and the compare() routine mostly dropped off the oprofile results.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira