[
https://issues.apache.org/jira/browse/HADOOP-12720?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15109053#comment-15109053
]
Alan Burlison commented on HADOOP-12720:
----------------------------------------
The issue here is that although there is code to check for SPARC and use a
pure-java comparison in that case, it only checks for a 32-bit JVM and not a
64-bit one:
{code:java}
static Comparer<byte[]> getBestComparer() {
if (System.getProperty("os.arch").equals("sparc")) {
{code}
That code needs to check for {{sparcv9}} as well. However there are bigger
issues here:
* While Intel supports misaligned accesses, other platforms such as SPARC and
ARM64 don't.
* It is only on recent Intel CPUs that there isn't performance hit for doing
misaligned accesses.
* Even on recent Intel CPUs it seems there may still bel a penalty for
misaligned accesses that straddle cache lines
* The last comments in HADOOP-7761 say that the code is actually slower on
small arrays and only slightly faster even on large arrays.
* The use of sun.misc. Unsafe is problematic in general because it is going to
be deprecated in Java9.
* The hand-rolled code in {{UnsafeComparer.compareTo}} means that any JVM
improvements to byte comparison operations (e.g. by using CPU block memory
instructions) will not be used by Hadoop.
I agree that optimising byte array comparisons is a worthwhile aim, but the
current implementation is platform-specific, relies on java.misc.Unsafe and
appears to be slower in practice anyway. I suggest some more investigation is
required in this area. And longer-term, Java9 will provide what is needed via
static methods on the Array class, see
https://bugs.openjdk.java.net/browse/JDK-8033148
> Misuse of sun.misc.Unsafe by
> org.apache.hadoop.io.FastByteComparisons$LexicographicalComparerHolder$UnsafeComparer.compareTo
> causes misaligned memory access coredumps
> ----------------------------------------------------------------------------------------------------------------------------------------------------------------------
>
> Key: HADOOP-12720
> URL: https://issues.apache.org/jira/browse/HADOOP-12720
> Project: Hadoop Common
> Issue Type: Sub-task
> Components: io
> Affects Versions: 2.7.1
> Environment: Solaris SPARC
> Reporter: Alan Burlison
> Labels: Solaris
>
> Core dump details below:
> {noformat}
> # Java VM: Java HotSpot(TM) 64-Bit Server VM (24.85-b07 mixed mode
> solaris-sparc compressed oops)
> # Problematic frame:
> # J 86 C2
> org.apache.hadoop.io.FastByteComparisons$LexicographicalComparerHolder$UnsafeComparer.compareTo([BII[BII)I
> (273 bytes) @ 0xffffffff6fc9b150 [0xffffffff6fc9b0e0+0x70]
> Stack: [0xffffffff7e200000,0xffffffff7e300000], sp=0xffffffff7e2fce50, free
> space=1011k
> Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native
> code)
> J 86 C2
> org.apache.hadoop.io.FastByteComparisons$LexicographicalComparerHolder$UnsafeComparer.compareTo([BII[BII)I
> (273 bytes) @ 0xffffffff6fc9b150 [0xffffffff6fc9b0e0+0x70]
> j
> org.apache.hadoop.io.FastByteComparisons$LexicographicalComparerHolder$UnsafeComparer.compareTo(Ljava/lang/Object;IILjava/lang/Object;II)I+16
> j org.apache.hadoop.io.FastByteComparisons.compareTo([BII[BII)I+11
> j org.apache.hadoop.io.WritableComparator.compareBytes([BII[BII)I+8
> j org.apache.hadoop.io.Text$Comparator.compare([BII[BII)I+39
> j org.apache.hadoop.io.TestText.testCompare()V+167
> v ~StubRoutines::call_stub
> {noformat}
> {noformat}
> # Problematic frame:
> # V [libjvm.so+0xc7fa40] Unsafe_GetLong+0x158
> Stack: [0xffffffff7e200000,0xffffffff7e300000], sp=0xffffffff7e2fc9b0, free
> space=1010k
> Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native
> code)
> V [libjvm.so+0xc7fa40] Unsafe_GetLong+0x158
> j sun.misc.Unsafe.getLong(Ljava/lang/Object;J)J+-292148
> j sun.misc.Unsafe.getLong(Ljava/lang/Object;J)J+0
> j
> org.apache.hadoop.io.FastByteComparisons$LexicographicalComparerHolder$UnsafeComparer.compareTo([BII[BII)I+91
> j
> org.apache.hadoop.io.FastByteComparisons$LexicographicalComparerHolder$UnsafeComparer.compareTo(Ljava/lang/Object;IILjava/lang/Object;II)I+16
> j org.apache.hadoop.io.FastByteComparisons.compareTo([BII[BII)I+11
> j org.apache.hadoop.io.WritableComparator.compareBytes([BII[BII)I+8
> j
> org.apache.hadoop.mapred.gridmix.GridmixRecord$Comparator.compare([BII[BII)I+61
> j
> org.apache.hadoop.mapred.gridmix.TestGridmixRecord.binSortTest(Lorg/apache/hadoop/mapred/gridmix/GridmixRecord;Lorg/apache/hadoop/mapred/gridmix/GridmixRecord;IILorg/apache/hadoop/io/WritableComparator;)V+280
> j org.apache.hadoop.mapred.gridmix.TestGridmixRecord.testBaseRecord()V+57
> v ~StubRoutines::call_stub
> {noformat}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)