Looking at http://hg.openjdk.java.net/jdk7/jdk7/jdk/file/9b8c96f96a0f/src/share/classes/sun/misc/Unsafe.java, looks like Unsafe is provided by openjdk as well.
I guess this issue, though disturbing, wouldn't show up. On Mon, Apr 1, 2013 at 10:04 AM, Alan Chaney <[email protected]> wrote: > > On 4/1/2013 9:42 AM, Stack wrote: > >> That is an interesting (disturbing) find Alan. Hopefully the fallback is >> rare. Did you have a technique for making the compare fallback to pure >> java compare? >> >> Thank you, >> St.Ack >> > > I agree its disturbing! I based my findings on reading the source code for > 0.92.1 (the CDH4.1.2 distro). > > It seems to me that, from org.apache.hadoop.hbase.**KeyValue$KVComparator > the KeyComparator calls KeyComparator.compareRows which in turn calls > > Bytes.compareTo(left, loffset, llength, righ, roffset, rlength) which in > turn calls Bytes.compareTo which calls LexicographicalCompareHolder.** > BEST_COMPARER > > which appears to be implemented thus: > > static class LexicographicalComparerHolder { > static final String UNSAFE_COMPARER_NAME = > LexicographicalComparerHolder.**class.getName() + > "$UnsafeComparer"; > > static final Comparer<byte[]> BEST_COMPARER = getBestComparer(); > /** > * Returns the Unsafe-using Comparer, or falls back to the pure-Java > * implementation if unable to do so. > */ > static Comparer<byte[]> getBestComparer() { > try { > Class<?> theClass = Class.forName(UNSAFE_COMPARER_**NAME); > ... > } > > enum PureJavaComparer implements Comparer<byte[]> { > INSTANCE; > > @Override > public int compareTo(byte[] buffer1, int offset1, int length1, > ... > } > } > > So, it looks like to me that Unsafe is the default. However, its not > really very easy to debug this, except by invoking the > KeyValue.KeyComparator and seeing what you get, which is what I did. Either > I'm doing something very stupid (extremely plausible) or there is a bit of > an issue here. I was hoping that someone would point out my error! > > I've got some unit tests that appear to show the difference. > > Thanks > > Alan > > > > >> >> On Mon, Apr 1, 2013 at 7:54 AM, Alan Chaney <[email protected]> >> wrote: >> >> Hi >>> >>> I need to write some code that sorts row keys identically to HBase. >>> >>> I looked at the KeyValue.KeyComparator code, and it seems that, by >>> default, HBase elects to use the 'Unsafe' comparator as the basis of its >>> comparison, with a fall-back to to the PureJavaComparer should Unsafe not >>> be available (for example, in tests.) >>> >>> However, I'm finding that the sort order from a call to >>> KeyValue.KeyComparator appears to be inconsistent between the two forms. >>> >>> As an example, comparing: >>> >>> (first param) (second param) >>> 0000000000000000ffffffffffffff****ffffffffffffffffff616c1b to >>> 0000000000000000ffffffffffffff****ffffffffffffffffff61741b >>> >>> gives 1 for the default (presumably, Unsafe) call, and -1 using the >>> PureJavaComparator. >>> >>> I would actually expect it to be a -ve number, based on the difference of >>> 6c to 74 in the 3rd from last byte above. >>> >>> Similarly >>> >>> 000000000000000000000000000000****000000000000000000616c1b to >>> 000000000000000000000000000000****0000000000000000061741b >>> >>> gives > 0 instead of < 0. The PureJavaComparator does a byte-by-byte >>> comparison by >>> >>> Is this expected? From the definition of lexicographical compare that I >>> found, I don't think so. There's no issue of signed comparison here, >>> because 0x6c and 0x74 are still +ve byte values. >>> >>> Regards >>> >>> Alan >>> >>> >>> >>> >
