ImmutableBytesWritable comparator not lexicographic?

Chase Bradford Wed, 16 Jun 2010 21:20:13 -0700

Hi Everyone,

I've been trying to track down a problem I'm having with sorting IBWs with
it comparator, and it seems as though the comparator doesn't work as
expected.


The problem seems to be that IBW.Comparator extends WritableComparator, but
only overrides compareBytes.  WritableComparator.compare uses IBW.compareTo
which compares by length, then contents, as if aiming for a big-endian
numerical comparison.  Although, it's not quite a numerical comparison,
because it doesn't account for leading 0 bytes.

I stumbled on this while trying to use the TotalOrderPartitioner with a
partition file lexicographically sorted but with values of varying lengths.
It uses the Comparator's compare() method.

Can someone explain why IBW.compareTo is implemented this way?

Thanks,
Chase

My test case:

import org.apache.hadoop.hbase.io.ImmutableBytesWritable;
public class Test
{
   public static void main(String[] args){
      ImmutableBytesWritable ibw1 = new ImmutableBytesWritable( new
byte[]{0x0f} );
      ImmutableBytesWritable ibw2 = new ImmutableBytesWritable( new
byte[]{0x00, 0x00} );
      ImmutableBytesWritable.Comparator c = new
ImmutableBytesWritable.Comparator();

      if( c.compare( ibw1, ibw2 ) < 0 )
         System.err.println( "ibw1 < ibw2" );

      System.exit(0);
   }
}

ImmutableBytesWritable comparator not lexicographic?

Reply via email to