Text objects typically contain more bytes than are actually in use.
If you were to use the alternate constructor for ImmutableBytesWritable:
new ImmutableBytesWritable(input.getBytes(), 0, input.getLength());
the test will pass.
One more note: Relying on the default encoding being the same for
Strings may work on any single machine but if one machine has a
default encoding of EN_US and another's is UTF-8, passing an
ImmutableBytesWritable from one machine to another will result in
the String decoding failing. For this reason, we always specify
an encoding for String.getBytes and in the String constructor:
new ImmutableBytesWritable("this is a string".getBytes("UTF-8"))
and
new String(ibw.getBytes(), "UTF-8")
---
Jim Kellerman, Senior Engineer; Powerset
> -----Original Message-----
> From: Jason Grey [mailto:[EMAIL PROTECTED]
> Sent: Wednesday, November 21, 2007 8:27 AM
> To: [email protected]
> Subject: Text and/or ImmutableBytesWritable issue?
>
> Can anyone explain why "testTextToBytes" doesn't assert and
> "testStringToBytes" does?
>
>
> import org.apache.hadoop.hbase.io.ImmutableBytesWritable;
> import org.apache.hadoop.io.Text;
>
> import junit.framework.TestCase;
>
> public class TestImmutableBytesWritable extends TestCase {
>
> public void testTextToBytes(){
>
> Text input = new Text("this is a test.");
>
> ImmutableBytesWritable bytes =
> new ImmutableBytesWritable(
> input.getBytes() );
>
> Text output = new Text( bytes.get() );
>
> assertEquals(input, output);
>
> }
>
> public void testStringToBytes(){
>
> String input = "this is a test.";
>
> ImmutableBytesWritable bytes =
> new ImmutableBytesWritable(
> input.getBytes() );
>
> String output = new String( bytes.get() );
>
> assertEquals(input, output);
>
> }
> }
>
>
> If I inspect the objects during debugging at the point of the
> assert I see the following:
>
> * input
> bytes = [116, 104, 105, 115, 32, 105
> , 115, 32, 97, 32, 116, 101
> , 115, 116, 46, 0]
> length = 15
>
> * bytes = [116, 104, 105, 115, 32, 105
> , 115, 32, 97, 32, 116, 101
> , 115, 116, 46, 0]
>
> * output
> bytes = [116, 104, 105, 115, 32, 105
> , 115, 32, 97, 32, 116, 101
> , 115, 116, 46, 0]
> length = 16
>
> The length property appears to be off between the two Text
> objects, but all the data is correct... any help would be
> greatly appreciated.
>
> Thanks
>
> -jg-
>
>