Well, it depends on how the Text object is initialized. If it is initialized
with a String, it sets its internal length to the length of the string.

---
Jim Kellerman, Senior Engineer; Powerset


> -----Original Message-----
> From: stack [mailto:[EMAIL PROTECTED]
> Sent: Wednesday, November 21, 2007 9:04 AM
> To: hadoop-user@lucene.apache.org
> Subject: Re: Text and/or ImmutableBytesWritable issue?
>
> What Jim just said, but it looks to me like Text is doing the
> wrong thing.  When you ask it its length, it returns the byte
> buffer capacity rather than how many bytes are in use.  It
> says length is 16 but there are only 15 characters in your
> test string, UTF-8'd or not.
>
> St.Ack
>
>
> Jason Grey wrote:
> > Can anyone explain why "testTextToBytes" doesn't assert and
> > "testStringToBytes" does?
> >
> >
> > import org.apache.hadoop.hbase.io.ImmutableBytesWritable;
> > import org.apache.hadoop.io.Text;
> >
> > import junit.framework.TestCase;
> >
> > public class TestImmutableBytesWritable extends TestCase {
> >
> >       public void testTextToBytes(){
> >
> >               Text input = new Text("this is a test.");
> >
> >               ImmutableBytesWritable bytes =
> >                       new ImmutableBytesWritable(
> input.getBytes() );
> >
> >               Text output = new Text( bytes.get() );
> >
> >               assertEquals(input, output);
> >
> >       }
> >
> >       public void testStringToBytes(){
> >
> >               String input = "this is a test.";
> >
> >               ImmutableBytesWritable bytes =
> >                       new ImmutableBytesWritable(
> input.getBytes() );
> >
> >               String output = new String( bytes.get() );
> >
> >               assertEquals(input, output);
> >
> >       }
> > }
> >
> >
> > If I inspect the objects during debugging at the point of
> the assert I
> > see the following:
> >
> > * input
> >       bytes = [116, 104, 105, 115, 32, 105
> >               , 115, 32, 97, 32, 116, 101
> >               , 115, 116, 46, 0]
> >       length = 15
> >
> > * bytes =     [116, 104, 105, 115, 32, 105
> >               , 115, 32, 97, 32, 116, 101
> >               , 115, 116, 46, 0]
> >
> > * output
> >       bytes = [116, 104, 105, 115, 32, 105
> >               , 115, 32, 97, 32, 116, 101
> >               , 115, 116, 46, 0]
> >       length = 16
> >
> > The length property appears to be off between the two Text objects,
> > but all the data is correct... any help would be greatly
> appreciated.
> >
> > Thanks
> >
> > -jg-
> >
> >
>
>

Reply via email to