Hi, I use the org.apache.hadoop.io.Text object to set its value "測試" in chinese text(six bytes in UTF-8 encoding), and when I invoke its "getBytes()" method that return the raw bytes (11 bytes), but it's actually only six bytes. I knew that object involves the "getLength()" method. My question is that the "getBytes()" method why not return actually bytes?
Why should not be:
public byte[] getBytes() {
//return bytes
return Arrays.copyOf(bytes,getLength());
}
thanks in advance
Shen
