what is the code for WritableComparator.readVInt and WritableUtils.decodeVIntSize doing?

Jane Wayne Fri, 30 Mar 2012 21:38:36 -0700

in tom white's book, Hadoop, The Definitive Guide, in the second edition,
on page 99, he shows how to compare the raw bytes of a key with Text
fields. he shows an example like the following.


int firstL1 = WritableUtils.decodeVIntSize(b1[s1]) + readVInt(b1, s1);
int firstL2 = WritableUtils.decodeVIntSize(b2[s2]) + readVInt(b2, s2);

his explanation is that firstL1 is the length of the first String/Text in
b1, and firstL2 is the length of the first String/Text in b2. but i'm
unsure of what the code is actually doing.

what is WritableUtils.decodeVIntSize(...) doing?
what is WritableComparator.readVInt(...) doing?
why do we have to add the outputs of these 2 methods to get the length of
the String/Text?

could someone please explain in plain terms what's happening here? it seems
WritableComparator.readVInt(...) is already getting the length of the
byte[] corresponding to the string. it seems
WritableUtils.decodeVIntSize(...) is also doing the same thing (from
reading the javadoc).

when i look at WritableUtils.writeString(...), two things happen. the
length of the byte[] is written, followed by writing the byte[] itself. why
can't we simply do something like the following to get the length?

int firstL1 = readInt(b1[s1]);
int firstL2 = readInt(b2[s2]);

what is the code for WritableComparator.readVInt and WritableUtils.decodeVIntSize doing?

Reply via email to