Nishant Khurana wrote:
Hi,
I was writing a mapreduce class to read from a text file and write the
entries to a table. My Map function reads each line and outputs a key and a
MapWritable as value. I was wondering while writing reduce using
TableReduce, how to convert the key (IntWritable) to ImmutableBytesWritable
and Mapwritable object to BatchUpdate so that my outputcollector doesn't
complain in reduce function. It seems to enforce the signature where it
collects the above two datatypes only.
For the key, would something like the below work for you:
// Let 'key' be the IntWritable passed to the reduce. key.get() returns
an int.
// Bytes has a bunch of overrides for different types returning byte [].
ImmutableBytesWritable ibw = new
ImmutableBytesWritable(Bytes.toBytes(key.get()));
For the MapWritable to BatchUpdate, how about:
// Again, let 'key' but the passed IntValue key. To make a byte
array of it,
// use, Bytes.toBytes.
BatchUpdate bu = new BatchUpdate(Bytes.toBytes(key.get()));
// Let 'v' be the MapWritable passed to this reduce.
while (v.hasNext()) {
HbaseMapWritable<SomeWritable, SomeWritable> hmw = v.next();
for (Entry<SomeWritable, SomeWritable> e: hmw.entrySet()) {
bu.put(Bytes.toBytes(e.get()), Bytes.toBytes(e.get()));
}
}
For 0.19.0 hbase, there is an example that does similar to what you are
up to under src/examples/mapred though I think it might depend on a
recent fix to HbaseMapWritable that allowed it take byte array as value,
not just Writables.
Also I believe I can only use above two datatypes while using table reduce
but couldn't understand them very well. How can I convert any datatype to
the above two to write them to the tables.
Please say more. I don't think I follow exactly (And would like to fix
this for 0.19.0 if its what I think you are saying).
St.Ack