Re: questions on usage of output collector

stack Mon, 24 Nov 2008 10:51:20 -0800

Nishant Khurana wrote:

Hi,
I was writing a mapreduce class to read from a text file and write the
entries to a table. My Map function reads each line and outputs a key and a
MapWritable as value. I was wondering while writing reduce using
TableReduce, how to convert the key (IntWritable) to ImmutableBytesWritable
and Mapwritable object to BatchUpdate so that my outputcollector doesn't
complain in reduce function. It seems to enforce the signature where it
collects the above two datatypes only.



For the key, would something like the below work for you:

// Let 'key' be the IntWritable passed to the reduce. key.get() returnsan int.

// Bytes has a bunch of overrides for different types returning byte [].

ImmutableBytesWritable ibw = newImmutableBytesWritable(Bytes.toBytes(key.get()));


For the MapWritable to BatchUpdate, how about:

// Again, let 'key' but the passed IntValue key. To make a bytearray of it,

       // use, Bytes.toBytes.
       BatchUpdate bu = new BatchUpdate(Bytes.toBytes(key.get()));
       // Let 'v' be the MapWritable passed to this reduce.
       while (v.hasNext()) {
         HbaseMapWritable<SomeWritable, SomeWritable> hmw = v.next();
         for (Entry<SomeWritable, SomeWritable> e: hmw.entrySet()) {
           bu.put(Bytes.toBytes(e.get()), Bytes.toBytes(e.get()));
         }
       }

For 0.19.0 hbase, there is an example that does similar to what you areup to under src/examples/mapred though I think it might depend on arecent fix to HbaseMapWritable that allowed it take byte array as value,not just Writables.

Also I believe I can only use above two datatypes while using table reduce
but couldn't understand them very well. How can I convert any datatype to
the above two to write them to the tables.

Please say more. I don't think I follow exactly (And would like to fixthis for 0.19.0 if its what I think you are saying).


St.Ack

Re: questions on usage of output collector

Reply via email to