Nishant Khurana wrote:
Hi,
I think that works for me. What I meant below was that the Output Collector
of TableReduce only collects ImmutableBytesWritable and BatchUpdate type of
key and value.

Yeah. These are the types needed by HTable doing its commit so it would seem to make sense that this is what should come out of the reduce step.

Looking at code, it doesn't look like this would be easy to change; TableReduce is just an interface but bulk of the insert work is done in TableOutputFormat. If you want to work on making TOF take other types, just say and we can try and work it through together (TOF would need to be made more generic).

So I was asking how can I use other datatypes while writing
back to Hbase tables using TableReduce. But seems either my mapper or
reducer has to convert my datatypes into above mentioned using the method
you suggested and them pass it on to output collector of table reduce.
Let me know if I am missing something.

Yeah, IBW and BU are what you need to make doing your hbase insert.

Also, how can I store multiple values for the same column in Hbase. Like a
movie id containing 5 genres all coming under column genre. My mapper was
extracting a comma separated list of genres from a text file for a movie id
and separating it to produce id, genre pair. Then I passed them to reducer
to add it to table but BatchUpdate seem to overwrite the previous entries by
the last one. Can I store all values in the same column ?

Well, as long as they are all emitted from the mapper with the same key, they should all be showing up in the reduce globbed together by the key with each of the attributes Iterable. How are you doing your map emissions?

For example, if emit from your map with a key of movieid (say as Text or as IBW) and then each of the genres as values (again as either IBW or Text), then your reducer should be passed a key of movied and then an Iterator over the Text or IBW of genres.

You'd then in your reducer create a BatchUpdate and do BU.put("genre:genretype", genrevalue).... and convert the key to IBW if not already and emit this from your reduce?

Pardon me if I'm stating what you already know.

St.Ack


Thanks

On Mon, Nov 24, 2008 at 1:50 PM, stack <[EMAIL PROTECTED]> wrote:

Nishant Khurana wrote:

Hi,
I was writing a mapreduce class to read from a text file and write the
entries to a table. My Map function reads each line and outputs a key and
a
MapWritable as value. I was wondering while writing reduce using
TableReduce, how to convert the key (IntWritable) to
ImmutableBytesWritable
and Mapwritable object to BatchUpdate so that my outputcollector doesn't
complain in reduce function. It seems to enforce the signature where it
collects the above two datatypes only.


For the key, would something like the below work for you:

// Let 'key' be the IntWritable passed to the reduce. key.get() returns an
int.
// Bytes has a bunch of overrides for different types returning byte [].
ImmutableBytesWritable ibw = new
ImmutableBytesWritable(Bytes.toBytes(key.get()));

For the MapWritable to BatchUpdate, how about:

      // Again, let 'key' but the passed IntValue key.  To make a byte
array of it,
      // use, Bytes.toBytes.
      BatchUpdate bu = new BatchUpdate(Bytes.toBytes(key.get()));
      // Let 'v' be the MapWritable passed to this reduce.
      while (v.hasNext()) {
        HbaseMapWritable<SomeWritable, SomeWritable> hmw = v.next();
        for (Entry<SomeWritable, SomeWritable> e: hmw.entrySet()) {
          bu.put(Bytes.toBytes(e.get()), Bytes.toBytes(e.get()));
        }
      }

For 0.19.0 hbase, there is an example that does similar to what you are up
to under src/examples/mapred though I think it might depend on a recent fix
to HbaseMapWritable that allowed it take byte array as value, not just
Writables.

 Also I believe I can only use above two datatypes while using table reduce
but couldn't understand them very well. How can I convert any datatype to
the above two to write them to the tables.



Please say more.  I don't think I follow exactly (And would like to fix
this for 0.19.0 if its what I think you are saying).

St.Ack





Reply via email to