Leo, Maybe HBaseHUT can help, although you say "append", not "update" or "combine"...
See: http://blog.sematext.com/2010/12/16/deferring-processing-updates-to-increase-hbase-write-performance/ Otis ---- Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch Lucene ecosystem search :: http://search-lucene.com/ ----- Original Message ---- > From: Leo Alekseyev <[email protected]> > To: [email protected] > Sent: Mon, November 1, 2010 5:28:31 AM > Subject: Best strategy for row updates > > We are populating some HBase tables from daily data streams that are > stored in Hive. When we see a row key that's already in the table, > the data should be appended to that row's record. What is the best > way to achieve this?.. Should we be using the Java API?.. Rely on > HBase cell timestamping?.. Create compound keys (row_id+date) and > periodically run a separate MR job to coalesce all the data belonging > to the same row_id?.. > > Any pointers greatly appreciated! > > --Leo >
