Leo,

Maybe HBaseHUT can help, although you say "append", not "update" or "combine"...

See:
http://blog.sematext.com/2010/12/16/deferring-processing-updates-to-increase-hbase-write-performance/


Otis
----
Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch
Lucene ecosystem search :: http://search-lucene.com/



----- Original Message ----
> From: Leo Alekseyev <[email protected]>
> To: [email protected]
> Sent: Mon, November 1, 2010 5:28:31 AM
> Subject: Best strategy for row updates
> 
> We are populating some HBase tables from daily data streams that are
> stored  in Hive.  When we see a row key that's already in the table,
> the data  should be appended to that row's record.  What is the best
> way to  achieve this?..  Should we be using the Java API?..  Rely on
> HBase  cell timestamping?..  Create compound keys (row_id+date)  and
> periodically run a separate MR job to coalesce all the data  belonging
> to the same row_id?..
> 
> Any pointers greatly  appreciated!
> 
> --Leo
> 

Reply via email to