Hey Michael You could probably use Cascading to migrate data between HBase clusters. http://wiki.apache.org/hadoop/Hbase/Cascading
But the code currently doesn't support multiple HBase cluster clients in a single JVM, but I'm sure it can be coded in quickly. (the code is hosted at github, so is easily cloned and patched).
A benefit of using Cascading would be the ability to put in quality checks, or filter data very easily.
I've already heard of users starting to migrate from HBase to a RDBMS using Cascading, and also between Hypertable and Aster Data. Hopefully those adapters will leak out for the rest of us to use.
ckw On Feb 4, 2009, at 8:21 AM, Michael Dagaev wrote:
Hi, all I read HBASE-974 and HBASE-643 mentioned on the list but what do you think about copying tables from the production to a backup Hbase cluster ? I guess we do need a big iron for such a backup cluster. I understand that the copy can be implemented with MR but for now we can implement it just as a simple sequential script, which scans the tables of the production Hbase and writes the data to the backup Hbase. Does it make sense? Thank you for your cooperation, M.
-- Chris K Wensel [email protected] http://www.cascading.org/ http://www.scaleunlimited.com/
