Hi all,

TL;DR Could table snapshots taken in hbase 1.0 be used in hbase 2.1?

We have an existing production hbase 1.0 cluster (CDH 5.4) , and we're
setting up a new cluster with hbase 2.1 (CDH 6.3). Let's call the old
cluster C1 and new one C2.

To migrate the existing data from C1 to C2, we plan to use the "snapshot +
replication" approach (snapshot would capture the existing part, and
replication would do the incremental part) . However when I was testing the
feasibility of this approach locally, I found that the snapshot could be
successfully export to c2, but but the restored table on C2 has no data.

Here is a minimal reproducible example:

1. on C1: take the snapshot and export it to C2

hbase shell:
    create "t1", {"NAME"=>"f1", "REPLICATION_SCOPE" => 1}
    put "t1", "r1", "f1:c1", "value"
    snapshot 't1', 't1s1'

sudo -u hdfs hbase org.apache.hadoop.hbase.snapshot.ExportSnapshot
-snapshot t1s1 \
            -copy-to hdfs://c2:8020/hbase -mappers 1

2. Then on C2 restore the table

hbase shell:
    create "t1", {"NAME"=>"f1", "REPLICATION_SCOPE" => 1}
    disable "t1"
    restore_snapshot "t1s1"
    enable "t1"
    scan "t1"

All these steps succeeds, except that the final "scan" command shows no
data at all. Also worth noting that on the master web ui on C2 it shows the
table t1 has two regions and one is not assigned - It shall have only one
region obviously.

So my question is:  Could table snapshots taken in hbase 1.0 be used in
hbase 2.1?
- If yes, anything I'm doing wrong here?
- If no, is there any workaround? (e.g. performing some preprocessing on
the snapshot data on hbase 2.1 side before restoring it?)

If this can't work, the only alternative way to migrate the data is too
install hbase 1.0 on C2 (so it could use the snapshot from C1), and upgrade
it to hbase 2.1 after restoring the snapshot. I I'd like to avoid going
this way as much as possible because it would be too cumbersome.

Any information would be much appreciated, thx!

Reply via email to