We have a use case to transfer data from one cluster to another cluster. As of 
now we are using CopyTable, but it is having impact on region server and it is 
taking lot of time to complete data transfer from one to another.

So we are exploring on HBase Export Snapshot feature and we have planned to go 
ahead with the below steps.


  1.  Take snapshot of a table in Source
  2.  Execute ExportSnapshot job and send the snapshot to the destination
  3.  Restore the snapshot sent from source.
  4.  Now we are able to access the data.

We want to understand how the data is handled in destination after restoring 
the snapshot. Because we can still see the data under /hbase/archive/data 
directory in HDFS and only reference data is being maintained in /hbase/data/

Can someone help us to understand

  1.  When the data under /hbase/archive/data will be removed?
  2.  When new data is inserted into the table, where the data will be stored 
either in /hbase/archive/data or /hbase/data?
  3.  I tried to delete the snapshot and run major_compaction for the table, 
the data got moved from /hbase/archive/data to /hbase /data. So, is 
major_compaction required always after restoring snapshot to move the data to 
its respective data location?
  4.  I'm able to see that data is being stored in archive even if there is no 
snapshot. Under what other scenario data will be stored in /hbase/archive/data/ 
?

Regards,
Subash Kunjupillai

Reply via email to