For #1, major compaction would produce data files under /hbase/data, releasing archived data.
For #2, /hbase/data For #3, you can access your data before major compaction is performed. You should follow best practice for major compaction on restored table. On Wed, Dec 20, 2017 at 8:36 PM, Subash K <[email protected]> wrote: > We have a use case to transfer data from one cluster to another cluster. > As of now we are using CopyTable, but it is having impact on region server > and it is taking lot of time to complete data transfer from one to another. > > So we are exploring on HBase Export Snapshot feature and we have planned > to go ahead with the below steps. > > > 1. Take snapshot of a table in Source > 2. Execute ExportSnapshot job and send the snapshot to the destination > 3. Restore the snapshot sent from source. > 4. Now we are able to access the data. > > We want to understand how the data is handled in destination after > restoring the snapshot. Because we can still see the data under > /hbase/archive/data directory in HDFS and only reference data is being > maintained in /hbase/data/ > > Can someone help us to understand > > 1. When the data under /hbase/archive/data will be removed? > 2. When new data is inserted into the table, where the data will be > stored either in /hbase/archive/data or /hbase/data? > 3. I tried to delete the snapshot and run major_compaction for the > table, the data got moved from /hbase/archive/data to /hbase /data. So, is > major_compaction required always after restoring snapshot to move the data > to its respective data location? > 4. I'm able to see that data is being stored in archive even if there > is no snapshot. Under what other scenario data will be stored in > /hbase/archive/data/ ? > > Regards, > Subash Kunjupillai > >
