[
https://issues.apache.org/jira/browse/HBASE-29346?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Duo Zhang resolved HBASE-29346.
-------------------------------
Fix Version/s: 2.7.0
3.0.0-beta-2
2.6.3
2.5.12
Hadoop Flags: Reviewed
Resolution: Fixed
Pushed to all active branches.
Thanks [~prathyu6] for contributing!
> Multiple Snapshot restores on same restoreDir ends up in Dataloss
> -----------------------------------------------------------------
>
> Key: HBASE-29346
> URL: https://issues.apache.org/jira/browse/HBASE-29346
> Project: HBase
> Issue Type: Bug
> Components: snapshots
> Reporter: Prathyusha
> Assignee: Prathyusha
> Priority: Critical
> Labels: pull-request-available
> Fix For: 2.7.0, 3.0.0-beta-2, 2.6.3, 2.5.12
>
>
> We restore snapshots to a temporary directory for Snapshot reads.
> When restored multiple SnapshotManifests (both created on same table at t1,
> t2 t2>t1), on the same temp dir, it deletes the merge parent regions from
> {color:#de350b}/hbase/data/ instead of temp restore folder as part of
> restore regions of{color}
> [RestoreSnapshotHelper|https://github.com/apache/hbase/blob/master/hbase-server/src/main/java/org/apache/hadoop/hbase/snapshot/RestoreSnapshotHelper.java#L416]
> Reproduce steps
> # Create a Snapshot on a table
> # Restore that snapshot on a temporary restoreDirectory instead of the same
> table
> # Delete that snapshot from shell
> # Disable compactions and trigger Merge
> # Create another snapshot
> # Restore that snapshot on to the same restoreDirectory from Step-2
> # It archives the closed parent regions from /hbase/data/ of actual table
> instead of temporary restoreDirectory leaving dangling references in daughter
> region which ends up in dataloss
> # Restart the regionserver holding the merged daughter region and it will
> end up in FAILED_OPEN state due to dangling reference files and the parent
> store files are already archived
> Proposed immediate fix -
> RestoreSnapshotHelper does {{restore, add, remove}} regions.
> Restore/Add operations use {{tableDir}} of RestoreSnapshotHelper (which is
> constructed from {{{}restoreDir{}}}) to construct {{RegionDir}} paths
> We should do the same strategy in removeRegions path also,
> currently
> [RestoreSnapshotHelper.removeHdfsRegion|https://github.com/apache/hbase/blob/master/hbase-server/src/main/java/org/apache/hadoop/hbase/snapshot/RestoreSnapshotHelper.java#L416]
> currently uses
> [HFileArchiver.archiveRegion|https://github.com/apache/hbase/blob/master/hbase-server/src/main/java/org/apache/hadoop/hbase/backup/HFileArchiver.java#L104]
> which essentially is constructing table from rootDir instead of restoreDir
--
This message was sent by Atlassian Jira
(v8.20.10#820010)