[ 
https://issues.apache.org/jira/browse/HBASE-23995?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17060757#comment-17060757
 ] 

Szabolcs Bukros edited comment on HBASE-23995 at 3/17/20, 9:19 AM:
-------------------------------------------------------------------

After the split the daughter regions only have hfile links to the storefile in 
the parent. Even the CatalogJanitor leaves these parents alone, only deleting 
it after the compaction is done and are no longer referenced from daughters.

The title might not be precise or well choosen. What I wanted to say is that 
snapshoting a state where the split was done, but compaction was not (this is 
what I clumsily called "splitting") results in a structure where the daughter 
regions has no data just links to a parent is saved and can be exported. 
However not every necessary info is exported with it (daughter references from 
parent are missing) and this leads to an issue where in the cloned table the 
parent region, that actually contains the data is archived then deleted in 
minutes after the cloning is done, resulting in loosing the exported data.


was (Author: bszabolcs):
After the split the daughter regions only have hfile links to the storefile in 
the parent. Even the CatalogJanitor leaves these parents alone, only deleting 
it after the compaction is done and are no longer referenced from daughters.

The title might not be precise or well choosen. What I wanted to say is that 
snapshoting a state where the split was done, but compaction was not (this is 
what I clumsily called "splitting") results in a structure where the daughter 
regions has no data just links to a parent is saved and can be exported. 
However not every necessary info is exported with it (daughter references from 
parent are missing) and this leads to an issue where in the cloned table the 
parent, that actually contains the data is archived then deleted in minutes 
after the cloning is done, resulting in loosing the exported data.

> Snapshoting a splitting region results in corrupted snapshot
> ------------------------------------------------------------
>
>                 Key: HBASE-23995
>                 URL: https://issues.apache.org/jira/browse/HBASE-23995
>             Project: HBase
>          Issue Type: Bug
>          Components: snapshots
>    Affects Versions: 2.0.2
>            Reporter: Szabolcs Bukros
>            Priority: Major
>
> The problem seems to originate from the fact that while the region split 
> itself runs in a lock, the compactions following it run in separate threads. 
> Alternatively the use of space quota policies can prevent compaction after a 
> split and leads to the same issue.
> In both cases the resulting snapshot will keep the split status of the parent 
> region, but do not keep the references to the daughter regions, because they 
> (splitA, splitB qualifiers) are stored separately in the meta table and do 
> not propagate with the snapshot.
> This is important because the in the freshly cloned table CatalogJanitor will 
> find the parent region, realizes it is in split state, but because it can not 
> find the daughter region references (haven't propagated) assumes parent could 
> be cleaned up and deletes it. The archived region used in the snaphost only 
> has back reference to the now also archived parent region and if the snapshot 
> is deleted they both gets cleaned up. Unfortunately the daughter regions only 
> contains hfile links, so at this point the data is lost.
> How to reproduce:
> {code:java}
> hbase shell <<EOF
> create 'test', 'cf'
> (0...2000).each{|i| put "test", "row#{i}", "cf:col", "val"}
> flush 'test'
> split 'test'
> snapshot 'test', 'testshot'
> EOF
> {code}
> This should make sure the snapshot is made before the compaction could be 
> finished even with small amount of data.
> {code:java}
> sudo -u habse hbase org.apache.hadoop.hbase.snapshot.ExportSnapshot -snapshot 
> testshot -copy-to hdfs://target:8020/apps/hbase/data/
> {code}
> I export the snapshot to make the usecase cleaner but deleting both the 
> snapshot and the original table after the cloning should have the same effect.
> {code:java}
> clone_snapshot 'testshot', 'test2'
> delete_snapshot "testshot"
> {code}
> I'm not sure what would be the best way to fix this. Preventing snapshots 
> when a region is in split state, would make snapshot creation problematic. 
> Forcing to run compaction as part of the split thread would make it rather 
> slow. Propagating the daughter region references could prevent the deletion 
> of the cloned parent region and the data would not be broken anymore but I'm 
> not sure we have a logic in place that could pick up the pieces and finish 
> the split process.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to