[jira] [Commented] (HBASE-23995) Snapshoting a splitting region results in corrupted snapshot

2020-03-26 Thread Szabolcs Bukros (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-23995?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17067867#comment-17067867
 ] 

Szabolcs Bukros commented on HBASE-23995:
-

The logs are from 2.0.

I'm reasonably certain. I see in the 2.0 logs that the manifest is created 
while compaction is running and before it could have finished writing to the 
temporary hfile. Thanks to this the manifest would refer to the hfile 
references. While in 2.2 where the snapshot runs after the compaction, it 
refers tot he freshly created storefiles.

> Snapshoting a splitting region results in corrupted snapshot
> 
>
> Key: HBASE-23995
> URL: https://issues.apache.org/jira/browse/HBASE-23995
> Project: HBase
>  Issue Type: Bug
>  Components: snapshots
>Affects Versions: 2.0.2
>Reporter: Szabolcs Bukros
>Priority: Major
>
> The problem seems to originate from the fact that while the region split 
> itself runs in a lock, the compactions following it run in separate threads. 
> Alternatively the use of space quota policies can prevent compaction after a 
> split and leads to the same issue.
> In both cases the resulting snapshot will keep the split status of the parent 
> region, but do not keep the references to the daughter regions, because they 
> (splitA, splitB qualifiers) are stored separately in the meta table and do 
> not propagate with the snapshot.
> This is important because the in the freshly cloned table CatalogJanitor will 
> find the parent region, realizes it is in split state, but because it can not 
> find the daughter region references (haven't propagated) assumes parent could 
> be cleaned up and deletes it. The archived region used in the snaphost only 
> has back reference to the now also archived parent region and if the snapshot 
> is deleted they both gets cleaned up. Unfortunately the daughter regions only 
> contains hfile links, so at this point the data is lost.
> How to reproduce:
> {code:java}
> hbase shell < create 'test', 'cf'
> (0...2000).each{|i| put "test", "row#{i}", "cf:col", "val"}
> flush 'test'
> split 'test'
> snapshot 'test', 'testshot'
> EOF
> {code}
> This should make sure the snapshot is made before the compaction could be 
> finished even with small amount of data.
> {code:java}
> sudo -u habse hbase org.apache.hadoop.hbase.snapshot.ExportSnapshot -snapshot 
> testshot -copy-to hdfs://target:8020/apps/hbase/data/
> {code}
> I export the snapshot to make the usecase cleaner but deleting both the 
> snapshot and the original table after the cloning should have the same effect.
> {code:java}
> clone_snapshot 'testshot', 'test2'
> delete_snapshot "testshot"
> {code}
> I'm not sure what would be the best way to fix this. Preventing snapshots 
> when a region is in split state, would make snapshot creation problematic. 
> Forcing to run compaction as part of the split thread would make it rather 
> slow. Propagating the daughter region references could prevent the deletion 
> of the cloned parent region and the data would not be broken anymore but I'm 
> not sure we have a logic in place that could pick up the pieces and finish 
> the split process.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HBASE-23995) Snapshoting a splitting region results in corrupted snapshot

2020-03-26 Thread Josh Elser (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-23995?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17067796#comment-17067796
 ] 

Josh Elser commented on HBASE-23995:


These logs from 2.0 or 2.2? The locking in your first snippet looks correct 
(LockProcedure only moves to RUNNABLE after STRP finishes).

Maybe STRP has changed to hold its lock longer, if the compaction (at 
14:32:32,088) is truly a part of the STRP.

Are we certain that the corruption in the snapshot is due to an execution 
pattern like?
 * Split starts
 * Split's compaction starts
 * Snapshot tries to start
 * Split finishes
 * Snapshot manifest created
 * Split compaction finishes
 * Rest of snapshot operation finishes

Meaning, we create an invalid snapshot manifest because the compaction 
post-split is not yet finished?

> Snapshoting a splitting region results in corrupted snapshot
> 
>
> Key: HBASE-23995
> URL: https://issues.apache.org/jira/browse/HBASE-23995
> Project: HBase
>  Issue Type: Bug
>  Components: snapshots
>Affects Versions: 2.0.2
>Reporter: Szabolcs Bukros
>Priority: Major
>
> The problem seems to originate from the fact that while the region split 
> itself runs in a lock, the compactions following it run in separate threads. 
> Alternatively the use of space quota policies can prevent compaction after a 
> split and leads to the same issue.
> In both cases the resulting snapshot will keep the split status of the parent 
> region, but do not keep the references to the daughter regions, because they 
> (splitA, splitB qualifiers) are stored separately in the meta table and do 
> not propagate with the snapshot.
> This is important because the in the freshly cloned table CatalogJanitor will 
> find the parent region, realizes it is in split state, but because it can not 
> find the daughter region references (haven't propagated) assumes parent could 
> be cleaned up and deletes it. The archived region used in the snaphost only 
> has back reference to the now also archived parent region and if the snapshot 
> is deleted they both gets cleaned up. Unfortunately the daughter regions only 
> contains hfile links, so at this point the data is lost.
> How to reproduce:
> {code:java}
> hbase shell < create 'test', 'cf'
> (0...2000).each{|i| put "test", "row#{i}", "cf:col", "val"}
> flush 'test'
> split 'test'
> snapshot 'test', 'testshot'
> EOF
> {code}
> This should make sure the snapshot is made before the compaction could be 
> finished even with small amount of data.
> {code:java}
> sudo -u habse hbase org.apache.hadoop.hbase.snapshot.ExportSnapshot -snapshot 
> testshot -copy-to hdfs://target:8020/apps/hbase/data/
> {code}
> I export the snapshot to make the usecase cleaner but deleting both the 
> snapshot and the original table after the cloning should have the same effect.
> {code:java}
> clone_snapshot 'testshot', 'test2'
> delete_snapshot "testshot"
> {code}
> I'm not sure what would be the best way to fix this. Preventing snapshots 
> when a region is in split state, would make snapshot creation problematic. 
> Forcing to run compaction as part of the split thread would make it rather 
> slow. Propagating the daughter region references could prevent the deletion 
> of the cloned parent region and the data would not be broken anymore but I'm 
> not sure we have a logic in place that could pick up the pieces and finish 
> the split process.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HBASE-23995) Snapshoting a splitting region results in corrupted snapshot

2020-03-26 Thread Szabolcs Bukros (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-23995?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17067784#comment-17067784
 ] 

Szabolcs Bukros commented on HBASE-23995:
-

As Josh mentioned both Split and Snapshot uses PV2 so it should work. And since 
in 2.2 it does work I started to check commits missing from the old branch. 
HBASE-21375 looked promising, while it does not target this behavior it looked 
like a general improvement on the locking logic. Quickly backported and 
re-tested it, but unfortunately it does not solve the issue.

Now that I know what to look for I could find in the log the point where the 
lock is passed from Split to Snapshot (hbase-master.log).

 
{code:java}
2020-03-26 14:32:31,945 INFO  [PEWorker-8] procedure2.ProcedureExecutor: 
Finished pid=28, state=SUCCESS; SplitTableRegionProcedure table=tab2, 
parent=11544264d3485f5ff700562ca6b62acb, daughterA
=dcf89acf08c55f494fd93ceedd3f3445, daughterB=bf84f2e23131d9488d9c56117d374187 
in 1.0010sec
2020-03-26 14:32:31,946 DEBUG [PEWorker-8] locking.LockProcedure: LOCKED 
pid=30, state=RUNNABLE; org.apache.hadoop.hbase.master.locking.LockProcedure, 
tableName=tab2, type=EXCLUSIVE
2020-03-26 14:32:31,948 INFO  [PEWorker-8] procedure2.TimeoutExecutorThread: 
ADDED pid=30, state=WAITING_TIMEOUT, locked=true; 
org.apache.hadoop.hbase.master.locking.LockProcedure, tableName=ta
b2, type=EXCLUSIVE; timeout=60, timestamp=1585233751948
2020-03-26 14:32:31,948 DEBUG 
[RpcServer.default.FPBQ.Fifo.handler=28,queue=1,port=16000] 
snapshot.SnapshotManager: Started snapshot: { ss=tabshot2 table=tab2 type=FLUSH 
}
{code}
Curiously in the rs log I can see PostOpenDeployTasks and compactions starting 
to run while SplitTableRegionProcedure has the lock
{code:java}
2020-03-26 14:32:31,918 INFO  
[PostOpenDeployTasks:dcf89acf08c55f494fd93ceedd3f3445] 
regionserver.HRegionServer: Post open deploy tasks for 
tab2,,1585233150936.dcf89acf08c55f494fd93ceedd3f3445.
2020-03-26 14:32:31,919 DEBUG 
[PostOpenDeployTasks:dcf89acf08c55f494fd93ceedd3f3445] 
regionserver.CompactSplit: Small Compaction requested: system; Because: Opening 
Region; compactionQueue=(longCompactions=0:shortCompactions=0), splitQueue=0
2020-03-26 14:32:31,921 DEBUG 
[regionserver/c2504-node4:16020-longCompactions-1585218367783] 
compactions.SortedCompactionPolicy: Selecting compaction from 1 store files, 0 
compacting, 1 eligible, 100 blocking
2020-03-26 14:32:31,922 DEBUG 
[regionserver/c2504-node4:16020-longCompactions-1585218367783] 
regionserver.HStore: dcf89acf08c55f494fd93ceedd3f3445 - cf: Initiating minor 
compaction (all files)
{code}
And it only finishes at around the same time snapshot is finishing:
{code:java}
  2020-03-26 14:32:32,088 INFO  
[regionserver/c2504-node4:16020-longCompactions-1585218367783] 
regionserver.CompactSplit: Completed compaction 
region=tab2,,1585233150936.dcf89acf08c55f494fd93ceedd3f3445., storeName=cf, 
priority=99, startTime=1585233151918; duration=0sec
2020-03-26 14:32:32,091 DEBUG 
[regionserver/c2504-node4:16020-longCompactions-1585218367783] 
regionserver.CompactSplit: Status 
compactionQueue=(longCompactions=0:shortCompactions=0), 
splitQueue=0233150936.bf84f2e23131d9488d9c56117d374187.
2020-03-26 14:32:32,101 DEBUG 
[rs(c2504-node4.coelab.cloudera.com,16020,1585218362034)-snapshot-pool6-thread-1]
 snapshot.FlushSnapshotSubprocedure: ... Flush Snapshotting region 
tab2,,1585233150936.dcf89acf08c55f494fd93ceedd3f3445. completed.
2020-03-26 14:32:32,101 DEBUG 
[rs(c2504-node4.coelab.cloudera.com,16020,1585218362034)-snapshot-pool6-thread-1]
 snapshot.FlushSnapshotSubprocedure: Closing snapshot operation on 
tab2,,1585233150936.dcf89acf08c55f494fd93ceedd3f3445.
2020-03-26 14:32:32,102 DEBUG [member: 
'c2504-node4.coelab.cloudera.com,16020,1585218362034' 
subprocedure-pool2-thread-1] snapshot.RegionServerSnapshotManager: Completed 
1/2 local region snapshots.
2020-03-26 14:32:32,102 DEBUG [member: 
'c2504-node4.coelab.cloudera.com,16020,1585218362034' 
subprocedure-pool2-thread-1] snapshot.RegionServerSnapshotManager: Completed 
2/2 local region snapshots.
{code}
 

 

 

> Snapshoting a splitting region results in corrupted snapshot
> 
>
> Key: HBASE-23995
> URL: https://issues.apache.org/jira/browse/HBASE-23995
> Project: HBase
>  Issue Type: Bug
>  Components: snapshots
>Affects Versions: 2.0.2
>Reporter: Szabolcs Bukros
>Priority: Major
>
> The problem seems to originate from the fact that while the region split 
> itself runs in a lock, the compactions following it run in separate threads. 
> Alternatively the use of space quota policies can prevent compaction after a 
> split and leads to the same issue.
> In both cases the resulting snapshot will keep the split status of the parent 
> region, but do not keep the references to 

[jira] [Commented] (HBASE-23995) Snapshoting a splitting region results in corrupted snapshot

2020-03-25 Thread Josh Elser (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-23995?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17067182#comment-17067182
 ] 

Josh Elser commented on HBASE-23995:


{quote}it looks like we grab a lock in the Master before taking a snapshot (via 
LockManager/MasterLock), rather than a lock in PV2 which is what the 
split/merge code would be grabbing
{quote}
Szabolcs was talking with me again this morning which had me looking again at 
the code – I definitely did not look far enough :). One more layer deep, I 
would've seen the LockProcedure which does this. I think Szabolcs missed 
copying an update here, but he was able to confirm that hbase 2.2 wasn't 
suffering this problem – the locks were working correctly.

Seems like something not happening quite right with PV2 table locks in the 2.0 
version. Likely, something that's already been fixed :)

> Snapshoting a splitting region results in corrupted snapshot
> 
>
> Key: HBASE-23995
> URL: https://issues.apache.org/jira/browse/HBASE-23995
> Project: HBase
>  Issue Type: Bug
>  Components: snapshots
>Affects Versions: 2.0.2
>Reporter: Szabolcs Bukros
>Priority: Major
>
> The problem seems to originate from the fact that while the region split 
> itself runs in a lock, the compactions following it run in separate threads. 
> Alternatively the use of space quota policies can prevent compaction after a 
> split and leads to the same issue.
> In both cases the resulting snapshot will keep the split status of the parent 
> region, but do not keep the references to the daughter regions, because they 
> (splitA, splitB qualifiers) are stored separately in the meta table and do 
> not propagate with the snapshot.
> This is important because the in the freshly cloned table CatalogJanitor will 
> find the parent region, realizes it is in split state, but because it can not 
> find the daughter region references (haven't propagated) assumes parent could 
> be cleaned up and deletes it. The archived region used in the snaphost only 
> has back reference to the now also archived parent region and if the snapshot 
> is deleted they both gets cleaned up. Unfortunately the daughter regions only 
> contains hfile links, so at this point the data is lost.
> How to reproduce:
> {code:java}
> hbase shell < create 'test', 'cf'
> (0...2000).each{|i| put "test", "row#{i}", "cf:col", "val"}
> flush 'test'
> split 'test'
> snapshot 'test', 'testshot'
> EOF
> {code}
> This should make sure the snapshot is made before the compaction could be 
> finished even with small amount of data.
> {code:java}
> sudo -u habse hbase org.apache.hadoop.hbase.snapshot.ExportSnapshot -snapshot 
> testshot -copy-to hdfs://target:8020/apps/hbase/data/
> {code}
> I export the snapshot to make the usecase cleaner but deleting both the 
> snapshot and the original table after the cloning should have the same effect.
> {code:java}
> clone_snapshot 'testshot', 'test2'
> delete_snapshot "testshot"
> {code}
> I'm not sure what would be the best way to fix this. Preventing snapshots 
> when a region is in split state, would make snapshot creation problematic. 
> Forcing to run compaction as part of the split thread would make it rather 
> slow. Propagating the daughter region references could prevent the deletion 
> of the cloned parent region and the data would not be broken anymore but I'm 
> not sure we have a logic in place that could pick up the pieces and finish 
> the split process.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HBASE-23995) Snapshoting a splitting region results in corrupted snapshot

2020-03-23 Thread Josh Elser (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-23995?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17065182#comment-17065182
 ] 

Josh Elser commented on HBASE-23995:


bq. And when snapshoting, we will hold the procedure lock on the table, and 
when splitting, we will hold the procedure lock on the region thus also on the 
table too, which will prevent them running together.

Glancing at branch-2, it looks like we grab a lock in the Master before taking 
a snapshot (via LockManager/MasterLock), rather than a lock in PV2 which is 
what the split/merge code would be grabbing. At least on the surface, it does 
look like we have two competing "locking" solutions which are unaware of each 
other, but I'm only looking on the surface. All of the snapshot "procedures" 
are the old procedure(1) code.

Do we push Snapshots into PV2 and then rely on PV2 locking the whole way down? 
Sounds like a good idea to dog-food this, but I'm wondering if there is a 
reason we didn't port snapshots right away. Maybe you remember, Duo?

I know you've thought through some potential scenarios on a "less invasive" 
fix, Szabolcs. Maybe share them here, with the downsides of them? e.g. detect 
in-progress splits in prepareSnapshot, snapshot only on disabled tables, and 
others.

> Snapshoting a splitting region results in corrupted snapshot
> 
>
> Key: HBASE-23995
> URL: https://issues.apache.org/jira/browse/HBASE-23995
> Project: HBase
>  Issue Type: Bug
>  Components: snapshots
>Affects Versions: 2.0.2
>Reporter: Szabolcs Bukros
>Priority: Major
>
> The problem seems to originate from the fact that while the region split 
> itself runs in a lock, the compactions following it run in separate threads. 
> Alternatively the use of space quota policies can prevent compaction after a 
> split and leads to the same issue.
> In both cases the resulting snapshot will keep the split status of the parent 
> region, but do not keep the references to the daughter regions, because they 
> (splitA, splitB qualifiers) are stored separately in the meta table and do 
> not propagate with the snapshot.
> This is important because the in the freshly cloned table CatalogJanitor will 
> find the parent region, realizes it is in split state, but because it can not 
> find the daughter region references (haven't propagated) assumes parent could 
> be cleaned up and deletes it. The archived region used in the snaphost only 
> has back reference to the now also archived parent region and if the snapshot 
> is deleted they both gets cleaned up. Unfortunately the daughter regions only 
> contains hfile links, so at this point the data is lost.
> How to reproduce:
> {code:java}
> hbase shell < create 'test', 'cf'
> (0...2000).each{|i| put "test", "row#{i}", "cf:col", "val"}
> flush 'test'
> split 'test'
> snapshot 'test', 'testshot'
> EOF
> {code}
> This should make sure the snapshot is made before the compaction could be 
> finished even with small amount of data.
> {code:java}
> sudo -u habse hbase org.apache.hadoop.hbase.snapshot.ExportSnapshot -snapshot 
> testshot -copy-to hdfs://target:8020/apps/hbase/data/
> {code}
> I export the snapshot to make the usecase cleaner but deleting both the 
> snapshot and the original table after the cloning should have the same effect.
> {code:java}
> clone_snapshot 'testshot', 'test2'
> delete_snapshot "testshot"
> {code}
> I'm not sure what would be the best way to fix this. Preventing snapshots 
> when a region is in split state, would make snapshot creation problematic. 
> Forcing to run compaction as part of the split thread would make it rather 
> slow. Propagating the daughter region references could prevent the deletion 
> of the cloned parent region and the data would not be broken anymore but I'm 
> not sure we have a logic in place that could pick up the pieces and finish 
> the split process.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HBASE-23995) Snapshoting a splitting region results in corrupted snapshot

2020-03-17 Thread Szabolcs Bukros (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-23995?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17060757#comment-17060757
 ] 

Szabolcs Bukros commented on HBASE-23995:
-

After the split the daughter regions only have hfile links to the storefile in 
the parent. Even the CatalogJanitor leaves these parents alone, only deleting 
it after the compaction is done and are no longer referenced from daughters.

The title might not be precise or well choosen. What I wanted to say is that 
snapshoting a state where the split was done, but compaction was not (this is 
what I clumsily called "splitting") results in a structure where the daughter 
regions has no data just links to a parent is saved and can be exported. 
However not every necessary info is exported with it (daughter references from 
parent are missing) and this leads to an issue where in the cloned table the 
parent, that actually contains the data is archived then deleted in minutes 
after the cloning is done, resulting in loosing the exported data.

> Snapshoting a splitting region results in corrupted snapshot
> 
>
> Key: HBASE-23995
> URL: https://issues.apache.org/jira/browse/HBASE-23995
> Project: HBase
>  Issue Type: Bug
>  Components: snapshots
>Affects Versions: 2.0.2
>Reporter: Szabolcs Bukros
>Priority: Major
>
> The problem seems to originate from the fact that while the region split 
> itself runs in a lock, the compactions following it run in separate threads. 
> Alternatively the use of space quota policies can prevent compaction after a 
> split and leads to the same issue.
> In both cases the resulting snapshot will keep the split status of the parent 
> region, but do not keep the references to the daughter regions, because they 
> (splitA, splitB qualifiers) are stored separately in the meta table and do 
> not propagate with the snapshot.
> This is important because the in the freshly cloned table CatalogJanitor will 
> find the parent region, realizes it is in split state, but because it can not 
> find the daughter region references (haven't propagated) assumes parent could 
> be cleaned up and deletes it. The archived region used in the snaphost only 
> has back reference to the now also archived parent region and if the snapshot 
> is deleted they both gets cleaned up. Unfortunately the daughter regions only 
> contains hfile links, so at this point the data is lost.
> How to reproduce:
> {code:java}
> hbase shell < create 'test', 'cf'
> (0...2000).each{|i| put "test", "row#{i}", "cf:col", "val"}
> flush 'test'
> split 'test'
> snapshot 'test', 'testshot'
> EOF
> {code}
> This should make sure the snapshot is made before the compaction could be 
> finished even with small amount of data.
> {code:java}
> sudo -u habse hbase org.apache.hadoop.hbase.snapshot.ExportSnapshot -snapshot 
> testshot -copy-to hdfs://target:8020/apps/hbase/data/
> {code}
> I export the snapshot to make the usecase cleaner but deleting both the 
> snapshot and the original table after the cloning should have the same effect.
> {code:java}
> clone_snapshot 'testshot', 'test2'
> delete_snapshot "testshot"
> {code}
> I'm not sure what would be the best way to fix this. Preventing snapshots 
> when a region is in split state, would make snapshot creation problematic. 
> Forcing to run compaction as part of the split thread would make it rather 
> slow. Propagating the daughter region references could prevent the deletion 
> of the cloned parent region and the data would not be broken anymore but I'm 
> not sure we have a logic in place that could pick up the pieces and finish 
> the split process.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HBASE-23995) Snapshoting a splitting region results in corrupted snapshot

2020-03-17 Thread Duo Zhang (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-23995?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17060728#comment-17060728
 ] 

Duo Zhang commented on HBASE-23995:
---

If the split is suceeded, you do not need to snapshot the parent region right? 
So whether the compaction leads to a removal of the parent region does not 
matter.

Or your title is misleading? You just say 'Snapshoting a splitting region', I 
would say that in newer version of HBase this is impossible.

Mabye there are other problem which make the snapshot broken but at least, not 
'Snapshoting a splitting region'.

Thanks.

> Snapshoting a splitting region results in corrupted snapshot
> 
>
> Key: HBASE-23995
> URL: https://issues.apache.org/jira/browse/HBASE-23995
> Project: HBase
>  Issue Type: Bug
>  Components: snapshots
>Affects Versions: 2.0.2
>Reporter: Szabolcs Bukros
>Priority: Major
>
> The problem seems to originate from the fact that while the region split 
> itself runs in a lock, the compactions following it run in separate threads. 
> Alternatively the use of space quota policies can prevent compaction after a 
> split and leads to the same issue.
> In both cases the resulting snapshot will keep the split status of the parent 
> region, but do not keep the references to the daughter regions, because they 
> (splitA, splitB qualifiers) are stored separately in the meta table and do 
> not propagate with the snapshot.
> This is important because the in the freshly cloned table CatalogJanitor will 
> find the parent region, realizes it is in split state, but because it can not 
> find the daughter region references (haven't propagated) assumes parent could 
> be cleaned up and deletes it. The archived region used in the snaphost only 
> has back reference to the now also archived parent region and if the snapshot 
> is deleted they both gets cleaned up. Unfortunately the daughter regions only 
> contains hfile links, so at this point the data is lost.
> How to reproduce:
> {code:java}
> hbase shell < create 'test', 'cf'
> (0...2000).each{|i| put "test", "row#{i}", "cf:col", "val"}
> flush 'test'
> split 'test'
> snapshot 'test', 'testshot'
> EOF
> {code}
> This should make sure the snapshot is made before the compaction could be 
> finished even with small amount of data.
> {code:java}
> sudo -u habse hbase org.apache.hadoop.hbase.snapshot.ExportSnapshot -snapshot 
> testshot -copy-to hdfs://target:8020/apps/hbase/data/
> {code}
> I export the snapshot to make the usecase cleaner but deleting both the 
> snapshot and the original table after the cloning should have the same effect.
> {code:java}
> clone_snapshot 'testshot', 'test2'
> delete_snapshot "testshot"
> {code}
> I'm not sure what would be the best way to fix this. Preventing snapshots 
> when a region is in split state, would make snapshot creation problematic. 
> Forcing to run compaction as part of the split thread would make it rather 
> slow. Propagating the daughter region references could prevent the deletion 
> of the cloned parent region and the data would not be broken anymore but I'm 
> not sure we have a logic in place that could pick up the pieces and finish 
> the split process.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HBASE-23995) Snapshoting a splitting region results in corrupted snapshot

2020-03-17 Thread Szabolcs Bukros (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-23995?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17060717#comment-17060717
 ] 

Szabolcs Bukros commented on HBASE-23995:
-

Hi [~zhangduo], thanks for your reply!

I tested and reproduced the issue on 2.0.2 but based on a quick comparison with 
master I would say not much have changed and the issue should be present there 
too.

If I understand correctly Procedure locks do not help because the compaction 
runs in separate threads. SplitTableRegionProcedure does the splitting, creates 
a ThreadPoolExecutor for the compactions and releases the locks while the 
compactions run in the background, making the snapshot possible.

> Snapshoting a splitting region results in corrupted snapshot
> 
>
> Key: HBASE-23995
> URL: https://issues.apache.org/jira/browse/HBASE-23995
> Project: HBase
>  Issue Type: Bug
>  Components: snapshots
>Affects Versions: 2.0.2
>Reporter: Szabolcs Bukros
>Priority: Major
>
> The problem seems to originate from the fact that while the region split 
> itself runs in a lock, the compactions following it run in separate threads. 
> Alternatively the use of space quota policies can prevent compaction after a 
> split and leads to the same issue.
> In both cases the resulting snapshot will keep the split status of the parent 
> region, but do not keep the references to the daughter regions, because they 
> (splitA, splitB qualifiers) are stored separately in the meta table and do 
> not propagate with the snapshot.
> This is important because the in the freshly cloned table CatalogJanitor will 
> find the parent region, realizes it is in split state, but because it can not 
> find the daughter region references (haven't propagated) assumes parent could 
> be cleaned up and deletes it. The archived region used in the snaphost only 
> has back reference to the now also archived parent region and if the snapshot 
> is deleted they both gets cleaned up. Unfortunately the daughter regions only 
> contains hfile links, so at this point the data is lost.
> How to reproduce:
> {code:java}
> hbase shell < create 'test', 'cf'
> (0...2000).each{|i| put "test", "row#{i}", "cf:col", "val"}
> flush 'test'
> split 'test'
> snapshot 'test', 'testshot'
> EOF
> {code}
> This should make sure the snapshot is made before the compaction could be 
> finished even with small amount of data.
> {code:java}
> sudo -u habse hbase org.apache.hadoop.hbase.snapshot.ExportSnapshot -snapshot 
> testshot -copy-to hdfs://target:8020/apps/hbase/data/
> {code}
> I export the snapshot to make the usecase cleaner but deleting both the 
> snapshot and the original table after the cloning should have the same effect.
> {code:java}
> clone_snapshot 'testshot', 'test2'
> delete_snapshot "testshot"
> {code}
> I'm not sure what would be the best way to fix this. Preventing snapshots 
> when a region is in split state, would make snapshot creation problematic. 
> Forcing to run compaction as part of the split thread would make it rather 
> slow. Propagating the daughter region references could prevent the deletion 
> of the cloned parent region and the data would not be broken anymore but I'm 
> not sure we have a logic in place that could pick up the pieces and finish 
> the split process.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HBASE-23995) Snapshoting a splitting region results in corrupted snapshot

2020-03-16 Thread Duo Zhang (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-23995?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17060545#comment-17060545
 ] 

Duo Zhang commented on HBASE-23995:
---

So the version is 2.0.2?

IIRC, now we will skip splitting the region if we are snapshoting the stable. 
And when snapshoting, we will hold the procedure lock on the table, and when 
splitting, we will hold the procedure lock on the region thus also on the table 
too, which will prevent them running together.

Can this solve your problem?

Thanks.

> Snapshoting a splitting region results in corrupted snapshot
> 
>
> Key: HBASE-23995
> URL: https://issues.apache.org/jira/browse/HBASE-23995
> Project: HBase
>  Issue Type: Bug
>  Components: snapshots
>Affects Versions: 2.0.2
>Reporter: Szabolcs Bukros
>Priority: Major
>
> The problem seems to originate from the fact that while the region split 
> itself runs in a lock, the compactions following it run in separate threads. 
> Alternatively the use of space quota policies can prevent compaction after a 
> split and leads to the same issue.
> In both cases the resulting snapshot will keep the split status of the parent 
> region, but do not keep the references to the daughter regions, because they 
> (splitA, splitB qualifiers) are stored separately in the meta table and do 
> not propagate with the snapshot.
> This is important because the in the freshly cloned table CatalogJanitor will 
> find the parent region, realizes it is in split state, but because it can not 
> find the daughter region references (haven't propagated) assumes parent could 
> be cleaned up and deletes it. The archived region used in the snaphost only 
> has back reference to the now also archived parent region and if the snapshot 
> is deleted they both gets cleaned up. Unfortunately the daughter regions only 
> contains hfile links, so at this point the data is lost.
> How to reproduce:
> {code:java}
> hbase shell < create 'test', 'cf'
> (0...2000).each{|i| put "test", "row#{i}", "cf:col", "val"}
> flush 'test'
> split 'test'
> snapshot 'test', 'testshot'
> EOF
> {code}
> This should make sure the snapshot is made before the compaction could be 
> finished even with small amount of data.
> {code:java}
> sudo -u habse hbase org.apache.hadoop.hbase.snapshot.ExportSnapshot -snapshot 
> testshot -copy-to hdfs://target:8020/apps/hbase/data/
> {code}
> I export the snapshot to make the usecase cleaner but deleting both the 
> snapshot and the original table after the cloning should have the same effect.
> {code:java}
> clone_snapshot 'testshot', 'test2'
> delete_snapshot "testshot"
> {code}
> I'm not sure what would be the best way to fix this. Preventing snapshots 
> when a region is in split state, would make snapshot creation problematic. 
> Forcing to run compaction as part of the split thread would make it rather 
> slow. Propagating the daughter region references could prevent the deletion 
> of the cloned parent region and the data would not be broken anymore but I'm 
> not sure we have a logic in place that could pick up the pieces and finish 
> the split process.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)