[ 
https://issues.apache.org/jira/browse/HBASE-8760?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13687286#comment-13687286
 ] 

Jerry He commented on HBASE-8760:
---------------------------------

[~mbertozzi]
Yes, restore/clone snapshot works wells if the parent hfile is not deleted. I 
am amazed to see the part of the code that figures out the Links and half 
References works well.  Below is some region server log dump.

If the parent hfile has been deleted before restore/clone, this is the error.
table1 is the original snapshot table. table1_clone is the clone_snapshot table.
--------------------------------------------------------------------------------------------------------
$ hadoop fs -lsr /hbase/.hbase-snapshot
/hbase/.hbase-snapshot/.tmp
/hbase/.hbase-snapshot/my_table1_snapshot
/hbase/.hbase-snapshot/my_table1_snapshot/.snapshotinfo
/hbase/.hbase-snapshot/my_table1_snapshot/.tableinfo.0000000001
/hbase/.hbase-snapshot/my_table1_snapshot/.tmp
/hbase/.hbase-snapshot/my_table1_snapshot/399a750df7646a7fb38d35780ca5254f
/hbase/.hbase-snapshot/my_table1_snapshot/399a750df7646a7fb38d35780ca5254f/.regioninfo
/hbase/.hbase-snapshot/my_table1_snapshot/399a750df7646a7fb38d35780ca5254f/.tmp
/hbase/.hbase-snapshot/my_table1_snapshot/399a750df7646a7fb38d35780ca5254f/family1
/hbase/.hbase-snapshot/my_table1_snapshot/399a750df7646a7fb38d35780ca5254f/family1/c272990ce92c409d8cdebd6afcb8cc14.3e96bb19fb20e4edd27949f894878714
/hbase/.hbase-snapshot/my_table1_snapshot/f3b8401f06dc4cbe2043f26df42e1b0e
/hbase/.hbase-snapshot/my_table1_snapshot/f3b8401f06dc4cbe2043f26df42e1b0e/.regioninfo
/hbase/.hbase-snapshot/my_table1_snapshot/f3b8401f06dc4cbe2043f26df42e1b0e/.tmp
/hbase/.hbase-snapshot/my_table1_snapshot/f3b8401f06dc4cbe2043f26df42e1b0e/family1
/hbase/.hbase-snapshot/my_table1_snapshot/f3b8401f06dc4cbe2043f26df42e1b0e/family1/c272990ce92c409d8cdebd6afcb8cc14.3e96bb19fb20e4edd27949f894878714


2013-06-14 22:40:03,065 INFO 
org.apache.hadoop.hbase.regionserver.HRegionServer: Received request to open 
region: table1_clone,,1371270778458.3ab8becbaddb796fc8a036762dbd9493.
2013-06-14 22:40:03,065 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: 
regionserver:60020-0x13f2c26b6106505 Attempting to transition node 
3ab8becbaddb796fc8a036762dbd9493 from M_ZK_REGION_OFFLINE to 
RS_ZK_REGION_OPENING
2013-06-14 22:40:03,067 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: 
regionserver:60020-0x13f2c26b6106505 Successfully transitioned node 
3ab8becbaddb796fc8a036762dbd9493 from M_ZK_REGION_OFFLINE to 
RS_ZK_REGION_OPENING
2013-06-14 22:40:03,067 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: 
Opening region: {NAME => 
'table1_clone,,1371270778458.3ab8becbaddb796fc8a036762dbd9493.', STARTKEY => 
'', ENDKEY => 'user1959958463', ENCODED => 3ab8becbaddb796fc8a036762dbd9493,}
2013-06-14 22:40:03,067 INFO org.apache.hadoop.hbase.regionserver.HRegion: 
Setting up tabledescriptor config now ...
2013-06-14 22:40:03,067 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: 
Instantiated table1_clone,,1371270778458.3ab8becbaddb796fc8a036762dbd9493.
2013-06-14 22:40:03,069 INFO org.apache.hadoop.hbase.regionserver.Store: time 
to purge deletes set to 0ms in store family1
2013-06-14 22:40:03,069 INFO org.apache.hadoop.hbase.regionserver.Store: 
hbase.hstore.compaction.min = 3
2013-06-14 22:40:03,070 DEBUG org.apache.hadoop.hbase.regionserver.StoreFile: 
reference 
'hdfs://hdtest009:9000/hbase/table1_clone/3ab8becbaddb796fc8a036762dbd9493/family1/table1=3e96bb19fb20e4edd27949f894878714-c272990ce92c409d8cdebd6afcb8cc14.3e96bb19fb20e4edd27949f894878714'
 to region=3e96bb19fb20e4edd27949f894878714 
hfile=table1=3e96bb19fb20e4edd27949f894878714-c272990ce92c409d8cdebd6afcb8cc14
2013-06-14 22:40:03,071 DEBUG org.apache.hadoop.hbase.regionserver.StoreFile: 
Store file 
hdfs://hdtest009:9000/hbase/table1_clone/3ab8becbaddb796fc8a036762dbd9493/family1/table1=3e96bb19fb20e4edd27949f894878714-c272990ce92c409d8cdebd6afcb8cc14.3e96bb19fb20e4edd27949f894878714
 is a bottom reference to 
hdfs://hdtest009:9000/hbase/table1_clone/3e96bb19fb20e4edd27949f894878714/family1/table1=3e96bb19fb20e4edd27949f894878714-c272990ce92c409d8cdebd6afcb8cc14
2013-06-14 22:40:03,072 ERROR 
org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler: Failed open of 
region=table1_clone,,1371270778458.3ab8becbaddb796fc8a036762dbd9493., starting 
to roll back the global memstore size.
java.io.IOException: java.io.IOException: java.io.FileNotFoundException: Unable 
to open link: org.apache.hadoop.hbase.io.HFileLink 
locations=[hdfs://hdtest009:9000/hbase/table1/3e96bb19fb20e4edd27949f894878714/family1/c272990ce92c409d8cdebd6afcb8cc14,
 
hdfs://hdtest009:9000/hbase/.tmp/table1/3e96bb19fb20e4edd27949f894878714/family1/c272990ce92c409d8cdebd6afcb8cc14,
 
hdfs://hdtest009:9000/hbase/.archive/table1/3e96bb19fb20e4edd27949f894878714/family1/c272990ce92c409d8cdebd6afcb8cc14]
        at 
org.apache.hadoop.hbase.regionserver.HRegion.initializeRegionInternals(HRegion.java:631)
        at 
org.apache.hadoop.hbase.regionserver.HRegion.initialize(HRegion.java:544)
        at 
org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:4372)
        at 
org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:4320)
        at 
org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.openRegion(OpenRegionHandler.java:330)
        at 
org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.process(OpenRegionHandler.java:101)
        at 
org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:175)
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:897)
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:919)
        at java.lang.Thread.run(Thread.java:738)
Caused by: java.io.IOException: java.io.FileNotFoundException: Unable to open 
link: org.apache.hadoop.hbase.io.HFileLink 
locations=[hdfs://hdtest009:9000/hbase/table1/3e96bb19fb20e4edd27949f894878714/family1/c272990ce92c409d8cdebd6afcb8cc14,
 
hdfs://hdtest009:9000/hbase/.tmp/table1/3e96bb19fb20e4edd27949f894878714/family1/c272990ce92c409d8cdebd6afcb8cc14,
 
hdfs://hdtest009:9000/hbase/.archive/table1/3e96bb19fb20e4edd27949f894878714/family1/c272990ce92c409d8cdebd6afcb8cc14]
        at 
org.apache.hadoop.hbase.regionserver.Store.loadStoreFiles(Store.java:481)
        at org.apache.hadoop.hbase.regionserver.Store.<init>(Store.java:258)
        at 
org.apache.hadoop.hbase.regionserver.HRegion.instantiateHStore(HRegion.java:3322)
        at org.apache.hadoop.hbase.regionserver.HRegion$1.call(HRegion.java:606)
        at org.apache.hadoop.hbase.regionserver.HRegion$1.call(HRegion.java:604)
        at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:314)
        at java.util.concurrent.FutureTask.run(FutureTask.java:149)
        at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:452)
        at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:314)
        at java.util.concurrent.FutureTask.run(FutureTask.java:149)
        ... 3 more
Caused by: java.io.FileNotFoundException: Unable to open link: 
org.apache.hadoop.hbase.io.HFileLink 
locations=[hdfs://hdtest009:9000/hbase/table1/3e96bb19fb20e4edd27949f894878714/family1/c272990ce92c409d8cdebd6afcb8cc14,
 
hdfs://hdtest009:9000/hbase/.tmp/table1/3e96bb19fb20e4edd27949f894878714/family1/c272990ce92c409d8cdebd6afcb8cc14,
 
hdfs://hdtest009:9000/hbase/.archive/table1/3e96bb19fb20e4edd27949f894878714/family1/c272990ce92c409d8cdebd6afcb8cc14]
        at org.apache.hadoop.hbase.io.FileLink.getFileStatus(FileLink.java:375)
        at 
org.apache.hadoop.hbase.io.HalfStoreFileReader.<init>(HalfStoreFileReader.java:97)
        at 
org.apache.hadoop.hbase.regionserver.StoreFile.open(StoreFile.java:537)
        at 
org.apache.hadoop.hbase.regionserver.StoreFile.createReader(StoreFile.java:639)
        at org.apache.hadoop.hbase.regionserver.Store$1.call(Store.java:457)
        at org.apache.hadoop.hbase.regionserver.Store$1.call(Store.java:452)
        ... 8 more

If the parent hfile is still present, everything works ok.
----------------------------------------------------------------------------

2013-06-18 14:58:50,026 INFO 
org.apache.hadoop.hbase.regionserver.HRegionServer: Received request to open 1 
region(s)
2013-06-18 14:58:50,026 INFO 
org.apache.hadoop.hbase.regionserver.HRegionServer: Received request to open 
region: table1_clone,,1371578484233.6dab7a8a16b0e195785d52ad7b15bd09.
2013-06-18 14:58:50,031 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: 
regionserver:60020-0x13f58635d150013 Attempting to transition node 
6dab7a8a16b0e195785d52ad7b15bd09 from M_ZK_REGION_OFFLINE to 
RS_ZK_REGION_OPENING
2013-06-18 14:58:50,033 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: 
regionserver:60020-0x13f58635d150013 Successfully transitioned node 
6dab7a8a16b0e195785d52ad7b15bd09 from M_ZK_REGION_OFFLINE to 
RS_ZK_REGION_OPENING
2013-06-18 14:58:50,034 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: 
Opening region: {NAME => 
'table1_clone,,1371578484233.6dab7a8a16b0e195785d52ad7b15bd09.', STARTKEY => 
'', ENDKEY => 'user1959958463', ENCODED => 6dab7a8a16b0e195785d52ad7b15bd09,}
2013-06-18 14:58:50,035 INFO org.apache.hadoop.hbase.regionserver.HRegion: 
Setting up tabledescriptor config now ...
2013-06-18 14:58:50,035 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: 
Instantiated table1_clone,,1371578484233.6dab7a8a16b0e195785d52ad7b15bd09.
2013-06-18 14:58:50,039 INFO org.apache.hadoop.hbase.regionserver.Store: time 
to purge deletes set to 0ms in store family1
2013-06-18 14:58:50,039 INFO org.apache.hadoop.hbase.regionserver.Store: 
hbase.hstore.compaction.min = 3
2013-06-18 14:58:50,045 DEBUG org.apache.hadoop.hbase.regionserver.StoreFile: 
reference 
'hdfs://hdtest009:9000/hbase/table1_clone/6dab7a8a16b0e195785d52ad7b15bd09/family1/table1=352470e8ef4d15b034ab1165b07e35e3-9014c0eed1c0418daf3d42882baecf24.352470e8ef4d15b034ab1165b07e35e3'
 to region=352470e8ef4d15b034ab1165b07e35e3 
hfile=table1=352470e8ef4d15b034ab1165b07e35e3-9014c0eed1c0418daf3d42882baecf24
2013-06-18 14:58:50,049 DEBUG org.apache.hadoop.hbase.regionserver.StoreFile: 
Store file 
hdfs://hdtest009:9000/hbase/table1_clone/6dab7a8a16b0e195785d52ad7b15bd09/family1/table1=352470e8ef4d15b034ab1165b07e35e3-9014c0eed1c0418daf3d42882baecf24.352470e8ef4d15b034ab1165b07e35e3
 is a bottom reference to 
hdfs://hdtest009:9000/hbase/table1_clone/352470e8ef4d15b034ab1165b07e35e3/family1/table1=352470e8ef4d15b034ab1165b07e35e3-9014c0eed1c0418daf3d42882baecf24
2013-06-18 14:58:50,062 DEBUG org.apache.hadoop.hbase.regionserver.Store: 
loaded 
hdfs://hdtest009:9000/hbase/table1_clone/6dab7a8a16b0e195785d52ad7b15bd09/family1/table1=352470e8ef4d15b034ab1165b07e35e3-9014c0eed1c0418daf3d42882baecf24.352470e8ef4d15b034ab1165b07e35e3,
 isReference=true, isBulkLoadResult=false, seqid=32549, majorCompaction=false
2013-06-18 14:58:50,064 INFO org.apache.hadoop.hbase.regionserver.HRegion: 
Onlined table1_clone,,1371578484233.6dab7a8a16b0e195785d52ad7b15bd09.; next 
sequenceid=32550

                
> possible loss of data in snapshot taken after region split
> ----------------------------------------------------------
>
>                 Key: HBASE-8760
>                 URL: https://issues.apache.org/jira/browse/HBASE-8760
>             Project: HBase
>          Issue Type: Bug
>          Components: snapshots
>    Affects Versions: 0.94.8
>            Reporter: Jerry He
>            Assignee: Jerry He
>             Fix For: 0.94.8
>
>         Attachments: HBase-8760-0.94.8.patch
>
>
> Right after a region split but before the daughter regions are compacted, we 
> have two daughter regions containing Reference files to the parent hfiles.
> If we take snapshot right at the moment, the snapshot will succeed, but it 
> will only contain the daughter Reference files. Since there is no hold on the 
> parent hfiles, they will be deleted by the HFile Cleaner after they are no 
> longer needed by the daughter regions soon after.
> A minimum we need to do is the keep these parent hfiles from being deleted. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to