[jira] [Comment Edited] (HBASE-19893) restore_snapshot is broken in master branch when region splits

Toshihiro Suzuki (JIRA) Wed, 04 Apr 2018 02:45:33 -0700

    [ 
https://issues.apache.org/jira/browse/HBASE-19893?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16425254#comment-16425254
 ]


Toshihiro Suzuki edited comment on HBASE-19893 at 4/4/18 9:44 AM:
------------------------------------------------------------------

Sorry for the late reply [~ram_krish],

{quote}
So this process of restore snapshot procs adds the in memory info that the 
procedures has to the META. So when the table is enabled after restore 
snapshot, this META info is not taken as the source of truth is it? Ya i think 
we may not know whether after disabling and when we enable if the enable is 
from the snapshot or frm some where else. In that sense this fix LGTM.
So the change in TestRestoreSnapshotFromClient if run without the fix it would 
fail and now it would pass I believe.
{quote}
Yes, the META info should be the source of truth.
Currently when restoring snapshot, the restore snapshot procs changes only META 
info and it doesn't change in-memory states.
That's why this issue happens.
The fix in the patch is adding a logic to change in-memory states.

{quote}
If the Master crashes and gets started again just after restore snapshot 
procedure is run and then you enable the table, what happens? Atleast that time 
do we read from META?
{quote}
Yes. I think even when Master crashes, Master can recover in-memory stats from 
the META table and retry restoring snapshot.


And I attached a v3 patch. In the previous patch, all region replica infos in 
in-memory stats were removed when restoring a snapshot.
However, I thought it is not correct and in the v3 patch, I think region 
replica infos in in-memory are handled correctly.

Could you please review this patch? [[email protected]] [~ram_krish]


was (Author: brfrn169):
Sorry for the late reply [~ram_krish],

{quote}
So this process of restore snapshot procs adds the in memory info that the 
procedures has to the META. So when the table is enabled after restore 
snapshot, this META info is not taken as the source of truth is it? Ya i think 
we may not know whether after disabling and when we enable if the enable is 
from the snapshot or frm some where else. In that sense this fix LGTM.
So the change in TestRestoreSnapshotFromClient if run without the fix it would 
fail and now it would pass I believe.
{quote}
Yes, the META info should be the source of truth.
Currently when restoring snapshot, the restore snapshot procs changes only META 
info and it doesn't change in-memory states.
That's why this issue happens.
The fix in the patch is adding a logic to change in-memory states.

{quote}
If the Master crashes and gets started again just after restore snapshot 
procedure is run and then you enable the table, what happens? Atleast that time 
do we read from META?
{quote}
Yes. I think even when Master crashes, Master can recover in-memory stats from 
the META table and retry restoring snapshot.


And I attached a v3 patch. In the previous patch, all region replica infos in 
in-memory stats were removed when restoring a snapshot.
However, I thought it is not correct and in the v3 patch, region replica infos 
in in-memory are handled correctly.

Could you please review this patch? [[email protected]] [~ram_krish]

> restore_snapshot is broken in master branch when region splits
> --------------------------------------------------------------
>
>                 Key: HBASE-19893
>                 URL: https://issues.apache.org/jira/browse/HBASE-19893
>             Project: HBase
>          Issue Type: Bug
>          Components: snapshots
>            Reporter: Toshihiro Suzuki
>            Assignee: Toshihiro Suzuki
>            Priority: Critical
>         Attachments: HBASE-19893.master.001.patch, 
> HBASE-19893.master.002.patch, HBASE-19893.master.003.patch
>
>
> When I was investigating HBASE-19850, I found restore_snapshot didn't work in 
> master branch.
>  
> Steps to reproduce are as follows:
> 1. Create a table
> {code:java}
> create "test", "cf"
> {code}
> 2. Load data (2000 rows) to the table
> {code:java}
> (0...2000).each{|i| put "test", "row#{i}", "cf:col", "val"}
> {code}
> 3. Split the table
> {code:java}
> split "test"
> {code}
> 4. Take a snapshot
> {code:java}
> snapshot "test", "snap"
> {code}
> 5. Load more data (2000 rows) to the table and split the table agin
> {code:java}
> (2000...4000).each{|i| put "test", "row#{i}", "cf:col", "val"}
> split "test"
> {code}
> 6. Restore the table from the snapshot 
> {code:java}
> disable "test"
> restore_snapshot "snap"
> enable "test"
> {code}
> 7. Scan the table
> {code:java}
> scan "test"
> {code}
> However, this scan returns only 244 rows (it should return 2000 rows) like 
> the following:
> {code:java}
> hbase(main):038:0> scan "test"
> ROW COLUMN+CELL
>  row78 column=cf:col, timestamp=1517298307049, value=val
> ....
>   row999 column=cf:col, timestamp=1517298307608, value=val
> 244 row(s)
> Took 0.1500 seconds
> {code}
>  
> Also, the restored table should have 2 online regions but it has 3 online 
> regions.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Comment Edited] (HBASE-19893) restore_snapshot is broken in master branch when region splits

Reply via email to