[jira] [Commented] (HBASE-21035) Meta Table should be able to online even if all procedures are lost

stack (JIRA) Wed, 12 Sep 2018 12:36:59 -0700


    [ 
https://issues.apache.org/jira/browse/HBASE-21035?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16612645#comment-16612645
 ]


stack commented on HBASE-21035:
-------------------------------

I put up a patch on HBASE-21191 that steals [~allan163]'s patch from here. It 
has master go into a holding pattern complaining that hbase:meta is not online 
asking for operator intervention. Ditto for namespace table. Will work on doc 
on what operators need to do to effect repair in a while after I've backfilled 
more of the hbck2 stuff. A test over in HBASE-21191 shows that scheduling an 
hbase:meta assign seems to do the job getting the cluster up again (missing is 
how to do this from the outside, from a client... working on that next).

> Meta Table should be able to online even if all procedures are lost
> -------------------------------------------------------------------
>
>                 Key: HBASE-21035
>                 URL: https://issues.apache.org/jira/browse/HBASE-21035
>             Project: HBase
>          Issue Type: Sub-task
>    Affects Versions: 2.1.0
>            Reporter: Allan Yang
>            Assignee: Allan Yang
>            Priority: Major
>         Attachments: HBASE-21035.branch-2.0.001.patch, 
> HBASE-21035.branch-2.1.001.patch
>
>
> After HBASE-20708, we changed the way we init after master starts. It will 
> only check WAL dirs and compare to Zookeeper RS nodes to decide which server 
> need to expire. For servers which's dir is ending with 'SPLITTING', we assure 
> that there will be a SCP for it.
> But, if the server with the meta region crashed before master restarts, and 
> if all the procedure wals are lost (due to bug, or deleted manually, 
> whatever), the new restarted master will be stuck when initing. Since no one 
> will bring meta region online.
> Although it is an anomaly case, but I think no matter what happens, we need 
> to online meta region. Otherwise, we are sitting ducks, noting can be done.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HBASE-21035) Meta Table should be able to online even if all procedures are lost

Reply via email to