[ 
https://issues.apache.org/jira/browse/HBASE-21191?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16612714#comment-16612714
 ] 

stack commented on HBASE-21191:
-------------------------------

Playing w/ this patch... I removed master wal procs content. Had to remove WALs 
for any old servers too else the meta assign was getting scheduled; i.e. its 
hard to manufacture case where there is no assign for hbase:meta.

I tried to use straight shell assign to do the hbase:meta assign but it came 
back with:

{code}
ERROR: org.apache.hadoop.hbase.PleaseHoldException: Master is initializing
        at 
org.apache.hadoop.hbase.master.HMaster.checkInitialized(HMaster.java:2966)
        at 
org.apache.hadoop.hbase.master.MasterRpcServices.assignRegion(MasterRpcServices.java:558)
        at 
org.apache.hadoop.hbase.shaded.protobuf.generated.MasterProtos$MasterService$2.callBlockingMethod(MasterProtos.java)
        at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:413)
        at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:130)
        at 
org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:324)
        at 
org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:304)
{code}

So, ok, this should be an hbck2 task. Working on it. 

> Add a holding-pattern if no assign for meta or namespace (Can happen if 
> masterprocwals have been cleared).
> ----------------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-21191
>                 URL: https://issues.apache.org/jira/browse/HBASE-21191
>             Project: HBase
>          Issue Type: Sub-task
>          Components: amv2
>            Reporter: stack
>            Assignee: stack
>            Priority: Major
>             Fix For: 2.1.1
>
>         Attachments: HBASE-21191.branch-2.1.001.patch
>
>
> If the masterprocwals have been removed -- operator error, hdfs dataloss, or 
> because we have gotten ourselves into a pathological state where we have 
> hundreds of masterprocwals too process and it is taking too long so we just 
> want to startover -- then master startup will have a dilemma. Master startup 
> needs hbase:meta to be online. If the masterprocwals have been removed, there 
> may be no outstanding assign or a servercrashprocedure with coverage for 
> hbase:meta (I ran into this issue repeatedly in internal testing purging 
> masterprocwals on a large test cluster). Worse, when master startup cannot 
> find an online hbase:meta, it exits after exhausting the RPC retries.
> So, we need a holding-pattern for master startup if hbase:meta is not online 
> if only so an operator can schedule an assign for meta or so they can assign 
> fixup procedures (HBASE-20786 has discussion on why we cannot just 
> auto-schedule an assign of meta).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to