[ https://issues.apache.org/jira/browse/HBASE-21191?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16612714#comment-16612714 ]
stack commented on HBASE-21191: ------------------------------- Playing w/ this patch... I removed master wal procs content. Had to remove WALs for any old servers too else the meta assign was getting scheduled; i.e. its hard to manufacture case where there is no assign for hbase:meta. I tried to use straight shell assign to do the hbase:meta assign but it came back with: {code} ERROR: org.apache.hadoop.hbase.PleaseHoldException: Master is initializing at org.apache.hadoop.hbase.master.HMaster.checkInitialized(HMaster.java:2966) at org.apache.hadoop.hbase.master.MasterRpcServices.assignRegion(MasterRpcServices.java:558) at org.apache.hadoop.hbase.shaded.protobuf.generated.MasterProtos$MasterService$2.callBlockingMethod(MasterProtos.java) at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:413) at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:130) at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:324) at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:304) {code} So, ok, this should be an hbck2 task. Working on it. > Add a holding-pattern if no assign for meta or namespace (Can happen if > masterprocwals have been cleared). > ---------------------------------------------------------------------------------------------------------- > > Key: HBASE-21191 > URL: https://issues.apache.org/jira/browse/HBASE-21191 > Project: HBase > Issue Type: Sub-task > Components: amv2 > Reporter: stack > Assignee: stack > Priority: Major > Fix For: 2.1.1 > > Attachments: HBASE-21191.branch-2.1.001.patch > > > If the masterprocwals have been removed -- operator error, hdfs dataloss, or > because we have gotten ourselves into a pathological state where we have > hundreds of masterprocwals too process and it is taking too long so we just > want to startover -- then master startup will have a dilemma. Master startup > needs hbase:meta to be online. If the masterprocwals have been removed, there > may be no outstanding assign or a servercrashprocedure with coverage for > hbase:meta (I ran into this issue repeatedly in internal testing purging > masterprocwals on a large test cluster). Worse, when master startup cannot > find an online hbase:meta, it exits after exhausting the RPC retries. > So, we need a holding-pattern for master startup if hbase:meta is not online > if only so an operator can schedule an assign for meta or so they can assign > fixup procedures (HBASE-20786 has discussion on why we cannot just > auto-schedule an assign of meta). -- This message was sent by Atlassian JIRA (v7.6.3#76005)