[
https://issues.apache.org/jira/browse/HBASE-20589?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16476896#comment-16476896
]
Guanghao Zhang commented on HBASE-20589:
----------------------------------------
2018-05-16 08:31:44,077 INFO [Thread-409] master.ServerManager(887): Finished
waiting on RegionServer count=3; waited=1261ms, expected min=3 server(s), max=3
server(s), master is running
2018-05-16 08:31:44,384 DEBUG [Thread-409] procedure2.ProcedureExecutor(884):
Stored pid=29, state=RUNNABLE:RECOVER_META_PREPARE; RecoverMetaProcedure
failedMetaServer=null, splitWal=true
2018-05-16 08:31:44,476 INFO [PEWorker-3]
procedure.RecoverMetaProcedure(125): Start pid=29,
state=RUNNABLE:RECOVER_META_SPLIT_LOGS; RecoverMetaProcedure
failedMetaServer=null, splitWal=true
2018-05-16 08:31:44,569 INFO [PEWorker-3] procedure.RecoverMetaProcedure(157):
pid=29, state=RUNNABLE:RECOVER_META_ASSIGN_REGIONS; RecoverMetaProcedure
failedMetaServer=null, splitWal=true; Retaini ng meta assignment to
server=hao-optiplex-7050,41617,1526430656886 2018-05-16 08:31:44,569 INFO
[PEWorker-3] procedure2.ProcedureExecutor(1515): Initialized
subprocedures=[\\{pid=30, ppid=29, state=RUNNABLE:REGION_TRANSITION_QUEUE;
AssignProcedure table=hbase:meta, region=1588230740,
target=hao-optiplex-7050,41617,1526430656886}]
2018-05-16 08:31:44,713 INFO [PEWorker-15]
procedure.MasterProcedureScheduler(640): pid=30, ppid=29,
state=RUNNABLE:REGION_TRANSITION_QUEUE; AssignProcedure table=hbase:meta,
region=1588230740, target=hao-optiplex-7050,41617,1526430656886 checking lock
on 1588230740
{color:#FF0000}2018-05-16 08:31:44,713 INFO [PEWorker-15]
assignment.AssignProcedure(218): Starting pid=30, ppid=29,
state=RUNNABLE:REGION_TRANSITION_QUEUE; AssignProcedure table=hbase:meta,
region=1588230740, target=hao-optiplex-7050,41617,1526430656886; rit=OFFLINE,
location=hao-optiplex-7050,41617,1526430656886; forceNewPlan=false,
retain=true{color}
The red log is the problem. The AssignProcedure thought meta is offline and
will assign meta to a new region server. I thought the right behavior should
return false directly and don't need assign meta...
{code:java}
protected boolean startTransition(final MasterProcedureEnv env, final
RegionStateNode regionNode)
throws IOException {
// If the region is already open we can't do much...
if (regionNode.isInState(State.OPEN) && isServerOnline(env, regionNode)) {
LOG.info("Assigned, not reassigning; " + this + "; " +
regionNode.toShortString());
return false;
}
......
}{code}
> Don't need to assign meta to a new RS when standby master become active
> -----------------------------------------------------------------------
>
> Key: HBASE-20589
> URL: https://issues.apache.org/jira/browse/HBASE-20589
> Project: HBase
> Issue Type: Bug
> Reporter: Guanghao Zhang
> Priority: Major
> Attachments: HBASE-20589.master.001.patch
>
>
> I found this problem when I write ut for HBASE-20569. Now the master
> finishActiveMasterInitialization introduce a new
> RecoverMetaProcedure(HBASE-18261) and it has a sub procedure AssignProcedure.
> AssignProcedure will skip assign a region when regions state is OPEN and
> server is online. But for the new regiog state node is created with state
> OFFLINE. So it will assign the meta to a new RS. And kill the old RS when old
> RS report to master. This will make the master initialization cost a long
> time. I will attatch a ut to show this. FYI [~stack]
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)