[ 
https://issues.apache.org/jira/browse/HBASE-20589?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16476896#comment-16476896
 ] 

Guanghao Zhang commented on HBASE-20589:
----------------------------------------

2018-05-16 08:31:44,077 INFO  [Thread-409] master.ServerManager(887): Finished 
waiting on RegionServer count=3; waited=1261ms, expected min=3 server(s), max=3 
server(s), master is running 
 2018-05-16 08:31:44,384 DEBUG [Thread-409] procedure2.ProcedureExecutor(884): 
Stored pid=29, state=RUNNABLE:RECOVER_META_PREPARE; RecoverMetaProcedure 
failedMetaServer=null, splitWal=true 
 2018-05-16 08:31:44,476 INFO  [PEWorker-3] 
procedure.RecoverMetaProcedure(125): Start pid=29, 
state=RUNNABLE:RECOVER_META_SPLIT_LOGS; RecoverMetaProcedure 
failedMetaServer=null, splitWal=true
 2018-05-16 08:31:44,569 INFO [PEWorker-3] procedure.RecoverMetaProcedure(157): 
pid=29, state=RUNNABLE:RECOVER_META_ASSIGN_REGIONS; RecoverMetaProcedure 
failedMetaServer=null, splitWal=true; Retaini ng meta assignment to 
server=hao-optiplex-7050,41617,1526430656886 2018-05-16 08:31:44,569 INFO 
[PEWorker-3] procedure2.ProcedureExecutor(1515): Initialized 
subprocedures=[\\{pid=30, ppid=29, state=RUNNABLE:REGION_TRANSITION_QUEUE; 
AssignProcedure table=hbase:meta, region=1588230740, 
target=hao-optiplex-7050,41617,1526430656886}] 
 2018-05-16 08:31:44,713 INFO [PEWorker-15] 
procedure.MasterProcedureScheduler(640): pid=30, ppid=29, 
state=RUNNABLE:REGION_TRANSITION_QUEUE; AssignProcedure table=hbase:meta, 
region=1588230740, target=hao-optiplex-7050,41617,1526430656886 checking lock 
on 1588230740 
 {color:#FF0000}2018-05-16 08:31:44,713 INFO [PEWorker-15] 
assignment.AssignProcedure(218): Starting pid=30, ppid=29, 
state=RUNNABLE:REGION_TRANSITION_QUEUE; AssignProcedure table=hbase:meta, 
region=1588230740, target=hao-optiplex-7050,41617,1526430656886; rit=OFFLINE, 
location=hao-optiplex-7050,41617,1526430656886; forceNewPlan=false, 
retain=true{color}

 

The red log is the problem. The AssignProcedure thought meta is offline and 
will assign meta to a new region server. I thought the right behavior should 
return false directly and don't need assign meta...
{code:java}
protected boolean startTransition(final MasterProcedureEnv env, final 
RegionStateNode regionNode)
    throws IOException {
  // If the region is already open we can't do much...
  if (regionNode.isInState(State.OPEN) && isServerOnline(env, regionNode)) {
    LOG.info("Assigned, not reassigning; " + this + "; " + 
regionNode.toShortString());
    return false;
  }
  ......
}{code}

> Don't need to assign meta to a new RS when standby master become active
> -----------------------------------------------------------------------
>
>                 Key: HBASE-20589
>                 URL: https://issues.apache.org/jira/browse/HBASE-20589
>             Project: HBase
>          Issue Type: Bug
>            Reporter: Guanghao Zhang
>            Priority: Major
>         Attachments: HBASE-20589.master.001.patch
>
>
> I found this problem when I write ut for HBASE-20569. Now the master  
> finishActiveMasterInitialization introduce a new 
> RecoverMetaProcedure(HBASE-18261) and it has a sub procedure AssignProcedure. 
> AssignProcedure will skip assign a region when regions state is OPEN and 
> server is online. But for the new regiog state node is created with state 
> OFFLINE. So it will assign the meta to a new RS. And kill the old RS when old 
> RS report to master. This will make the master initialization cost a long 
> time. I will attatch a ut to show this. FYI [~stack]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to