[ 
https://issues.apache.org/jira/browse/HBASE-20589?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16480613#comment-16480613
 ] 

Guanghao Zhang commented on HBASE-20589:
----------------------------------------

2018-05-16 08:31:44,713 INFO [PEWorker-15] 
procedure.MasterProcedureScheduler(640): pid=30, ppid=29, 
state=RUNNABLE:REGION_TRANSITION_QUEUE; AssignProcedure table=hbase:meta, 
region=1588230740, target=hao-optiplex-7050,41617,1526430656886 checking lock 
on 1588230740
 2018-05-16 08:31:44,713 INFO [PEWorker-15] assignment.AssignProcedure(218): 
Starting pid=30, ppid=29, state=RUNNABLE:REGION_TRANSITION_QUEUE; 
AssignProcedure table=hbase:meta, region=1588230740,{color:#FF0000} 
target=hao-optiplex-7050,41617,1526430656886; rit=OFFLINE, 
location=hao-optiplex-7050,41617,1526430656886{color}; forceNewPlan=false, 
retain=true 
 2018-05-16 08:31:44,717 INFO [IPC Server handler 3 on 41717] 
blockmanagement.BlockManager(1169): BLOCK* addToInvalidates: 
blk_1073741842_1018 127.0.0.1:44841 127.0.0.1:45905 127.0.0.1:44416 
 2018-05-16 08:31:44,871 INFO [master/hao-OptiPlex-7050:0] 
balancer.BaseLoadBalancer(1489): Reassigned 1 regions. 0 retained the 
pre-restart assignment.{color:#FF0000} 1 regions were assigned to random 
hosts{color}, since the old hosts for these regions are no longer present in 
the cluster. These hosts were: 
 hao-OptiPlex-7050
 2018-05-16 08:31:44,876 INFO [PEWorker-13] zookeeper.MetaTableLocator(452): 
Setting hbase:meta (replicaId=0) location in ZooKeeper as 
{color:#FF0000}hao-optiplex-7050,44270,1526430656935{color}
 2018-05-16 08:31:44,980 INFO [PEWorker-13] 
assignment.RegionTransitionProcedure(251): Dispatch pid=30, ppid=29, 
state=RUNNABLE:REGION_TRANSITION_DISPATCH; AssignProcedure table=hbase:meta, 
region=1588230740, target=hao-optiplex-7050,41617,1526430656886; rit=OPENING, 
location=hao-optiplex-7050,44270,1526430656935

 

New findings: The key problem was the meta region was assigned to a new random 
region server... Even the assign procedure thought meta is offline and assign 
it to the old target RS, it is right. But the log shows it assign meta to a new 
random RS. It is wrong......

> Don't need to assign meta to a new RS when standby master become active
> -----------------------------------------------------------------------
>
>                 Key: HBASE-20589
>                 URL: https://issues.apache.org/jira/browse/HBASE-20589
>             Project: HBase
>          Issue Type: Bug
>            Reporter: Guanghao Zhang
>            Assignee: Guanghao Zhang
>            Priority: Major
>         Attachments: HBASE-20589.master.001.patch, 
> HBASE-20589.master.002.patch, HBASE-20589.master.003.patch, 
> HBASE-20589.master.003.patch, HBASE-20589.master.004.patch
>
>
> I found this problem when I write ut for HBASE-20569. Now the master  
> finishActiveMasterInitialization introduce a new 
> RecoverMetaProcedure(HBASE-18261) and it has a sub procedure AssignProcedure. 
> AssignProcedure will skip assign a region when regions state is OPEN and 
> server is online. But for the new regiog state node is created with state 
> OFFLINE. So it will assign the meta to a new RS. And kill the old RS when old 
> RS report to master. This will make the master initialization cost a long 
> time. I will attatch a ut to show this. FYI [~stack]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to