[
https://issues.apache.org/jira/browse/HBASE-20589?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16480613#comment-16480613
]
Guanghao Zhang commented on HBASE-20589:
----------------------------------------
2018-05-16 08:31:44,713 INFO [PEWorker-15]
procedure.MasterProcedureScheduler(640): pid=30, ppid=29,
state=RUNNABLE:REGION_TRANSITION_QUEUE; AssignProcedure table=hbase:meta,
region=1588230740, target=hao-optiplex-7050,41617,1526430656886 checking lock
on 1588230740
2018-05-16 08:31:44,713 INFO [PEWorker-15] assignment.AssignProcedure(218):
Starting pid=30, ppid=29, state=RUNNABLE:REGION_TRANSITION_QUEUE;
AssignProcedure table=hbase:meta, region=1588230740,{color:#FF0000}
target=hao-optiplex-7050,41617,1526430656886; rit=OFFLINE,
location=hao-optiplex-7050,41617,1526430656886{color}; forceNewPlan=false,
retain=true
2018-05-16 08:31:44,717 INFO [IPC Server handler 3 on 41717]
blockmanagement.BlockManager(1169): BLOCK* addToInvalidates:
blk_1073741842_1018 127.0.0.1:44841 127.0.0.1:45905 127.0.0.1:44416
2018-05-16 08:31:44,871 INFO [master/hao-OptiPlex-7050:0]
balancer.BaseLoadBalancer(1489): Reassigned 1 regions. 0 retained the
pre-restart assignment.{color:#FF0000} 1 regions were assigned to random
hosts{color}, since the old hosts for these regions are no longer present in
the cluster. These hosts were:
hao-OptiPlex-7050
2018-05-16 08:31:44,876 INFO [PEWorker-13] zookeeper.MetaTableLocator(452):
Setting hbase:meta (replicaId=0) location in ZooKeeper as
{color:#FF0000}hao-optiplex-7050,44270,1526430656935{color}
2018-05-16 08:31:44,980 INFO [PEWorker-13]
assignment.RegionTransitionProcedure(251): Dispatch pid=30, ppid=29,
state=RUNNABLE:REGION_TRANSITION_DISPATCH; AssignProcedure table=hbase:meta,
region=1588230740, target=hao-optiplex-7050,41617,1526430656886; rit=OPENING,
location=hao-optiplex-7050,44270,1526430656935
New findings: The key problem was the meta region was assigned to a new random
region server... Even the assign procedure thought meta is offline and assign
it to the old target RS, it is right. But the log shows it assign meta to a new
random RS. It is wrong......
> Don't need to assign meta to a new RS when standby master become active
> -----------------------------------------------------------------------
>
> Key: HBASE-20589
> URL: https://issues.apache.org/jira/browse/HBASE-20589
> Project: HBase
> Issue Type: Bug
> Reporter: Guanghao Zhang
> Assignee: Guanghao Zhang
> Priority: Major
> Attachments: HBASE-20589.master.001.patch,
> HBASE-20589.master.002.patch, HBASE-20589.master.003.patch,
> HBASE-20589.master.003.patch, HBASE-20589.master.004.patch
>
>
> I found this problem when I write ut for HBASE-20569. Now the master
> finishActiveMasterInitialization introduce a new
> RecoverMetaProcedure(HBASE-18261) and it has a sub procedure AssignProcedure.
> AssignProcedure will skip assign a region when regions state is OPEN and
> server is online. But for the new regiog state node is created with state
> OFFLINE. So it will assign the meta to a new RS. And kill the old RS when old
> RS report to master. This will make the master initialization cost a long
> time. I will attatch a ut to show this. FYI [~stack]
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)