[
https://issues.apache.org/jira/browse/HBASE-18694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16487494#comment-16487494
]
Pankaj Kumar commented on HBASE-18694:
--------------------------------------
If the active master goes down in middle of assignment process (already RS
started opening the regions) then new master takes care of the region
assignment (by processing the regions which were in transition during last
master operation).
Steps as follows,
1. During failover new active master will set the region status as OFFLINE and
start processing of transition Znodes.
2. While processing the transition Znode,
- ZK-worker may receive notification related to RS_ZK update
(RS_ZK_REGION_OPENED, because RS opened the region as per old master request)
which may mislead HM.
- Since there will be mismatch in region_state/servername (activeMaster
thread would have reset it to OFFLINE/OPEN), so to avoid double problem HM
processed those regions as CLOSE without ZK transition.
So those regions will not open again and remain offline.
> The master switch of ha mode causes region not online
> -----------------------------------------------------
>
> Key: HBASE-18694
> URL: https://issues.apache.org/jira/browse/HBASE-18694
> Project: HBase
> Issue Type: Bug
> Components: master, Zookeeper
> Affects Versions: 0.98.10, 1.3.0
> Reporter: Bo Cui
> Assignee: Pankaj Kumar
> Priority: Major
> Attachments: Sequence Diagram.jpg, hbase-18694-0.98.10.patch
>
>
> HA mode has two master, firstMaster and secondMaster; at the beginning,
> SecondMaster is active, firstMaster is standby.
> After secondMaster sends the openregion request to rs , the rs begins to
> launch regions, and the secondMaster just happened to be arbor; firstMaster
> becomes active and receives the zk message (node changed opend), but the
> region state in firstMaster memory is not opening, firstMaster sends the send
> closeRegion to rs, the region will never be on-line.
> Other versions should also exist the problem.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)