[
https://issues.apache.org/jira/browse/HBASE-14931?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Andrew Purtell resolved HBASE-14931.
------------------------------------
Resolution: Duplicate
Fix Version/s: (was: 0.98.18)
> Active master switches may cause region close forever
> -----------------------------------------------------
>
> Key: HBASE-14931
> URL: https://issues.apache.org/jira/browse/HBASE-14931
> Project: HBase
> Issue Type: Bug
> Components: master
> Affects Versions: 0.98.10
> Reporter: Shuaifeng Zhou
> Priority: Critical
>
> 60010 webpage shows that a region is online on one RS, but when access data
> in the region throw notServingRegion. After lookup the source code and logs,
> found that it's because active master switches during the region openning:
> 1, master1 open region 'region1', sent open region request to rs and create
> node in zk
> 2, master1 stoped
> 3, master2 became active master
> 4, master2 obtain all region status, 'region1' status is offline
> 5, rs opened 'region1' node changed to opened in zk, and sent message to
> master2
> 6, master2 received RS_ZK_REGION_OPENED, but the status is not pending open
> or openning, sent unassign to rs, 'region1' closed
> {code:title=AssignmentManager.java|borderStyle=solid}
> case RS_ZK_REGION_OPENED:
> // Should see OPENED after OPENING but possible after PENDING_OPEN.
> if (regionState == null
> || !regionState.isPendingOpenOrOpeningOnServer(sn)) {
> LOG.warn("Received OPENED for " + prettyPrintedRegionName
> + " from " + sn + " but the region isn't PENDING_OPEN/OPENING
> here: "
> + regionStates.getRegionState(encodedName));
> if (regionState != null) {
> // Close it without updating the internal region states,
> // so as not to create double assignments in unlucky scenarios
> // mentioned in OpenRegionHandler#process
> unassign(regionState.getRegion(), null, -1, null, false, sn);
> }
> return;
> }
> {code}
> 7, master2 continue handle regioninfo when master1 stoped, found that
> 'region1' status in zk is opened, update status in memory to opened.
> 8, up to now, 'region1' status is opened on webpage of master status, but not
> opened on any regionserver.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)