[
https://issues.apache.org/jira/browse/HBASE-3937?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
stack resolved HBASE-3937.
--------------------------
Resolution: Invalid
Resolving as invalid. A bunch of work has been done in AM since 0.90. This
issue if it still exists will have a new form.
> Region PENDING-OPEN timeout with un-expected ZK node state leads to an
> endless loop
> -----------------------------------------------------------------------------------
>
> Key: HBASE-3937
> URL: https://issues.apache.org/jira/browse/HBASE-3937
> Project: HBase
> Issue Type: Bug
> Components: master
> Affects Versions: 0.90.3
> Reporter: Jieshan Bean
> Assignee: Jieshan Bean
> Fix For: 0.95.1
>
>
> I describe the scenario of how this problem happened:
> 1.HMaster assigned the region A to RS1. So the RegionState was set to
> PENDING_OPEN.
> 2.For there's too many opening requests, the open process on RS1 was blocked.
> 3.Some time later, TimeoutMonitor found the assigning of A was timeout. For
> the RegionState was in PENDING_OPEN, went into the following handler
> process(Just put the region into an waiting-assigning set):
> case PENDING_OPEN:
> LOG.info("Region has been PENDING_OPEN for too " +
> "long, reassigning region=" +
> regionInfo.getRegionNameAsString());
> assigns.put(regionState.getRegion(), Boolean.TRUE);
> break;
> So we can see that, under this case, we consider the ZK node state was
> OFFLINE. Indeed, in an normal disposal, it's OK.
> 4.But before the real-assigning, the requests of RS1 was disposed. So that
> affected the new-assigning. For it update the ZK node state from OFFLINE to
> OPENING.
> 5.The new assigning started, so it send region to open in RS2. But while the
> opening, it should update the ZK node state from OFFLINE to OPENING. For the
> current state is OPENING, so this operation failed.
> So this region couldn't be open success anymore.
> So I think, to void this problem , under the case of PENDING_OPEN of
> TiemoutMonitor, we should transform the ZK node state to OFFLINE first.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira