Hi all
We encountered a strange scenario in our Hbase cluster ( based on 1.0 branch).
Scenario is like below
There is a table (t1) with region (r1) in disabled state. Region r1 was last
assigned to Region Server (RS1). For some duration in between, Network
communication got broken between HMaster (HM1) and RegionServer (RS1).
In this duration , when user tries to enable table t1, it failed. This
happened because the region r1 couldn't assign to any of the live RS. The
assignment got skipped form forceRegionStateToOffline() method in
AssignmentMaanger due to below check
if (useZKForAssignment
&& regionStates.isServerDeadAndNotProcessed(sn)
&& wasRegionOnDeadServerByMeta(region, sn)) {
}
We found that the method regionStates.isServerDeadAndNotProcessed(sn), will put
the RS1 in its deadServers and wait for SSH to process the RS1 which never
happens as session between RS1 and ZK is still fine.
synchronized boolean isServerDeadAndNotProcessed(ServerName server) {
-----
if (serverManager.isServerReachable(server)) {
return false;
}
// The size of deadServers won't grow unbounded.
deadServers.put(hostAndPort, Long.valueOf(startCode));
}
// Watch out! If the server is not dead, the region could
// remain unassigned. That's why ServerManager#isServerReachable
// should use some retry.
-----
Even though Network recovered after some time, The table could not be enabled
after that. Its due to
a) deadServers never removes the entry of RS1
b) Even though entry from deadServers is removed OR RS is aborted, the table
cannot be enabled as its already in EANBLING state. Only when Master failover
happens, the table gets enabled.
Similar scenario is also discussed in below JIRAs
https://issues.apache.org/jira/browse/HBASE-9514?focusedCommentId=13769761&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13769761
https://issues.apache.org/jira/browse/HBASE-6469
Let us know how to handle this scenario OR any other mechanism.
Thanks
Bhupendra
--------------------------------------------------------------------------------------------------------
This e-mail and its attachments contain confidential information from HUAWEI,
which
is intended only for the person or entity whose address is listed above. Any
use of the
information contained herein in any way (including, but not limited to, total
or partial
disclosure, reproduction, or dissemination) by persons other than the intended
recipient(s) is prohibited. If you receive this e-mail in error, please notify
the sender by
phone or email immediately and delete it!