stack created HBASE-19515:
-----------------------------
Summary: Region server left in online servers list forever if it
went down after registering to master and before creating ephemeral node
Key: HBASE-19515
URL: https://issues.apache.org/jira/browse/HBASE-19515
Project: HBase
Issue Type: Bug
Components: Region Assignment
Reporter: stack
Priority: Critical
Fix For: 2.0.0
This one is interesting. It was supposedly fixed long time ago back in
HBASE-9593 (The issue has same subject as this one) but there was a problem w/
the fix reported later, post-commit, long after the issue was closed. The 'fix'
was registering ephemeral node in ZK BEFORE reporting in to the Master for the
first time. The problem w/ this approach is that the Master tells the RS what
name it should use reporting in. If we register in ZK before we talk to the
Master, the name in ZK and the one the RS ends up using could deviate.
In hbase2, we do the right thing registering the ephemeral node after we report
to the Master. So, the issue reported in HBASE-9593, that a RS that dies
between reporting to master and registering up in ZK, stays registered at the
Master for ever is back; we'll keep trying to assign it regions. Its a real
problem.
That hbase2 has this issue has been suppressed up until now. The test that was
written for HBASE-9593, TestRSKilledWhenInitializing, is a good test but a
little sloppy. It puts up two RSs aborting one only after registering at the
Master before posting to ZK. That leaves one healthy server up. It is hosting
hbase:meta. This is enough for the test to bluster through. The only assign it
does is namespace table. It goes to the hbase:meta server. If the test created
a new table and did roundrobin, it'd fail.
After HBASE-18946, where we do round robin on table create -- a desirable
attribute -- via the balancer so all is kosher, the test
TestRSKilledWhenInitializing now starts to fail because we chose the hobbled
server most of the time.
So, this issue is about fixing the original issue properly for hbase2. We don't
have a timeout on assign in AMv2, not yet, that might be the fix, or perhaps a
double report before we online a server with the second report coming in after
ZK goes up (or we stop doing ephemeral nodes for RS up in ZK and just rely on
heartbeats....).
Making this a critical issue.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)