Master can fail if ZooKeeper session expires
--------------------------------------------
Key: HBASE-5549
URL: https://issues.apache.org/jira/browse/HBASE-5549
Project: HBase
Issue Type: Bug
Components: master, zookeeper
Affects Versions: 0.96.0
Environment: all
Reporter: nkeywal
Assignee: nkeywal
Priority: Minor
There is a retry mechanism in RecoverableZooKeeper, but when the session
expires, the whole ZooKeeperWatcher is recreated, hence the retry mechanism
does not work in this case. This is why a sleep is needed in
TestZooKeeper#testMasterSessionExpired: we need to wait for ZooKeeperWatcher to
be recreated before using the connection.
This can happen in real life, it can happen when:
- master & zookeeper starts
- zookeeper connection is cut
- master enters the retry loop
- in the meantime the session expires
- the network comes back, the session is recreated
- the retries continues, but on the wrong object, hence fails.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira