[
https://issues.apache.org/jira/browse/HADOOP-8212?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Todd Lipcon updated HADOOP-8212:
--------------------------------
Attachment: hadoop-8212.txt
Attached patch fixes the behavior to not notifyFatalError when the session is
expired. The existing code already handled rejoining.
I also fixed a race bug I turned up where, after rejoining with a new zkClient,
some old notifications from the previous zkClient could end up getting through.
The watchers and callbacks now pass along the zkClient used to set them, and
then in the callback, we check to make sure it is still current.
I also simplified the test case to no longer be multi-threaded, since it's much
easier to follow as a linear progression, and the threads didn't buy us
anything. I added test coverage around session expiration to cover the new code.
> Improve ActiveStandbyElector's behavior when session expires
> ------------------------------------------------------------
>
> Key: HADOOP-8212
> URL: https://issues.apache.org/jira/browse/HADOOP-8212
> Project: Hadoop Common
> Issue Type: Improvement
> Affects Versions: 0.23.3, 0.24.0
> Reporter: Todd Lipcon
> Assignee: Todd Lipcon
> Attachments: hadoop-8212.txt
>
>
> Currently when the ZK session expires, it results in a fatal error being sent
> to the application callback. This is not the best behavior -- for example, in
> the case of HA, if ZK goes down, we would like the current state to be
> maintained, rather than causing either NN to abort. When the ZK clients are
> able to reconnect, they should sort out the correct leader based on the
> normal locking schemes.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira