When I wrote this code over a year ago, my understanding of proper handling of error conditions was to suspend the leaders, locks, etc.. when the connection was SUSPENDED and to rebuild the leaders, locks, etc.. if the connection had been LOST. I believe I have been getting Connection LOST when the session was really still alive. When my code was then, upon RECONNECT, created a new LeaderSelector, this was causing a new zNode to be added (queued) to the leader path. Clearly, this is not the correct error handling.
Today, I am upgrading to the 3.x Curator and 3.5 zookeeper. You imply that I should not closing the LeaderSelector on a LOST. What is the correctly handling, assuming I am using the 3.x branch of Curator. Thank you, Curtis From: Jordan Zimmerman [mailto:[email protected]] Sent: Wednesday, July 13, 2016 4:26 PM To: [email protected] Subject: Re: Problem with LeaderSelector 2.7.1 I quickly looked at your code and don’t understand why you close the leader selector on connection LOST. Does your network partition often? Also, are you really creating a new Curator instance for every leader selector? You should create one Curator instance for your entire application. -JZ On Jul 13, 2016, at 1:41 PM, Cantrell, Curtis <[email protected]<mailto:[email protected]>> wrote: It looks like maybe there are two Fixes that affect my problem. CURATOR-264 and CURATOR-247. Has CURATOR-247 been merge to the 2.X branch or do I need to update my zookeeper to 3.5 in order to get the fix? Leader election: Duplicate ephemeral nodes with same owner id https://issues.apache.org/jira/browse/CURATOR-264 We sometimes experience failure in our leader-election functionality when we have network issues. When this situation occurs we see that there are two ephemeral nodes in the zookeeper cluster for the same session but there is no active leader. Extend Curator's connection state to support SESSION_LOST https://issues.apache.org/jira/browse/CURATOR-247 Curator has a connection state for LOST that confuses users. It does not mean that the session is lost. Instead it means that the retry policy has given up retrying The information contained in this message is proprietary and/or confidential. If you are not the intended recipient, please: (i) delete the message and all copies; (ii) do not disclose, distribute or use the message in any manner; and (iii) notify the sender immediately. In addition, please be aware that any message addressed to our domain is subject to archiving and review by persons other than the intended recipient. Thank you.
