Right, that’s what I assumed after I actually read the code for the LeaderLatch class.
We’re not using await(), but have a number of LeaderLatches and currently we’re caching the last response of getLeader() (for a each LeaderLatch), and we add watches for the election paths and update the cache if we get a NodeChildrenChanged notification. When we get a LOST event followed by a RECONNECTED event we clear the cache and start over as we have no clue who’s responsible for what. If we get a SUSPENDED event we don’t permit reads from the cache until we get a RECONNECTED event (or rather we return null as we cannot be sure who’s leader). Perhaps we should clear the cache when we get a SUSPENDED event as well, to be on the safe side. But in conclusion there’s no need to actually close and re-create LeaderLatches in case of a connection loss, which is really what I was wondering about. Best regards, Mathias On Tue, May 27, 2014 at 9:41 PM, Jordan Zimmerman < [email protected]> wrote: > The documentation probably needs updating as this has been refined over > time. > > - The LeaderLatch installs its own connection state listener > - If the connection drops (SUSPENDED or LOST), the LeaderLatch changes > its internal state to “leader == false” > - If the connection goes to RECONNECTED, the LeaderLatch will attempt > to regain leadership > > This has implications for users of LeaderLatch. If, for example you've > called await() on the LeaderLatch your code will assume that it is the > leader. However, if the connection drops you may no longer be the leader. > So, clients should install their own ConnectionStateListener and notice > that the connection has dropped. Also, you can examine > LeaderLatch.hasLeadership() before your client code does anything where it > assumes it is the leader and then periodically re-check it. > > I hope this helps. > > -JZ > > > From: Mathias Söderberg [email protected] > Reply: [email protected] [email protected] > Date: May 27, 2014 at 2:33:21 PM > To: [email protected] [email protected] > Subject: LeaderLatch recipe and error handling > > Good evening, > > I’m currently working on a project where we’re utilising Curator and more > specifically (quite heavily) the LeaderLatch recipe. > > The documentation for error handling in “general” states the following for > a LOST notification: > > The connection is confirmed to be lost. Close any locks, leaders, etc. > and attempt to re-create them. NOTE: it is possible to get a RECONNECTED > state after this but you should still consider any locks, etc. as > dirty/unstable. > > And the documentation for the LeaderLatch recipe states the following: > > LeaderLatch instances add a ConnectionStateListener to watch for > connection problems. If SUSPENDED or LOST is reported, the LeaderLatch that > is the leader will report that it is no longer the leader (i.e. there will > not be a leader until the connection is re-established). If a LOST > connection is RECONNECTED, the LeaderLatch will delete its previous ZNode > and create a new one. > > Users of LeaderLatch must take account that connection issues can cause > leadership to be lost. i.e. hasLeadership() returns true but some time > later the connection is SUSPENDED or LOST. At that point hasLeadership() > will return false. It is highly recommended that LeaderLatch users register > a ConnectionStateListener. > > My conclusion from reading these two sections is that we’re supposed to > add a ConnectionStateListener and when we’re notified of a LOST event > followed by a RECONNECTED event, we’re supposed to close the current > LeaderLatches that we’re holding and re-create them? > > However, looking through the actual code for the LeaderLatch, it appears > that this is actually already handled, i.e. it appears to create a new > znode when it encounters a RECONNECTED event, or am I reading this wrong? > (The documentation also states this as a fact). > > My question is really: do we have to take any particular precaution > regarding the LeaderLatch recipe and connection loss scenarios? i.e. do we > have to close and re-create the LeaderLatches? Or can we be calm and just > carry on with our business as Curator handles this? > > If anything is unclear, let me know. > > Best regards, > > Mathias Söderberg > Software Developer, Burt > > www.burtcorp.com > Cell: + 46 762 79 57 55 | Skype: mthssdrbrg > http://twitter.com/mthssdrbrg | http://twitter.com/burtcorp > ––––––––––––––––––––––––––––––––––––––––––– > > The Analytics Platform for Online Media > > -- Mathias Söderberg Software Developer, Burt www.burtcorp.com Cell: + 46 762 79 57 55 | Skype: mthssdrbrg http://twitter.com/mthssdrbrg | http://twitter.com/burtcorp ––––––––––––––––––––––––––––––––––––––––––– The Analytics Platform for Online Media
