Re: LeaderLatch recipe and error handling

Mathias Söderberg Tue, 27 May 2014 13:13:39 -0700

Right, that’s what I assumed after I actually read the code for the
LeaderLatch class.


We’re not using await(), but have a number of LeaderLatches and currently
we’re caching the last response of getLeader() (for a each LeaderLatch),
and we add watches for the election paths and update the cache if we get a
NodeChildrenChanged notification.
When we get a LOST event followed by a RECONNECTED event we clear the cache
and start over as we have no clue who’s responsible for what. If we get a
SUSPENDED event we don’t permit reads from the cache until we get a
RECONNECTED event (or rather we return null as we cannot be sure who’s
leader).

Perhaps we should clear the cache when we get a SUSPENDED event as well, to
be on the safe side.

But in conclusion there’s no need to actually close and re-create
LeaderLatches in case of a connection loss, which is really what I was
wondering about.

Best regards,
Mathias


On Tue, May 27, 2014 at 9:41 PM, Jordan Zimmerman <
[email protected]> wrote:

> The documentation probably needs updating as this has been refined over
> time.
>
>    - The LeaderLatch installs its own connection state listener
>    - If the connection drops (SUSPENDED or LOST), the LeaderLatch changes
>    its internal state to “leader == false”
>    - If the connection goes to RECONNECTED, the LeaderLatch will attempt
>    to regain leadership
>
> This has implications for users of LeaderLatch. If, for example you've
> called await() on the LeaderLatch your code will assume that it is the
> leader. However, if the connection drops you may no longer be the leader.
> So, clients should install their own ConnectionStateListener and notice
> that the connection has dropped. Also, you can examine
> LeaderLatch.hasLeadership() before your client code does anything where it
> assumes it is the leader and then periodically re-check it.
>
> I hope this helps.
>
> -JZ
>
>
> From: Mathias Söderberg [email protected]
> Reply: [email protected] [email protected]
> Date: May 27, 2014 at 2:33:21 PM
> To: [email protected] [email protected]
> Subject:  LeaderLatch recipe and error handling
>
>  Good evening,
>
> I’m currently working on a project where we’re utilising Curator and more
> specifically (quite heavily) the LeaderLatch recipe.
>
> The documentation for error handling in “general” states the following for
> a LOST notification:
>
>  The connection is confirmed to be lost. Close any locks, leaders, etc.
> and attempt to re-create them. NOTE: it is possible to get a RECONNECTED
> state after this but you should still consider any locks, etc. as
> dirty/unstable.
>
>  And the documentation for the LeaderLatch recipe states the following:
>
>  LeaderLatch instances add a ConnectionStateListener to watch for
> connection problems. If SUSPENDED or LOST is reported, the LeaderLatch that
> is the leader will report that it is no longer the leader (i.e. there will
> not be a leader until the connection is re-established). If a LOST
> connection is RECONNECTED, the LeaderLatch will delete its previous ZNode
> and create a new one.
>
> Users of LeaderLatch must take account that connection issues can cause
> leadership to be lost. i.e. hasLeadership() returns true but some time
> later the connection is SUSPENDED or LOST. At that point hasLeadership()
> will return false. It is highly recommended that LeaderLatch users register
> a ConnectionStateListener.
>
>  My conclusion from reading these two sections is that we’re supposed to
> add a ConnectionStateListener and when we’re notified of a LOST event
> followed by a RECONNECTED event, we’re supposed to close the current
> LeaderLatches that we’re holding and re-create them?
>
> However, looking through the actual code for the LeaderLatch, it appears
> that this is actually already handled, i.e. it appears to create a new
> znode when it encounters a RECONNECTED event, or am I reading this wrong?
> (The documentation also states this as a fact).
>
> My question is really: do we have to take any particular precaution
> regarding the LeaderLatch recipe and connection loss scenarios? i.e. do we
> have to close and re-create the LeaderLatches? Or can we be calm and just
> carry on with our business as Curator handles this?
>
> If anything is unclear, let me know.
>
> Best regards,
>
> Mathias Söderberg
> Software Developer, Burt
>
> www.burtcorp.com
> Cell: + 46 762 79 57 55 | Skype: mthssdrbrg
> http://twitter.com/mthssdrbrg | http://twitter.com/burtcorp
> –––––––––––––––––––––––––––––––––––––––––––
>
> The Analytics Platform for Online Media
>
>


-- 

Mathias Söderberg
Software Developer, Burt

www.burtcorp.com
Cell: + 46 762 79 57 55 | Skype: mthssdrbrg
http://twitter.com/mthssdrbrg | http://twitter.com/burtcorp
–––––––––––––––––––––––––––––––––––––––––––

The Analytics Platform for Online Media

Re: LeaderLatch recipe and error handling

Reply via email to