I can see what you describe as what is happening, but this is not what was 
expecting.
I have had the assumption that the session-timeout is the key timeout value 
which will prevent other clients - which indeed would be connected to another 
host of the zookeeper ensemble - from grabbing leadership.
As apparently today happens, the old-leader is ready to give up leadership once 
it detects a suspend - but why is the zookeeper ensemble doing the same before 
the sessiontimeout?  I think that the zookeeper ensemble is not doing this 
until the session times out,  but the curators leader latch is apparently 
setting up the release of its leader latch node which it will push to zookeeper 
once it reconnects.
This leads to the confusing behaviour that the old leader remains the leader 
during 5-10 seconds it takes to reconnect to the zookeeper cluster (running 
across multiple AZ’s in AWS), but looses leadership soon after the reconnect.
(and of course I did not realise this until recently wondering what is going 
wrong :)

Do you think this is the behaviour people would expect? What are other users 
expecting from the leader latch? Could the documentation clarify this a bit 
better either way?

I will be experimenting a bit more with a more power-hungry “dictatorLatch” 
which hangs on to its leadership until the last moment - the session timeout.


On 19 Mar, 2014, at 15:00 pm, Matt Brown 
<[email protected]<mailto:[email protected]>> wrote:

> My assumption and desired behaviour is that the user should suspend 
> operations - which implies to me that its leadership status is uncertain. (I 
> am holding off all persistent operations for example).
> But -I think- this also implies that no-one else can become leader yet - we 
> either have the old-leader still be leader, and no one else, or then the 
> old-leader disappeared and we are in effect leaderless for some time.

I think the second part of this is incorrect – if client 1 has lost it's 
zookeeper connection, it doesn't imply that other clients have also lost their 
zookeeper connection.

So it would be correct for the former leader who now has a suspended connection 
to cease it's leader activities – but other clients who are still connected to 
the ensemble may have become the leader due to the suspension of client 1's 
connection.

If client 1 still acted as if it still might be the leader when it's connection 
becomes suspended, then you would have two leaders – client 1 and whatever 
client which that still has a healthy ZK connection which grabbed the latch.

>From the perspective of the zookeeper ensemble, it can't know if your client 
>is suffering from a "short connection break" or if it has died altogether – so 
>the client's leader role should be treated as lost in either case.

From: Robert Kamphuis 
<[email protected]<mailto:[email protected]>>
Reply-To: "[email protected]<mailto:[email protected]>" 
<[email protected]<mailto:[email protected]>>
Date: Wednesday, March 19, 2014 at 6:18 AM
To: "[email protected]<mailto:[email protected]>" 
<[email protected]<mailto:[email protected]>>
Cc: Robert Kamphuis 
<[email protected]<mailto:[email protected]>>
Subject: Confused about the LeaderLatch - what should happen on 
ConnectionState.SUSPENDED and ConnectionState.LOST ?


Hi,

I have been working on changing our application to work with Zookeeper and 
Curator for some while now, and are occasionally getting wrong behaviour out of 
my system.
The symptom I’m getting is that two servers are concluding that they are the 
leader of a particular task/leaderlatch at the same time, braking everything in 
my application.
This does not happen too often - but often enough and it is bad enough for my 
application. I can get it pretty consistently occurring by restarting one of 
the servers in our 5-server zookeeper ensembles in turns,
while having multiple servers queuing up for the same leader latch.

My key question is the following:
- WHAT should a user of a leaderLatch do when the connectionState goes to 
suspended?

My assumption and desired behaviour is that the user should suspend operations 
- which implies to me that its leadership status is uncertain. (I am holding 
off all persistent operations for example).
But -I think- this also implies that no-one else can become leader yet - we 
either have the old-leader still be leader, and no one else, or then the 
old-leader disappeared and we are in effect leaderless for some time.
This will then be followed by
a) a reconnect - in which case the old leader can continue its stuff (and 
optionally double check its leadership status) or
b) a lost - in which case the old leader lost its leadership and should release 
all its power etc and try again or do something else. Someone else likely 
became leader in my application by then.
The a) or b) is controlled by the SessionTimeout negotiated between the 
curator/zookeeper client and zookeeper ensemble.

Is my thinking correct here?
and if so, why is the curator’s LeaderLatch.handleStateChange(ConnectionState 
newState) handling both in the same way : setLeadership(false)

In my application, a leadership change is a pretty big event, due to the amount 
of work the code does, and I really want leadership to remain between short 
connection-breaks - eg. one of the zookeeper servers crashes. Leadership should 
only be swapped on a sessiontimeout - eg. broken application node, or long 
network break between the server and the zookeeper servers. I am thinking to 
use 90 second as session timeout (so to survive eg. longer GC breaks and 
similar without leadership change) - maybe even longer.

Is this a bug in leader latch, or should I use something else than leader 
latch, or implement my desired behaviour in a new recipe?

kind regards,
Robert Kamphuis

PS. using zookeeper3.4.5 and curator2.4.0


Reply via email to