On 20 Mar, 2014, at 02:49 am, Jordan Zimmerman 
<[email protected]<mailto:[email protected]>> wrote:

Curator’s approach is safety first. While it might be possible to time a 
network outage against the session timeout, I believe that this is not good 
distributed design. What if another client has negotiated a different session 
timeout than yours? What if there is clock drift? So, Curator uses the most 
conservative method when there’s a network event.

I am setting things up with identical configurations using the same AWS images 
for my client-apps and for the zookeeper ensemble. Clock drift is kept in check 
with ntp.
By using a long session time out - 90secs or more - I hope to survive a 
crashing zookeeper-server - without some 20% of my client servers loosing 
leadership and in effect shutting down and restarting their task election. I am 
gonna have a couple of hundred servers - loosing 20% is too big a hit for our 
application logic.
I guess I am using the leader latch in a slightly different manner - for 
selecting a worker getting “elected for life” in stead of elected in parliament 
in some countries where there is a re-election a couple of times a year.

Do you agree that there are use cases like mine where this election-for-life is 
the desired behaviour?
I will be experimenting by building a “dictatorLatch” and see how that works 
out.



That said, it might be possible to have a pluggable ConnectionStateListener 
strategy for Curator Recipes. Instead of each recipe assuming the worst when 
there is a network event, there could be something like a 
ConnectionStateListener wrapper that suppresses SUSPENDED until the session 
timeout elapses. I haven’t totally thought this through though.


>From my experiments so far (copy-pasting the leader latch and modifying the 
>behaviour on suspend) it looks that on reconnect the leader latch is 
>apparently replacing its node in zookeeper, and when other servers where in 
>the election during the suspend-period of the leader, the replacement node 
>will be lower in the election-list - thus loosing the leadership. I will 
>continue staring at this - need some more tracing to isolate what happens.

thanks for your time!

Robert



-JZ


From: Robert Kamphuis 
[email protected]<mailto:[email protected]>
Reply: [email protected]<mailto:[email protected]> 
[email protected]<mailto:[email protected]>
Date: March 19, 2014 at 6:23:01 AM
To: [email protected]<mailto:[email protected]> 
[email protected]<mailto:[email protected]>
Cc: Robert Kamphuis 
[email protected]<mailto:[email protected]>
Subject:  Confused about the LeaderLatch - what should happen on 
ConnectionState.SUSPENDED and ConnectionState.LOST ?


Hi,

I have been working on changing our application to work with Zookeeper and 
Curator for some while now, and are occasionally getting wrong behaviour out of 
my system.
The symptom I’m getting is that two servers are concluding that they are the 
leader of a particular task/leaderlatch at the same time, braking everything in 
my application.
This does not happen too often - but often enough and it is bad enough for my 
application. I can get it pretty consistently occurring by restarting one of 
the servers in our 5-server zookeeper ensembles in turns,
while having multiple servers queuing up for the same leader latch.

My key question is the following:
- WHAT should a user of a leaderLatch do when the connectionState goes to 
suspended?

My assumption and desired behaviour is that the user should suspend operations 
- which implies to me that its leadership status is uncertain. (I am holding 
off all persistent operations for example).
But -I think- this also implies that no-one else can become leader yet - we 
either have the old-leader still be leader, and no one else, or then the 
old-leader disappeared and we are in effect leaderless for some time.
This will then be followed by
a) a reconnect - in which case the old leader can continue its stuff (and 
optionally double check its leadership status) or
b) a lost - in which case the old leader lost its leadership and should release 
all its power etc and try again or do something else. Someone else likely 
became leader in my application by then.
The a) or b) is controlled by the SessionTimeout negotiated between the 
curator/zookeeper client and zookeeper ensemble.

Is my thinking correct here?
and if so, why is the curator’s LeaderLatch.handleStateChange(ConnectionState 
newState) handling both in the same way : setLeadership(false)

In my application, a leadership change is a pretty big event, due to the amount 
of work the code does, and I really want leadership to remain between short 
connection-breaks - eg. one of the zookeeper servers crashes. Leadership should 
only be swapped on a sessiontimeout - eg. broken application node, or long 
network break between the server and the zookeeper servers. I am thinking to 
use 90 second as session timeout (so to survive eg. longer GC breaks and 
similar without leadership change) - maybe even longer.

Is this a bug in leader latch, or should I use something else than leader 
latch, or implement my desired behaviour in a new recipe?

kind regards,
Robert Kamphuis

PS. using zookeeper3.4.5 and curator2.4.0

Reply via email to