Its strange that sync doesn't run through agreement, I was always assuming that it is... Exactly for the reason you say - you may trust your leader, but I may have a different leader and your leader may not detect it yet and still think its the leader.
This seems like a bug to me. Similarly to Paxos, Zookeeper's safety guarantees don't (or shouldn't) depend on timing assumption. Only progress guarantees depend on time. Alex On Wed, Sep 26, 2012 at 4:41 PM, John Carrino <[email protected]> wrote: > I have some pretty strong requirements in terms of consistency where > reading from followers that may be behind in terms of updates isn't ok for > my use case. > > One error case that worries me is if a follower and leader are partitioned > off from the network. A new leader is elected, but the follower and old > leader don't know about it. > > Normally I think sync was made for this purpost, but I looked at the sync > code and if there aren't any outstanding proposals the leader sends the > sync right back to the client without first verifying that it still has > quorum, so this won't work for my use case. > > At the core of the issue all I really need is a call that will make it's > way to the leader and will ping it's followers, ensure it still has a > quorum and return success. > > Basically a getCurrentLeaderEpoch() method that will be forwarded to the > leader, leader will ensure it still has quorum and return it's epoch. I > can use this primitive to implement all the other properties I want to > verify (assuming that my client will never connect to an older epoch after > this call returns). Also the nice thing about this method is that it will > not have to hit disk and the latency should just be a round trip to the > followers. > > Most of the guarentees offered by zookeeper are time based an rely on > clocks and expiring timers, but I'm hoping to offer some guarantees in > spite of busted clocks, horrible GC perf, VM suspends and any other way > time is broken. > > Also if people are interested I can go into more detail about what I am > trying to write. > > -jc
