Looking at the code it looks like we don't need a synched quorum to accept a 
new client session, just a quorum in the process of synching, so I don't think 
the session handling will solve this. I suppose it's a warning that correctness 
for n=3 doesn't extend to all possible cluster sizes of N.
Definitely worth opening a JIRA.

C

From: Flavio Junqueira [mailto:[email protected]]
Sent: Monday, March 28, 2011 11:49 AM
To: [email protected]
Subject: Re: send UPTODATE to follower until a quorum of servers synced with 
leader

Hi Jiangwen, Good catch. I followed the code and it does sound like this 
scenario can happen, ignoring how sessions are handled. I checked that a 
follower takes a snapshot and starts a zookeeper server right after receiving 
an UPTODATE message. I'm not clear, though, if it is possible for a client to 
revalidate a session while the leader hasn't started. I was discussing with Ben 
offline and it sounds like we do not necessarily wait for a leader to come up 
to revalidate sessions. I'm not so familiar with the session handling part of 
the code, so I'll let perhaps Ben or someone else add to this discussion.

In any case, you might want to open a jira to track our comments so that we 
don't miss important comments. I also wanted to point out that we have been 
observing a few corner cases like the one you raised, and we have been 
designing changes to the implementation that take care of such problems. If I'm 
not mistaken, the scenario you point out wouldn't happen under our changes 
because followers would wait for a commit message (wait for a quorum to ack) 
before starting a server, as you point out. The latest notes on the design are 
under Zab1.0 in the ZooKeeper wiki.

Thanks,
-Flavio


On Mar 28, 2011, at 10:24 AM, jiangwen w wrote:


1. current process
when leader fail, a new leader will be elected, followers will sync with the
new leader.
After synced, leader send UPTODATE to follower.

2. a corner case
but there is a corner case, things will go wrong.
suppose message M only exists on leader, after a follower synced with
leader, the client connected to the follower will see M.
but it only exists on two servers, not on a quorum of servers. If the new
leader and the follower failed, message M is lost, but M is already seen by
client.

3. one solution
So I think UPTODATE  can be sent to follower only when a quorum of server
synced with the leader.

Sincerely

flavio
junqueira

research scientist

[email protected]<mailto:[email protected]>
direct +34 93-183-8828

avinguda diagonal 177, 8th floor, barcelona, 08018, es
phone (408) 349 3300    fax (408) 349 3301

[cid:[email protected]]

Reply via email to