Hello Wayne, No, this cannot happen. ZooKeeper guarantees that a client session will never "go back in time" and read state older than state that the client session already read. This is accomplished by exchanging a transaction ID, called the "zxid", between the server and client. On every response from the server, it tells the client its last known committed zxid. The client remembers this zxid. If at any point a client needs to reestablish its session, there is a guarantee enforced that the session reestablishes with a server that has a zxid equal to or greater than the client's last observed zxid.
For more details, please see pages 9-10 of "ZooKeeper: Wait-free coordination for Internet-scale systems". http://static.cs.brown.edu/courses/cs227/archives/2012/papers/replication/h unt.pdf Quoting that paper: ZooKeeper servers process requests from clients in FIFO order. Responses include the zxid that the response is relative to. Even heartbeat messages during intervals of no activity include the last zxid seen by the server that the client is connected to. If the client connects to a new server, that new server ensures that its view of the ZooKeeper data is at least as recent as the view of the client by checking the last zxid of the client against its last zxid. If the client has a more recent view than the server, the server does not reestablish the session with the client until the server has caught up. The client is guaranteed to be able to find another server that has a recent view of the system since the client only sees changes that have been replicated to a majority of the ZooKeeper servers. This behavior is important to guarantee durability. --Chris Nauroth On 2/27/16, 4:34 PM, "wayne" <[email protected]> wrote: >Thanks German for the awesome answer! > >I have an interesting followup question. My question is how does Zookeeper >guarantee that within a client session, a subsequent read always observes >a >previous write? > >More specifically, consider the following set up (to simplify the >discussion >assuming every operation is synchronous): > >I have Zookeeper server s1, s2 and s3. s1 is the ZAB master. I have a >client >c, which connects to s3 (its local replica) and establish a session with >s3. > >(1) c issues a write, which goes to s3, s3 identifies that it is not the >master, s3 forwards the request to s1. Then a network partition happens >between s3 and s1, so s1 is only able to replicate the write to itself and >s2. (The write succeeds because the majority agree to commit the write). >Then the network partition heals itself, so s1 returns success to s3, >which >in turn returns success to the client. > >(2) c issues a read, the reads goes to s3, s3 serves the read locally, >which >does not reflect the write in (1). > >Could this ever happen? Thanks! > > > >-- >View this message in context: >http://zookeeper-user.578899.n2.nabble.com/zookeeper-client-session-write- >read-consistency-tp7579330p7582086.html >Sent from the zookeeper-user mailing list archive at Nabble.com. >
