Hi, In your case only A and E has committed the latest transaction say am calling it as txid=1000. B, C, D servers are down at this time and doesn't have the changes of txid=1000. Also, when restarting B,C,D the servers A, E are not available. Now the newly elected Leader is seeing atmost txid=999 and when A, E rejoins the quorum it will 'truncate' himself by deleting the txid=1000. As you said, the write operation performed will be lost in this case.
I could see this is a kinda tricky case of double failures or multiple failures. But I agree this can happen. My point is, if user wants to maintain a reliable cluster then he should keep in mind that the failures more than the tolerated number of failures may leads to unexpected results like this. Best Regards, Rakesh -----Original Message----- From: bit1...@163.com [mailto:bit1...@163.com] Sent: 05 January 2015 15:56 To: user@zookeeper.apache.org Subject: Re: Question about the two-phrase commit Could someone help on this question? Thanks. bit1...@163.com From: bit1...@163.com Date: 2015-01-05 15:05 To: user@zookeeper.apache.org Subject: Question about the two-phrase commit Hi,Zookeepers, I got a question about the two phrase commit in Zookeeper. When a write operation happens 1. Leader proposes all the followers to accept the change(Proposal Vote phrase) 2. Followers ack the proposal and writes the change to the disk(but not persisted yet?) 3. When the Leader receives the majority of acks from followers, the Leader asks the followers to commit the change 4. When each follower receives the commit request, follower commits the changes(persist the change for ever?) In the above process, something rare could happen a. Say,there are 5 nodes in the quorum(1 leader E, 4 follower A,B,C,D). b. The write operation is issued by the client that connects to Follower A c. A commits the changes and response to the client that the writer succeeds. d. Assume that When the response from A is back to client telling the client that the write is successful, But in the period, the other followers (B,C,D) haven't even received the commit request, and B,C,D are down without getting a chance to commit the change. Then shut down A and E. Restart B,C,D,making sure that they will elect a leader.and A start later(A's latest tranactions will be lost,because A will sync with Lead). When this is done, the write operation done before is lost? Is there anything I miss in the above process? Thanks. bit1...@163.com