Hi Rakesh, In Zab, before the end of synchronization phase, new leader will not commit any proposals in transaction logs that have not got a majority of acks from pervious ensemble (that what you are saying).
I think what Zab does is that before the end of synchronization phase, in L and F2 (the new quorum), L (a prospective leader) will sync its own state with F2 as the initial state. Referring to my scenario, zxid =10 is part of the initial state and as a result it will be delivered in new quorum (L and F2) before processing new proposals of new epoch. You can read this thread http://zookeeper-user.578899.n2.nabble.com/Zab-Failure-scenario-td7581583.html for more info What do you think? Does anyone have any questions or concerns about such (small) optimization? Ibrahim From: Rakesh Radhakrishnan [mailto:[email protected]] Sent: Thursday, October 01, 2015 06:15 م To: Ibrahim El-sanosi (PGR) Subject: Re: 3-server Zab cluster >>>>>>>>(***) Ok, I thought when F2 form a quorum with L and before serving >>>>>>>>clients, L synchronizes its state with F2, resulting in zxid=10 will be >>>>>>>>committed in L and F2 as well. I also though this process is the same >>>>>>>>as Zab, isn't it? Since L didn't receives any ACK responses from F1 or F2 before leaving the Leader status previously, L won't commit transaction zxid=10. IIUC after re-forming the new quorum L will not have any mechanism to re-initiate the proposal(Active messaging phase) for the previous zxid=10. -Rakesh On Thu, Oct 1, 2015 at 10:19 PM, Ibrahim El-sanosi (PGR) <[email protected]<mailto:[email protected]>> wrote: Thank you Rakesh. >>>In your case, zk client sees a successful response from F1. Then assume F2 >>>>>>joins quorum first and L become the leader again. But the newly formed >>>>>>quorum will not have the zxid=10 transaction. This will make the cluster >>>>>>inconsistent, isn't it? (***) Ok, I thought when F2 form a quorum with L and before serving clients, L synchronizes its state with F2, resulting in zxid=10 will be committed in L and F2 as well. I also though this process is the same as Zab, isn't it? >>>Apart from the above case I'm not seeing any other problems with 3 node >>>>>>cluster. The above data loss case can be avoided by putting an assumption >>>>>>that more than a tolerated number of server failures may affect the >>>cluster >>>consistency and results in data loss. Yes, if the solution above (***) is not correct, you assumption makes sense. Ibrahim From: Rakesh Radhakrishnan [mailto:[email protected]<mailto:[email protected]>] Sent: 01 October 2015 17:26 To: [email protected]<mailto:[email protected]>; Ibrahim El-sanosi (PGR) Subject: Re: 3-server Zab cluster Hi Ibrahim, Below example taken from your older mail thread. >>>>> 1. leader (L) sends a proposal p with zxid =10 to F1 and F2. >>>>> 2. F1 logs, sends an ACK, commits, replays to clients and crashes. F2 >>>>> crashes before receiving P10. L has not received any ACKs My thoughts for the above scenario is, In your case, zk client sees a successful response from F1. Then assume F2 joins quorum first and L become the leader again. But the newly formed quorum will not have the zxid=10 transaction. This will make the cluster inconsistent, isn't it? Apart from the above case I'm not seeing any other problems with 3 node cluster. The above data loss case can be avoided by putting an assumption that more than a tolerated number of server failures may affect the cluster consistency and results in data loss. But I feel this optimization would have more cases if we scale up the cluster size beyond 3 servers. Now, I'm not thinking in that direction as your case is limited to 3 node cluster. Regards, Rakesh On Tue, Sep 29, 2015 at 2:28 PM, Ibrahim El-sanosi (PGR) <[email protected]<mailto:[email protected]>> wrote: Yes Alex, in my post I mentioned that this (small) optimization can only work with 3-servers cluster. Who could confirm the optimization can work? Ibrahim -----Original Message----- From: Alexander Shraer [mailto:[email protected]<mailto:[email protected]>] Sent: Tuesday, September 29, 2015 12:11 ص To: [email protected]<mailto:[email protected]> Subject: Re: 3-server Zab cluster I'm not 100% sure whether operations that were pending on the leader are sent out during sync when this leader looses quorum and re-elected. If so, then maybe you're right. But in any case, this would not work for 5 or more servers... On Mon, Sep 28, 2015 at 3:51 PM, Ibrahim El-sanosi (PGR) < [email protected]<mailto:[email protected]>> wrote: > Thank you Alex for replaying. > > When you said " the leader gets re-elected and the operation is > truncated from logs at other servers". I though the new leader will > sync the its logs with other followers (synchronization phase), > resulting in the operation will commit by new quorum. Let me make the > scenarios as steps: > > 1. leader (L) sends a proposal p with zxid =10 to F1 and F2. > 2. F1 logs, sends an ACK, commits, replays to clients and crashes. F2 > crashes before receiving P10. L has not received any ACKs > > Possible solution (1) > The leader will move to LOOKING phase as there is no quorum supporting > its leadership. Now Assume F2 wakes up. F2 forms a quorum with the L > (pervious leader), L becomes new leader again as it has latest zxid (10) in > its log. > L syncs its state with F2, as a result L, F1 (before crashing) and F2 > commit P10. Is that correct? > > Possible solution (2) > The leader will move to LOOKING phase as there is no quorum supporting > its leadership. Now Assume F1 (with Zxid =10 committed) wakes up. I > am not sure who should be a leader (F1 with Zxid =10 committed or L > (pervious > leader) with Zxid = 10 logged), I think F1 become a new leader as it > has Zxid = 10 committed. F1 forms a quorum with the L (pervious > leader), F1 becomes new leader as it has latest zxid (10) . L (new > leader) syncs its state with L (pervious leader now become a > follower), as a result Zxid10 commits by new quorum. Is that correct? > > What do you think? > > Ibrahim > > > > > > -----Original Message----- > From: Alexander Shraer [mailto:[email protected]<mailto:[email protected]>] > Sent: Monday, September 28, 2015 07:27 م > To: [email protected]<mailto:[email protected]> > Cc: [email protected]<mailto:[email protected]> > Subject: Re: 3-server Zab cluster > > Committing locally when sending an ACK at a server would lead to loss > of consistency - it is possible that this is the only server that > acks, e.g., this server is temporarily disconnected from the leader, > the leader gets re-elected and the operation is truncated from logs at > other servers. Its ok to ACK it but its not ok to commit since this > exposes this to users as a committed operation that they can see. > > On Mon, Sep 28, 2015 at 4:19 AM, Ibrahim El-sanosi (PGR) < > [email protected]<mailto:[email protected]>> wrote: > > > In Zab, assume we have a cluster consists of 3-servers. To deliver a > > write request, it must run 3 communication steps proposal, > > acknowledgement and commit. > > As Zab uses reliable FIFO, it is possible to remove commit round. As > > soon as a follower receives a proposal, it logs, sends an ACK and > > commits locally. Upon receiving ACK from any follower, leader > > commits a proposal locally, no COMMIT message need to be sent to > > followers. In this case, all servers commit a proposal in two > > round-trips, resulting in reducing latency particularly in followers. > > > > Note that this optimization can only work in 3-servers cluster > > (follower reaches a majority as soon as it acks). > > Does anyone see any problems with such (small) optimization? > > Ibrahim > > >
