Indeed, I meant to say quorum. -Flavio On 5 Oct 2015 6:30 pm, "Ibrahim El-sanosi (PGR)" < [email protected]> wrote:
> Hi Flavio, > > > >That's not accurate. Being recorded by a quorum guarantees that a txn > will be in the initial state of future epochs, but a prospective leader > might have txns it its log that haven't been recorded in *a log*. The > ?>prospective leader needs to make sure that such txns are recorded in a > quorum before establishing a new epoch, though. > > I guess you meant a quorum not a LOG in above world *log* !!! > > Thank you > > Ibrahim > > -----Original Message----- > From: Flavio Junqueira [mailto:[email protected]] > Sent: Monday, October 05, 2015 06:23 م > To: [email protected]? > Subject: Re: 3-server Zab cluster > > > > On 05 Oct 2015, at 18:13, Ibrahim El-sanosi (PGR) < > [email protected]> wrote: > > > > Hi Rakesh, > > > > In Zab, before the end of synchronization phase, new leader will not > commit any proposals in transaction logs that have not got a majority of > acks from pervious ensemble (that what you are saying). > > That's not accurate. Being recorded by a quorum guarantees that a txn will > be in the initial state of future epochs, but a prospective leader might > have txns it its log that haven't been recorded in a log. The prospective > leader needs to make sure that such txns are recorded in a quorum before > establishing a new epoch, though. > > > I think what Zab does is that before the end of synchronization phase, > in L and F2 (the new quorum), L (a prospective leader) will sync its own > state with F2 as the initial state. Referring to my scenario, zxid =10 is > part of the initial state and as a result it will be delivered in new > quorum (L and F2) before processing new proposals of new epoch. > > Yes, this is right. > > > > > You can read this thread > > http://zookeeper-user.578899.n2.nabble.com/Zab-Failure-scenario-td7581 > > 583.html > > <http://zookeeper-user.578899.n2.nabble.com/Zab-Failure-scenario-td758 > > 1583.html> for more info > > > > What do you think? Does anyone have any questions or concerns about such > (small) optimization? > > I'm not entirely sure what the optimization is and if you are proposing a > change or what. Are you looking for a blessing from this community? I'd > like to understand what you're trying to achieve. > > -Flavio > > > > > Ibrahim > > > > From: Rakesh Radhakrishnan [mailto:[email protected] > > <mailto:[email protected]>] > > Sent: Thursday, October 01, 2015 06:15 م > > To: Ibrahim El-sanosi (PGR) > > Subject: Re: 3-server Zab cluster > > > >>>>>>>>> (***) Ok, I thought when F2 form a quorum with L and before > serving clients, L synchronizes its state with F2, resulting in zxid=10 > will be committed in L and F2 as well. I also though this process is the > same as Zab, isn't it? > > > > Since L didn't receives any ACK responses from F1 or F2 before leaving > the Leader status previously, L won't commit transaction zxid=10. IIUC > after re-forming the new quorum L will not have any mechanism to > re-initiate the proposal(Active messaging phase) for the previous zxid=10. > > > > -Rakesh > > > > On Thu, Oct 1, 2015 at 10:19 PM, Ibrahim El-sanosi (PGR) < > [email protected] <mailto:[email protected] > ><mailto:[email protected] <mailto: > [email protected]>>> wrote: > > Thank you Rakesh. > > > >>>> In your case, zk client sees a successful response from F1. Then > assume F2 >>>joins quorum first and L become the leader again. But the > newly formed >>>quorum will not have the zxid=10 transaction. This will > make the cluster >>>inconsistent, isn't it? > > > > (***) Ok, I thought when F2 form a quorum with L and before serving > clients, L synchronizes its state with F2, resulting in zxid=10 will be > committed in L and F2 as well. I also though this process is the same as > Zab, isn't it? > > > > > >>>> Apart from the above case I'm not seeing any other problems with 3 > node >>>cluster. The above data loss case can be avoided by putting an > assumption >>>that more than a tolerated number of server failures may > affect the cluster >>>consistency and results in data loss. > > > > Yes, if the solution above (***) is not correct, you assumption makes > sense. > > > > Ibrahim > > > > From: Rakesh Radhakrishnan [mailto:[email protected] > > <mailto:[email protected]><mailto:[email protected] > > <mailto:[email protected]>>] > > Sent: 01 October 2015 17:26 > > To: [email protected] > > <mailto:[email protected]><mailto:[email protected] > > <mailto:[email protected]>>; Ibrahim El-sanosi (PGR) > > > > Subject: Re: 3-server Zab cluster > > > > Hi Ibrahim, > > > > Below example taken from your older mail thread. > > > >>>>>> 1. leader (L) sends a proposal p with zxid =10 to F1 and F2. > >>>>>> 2. F1 logs, sends an ACK, commits, replays to clients and > >>>>>> crashes. F2 crashes before receiving P10. L has not received any > >>>>>> ACKs > > > > My thoughts for the above scenario is, > > > > In your case, zk client sees a successful response from F1. Then assume > F2 joins quorum first and L become the leader again. But the newly formed > quorum will not have the zxid=10 transaction. This will make the cluster > inconsistent, isn't it? > > > > Apart from the above case I'm not seeing any other problems with 3 node > cluster. The above data loss case can be avoided by putting an assumption > that more than a tolerated number of server failures may affect the cluster > consistency and results in data loss. But I feel this optimization would > have more cases if we scale up the cluster size beyond 3 servers. Now, I'm > not thinking in that direction as your case is limited to 3 node cluster. > > > > Regards, > > Rakesh > > > > > > On Tue, Sep 29, 2015 at 2:28 PM, Ibrahim El-sanosi (PGR) < > [email protected] <mailto:[email protected] > ><mailto:[email protected] <mailto: > [email protected]>>> wrote: > > Yes Alex, in my post I mentioned that this (small) optimization can only > work with 3-servers cluster. > > > > Who could confirm the optimization can work? > > > > Ibrahim > > > > -----Original Message----- > > From: Alexander Shraer [mailto:[email protected] > > <mailto:[email protected]><mailto:[email protected] > > <mailto:[email protected]>>] > > Sent: Tuesday, September 29, 2015 12:11 ص > > To: [email protected] > > <mailto:[email protected]><mailto:[email protected] > > <mailto:[email protected]>> > > Subject: Re: 3-server Zab cluster > > > > I'm not 100% sure whether operations that were pending on the leader are > sent out during sync when this leader looses quorum and re-elected. If so, > then maybe you're right. But in any case, this would not work for 5 or more > servers... > > > > On Mon, Sep 28, 2015 at 3:51 PM, Ibrahim El-sanosi (PGR) < > [email protected] <mailto:[email protected] > ><mailto:[email protected] <mailto: > [email protected]>>> wrote: > > > >> Thank you Alex for replaying. > >> > >> When you said " the leader gets re-elected and the operation is > >> truncated from logs at other servers". I though the new leader will > >> sync the its logs with other followers (synchronization phase), > >> resulting in the operation will commit by new quorum. Let me make the > scenarios as steps: > >> > >> 1. leader (L) sends a proposal p with zxid =10 to F1 and F2. > >> 2. F1 logs, sends an ACK, commits, replays to clients and crashes. F2 > >> crashes before receiving P10. L has not received any ACKs > >> > >> Possible solution (1) > >> The leader will move to LOOKING phase as there is no quorum > >> supporting its leadership. Now Assume F2 wakes up. F2 forms a quorum > >> with the L (pervious leader), L becomes new leader again as it has > latest zxid (10) in its log. > >> L syncs its state with F2, as a result L, F1 (before crashing) and F2 > >> commit P10. Is that correct? > >> > >> Possible solution (2) > >> The leader will move to LOOKING phase as there is no quorum > >> supporting its leadership. Now Assume F1 (with Zxid =10 committed) > >> wakes up. I am not sure who should be a leader (F1 with Zxid =10 > >> committed or L (pervious > >> leader) with Zxid = 10 logged), I think F1 become a new leader as it > >> has Zxid = 10 committed. F1 forms a quorum with the L (pervious > >> leader), F1 becomes new leader as it has latest zxid (10) . L (new > >> leader) syncs its state with L (pervious leader now become a > >> follower), as a result Zxid10 commits by new quorum. Is that correct? > >> > >> What do you think? > >> > >> Ibrahim > >> > >> > >> > >> > >> > >> -----Original Message----- > >> From: Alexander Shraer [mailto:[email protected] > >> <mailto:[email protected]><mailto:[email protected] > >> <mailto:[email protected]>>] > >> Sent: Monday, September 28, 2015 07:27 م > >> To: [email protected] > >> <mailto:[email protected]><mailto:[email protected] > >> <mailto:[email protected]>> > >> Cc: [email protected] > >> <mailto:[email protected]><mailto:[email protected] > >> <mailto:[email protected]>> > >> Subject: Re: 3-server Zab cluster > >> > >> Committing locally when sending an ACK at a server would lead to loss > >> of consistency - it is possible that this is the only server that > >> acks, e.g., this server is temporarily disconnected from the leader, > >> the leader gets re-elected and the operation is truncated from logs > >> at other servers. Its ok to ACK it but its not ok to commit since > >> this exposes this to users as a committed operation that they can see. > >> > >> On Mon, Sep 28, 2015 at 4:19 AM, Ibrahim El-sanosi (PGR) < > >> [email protected] <mailto:[email protected] > ><mailto:[email protected] <mailto: > [email protected]>>> wrote: > >> > >>> In Zab, assume we have a cluster consists of 3-servers. To deliver a > >>> write request, it must run 3 communication steps proposal, > >>> acknowledgement and commit. > >>> As Zab uses reliable FIFO, it is possible to remove commit round. As > >>> soon as a follower receives a proposal, it logs, sends an ACK and > >>> commits locally. Upon receiving ACK from any follower, leader > >>> commits a proposal locally, no COMMIT message need to be sent to > >>> followers. In this case, all servers commit a proposal in two > >>> round-trips, resulting in reducing latency particularly in followers. > >>> > >>> Note that this optimization can only work in 3-servers cluster > >>> (follower reaches a majority as soon as it acks). > >>> Does anyone see any problems with such (small) optimization? > >>> Ibrahim > >
