RE: Zab Failure scenario
Thank you Flavio, it makes sense. Ibrahim -Original Message- From: Flavio P JUNQUEIRA [mailto:f...@apache.org] Sent: Sunday, October 04, 2015 05:43 م To: user@zookeeper.apache.org Subject: RE: Zab Failure scenario Acks aren't logged and neither are commits. A prospective leader commits the initial state of the epoch using its own state as the proposed initial state. In your scenario, txn 10 is part of the initial proposed state. -Flavio On 4 Oct 2015 4:28 pm, "Ibrahim El-sanosi (PGR)" < i.s.el-san...@newcastle.ac.uk> wrote: > Sorry, there is a type error in previous email. > > Does zxid = 10 commit because there are acknowledgments from a quorum > of pervious epoch, or the prospective leader needs to commit any > proposals in transaction logs (regardless of having quorums of ACKS > from pervious epoch or not) ? (remember we are taking about proposals > that have NOT committed yet and located in transaction logs). > > Following scenarios apply to above questions: > > Assume we have 3-server cluster. Leader (L), follower1 (F1) and > follower2 (F2). The scenario is as follows: > 1. Leader sends a proposal with Zxid = 10. > 2. F2 crashes before receiving P10. F1 logs, sends an ACK and > crashes. > > Assume L has not received F1' ACK. As there is no quorum supporting > L, L moves to Leader election (Fast Leader Election, FLE) to find a > quorum and elects new leader. After sometime F2 wakes up and form a > quorum with L. In FLE, a process that has the most recent zxid > becomes a prospective leader, resulting in L (pervious leader) > becomes a prospective leader, because of having zxid =10. > > What happen then? Does zxid = 10 commit eventually before the end of > synchronization phase or discard? (remember, zxid =10 did not get a > quorum of ACKS from pervious epoch) > > Regards, > > Ibrahim > > -Original Message- > From: Ibrahim El-sanosi (PGR) > Sent: Sunday, October 04, 2015 03:48 م > To: user@zookeeper.apache.org > Subject: RE: Zab Failure scenario > > Does zxid = 10 commit because there are acknowledgments from a quorum > of pervious epoch, or the prospective leader needs to commit any > proposals in transaction logs (regardless of having quorums of ACKS > from pervious epoch or not) ? (remember we are taking about proposals > that have committed yet and located in transaction logs). > > Following scenarios apply to above questions: > > Assume we have 3-server cluster. Leader (L), follower1 (F1) and > follower2 (F2). The scenario is as follows: > 1. Leader sends a proposal with Zxid = 10. > 2. F2 crashes before receiving P10. F1 logs, sends an ACK and > crashes. > > As there is no quorum supporting L, L moves to Leader election (Fast > Leader Election, FLE) to find a quorum and elects new leader. After > sometime F2 wakes up and form a quorum with L. In FLE, a process that > has the most recent zxid becomes a prospective leader, resulting in L > (pervious leader) becomes a prospective leader, because of zxid =10. > > What happen then? Does zxid = 10 commit eventually before the end of > synchronization phase or discard? > > Regards, > > Ibrahim > > > > -Original Message- > From: Flavio P JUNQUEIRA [mailto:f...@apache.org] > Sent: Sunday, September 27, 2015 05:21 م > To: user@zookeeper.apache.org > Subject: Re: Zab Failure scenario > > In 3, it is not exactly a pending proposal, but if the leader has 10 > in its log, then it will make sure 10 is committed by the end of the > synchronisation phase and before it becomes established. > > I'm not sure why you are assuming 3.4.6, though. Why is it relevant > for this question? > > -Flavio > On 27 Sep 2015 4:51 pm, "Ibrahim El-sanosi (PGR)" < > i.s.el-san...@newcastle.ac.uk> wrote: > > > Assume we use ZooKeeper 3.4.6 and we have 3-server cluster. Leader > > (L), > > follower1 (F1) and follower2 (F2). The scenario is as following: > > > > 1. Leader sends a proposal with Zxid = 10. > > > > 2. F2 crashes before receiving P10. F1 logs, sends an ACK and > > crashes. > > As there is no quorum supporting L, L moves to LOOKING phase to > > find a quorum and elects new leader. After sometime F1 wakes up and > > form a quorum with L. Both F1 and L (pervious leader) have same > > state (zxid > > =10 in their log). Therefore the process which has the large myid > > will be a leader, assume L (pervious leader) has larger myid, So, > > > > 3. L sends a pending proposal with Zxid =10 to F1. > > > > 4. F1 logs and sends an ACK. > > > > 5. Upon receiving ACK, L commits p10 and sends an ACK. > > > > > > > > (1)Is this true or false? > > > > (2)Regards, > > Ibrahim > > > > >
RE: Zab Failure scenario
Does zxid = 10 commit because there are acknowledgments from a quorum of pervious epoch, or the prospective leader needs to commit any proposals in transaction logs (regardless of having quorums of ACKS from pervious epoch or not) ? (remember we are taking about proposals that have committed yet and located in transaction logs). Following scenarios apply to above questions: Assume we have 3-server cluster. Leader (L), follower1 (F1) and follower2 (F2). The scenario is as follows: 1. Leader sends a proposal with Zxid = 10. 2. F2 crashes before receiving P10. F1 logs, sends an ACK and crashes. As there is no quorum supporting L, L moves to Leader election (Fast Leader Election, FLE) to find a quorum and elects new leader. After sometime F2 wakes up and form a quorum with L. In FLE, a process that has the most recent zxid becomes a prospective leader, resulting in L (pervious leader) becomes a prospective leader, because of zxid =10. What happen then? Does zxid = 10 commit eventually before the end of synchronization phase or discard? Regards, Ibrahim -Original Message- From: Flavio P JUNQUEIRA [mailto:f...@apache.org] Sent: Sunday, September 27, 2015 05:21 م To: user@zookeeper.apache.org Subject: Re: Zab Failure scenario In 3, it is not exactly a pending proposal, but if the leader has 10 in its log, then it will make sure 10 is committed by the end of the synchronisation phase and before it becomes established. I'm not sure why you are assuming 3.4.6, though. Why is it relevant for this question? -Flavio On 27 Sep 2015 4:51 pm, "Ibrahim El-sanosi (PGR)" < i.s.el-san...@newcastle.ac.uk> wrote: > Assume we use ZooKeeper 3.4.6 and we have 3-server cluster. Leader > (L), > follower1 (F1) and follower2 (F2). The scenario is as following: > > 1. Leader sends a proposal with Zxid = 10. > > 2. F2 crashes before receiving P10. F1 logs, sends an ACK and > crashes. > As there is no quorum supporting L, L moves to LOOKING phase to find > a quorum and elects new leader. After sometime F1 wakes up and form a > quorum with L. Both F1 and L (pervious leader) have same state (zxid > =10 in their log). Therefore the process which has the large myid will > be a leader, assume L (pervious leader) has larger myid, So, > > 3. L sends a pending proposal with Zxid =10 to F1. > > 4. F1 logs and sends an ACK. > > 5. Upon receiving ACK, L commits p10 and sends an ACK. > > > > (1)Is this true or false? > > (2)Regards, > Ibrahim > >
RE: Zab Failure scenario
Acks aren't logged and neither are commits. A prospective leader commits the initial state of the epoch using its own state as the proposed initial state. In your scenario, txn 10 is part of the initial proposed state. -Flavio On 4 Oct 2015 4:28 pm, "Ibrahim El-sanosi (PGR)" < i.s.el-san...@newcastle.ac.uk> wrote: > Sorry, there is a type error in previous email. > > Does zxid = 10 commit because there are acknowledgments from a quorum of > pervious epoch, or the prospective leader needs to commit any proposals in > transaction logs (regardless of having quorums of ACKS from pervious epoch > or not) ? (remember we are taking about proposals that have NOT committed > yet and located in transaction logs). > > Following scenarios apply to above questions: > > Assume we have 3-server cluster. Leader (L), follower1 (F1) and follower2 > (F2). The scenario is as follows: > 1. Leader sends a proposal with Zxid = 10. > 2. F2 crashes before receiving P10. F1 logs, sends an ACK and > crashes. > > Assume L has not received F1' ACK. As there is no quorum supporting L, L > moves to Leader election (Fast Leader Election, FLE) to find a quorum and > elects new leader. After sometime F2 wakes up and form a quorum with L. In > FLE, a process that has the most recent zxid becomes a prospective leader, > resulting in L (pervious leader) becomes a prospective leader, because of > having zxid =10. > > What happen then? Does zxid = 10 commit eventually before the end of > synchronization phase or discard? (remember, zxid =10 did not get a quorum > of ACKS from pervious epoch) > > Regards, > > Ibrahim > > -Original Message- > From: Ibrahim El-sanosi (PGR) > Sent: Sunday, October 04, 2015 03:48 م > To: user@zookeeper.apache.org > Subject: RE: Zab Failure scenario > > Does zxid = 10 commit because there are acknowledgments from a quorum of > pervious epoch, or the prospective leader needs to commit any proposals in > transaction logs (regardless of having quorums of ACKS from pervious epoch > or not) ? (remember we are taking about proposals that have committed yet > and located in transaction logs). > > Following scenarios apply to above questions: > > Assume we have 3-server cluster. Leader (L), follower1 (F1) and follower2 > (F2). The scenario is as follows: > 1. Leader sends a proposal with Zxid = 10. > 2. F2 crashes before receiving P10. F1 logs, sends an ACK and > crashes. > > As there is no quorum supporting L, L moves to Leader election (Fast > Leader Election, FLE) to find a quorum and elects new leader. After > sometime F2 wakes up and form a quorum with L. In FLE, a process that has > the most recent zxid becomes a prospective leader, resulting in L > (pervious leader) becomes a prospective leader, because of zxid =10. > > What happen then? Does zxid = 10 commit eventually before the end of > synchronization phase or discard? > > Regards, > > Ibrahim > > > > -Original Message- > From: Flavio P JUNQUEIRA [mailto:f...@apache.org] > Sent: Sunday, September 27, 2015 05:21 م > To: user@zookeeper.apache.org > Subject: Re: Zab Failure scenario > > In 3, it is not exactly a pending proposal, but if the leader has 10 in > its log, then it will make sure 10 is committed by the end of the > synchronisation phase and before it becomes established. > > I'm not sure why you are assuming 3.4.6, though. Why is it relevant for > this question? > > -Flavio > On 27 Sep 2015 4:51 pm, "Ibrahim El-sanosi (PGR)" < > i.s.el-san...@newcastle.ac.uk> wrote: > > > Assume we use ZooKeeper 3.4.6 and we have 3-server cluster. Leader > > (L), > > follower1 (F1) and follower2 (F2). The scenario is as following: > > > > 1. Leader sends a proposal with Zxid = 10. > > > > 2. F2 crashes before receiving P10. F1 logs, sends an ACK and > > crashes. > > As there is no quorum supporting L, L moves to LOOKING phase to find > > a quorum and elects new leader. After sometime F1 wakes up and form a > > quorum with L. Both F1 and L (pervious leader) have same state (zxid > > =10 in their log). Therefore the process which has the large myid will > > be a leader, assume L (pervious leader) has larger myid, So, > > > > 3. L sends a pending proposal with Zxid =10 to F1. > > > > 4. F1 logs and sends an ACK. > > > > 5. Upon receiving ACK, L commits p10 and sends an ACK. > > > > > > > > (1)Is this true or false? > > > > (2)Regards, > > Ibrahim > > > > >
RE: Zab Failure scenario
Sorry, there is a type error in previous email. Does zxid = 10 commit because there are acknowledgments from a quorum of pervious epoch, or the prospective leader needs to commit any proposals in transaction logs (regardless of having quorums of ACKS from pervious epoch or not) ? (remember we are taking about proposals that have NOT committed yet and located in transaction logs). Following scenarios apply to above questions: Assume we have 3-server cluster. Leader (L), follower1 (F1) and follower2 (F2). The scenario is as follows: 1. Leader sends a proposal with Zxid = 10. 2. F2 crashes before receiving P10. F1 logs, sends an ACK and crashes. Assume L has not received F1' ACK. As there is no quorum supporting L, L moves to Leader election (Fast Leader Election, FLE) to find a quorum and elects new leader. After sometime F2 wakes up and form a quorum with L. In FLE, a process that has the most recent zxid becomes a prospective leader, resulting in L (pervious leader) becomes a prospective leader, because of having zxid =10. What happen then? Does zxid = 10 commit eventually before the end of synchronization phase or discard? (remember, zxid =10 did not get a quorum of ACKS from pervious epoch) Regards, Ibrahim -Original Message- From: Ibrahim El-sanosi (PGR) Sent: Sunday, October 04, 2015 03:48 م To: user@zookeeper.apache.org Subject: RE: Zab Failure scenario Does zxid = 10 commit because there are acknowledgments from a quorum of pervious epoch, or the prospective leader needs to commit any proposals in transaction logs (regardless of having quorums of ACKS from pervious epoch or not) ? (remember we are taking about proposals that have committed yet and located in transaction logs). Following scenarios apply to above questions: Assume we have 3-server cluster. Leader (L), follower1 (F1) and follower2 (F2). The scenario is as follows: 1. Leader sends a proposal with Zxid = 10. 2. F2 crashes before receiving P10. F1 logs, sends an ACK and crashes. As there is no quorum supporting L, L moves to Leader election (Fast Leader Election, FLE) to find a quorum and elects new leader. After sometime F2 wakes up and form a quorum with L. In FLE, a process that has the most recent zxid becomes a prospective leader, resulting in L (pervious leader) becomes a prospective leader, because of zxid =10. What happen then? Does zxid = 10 commit eventually before the end of synchronization phase or discard? Regards, Ibrahim -Original Message- From: Flavio P JUNQUEIRA [mailto:f...@apache.org] Sent: Sunday, September 27, 2015 05:21 م To: user@zookeeper.apache.org Subject: Re: Zab Failure scenario In 3, it is not exactly a pending proposal, but if the leader has 10 in its log, then it will make sure 10 is committed by the end of the synchronisation phase and before it becomes established. I'm not sure why you are assuming 3.4.6, though. Why is it relevant for this question? -Flavio On 27 Sep 2015 4:51 pm, "Ibrahim El-sanosi (PGR)" < i.s.el-san...@newcastle.ac.uk> wrote: > Assume we use ZooKeeper 3.4.6 and we have 3-server cluster. Leader > (L), > follower1 (F1) and follower2 (F2). The scenario is as following: > > 1. Leader sends a proposal with Zxid = 10. > > 2. F2 crashes before receiving P10. F1 logs, sends an ACK and > crashes. > As there is no quorum supporting L, L moves to LOOKING phase to find > a quorum and elects new leader. After sometime F1 wakes up and form a > quorum with L. Both F1 and L (pervious leader) have same state (zxid > =10 in their log). Therefore the process which has the large myid will > be a leader, assume L (pervious leader) has larger myid, So, > > 3. L sends a pending proposal with Zxid =10 to F1. > > 4. F1 logs and sends an ACK. > > 5. Upon receiving ACK, L commits p10 and sends an ACK. > > > > (1)Is this true or false? > > (2)Regards, > Ibrahim > >
Re: Zab Failure scenario
A reconfiguration is treated similarly to other proposals for recovery purposes (of course commit is different in that it changes the configuration). You can see the paper <https://www.usenix.org/system/files/conference/atc12/atc12-final74.pdf> for details on how recovery works in principle, and if you have a specific question please feel free to ask. On Mon, Sep 28, 2015 at 10:54 AM, Ibrahim El-sanosi (PGR) < i.s.el-san...@newcastle.ac.uk> wrote: > Yes, I am thinking of mixing an in-flight reconfiguration request with the > crashing servers example that you gave Not about how proposals, acks, > commits (i.e.: ZAB proper) work. > > Thank you > > -Original Message- > From: Raúl Gutiérrez Segalés [mailto:r...@itevenworks.net] > Sent: Monday, September 28, 2015 02:56 ص > To: user@zookeeper.apache.org > Subject: Re: Zab Failure scenario > > On 27 September 2015 at 10:12, Ibrahim El-sanosi (PGR) < > i.s.el-san...@newcastle.ac.uk> wrote: > > > Thank you Flavio for explanation. It really makes sense for me. > > > > > I'm not sure why you are assuming 3.4.6, though. Why is it relevant > > > for > > this question? > > > > I am assuming 3.4.6 because first I use this version, second I do not > > know about dynamic configuration 3.5.0 as it may have different > > solution for mentioned scenario. > > > > I don't think dynamic reconfiguration changes anything about how > proposals, acks, commits (i.e.: ZAB proper) work. Unless you are thinking > of mixing an in-flight reconfiguration request with the crashing servers > example that you gave > > > -rgs >
RE: Zab Failure scenario
Yes, I am thinking of mixing an in-flight reconfiguration request with the crashing servers example that you gave Not about how proposals, acks, commits (i.e.: ZAB proper) work. Thank you -Original Message- From: Raúl Gutiérrez Segalés [mailto:r...@itevenworks.net] Sent: Monday, September 28, 2015 02:56 ص To: user@zookeeper.apache.org Subject: Re: Zab Failure scenario On 27 September 2015 at 10:12, Ibrahim El-sanosi (PGR) < i.s.el-san...@newcastle.ac.uk> wrote: > Thank you Flavio for explanation. It really makes sense for me. > > > I'm not sure why you are assuming 3.4.6, though. Why is it relevant > > for > this question? > > I am assuming 3.4.6 because first I use this version, second I do not > know about dynamic configuration 3.5.0 as it may have different > solution for mentioned scenario. > I don't think dynamic reconfiguration changes anything about how proposals, acks, commits (i.e.: ZAB proper) work. Unless you are thinking of mixing an in-flight reconfiguration request with the crashing servers example that you gave -rgs
Re: Zab Failure scenario
In 3, it is not exactly a pending proposal, but if the leader has 10 in its log, then it will make sure 10 is committed by the end of the synchronisation phase and before it becomes established. I'm not sure why you are assuming 3.4.6, though. Why is it relevant for this question? -Flavio On 27 Sep 2015 4:51 pm, "Ibrahim El-sanosi (PGR)" < i.s.el-san...@newcastle.ac.uk> wrote: > Assume we use ZooKeeper 3.4.6 and we have 3-server cluster. Leader (L), > follower1 (F1) and follower2 (F2). The scenario is as following: > > 1. Leader sends a proposal with Zxid = 10. > > 2. F2 crashes before receiving P10. F1 logs, sends an ACK and > crashes. > As there is no quorum supporting L, L moves to LOOKING phase to find a > quorum and elects new leader. After sometime F1 wakes up and form a quorum > with L. Both F1 and L (pervious leader) have same state (zxid =10 in their > log). Therefore the process which has the large myid will be a leader, > assume L (pervious leader) has larger myid, So, > > 3. L sends a pending proposal with Zxid =10 to F1. > > 4. F1 logs and sends an ACK. > > 5. Upon receiving ACK, L commits p10 and sends an ACK. > > > > (1)Is this true or false? > > (2)Regards, > Ibrahim > >
RE: Zab Failure scenario
Thank you Flavio for explanation. It really makes sense for me. > I'm not sure why you are assuming 3.4.6, though. Why is it relevant for this > question? I am assuming 3.4.6 because first I use this version, second I do not know about dynamic configuration 3.5.0 as it may have different solution for mentioned scenario. Ibrahim -Original Message- From: Flavio P JUNQUEIRA [mailto:f...@apache.org] Sent: Sunday, September 27, 2015 05:21 م To: user@zookeeper.apache.org Subject: Re: Zab Failure scenario In 3, it is not exactly a pending proposal, but if the leader has 10 in its log, then it will make sure 10 is committed by the end of the synchronisation phase and before it becomes established. I'm not sure why you are assuming 3.4.6, though. Why is it relevant for this question? -Flavio On 27 Sep 2015 4:51 pm, "Ibrahim El-sanosi (PGR)" < i.s.el-san...@newcastle.ac.uk> wrote: > Assume we use ZooKeeper 3.4.6 and we have 3-server cluster. Leader > (L), > follower1 (F1) and follower2 (F2). The scenario is as following: > > 1. Leader sends a proposal with Zxid = 10. > > 2. F2 crashes before receiving P10. F1 logs, sends an ACK and > crashes. > As there is no quorum supporting L, L moves to LOOKING phase to find > a quorum and elects new leader. After sometime F1 wakes up and form a > quorum with L. Both F1 and L (pervious leader) have same state (zxid > =10 in their log). Therefore the process which has the large myid will > be a leader, assume L (pervious leader) has larger myid, So, > > 3. L sends a pending proposal with Zxid =10 to F1. > > 4. F1 logs and sends an ACK. > > 5. Upon receiving ACK, L commits p10 and sends an ACK. > > > > (1)Is this true or false? > > (2)Regards, > Ibrahim > >