Hi Ted, In your scenario there is no problem I can see. The problem is in another scenario I described in the JIRA - there C has seen more proposals than B but B has seen more commits than C. When leader election happens (and assuming they don't restart beforehand), B will be elected as leader and not C, which is a problem because C's suffix of transactions which were acked by both A and C will be truncated.
Alex > -----Original Message----- > From: Ted Dunning [mailto:[email protected]] > Sent: Thursday, July 21, 2011 1:25 PM > To: [email protected] > Cc: Yang > Subject: Re: what would happen with this case ? (ZAB protocol question) > > Alex, > > Are you sure that this is a bug. > > Take the case of three servers A, B and C with A being leader. > > If transactions 1, 2 and 3 are committed, then a majority of the nodes, > including at least A, must have seen these transactions. Moreover, > transactions cannot be committed on a node unless all previous transactions > have been seen on that node as well. Thus, by symmetry, we can consider > cases where B alone committed these transactions or where B and C committed > them. Only the first case is problematic. > > Now, assume further that transaction 4 has arrived at B and been forwarded > to A but neither B nor C have committed to it. > > The situation now is that in this first epoch, A has seen 1-4, B has seen > 1-3 and C has seen nothing. At least two nodes know the current epoch > because we obviously have a quorum and we know that B knows the current > epoch because it has seen transactions from this epoch. Thus the collection > of machines that know the current epoch can be A+B or A+B+C. > > IF all three nodes now die simultaneously and B and C come back up, the > question is what will happen. We know that the two nodes will agree on the > epoch because at least B has the last epoch. Node B will be elected leader > because it has seen later transactions than C. C will now get the > transactions and we have a quorum in a new epoch. > > If A returns at this point, it will know about transactions 1, 2, 3 and 4. > Further, it will know that 1, 2, and 3 have been committed in the first > epoch and that 4 was proposed, but never committed. As it joins, it will > find that a new epoch has started and will recognize B as master. B will > tell it to truncate the log by deleting 4, but 4 was never committed anyway. > > Where is the problem? > > On Thu, Jul 21, 2011 at 1:11 PM, Alexander Shraer <shralex@yahoo- > inc.com>wrote: > > > The problem is in leader election - if the server doesn't reboot before > > running leader election (the usual case) then only the transactions for > > which it received a commit count and it might not be elected leader, even if > > it has seen more transactions than the others. This may lead to transactions > > being dropped. > > > > I opened a JIRA for this. > >
