It is ok to have two servers thinking they are leaders as long as only one is able to commit txns at a time by having a quorum of supporters. Each server is going to follow a single leader, so I don't see a problem in your scenario with the information you provided. Now if you tell me that when you keep sending new transactions to those leaders and they keep committing them forever, both keep committing new transactions, then we have a problem. I don't see how this can happen, though.
Also, one of the leaders should eventually time out and go back to leader election. -Flavio > -----Original Message----- > From: Jordan Zimmerman [mailto:[email protected]] > Sent: 14 June 2013 21:10 > To: [email protected] > Subject: Re: Rolling config change considered harmful? > > More on this. > > I just did some testing with wholly contrived scenarios and I was able to get a > cluster in a state where it had two leaders. NOTE: all of this was done with > Curator's TestingCluster > > * Create a 5 node ensemble > * Save the list of instances, directories etc. > * Wait for quorum > * Shut down the cluster > * Restart the ensemble with the same ports and directories. However, this > time, give different server lists to each instance: > * Instance A -> A D E > * Instance B -> A B C > * Instance C -> A B C > * Instance D -> A D E > * Instance E -> A D E > > There is at least one common server amongst all of them. When I restart the > cluster with this configuration I ended up with two leaders. This state stays > consistent after leader election (i.e. it doesn't try to re-elect). > > A: following > B: leading > C: following > D: leading > E: following > > This may be the correct behavior. i.e. it may be that ZooKeeper cannot > realistically run in this scenario. What it means to me is that rolling config > changes, if too lax, can create chaos. > > -Jordan > > On Jun 14, 2013, at 12:27 PM, "FPJ" <[email protected]> wrote: > > > In the case I described, the txn is not reflected in the zookeeper state. > > Say T is a create txn. Once C is elected, it determines the initial > > history of txns for the new epoch that is starting and this initial > > history is not going to include T. > > > > In the example below, I was ignoring the client that triggered T, but > > since it has been acked by a quorum, the client might as well have > > received the confirmation of the operation and think that the znode has > been created. > > > > -Flavio > > > >> -----Original Message----- > >> From: Jordan Zimmerman [mailto:[email protected]] > >> Sent: 14 June 2013 20:16 > >> To: [email protected] > >> Subject: Re: Rolling config change considered harmful? > >> > >> Yes - save that I'm not sure what happens with a client when a > >> transaction > > is > >> lost. What is the error to the client? Or are you referring to > >> internal transactions as part of the leader election? > >> > >> -JZ > >> > >> On Jun 14, 2013, at 12:07 PM, "FPJ" <[email protected]> wrote: > >> > >>> Not sure if this helps but here is an example: > >>> > >>> - Txn T is acknowledged by A and B (ensemble is {A, B, C}) > >>> - Ensemble changes to {B, C, D} > >>> - C and D form a quorum and elect C because it has the highest zxid. > >>> > >>> C won't have T, so the txn gets lost. > >>> > >>> Does it make sense? > >>> > >>> -Flavio > > > >
