+1 Thank you for looking into this Hongchao.
On Mon, Mar 9, 2015 at 8:31 PM, Hongchao Deng <hd...@cloudera.com> wrote: > Hi all, > > I recently worked on fixing flaky test -- testPortChange(), which is > related to ZOOKEEPER-2000. > > This is what I have figured out: > > * Server (1) and (2) were followers, (3) was the leader. > * client connected to (1), did a reconfig(). > * (1) and (2) formed a quorum, reconfig was successful, and returned. > * (3) still thinks he's the leader, so using LeaderZooKeeperServer. > * client connected to (3) did a sync(), and the sync didn't go through a > quorum. THE CLIENT WHO DID SYNC() GETS WRONG BEHAVIOR. There's a split > brain here for sync(). > * Then (3) gradually moves to the new quorum config. > > I'm proposing to change sync() to need quorum acks. I've privately talked > with my friend Xiang Li who's working on etcd. He previously had similar > experience and finally changed sync to go through quorum. > > Since this change affects the behavior of sync(), I'm asking in public if > there's any concern/assumption? Let's discuss it here. > > Best, > -- > *- Hongchao Deng* > *Software Engineer*