2.) It appears that the leader closes connections to the affected followers after a “transaction timeout” occurs. Where would I find out what this timeout is ? Is this the same thing as a session timout (e.g. The default of 20 * tickTime) ?
https://github.com/apache/zookeeper/blob/trunk/src/java/main/org/apache/zookeeper/server/quorum/LearnerHandler.java#L496 a. So the Leader closes connections to Followers and Observers after syncLimit*ticktime milliseconds ? b. So what purpose does the SyncLimit serve in followers and in observers ? c. If i needed the Observer to stay connected to ZKEnsemble for a longer time - in case of network partitiions - do i increase the syncLimit at the leader or at the Observer ? > Date: Fri, 26 Jun 2015 18:10:45 -0700 > Subject: Re: Tracking down possible network partition > From: [email protected] > To: [email protected] > > On 25 June 2015 at 07:28, Round, Mark <[email protected]> wrote: > > > I have a 5-node Zookeeper 3.4.6 cluster across 3 data centres (2 > > zookeepers in each “main” DC, and a 5th in a 3rd DC for quorum). I see that > > the two nodes in one DC have regular “issues” where they get kicked out of > > the cluster and the ZooKeeperServer process stops for a few minutes until > > the node rejoins. I’d like to know a couple of things, if someone could > > please point me in the direction of the relevant docs I’d greatly > > appreciate it. > > > > 1.) Is it expected behaviour that when a node is kicked from the cluster, > > it will not be allowed to re-join for a period ? From the logs below I can > > see that re-establishing a valid cluster took around 15 minutes. > > > > I don't think so. > > 2.) It appears that the leader closes connections to the affected followers > > after a “transaction timeout” occurs. Where would I find out what this > > timeout is ? Is this the same thing as a session timout (e.g. The default > > of 20 * tickTime) ? > > > > https://github.com/apache/zookeeper/blob/trunk/src/java/main/org/apache/zookeeper/server/quorum/LearnerHandler.java#L496 > > > > 3.) Where can I find the definition of the different fields in the > > election log messages (I.e. What are “n.round”, “n.zxid”, “n.state” and so > > on) ? > > > Not sure if there's a better source than the source: > https://github.com/apache/zookeeper/blob/trunk/src/java/main/org/apache/zookeeper/server/quorum/FastLeaderElection.java#L687 > > > > -rgs
