Please consider committing ZK-1810 as well! :-) -Flavio
On 31 May 2014, at 04:29, Michi Mutsuzaki <[email protected]> wrote: > Thanks Flavio. I guess it's time for me to upgrade to 3.4.6 :) > > Regarding the socket timeout, I see that the timeout is set to > self.tickTime * self.initLimit in connectToLeader(). I'm using the > default values for both tickTime and initLimit, so it should have > timed out sooner. I'll double check these settings and the time the > leader got killed. > > Thanks! > --Michi > > > On Fri, May 30, 2014 at 1:30 AM, FPJ <[email protected]> wrote: >> Hi Michi, >> >> 1) The follower stops following the leader when it gets an exception on the >> socket (Follower.followLeader): >> ... >> while (self.isRunning()) { >> readPacket(qp); >> processPacket(qp); >> } >> } catch (Exception e) { >> ... >> >> I believe we are setting the timeout like this: self.tickTime * >> self.initLimit. Check connectToLeader(). >> >> 2) I believe we fixed this bug in 3.4.6 and the change is pending for trunk. >> Check ZK-1808 for 3.4.6 and ZK-1810 for trunk. >> >> -Flavio >> >> >>> -----Original Message----- >>> From: [email protected] [mailto:[email protected]] On Behalf Of >>> Michi Mutsuzaki >>> Sent: 30 May 2014 06:30 >>> To: [email protected] >>> Subject: leader election doesn't settle >>> >>> I have a 3 server cluster using ZooKeeper 3.4.5 with server IDs 61, 150, >> and >>> 228, and 150 is the leader. I shut down 150. I have 2 questions. >>> >>> 1) Both 61 and 228 takes about 5 minutes to detect that the leader died. >> Is >>> there a tcp setting I need to tune to make this quicker? >>> >>> https://paste.apache.org/4AFR?action=download >>> >>> 2) Leader election between 61 and 228 never settles. 61 doesn't seem to >>> receive notification from 228, and 228 keeps receiving notification from >> 61 for >>> the previous epoch. I restarted 61 and the leader election settled. Have >> you >>> guys seen this behavior? >>> >>> https://paste.apache.org/37vU?action=download >>
