[ https://issues.apache.org/jira/browse/ZOOKEEPER-1330?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16950685#comment-16950685 ]
MachelCheng edited comment on ZOOKEEPER-1330 at 10/14/19 3:13 AM: ------------------------------------------------------------------ Hi,[~hanm] [~fpj] In my project, I just encountered this problem. Leader election took 25 seconds, but the follower took 374 seconds. The log is as follows: 2019-10-12 23:00:20,354 [myid:3] - INFO [QuorumPeer[myid=3]/0:0:0:0:0:0:0:0:2181:Leader@380] - LEADING - LEADER ELECTION TOOK - 25507 2019-10-12 23:06:54,692 [myid:1] - INFO [QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:Follower@65] - FOLLOWING - LEADER ELECTION TOOK - 374511 During this period, the client could not connect to zookeeper server, and there are many logs as follows: WARN [NIOServerCxn.Factory: 0.0.0.0/0.0.0.0:2181:NIOServerCnxn@362] - Exception causing close of session 0x0 due to java.io.IOException: ZooKeeperServer not running There are three questions, who can help explain: 1. Why the time required to select a follower needs to be so long, and the time difference from the leader election is so large; 2. When does zookeeper provide services normally? Need followers‘s election process is done? Or just the leader election is done? 3. How to solve this problem? This is the most important and I want to know the most. was (Author: machelcheng): Hi,[~hanm] [~fpj] In my project, I just encountered this problem. Leader election 25 seconds, but the follower took 374 seconds. The log is as follows: 2019-10-12 23:00:20,354 [myid:3] - INFO [QuorumPeer[myid=3]/0:0:0:0:0:0:0:0:2181:Leader@380] - LEADING - LEADER ELECTION TOOK - 25507 2019-10-12 23:06:54,692 [myid:1] - INFO [QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:Follower@65] - FOLLOWING - LEADER ELECTION TOOK - 374511 During this period, the client could not connect to zookeeper server, and there are many logs as follows: WARN [NIOServerCxn.Factory: 0.0.0.0/0.0.0.0:2181:NIOServerCnxn@362] - Exception causing close of session 0x0 due to java.io.IOException: ZooKeeperServer not running There are three questions, who can help explain: 1. Why the time required to select a follower needs to be so long, and the time difference from the leader election is so large; 2. When does zookeeper provide services normally? Need followers‘s election process is done? Or just the leader election is done? 3. How to solve this problem? This is the most important and I want to know the most. > Zookeeper server not serving the client request even after completion of > Leader election > ---------------------------------------------------------------------------------------- > > Key: ZOOKEEPER-1330 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1330 > Project: ZooKeeper > Issue Type: Bug > Components: server > Affects Versions: 3.4.0 > Environment: 3 zk quorum > Reporter: amith > Priority: Minor > Fix For: 3.6.0, 3.5.7 > > > Have a cluster of 3 zookeepers > 90 clients are connected to the server > leader got killed and started > the other 2 zookeeper started FLE and Leader was elected > But its taking nearly 10 sec for this server to server requests and saying > "ZooKeeperServer not running" message..? > Why is this even after Leader election SERVER IS NOT RUNNING !!!!!!!!!! > 2011-12-19 16:12:29,732 [myid:2] - WARN > [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2182:NIOServerCnxn@354] - Exception > causing close of session 0x0 due to java.io.IOException: ZooKeeperServer not > running > 2011-12-19 16:12:29,733 [myid:2] - INFO > [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2182:NIOServerCnxn@1000] - Closed > socket connection for client /10.18.47.148:51965 (no session established for > client) > 2011-12-19 16:12:29,753 [myid:2] - INFO > [QuorumPeer[myid=2]/0:0:0:0:0:0:0:0:2182:QuorumPeer@747] - LEADING > 2011-12-19 16:12:29,762 [myid:2] - INFO > [QuorumPeer[myid=2]/0:0:0:0:0:0:0:0:2182:Leader@58] - TCP NoDelay set to: true > 2011-12-19 16:12:29,765 [myid:2] - INFO > [QuorumPeer[myid=2]/0:0:0:0:0:0:0:0:2182:ZooKeeperServer@168] - Created > server with tickTime 2000 minSessionTimeout 4000 maxSessionTimeout 40000 > datadir ../dataDir/version-2 snapdir ../dataDir/version-2 > 2011-12-19 16:12:29,766 [myid:2] - INFO > [QuorumPeer[myid=2]/0:0:0:0:0:0:0:0:2182:Leader@294] - LEADING - LEADER > ELECTION TOOK - 4663 > 2011-12-19 16:12:29,776 [myid:2] - INFO > [QuorumPeer[myid=2]/0:0:0:0:0:0:0:0:2182:FileSnap@83] - Reading snapshot > ../dataDir/version-2/snapshot.100013661 > 2011-12-19 16:12:29,831 [myid:2] - INFO > [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2182:NIOServerCnxnFactory@213] - > Accepted socket connection from /10.18.47.148:51982 > 2011-12-19 16:12:29,831 [myid:2] - WARN > [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2182:NIOServerCnxn@354] - Exception > causing close of session 0x0 due to java.io.IOException: ZooKeeperServer not > running > 2011-12-19 16:12:29,832 [myid:2] - INFO > [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2182:NIOServerCnxn@1000] - Closed > socket connection for client /10.18.47.148:51982 (no session established for > client) > 2011-12-19 16:12:29,884 [myid:2] - INFO > [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2182:NIOServerCnxnFactory@213] - > Accepted socket connection from /10.18.47.148:51989 > 2011-12-19 16:12:29,884 [myid:2] - WARN > [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2182:NIOServerCnxn@354] - Exception > causing close of session 0x0 due to java.io.IOException: ZooKeeperServer not > running -- This message was sent by Atlassian Jira (v8.3.4#803005)