Sorry for any confusion but these messages are coming from a running Zookeeper cluster with 5 nodes. Can someone confirm that shutdown of ZK/ZookeeperServer not running is due to network issues as below is what i see in ZK logs.
... ... 2018-10-22 06:03:56,022 [myid:2] - INFO [NIOServerCxn.Factory: 0.0.0.0/0.0.0.0:2182:NIOServerCnxn@1044] - Closed socket connection for client /192.3.101.219:55704 which had sessionid 0x5665c67cb0d0000 2018-10-22 06:03:56,022 [myid:2] - INFO [NIOServerCxn.Factory: 0.0.0.0/0.0.0.0:2182:Learner@108] - Revalidating client: 0x5665c67cb0d0000 2018-10-22 06:03:56,265 [myid:2] - WARN [QuorumPeer[myid=2]/0:0:0:0:0:0:0:0:2182:Follower@89] - Exception when following the leader java.net.SocketTimeoutException: Read timed out at java.net.SocketInputStream.socketRead0(Native Method) at java.net.SocketInputStream.socketRead(SocketInputStream.java:116) at java.net.SocketInputStream.read(SocketInputStream.java:170) at java.net.SocketInputStream.read(SocketInputStream.java:141) at java.io.BufferedInputStream.fill(BufferedInputStream.java:246) at java.io.BufferedInputStream.read(BufferedInputStream.java:265) at java.io.DataInputStream.readInt(DataInputStream.java:387) at org.apache.jute.BinaryInputArchive.readInt(BinaryInputArchive.java:63) at org.apache.zookeeper.server.quorum.QuorumPacket.deserialize(QuorumPacket.java:83) at org.apache.jute.BinaryInputArchive.readRecord(BinaryInputArchive.java:99) at org.apache.zookeeper.server.quorum.Learner.readPacket(Learner.java:153) at org.apache.zookeeper.server.quorum.Follower.followLeader(Follower.java:85) at org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:937) 2018-10-22 06:03:56,266 [myid:2] - INFO [QuorumPeer[myid=2]/0:0:0:0:0:0:0:0:2182:Follower@166] - shutdown called java.lang.Exception: shutdown Follower at org.apache.zookeeper.server.quorum.Follower.shutdown(Follower.java:166) at org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:941) 2018-10-22 06:03:56,266 [myid:2] - INFO [QuorumPeer[myid=2]/0:0:0:0:0:0:0:0:2182:NIOServerCnxn@1044] - Closed socket connection for client /192.72.5.213:57834 which had sessionid 0x46591d67d0c0024 2018-10-22 06:03:56,266 [myid:2] - INFO [QuorumPeer[myid=2]/0:0:0:0:0:0:0:0:2182:NIOServerCnxn@1044] - Closed socket connection for client /192.3.95.181:55192 which had sessionid 0x3665c676caf0004 2018-10-22 06:03:56,266 [myid:2] - INFO [QuorumPeer[myid=2]/0:0:0:0:0:0:0:0:2182:NIOServerCnxn@1044] - Closed socket connection for client /192.3.224.15:38712 which had sessionid 0x2668d42319e0012 ... ... On Tue, Oct 23, 2018 at 8:32 AM Norbert Kalmar <nkal...@cloudera.com.invalid> wrote: > Hi, > > Looks like to me ZooKeeper quorum simply didn't complete leader election > yet. You can see server state as "LOOKING", and after the errors client > timeouting, you get the server created and FOLLOWING log message. After > that, I assume there is no more error message. > > So probably ZooKeeper should be started earlier a bit if you don't want > these error messages. > > Regards, > Norbert > > On Tue, Oct 23, 2018 at 3:35 AM Susheel Kumar <susheel2...@gmail.com> > wrote: > > > Hello, > > > > I am seeing "ZookeeperServer not running" WARM messages in zookeeper logs > > which is causing the Solr client connections to timeout... > > > > What could be the problem? > > > > ZK: 3.4.10 > > > > Zookeeper.out > > == > > 2018-10-22 06:04:51,071 [myid:2] - INFO > > [WorkerReceiver[myid=2]:FastLeaderElection@600] - Notification: 1 > (message > > format version), 5 (n.leader), 0xf00000461 (n.zxid), 0x10 (n.round), > > FOLLOWING (n.state), 4 (n.sid), 0xf (n.peerEpoch) LOOKING (my state) > > 2018-10-22 06:04:51,093 [myid:2] - INFO [NIOServerCxn.Factory: > > 0.0.0.0/0.0.0.0:2182:NIOServerCnxnFactory@192] - Accepted socket > > connection > > from /192.72.25.177:39514 > > 2018-10-22 06:04:51,094 [myid:2] - WARN [NIOServerCxn.Factory: > > 0.0.0.0/0.0.0.0:2182:NIOServerCnxn@373] - Exception causing close of > > session 0x0 due to java.io.IOException: ZooKeeperServer not running > > 2018-10-22 06:04:51,094 [myid:2] - INFO [NIOServerCxn.Factory: > > 0.0.0.0/0.0.0.0:2182:NIOServerCnxn@1044] - Closed socket connection for > > client /192.72.25.177:39514 (no session established for client) > > 2018-10-22 06:04:51,138 [myid:2] - INFO [NIOServerCxn.Factory: > > 0.0.0.0/0.0.0.0:2182:NIOServerCnxnFactory@192] - Accepted socket > > connection > > from /192.3.101.219:56298 > > 2018-10-22 06:04:51,138 [myid:2] - WARN [NIOServerCxn.Factory: > > 0.0.0.0/0.0.0.0:2182:NIOServerCnxn@373] - Exception causing close of > > session 0x0 due to java.io.IOException: ZooKeeperServer not running > > 2018-10-22 06:04:51,139 [myid:2] - INFO [NIOServerCxn.Factory: > > 0.0.0.0/0.0.0.0:2182:NIOServerCnxn@1044] - Closed socket connection for > > client /192.3.101.219:56298 (no session established for client) > > 2018-10-22 06:04:51,250 [myid:2] - INFO [NIOServerCxn.Factory: > > 0.0.0.0/0.0.0.0:2182:NIOServerCnxnFactory@192] - Accepted socket > > connection > > from /192.72.27.181:46414 > > 2018-10-22 06:04:51,250 [myid:2] - WARN [NIOServerCxn.Factory: > > 0.0.0.0/0.0.0.0:2182:NIOServerCnxn@373] - Exception causing close of > > session 0x0 due to java.io.IOException: ZooKeeperServer not running > > 2018-10-22 06:04:51,250 [myid:2] - INFO [NIOServerCxn.Factory: > > 0.0.0.0/0.0.0.0:2182:NIOServerCnxn@1044] - Closed socket connection for > > client /192.72.27.181:46414 (no session established for client) > > 2018-10-22 06:04:51,275 [myid:2] - INFO > > [WorkerReceiver[myid=2]:FastLeaderElection@600] - Notification: 1 > (message > > format version), 4 (n.leader), 0xf00000461 (n.zxid), 0x192 (n.round), > > LOOKING (n.state), 4 (n.sid), 0xf (n.peerEpoch) LOOKING (my state) > > 2018-10-22 06:04:51,275 [myid:2] - INFO > > [WorkerReceiver[myid=2]:FastLeaderElection@600] - Notification: 1 > (message > > format version), 4 (n.leader), 0xf00000461 (n.zxid), 0x192 (n.round), > > LOOKING (n.state), 2 (n.sid), 0xf (n.peerEpoch) LOOKING (my state) > > 2018-10-22 06:04:51,275 [myid:2] - INFO > > [WorkerReceiver[myid=2]:FastLeaderElection@600] - Notification: 1 > (message > > format version), 4 (n.leader), 0xf00000461 (n.zxid), 0x192 (n.round), > > LOOKING (n.state), 1 (n.sid), 0xf (n.peerEpoch) LOOKING (my state) > > 2018-10-22 06:04:51,309 [myid:2] - INFO [NIOServerCxn.Factory: > > 0.0.0.0/0.0.0.0:2182:NIOServerCnxnFactory@192] - Accepted socket > > connection > > from /192.72.5.212:38944 > > 2018-10-22 06:04:51,309 [myid:2] - WARN [NIOServerCxn.Factory: > > 0.0.0.0/0.0.0.0:2182:NIOServerCnxn@373] - Exception causing close of > > session 0x0 due to java.io.IOException: ZooKeeperServer not running > > 2018-10-22 06:04:51,309 [myid:2] - INFO [NIOServerCxn.Factory: > > 0.0.0.0/0.0.0.0:2182:NIOServerCnxn@1044] - Closed socket connection for > > client /192.72.5.212:38944 (no session established for client) > > 2018-10-22 06:04:51,356 [myid:2] - INFO [NIOServerCxn.Factory: > > 0.0.0.0/0.0.0.0:2182:NIOServerCnxnFactory@192] - Accepted socket > > connection > > from /192.72.7.201:59310 > > 2018-10-22 06:04:51,356 [myid:2] - WARN [NIOServerCxn.Factory: > > 0.0.0.0/0.0.0.0:2182:NIOServerCnxn@373] - Exception causing close of > > session 0x0 due to java.io.IOException: ZooKeeperServer not running > > 2018-10-22 06:04:51,356 [myid:2] - INFO [NIOServerCxn.Factory: > > 0.0.0.0/0.0.0.0:2182:NIOServerCnxn@1044] - Closed socket connection for > > client /192.72.7.201:59310 (no session established for client) > > 2018-10-22 06:04:51,402 [myid:2] - INFO [NIOServerCxn.Factory: > > 0.0.0.0/0.0.0.0:2182:NIOServerCnxnFactory@192] - Accepted socket > > connection > > from /192.3.101.219:56302 > > 2018-10-22 06:04:51,402 [myid:2] - WARN [NIOServerCxn.Factory: > > 0.0.0.0/0.0.0.0:2182:NIOServerCnxn@373] - Exception causing close of > > session 0x0 due to java.io.IOException: ZooKeeperServer not running > > 2018-10-22 06:04:51,402 [myid:2] - INFO [NIOServerCxn.Factory: > > 0.0.0.0/0.0.0.0:2182:NIOServerCnxn@1044] - Closed socket connection for > > client /192.3.101.219:56302 (no session established for client) > > 2018-10-22 06:04:51,467 [myid:2] - INFO [NIOServerCxn.Factory: > > 0.0.0.0/0.0.0.0:2182:NIOServerCnxnFactory@192] - Accepted socket > > connection > > from /192.72.7.205:46694 > > 2018-10-22 06:04:51,468 [myid:2] - WARN [NIOServerCxn.Factory: > > 0.0.0.0/0.0.0.0:2182:NIOServerCnxn@373] - Exception causing close of > > session 0x0 due to java.io.IOException: ZooKeeperServer not running > > 2018-10-22 06:04:51,468 [myid:2] - INFO [NIOServerCxn.Factory: > > 0.0.0.0/0.0.0.0:2182:NIOServerCnxn@1044] - Closed socket connection for > > client /192.72.7.205:46694 (no session established for client) > > 2018-10-22 06:04:51,476 [myid:2] - INFO > > [QuorumPeer[myid=2]/0:0:0:0:0:0:0:0:2182:QuorumPeer@935] - FOLLOWING > > 2018-10-22 06:04:51,476 [myid:2] - INFO > > [QuorumPeer[myid=2]/0:0:0:0:0:0:0:0:2182:ZooKeeperServer@173] - Created > > server with tickTime 2000 minSessionTimeout 4000 maxSessionTimeout 40000 > > datadir /app/zookeeper-3.4.10/log/version-2 snapdir > > /app/zookeeper-3.4.10/data/version-2 > > 2018-10-22 06:04:51,476 [myid:2] - INFO > > [QuorumPeer[myid=2]/0:0:0:0:0:0:0:0:2182:Follower@64] - FOLLOWING - > LEADER > > ELECTION TOOK - 1008 > > 2018-10-22 06:04:51,477 [myid:2] - INFO > > [QuorumPeer[myid=2]/0:0:0:0:0:0:0:0:2182:QuorumPeer$QuorumServer@167] - > > Resolved hostname: srch0120.whc.dc02.us.adp to address: srch0120/ > > 192.3.35.243 > > > > > > solr.log > > 2018-10-22 10:02:21.466 WARN (main-SendThread(srch0118:2182)) [ ] > > o.a.z.ClientCnxn Client session timed out, have not heard from server in > > 26675ms for sessionid 0x5665c67cb0d000b > > 2018-10-22 10:02:48.891 WARN (main-SendThread(srch0119:2182)) [ ] > > o.a.z.ClientCnxn Client session timed out, have not heard from server in > > 26667ms for sessionid 0x5665c67cb0d000b > > 2018-10-22 10:02:57.720 WARN (main-SendThread(srch0117:2182)) [ ] > > o.a.z.ClientCnxn Client session timed out, have not heard from server in > > 8007ms for sessionid 0x5665c67cb0d000b > > 2018-10-22 10:03:06.776 WARN (main-SendThread(srch0120:2182)) [ ] > > o.a.z.ClientCnxn Client session timed out, have not heard from server in > > 8006ms for sessionid 0x5665c67cb0d000b > > 2018-10-22 10:03:15.734 WARN (main-SendThread(srch0121:2182)) [ ] > > o.a.z.ClientCnxn Client session timed out, have not heard from server in > > 8004ms for sessionid 0x5665c67cb0d000b > > 2018-10-22 10:03:24.568 WARN (main-SendThread(srch0118:2182)) [ ] > > o.a.z.ClientCnxn Client session timed out, have not heard from server in > > 8003ms for sessionid 0x5665c67cb0d000b > > 2018-10-22 10:03:33.844 WARN (main-SendThread(srch0119:2182)) [ ] > > o.a.z.ClientCnxn Client session timed out, have not heard from server in > > 8001ms for sessionid 0x5665c67cb0d000b > > 2018-10-22 10:03:42.377 WARN (main-SendThread(srch0117:2182)) [ ] > > o.a.z.ClientCnxn Client session timed out, have not heard from server in > > 8008ms for sessionid 0x5665c67cb0d000b > > 2018-10-22 10:03:51.138 WARN (main-SendThread(srch0120:2182)) [ ] > > o.a.z.ClientCnxn Client session timed out, have not heard from server in > > 8004ms for sessionid 0x5665c67cb0d000b > > 2018-10-22 10:04:10.229 WARN (main-SendThread(srch0119:2182)) [ ] > > o.a.z.ClientCnxn Client session timed out, have not heard from server in > > 8008ms for sessionid 0x5665c67cb0d000b > > 2018-10-22 10:04:20.669 WARN (main-SendThread(srch0119:2182)) [ ] > > o.a.z.ClientCnxn Client session timed out, have not heard from server in > > 9272ms for sessionid 0x5665c67cb0d000b > > 2018-10-22 10:04:34.076 WARN (main-SendThread(srch0119:2182)) [ ] > > o.a.z.ClientCnxn Client session timed out, have not heard from server in > > 9873ms for sessionid 0x5665c67cb0d000b > > >