I just had another one of these events today. Is there anyone who has any insight here? Should I just file a JIRA? Thanks!
On Wed, Aug 14, 2013 at 2:12 PM, Stephen Tyree <[email protected]> wrote: > Hello all, > > Recently one of our production zookeeper nodes in a cluster of 5 suddenly > dropped all of its connections (about 3000-4000), shortly thereafter > accepting connections once more. Looking at the logs, I saw the following > event occur: > > 2013-08-12 04:50:32,670 - WARN > [QuorumPeer[myid=2]/0:0:0:0:0:0:0:0:2181:Follower@89] - Exception when > following the leader > java.io.EOFException > at java.io.DataInputStream.readInt(DataInputStream.java:375) > at > org.apache.jute.BinaryInputArchive.readInt(BinaryInputArchive.java:63) > at > org.apache.zookeeper.server.quorum.QuorumPacket.deserialize(QuorumPacket.java:83) > at > org.apache.jute.BinaryInputArchive.readRecord(BinaryInputArchive.java:108) > at > org.apache.zookeeper.server.quorum.Learner.readPacket(Learner.java:152) > at > org.apache.zookeeper.server.quorum.Follower.followLeader(Follower.java:85) > at > org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:740) > 2013-08-12 04:50:32,670 - INFO > [QuorumPeer[myid=2]/0:0:0:0:0:0:0:0:2181:Follower@166] - shutdown called > java.lang.Exception: shutdown Follower > at > org.apache.zookeeper.server.quorum.Follower.shutdown(Follower.java:166) > at > org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:744) > > It seems that Java gets an exception when it hits end of file when reading > from the leader socket, and as a result the Follower immediately shuts > down. Is this expected behavior? Why would all of the connections need to > be closed? Why can't the follower just reconnect to the leader? > > Thanks, > Steve Tyree > >
