zechao zheng created ZOOKEEPER-3634:
---------------------------------------

             Summary: why zookeeper huge snapshot cause waitEpockAck timeout?
                 Key: ZOOKEEPER-3634
                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-3634
             Project: ZooKeeper
          Issue Type: Bug
            Reporter: zechao zheng


h4. Question

After a large number of znodes are created, ZooKeeper servers in the ZooKeeper 
cluster become faulty and cannot be automatically recovered or restarted.

Logs of the followe:
2016-06-23 08:00:18,763 | WARN  | 
QuorumPeer[myid=26](plain=/10.16.9.138:24002)(secure=disabled) | Exception when 
following the leader | 
org.apache.zookeeper.server.quorum.Follower.followLeader(Follower.java:93)
java.net.SocketTimeoutException: Read timed out
    at java.net.SocketInputStream.socketRead0(Native Method)
    at java.net.SocketInputStream.socketRead(SocketInputStream.java:116)
    at java.net.SocketInputStream.read(SocketInputStream.java:170)
    at java.net.SocketInputStream.read(SocketInputStream.java:141)
    at java.io.BufferedInputStream.fill(BufferedInputStream.java:246)
    at java.io.BufferedInputStream.read(BufferedInputStream.java:265)
    at java.io.DataInputStream.readInt(DataInputStream.java:387)
    at org.apache.jute.BinaryInputArchive.readInt(BinaryInputArchive.java:63)
    at 
org.apache.zookeeper.server.quorum.QuorumPacket.deserialize(QuorumPacket.java:83)
    at org.apache.jute.BinaryInputArchive.readRecord(BinaryInputArchive.java:99)
    at org.apache.zookeeper.server.quorum.Learner.readPacket(Learner.java:156)
    at 
org.apache.zookeeper.server.quorum.Learner.registerWithLeader(Learner.java:276)
    at 
org.apache.zookeeper.server.quorum.Follower.followLeader(Follower.java:75)
    at 
org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:1094)2016-06-23
 08:00:18,763 | WARN  | 
QuorumPeer[myid=26](plain=/10.16.9.138:24002)(secure=disabled) | Exception when 
following the leader | 
org.apache.zookeeper.server.quorum.Follower.followLeader(Follower.java:93)
java.net.SocketTimeoutException: Read timed out
    at java.net.SocketInputStream.socketRead0(Native Method)
    at java.net.SocketInputStream.socketRead(SocketInputStream.java:116)
    at java.net.SocketInputStream.read(SocketInputStream.java:170)
    at java.net.SocketInputStream.read(SocketInputStream.java:141)
    at java.io.BufferedInputStream.fill(BufferedInputStream.java:246)
    at java.io.BufferedInputStream.read(BufferedInputStream.java:265)
    at java.io.DataInputStream.readInt(DataInputStream.java:387)
    at org.apache.jute.BinaryInputArchive.readInt(BinaryInputArchive.java:63)
    at 
org.apache.zookeeper.server.quorum.QuorumPacket.deserialize(QuorumPacket.java:83)
    at org.apache.jute.BinaryInputArchive.readRecord(BinaryInputArchive.java:99)
    at org.apache.zookeeper.server.quorum.Learner.readPacket(Learner.java:156)
    at 
org.apache.zookeeper.server.quorum.Learner.registerWithLeader(Learner.java:276)
    at 
org.apache.zookeeper.server.quorum.Follower.followLeader(Follower.java:75)
    at org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:1094)
Logs of the leader:
016-06-23 07:30:57,481 | WARN  | 
QuorumPeer[myid=25](plain=/10.16.9.136:24002)(secure=disabled) | Unexpected 
exception | 
org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:1108)
java.lang.InterruptedException: Timeout while waiting for epoch to be acked by 
quorum
    at 
org.apache.zookeeper.server.quorum.Leader.waitForEpochAck(Leader.java:1221)
    at org.apache.zookeeper.server.quorum.Leader.lead(Leader.java:487)
    at org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:1105)
2016-06-23 07:30:57,482 | INFO  | 
QuorumPeer[myid=25](plain=/10.16.9.136:24002)(secure=disabled) | Shutdown 
called | org.apache.zookeeper.server.quorum.Leader.shutdown(Leader.java:623)
java.lang.Exception: shutdown Leader! reason: Forcing shutdown
    at org.apache.zookeeper.server.quorum.Leader.shutdown(Leader.java:623)
    at 
org.apache.zookeeper.server.quorum.QuorumPeer.stopLeader(QuorumPeer.java:1149)
    at org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:1110)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to