[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-3634?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Damien Diederen updated ZOOKEEPER-3634:
---------------------------------------
    Priority: Critical  (was: Blocker)

> why zookeeper huge snapshot cause waitEpockAck timeout?
> -------------------------------------------------------
>
>                 Key: ZOOKEEPER-3634
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-3634
>             Project: ZooKeeper
>          Issue Type: Bug
>    Affects Versions: 3.4.5
>            Reporter: zechao zheng
>            Priority: Critical
>
> h4. Question
> After a large number of znodes are created, ZooKeeper servers in the 
> ZooKeeper cluster become faulty and cannot be automatically recovered or 
> restarted.
> Logs of the followe:
> 2016-06-23 08:00:18,763 | WARN  | 
> QuorumPeer[myid=26](plain=/10.16.9.138:24002)(secure=disabled) | Exception 
> when following the leader | 
> org.apache.zookeeper.server.quorum.Follower.followLeader(Follower.java:93)
> java.net.SocketTimeoutException: Read timed out
>     at java.net.SocketInputStream.socketRead0(Native Method)
>     at java.net.SocketInputStream.socketRead(SocketInputStream.java:116)
>     at java.net.SocketInputStream.read(SocketInputStream.java:170)
>     at java.net.SocketInputStream.read(SocketInputStream.java:141)
>     at java.io.BufferedInputStream.fill(BufferedInputStream.java:246)
>     at java.io.BufferedInputStream.read(BufferedInputStream.java:265)
>     at java.io.DataInputStream.readInt(DataInputStream.java:387)
>     at org.apache.jute.BinaryInputArchive.readInt(BinaryInputArchive.java:63)
>     at 
> org.apache.zookeeper.server.quorum.QuorumPacket.deserialize(QuorumPacket.java:83)
>     at 
> org.apache.jute.BinaryInputArchive.readRecord(BinaryInputArchive.java:99)
>     at org.apache.zookeeper.server.quorum.Learner.readPacket(Learner.java:156)
>     at 
> org.apache.zookeeper.server.quorum.Learner.registerWithLeader(Learner.java:276)
>     at 
> org.apache.zookeeper.server.quorum.Follower.followLeader(Follower.java:75)
>     at 
> org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:1094)2016-06-23
>  08:00:18,763 | WARN  | 
> QuorumPeer[myid=26](plain=/10.16.9.138:24002)(secure=disabled) | Exception 
> when following the leader | 
> org.apache.zookeeper.server.quorum.Follower.followLeader(Follower.java:93)
> java.net.SocketTimeoutException: Read timed out
>     at java.net.SocketInputStream.socketRead0(Native Method)
>     at java.net.SocketInputStream.socketRead(SocketInputStream.java:116)
>     at java.net.SocketInputStream.read(SocketInputStream.java:170)
>     at java.net.SocketInputStream.read(SocketInputStream.java:141)
>     at java.io.BufferedInputStream.fill(BufferedInputStream.java:246)
>     at java.io.BufferedInputStream.read(BufferedInputStream.java:265)
>     at java.io.DataInputStream.readInt(DataInputStream.java:387)
>     at org.apache.jute.BinaryInputArchive.readInt(BinaryInputArchive.java:63)
>     at 
> org.apache.zookeeper.server.quorum.QuorumPacket.deserialize(QuorumPacket.java:83)
>     at 
> org.apache.jute.BinaryInputArchive.readRecord(BinaryInputArchive.java:99)
>     at org.apache.zookeeper.server.quorum.Learner.readPacket(Learner.java:156)
>     at 
> org.apache.zookeeper.server.quorum.Learner.registerWithLeader(Learner.java:276)
>     at 
> org.apache.zookeeper.server.quorum.Follower.followLeader(Follower.java:75)
>     at org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:1094)
> Logs of the leader:
> 016-06-23 07:30:57,481 | WARN  | 
> QuorumPeer[myid=25](plain=/10.16.9.136:24002)(secure=disabled) | Unexpected 
> exception | 
> org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:1108)
> java.lang.InterruptedException: Timeout while waiting for epoch to be acked 
> by quorum
>     at 
> org.apache.zookeeper.server.quorum.Leader.waitForEpochAck(Leader.java:1221)
>     at org.apache.zookeeper.server.quorum.Leader.lead(Leader.java:487)
>     at org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:1105)
> 2016-06-23 07:30:57,482 | INFO  | 
> QuorumPeer[myid=25](plain=/10.16.9.136:24002)(secure=disabled) | Shutdown 
> called | org.apache.zookeeper.server.quorum.Leader.shutdown(Leader.java:623)
> java.lang.Exception: shutdown Leader! reason: Forcing shutdown
>     at org.apache.zookeeper.server.quorum.Leader.shutdown(Leader.java:623)
>     at 
> org.apache.zookeeper.server.quorum.QuorumPeer.stopLeader(QuorumPeer.java:1149)
>     at org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:1110)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to