[
https://issues.apache.org/jira/browse/ZOOKEEPER-3634?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
zechao zheng updated ZOOKEEPER-3634:
------------------------------------
Affects Version/s: 3.4.5
> why zookeeper huge snapshot cause waitEpockAck timeout?
> -------------------------------------------------------
>
> Key: ZOOKEEPER-3634
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-3634
> Project: ZooKeeper
> Issue Type: Bug
> Affects Versions: 3.4.5
> Reporter: zechao zheng
> Priority: Blocker
>
> h4. Question
> After a large number of znodes are created, ZooKeeper servers in the
> ZooKeeper cluster become faulty and cannot be automatically recovered or
> restarted.
> Logs of the followe:
> 2016-06-23 08:00:18,763 | WARN |
> QuorumPeer[myid=26](plain=/10.16.9.138:24002)(secure=disabled) | Exception
> when following the leader |
> org.apache.zookeeper.server.quorum.Follower.followLeader(Follower.java:93)
> java.net.SocketTimeoutException: Read timed out
> at java.net.SocketInputStream.socketRead0(Native Method)
> at java.net.SocketInputStream.socketRead(SocketInputStream.java:116)
> at java.net.SocketInputStream.read(SocketInputStream.java:170)
> at java.net.SocketInputStream.read(SocketInputStream.java:141)
> at java.io.BufferedInputStream.fill(BufferedInputStream.java:246)
> at java.io.BufferedInputStream.read(BufferedInputStream.java:265)
> at java.io.DataInputStream.readInt(DataInputStream.java:387)
> at org.apache.jute.BinaryInputArchive.readInt(BinaryInputArchive.java:63)
> at
> org.apache.zookeeper.server.quorum.QuorumPacket.deserialize(QuorumPacket.java:83)
> at
> org.apache.jute.BinaryInputArchive.readRecord(BinaryInputArchive.java:99)
> at org.apache.zookeeper.server.quorum.Learner.readPacket(Learner.java:156)
> at
> org.apache.zookeeper.server.quorum.Learner.registerWithLeader(Learner.java:276)
> at
> org.apache.zookeeper.server.quorum.Follower.followLeader(Follower.java:75)
> at
> org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:1094)2016-06-23
> 08:00:18,763 | WARN |
> QuorumPeer[myid=26](plain=/10.16.9.138:24002)(secure=disabled) | Exception
> when following the leader |
> org.apache.zookeeper.server.quorum.Follower.followLeader(Follower.java:93)
> java.net.SocketTimeoutException: Read timed out
> at java.net.SocketInputStream.socketRead0(Native Method)
> at java.net.SocketInputStream.socketRead(SocketInputStream.java:116)
> at java.net.SocketInputStream.read(SocketInputStream.java:170)
> at java.net.SocketInputStream.read(SocketInputStream.java:141)
> at java.io.BufferedInputStream.fill(BufferedInputStream.java:246)
> at java.io.BufferedInputStream.read(BufferedInputStream.java:265)
> at java.io.DataInputStream.readInt(DataInputStream.java:387)
> at org.apache.jute.BinaryInputArchive.readInt(BinaryInputArchive.java:63)
> at
> org.apache.zookeeper.server.quorum.QuorumPacket.deserialize(QuorumPacket.java:83)
> at
> org.apache.jute.BinaryInputArchive.readRecord(BinaryInputArchive.java:99)
> at org.apache.zookeeper.server.quorum.Learner.readPacket(Learner.java:156)
> at
> org.apache.zookeeper.server.quorum.Learner.registerWithLeader(Learner.java:276)
> at
> org.apache.zookeeper.server.quorum.Follower.followLeader(Follower.java:75)
> at org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:1094)
> Logs of the leader:
> 016-06-23 07:30:57,481 | WARN |
> QuorumPeer[myid=25](plain=/10.16.9.136:24002)(secure=disabled) | Unexpected
> exception |
> org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:1108)
> java.lang.InterruptedException: Timeout while waiting for epoch to be acked
> by quorum
> at
> org.apache.zookeeper.server.quorum.Leader.waitForEpochAck(Leader.java:1221)
> at org.apache.zookeeper.server.quorum.Leader.lead(Leader.java:487)
> at org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:1105)
> 2016-06-23 07:30:57,482 | INFO |
> QuorumPeer[myid=25](plain=/10.16.9.136:24002)(secure=disabled) | Shutdown
> called | org.apache.zookeeper.server.quorum.Leader.shutdown(Leader.java:623)
> java.lang.Exception: shutdown Leader! reason: Forcing shutdown
> at org.apache.zookeeper.server.quorum.Leader.shutdown(Leader.java:623)
> at
> org.apache.zookeeper.server.quorum.QuorumPeer.stopLeader(QuorumPeer.java:1149)
> at org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:1110)
--
This message was sent by Atlassian Jira
(v8.3.4#803005)