[
https://issues.apache.org/jira/browse/ZOOKEEPER-1115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13530788#comment-13530788
]
Patrick Hunt commented on ZOOKEEPER-1115:
-----------------------------------------
Is this addressed in ZOOKEEPER-1521 ? i.e. you can set the initlimit higher to
allow for longer initialization while not having to modify synclimit.
> follower can not sync with leader
> ---------------------------------
>
> Key: ZOOKEEPER-1115
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1115
> Project: ZooKeeper
> Issue Type: Bug
> Components: quorum
> Affects Versions: 3.3.0, 3.3.3
> Environment: linux rhel 4, x64, java version 1.4.2
> Reporter: helei
> Priority: Critical
>
> there are 5 members in the quorum. one follower can not sync with leader
> after restart. it seems leader has close the data connection with this
> follower because of read timeout. here is the key log in follower:
> {noformat}
> 2011-06-30 22:14:45,069 - WARN [Thread-17:QuorumCnxManager$RecvWorker@658] -
> Connection broken:
> java.nio.channels.ClosedChannelException
> at
> sun.nio.ch.SocketChannelImpl.ensureReadOpen(SocketChannelImpl.java:113)
> at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:156)
> at
> org.apache.zookeeper.server.quorum.QuorumCnxManager$RecvWorker.run(QuorumCnxManager.java:629)
> 2011-06-30 22:14:45,069 - INFO
> [QuorumPeer:/0.0.0.0:2181:FastLeaderElection@689] - Notification: 3,
> 17198470148, 3, 3, LOOKING, LOOKING, 3
> 2011-06-30 22:14:45,070 - ERROR [Thread-16:QuorumCnxManager$SendWorker@559] -
> Failed to send last message. Shutting down thread.
> java.nio.channels.ClosedChannelException
> at
> sun.nio.ch.SocketChannelImpl.ensureWriteOpen(SocketChannelImpl.java:126)
> at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:324)
> at
> org.apache.zookeeper.server.quorum.QuorumCnxManager$SendWorker.send(QuorumCnxManager.java:548)
> at
> org.apache.zookeeper.server.quorum.QuorumCnxManager$SendWorker.run(QuorumCnxManager.java:557)
> 2011-06-30 22:14:45,082 - INFO [QuorumPeer:/0.0.0.0:2181:Learner@282] -
> Getting a diff from the leader 0x4011bd462
> 2011-06-30 22:14:45,083 - WARN [Thread-18:QuorumCnxManager$SendWorker@589] -
> Send worker leaving thread
> 2011-06-30 22:14:45,085 - WARN [QuorumPeer:/0.0.0.0:2181:Follower@116] - Got
> zxid 0x4011bd405 expected 0x1
> 2011-06-30 22:14:45,090 - INFO [QuorumPeer:/0.0.0.0:2181:FileTxnSnapLog@208]
> - Snapshotting: 4011bd462
> 2011-06-30 22:14:53,397 - WARN [SyncThread:3:SendAckRequestProcessor@63] -
> Closing connection to leader, exception during packet send
> java.net.SocketException: Broken pipe
> at java.net.SocketOutputStream.socketWrite0(Native Method)
> at java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:92)
> at java.net.SocketOutputStream.write(SocketOutputStream.java:136)
> at
> java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:65)
> at java.io.BufferedOutputStream.flush(BufferedOutputStream.java:123)
> at
> org.apache.zookeeper.server.quorum.Learner.writePacket(Learner.java:126)
> at
> org.apache.zookeeper.server.quorum.SendAckRequestProcessor.flush(SendAckRequestProcessor.java:61)
> at
> org.apache.zookeeper.server.SyncRequestProcessor.flush(SyncRequestProcessor.java:164)
> at
> org.apache.zookeeper.server.SyncRequestProcessor.run(SyncRequestProcessor.java:98)
> 2011-06-30 22:14:53,398 - WARN [QuorumPeer:/0.0.0.0:2181:Follower@82] -
> Exception when following the leader
> java.net.SocketException: Socket closed
> at java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:99)
> at java.net.SocketOutputStream.write(SocketOutputStream.java:136)
> at
> java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:65)
> at java.io.BufferedOutputStream.flush(BufferedOutputStream.java:123)
> at
> org.apache.zookeeper.server.quorum.Learner.writePacket(Learner.java:126)
> at org.apache.zookeeper.server.quorum.Learner.ping(Learner.java:358)
> at
> org.apache.zookeeper.server.quorum.Follower.processPacket(Follower.java:108)
> at
> org.apache.zookeeper.server.quorum.Follower.followLeader(Follower.java:79)
> at
> org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:634)
> 2011-06-30 22:14:53,398 - WARN [SyncThread:3:SendAckRequestProcessor@63] -
> Closing connection to leader, exception during packet send
> java.net.SocketException: Socket closed
> at java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:99)
> at java.net.SocketOutputStream.write(SocketOutputStream.java:136)
> at
> java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:65)
> at java.io.BufferedOutputStream.flush(BufferedOutputStream.java:123)
> at
> org.apache.zookeeper.server.quorum.Learner.writePacket(Learner.java:126)
> at
> org.apache.zookeeper.server.quorum.SendAckRequestProcessor.flush(SendAckRequestProcessor.java:61)
> at
> org.apache.zookeeper.server.SyncRequestProcessor.flush(SyncRequestProcessor.java:164)
> at
> org.apache.zookeeper.server.SyncRequestProcessor.run(SyncRequestProcessor.java:98)
> 2011-06-30 22:14:53,399 - INFO [QuorumPeer:/0.0.0.0:2181:Follower@166] -
> shutdown called
> java.lang.Exception: shutdown Follower
> at
> org.apache.zookeeper.server.quorum.Follower.shutdown(Follower.java:166)
> at
> org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:638)
> and these are the leader's:
> 2011-06-30 22:14:35,943 - ERROR
> [LearnerHandler-/10.23.247.163:14975:LearnerHandler@444] - Unexpected
> exception causing shutdown while sock still open
> java.net.SocketTimeoutException: Read timed out
> at java.net.SocketInputStream.socketRead0(Native Method)
> at java.net.SocketInputStream.read(SocketInputStream.java:129)
> at java.io.BufferedInputStream.fill(BufferedInputStream.java:218)
> at java.io.BufferedInputStream.read(BufferedInputStream.java:237)
> at java.io.DataInputStream.readInt(DataInputStream.java:370)
> at
> org.apache.jute.BinaryInputArchive.readInt(BinaryInputArchive.java:63)
> at
> org.apache.zookeeper.server.quorum.QuorumPacket.deserialize(QuorumPacket.java:84)
> at
> org.apache.jute.BinaryInputArchive.readRecord(BinaryInputArchive.java:108)
> at
> org.apache.zookeeper.server.quorum.LearnerHandler.run(LearnerHandler.java:358)
> 2011-06-30 22:14:35,943 - WARN
> [LearnerHandler-/10.23.247.163:14975:LearnerHandler@457] - ******* GOODBYE
> /10.23.247.163:14975 ********
> 2011-06-30 22:14:48,943 - ERROR [CommitProcessor:4:NIOServerCnxn@422] -
> Unexpected Exception:
> java.nio.channels.CancelledKeyException
> at sun.nio.ch.SelectionKeyImpl.ensureValid(SelectionKeyImpl.java:55)
> at sun.nio.ch.SelectionKeyImpl.interestOps(SelectionKeyImpl.java:59)
> at
> org.apache.zookeeper.server.NIOServerCnxn.sendBuffer(NIOServerCnxn.java:395)
> at
> org.apache.zookeeper.server.NIOServerCnxn.sendResponse(NIOServerCnxn.java:1360)
> at
> org.apache.zookeeper.server.FinalRequestProcessor.processRequest(FinalRequestProcessor.java:367)
> at
> org.apache.zookeeper.server.quorum.Leader$ToBeAppliedRequestProcessor.processRequest(Leader.java:535)
> at
> org.apache.zookeeper.server.quorum.CommitProcessor.run(CommitProcessor.java:73)
> 2011-06-30 22:14:49,084 - ERROR
> [LearnerHandler-/10.23.247.163:14998:LearnerHandler@444] - Unexpected
> exception causing shutdown while sock still open
> java.net.SocketTimeoutException: Read timed out
> at java.net.SocketInputStream.socketRead0(Native Method)
> at java.net.SocketInputStream.read(SocketInputStream.java:129)
> at java.io.BufferedInputStream.fill(BufferedInputStream.java:218)
> at java.io.BufferedInputStream.read(BufferedInputStream.java:237)
> at java.io.DataInputStream.readInt(DataInputStream.java:370)
> at
> org.apache.jute.BinaryInputArchive.readInt(BinaryInputArchive.java:63)
> at
> org.apache.zookeeper.server.quorum.QuorumPacket.deserialize(QuorumPacket.java:84)
> at
> org.apache.jute.BinaryInputArchive.readRecord(BinaryInputArchive.java:108)
> at
> org.apache.zookeeper.server.quorum.LearnerHandler.run(LearnerHandler.java:358)
> 2011-06-30 22:14:49,084 - WARN
> [LearnerHandler-/10.23.247.163:14998:LearnerHandler@457] - ******* GOODBYE
> /10.23.247.163:14998 ********
> {noformat}
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira