follower can not sync with leader
---------------------------------
Key: ZOOKEEPER-1115
URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1115
Project: ZooKeeper
Issue Type: Bug
Components: quorum
Affects Versions: 3.3.3, 3.3.0
Environment: linux rhel 4, x64, java version 1.4.2
Reporter: helei
Priority: Critical
there are 5 members in the quorum. one follower can not sync with leader after
restart. it seems leader has close the data connection with this follower
because of read timeout. here is the key log in follower:
2011-06-30 22:14:45,069 - WARN [Thread-17:QuorumCnxManager$RecvWorker@658] -
Connection broken:
java.nio.channels.ClosedChannelException
at
sun.nio.ch.SocketChannelImpl.ensureReadOpen(SocketChannelImpl.java:113)
at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:156)
at
org.apache.zookeeper.server.quorum.QuorumCnxManager$RecvWorker.run(QuorumCnxManager.java:629)
2011-06-30 22:14:45,069 - INFO
[QuorumPeer:/0.0.0.0:2181:FastLeaderElection@689] - Notification: 3,
17198470148, 3, 3, LOOKING, LOOKING, 3
2011-06-30 22:14:45,070 - ERROR [Thread-16:QuorumCnxManager$SendWorker@559] -
Failed to send last message. Shutting down thread.
java.nio.channels.ClosedChannelException
at
sun.nio.ch.SocketChannelImpl.ensureWriteOpen(SocketChannelImpl.java:126)
at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:324)
at
org.apache.zookeeper.server.quorum.QuorumCnxManager$SendWorker.send(QuorumCnxManager.java:548)
at
org.apache.zookeeper.server.quorum.QuorumCnxManager$SendWorker.run(QuorumCnxManager.java:557)
2011-06-30 22:14:45,082 - INFO [QuorumPeer:/0.0.0.0:2181:Learner@282] -
Getting a diff from the leader 0x4011bd462
2011-06-30 22:14:45,083 - WARN [Thread-18:QuorumCnxManager$SendWorker@589] -
Send worker leaving thread
2011-06-30 22:14:45,085 - WARN [QuorumPeer:/0.0.0.0:2181:Follower@116] - Got
zxid 0x4011bd405 expected 0x1
2011-06-30 22:14:45,090 - INFO [QuorumPeer:/0.0.0.0:2181:FileTxnSnapLog@208] -
Snapshotting: 4011bd462
2011-06-30 22:14:53,397 - WARN [SyncThread:3:SendAckRequestProcessor@63] -
Closing connection to leader, exception during packet send
java.net.SocketException: Broken pipe
at java.net.SocketOutputStream.socketWrite0(Native Method)
at java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:92)
at java.net.SocketOutputStream.write(SocketOutputStream.java:136)
at
java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:65)
at java.io.BufferedOutputStream.flush(BufferedOutputStream.java:123)
at
org.apache.zookeeper.server.quorum.Learner.writePacket(Learner.java:126)
at
org.apache.zookeeper.server.quorum.SendAckRequestProcessor.flush(SendAckRequestProcessor.java:61)
at
org.apache.zookeeper.server.SyncRequestProcessor.flush(SyncRequestProcessor.java:164)
at
org.apache.zookeeper.server.SyncRequestProcessor.run(SyncRequestProcessor.java:98)
2011-06-30 22:14:53,398 - WARN [QuorumPeer:/0.0.0.0:2181:Follower@82] -
Exception when following the leader
java.net.SocketException: Socket closed
at java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:99)
at java.net.SocketOutputStream.write(SocketOutputStream.java:136)
at
java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:65)
at java.io.BufferedOutputStream.flush(BufferedOutputStream.java:123)
at
org.apache.zookeeper.server.quorum.Learner.writePacket(Learner.java:126)
at org.apache.zookeeper.server.quorum.Learner.ping(Learner.java:358)
at
org.apache.zookeeper.server.quorum.Follower.processPacket(Follower.java:108)
at
org.apache.zookeeper.server.quorum.Follower.followLeader(Follower.java:79)
at
org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:634)
2011-06-30 22:14:53,398 - WARN [SyncThread:3:SendAckRequestProcessor@63] -
Closing connection to leader, exception during packet send
java.net.SocketException: Socket closed
at java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:99)
at java.net.SocketOutputStream.write(SocketOutputStream.java:136)
at
java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:65)
at java.io.BufferedOutputStream.flush(BufferedOutputStream.java:123)
at
org.apache.zookeeper.server.quorum.Learner.writePacket(Learner.java:126)
at
org.apache.zookeeper.server.quorum.SendAckRequestProcessor.flush(SendAckRequestProcessor.java:61)
at
org.apache.zookeeper.server.SyncRequestProcessor.flush(SyncRequestProcessor.java:164)
at
org.apache.zookeeper.server.SyncRequestProcessor.run(SyncRequestProcessor.java:98)
2011-06-30 22:14:53,399 - INFO [QuorumPeer:/0.0.0.0:2181:Follower@166] -
shutdown called
java.lang.Exception: shutdown Follower
at
org.apache.zookeeper.server.quorum.Follower.shutdown(Follower.java:166)
at
org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:638)
and these are the leader's:
2011-06-30 22:14:35,943 - ERROR
[LearnerHandler-/10.23.247.163:14975:LearnerHandler@444] - Unexpected exception
causing shutdown while sock still open
java.net.SocketTimeoutException: Read timed out
at java.net.SocketInputStream.socketRead0(Native Method)
at java.net.SocketInputStream.read(SocketInputStream.java:129)
at java.io.BufferedInputStream.fill(BufferedInputStream.java:218)
at java.io.BufferedInputStream.read(BufferedInputStream.java:237)
at java.io.DataInputStream.readInt(DataInputStream.java:370)
at
org.apache.jute.BinaryInputArchive.readInt(BinaryInputArchive.java:63)
at
org.apache.zookeeper.server.quorum.QuorumPacket.deserialize(QuorumPacket.java:84)
at
org.apache.jute.BinaryInputArchive.readRecord(BinaryInputArchive.java:108)
at
org.apache.zookeeper.server.quorum.LearnerHandler.run(LearnerHandler.java:358)
2011-06-30 22:14:35,943 - WARN
[LearnerHandler-/10.23.247.163:14975:LearnerHandler@457] - ******* GOODBYE
/10.23.247.163:14975 ********
2011-06-30 22:14:48,943 - ERROR [CommitProcessor:4:NIOServerCnxn@422] -
Unexpected Exception:
java.nio.channels.CancelledKeyException
at sun.nio.ch.SelectionKeyImpl.ensureValid(SelectionKeyImpl.java:55)
at sun.nio.ch.SelectionKeyImpl.interestOps(SelectionKeyImpl.java:59)
at
org.apache.zookeeper.server.NIOServerCnxn.sendBuffer(NIOServerCnxn.java:395)
at
org.apache.zookeeper.server.NIOServerCnxn.sendResponse(NIOServerCnxn.java:1360)
at
org.apache.zookeeper.server.FinalRequestProcessor.processRequest(FinalRequestProcessor.java:367)
at
org.apache.zookeeper.server.quorum.Leader$ToBeAppliedRequestProcessor.processRequest(Leader.java:535)
at
org.apache.zookeeper.server.quorum.CommitProcessor.run(CommitProcessor.java:73)
2011-06-30 22:14:49,084 - ERROR
[LearnerHandler-/10.23.247.163:14998:LearnerHandler@444] - Unexpected exception
causing shutdown while sock still open
java.net.SocketTimeoutException: Read timed out
at java.net.SocketInputStream.socketRead0(Native Method)
at java.net.SocketInputStream.read(SocketInputStream.java:129)
at java.io.BufferedInputStream.fill(BufferedInputStream.java:218)
at java.io.BufferedInputStream.read(BufferedInputStream.java:237)
at java.io.DataInputStream.readInt(DataInputStream.java:370)
at
org.apache.jute.BinaryInputArchive.readInt(BinaryInputArchive.java:63)
at
org.apache.zookeeper.server.quorum.QuorumPacket.deserialize(QuorumPacket.java:84)
at
org.apache.jute.BinaryInputArchive.readRecord(BinaryInputArchive.java:108)
at
org.apache.zookeeper.server.quorum.LearnerHandler.run(LearnerHandler.java:358)
2011-06-30 22:14:49,084 - WARN
[LearnerHandler-/10.23.247.163:14998:LearnerHandler@457] - ******* GOODBYE
/10.23.247.163:14998 ********
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira