[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1599?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13755573#comment-13755573
 ] 

Flavio Junqueira commented on ZOOKEEPER-1599:
---------------------------------------------

Thanks for the feedback, Pat. I don't mind adding the release note, but let me 
go over a couple of things. The multi-op jira only got into the 3.4 branch, so 
to make the servers compatible, we would need to backport ZOOKEEPER-965. The 
other jira that [~skye] claims to be a problem is ZOOKEEPER-882. I had a look 
at ZOOKEEPER-882 and it should have got into 3.3.3 according to the jira 
header, but I only see I commit (from me, actually) and not the second, so I 
suspect the 3.3 branch ended up not having that patch applied.

I'll check what happened with ZOOKEEPER-882 and in the case it really didn't 
get in, if the patch still applies. In the case it does, would it make sense to 
have another release for the 3.3 branch? I don't think it solves completely the 
problem of this jira because the multi-op jira is not in, and it might be some 
work to get it. 
                
> 3.3 server cannot join 3.4 quorum
> ---------------------------------
>
>                 Key: ZOOKEEPER-1599
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1599
>             Project: ZooKeeper
>          Issue Type: Bug
>          Components: quorum
>    Affects Versions: 3.3.6, 3.4.5
>            Reporter: Skye Wanderman-Milne
>            Assignee: Skye Wanderman-Milne
>            Priority: Blocker
>             Fix For: 3.4.6
>
>         Attachments: ZOOKEEPER-1599.patch
>
>
> When a 3.3 server attempts to join an existing quorum lead by a 3.4 server, 
> the 3.3 server is disconnected while trying to download the leader's 
> snapshot. The 3.3 server restarts and starts the process over again, but is 
> never able to join the quorum.
> 3.3 server log:
> {code}
> 2012-12-07 10:44:34,582 - INFO  
> [QuorumPeer:/0:0:0:0:0:0:0:0:2183:Learner@294] - Getting a snapshot from 
> leader
> 2012-12-07 10:44:34,582 - INFO  
> [QuorumPeer:/0:0:0:0:0:0:0:0:2183:Learner@325] - Setting leader epoch 12
> 2012-12-07 10:44:54,604 - WARN  
> [QuorumPeer:/0:0:0:0:0:0:0:0:2183:Follower@82] - Exception when following the 
> leader
> java.io.EOFException
>         at java.io.DataInputStream.readInt(DataInputStream.java:392)
>         at 
> org.apache.jute.BinaryInputArchive.readInt(BinaryInputArchive.java:63)
>         at 
> org.apache.zookeeper.server.quorum.QuorumPacket.deserialize(QuorumPacket.java:84)
>         at 
> org.apache.jute.BinaryInputArchive.readRecord(BinaryInputArchive.java:108)
>         at 
> org.apache.zookeeper.server.quorum.Learner.readPacket(Learner.java:148)
>         at 
> org.apache.zookeeper.server.quorum.Learner.syncWithLeader(Learner.java:332)
>         at 
> org.apache.zookeeper.server.quorum.Follower.followLeader(Follower.java:75)
>         at 
> org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:645)
> 2012-12-07 10:44:54,605 - INFO  
> [QuorumPeer:/0:0:0:0:0:0:0:0:2183:Follower@165] - shutdown called
> java.lang.Exception: shutdown Follower
>         at 
> org.apache.zookeeper.server.quorum.Follower.shutdown(Follower.java:165)
>         at 
> org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:649)
> {code}
> 3.4 leader log:
> {code}
> 2012-12-07 10:51:35,178 [myid:2] - INFO  
> [WorkerReceiver[myid=2]:FastLeaderElection$Messenger$WorkerReceiver@273] - 
> Backward compatibility mode, server id=3
> 2012-12-07 10:51:35,178 [myid:2] - INFO  
> [WorkerReceiver[myid=2]:FastLeaderElection@542] - Notification: 3 (n.leader), 
> 0x1100000000 (n.zxid), 0x2 (n.round), LOOKING (n.state), 3 (n.sid), 0x11 
> (n.peerEPoch), LEADING (my state)
> 2012-12-07 10:51:35,182 [myid:2] - INFO  
> [LearnerHandler-/127.0.0.1:37654:LearnerHandler@263] - Follower sid: 3 : info 
> : org.apache.zookeeper.server.quorum.QuorumPeer$QuorumServer@262f4873
> 2012-12-07 10:51:35,182 [myid:2] - INFO  
> [LearnerHandler-/127.0.0.1:37654:LearnerHandler@318] - Synchronizing with 
> Follower sid: 3 maxCommittedLog=0x0 minCommittedLog=0x0 
> peerLastZxid=0x1100000000
> 2012-12-07 10:51:35,182 [myid:2] - INFO  
> [LearnerHandler-/127.0.0.1:37654:LearnerHandler@395] - Sending SNAP
> 2012-12-07 10:51:35,183 [myid:2] - INFO  
> [LearnerHandler-/127.0.0.1:37654:LearnerHandler@419] - Sending snapshot last 
> zxid of peer is 0x1100000000  zxid of leader is 0x1200000000sent zxid of db 
> as 0x1200000000
> 2012-12-07 10:51:55,204 [myid:2] - ERROR 
> [LearnerHandler-/127.0.0.1:37654:LearnerHandler@562] - Unexpected exception 
> causing shutdown while sock still open
> java.net.SocketTimeoutException: Read timed out
>         at java.net.SocketInputStream.socketRead0(Native Method)
>         at java.net.SocketInputStream.read(SocketInputStream.java:150)
>         at java.net.SocketInputStream.read(SocketInputStream.java:121)
>         at java.io.BufferedInputStream.fill(BufferedInputStream.java:235)
>         at java.io.BufferedInputStream.read(BufferedInputStream.java:254)
>         at java.io.DataInputStream.readInt(DataInputStream.java:387)
>         at 
> org.apache.jute.BinaryInputArchive.readInt(BinaryInputArchive.java:63)
>         at 
> org.apache.zookeeper.server.quorum.QuorumPacket.deserialize(QuorumPacket.java:83)
>         at 
> org.apache.jute.BinaryInputArchive.readRecord(BinaryInputArchive.java:108)
>         at 
> org.apache.zookeeper.server.quorum.LearnerHandler.run(LearnerHandler.java:450)
> 2012-12-07 10:51:55,205 [myid:2] - WARN  
> [LearnerHandler-/127.0.0.1:37654:LearnerHandler@575] - ******* GOODBYE 
> /127.0.0.1:37654 ********
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to