[
https://issues.apache.org/jira/browse/ZOOKEEPER-1732?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13807102#comment-13807102
]
Raul Gutierrez Segales commented on ZOOKEEPER-1732:
---------------------------------------------------
[~fpj], [~abranzyck]: did you guys test this patch when joining a cluster of
servers running without this patch (i.e.: trunk, only without this patch)?
After rolling the first 2 followers - in a 5 member ensemble - the 3rd follower
fails to join with this:
{noformat}
2013-10-28 18:43:18,134 - INFO [WorkerReceiver[myid=4]] - Notification: 4
(n.leader), 0x8900000415 (n.zxid), 0x6 (n.round), LOOKING (n.state), 4 (n.sid),
0x89 (n.peerEPoch), LOOKING (my state)0 (n.config version)
2013-10-28 18:43:18,134 - INFO [WorkerReceiver[myid=4]] - Notification: 2
(n.leader), 0x880000002c (n.zxid), 0xffffffffffffffff (n.round), FOLLOWING
(n.state), 0 (n.sid), 0x89 (n.peerEPoch), LOOKING (my state)0 (n.config version)
2013-10-28 18:43:18,135 - INFO [WorkerReceiver[myid=4]] - Notification: 2
(n.leader), 0x880000002c (n.zxid), 0x6 (n.round), LEADING (n.state), 2 (n.sid),
0x88 (n.peerEPoch), LOOKING (my state)0 (n.config version)
2013-10-28 18:43:18,135 - INFO [WorkerReceiver[myid=4]] - Notification: 2
(n.leader), 0x880000002c (n.zxid), 0x6 (n.round), FOLLOWING (n.state), 3
(n.sid), 0x88 (n.peerEPoch), LOOKING (my state)0 (n.config version)
2013-10-28 18:43:18,136 - INFO [WorkerReceiver[myid=4]] - Notification: 2
(n.leader), 0x880000002c (n.zxid), 0xffffffffffffffff (n.round), FOLLOWING
(n.state), 1 (n.sid), 0x89 (n.peerEPoch), LOOKING (my state)0 (n.config version)
{noformat}
I am guessing IGNOREVALUE (0xffffffffffffffff) as the round value is causing
issues? What was the expected behavior here (i.e.: when dealing with cluster
members without this patch during an upgrade)?
> ZooKeeper server unable to join established ensemble
> ----------------------------------------------------
>
> Key: ZOOKEEPER-1732
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1732
> Project: ZooKeeper
> Issue Type: Bug
> Components: leaderElection
> Affects Versions: 3.4.5
> Environment: Windows 7, Java 1.7
> Reporter: Germán Blanco
> Assignee: Germán Blanco
> Priority: Blocker
> Fix For: 3.4.6, 3.5.0
>
> Attachments: CREATE_INCONSISTENCIES_patch.txt, zklog.tar.gz,
> ZOOKEEPER-1732-3.4.patch, ZOOKEEPER-1732-3.4.patch, ZOOKEEPER-1732-3.4.patch,
> ZOOKEEPER-1732-3.4.patch, ZOOKEEPER-1732-b3.4.patch,
> ZOOKEEPER-1732-b3.4.patch, ZOOKEEPER-1732.patch, ZOOKEEPER-1732.patch,
> ZOOKEEPER-1732.patch, ZOOKEEPER-1732.patch, ZOOKEEPER-1732.patch
>
>
> I have a test in which I do a rolling restart of three ZooKeeper servers and
> it was failing from time to time.
> I ran the tests in a loop until the failure came out and it seems that at
> some point one of the servers is unable to join the enssemble formed by the
> other two.
--
This message was sent by Atlassian JIRA
(v6.1#6144)