[ https://issues.apache.org/jira/browse/ZOOKEEPER-1805?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13812455#comment-13812455 ]
Germán Blanco commented on ZOOKEEPER-1805: ------------------------------------------ As far as I can see, there is never a mix of messages with and without don't care values. The don't care values never get sent over the network ... or at least that was not intentional. I have noticed that the current value (-1) happens to be the same that was being used by default in Vote.java for some of the incomplete constructors, and this is why the value does appear in the traces sent by Raúl for the epoch (note that the epoch was not set to don't care value in this case). But it has nothing to do with the patch for ZOOKEEPER-1732. You can see that e.g. zxid does not have a don't care value in these traces. What your change is doing is that if there is a don't care value, then it checks if the epoch is greater or equal between the vote with the don't care value and the other. All votes in the outofelection collection have don't care values, so the result is that the comparison for the epochs ignores the value of the epochs in all cases. Epoch may be greater of equal or smaller or equal for the comparison to be succesful when both votes being compared have don't care values. The same result would have been achieved by setting the epoch to the don't care value when inserting the vote in the outofelection collection (and in the call to termPredicate) and not making any changes at all in the comparisons in Vote.java. And in that case also, the changes in learner.java leader.java and QuorumPeer.java are not good for anything any more, since all they do is setting the value of the epoch to a common value in Learners and Leader and that value is going to be ignored. That would be the approach that I would be taking to implement your proposal. For a test case, it would be enough to modify the test case added in ZOOKEEPER-1732 and just set the peerEpoch to any value, so that it is clear that this value is also ignored in the comparison. But as far as I can see, the current patch has the same behaviour, and the last decision of how to code behaviours is yours, so both solutions to this problem are fine for me. If the decision was mine, I would go for setting epoch to newEpoch-1. Which might be (arguably) a bit hacky, but the hackery is actually only covering the case of the upgrade and it doesn't have any effect in other cases. Ignoring the epoch applies to all cases in which a new server joins an established ensemble and it might have (at least) the problem of votes of ensembles established with different epochs to be taken into account as if they belonged to the same ensemble. I don't like that too much, but failures don't seem likely and they might not cause problems, since even if the new server joins the wrong leader, this leader will not process any transaction unless it has acks from sufficient followers. So the potential problem seems to be only an small possibility of a delay when joining the right ensemble. That means both (newEpoch-1 and ignoring epoch) look to me as working solutions. Sorry if that was too long, but I think it summarises all corners of my personal view of this issue. The short summary is "I am ok with this solution". If you want a patch with my alternative implementation of the option of ignoring the epoch, I can also prepare that. > "Don't care" value in ZooKeeper election breaks rolling upgrades > ---------------------------------------------------------------- > > Key: ZOOKEEPER-1805 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1805 > Project: ZooKeeper > Issue Type: Bug > Reporter: Flavio Junqueira > Assignee: Flavio Junqueira > Priority: Blocker > Fix For: 3.4.6, 3.5.0 > > Attachments: ZOOKEEPER-1805-b3.4.patch, ZOOKEEPER-1805.patch, > ZOOKEEPER-1805.patch, ZOOKEEPER-1805.patch, ZOOKEEPER-1805.patch, > ZOOKEEPER-1805.patch, ZOOKEEPER-1805.patch, ZOOKEEPER-1805.patch > > > This is an issue that has been originally reported in ZOOKEEPER-1732. -- This message was sent by Atlassian JIRA (v6.1#6144)