[
https://issues.apache.org/jira/browse/CASSANDRA-5102?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Brandon Williams updated CASSANDRA-5102:
----------------------------------------
Attachment: 5102.txt
Here is a sad story of how multiple release cycles ended up causing a
regression.
The cause of these exceptions is CASSANDRA-4576. There, we added checks
against VERSION_11 to prevent using the compatible mode with newer node that
didn't need it. VERSION_11 has an actual value of 4. We closed the ticket on
Sept 18, and that was that.
Fast forward to November, where we closed CASSANDRA-4880. To do this, we
needed a protocol version bump, and created VERSION_117, which has an actual
value of 5. Unfortunately we used <= comparisons in CASSANDRA-4576, but now
had created a version higher than VERSION_11 that still needed the
compatibility, and we got our original bug back.
The effect of this is if you upgrade from nodes on 1.1.7 or later to 1.2.0, the
1.2.0 nodes won't be able to gossip with the 1.1.7 nodes and they won't be
visible in ring output on the 1.2.0 node until they too are on 1.2.0. The
1.1.7 nodes will still know about the 1.2.0 node, but they won't be able to
successfully gossip with it, and keep it marked down.
Patch attached to go ahead and compare more explicitly against VERSION_12 to
fix this, but I think it highlights a deeper problem, which is that if we ever
do need to do another protocol bump in a minor, stable branch, we're out of
luck because there's no space between VERSION_117 and VERSION_12.
> upgrading from 1.1.7 to 1.2.0 caused upgraded nodes to only know about other
> 1.2.0 nodes
> ----------------------------------------------------------------------------------------
>
> Key: CASSANDRA-5102
> URL: https://issues.apache.org/jira/browse/CASSANDRA-5102
> Project: Cassandra
> Issue Type: Bug
> Affects Versions: 1.2.0
> Reporter: Michael Kjellman
> Assignee: Brandon Williams
> Priority: Blocker
> Attachments: 5102.txt
>
>
> I upgraded as I have since 0.86 and things didn't go very smoothly.
> I did a nodetool drain to my 1.1.7 node and changed my puppet config to use
> the new merged config. When it came back up (without any errors in the log) a
> nodetool ring only showed itself. I upgraded another node and sure enough now
> nodetool ring showed two nodes.
> I tried resetting the local schema. The upgraded node happily grabbed the
> schema again but still only 1.2 nodes were visible in the ring to any
> upgraded nodes.
> "Interesting" Log Lines:
> INFO 14:43:41,997 Using saved token [42535295865117307932921825928971026436]
> ....
> WARN 23:04:03,361 No host ID found, created
> 5cef7f51-688d-46c3-9fe4-6c82bde4bb98 (Note: This should happen exa
> ctly once per node).
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira