[
https://issues.apache.org/jira/browse/CASSANDRA-19221?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17799528#comment-17799528
]
Sam Tunnicliffe commented on CASSANDRA-19221:
---------------------------------------------
thanks, that makes it a bit clearer what's going on here. The issue seems to be
that when changing address, we are reading the local nodeid from the wrong
place at startup. This is only an issue if the new address is already in use by
another node. In this scenario, it causes the metadata update which updates the
addresses not to be submitted, so the nodes silently swap endpoints which
certainly does lead to the symptoms you're described. If the new address isn't
already in use, this doesn't happen and things work as expected.
Thanks for this report, it's definitely a bug to be fixed and once it is, this
kind of address switching should be a safe operation.
> CMS: Nodes can restart with new ipaddress already defined in the cluster
> ------------------------------------------------------------------------
>
> Key: CASSANDRA-19221
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19221
> Project: Cassandra
> Issue Type: Bug
> Components: Transactional Cluster Metadata
> Reporter: Paul Chandler
> Priority: Normal
> Fix For: 5.1-alpha1
>
>
> I am simulating running a cluster in Kubernetes and testing what happens when
> several pods go down and ip addresses are swapped between nodes. In 4.0 this
> is blocked and the node cannot be restarted.
> To simulate this I create a 3 node cluster on a local machine using 3
> loopback addresses
> 127.0.0.1
> 127.0.0.2
> 127.0.0.3
> The nodes are created correctly and the first node is assigned as a CMS node
> as shown:
> bin/nodetool -p 7199 describecms
> Cluster Metadata Service:
> Members: /127.0.0.1:7000
> Is Member: true
> Service State: LOCAL
> At this point I bring down the nodes 127.0.0.2 and 127.0.0.3 and swap the ip
> addresses for the rpc_address and listen_address
>
> The nodes come back as normal, but the nodeid has now been swapped against
> the ip address:
> Before:
> Datacenter: datacenter1
> =======================
> Status=Up/Down
> |/ State=Normal/Leaving/Joining/Moving
> -- Address Load Tokens Owns (effective) Host ID
> Rack
> UN 127.0.0.3 75.2 KiB 16 76.0%
> 6d194555-f6eb-41d0-c000-000000000003 rack1
> UN 127.0.0.2 86.77 KiB 16 59.3%
> 6d194555-f6eb-41d0-c000-000000000002 rack1
> UN 127.0.0.1 80.88 KiB 16 64.7%
> 6d194555-f6eb-41d0-c000-000000000001 rack1
>
> After:
> Datacenter: datacenter1
> =======================
> Status=Up/Down
> |/ State=Normal/Leaving/Joining/Moving
> -- Address Load Tokens Owns (effective) Host ID
> Rack
> UN 127.0.0.3 149.62 KiB 16 76.0%
> 6d194555-f6eb-41d0-c000-000000000003 rack1
> UN 127.0.0.2 155.48 KiB 16 59.3%
> 6d194555-f6eb-41d0-c000-000000000002 rack1
> UN 127.0.0.1 75.74 KiB 16 64.7%
> 6d194555-f6eb-41d0-c000-000000000001 rack1
> On previous tests of this I have created a table with a replication factor of
> 1, inserted some data before the swap. After the swap the data on nodes 2
> and 3 is now missing.
> One theory I have is that I am using different port numbers for the different
> nodes, and I am only swapping the ip addresses and not the port numbers, so
> the ip:port still looks unique
> i.e. 127.0.0.2:9043 becomes 127.0.0.2:9044
> and 127.0.0.3:9044 becomes 127.0.0.3:9043
>
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]