Paul Chandler created CASSANDRA-19221:
-----------------------------------------
Summary: CMS: Nodes can restart with new ipaddress already defined
in the cluster
Key: CASSANDRA-19221
URL: https://issues.apache.org/jira/browse/CASSANDRA-19221
Project: Cassandra
Issue Type: Bug
Components: Transactional Cluster Metadata
Reporter: Paul Chandler
I am simulating running a cluster in Kubernetes and testing what happens when
several pods go down and ip addresses are swapped between nodes. In 4.0 this
is blocked and the node cannot be restarted.
To simulate this I create a 3 node cluster on a local machine using 3 loopback
addresses
127.0.0.1
127.0.0.2
127.0.0.3
The nodes are created correctly and the first node is assigned as a CMS node as
shown:
bin/nodetool -p 7199 describecms
Cluster Metadata Service:
Members: /127.0.0.1:7000
Is Member: true
Service State: LOCAL
At this point I bring down the nodes 127.0.0.2 and 127.0.0.3 and swap the ip
addresses for the rpc_address and listen_address
The nodes come back as normal, but the nodeid has now been swapped against the
ip address:
Before:
Datacenter: datacenter1
=======================
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
-- Address Load Tokens Owns (effective) Host ID
Rack
UN 127.0.0.3 75.2 KiB 16 76.0%
6d194555-f6eb-41d0-c000-000000000003 rack1
UN 127.0.0.2 86.77 KiB 16 59.3%
6d194555-f6eb-41d0-c000-000000000002 rack1
UN 127.0.0.1 80.88 KiB 16 64.7%
6d194555-f6eb-41d0-c000-000000000001 rack1
After:
Datacenter: datacenter1
=======================
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
-- Address Load Tokens Owns (effective) Host ID
Rack
UN 127.0.0.3 149.62 KiB 16 76.0%
6d194555-f6eb-41d0-c000-000000000003 rack1
UN 127.0.0.2 155.48 KiB 16 59.3%
6d194555-f6eb-41d0-c000-000000000002 rack1
UN 127.0.0.1 75.74 KiB 16 64.7%
6d194555-f6eb-41d0-c000-000000000001 rack1
On previous tests of this I have created a table with a replication factor of
1, inserted some data before the swap. After the swap the data on nodes 2 and
3 is now missing.
One theory I have is that I am using different port numbers for the different
nodes, and I am only swapping the ip addresses and not the port numbers, so the
ip:port still looks unique
i.e. 127.0.0.2:9043 becomes 127.0.0.2:9044
and 127.0.0.3:9044 becomes 127.0.0.3:9043
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]