[jira] [Updated] (CASSANDRA-19221) CMS: Nodes can restart with new ipaddress already defined in the cluster
[ https://issues.apache.org/jira/browse/CASSANDRA-19221?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alex Petrov updated CASSANDRA-19221: Since Version: 5.0-alpha1 Source Control Link: https://github.com/apache/cassandra/commit/38512a469cef06770384423d0b30e3e85b511258 Resolution: Fixed Status: Resolved (was: Ready to Commit) > CMS: Nodes can restart with new ipaddress already defined in the cluster > > > Key: CASSANDRA-19221 > URL: https://issues.apache.org/jira/browse/CASSANDRA-19221 > Project: Cassandra > Issue Type: Bug > Components: Transactional Cluster Metadata >Reporter: Paul Chandler >Assignee: Alex Petrov >Priority: Normal > Fix For: 5.1-alpha1 > > Attachments: ci_summary-1.html, ci_summary.html > > > I am simulating running a cluster in Kubernetes and testing what happens when > several pods go down and ip addresses are swapped between nodes. In 4.0 this > is blocked and the node cannot be restarted. > To simulate this I create a 3 node cluster on a local machine using 3 > loopback addresses > {code} > 127.0.0.1 > 127.0.0.2 > 127.0.0.3 > {code} > The nodes are created correctly and the first node is assigned as a CMS node > as shown: > {code} > bin/nodetool -p 7199 describecms > {code} > Cluster Metadata Service: > {code} > Members: /127.0.0.1:7000 > Is Member: true > Service State: LOCAL > {code} > At this point I bring down the nodes 127.0.0.2 and 127.0.0.3 and swap the ip > addresses for the rpc_address and listen_address > > The nodes come back as normal, but the nodeid has now been swapped against > the ip address: > Before: > {code} > Datacenter: datacenter1 > === > Status=Up/Down > |/ State=Normal/Leaving/Joining/Moving > -- Address Load Tokens Owns (effective) Host ID > Rack > UN 127.0.0.3 75.2 KiB 16 76.0% > 6d194555-f6eb-41d0-c000-0003 rack1 > UN 127.0.0.2 86.77 KiB 16 59.3% > 6d194555-f6eb-41d0-c000-0002 rack1 > UN 127.0.0.1 80.88 KiB 16 64.7% > 6d194555-f6eb-41d0-c000-0001 rack1 > {code} > After: > {code} > Datacenter: datacenter1 > === > Status=Up/Down > |/ State=Normal/Leaving/Joining/Moving > -- Address Load Tokens Owns (effective) Host ID > Rack > UN 127.0.0.3 149.62 KiB 16 76.0% > 6d194555-f6eb-41d0-c000-0003 rack1 > UN 127.0.0.2 155.48 KiB 16 59.3% > 6d194555-f6eb-41d0-c000-0002 rack1 > UN 127.0.0.1 75.74 KiB 16 64.7% > 6d194555-f6eb-41d0-c000-0001 rack1 > {code} > On previous tests of this I have created a table with a replication factor of > 1, inserted some data before the swap. After the swap the data on nodes 2 > and 3 is now missing. > One theory I have is that I am using different port numbers for the different > nodes, and I am only swapping the ip addresses and not the port numbers, so > the ip:port still looks unique > i.e. 127.0.0.2:9043 becomes 127.0.0.2:9044 > and 127.0.0.3:9044 becomes 127.0.0.3:9043 > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-19221) CMS: Nodes can restart with new ipaddress already defined in the cluster
[ https://issues.apache.org/jira/browse/CASSANDRA-19221?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sam Tunnicliffe updated CASSANDRA-19221: Status: Ready to Commit (was: Review In Progress) +1. Agreed about the splitting of {{MetadataChangeSimulationTest}}, CASSANDRA-19344 added another dimension to the tests, so it's not super surprising if that sometimes pushes run time over the timeout. > CMS: Nodes can restart with new ipaddress already defined in the cluster > > > Key: CASSANDRA-19221 > URL: https://issues.apache.org/jira/browse/CASSANDRA-19221 > Project: Cassandra > Issue Type: Bug > Components: Transactional Cluster Metadata >Reporter: Paul Chandler >Assignee: Alex Petrov >Priority: Normal > Fix For: 5.1-alpha1 > > Attachments: ci_summary-1.html, ci_summary.html > > > I am simulating running a cluster in Kubernetes and testing what happens when > several pods go down and ip addresses are swapped between nodes. In 4.0 this > is blocked and the node cannot be restarted. > To simulate this I create a 3 node cluster on a local machine using 3 > loopback addresses > {code} > 127.0.0.1 > 127.0.0.2 > 127.0.0.3 > {code} > The nodes are created correctly and the first node is assigned as a CMS node > as shown: > {code} > bin/nodetool -p 7199 describecms > {code} > Cluster Metadata Service: > {code} > Members: /127.0.0.1:7000 > Is Member: true > Service State: LOCAL > {code} > At this point I bring down the nodes 127.0.0.2 and 127.0.0.3 and swap the ip > addresses for the rpc_address and listen_address > > The nodes come back as normal, but the nodeid has now been swapped against > the ip address: > Before: > {code} > Datacenter: datacenter1 > === > Status=Up/Down > |/ State=Normal/Leaving/Joining/Moving > -- Address Load Tokens Owns (effective) Host ID > Rack > UN 127.0.0.3 75.2 KiB 16 76.0% > 6d194555-f6eb-41d0-c000-0003 rack1 > UN 127.0.0.2 86.77 KiB 16 59.3% > 6d194555-f6eb-41d0-c000-0002 rack1 > UN 127.0.0.1 80.88 KiB 16 64.7% > 6d194555-f6eb-41d0-c000-0001 rack1 > {code} > After: > {code} > Datacenter: datacenter1 > === > Status=Up/Down > |/ State=Normal/Leaving/Joining/Moving > -- Address Load Tokens Owns (effective) Host ID > Rack > UN 127.0.0.3 149.62 KiB 16 76.0% > 6d194555-f6eb-41d0-c000-0003 rack1 > UN 127.0.0.2 155.48 KiB 16 59.3% > 6d194555-f6eb-41d0-c000-0002 rack1 > UN 127.0.0.1 75.74 KiB 16 64.7% > 6d194555-f6eb-41d0-c000-0001 rack1 > {code} > On previous tests of this I have created a table with a replication factor of > 1, inserted some data before the swap. After the swap the data on nodes 2 > and 3 is now missing. > One theory I have is that I am using different port numbers for the different > nodes, and I am only swapping the ip addresses and not the port numbers, so > the ip:port still looks unique > i.e. 127.0.0.2:9043 becomes 127.0.0.2:9044 > and 127.0.0.3:9044 becomes 127.0.0.3:9043 > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-19221) CMS: Nodes can restart with new ipaddress already defined in the cluster
[ https://issues.apache.org/jira/browse/CASSANDRA-19221?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alex Petrov updated CASSANDRA-19221: Attachment: ci_summary-1.html > CMS: Nodes can restart with new ipaddress already defined in the cluster > > > Key: CASSANDRA-19221 > URL: https://issues.apache.org/jira/browse/CASSANDRA-19221 > Project: Cassandra > Issue Type: Bug > Components: Transactional Cluster Metadata >Reporter: Paul Chandler >Assignee: Alex Petrov >Priority: Normal > Fix For: 5.1-alpha1 > > Attachments: ci_summary-1.html, ci_summary.html > > > I am simulating running a cluster in Kubernetes and testing what happens when > several pods go down and ip addresses are swapped between nodes. In 4.0 this > is blocked and the node cannot be restarted. > To simulate this I create a 3 node cluster on a local machine using 3 > loopback addresses > {code} > 127.0.0.1 > 127.0.0.2 > 127.0.0.3 > {code} > The nodes are created correctly and the first node is assigned as a CMS node > as shown: > {code} > bin/nodetool -p 7199 describecms > {code} > Cluster Metadata Service: > {code} > Members: /127.0.0.1:7000 > Is Member: true > Service State: LOCAL > {code} > At this point I bring down the nodes 127.0.0.2 and 127.0.0.3 and swap the ip > addresses for the rpc_address and listen_address > > The nodes come back as normal, but the nodeid has now been swapped against > the ip address: > Before: > {code} > Datacenter: datacenter1 > === > Status=Up/Down > |/ State=Normal/Leaving/Joining/Moving > -- Address Load Tokens Owns (effective) Host ID > Rack > UN 127.0.0.3 75.2 KiB 16 76.0% > 6d194555-f6eb-41d0-c000-0003 rack1 > UN 127.0.0.2 86.77 KiB 16 59.3% > 6d194555-f6eb-41d0-c000-0002 rack1 > UN 127.0.0.1 80.88 KiB 16 64.7% > 6d194555-f6eb-41d0-c000-0001 rack1 > {code} > After: > {code} > Datacenter: datacenter1 > === > Status=Up/Down > |/ State=Normal/Leaving/Joining/Moving > -- Address Load Tokens Owns (effective) Host ID > Rack > UN 127.0.0.3 149.62 KiB 16 76.0% > 6d194555-f6eb-41d0-c000-0003 rack1 > UN 127.0.0.2 155.48 KiB 16 59.3% > 6d194555-f6eb-41d0-c000-0002 rack1 > UN 127.0.0.1 75.74 KiB 16 64.7% > 6d194555-f6eb-41d0-c000-0001 rack1 > {code} > On previous tests of this I have created a table with a replication factor of > 1, inserted some data before the swap. After the swap the data on nodes 2 > and 3 is now missing. > One theory I have is that I am using different port numbers for the different > nodes, and I am only swapping the ip addresses and not the port numbers, so > the ip:port still looks unique > i.e. 127.0.0.2:9043 becomes 127.0.0.2:9044 > and 127.0.0.3:9044 becomes 127.0.0.3:9043 > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-19221) CMS: Nodes can restart with new ipaddress already defined in the cluster
[ https://issues.apache.org/jira/browse/CASSANDRA-19221?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sam Tunnicliffe updated CASSANDRA-19221: Reviewers: Sam Tunnicliffe, Sam Tunnicliffe Sam Tunnicliffe, Sam Tunnicliffe (was: Sam Tunnicliffe) Status: Review In Progress (was: Patch Available) > CMS: Nodes can restart with new ipaddress already defined in the cluster > > > Key: CASSANDRA-19221 > URL: https://issues.apache.org/jira/browse/CASSANDRA-19221 > Project: Cassandra > Issue Type: Bug > Components: Transactional Cluster Metadata >Reporter: Paul Chandler >Assignee: Alex Petrov >Priority: Normal > Fix For: 5.1-alpha1 > > Attachments: ci_summary.html > > > I am simulating running a cluster in Kubernetes and testing what happens when > several pods go down and ip addresses are swapped between nodes. In 4.0 this > is blocked and the node cannot be restarted. > To simulate this I create a 3 node cluster on a local machine using 3 > loopback addresses > {code} > 127.0.0.1 > 127.0.0.2 > 127.0.0.3 > {code} > The nodes are created correctly and the first node is assigned as a CMS node > as shown: > {code} > bin/nodetool -p 7199 describecms > {code} > Cluster Metadata Service: > {code} > Members: /127.0.0.1:7000 > Is Member: true > Service State: LOCAL > {code} > At this point I bring down the nodes 127.0.0.2 and 127.0.0.3 and swap the ip > addresses for the rpc_address and listen_address > > The nodes come back as normal, but the nodeid has now been swapped against > the ip address: > Before: > {code} > Datacenter: datacenter1 > === > Status=Up/Down > |/ State=Normal/Leaving/Joining/Moving > -- Address Load Tokens Owns (effective) Host ID > Rack > UN 127.0.0.3 75.2 KiB 16 76.0% > 6d194555-f6eb-41d0-c000-0003 rack1 > UN 127.0.0.2 86.77 KiB 16 59.3% > 6d194555-f6eb-41d0-c000-0002 rack1 > UN 127.0.0.1 80.88 KiB 16 64.7% > 6d194555-f6eb-41d0-c000-0001 rack1 > {code} > After: > {code} > Datacenter: datacenter1 > === > Status=Up/Down > |/ State=Normal/Leaving/Joining/Moving > -- Address Load Tokens Owns (effective) Host ID > Rack > UN 127.0.0.3 149.62 KiB 16 76.0% > 6d194555-f6eb-41d0-c000-0003 rack1 > UN 127.0.0.2 155.48 KiB 16 59.3% > 6d194555-f6eb-41d0-c000-0002 rack1 > UN 127.0.0.1 75.74 KiB 16 64.7% > 6d194555-f6eb-41d0-c000-0001 rack1 > {code} > On previous tests of this I have created a table with a replication factor of > 1, inserted some data before the swap. After the swap the data on nodes 2 > and 3 is now missing. > One theory I have is that I am using different port numbers for the different > nodes, and I am only swapping the ip addresses and not the port numbers, so > the ip:port still looks unique > i.e. 127.0.0.2:9043 becomes 127.0.0.2:9044 > and 127.0.0.3:9044 becomes 127.0.0.3:9043 > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-19221) CMS: Nodes can restart with new ipaddress already defined in the cluster
[ https://issues.apache.org/jira/browse/CASSANDRA-19221?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alex Petrov updated CASSANDRA-19221: Test and Documentation Plan: Includes a test Status: Patch Available (was: Open) > CMS: Nodes can restart with new ipaddress already defined in the cluster > > > Key: CASSANDRA-19221 > URL: https://issues.apache.org/jira/browse/CASSANDRA-19221 > Project: Cassandra > Issue Type: Bug > Components: Transactional Cluster Metadata >Reporter: Paul Chandler >Assignee: Alex Petrov >Priority: Normal > Fix For: 5.1-alpha1 > > Attachments: ci_summary.html > > > I am simulating running a cluster in Kubernetes and testing what happens when > several pods go down and ip addresses are swapped between nodes. In 4.0 this > is blocked and the node cannot be restarted. > To simulate this I create a 3 node cluster on a local machine using 3 > loopback addresses > {code} > 127.0.0.1 > 127.0.0.2 > 127.0.0.3 > {code} > The nodes are created correctly and the first node is assigned as a CMS node > as shown: > {code} > bin/nodetool -p 7199 describecms > {code} > Cluster Metadata Service: > {code} > Members: /127.0.0.1:7000 > Is Member: true > Service State: LOCAL > {code} > At this point I bring down the nodes 127.0.0.2 and 127.0.0.3 and swap the ip > addresses for the rpc_address and listen_address > > The nodes come back as normal, but the nodeid has now been swapped against > the ip address: > Before: > {code} > Datacenter: datacenter1 > === > Status=Up/Down > |/ State=Normal/Leaving/Joining/Moving > -- Address Load Tokens Owns (effective) Host ID > Rack > UN 127.0.0.3 75.2 KiB 16 76.0% > 6d194555-f6eb-41d0-c000-0003 rack1 > UN 127.0.0.2 86.77 KiB 16 59.3% > 6d194555-f6eb-41d0-c000-0002 rack1 > UN 127.0.0.1 80.88 KiB 16 64.7% > 6d194555-f6eb-41d0-c000-0001 rack1 > {code} > After: > {code} > Datacenter: datacenter1 > === > Status=Up/Down > |/ State=Normal/Leaving/Joining/Moving > -- Address Load Tokens Owns (effective) Host ID > Rack > UN 127.0.0.3 149.62 KiB 16 76.0% > 6d194555-f6eb-41d0-c000-0003 rack1 > UN 127.0.0.2 155.48 KiB 16 59.3% > 6d194555-f6eb-41d0-c000-0002 rack1 > UN 127.0.0.1 75.74 KiB 16 64.7% > 6d194555-f6eb-41d0-c000-0001 rack1 > {code} > On previous tests of this I have created a table with a replication factor of > 1, inserted some data before the swap. After the swap the data on nodes 2 > and 3 is now missing. > One theory I have is that I am using different port numbers for the different > nodes, and I am only swapping the ip addresses and not the port numbers, so > the ip:port still looks unique > i.e. 127.0.0.2:9043 becomes 127.0.0.2:9044 > and 127.0.0.3:9044 becomes 127.0.0.3:9043 > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-19221) CMS: Nodes can restart with new ipaddress already defined in the cluster
[ https://issues.apache.org/jira/browse/CASSANDRA-19221?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alex Petrov updated CASSANDRA-19221: Attachment: ci_summary.html > CMS: Nodes can restart with new ipaddress already defined in the cluster > > > Key: CASSANDRA-19221 > URL: https://issues.apache.org/jira/browse/CASSANDRA-19221 > Project: Cassandra > Issue Type: Bug > Components: Transactional Cluster Metadata >Reporter: Paul Chandler >Assignee: Alex Petrov >Priority: Normal > Fix For: 5.1-alpha1 > > Attachments: ci_summary.html > > > I am simulating running a cluster in Kubernetes and testing what happens when > several pods go down and ip addresses are swapped between nodes. In 4.0 this > is blocked and the node cannot be restarted. > To simulate this I create a 3 node cluster on a local machine using 3 > loopback addresses > {code} > 127.0.0.1 > 127.0.0.2 > 127.0.0.3 > {code} > The nodes are created correctly and the first node is assigned as a CMS node > as shown: > {code} > bin/nodetool -p 7199 describecms > {code} > Cluster Metadata Service: > {code} > Members: /127.0.0.1:7000 > Is Member: true > Service State: LOCAL > {code} > At this point I bring down the nodes 127.0.0.2 and 127.0.0.3 and swap the ip > addresses for the rpc_address and listen_address > > The nodes come back as normal, but the nodeid has now been swapped against > the ip address: > Before: > {code} > Datacenter: datacenter1 > === > Status=Up/Down > |/ State=Normal/Leaving/Joining/Moving > -- Address Load Tokens Owns (effective) Host ID > Rack > UN 127.0.0.3 75.2 KiB 16 76.0% > 6d194555-f6eb-41d0-c000-0003 rack1 > UN 127.0.0.2 86.77 KiB 16 59.3% > 6d194555-f6eb-41d0-c000-0002 rack1 > UN 127.0.0.1 80.88 KiB 16 64.7% > 6d194555-f6eb-41d0-c000-0001 rack1 > {code} > After: > {code} > Datacenter: datacenter1 > === > Status=Up/Down > |/ State=Normal/Leaving/Joining/Moving > -- Address Load Tokens Owns (effective) Host ID > Rack > UN 127.0.0.3 149.62 KiB 16 76.0% > 6d194555-f6eb-41d0-c000-0003 rack1 > UN 127.0.0.2 155.48 KiB 16 59.3% > 6d194555-f6eb-41d0-c000-0002 rack1 > UN 127.0.0.1 75.74 KiB 16 64.7% > 6d194555-f6eb-41d0-c000-0001 rack1 > {code} > On previous tests of this I have created a table with a replication factor of > 1, inserted some data before the swap. After the swap the data on nodes 2 > and 3 is now missing. > One theory I have is that I am using different port numbers for the different > nodes, and I am only swapping the ip addresses and not the port numbers, so > the ip:port still looks unique > i.e. 127.0.0.2:9043 becomes 127.0.0.2:9044 > and 127.0.0.3:9044 becomes 127.0.0.3:9043 > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-19221) CMS: Nodes can restart with new ipaddress already defined in the cluster
[ https://issues.apache.org/jira/browse/CASSANDRA-19221?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alex Petrov updated CASSANDRA-19221: Description: I am simulating running a cluster in Kubernetes and testing what happens when several pods go down and ip addresses are swapped between nodes. In 4.0 this is blocked and the node cannot be restarted. To simulate this I create a 3 node cluster on a local machine using 3 loopback addresses {code} 127.0.0.1 127.0.0.2 127.0.0.3 {code} The nodes are created correctly and the first node is assigned as a CMS node as shown: {code} bin/nodetool -p 7199 describecms {code} Cluster Metadata Service: {code} Members: /127.0.0.1:7000 Is Member: true Service State: LOCAL {code} At this point I bring down the nodes 127.0.0.2 and 127.0.0.3 and swap the ip addresses for the rpc_address and listen_address The nodes come back as normal, but the nodeid has now been swapped against the ip address: Before: {code} Datacenter: datacenter1 === Status=Up/Down |/ State=Normal/Leaving/Joining/Moving -- Address Load Tokens Owns (effective) Host ID Rack UN 127.0.0.3 75.2 KiB 16 76.0% 6d194555-f6eb-41d0-c000-0003 rack1 UN 127.0.0.2 86.77 KiB 16 59.3% 6d194555-f6eb-41d0-c000-0002 rack1 UN 127.0.0.1 80.88 KiB 16 64.7% 6d194555-f6eb-41d0-c000-0001 rack1 {code} After: {code} Datacenter: datacenter1 === Status=Up/Down |/ State=Normal/Leaving/Joining/Moving -- Address Load Tokens Owns (effective) Host ID Rack UN 127.0.0.3 149.62 KiB 16 76.0% 6d194555-f6eb-41d0-c000-0003 rack1 UN 127.0.0.2 155.48 KiB 16 59.3% 6d194555-f6eb-41d0-c000-0002 rack1 UN 127.0.0.1 75.74 KiB 16 64.7% 6d194555-f6eb-41d0-c000-0001 rack1 {code} On previous tests of this I have created a table with a replication factor of 1, inserted some data before the swap. After the swap the data on nodes 2 and 3 is now missing. One theory I have is that I am using different port numbers for the different nodes, and I am only swapping the ip addresses and not the port numbers, so the ip:port still looks unique i.e. 127.0.0.2:9043 becomes 127.0.0.2:9044 and 127.0.0.3:9044 becomes 127.0.0.3:9043 was: I am simulating running a cluster in Kubernetes and testing what happens when several pods go down and ip addresses are swapped between nodes. In 4.0 this is blocked and the node cannot be restarted. To simulate this I create a 3 node cluster on a local machine using 3 loopback addresses 127.0.0.1 127.0.0.2 127.0.0.3 The nodes are created correctly and the first node is assigned as a CMS node as shown: bin/nodetool -p 7199 describecms Cluster Metadata Service: Members: /127.0.0.1:7000 Is Member: true Service State: LOCAL At this point I bring down the nodes 127.0.0.2 and 127.0.0.3 and swap the ip addresses for the rpc_address and listen_address The nodes come back as normal, but the nodeid has now been swapped against the ip address: Before: Datacenter: datacenter1 === Status=Up/Down |/ State=Normal/Leaving/Joining/Moving -- Address Load Tokens Owns (effective) Host ID Rack UN 127.0.0.3 75.2 KiB 16 76.0% 6d194555-f6eb-41d0-c000-0003 rack1 UN 127.0.0.2 86.77 KiB 16 59.3% 6d194555-f6eb-41d0-c000-0002 rack1 UN 127.0.0.1 80.88 KiB 16 64.7% 6d194555-f6eb-41d0-c000-0001 rack1 After: Datacenter: datacenter1 === Status=Up/Down |/ State=Normal/Leaving/Joining/Moving -- Address Load Tokens Owns (effective) Host ID Rack UN 127.0.0.3 149.62 KiB 16 76.0% 6d194555-f6eb-41d0-c000-0003 rack1 UN 127.0.0.2 155.48 KiB 16 59.3% 6d194555-f6eb-41d0-c000-0002 rack1 UN 127.0.0.1 75.74 KiB 16 64.7% 6d194555-f6eb-41d0-c000-0001 rack1 On previous tests of this I have created a table with a replication factor of 1, inserted some data before the swap. After the swap the data on nodes 2 and 3 is now missing. One theory I have is that I am using different port numbers for the different nodes, and I am only swapping the ip addresses and not the port numbers, so the ip:port still looks unique i.e. 127.0.0.2:9043 becomes 127.0.0.2:9044 and 127.0.0.3:9044 becomes 127.0.0.3:9043 > CMS: Nodes can restart with new ipaddress already defined in the cluster > > > Key: CASSANDRA-19221 > URL:
[jira] [Updated] (CASSANDRA-19221) CMS: Nodes can restart with new ipaddress already defined in the cluster
[ https://issues.apache.org/jira/browse/CASSANDRA-19221?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sam Tunnicliffe updated CASSANDRA-19221: Bug Category: Parent values: Correctness(12982)Level 1 values: Unrecoverable Corruption / Loss(13161) Complexity: Normal Discovered By: Adhoc Test Fix Version/s: 5.1-alpha1 Severity: Normal Status: Open (was: Triage Needed) > CMS: Nodes can restart with new ipaddress already defined in the cluster > > > Key: CASSANDRA-19221 > URL: https://issues.apache.org/jira/browse/CASSANDRA-19221 > Project: Cassandra > Issue Type: Bug > Components: Transactional Cluster Metadata >Reporter: Paul Chandler >Priority: Normal > Fix For: 5.1-alpha1 > > > I am simulating running a cluster in Kubernetes and testing what happens when > several pods go down and ip addresses are swapped between nodes. In 4.0 this > is blocked and the node cannot be restarted. > To simulate this I create a 3 node cluster on a local machine using 3 > loopback addresses > 127.0.0.1 > 127.0.0.2 > 127.0.0.3 > The nodes are created correctly and the first node is assigned as a CMS node > as shown: > bin/nodetool -p 7199 describecms > Cluster Metadata Service: > Members: /127.0.0.1:7000 > Is Member: true > Service State: LOCAL > At this point I bring down the nodes 127.0.0.2 and 127.0.0.3 and swap the ip > addresses for the rpc_address and listen_address > > The nodes come back as normal, but the nodeid has now been swapped against > the ip address: > Before: > Datacenter: datacenter1 > === > Status=Up/Down > |/ State=Normal/Leaving/Joining/Moving > -- Address Load Tokens Owns (effective) Host ID > Rack > UN 127.0.0.3 75.2 KiB 16 76.0% > 6d194555-f6eb-41d0-c000-0003 rack1 > UN 127.0.0.2 86.77 KiB 16 59.3% > 6d194555-f6eb-41d0-c000-0002 rack1 > UN 127.0.0.1 80.88 KiB 16 64.7% > 6d194555-f6eb-41d0-c000-0001 rack1 > > After: > Datacenter: datacenter1 > === > Status=Up/Down > |/ State=Normal/Leaving/Joining/Moving > -- Address Load Tokens Owns (effective) Host ID > Rack > UN 127.0.0.3 149.62 KiB 16 76.0% > 6d194555-f6eb-41d0-c000-0003 rack1 > UN 127.0.0.2 155.48 KiB 16 59.3% > 6d194555-f6eb-41d0-c000-0002 rack1 > UN 127.0.0.1 75.74 KiB 16 64.7% > 6d194555-f6eb-41d0-c000-0001 rack1 > On previous tests of this I have created a table with a replication factor of > 1, inserted some data before the swap. After the swap the data on nodes 2 > and 3 is now missing. > One theory I have is that I am using different port numbers for the different > nodes, and I am only swapping the ip addresses and not the port numbers, so > the ip:port still looks unique > i.e. 127.0.0.2:9043 becomes 127.0.0.2:9044 > and 127.0.0.3:9044 becomes 127.0.0.3:9043 > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org