[jira] [Updated] (CASSANDRA-14559) Check for endpoint collision with hibernating nodes
[ https://issues.apache.org/jira/browse/CASSANDRA-14559?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brandon Williams updated CASSANDRA-14559: - Fix Version/s: (was: 3.11.x) 3.11.8 3.0.22 Since Version: 1.0.0 Source Control Link: https://github.com/apache/cassandra/commit/c94ececec0fcd87459858370396d6cd586853787 Resolution: Fixed Status: Resolved (was: Ready to Commit) Committed. > Check for endpoint collision with hibernating nodes > > > Key: CASSANDRA-14559 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14559 > Project: Cassandra > Issue Type: Bug > Components: Consistency/Bootstrap and Decommission >Reporter: Vincent White >Assignee: Stefan Miklosovic >Priority: Normal > Fix For: 3.0.22, 3.11.8, 4.0-beta2 > > Time Spent: 20m > Remaining Estimate: 0h > > I ran across an edge case when replacing a node with the same address. This > issue results in the node(and its tokens) being unsafely removed from gossip. > Steps to replicate: > 1. Create 3 node cluster. > 2. Stop a node > 3. Replace the stopped node with a node using the same address using the > replace_address flag > 4. Stop the node before it finishes bootstrapping > 5. Remove the replace_address flag and restart the node to resume > bootstrapping (if the data dir is also cleared at this point the node will > also generate new tokens when it starts) > 6. Stop the node before it finishes bootstrapping again > 7. 30 Seconds later the node will be removed from gossip because it now > matches the check for a FatClient > I think this is only an issue when replacing a node with the same address > because other replacements now use STATUS_BOOTSTRAPPING_REPLACE and leave the > dead node unchanged. > I believe the simplest fix for this is to add a check that prevents a > non-bootstrapped node (without the replaces_address flag) starting if there > is a gossip entry for the same address in the hibernate state. > [3.11 PoC > |https://github.com/apache/cassandra/compare/trunk...vincewhite:check_for_hibernate_on_start] > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-14559) Check for endpoint collision with hibernating nodes
[ https://issues.apache.org/jira/browse/CASSANDRA-14559?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brandon Williams updated CASSANDRA-14559: - Status: Ready to Commit (was: Review In Progress) > Check for endpoint collision with hibernating nodes > > > Key: CASSANDRA-14559 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14559 > Project: Cassandra > Issue Type: Bug > Components: Consistency/Bootstrap and Decommission >Reporter: Vincent White >Assignee: Stefan Miklosovic >Priority: Normal > Fix For: 3.11.x, 4.0-beta2 > > Time Spent: 20m > Remaining Estimate: 0h > > I ran across an edge case when replacing a node with the same address. This > issue results in the node(and its tokens) being unsafely removed from gossip. > Steps to replicate: > 1. Create 3 node cluster. > 2. Stop a node > 3. Replace the stopped node with a node using the same address using the > replace_address flag > 4. Stop the node before it finishes bootstrapping > 5. Remove the replace_address flag and restart the node to resume > bootstrapping (if the data dir is also cleared at this point the node will > also generate new tokens when it starts) > 6. Stop the node before it finishes bootstrapping again > 7. 30 Seconds later the node will be removed from gossip because it now > matches the check for a FatClient > I think this is only an issue when replacing a node with the same address > because other replacements now use STATUS_BOOTSTRAPPING_REPLACE and leave the > dead node unchanged. > I believe the simplest fix for this is to add a check that prevents a > non-bootstrapped node (without the replaces_address flag) starting if there > is a gossip entry for the same address in the hibernate state. > [3.11 PoC > |https://github.com/apache/cassandra/compare/trunk...vincewhite:check_for_hibernate_on_start] > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-14559) Check for endpoint collision with hibernating nodes
[ https://issues.apache.org/jira/browse/CASSANDRA-14559?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brandon Williams updated CASSANDRA-14559: - Reviewers: Brandon Williams, Brandon Williams (was: Brandon Williams) Brandon Williams, Brandon Williams Status: Review In Progress (was: Patch Available) > Check for endpoint collision with hibernating nodes > > > Key: CASSANDRA-14559 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14559 > Project: Cassandra > Issue Type: Bug > Components: Consistency/Bootstrap and Decommission >Reporter: Vincent White >Assignee: Stefan Miklosovic >Priority: Normal > Fix For: 3.11.x, 4.0-beta2 > > Time Spent: 20m > Remaining Estimate: 0h > > I ran across an edge case when replacing a node with the same address. This > issue results in the node(and its tokens) being unsafely removed from gossip. > Steps to replicate: > 1. Create 3 node cluster. > 2. Stop a node > 3. Replace the stopped node with a node using the same address using the > replace_address flag > 4. Stop the node before it finishes bootstrapping > 5. Remove the replace_address flag and restart the node to resume > bootstrapping (if the data dir is also cleared at this point the node will > also generate new tokens when it starts) > 6. Stop the node before it finishes bootstrapping again > 7. 30 Seconds later the node will be removed from gossip because it now > matches the check for a FatClient > I think this is only an issue when replacing a node with the same address > because other replacements now use STATUS_BOOTSTRAPPING_REPLACE and leave the > dead node unchanged. > I believe the simplest fix for this is to add a check that prevents a > non-bootstrapped node (without the replaces_address flag) starting if there > is a gossip entry for the same address in the hibernate state. > [3.11 PoC > |https://github.com/apache/cassandra/compare/trunk...vincewhite:check_for_hibernate_on_start] > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-14559) Check for endpoint collision with hibernating nodes
[ https://issues.apache.org/jira/browse/CASSANDRA-14559?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stefan Miklosovic updated CASSANDRA-14559: -- Complexity: Normal Component/s: (was: Legacy/Distributed Metadata) Consistency/Bootstrap and Decommission Discovered By: User Report Fix Version/s: 4.0-beta2 3.11.x Platform: All > Check for endpoint collision with hibernating nodes > > > Key: CASSANDRA-14559 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14559 > Project: Cassandra > Issue Type: Bug > Components: Consistency/Bootstrap and Decommission >Reporter: Vincent White >Assignee: Stefan Miklosovic >Priority: Normal > Fix For: 3.11.x, 4.0-beta2 > > > I ran across an edge case when replacing a node with the same address. This > issue results in the node(and its tokens) being unsafely removed from gossip. > Steps to replicate: > 1. Create 3 node cluster. > 2. Stop a node > 3. Replace the stopped node with a node using the same address using the > replace_address flag > 4. Stop the node before it finishes bootstrapping > 5. Remove the replace_address flag and restart the node to resume > bootstrapping (if the data dir is also cleared at this point the node will > also generate new tokens when it starts) > 6. Stop the node before it finishes bootstrapping again > 7. 30 Seconds later the node will be removed from gossip because it now > matches the check for a FatClient > I think this is only an issue when replacing a node with the same address > because other replacements now use STATUS_BOOTSTRAPPING_REPLACE and leave the > dead node unchanged. > I believe the simplest fix for this is to add a check that prevents a > non-bootstrapped node (without the replaces_address flag) starting if there > is a gossip entry for the same address in the hibernate state. > [3.11 PoC > |https://github.com/apache/cassandra/compare/trunk...vincewhite:check_for_hibernate_on_start] > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-14559) Check for endpoint collision with hibernating nodes
[ https://issues.apache.org/jira/browse/CASSANDRA-14559?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] C. Scott Andreas updated CASSANDRA-14559: - Component/s: Distributed Metadata > Check for endpoint collision with hibernating nodes > > > Key: CASSANDRA-14559 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14559 > Project: Cassandra > Issue Type: Bug > Components: Distributed Metadata >Reporter: Vincent White >Assignee: Vincent White >Priority: Major > > I ran across an edge case when replacing a node with the same address. This > issue results in the node(and its tokens) being unsafely removed from gossip. > Steps to replicate: > 1. Create 3 node cluster. > 2. Stop a node > 3. Replace the stopped node with a node using the same address using the > replace_address flag > 4. Stop the node before it finishes bootstrapping > 5. Remove the replace_address flag and restart the node to resume > bootstrapping (if the data dir is also cleared at this point the node will > also generate new tokens when it starts) > 6. Stop the node before it finishes bootstrapping again > 7. 30 Seconds later the node will be removed from gossip because it now > matches the check for a FatClient > I think this is only an issue when replacing a node with the same address > because other replacements now use STATUS_BOOTSTRAPPING_REPLACE and leave the > dead node unchanged. > I believe the simplest fix for this is to add a check that prevents a > non-bootstrapped node (without the replaces_address flag) starting if there > is a gossip entry for the same address in the hibernate state. > [3.11 PoC > |https://github.com/apache/cassandra/compare/trunk...vincewhite:check_for_hibernate_on_start] > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-14559) Check for endpoint collision with hibernating nodes
[ https://issues.apache.org/jira/browse/CASSANDRA-14559?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kurt Greaves updated CASSANDRA-14559: - Assignee: Vincent White Status: Patch Available (was: Open) > Check for endpoint collision with hibernating nodes > > > Key: CASSANDRA-14559 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14559 > Project: Cassandra > Issue Type: Bug >Reporter: Vincent White >Assignee: Vincent White >Priority: Major > > I ran across an edge case when replacing a node with the same address. This > issue results in the node(and its tokens) being unsafely removed from gossip. > Steps to replicate: > 1. Create 3 node cluster. > 2. Stop a node > 3. Replace the stopped node with a node using the same address using the > replace_address flag > 4. Stop the node before it finishes bootstrapping > 5. Remove the replace_address flag and restart the node to resume > bootstrapping (if the data dir is also cleared at this point the node will > also generate new tokens when it starts) > 6. Stop the node before it finishes bootstrapping again > 7. 30 Seconds later the node will be removed from gossip because it now > matches the check for a FatClient > I think this is only an issue when replacing a node with the same address > because other replacements now use STATUS_BOOTSTRAPPING_REPLACE and leave the > dead node unchanged. > I believe the simplest fix for this is to add a check that prevents a > non-bootstrapped node (without the replaces_address flag) starting if there > is a gossip entry for the same address in the hibernate state. > [3.11 PoC > |https://github.com/apache/cassandra/compare/trunk...vincewhite:check_for_hibernate_on_start] > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org