[jira] [Updated] (CASSANDRA-5571) Reject bootstrapping endpoints that are already in the ring with different gossip data
[ https://issues.apache.org/jira/browse/CASSANDRA-5571?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brandon Williams updated CASSANDRA-5571: Fix Version/s: (was: 2.0.3) 2.0.2 Reject bootstrapping endpoints that are already in the ring with different gossip data -- Key: CASSANDRA-5571 URL: https://issues.apache.org/jira/browse/CASSANDRA-5571 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Rick Branson Assignee: Tyler Hobbs Fix For: 2.0.2 Attachments: 5571-2.0-v1.patch, 5571-2.0-v2.patch, 5571-2.0-v3.patch The ring can be silently broken by improperly bootstrapping an endpoint that has an existing entry in the gossip table. In the case where a node attempts to bootstrap with the same IP address as an existing ring member, the old token metadata is dropped without warning, resulting in range shifts for the cluster. This isn't so bad for non-vnode cases where, in general, tokens are explicitly assigned, and a bootstrap on the same token would result in no range shifts. For vnode cases, the convention is to just let nodes come up by selecting their own tokens, and a bootstrap will override the existing tokens for that endpoint. While there are some other issues open for adding an explicit rebootstrap feature for vnode cases, given the changes in operator habits for vnode rings, it seems a bit too easy to make this happen. Even more undesirable is the fact that it's basically silent. This is a proposal for checking for this exact case: bootstraps on endpoints with existing ring entries that have different hostIDs and/or tokens should be rejected with an error message describing what happened and how to override the safety check. It looks like the override can be supported using the existing nodetool removenode -force. I can work up a patch for this. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (CASSANDRA-5571) Reject bootstrapping endpoints that are already in the ring with different gossip data
[ https://issues.apache.org/jira/browse/CASSANDRA-5571?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tyler Hobbs updated CASSANDRA-5571: --- Attachment: 5571-2.0-v3.patch 5571-2.0-v3.patch tolerates dead nodes when checking for endpoint collisions on bootstrap. Reject bootstrapping endpoints that are already in the ring with different gossip data -- Key: CASSANDRA-5571 URL: https://issues.apache.org/jira/browse/CASSANDRA-5571 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Rick Branson Assignee: Tyler Hobbs Attachments: 5571-2.0-v1.patch, 5571-2.0-v2.patch, 5571-2.0-v3.patch The ring can be silently broken by improperly bootstrapping an endpoint that has an existing entry in the gossip table. In the case where a node attempts to bootstrap with the same IP address as an existing ring member, the old token metadata is dropped without warning, resulting in range shifts for the cluster. This isn't so bad for non-vnode cases where, in general, tokens are explicitly assigned, and a bootstrap on the same token would result in no range shifts. For vnode cases, the convention is to just let nodes come up by selecting their own tokens, and a bootstrap will override the existing tokens for that endpoint. While there are some other issues open for adding an explicit rebootstrap feature for vnode cases, given the changes in operator habits for vnode rings, it seems a bit too easy to make this happen. Even more undesirable is the fact that it's basically silent. This is a proposal for checking for this exact case: bootstraps on endpoints with existing ring entries that have different hostIDs and/or tokens should be rejected with an error message describing what happened and how to override the safety check. It looks like the override can be supported using the existing nodetool removenode -force. I can work up a patch for this. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (CASSANDRA-5571) Reject bootstrapping endpoints that are already in the ring with different gossip data
[ https://issues.apache.org/jira/browse/CASSANDRA-5571?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tyler Hobbs updated CASSANDRA-5571: --- Attachment: 5571-2.0-v1.patch 5571-2.0-v1.patch (and [branch|https://github.com/thobbs/cassandra/tree/CASSANDRA-5571]) uses a shadow gossip round to check for endpoint collisions. Reject bootstrapping endpoints that are already in the ring with different gossip data -- Key: CASSANDRA-5571 URL: https://issues.apache.org/jira/browse/CASSANDRA-5571 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Rick Branson Assignee: Tyler Hobbs Attachments: 5571-2.0-v1.patch The ring can be silently broken by improperly bootstrapping an endpoint that has an existing entry in the gossip table. In the case where a node attempts to bootstrap with the same IP address as an existing ring member, the old token metadata is dropped without warning, resulting in range shifts for the cluster. This isn't so bad for non-vnode cases where, in general, tokens are explicitly assigned, and a bootstrap on the same token would result in no range shifts. For vnode cases, the convention is to just let nodes come up by selecting their own tokens, and a bootstrap will override the existing tokens for that endpoint. While there are some other issues open for adding an explicit rebootstrap feature for vnode cases, given the changes in operator habits for vnode rings, it seems a bit too easy to make this happen. Even more undesirable is the fact that it's basically silent. This is a proposal for checking for this exact case: bootstraps on endpoints with existing ring entries that have different hostIDs and/or tokens should be rejected with an error message describing what happened and how to override the safety check. It looks like the override can be supported using the existing nodetool removenode -force. I can work up a patch for this. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (CASSANDRA-5571) Reject bootstrapping endpoints that are already in the ring with different gossip data
[ https://issues.apache.org/jira/browse/CASSANDRA-5571?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tyler Hobbs updated CASSANDRA-5571: --- Attachment: 5571-2.0-v2.patch Bah, not sure how I missed that. 5517-2.0-v2.patch (the branch is also updated) only checks for collisions when bootstrapping. Reject bootstrapping endpoints that are already in the ring with different gossip data -- Key: CASSANDRA-5571 URL: https://issues.apache.org/jira/browse/CASSANDRA-5571 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Rick Branson Assignee: Tyler Hobbs Attachments: 5571-2.0-v1.patch, 5571-2.0-v2.patch The ring can be silently broken by improperly bootstrapping an endpoint that has an existing entry in the gossip table. In the case where a node attempts to bootstrap with the same IP address as an existing ring member, the old token metadata is dropped without warning, resulting in range shifts for the cluster. This isn't so bad for non-vnode cases where, in general, tokens are explicitly assigned, and a bootstrap on the same token would result in no range shifts. For vnode cases, the convention is to just let nodes come up by selecting their own tokens, and a bootstrap will override the existing tokens for that endpoint. While there are some other issues open for adding an explicit rebootstrap feature for vnode cases, given the changes in operator habits for vnode rings, it seems a bit too easy to make this happen. Even more undesirable is the fact that it's basically silent. This is a proposal for checking for this exact case: bootstraps on endpoints with existing ring entries that have different hostIDs and/or tokens should be rejected with an error message describing what happened and how to override the safety check. It looks like the override can be supported using the existing nodetool removenode -force. I can work up a patch for this. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (CASSANDRA-5571) Reject bootstrapping endpoints that are already in the ring with different gossip data
[ https://issues.apache.org/jira/browse/CASSANDRA-5571?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rick Branson updated CASSANDRA-5571: Description: The ring can be silently broken by improperly bootstrapping an endpoint that has an existing entry in the gossip table. In the case where a node attempts to bootstrap with the same IP address as an existing ring member, the old token metadata is dropped without warning, resulting in range shifts for the cluster. This isn't so bad for non-vnode cases where, in general, tokens are explicitly assigned, and a bootstrap on the same token would result in no range shifts. For vnode cases, the convention is to just let nodes come up by selecting their own tokens, and a bootstrap will override the existing tokens for that endpoint. While there are some other issues open for adding an explicit rebootstrap feature for vnode cases, given the changes in operator habits for vnode rings, it seems a bit too easy to make this happen. Even more undesirable is the fact that it's basically silent. This is a proposal for checking for this exact case: bootstraps on endpoints with existing ring entries that have different hostIDs and/or tokens should be rejected with an error message describing what happened and how to override the safety check. It looks like the override can be supported using the existing nodetool removenode -force. I can work up a patch for this. was: The ring can be silently broken by improperly bootstrapping an endpoint that has existing entries in the gossip tables. In the case where a node attempts to bootstrap with the same IP address as an existing ring member, the old token metadata is dropped without warning, resulting in range shifts for the cluster. This isn't so bad for non-vnode cases where, in general, tokens are explicitly assigned, and a bootstrap on the same token would result in no range shifts. For vnode cases, the convention is to just let nodes come up by selecting their own tokens, and a bootstrap will override the existing tokens for that endpoint. While there are some other issues open for adding an explicit rebootstrap feature for vnode cases, given the changes in operator habits for vnode rings, it seems a bit too easy to make this happen. Even more undesirable is the fact that it's basically silent. This is a proposal for checking for this exact case: bootstraps on endpoints with existing ring entries that have different hostIDs and/or tokens should be rejected with an error message describing what happened and how to override the safety check. It looks like the override can be supported using the existing nodetool removenode -force. I can work up a patch for this. Reject bootstrapping endpoints that are already in the ring with different gossip data -- Key: CASSANDRA-5571 URL: https://issues.apache.org/jira/browse/CASSANDRA-5571 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Rick Branson Assignee: Rick Branson The ring can be silently broken by improperly bootstrapping an endpoint that has an existing entry in the gossip table. In the case where a node attempts to bootstrap with the same IP address as an existing ring member, the old token metadata is dropped without warning, resulting in range shifts for the cluster. This isn't so bad for non-vnode cases where, in general, tokens are explicitly assigned, and a bootstrap on the same token would result in no range shifts. For vnode cases, the convention is to just let nodes come up by selecting their own tokens, and a bootstrap will override the existing tokens for that endpoint. While there are some other issues open for adding an explicit rebootstrap feature for vnode cases, given the changes in operator habits for vnode rings, it seems a bit too easy to make this happen. Even more undesirable is the fact that it's basically silent. This is a proposal for checking for this exact case: bootstraps on endpoints with existing ring entries that have different hostIDs and/or tokens should be rejected with an error message describing what happened and how to override the safety check. It looks like the override can be supported using the existing nodetool removenode -force. I can work up a patch for this. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-5571) Reject bootstrapping endpoints that are already in the ring with different gossip data
[ https://issues.apache.org/jira/browse/CASSANDRA-5571?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brandon Williams updated CASSANDRA-5571: Reviewer: brandon.williams Reject bootstrapping endpoints that are already in the ring with different gossip data -- Key: CASSANDRA-5571 URL: https://issues.apache.org/jira/browse/CASSANDRA-5571 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Rick Branson Assignee: Rick Branson The ring can be silently broken by improperly bootstrapping an endpoint that has an existing entry in the gossip table. In the case where a node attempts to bootstrap with the same IP address as an existing ring member, the old token metadata is dropped without warning, resulting in range shifts for the cluster. This isn't so bad for non-vnode cases where, in general, tokens are explicitly assigned, and a bootstrap on the same token would result in no range shifts. For vnode cases, the convention is to just let nodes come up by selecting their own tokens, and a bootstrap will override the existing tokens for that endpoint. While there are some other issues open for adding an explicit rebootstrap feature for vnode cases, given the changes in operator habits for vnode rings, it seems a bit too easy to make this happen. Even more undesirable is the fact that it's basically silent. This is a proposal for checking for this exact case: bootstraps on endpoints with existing ring entries that have different hostIDs and/or tokens should be rejected with an error message describing what happened and how to override the safety check. It looks like the override can be supported using the existing nodetool removenode -force. I can work up a patch for this. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira