[ 
https://issues.apache.org/jira/browse/CASSANDRA-5571?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14058239#comment-14058239
 ] 

Giampiero Recco commented on CASSANDRA-5571:
--------------------------------------------

I noticed that checkForEndpointCollision() is called before starting Gossiper 
(that is started immediately after this check) . 
This causes a problem the first time I bootstrap a cluster since all nodes call 
checkForEndpointCollision() but no one have yet started Gossiper so no one 
answer the gossip messages leading all nodes to timeout and die with an "Unable 
to gossip with any seeds" RTE.

This is an issue especially using Cassandra with Priam 
(https://github.com/Netflix/Priam) were all nodes starts automatically at the 
very same time with the same configuration. 
Unfortunately working around the problem in Priam is fairly complicated since 
it would require synchronizing the whole cluster to bootstrap in a specific 
order with different configuration. 

The question then is: may we move the checkForEndpointCollision() call after 
Gossiper is started (about ten lines later in StorageService)?

On the contrary if this check need to happen before Gossiper is started, 
another option could be to allow GossipDigestSynVerbHandler.doVerb() to respond 
even if Gossiper is not yet enabled (right now it checks for 
Gossiper.instance.isEnabled() or it silently discard the request).



> Reject bootstrapping endpoints that are already in the ring with different 
> gossip data
> --------------------------------------------------------------------------------------
>
>                 Key: CASSANDRA-5571
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-5571
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Rick Branson
>            Assignee: Tyler Hobbs
>             Fix For: 2.0.2
>
>         Attachments: 5571-2.0-v1.patch, 5571-2.0-v2.patch, 5571-2.0-v3.patch
>
>
> The ring can be silently broken by improperly bootstrapping an endpoint that 
> has an existing entry in the gossip table. In the case where a node attempts 
> to bootstrap with the same IP address as an existing ring member, the old 
> token metadata is dropped without warning, resulting in range shifts for the 
> cluster.
> This isn't so bad for non-vnode cases where, in general, tokens are 
> explicitly assigned, and a bootstrap on the same token would result in no 
> range shifts. For vnode cases, the convention is to just let nodes come up by 
> selecting their own tokens, and a bootstrap will override the existing tokens 
> for that endpoint.
> While there are some other issues open for adding an explicit rebootstrap 
> feature for vnode cases, given the changes in operator habits for vnode 
> rings, it seems a bit too easy to make this happen. Even more undesirable is 
> the fact that it's basically silent.
> This is a proposal for checking for this exact case: bootstraps on endpoints 
> with existing ring entries that have different hostIDs and/or tokens should 
> be rejected with an error message describing what happened and how to 
> override the safety check. It looks like the override can be supported using 
> the existing "nodetool removenode -force".
> I can work up a patch for this.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to