[jira] [Commented] (CASSANDRA-10134) Always require replace_address to replace existing address

Joel Knighton (JIRA) Wed, 20 Apr 2016 10:35:52 -0700

    [ 
https://issues.apache.org/jira/browse/CASSANDRA-10134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15250341#comment-15250341
 ]


Joel Knighton commented on CASSANDRA-10134:
-------------------------------------------

This looks good to me for the most part. CI is clean as expected, and I like 
the implementation choices.

A few very small nits/suggestions:
* In {{GossipDigestSynVerbHandler}}, the log "Ignoring GossipDigestSynMessage 
because currently in gossip shadow round" would be more descriptive as 
"Ignoring non-empty GossipDigestSynMessage..."
* In {{GossipDigestSynVerbHandler}}, the log "Received a shadow round syn from 
{}" never includes the source as a second argument.
* In {{StorageService.prepareToJoin}}, we call 
{{MessagingService.instance().listen()}} down either branch of {{if 
(replacing)}} (in either {{prepareReplacementInfo}} or 
{{checkForEndpointCollision}}, since both use a shadow round). This makes the 
call at the end of {{prepareToJoin}} always redundant - it seems clearer to me 
to move this further up in {{prepareToJoin}} and remove the call to 
{{MessagingService.instance().listen()}} from {{Gossiper.doShadowRound}}.
* Minor whitespace fix in {{StorageService.prepareReplacementInfo}} at 
{{(replaceAddress)== null}}.

I'm not sure yet about the MV change. In {{CassandraDaemon}}, skipping 
{{view.build()}} and {{view.updateDefinition}} seems safe to me, since the 
definitions will be up-to-date on construction and we submit a build of all 
views after {{StorageService}} initialization. It seems to me that MV builds 
that were to be submitted while gossip is stopped (via JMX or nodetool) will 
never have them submitted, which feels like it might surprise someone down the 
road. Am I missing something here?  It might be worthwhile to just reload all 
ViewManagers in {{StorageService.startGossiping}}. At the same time, I have a 
hard time thinking of a scenario in which this will actually affect someone.

> Always require replace_address to replace existing address
> ----------------------------------------------------------
>
>                 Key: CASSANDRA-10134
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-10134
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Distributed Metadata
>            Reporter: Tyler Hobbs
>            Assignee: Sam Tunnicliffe
>             Fix For: 3.x
>
>
> Normally, when a node is started from a clean state with the same address as 
> an existing down node, it will fail to start with an error like this:
> {noformat}
> ERROR [main] 2015-08-19 15:07:51,577 CassandraDaemon.java:554 - Exception 
> encountered during startup
> java.lang.RuntimeException: A node with address /127.0.0.3 already exists, 
> cancelling join. Use cassandra.replace_address if you want to replace this 
> node.
>       at 
> org.apache.cassandra.service.StorageService.checkForEndpointCollision(StorageService.java:543)
>  ~[main/:na]
>       at 
> org.apache.cassandra.service.StorageService.prepareToJoin(StorageService.java:783)
>  ~[main/:na]
>       at 
> org.apache.cassandra.service.StorageService.initServer(StorageService.java:720)
>  ~[main/:na]
>       at 
> org.apache.cassandra.service.StorageService.initServer(StorageService.java:611)
>  ~[main/:na]
>       at 
> org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:378) 
> [main/:na]
>       at 
> org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:537)
>  [main/:na]
>       at 
> org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:626) 
> [main/:na]
> {noformat}
> However, if {{auto_bootstrap}} is set to false or the node is in its own seed 
> list, it will not throw this error and will start normally.  The new node 
> then takes over the host ID of the old node (even if the tokens are 
> different), and the only message you will see is a warning in the other 
> nodes' logs:
> {noformat}
> logger.warn("Changing {}'s host ID from {} to {}", endpoint, storedId, 
> hostId);
> {noformat}
> This could cause an operator to accidentally wipe out the token information 
> for a down node without replacing it.  To fix this, we should check for an 
> endpoint collision even if {{auto_bootstrap}} is false or the node is a seed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-10134) Always require replace_address to replace existing address

Reply via email to