[
https://issues.apache.org/jira/browse/CASSANDRA-5916?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Brandon Williams updated CASSANDRA-5916:
----------------------------------------
Attachment: 5916.txt
Here's my first (working) attempt at solving this. This patch disables
replace_[token,node] and adds a new replace_address. In some ways
replace_address seems more intuitive, but really we have to do it this way
because we're going to pull everything else we need out of gossip, and
endpoints are keyed by address.
We use a special gossip operation I'm calling 'shadow gossip' where we use a
generation of zero and only do a single, half-round. This means we send an
empty SYN with our own blank digest to a seed, accept one ACK and then stop the
gossip round there, so as not to perturb any existing state.
>From there we extract the original HOST_ID and tokens, and use those for the
>replacement process. A catch here though is once our gossiper actually
>starts, we'll knock both the TOKENS state and the existing STATUS state (for
>single token replacements) out with our newer, real generation, so if the
>replace fails past this point, we can't retry. It may be possible to stay in
>shadow gossip mode through all of the process to get around that (and just
>remove the hibernate state), but I haven't tried this.
> gossip and tokenMetadata get hostId out of sync on failed replace_node with
> the same IP address
> -----------------------------------------------------------------------------------------------
>
> Key: CASSANDRA-5916
> URL: https://issues.apache.org/jira/browse/CASSANDRA-5916
> Project: Cassandra
> Issue Type: Bug
> Reporter: Brandon Williams
> Assignee: Brandon Williams
> Fix For: 1.2.11
>
> Attachments: 5916.txt
>
>
> If you try to replace_node an existing, live hostId, it will error out.
> However if you're using an existing IP to do this (as in, you chose the wrong
> uuid to replace on accident) then the newly generated hostId wipes out the
> old one in TMD, and when you do try to replace it replace_node will complain
> it does not exist. Examination of gossipinfo still shows the old hostId,
> however now you can't replace it either.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira