[jira] Updated: (CASSANDRA-572) handle old gossip properly

Jaakko Laine (JIRA) Thu, 03 Dec 2009 00:42:47 -0800

     [ 
https://issues.apache.org/jira/browse/CASSANDRA-572?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Jaakko Laine updated CASSANDRA-572:
-----------------------------------

    Attachment: use-same-APstate-for-all-node-state-gossip.patch

OK, here's a patch that uses same state name (NODE_STATE) to gossip all 
movement information. Format is (BOOTSTRAPPING|NORMAL|LEAVING|LEFT)|token.

The main things caused by this modification to the state machine were:
(1) When a node is bootstrapping, we should clear pending ranges for this 
endpoint, as well as remove it from token metadata. These checks are not 
strictly necessary (I think), but are there to help transition from LEAVING -> 
BOOTSTRAPPING in case we missed LEFT due to network partition.
(2) For handleStateLeaving and handleStateLeft remove pending ranges for this 
endpoint before doing anything else. If we missed NORMAL, there might be 
obsolete pending ranges from BOOTSTRAP. Distant possibility, but possibility 
nonetheless.

Following additional check is not directly related to gossip format change and 
could happen even using the current model. This is a very unlikely event, but 
in a large (say, 200+ nodes) multi-DC cluster with lots of node movement, this 
could very well happen even with relatively short DC-to-DC network outage:
(1) Added a check to handleStateLeaving and handleStateLeft for the case that a 
node has made NORMAL -> LEAVING -> LEFT -> BOOTSTRAP -> NORMAL -> LEAVING 
[->LEFT] movement cycle without us seeing the intermediate stages. In this case 
we have information for the old token and now the node is leaving _new_ token. 
We cannot simply assert this, as it is possible this happens.

Now of course this already touches the subject what conditions we must take 
care of and what should be left to operators to handle. Some of them (like 
removing all references to the endpoint before continuing to handle 
bootstrapping) are questionable and might relax safety precautions, but if we 
do not do that, a modest 30s network outage might cause us not to see 
STATE_LEFT and we'd end up having strange pending ranges.

I don't expect this patch to be included as it is, but let's see what people 
think of this gossip change and then discuss what checks should be made :)


> handle old gossip properly
> --------------------------
>
>                 Key: CASSANDRA-572
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-572
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>    Affects Versions: 0.5
>            Reporter: Jaakko Laine
>             Fix For: 0.5
>
>         Attachments: 572-handle-old-gossip.patch, 
> use-same-APstate-for-all-node-state-gossip.patch
>
>
> (1) If a node has been moving in the ring, further bootstraps by other nodes 
> will cause errors as they are handling STATE_LEAVING gossip without having 
> such member in token metadata.
> (2) When a node bootstraps, it handles all ep states in the order they happen 
> to arrive. If the first one to arrive has moved in the past (that is, it has 
> STATE_LEAVING in its ep state), getNaturalEndpoint will throw 
> ArrayIndexOutOfBounds exception as sortedTokens.size() == 0.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (CASSANDRA-572) handle old gossip properly

Reply via email to