[ 
https://issues.apache.org/jira/browse/CASSANDRA-572?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12781719#action_12781719
 ] 

Jaakko Laine commented on CASSANDRA-572:
----------------------------------------

I probably described the problem a bit vaguely, another try:

Suppose all nodes in the cluster are running normally and none of them have 
moved. Their EP state includes STATE_BOOTSTRAPPING (if they were bootstrapped 
to the ring) and STATE_NORMAL in this order. Suppose there is nodeA, which gets 
loadbalanced. It goes through leaving, left and bootstrapping back to normal. 
After this its EP state includes (in this order!) LEAVING, LEFT, BOOTSTRAPPING, 
NORMAL. The important thing is that EP state can have only one of each state, 
and they will be handled by other nodes in the order added originally. This is 
fine for nodes that already were in the ring, as they have seen the _old_ 
NORMAL state. However, if we ever want to bootstrap another node to the ring, 
it will cause errors, as they will start to handle states from LEAVING. They 
have no knowledge of this node's state before they handle NORMAL, so we must 
handle LEAVING and LEFT properly. That is, we must do nothing if we do not have 
knowledge of the node.

So this is not related to the new node serving requests, only to handle state 
gossip from other nodes properly. My term old gossip was obviously badly 
chosen, perhaps old state information would be more appropriate.


> handle old gossip properly
> --------------------------
>
>                 Key: CASSANDRA-572
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-572
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>    Affects Versions: 0.5
>            Reporter: Jaakko Laine
>             Fix For: 0.5
>
>         Attachments: 572-handle-old-gossip.patch
>
>
> (1) If a node has been moving in the ring, further bootstraps by other nodes 
> will cause errors as they are handling STATE_LEAVING gossip without having 
> such member in token metadata.
> (2) When a node bootstraps, it handles all ep states in the order they happen 
> to arrive. If the first one to arrive has moved in the past (that is, it has 
> STATE_LEAVING in its ep state), getNaturalEndpoint will throw 
> ArrayIndexOutOfBounds exception as sortedTokens.size() == 0.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to