[ 
https://issues.apache.org/jira/browse/CASSANDRA-572?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12785155#action_12785155
 ] 

Jaakko Laine commented on CASSANDRA-572:
----------------------------------------

Funny thing, I was just thinking about the same thing during breakfast. Have to 
eat more often :)

The problem with this is that handling state changes will become somewhat more 
complex as we must be prepared to handle transitions between any two states in 
any order. Current gossip model leaves a trace of what the node has node, and 
even in the face of network partitions we can "play back" the transitions when 
they eventually arrive. That is, if a node moves, we will still see LEAVING, 
LEFT, BOOTSTRAPPING and NORMAL and construct token metadata according to that. 
If we only have one value to represent node's current state, we might go from, 
say NORMAL to NORMAL, or even LEFT to LEAVING without seeing any of the 
intermediate steps. Of course this can be done, but needs extra care. Don't 
know how much, though. Might very well be that in the end this would be better 
than the current way.

But even this would not remove the need to handle old application state 
correctly. If a node enters the ring when another node is just LEAVING or LEFT, 
that state will be the first one to be seen, and it must be ignored since there 
is nothing that can be done if NORMAL has not been seen. I think the real cause 
is there in any case, so we can't avoid fixing the symptoms that arrive with it.

I'll try this out now that I'm working on the gossiping part anyway so we'll 
have some more insight on what it would look like.


> handle old gossip properly
> --------------------------
>
>                 Key: CASSANDRA-572
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-572
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>    Affects Versions: 0.5
>            Reporter: Jaakko Laine
>             Fix For: 0.5
>
>         Attachments: 572-handle-old-gossip.patch
>
>
> (1) If a node has been moving in the ring, further bootstraps by other nodes 
> will cause errors as they are handling STATE_LEAVING gossip without having 
> such member in token metadata.
> (2) When a node bootstraps, it handles all ep states in the order they happen 
> to arrive. If the first one to arrive has moved in the past (that is, it has 
> STATE_LEAVING in its ep state), getNaturalEndpoint will throw 
> ArrayIndexOutOfBounds exception as sortedTokens.size() == 0.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to