[ 
https://issues.apache.org/jira/browse/CASSANDRA-15120?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benedict updated CASSANDRA-15120:
---------------------------------
    Test and Documentation Plan: regression test included in patch 
                         Status: Patch Available  (was: Open)

An initial patch is available 
[here|https://github.com/belliottsmith/cassandra/tree/15120-3.0] for 3.0.  Some 
work is needed still to support switching the in-jvm dtest behaviour to 
disable/enable gossip and networking, as well as to port this capability to 
other versions of the dtests.

Since this work touches gossip, while the patch appears simple we need to take 
a great deal of care.  I have attempted to verify the state transitions that 
may precede one of these presently broken events, so that I have confidence the 
patch does not degrade gossip's correctness, but this can only be a best effort 
without a significant investment in evaluating the correctness of gossip more 
holistically.  I also intend to do another round of analysis before we commit 
the patch.

> Nodes that join the ring while another node is MOVING build an invalid view 
> of the token ring
> ---------------------------------------------------------------------------------------------
>
>                 Key: CASSANDRA-15120
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-15120
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Cluster/Gossip, Cluster/Membership
>            Reporter: Benedict
>            Assignee: Benedict
>            Priority: Normal
>
> Gossip only updates the token metadata for nodes in the NORMAL, SHUTDOWN or 
> LEAVING* statuses.  MOVING and REMOVING_TOKEN nodes do not have their ring 
> information updated (nor do others, but these other states _should_ only be 
> taken by nodes that are not members of the ring).  
> If a node missed the most recent token-modifying events because they were not 
> a member of the ring when they happened (or because Gossip was delayed to 
> them), they will retain an invalid view of the ring until the node enters the 
> one of the NORMAL, SHUTDOWN or LEAVING states.
> *LEAVING is populated differently, however, and in a probably unsafe manner 
> that this work will also address.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to