[ 
https://issues.apache.org/jira/browse/CASSANDRA-8336?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brandon Williams updated CASSANDRA-8336:
----------------------------------------
    Attachment: 8336-v2.txt

This patch helps, but the problem with this is approach is the node can still 
flap, given a disjoint enough (gossip state-wise) cluster.  There are a few 
ways we can solve this:

* quarantine after shutdown.  This has the consequence of not being able to 
restart a node until the quarantine expires.

* Sleep for ring_delay or some interval after setting the shutdown state before 
sending the rpc shutdown.  I'm not 100% sure this would prevent the flapping, 
and sleeping that long on shutdown sucks as equally as not being able to reboot 
until the quarantine expires.

* Offline Richard suggested to me a third way, which I'll discuss below.

The method suggests when node X receives a shutdown event from Y, it will 
update its local state for Y to version Integer.MAX_VALUE, and thus no updates 
for the same generation will be accepted since they will always have a lower 
version.  When Y restarts it will have a new generation and everything will 
work normally.  

There is one consequence to this method, and that is that gossipdisable/enable 
has to now generate a new generation, which triggers the "has restarted, now 
UP" message on other nodes, but this seems like a fairly minor thing.

On the surface, it may seem easier to have Y just send with a version of 
MAX_VALUE, but that will only apply to nodes that receive it via gossip, not 
the ones that receive it via rpc which is likely the bulk of them, and it 
wouldn't be an optimization anyway since we only sleep for one gossip round, 
and the node(s) we gossip to will set the version anyway before propagating it 
to the rest of the cluster.

v2 does this.

> Quarantine nodes after receiving the gossip shutdown message
> ------------------------------------------------------------
>
>                 Key: CASSANDRA-8336
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-8336
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>            Reporter: Brandon Williams
>            Assignee: Brandon Williams
>             Fix For: 2.0.13
>
>         Attachments: 8336-v2.txt, 8336.txt
>
>
> In CASSANDRA-3936 we added a gossip shutdown announcement.  The problem here 
> is that this isn't sufficient; you can still get TOEs and have to wait on the 
> FD to figure things out.  This happens due to gossip propagation time and 
> variance; if node X shuts down and sends the message to Y, but Z has a 
> greater gossip version than Y for X and has not yet received the message, it 
> can initiate gossip with Y and thus mark X alive again.  I propose 
> quarantining to solve this, however I feel it should be a -D parameter you 
> have to specify, so as not to destroy current dev and test practices, since 
> this will mean a node that shuts down will not be able to restart until the 
> quarantine expires.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to