Brandon Williams created CASSANDRA-8336:
-------------------------------------------
Summary: Quarantine nodes after receiving the gossip shutdown
message
Key: CASSANDRA-8336
URL: https://issues.apache.org/jira/browse/CASSANDRA-8336
Project: Cassandra
Issue Type: Bug
Components: Core
Reporter: Brandon Williams
Assignee: Brandon Williams
Fix For: 2.0.12
In CASSANDRA-3936 we added a gossip shutdown announcement. The problem here is
that this isn't sufficient; you can still get TOEs and have to wait on the FD
to figure things out. This happens due to gossip propagation time and
variance; if node X shuts down and sends the message to Y, but Z has a greater
gossip version than Y for X and has not yet received the message, it can
initiate gossip with Y and thus mark X alive again. I propose quarantining to
solve this, however I feel it should be a -D parameter you have to specify, so
as not to destroy current dev and test practices, since this will mean a node
that shuts down will not be able to restart until the quarantine expires.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)