John Knost created CASSANDRA-14577:
--------------------------------------
Summary: gc_grace_seconds should include UJ (Up/Joining) status
period
Key: CASSANDRA-14577
URL: https://issues.apache.org/jira/browse/CASSANDRA-14577
Project: Cassandra
Issue Type: Improvement
Components: Coordination
Environment: Issue observed in environment running Contrail
3.0.3.3-22/Cassandra 2.1.13
Reporter: John Knost
Partial network connectivity (e.g. and MTU mismatch that blackholes jumbo
frames) can cause a node to get stuck in a permanent UJ status (as reflected in
nodetool). It's possible the node can stay in this way for an extended period
of time. Once the isolated node rejoins due to a network repair, it can cause
extensive data loss to the healthy nodes.
If the node were completely isolated gc_grace_seconds would prevent the node
from joining after the specified period. Other corner cases besides "DN"
should be covered if applicable.
Reference:
JTAC 2018-0303-0029
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]