[jira] [Created] (CASSANDRA-16182) A replacement node, although completed bootstrap and joined ring according to itself, maybe stuck in Joining state as per the peers

Sumanth Pasupuleti (Jira) Mon, 05 Oct 2020 09:17:17 -0700

Sumanth Pasupuleti created CASSANDRA-16182:
----------------------------------------------


             Summary: A replacement node, although completed bootstrap and 
joined ring according to itself, maybe stuck in Joining state as per the peers
                 Key: CASSANDRA-16182
                 URL: https://issues.apache.org/jira/browse/CASSANDRA-16182
             Project: Cassandra
          Issue Type: Bug
          Components: Cluster/Gossip
            Reporter: Sumanth Pasupuleti


This issue occurred in a production 3.0.21 cluster.

Here is what happened
# We had, say, a three node Cassandra cluster with nodes A, B and C
# C got "terminated" due to health check failure and a replacement node C' got 
launched.
# C' started bootstrapping data from its neighbors
# Network flaw: Nodes A,B were still able to communicate with terminated node C 
and consequently still have C as alive.
# The replacement node C' learnt about C through gossip but was unable to 
communicate with C and marked C as DOWN.
# C' completed bootstrapping successfully and itself and its peers logged this 
statement "Node C' will complete replacement of C for tokens 
[-7686143363672898397]"
# C' logged the statement "Nodes C' and C have the same token 
-7686143363672898397. C' is the new owner"
# C' started listening for thrift and cql clients
# Peer nodes A and B logged 'Node C' cannot complete replacement of alive node 
C'
# A few seconds later, A and B marked C' as DOWN

C' continued to log below lines

{code:java}
Node C is now part of the cluster
Nodes () and C' have the same token C.  Ignoring -7686143363672898397 (Needs a 
log statement fix)
FatClient C has been silent for 30000ms, removing from gossip
{code}


My reasoning of what happened: By the time replacement node (C') finished 
bootstrapping and announced it's state to Normal, A and B were still able to 
communicate with the replacing node C (while C' was not able to with C), and 
hence rejected C' replacing C. C' does not know this and does not attempt to 
recommunicate its "Normal" state to rest of the cluster. (Worth noting that A 
and B marked C as down soon after)

Gossip keeps telling C' to add C to its metadata, and C' keeps kicking C out 
eventually based on FailureDetector. 

Proposed fix:
When C' is notified through gossip about C, and given both own the same token 
and given C' has finished bootstrapping, C' can emit its Normal state again 
which should fix this in my opinion (so long as A and B have marked C as DOWN, 
which they did eventually)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Created] (CASSANDRA-16182) A replacement node, although completed bootstrap and joined ring according to itself, maybe stuck in Joining state as per the peers

Reply via email to