Joel Knighton created CASSANDRA-13700:
-----------------------------------------

             Summary: Heartbeats can cause gossip information to go permanently 
missing on certain nodes
                 Key: CASSANDRA-13700
                 URL: https://issues.apache.org/jira/browse/CASSANDRA-13700
             Project: Cassandra
          Issue Type: Bug
          Components: Distributed Metadata
            Reporter: Joel Knighton
            Assignee: Joel Knighton
            Priority: Critical


In {{Gossiper.getStateForVersionBiggerThan}}, we add the {{HeartBeatState}} 
from the corresponding {{EndpointState}} to the {{EndpointState}} to send. When 
we're getting state for ourselves, this means that we add a reference to the 
local {{HeartBeatState}}. Then, once we've built a message (in either the Syn 
or Ack handler), we send it through the {{MessagingService}}. In the case that 
the {{MessagingService}} is sufficiently slow, the {{GossipTask}} may run 
before serialization of the Syn or Ack. This means that when the {{GossipTask}} 
acquires the gossip {{taskLock}}, it may increment the {{HeartBeatState}} 
version of the local node as stored in the endpoint state map. Then, when we 
finally serialize the Syn or Ack, we'll follow the reference to the 
{{HeartBeatState}} and serialize it with a higher version than we saw when 
constructing the Ack or Ack2.

Consider the case where we see {{HeartBeatState}} with version 4 when 
constructing an Ack and send it through the {{Messaging Service}}. Then, we add 
some piece of state with version 5 to our local {{EndpointState}}. If 
{{GossipTask}} runs and increases the {{HeartBeatState}} version to 6 before 
the {{MessageOut}} containing the Ack is serialized, the node receiving the Ack 
will believe it is current to version 6, despite the fact that it has never 
received a message containing the {{ApplicationState}} tagged with version 5.

I've reproduced in this in several versions; so far, I believe this is possible 
in all versions.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

Reply via email to