[ 
https://issues.apache.org/jira/browse/CASSANDRA-13700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16094744#comment-16094744
 ] 

Jason Brown commented on CASSANDRA-13700:
-----------------------------------------

[~jkni] Fantastic debugging here, Joel. We have seen this problem, as well, 
with missing STATUS and TOKENS entries.

I followed this through, and I believe you are correct. Just to point out 
(because I had to dig and reason through it), the key problem is (as Joel 
points out) the shared mutable state of {{HeartBeatState}}. In 
{{Gossiper.getStateForVersionBiggerThan}}, when the local node is building up 
the {{Map<ApplicationState, VersionedValue> states}} about itself, if any 
states are added after the function returns *and* the heartbeat is incremented 
before serialization, the peer will get the updated heartbeat value but not the 
updated states (as we the set of states for the local node that we're sending 
over was already constructed a priori the serialization).

Off the top of my head, I think there are at least two possible ways to fix 
this:

- clone the {{HeartBeatState}} when constructing the {{EndpointState}} to 
return from {{Gossiper.getStateForVersionBiggerThan}}. That way it's not 
referencing mutable heartbeat state.
- execute the {{GossipTask}} on the same thread the we receive the gossip 
syn/ack/ack2 messages (on the {{Stage.GOSSIP}} thread). That way we force 
(almost) all references to gossip's stated mutable state into one thread.

The first option is simpler, smaller in scope, and certainly safer.
The second option is has performance implications, especially if the 
{{GossipTask}} takes a while to execute, then we could start backing up the 
tasks on the stage. This option, though, has the "possibility" of eliminating 
more of the state race bugs that we seems to continually uncover as time goes 
on. (Side note: there are still some updates to local Gossip state from the 
main thread (via {{StorageService}}) at startup, and the response to the 
{{EchoMessage}} is on the wrong thread, as well.)

Joel, can you share the method of how you are able to reproduce this?

> Heartbeats can cause gossip information to go permanently missing on certain 
> nodes
> ----------------------------------------------------------------------------------
>
>                 Key: CASSANDRA-13700
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-13700
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Distributed Metadata
>            Reporter: Joel Knighton
>            Assignee: Joel Knighton
>            Priority: Critical
>
> In {{Gossiper.getStateForVersionBiggerThan}}, we add the {{HeartBeatState}} 
> from the corresponding {{EndpointState}} to the {{EndpointState}} to send. 
> When we're getting state for ourselves, this means that we add a reference to 
> the local {{HeartBeatState}}. Then, once we've built a message (in either the 
> Syn or Ack handler), we send it through the {{MessagingService}}. In the case 
> that the {{MessagingService}} is sufficiently slow, the {{GossipTask}} may 
> run before serialization of the Syn or Ack. This means that when the 
> {{GossipTask}} acquires the gossip {{taskLock}}, it may increment the 
> {{HeartBeatState}} version of the local node as stored in the endpoint state 
> map. Then, when we finally serialize the Syn or Ack, we'll follow the 
> reference to the {{HeartBeatState}} and serialize it with a higher version 
> than we saw when constructing the Ack or Ack2.
> Consider the case where we see {{HeartBeatState}} with version 4 when 
> constructing an Ack and send it through the {{MessagingService}}. Then, we 
> add some piece of state with version 5 to our local {{EndpointState}}. If 
> {{GossipTask}} runs and increases the {{HeartBeatState}} version to 6 before 
> the {{MessageOut}} containing the Ack is serialized, the node receiving the 
> Ack will believe it is current to version 6, despite the fact that it has 
> never received a message containing the {{ApplicationState}} tagged with 
> version 5.
> I've reproduced in this in several versions; so far, I believe this is 
> possible in all versions.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

Reply via email to