James Baker created CASSANDRA-15355:
---------------------------------------

             Summary: Schema push/pull race on continuous data changes
                 Key: CASSANDRA-15355
                 URL: https://issues.apache.org/jira/browse/CASSANDRA-15355
             Project: Cassandra
          Issue Type: Bug
            Reporter: James Baker


In https://issues.apache.org/jira/browse/CASSANDRA-5025, pull based schema 
updates were scheduled 1 minute after the schema change was first visible, so 
as to prefer the push codepath as much as possible.

Unfortunately, this does not handle the case where there are many schema 
changes happening - imagine a scenario where we create a table every 5 seconds 
for 2 minutes - the first update tasks execute 60 seconds in and the schemas 
may well be out of sync between nodes at that point.


In this case, there is some chance that when the task runs, the schemas are out 
of sync because a subsequent schema update has occurred, and so the same 
push/pull race has happened.

A fix is to modify the codepath such that the scheduled task is only run if the 
other node's schema version is the same as when the task was scheduled. A 
different (later scheduled) task should run otherwise.

For us, what we see is that when we have a reasonably large number of changes, 
a few schema changes can have the unfortunate outcome of causing our nodes to 
run out of memory and crash. This change stops that.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to