[ 
https://issues.apache.org/jira/browse/CASSANDRA-13441?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15966726#comment-15966726
 ] 

mck commented on CASSANDRA-13441:
---------------------------------

I suspect we'll see a number of people doing 2.1.x and 2.2.x upgrades to 3.11.x 
(especially the bigger clusters after a few patch releases on 3.11), long 
before we see many upgrading to 4.0.x.

Why not slate this for 3.11.x ?

> Schema version uses built-in digest which includes timestamps, causing 
> migration storms
> ---------------------------------------------------------------------------------------
>
>                 Key: CASSANDRA-13441
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-13441
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Schema
>            Reporter: Jeff Jirsa
>            Assignee: Jeff Jirsa
>             Fix For: 4.x
>
>
> In versions < 3.0, schema was essentially deterministic - a given schema 
> always hashed to the same version, so during a rolling upgrade (say 2.0 -> 
> 2.1), the first node to upgrade to 2.1 would add the new tables, setting the 
> new 2.1 version ID, and subsequently upgraded hosts would settle on that 
> version.
> In 3.0, we delegate the digest calculation to the post-8099 data structures, 
> which are the same digest calculators used in the read path for digest 
> match/mismatch - which means it includes timestamps (and ttls).
> Since schema will never use TTL, we don't care about TTL fields. Similarly, 
> when a 3.0 node upgrades and writes its own new-in-3.0 system tables, it'll 
> write the same tables that exist in the schema with brand new timestamps. As 
> written, this will cause all nodes in the cluster to change schema (to the 
> version with the newest timestamp), and then change a second time as the 
> non-system schema is propagated to the newly upgraded nodes.
> On a sufficiently large cluster with a non-trivial schema, this could cause 
> (literally) millions of migration tasks to needlessly bounce across the 
> cluster.
> Up for discussion: if we fix this in 3.0 (say 3.0.X where X >= 14), then any 
> 3.0 node below this will always mismatch, and cause ping-ponging described in 
> CASSANDRA-11050 . However, if we don't fix it, we create a situation that's 
> potentially an outage on rolling upgrade. I'm leaning towards a strong 
> warning in NEWS about the right way to upgrade, and fixing it in 4.x, but 
> wouldn't mind hearing opinions from [~slebresne] and [~iamaleksey] and 
> [~amorton] since you three already talked about this on CASSANDRA-11050 . 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to