[
https://issues.apache.org/jira/browse/CASSANDRA-20118?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17903509#comment-17903509
]
Stefan Miklosovic edited comment on CASSANDRA-20118 at 12/5/24 11:48 PM:
-------------------------------------------------------------------------
The term "migration" is a little bit misleading here in such sense that until
18061 was done, all it did was that it populated tables on "v2" with values. If
you compare migrator in 4.1 and 5.0, everything is same, all what was added was
compaction history.
The reason all other migrations are same was that these tables between 4.1 and
5.0 are same. I think that if somebody upgrades from 3.0 to 5.0 in offline
fashion, in 3.0 there were not these v2 versions of these tables so when they
were added in 5.0, something has to populate them. The population of v2 tables
between 4.1 and 5.0 is basically a no-op action (minus compaction history).
Hence, if we deffered this migration until we are out of CASSANDRA_4, we would
not only skip the invocation of "migrateCompactionHistory" method, that itself
is not enough. We would need to skip the addition of a new column into that
table. If we just skipped the population, that would not be enough in order to
not have another schema version as new column would be added.
If we indeed want to do this, then all we would achieve is that schema version
should be at least same as long as it is in CASSANDRA_4 mode but as soon as
whole cluster is on 5.0 in CASSANDRA_4 mode and we wanted to add that column,
we would do a rolling restart and we would have a schema mismatch as well. So
until the whole cluster is bounced, we would still hit this "hints not
delivered" issue, no?
was (Author: smiklosovic):
The term "migration" is a little bit misleading here in such sense that until
18061 was done, all it did was that it populated tables on "v2" with values. If
you compare migrator in 4.1 and 5.0, everything is same, all what was added was
compaction history.
The reason all other migrations are same was that these tables between 4.1 and
5.0 are same. I think that if somebody upgrades from 3.0 to 5.0 in offline
fashion, in 3.0 there were not these v2 versions of these tables so when they
were added in 5.0, something has to populate them. The population of v2 tables
between 4.1 and 5.0 is basically a no-op action (minus compaction history).
Hence, if we deffered this migration until we are out of CASSANDRA_4, we would
not only skip the invocation of "migrateCompactionHistory" method, that it self
is not enough. We would need to skip the addition of a new column into that
table. If we just skipped the population, that would not be enough in order to
not have another schema version as new column would be added.
If we indeed want to do this, then all we would achieve is that schema version
should be at least same as long as it is in CASSANDRA_4 mode but as soon as
whole cluster is on 5.0 in CASSANDRA_4 mode and we wanted to add that column,
we would do a rolling restart and we would have a schema mismatch as well. So
until the whole cluster is bounced, we would still hit this "hints not
delivered" issue, no?
> Hints ignored during Upgrade from C*4 to C*5
> --------------------------------------------
>
> Key: CASSANDRA-20118
> URL: https://issues.apache.org/jira/browse/CASSANDRA-20118
> Project: Apache Cassandra
> Issue Type: Bug
> Components: Consistency/Hints
> Reporter: Paul Chandler
> Assignee: Brandon Williams
> Priority: Normal
> Fix For: 5.0.x
>
> Attachments: 4-1system-third-node.log, 4-1system.log, 5-0system.log
>
>
> I have discovered that some hints were not being processed after nodes come
> back up when a cluster in in a mixed mode with some cassandra 4 nodes and
> some cassdandra 5 nodes ( these with a storage compatibility mode CASSANDRA_4
> )
>
> When in this mode there is a schema mismatch after the first node has been
> upgraded, which continues until the last node has been upgraded.
> It seems that the hints are blocked from being sent if there is a schema
> mismatch between the 2 nodes, that can be seen at this line.
> [cassandra/src/java/org/apache/cassandra/hints/HintsDispatchTrigger.java at
> cassandra-5.0 ·
> apache/cassandra|https://github.com/apache/cassandra/blob/cassandra-5.0/src/java/org/apache/cassandra/hints/HintsDispatchTrigger.java#L65]
> I have tested removing this line, and that then does allow the hint to be
> transferred normally. However I am not sure of the implications for doing
> that if the hint is for part of the schema where the actual mismatch occurs.
>
> This creates the problem when a node is being upgraded and is currently down,
> hint files will be created for it on the new cassandra 5 nodes and the old
> cassandra 4 nodes, but the hint files on the old cassandra 4 nodes will not
> be processed, due to the schema mismatch. Leading to potential data loss.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]