[
https://issues.apache.org/jira/browse/CASSANDRA-20118?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17905075#comment-17905075
]
Stefan Miklosovic edited comment on CASSANDRA-20118 at 12/12/24 9:03 AM:
-------------------------------------------------------------------------
Was thinking about this more and the only way I see so far to get out of this
is to not include system tables into schema version computation, as I already
brought that idea / question earlier. Think about that, if we ever cared about
user ones (or the ones which are system but distributed, e.g. system_auth) and
we would not care about anything else system-wise, then if we included the
calculation method Brandon did to avoid the mismatch with added table
parameters, then there would not be any mismatch from 4.1 to CASSANDRA_4 and
there would not even be any mismatch from CASSANDRA_4 to UPGRADING. It is
questionable at what exact point we would start to compute the schema version
with added table parameters in mind but if we did it in NONE as the last step
then is not it true that schema version would be all same for 5.0 nodes all the
way from CASSANDRA_4 to UPGRADING included?
I still dont know for what reason we are including internal / local system
schema to schema version computation, why? These tables will never leave the
nodes. Why does it matter for another node to know / see that some other node
has some internal tables different? There is no use in knowing that.
EDIT: well, the thing is that we indeed added new tables in system_auth -
CIDRPermissions, CIDRGroups, IdentityToRoles, ah ... so because these are
distributed, these would be added among new ones hence mismatch again. Duh.
was (Author: smiklosovic):
Was thinking about this more and the only way I see so far to get out of this
is to not include system tables into schema version computation, as I already
brought that idea / question earlier. Think about that, if we ever cared about
user ones (or the ones which are system but distributed, e.g. system_auth) and
we would not care about anything else system-wise, then if we included the
calculation method Brandon did to avoid the mismatch with added table
parameters, then there would not be any mismatch from 4.1 to CASSANDRA_4 and
there would not even be any mismatch from CASSANDRA_4 to UPGRADING. It is
questionable at what exact point we would start to compute the schema version
with added table parameters in mind but if we did it in NONE as the last step
then is not it true that schema version would be all same for 5.0 nodes all the
way from CASSANDRA_4 to UPGRADING included?
I still dont know for what reason we are including internal / local system
schema to schema version computation, why? These tables will never leave the
nodes. Why does it matter for another node to know / see that some other node
has some internal tables different? There is no use in knowing that.
> Hints ignored during Upgrade from C*4 to C*5
> --------------------------------------------
>
> Key: CASSANDRA-20118
> URL: https://issues.apache.org/jira/browse/CASSANDRA-20118
> Project: Apache Cassandra
> Issue Type: Bug
> Components: Consistency/Hints
> Reporter: Paul Chandler
> Assignee: Brandon Williams
> Priority: Normal
> Fix For: 5.0.x
>
> Attachments: 4-1debug-third-node.log, 4-1debug.log,
> 4-1system-third-node.log, 4-1system.log, 5-0debug.log, 5-0system.log,
> image-2024-12-11-17-44-56-585.png, system-20118.log
>
>
> I have discovered that some hints were not being processed after nodes come
> back up when a cluster in in a mixed mode with some cassandra 4 nodes and
> some cassdandra 5 nodes ( these with a storage compatibility mode CASSANDRA_4
> )
>
> When in this mode there is a schema mismatch after the first node has been
> upgraded, which continues until the last node has been upgraded.
> It seems that the hints are blocked from being sent if there is a schema
> mismatch between the 2 nodes, that can be seen at this line.
> [cassandra/src/java/org/apache/cassandra/hints/HintsDispatchTrigger.java at
> cassandra-5.0 ·
> apache/cassandra|https://github.com/apache/cassandra/blob/cassandra-5.0/src/java/org/apache/cassandra/hints/HintsDispatchTrigger.java#L65]
> I have tested removing this line, and that then does allow the hint to be
> transferred normally. However I am not sure of the implications for doing
> that if the hint is for part of the schema where the actual mismatch occurs.
>
> This creates the problem when a node is being upgraded and is currently down,
> hint files will be created for it on the new cassandra 5 nodes and the old
> cassandra 4 nodes, but the hint files on the old cassandra 4 nodes will not
> be processed, due to the schema mismatch. Leading to potential data loss.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]