[jira] [Commented] (CASSANDRA-17601) IllegalStateException with prepared queries selecting static columns in mixed 3.0.x/4.x clusters

Benjamin Lerer (Jira) Wed, 13 Jul 2022 06:52:06 -0700


    [ 
https://issues.apache.org/jira/browse/CASSANDRA-17601?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17566323#comment-17566323
 ]


Benjamin Lerer commented on CASSANDRA-17601:
--------------------------------------------

[~jonmeredith] I am still trying to wrap my mind around the problem.
If the problem is with the pre-computation why do we not simply switch for all 
scenarios to {{OnRequestColumnFilterFactory}}?
  


> IllegalStateException with prepared queries selecting static columns in mixed 
> 3.0.x/4.x clusters
> ------------------------------------------------------------------------------------------------
>
>                 Key: CASSANDRA-17601
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-17601
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Cluster/Gossip, Consistency/Coordination
>            Reporter: Jon Meredith
>            Assignee: Jon Meredith
>            Priority: Normal
>             Fix For: 4.0.x, 4.1-beta, 4.x
>
>
> Clusters that contain prepared statements that partially select static 
> columns before the upgrade will fail to execute those statements coordinated 
> from the 4.x nodes until the upgrade completes.
> h2. Reproduction
> Setup (before upgrade)
> {code:java}
> CREATE KEYSPACE ks1 WITH replication = {'class': 'SimpleStrategy', 
> 'replication_factor':3}
> CREATE TABLE ks1.tbl1 (pk1 int,
> ck2 int,
> s3 int static,
> s4 int static,
> c5 int,
> PRIMARY KEY (pk1, ck2));
> INSERT INTO ks1.tbl1 (pk1, ck2, s3, s4, c5) VALUES (1, 2, 3, 4, 5);
> {code}
> Prepared Statement (prepare before upgrade)
> {code:java}
> SELECT c5, s3 FROM ks1.tbl1 WHERE pk1 = ? AND ck2 = ?;
> {code}
> Exception on 3.0.x nodes (when executing prepared statement after upgrade)
> {code:java}
> java.lang.IllegalStateException: [s3, s4] is not a subset of [s3] at 
> org.apache.cassandra.db.Columns$Serializer.encodeBitmap(Columns.java:566)
> at 
> org.apache.cassandra.db.Columns$Serializer.serializeSubset(Columns.java:498) 
> at 
> org.apache.cassandra.db.rows.UnfilteredSerializer.serializeRowBody(UnfilteredSerializer.java:235)
> at 
> org.apache.cassandra.db.rows.UnfilteredSerializer.serialize(UnfilteredSerializer.java:209)
> at 
> org.apache.cassandra.db.rows.UnfilteredSerializer.serialize(UnfilteredSerializer.java:141)
> at 
> org.apache.cassandra.db.rows.UnfilteredSerializer.serialize(UnfilteredSerializer.java:129)
> at 
> org.apache.cassandra.db.rows.UnfilteredRowIteratorSerializer.serialize(UnfilteredRowIteratorSerializer.java:140)
> at 
> org.apache.cassandra.db.rows.UnfilteredRowIteratorSerializer.serialize(UnfilteredRowIteratorSerializer.java:95)
> at 
> org.apache.cassandra.db.rows.UnfilteredRowIteratorSerializer.serialize(UnfilteredRowIteratorSerializer.java:80)
> at 
> org.apache.cassandra.db.partitions.UnfilteredPartitionIterators$Serializer.serialize(UnfilteredPartitionIterators.java:308)
> at 
> org.apache.cassandra.db.ReadResponse$LocalDataResponse.build(ReadResponse.java:191)
> at 
> org.apache.cassandra.db.ReadResponse$LocalDataResponse.<init>(ReadResponse.java:181)
> at 
> org.apache.cassandra.db.ReadResponse$LocalDataResponse.<init>(ReadResponse.java:177)
> at 
> org.apache.cassandra.db.ReadResponse.createDataResponse(ReadResponse.java:48)
> at org.apache.cassandra.db.ReadCommand.createResponse(ReadCommand.java:335)
> at 
> org.apache.cassandra.db.ReadCommandVerbHandler.doVerb(ReadCommandVerbHandler.java:91)
> at org.apache.cassandra.net.InboundSink.lambda$new$0(InboundSink.java:77)
> at org.apache.cassandra.net.InboundSink.accept(InboundSink.java:93)
> at org.apache.cassandra.net.InboundSink.accept(InboundSink.java:44)
> at 
> org.apache.cassandra.net.InboundMessageHandler$ProcessMessage.run(InboundMessageHandler.java:433)
> at 
> java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
> at 
> org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$FutureTask.run(AbstractLocalAwareExecutorService.java:162)
> at 
> org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$LocalSessionFutureTask.run(AbstractLocalAwareExecutorService.java:134)
> at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:119)
> at 
> io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
> at java.base/java.lang.Thread.run(Thread.java:834)
> {code}
> Exception on 4.0.x nodes (when executing prepared statement after upgrade)
> {code:java}
> java.lang.IllegalStateException: [ColumnDefinition{name=s3, 
> type=org.apache.cassandra.db.marshal.IntType, kind=STATIC, position=-1},
> ColumnDefinition{name=s4, type=org.apache.cassandra.db.marshal.IntType, 
> kind=STATIC, position=-1}] is not a subset of [s3]
> at org.apache.cassandra.db.Columns$Serializer.encodeBitmap(Columns.java:555)
> at 
> org.apache.cassandra.db.Columns$Serializer.serializeSubset(Columns.java:487)
> at 
> org.apache.cassandra.db.rows.UnfilteredSerializer.serializeRowBody(UnfilteredSerializer.java:216)
> at 
> org.apache.cassandra.db.rows.UnfilteredSerializer.serialize(UnfilteredSerializer.java:190)
> at 
> org.apache.cassandra.db.rows.UnfilteredSerializer.serialize(UnfilteredSerializer.java:121)
> at 
> org.apache.cassandra.db.rows.UnfilteredSerializer.serialize(UnfilteredSerializer.java:109)
> at 
> org.apache.cassandra.db.rows.UnfilteredRowIteratorSerializer.serialize(UnfilteredRowIteratorSerializer.java:140)
> at 
> org.apache.cassandra.db.rows.UnfilteredRowIteratorSerializer.serialize(UnfilteredRowIteratorSerializer.java:94)
> at 
> org.apache.cassandra.db.rows.UnfilteredRowIteratorSerializer.serialize(UnfilteredRowIteratorSerializer.java:79)
> at 
> org.apache.cassandra.db.partitions.UnfilteredPartitionIterators$Serializer.serialize(UnfilteredPartitionIterators.java:326)
> at 
> org.apache.cassandra.db.ReadResponse$LocalDataResponse.build(ReadResponse.java:186)
> at 
> org.apache.cassandra.db.ReadResponse$LocalDataResponse.<init>(ReadResponse.java:179)
> at 
> org.apache.cassandra.db.ReadResponse$LocalDataResponse.<init>(ReadResponse.java:175)
> at 
> org.apache.cassandra.db.ReadResponse.createDataResponse(ReadResponse.java:75)
> at org.apache.cassandra.db.ReadCommand.createResponse(ReadCommand.java:499)
> at 
> org.apache.cassandra.db.ReadCommandVerbHandler.doVerb(ReadCommandVerbHandler.java:91)
> at 
> org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:67)
> at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> at 
> org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$FutureTask.runUnsafe(AbstractLocalAwareExecutorService.java:194)
> at 
> org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$LocalSessionFutureTask.runUnsafe(AbstractLocalAwareExecutorService.java:137)
> at 
> org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$FutureTask.run(AbstractLocalAwareExecutorService.java:167)
> at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:122) at 
> java.lang.Thread.run(Thread.java:748)
> {code}
> The root cause is CASSANDRA-16686 changes ColumnFilters to build and 
> deserialize based on what versions the coordinating node thinks are running 
> in the cluster, and that
> knowledge is always incorrect when statements are reprepared on startup and 
> may be incorrect as all nodes reach their final version.
> h2. Sequence of events:
> Prepared statements are persisted in {{system.prepared_statements}} to be 
> re-prepared on future startup.
> When the 4.x node starts up after upgrade, in 
> {{org.apache.cassandra.service.CassandraDaemon#setup}} it calls 
> {{QueryProcessor.instance.preloadPreparedStatements}} *before* the 
> {{Gossiper}} is started by a call to {{StorageService.instance.initServer()}} 
> later in {{{}setup{}}}.
> As part of preparing statements, when possible a {{ColumnFilterFactory}} is 
> created that returns a {{ColumnFilter}} built at the time the query is 
> prepared.
> After the changes from CASSANDRA-16686, the {{ColumnFilter}} builder 
> constructs different column filter variants depending on the lowest version 
> reported in gossip by checking 
> {{{}org.apache.cassandra.gms.Gossiper#upgradeFromVersionMemoized{}}}. If this 
> runs before the Gossiper is enabled the 
> {{{}SystemKeyspace.CURRENT_VERSION{}}}, causing the {{ColumnFilter}} to 
> create a column filter as if the cluster were fully upgraded.
> For the query above, the ColumnFilter creates an 
> ALL_REGULARS_AND_QUERIED_STATICS_COLUMNS filter.
> The 3.0.x nodes participating do not understand the new flag and creates a 
> {{ColumnFilter}} the equivalent of a {{{}WildcardColumnFilter{}}}. The 4.x 
> nodes participating do understand the new flag, however the deserializer 
> takes the lower than 3.4 path as other 3.0 nodes are known about and creates 
> a {{{}WildcardColumFilter{}}}.
> The fetchedColumns sent by the ALL_REGULARS_AND_QUERIED_STATICS_COLUMNS 
> filter only contains the queried static columns, however the pre-3.4 sstable 
> iterator returns all regular and static columns, causing an 
> IllegalStateException when the serialized response is sent back.
> The ISE clears once all nodes in the cluster think they are upgraded to the 
> current version and behave as the originally prepared query intended.
> h2. Related Problems
> _Non-deterministic behavior of 4.0.x/4.1.x nodes_
> If the prepared statements are cleared and/or freshly prepared when the 
> cluster is in mixed 3.0/4.0 mode, the pre-built ColumnFilter will remain in 
> the mixed mode version until re-prepared on a restart or cache clear/eviction.
> As upgradeFromVersionMemoized times out and is recalculated after the upgrade 
> reaches a single version, individual nodes will make a local decision on 
> column filter building and deserializing.
> Nodes that update upgradeFromVersionMemoized early that coordinate requests 
> may cause the same ISE against nodes responding to the read command have the 
> previous version still.
> _Digest Mismatches_
> If {{ALL_REGULARS_AND_QUERIED_STATICS_COLUMN}} {{ColumnFilter}} s are 
> incorrectly sent to 3.0.x nodes, the list of columns included will be ignored 
> and compute a different digest than one locally executed on a 4.0.x 
> coordinator.
> h1. Proposed fix
> In discussion with [~ifesdjeen], he suggested that the one way to resolve 
> this is the {{ALL_REGULARS_AND_QUERIED_STATICS_COLUMNS}} filter should by 
> deprecated (or just removed) and no longer built, always selecting all static 
> columns
> This would just leave {{WildCardColumnFilter}} and {{SelectionColumnFilter}} 
> with {{ALL_COLUMNS}} or {{ONLY_QUERIED_COLUMNS}}.
> This is a potential performance regression for unusual schemas with very 
> large numbers of static columns, but seems unlikely in practice.
> /cc: [~blerer] 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Commented] (CASSANDRA-17601) IllegalStateException with prepared queries selecting static columns in mixed 3.0.x/4.x clusters

Reply via email to