[
https://issues.apache.org/jira/browse/CASSANDRA-17601?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17566323#comment-17566323
]
Benjamin Lerer commented on CASSANDRA-17601:
--------------------------------------------
[~jonmeredith] I am still trying to wrap my mind around the problem.
If the problem is with the pre-computation why do we not simply switch for all
scenarios to {{OnRequestColumnFilterFactory}}?
> IllegalStateException with prepared queries selecting static columns in mixed
> 3.0.x/4.x clusters
> ------------------------------------------------------------------------------------------------
>
> Key: CASSANDRA-17601
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17601
> Project: Cassandra
> Issue Type: Bug
> Components: Cluster/Gossip, Consistency/Coordination
> Reporter: Jon Meredith
> Assignee: Jon Meredith
> Priority: Normal
> Fix For: 4.0.x, 4.1-beta, 4.x
>
>
> Clusters that contain prepared statements that partially select static
> columns before the upgrade will fail to execute those statements coordinated
> from the 4.x nodes until the upgrade completes.
> h2. Reproduction
> Setup (before upgrade)
> {code:java}
> CREATE KEYSPACE ks1 WITH replication = {'class': 'SimpleStrategy',
> 'replication_factor':3}
> CREATE TABLE ks1.tbl1 (pk1 int,
> ck2 int,
> s3 int static,
> s4 int static,
> c5 int,
> PRIMARY KEY (pk1, ck2));
> INSERT INTO ks1.tbl1 (pk1, ck2, s3, s4, c5) VALUES (1, 2, 3, 4, 5);
> {code}
> Prepared Statement (prepare before upgrade)
> {code:java}
> SELECT c5, s3 FROM ks1.tbl1 WHERE pk1 = ? AND ck2 = ?;
> {code}
> Exception on 3.0.x nodes (when executing prepared statement after upgrade)
> {code:java}
> java.lang.IllegalStateException: [s3, s4] is not a subset of [s3] at
> org.apache.cassandra.db.Columns$Serializer.encodeBitmap(Columns.java:566)
> at
> org.apache.cassandra.db.Columns$Serializer.serializeSubset(Columns.java:498)
> at
> org.apache.cassandra.db.rows.UnfilteredSerializer.serializeRowBody(UnfilteredSerializer.java:235)
> at
> org.apache.cassandra.db.rows.UnfilteredSerializer.serialize(UnfilteredSerializer.java:209)
> at
> org.apache.cassandra.db.rows.UnfilteredSerializer.serialize(UnfilteredSerializer.java:141)
> at
> org.apache.cassandra.db.rows.UnfilteredSerializer.serialize(UnfilteredSerializer.java:129)
> at
> org.apache.cassandra.db.rows.UnfilteredRowIteratorSerializer.serialize(UnfilteredRowIteratorSerializer.java:140)
> at
> org.apache.cassandra.db.rows.UnfilteredRowIteratorSerializer.serialize(UnfilteredRowIteratorSerializer.java:95)
> at
> org.apache.cassandra.db.rows.UnfilteredRowIteratorSerializer.serialize(UnfilteredRowIteratorSerializer.java:80)
> at
> org.apache.cassandra.db.partitions.UnfilteredPartitionIterators$Serializer.serialize(UnfilteredPartitionIterators.java:308)
> at
> org.apache.cassandra.db.ReadResponse$LocalDataResponse.build(ReadResponse.java:191)
> at
> org.apache.cassandra.db.ReadResponse$LocalDataResponse.<init>(ReadResponse.java:181)
> at
> org.apache.cassandra.db.ReadResponse$LocalDataResponse.<init>(ReadResponse.java:177)
> at
> org.apache.cassandra.db.ReadResponse.createDataResponse(ReadResponse.java:48)
> at org.apache.cassandra.db.ReadCommand.createResponse(ReadCommand.java:335)
> at
> org.apache.cassandra.db.ReadCommandVerbHandler.doVerb(ReadCommandVerbHandler.java:91)
> at org.apache.cassandra.net.InboundSink.lambda$new$0(InboundSink.java:77)
> at org.apache.cassandra.net.InboundSink.accept(InboundSink.java:93)
> at org.apache.cassandra.net.InboundSink.accept(InboundSink.java:44)
> at
> org.apache.cassandra.net.InboundMessageHandler$ProcessMessage.run(InboundMessageHandler.java:433)
> at
> java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
> at
> org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$FutureTask.run(AbstractLocalAwareExecutorService.java:162)
> at
> org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$LocalSessionFutureTask.run(AbstractLocalAwareExecutorService.java:134)
> at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:119)
> at
> io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
> at java.base/java.lang.Thread.run(Thread.java:834)
> {code}
> Exception on 4.0.x nodes (when executing prepared statement after upgrade)
> {code:java}
> java.lang.IllegalStateException: [ColumnDefinition{name=s3,
> type=org.apache.cassandra.db.marshal.IntType, kind=STATIC, position=-1},
> ColumnDefinition{name=s4, type=org.apache.cassandra.db.marshal.IntType,
> kind=STATIC, position=-1}] is not a subset of [s3]
> at org.apache.cassandra.db.Columns$Serializer.encodeBitmap(Columns.java:555)
> at
> org.apache.cassandra.db.Columns$Serializer.serializeSubset(Columns.java:487)
> at
> org.apache.cassandra.db.rows.UnfilteredSerializer.serializeRowBody(UnfilteredSerializer.java:216)
> at
> org.apache.cassandra.db.rows.UnfilteredSerializer.serialize(UnfilteredSerializer.java:190)
> at
> org.apache.cassandra.db.rows.UnfilteredSerializer.serialize(UnfilteredSerializer.java:121)
> at
> org.apache.cassandra.db.rows.UnfilteredSerializer.serialize(UnfilteredSerializer.java:109)
> at
> org.apache.cassandra.db.rows.UnfilteredRowIteratorSerializer.serialize(UnfilteredRowIteratorSerializer.java:140)
> at
> org.apache.cassandra.db.rows.UnfilteredRowIteratorSerializer.serialize(UnfilteredRowIteratorSerializer.java:94)
> at
> org.apache.cassandra.db.rows.UnfilteredRowIteratorSerializer.serialize(UnfilteredRowIteratorSerializer.java:79)
> at
> org.apache.cassandra.db.partitions.UnfilteredPartitionIterators$Serializer.serialize(UnfilteredPartitionIterators.java:326)
> at
> org.apache.cassandra.db.ReadResponse$LocalDataResponse.build(ReadResponse.java:186)
> at
> org.apache.cassandra.db.ReadResponse$LocalDataResponse.<init>(ReadResponse.java:179)
> at
> org.apache.cassandra.db.ReadResponse$LocalDataResponse.<init>(ReadResponse.java:175)
> at
> org.apache.cassandra.db.ReadResponse.createDataResponse(ReadResponse.java:75)
> at org.apache.cassandra.db.ReadCommand.createResponse(ReadCommand.java:499)
> at
> org.apache.cassandra.db.ReadCommandVerbHandler.doVerb(ReadCommandVerbHandler.java:91)
> at
> org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:67)
> at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> at
> org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$FutureTask.runUnsafe(AbstractLocalAwareExecutorService.java:194)
> at
> org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$LocalSessionFutureTask.runUnsafe(AbstractLocalAwareExecutorService.java:137)
> at
> org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$FutureTask.run(AbstractLocalAwareExecutorService.java:167)
> at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:122) at
> java.lang.Thread.run(Thread.java:748)
> {code}
> The root cause is CASSANDRA-16686 changes ColumnFilters to build and
> deserialize based on what versions the coordinating node thinks are running
> in the cluster, and that
> knowledge is always incorrect when statements are reprepared on startup and
> may be incorrect as all nodes reach their final version.
> h2. Sequence of events:
> Prepared statements are persisted in {{system.prepared_statements}} to be
> re-prepared on future startup.
> When the 4.x node starts up after upgrade, in
> {{org.apache.cassandra.service.CassandraDaemon#setup}} it calls
> {{QueryProcessor.instance.preloadPreparedStatements}} *before* the
> {{Gossiper}} is started by a call to {{StorageService.instance.initServer()}}
> later in {{{}setup{}}}.
> As part of preparing statements, when possible a {{ColumnFilterFactory}} is
> created that returns a {{ColumnFilter}} built at the time the query is
> prepared.
> After the changes from CASSANDRA-16686, the {{ColumnFilter}} builder
> constructs different column filter variants depending on the lowest version
> reported in gossip by checking
> {{{}org.apache.cassandra.gms.Gossiper#upgradeFromVersionMemoized{}}}. If this
> runs before the Gossiper is enabled the
> {{{}SystemKeyspace.CURRENT_VERSION{}}}, causing the {{ColumnFilter}} to
> create a column filter as if the cluster were fully upgraded.
> For the query above, the ColumnFilter creates an
> ALL_REGULARS_AND_QUERIED_STATICS_COLUMNS filter.
> The 3.0.x nodes participating do not understand the new flag and creates a
> {{ColumnFilter}} the equivalent of a {{{}WildcardColumnFilter{}}}. The 4.x
> nodes participating do understand the new flag, however the deserializer
> takes the lower than 3.4 path as other 3.0 nodes are known about and creates
> a {{{}WildcardColumFilter{}}}.
> The fetchedColumns sent by the ALL_REGULARS_AND_QUERIED_STATICS_COLUMNS
> filter only contains the queried static columns, however the pre-3.4 sstable
> iterator returns all regular and static columns, causing an
> IllegalStateException when the serialized response is sent back.
> The ISE clears once all nodes in the cluster think they are upgraded to the
> current version and behave as the originally prepared query intended.
> h2. Related Problems
> _Non-deterministic behavior of 4.0.x/4.1.x nodes_
> If the prepared statements are cleared and/or freshly prepared when the
> cluster is in mixed 3.0/4.0 mode, the pre-built ColumnFilter will remain in
> the mixed mode version until re-prepared on a restart or cache clear/eviction.
> As upgradeFromVersionMemoized times out and is recalculated after the upgrade
> reaches a single version, individual nodes will make a local decision on
> column filter building and deserializing.
> Nodes that update upgradeFromVersionMemoized early that coordinate requests
> may cause the same ISE against nodes responding to the read command have the
> previous version still.
> _Digest Mismatches_
> If {{ALL_REGULARS_AND_QUERIED_STATICS_COLUMN}} {{ColumnFilter}} s are
> incorrectly sent to 3.0.x nodes, the list of columns included will be ignored
> and compute a different digest than one locally executed on a 4.0.x
> coordinator.
> h1. Proposed fix
> In discussion with [~ifesdjeen], he suggested that the one way to resolve
> this is the {{ALL_REGULARS_AND_QUERIED_STATICS_COLUMNS}} filter should by
> deprecated (or just removed) and no longer built, always selecting all static
> columns
> This would just leave {{WildCardColumnFilter}} and {{SelectionColumnFilter}}
> with {{ALL_COLUMNS}} or {{ONLY_QUERIED_COLUMNS}}.
> This is a potential performance regression for unusual schemas with very
> large numbers of static columns, but seems unlikely in practice.
> /cc: [~blerer]
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]