This is an automated email from the ASF dual-hosted git repository.

maedhroz pushed a commit to branch trunk
in repository https://gitbox.apache.org/repos/asf/cassandra.git

commit 4a80daf32eb4226d9870b914779a1fc007479da6
Merge: 1dbce2a8c5 953ab6cf64
Author: Caleb Rackliffe <[email protected]>
AuthorDate: Thu Feb 6 13:49:55 2025 -0600

    Merge branch 'cassandra-5.0' into trunk
    
    * cassandra-5.0:
      Avoid possible consistency violations for SAI intersection queries over 
repaired index matches and multiple non-indexed column matches

 CHANGES.txt                                        |   1 +
 .../org/apache/cassandra/db/filter/RowFilter.java  |  14 +-
 .../cassandra/index/sai/plan/FilterTree.java       |  74 ++++++---
 .../apache/cassandra/index/sai/plan/Operation.java |  87 +++++++++--
 .../cassandra/index/sai/plan/QueryController.java  |   2 +-
 .../cassandra/index/sai/utils/IndexTermType.java   |   5 +
 .../distributed/test/sai/StrictFilteringTest.java  |  57 +++++++
 .../cassandra/fuzz/sai/MultiNodeSAITestBase.java   |   8 +-
 .../cassandra/fuzz/sai/SingleNodeSAITestBase.java  | 174 ++++++++++++++-------
 .../cassandra/harry/dsl/HistoryBuilderHelper.java  |  43 +++--
 .../apache/cassandra/cql3/ast/CreateIndexDDL.java  |  13 +-
 .../cassandra/index/sai/plan/OperationTest.java    |   5 +-
 12 files changed, 361 insertions(+), 122 deletions(-)

diff --cc CHANGES.txt
index 19432f6782,13ae3fddc2..de1173b382
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@@ -1,134 -1,5 +1,135 @@@
 -5.0.4
 +5.1
 + * Add system_views.partition_key_statistics for querying SSTable metadata 
(CASSANDRA-20161)
 + * CEP-42 - Add Constraints Framework (CASSANDRA-19947)
 + * Add table metric PurgeableTombstoneScannedHistogram and a tracing event 
for scanned purgeable tombstones (CASSANDRA-20132)
 + * Make sure we can parse the expanded CQL before writing it to the log or 
sending it to replicas (CASSANDRA-20218)
 + * Add format_bytes and format_time functions (CASSANDRA-19546)
 + * Fix error when trying to assign a tuple to target type not being a tuple 
(CASSANDRA-20237)
 + * Fail CREATE TABLE LIKE statement if UDTs in target keyspace do not exist 
or they have different structure from ones in source keyspace (CASSANDRA-19966)
 + * Support octet_length and length functions (CASSANDRA-20102)
 + * Make JsonUtils serialize Instant always with the same format 
(CASSANDRA-20209)
 + * Port Harry v2 to trunk (CASSANDRA-20200)
 + * Enable filtering of snapshots on keyspace, table and snapshot name in 
nodetool listsnapshots (CASSANDRA-20151)
 + * Create manifest upon loading where it does not exist or enrich it 
(CASSANDRA-20150)
 + * Propagate true size of snapshot in SnapshotDetailsTabularData to not call 
JMX twice in nodetool listsnapshots (CASSANDRA-20149)
 + * Implementation of CEP-43 - copying a table via CQL by CREATE TABLE LIKE 
(CASSANDRA-19964)
 + * Periodically disconnect roles that are revoked or have LOGIN=FALSE set 
(CASSANDRA-19385)
 + * AST library for CQL-based fuzz tests (CASSANDRA-20198)
 + * Support audit logging for JMX operations (CASSANDRA-20128)
 + * Enable sorting of nodetool status output (CASSANDRA-20104)
 + * Support downgrading after CMS is initialized (CASSANDRA-20145)
 + * Deprecate IEndpointSnitch (CASSANDRA-19488)
 + * Check presence of a snapshot in a case-insensitive manner on macOS 
platform to prevent hardlinking failures (CASSANDRA-20146)
 + * Enable JMX server configuration to be in cassandra.yaml (CASSANDRA-11695)
 + * Parallelized UCS compactions (CASSANDRA-18802)
 + * Avoid prepared statement invalidation race when committing schema changes 
(CASSANDRA-20116)
 + * Restore optimization in MultiCBuilder around building one clustering 
(CASSANDRA-20129)
 + * Consolidate all snapshot management to SnapshotManager and introduce 
SnapshotManagerMBean (CASSANDRA-18111)
 + * Fix RequestFailureReason constants codes (CASSANDRA-20126)
 + * Introduce SSTableSimpleScanner for compaction (CASSANDRA-20092)
 + * Include column drop timestamp in alter table transformation 
(CASSANDRA-18961)
 + * Make JMX SSL configurable in cassandra.yaml (CASSANDRA-18508)
 + * Fix cqlsh CAPTURE command to save query results without trace details when 
TRACING is ON (CASSANDRA-19105)
 + * Optionally prevent tombstone purging during repair (CASSANDRA-20071)
 + * Add post-filtering support for the IN operator in SAI queries 
(CASSANDRA-20025)
 + * Don’t finish ongoing decommission and move operations during startup 
(CASSANDRA-20040)
 + * Nodetool reconfigure cms has correct return code when streaming fails 
(CASSANDRA-19972)
 + * Reintroduce RestrictionSet#iterator() optimization around multi-column 
restrictions (CASSANDRA-20034)
 + * Explicitly localize strings to Locale.US for internal implementation 
(CASSANDRA-19953)
 + * Add -H option for human-friendly output in nodetool compactionhistory 
(CASSANDRA-20015)
 + * Fix type check for referenced duration type for nested types 
(CASSANDRA-19890)
 + * In simulation tests, correctly set the tokens of replacement nodes 
(CASSANDRA-19997)
 + * During TCM upgrade, retain all properties of existing system tables 
(CASSANDRA-19992)
 + * Properly cancel in-flight futures and reject requests in 
EpochAwareDebounce during shutdown (CASSANDRA-19848)
 + * Provide clearer exception message on failing commitlog_disk_access_mode 
combinations (CASSANDRA-19812)
 + * Add total space used for a keyspace to nodetool tablestats 
(CASSANDRA-19671)
 + * Ensure Relation#toRestriction() handles ReversedType properly 
(CASSANDRA-19950)
 + * Add JSON and YAML output option to nodetool gcstats (CASSANDRA-19771)
 + * Introduce metadata serialization version V4 (CASSANDRA-19970)
 + * Allow CMS reconfiguration to work around DOWN nodes (CASSANDRA-19943)
 + * Make TableParams.Serializer set allowAutoSnapshots and incrementalBackups 
(CASSANDRA-19954)
 + * Make sstabledump possible to show tombstones only (CASSANDRA-19939)
 + * Ensure that RFP queries potentially stale replicas even with only key 
columns in the row filter (CASSANDRA-19938)
 + * Allow nodes to change IP address while upgrading to TCM (CASSANDRA-19921)
 + * Retain existing keyspace params on system tables after upgrade 
(CASSANDRA-19916)
 + * Deprecate use of gossip state for paxos electorate verification 
(CASSANDRA-19904)
 + * Update dtest-api to 0.0.17 to fix jvm17 crash in jvm-dtests 
(CASSANDRA-19239)
 + * Add resource leak test and Update Netty to 4.1.113.Final to fix leak 
(CASSANDRA-19783)
 + * Fix incorrect nodetool suggestion when gossip mode is running 
(CASSANDRA-19905)
 + * SAI support for BETWEEN operator (CASSANDRA-19688)
 + * Fix BETWEEN filtering for reversed clustering columns (CASSANDRA-19878)
 + * Retry if node leaves CMS while committing a transformation 
(CASSANDRA-19872)
 + * Add support for NOT operators in WHERE clauses. Fixed Three Valued Logic 
(CASSANDRA-18584)
 + * Allow getendpoints for system tables and make sure getNaturalReplicas work 
for MetaStrategy (CASSANDRA-19846)
 + * On upgrade, handle pre-existing tables with unexpected table ids 
(CASSANDRA-19845)
 + * Reconfigure CMS before assassinate (CASSANDRA-19768)
 + * Warn about unqualified prepared statement only if it is select or 
modification statement (CASSANDRA-18322)
 + * Update legacy peers tables during node replacement (CASSANDRA-19782)
 + * Refactor ColumnCondition (CASSANDRA-19620)
 + * Allow configuring log format for Audit Logs (CASSANDRA-19792)
 + * Support for noboolean rpm (centos7 compatible) packages removed 
(CASSANDRA-19787)
 + * Allow threads waiting for the metadata log follower to be interrupted 
(CASSANDRA-19761)
 + * Support dictionary lookup for CassandraPasswordValidator (CASSANDRA-19762)
 + * Disallow denylisting keys in system_cluster_metadata (CASSANDRA-19713)
 + * Fix gossip status after replacement (CASSANDRA-19712)
 + * Ignore repair requests for system_cluster_metadata (CASSANDRA-19711)
 + * Avoid ClassCastException when verifying tables with reversed partitioner 
(CASSANDRA-19710)
 + * Always repair the full range when repairing system_cluster_metadata 
(CASSANDRA-19709)
 + * Use table-specific partitioners during Paxos repair (CASSANDRA-19714)
 + * Expose current compaction throughput in nodetool (CASSANDRA-13890)
 + * CEP-24 Password validation / generation (CASSANDRA-17457)
 + * Reconfigure CMS after replacement, bootstrap and move operations 
(CASSANDRA-19705)
 + * Support querying LocalStrategy tables with any partitioner 
(CASSANDRA-19692)
 + * Relax slow_query_log_timeout for MultiNodeSAITest (CASSANDRA-19693)
 + * Audit Log entries are missing identity for mTLS connections 
(CASSANDRA-19669)
 + * Add support for the BETWEEN operator in WHERE clauses (CASSANDRA-19604)
 + * Replace Stream iteration with for-loop for 
SimpleRestriction::bindAndGetClusteringElements (CASSANDRA-19679)
 + * Consolidate logging on trace level (CASSANDRA-19632)
 + * Expand DDL statements on coordinator before submission to the CMS 
(CASSANDRA-19592)
 + * Fix number of arguments of String.format() in various classes 
(CASSANDRA-19645)
 + * Remove unused fields from config (CASSANDRA-19599)
 + * Refactor Relation and Restriction hierarchies (CASSANDRA-19341)
 + * Raise priority of TCM internode messages during critical operations 
(CASSANDRA-19517)
 + * Add nodetool command to unregister LEFT nodes (CASSANDRA-19581)
 + * Add cluster metadata id to gossip syn messages (CASSANDRA-19613)
 + * Reduce heap usage occupied by the metrics (CASSANDRA-19567)
 + * Improve handling of transient replicas during range movements 
(CASSANDRA-19344)
 + * Enable debounced internode log requests to be cancelled at shutdown 
(CASSANDRA-19514)
 + * Correctly set last modified epoch when combining multistep operations into 
a single step (CASSANDRA-19538)
 + * Add new TriggersPolicy configuration to allow operators to disable 
triggers (CASSANDRA-19532)
 + * Use Transformation.Kind.id in local and distributed log tables 
(CASSANDRA-19516)
 + * Remove period field from ClusterMetadata and metadata log tables 
(CASSANDRA-19482)
 + * Enrich system_views.pending_hints vtable with hints sizes (CASSANDRA-19486)
 + * Expose all dropwizard metrics in virtual tables (CASSANDRA-14572)
 + * Ensured that PropertyFileSnitchTest do not overwrite 
cassandra-toploogy.properties (CASSANDRA-19502)
 + * Add option for MutualTlsAuthenticator to restrict the certificate validity 
period (CASSANDRA-18951)
 + * Fix StorageService::constructRangeToEndpointMap for non-distributed 
keyspaces (CASSANDRA-19255)
 + * Group nodetool cms commands into single command group (CASSANDRA-19393)
 + * Register the measurements of the bootstrap process as Dropwizard metrics 
(CASSANDRA-19447)
 + * Add LIST SUPERUSERS CQL statement (CASSANDRA-19417)
 + * Modernize CQLSH datetime conversions (CASSANDRA-18879)
 + * Harry model and in-JVM tests for partition-restricted 2i queries 
(CASSANDRA-18275)
 + * Refactor cqlshmain global constants (CASSANDRA-19201)
 + * Remove native_transport_port_ssl (CASSANDRA-19397)
 + * Make nodetool reconfigurecms sync by default and add --cancel to be able 
to cancel ongoing reconfigurations (CASSANDRA-19216)
 + * Expose auth mode in system_views.clients, nodetool clientstats, metrics 
(CASSANDRA-19366)
 + * Remove sealed_periods and last_sealed_period tables (CASSANDRA-19189)
 + * Improve setup and initialisation of LocalLog/LogSpec (CASSANDRA-19271)
 + * Refactor structure of caching metrics and expose auth cache metrics via 
JMX (CASSANDRA-17062)
 + * Allow CQL client certificate authentication to work without sending an 
AUTHENTICATE request (CASSANDRA-18857)
 + * Extend nodetool tpstats and system_views.thread_pools with detailed pool 
parameters (CASSANDRA-19289)
 + * Remove dependency on Sigar in favor of OSHI (CASSANDRA-16565)
 + * Simplify the bind marker and Term logic (CASSANDRA-18813)
 + * Limit cassandra startup to supported JDKs, allow higher JDKs by setting 
CASSANDRA_JDK_UNSUPPORTED (CASSANDRA-18688)
 + * Standardize nodetool tablestats formatting of data units (CASSANDRA-19104)
 + * Make nodetool tablestats use number of significant digits for time and 
average values consistently (CASSANDRA-19015)
 + * Upgrade jackson to 2.15.3 and snakeyaml to 2.1 (CASSANDRA-18875)
 + * Transactional Cluster Metadata [CEP-21] (CASSANDRA-18330)
 + * Add ELAPSED command to cqlsh (CASSANDRA-18861)
 + * Add the ability to disable bulk loading of SSTables (CASSANDRA-18781)
 + * Clean up obsolete functions and simplify cql_version handling in cqlsh 
(CASSANDRA-18787)
 +Merged from 5.0:
+  * Avoid possible consistency violations for SAI intersection queries over 
repaired index matches and multiple non-indexed column matches (CASSANDRA-20189)
   * Skip check for DirectIO when initializing tools (CASSANDRA-20289)
   * Avoid under-skipping during intersections when an iterator has mixed 
STATIC and WIDE keys (CASSANDRA-20258)
   * Correct the default behavior of compareTo() when comparing WIDE and STATIC 
PrimaryKeys (CASSANDRA-20238)
diff --cc 
test/distributed/org/apache/cassandra/distributed/test/sai/StrictFilteringTest.java
index 83bfdcf948,cd753d6e01..60646c6630
--- 
a/test/distributed/org/apache/cassandra/distributed/test/sai/StrictFilteringTest.java
+++ 
b/test/distributed/org/apache/cassandra/distributed/test/sai/StrictFilteringTest.java
@@@ -51,8 -54,28 +51,28 @@@ public class StrictFilteringTest extend
          CLUSTER = init(Cluster.build(2).withConfig(config -> 
config.set("hinted_handoff_enabled", 
false).with(GOSSIP).with(NETWORK)).start());
      }
  
+     @Test
+     public void shouldDegradeToUnionOnSingleStatic()
+     {
+         CLUSTER.schemaChange(withKeyspace("CREATE TABLE %s.single_static (pk0 
int, ck0 int, ck1 int, s0 int static, v0 int, PRIMARY KEY (pk0, ck0, ck1)) " +
+                                           "WITH read_repair = 'NONE' AND 
CLUSTERING ORDER BY (ck0 ASC, ck1 DESC)"));
+         CLUSTER.schemaChange(withKeyspace("CREATE INDEX ON 
%s.single_static(ck0) USING 'sai'"));
+         CLUSTER.schemaChange(withKeyspace("CREATE INDEX ON 
%s.single_static(s0) USING 'sai'"));
+         SAIUtil.waitForIndexQueryable(CLUSTER, KEYSPACE);
+ 
+         // To present the coordinator with enough data to find a row match, 
both replicas must degrade to OR at query
+         // time. The static column match from node 2 and the clustering key 
match from node 1 must be merged.
+         CLUSTER.get(2).executeInternal(withKeyspace("INSERT INTO 
%s.single_static (pk0, ck0, ck1, s0, v0) VALUES (0, 1, 2, 3, 4)"));
+         CLUSTER.get(1).executeInternal(withKeyspace("UPDATE %s.single_static 
SET v0 = 5 WHERE pk0 = 0 AND ck0 = 6 AND ck1 = 7"));
+ 
+         // A static column predicate and ANY other predicate makes strict 
filtering impossible, as the static match
+         // applies to the entire partition.
+         String select = withKeyspace("SELECT * FROM %s.single_static WHERE s0 
= 3 AND ck0 = 6");
+         assertRows(CLUSTER.coordinator(1).execute(select, 
ConsistencyLevel.ALL), row(0, 6, 7, 3, 5));
+     }
+ 
      @Test
 -    public void shouldRejectNonStrictIN()
 +    public void shouldPostFilterNonStrictIN()
      {
          CLUSTER.schemaChange(withKeyspace("CREATE TABLE %s.reject_in (k int 
PRIMARY KEY, a int, b int) WITH read_repair = 'NONE'"));
          CLUSTER.schemaChange(withKeyspace("CREATE INDEX ON %s.reject_in(a) 
USING 'sai'"));
diff --cc 
test/distributed/org/apache/cassandra/fuzz/sai/MultiNodeSAITestBase.java
index 045cfc81d9,0000000000..860e779970
mode 100644,000000..100644
--- a/test/distributed/org/apache/cassandra/fuzz/sai/MultiNodeSAITestBase.java
+++ b/test/distributed/org/apache/cassandra/fuzz/sai/MultiNodeSAITestBase.java
@@@ -1,85 -1,0 +1,83 @@@
 +/*
 + * Licensed to the Apache Software Foundation (ASF) under one
 + * or more contributor license agreements.  See the NOTICE file
 + * distributed with this work for additional information
 + * regarding copyright ownership.  The ASF licenses this file
 + * to you under the Apache License, Version 2.0 (the
 + * "License"); you may not use this file except in compliance
 + * with the License.  You may obtain a copy of the License at
 + *
 + *     http://www.apache.org/licenses/LICENSE-2.0
 + *
 + * Unless required by applicable law or agreed to in writing, software
 + * distributed under the License is distributed on an "AS IS" BASIS,
 + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 + * See the License for the specific language governing permissions and
 + * limitations under the License.
 + */
 +
 +package org.apache.cassandra.fuzz.sai;
 +
- import org.junit.Before;
 +import org.junit.BeforeClass;
 +
 +import org.apache.cassandra.distributed.Cluster;
 +import org.apache.cassandra.distributed.test.sai.SAIUtil;
 +import org.apache.cassandra.harry.SchemaSpec;
 +
 +import static org.apache.cassandra.distributed.api.Feature.GOSSIP;
 +import static org.apache.cassandra.distributed.api.Feature.NETWORK;
 +
 +public abstract class MultiNodeSAITestBase extends SingleNodeSAITestBase
 +{
 +    public MultiNodeSAITestBase()
 +    {
 +        super();
 +    }
 +
 +    @BeforeClass
 +    public static void before() throws Throwable
 +    {
 +        cluster = Cluster.build()
 +                         .withNodes(2)
 +                         // At lower fetch sizes, queries w/ hundreds or 
thousands of matches can take a very long time.
 +                         .withConfig(defaultConfig().andThen(c -> 
c.set("range_request_timeout", "180s")
 +                                                                   
.set("read_request_timeout", "180s")
 +                                                                   
.set("write_request_timeout", "180s")
 +                                                                   
.set("native_transport_timeout", "180s")
 +                                                                   
.set("slow_query_log_timeout", "180s")
 +                                                                   
.with(GOSSIP).with(NETWORK)))
 +                         .createWithoutStarting();
 +        cluster.startup();
 +        cluster = init(cluster);
 +    }
 +
-     @Before
-     public void beforeEach()
++    @Override
++    protected int rf()
 +    {
-         cluster.schemaChange("DROP KEYSPACE IF EXISTS harry");
-         cluster.schemaChange("CREATE KEYSPACE harry WITH replication = 
{'class': 'SimpleStrategy', 'replication_factor': 2};");
++        return 2;
 +    }
 +
 +    @Override
 +    protected void flush(SchemaSpec schema)
 +    {
 +        cluster.forEach(i -> i.nodetool("flush", schema.keyspace));
 +    }
 +
 +    @Override
 +    protected void repair(SchemaSpec schema)
 +    {
 +        cluster.get(1).nodetool("repair", schema.keyspace);
 +    }
 +
 +    @Override
 +    protected void compact(SchemaSpec schema)
 +    {
 +        cluster.forEach(i -> i.nodetool("compact", schema.keyspace));
 +    }
 +
 +    @Override
 +    protected void waitForIndexesQueryable(SchemaSpec schema)
 +    {
 +        SAIUtil.waitForIndexQueryable(cluster, schema.keyspace);
 +    }
 +}
diff --cc 
test/distributed/org/apache/cassandra/fuzz/sai/SingleNodeSAITestBase.java
index b14fa6908a,0000000000..afdb52e716
mode 100644,000000..100644
--- a/test/distributed/org/apache/cassandra/fuzz/sai/SingleNodeSAITestBase.java
+++ b/test/distributed/org/apache/cassandra/fuzz/sai/SingleNodeSAITestBase.java
@@@ -1,284 -1,0 +1,350 @@@
 +/*
 + * Licensed to the Apache Software Foundation (ASF) under one
 + * or more contributor license agreements.  See the NOTICE file
 + * distributed with this work for additional information
 + * regarding copyright ownership.  The ASF licenses this file
 + * to you under the Apache License, Version 2.0 (the
 + * "License"); you may not use this file except in compliance
 + * with the License.  You may obtain a copy of the License at
 + *
 + *     http://www.apache.org/licenses/LICENSE-2.0
 + *
 + * Unless required by applicable law or agreed to in writing, software
 + * distributed under the License is distributed on an "AS IS" BASIS,
 + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 + * See the License for the specific language governing permissions and
 + * limitations under the License.
 + */
 +
 +package org.apache.cassandra.fuzz.sai;
 +
 +
- import java.util.ArrayList;
 +import java.util.HashSet;
 +import java.util.List;
 +import java.util.Set;
 +import java.util.function.Consumer;
++import java.util.function.Supplier;
 +
 +import com.google.common.collect.Streams;
 +import org.junit.AfterClass;
 +import org.junit.Before;
 +import org.junit.BeforeClass;
 +import org.junit.Test;
 +import org.slf4j.Logger;
 +import org.slf4j.LoggerFactory;
 +
 +import org.apache.cassandra.config.CassandraRelevantProperties;
 +import org.apache.cassandra.distributed.Cluster;
 +import org.apache.cassandra.distributed.api.ConsistencyLevel;
 +import org.apache.cassandra.distributed.api.IInstanceConfig;
 +import org.apache.cassandra.distributed.test.TestBaseImpl;
 +import org.apache.cassandra.harry.SchemaSpec;
 +import org.apache.cassandra.harry.dsl.HistoryBuilder;
 +import org.apache.cassandra.harry.dsl.HistoryBuilderHelper;
 +import org.apache.cassandra.harry.dsl.ReplayingHistoryBuilder;
 +import org.apache.cassandra.harry.execution.InJvmDTestVisitExecutor;
 +import org.apache.cassandra.harry.gen.EntropySource;
 +import org.apache.cassandra.harry.gen.Generator;
 +import org.apache.cassandra.harry.gen.Generators;
 +import org.apache.cassandra.harry.gen.SchemaGenerators;
++import org.apache.cassandra.index.sai.utils.IndexTermType;
 +
 +import static org.apache.cassandra.distributed.api.Feature.GOSSIP;
 +import static org.apache.cassandra.distributed.api.Feature.NETWORK;
 +import static org.apache.cassandra.harry.checker.TestHelper.withRandom;
 +import static 
org.apache.cassandra.harry.dsl.SingleOperationBuilder.IdxRelation;
 +
 +// TODO: "WITH OPTIONS = {'case_sensitive': 'false', 'normalize': 'true', 
'ascii': 'true'};",
 +public abstract class SingleNodeSAITestBase extends TestBaseImpl
 +{
-     private static final int OPERATIONS_PER_RUN = 30_000;
-     private static final int REPAIR_SKIP = OPERATIONS_PER_RUN / 2;
-     private static final int FLUSH_SKIP = OPERATIONS_PER_RUN / 7;
-     private static final int COMPACTION_SKIP = OPERATIONS_PER_RUN / 11;
++    private static final int ITERATIONS = 5;
++
++    private static final int VALIDATION_SKIP = 739;
++    private static final int QUERIES_PER_VALIDATION = 8;
++
++    private static final int FLUSH_SKIP = 1499;
++    private static final int COMPACTION_SKIP = 1503;
++    private static final int DEFAULT_REPAIR_SKIP = 6101;
++
++    private static final int OPERATIONS_PER_RUN = DEFAULT_REPAIR_SKIP * 5;
++
++    private static final int NUM_PARTITIONS = 64;
++    private static final int NUM_VISITED_PARTITIONS = 16;
++    private static final int MAX_PARTITION_SIZE = 2000;
 +
-     private static final int NUM_PARTITIONS = OPERATIONS_PER_RUN / 1000;
-     protected static final int MAX_PARTITION_SIZE = 10_000;
 +    private static final int UNIQUE_CELL_VALUES = 5;
 +
 +    protected static final Logger logger = 
LoggerFactory.getLogger(SingleNodeSAITest.class);
 +    protected static Cluster cluster;
 +
-     protected SingleNodeSAITestBase()
-     {
-     }
++    protected SingleNodeSAITestBase() {}
 +
 +    @BeforeClass
 +    public static void before() throws Throwable
 +    {
 +        init(1,
 +             // At lower fetch sizes, queries w/ hundreds or thousands of 
matches can take a very long time.
 +             defaultConfig().andThen(c -> c.set("range_request_timeout", 
"180s")
 +                                           .set("read_request_timeout", 
"180s")
 +                                           .set("write_request_timeout", 
"180s")
 +                                           .set("native_transport_timeout", 
"180s")
 +                                           .set("slow_query_log_timeout", 
"180s")
 +                                           .with(GOSSIP).with(NETWORK))
 +        );
 +    }
 +
 +    protected static void init(int nodes, Consumer<IInstanceConfig> cfg) 
throws Throwable
 +    {
 +        cluster = Cluster.build()
 +                         .withNodes(nodes)
 +                         .withConfig(cfg)
 +                         .createWithoutStarting();
 +        cluster.startup();
 +        cluster = init(cluster);
 +    }
 +    @AfterClass
 +    public static void afterClass()
 +    {
 +        cluster.close();
 +    }
 +
 +    @Before
 +    public void beforeEach()
 +    {
-         cluster.schemaChange("DROP KEYSPACE IF EXISTS harry");
-         cluster.schemaChange("CREATE KEYSPACE harry WITH replication = 
{'class': 'SimpleStrategy', 'replication_factor': 1};");
++        cluster.schemaChange("DROP KEYSPACE IF EXISTS " + KEYSPACE);
++        cluster.schemaChange("CREATE KEYSPACE " + KEYSPACE + " WITH 
replication = {'class': 'SimpleStrategy', 'replication_factor': " + rf() + 
"};");
++    }
++
++    protected int rf()
++    {
++        return 1;
 +    }
 +
 +    @Test
 +    public void simplifiedSaiTest()
 +    {
-         withRandom(rng -> basicSaiTest(rng, 
SchemaGenerators.trivialSchema("harry", "simplified", 1000).generate(rng)));
++        withRandom(rng -> saiTest(rng,
++                                  SchemaGenerators.trivialSchema(KEYSPACE, 
"simplified", 1000).generate(rng),
++                                  () -> true,
++                                  DEFAULT_REPAIR_SKIP));
++    }
++
++    @Test
++    public void indexOnlySaiTest()
++    {
++        for (int i = 0; i < ITERATIONS; i++)
++        {
++            logger.info("Starting iteration {}...", i);
++            withRandom(rng -> saiTest(rng,
++                                      
schemaGenerator(rng.nextBoolean()).generate(rng),
++                                      () -> true,
++                                      rng.nextBoolean() ? DEFAULT_REPAIR_SKIP 
: Integer.MAX_VALUE));
++        }
 +    }
 +
 +    @Test
-     public void basicSaiTest()
++    public void mixedFilteringSaiTest()
 +    {
-         Generator<SchemaSpec> schemaGen = schemaGenerator();
-         withRandom(rng -> {
-             basicSaiTest(rng, schemaGen.generate(rng));
-         });
++        for (int i = 0; i < ITERATIONS; i++)
++        {
++            logger.info("Starting iteration {}...", i);
++            withRandom(rng -> saiTest(rng,
++                                      
schemaGenerator(rng.nextBoolean()).generate(rng),
++                                      () -> rng.nextFloat() < 0.7f,
++                                      rng.nextBoolean() ? DEFAULT_REPAIR_SKIP 
: Integer.MAX_VALUE));
++        }
 +    }
 +
-     private void basicSaiTest(EntropySource rng, SchemaSpec schema)
++    private void saiTest(EntropySource rng, SchemaSpec schema, 
Supplier<Boolean> createIndex, int repairSkip)
 +    {
-         Set<Integer> usedPartitions = new HashSet<>();
 +        logger.info(schema.compile());
 +
 +        Generator<Integer> globalPkGen = Generators.int32(0, 
Math.min(NUM_PARTITIONS, schema.valueGenerators.pkPopulation()));
 +        Generator<Integer> ckGen = Generators.int32(0, 
schema.valueGenerators.ckPopulation());
 +
 +        CassandraRelevantProperties.SAI_INTERSECTION_CLAUSE_LIMIT.setInt(100);
 +        beforeEach();
 +        cluster.forEach(i -> i.nodetool("disableautocompaction"));
 +
 +        cluster.schemaChange(schema.compile());
-         cluster.schemaChange(schema.compile().replace(schema.keyspace + "." + 
schema.table,
-                                                       schema.keyspace + 
".debug_table"));
-         Streams.concat(schema.clusteringKeys.stream(),
-                        schema.regularColumns.stream(),
-                        schema.staticColumns.stream())
++        cluster.schemaChange(schema.compile().replace(schema.keyspace + '.' + 
schema.table, schema.keyspace + ".debug_table"));
++
++        Streams.concat(schema.clusteringKeys.stream(), 
schema.regularColumns.stream(), schema.staticColumns.stream())
 +               .forEach(column -> {
++                   if (createIndex.get())
++                   {
++                       logger.info("Adding index to column {}...", 
column.name);
 +                       cluster.schemaChange(String.format("CREATE INDEX 
%s_sai_idx ON %s.%s (%s) USING 'sai' ",
-                                                           column.name,
-                                                           schema.keyspace,
-                                                           schema.table,
-                                                           column.name));
++                               column.name, schema.keyspace, schema.table, 
column.name));
++                   }
++                   else
++                   {
++                       logger.info("Leaving column {} unindexed...", 
column.name);
++                   }
 +               });
 +
 +        waitForIndexesQueryable(schema);
 +
 +        HistoryBuilder history = new 
ReplayingHistoryBuilder(schema.valueGenerators,
 +                                                             (hb) -> 
InJvmDTestVisitExecutor.builder()
 +                                                                              
              .pageSizeSelector(pageSizeSelector(rng))
 +                                                                              
              .consistencyLevel(consistencyLevelSelector())
 +                                                                              
              .doubleWriting(schema, hb, cluster, "debug_table"));
-         List<Integer> partitions = new ArrayList<>();
-         for (int j = 0; j < 5; j++)
++        Set<Integer> partitions = new HashSet<>();
++        int attempts = 0;
++        while (partitions.size() < NUM_VISITED_PARTITIONS && attempts < 
NUM_VISITED_PARTITIONS * 10)
 +        {
-             int picked = globalPkGen.generate(rng);
-             if (usedPartitions.contains(picked))
-                 continue;
-             partitions.add(picked);
++            partitions.add(globalPkGen.generate(rng));
++            attempts++;
 +        }
 +
-         usedPartitions.addAll(partitions);
-         if (partitions.isEmpty())
-             return;
++        if (partitions.size() < NUM_VISITED_PARTITIONS)
++            logger.warn("Unable to generate {} partitions to visit. 
Continuing with {}...", NUM_VISITED_PARTITIONS, partitions.size());
++
++        Generator<Integer> pkGen = Generators.pick(List.copyOf(partitions));
++
++        // Ensure that we don't attempt to use range queries against SAI 
indexes that don't support them:
++        Set<Integer> eqOnlyRegularColumns = new HashSet<>();
++        for (int i = 0; i < schema.regularColumns.size(); i++)
++            if 
(IndexTermType.isEqOnlyType(schema.regularColumns.get(i).type.asServerType()))
++                eqOnlyRegularColumns.add(i);
++
++        Set<Integer> eqOnlyStaticColumns = new HashSet<>();
++        for (int i = 0; i < schema.staticColumns.size(); i++)
++            if 
(IndexTermType.isEqOnlyType(schema.staticColumns.get(i).type.asServerType()))
++                eqOnlyStaticColumns.add(i);
++
++        Set<Integer> eqOnlyClusteringColumns = new HashSet<>();
++        for (int i = 0; i < schema.clusteringKeys.size(); i++)
++            if 
(IndexTermType.isEqOnlyType(schema.clusteringKeys.get(i).type.asServerType()))
++                eqOnlyClusteringColumns.add(i);
 +
-         Generator<Integer> pkGen = Generators.pick(partitions);
 +        for (int i = 0; i < OPERATIONS_PER_RUN; i++)
 +        {
 +            int partitionIndex = pkGen.generate(rng);
 +            HistoryBuilderHelper.insertRandomData(schema, partitionIndex, 
ckGen.generate(rng), rng, 0.5d, history);
 +
 +            if (rng.nextFloat() > 0.99f)
 +            {
 +                int row1 = ckGen.generate(rng);
 +                int row2 = ckGen.generate(rng);
 +                history.deleteRowRange(partitionIndex,
 +                                       Math.min(row1, row2),
 +                                       Math.max(row1, row2),
 +                                       
rng.nextInt(schema.clusteringKeys.size()),
 +                                       rng.nextBoolean(),
 +                                       rng.nextBoolean());
 +            }
 +
 +            if (rng.nextFloat() > 0.995f)
 +                HistoryBuilderHelper.deleteRandomColumns(schema, 
partitionIndex, ckGen.generate(rng), rng, history);
 +
 +            if (rng.nextFloat() > 0.9995f)
 +                history.deletePartition(partitionIndex);
 +
-             if (i % REPAIR_SKIP == 0)
-                 history.custom(() -> repair(schema), "Repair");
-             else if (i % FLUSH_SKIP == 0)
++            if (i % FLUSH_SKIP == 0)
 +                history.custom(() -> flush(schema), "Flush");
 +            else if (i % COMPACTION_SKIP == 0)
 +                history.custom(() -> compact(schema), "Compact");
++            else if (i % repairSkip == 0)
++                history.custom(() -> repair(schema), "Repair");
 +
-             if (i > 0 && i % 1000 == 0)
++            if (i > 0 && i % VALIDATION_SKIP == 0)
 +            {
-                 for (int j = 0; j < 5; j++)
++                for (int j = 0; j < QUERIES_PER_VALIDATION; j++)
 +                {
-                     List<IdxRelation> regularRelations = 
HistoryBuilderHelper.generateValueRelations(rng, schema.regularColumns.size(),
-                                                                               
                       column -> 
Math.min(schema.valueGenerators.regularPopulation(column), UNIQUE_CELL_VALUES));
-                     List<IdxRelation> staticRelations = 
HistoryBuilderHelper.generateValueRelations(rng, schema.staticColumns.size(),
-                                                                               
                      column -> 
Math.min(schema.valueGenerators.staticPopulation(column), UNIQUE_CELL_VALUES));
-                     history.select(pkGen.generate(rng),
-                                    
HistoryBuilderHelper.generateClusteringRelations(rng, 
schema.clusteringKeys.size(), ckGen).toArray(new IdxRelation[0]),
-                                    regularRelations.toArray(new 
IdxRelation[regularRelations.size()]),
-                                    staticRelations.toArray(new 
IdxRelation[staticRelations.size()]));
++                    List<IdxRelation> regularRelations = 
++                            HistoryBuilderHelper.generateValueRelations(rng,
++                                                                        
schema.regularColumns.size(),
++                                                                        
column -> Math.min(schema.valueGenerators.regularPopulation(column), 
UNIQUE_CELL_VALUES),
++                                                                        
eqOnlyRegularColumns::contains);
++
++                    List<IdxRelation> staticRelations = 
++                            HistoryBuilderHelper.generateValueRelations(rng,
++                                                                        
schema.staticColumns.size(),
++                                                                        
column -> Math.min(schema.valueGenerators.staticPopulation(column), 
UNIQUE_CELL_VALUES),
++                                                                        
eqOnlyStaticColumns::contains);
++
++                    Integer pk = pkGen.generate(rng);
++
++                    IdxRelation[] ckRelations = 
++                            
HistoryBuilderHelper.generateClusteringRelations(rng,
++                                                                             
schema.clusteringKeys.size(),
++                                                                             
ckGen,
++                                                                             
eqOnlyClusteringColumns).toArray(new IdxRelation[0]);
++
++                    IdxRelation[] regularRelationsArray = 
regularRelations.toArray(new IdxRelation[regularRelations.size()]);
++                    IdxRelation[] staticRelationsArray = 
staticRelations.toArray(new IdxRelation[staticRelations.size()]);
++
++                    history.select(pk, ckRelations, regularRelationsArray, 
staticRelationsArray);
 +                }
 +            }
 +        }
 +    }
 +
-     protected Generator<SchemaSpec> schemaGenerator()
++    protected Generator<SchemaSpec> schemaGenerator(boolean disableReadRepair)
 +    {
-         SchemaSpec.OptionsBuilder builder = 
SchemaSpec.optionsBuilder().compactionStrategy("LeveledCompactionStrategy");
++        SchemaSpec.OptionsBuilder builder = 
SchemaSpec.optionsBuilder().disableReadRepair(disableReadRepair)
++                                                                       
.compactionStrategy("LeveledCompactionStrategy");
 +        return SchemaGenerators.schemaSpecGen(KEYSPACE, "basic_sai", 
MAX_PARTITION_SIZE, builder);
 +    }
 +
 +    protected void flush(SchemaSpec schema)
 +    {
 +        cluster.get(1).nodetool("flush", schema.keyspace, schema.table);
 +    }
 +
 +    protected void compact(SchemaSpec schema)
 +    {
 +        cluster.get(1).nodetool("compact", schema.keyspace);
 +    }
 +
 +    protected void repair(SchemaSpec schema)
 +    {
 +        // Repair is nonsensical for a single node, but a repair does flush 
first, so do that at least.
 +        cluster.get(1).nodetool("flush", schema.keyspace, schema.table);
 +    }
 +
 +    protected void waitForIndexesQueryable(SchemaSpec schema) {}
 +
 +    public static Consumer<IInstanceConfig> defaultConfig()
 +    {
 +        return (cfg) -> {
 +            cfg.set("row_cache_size", "50MiB")
 +               .set("index_summary_capacity", "50MiB")
 +               .set("counter_cache_size", "50MiB")
 +               .set("key_cache_size", "50MiB")
 +               .set("file_cache_size", "50MiB")
 +               .set("index_summary_capacity", "50MiB")
 +               .set("memtable_heap_space", "128MiB")
 +               .set("memtable_offheap_space", "128MiB")
 +               .set("memtable_flush_writers", 1)
 +               .set("concurrent_compactors", 1)
 +               .set("concurrent_reads", 5)
 +               .set("concurrent_writes", 5)
 +               .set("compaction_throughput_mb_per_sec", 10)
 +               .set("hinted_handoff_enabled", false);
 +        };
 +    }
 +
 +    protected InJvmDTestVisitExecutor.ConsistencyLevelSelector 
consistencyLevelSelector()
 +    {
 +        return visit -> {
 +            if (visit.selectOnly)
 +                return ConsistencyLevel.ALL;
 +
 +            // The goal here is to make replicas as out of date as possible, 
modulo the efforts of repair
 +            // and read-repair in the test itself.
 +            return ConsistencyLevel.NODE_LOCAL;
 +
 +        };
 +    }
 +
 +    protected InJvmDTestVisitExecutor.PageSizeSelector 
pageSizeSelector(EntropySource rng)
 +    {
 +        // Chosing a fetch size has implications for how well this test will 
excercise paging, short-read protection, and
 +        // other important parts of the distributed query apparatus. This 
should be set low enough to ensure a significant
 +        // number of queries during validation page, but not too low that 
more expesive queries time out and fail the test.
 +        return lts -> rng.nextInt(1, 20);
 +    }
 +}
diff --cc 
test/harry/main/org/apache/cassandra/harry/dsl/HistoryBuilderHelper.java
index 281f62cc87,0000000000..21893a0352
mode 100644,000000..100644
--- a/test/harry/main/org/apache/cassandra/harry/dsl/HistoryBuilderHelper.java
+++ b/test/harry/main/org/apache/cassandra/harry/dsl/HistoryBuilderHelper.java
@@@ -1,195 -1,0 +1,216 @@@
 +/*
 + * Licensed to the Apache Software Foundation (ASF) under one
 + * or more contributor license agreements.  See the NOTICE file
 + * distributed with this work for additional information
 + * regarding copyright ownership.  The ASF licenses this file
 + * to you under the Apache License, Version 2.0 (the
 + * "License"); you may not use this file except in compliance
 + * with the License.  You may obtain a copy of the License at
 + *
 + *     http://www.apache.org/licenses/LICENSE-2.0
 + *
 + * Unless required by applicable law or agreed to in writing, software
 + * distributed under the License is distributed on an "AS IS" BASIS,
 + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 + * See the License for the specific language governing permissions and
 + * limitations under the License.
 + */
 +
 +package org.apache.cassandra.harry.dsl;
 +
 +import java.util.ArrayList;
++import java.util.Collections;
++import java.util.EnumSet;
 +import java.util.HashMap;
- import java.util.HashSet;
 +import java.util.List;
 +import java.util.Map;
 +import java.util.Set;
 +import java.util.function.Function;
++import java.util.function.Predicate;
 +
- import org.apache.cassandra.harry.gen.EntropySource;
- import org.apache.cassandra.harry.gen.Generator;
- import org.apache.cassandra.harry.gen.Generators;
 +import org.apache.cassandra.harry.MagicConstants;
 +import org.apache.cassandra.harry.Relations;
 +import org.apache.cassandra.harry.SchemaSpec;
++import org.apache.cassandra.harry.gen.EntropySource;
++import org.apache.cassandra.harry.gen.Generator;
++import org.apache.cassandra.harry.gen.Generators;
 +import org.apache.cassandra.harry.util.BitSet;
 +
 +import static org.apache.cassandra.harry.Relations.RelationKind.EQ;
 +import static org.apache.cassandra.harry.Relations.RelationKind.GT;
 +import static org.apache.cassandra.harry.Relations.RelationKind.GTE;
 +import static org.apache.cassandra.harry.Relations.RelationKind.LT;
 +
 +/**
 + * Things that seemed like a good idea, but ultimately were not a good fit 
for the HistoryBuilder API
 + */
 +public class HistoryBuilderHelper
 +{
 +    /**
 +     * Perform a random insert to any row
 +     */
 +    public static void insertRandomData(SchemaSpec schema, Generator<Integer> 
pkGen, Generator<Integer> ckGen, EntropySource rng, HistoryBuilder history)
 +    {
 +        insertRandomData(schema, pkGen.generate(rng), ckGen.generate(rng), 
rng, history);
 +    }
 +
 +    public static void insertRandomData(SchemaSpec schema, int partitionIdx, 
int rowIdx, EntropySource rng, HistoryBuilder history)
 +    {
 +        int[] vIdxs = new int[schema.regularColumns.size()];
 +        for (int i = 0; i < schema.regularColumns.size(); i++)
 +            vIdxs[i] = 
rng.nextInt(history.valueGenerators().regularPopulation(i));
 +        int[] sIdxs = new int[schema.staticColumns.size()];
 +        for (int i = 0; i < schema.staticColumns.size(); i++)
 +            sIdxs[i] = 
rng.nextInt(history.valueGenerators().staticPopulation(i));
 +        history.insert(partitionIdx, rowIdx, vIdxs, sIdxs);
 +    }
 +
 +    public static void insertRandomData(SchemaSpec schema, int pkIdx, 
EntropySource rng, HistoryBuilder history)
 +    {
 +        insertRandomData(schema,
 +                         pkIdx,
 +                         rng.nextInt(0, 
history.valueGenerators().ckPopulation()),
 +                         rng,
 +                         0,
 +                         history);
 +    }
 +
 +    public static void insertRandomData(SchemaSpec schema, int partitionIdx, 
int rowIdx, EntropySource rng, double chanceOfUnset, HistoryBuilder history)
 +    {
 +        int[] vIdxs = new int[schema.regularColumns.size()];
 +        for (int i = 0; i < schema.regularColumns.size(); i++)
 +            vIdxs[i] = rng.nextDouble() <= chanceOfUnset ? 
MagicConstants.UNSET_IDX : 
rng.nextInt(history.valueGenerators().regularPopulation(i));
 +        int[] sIdxs = new int[schema.staticColumns.size()];
 +        for (int i = 0; i < schema.staticColumns.size(); i++)
 +            sIdxs[i] = rng.nextDouble() <= chanceOfUnset ? 
MagicConstants.UNSET_IDX : 
rng.nextInt(history.valueGenerators().staticPopulation(i));
 +        history.insert(partitionIdx, rowIdx, vIdxs, sIdxs);
 +    }
 +
- 
 +    public static void deleteRandomColumns(SchemaSpec schema, int 
partitionIdx, int rowIdx, EntropySource rng, SingleOperationBuilder history)
 +    {
 +        Generator<BitSet> regularMask = 
Generators.bitSet(schema.regularColumns.size());
 +        Generator<BitSet> staticMask = 
Generators.bitSet(schema.staticColumns.size());
 +
 +        history.deleteColumns(partitionIdx,
 +                              rowIdx,
 +                              regularMask.generate(rng),
 +                              staticMask.generate(rng));
 +    }
 +
 +    private static final Generator<Relations.RelationKind> relationKindGen = 
Generators.pick(LT, GT, EQ);
 +    private static final Set<Relations.RelationKind> lowBoundRelations = 
Set.of(GT, GTE, EQ);
 +
++    public static List<SingleOperationBuilder.IdxRelation> 
generateValueRelations(EntropySource rng, int numColumns, Function<Integer, 
Integer> population)
++    {
++        return generateValueRelations(rng, numColumns, population, c -> 
false);
++    }
++
 +    /**
 +     * Generates random relations for regular and static columns for 
FILTERING and SAI queries.
 +     *
 +     * Will generate at most 2 relations per column:
 +     *   * generates a random relation
 +     *     * if this relation is EQ, that's the only relation that will lock 
this column
 +     *     * if relation is GT, next bound, if generated, will be LT
 +     *     * if relation is LT, next bound, if generated, will be GT
 +     *
 +     * @param rng - random number generator
 +     * @param numColumns - number of columns in the generated set of 
relationships
 +     * @param population - expected population / number of possible values 
for a given column
 +     * @return a list of relations
 +     */
-     public static List<SingleOperationBuilder.IdxRelation> 
generateValueRelations(EntropySource rng, int numColumns, Function<Integer, 
Integer> population)
++    public static List<SingleOperationBuilder.IdxRelation> 
generateValueRelations(EntropySource rng, int numColumns, Function<Integer, 
Integer> population, Predicate<Integer> eqOnlyColumns)
 +    {
 +        List<SingleOperationBuilder.IdxRelation> relations = new 
ArrayList<>();
 +        Map<Integer, Set<Relations.RelationKind>> kindsMap = new HashMap<>();
 +        int remainingColumns = numColumns;
 +        while (remainingColumns > 0)
 +        {
 +            int column = rng.nextInt(numColumns);
-             Set<Relations.RelationKind> kinds = 
kindsMap.computeIfAbsent(column, c -> new HashSet<>());
++            Set<Relations.RelationKind> kinds = 
kindsMap.computeIfAbsent(column, c -> 
EnumSet.noneOf(Relations.RelationKind.class));
 +            if (kinds.size() > 1 || kinds.contains(EQ))
 +                continue;
 +            Relations.RelationKind kind;
-             if (kinds.size() == 1)
++            
++            if (eqOnlyColumns.test(column))
++            {
++                kind = EQ;
++            }
++            else if (kinds.size() == 1)
 +            {
 +                if (kinds.contains(LT)) kind = GT;
 +                else kind = LT;
 +                remainingColumns--;
 +            }
 +            else
 +                // TODO: weights per relation?
 +                kind = relationKindGen.generate(rng);
 +
 +            if (kind == EQ)
 +                remainingColumns--;
 +
 +            kinds.add(kind);
 +
 +            int regularIdx = rng.nextInt(population.apply(column));
 +            relations.add(new SingleOperationBuilder.IdxRelation(kind, 
regularIdx, column));
 +            if (rng.nextBoolean())
 +                break;
 +        }
 +        return relations;
 +    }
 +
++    public static List<SingleOperationBuilder.IdxRelation> 
generateClusteringRelations(EntropySource rng, int numColumns, 
Generator<Integer> ckGen)
++    {
++        return generateClusteringRelations(rng, numColumns, ckGen, 
Collections.emptySet());
++    }
++
 +    /**
 +     * Generates random relations for regular and static columns for 
FILTERING and SAI queries.
 +     *
 +     * Will generate at most 2 relations per column. Low bound will always 
use values from low bound clustering,
 +     * high bound will always use values from high bound.
 +     *
 +     * @param rng - random number generator
 +     * @param numColumns - number of columns in the generated set of 
relationships
 +     * @return a list of relations
 +     */
-     public static List<SingleOperationBuilder.IdxRelation> 
generateClusteringRelations(EntropySource rng, int numColumns, 
Generator<Integer> ckGen)
++    public static List<SingleOperationBuilder.IdxRelation> 
generateClusteringRelations(EntropySource rng, int numColumns, 
Generator<Integer> ckGen, Set<Integer> eqOnlyColumns)
 +    {
 +        List<SingleOperationBuilder.IdxRelation> relations = new 
ArrayList<>();
 +        Map<Integer, Set<Relations.RelationKind>> kindsMap = new HashMap<>();
 +        int remainingColumns = numColumns;
 +        int lowBoundIdx = ckGen.generate(rng);
 +        int highBoundIdx = ckGen.generate(rng);
 +        while (remainingColumns > 0)
 +        {
 +            int column = rng.nextInt(numColumns);
-             Set<Relations.RelationKind> kinds = 
kindsMap.computeIfAbsent(column, c -> new HashSet<>());
++            Set<Relations.RelationKind> kinds = 
kindsMap.computeIfAbsent(column, c -> 
EnumSet.noneOf(Relations.RelationKind.class));
 +            if (kinds.size() > 1 || kinds.contains(EQ))
 +                continue;
 +            Relations.RelationKind kind;
-             if (kinds.size() == 1)
++
++            if (eqOnlyColumns.contains(column))
++            {
++                kind = EQ;
++            }
++            else if (kinds.size() == 1)
 +            {
 +                if (kinds.contains(LT)) kind = GT;
 +                else kind = LT;
 +                remainingColumns--;
 +            }
 +            else
 +                kind = relationKindGen.generate(rng);
 +
 +            if (kind == EQ)
 +                remainingColumns--;
 +
 +            kinds.add(kind);
 +
 +            relations.add(new SingleOperationBuilder.IdxRelation(kind, 
lowBoundRelations.contains(kind) ? lowBoundIdx : highBoundIdx, column));
 +            if (rng.nextBoolean())
 +                break;
 +        }
 +        return relations;
 +    }
 +}
diff --cc test/unit/org/apache/cassandra/cql3/ast/CreateIndexDDL.java
index 252dd99bc9,0000000000..0984ee4620
mode 100644,000000..100644
--- a/test/unit/org/apache/cassandra/cql3/ast/CreateIndexDDL.java
+++ b/test/unit/org/apache/cassandra/cql3/ast/CreateIndexDDL.java
@@@ -1,271 -1,0 +1,262 @@@
 +/*
 + * Licensed to the Apache Software Foundation (ASF) under one
 + * or more contributor license agreements.  See the NOTICE file
 + * distributed with this work for additional information
 + * regarding copyright ownership.  The ASF licenses this file
 + * to you under the Apache License, Version 2.0 (the
 + * "License"); you may not use this file except in compliance
 + * with the License.  You may obtain a copy of the License at
 + *
 + *     http://www.apache.org/licenses/LICENSE-2.0
 + *
 + * Unless required by applicable law or agreed to in writing, software
 + * distributed under the License is distributed on an "AS IS" BASIS,
 + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 + * See the License for the specific language governing permissions and
 + * limitations under the License.
 + */
 +
 +package org.apache.cassandra.cql3.ast;
 +
 +import java.util.Arrays;
 +import java.util.EnumSet;
 +import java.util.List;
 +import java.util.Map;
 +import java.util.Optional;
- import java.util.Set;
 +import java.util.stream.Stream;
 +
- import com.google.common.collect.ImmutableSet;
- 
 +import org.apache.cassandra.cql3.ast.Symbol.UnquotedSymbol;
 +import org.apache.cassandra.db.marshal.AbstractType;
- import org.apache.cassandra.db.marshal.AsciiType;
- import org.apache.cassandra.db.marshal.BooleanType;
 +import org.apache.cassandra.db.marshal.CompositeType;
 +import org.apache.cassandra.db.marshal.UTF8Type;
- import org.apache.cassandra.db.marshal.UUIDType;
 +import org.apache.cassandra.index.sai.StorageAttachedIndex;
++import org.apache.cassandra.index.sai.utils.IndexTermType;
 +import org.apache.cassandra.schema.ColumnMetadata;
 +import org.apache.cassandra.schema.TableMetadata;
 +
 +//TODO (now): replace with IndexMetadata?  Rather than create a custom DDL 
type can just leverage the existing metadata like Table/Keyspace
 +public class CreateIndexDDL implements Element
 +{
 +    public enum Version { V1, V2 }
 +    public enum QueryType { Eq, Range }
 +
 +    public interface Indexer
 +    {
 +        enum Kind { legacy, sai }
 +        Kind kind();
 +        default boolean isCustom()
 +        {
 +            return kind() != Kind.legacy;
 +        }
 +        UnquotedSymbol name();
 +        default UnquotedSymbol longName()
 +        {
 +            return name();
 +        }
 +        boolean supported(TableMetadata table, ColumnMetadata column);
 +        default EnumSet<QueryType> supportedQueries(AbstractType<?> type)
 +        {
 +            return EnumSet.of(QueryType.Eq);
 +        }
 +    }
 +
 +    public static List<Indexer> supportedIndexers()
 +    {
 +        return Arrays.asList(LEGACY, SAI);
 +    }
 +
 +    private static boolean standardSupported(TableMetadata metadata, 
ColumnMetadata col)
 +    {
 +        if (metadata.partitionKeyColumns().size() == 1 && 
col.isPartitionKey()) return false;
 +        AbstractType<?> type = col.type.unwrap();
 +        if (type.isUDT() && type.isMultiCell()) return false; // non-frozen 
UDTs are not supported
 +        if (type.referencesDuration()) return false; // Duration is not 
allowed!  See 
org.apache.cassandra.cql3.statements.schema.CreateIndexStatement.validateIndexTarget
 +        return true;
 +    }
 +
 +    private static boolean isFrozen(AbstractType<?> type)
 +    {
 +        return !type.subTypes().isEmpty() && !type.isMultiCell();
 +    }
 +
 +    public static final Indexer LEGACY = new Indexer()
 +    {
 +
 +        @Override
 +        public Kind kind()
 +        {
 +            return Kind.legacy;
 +        }
 +
 +        @Override
 +        public UnquotedSymbol name()
 +        {
 +            return new UnquotedSymbol("legacy_local_table", 
UTF8Type.instance);
 +        }
 +
 +        @Override
 +        public boolean supported(TableMetadata table, ColumnMetadata column)
 +        {
 +            return standardSupported(table, column);
 +        }
 +    };
 +
-     private static final Set<AbstractType<?>> SAI_EQ_ONLY = 
ImmutableSet.of(UTF8Type.instance, AsciiType.instance,
-                                                                             
BooleanType.instance,
-                                                                             
UUIDType.instance);
- 
 +    public static final Indexer SAI = new Indexer()
 +    {
 +        @Override
 +        public Kind kind()
 +        {
 +            return Kind.sai;
 +        }
 +
 +        @Override
 +        public UnquotedSymbol name()
 +        {
 +            return new UnquotedSymbol("SAI", UTF8Type.instance);
 +        }
 +
 +        @Override
 +        public UnquotedSymbol longName()
 +        {
 +            return new UnquotedSymbol("StorageAttachedIndex", 
UTF8Type.instance);
 +        }
 +
 +        @Override
 +        public boolean supported(TableMetadata table, ColumnMetadata col)
 +        {
 +            if (!standardSupported(table, col)) return false;
 +            AbstractType<?> type = col.type.unwrap();
 +            if (type instanceof CompositeType)
 +            {
 +                // each element must be SUPPORTED_TYPES only...
 +                if 
(type.subTypes().stream().allMatch(StorageAttachedIndex.SUPPORTED_TYPES::contains))
 +                    return true;
 +            }
 +            else if (((isFrozen(type) && !type.isVector()) || 
StorageAttachedIndex.SUPPORTED_TYPES.contains(type.asCQL3Type())))
 +                return true;
 +            return false;
 +        }
 +
 +        @Override
 +        public EnumSet<QueryType> supportedQueries(AbstractType<?> type)
 +        {
 +            type = type.unwrap();
-             if (SAI_EQ_ONLY.contains(type))
++            if (IndexTermType.isEqOnlyType(type))
 +                return EnumSet.of(QueryType.Eq);
 +            return EnumSet.allOf(QueryType.class);
 +        }
 +    };
 +
 +    public final Version version;
 +    public final Indexer indexer;
 +    public final Optional<Symbol> name;
 +    public final TableReference on;
 +    public final List<ReferenceExpression> references;
 +    public final Map<String, String> options;
 +
 +    public CreateIndexDDL(Version version, Indexer indexer, Optional<Symbol> 
name, TableReference on, List<ReferenceExpression> references, Map<String, 
String> options)
 +    {
 +        this.version = version;
 +        this.indexer = indexer;
 +        this.name = name;
 +        this.on = on;
 +        this.references = references;
 +        this.options = options;
 +    }
 +
 +    @Override
 +    public void toCQL(StringBuilder sb, CQLFormatter formatter)
 +    {
 +        switch (version)
 +        {
 +            case V1:
 +                if (indexer.isCustom()) sb.append("CREATE CUSTOM INDEX");
 +                else                    sb.append("CREATE INDEX");
 +                break;
 +            case V2:
 +                sb.append("CREATE INDEX");
 +                break;
 +            default:
 +                throw new UnsupportedOperationException(version.name());
 +        }
 +        if (name.isPresent())
 +        {
 +            sb.append(' ');
 +            name.get().toCQL(sb, formatter);
 +        }
 +        formatter.section(sb);
 +        sb.append("ON ");
 +        on.toCQL(sb, formatter);
 +        sb.append('(');
 +        for (ReferenceExpression ref : references)
 +        {
 +            ref.toCQL(sb, formatter);
 +            sb.append(", ");
 +        }
 +        sb.setLength(sb.length() - 2); // remove last ", "
 +        sb.append(')');
 +        UnquotedSymbol indexerName = null;
 +        switch (version)
 +        {
 +            case V1:
 +                if (indexer.isCustom())
 +                    indexerName = indexer.longName();
 +                break;
 +            case V2:
 +                indexerName = indexer.name();
 +                break;
 +            default:
 +                throw new UnsupportedOperationException(version.name());
 +        }
 +        if (indexerName != null)
 +        {
 +            formatter.section(sb);
 +            sb.append("USING '");
 +            indexerName.toCQL(sb, formatter);
 +            sb.append("'");
 +        }
 +        if (!options.isEmpty())
 +        {
 +            formatter.section(sb);
 +            sb.append("WITH OPTIONS = {");
 +            for (Map.Entry<String, String> e : options.entrySet())
 +                sb.append("'").append(e.getKey()).append("': 
'").append(e.getValue()).append("', ");
 +            sb.setLength(sb.length() - 2); // remove ", "
 +            sb.append('}');
 +        }
 +    }
 +
 +    public static class CollectionReference implements ReferenceExpression
 +    {
 +        public enum Kind { FULL, KEYS, ENTRIES }
 +
 +        public final Kind kind;
 +        public final ReferenceExpression column;
 +
 +        public CollectionReference(Kind kind, ReferenceExpression column)
 +        {
 +            this.kind = kind;
 +            this.column = column;
 +        }
 +
 +        @Override
 +        public AbstractType<?> type()
 +        {
 +            return column.type();
 +        }
 +
 +        @Override
 +        public void toCQL(StringBuilder sb, CQLFormatter formatter)
 +        {
 +            sb.append(kind.name()).append('(');
 +            column.toCQL(sb, formatter);
 +            sb.append(')');
 +        }
 +
 +        @Override
 +        public Stream<? extends Element> stream()
 +        {
 +            return Stream.of(column);
 +        }
 +    }
 +}


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to