Michael Marshall created CASSANDRA-21118:
--------------------------------------------
Summary: SAI query on indexed static column reads full partition
Key: CASSANDRA-21118
URL: https://issues.apache.org/jira/browse/CASSANDRA-21118
Project: Apache Cassandra
Issue Type: Bug
Reporter: Michael Marshall
The `ResultRetriever` in SAI materializes `matches` eagerly instead of
iteratively, and as a result, when a static primary key is used to create the
partition iterator, we iterate the full partition, independent of the `limit`
value. Here is a test that demonstrates the problem (it doesn't fail, so you'll
need to add logging or attach a debugger).
{code:java}
@Test
public void staticIndexOnlyMaterializesLimitRowsFromPartition() throws
Throwable
{
createTable("CREATE TABLE %s (pk int, ck int, val1 int static, val2
int, PRIMARY KEY(pk, ck))");
disableCompaction(KEYSPACE);
createIndex("CREATE INDEX ON %s(val1) USING 'sai'");
execute("INSERT INTO %s(pk, ck, val1, val2) VALUES(?, ?, ?, ?)", 1, 1,
2, 1);
for (int i = 2; i < 10000; i++)
execute("INSERT INTO %s(pk, ck, val2) VALUES(?, ?, ?)", 1,
i, i);
beforeAndAfterFlush(() -> assertRows(execute("SELECT pk, ck, val1, val2
FROM %s WHERE val1 = 2 LIMIT 3"),
row(1, 1, 2, 1), row(1, 2, 2, 2),
row(1, 3, 2, 3)));
}
{code}
The proper solution is to apply an iterator based filter so that rows are
lazily filtered. It might be worth reviewing the git history to see if it was
implemented that way initially.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]