[ 
https://issues.apache.org/jira/browse/CASSANDRA-21118?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Caleb Rackliffe updated CASSANDRA-21118:
----------------------------------------
     Bug Category: Parent values: Degradation(12984)Level 1 values: Performance 
Bug/Regression(12997)
       Complexity: Normal
      Component/s: Feature/SAI
    Discovered By: Code Inspection
         Severity: Normal
           Status: Open  (was: Triage Needed)

> SAI query on indexed static column reads full partition
> -------------------------------------------------------
>
>                 Key: CASSANDRA-21118
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-21118
>             Project: Apache Cassandra
>          Issue Type: Bug
>          Components: Feature/SAI
>            Reporter: Michael Marshall
>            Assignee: Caleb Rackliffe
>            Priority: Normal
>
> The `ResultRetriever` in SAI materializes `matches` eagerly instead of 
> iteratively, and as a result, when a static primary key is used to create the 
> partition iterator, we iterate the full partition, independent of the `limit` 
> value. Here is a test that demonstrates the problem (it doesn't fail, so 
> you'll need to add logging or attach a debugger).
> {code:java}
>     @Test
>     public void staticIndexOnlyMaterializesLimitRowsFromPartition() throws 
> Throwable
>     {
>         createTable("CREATE TABLE %s (pk int, ck int, val1 int static, val2 
> int, PRIMARY KEY(pk, ck))");
>         disableCompaction(KEYSPACE);
>         createIndex("CREATE INDEX ON %s(val1) USING 'sai'");
>         execute("INSERT INTO %s(pk, ck, val1, val2) VALUES(?, ?, ?, ?)", 1, 
> 1, 2, 1);
>         for (int i = 2; i < 10000; i++)
>             execute("INSERT INTO %s(pk, ck,       val2) VALUES(?, ?,    ?)", 
> 1, i,    i);
>         beforeAndAfterFlush(() -> assertRows(execute("SELECT pk, ck, val1, 
> val2 FROM %s WHERE val1 = 2 LIMIT 3"),
>                                              row(1, 1, 2, 1), row(1, 2, 2, 
> 2), row(1, 3, 2, 3)));
>     }
> {code}
> The proper solution is to apply an iterator based filter so that rows are 
> lazily filtered. It might be worth reviewing the git history to see if it was 
> implemented that way initially.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to