Caleb Rackliffe created CASSANDRA-20189:
-------------------------------------------

             Summary: Avoid possible consistency violations for SAI 
intersection queries over repaired index matches and multiple non-indexed 
column matches
                 Key: CASSANDRA-20189
                 URL: https://issues.apache.org/jira/browse/CASSANDRA-20189
             Project: Apache Cassandra
          Issue Type: Bug
          Components: Consistency/Coordination, Feature/SAI
            Reporter: Caleb Rackliffe
            Assignee: Caleb Rackliffe


{{FilterTree}}, which is responsible for SAI's post-filtering, is too 
aggressive about automatically using strict filtering when a.) only repaired 
matches are returned from index columns and b.) there are still multiple 
non-indexed columns that must be post-filtered. The following test illustrates 
this:

{noformat}
@Test
public void testPartialUpdatesOnNonIndexedColumnsAfterRepair()
{
   CLUSTER.schemaChange(withKeyspace("CREATE TABLE %s.partial_updates (k int 
PRIMARY KEY, a int, b int, c int) WITH read_repair = 'NONE'"));
   CLUSTER.schemaChange(withKeyspace("CREATE INDEX ON %s.partial_updates(a) 
USING 'sai'"));
   SAIUtil.waitForIndexQueryable(CLUSTER, KEYSPACE);

   CLUSTER.coordinator(1).execute(withKeyspace("INSERT INTO 
%s.partial_updates(k, a) VALUES (0, 1) USING TIMESTAMP 1"), 
ConsistencyLevel.ALL);
   CLUSTER.get(1).nodetoolResult("repair", KEYSPACE).asserts().success();

   // insert a split row
   CLUSTER.get(1).executeInternal(withKeyspace("INSERT INTO 
%s.partial_updates(k, b) VALUES (0, 2) USING TIMESTAMP 2"));
   CLUSTER.get(2).executeInternal(withKeyspace("INSERT INTO 
%s.partial_updates(k, c) VALUES (0, 3) USING TIMESTAMP 3"));

   String select = withKeyspace("SELECT * FROM %s.partial_updates WHERE a = 1 
AND b = 2 AND c = 3 ALLOW FILTERING");
   Object[][] initialRows = CLUSTER.coordinator(1).execute(select, 
ConsistencyLevel.ALL);
   assertRows(initialRows, row(0, 1, 2, 3));
}
{noformat}

This should be easy to fix without adversely affecting performance too much, 
given the selectivity of the clauses on the indexed columns still determines 
the upper bound on how many rows can be returned to the coordinator for 
coordinator-side filtering.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to