[ 
https://issues.apache.org/jira/browse/CASSANDRA-11223?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15195854#comment-15195854
 ] 

Benjamin Lerer commented on CASSANDRA-11223:
--------------------------------------------

My initial idea was to filter out earlier in the read path the partitions 
containing only static columns, in the case where they should not be returned. 
Unfortunatly, it was the wrong approach. The filtering cannot be done before we 
have reconciled the data and removed the tombstoned rows as we do not know 
until that point if the partitions contains some rows or not. This means that 
we can end up with less rows that requested as the limit has been applied on 
the replicas taking the static rows into account.
I now think that this problem should probably be solved at the paging level. In 
the case where the partitions without rows should not be returned, the static 
rows should not be counted in {{DataLimits}}.


> Queries with LIMIT filtering on clustering columns can return less row than 
> expected
> ------------------------------------------------------------------------------------
>
>                 Key: CASSANDRA-11223
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-11223
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Local Write-Read Paths
>            Reporter: Benjamin Lerer
>            Assignee: Benjamin Lerer
>
> A query like {{SELECT * FROM %s WHERE b = 1 LIMIT 2 ALLOW FILTERING}} can 
> return less row than expected if the table has some static columns and some 
> of the partition have no rows matching b = 1.
> The problem can be reproduced with the following unit test:
> {code}
>     public void testFilteringOnClusteringColumnsWithLimitAndStaticColumns() 
> throws Throwable
>     {
>         createTable("CREATE TABLE %s (a int, b int, s int static, c int, 
> primary key (a, b))");
>         for (int i = 0; i < 3; i++)
>         {
>             execute("INSERT INTO %s (a, s) VALUES (?, ?)", i, i);
>                 for (int j = 0; j < 3; j++)
>                     if (!(i == 0 && j == 1))
>                         execute("INSERT INTO %s (a, b, c) VALUES (?, ?, ?)", 
> i, j, i + j);
>         }
>         assertRows(execute("SELECT * FROM %s"),
>                    row(1, 0, 1, 1),
>                    row(1, 1, 1, 2),
>                    row(1, 2, 1, 3),
>                    row(0, 0, 0, 0),
>                    row(0, 2, 0, 2),
>                    row(2, 0, 2, 2),
>                    row(2, 1, 2, 3),
>                    row(2, 2, 2, 4));
>         assertRows(execute("SELECT * FROM %s WHERE b = 1 ALLOW FILTERING"),
>                    row(1, 1, 1, 2),
>                    row(2, 1, 2, 3));
>         assertRows(execute("SELECT * FROM %s WHERE b = 1 LIMIT 2 ALLOW 
> FILTERING"),
>                    row(1, 1, 1, 2),
>                    row(2, 1, 2, 3)); // <-------- FAIL It returns only one 
> row because the static row of partition 0 is counted and filtered out in 
> SELECT statement
>     }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to