[
https://issues.apache.org/jira/browse/CASSANDRA-11223?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16073309#comment-16073309
]
Branimir Lambov edited comment on CASSANDRA-11223 at 7/4/17 8:54 AM:
---------------------------------------------------------------------
I do have some worries about the behaviour of {{CQLFilter}}, namely that we
throw away static-row only partitions if they do not have rows that survive the
filter before we have had a chance to merge data from multiple replicas. We are
also somewhat inconsistent in removing deletions, letting them survive only if
they are together with live data.
I am not sure I understand all the implications, but I think we are
over-filtering in {{CQLFilter}}. AFAICS this patch already removes the main
reason for doing so (counting static-only partitions in {{DataLimits}}), thus I
would prefer to change {{CQLFilter}} to only remove data that does not satisfy
the expressions and leave deletions and static rows in. This may impact the
size of the data we have to send to the coordinator, though; I don't know if
that's acceptable.
was (Author: blambov):
I do have some worries about the behaviour of {{CQLFilter}}, which I've
commented on
[here|https://github.com/blambov/riptanodb/blob/tpc-nopp/src/java/org/apache/cassandra/db/filter/RowFilter.java#L298]
and
[here|https://github.com/blambov/riptanodb/blob/tpc-nopp/src/java/org/apache/cassandra/db/filter/RowFilter.java#L279]
in in-progress work on the TPC branch.
I am not sure I understand all the implications, but I think we are
over-filtering in {{CQLFilter}}. AFAICS this patch already removes the main
reason for doing so (counting static-only partitions in {{DataLimits}}), thus I
would prefer to change {{CQLFilter}} to only remove data that does not satisfy
the expressions and leave deletions and static rows in. This may impact the
size of the data we have to send to the coordinator, though; I don't know if
that's acceptable.
> Queries with LIMIT filtering on clustering columns can return less rows than
> expected
> -------------------------------------------------------------------------------------
>
> Key: CASSANDRA-11223
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11223
> Project: Cassandra
> Issue Type: Bug
> Components: Local Write-Read Paths
> Reporter: Benjamin Lerer
> Assignee: Benjamin Lerer
>
> A query like {{SELECT * FROM %s WHERE b = 1 LIMIT 2 ALLOW FILTERING}} can
> return less row than expected if the table has some static columns and some
> of the partition have no rows matching b = 1.
> The problem can be reproduced with the following unit test:
> {code}
> public void testFilteringOnClusteringColumnsWithLimitAndStaticColumns()
> throws Throwable
> {
> createTable("CREATE TABLE %s (a int, b int, s int static, c int,
> primary key (a, b))");
> for (int i = 0; i < 3; i++)
> {
> execute("INSERT INTO %s (a, s) VALUES (?, ?)", i, i);
> for (int j = 0; j < 3; j++)
> if (!(i == 0 && j == 1))
> execute("INSERT INTO %s (a, b, c) VALUES (?, ?, ?)",
> i, j, i + j);
> }
> assertRows(execute("SELECT * FROM %s"),
> row(1, 0, 1, 1),
> row(1, 1, 1, 2),
> row(1, 2, 1, 3),
> row(0, 0, 0, 0),
> row(0, 2, 0, 2),
> row(2, 0, 2, 2),
> row(2, 1, 2, 3),
> row(2, 2, 2, 4));
> assertRows(execute("SELECT * FROM %s WHERE b = 1 ALLOW FILTERING"),
> row(1, 1, 1, 2),
> row(2, 1, 2, 3));
> assertRows(execute("SELECT * FROM %s WHERE b = 1 LIMIT 2 ALLOW
> FILTERING"),
> row(1, 1, 1, 2),
> row(2, 1, 2, 3)); // <-------- FAIL It returns only one
> row because the static row of partition 0 is counted and filtered out in
> SELECT statement
> }
> {code}
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]