[
https://issues.apache.org/jira/browse/CASSANDRA-11223?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16094408#comment-16094408
]
Stefania commented on CASSANDRA-11223:
--------------------------------------
I don't think it's correct to always return false in
[ClusteringIndexNamesFilter.selectsAllPartition()|https://github.com/stef1927/cassandra/blob/cassandra-3.0/src/java/org/apache/cassandra/db/filter/ClusteringIndexNamesFilter.java#L75].
It's existing code, but with this patch applied we are no longer able to count
rows for tables of the form {{CREATE TABLE %s (k int, v int, PRIMARY KEY (k) )
WITH COMPACT STORAGE}}. We don't notice in the tests because we trim the
results in {{SelectStatement}}, but it does mean that we return too much data
replica side in this cases. I noticed because of timeouts with large range
queries on tables created by cassandra-stress.
Here is a
[test|https://github.com/apache/cassandra/compare/trunk...stef1927:11223-3.0]
for 3.0 that reproduces the problem:
{code}
@Test
public void testLimitInStaticTable() throws Throwable
{
createTable("CREATE TABLE %s (k int, v int, PRIMARY KEY (k) ) WITH
COMPACT STORAGE ");
for (int i = 0; i < 10; i++)
execute("INSERT INTO %s(k, v) VALUES (?, ?)", i, i);
assertRows(execute("SELECT * FROM %s LIMIT 5"),
row(0, 0),
row(1, 1),
row(2, 2),
row(3, 3),
row(4, 4));
}
{code}
If we temporarily comment out {{cqlRows.trim(userLimit);}} in
{{SelectStatement.process()}}, then the test only passes if we return
{{clusterings.isEmpty()}} from
{{ClusteringIndexNamesFilter.selectsAllPartition}}. However, note that I am not
100% sure this approach is correct.
Once you are back from holiday, could you take a look [~blerer]?
> Queries with LIMIT filtering on clustering columns can return less rows than
> expected
> -------------------------------------------------------------------------------------
>
> Key: CASSANDRA-11223
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11223
> Project: Cassandra
> Issue Type: Bug
> Components: Local Write-Read Paths
> Reporter: Benjamin Lerer
> Assignee: Benjamin Lerer
> Fix For: 2.2.11, 3.0.15, 3.11.1, 4.0
>
>
> A query like {{SELECT * FROM %s WHERE b = 1 LIMIT 2 ALLOW FILTERING}} can
> return less row than expected if the table has some static columns and some
> of the partition have no rows matching b = 1.
> The problem can be reproduced with the following unit test:
> {code}
> public void testFilteringOnClusteringColumnsWithLimitAndStaticColumns()
> throws Throwable
> {
> createTable("CREATE TABLE %s (a int, b int, s int static, c int,
> primary key (a, b))");
> for (int i = 0; i < 3; i++)
> {
> execute("INSERT INTO %s (a, s) VALUES (?, ?)", i, i);
> for (int j = 0; j < 3; j++)
> if (!(i == 0 && j == 1))
> execute("INSERT INTO %s (a, b, c) VALUES (?, ?, ?)",
> i, j, i + j);
> }
> assertRows(execute("SELECT * FROM %s"),
> row(1, 0, 1, 1),
> row(1, 1, 1, 2),
> row(1, 2, 1, 3),
> row(0, 0, 0, 0),
> row(0, 2, 0, 2),
> row(2, 0, 2, 2),
> row(2, 1, 2, 3),
> row(2, 2, 2, 4));
> assertRows(execute("SELECT * FROM %s WHERE b = 1 ALLOW FILTERING"),
> row(1, 1, 1, 2),
> row(2, 1, 2, 3));
> assertRows(execute("SELECT * FROM %s WHERE b = 1 LIMIT 2 ALLOW
> FILTERING"),
> row(1, 1, 1, 2),
> row(2, 1, 2, 3)); // <-------- FAIL It returns only one
> row because the static row of partition 0 is counted and filtered out in
> SELECT statement
> }
> {code}
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]