[
https://issues.apache.org/jira/browse/CASSANDRA-11223?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16085685#comment-16085685
]
Benjamin Lerer commented on CASSANDRA-11223:
--------------------------------------------
{quote}I missed it last time sorry, but we should not count the static row in
{{GroupByPrefixReversed.count()}} when {{countPartitionsWithOnlyStaticData}} is
false.{quote}
I did not change it on purpose.
The problem only affect range queries and multi-partition queries. Range
queries do not accept an {{ORDER BY}} clause. Multi-partition queries only
accept an {{ORDER BY}} clause when paging is off. The limit in this case is
used only when the rows with only static data have already been discarded. So,
in practice changing {{GroupByPrefixReversed.count()}} has no effect.
I will add a test to prove that the behavior is correct.
{quote}Are we sure it's fine to always count the static row in
{{ColumnFamily.liveCQL3RowCount()}}?{quote}
{{ColumnFamily.liveCQL3RowCount()}} is only used by
{{ColumnFamilyStore::isFilterFullyCoveredBy}} to check if the whole partition
is cached. That we count the static row or not, the answer will be correct.
We could argue about {{ColumnFamily.liveCQL3RowCount()}} independently of its
current use, but I am in favor of minimizing the changes on {{2.2}} (taken into
account that everything is different in {{3.0}}). What do you think?
{quote}Should we count static rows for the limits used by
{{RowCacheSerializer.deserialize()}}?{quote}
I think it is fine. The limit is only used for limiting the number of rows
stored in the cache for each partition and we will only count the static row if
the partition does not have any row.
{quote}I haven't checked the 3.11 and trunk patches yet, did they apply or do
they need a full review?{quote}
{{3.11}} is different due to the {{GROUP BY}} functionality.
I pushed the updated branches.
> Queries with LIMIT filtering on clustering columns can return less rows than
> expected
> -------------------------------------------------------------------------------------
>
> Key: CASSANDRA-11223
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11223
> Project: Cassandra
> Issue Type: Bug
> Components: Local Write-Read Paths
> Reporter: Benjamin Lerer
> Assignee: Benjamin Lerer
>
> A query like {{SELECT * FROM %s WHERE b = 1 LIMIT 2 ALLOW FILTERING}} can
> return less row than expected if the table has some static columns and some
> of the partition have no rows matching b = 1.
> The problem can be reproduced with the following unit test:
> {code}
> public void testFilteringOnClusteringColumnsWithLimitAndStaticColumns()
> throws Throwable
> {
> createTable("CREATE TABLE %s (a int, b int, s int static, c int,
> primary key (a, b))");
> for (int i = 0; i < 3; i++)
> {
> execute("INSERT INTO %s (a, s) VALUES (?, ?)", i, i);
> for (int j = 0; j < 3; j++)
> if (!(i == 0 && j == 1))
> execute("INSERT INTO %s (a, b, c) VALUES (?, ?, ?)",
> i, j, i + j);
> }
> assertRows(execute("SELECT * FROM %s"),
> row(1, 0, 1, 1),
> row(1, 1, 1, 2),
> row(1, 2, 1, 3),
> row(0, 0, 0, 0),
> row(0, 2, 0, 2),
> row(2, 0, 2, 2),
> row(2, 1, 2, 3),
> row(2, 2, 2, 4));
> assertRows(execute("SELECT * FROM %s WHERE b = 1 ALLOW FILTERING"),
> row(1, 1, 1, 2),
> row(2, 1, 2, 3));
> assertRows(execute("SELECT * FROM %s WHERE b = 1 LIMIT 2 ALLOW
> FILTERING"),
> row(1, 1, 1, 2),
> row(2, 1, 2, 3)); // <-------- FAIL It returns only one
> row because the static row of partition 0 is counted and filtered out in
> SELECT statement
> }
> {code}
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]