[
https://issues.apache.org/jira/browse/CASSANDRA-11223?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16073035#comment-16073035
]
Stefania commented on CASSANDRA-11223:
--------------------------------------
The 3.0 patch LGTM except for one concern, which regards how we count static
rows when we create a cached partition and when we check if it has enough rows.
To be on the safe side, we should not count static rows when creating cached
partitions ({{SinglePartitionReadCommand}} line 439), and we should actually
check whether the query needs to count them in
{{DataLimits.hasEnoughLiveData()}}, WDYT?
There is also some existing but related code that [~blambov] and I were
debating a while ago. In {{RowFilter}} at [line
273->288|https://github.com/apache/cassandra/blob/cassandra-3.0/src/java/org/apache/cassandra/db/filter/RowFilter.java#L273],
is it correct to suppress a static row when filtering on non static rows
replica side? If the static row was updated only on this replica, wouldn't the
combined view be incorrect then? Now that we don't count static rows if
filtering on reg. or clustering columns, isn't it more correct to send the
static row to the coordinator? However, would this have implications on the
digest calculations?
The 3.0 patch also has some typos:
* {{ReadQuery.selectsFullPartitions()}} javadoc, consider something like:
{code}
* Checks if this {@code ReadQuery} selects full partitions, that is it has no
filtering on clustering or regular columns.
* @return {@code true} if this {@code ReadQuery} selects full partitions,
{@code false} otherwise.
{code}
* perhaps {{ReadQuery.selectsFullPartitions()}} should be singular,
{{ReadQuery.selectsFullPartition()}} (this would also
make it consistent with the the equivalent methods in {{DataRange}} and
{{ClusteringIndexFilter}}).
* {{RowFilter.hasExpressionOnClusteringOrRegularColumns()}} javadoc:
{code}
* Checks if some expressions apply to clustering or regular columns.
* @return {@code true} if at least one expression applies to clustering or
regular columns, {@code false} otherwise.
{code}
* {{DataLimits.newCounter}}, the parameter name in the javadoc should change
from {{countsPartitionWithOnlyStaticData}} to
{{countPartitionsWithOnlyStaticData}}
--
The 2.2 patch LGTM, I found no problems or typos. I just have the same concerns
regarding how we count static rows when dealing with cached partitions.
Similarly to what discussed above, but probably not worth addressing it,
_partitions with only static data are never returned for index queries_, I am
not entirely sure that this is correct.
> Queries with LIMIT filtering on clustering columns can return less rows than
> expected
> -------------------------------------------------------------------------------------
>
> Key: CASSANDRA-11223
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11223
> Project: Cassandra
> Issue Type: Bug
> Components: Local Write-Read Paths
> Reporter: Benjamin Lerer
> Assignee: Benjamin Lerer
>
> A query like {{SELECT * FROM %s WHERE b = 1 LIMIT 2 ALLOW FILTERING}} can
> return less row than expected if the table has some static columns and some
> of the partition have no rows matching b = 1.
> The problem can be reproduced with the following unit test:
> {code}
> public void testFilteringOnClusteringColumnsWithLimitAndStaticColumns()
> throws Throwable
> {
> createTable("CREATE TABLE %s (a int, b int, s int static, c int,
> primary key (a, b))");
> for (int i = 0; i < 3; i++)
> {
> execute("INSERT INTO %s (a, s) VALUES (?, ?)", i, i);
> for (int j = 0; j < 3; j++)
> if (!(i == 0 && j == 1))
> execute("INSERT INTO %s (a, b, c) VALUES (?, ?, ?)",
> i, j, i + j);
> }
> assertRows(execute("SELECT * FROM %s"),
> row(1, 0, 1, 1),
> row(1, 1, 1, 2),
> row(1, 2, 1, 3),
> row(0, 0, 0, 0),
> row(0, 2, 0, 2),
> row(2, 0, 2, 2),
> row(2, 1, 2, 3),
> row(2, 2, 2, 4));
> assertRows(execute("SELECT * FROM %s WHERE b = 1 ALLOW FILTERING"),
> row(1, 1, 1, 2),
> row(2, 1, 2, 3));
> assertRows(execute("SELECT * FROM %s WHERE b = 1 LIMIT 2 ALLOW
> FILTERING"),
> row(1, 1, 1, 2),
> row(2, 1, 2, 3)); // <-------- FAIL It returns only one
> row because the static row of partition 0 is counted and filtered out in
> SELECT statement
> }
> {code}
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]