[ 
https://issues.apache.org/jira/browse/CASSANDRA-11223?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16073035#comment-16073035
 ] 

Stefania commented on CASSANDRA-11223:
--------------------------------------

The 3.0 patch LGTM except for one concern, which regards how we count static 
rows when we create a cached partition and when we check if it has enough rows. 
To be on the safe side, we should not count static rows when creating cached 
partitions ({{SinglePartitionReadCommand}} line 439), and we should actually 
check whether the query needs to count them in 
{{DataLimits.hasEnoughLiveData()}}, WDYT?

There is also some existing but related code that [~blambov] and I were 
debating a while ago. In {{RowFilter}} at [line 
273->288|https://github.com/apache/cassandra/blob/cassandra-3.0/src/java/org/apache/cassandra/db/filter/RowFilter.java#L273],
 is it correct to suppress a static row when filtering on non static rows 
replica side? If the static row was updated only on this replica, wouldn't the 
combined view be incorrect then? Now that we don't count static rows if 
filtering on reg. or clustering columns, isn't it more correct to send the 
static row to the coordinator? However, would this have implications on the 
digest calculations?

The 3.0 patch also has some typos:

* {{ReadQuery.selectsFullPartitions()}} javadoc, consider something like:
{code}
* Checks if this {@code ReadQuery} selects full partitions, that is it has no 
filtering on clustering or regular columns.
* @return {@code true} if this {@code ReadQuery} selects full partitions, 
{@code false} otherwise.
{code}

* perhaps {{ReadQuery.selectsFullPartitions()}} should be singular, 
{{ReadQuery.selectsFullPartition()}} (this would also
  make it consistent with the the equivalent methods in {{DataRange}} and 
{{ClusteringIndexFilter}}).

* {{RowFilter.hasExpressionOnClusteringOrRegularColumns()}} javadoc:
{code}
* Checks if some expressions apply to clustering or regular columns.
* @return {@code true} if at least one expression applies to clustering or 
regular columns, {@code false} otherwise.
{code}

* {{DataLimits.newCounter}}, the parameter name in the javadoc should change 
from {{countsPartitionWithOnlyStaticData}} to 
{{countPartitionsWithOnlyStaticData}}

--

The 2.2 patch LGTM, I found no problems or typos. I just have the same concerns 
regarding how we count static rows when dealing with cached partitions. 
Similarly to what discussed above, but probably not worth addressing it, 
_partitions with only static data are never returned for index queries_, I am 
not entirely sure that this is correct.


> Queries with LIMIT filtering on clustering columns can return less rows than 
> expected
> -------------------------------------------------------------------------------------
>
>                 Key: CASSANDRA-11223
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-11223
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Local Write-Read Paths
>            Reporter: Benjamin Lerer
>            Assignee: Benjamin Lerer
>
> A query like {{SELECT * FROM %s WHERE b = 1 LIMIT 2 ALLOW FILTERING}} can 
> return less row than expected if the table has some static columns and some 
> of the partition have no rows matching b = 1.
> The problem can be reproduced with the following unit test:
> {code}
>     public void testFilteringOnClusteringColumnsWithLimitAndStaticColumns() 
> throws Throwable
>     {
>         createTable("CREATE TABLE %s (a int, b int, s int static, c int, 
> primary key (a, b))");
>         for (int i = 0; i < 3; i++)
>         {
>             execute("INSERT INTO %s (a, s) VALUES (?, ?)", i, i);
>                 for (int j = 0; j < 3; j++)
>                     if (!(i == 0 && j == 1))
>                         execute("INSERT INTO %s (a, b, c) VALUES (?, ?, ?)", 
> i, j, i + j);
>         }
>         assertRows(execute("SELECT * FROM %s"),
>                    row(1, 0, 1, 1),
>                    row(1, 1, 1, 2),
>                    row(1, 2, 1, 3),
>                    row(0, 0, 0, 0),
>                    row(0, 2, 0, 2),
>                    row(2, 0, 2, 2),
>                    row(2, 1, 2, 3),
>                    row(2, 2, 2, 4));
>         assertRows(execute("SELECT * FROM %s WHERE b = 1 ALLOW FILTERING"),
>                    row(1, 1, 1, 2),
>                    row(2, 1, 2, 3));
>         assertRows(execute("SELECT * FROM %s WHERE b = 1 LIMIT 2 ALLOW 
> FILTERING"),
>                    row(1, 1, 1, 2),
>                    row(2, 1, 2, 3)); // <-------- FAIL It returns only one 
> row because the static row of partition 0 is counted and filtered out in 
> SELECT statement
>     }
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to