[jira] [Commented] (CASSANDRA-16307) GROUP BY queries with paging can return deleted data

Alex Petrov (Jira) Thu, 11 Feb 2021 05:19:05 -0800


    [ 
https://issues.apache.org/jira/browse/CASSANDRA-16307?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17283026#comment-17283026
 ]


Alex Petrov commented on CASSANDRA-16307:
-----------------------------------------

[~samt] I've tried out a slightly different approach, suggested by [~blerer], 
which is to use {{command.isExhausted}}. Since it has a logic that checks 
whether or not the counter is exhausted (in other words, we've reached _either_ 
row limit, or group limit), it works for this case. Works for rows, since we 
count them as we see, and works for groups since we'll only count the group as 
soon as iterator is closed or we encounter the next group. One of the 
advantages of using this approach is that we will avoid an extra page request 
for cases when group limit coincides with row limit. Would you be able to take 
another look?

> GROUP BY queries with paging can return deleted data
> ----------------------------------------------------
>
>                 Key: CASSANDRA-16307
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-16307
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Consistency/Coordination
>            Reporter: Andres de la Peña
>            Assignee: Alex Petrov
>            Priority: Normal
>             Fix For: 3.11.x, 4.0-beta
>
>
> {{GROUP BY}} queries using paging and CL>ONE/LOCAL_ONE. This dtest reproduces 
> the problem:
> {code:java}
> try (Cluster cluster = init(Cluster.create(2)))
> {
>     cluster.schemaChange(withKeyspace("CREATE TABLE %s.t (pk int, ck int, 
> PRIMARY KEY (pk, ck))"));
>     ICoordinator coordinator = cluster.coordinator(1);
>     coordinator.execute(withKeyspace("INSERT INTO %s.t (pk, ck) VALUES (0, 
> 0)"), ConsistencyLevel.ALL);
>     coordinator.execute(withKeyspace("INSERT INTO %s.t (pk, ck) VALUES (1, 
> 1)"), ConsistencyLevel.ALL);
>     
>     cluster.get(1).executeInternal(withKeyspace("DELETE FROM %s.t WHERE pk=0 
> AND ck=0"));
>     cluster.get(2).executeInternal(withKeyspace("DELETE FROM %s.t WHERE pk=1 
> AND ck=1"));
>     String query = withKeyspace("SELECT * FROM %s.t GROUP BY pk");
>     Iterator<Object[]> rows = coordinator.executeWithPaging(query, 
> ConsistencyLevel.ALL, 1);
>     assertRows(Iterators.toArray(rows, Object[].class));
> }
> {code}
> Using a 2-node cluster and RF=2, the test inserts two partitions in both 
> nodes. Then it locally deletes each row in a separate node, so each node sees 
> a different partition alive, but reconciliation should produce no alive 
> partitions. However, a {{GROUP BY}} query using a page size of 1 wrongly 
> returns one of the rows.
> This has been detected during CASSANDRA-16180, and it is probably related to 
> CASSANDRA-15459, which solved a similar problem for group-by queries with 
> limit, instead of paging.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Commented] (CASSANDRA-16307) GROUP BY queries with paging can return deleted data

Reply via email to