[
https://issues.apache.org/jira/browse/CASSANDRA-15459?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17138356#comment-17138356
]
Andres de la Peña commented on CASSANDRA-15459:
-----------------------------------------------
I think that the way "row counted" is used to calculate limits is not the cause
of the problem here. The comments about this usage in
[{{ShortReadPartitionsProtection#counted}}|https://github.com/apache/cassandra/blob/4f50a6712ada5c4298ec860836015ea15049cbda/src/java/org/apache/cassandra/service/DataResolver.java#L762-L768]
and
[{{ShortReadPartitionsProtection.ShortReadRowsProtection#countedInCurrentPartition}}|https://github.com/apache/cassandra/blob/4f50a6712ada5c4298ec860836015ea15049cbda/src/java/org/apache/cassandra/service/DataResolver.java#L907-L913]
do not seem to describe what the method is actually doing. These comments
where added during CASSANDRA-10707, replacing the original comments added
during CASSANDRA-13794
([here|https://github.com/apache/cassandra/commit/2bae4ca907ac4d2ab53c899e5cf5c9e4de631f52]).
I'm restoring the original description of those methods.
It seems to me that the cause of the error is
[here|https://github.com/apache/cassandra/blob/4f50a6712ada5c4298ec860836015ea15049cbda/src/java/org/apache/cassandra/db/filter/DataLimits.java#L966].
Implementations of {{DataLimit.Counter}} are meant to both count results and
also to limit them, being a {{StoppingTransformation}}. The method
{{DataLimit.Counter#onlyCount}} allows to disable their result-limiting
behaviour, so they only count results without transforming them. The counter
[{{singleResultCounter}}|https://github.com/apache/cassandra/blob/4f50a6712ada5c4298ec860836015ea15049cbda/src/java/org/apache/cassandra/service/DataResolver.java#L630-L631]
in short read protection uses this read-only behaviour, so it counts the
replica results without truncating them, in case more replica results are
needed after reconciliation. However, the method
{{GroupByAwareCounter#applyToRow}} unconditionally returns a {{null}} row in
case the read partition has more rows than the specified by the limit, which
can violate the count-only behaviour. Something similar happens in
{{GroupByAwareCounter#applyToStatic}}. The proposed PR simply takes into
account the {{Counter.enforceLimits}} to prevent this filtering.
The dtest PR just adds the excellent test provided by [~jasonstack] in the
description, with a minimal change to disable read-repair in 3.11 with Byteman,
because we don't have the {{NONE}} repair strategy available in that version.
I'm also excluding 3.0 because {{GROUP BY}} was added in 3.10.
CI results:
||branch||ci-cassandra utest||ci-cassandra dtest||CircleCI j8||CircleCI j11||
|[3.11|https://github.com/apache/cassandra/pull/635]|[126|https://ci-cassandra.apache.org/view/all/job/Cassandra-devbranch-test/126/]|[163|https://ci-cassandra.apache.org/view/all/job/Cassandra-devbranch-dtest/163/]|[link|https://app.circleci.com/pipelines/github/adelapena/cassandra/53/workflows/ce4d2cad-a811-43af-a215-b4ea71260d0e]||
|[trunk|https://github.com/apache/cassandra/pull/636]|[127|https://ci-cassandra.apache.org/view/all/job/Cassandra-devbranch-test/127/]|[164|https://ci-cassandra.apache.org/view/all/job/Cassandra-devbranch-dtest/163/]|[link|https://app.circleci.com/pipelines/github/adelapena/cassandra/55/workflows/9f73dadc-a963-43b6-8792-ea5e0c0e17c8]|[link|https://app.circleci.com/pipelines/github/adelapena/cassandra/55/workflows/56a025b0-4c18-4eae-9f77-2c261b1d2cc5]|
CC [~blerer]
> Short read protection doesn't work on group-by queries
> ------------------------------------------------------
>
> Key: CASSANDRA-15459
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15459
> Project: Cassandra
> Issue Type: Bug
> Components: Legacy/Coordination
> Reporter: ZhaoYang
> Assignee: Andres de la Peña
> Priority: Normal
> Labels: correctness
> Fix For: 3.11.7, 4.0-beta
>
> Time Spent: 0.5h
> Remaining Estimate: 0h
>
> [DTest to
> reproduce|https://github.com/apache/cassandra-dtest/compare/master...jasonstack:srp_group_by_trunk?expand=1]:
> it affects all versions..
> {code}
> In a two-node cluster with RF = 2
> Execute only on Node1:
> * Insert pk=1 and ck=1 with timestamp 9
> * Delete pk=0 and ck=0 with timestamp 10
> * Insert pk=2 and ck=2 with timestamp 9
> Execute only on Node2:
> * Delete pk=1 and ck=1 with timestamp 10
> * Insert pk=0 and ck=0 with timestamp 9
> * Delete pk=2 and ck=2 with timestamp 10
> Query: "SELECT pk, c FROM %s GROUP BY pk LIMIT 1"
> * Expect no live data, but got [0, 0]
> {code}
> Note: for group-by queries, SRP should use "group counted" to calculate
> limits used for SRP query, rather than "row counted".
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]