[
https://issues.apache.org/jira/browse/CASSANDRA-15992?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17189647#comment-17189647
]
Adam Holmberg commented on CASSANDRA-15992:
-------------------------------------------
I have an environment that produced the error every 100 runs or so, but it is
not deterministic. The assertion is failing because we're sometimes not
arriving in
[here|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/service/reads/ShortReadPartitionsProtection.java#L151]
for one of two SRP iterators created for this query (one per node). The only
way I can see for this to happen is for results iteration to stop before
exhausting both.
My current working theory is that this is possible, and okay due to
[concurrency|https://github.com/apache/cassandra/blob/614d7d06f4964f03681e9e90d98ddf3562c47598/src/java/org/apache/cassandra/service/StorageProxy.java#L2126-L2141],
[potential
variation|https://github.com/apache/cassandra/blob/614d7d06f4964f03681e9e90d98ddf3562c47598/src/java/org/apache/cassandra/locator/ReplicaPlans.java#L616-L635]
in replica iteration, and the potential to
[stop|https://github.com/apache/cassandra/blob/614d7d06f4964f03681e9e90d98ddf3562c47598/src/java/org/apache/cassandra/db/transform/BasePartitions.java#L93]
inner iteration early when [limits are in
play|https://github.com/apache/cassandra-dtest/blob/b00d0c310ff61d3f39c116daeccdf43aa63f2b25/consistency_test.py#L1281].
Sadly I have not once been able to reproduce this with any instrumentation in
the code whatsoever.
Having stalled progress on empirical observation, I wanted to float this static
analysis for some discussion. If it holds water, the change would be to simply
update the test to only require that SRP counter be at least the value of one
of the node ranges (4 or 5).
> Fix flaky python dtest test_13595 - consistency_test.TestConsistency
> --------------------------------------------------------------------
>
> Key: CASSANDRA-15992
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15992
> Project: Cassandra
> Issue Type: Bug
> Components: Test/dtest/python
> Reporter: David Capwell
> Assignee: Adam Holmberg
> Priority: Normal
> Fix For: 4.0-beta
>
>
> https://app.circleci.com/pipelines/github/dcapwell/cassandra/355/workflows/7b8df61d-706f-4094-a206-7cdc6b4e0451/jobs/1818
> {code}
> > assert 9 == jmx.read_attribute(srp, 'Count')
> E AssertionError: assert 9 == 5
> {code}
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]