[
https://issues.apache.org/jira/browse/CASSANDRA-15650?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17066854#comment-17066854
]
David Capwell commented on CASSANDRA-15650:
-------------------------------------------
To also help with the review, a few things changed but only a hand full solve
the problem.
1) there is a bug in nodetool repair (client side) where the error notification
isn't seen (expected behavior, notifications are lossy) but the complete is.
In this case the return code is 0 though the message will say there was a
failure. I changed it so we double check for errors.
2) the slow tests take 6m10s on my laptop with only 2 cores running; the dtest
timeout is 6m. I split the tests into 2 (which causes 3 new implementations:
preview, IR, full) which now gets it to 4m30s under the same settings.
The other changes are
1) better assert messages, useful to see issue #1, but generally offer more
detail than before
2) thread renaming, this is a operational improvement as it better allows us to
see which threads are running which repair. This ties in well with
improvements to #2 as it will show which repair is timed out rather than "there
exists a repair"
> Fix flaky test
> org.apache.cassandra.distributed.test.*RepairCoordinatorFastTest
> -------------------------------------------------------------------------------
>
> Key: CASSANDRA-15650
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15650
> Project: Cassandra
> Issue Type: Bug
> Components: Test/dtest
> Reporter: David Capwell
> Assignee: David Capwell
> Priority: Normal
> Labels: pull-request-available
> Fix For: 4.0-alpha
>
> Time Spent: 10m
> Remaining Estimate: 0h
>
> Test failure:
> https://app.circleci.com/pipelines/github/dcapwell/cassandra/177/workflows/3dff37a5-9bf4-40e2-8d5b-f127b416dc79/jobs/862
> {code}
> [junit-timeout] Testcase:
> onlyCoordinator[SEQUENTIAL/true](org.apache.cassandra.distributed.test.FullRepairCoordinatorFastTest):
> FAILED
> [junit-timeout] nodetool command repair was successful but not expected to
> be. Actual: 0
> [junit-timeout] junit.framework.AssertionFailedError: nodetool command repair
> was successful but not expected to be. Actual: 0
> [junit-timeout] at
> org.apache.cassandra.distributed.api.NodeToolResult$Asserts.failure(NodeToolResult.java:76)
> [junit-timeout] at
> org.apache.cassandra.distributed.test.RepairCoordinatorFast.onlyCoordinator(RepairCoordinatorFast.java:255)
> [junit-timeout] at
> java.util.concurrent.FutureTask.run(FutureTask.java:266)
> [junit-timeout] at java.lang.Thread.run(Thread.java:748)
> {code}
> [Circle CI
> LOWER|https://circleci.com/gh/dcapwell/cassandra/tree/bug%2FrepairCoordinatorTestFlaky]
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]