[ https://issues.apache.org/jira/browse/CASSANDRA-15650?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17066854#comment-17066854 ]
David Capwell commented on CASSANDRA-15650: ------------------------------------------- To also help with the review, a few things changed but only a hand full solve the problem. 1) there is a bug in nodetool repair (client side) where the error notification isn't seen (expected behavior, notifications are lossy) but the complete is. In this case the return code is 0 though the message will say there was a failure. I changed it so we double check for errors. 2) the slow tests take 6m10s on my laptop with only 2 cores running; the dtest timeout is 6m. I split the tests into 2 (which causes 3 new implementations: preview, IR, full) which now gets it to 4m30s under the same settings. The other changes are 1) better assert messages, useful to see issue #1, but generally offer more detail than before 2) thread renaming, this is a operational improvement as it better allows us to see which threads are running which repair. This ties in well with improvements to #2 as it will show which repair is timed out rather than "there exists a repair" > Fix flaky test > org.apache.cassandra.distributed.test.*RepairCoordinatorFastTest > ------------------------------------------------------------------------------- > > Key: CASSANDRA-15650 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15650 > Project: Cassandra > Issue Type: Bug > Components: Test/dtest > Reporter: David Capwell > Assignee: David Capwell > Priority: Normal > Labels: pull-request-available > Fix For: 4.0-alpha > > Time Spent: 10m > Remaining Estimate: 0h > > Test failure: > https://app.circleci.com/pipelines/github/dcapwell/cassandra/177/workflows/3dff37a5-9bf4-40e2-8d5b-f127b416dc79/jobs/862 > {code} > [junit-timeout] Testcase: > onlyCoordinator[SEQUENTIAL/true](org.apache.cassandra.distributed.test.FullRepairCoordinatorFastTest): > FAILED > [junit-timeout] nodetool command repair was successful but not expected to > be. Actual: 0 > [junit-timeout] junit.framework.AssertionFailedError: nodetool command repair > was successful but not expected to be. Actual: 0 > [junit-timeout] at > org.apache.cassandra.distributed.api.NodeToolResult$Asserts.failure(NodeToolResult.java:76) > [junit-timeout] at > org.apache.cassandra.distributed.test.RepairCoordinatorFast.onlyCoordinator(RepairCoordinatorFast.java:255) > [junit-timeout] at > java.util.concurrent.FutureTask.run(FutureTask.java:266) > [junit-timeout] at java.lang.Thread.run(Thread.java:748) > {code} > [Circle CI > LOWER|https://circleci.com/gh/dcapwell/cassandra/tree/bug%2FrepairCoordinatorTestFlaky] -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org