jaydeepkumar1984 opened a new pull request, #4139: URL: https://github.com/apache/cassandra/pull/4139
The AutoRepair InJvm flaky dtest has been found. It is tough to reproduce the issue, but here is the theory: **Problem:** The InJvm dtest relies on a check to see if _nodeRepairTimeInSec_ metric is > 0 or not to. In most cases, the repair would take some time, so the metric would be "> 0" all the time. But there can be a corner-case scenario in that the repair finishes, say in 900 ms, and in that case, the metric will remain 0 **Fix** 1. The AutoRepair already leverages SLEEP_IF_REPAIR_FINISHES_QUICKLY for such cases, but the metrics are calculated before this sleep interval. In this PR, we first do SLEEP_IF_REPAIR_FINISHES_QUICKLY and then calculate the metrics. 2. Making the InJvm dtest more aggressive by reducing the min_repair_interval and increasing the concurrency 3. Add the node's broadcast address in the _Assert_ to know which node exactly failed for better debugging, if it happens the next time. The [Cassandra Jira](https://issues.apache.org/jira/projects/CASSANDRA/issues/CASSANDRA-20620) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: pr-unsubscr...@cassandra.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: pr-unsubscr...@cassandra.apache.org For additional commands, e-mail: pr-h...@cassandra.apache.org