janhoy opened a new pull request, #3957: URL: https://github.com/apache/solr/pull/3957
This test only fails about 1% of the runs, but it is often enough to happen almost every day, see [develocity](https://develocity.apache.org/scans/tests?search.buildToolType=gradle&search.relativeStartTime=P28D&search.rootProjectNames=solr-root&search.tasks=test&search.timeZoneId=Europe%2FOslo&tests.container=org.apache.solr.cloud.LeaderElectionTest&tests.sortField=FLAKY&tests.test=testStressElection). Using Claude to analyze likely cause of failure, here is a proposed hardening of the test. The log during test failure is typically > java.lang.RuntimeException: Could not get leader props for collection1 shard1 This happens in `getLeaderUrl()`, either when a 30s retry loop ehausts, or an unexpected exception is thrown in the retry loop. The fix is to wrap the entire assert with `RetryUtils` which will retry the entire assert once more on this exception. After all the stress havoc with killing threads and ZK exceptions (can be seen in logs), this would be more robust and still prove that eventually leader election succeeds, which is the intent of the test. https://issues.apache.org/jira/browse/SOLR-17890 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
