-1 to simply remove that test. It showed we found and fixed a critical performance problem that could easily re-appear.
I imagine a straight-forward solution is to set the system property "solr.solrj.cache.timeout.sec" to like 60 in the test. Ideally that new live node cache refresh mechanism will be improved to a different design that doesn't ping Solr all day long. Staleness of collection state isn't handled that way. ~ David On Mon, Oct 6, 2025 at 6:39 PM James Dyer <[email protected]> wrote: > I am looking at the recent failures with > CloudHttp2SolrClientTest#testHttpCspPerf. This test was added with > SOLR-14985, which fixed a performance regression because CSC wasn't > caching its state. This test counts the number of logged requests for > CLUSTERSTATUS. However, with the recent commit on 9/22 for SOLR-17921 > ("BaseHttpClusterStateProvider should prefetch refreshes of > liveNodes"), we now have a background task periodically calling > CLUSTESTATUS for us. Although the test failure does not reproduce for > me without modifications, I can easily make it happen by adding a > Thead.sleep to the middle of the test class. > > Perhaps SOLR-17921 does not make the worries about performance > regression here entirely go away, but it for sure makes it hard to > verify we are not calling CUSTERSTATUS too many times! My inclination > here is to entirely remove the test method "testHttpCspPerf" as it > seems less-relevant and less-verifiable. Unless, someone here has a > good idea of how we can still verify this without the test being > flaky? > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [email protected] > For additional commands, e-mail: [email protected] > >
