-1 to simply remove that test.  It showed we found and fixed a critical
performance problem that could easily re-appear.

I imagine a straight-forward solution is to set the system property
"solr.solrj.cache.timeout.sec" to like 60 in the test.

Ideally that new live node cache refresh mechanism will be improved to a
different design that doesn't ping Solr all day long. Staleness of
collection state isn't handled that way.

~ David

On Mon, Oct 6, 2025 at 6:39 PM James Dyer <[email protected]> wrote:

> I am looking at the recent failures with
> CloudHttp2SolrClientTest#testHttpCspPerf.  This test was added with
> SOLR-14985, which fixed a performance regression because CSC wasn't
> caching its state.  This test counts the number of logged requests for
> CLUSTERSTATUS.  However, with the recent commit on 9/22 for SOLR-17921
> ("BaseHttpClusterStateProvider should prefetch refreshes of
> liveNodes"), we now have a background task periodically calling
> CLUSTESTATUS for us.  Although the test failure does not reproduce for
> me without modifications, I can easily make it happen by adding a
> Thead.sleep to the middle of the test class.
>
> Perhaps SOLR-17921 does not make the worries about performance
> regression here entirely go away, but it for sure makes it hard to
> verify we are not calling CUSTERSTATUS too many times!  My inclination
> here is to entirely remove the test method "testHttpCspPerf" as it
> seems less-relevant and less-verifiable.  Unless, someone here has a
> good idea of how we can still verify this without the test being
> flaky?
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [email protected]
> For additional commands, e-mail: [email protected]
>
>

Reply via email to