My immediate thought was HTTP/2 but I see you are running with HTTP/1 (although interestingly some changes made for the sake of HTTP/2 may have contributed since there is shared code). I am sort of skeptical of the third finding. The application of the idle timeout as the default request timeout isn't *that* old. I remember researching this because of an issue with the index fetcher (which incidentally should *not* have this behavior)
https://issues.apache.org/jira/browse/SOLR-17711 The thought of a bunch of requests trickling little bits of data for arbitrarily long, just enough to reset idle timeout seems unlikely at first blush. From: [email protected] At: 03/18/26 07:39:49 UTC-4:00To: [email protected] Subject: Deadlock observed for distributed search in Solr 9.10.1 We recently upgraded some Solr clusters from version 9.7 to 9.10.1. Collection have multiple shards and run distributed requests continously. After a few days, distributed requests would start timing out and all clients would fail, requiring a full solr cluster restart to recover. No sign of overload. Downgrading back to Solr 9.7 fixed the issues. This has been observed in several different environments. Have anyone else seen similar behavior in your own clusters? As there is no errors in Solr logs, no exceptions, no high load or scary Grafana graphs in GC or otherwise, we have spent several days investigating and trying to reproduce, with limited luck. The best I have is an LLM analysis of the issue and a theory of what might cause it. It think the analysis is interesting and the suspect is leaking semaphores in Http2SolrClient.AsyncTracker which would eventually cause a full stop. The analysis is here https://cwiki.apache.org/confluence/x/AZM8G - it contains a description, executive summary, tech details and some questions for ocmmitters. You may comment inline in Confluence if you have an account, or here in this thread. I have not yet filed a bug in JIRA, as I want to discuss here and still hope to reproduce the issue in a pristine environment. Jan
