These problems are really hard to track down. I would start by eliminating 
possible causes, because that will improve the search performance overall.

* Use a GC that has limits on pauses.
* Check for major GCs correlated with the slow queries.
* Check for CPU spikes correlated with the slow queries (this is harder to fix).
* Switch to the latest AMD64 instances if you haven’t already.
* Make sure you are doing batch updates, not committing after each document.
* Check whether the slow queries are right after a commit, caches will be empty 
then.
* Autowarm with a small set (20? 100?) of the most common query words.
* Have enough RAM so that indexes fit in OS file buffers.
* Put indexes on SSD volumes that are not used by anything else.
* Check whether the slow queries are using expensive operations like grouping 
or faceting.
* Make sure you have the right schema field types to support grouping or 
faceting.
* Check whether the slow queries are deep paging. If they are looking for page 
200, it is probably a spammer and you can ignore them. I’ve usually cut off 
paging at page 50, a request for later pages gets page 50.
* Use the 95th percentile metric to report performance, this is less sensitive 
to outliers than averages.

I’m sure there are other things, that is just off the top of my head.

wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/  (my blog)

> On Aug 11, 2025, at 12:49 AM, Schmidt, Mihael <mschm...@bauformat.de> wrote:
> 
> Hi,
> 
> I had made the same experience. And it not only happens in a cluster. We had 
> two standalone installations and it happened there too. We then switched to a 
> 3 node cluster and can still see some spikes in the response time but they 
> are low enough so that it is negligible for us.
> 
> Best regards
> 
> Mihael
> 
> 
> Mihael Schmidt
> Software Engineering
>  
> Bauformat Küchen GmbH & Co. KG
> Kattwinkel 1 | 32584 Löhne | Deutschland
>  
> Fon: +495732 102-379
> Fax: +495732 102-300
> Mail: mschm...@bauformat.de <mailto:mschm...@bauformat.de>
> Internet: www.bauformat.de <http://www.bauformat.de/>
>  
> <Logo_BAUFORMAT_sw_130px_61b65919-4a79-4edc-a81a-8b84d76dc540.png> 
>  
> Umsatzsteuer-Identifikationsnummer: DE 124323068 - Steuer-Nr.: 310/5705/0461 
> / Finanzamt Bünde - Handelsregister Bad Oeynhausen HRA 1801
> Komplementärin: Bauformat Küchen Verwaltungs GmbH - Handelsregister Bad 
> Oeynhausen HRB 1465
> Geschäftsführer: Michael Assner, Sabine Brockschnieder
>  
>  
> <https://www.bauformat.de/><e-mailbanner_hausmesse2025_e4a6f417-48a3-4e15-ac78-362c2c6359f4.jpg>
>  <https://hausmesse.bauformat.de/2025/de/>
>  
> Wir erfüllen unsere Informationspflichten zum Datenschutz gem. Artt. 13-14 
> DS-GVO durch Veröffentlichung auf unserer Internetseite unter:
> www.bauformat.de/datenschutz <http://www.bauformat.de/datenschutz> oder durch 
> Zusendung auf Ihre formlose Anfrage.
> Von: Antonio Nunziante <nunzia...@light-sf.com 
> <mailto:nunzia...@light-sf.com>>
> Gesendet: Freitag, 08. August 2025 10:27
> Bis: users@solr.apache.org <mailto:users@solr.apache.org> 
> <users@solr.apache.org <mailto:users@solr.apache.org>>
> Betreff: Occasional very slow queries (120s) despite low average latency 
> 
> Hello,
>  
> we are experiencing an occasional performance issue with our Solr cluster.
> 
> Normally, the average search time is around 50 ms, but from time to time we 
> see a query taking up to 120 seconds to complete.
> This behavior is rare but concerning, especially because it sometimes happens 
> during periods of very low traffic.
>  
> Technical environment:
> Apache Solr 8.11.1 deployed in a 3-node cluster
> Red Hat, Inc. OpenJDK 64-Bit Server VM 11.0.25 (build 11.0.25+9-LTS)
>  
> We would like to understand:
> What could cause such a sudden slowdown in otherwise fast searches?
> Which Solr metrics, logs, or debugging tools should we check to pinpoint the 
> root cause?
> Are there recommended practices for capturing detailed query execution traces 
> when these rare slowdowns occur?
>  
> Any guidance or suggestions on how to monitor and diagnose this type of 
> sporadic latency would be greatly appreciated.
>  
> Thank you,
> Antonio

Reply via email to