We operate a generic search tool whose primary use case involves collapsing and expanding documents based on user-provided keywords and filter queries. Since requests use Solr cursors, iterating through large result sets is a common pattern. We do not control how the user-provided keywords and filter queries narrow the dataset, which means many requests end up applying collapse to millions of documents. This was acceptable in Solr 8, but has become a serious issue after migrating to Solr 9, where the same queries are 2–3x slower.
We have considered two potential workarounds. First, splitting the index into two — one for raw documents and one storing pre-collapsed results. Second, replacing the string sort field (document id) in collapse with a numeric field (hash from the string document id). Both options introduce breaking changes: the index split requires a significant redesign, and changing the collapse sort field type would invalidate all existing cursors stored by clients of the generic tool and require reindexing. Neither is a viable short-term fix. *The goal of this message is to understand what options exist in the current situation without committing to a large engineering effort. * We would also like to raise a broader concern: introducing mandatory LZ4 compression for SortedDocValues term dictionaries in Lucene 9 reduced index size on disk, but introduced a significant performance regression for workloads that rely heavily on string sort fields in collapse queries. Whether this trade-off was intentional, and whether there is a plan to make compression configurable, would be valuable to know. Kind regards, Bartosz On Wed, Jun 24, 2026 at 10:32 PM Rob Audenaerde <[email protected]> wrote: > I don't know a direct answer to your questions, but some context of why > you are you running a collapse query on 7m documents could help provide > insight? What are you trying to achieve? Are the results to be paged in a > ui? Is it an analytics workload? > > > > On Wed, Jun 24, 2026, 21:46 Bartosz Fidrysiak <[email protected]> > wrote: > >> We identified a 2–3x performance regression in Solr 9.10.1 compared to >> Solr 8.11.2 for collapse >> queries that use a string field as a collapse sort field. >> >> >> Test setup >> ---------- >> >> To measure the regression under real production conditions, we configured >> both clusters to receive identical traffic simultaneously — every Solr >> request is sent to both instances at the same time, making the comparison >> direct and unbiased. Both clusters have the same number of nodes, >> documents, shards, and shard ranges. The data is sharded by tenant ID, so >> each request is served by a single shard with no cross-shard overhead. Solr >> schema is the same for both clusters. >> >> We tested six query variants covering different combinations of collapse >> sort fields: no collapse, collapse with date sort, date+long sort, >> date+string sort, and string-only sort (see attachments). The results show >> that queries with a string field in the collapse sort are consistently and >> significantly slower in Solr 9, while queries using only numeric or date >> sort fields show no regression. Notably, the string field used in the >> collapse sort has very high cardinality, and the worst-case queries process >> millions of documents. >> >> >> [image: image.png] >> [image: image.png] >> [image: image.png] >> >> Root cause >> ---------- >> >> JFR profiling of the worst-case query (sort="modified_date desc, >> document_id asc", ~7M documents) confirmed the root cause. >> [image: image.png] >> >> Lucene 9 changed the internal format for SortedDocValues >> (Lucene90DocValuesProducer). The term dictionary (TermsDict) now stores >> string values in LZ4-compressed blocks. In Lucene 8, the same data was held >> uncompressed in direct memory — reads were instant. In Lucene 9, every time >> the collapse logic needs to materialize a string value for comparison or to >> record a new group winner, it must decompress an LZ4 block. For ~7M >> documents, this decompression is triggered on nearly every document via the >> following call chain: >> >> SortFieldsCompare >> -> TermOrdValLeafComparator.copy() >> -> lookupOrd() >> -> TermsDict.decompressBlock() >> -> LZ4.decompress() >> >> LZ4 decompression accounts for almost 40% of CPU time in the >> query-serving thread in Solr 9, >> versus near zero in Solr 8. >> >> Similar concerns were raised in >> https://github.com/apache/lucene/issues/11485 >> >> Questions >> --------- >> >> Q1: What are your recommendations for improving the performance of >> collapse queries that use a string field as a sort tiebreaker in Solr 9? >> >> Q2: Is it possible to disable LZ4 compression for SortedDocValues term >> dictionaries — either via a configuration property or a docValuesFormat >> option — or is this something that could be planned for a future release? >> >> Q3: Would it be feasible to lazily materialize string field values in >> CollapsingQParserPlugin for group winners, so that lookupOrd() is only >> called when a cross-segment comparison is actually needed? This could >> improve performance for queries where most groups contain only one document. >> >> Kind regards, >> Bartosz >> >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: [email protected] >> For additional commands, e-mail: [email protected] > >
