We operate a generic search tool whose primary use case involves collapsing
and expanding documents based on user-provided keywords and filter queries.
Since requests use Solr cursors, iterating through large result sets is a
common pattern. We do not control how the user-provided keywords and filter
queries narrow the dataset, which means many requests end up applying
collapse to millions of documents. This was acceptable in Solr 8, but has
become a serious issue after migrating to Solr 9, where the same queries
are 2–3x slower.

We have considered two potential workarounds. First, splitting the index
into two — one for raw documents and one storing pre-collapsed results.
Second, replacing the string sort field (document id) in collapse with a
numeric field (hash from the string document id). Both options introduce
breaking changes: the index split requires a significant redesign, and
changing the collapse sort field type would invalidate all existing cursors
stored by clients of the generic tool and require reindexing. Neither is a
viable short-term fix.

*The goal of this message is to understand what options exist in the
current situation without committing to a large engineering effort. *

We would also like to raise a broader concern: introducing mandatory LZ4
compression for SortedDocValues term dictionaries in Lucene 9 reduced index
size on disk, but introduced a significant performance regression for
workloads that rely heavily on string sort fields in collapse queries.
Whether this trade-off was intentional, and whether there is a plan to make
compression configurable, would be valuable to know.

Kind regards,
Bartosz

On Wed, Jun 24, 2026 at 10:32 PM Rob Audenaerde <[email protected]>
wrote:

> I don't know a direct answer to your questions, but some context of why
> you are you running a collapse query on 7m documents could help provide
> insight? What are you trying to achieve? Are the results to be paged in a
> ui? Is it an analytics workload?
>
>
>
> On Wed, Jun 24, 2026, 21:46 Bartosz Fidrysiak <[email protected]>
> wrote:
>
>> We identified a 2–3x performance regression in Solr 9.10.1 compared to
>> Solr 8.11.2 for collapse
>> queries that use a string field as a collapse sort field.
>>
>>
>> Test setup
>> ----------
>>
>> To measure the regression under real production conditions, we configured
>> both clusters to receive identical traffic simultaneously — every Solr
>> request is sent to both instances at the same time, making the comparison
>> direct and unbiased. Both clusters have the same number of nodes,
>> documents, shards, and shard ranges. The data is sharded by tenant ID, so
>> each request is served by a single shard with no cross-shard overhead. Solr
>> schema is the same for both clusters.
>>
>> We tested six query variants covering different combinations of collapse
>> sort fields: no collapse, collapse with date sort, date+long sort,
>> date+string sort, and string-only sort (see attachments). The results show
>> that queries with a string field in the collapse sort are consistently and
>> significantly slower in Solr 9, while queries using only numeric or date
>> sort fields show no regression. Notably, the string field used in the
>> collapse sort has very high cardinality, and the worst-case queries process
>> millions of documents.
>>
>>
>> [image: image.png]
>> [image: image.png]
>> [image: image.png]
>>
>> Root cause
>> ----------
>>
>> JFR profiling of the worst-case query (sort="modified_date desc,
>> document_id asc", ~7M documents) confirmed the root cause.
>> [image: image.png]
>>
>> Lucene 9 changed the internal format for SortedDocValues
>> (Lucene90DocValuesProducer). The term dictionary (TermsDict) now stores
>> string values in LZ4-compressed blocks. In Lucene 8, the same data was held
>> uncompressed in direct memory — reads were instant. In Lucene 9, every time
>> the collapse logic needs to materialize a string value for comparison or to
>> record a new group winner, it must decompress an LZ4 block. For ~7M
>> documents, this decompression is triggered on nearly every document via the
>> following call chain:
>>
>>   SortFieldsCompare
>>     -> TermOrdValLeafComparator.copy()
>>     -> lookupOrd()
>>     -> TermsDict.decompressBlock()
>>     -> LZ4.decompress()
>>
>> LZ4 decompression accounts for almost 40% of CPU time in the
>> query-serving thread in Solr 9,
>> versus near zero in Solr 8.
>>
>> Similar concerns were raised in
>> https://github.com/apache/lucene/issues/11485
>>
>> Questions
>> ---------
>>
>> Q1: What are your recommendations for improving the performance of
>> collapse queries that use a string field as a sort tiebreaker in Solr 9?
>>
>> Q2: Is it possible to disable LZ4 compression for SortedDocValues term
>> dictionaries — either via a configuration property or a docValuesFormat
>> option — or is this something that could be planned for a future release?
>>
>> Q3: Would it be feasible to lazily materialize string field values in
>> CollapsingQParserPlugin for group winners, so that lookupOrd() is only
>> called when a cross-segment comparison is actually needed? This could
>> improve performance for queries where most groups contain only one document.
>>
>> Kind regards,
>> Bartosz
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: [email protected]
>> For additional commands, e-mail: [email protected]
>
>

Reply via email to