Representative filtering of very large result sets

Jeremy Buckley - IQ-C Wed, 23 Mar 2022 09:30:00 -0700

We are using the collapse query parser for consolidating results based on a
field value, and are also faceting on a number of other fields.  The
collapse field and the facet fields all have docValues=true. For very large
(millions of documents) result sets, the heap usage gets a little out of
hand, and the resulting GC is problematic.  I am trying to figure out how
to reduce the number of documents that are being faceted over, and still
display facets that are "representative" of the entire result set.


Some sort of filter query seems to be the obvious answer, but what? I don't
want to accidentally exclude my most relevant results.

How can I facet over only the top N results?

Thanks for any tips.

-- 
Jeremy Buckley

Representative filtering of very large result sets

Reply via email to