We are using the collapse query parser for consolidating results based on a field value, and are also faceting on a number of other fields. The collapse field and the facet fields all have docValues=true. For very large (millions of documents) result sets, the heap usage gets a little out of hand, and the resulting GC is problematic. I am trying to figure out how to reduce the number of documents that are being faceted over, and still display facets that are "representative" of the entire result set.
Some sort of filter query seems to be the obvious answer, but what? I don't want to accidentally exclude my most relevant results. How can I facet over only the top N results? Thanks for any tips. -- Jeremy Buckley
