Hi Thomas, Your question suggests that you are creating a huge BooleanQuery to identify these documents. A TermInSetQuery should perform better.
Doing better would require to better understand what you are trying to achieve. For instance if you end up with such a large list of terms because you're trying to evaluate a join, you may want to look at Lucene's support for suery-time joins: https://lucene.apache.org/core/10_1_0/join/org/apache/lucene/search/join/package-summary.html#query-time-joins-heading Le mar. 5 août 2025, 05:48, Thomas Barr <[email protected]> a écrit : > I have a medium-sized (~10m) Lucene index and I frequently want to > repeatedly search within a subset of around ~100k documents. I can increase > MaxClauseCount and build up a huge TermQuery, keep that around, then build > a BooleanQuery out of the result at runtime, but the resulting query is > quite slow. The now deprecated Filter would have been a good option with a > BitSet, but that’s deprecated. > > Any thoughts on the best way to do this? > > Thanks! > -twb > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [email protected] > For additional commands, e-mail: [email protected] > > Adrien
