Optimizing unordered queries

Nigel Thu, 25 Jun 2009 19:12:24 -0700

I recently posted some questions about performance problems with large
indexes.  One key thing about our situation is that we don't need sorted
results (either by relevance or any other key).  I've been looking into our
memory usage and tracing through some code, which in combination with the
recent posts about setTermInfosIndexDivisor got me thinking about the best
way to do a query where the order of results doesn't matter.


Currently we're (perhaps naively) doing the equivalent of
query.weight(searcher).scorer(reader).score(collector).  Obviously there's a
certain amount of unnecessary calculation that results from this if you
don't care about sorting.  Are there any general recommendations for
unordered searching?  (We already omit norms.)

(More details: Of particular interest are things that access the TermInfos,
since that's the major source of RAM usage: if a smaller number of TermInfos
were needed then we could perhaps use an aggressive index divisor setting to
save RAM without a performance penalty.  For example, I was thinking about a
custom Similarity implementation that skipped the idf calculations, since
those have to hit the TermInfos.)

Thanks,
Chris

Optimizing unordered queries

Reply via email to