On Thu, Jun 25, 2009 at 10:11 PM, Nigel<nigelspl...@gmail.com> wrote:
> Currently we're (perhaps naively) doing the equivalent of > query.weight(searcher).scorer(reader).score(collector). Obviously there's a > certain amount of unnecessary calculation that results from this if you > don't care about sorting. Are there any general recommendations for > unordered searching? (We already omit norms.) As of 2.9, scoring is optional with the new Collector. Ie, if your Collector doesn't call scorer.score() then the score won't be computed. Also, 2.9 will enable out-of-docID-order scoring if your Collector.acceptsDocsOutOfOrder returns true, but currently it's disjunction only (plus up to 32 prohibited terms) that will see a speedup from this. What kind of queries are you running? I assume your Collector simply aggregates the list of docIDs hit? Ie no sorting, priority queue, etc., in it. > (More details: Of particular interest are things that access the TermInfos, > since that's the major source of RAM usage: if a smaller number of TermInfos > were needed then we could perhaps use an aggressive index divisor setting to > save RAM without a performance penalty. For example, I was thinking about a > custom Similarity implementation that skipped the idf calculations, since > those have to hit the TermInfos.) Unfortunately the TermInfos must still be hit to look up the freq/proxOffset in the postings files. Mike --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org