Hmm, the tradeoff is an added per-hit check (doc != NO_MORE_DOCS), vs the one-time cost at the end of calling advance(NO_MORE_DOCS) for each sub-clause? I think in general this isn't a good tradeoff?
Ie what about the case where we and high-freq, and similarly freq'd, terms together? Then, the per-hit check will at some point dominate? It's valid to pass NO_MORE_DOCS to DocsEnum.advance. Mike McCandless http://blog.mikemccandless.com On Thu, Mar 1, 2012 at 7:22 AM, mark harwood <[email protected]> wrote: > I got round to some benchmarking of this change on Wikipedia content which > shows a small improvement: http://goo.gl/60wJG > > Aside from the small performance gain to be had, it just feels more logical > if ConjunctionScorer does not issue sub scorers with a request to advance to > "NO_MORE_DOCS". > > > > > ----- Original Message ----- > From: mark harwood <[email protected]> > To: "[email protected]" <[email protected]> > Cc: > Sent: Thursday, 1 March 2012, 9:39 > Subject: ConjunctionScorer.doNext() overstays? > > Due to the odd behaviour of a custom Scorer of mine I discovered > ConjunctionScorer.doNext() could loop indefinitely. > It does not bail out as soon as any scorer.advance() call it makes reports > back "NO_MORE_DOCS". Is there not a performance optimisation to be gained in > exiting as soon as this happens? > At this stage I cannot see any point in continuing to advance other scorers - > a quick look at TermScorer suggests that any questionable calls made by > ConjunctionScorer to advance to NO_MORE_DOCS receives no special treatment > and disk will be hit as a consequence. > I added an extra condition to the while loop on the 3.5 source: > > while ((doc != NO_MORE_DOCS) && ((firstScorer = scorers[first]).docID() > < doc)) { > > and Junit tests passed.I haven't been able to benchmark performance > improvements but it looks like it would be sensible to make the change anyway. > > Cheers, > Mark > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [email protected] > For additional commands, e-mail: [email protected] > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [email protected] > For additional commands, e-mail: [email protected] > --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
