On Friday 10 December 2004 21:35, Doug Cutting wrote: > Christoph Goller wrote: > > I think we should change BooleanScorer. An easy way would be to sort the > > bucket > > list before it is used. Do you think that would affect performance > > dramatically? > > I think it would make it slower. > > > Otherwise we should reimplement BooleanScorer. I haven't looked into the > > DisjunctionScorer patch in Bugzilla yet. Maybe it's a good starting point. > > I think we should incorporate Paul's code into CVS. This algorithm may > be slower in some cases, but it may also be faster in some cases. We > should add a static method to switch back to the old implementation, and > encourage folks to benchmark their code. If it proves no slower then we > could remove the old implementation altogether. >
There may be an alternative to this in the form of adding skipTo() to the current Boolean Scorer. Before I wrote the alternative boolean scorer, I investigated this possibility shortly, but I did not see how adding skipTo() could be done easily. Nonetheless, it might be possible. Here is some background on the alternative boolean scorer. More information is in the posting on bugzilla and from the javadocs. http://issues.apache.org/bugzilla/show_bug.cgi?id=31785 The core of the DisjunctionScorer is based on a simplification of SpanOrQuery. In particular class DisjunctionScorer.ScorerQueue is a simplified version of SpanOrQuery.SpanQueue in that it only needs to use document numbers, but not term positions. The existing ConjunctionScorer needed to be slightly extended to implement NrMatchersScorer, which is a Scorer that also provides the number of matching subscorers. The number of matchers is needed to provide coordination factor back the level of the BooleanQuery through some nested scorers. In case the code of the alternative boolean is added in cvs, it might be considered to merge the nrMatchers() method into the current Scorer. To complete the alternative boolean scorer, I added scorers for combining with prohibited scorers and for combining with optional scorers. These combining scorers were available from an extension of the Surround query language I posted in April this year. Mapping the required, optional and prohibited scorers of a BooleanQuery to a nesting of these combining scorers, DisjunctionScorer and ConjunctionScorer was straightforward, but a bit tedious. It is done by the make...SumScorer methods. Regards, Paul Elschot --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]