Which scorer to use for disjunctions?

Paul Elschot Wed, 25 May 2005 12:25:23 -0700

Dear readers,

At the moment it's not clear to me which code
is best for scoring disjunctions:


There is a specialised priority queue for DisjunctionScorer:
http://issues.apache.org/bugzilla/show_bug.cgi?id=34193
This also contains:
- a btree implementation of BooleanScorer by Karl Wright
  that is probably the good for a small number of subscorers.
- performance measurement code in the TestDisjunctionPerf1

There is also BooleanScorer1:
http://issues.apache.org/bugzilla/show_bug.cgi?id=33019

I extended TestDisjunctionPerf1 to also exercise the btree
scorer, and the measurements are inconclusive: performance
of one scorer depends on the presence of others, which probably
means that the JIT is working irregurarly, even with -server and
-Xbatch as jvm options.
Also the relative order of the various scorers depends on the
number of subscorers.

TestDisjunctionScorer1 uses a set of test scorers like this:
  /** A scorer that matches all docs having a document number
   * that is a positive multiple of a given interval, up to a maximum.
   */
The interval is normally chosen as a prime number and the test
starts from an array of these numbers, adding a test scorer 
for each interval in the array.

Could someone indicate a few typical cases to use for selecting
the best disjunction scorer?


Regards,
Paul Elschot


P.S.
I also tried getting this to work under gcj, but I'm having problems
with class loading from shared libraries. I got gcj/gij to work for
another project, so I'm trying to find the difference in the build files
that causes this. Is there perhaps someone else that has gcj/gij
working on the Lucene test cases?


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Which scorer to use for disjunctions?

Reply via email to