Adrien Grand created LUCENE-8759:
------------------------------------
Summary: BlockMaxConjunctionScorer's simplified way of computing
max scores hurts performance
Key: LUCENE-8759
URL: https://issues.apache.org/jira/browse/LUCENE-8759
Project: Lucene - Core
Issue Type: Improvement
Reporter: Adrien Grand
BlockMaxConjunctionScorer computes the minimum value that the score should have
after each scorer in order to be able to interrupt scorer as soon as possible.
For instance say scorers A, B and C produce maximum scores that are equal to 4,
2 and 1. If the minimum competitive score is X, then the score after scoring A,
B and C must be at least X, the score after scoring A and B must be at least
X-1 and the score after scoring A must be at least X-1-2.
However this is made a bit more complex than that due to floating-point numbers
and the fact that intermediate score values are doubles which only get casted
to a float after all values have been summed up. In order to keep things
simple, BlockMaxConjunctionScore has the following comment and code
{code}
// Also compute the minimum required scores for a hit to be competitive
// A double that is less than 'score' might still be converted to
'score'
// when casted to a float, so we go to the previous float to avoid this
issue
minScores[minScores.length - 1] = minScore > 0 ?
Math.nextDown(minScore) : 0;
{code}
It simplifies the problem by calling Math.nextDown(minScore). However this is
problematic because it defeats the fact that TopScoreDocCollector calls
setMinCompetitiveScore on the float value that is immediately greater than the
k-th greatest hit so far.
nextDown(minScore) is not the value that we need. The value that we need is the
smallest double that converts to minScore when casted to a float, which would
be half-way between nextDown(minScore) and minScore. In some cases this would
help get better performance out of conjunctions, especially if some clauses
produce constant scores.
MaxScoreSumPropagator#setMinCompetitiveScore has the same issue.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]