Jonathan Hoag created LUCENE-5343:
-------------------------------------

             Summary: Potential floating point precision error in 
ConjuncionScorer.score()
                 Key: LUCENE-5343
                 URL: https://issues.apache.org/jira/browse/LUCENE-5343
             Project: Lucene - Core
          Issue Type: Bug
          Components: core/query/scoring
    Affects Versions: 4.5, 3.6.1, 3.6.3, 4.5.1
            Reporter: Jonathan Hoag


I have been investigating an issue with document scoring and found that the 
ConjunctionScorer implements the score method in a way that can cause floating 
point precision rounding issues.  I noticed in some of my test cases that 
documents that have not been merged/optimized (I'm not sure of the correct 
terminology, they have a docNum of 0) have scorers added in a different order 
than optimized documents.  Using a float to maintain the sum of scores 
introduces the potential for floating point precision errors.  In turn this 
causes the score that is returned from the ConjunctionScorer to be different 
for some merged/unmerged documents that should have identical scores. 

Example: 

float sum1 = 0.0061859353f + 0.0061859353f + 0.0030929677f + 0.0030929677f + 
0.0030929677f + 0.5010608f + 0.0061859353f; 

float sum2 =  0.0061859353f + 0.0061859353f + 0.0061859353f + 0.0030929677f + 
0.0030929677f + 0.0030929677f + 0.5010608f; 

sum1 == 0.5288975; // Incorrect 
sum2 == 0.52889746; // Correct 

I also noticed that there is a comment in the 4.5.1 version of Lucene to the 
effect of: 
// TODO: sum into a double and cast to float if we ever send required clauses 
to BS1 

Is there a reason that this has not been implemented yet?

        public float score() throws IOException { 
                double sum = 0.0d; 
                for (int i = 0; i < scorers.length; i++) { 
                        sum += scorers[i].score(); 
                } 
                return (float)sum; 
        } 





--
This message was sent by Atlassian JIRA
(v6.1#6144)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to