the disadvantage is it sucks to lose test coverage in case someone boosts a document by zero (we do nothing to prevent someone from doing such a thing).
again this sim is well-behaved here, its explain is always EXACTLY what the score returns, you can even add this assert to IBSimilarity: assert expl.getValue() == score(stats, freq, docLen); But the floating point inaccuracy comes from some query's explain(), in this case its -6.3329935E-8 versus -4.221996E-8. Previously this was never an issue, because we used a 'fixed' epsilon always, but i changed this to be 'relative' in https://issues.apache.org/jira/browse/LUCENE-3478. Another option would be to keep the relative epsilon (important for DefaultSimilarity if queryNorm is disabled), but always floor it to some tiny threshold. On Tue, Oct 4, 2011 at 10:23 AM, Chris Male <[email protected]> wrote: > +1 for B. > Some of these explanation test queries are insane. > > On Wed, Oct 5, 2011 at 3:19 AM, Robert Muir <[email protected]> wrote: >> >> This is not a sim issue, its a problem with the explain() impl in some >> query (disjunction max or bq). >> >> The test is asking for trouble by boosting a document with a boost of >> '0' which makes it look like an infinitely long document, returning a >> tiny tiny score (in my opinion this similarity is completely correct >> here). >> >> This means that we either: >> A. fix queries to try to perform floating point operations in their >> explains in the same order as their scorers >> or >> B. don't boost a document by 0 in the explanations tests. >> >> On Tue, Oct 4, 2011 at 6:15 AM, Apache Jenkins Server >> <[email protected]> wrote: >> > Build: https://builds.apache.org/job/Lucene-Solr-tests-only-trunk/10733/ >> > >> > 2 tests failed. >> > FAILED: org.apache.lucene.search.TestSimpleExplanations.testDMQ7 >> > >> > Error Message: >> > ((-field:yy field:w3) | field:w2)~0.5: score(doc=0)=-6.3329935E-8 != >> > explanationScore=-4.221996E-8 Explanation: -4.221996E-8 = (MATCH) max plus >> > 0.5 times others of: -4.221996E-8 = (MATCH) sum of: -4.221996E-8 = >> > (MATCH) weight(field:w3 in 0) [IBSimilarity], result of: -4.221996E-8 >> > = score(IBSimilarity, doc=0, freq=1.0), computed from: 1.0 = >> > termFreq=1 0.0 = NormalizationH2, computed from: 1.0 = >> > tf >> > 1.0 = avgFieldLength Infinity = len 0.29411766 >> > = >> > LambdaDF, computed from: 4.0 = docFreq 16.0 = >> > numberOfDocuments -4.221996E-8 = DistributionSPL -4.221996E-8 = >> > (MATCH) weight(field:w2 in 0) [IBSimilarity], result of: -4.221996E-8 = >> > score(IBSimilarity, doc=0, freq=1.0), computed from: 1.0 = termFreq=1 >> > 0.0 = NormalizationH2, computed from: 1.0 = tf 1.0 = >> > avgFieldLength Infinity = len 0.29411766 = LambdaDF, computed >> > from: 4.0 = docFreq 16.0 = numberOfDocuments >> > -4.221996E-8 = DistributionSPL expected:<-6.3329935E-8> but >> > was:<-4.221996E-8> >> > >> > Stack Trace: >> > junit.framework.AssertionFailedError: ((-field:yy field:w3) | >> > field:w2)~0.5: score(doc=0)=-6.3329935E-8 != explanationScore=-4.221996E-8 >> > Explanation: -4.221996E-8 = (MATCH) max plus 0.5 times others of: >> > -4.221996E-8 = (MATCH) sum of: >> > -4.221996E-8 = (MATCH) weight(field:w3 in 0) [IBSimilarity], result >> > of: >> > -4.221996E-8 = score(IBSimilarity, doc=0, freq=1.0), computed from: >> > 1.0 = termFreq=1 >> > 0.0 = NormalizationH2, computed from: >> > 1.0 = tf >> > 1.0 = avgFieldLength >> > Infinity = len >> > 0.29411766 = LambdaDF, computed from: >> > 4.0 = docFreq >> > 16.0 = numberOfDocuments >> > -4.221996E-8 = DistributionSPL >> > -4.221996E-8 = (MATCH) weight(field:w2 in 0) [IBSimilarity], result of: >> > -4.221996E-8 = score(IBSimilarity, doc=0, freq=1.0), computed from: >> > 1.0 = termFreq=1 >> > 0.0 = NormalizationH2, computed from: >> > 1.0 = tf >> > 1.0 = avgFieldLength >> > Infinity = len >> > 0.29411766 = LambdaDF, computed from: >> > 4.0 = docFreq >> > 16.0 = numberOfDocuments >> > -4.221996E-8 = DistributionSPL >> > expected:<-6.3329935E-8> but was:<-4.221996E-8> >> > at >> > org.apache.lucene.search.CheckHits.verifyExplanation(CheckHits.java:328) >> > at >> > org.apache.lucene.search.CheckHits$ExplanationAsserter.collect(CheckHits.java:498) >> > at org.apache.lucene.search.Scorer.score(Scorer.java:60) >> > at >> > org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:577) >> > at >> > org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:362) >> > at >> > org.apache.lucene.search.CheckHits.checkExplanations(CheckHits.java:302) >> > at >> > org.apache.lucene.search.QueryUtils.checkExplanations(QueryUtils.java:92) >> > at org.apache.lucene.search.QueryUtils.check(QueryUtils.java:126) >> > at org.apache.lucene.search.QueryUtils.check(QueryUtils.java:119) >> > at org.apache.lucene.search.QueryUtils.check(QueryUtils.java:106) >> > at >> > org.apache.lucene.search.CheckHits.checkHitCollector(CheckHits.java:89) >> > at >> > org.apache.lucene.search.TestExplanations.qtest(TestExplanations.java:99) >> > at >> > org.apache.lucene.search.TestSimpleExplanations.testDMQ7(TestSimpleExplanations.java:207) >> > at >> > org.apache.lucene.util.LuceneTestCase$2$1.evaluate(LuceneTestCase.java:608) >> > at >> > org.apache.lucene.util.LuceneTestCaseRunner.runChild(LuceneTestCaseRunner.java:148) >> > at >> > org.apache.lucene.util.LuceneTestCaseRunner.runChild(LuceneTestCaseRunner.java:50) >> > >> > >> > FAILED: org.apache.lucene.search.TestSimpleExplanations.testDMQ6 >> > >> > Error Message: >> > ((-field:yy field:w3) | field:xx)~0.5: score(doc=0)=-4.221996E-8 != >> > explanationScore=-2.110998E-8 Explanation: -2.110998E-8 = (MATCH) max plus >> > 0.5 times others of: -4.221996E-8 = (MATCH) sum of: -4.221996E-8 = >> > (MATCH) weight(field:w3 in 0) [IBSimilarity], result of: -4.221996E-8 >> > = score(IBSimilarity, doc=0, freq=1.0), computed from: 1.0 = >> > termFreq=1 0.0 = NormalizationH2, computed from: 1.0 = >> > tf >> > 1.0 = avgFieldLength Infinity = len 0.29411766 >> > = >> > LambdaDF, computed from: 4.0 = docFreq 16.0 = >> > numberOfDocuments -4.221996E-8 = DistributionSPL >> > expected:<-4.221996E-8> but was:<-2.110998E-8> >> > >> > Stack Trace: >> > junit.framework.AssertionFailedError: ((-field:yy field:w3) | >> > field:xx)~0.5: score(doc=0)=-4.221996E-8 != explanationScore=-2.110998E-8 >> > Explanation: -2.110998E-8 = (MATCH) max plus 0.5 times others of: >> > -4.221996E-8 = (MATCH) sum of: >> > -4.221996E-8 = (MATCH) weight(field:w3 in 0) [IBSimilarity], result >> > of: >> > -4.221996E-8 = score(IBSimilarity, doc=0, freq=1.0), computed from: >> > 1.0 = termFreq=1 >> > 0.0 = NormalizationH2, computed from: >> > 1.0 = tf >> > 1.0 = avgFieldLength >> > Infinity = len >> > 0.29411766 = LambdaDF, computed from: >> > 4.0 = docFreq >> > 16.0 = numberOfDocuments >> > -4.221996E-8 = DistributionSPL >> > expected:<-4.221996E-8> but was:<-2.110998E-8> >> > at >> > org.apache.lucene.search.CheckHits.verifyExplanation(CheckHits.java:328) >> > at >> > org.apache.lucene.search.CheckHits$ExplanationAsserter.collect(CheckHits.java:498) >> > at org.apache.lucene.search.Scorer.score(Scorer.java:60) >> > at >> > org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:577) >> > at >> > org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:362) >> > at >> > org.apache.lucene.search.CheckHits.checkExplanations(CheckHits.java:302) >> > at >> > org.apache.lucene.search.QueryUtils.checkExplanations(QueryUtils.java:92) >> > at org.apache.lucene.search.QueryUtils.check(QueryUtils.java:126) >> > at org.apache.lucene.search.QueryUtils.check(QueryUtils.java:119) >> > at org.apache.lucene.search.QueryUtils.check(QueryUtils.java:106) >> > at >> > org.apache.lucene.search.CheckHits.checkHitCollector(CheckHits.java:89) >> > at >> > org.apache.lucene.search.TestExplanations.qtest(TestExplanations.java:99) >> > at >> > org.apache.lucene.search.TestSimpleExplanations.testDMQ6(TestSimpleExplanations.java:196) >> > at >> > org.apache.lucene.util.LuceneTestCase$2$1.evaluate(LuceneTestCase.java:608) >> > at >> > org.apache.lucene.util.LuceneTestCaseRunner.runChild(LuceneTestCaseRunner.java:148) >> > at >> > org.apache.lucene.util.LuceneTestCaseRunner.runChild(LuceneTestCaseRunner.java:50) >> > >> > >> > >> > >> > Build Log (for compile errors): >> > [...truncated 1391 lines...] >> > >> > >> > >> > --------------------------------------------------------------------- >> > To unsubscribe, e-mail: [email protected] >> > For additional commands, e-mail: [email protected] >> > >> > >> >> >> >> -- >> lucidimagination.com >> >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: [email protected] >> For additional commands, e-mail: [email protected] >> > > > > -- > Chris Male | Software Developer | JTeam BV.| www.jteam.nl > -- lucidimagination.com --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
