On Wed, Oct 5, 2011 at 3:33 AM, Robert Muir <[email protected]> wrote:
> the disadvantage is it sucks to lose test coverage in case someone > boosts a document by zero (we do nothing to prevent someone from doing > such a thing). > > again this sim is well-behaved here, its explain is always EXACTLY > what the score returns, you can even add this assert to IBSimilarity: > > assert expl.getValue() == score(stats, freq, docLen); > > But the floating point inaccuracy comes from some query's explain(), > in this case its -6.3329935E-8 versus -4.221996E-8. > > Previously this was never an issue, because we used a 'fixed' epsilon > always, but i changed this to be 'relative' in > https://issues.apache.org/jira/browse/LUCENE-3478. > > Another option would be to keep the relative epsilon (important for > DefaultSimilarity if queryNorm is disabled), but always floor it to > some tiny threshold. > This last option sounds much better I think. I still don't especially like that we have to deal with boosting by 0, but it'd be a huge pain to prevent. > > On Tue, Oct 4, 2011 at 10:23 AM, Chris Male <[email protected]> wrote: > > +1 for B. > > Some of these explanation test queries are insane. > > > > On Wed, Oct 5, 2011 at 3:19 AM, Robert Muir <[email protected]> wrote: > >> > >> This is not a sim issue, its a problem with the explain() impl in some > >> query (disjunction max or bq). > >> > >> The test is asking for trouble by boosting a document with a boost of > >> '0' which makes it look like an infinitely long document, returning a > >> tiny tiny score (in my opinion this similarity is completely correct > >> here). > >> > >> This means that we either: > >> A. fix queries to try to perform floating point operations in their > >> explains in the same order as their scorers > >> or > >> B. don't boost a document by 0 in the explanations tests. > >> > >> On Tue, Oct 4, 2011 at 6:15 AM, Apache Jenkins Server > >> <[email protected]> wrote: > >> > Build: > https://builds.apache.org/job/Lucene-Solr-tests-only-trunk/10733/ > >> > > >> > 2 tests failed. > >> > FAILED: org.apache.lucene.search.TestSimpleExplanations.testDMQ7 > >> > > >> > Error Message: > >> > ((-field:yy field:w3) | field:w2)~0.5: score(doc=0)=-6.3329935E-8 != > >> > explanationScore=-4.221996E-8 Explanation: -4.221996E-8 = (MATCH) max > plus > >> > 0.5 times others of: -4.221996E-8 = (MATCH) sum of: -4.221996E-8 > = > >> > (MATCH) weight(field:w3 in 0) [IBSimilarity], result of: > -4.221996E-8 > >> > = score(IBSimilarity, doc=0, freq=1.0), computed from: 1.0 = > >> > termFreq=1 0.0 = NormalizationH2, computed from: > 1.0 = tf > >> > 1.0 = avgFieldLength Infinity = len > 0.29411766 = > >> > LambdaDF, computed from: 4.0 = docFreq 16.0 = > >> > numberOfDocuments -4.221996E-8 = DistributionSPL > -4.221996E-8 = > >> > (MATCH) weight(field:w2 in 0) [IBSimilarity], result of: > -4.221996E-8 = > >> > score(IBSimilarity, doc=0, freq=1.0), computed from: 1.0 = > termFreq=1 > >> > 0.0 = NormalizationH2, computed from: 1.0 = tf > 1.0 = > >> > avgFieldLength Infinity = len 0.29411766 = LambdaDF, > computed > >> > from: 4.0 = docFreq 16.0 = numberOfDocuments > >> > -4.221996E-8 = DistributionSPL expected:<-6.3329935E-8> but > >> > was:<-4.221996E-8> > >> > > >> > Stack Trace: > >> > junit.framework.AssertionFailedError: ((-field:yy field:w3) | > >> > field:w2)~0.5: score(doc=0)=-6.3329935E-8 != > explanationScore=-4.221996E-8 > >> > Explanation: -4.221996E-8 = (MATCH) max plus 0.5 times others of: > >> > -4.221996E-8 = (MATCH) sum of: > >> > -4.221996E-8 = (MATCH) weight(field:w3 in 0) [IBSimilarity], result > >> > of: > >> > -4.221996E-8 = score(IBSimilarity, doc=0, freq=1.0), computed > from: > >> > 1.0 = termFreq=1 > >> > 0.0 = NormalizationH2, computed from: > >> > 1.0 = tf > >> > 1.0 = avgFieldLength > >> > Infinity = len > >> > 0.29411766 = LambdaDF, computed from: > >> > 4.0 = docFreq > >> > 16.0 = numberOfDocuments > >> > -4.221996E-8 = DistributionSPL > >> > -4.221996E-8 = (MATCH) weight(field:w2 in 0) [IBSimilarity], result > of: > >> > -4.221996E-8 = score(IBSimilarity, doc=0, freq=1.0), computed from: > >> > 1.0 = termFreq=1 > >> > 0.0 = NormalizationH2, computed from: > >> > 1.0 = tf > >> > 1.0 = avgFieldLength > >> > Infinity = len > >> > 0.29411766 = LambdaDF, computed from: > >> > 4.0 = docFreq > >> > 16.0 = numberOfDocuments > >> > -4.221996E-8 = DistributionSPL > >> > expected:<-6.3329935E-8> but was:<-4.221996E-8> > >> > at > >> > > org.apache.lucene.search.CheckHits.verifyExplanation(CheckHits.java:328) > >> > at > >> > > org.apache.lucene.search.CheckHits$ExplanationAsserter.collect(CheckHits.java:498) > >> > at org.apache.lucene.search.Scorer.score(Scorer.java:60) > >> > at > >> > org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:577) > >> > at > >> > org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:362) > >> > at > >> > > org.apache.lucene.search.CheckHits.checkExplanations(CheckHits.java:302) > >> > at > >> > > org.apache.lucene.search.QueryUtils.checkExplanations(QueryUtils.java:92) > >> > at > org.apache.lucene.search.QueryUtils.check(QueryUtils.java:126) > >> > at > org.apache.lucene.search.QueryUtils.check(QueryUtils.java:119) > >> > at > org.apache.lucene.search.QueryUtils.check(QueryUtils.java:106) > >> > at > >> > > org.apache.lucene.search.CheckHits.checkHitCollector(CheckHits.java:89) > >> > at > >> > > org.apache.lucene.search.TestExplanations.qtest(TestExplanations.java:99) > >> > at > >> > > org.apache.lucene.search.TestSimpleExplanations.testDMQ7(TestSimpleExplanations.java:207) > >> > at > >> > > org.apache.lucene.util.LuceneTestCase$2$1.evaluate(LuceneTestCase.java:608) > >> > at > >> > > org.apache.lucene.util.LuceneTestCaseRunner.runChild(LuceneTestCaseRunner.java:148) > >> > at > >> > > org.apache.lucene.util.LuceneTestCaseRunner.runChild(LuceneTestCaseRunner.java:50) > >> > > >> > > >> > FAILED: org.apache.lucene.search.TestSimpleExplanations.testDMQ6 > >> > > >> > Error Message: > >> > ((-field:yy field:w3) | field:xx)~0.5: score(doc=0)=-4.221996E-8 != > >> > explanationScore=-2.110998E-8 Explanation: -2.110998E-8 = (MATCH) max > plus > >> > 0.5 times others of: -4.221996E-8 = (MATCH) sum of: -4.221996E-8 > = > >> > (MATCH) weight(field:w3 in 0) [IBSimilarity], result of: > -4.221996E-8 > >> > = score(IBSimilarity, doc=0, freq=1.0), computed from: 1.0 = > >> > termFreq=1 0.0 = NormalizationH2, computed from: > 1.0 = tf > >> > 1.0 = avgFieldLength Infinity = len > 0.29411766 = > >> > LambdaDF, computed from: 4.0 = docFreq 16.0 = > >> > numberOfDocuments -4.221996E-8 = DistributionSPL > >> > expected:<-4.221996E-8> but was:<-2.110998E-8> > >> > > >> > Stack Trace: > >> > junit.framework.AssertionFailedError: ((-field:yy field:w3) | > >> > field:xx)~0.5: score(doc=0)=-4.221996E-8 != > explanationScore=-2.110998E-8 > >> > Explanation: -2.110998E-8 = (MATCH) max plus 0.5 times others of: > >> > -4.221996E-8 = (MATCH) sum of: > >> > -4.221996E-8 = (MATCH) weight(field:w3 in 0) [IBSimilarity], result > >> > of: > >> > -4.221996E-8 = score(IBSimilarity, doc=0, freq=1.0), computed > from: > >> > 1.0 = termFreq=1 > >> > 0.0 = NormalizationH2, computed from: > >> > 1.0 = tf > >> > 1.0 = avgFieldLength > >> > Infinity = len > >> > 0.29411766 = LambdaDF, computed from: > >> > 4.0 = docFreq > >> > 16.0 = numberOfDocuments > >> > -4.221996E-8 = DistributionSPL > >> > expected:<-4.221996E-8> but was:<-2.110998E-8> > >> > at > >> > > org.apache.lucene.search.CheckHits.verifyExplanation(CheckHits.java:328) > >> > at > >> > > org.apache.lucene.search.CheckHits$ExplanationAsserter.collect(CheckHits.java:498) > >> > at org.apache.lucene.search.Scorer.score(Scorer.java:60) > >> > at > >> > org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:577) > >> > at > >> > org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:362) > >> > at > >> > > org.apache.lucene.search.CheckHits.checkExplanations(CheckHits.java:302) > >> > at > >> > > org.apache.lucene.search.QueryUtils.checkExplanations(QueryUtils.java:92) > >> > at > org.apache.lucene.search.QueryUtils.check(QueryUtils.java:126) > >> > at > org.apache.lucene.search.QueryUtils.check(QueryUtils.java:119) > >> > at > org.apache.lucene.search.QueryUtils.check(QueryUtils.java:106) > >> > at > >> > > org.apache.lucene.search.CheckHits.checkHitCollector(CheckHits.java:89) > >> > at > >> > > org.apache.lucene.search.TestExplanations.qtest(TestExplanations.java:99) > >> > at > >> > > org.apache.lucene.search.TestSimpleExplanations.testDMQ6(TestSimpleExplanations.java:196) > >> > at > >> > > org.apache.lucene.util.LuceneTestCase$2$1.evaluate(LuceneTestCase.java:608) > >> > at > >> > > org.apache.lucene.util.LuceneTestCaseRunner.runChild(LuceneTestCaseRunner.java:148) > >> > at > >> > > org.apache.lucene.util.LuceneTestCaseRunner.runChild(LuceneTestCaseRunner.java:50) > >> > > >> > > >> > > >> > > >> > Build Log (for compile errors): > >> > [...truncated 1391 lines...] > >> > > >> > > >> > > >> > --------------------------------------------------------------------- > >> > To unsubscribe, e-mail: [email protected] > >> > For additional commands, e-mail: [email protected] > >> > > >> > > >> > >> > >> > >> -- > >> lucidimagination.com > >> > >> --------------------------------------------------------------------- > >> To unsubscribe, e-mail: [email protected] > >> For additional commands, e-mail: [email protected] > >> > > > > > > > > -- > > Chris Male | Software Developer | JTeam BV.| www.jteam.nl > > > > > > -- > lucidimagination.com > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [email protected] > For additional commands, e-mail: [email protected] > > -- Chris Male | Software Developer | JTeam BV.| www.jteam.nl
