[ https://issues.apache.org/jira/browse/LUCENE-3430?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Robert Muir resolved LUCENE-3430. --------------------------------- Resolution: Fixed Assignee: Robert Muir > TestParser.testSpanTermXML fails with some sims > ----------------------------------------------- > > Key: LUCENE-3430 > URL: https://issues.apache.org/jira/browse/LUCENE-3430 > Project: Lucene - Java > Issue Type: Bug > Affects Versions: 4.0 > Reporter: Robert Muir > Assignee: Robert Muir > Fix For: 4.0 > > Attachments: LUCENE-3430.patch > > > here is why this test sometimes fails (my explanation in the test i wrote): > {noformat} > /** make sure all sims work with spanOR(termX, termY) where termY does not > exist */ > public void testCrazySpans() throws Exception { > // The problem: "normal" lucene queries create scorers, returning null if > terms dont exist > // This means they never score a term that does not exist. > // however with spans, there is only one scorer for the whole hierarchy: > // inner queries are not real queries, their boosts are ignored, etc. > {noformat} > Basically, SpanQueries aren't really queries, you just get one scorer. it > calls extractTerms on the whole hierarchy and computes weights (e.g. IDF) on > the whole bag of terms, even if they don't exist. > This is fine, we already have tests that sim's won't bug-out in > computeStats() here: however they don't expect to actually score documents > based on > these terms that don't exist... however this is exactly what happens in Spans > because it doesn't use sub-scorers. > Lucene's sim avoids this with the (docFreq + 1) -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org