[
https://issues.apache.org/jira/browse/LUCENE-3430?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Robert Muir resolved LUCENE-3430.
---------------------------------
Resolution: Fixed
Assignee: Robert Muir
> TestParser.testSpanTermXML fails with some sims
> -----------------------------------------------
>
> Key: LUCENE-3430
> URL: https://issues.apache.org/jira/browse/LUCENE-3430
> Project: Lucene - Java
> Issue Type: Bug
> Affects Versions: 4.0
> Reporter: Robert Muir
> Assignee: Robert Muir
> Fix For: 4.0
>
> Attachments: LUCENE-3430.patch
>
>
> here is why this test sometimes fails (my explanation in the test i wrote):
> {noformat}
> /** make sure all sims work with spanOR(termX, termY) where termY does not
> exist */
> public void testCrazySpans() throws Exception {
> // The problem: "normal" lucene queries create scorers, returning null if
> terms dont exist
> // This means they never score a term that does not exist.
> // however with spans, there is only one scorer for the whole hierarchy:
> // inner queries are not real queries, their boosts are ignored, etc.
> {noformat}
> Basically, SpanQueries aren't really queries, you just get one scorer. it
> calls extractTerms on the whole hierarchy and computes weights (e.g. IDF) on
> the whole bag of terms, even if they don't exist.
> This is fine, we already have tests that sim's won't bug-out in
> computeStats() here: however they don't expect to actually score documents
> based on
> these terms that don't exist... however this is exactly what happens in Spans
> because it doesn't use sub-scorers.
> Lucene's sim avoids this with the (docFreq + 1)
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]