[ 
https://issues.apache.org/jira/browse/LUCENE-7580?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15740402#comment-15740402
 ] 

Paul Elschot commented on LUCENE-7580:
--------------------------------------

This adds a nonMatchSlop attribute to SpanNearQuery,
and drops the nonMatchSlopFactor argument from SpansTreeQuery.

nonMatchSlop is the distance for determining a slop factor that is to be used 
for non matching occurrences of a SpanNearQuery.
Smaller values for this distance will increase the score contribution of non 
matching occurrences via
SimScorer.computeSlopFactor()

But smaller values for this distance, i.e. higher score contribution of non 
matching occurrences,
may lead to a scoring inconsistency between two span near queries that only 
differ in the allowed slop.
For example consider query A with a smaller allowed slop and query B with a 
larger one.
For query B there can be more matches, and these should increase the score of B
when compared to the score of A.
So for each extra match at B, the non matching score for query A should be 
lower than
the matching score for query B.
This may not be the case when the non matching score contribution is too high.

To have consistent scoring between two such queries,
choose a non matching slop that is larger than the largest allowed match slop,
and provide that non matching slop to both queries.
In case this consistency is not needed, nonMatchSlop can be chosen to be 
somewhat
larger than the maximum allowed match slop.

This nonMatchSlop is used in SpansTreeWeight to compute a minimal nested slop 
factor
from the maximum possible slops that can occur
in a SpanQuery for the nested SpanNearQueries and for nested SpanOrQueries with 
distance.
Finally, this minimal nested slop factor is used as the weight for scoring non 
matching terms.

The default nonMatchSlop for SpanNearQuery is large, Integer.MAX_VALUE/2.
Therefore by default non matching occurrences have no real score contribution.


> Spans tree scoring
> ------------------
>
>                 Key: LUCENE-7580
>                 URL: https://issues.apache.org/jira/browse/LUCENE-7580
>             Project: Lucene - Core
>          Issue Type: Improvement
>          Components: core/search
>    Affects Versions: master (7.0)
>            Reporter: Paul Elschot
>            Priority: Minor
>             Fix For: 6.x
>
>         Attachments: LUCENE-7580.patch, LUCENE-7580.patch
>
>
> Recurse the spans tree to compose a score based on the type of subqueries and 
> what matched



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to