David Wendt created LUCENE-7151:
-----------------------------------
Summary: Nested spanNear scoring error when inner clauses overlap
positions
Key: LUCENE-7151
URL: https://issues.apache.org/jira/browse/LUCENE-7151
Project: Lucene - Core
Issue Type: Bug
Components: core/query/scoring
Affects Versions: 5.5, 5.3.1
Environment: Windows, Linux
Reporter: David Wendt
For spanNear([spanNear([contents:word1, contents:word3], 2, true),
spanNear([contents:word2, contents:word3], 2, true)], 2, false)
Scores for the following two documents should be the same but are not.
doc1: [----- word1 word2 ----- word2 word3 ----- word1 word2 word3 -----]
doc2: [----- word2 word3 ----- word1 word3 ----- word1 word2 word3 -----]
The positions of the inner clauses effect the scoring for the of the final
3-term phrase. This appears to be a side-effect of the span-scoring rewrite in
5.2(?).
The SpansCell.adjustMax() uses end-position values to decide maxEndPositionCell
while the SpanPositionQueue uses start-position and end-position values to sort
the SpanCells. This means that maxEndPositionCell will be incorrectly set or
not set depending on previous positions.
I can provide example code illustrating the score error.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]