Alan Woodward created LUCENE-8633:
-------------------------------------
Summary: Remove term weighting from interval scoring
Key: LUCENE-8633
URL: https://issues.apache.org/jira/browse/LUCENE-8633
Project: Lucene - Core
Issue Type: Improvement
Reporter: Alan Woodward
Assignee: Alan Woodward
Attachments: LUCENE-8633.patch
IntervalScorer currently uses the same scoring mechanism as SpanScorer, summing
the IDF of all possibly matching terms from its parent IntervalsSource and
using that in conjunction with a sloppy frequency to produce a similarity-based
score. This doesn't really make sense, however, as it means that terms that
don't appear in a document can still contribute to the score, and appears to
make scores from interval queries comparable with scores from term or phrase
queries when they really aren't.
I'd like to explore a different scoring mechanism for intervals, based purely
on sloppy frequency and ignoring term weighting. This should make the scores
easier to reason about, as well as making them useful for things like proximity
boosting on boolean queries.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]