[
https://issues.apache.org/jira/browse/LUCENE-7993?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16213858#comment-16213858
]
Robert Muir commented on LUCENE-7993:
-------------------------------------
I see. I wonder if we could try a simple "degraded" form of the optimization at
first, where we look at maximum tf value versus the minimum one. In other
words, if freq(a)=4 and freq(b)=2, we'd test score(4) for sloppyPhrase instead
of score(2) like we do for exactPhrase.
I realize this is not very good and really makes the optimization significantly
less potent, but perhaps still avoids reading a lot of positions, safe and easy
as a start?
> Speed up phrase queries when total hit count is not needed
> ----------------------------------------------------------
>
> Key: LUCENE-7993
> URL: https://issues.apache.org/jira/browse/LUCENE-7993
> Project: Lucene - Core
> Issue Type: Improvement
> Reporter: Adrien Grand
> Priority: Minor
> Attachments: LUCENE-7993.patch
>
>
> Follow-up of LUCENE-4100: When thinking about the API that we needed to
> introduce to support MAXSCORE, I wondered whether the same API could support
> other optimizations. The idea is that when running phrase queries, before we
> start reading positions, we already have access to the term frequency of each
> term. And the frequency of the phrase is bounded by the minimum term
> frequency of the involved terms. So if the score for that minimum term
> frequency is not competitive then it means that the score for the phrase is
> not competitive either if we can assume that the score increases (or
> stagnates) when the term freq increases, which sounds like an ok requirement
> for a sane Similarity?
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]