[
https://issues.apache.org/jira/browse/LUCENE-7993?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16213772#comment-16213772
]
Adrien Grand commented on LUCENE-7993:
--------------------------------------
Yes, I think it would be possible. I started with exact phrases which are
easier to reason about but we should definitely think about sloppy phrases too.
I think it just needs a bit more thinking given that a term can count twice in
the frequency of sloppy freqs, eg. if you search for "a b"~3 and your document
contains "a b a" (two matches in spite of a freq of 1 for b)?
> Speed up phrase queries when total hit count is not needed
> ----------------------------------------------------------
>
> Key: LUCENE-7993
> URL: https://issues.apache.org/jira/browse/LUCENE-7993
> Project: Lucene - Core
> Issue Type: Improvement
> Reporter: Adrien Grand
> Priority: Minor
> Attachments: LUCENE-7993.patch
>
>
> Follow-up of LUCENE-4100: When thinking about the API that we needed to
> introduce to support MAXSCORE, I wondered whether the same API could support
> other optimizations. The idea is that when running phrase queries, before we
> start reading positions, we already have access to the term frequency of each
> term. And the frequency of the phrase is bounded by the minimum term
> frequency of the involved terms. So if the score for that minimum term
> frequency is not competitive then it means that the score for the phrase is
> not competitive either if we can assume that the score increases (or
> stagnates) when the term freq increases, which sounds like an ok requirement
> for a sane Similarity?
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]