[ 
https://issues.apache.org/jira/browse/LUCENE-7993?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16213772#comment-16213772
 ] 

Adrien Grand commented on LUCENE-7993:
--------------------------------------

Yes, I think it would be possible. I started with exact phrases which are 
easier to reason about but we should definitely think about sloppy phrases too. 
I think it just needs a bit more thinking given that a term can count twice in 
the frequency of sloppy freqs, eg. if you search for "a b"~3 and your document 
contains "a b a" (two matches in spite of a freq of 1 for b)?

> Speed up phrase queries when total hit count is not needed
> ----------------------------------------------------------
>
>                 Key: LUCENE-7993
>                 URL: https://issues.apache.org/jira/browse/LUCENE-7993
>             Project: Lucene - Core
>          Issue Type: Improvement
>            Reporter: Adrien Grand
>            Priority: Minor
>         Attachments: LUCENE-7993.patch
>
>
> Follow-up of LUCENE-4100: When thinking about the API that we needed to 
> introduce to support MAXSCORE, I wondered whether the same API could support 
> other optimizations. The idea is that when running phrase queries, before we 
> start reading positions, we already have access to the term frequency of each 
> term. And the frequency of the phrase is bounded by the minimum term 
> frequency of the involved terms. So if the score for that minimum term 
> frequency is not competitive then it means that the score for the phrase is 
> not competitive either if we can assume that the score increases (or 
> stagnates) when the term freq increases, which sounds like an ok requirement 
> for a sane Similarity?



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to