[
https://issues.apache.org/jira/browse/LUCENE-10317?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17460158#comment-17460158
]
Dawid Weiss commented on LUCENE-10317:
--------------------------------------
If you open up your document in Luke and take a look at positions where the
terms are at, it's simpler to compute the required number of operations then.
Yes - terms ignored by an analyzer or gaps introduced by an analyzer can be
confusing with phrase queries.
> In PhraseQuery API, the explanation of getSlop is not inexact but could be
> more clear
> -------------------------------------------------------------------------------------
>
> Key: LUCENE-10317
> URL: https://issues.apache.org/jira/browse/LUCENE-10317
> Project: Lucene - Core
> Issue Type: Improvement
> Components: core/search
> Affects Versions: 5.2.1
> Reporter: Claude Lepère
> Priority: Trivial
>
> The explanation says that searching for "quick fox" will match the document
> "the fox is quick" with a slop of 3.
> That's true if the stop word "is" is not removed by the analyzer at indexing
> but, with the standard stop word list of Lucene which includes "is", a slop
> of 2 is enough.
> As I understand the comment in the PhraseQuery source, switching the order of
> two words requires two moves (the first places the words atop one another)
> and the slop is 2, but, if "is" is not removed, a third "move" is needed to
> add "is" itself and the slop is 3. I am not sure of this explanation. I would
> be happy to have it confirmed ... or not.
> I tested both cases in Lucene 5.2.1 but the text is the same in PhraseQuery
> API 8_0_0.
>
--
This message was sent by Atlassian Jira
(v8.20.1#820001)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]