Claude Lepère created LUCENE-10317:
--------------------------------------
Summary: In PhraseQuery API, the explanation of getSlop is not
inexact but could be more clear
Key: LUCENE-10317
URL: https://issues.apache.org/jira/browse/LUCENE-10317
Project: Lucene - Core
Issue Type: Improvement
Components: core/search
Affects Versions: 5.2.1
Reporter: Claude Lepère
The explanation says that searching for "quick fox" will match the document
"the fox is quick" with a slop of 3.
That's true if the stop word "is" is not removed by the analyzer at indexing
but, with the standard stop word list of Lucene which includes "is", a slop of
2 is enough.
As I understand the comment in the PhraseQuery source, switching the order of
two words requires two moves (the first places the words atop one another) and
the slop is 2, but, if "is" is not removed, a third "move" is needed to add
"is" itself and the slop is 3. I am not sure of this explanation. I would be
happy to have it confirmed ... or not.
I tested both cases in Lucene 5.2.1 but the text is the same in PhraseQuery API
8_0_0.
--
This message was sent by Atlassian Jira
(v8.20.1#820001)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]