Hello.

The explanation of
https://lucene.apache.org/core/8_0_0/core/org/apache/lucene/search/PhraseQuery.html#getSlop
<https://lucene.apache.org/core/8_0_0/core/org/apache/lucene/search/PhraseQuery.html#getSlop-->
writes
that the edit distance between "quick fox" and "the fox is quick" would be
at an edit distance of 3;
this seems inaccurate to me.

I don't know if the edit distance used by Lucene is the Levenshtein
distance (insertion, deletion, substitution, all of weight 1) - a standard
in information retrieval - but a test of "quick fox" PhraseQuery with a
slop of 2 hits the text "the fox is quick" (1 deletion + 1 insertion); the
slop does not have to be 3.

I wonder if I'm right.


Claude Lepère, Belgium

claudelep...@gmail.com



<http://www.avg.com/email-signature?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=webmail>
Virus-free.
www.avg.com
<http://www.avg.com/email-signature?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=webmail>
<#DAB4FAD8-2DD7-40BB-A1B8-4E2AA1F9FDF2>

Reply via email to