[
https://issues.apache.org/jira/browse/LUCENE-7439?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16148358#comment-16148358
]
Tim Allison commented on LUCENE-7439:
-------------------------------------
Talk about slow on the uptake (no pun intended)...
[faf3bc3|https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=faf3bc3]
removed SlowFuzzyQuery, and I realize it has been deprecated for a long, long
time. Is there any way to increase the maxEdits/MAXIMUM_SUPPORTED_DISTANCE to
> 2?
> Should FuzzyQuery match short terms too?
> ----------------------------------------
>
> Key: LUCENE-7439
> URL: https://issues.apache.org/jira/browse/LUCENE-7439
> Project: Lucene - Core
> Issue Type: Improvement
> Reporter: Michael McCandless
> Assignee: Michael McCandless
> Fix For: 7.0, 6.3
>
> Attachments: LUCENE-7439.patch, LUCENE-7439.patch, LUCENE-7439.patch
>
>
> Today, if you ask {{FuzzyQuery}} to match {{abcd}} with edit distance 2, it
> will fail to match the term {{ab}} even though it's 2 edits away.
> Its javadocs explain this:
> {noformat}
> * <p>NOTE: terms of length 1 or 2 will sometimes not match because of how
> the scaled
> * distance between two terms is computed. For a term to match, the edit
> distance between
> * the terms must be less than the minimum length term (either the input
> term, or
> * the candidate term). For example, FuzzyQuery on term "abcd" with
> maxEdits=2 will
> * not match an indexed term "ab", and FuzzyQuery on term "a" with maxEdits=2
> will not
> * match an indexed term "abc".
> {noformat}
> On the one hand, I can see that this behavior is sort of justified in that
> 50% of the characters are different and so this is a very "weak" match, but
> on the other hand, it's quite unexpected since edit distance is such an exact
> measure so the terms should have matched.
> It seems like the behavior is caused by internal implementation details about
> how the relative (floating point) score is computed. I think we should fix
> it, so that edit distance 2 does in fact match all terms with edit distance
> <= 2.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]