[ 
https://issues.apache.org/jira/browse/LUCENE-2667?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Muir reopened LUCENE-2667:
---------------------------------


i missed some things in the contrib/queryparser when doing this.

> Fix FuzzyQuery's defaults, so its fast.
> ---------------------------------------
>
>                 Key: LUCENE-2667
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2667
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Search
>    Affects Versions: 4.0
>            Reporter: Robert Muir
>            Assignee: Robert Muir
>             Fix For: 4.0
>
>         Attachments: LUCENE-2667.patch, LUCENE-2667.patch
>
>
> We worked a lot on FuzzyQuery, but you need to be a rocket scientist to 
> ensure good results.
> The main problem is that the default distance is 0.5f, which doesn't take 
> into account the length of the string.
> To add insult to injury, the default number of expansions is 1024 
> (traditionally from BooleanQuery maxClauseCount)
> I propose:
> * The syntax of FuzzyQuery is enhanced, so that you can specify raw edits 
> too: such as foobar~2 (all terms within 2 levenshtein edits of foobar). 
> Previously if you specified any amount >=1, you got IllegalArgumentException, 
> so this won't break anyone. You can still use foobar~0.5, and it works just 
> as before
> * The default for minimumSimilarity then becomes 
> LevenshteinAutomata.MAXIMUM_SUPPORTED_DISTANCE, which is 2. This way if you 
> just do foobar~, its always fast.
> * The size of the priority queue is reduced by default from 1024 to a much 
> more reasonable value: 50. This is what FuzzyLikeThis uses.
> I think its best to just change the defaults for this query, since it was so 
> aweful before. We can add notes in migrate.txt that if you care about using 
> the old values, then you should provide them explicitly, and you will get the 
> same results!

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to