You, or any other solr member, knows a good fuzzy string matching library to
recommend?

On Thu, May 19, 2011 at 9:39 AM, Michael McCandless <
luc...@mikemccandless.com> wrote:

> Well.... the good news is FuzzyQuery is indeed much faster in Lucene/Solr
> 4.0.
>
> But the bad news is... FuzzyQuery won't do what you need here.  You
> need some sort of FuzzyPhraseQuery, which is able to replace terms
> similar to one another (comp/company/corporation) by some metric.  I
> don't know of such a query in Lucene/Solr... but it'd be a nice
> addition.  Others have asked about this before.
>
> FuzzyQuery finds terms "close" to other terms, when measured by edit
> distance, eg fuzzy/wuzzy/muzzy are all edit distance one from each
> other.
>
> Mike
>
> http://blog.mikemccandless.com
>
> On Wed, May 18, 2011 at 8:03 PM, Guilherme Aiolfi <grad...@gmail.com>
> wrote:
> > Hi,
> >
> > I want to do a fuzzy search that compare a phrase to a field in solr. For
> > example:
> >
> > "abc company ltda" will be compared to "abc comp", "abc corporation",
> "def
> > company ltda", "nothing to match here".
> >
> > The thing is the it has to always returns documents sorted by its score.
> >
> > I've found some good algorithms to do that, like StrikeAMatch[1] and
> > JaroWinkler.
> >
> > Using the JaroWinkler with strdist() I can do exactly that. But, I rather
> > prefer to use the StrikeAMatch that had a patch in the lucene jira that
> was
> > never commited.
> >
> > So, I contacted the author of that patch and he told me that I should use
> > the solr 4.0 that it has now some pretty good new fuzzy search
> enhancements
> > that made StrikeAMatch seems toys for kids.
> >
> > Anyone know how can I achieve that using solr 4.0?
> >
> > [1] http://www.catalysoft.com/articles/StrikeAMatch.html
> >
>

Reply via email to