Hi, Shivani,
For my understanding, Jarowinkler doesn't quite fit with Lucene's
structure. Calculating Jaro-Winkler distance for the query against
each word in the index is quite computational intensive.
What's possible may be using SoundEx, Metaphone, Double Metaphone,
etc, instead. For each word
FuzzyQuery uses EditDistance, you probably could create a
JaroWinklerQuery that mimics FuzzyQuery but calculates the JaroWinkler
score instead of the edit distance. As for dealing with phrases, that
would get a bit more complex, but you may be able to use PhraseQuery
as an example and then
Hi All,
I am using Jarowinkler scoring in my current project for matching names. The
database of names against which the inputted value has to be matched is huge
and thus we are faced with performance issues.
We now want lucene to help us here; we want lucene's speed for handling huge
data