Re: Using Lucene with Jarowinkler

2008-01-07 Thread Chris Lu
Hi, Shivani, For my understanding, Jarowinkler doesn't quite fit with Lucene's structure. Calculating Jaro-Winkler distance for the query against each word in the index is quite computational intensive. What's possible may be using SoundEx, Metaphone, Double Metaphone, etc, instead. For each word

Re: Using Lucene with Jarowinkler

2008-01-07 Thread Grant Ingersoll
FuzzyQuery uses EditDistance, you probably could create a JaroWinklerQuery that mimics FuzzyQuery but calculates the JaroWinkler score instead of the edit distance. As for dealing with phrases, that would get a bit more complex, but you may be able to use PhraseQuery as an example and then

Using Lucene with Jarowinkler

2008-01-07 Thread Shivani Sawhney
Hi All, I am using Jarowinkler scoring in my current project for matching names. The database of names against which the inputted value has to be matched is huge and thus we are faced with performance issues. We now want lucene to help us here; we want lucene's speed for handling huge data