Re: Fuzzy searching, tildes and solr

Walter Lewis Fri, 26 Jan 2007 07:46:07 -0800

Yonik Seeley wrote:

+(+text:jame +text:sutherland) +searchSet:testSet

+(+text:james~0.75 +text:sutherland~0.75) +searchSet:testSet


I can tell from the first that this is a stemmed field... "james" is
transformed to "jame"

"James" being the plural of "Jame" according to the stemmer. I guess mymind hadn't run in that direction. :)

I guess I wasn't expecting the fuzzy query logic to bypass thestemming. Would it be correct that if I were to add "james" to theprotwords.txt file that this *specific* problem would go away? Obviouslythere are a significant quantity of proper names where this would havean impact, so a more generic solution is preferable.

So, you could
- index the field twice using copyField, and then do fuzzy queries on
the non-stemmed version. [plus two other good suggestions]

As I look at the field types in the example schema would you recommendsomething like text_lu without the EnglishPorterFilterFactory, or arethere other issues I'm overlooking.


Walter Lewis
(aka Walt Lewi apparently)

Re: Fuzzy searching, tildes and solr

Reply via email to