I should have Googled better. It seems that my question has been asked
and answered already, and not just once:
http://www.nabble.com/Using-wildcard-with-accented-words-tf4673239.html
http://groups.google.com/group/acts_as_solr/browse_thread/thread/42920dc2dcc5fa88
On Nov 28, 2007 9:42 AM, Charles Hornberger
[EMAIL PROTECTED] wrote:
I'm confused by some behavior I'm seeing in Solr (i'm using 1.2.0). I
have a field named description, declared with the following
fieldType:
fieldType name=textTightUnstemmed class=solr.TextField
positionIncrementGap=100
analyzer
tokenizer class=solr.WhitespaceTokenizerFactory/
filter class=solr.SynonymFilterFactory
synonyms=synonyms.txt ignoreCase=true expand=false/
filter class=solr.StopFilterFactory ignoreCase=true
words=stopwords.txt/
filter class=solr.WordDelimiterFilterFactory
generateWordParts=0 generateNumberParts=0 catenateWords=1
catenateNumbers=1 catenateAll=0/
filter class=solr.LowerCaseFilterFactory/
filter class=solr.RemoveDuplicatesTokenFilterFactory/
/analyzer
/fieldType
The problem I'm having is that when I search for description:deck*, I
get the results I expect; when I search for description:Deck*, I get
nothing. I want both queries to return the same result set. (I'm using
the standard request handler.)
Interestingly, when I search for description:Deck from the web
interface, the debug output shows that the query term is converted to
lowercase:
str name=rawquerystringdescription:Deck/str
str name=querystringdescription:Deck/str
str name=parsedquerydescription:deck/str
str name=parsedquery_toStringdescription:deck/str
... but when I search for description:Deck*, it shows that it is not:
str name=rawquerystringdescription:Deck*/str
str name=querystringdescription:Deck*/str
str name=parsedquerydescription:Deck*/str
str name=parsedquery_toStringdescription:Deck*/str
What am I doing wrong here?
Also, when I use the Field Analysis tool for description:Deck*, it
shows the following (sorry for the bad copy/paste):
Query Analyzer
org.apache.solr.analysis.WhitespaceTokenizerFactory {}
term position 1
term text Deck*
term type word
source start,end0,5
org.apache.solr.analysis.SynonymFilterFactory {synonyms=synonyms.txt,
expand=false, ignoreCase=true}
term position 1
term text Deck*
term type word
source start,end0,5
org.apache.solr.analysis.StopFilterFactory {words=stopwords.txt,
ignoreCase=true}
term position 1
term text Deck*
term type word
source start,end0,5
org.apache.solr.analysis.WordDelimiterFilterFactory
{generateNumberParts=0, catenateWords=1, generateWordParts=0,
catenateAll=0, catenateNumbers=1}
term position 1
term text Deck
term type word
source start,end0,4
org.apache.solr.analysis.LowerCaseFilterFactory {}
term position 1
term text deck
term type word
source start,end0,4
org.apache.solr.analysis.RemoveDuplicatesTokenFilterFactory {}
term position 1
term text deck
term type word
source start,end0,4
Thanks,
Charlie