I'm having a problem when users enter stopwords in their query. I'm using a
dismax request handler against a field setup like:
fieldType name=simpleText class=solr.TextField positionIncrementGap=100
analyzer type=index
tokenizer class=solr.StandardTokenizerFactory/
filter class=solr.StopFilterFactory
ignoreCase=true words=stopwords.txt enablePositionIncrements=true /
filter class=solr.LowerCaseFilterFactory/
filter class=solr.LengthFilterFactory min=2
max=20 /
/analyzer
analyzer type=query
tokenizer class=solr.WhitespaceTokenizerFactory/
filter class=solr.StopFilterFactory
ignoreCase=true words=stopwords.txt enablePositionIncrements=true /
filter class=solr.SynonymFilterFactory
synonyms=synonyms.txt ignoreCase=true expand=true/
filter class=solr.LowerCaseFilterFactory/
filter class=solr.LengthFilterFactory min=2
max=20 /
/analyzer
/fieldType
The problem is that when a user enters a query like 'meet the president', zero
results are returned. I imagine it has something to do with 'the' being
stripped out, then only 2 of the 3 terms matching. As a temporary workaround I
set minshouldmatch to 1 so I do get results. That causes other problems though,
such as 'the' never being highlighted in the results. Am I doing something
totally wrong?
Thanks,
Kallin Nagelberg