Hi Everyone, I think the subject line said it all. Here is the schema I'm using:
<fieldType name="my_text" class="solr.TextField" positionIncrementGap="100" autoGeneratePhraseQueries="true"> <analyzer> <tokenizer class="solr.WhitespaceTokenizerFactory"/> <filter class="solr.StopFilterFactory" ignoreCase="true" words="lang/stopwords_en.txt"/> <filter class="solr.WordDelimiterFilterFactory" generateWordParts="1" generateNumberParts="1" catenateWords="1" catenateNumbers="1" catenateAll="1" splitOnCaseChange="0" splitOnNumerics="1" stemEnglishPossessive="1" preserveOriginal="1"/> <filter class="solr.LowerCaseFilterFactory"/> <filter class="solr.KeywordMarkerFilterFactory" protected="protwords.txt"/> <filter class="solr.PorterStemFilterFactory"/> <filter class="solr.RemoveDuplicatesTokenFilterFactory"/> </analyzer> </fieldType> I'm guessing this is due to how solr.WhitespaceTokenizerFactory works and those that it is not indexing are removed because they are considered "white-spaces"? If so, how can I include %, &, etc. into this none-indexed list? I would rather see all these not indexed vs some are and some are not causing confusion to my users. Thanks Steve