The fieldType definition is a tad on the longer side: <fieldType name="text" class="solr.TextField" positionIncrementGap="100"> <analyzer type="index"> <tokenizer class="solr.WhitespaceTokenizerFactory"/>
<filter class="solr.WordDelimiterFilterFactory" catenateWords="1" catenateNumbers="1" generateNumberParts="1" splitOnCaseChange="1" generateWordParts="1" catenateAll="0" preserveOriginal="1" splitOnNumerics="0" /> <filter class="solr.LowerCaseFilterFactory"/> <filter class="solr.SynonymFilterFactory" synonyms="german/synonyms.txt" ignoreCase="true" expand="true"/> <filter class="solr.DictionaryCompoundWordTokenFilterFactory" dictionary="german/german-common-nouns.txt" minWordSize="5" minSubwordSize="4" maxSubwordSize="15" onlyLongestMatch="true" /> <filter class="solr.StopFilterFactory" words="german/stopwords.txt" ignoreCase="true" enablePositionIncrements="true"/> <filter class="solr.SnowballPorterFilterFactory" language="German2" protected="german/protwords.txt"/> <filter class="solr.RemoveDuplicatesTokenFilterFactory"/> </analyzer> <analyzer type="query"> <tokenizer class="solr.WhitespaceTokenizerFactory"/> <filter class="solr.WordDelimiterFilterFactory" catenateWords="0" catenateNumbers="0" generateWordParts="1" splitOnCaseChange="1" generateNumberParts="1" catenateAll="0" preserveOriginal="1" splitOnNumerics="0" /> <filter class="solr.LowerCaseFilterFactory"/> <filter class="solr.StopFilterFactory" words="german/stopwords.txt" ignoreCase="true" enablePositionIncrements="true"/> <filter class="solr.SnowballPorterFilterFactory" language="German2" protected="german/protwords.txt"/> <filter class="solr.RemoveDuplicatesTokenFilterFactory"/> </analyzer> </fieldType> Thank you for taking a look. 2014-01-29 Jack Krupansky <j...@basetechnology.com> > What field type and analyzer/tokenizer are you using? > > -- Jack Krupansky > > -----Original Message----- From: Thomas Michael Engelke Sent: Wednesday, > January 29, 2014 10:45 AM To: solr-user@lucene.apache.org Subject: Not > finding part of fulltext field when word ends in dot > Hello everybody, > > we have a legacy solr installation in version 3.6.0.1. One of the indices > defines a field named "content" as a fulltext field where a product > description will reside. One of the records indexed contains the following > data (excerpt): > > z. B. in der Serie 26KA. > > I had the problem that searching the value "26KA" didn't find anything. > Using the analyzer of the adminstrative interface and using the full text > on one hand and "26KA" as the query string, I can see how the search string > is transformed by the used filter factories. The WordDelimiterFilterFactory > transforms the "26KA." into "26KA", which is displayed like this (excerpt): > > 73 74 75 76 > in der Serie 26KA. > 26KA > > It seems that it stripped the "26KA." of the dot. Using the option to > highlight matches, an analysis search of "26KA" shows the lower of the two > entries matches (after reaching the LowerCaseFilterFactory). However, > querying the index using the query interface doesn't show any matches. > > I discovered that adding an asterisk to the search seems to work, as does > adding the dot. I am puzzled by this, as I thought that the second added > entry was the word actually indexed. I've tried looking up the definition > of the administrative interface, but the documentation only specifies this > for the latest version, where the display is different and (at least in the > sample) doesn't show such "duplication". > > Can anybody shed some light onto this? >