Hey Dmitry We had a similar issue reported and already fixed: https://issues.apache.org/jira/browse/SOLR-5800 i'd suspect that this patch fixes your issue too? would like to hear back from you, if that's the case :)
-Stefan On Saturday, March 15, 2014 at 6:58 PM, Dmitry Kan wrote: > Hello, > > The following type does not get analyzed properly on the solr 4.7.0 > analysis page: > > <fieldType name="text_en_splitting" class="solr.TextField" > positionIncrementGap="100" autoGeneratePhraseQueries="true"> > <analyzer type="index"> > <charFilter class="solr.HTMLStripCharFilterFactory"/> > <!-- <tokenizer class="solr.WhitespaceTokenizerFactory"/> --> > <tokenizer class="solr.StandardTokenizerFactory" /> > <filter class="solr.StopFilterFactory" > ignoreCase="true" > words="lang/stopwords_en.txt" > /> > <filter class="solr.WordDelimiterFilterFactory" > generateWordParts="1" generateNumberParts="1" catenateWords="1" > catenateNumbers="1" catenateAll="0" splitOnCaseChange="1"/> > <filter class="solr.LowerCaseFilterFactory"/> > <filter class="solr.KeywordMarkerFilterFactory" > protected="protwords.txt"/> > <filter class="solr.PorterStemFilterFactory"/> > </analyzer> > <analyzer type="query"> > <tokenizer class="solr.StandardTokenizerFactory" /> > <filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt" > ignoreCase="true" expand="true"/> > <filter class="solr.StopFilterFactory" > ignoreCase="true" > words="lang/stopwords_en.txt" > /> > <filter class="solr.WordDelimiterFilterFactory" > generateWordParts="1" generateNumberParts="1" catenateWords="0" > catenateNumbers="0" catenateAll="0" splitOnCaseChange="1"/> > <filter class="solr.LowerCaseFilterFactory"/> > <filter class="solr.KeywordMarkerFilterFactory" > protected="protwords.txt"/> > <filter class="solr.PorterStemFilterFactory"/> > </analyzer> > </fieldType> > > Example text: > fox jumps > > Screenshot: > http://pbrd.co/1lEVEIa > > This works fine in solr 4.6.1. > > -- > Dmitry > Blog: http://dmitrykan.blogspot.com > Twitter: http://twitter.com/dmitrykan > >