Re: [solr 4.7.0] analysis page: issue with HTMLStripCharFilterFactory

Stefan Matheis Sun, 16 Mar 2014 13:39:07 -0700

Hey Dmitry 

We had a similar issue reported and already fixed: 
https://issues.apache.org/jira/browse/SOLR-5800
i'd suspect that this patch fixes your issue too? would like to hear back from 
you, if that's the case :)


-Stefan 


On Saturday, March 15, 2014 at 6:58 PM, Dmitry Kan wrote:

> Hello,
> 
> The following type does not get analyzed properly on the solr 4.7.0
> analysis page:
> 
> <fieldType name="text_en_splitting" class="solr.TextField"
> positionIncrementGap="100" autoGeneratePhraseQueries="true">
> <analyzer type="index">
> <charFilter class="solr.HTMLStripCharFilterFactory"/>
> <!-- <tokenizer class="solr.WhitespaceTokenizerFactory"/> -->
> <tokenizer class="solr.StandardTokenizerFactory" />
> <filter class="solr.StopFilterFactory"
> ignoreCase="true"
> words="lang/stopwords_en.txt"
> />
> <filter class="solr.WordDelimiterFilterFactory"
> generateWordParts="1" generateNumberParts="1" catenateWords="1"
> catenateNumbers="1" catenateAll="0" splitOnCaseChange="1"/>
> <filter class="solr.LowerCaseFilterFactory"/>
> <filter class="solr.KeywordMarkerFilterFactory"
> protected="protwords.txt"/>
> <filter class="solr.PorterStemFilterFactory"/>
> </analyzer>
> <analyzer type="query">
> <tokenizer class="solr.StandardTokenizerFactory" />
> <filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt"
> ignoreCase="true" expand="true"/>
> <filter class="solr.StopFilterFactory"
> ignoreCase="true"
> words="lang/stopwords_en.txt"
> />
> <filter class="solr.WordDelimiterFilterFactory"
> generateWordParts="1" generateNumberParts="1" catenateWords="0"
> catenateNumbers="0" catenateAll="0" splitOnCaseChange="1"/>
> <filter class="solr.LowerCaseFilterFactory"/>
> <filter class="solr.KeywordMarkerFilterFactory"
> protected="protwords.txt"/>
> <filter class="solr.PorterStemFilterFactory"/>
> </analyzer>
> </fieldType>
> 
> Example text:
> fox jumps
> 
> Screenshot:
> http://pbrd.co/1lEVEIa
> 
> This works fine in solr 4.6.1.
> 
> -- 
> Dmitry
> Blog: http://dmitrykan.blogspot.com
> Twitter: http://twitter.com/dmitrykan
> 
>

Re: [solr 4.7.0] analysis page: issue with HTMLStripCharFilterFactory

Reply via email to