Hi,
what is the effect of the format attribute for StopFilterFactory? E.g.
format="snowball"?
Sorl ships with a schema.xml with a lot of good examples. The file is in
example/solr/conf/schema.xml and defines a <fieldType> for German text:
<!-- German -->
<fieldType name="text_de" class="solr.TextField" positionIncrementGap="100">
<analyzer>
<tokenizer class="solr.StandardTokenizerFactory"/>
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.StopFilterFactory" ignoreCase="true"
words="lang/stopwords_de.txt" format="snowball"
enablePositionIncrements="true"/>
<filter class="solr.GermanNormalizationFilterFactory"/>
<filter class="solr.GermanLightStemFilterFactory"/>
<!-- less aggressive: <filter
class="solr.GermanMinimalStemFilterFactory"/> -->
<!-- more aggressive: <filter class="solr.SnowballPorterFilterFactory"
language="German2"/> -->
</analyzer>
</fieldType>
The StopFilterFactory is configured with format="snowball". For what is this
good?
I grabbed the Solr 4.0-BETA source with Maven and had a look at classes
StopFilter and StopFilterFactory:
<dependency>
<groupId>org.apache.solr</groupId>
<artifactId>solr</artifactId>
<version>4.0.0-BETA</version>
<type>java-source</type>
</dependency>
But there is no attribute format handled anywhere. Am I missing something here?