I have stopwords.txt file with 1200+ words, i did not understand this with the stemming - you mean my stopwords are somehow ignored due to some stemming or ?
On Sun, Jan 3, 2010 at 3:53 PM, Grant Ingersoll <[email protected]> wrote: > Are you sure you have stopwords and it is not the result of stemming some > other word? > > On Jan 3, 2010, at 7:57 AM, Bogdan Vatkov wrote: > > > my Solr config is like the default one: > > > > <field name="msg_body" type="text" termVectors="true" indexed="true" > > stored="true"/> > > > > <fieldType name="text" class="solr.TextField" > positionIncrementGap="100"> > > <analyzer type="index"> > > <tokenizer class="solr.WhitespaceTokenizerFactory"/> > > <filter class="solr.StopFilterFactory" > > ignoreCase="true" > > words="stopwords.txt" > > enablePositionIncrements="true" > > /> > > <filter class="solr.WordDelimiterFilterFactory" > > generateWordParts="1" generateNumberParts="1" catenateWords="1" > > catenateNumbers="1" catenateAll="0" splitOnCaseChange="1"/> > > <filter class="solr.LowerCaseFilterFactory"/> > > <filter class="solr.SnowballPorterFilterFactory" > language="English" > > protected="protwords.txt"/> > > </analyzer> > > <analyzer type="query"> > > <tokenizer class="solr.WhitespaceTokenizerFactory"/> > > <filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt" > > ignoreCase="true" expand="true"/> > > <filter class="solr.StopFilterFactory" > > ignoreCase="true" > > words="stopwords.txt" > > enablePositionIncrements="true" > > /> > > <filter class="solr.WordDelimiterFilterFactory" > > generateWordParts="1" generateNumberParts="1" catenateWords="0" > > catenateNumbers="0" catenateAll="0" splitOnCaseChange="1"/> > > <filter class="solr.LowerCaseFilterFactory"/> > > <filter class="solr.SnowballPorterFilterFactory" > language="English" > > protected="protwords.txt"/> > > </analyzer> > > </fieldType> > > -- Best regards, Bogdan
