Hi Harry, I re-discovered this thread last week and have made some minor changes to the code (remove deprication warnings) so that it compiles with trunk. I think it would be quite useful to get this stemmer into Solr once all the legal / licensing issues are resolved. If there are no objections, I'll open a JIRA ticket and upload my changes so we can make sure we're all working with the same code.
cheers, Piete On 11/09/2007, Wagner,Harry <[EMAIL PROTECTED]> wrote: > > Bill, > Currently it is a plug-in. Put the lower case filter ahead of kstem, > just as for porter (example below). You can use it with porter, but I > can't imagine why you would want to. At least not in the same analyzer. > Hope this helps. > > <fieldtype name="text_kstem" class="solr.TextField"> > <analyzer type="index"> > <tokenizer class="solr.WhitespaceTokenizerFactory "/> > <filter class="solr.StopFilterFactory" ignoreCase="true" > words="stopwords.txt"/> > <filter class="solr.WordDelimiterFilterFactory" > generateWordParts="1" generateNumberParts="1" catenateWords="1" > catenateNumbers="1" catenateAll="0"/> > <filter class="solr.LowerCaseFilterFactory"/> > <filter class="org.oclc.solr.analysis.KStemFilterFactory" > cacheSize="20000"/> > <filter class="solr.RemoveDuplicatesTokenFilterFactory"/> > </analyzer> > <analyzer type="query"> > <tokenizer class=" solr.WhitespaceTokenizerFactory"/> > <filter class="solr.SynonymFilterFactory" > synonyms="synonyms.txt" ignoreCase="true" expand="true"/> > <filter class=" solr.StopFilterFactory" ignoreCase="true" > words="stopwords.txt"/> > <filter class="solr.WordDelimiterFilterFactory" > generateWordParts="1" generateNumberParts="1" catenateWords="0" > catenateNumbers="0" catenateAll="0"/> > <filter class="solr.LowerCaseFilterFactory"/> > <filter class="org.oclc.solr.analysis.KStemFilterFactory" > cacheSize="20000"/> > <filter class="solr.RemoveDuplicatesTokenFilterFactory"/> > </analyzer> > </fieldtype> > > Cheers... harry > >