Thanks for the info! Is the way to approach it just to call
PorterStemFilter in NutchDocumentAnalysis.java? Something
like this:

 /** Analyzer used to index textual content. */
 private static class ContentAnalyzer extends Analyzer {
   /** Constructs a [EMAIL PROTECTED] NutchDocumentTokenizer}. */
   public TokenStream tokenStream(String field, Reader reader) {
TokenStream ts = CommonGrams.getFilter(new NutchDocumentTokenizer(reader), field);
     return new PorterStemFilter(ts);
   }
 }

Am I completely off-base?

Howie

From: Andy Liu <[EMAIL PROTECTED]>

There's a couple that have been developed for Lucene.  You'd have to
modify the Nutch code to use your new stemming analyzer.

On 6/8/05, J�r�me Charron <[EMAIL PROTECTED]> wrote:
> > It seems that stemming is not working for me in nutch. If a document
> > has the word "kittens" in it, when I search for "kitten" it is not
> > being returned. Is there something I need to do to enable or install
> > support for stemming in English?
>
> As far as I know, it does not seem to me that the Nutch Analyzer performs
> stemming.
> I planned for the next release to write a proposal for integrating
> multi-language analyzers in Nutch (like in Lucene).
> But for now, as far as I know, there is nothing done on this area.
>
> Jerome
>
> --
> http://motrech.free.fr/
> http://frutch.free.fr/
>
>


Reply via email to