Thanks for the info! Is the way to approach it just to call
PorterStemFilter in NutchDocumentAnalysis.java? Something
like this:
/** Analyzer used to index textual content. */
private static class ContentAnalyzer extends Analyzer {
/** Constructs a [EMAIL PROTECTED] NutchDocumentTokenizer}. */
public TokenStream tokenStream(String field, Reader reader) {
TokenStream ts = CommonGrams.getFilter(new
NutchDocumentTokenizer(reader), field);
return new PorterStemFilter(ts);
}
}
Am I completely off-base?
Howie
From: Andy Liu <[EMAIL PROTECTED]>
There's a couple that have been developed for Lucene. You'd have to
modify the Nutch code to use your new stemming analyzer.
On 6/8/05, J�r�me Charron <[EMAIL PROTECTED]> wrote:
> > It seems that stemming is not working for me in nutch. If a document
> > has the word "kittens" in it, when I search for "kitten" it is not
> > being returned. Is there something I need to do to enable or install
> > support for stemming in English?
>
> As far as I know, it does not seem to me that the Nutch Analyzer
performs
> stemming.
> I planned for the next release to write a proposal for integrating
> multi-language analyzers in Nutch (like in Lucene).
> But for now, as far as I know, there is nothing done on this area.
>
> Jerome
>
> --
> http://motrech.free.fr/
> http://frutch.free.fr/
>
>