On 9/19/06, Gonçalo Gaiolas <[EMAIL PROTECTED]> wrote:
Hi everyone!



I'm using version 7.2 of Nutch and I'm very happy with it. Want to send a
big thumbs up for you guys behind it!

Welcome, our honoured guest from the future! :) 7.2 probably includes
natural language processing and spawns a great deal of controversy as
to weather it can be considered "intelligent" or just very good at
smalltalk. :)

Having said that, I'd like to make my users search experience as good as
possible. To do that, I need to solve two little "problems" :

-          Stemming – in my index I have lots of plurals and verbal forms
that prevent my users from sometimes finding the right results. I've been
looking around and it seems that the only stemming implementation available
for nutch is described in the wiki and requires extensive changes in Nutch
code, something I'd like to avoid. Can somebody help me ?

-          Synonyms – Ok, I don't really need synonyms. What I need is a way
to specify that Image Converter should be equal to ImageConverter, or
WebBlock should be the same as web block. How can I do this? This one is
really impacting the search quality :-)

I guess you need a different Analyzer. There's a list at
http://lucene.apache.org/java/docs/api/org/apache/lucene/analysis/Analyzer.html
You could also write your own to best represent the data you have.

Cheers,
t.n.a.

Reply via email to