I dont think it should be 7.2 before we get some natural language
processing.
especially if there is public collaboration with nutch community and the
folks at
http://opennlp.sourceforge.net/
:-0

Tomi NA wrote:
> On 9/19/06, Gonçalo Gaiolas <[EMAIL PROTECTED]> wrote:
>> Hi everyone!
>>
>>
>>
>> I'm using version 7.2 of Nutch and I'm very happy with it. Want to
>> send a
>> big thumbs up for you guys behind it!
>
> Welcome, our honoured guest from the future! :) 7.2 probably includes
> natural language processing and spawns a great deal of controversy as
> to weather it can be considered "intelligent" or just very good at
> smalltalk. :)
>
>> Having said that, I'd like to make my users search experience as good as
>> possible. To do that, I need to solve two little "problems" :
>>
>> -          Stemming – in my index I have lots of plurals and verbal
>> forms
>> that prevent my users from sometimes finding the right results. I've
>> been
>> looking around and it seems that the only stemming implementation
>> available
>> for nutch is described in the wiki and requires extensive changes in
>> Nutch
>> code, something I'd like to avoid. Can somebody help me ?
>>
>> -          Synonyms – Ok, I don't really need synonyms. What I need
>> is a way
>> to specify that Image Converter should be equal to ImageConverter, or
>> WebBlock should be the same as web block. How can I do this? This one is
>> really impacting the search quality :-)
>
> I guess you need a different Analyzer. There's a list at
> http://lucene.apache.org/java/docs/api/org/apache/lucene/analysis/Analyzer.html
>
> You could also write your own to best represent the data you have.
>
> Cheers,
> t.n.a.
>
>
>


Reply via email to