On Thu, 2007-09-20 at 11:13 -0700, Lance Norskog wrote:
> English and French are messy, so heuristic methods are the only possible.
> Spanish is rigorously clean, and stemming should be done from the declension
> rules and irregular conjugation tables. This involves large (fast) tables in
> ram rather than small (slow) string-shuffling.
> 

Interesting do you a link for some documentation how to implement this?

salu2

> Lance Norskog
> 
> -----Original Message-----
> From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of
> Bertrand Delacretaz
> Sent: Thursday, September 20, 2007 8:11 AM
> To: solr-user@lucene.apache.org
> Subject: Re: Strange behavior when searching with accents
> 
> On 9/20/07, Thorsten Scherler <[EMAIL PROTECTED]>
> wrote:
> > ...Betrand, does the French Snowball work fine?...
> 
> I've seen some weirdnesses, like "tennis" and "tenir" (means to hold) both
> stemmed to "ten", but in all of our (simple) tests it was ok.
> 
> The application where we're using it does not require high precision though,
> so it looked good enough and we didn't do create very extensive tests for
> it.
> 
> -Bertrand
> 
-- 
Thorsten Scherler                                 thorsten.at.apache.org
Open Source Java                      consulting, training and solutions

Reply via email to