On Thu, 2007-09-20 at 11:13 -0700, Lance Norskog wrote: > English and French are messy, so heuristic methods are the only possible. > Spanish is rigorously clean, and stemming should be done from the declension > rules and irregular conjugation tables. This involves large (fast) tables in > ram rather than small (slow) string-shuffling. >
Interesting do you a link for some documentation how to implement this? salu2 > Lance Norskog > > -----Original Message----- > From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of > Bertrand Delacretaz > Sent: Thursday, September 20, 2007 8:11 AM > To: solr-user@lucene.apache.org > Subject: Re: Strange behavior when searching with accents > > On 9/20/07, Thorsten Scherler <[EMAIL PROTECTED]> > wrote: > > ...Betrand, does the French Snowball work fine?... > > I've seen some weirdnesses, like "tennis" and "tenir" (means to hold) both > stemmed to "ten", but in all of our (simple) tests it was ok. > > The application where we're using it does not require high precision though, > so it looked good enough and we didn't do create very extensive tests for > it. > > -Bertrand > -- Thorsten Scherler thorsten.at.apache.org Open Source Java consulting, training and solutions