I have no links but it can all be done with synonym tables.

I'm sure somewhere on the net there are full lists of the Spanish regular
and irregular verbs (verbs which do not follow the conjugation rules). Then
using basic text processing you could generate all of the declensions for
the most common regular verbs. 

And then a custom stemmer would do the basics like adjective-mente ->
adjective.

Lance

-----Original Message-----
From: Thorsten Scherler [mailto:[EMAIL PROTECTED] 
Sent: Friday, September 21, 2007 12:08 AM
To: solr-user@lucene.apache.org
Subject: RE: Strange behavior when searching with accents

On Thu, 2007-09-20 at 11:13 -0700, Lance Norskog wrote:
> English and French are messy, so heuristic methods are the only possible.
> Spanish is rigorously clean, and stemming should be done from the 
> declension rules and irregular conjugation tables. This involves large 
> (fast) tables in ram rather than small (slow) string-shuffling.
> 

Interesting do you a link for some documentation how to implement this?

salu2

> Lance Norskog
> 
> -----Original Message-----
> From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf 
> Of Bertrand Delacretaz
> Sent: Thursday, September 20, 2007 8:11 AM
> To: solr-user@lucene.apache.org
> Subject: Re: Strange behavior when searching with accents
> 
> On 9/20/07, Thorsten Scherler 
> <[EMAIL PROTECTED]>
> wrote:
> > ...Betrand, does the French Snowball work fine?...
> 
> I've seen some weirdnesses, like "tennis" and "tenir" (means to hold) 
> both stemmed to "ten", but in all of our (simple) tests it was ok.
> 
> The application where we're using it does not require high precision 
> though, so it looked good enough and we didn't do create very 
> extensive tests for it.
> 
> -Bertrand
> 
-- 
Thorsten Scherler                                 thorsten.at.apache.org
Open Source Java                      consulting, training and solutions

Reply via email to