On Friday 28 March 2008 21:44:29 Leonardo Santagada wrote:
> Well his examples are in brazilian portuguese and not spanish and the
> biggest problem is that a spanish stemmer is not goin to work. I
> haven't found a pt_BR steammer, have I overlooked something?

Try the Snowball Porter filter factory. The algorithm is specified in plain 
text files, so adding new stemmers to the codebase is pretty easy. The hard 
part is finding a good specification of the algorithm for Brazilian 
Portuguese.

A Google search reveals some references to Brazilian Portuguese versions of 
the Porter algorithm. Maybe one of these is suitably unencumbered for 
implementation and distribution as free software.

As a last resort, there already is a Snowball Porter stemmer for Portuguese in 
the SOLR codebase. However, I do not know how suitable it would be for 
adaptation to Brazilian Portuguese, as I know zilch about the variant spoken 
in Portugal.

Best  regards
- Christian

Reply via email to