On Friday 28 March 2008 21:44:29 Leonardo Santagada wrote: > Well his examples are in brazilian portuguese and not spanish and the > biggest problem is that a spanish stemmer is not goin to work. I > haven't found a pt_BR steammer, have I overlooked something?
Try the Snowball Porter filter factory. The algorithm is specified in plain text files, so adding new stemmers to the codebase is pretty easy. The hard part is finding a good specification of the algorithm for Brazilian Portuguese. A Google search reveals some references to Brazilian Portuguese versions of the Porter algorithm. Maybe one of these is suitably unencumbered for implementation and distribution as free software. As a last resort, there already is a Snowball Porter stemmer for Portuguese in the SOLR codebase. However, I do not know how suitable it would be for adaptation to Brazilian Portuguese, as I know zilch about the variant spoken in Portugal. Best regards - Christian