En realidade, dáme o corpo que o stemmer de hunspell non se comporta así.
Nas probas que fixen, o stemmer ($hunspell -s) identifica un lema só cando
ten un único sufixo (e non o identifica cando hai dobre recursividade nos
sufixos, como é o caso dos pronomes enclíticos):

amabas
amabas amar

amábachellela
amábachellela

trouxo
trouxo traer

tróuxocho
tróuxocho

pedras
pedras pedra

Acerta na flexión nominal e verbal, pero non atina con pronomes enclíticos.
Igual en futuras versións de hunspell...


2010/8/24 mvillarino <mvillar...@gmail.com>

> >> I've got a question: does it make sense to use Spanish stemmer for
> >> Catalan?  How about Portuguese for Galician?
> >
> > I don't really know about this. I'll forward this message to the
> > galician free software localization mailing list.
>
> I have ever tryed to stem Galician with Portuguese stemmer, however it
> should fail for verbs with enclitic particles (amabachellela ->
> amava-che-lhe-la)
>
> Regarding stemmers and CAT tools, Lokalize from trunk (heave not yet
> tested kde sc 4.5) stems prior to searching into glossary, that is, it
> stems the source text before searching for matches into the stemmed
> version of the glossary (because not everybody uses the glossary for
> terminology only). But despite Mikola considered initially the use of
> snowball, it has finally been done through Hunspell's stemmer, thus
> supporting a wider set of languages.
> The results? well, It is so fresh that possibly most of Lokalize's
> users had not yet realized about this functionality. I did some
> testing on a pre-release checkout of the sources and I personaly like
> the result, but...
> ...but please take into account that by stemming hunspell refers to
> doing a "reverse spellchecking", so for each word it offers as stems
> watever word in the dictionary can be derived into the word in the
> text, so please expect a lot of "false matches". By the way, i find
> them very usefull both to check the quality of the glossary as well as
> to pray for this process to use additional information from a pos
> tagger some day into the near future.
>
> All the best,
> Marce Villarino
> _______________________________________________
> Proxecto mailing list
> Proxecto@trasno.net
> http://listas.trasno.net/listinfo/proxecto
>
_______________________________________________
Proxecto mailing list
Proxecto@trasno.net
http://listas.trasno.net/listinfo/proxecto

Responderlle a