>> I've got a question: does it make sense to use Spanish stemmer for
>> Catalan?  How about Portuguese for Galician?
>
> I don't really know about this. I'll forward this message to the
> galician free software localization mailing list.

I have ever tryed to stem Galician with Portuguese stemmer, however it
should fail for verbs with enclitic particles (amabachellela ->
amava-che-lhe-la)

Regarding stemmers and CAT tools, Lokalize from trunk (heave not yet
tested kde sc 4.5) stems prior to searching into glossary, that is, it
stems the source text before searching for matches into the stemmed
version of the glossary (because not everybody uses the glossary for
terminology only). But despite Mikola considered initially the use of
snowball, it has finally been done through Hunspell's stemmer, thus
supporting a wider set of languages.
The results? well, It is so fresh that possibly most of Lokalize's
users had not yet realized about this functionality. I did some
testing on a pre-release checkout of the sources and I personaly like
the result, but...
...but please take into account that by stemming hunspell refers to
doing a "reverse spellchecking", so for each word it offers as stems
watever word in the dictionary can be derived into the word in the
text, so please expect a lot of "false matches". By the way, i find
them very usefull both to check the quality of the glossary as well as
to pray for this process to use additional information from a pos
tagger some day into the near future.

All the best,
Marce Villarino
_______________________________________________
Proxecto mailing list
Proxecto@trasno.net
http://listas.trasno.net/listinfo/proxecto

Responderlle a