Re: [open-tran] Stemming
I've got a question: does it make sense to use Spanish stemmer for Catalan? How about Portuguese for Galician? I don't really know about this. I'll forward this message to the galician free software localization mailing list. I have ever tryed to stem Galician with Portuguese stemmer, however it should fail for verbs with enclitic particles (amabachellela - amava-che-lhe-la) Regarding stemmers and CAT tools, Lokalize from trunk (heave not yet tested kde sc 4.5) stems prior to searching into glossary, that is, it stems the source text before searching for matches into the stemmed version of the glossary (because not everybody uses the glossary for terminology only). But despite Mikola considered initially the use of snowball, it has finally been done through Hunspell's stemmer, thus supporting a wider set of languages. The results? well, It is so fresh that possibly most of Lokalize's users had not yet realized about this functionality. I did some testing on a pre-release checkout of the sources and I personaly like the result, but... ...but please take into account that by stemming hunspell refers to doing a reverse spellchecking, so for each word it offers as stems watever word in the dictionary can be derived into the word in the text, so please expect a lot of false matches. By the way, i find them very usefull both to check the quality of the glossary as well as to pray for this process to use additional information from a pos tagger some day into the near future. All the best, Marce Villarino ___ Proxecto mailing list Proxecto@trasno.net http://listas.trasno.net/listinfo/proxecto
Re: [open-tran] Stemming
En realidade, dáme o corpo que o stemmer de hunspell non se comporta así. Nas probas que fixen, o stemmer ($hunspell -s) identifica un lema só cando ten un único sufixo (e non o identifica cando hai dobre recursividade nos sufixos, como é o caso dos pronomes enclíticos): amabas amabas amar amábachellela amábachellela trouxo trouxo traer tróuxocho tróuxocho pedras pedras pedra Acerta na flexión nominal e verbal, pero non atina con pronomes enclíticos. Igual en futuras versións de hunspell... 2010/8/24 mvillarino mvillar...@gmail.com I've got a question: does it make sense to use Spanish stemmer for Catalan? How about Portuguese for Galician? I don't really know about this. I'll forward this message to the galician free software localization mailing list. I have ever tryed to stem Galician with Portuguese stemmer, however it should fail for verbs with enclitic particles (amabachellela - amava-che-lhe-la) Regarding stemmers and CAT tools, Lokalize from trunk (heave not yet tested kde sc 4.5) stems prior to searching into glossary, that is, it stems the source text before searching for matches into the stemmed version of the glossary (because not everybody uses the glossary for terminology only). But despite Mikola considered initially the use of snowball, it has finally been done through Hunspell's stemmer, thus supporting a wider set of languages. The results? well, It is so fresh that possibly most of Lokalize's users had not yet realized about this functionality. I did some testing on a pre-release checkout of the sources and I personaly like the result, but... ...but please take into account that by stemming hunspell refers to doing a reverse spellchecking, so for each word it offers as stems watever word in the dictionary can be derived into the word in the text, so please expect a lot of false matches. By the way, i find them very usefull both to check the quality of the glossary as well as to pray for this process to use additional information from a pos tagger some day into the near future. All the best, Marce Villarino ___ Proxecto mailing list Proxecto@trasno.net http://listas.trasno.net/listinfo/proxecto ___ Proxecto mailing list Proxecto@trasno.net http://listas.trasno.net/listinfo/proxecto
Traducións KDE en Ubuntu non completadas
Ola rapaces, revisando en Ubuntu vexo que hai unha morea de paquetes de KDE que non están completos: https://translations.edge.launchpad.net/ubuntu/maverick/+lang/gl/+index?start=300batch=50 páxinas anteriores e sucesivas. e pregúntome se existe algún problema polo cal non se estean importando correctamente ou se mesmo no voso grupo non o podedes completar. Eu en GNOME tamén teño bastantes que non están importadas máis que si están traducidas en upstream. De tódolos xeitos agardo que cando se libere a versión oficial e final de GNOME 2.32 se importen todas. Está pasando isto mesmo en KDE? Saúdos ___ Proxecto mailing list Proxecto@trasno.net http://listas.trasno.net/listinfo/proxecto
Re: [open-tran] Stemming
2010/8/24, Miguel Solla brado...@gmail.com: En realidade, dáme o corpo que o stemmer de hunspell non se comporta así. Moi posibelmente Igual en futuras versións de hunspell... Non se ninguén fai unha RFE ___ Proxecto mailing list Proxecto@trasno.net http://listas.trasno.net/listinfo/proxecto