> > Just because we "can" add information, does not mean we "should". >
Yes, I agree. But I think the "material" example that Hèctor raised (*for instance, as a rule, Catalan preposition "de" is translated as "de" in French, but if the following word is a material, "en" must be selected (de fusta > en bois*) is a good one where the transfer (an improved one, for sure) would also benefit on having that information available. Missatge de Francis Tyers <fty...@prompsit.com> del dia dl., 15 de juny 2020 a les 18:45: > El 2020-06-15 17:38, Hèctor Alòs i Font escribió: > > Here come several practical examples. I tried to select them for their > > variety. The result is more a wish list than something structured. > > > > Let's begin with "je la baise". Depending on the context this may be > > "I kiss her" or "I fuck her". The context can tell us if we are in a > > formal or colloquial type of language. Another issue is that in this > > case the anaphora resolution can also help us: if the pronoun > > reference is "hand", it can only be "kiss"; if it is a person, the > > doubt persists. > > > > Another kind of problem is the Arpitan words "chamô" ("camel"; plural > > "camels") and "chamôs ("chamois"; unchanged in plural). So, > > translating into French, I got yesterday chamois in a Bible text of > > Exodus xD I solved it deciding in a CG rule that all "chamôs" > > (without nothing around in singular) are camels. (Similar cases in > > French: fil/fils, foi/fois, cour/cours) > > > > In French there are plenty of words with different meanings, depending > > on the genre: livre, page, tour, etc. The problem is that often the > > immediate surrounding context does not disambiguate: des livres, les > > pages, de tour, etc. A similar but slightly different case is the word > > pairs homicide mf/homicide m, féminicide mf/féminicide m, parricide > > mf/parricide, etc.: the one with the genre "mf" is a person and the > > other is the action. > > > > Other problems come in lexical selection. For instance, as a rule, > > Catalan preposition "de" is translated as "de" in French, but if the > > following word is a material, "en" must be selected (de fusta > en > > bois). So in the Catalan2French lrx file we have a list of materials, > > as we have a list of countries, a list of musical instruments, a list > > of animals, etc. I dream about a monolingual dictionary where we could > > get this kind of information. It is not useful to have these lists for > > many language pairs using Catalan. This information should be in > > apertium-cat and not in every apertium-cat-xxx lrx file. > > > > Moreover, If we had words not only with different kind of semantic > > labels, but also marked as synonyms, maybe it'd be possible to give a > > translation using a word labeled as synonym (if it has a translation) > > instead of "unknown". > > > > These are excellent examples, I'm just about to go out, but will address > them when I get back. Thanks for the ideas.. > > Note that my suggestion was to include this information > in the monolingual packages. > > Fran > -- < Xavi Ivars > < http://xavi.ivars.me >
_______________________________________________ Apertium-stuff mailing list Apertium-stuff@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/apertium-stuff