El 2020-06-15 17:38, Hèctor Alòs i Font escribió:
Here come several practical examples. I tried to select them for their
variety. The result is more a wish list than something structured.

Let's begin with "je la baise". Depending on the context this may be
"I kiss her" or "I fuck her". The context can tell us if we are in a
formal or colloquial type of language. Another issue is that in this
case the anaphora resolution can also help us: if the pronoun
reference is "hand", it can only be "kiss"; if it is a person, the
doubt persists.

Another kind of problem is the Arpitan words "chamô" ("camel"; plural
"camels") and "chamôs ("chamois"; unchanged in plural). So,
translating into French, I got yesterday chamois in a Bible text of
Exodus xD  I solved it deciding in a CG rule that all "chamôs"
(without nothing around in singular) are camels. (Similar cases in
French: fil/fils, foi/fois, cour/cours)

In French there are plenty of words with different meanings, depending
on the genre: livre, page, tour, etc. The problem is that often the
immediate surrounding context does not disambiguate: des livres, les
pages, de tour, etc. A similar but slightly different case is the word
pairs homicide mf/homicide m, féminicide mf/féminicide m, parricide
mf/parricide, etc.: the one with the genre "mf" is a person and the
other is the action.

Other problems come in lexical selection. For instance, as a rule,
Catalan preposition "de" is translated as "de" in French, but if the
following word is a material, "en" must be selected (de fusta > en
bois). So in the Catalan2French lrx file we have a list of materials,
as we have a list of countries, a list of musical instruments, a list
of animals, etc. I dream about a monolingual dictionary where we could
get this kind of information. It is not useful to have these lists for
many language pairs using Catalan. This information should be in
apertium-cat and not in every apertium-cat-xxx lrx file.

Moreover, If we had words not only with different kind of semantic
labels, but also marked as synonyms, maybe it'd be possible to give a
translation using a word labeled as synonym (if it has a translation)
instead of "unknown".


These are excellent examples, I'm just about to go out, but will address
them when I get back. Thanks for the ideas..

Note that my suggestion was to include this information
in the monolingual packages.

Fran


_______________________________________________
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff

Reply via email to