Hi,
I am developing a method to learn Apertium shallow-transfer rules from the
translations of small chunks provided by non-expert users. In order to
generalise the learned rules, I check the bilingual dictionary which, as
far as I know, encodes only the lemma and the tags changed when translating
from source language to target language. However, I have found the
following entries in the Spanish-Catalan bilingual dictionary:
<e> <i>el<s n="det"/><s n="def"/><s n="f"/><s n="pl"/></i></e>
<e> <i>el<s n="det"/><s n="def"/><s n="f"/><s n="sg"/></i></e>
<e> <i>el<s n="det"/><s n="def"/><s n="m"/><s n="pl"/></i></e>
<e> <i>el<s n="det"/><s n="def"/><s n="m"/><s n="sg"/></i></e>
I think that, as the gender and the number don't change, they could be
written using only one entry:
<e> <i>el<s n="det"/><s n="def"/></i></e>
It would be very useful for the approach I am developing to remove this
redundancy from the bilingual dictionary. But, before making a commit, I
want to be sure that I'm not breaking anything. Do you think the proposed
change is correct?
Regards,
Víctor M. Sánchez-Cartagena
------------------------------------------------------------------------------
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and
threat landscape has changed and how IT managers can respond. Discussions
will include endpoint security, mobile security and the latest in malware
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
_______________________________________________
Apertium-stuff mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/apertium-stuff