Hey all!
I'm not sure if this is a bug or a feature, but when I try to make
accented characters in Cyrillic (e.g. with combining characters: о́ ы́
у́ etc.) equivalent with their non-combined variants (e.g. о ы у) in an
ACX file,[1] then I don't get the unaccented characters treated as the
accented ones,
1.
<char value="ы́">
<equiv-char value="ы"/>
</char>
Looking at the transducer, it seems that this makes some kind of sense
from an encoding point of view because о and combining ´ are treated as
separate 'characters', but linguistically it might be less clear.
I suspect that this is can not be dealt with in a clean way (without
having some special code in lttoolbox to deal with combining
characters). But I thought it meritted an email to the list in case
anyone else comes up against this.
Fran
------------------------------------------------------------------------------
Minimize network downtime and maximize team effectiveness.
Reduce network management and security costs.Learn how to hire
the most talented Cisco Certified professionals. Visit the
Employer Resources Portal
http://www.cisco.com/web/learning/employer_resources/index.html
_______________________________________________
Apertium-stuff mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/apertium-stuff