Dear Folks,
A scheme called ITRANS uses the ASCII printing character set and between one and
three printing characters to unambiguously represent characters in Indic
scripts or a Romanized script called IAST. Since characters in these scripts
have Unicode code points, it should be possible to automate the translation
between words in the ASCII source text and the desired Unicoded output text.
I am trying to write a Perl script to do this and would appreciate advice on how
best to proceed before I start.
To give a better picture of what I am trying to do, I have given some examples
below for ASCII to IAST characters:
--------
1. Transliteration of between one and three ASCII printing characters to one
Unicode character.
2. Many characters are unchanged by the transliteration.
3. Some transliteration examples are shown below:
a a U+0061 LATIN SMALL LETTER A
aa ā U+0101 LATIN SMALL LETTER A WITH MACRON
A ā U+0101 LATIN SMALL LETTER A WITH MACRON
.a ' U+0027 APOSTROPHE
~N ṅ U+1E45 LATIN SMALL LETTER N WITH DOT ABOVE
RRI ṝ U+1E5D LATIN SMALL LETTER R WITH DOT BELOW AND MACRON
R^I ṝ U+1E5D LATIN SMALL LETTER R WITH DOT BELOW AND MACRON
--------
Many thanks.
Chandra
--
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
http://learn.perl.org/