Dear Folks,

A scheme called ITRANS uses the ASCII printing character set and between one and three printing characters to unambiguously represent characters in Indic scripts or a Romanized script called IAST. Since characters in these scripts have Unicode code points, it should be possible to automate the translation between words in the ASCII source text and the desired Unicoded output text.

I am trying to write a Perl script to do this and would appreciate advice on how best to proceed before I start.

To give a better picture of what I am trying to do, I have given some examples below for ASCII to IAST characters:

--------
1. Transliteration of between one and three ASCII printing characters to one Unicode character.

2. Many characters are unchanged by the transliteration.

3. Some transliteration examples are shown below:

a       a   U+0061   LATIN SMALL LETTER A
aa      ā   U+0101   LATIN SMALL LETTER A WITH MACRON
A       ā   U+0101   LATIN SMALL LETTER A WITH MACRON
.a      '   U+0027   APOSTROPHE
~N      ṅ   U+1E45   LATIN SMALL LETTER N WITH DOT ABOVE
RRI     ṝ   U+1E5D   LATIN SMALL LETTER R WITH DOT BELOW AND MACRON
R^I     ṝ   U+1E5D   LATIN SMALL LETTER R WITH DOT BELOW AND MACRON
--------

Many thanks.

Chandra

--
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
http://learn.perl.org/


Reply via email to