Re: Advice on how to approach character translation

Chas. Owens Wed, 23 Apr 2008 08:31:11 -0700

On Wed, Apr 23, 2008 at 5:34 AM, R (Chandra) Chandrasekhar
<[EMAIL PROTECTED]> wrote:
> Dear Folks,
>
>  A scheme called ITRANS uses the ASCII printing character set and between
> one and  three printing characters to unambiguously represent characters in
> Indic scripts or a Romanized script called IAST. Since characters in these
> scripts have Unicode code points, it should be possible to automate the
> translation between words in the ASCII source text and the desired Unicoded
> output text.
>
>  I am trying to write a Perl script to do this and would appreciate advice
> on how best to proceed before I start.
>
>  To give a better picture of what I am trying to do, I have given some
> examples below for ASCII to IAST characters:
>
>  --------
>  1. Transliteration of between one and three ASCII printing characters to
> one Unicode character.
>
>  2. Many characters are unchanged by the transliteration.
>
>  3. Some transliteration examples are shown below:
>
>  a       a   U+0061   LATIN SMALL LETTER A
>  aa      ā   U+0101   LATIN SMALL LETTER A WITH MACRON
>  A       ā   U+0101   LATIN SMALL LETTER A WITH MACRON
>  .a      '   U+0027   APOSTROPHE
>  ~N      ṅ   U+1E45   LATIN SMALL LETTER N WITH DOT ABOVE
>  RRI     ṝ   U+1E5D   LATIN SMALL LETTER R WITH DOT BELOW AND MACRON
>  R^I     ṝ   U+1E5D   LATIN SMALL LETTER R WITH DOT BELOW AND MACRON
>  --------
>
>  Many thanks.
>
>  Chandra
>
>  --
>  To unsubscribe, e-mail: [EMAIL PROTECTED]
>  For additional commands, e-mail: [EMAIL PROTECTED]
>  http://learn.perl.org/
>
>
>


The easiest way I can think of is to build a (UTF-8) file named
itrans2unicode.table that looks like this

a   => a
aa => ā
~N => ṅ

Then read that file into a hash at startup and then process the file
line by line using a regex like

$line =~ s/(.)/$table{$1}/g;

There is supposedly a full table at
http://www.aczoom.com/itrans/#itransencoding but I was unable to load
that page.

-- 
Chas. Owens
wonkden.net
The most important skill a programmer can have is the ability to read.

Re: Advice on how to approach character translation

Reply via email to