Re: Re: Regd- ISCII to Unicode Converter!

Ram Viswanadha Wed, 03 Apr 2002 16:42:35 -0800


INV (0xd9) code is used in ISCII for special display purposes in situations
where formation of composite characters requires a consonantal base and the
consonant itself is invisible. INV cannot be accurately represented in
Unicode
so we fallback to ZEN and fallbacks by definition cannot be round-tripped.


        ISCII
      ======
1)   KA+HALANT+INV                     => HALF KA (all half characters can
be represented by CONSONANT+HALANT+ZWJ in Unicode)
2)   RA+HALANT+INV                      => RAsup (RA+HALANT+ZWJ is treated
as Eyelash RA which is may not be the desired effect)

The below stand alone forms of Vowel signs cannot be accurately represented
in Unicode.

3)   INV+VOWEL SIGN I+NUKTA   => VOWEL SIGN
                                                                   VOCALLIC
L
4)   INV+HALANT+RA                     => RAsub
5)   INV+ VOWEL SIGN                   => VOWEL SIGN
      VOCALLIC R +NUKTA                   VOCALLIC RR
6)   INV+VOWEL SIGN II+NUKTA => VOWEL SIGN
                                                                   VOCALLIC
LL

Apple in their mapping tables maps the INV to LRM, and I think they use it
when rendering  like if you have a LRM in middle of  Indic codepoint stream
and it follows these rules then do something interesting. But I am not sure,
maybe someone from Apple may correct me.


Regards,

Ram
---------------------------------------------------
Ram Viswanadha
International Components For Unicode
GCoC San Jose
IBM

----- Original Message -----
From: "Markus Scherer" <[EMAIL PROTECTED]>
To: "Ram Viswanadha" <[EMAIL PROTECTED]>; <[EMAIL PROTECTED]>
Cc: <[EMAIL PROTECTED]>
Sent: Tuesday, April 02, 2002 6:17 PM
Subject: Fwd: Re: Regd- ISCII to Unicode Converter!


> Ram, could you please respond to this to the [EMAIL PROTECTED] ?
> Thanks,
> markus
> -------- Original Message --------
> Date: Wed, 27 Mar 2002 17:17:13 +0800
> From: Federic Zhang <[EMAIL PROTECTED]>
> To: Markus Scherer <[EMAIL PROTECTED]>
> CC: unicode <[EMAIL PROTECTED]>
> Subject: Re: Regd- ISCII to Unicode Converter!
>
>
> Hi Markus,
>
>  From the ucnvisci.c code, seems that  the sequence of  "Halant  INV"
(0xe8 0xd9) in  Devanagari script would
> be converted to "U+094D  ZWJ" in Unicode and becomes "Halant Nukta" (0xe8
0xe9) if convert back from
> Unicode to ISCII since the "U+094D ZWJ" would be treated as one soft
halant.  Is it correct behavior?
>
> Regards,
> Federic
>
> Markus Scherer д�룺
>
>  > ICU supports ISCII, except for the font-style attributes (like "bold")
which are not expressible in plain text.
>  >
>  >
http://oss.software.ibm.com/cvs/icu/~checkout~/icu/source/data/mappings/conv
rtrs.txt
>  > http://oss.software.ibm.com/icu/
>  >
>  > ISCII is algorithmic. The mapping part to/from Unicode is fairly
straightforward because Unicode's encoding of Indic scripts is based on an
earlier version of ISCII.
>  >
>  > For details take a look at the source code:
>  >
http://oss.software.ibm.com/cvs/icu/~checkout~/icu/source/common/ucnvisci.c
>  >
>  > markus
>  >
>  > Rajesh Chandrakar wrote:
>  >
>  > > Sorry to say that I lost the web address some one posted on forum
concerning
>  > > to ISCII to Unicode Conversion. It would be highly appreciated, if
some one
>  > > provides me. I wanted to check the conversion, how far it works?
>
>
>
>
>

Re: Re: Regd- ISCII to Unicode Converter!

Reply via email to