Keyur Shroff wrote:
> In the FAQ
>    http://www.unicode.org/faq/indic.html#16
> 
> It is mentioned that following are equivalent
> 
> ISCII                 Unicode
> KA halant INV         KA virama ZWJ
> RA halant INV         RAsup (i.e., repha)

The last line is really bizarre! I would agree that it is plain wrong...

What is supposed to appear in column "Unicode" is the Unicode *encoding*
equivalent to the <RA halant INV> in the "ISCII" column. But "RAsup (i.e.,
repha)" is the description of a *glyph*.

> In fact there is no way in Unicode to produce RAsup directly, 
> i.e., without using base consonant. [...]

I agree. This issue has been raised several times, and several viable
solutions have been proposed, but I don't remember that Unicode "officials"
ever showed to even acknowledge the problem.

But probably this has been noted down and discussed. I hope to see an
official solution in TUS 4.0.

> SUGGESTION-3:
> 
> Use of SPACE character as consonant may create problem for 
> state machine which finds language/syllable boundary.
> In fact we need a codepoint for one invisible consonant
> (similar to INV in ISCII) in Unicode which can solve
> this problem with Unicode.
> 
> After inclusion of INV character the following can be recommended.
> 
> ISCII                 Unicode
> KA halant INV         KA virama INV
> RA halant INV         RA virama INV (i.e., repha)
> INV halant RA         INV virama RA (RAsub)

Why not representing INV with a double ZWJ? E.g.:

        ISCII                 Unicode
        KA halant INV         KA virama ZWJ ZWJ
        RA halant INV         RA virama ZWJ ZWJ (i.e., repha)
        INV halant RA         ZWJ ZWJ virama RA (RAsub)

This has the advantage that the most common sequences will work OK also on
old display engines implemented *before* the double-ZWJ convention is
introduced.

E.g., sequence "KA virama ZWJ ZWJ" works well also on an old engine, for the
simple reason that the first ZWJ is enough to do the work, and  the second
ZWJ is invisible.

Of course, an old engine will still display a <RA[eyelash]> for <RA virama
ZWJ ZWJ>, but that is not worse than displaying <RA+virama> followed by a
white box, which is what would happen with your new INV character.

_ Marco

Reply via email to