Dear List Members,

I understand that characters of different scripts, with 
equal appearance are dis-unified and have different 
Unicode codepoints,  Latin E vs Greek U+0395 vs 
Cyrillic U+0414 a typical example.

I also understand that characters of one script having 
equal shapes in some fonts only, e.g. 0 and O are clearly
not a candidate for sharing a Unicode codepoint.

Now I'm wondering about Tamil LLA (U+0BB3) and
Tamil AU Length Mark (U+0BD7). They not only have
incidental equal shapes in the Font used for preparing
the Unicode charts, they are also indistinguishable in
handwritten Tamil, typewriter Tamil etc, I am told.

So for all purposes:

U+0B95 U+0BCC which is canonically equivalent to
U+0B95 U+0BC7 U+0BD7 

looks exactly the same as

U+0B95 U+0BC7 U+0BB3

Isn't that a bit odd? 

Giving an analogy using Latin script, 
that would be the same as if Latin y U+0079 
in vocalic and consonantic use were 
mapped to two different Unicode
codepoints.

Regards,
Peter Jacobi

 

-- 
NEU F�R ALLE - GMX MediaCenter - f�r Fotos, Musik, Dateien...
Fotoalbum, File Sharing, MMS, Multimedia-Gru�, GMX FotoService

Jetzt kostenlos anmelden unter http://www.gmx.net

+++ GMX - die erste Adresse f�r Mail, Message, More! +++


Reply via email to