This whole attempt to make digitizing Indic script some esoteric, 'abstract', 'semantic representation' and so on seems to me is an attempt to make Unicode the realm of the some super humans.

The purpose of writing is to represent speech. It is not some secret that demi-gods created that we are trying to explain with 'modern' linguistic gymnastics. sound => letter that is the basis for writing. English writing was massacred when printing was brought in from Europe. A similar thing is happening to Indic by all this mumbo-jumbo.

I call out to NATIVE users of Indic to explain what apparently Europeans or Americans are discussing here.


On 5/1/2017 10:47 AM, Philippe Verdy wrote:


2017-04-29 21:21 GMT+02:00 Naena Guru via Unicode <unicode@unicode.org <mailto:unicode@unicode.org>>:

    Just about the name paluta:
    In Sanskrit, the length of vowels are measured in maaþra (a
    cognate of the word 'meter'). It is the spoken length of a short
    vowel. In Latin it is termed mora. Usually, you have only single
    and double length vowels. A paluþa length is like when you call
    out somebody from a distance. Pluta is a careless use of spelling.
    Virama and Halanta are two other terms loosely used.

    Anyway, Unicode is only about DISPLAYING a script: There's a shape
    here; Let's find how to get it by assembling other shapes or by
    creating a code point for it. What is short, long or longer in
    speech is no concern for Unicode.


Wrong. Unicode is absolutely not about how to "display" any script (except symbols and notational symbols). Unicode does not encode glyphs. Unicode encodes "abstract characters" according to their semantics, in order to assign them properties allowing meaningful transformations of text and in order to allow perfoirming searches (with collation algorithms). What is important is their properties (something that ISO 10646 does not care when it started the UCS in a separate project, ignoring how it would be used, focusing too much on apparent glytphs (and introducing lot of "compatiblity characters" that would not have been encoded otherwise, and creating some havoc in logical processing.

Anyway Unciode makes some exceptions to the logical model only for roundtrip comptaibility with other standards that used another encoding model widely used, notably in Thai: these are the exception where there are "prepended" letters. There was some havoc also for some scripts in India because of roundtrip compatiblity with an Indian standard (criticized by many users of Tamil and some other Southern Indic scripts that don't follow directly the paradigm created for getting some limited transliteration with Devanagari: that initial desire was abandoned but the legacy Indic scripts in India were imported as is to Unicode)

Reply via email to