Den 2011-09-10 00:53, skrev "delex r" <[email protected]>:
> I figure out that Unicode has not addressed the sovereignty issues of a > language Which, I daresay, is irrelevant from a *character* encoding perspective. > while trying to devise an ASCII like encoding system for almost all > the characters and symbols used on earth. I am continuing with my observation > of the glaring mistake done by Unicode by naming a South Asian Script as > ³Bengali². Here I would like to give certain information that I think will be > of some help for Unicode in its endeavour to faithfully represent a Universal > Character encoding standard truer to even micro-facts. > > India is believed to have at least 1652 mother tongues out of which only 22 One list of languages in India is given in http://www.ethnologue.com/show_country.asp?name=IN (I did not count the number of entries) > are recognized by the Indian Constitution as official languages for > administrative communication among local governments and to the citizens. And > the constitution has not explicitly recognized any official script. As Unicode > has listed the languages and scripts, the Indian Constitution has also listed Unicode does not list any languages at all. Ok, the CLDR subproject copies a list of language codes from the IANA language subtag registry, which (in a complex manner) takes its language codes from (among others) the ISO 639-3 registry, which largely is in sync with Ethnologue (as in the list above); but I guess that is not what you referred to. > the official languages ( In its 8th schedule). The first entry in that list is > the Assamese language. Assamese is a sovereign language with its own grammar Which I don't think is in dispute at all. > and ³script² that contains some unique characters that you will not find in > any of the scripts so far discovered by Unicode. At least 30 million people Unicode (at this stage) does not do any "discovery". Unicode and ISO/IEC 10646 is driven by applications (proposals) to encode characters (and define properties of characters). > call it the ³Assamese Script² and if provided with computers and internet If you want to disunify the Bengali script (and characters) from Assamese, you need to show, in a proposal document, that they really are different scripts, and should not be unified as just different uses of the same script. > connection can bomb the Unicode e-mail address with confirmations. These Hmm, an email bombing threat... I'm sure Sarasvati can find a way to block those (or we may all simply file them away as spam). > characters are, I repeat, the one that is given a Hexcode 09F0 and the other > with 09F1 by this universal character encoding system but unfortunat! > ely has described both as ³Bengali² Ra etc. etc. I don¹t know who has advised > Unicode to use the tag ³Bengali² to name the block that includes these two > characters. > > If you are not an Indian then just google an image of an Indian Currency note. > There on one side of the note you will find a box inside which the value of > the currency note is written in words in at least 15 scripts of official > Indian languages.( I don¹t know why it is not 22). At the top , the script is > Assamese as Assamese is the first officially recognized language (script?) . > Next below it you will find almost similar shapes. That is in Bengali. India > officially recognises the distinction between these two scripts which although > shaped similar but sounds very different at many points. And the standard Minor font differences is not a reason for disunification. Different pronunciations of the same letters is not a reason for disunification either. Just think of how many different ways Latin letters (and letter combinations) are pronounced in different languages (x, j, h, v, w, f, ...; even "a" gets different pronunciation in British English vs. US English, and that is within the same language...; and most orthographies aren't very accurately phonetic anyway, with quite a bit of varying (contextual and dialectal) pronunciation for the letters). > assamese alphabet set has extra characters which are never bengali just like > London is never in Germany. There are 8 London in the USA, two in Canada, one in Kiribati, ... ;-) (http://en.wikipedia.org/wiki/London_(disambiguation)) > Coming again to the Hexcodes 09F0 (Raw) and 09F1 (wabo). Both have nothing > Bengali in them and interestingly 09F1 ( sounds WO or WA when used within > words) has even nothing Ra¹ sound in it. Thus you know, with actual Bengali > alphabet set one can¹t write anything to produce the sound ³Watt² as in James > Watt and instead need to combine three alphabets but even then only to sound > like ³ OOYAT ³ in Bengali itself. Yes, English has a rather peculiar pronunciation for the letter W... ;-) Several languages will pronounce Watt (without changing the spelling) as Vatt, and regard that as a normal pronunciation of Watt. > Therefore Unicode must consider terming the block range as ³Assamese² which > will faithfully describe the block range with 09F0 and 09F1 in it and replace > all tags ³ Bengali² with ³Assamese² in the code descriptions and vice versa . > London is in England and Berlin is in Germany. You just can¹t bring London > into Germany and then say England is in Germany. You can¹t live with a lie or > wrong too long. See above re. London. ;-) As for Berlin: see http://en.wikipedia.org/wiki/Berlin_(disambiguation)... (I still fail to see how this would be analogous in any way whatsoever to your quest.) Yes, I have responded with a quite large dose of irony. Dryer and to the point responses by others seem to have passes unnoticed. /Kent K

