J. Cliff Dyer wrote: " ...UCS-2, for example, is a fixed width, 2-byte encoding that can handle any unicode code point up to 0xffff, but cannot handle the 3 and 4 byte extension sets. "
I was going to reply to say that this is a good point. But on my way i looked up wikipedia, http://en.wikipedia.org/wiki/UTF-16/UCS-2 quote: " In computing, UTF-16 (16-bit Unicode Transformation Format) is a variable-length character encoding for Unicode, capable of encoding the entire Unicode repertoire. " and " UCS-2 (2-byte Universal Character Set) is an obsolete character encoding which is a predecessor to UTF-16. The UCS-2 encoding form is nearly identical to that of UTF-16, except that it does not support surrogate pairs and therefore can only encode characters in the BMP range U+0000 through U+FFFF. " So, the matter isn't simple. (i.e. it is not decisive to say i'm incorrect in my original criticism about that article's statement on utf-8.) ------------ Btw, i think i should mention, that i have read from cover to cover the unicode 3 specification in 2002. (one heavy, thick, large, deep blue colored book) Another resource that contributed my understanding of unicode, is the book "CJKV Information Processing" by Ken Lunde, which i read in the same year. Also of interest, is that i learned about a year ago, the chinese encoding http://en.wikipedia.org/wiki/GB_18030 which is required by law for all computers sold in China to support, is actually a Unicode encoding. Specifically, in encompasses all the chars in Unicode. Also relevant info in our discussion, is that recently i was looking at alexa.com's web ranking: http://alexa.com/site/ds/top_sites?ts_mode=global&lang=none and noticed several pure chinese lang websites are among the top 100. Baidu.com (百度) is at top 8 today, followed by 腾讯网 (http://www.qq.com) at 12, and 新浪 sina.com.cn at 19, etc. It is somewhat amazing in the context of computing and languages. No other non-English lang comes close. (Note here also, Chinese as measured by number of speakers, is roughly 4 times that of English. http://en.wikipedia.org/wiki/Ethnologue_list_of_most_spoken_languages This fact, coupled with developement and commercialization of China in the past decade, are reasons of the above web ranking result. ) Not relevant in our discussion, but I happend to also notice a site named youporn.com (was ranked 69 few weeks ago). youporn.com is basically like youtube.com, but with porn vids. It has long been my thought, that the progress of humanity in a society can be measured as by its popularity and acceptance of porn. (in fact i recall seeing some academic (or not) report about this few months ago... couldn't remember where now) Society as a whole, have improved dramatically since the communication revolution in particulart started with the web. (see Xah's Porn Outspeak http://xahlee.org/PageTwo_dir/Personal_dir/porn_movies.html For more info about youtube.com, see: http://en.wikipedia.org/wiki/Youporn curious party might also check out http://en.wikipedia.org/wiki/Youtube which is a major phenomenon, in my opinion, contributed to the progress of humanities far more than, say, any university or educational institution. (my thesis in general in this direction, is that communication, the main media of knowledge, is the utmost factor in human animal's progress with respect to what's generally considered humanitarianism. More important than, say, the need to decry war, have laws, maintain peace, spread gospels, aid the poor, ... etc. (and in fact, in this thesis, i consider what commonly considered as good activities such as aiding the poor, or any moral attitude and activities about good of humanity (such as OpenSource), are in fact criminal in their effects and almost in their intention too ...)) ) PS for some reason message posted thru google groups service since the past week or so are stripping off the unicode chars double angle brackets (U+00AB and U+00BB). For that reason, in this msg i've also used double curly quotes "" whenever i have double angle brackets. Xah [EMAIL PROTECTED] ∑ http://xahlee.org/ -- http://mail.python.org/mailman/listinfo/python-list