On Tue, 15 Oct 2002, "Stefan Persson" wrote: > That font also includes some characters mapped to the PUA: A � sign, and > several 漢 character, many of which look like radicals. Why? Is that > something that's also required by that law? >
It's my experience that many fonts include gunk in the Private Use Area. A quick check of some of the CJK glyphs in the PUA of SimSun-18030 shows that they are not unique, but are also mapped to codepoints in the CJK Radical Supplement and CJK-A blocks for example. I believe that it is intended to maintain a one-to-one correspondence between the GB18030 standard and Unicode, and so there should be no need for any supplementary glyphs in the PUA. The new PRC law is, as you hint, overly restrictive and prescriptive, and is, I think, a serious setback for popularisation of Unicode on the Web. The intent is that GB18030 should replace GB2312 and Big5, and so that instead of the current mishmash of GB2312 (SC) and Big5 (TC) websites, in the future Traditional and Simplified Chinese sites (at least those hosted in China) will use the same GB18030 encoding. Where does this leave websites written in Unicode Chinese ? Out in the cold ! At present web pages written in Unicode Chinese (some of mine for example) are not being indexed by Google, and are ignored by both Yahoo China (SC) and Chinese Yahoo (TC). The situation will certainly not be improved by the replacement of GB2312 and Big5 with GB18030. Andrew

