S***, it seems I made a mistake. The font selection in Windows 2000 is not at all as flexible as Java; it's more like Linux. Just that the default font in the Simplified Chinese version is still Tahoma instead of Song Ti.
Jungshik must be right that I could change the default font in locale zh_CN to make ASCII characters appear nicer. The only problem is that the standard locale for Simplified Chinese in Red Hat 8.0 (which I use) is zh_CN.GB18030. I was told that it was possible to change that to zh_CN.UTF-8, but I did not find the motive/time to do that. Regarding the 'A' APIs in Windows. Do you mean that there should be some API to change the interpretation of strings in 'A' APIs (esp. regarding file names, etc.)? If that were the case, the OS must speak Unicode in some form internally. In my previous message I interpreted your talk about UTF-8 in 'A' APIs as all things are to be encoded in UTF-8 (instead of the language-specific encodings), which I thought could not be acceptable at the time of Windows 95. When talking about the file system, I really like NTFS much better. POSIX file system is *too* simple. I hate the fact that when I switch from en_US.UTF-8 to zh_CN.GB18030, the file names with characters beyond ASCII are corrupt. If the file is on a Windows partition, it is possible to remount the partition in an appropriate encoding; if it is on an EXT2/3 partition or on a CD-ROM, then I am out of luck. Maybe the mount tool should do something to handle this? :-) Best regards, Wu Yongwei --- Original Message from Jungshik Shin --- On Thu, 10 Jul 2003, Wu Yongwei wrote: > Jungshik Shin wrote: > > > I think it's not so much due to defects in programs as due to the lack of > > high-quality fonts. These days, most Linux distributions come with free > > truetype fonts for zh, ja, ko, th and other Asian scripts. However, > > the number and the quality of fonts for Linux desktop are still > > inferior to those for Windows. > > The problem is mainly not font itself, but font combination. I really > cannot bear the display of ASCII characters in Song Ti, which is simply ugly > (and fixed width). Why don't you specify a variable-width font as the system default? I understand you still don't like Latin glyphs in Chinese fonts. I hate Latin glyphs in Korean fonts, too. > locale Linux seems to be able to do so, but in the Chinese locale all is in > the Chinese font, which is not suitable at all for Latin characters. I don't think there's any difference between English and Chinese locales provided that you meant en_*.UTF-8 and zh_*.UTF-8. You may get an impression that it seems to work under en_US.UTF-8 because the 'system default font' for en_US.UTF-8 does not cover Chinese characters and the automatic font selection mechanism picks up a Chinese font for Chinese characters while using the default font for Latin letters. On the other hand, in zh*.UTF-8, the system default font covers Latin letters as well as Chinese characters so that both Latin/Chinese are rendered with the default font. A way to work around is to specify your favorite Latin font ahead of your Chinese font if CSS-style font list can be used. > Beginning with Windows 2000, Windows could choose the > font to use based on the Unicode range (Java does this too). In the English This is a good feature to have although CSS-style font list works most of time. Almost everything we need for this is already in place (fontconfig, pango). BTW, I haven't seen this available in Win2k. How can I do that? (not that I don't believe you but that I'm curious) > I used an Windows Gtk application, which used Tahoma (an good sans serif > font) at first. But after an upgrade it automatically chose to use the > system default font, which is the Chinese Song Ti. It took me several hours > to "correct" the ugly and corrupt (yes, because dialogue dimensions are > different) display. Again, I haven't run Gtk programs under Win32 so that I don't know how they select fonts. Do they use fontconfig? fontconfig can make a big difference. > >> There seems little sense now arguing the virtues of UTF-8 and UTF-16. > >> Technically they both have advantages and disadvantages. I suppose we > > If MS had decided to use UTF-8 (instead of coming up with a whole new > > set of APIs for UTF-16) with 'A' APIs, Mozilla developers' headache(and .... > > UTF-8/'A' APIs vs UTF-16/'W' APIs and there are many other things to > > consider in case of Win32. > > It seems impossible because there are some many legacy applications. On the > Simplified Chinese versions of Windows, 'A' always implies GB2312/GBK. > Switching ALL to UTF-8 seems too radical an idea about 1994. At the time Using 'A' APIs and UTF-8 does not mean that 'A' APIs are made to work ONLY with UTF-8. As you know well, 'A' APIs are bascially for APIs to deal with 'char *'. As such, in theory, it can be used for any single or multibyte encodings including Windows 932, 936, 949, 950 and 6xxxx(I forgot the codepage designation for UTF-8). As Unix(e.g. Solaris and AIX and to a lesser degree Linux) demonstrated, a single application (written to support multibyte encodings) can work well both under legacy-encoding-based locales and under UTF-8 locales. > Microsoft adopted Unicode, people might truly believe UCS-2 is enough for > most application, and Microsoft had not the file name compatibility burden > in Unix Well, this is an orthogonal issue. POSIX file system is so 'simple' (which is a virtue in some aspects) that it doesn't have an inherent notion of 'codeset/encoding/charset'. However, Windows doesn't use POSIX file system and using 'A' APIs does NOT mean that they couldn't use VFAT or NTFS where filenames are in a form of Unicode. > (I suppose you all know that the long file names in Windows are in > UTF-16). Actually, VFAT documentation is so hard to come by that we can just speculate that it's UTF-16 (it could well be just UCS-2 in Windows 95) > I would not blame Microsoft for this. I wouldn't either and I didn't mean to. I believe they weighted all pros and cons of different options and decided to go with their two-tiered API approach. In my previous message, I just gave a downside to that approach aggregating all other arguments into a single phrase 'there are many other things to consider.....' > Also consider the following > fact: Windows 95 emerged at a time when many people had only 8MB of RAM. > Yah, I don't think AT THAT TIME we could tolerate a 50% growth in memory > occupation. Windows 95/98/ME are not Unicode-enabled in many senses while Win 2k/XP (NT4 to a lesser degree) are [1]. Therefore, it was not an issue for Win95 in 1994/95 simply because Win95 still used legacy encodings. [1] Win 9x/ME is rather like POSIX system running under locales with legacy encodings whereas Win 2k/XP is similar to POSIX system running under UTF-8 locales. Jungshik -- Linux-UTF8: i18n of Linux on all levels Archive: http://mail.nl.linux.org/linux-utf8/
