Hi, At Fri, 11 Jan 2002 22:19:40 -0500, Glenn Maynard wrote:
> You have to assume that most Japanese systems will display \ as a Yen symbol, > because they wlil. Japanese Windows system always displays \ (0x5c) (in CP932, or, almost people call this as "Shift JIS") and U+005C with Yen Symbol. However, most Linux/BSD/UNIX systems display \ (0x5c) (in EUC-JP, which is the most popular encoding for Linux/BSD/UNIX system) and U+005C in backslash even in Japan. > Now, translation tables for CP932 on these systems could translate > backslash and the yen symbol both to the yen symbol; What is "both"? I think you are talking about both of backslash and yen symbol. However, what do you think is the codepoints for them in CP932? Answer: CP932 has the following yen sign and backslash CP932 (Shift JIS) Unicode (mapped by CP932 table) ------------------------------ ------------------------------- 0x5C (yen sign) U+005C (yen sign glyph in Windows) 0x81 0x5F (fullwidth backslash) U+FF3C (fullwidth backslash) 0x81 0x8F (fullwidth yen sign) U+FFE5 (fullwidth yen sign) note that CP932 0x5C (yen sign) is derived from JIS X 0201 and CP932 0x81 0x5F and CP932 0x81 0x8F are derived from JIS X 0208. thus, if you modify CP932 table 0x5C -> U+00A5, it doesn't mean breaking round-trip compatibility with CP932. In case of Ogg, I think this can be a solution, because the strings are never parsed as filenames. However, this cannot be a general solution. --- Tomohiro KUBOTA <[EMAIL PROTECTED]> http://www.debian.or.jp/~kubota/ "Introduction to I18N" http://www.debian.org/doc/manuals/intro-i18n/ -- Linux-UTF8: i18n of Linux on all levels Archive: http://mail.nl.linux.org/linux-utf8/
