From: "Kazuhiro Kazama" <[EMAIL PROTECTED]> > From: Jane Liu <[EMAIL PROTECTED]> > Subject: Shift-JIS/Unicode mapping in JAVA > Date: Wed, 28 May 2003 12:36:39 -0700 (PDT) > Message-ID: <[EMAIL PROTECTED]> > > I am running a JAVA program on Japanese Windows 2000 system, looking > > at the Unicode conversion of the following four characters from > > Shift-JIS encoding (MS-CP932) in both JRE 1.3.1 and JRE 1.4.1, and > > noticed some interesting changes: > > I guess that you used the charset name "Shift_JIS". Would you try to > use "Windows-31J"?
I think that the canonical name of this encoding should be used, as "Windows-31J" is very uncommon. So it seems better to designate the encoding with "CP932", or "windows-932", which Windows and Internet Explorer also prefers (and probably many other browsers). It is true that MS-CP932 is NOT Shift-JIS, even if it's mostly compatible with it. It was created a long time ago as an extension of an *old* version of the JIS standard, and includes characters that have been later integrated in Shift_JIS. The current version of Shift_JIS has now more characters than the Microsoft codepage 932, but MS-CP932 also includes some characters defined in all Microsoft codepages and that are still missing from Shift_JIS and won't be added now that Shift_JIS has been deprecated by a newer version that includes support for all UniHan and Unicode/ISO10646 characters.

