Tomohiro KUBOTA wrote on 2002-04-03 07:51 UTC: > > The compatibility characters are there in Unicode to allow you to chose > > to either use the unification rules of JIS or the unification rules of > > the IRG, at your choice. > > For example, IBM's table says that: > > SJIS UCS > 8A43 6D77 (in CJK UNIFIED IDEOGRAPHS region) > EBA2 FA45 (in CJK COMPATIBILITY IDEOGRAPHS region)
> Usage of SJIS 8A43 means that the character is *not* SJIS EBA2. What is the exact difference between these two? Are these two ideographs distinguished in any major Japanese dictionary? Unihan suggests that this is not the case. Where do these two ideographs come from? What do you need SJIS EBA2 for and when/why was it added? Unicode has two compatibility ideographs for U+6D77, namely U+FA45 and U+2F901, both of which are mapped to U+6D77 in Unicode Normalization Form KC. FA45;CJK COMPATIBILITY IDEOGRAPH-FA45;Lo;0;L;6D77;;;;N;;;;; 2F901;CJK COMPATIBILITY IDEOGRAPH-2F901;Lo;0;L;6D77;;;;N;;;;; http://www.unicode.org/unicode/reports/tr15/ Aren't these two compatibility ideographs enough to unambiguously preserve the SJIS glyph information that you worry about? > However, UCS 6D77 is a unification of SJIS 8A43 and SJIS EBA2, > and thus, usage of UCS 6D77 cannot specify SJIS 8A43's glyph. Would changing IBM's mapping table to map SJIS 8A43 -> U+2F901 fix your concern? It is another way of preserving round-trip compatibility, and after normalization (for those users who don't care about round-trip compatibility to SJIS), you end up with the exact same Unicode text. The example glyph for U+2F901 used in ISO 10646-2:2001 on page 370 looks more similar to the glyph for U+6D77 used on page 926 of the Unicode 3.0 book than the glyph used on page 666 of Unicode 3.0. Unfortunately, http://charts.unicode.org/unihan.html seems to be down today, so I can't have a look at all the various glyph alternatives right now. > Accidentally, Unicode Consortium's sample glyph for UCS 6D77 is > same as SJIS EBA2 (and different with SJIS 8A43). Thus, glyphs > of UCS FA45 and UCS 6D77 in the page 843 of > http://unicode.org/charts/PDF/UF900.pdf are same, which may have > confused you. Is there something wrong with the glyph for U+6D77 used on page 926 of the Unicode 3.0 book? Where is the SJIS EBA2 officially defined? The Unicode 3.0 Shift-JIS index ends at EAA4. Attached: ISO 10646-2 page with U+2F901 glyph. Markus -- Markus G. Kuhn, Computer Laboratory, University of Cambridge, UK Email: mkuhn at acm.org, WWW: <http://www.cl.cam.ac.uk/~mgk25/>
U+2F901.gif
Description: U+2F901.gif
