On Saturday, March 30, 2002, at 04:44 , Jarkko Hietaniemi wrote:
> Gentlemen, you may want to read Unicode 3.2
> ( http://www.unicode.org/unicode/reports/tr28/ ) It does say something
> about Han, Katakana, and Hangul (sections 10.1, 10.3, and 10.4). (No,
> I don't know what happened to 10.2).  What I'm after is whether the
> said CJK changes affect Encode?

   For Japanese, I pretty much doubt it, at least for the time being. JIS 
X 0213:2000, as you see, is only two years old and encodings that 
support are not popular -- yet.
   The support will take a form of ADDITION, not MODIFICATION, at least 
so long as JIS X 0213 is concerned.

   But let me post a summery of (proposed) encodings for JIS X 0213 for 
the record.

(See also http://www.asahi-net.or.jp/~wq6k-yn/code/enc-x0213.html if 
your browser supports Japanese)

JIS X 0213
==========

Is;  tidy (JIS X 0208 + JIS X0212).  It consists of two 94x94 planes.  
plane 1 corresponds to 0208 and 0212.  But some of the code points are 
rearranged so 0213-1 != 0208 and 0213-2 != 0208

EUC-JISX0213
============

Encoding scheme is the same as EUC-JP.  Here is the diagram

        G0      US-ASCC
        G1      JISX0213-1
        (G2  JISX0201-kana (depreciated))
        G3   JISX0213-2

Technical difficulty is minimum.  All I need is a table.  I may make a 
UCM out of Unihan DB and post it to something like Encode::JPExtra or 
something.

When in use, this encoding supersedes EUC-JP because you can't tell the 
difference by looking at a given string.  You must explicitly set your 
encoding to this or "classical" EUC-JP

ISO-2022-JP-3
=============

Basically This one is ISO-2022-JP with new escape sequences.

Esc. Seq.               Charset
------------------------
ESC $(O                 JISX0213-1
ESC $(P                 JISX0213-2

This one is easy, too.

Unlike EUC-JISX0213, this one EXTENDS ISO-2022-JP and old 0208/0212 and 
0213 can coexist, thanks to escape sequences.

Shift_JISX0213
==============

And the most controversial one.  This one squeezes what was not used in 
Shift_JIS.  Shift_JIS was already acrobatic and this one is a 
nightmare.  However, this one also has only 2 bytes max so the support 
for this is not that hard.  But unlike the cases above, I need UTF-8 => 
Shift_JISX0213 mapping instead of vanilla JISX0213,  which I am not sure 
if it is available.  I'll look into it.

As for Hangul.   I'll let the experts like Jungshik review the impact....

Dan the Man with Even more Encodings

Reply via email to