Frank Atanassow <[EMAIL PROTECTED]> wrote,
> George Russell writes:
> > Marcin 'Qrczak' Kowalczyk wrote:
> > > As for the language standard: I hope that Char will be allowed or
> > > required to have >=30 bits instead of current 16; but never more than
> > > Int, to be able to use ord and chr safely.
> > Er does it have to? The Java Virtual Machine implements Unicode with
> > 16 bits. (OK, so I suppose that means it can't cope
> > with Korean or Chinese.)
>
> Just to set the record straight:
>
> Many CJK (Chinese-Japanese-Korean) characters are
> encodable in 16 bits. I am not so familiar with the
> Chinese or Korean situations, but in Japan there is a
> nationally standardized subset of about 2000 characters
> called the Jyouyou ("often-used") kanji, which newspapers
> and most printed books are mostly supposed to
> respect. These are all strictly contained in the 16-bit
> space. One only needs the additional 16-bits for foreign
> characters (say, Chinese), older literary works and
> such-like. Even then, since Japanese has two phoenetic
> alphabets as well, and you can usually substitute
> phoenetic characters in the place of non-Jyouyou
> kanji---in fact, since these kanji are considered
> difficult, one often _does_ supplement the ideographic
> representation with a phoenetic one. Of course, using only
> phoenetic characters in such cases would look
> unprofessional in some contexts, and it forces the reader
> to guess at which word was meant...
The problem with restricting youself to the Jouyou-Kanji is
that you have a hard time with names (of persons and
places). Many exotic and otherwise unused Kanji are used in
names (for historical reasons) and as the Kanji
representation of a name is the official identifier, it is
rather bad form to write a person's name in Kana (the
phonetic alphabets).
Cheers,
Manuel