Although I appreciate the people making huge efforts for a communication,
my resource is too short to participate that - please let me change the
subject for this off-topic discussion...

--

Dear Jim,

I haven't had any occasion to poke around at 21-bit Unicode
codepoints. The JIS standards only have 303 kanji with them; all added
in the JIS X 0213 standard introduced in 2000.

I understand your background is academic study of Japanese language, but
is there any special reason to mention to JIS X 0213, during the discussion
of general purpose encoding scheme of UTF-8?

In Japan, many running systems keep the restriction of JIS X 0208,
especially in public sectors. Also, the customers of Japanese printing
factories are often expected to make a data with JIS X 0208 and PUA glyphs,
instead of full repertoire of CJK Unified Ideograph.

On the other hand, Japanese young people emit many non-JIS characters to
their SNS accounts, like so-called "emoji", or Indic or Arabic characters
to design their favorite face symbols.

I think, the popularity of "21-bit Unicode codepoint" in Japanese text is
highly dependent with the category of the text.

Regards,
mpsuzuki


On 2024/11/08 15:04, Jim Breen via Unicode wrote:
On Fri, 8 Nov 2024 at 11:37, Markus Scherer <[email protected]> wrote:
On Thu, Nov 7, 2024 at 3:03 PM Jim Breen via Unicode <[email protected]> 
wrote:

On rare occasions, I need to dig into UTF-8 at the bit level. I have a
note pinned near my desk as an aide memoire. It has 3 lines:

UTF-8
zzzzyyyyyxxxxx
1110zzzz 10yyyyyy 10xxxxxx

11110nnn 10zzzzzz 10yyyyyy 10xxxxxx

I haven't had any occasion to poke around at 21-bit Unicode
codepoints. The JIS standards only have 303 kanji with them; all added
in the JIS X 0213 standard introduced in 2000.

[As I wrote in my "A Brief History of Japanese Character Set
Standards" (https://www.edrdg.org/~jwb/paperdir/kanjicomp.html) "the
main lasting impact of the JIS X 0213 standard will probably be the
additional 303 kanji it contributed to Unicode."]

Jim


Reply via email to