Eli Zaretskii <[EMAIL PROTECTED]> writes: > On Sat, 27 Oct 2001, David Starner wrote: >> On Fri, Oct 26, 2001 at 01:35:26PM +0200, Oliver Doepner wrote: >> > Emacs-Unicode-990824 >> > ---------------------------------------------------------------------- >> > Internal Character code: >> > >> > 00 0000 xxxxxxxx xxxxxxxx Unicode U+0000 - U+FFFF >> > 00 xxxx xxxxxxxx xxxxxxxx Unicode 20bit (via surrogate pair) >> > 01 0000 xxxxxxxx xxxxxxxx Unicode 20bit (via surrogate pair) >> >> Why are astral characters going to be supported by surrogate pairs? >> That's just ugly, especially if elisp coders have to deal with >> surrogates. Considering the binary transparency demands, you also need >> to round-trip surrogate pairs in UTF-8 back to surrogate pairs, not >> astral characters. >> >> On second glance, it doesn't look like you're using surrogate pairs at >> all. Then why do you mention them? They're just an encoding trick; if >> you aren't using UTF-16, you can forget about them. The characters above >> U+FFFF are U+10000, U+10001, etc., not U+D800 U+DC00, U+D800 U+DC01, >> etc.
> Handa-san, could you please comment on that? David is one of a few > people who replied to my (quite desperate ;-) message posted to > gnu.emacs.bug a few days ago. Please ignore the text "(via surrogate pair)". It means nothing, and I don't remember why I wrote that part. :-( Florian Weimer <[EMAIL PROTECTED]> writes: > What does 'via surrogate pair' mean? I guess the second line should > read: >> 00 xxxx xxxxxxxx xxxxxxxx Unicode 20bit (U+10000 - U+FFFFF) Yes. That's correct, and the third line shoud read as below: 01 0000 xxxxxxxx xxxxxxxx Unicode 20bit (U+100000 - U+10FFFF) >> 01 0ppp xxxxxxxx xxxxxxxx 7 64kByte planes reserved for Emacs >> 01 1ppp xxxxxxxx xxxxxxxx 8 64kByte planes for private use >> 1x xxxx xxxxxxxx xxxxxxxx for private use, CNS 3-16, and CCCII >> >> Private area is 180000h - 3087FFh > These are the characters from > 1 1000 00000000 00000000 > to > 11 0000 10000111 11111111 . > Is this range intentional? It looks rather strange. I don't remember well. :-( Perhaps to fill code-space for CNS 3-16 and CCCII from the tail. They require #xF7800 code points (== (96*96*14) + (96*96*96)), and #x3087FF == #x3FFFFF - #xF7800. > Anyway, what does 'private use' mean? Reserved for GNU Emacs, for Lisp > packages, for the end user? It seems that we have not yet discussed it in detail. --- Ken'ichi HANDA [EMAIL PROTECTED] - Linux-UTF8: i18n of Linux on all levels Archive: http://mail.nl.linux.org/linux-utf8/
