Eli Zaretskii <[EMAIL PROTECTED]> writes:
> On Sat, 27 Oct 2001, David Starner wrote:
>>  On Fri, Oct 26, 2001 at 01:35:26PM +0200, Oliver Doepner wrote:
>>  >   Emacs-Unicode-990824
>>  > ----------------------------------------------------------------------
>>  > Internal Character code:
>>  > 
>>  >   00 0000 xxxxxxxx xxxxxxxx   Unicode U+0000 - U+FFFF
>>  >   00 xxxx xxxxxxxx xxxxxxxx   Unicode 20bit (via surrogate pair)
>>  >   01 0000 xxxxxxxx xxxxxxxx   Unicode 20bit (via surrogate pair)
>>  
>>  Why are astral characters going to be supported by surrogate pairs?
>>  That's just ugly, especially if elisp coders have to deal with
>>  surrogates. Considering the binary transparency demands, you also need
>>  to round-trip surrogate pairs in UTF-8 back to surrogate pairs, not
>>  astral characters.
>>  
>>  On second glance, it doesn't look like you're using surrogate pairs at
>>  all. Then why do you mention them? They're just an encoding trick; if
>>  you aren't using UTF-16, you can forget about them. The characters above
>>  U+FFFF are U+10000, U+10001, etc., not U+D800 U+DC00, U+D800 U+DC01,
>>  etc.

> Handa-san, could you please comment on that?  David is one of a few
> people who replied to my (quite desperate ;-) message posted to 
> gnu.emacs.bug a few days ago.

Please ignore the text "(via surrogate pair)".  It means
nothing, and I don't remember why I wrote that part.  :-(

Florian Weimer <[EMAIL PROTECTED]> writes:
> What does 'via surrogate pair' mean?  I guess the second line should
> read:

>>    00 xxxx xxxxxxxx xxxxxxxx   Unicode 20bit (U+10000 - U+FFFFF)

Yes.   That's correct, and the third line shoud read as below:

   01 0000 xxxxxxxx xxxxxxxx   Unicode 20bit (U+100000 - U+10FFFF)

>>    01 0ppp xxxxxxxx xxxxxxxx   7 64kByte planes reserved for Emacs
>>    01 1ppp xxxxxxxx xxxxxxxx   8 64kByte planes for private use
>>    1x xxxx xxxxxxxx xxxxxxxx   for private use, CNS 3-16, and CCCII
>>  
>>      Private area is 180000h - 3087FFh

> These are the characters from
>      1 1000 00000000 00000000
> to
>     11 0000 10000111 11111111 .
> Is this range intentional?  It looks rather strange.

I don't remember well.  :-(  Perhaps to fill code-space for
CNS 3-16 and CCCII from the tail.   They require #xF7800
code points (== (96*96*14) + (96*96*96)), and #x3087FF ==
#x3FFFFF - #xF7800.

> Anyway, what does 'private use' mean? Reserved for GNU Emacs, for Lisp
> packages, for the end user?

It seems that we have not yet discussed it in detail.

---
Ken'ichi HANDA
[EMAIL PROTECTED]
-
Linux-UTF8:   i18n of Linux on all levels
Archive:      http://mail.nl.linux.org/linux-utf8/

Reply via email to