Re: Unicode in Emacs again

David Starner Sat, 27 Oct 2001 12:36:09 -0700

On Fri, Oct 26, 2001 at 01:35:26PM +0200, Oliver Doepner wrote:
>       Emacs-Unicode-990824
> ----------------------------------------------------------------------
> Internal Character code:
> 
>   00 0000 xxxxxxxx xxxxxxxx   Unicode U+0000 - U+FFFF
>   00 xxxx xxxxxxxx xxxxxxxx   Unicode 20bit (via surrogate pair)
>   01 0000 xxxxxxxx xxxxxxxx   Unicode 20bit (via surrogate pair)


Why are astral characters going to be supported by surrogate pairs?
That's just ugly, especially if elisp coders have to deal with
surrogates. Considering the binary transparency demands, you also need
to round-trip surrogate pairs in UTF-8 back to surrogate pairs, not
astral characters.

On second glance, it doesn't look like you're using surrogate pairs at
all. Then why do you mention them? They're just an encoding trick; if
you aren't using UTF-16, you can forget about them. The characters above
U+FFFF are U+10000, U+10001, etc., not U+D800 U+DC00, U+D800 U+DC01,
etc.

-- 
David Starner - [EMAIL PROTECTED]
Pointless website: http://dvdeug.dhis.org
"I saw a daemon stare into my face, and an angel touch my breast; each 
one softly calls my name . . . the daemon scares me less."
- "Disciple", Stuart Davis
-
Linux-UTF8:   i18n of Linux on all levels
Archive:      http://mail.nl.linux.org/linux-utf8/

Re: Unicode in Emacs again

Reply via email to