On Fri, Oct 26, 2001 at 01:35:26PM +0200, Oliver Doepner wrote: > Emacs-Unicode-990824 > ---------------------------------------------------------------------- > Internal Character code: > > 00 0000 xxxxxxxx xxxxxxxx Unicode U+0000 - U+FFFF > 00 xxxx xxxxxxxx xxxxxxxx Unicode 20bit (via surrogate pair) > 01 0000 xxxxxxxx xxxxxxxx Unicode 20bit (via surrogate pair)
Why are astral characters going to be supported by surrogate pairs? That's just ugly, especially if elisp coders have to deal with surrogates. Considering the binary transparency demands, you also need to round-trip surrogate pairs in UTF-8 back to surrogate pairs, not astral characters. On second glance, it doesn't look like you're using surrogate pairs at all. Then why do you mention them? They're just an encoding trick; if you aren't using UTF-16, you can forget about them. The characters above U+FFFF are U+10000, U+10001, etc., not U+D800 U+DC00, U+D800 U+DC01, etc. -- David Starner - [EMAIL PROTECTED] Pointless website: http://dvdeug.dhis.org "I saw a daemon stare into my face, and an angel touch my breast; each one softly calls my name . . . the daemon scares me less." - "Disciple", Stuart Davis - Linux-UTF8: i18n of Linux on all levels Archive: http://mail.nl.linux.org/linux-utf8/
