On Sat, 27 Oct 2001, David Starner wrote:
> On Fri, Oct 26, 2001 at 01:35:26PM +0200, Oliver Doepner wrote: > > Emacs-Unicode-990824 > > ---------------------------------------------------------------------- > > Internal Character code: > > > > 00 0000 xxxxxxxx xxxxxxxx Unicode U+0000 - U+FFFF > > 00 xxxx xxxxxxxx xxxxxxxx Unicode 20bit (via surrogate pair) > > 01 0000 xxxxxxxx xxxxxxxx Unicode 20bit (via surrogate pair) > > Why are astral characters going to be supported by surrogate pairs? > That's just ugly, especially if elisp coders have to deal with > surrogates. Considering the binary transparency demands, you also need > to round-trip surrogate pairs in UTF-8 back to surrogate pairs, not > astral characters. > > On second glance, it doesn't look like you're using surrogate pairs at > all. Then why do you mention them? They're just an encoding trick; if > you aren't using UTF-16, you can forget about them. The characters above > U+FFFF are U+10000, U+10001, etc., not U+D800 U+DC00, U+D800 U+DC01, > etc. Handa-san, could you please comment on that? David is one of a few people who replied to my (quite desperate ;-) message posted to gnu.emacs.bug a few days ago. In any case, let's continue discussing this on [EMAIL PROTECTED] (CC'ed). - Linux-UTF8: i18n of Linux on all levels Archive: http://mail.nl.linux.org/linux-utf8/