Re: Unicode in Emacs again

Eli Zaretskii Sun, 28 Oct 2001 01:07:36 -0700


On Sat, 27 Oct 2001, David Starner wrote:


> On Fri, Oct 26, 2001 at 01:35:26PM +0200, Oliver Doepner wrote:
> >     Emacs-Unicode-990824
> > ----------------------------------------------------------------------
> > Internal Character code:
> > 
> >   00 0000 xxxxxxxx xxxxxxxx   Unicode U+0000 - U+FFFF
> >   00 xxxx xxxxxxxx xxxxxxxx   Unicode 20bit (via surrogate pair)
> >   01 0000 xxxxxxxx xxxxxxxx   Unicode 20bit (via surrogate pair)
> 
> Why are astral characters going to be supported by surrogate pairs?
> That's just ugly, especially if elisp coders have to deal with
> surrogates. Considering the binary transparency demands, you also need
> to round-trip surrogate pairs in UTF-8 back to surrogate pairs, not
> astral characters.
> 
> On second glance, it doesn't look like you're using surrogate pairs at
> all. Then why do you mention them? They're just an encoding trick; if
> you aren't using UTF-16, you can forget about them. The characters above
> U+FFFF are U+10000, U+10001, etc., not U+D800 U+DC00, U+D800 U+DC01,
> etc.

Handa-san, could you please comment on that?  David is one of a few
people who replied to my (quite desperate ;-) message posted to 
gnu.emacs.bug a few days ago.

In any case, let's continue discussing this on [EMAIL PROTECTED] 
(CC'ed).
-
Linux-UTF8:   i18n of Linux on all levels
Archive:      http://mail.nl.linux.org/linux-utf8/

Re: Unicode in Emacs again

Reply via email to