Re: Encode, take five

Philip Newton Wed, 13 Sep 2000 18:32:46 -0700
On Wed, 13 Sep 2000, Matt Sergeant wrote:

> Until someone extends the Unicode character set beyond the current range,

This has "already" happened. Have a look at
http://www.unicode.org/unicode/alloc/Pipeline.html , the Unicode
allocation pipeline of proposed new characters and scripts. It lists
quite a few scripts beyond U+FFFF, several of which are "Accepted" by the
Unicode Technical Committee (some as much as three years ago) and some in
various stages of the ISO pipeline. While it may take a while before these
become canon (and some may get thrown out along the way), it's not as if
everything after U+FFFF is empty as far as the eye can see, with nothing
on the horizon.

> UCS-2 and UTF-16 currently have a one to one mapping. I assume thats the
> point being made. An excerpt from the book I'm currently tech reviewing:
> 
>   Nonetheless Unicode does provide a means of representing code points
>   beyond 64,535 by recognizing certain two-byte sequences as half of a
>   surrogate pair. A Unicode document that uses UCS-2 plus surrogate
>   pairs is said to be in the UTF-16 encoding. Since no software
>   currently supports or produces surrogate pairs, and since no scripts

I'll grant you that, especially considering the sort of things proposed
for Plane 1 (Deseret Alphabet, Musical Symbols), etc. And, of course, not
many people will be taking advantage of these code points since they're
not finalised.

>   are encoded in Unicode with code points above 65,535 the
>   distinction between UCS-2 and UTF-16 is mostly academic at this
>   point in time.

At this point in time, yes. I suppose I just wanted to point out that this
*may* change, at some unspecified (and maybe even distant) point in the
future.

Cheers,
Philip
-- 
Philip Newton <[EMAIL PROTECTED]>
Re: Encode, take five

Reply via email to