On Wed, 13 Sep 2000, Matt Sergeant wrote:
> Until someone extends the Unicode character set beyond the current range,
This has "already" happened. Have a look at
http://www.unicode.org/unicode/alloc/Pipeline.html , the Unicode
allocation pipeline of proposed new characters and scripts. It lists
quite a few scripts beyond U+FFFF, several of which are "Accepted" by the
Unicode Technical Committee (some as much as three years ago) and some in
various stages of the ISO pipeline. While it may take a while before these
become canon (and some may get thrown out along the way), it's not as if
everything after U+FFFF is empty as far as the eye can see, with nothing
on the horizon.
> UCS-2 and UTF-16 currently have a one to one mapping. I assume thats the
> point being made. An excerpt from the book I'm currently tech reviewing:
>
> Nonetheless Unicode does provide a means of representing code points
> beyond 64,535 by recognizing certain two-byte sequences as half of a
> surrogate pair. A Unicode document that uses UCS-2 plus surrogate
> pairs is said to be in the UTF-16 encoding. Since no software
> currently supports or produces surrogate pairs, and since no scripts
I'll grant you that, especially considering the sort of things proposed
for Plane 1 (Deseret Alphabet, Musical Symbols), etc. And, of course, not
many people will be taking advantage of these code points since they're
not finalised.
> are encoded in Unicode with code points above 65,535 the
> distinction between UCS-2 and UTF-16 is mostly academic at this
> point in time.
At this point in time, yes. I suppose I just wanted to point out that this
*may* change, at some unspecified (and maybe even distant) point in the
future.
Cheers,
Philip
--
Philip Newton <[EMAIL PROTECTED]>