In a message dated 2001-02-16 0:19:01 Pacific Standard Time, [EMAIL PROTECTED] 
writes:

>   Because of the widespread belief that Unicode stops at U+FFFF,
>   many fonts and applications that claim to support Unicode can
>   only handle basic characters, not supplementary characters.
>  
>  Right.  (Is it really a widespread belief?  That's something I've
>  been wondering.)

Well, [EMAIL PROTECTED] seems to think so:

>  > Many descriptions on the Web erroneously claim that Unicode contains 
only the 
>  > first 64K characters of ISO 10646.
>
>  Well, AFAICT it's true.
>
>  At some point in the future I suppose it will cease to be true, but if you
>  say "is" you should be talking about the present.

Unicode has been defined as ranging from U+0000 to U+10FFFF for several years 
now.  The fact that no characters have been assigned beyond U+FFFF before 
Unicode 3.1 (which is still in beta) does not change this.

>  > Because of the widespread belief that Unicode stops at U+FFFF, many 
fonts and 
>  > applications that claim to support Unicode can only handle basic 
characters, 
>  > not supplementary characters.
>
>  The code I wrote is like that, and it'll remain like that for as long as
>  that's all that can be tested and used in real life.

You can already test private-use characters in the U+Fxxxx and U+10xxxx 
ranges.  Saying that your code shouldn't have to work with characters beyond 
U+FFFF because no such characters have been assigned yet is like saying it 
shouldn't have to support U+20B0 through U+20CF.  You know characters will be 
assigned to that range some day, possibly sooner than you think.

Back to [EMAIL PROTECTED]:

>  So using the plain english term "basic" to describe that subset
>  of Unicode is misleading.
>  
>  I agree with you that the language in the standard needs updating.

I think that has been tried already, and 'basic' was the best anyone could 
do.  Terms involving 'planes', such as 'BMP' and 'supplementary planes', are 
discouraged because planes per se are not part of Unicode, only ISO/IEC 10646.

I personally don't like 'basic' and 'supplementary' because they seem to 
imply that the first 64K code points are better in some way, but the most 
important thing is that the terminology remain consistent, even if flawed.

-Doug Ewell
 Fullerton, California

Reply via email to