Are you planning to add an explicit statement to the Unicode standard that the valid range for scalar values is 0..10FFFF? (Or is such a statement there, and I've just missed it?)
In the absence of such a statement, I think it's very easy for people to get the idea that the range of scalar values is unbounded above, and that any limit is a property of a particular encoding. In particular, as the use of 32-bit variables to hold Unicode characters becomes more common (apparently most unices make wchar_t 32 bits wide), many will imagine that such a variable represents a 32-bit encoding of Unicode, with range 0..FFFFFFFF, where it just happens that every value above 10FFFF is unassigned. I am one such person (but no longer!) Of course, the Unicode Standard 3.0 doesn't even mention a 32-bit encoding - but that's not stopping uniphiles from storing Unicode data in their wchar_t's! Thanks - rick cameron -----Original Message----- From: Asmus Freytag [mailto:[EMAIL PROTECTED]] Sent: Tuesday, 18 December 2001 13:53 To: Rick Cameron; [EMAIL PROTECTED] Subject: RE: Astral planes (was: RE: Plane One use, was Re: HTML Validatio n) At 10:38 AM 12/18/01 -0800, Rick Cameron wrote: >It looks like UCS-2 and UCS-4 are defined in ISO 10646. Does that >standard restrict the valid range of UCS-4 to 0..10FFFF? It will with AMD1 to ISO/IEC 10646-1:2000 which is expected to pass final balloting and head for publication in 2002. >If not, does this represent >a significant divergence between Unicode and ISO 10646? No. Just one more area where the committees are working together to make sure that the formal statements of both standards are completely synchronized, despite starting from a different framework and approach to standardization and somewhat different terminology as well. Getting the last few wrinkles may take another amendment or so, but the will is there to see it through. A./ Technical Vice President The Unicode Consortium Liaison to ISO/IEC JTC1/SC2/WG2

