OK, so it is there in 3.0. But in the section on Surrogates? And on Transformations? A little obscure.
I expected to find it in section 2.3, for example, where the major encoding forms are being described; or even earlier - say in 1.1 Coverage. Surely the range of valid scalar values is an important aspect of coverage! I hope this aspect of the standard will be front and centre in 4.0. Thanks - rick cameron -----Original Message----- From: Kenneth Whistler [mailto:[EMAIL PROTECTED]] Sent: Tuesday, 18 December 2001 16:35 To: [EMAIL PROTECTED] Cc: [EMAIL PROTECTED] Subject: RE: Astral planes (was: RE: Plane One use, was Re: HTML Validatio n) Rick Cameron asked: > Are you planning to add an explicit statement to the Unicode standard > that the valid range for scalar values is 0..10FFFF? (Or is such a > statement there, and I've just missed it?) Unicode 3.0, p. 45, D28: Unicode scalar value: a number N from 0 to 10FFFF<sub>16</sub>... and p. 46, D29, second bullet: * Any sequence of code values that would correspond to a scalar value greater than 10FFFF<sub>16</sub> is illegal. > > In the absence of such a statement, I think it's very easy for people > to get the idea that the range of scalar values is unbounded above, > and that any limit is a property of a particular encoding. > > In particular, as the use of 32-bit variables to hold Unicode > characters becomes more common (apparently most unices make wchar_t 32 > bits wide), many will imagine that such a variable represents a 32-bit > encoding of Unicode, with range 0..FFFFFFFF, where it just happens > that every value above 10FFFF is unassigned. > > I am one such person (but no longer!) > > Of course, the Unicode Standard 3.0 doesn't even mention a 32-bit > encoding - but that's not stopping uniphiles from storing Unicode data > in their wchar_t's! It's the Unicode Standard 3.1 that you need to be referring to. UTF-32 was incorporated into the standard at that point. See http://www.unicode.org/unicode/reports/tr27/ --Ken

