From: "Doug Ewell" <[EMAIL PROTECTED]> > Philippe Verdy wrote (in rich text): > > > Due to that, an application needs to specify whever it will support > > and comply with the full ISO/IEC 10646-1:2000 character set or to the > > Unicode subset. > > ISO/IEC 10646 has reduced its range to match Unicode's, so this > distinction is obsolete.
It is not obsolete: the corrigendum #1 for UTF-8 (published in Unicode 4.0) refers to ISO/IEC 10646-1:2000, not to ISO/IEC 10646:2003 which is the character repertoire which corresponds to Unicode 4.0... So that's a reference error in the version of the now normative corrigendum published in Unicode 4.0... Does it need another Corrigendum to correct this reference in the Corrigendum? Well, I still doubt that ISO/IEC 10646 has reduced its character set. It has just agreed to limit its repertoire of _standardized_ and _interchangeable_ characters to the first 17 planes so that _these_ characters can remain in sync and encoded identically in the Unicode repertoire with the same code points, but all the other planes are still present in ISO/IEC 10646, some of them being still allocated to PUAs that don't have equivalents in Unicode, but they are still valid within UTF-8 encoded data and also still conforming to ISO/IEC 10646 (even if they are illegal for use in Unicode 4.0, these sequences are not ill-formed like non shortest forms now forbidden in both standards).

