On Friday, June 27, 2003 4:40 PM, John Cowan <[EMAIL PROTECTED]> wrote: > Not so. Sometimes stability is more important than correctness.
Very well answered. I don't see why we need to sacrifice stability when correcting something. As the error is not in ISO10646, it is definitely not reasonnable to have ISO10646 endorse the error done by Unicode due to its stability pact. For now, the only good solution is to use existing Unicode-only resources that will not impact the normalization pact, and the ISO10646 unification work. If this requires defining some additional Unicode semantics or properties for some language-significant markup characters, this can be done with variants (if ISO10646 accept it), or with a request for a dedicated new *invisible* diacritic in the Hebrew block to ISO10646. May be Unicode should be more prudent with Normalization Forms: if new characters are added, their combining classes should be documented as informative before there is a consensus and experimentation. This will not break the stability pact with XML, which will simply not accept the new characters before they are stabilized by Unicode. So the characters can be standardized by Unicode, and ISO10646, but be used with caution with XML which can restrict the set of characters supported to only those for which the canonicalization is not finished. Why not then documenting these critical normative properties to make them clearly informative if needed? For example informative canonical decompositions could be noted with <canon> (and thus only recognized by compatibility decompositions until further notice). And proposed combining classes could be noted with an additional symbol in the CC column of the UCD (for example a "?"). This would prevent using the character within XML compliant applications, but it could allow a more rapid development of fonts and renderers or layout engines, allow experimentations to encode actual new documents with some safe-guards regarding the actual character properties. This would say to IETF and W3C a "warning" this character has an informative combining class or decomposition. Normalization at this step is dangerous, and documents should be considered as already normalized for those characters. These potentially instable unicode-encoded documents will then be labelled with the unicode version, as a future revision may require verigying if the informative properties have become enforcable. If there's a change in the properties, existing documents can then be tested to see if they still respect the proposed normalization, and corrected. If there is no change after say 1 year, a revision annex publishes these properties as normative and a incremental version of Unicode is added, that allows interchange and conservation of the encoded documents without an explicit Unicode version label.

