On 2/11/2013 12:49 AM, Richard Wordingham wrote:
The problem sequence is <U+003E GREATER-THAN SIGN, U+0338 COMBINING LONG SOLIDUS OVERLAY> which is canonically equivalent to <U+226F NOT GREATER-THAN>.
Which demonstrates: NFC applied to the serialization of an XML infoset is not the same as NFC applied to the text nodes and attributes of that infoset.
The short answer is that XML shall not do canonical equivalence, at least, not on data; so doing would corrupt some of the CLDR definitions,
That case is different: it's whether a use of text strings (CLDR in this case) can be indifferent to normalization. There are other cases, e.g. the regular expressions to validate some of Unihan's properties, which should not be normalized, and which assume that the data to be validated is in NFD.
Eric.

