Peter Kirk <peterkirk at qaya dot org> wrote: > Surely ignoring Composition Exclusions is not unilaterally extending > C10. The excluded precomposed characters are still canonically > equivalent to the decomposed (and normalised) forms. And so composing > a text with them, for compression or any other purpose, still conforms > to C10, which explicitly allows "replacement of character sequences by > their canonical-equivalent sequences" - not only when the resulting > sequence is NFC or NFD.
Ignoring the composition exclusions does still respect canonical equivalence, but does not preserve a canonical normalization form (using the language of UAX #15). So although it is not a violation of C10, it does seem to run afoul of Mark's recommendation: "In practice, if a compressor does not produce codepoint-identical text, it should produce NFC (not just any canonically equivalent text), and should document that it does so." -Doug Ewell Fullerton, California http://users.adelphia.net/~dewell/

