Peter Kirk <peterkirk at qaya dot org> wrote:

> Surely ignoring Composition Exclusions is not unilaterally extending
> C10. The excluded precomposed characters are still canonically
> equivalent to the decomposed (and normalised) forms. And so composing
> a text with them, for compression or any other purpose, still conforms
> to C10, which explicitly allows "replacement of character sequences by
> their canonical-equivalent sequences" - not only when the resulting
> sequence is NFC or NFD.

Ignoring the composition exclusions does still respect canonical
equivalence, but does not preserve a canonical normalization form (using
the language of UAX #15).  So although it is not a violation of C10, it
does seem to run afoul of Mark's recommendation:

"In practice, if a compressor does not produce codepoint-identical text,
it should produce NFC
(not just any canonically equivalent text), and should document that it
does so."

-Doug Ewell
 Fullerton, California
 http://users.adelphia.net/~dewell/


Reply via email to