> OK. So it's Mark, not me, who is unilaterally extending C10. Where on earth do you get that? I did say that, in practice, NFC should be produced, but that is simply a practical guideline, independent of C10.
Mark __________________________________ http://www.macchiato.com â ààààààààààààààààààààà â ----- Original Message ----- From: "Peter Kirk" <[EMAIL PROTECTED]> To: "Doug Ewell" <[EMAIL PROTECTED]> Cc: "Unicode Mailing List" <[EMAIL PROTECTED]> Sent: Fri, 2003 Dec 05 02:51 Subject: Re: Compression through normalization > On 05/12/2003 00:34, Doug Ewell wrote: > > >Peter Kirk <peterkirk at qaya dot org> wrote: > > > > > > > >>Surely ignoring Composition Exclusions is not unilaterally extending > >>C10. The excluded precomposed characters are still canonically > >>equivalent to the decomposed (and normalised) forms. And so composing > >>a text with them, for compression or any other purpose, still conforms > >>to C10, which explicitly allows "replacement of character sequences by > >>their canonical-equivalent sequences" - not only when the resulting > >>sequence is NFC or NFD. > >> > >> > > > >Ignoring the composition exclusions does still respect canonical > >equivalence, but does not preserve a canonical normalization form (using > >the language of UAX #15). So although it is not a violation of C10, it > >does seem to run afoul of Mark's recommendation: > > > >"In practice, if a compressor does not produce codepoint-identical text, > >it should produce NFC > >(not just any canonically equivalent text), and should document that it > >does so." > > > > > > > > > OK. So it's Mark, not me, who is unilaterally extending C10. Well, Ken > said much the same, so it's bilateral; and I agree it is a sensible > extension. > > But, as Ken also pointed out, it is quite permissible to use any > encoding for the intermediate e.g. compressed form of the text, as long > as it is possible to recover from this the normalised form of the > original text. My suggestion of composing the text using composition > exclusions meets this test, in a way not met by some of the other > suggestions, e.g. composing Korean characters into precomposed forms > which are (sadly) not canonically equivalent. > > -- > Peter Kirk > [EMAIL PROTECTED] (personal) > [EMAIL PROTECTED] (work) > http://www.qaya.org/ > > > >

