Re: Compression through normalization

Peter Kirk Thu, 04 Dec 2003 12:13:34 -0800

On 04/12/2003 08:39, Doug Ewell wrote:

...

(2)  I am NOT interested in inventing a new normalization form, or any
variants on existing forms.  Any approach that involves compatibility
equivalences, ignores the Composition Exclusions table, or creates
equivalences that do not exist in the Unicode Character Database (such
as "U+1109 + U+1109 = U+110A") is NOT of interest.  That amounts to
unilaterally extending C10, which may already be too liberal to be
applied to compression.

Surely ignoring Composition Exclusions is not unilaterally extending C10. The excluded precomposed characters are still canonically equivalent to the decomposed (and normalised) forms. And so composing a text with them, for compression or any other purpose, still conforms to C10, which explicitly allows "replacement of character sequences by their canonical-equivalent sequences" - not only when the resulting sequence is NFC or NFD.

--
Peter Kirk
[EMAIL PROTECTED] (personal)
[EMAIL PROTECTED] (work)
http://www.qaya.org/

Re: Compression through normalization

Reply via email to