Philippe Verdy <verdy underscore p at wanadoo dot fr> wrote: > So SCSU and BOCU-* formats are NOT general purpose compressors. As > they are defined only in terms of stream of Unicode code points, they > are assumed to follow the conformance clauses of Unicode. As they > recognize their input as Unicode text, they can recognize canonical > equivalence, and thus this creates an opportunity for them to consider > if a (de)normalization or de/re-composition would result in higher > compression (interestingly, the composition exclusion could be > reconsidered in the case of BOCU-1 and SCSU compressed streams, > provided that the decompression to code points will redecompose the > excluded compositions).
I have to say, if there's a flaw in Philippe's logic here, I don't see it. Anyone? -Doug Ewell Fullerton, California http://users.adelphia.net/~dewell/

