Doug Ewell writes:
> Philippe Verdy <verdy underscore p at wanadoo dot fr> wrote:
> 
> > I've tried to experiment a collation algorithm to implement UCA by the
> > same system as used in UCD decompositions, but with added (and
> > sometimes modified) decompositions. This system creates new "code
> > points" needed to represent only <font> compatibility differences,
> > ligatures, or alternate forms, as a decomposition of the existing
> > compatibility character, into more basic characters exposed with
> > primary differences in UCA, plus these new characters given "variable"
> > collation weights, which may be ignorable in applications which ignore
> > extra levels. This encoding uses a 31 bit code space, which is still
> > highly compressible, but still representable with the UTF-8 TES (but
> > they are not containing Unicode code points) or similar ad-hoc
> > representation.
> 
> Please don't use UTF-8 to encode anything other than Unicode code
> points.

As long as I use it internally for intermediate processing, I can do what
I want. For now it is just a convenient way to represent variable size
integers up to 31 bits (in fact I use it to represent 32 bit signed
integers, but the two highest bits are equal).

Of course if I still use it to represent something else thzn codepoints
in some published data or text, I will rename it and won't keep the
same charset label. But it's highly probable that this will not be the
most efficient representation (due to its byte-oriented splitting), and
a more compact or easier to process serialization could require an
alternate encoding scheme (or transfer syntax).


__________________________________________________________________
<< ella for Spam Control >> has removed Spam messages and set aside
Newsletters for me
You can use it too - and it's FREE!  http://www.ellaforspam.com

__________________________________________________________________
<< ella for Spam Control >> has removed Spam messages and set aside
Newsletters for me
You can use it too - and it's FREE!  http://www.ellaforspam.com

<<attachment: winmail.dat>>

Reply via email to