Re: An attempt to focus the PUA discussion [long]

Peter Kirk Sat, 01 May 2004 11:40:13 -0700

On 29/04/2004 16:56, Kenneth Whistler wrote:

Peter Kirk wrote, in response to Ernest Cline:
...  It simply is impossible
to simulate non-zero canonical combining class characters in Unicode
with anything other than a character with the appropriate canonical
combining class. ...
True. But fortunately Unicode don't really need to worry about normalisation of PUA data, as this is surely out of its scope.
Not quite. PUA code points are subject to the Unicode normalization algorithm, as well as any other. Their behavior in NFC or NFD, for example, is rigidly defined, if trivial: a PUA code point normalizes to itself.

Indeed. Perhaps I should have referred to any transformations of PUA data for normalisation. Unicode rightly does not transform it.

I was actually thinking more of logical normalisation, i.e. that it is not up to Unicode to decide whether <ELMTREE SYMBOL, COMBINING CHIPMUNK, COMBINING SQUIRREL> is semantically equivalent to <ELMTREE SYMBOL, COMBINING SQUIRREL, COMBINING CHIPMUNK> or, if they are, to provide a mechanism whereby one of these is normalised to the other. If in fact they are equivalent (e.g. the squirrel is on the ground, but the chipmunk is in the tree), then it is up to the PUA user to ensure that the data is ordered consistently or to provide private non-standard ordering mechanisms. Do you agree? If this is true, then there is no point in allocating the combining PUA characters to any class other than zero.

--
Peter Kirk
[EMAIL PROTECTED] (personal)
[EMAIL PROTECTED] (work)
http://www.qaya.org/

Re: An attempt to focus the PUA discussion [long]

Reply via email to