----- Original Message ----- From: "Ernest Cline" <[EMAIL PROTECTED]> To: "Kenneth Whistler" <[EMAIL PROTECTED]>; <[EMAIL PROTECTED]> Cc: <[EMAIL PROTECTED]>; <[EMAIL PROTECTED]> Sent: Saturday, May 01, 2004 1:42 AM Subject: Re: An attempt to focus the PUA discussion [long]
> > > [Original Message] > > From: Kenneth Whistler <[EMAIL PROTECTED]> > > > > On the other hand, I could not expect any software doing > > Unicode normalization to pay any attention to *my* interpretation > > of those equivalences, and if I really wanted to process data > > using such equivalences, it would be up to me to write the > > software to do so. > > Decompositions and canonical combining classes are the > two things that affect normalization, and are why Unicode > limits changes to these two to be made only in an upwardly > compatible manner. This is what makes assigning those > properties to private use characters so tricky. As far as I know, the stability of normalization is important only for interchange of data using and assuming the same standard Unicode conventions. This is not fondamental for PUAs which are used with private conventions, using agreements between users so that they can at the same time use their own normalization. Stibility of PUAs will be guaranteed only for applications that don't handle PUAs or treat them with the Unicode default properties. If someone needs to assign new diacritics or now decomposable characters or new precomposed characters in PUAs, and handle them with their own normalization, this should be OK. After all, this is what many fonts do everyday: they assign internally some codes to create ligatures or recognize variant forms, and these new private "characters" are internally mapped to PUAs, using their own normalizations. As the resulting string of reordered and rearranged glyphs will not be interchanged but only used locally to render a text graphically, this already falls within what is allowed in PUAs. These fonts (and the text layout engines that use them) don't care about the normative default properties of PUAs as they really use them with the properties they want (joining types, case mappings for special styles like SmallCaps, mirrored characters, bidirectional properties, etc... are freely changed from the default assignment in Unicode, and GSUB tables can also be viewed as a normalization step performed by renderers to translate a series of standard Unicode points into a string of glyph ids, whose value generally match the standard code point to represent or a PUA codepoint). The default combining class 0 of PUAs is necessary in Unicode so that an application that does not know their contextual semantic will not attempt to reorder them through the standard normalization algorithm. But I don't think there's a limitation for applications that would use PUAs contextually, using other combining classes. So for me all PUAs can be decomposable and reorderable within the private convention that defines a private semantic for them, and it's not the responsability of Unicode to forbid it, and Unicode does not need to inspect what is in such private convention.

