From: "Michael Everson" <[EMAIL PROTECTED]> > At 17:02 -0800 2004-03-30, Mike Ayers wrote: > >I feel obligated to take this one step further - these folks are > >forgetting that "P" stands for "private". Their use of this space > >is their own problem, in all senses. It does not seem reasonable to > >me that *any* standard behavior could be expected of PUA code > >points, from operating systems or applications, as such may have > >chosen to, or may yet choose to, use those code points to > >encapsulate very un-font-rendering-like behavior, and such a > >decision, made past, present or future, is a perfectly valid private > >use. > > Which I assume means: "it's wrong for Unicode to make ANY property > pronouncements for ANY PUA characters, since that defines them, and > removes the P from the Use."
Do you mean here that any properties currently defined in Unicode for PUAs should be deprecated with their current normative value, and left to implementers, so that no application can be said non-conforming if it implements other defaults? May be this would require some adjustments in the normative wordings related to Unicode conformance... And as well, variant selectors, if they are used on PUAs should not be constrained as well (the current restrictions for variant selectors usage should not apply to PUAs as well, given that a VSn should still be fully ignorable including for PUAs that have no defined normative semantic in Unicode, meaning that the combination of PUA+VSn has also no defined normative semantic in Unicode itself). Leave that for implementations, and may be we'll ease the development of new scripts, by allowing other groups to work on some interchangeable formats based on PUAs, which could then be later integrated in Unicode after an easier phase where these scripts would have been experimented. It would ease the adoption of a later consensus, and would offer a great tool for developers and searchers, that could safely base their work based on Unicode encoding conventions Also this would be a good indicator that specialized 8-bit code sets are no longer necessary, and IANA could then close its 8-bit encodings registry, in favor of PUA-based encodings defined by some conventional rules which could then become a standard and open extension mechanism... This will have the advantage of avoiding pressures on Unicode to normalize new scripts too fast, and longer open experimentations would avoid many future errors in the new normalized scripts. The CSUR registry is one approach for the definition of new scripts, SIL.org has its own, but for now I see little efforts to allow specifying these properties in a partially interchangeable format, and one reason can be that Unicode has made too many restrictions on the usage of PUAs, so that developers fear that their protocols which need them become non conforming. I do think that there must exist a way to have PUAs used safely without ambiguities or risks of collisions, using extensions mechanisms similar to namespaces in XML, and some normative declarations and possibly a registry of PUA sets (why not the IANA charsets registry if it can reference the associated properties with some URL to a script definition schema?).

