From: "Ernest Cline" <[EMAIL PROTECTED]> > I'd have to take the time to list them, but a quick glance convinces > me that there are at most several hundred combinations that would > need to be supported if we limit things to just those combinations > already in use. (it might take more, if for example all 256 potential > combining classes were supported instead of the 26 listed in > UCD.html), At 128 characters per combination plus more for a > few that might need them, it should prove possible to handle this > in 1 or 2 planes.
This seems highly excessive. We already have plenty of PUA space. All what we need is a standard way (file format? protocol?) to transport PUA character properties, and possibly encode a reference (URI?) to the definition file or service. If Unicode does not want to do this job, at least it could participate in such independant development by commenting about the protocol/format used to encode these properties (notably to make sure that the system remains extensible and can encode new properties that may be added later). This would work in relation with the evolution of the Unicode standard itself (versioning) which may be handled correctly (however less efficiently) through a sort of emulation layer that would "mimic" the behavior of new standardized characters and properties. I won't expect that every application will be able to interpret this protocol or implement the emulation layer, but at least it becomes possible to create less ambiguous interoperable solutions based on other existing standards (that's why I think that, if such separate development is created, it should be based on the most advanced interoperability technologies of today, notably XML and its schemas and namespaces). You think this is overkill? Well in some near future, I think that it will be difficult for applications to follow the evolutions of the Unicode standard, and differences of versions will cause soon a nightmare if there's no more formal way to specify what is implicitly part of a Unicode version (and does not need a complex negoctiation of protocol) clearly identified by a identifier resolvable by online services, and what can be supported the most completely as possible by an emulation layer. XML schemas, because they are versionnable, can really help here (notably because of the capability of modern XML parsers to use local caches for definition data, including local prebuilt-in implementations which are the most efficient). So I don't like the idea of adding more PUAs with other defaults. I much favor some more fredom on the use of PUAs, and a way to make what looks like a deviation of the standard today, a now conforming solution. It will become more important with the remaining scripts to encode, simply because we really lack some resources to be able to produce any standard for them. What this means is that the evolution of Unicode will soon become impossible without experimentation and gradual integration with some interoperable services. With the current standard stability policy, this need is even more important because further corrections of past errors will become nearly impossible (and so this will stop any attempt to make significant evolutions to the standard itself). It's clear that there are needs for PUAs today, just because Unicode is becoming an universal standard for more and more applications. If this universal standard blocks evolution, then others will want to develop indepant standards and there will be a risk of splits caused by OS vendors themselves. (see what has happened 15 years ago to Unix, and the high difficulty today to reunify what was initially a unique standard; thanks GNU and Linux have been the motors and such reunification, because other proprietary *nix versions are now converging for interoperability with Linux; but this unification is probably 15 to 20 years before it becomes true, unless *nix vendors decide to abandon prememtively some "dead" branches to keep only those that users want and are ready to learn and support themselves).

