Back to Patrick's original question. Warning: this post contains nothing about Klingon, or even Tengwar.
Patrick Rourke <[EMAIL PROTECTED]> wrote: > One effect of Unicode Consortium's rigorous proposal/review policy is that > while a particular script or group of characters may not be adopted into > Unicode for a couple of years after it is proposed, font makers usually > don't get around to creating the fonts for those scripts until after they > have been officially approved for Unicode. There's no reason it has to be that way. Proposed glyphs are posted on the Unicode Web site months in advance of their "go live" date, even before the beta period, largely for this reason. I'm sure Unicode-aware type designers like John Hudson don't wait until a version of Unicode is formally released before they start designing glyphs. > Would it be a misuse of the PUA to come up with a private agreement within a > community to assign certain codepoints in the PUA to characters that have > been propsed to the Unicode Consortium, but not yet approved, so that font > designers and others in that community could get to work on establishing > support for these characters, and so that content providers can begin the > process of incorporating these characters into their content? As some have already said, this is exactly what the PUA is for. But the size and scope of the "community" may impose limits on the utility of these PUA assignments. Certainly not all font designers and content providers for a given non-Unicode script, worldwide, can be expected to comply -- and if they do, it may cause another set of problems, as we will see. One important point to remember is that any use or proposed use of the PUA, such as ConScript, is strictly up to private organizations, not the Unicode Consortium. To be sure, ConScript is the domain of two guys who are quite influential in Unicode, but they do not maintain ConScript in any official capacity as representatives of Unicode. > Would it be > useful/practial for such an agreement to stipulate a versioning system > whereby the font creators &c. and content providers in that community who > wish to use the PUA mapping in question would have to release new versions > of their products with the characters remapped to the approved codepoints > upon the acceptance of the characters in Unicode (and with the PUA > codepoints being obsolesced, and eventually removed, in subsequent versions > of the agreement assignments, until all characters were assigned by the > Unicode Consortium)? I would think you could simply use the version number of the Unicode Standard. For example, the use of Tagalog would have been conformant to this proposed PUA registry until Unicode version 3.2, at which time it would have to be removed from the registry because of its introduction into Unicode. > This would I think considerably shorten the amount of > time it would take for characters to become usable to a community after they > had been accepted into Unicode, and would also provide a mechanism for the > gradual introduction of "new" characters, while the versioning system would > (I'd hope) prevent PUA code points from being used long after perfectly good > permenent code points have been assigned. Conformance to this registry, especially over a period of time, is up to the user community. The presence of a standard is no guarantee that it will be followed, or even noticed. Here's an example of a potential pitfall of widespread PUA quasi-standardization. John Jenkins has probably done more than anyone to get the Deseret Alphabet encoded in Unicode (although it is never wise to overlook Michael Everson's influence). John has a series of Web pages describing Unicode and the DA. To this day, the main page at <http://homepage.mac.com/jenkins/Deseret/Unicode.html> still includes the following quote, in large bold italics: "It is strongly recommended that any implementations of the Deseret Alphabet conform to the ConScript encoding, if possible." Now, I don't bring this up to point out that John isn't keeping his Web pages up to date, but to show that this is and will continue to be a widespread problem, on the Web and elsewhere, even among the most diligent supporters of a script and of Unicode. Suppose Old Persian Cuneiform is encoded in Patrick's PUA registry next week, and that encoding achieves some popularity. Then suppose at some later date it is encoded in Unicode, say version 4.1. This will necessarily cause the encoding in Patrick's registry to be withdrawn, or at least deprecated. How many people will switch immediately to the sanctioned Unicode encoding? How quickly will existing software and data be converted? Probably not right away, and the chances for a timely conversion are less if the private-use encoding is particularly successful, whether or not there are scripts available to help people make the conversion. I provided a "Format A" conversion table to map Deseret characters from the old ConScript encoding to the code positions introduced in Unicode 3.1, and another to map Shavian to its proposed Unicode code points. You can see them at the ConScript site, <http://www.evertype.com/standards/csur/index.html>. Whether anyone has ever used these tables, or will ever notice them, is another matter entirely. > The main issue I can think of is the matter of rejected characters: what > does one do if a character is rejected by the Unicode Consortium for valid > reasons? Delete it from the agreement, and have to remove a distinction > from the character data of the content providers? Leave it there, and so > perpetuate some final version of the agreement for all time, as a kind of > extension to Unicode? This is exactly the reason for the "rigorous proposal/review policy" mentioned earlier, and perhaps the biggest drawback to the concept of a widespread PUA encoding for future Unicode scripts. It usually does take a while to get characters encoded in Unicode, not just because committees are big and slow and bureaucratic, but because there are real decisions to be made that can take a lot of time and research. Rushing these characters into use before Unicode and WG2 have finished making these decisions could subvert the process and create the dilemmas Patrick mentioned. Sorry to be so negative. -Doug Ewell Fullerton, California

