Thank you for your reply. >I do feel I need to comment in regard to your two messages, totaling 19 >KB, which are largely focused on the Private Use Area and quasi-official >codifications of its usage.
Well, the ideas are not intended to be quasi-official. Just one end user of the Unicode system seeking to use the Private Use Area to good effect and putting forward ideas to other end users who might like to consider using some of the facilities suggested. >Many of the issues relating to the PUA have been discussed numerous >times on this list. We all know, as you state in one of your posts, >that Unicode is committed to leaving the PUA free and available to all >users, to the point that they will not sanction any "semi-official" >mappings of characters to the PUA nor any indexing mechanism to >reference such mappings. I used to wonder why there was no link on the >Unicode Web site to ConScript, since it seemed to me like a creative use >of Unicode. Now I understand that such a link might be misinterpreted >as an official endorsement of ConScript. Well, I have found that if I don't mention the fact that endorsement of Private Use Area allocations will not be endorsed then someone will usually point out the fact as if saying so then refutes my suggestion. Also, I was replying to someone who may possibly not have known about the matter. A pity that the ConScript link cannot be linked from the Unicode website. I would have thought that mentioning that various people are using the Private Use Area in various ways and, after having stated the non-endorsement rule, providing a few links would have been alright: still, its not my website and maybe there are legal implications of which I am unaware should the Unicode Consortium provide such a link. >1. There is *strong* opposition to encoding additional presentation >forms for alphabetic characters. Ligatures are presentation forms. >Beginning with version 3.1, Unicode has stated that alphabetic ligatures >may be formed with the help of U+200D ZERO WIDTH JOINER, or >automatically by the font without explicit encoding. (As Michael >Everson pointed out in his "zero-width ligator" papers, the >automatic-formation approach requires hairy contextual analysis in cases >such as Fraktur.) But the whole reason for inventing these solutions is >that additional Latin ligatures are EXTREMELY unlikely to be encoded. Well, I had a search for Mr Everson's papers and found three of them on his http://www.evertype.com website. I have had a look through them and hope to have a further read later. These are n2141.pdf, n2147.pdf and n2317.pdf. Now, the fact is that Michael suggested a feature named ZERO WIDTH LIGATOR specifically for the purpose of ligation and it appears that that suggestion has not been accepted, but that a shared solution with a code point that can also mean something else has been decided upon. Now, I do not know the details of all of this and I certainly hope to study the matter more, yet, as someone who is not a linguist as such but an inventor and programmer, I have a concern that using one code point for two types of meaning rather than one code point for each type of meaning is what I call a software unicorn. The concept of a software unicorn can be read about on http://www.users.globalnet.co.uk/~ngo/euto0008.htm if anyone is interested. Certainly, if I were encoding ligation as an operator I feel that I would perhaps tend to introduce two new code points, namely PLEASE LIGATE THE NEXT TWO CHARACTERS and PLEASE LIGATE THE NEXT THREE CHARACTERS rather than a ZERO WIDTH LIGATOR which is only reached after the first character of the ligature has been reached, yet I would also try to make sure, as Michael has done, that ligation was carefully separated from other operations. I feel that perhaps the matter needs to be looked at again and Michael's suggestion for the ZERO WIDTH LIGATOR reviewed with the possibility that it become accepted after all. I am wondering whether to add ligation facilities into the U+F3.. block of my usage of the Private Use Area so as to provide a safety net in case those software unicorns start galloping, for as the song goes, a castle of software can fall to the ground, if over its drawbridge their golden hooves pound! Yet I do not regard my desire to formally encode some more ligatures into the U+FB.. block as in any way contradictory to the use of a ZERO WIDTH LIGATOR code point. There is room for both methods to be available and the end user can make his or her choice depending upon the application and upon what facilities are available to him or her. Someone compiling a dictionary might well use a ligator approach. If someone wishes to set Fraktur or set in an18th Century English style, using TrueType founts with a wordprocessing package or a multimedia authoring package, then using ligature characters might be the best approach. Certainly my background in setting metal type perhaps influences me to want to have the ligature characters as such, yet metal type was used for centuries and transcribing of texts onto computers may well take place. I accept that there can be problems of an ever expanding set of ligatures, yet the glyph design has got to be available for the ligature somehow and the "one code point gives one ligature character" approach is certainly effective with wordprocessing software and multimedia authoring packages. As to strong opposition to encoding additional presentation forms for alphabetic characters, well, we live in a democratic society and if some people who would like to produce quality printing feel that using a TrueType fount with some ligature characters does what they want and harms no one else, what exactly is the objection? Certainly, if it were one method or the other, then the ligator operator would be best, yet it is not necessarily one method or the other, there is scope to encode the ligatures for Fraktur and for 18th Century English books and also have Michael's ZERO WIDTH LIGATOR. Maybe there needs to be a note about which method is considered the most appropriate use for major uses, yet there is, I suggest, scope for both methods to be encoded into regular Unicode. I hope that that will happen at some time in the future. For the moment, I am trying to have a discussion about it with a view to producing a list of Private Use Area code points by the weekend if possible. Then anyone who uses that list, perhaps to produce a TrueType fount, can at least have a set of code points to use. >But don't expect that action to have any bearing on what >UTC or WG2 does. They want formal proposals, and they have an official >form. If I decide there is enough support for "lock" and "unlock" to >warrant a proposal (the grass-roots vote so far is 3 for, 1 against), I >will fill out the form. Doing a PUA implementation is fine, but has >nothing to do with formal proposals. Certainly they want proposals to be formal and on an official form. As to whether a Private Use Area implementation has nothing to do with formal proposals is not, I feel, so clear cut. Certainly, I do not expect the fact that I have suggested four particular code points for various padlocks in the Private Use Area to influence a formal decision. Yet, by suggesting those four code points, if, at various organizations various people are, without making any public announcement, trying out a fount with two or four padlock symbols in them, then maybe, just maybe, they will use the code points that I suggested in my posting. If they do, this would then mean that if they try making test applications that make use of the padlock symbols expressed as Unicode code points then those test applications may be interoperable with test applications made by other researchers, which might be of benefit at some stage in the future, if perhaps various people make test founts with padlock symbols in them available for trials. As for the ligatures, I feel that having a list of code points available is worthwhile, so that anyone who does want to make a fount with ligature characters in it has a list to use. I am adding various characters to my list. A gentleman emailed to suggest fj as a ligature as in the word fjord. Also, in Michael's documents there are some additional characters, including an ft in Fraktur. It is interesting. I have certainly learned a lot by following up your mention of Michael's papers. > >Sorry for chewing up so much additional bandwidth. Well, I enjoyed reading what you wrote. Thank you for replying so fully. William Overington 22 May 2002

