Ken,

In the good old days of 8-bit, if one wanted to make a Thaana font that worked, one used 8-bit Arabic code points for the letters and Arabic code points for the vowels signs. It was a hack, but it worked. It worked because the OS (Mac, PC, whatever) treated the characters appropriately to their DEFINED meanings. Of course, the font-hacker was cheating, but in terms of processing, what he did worked.

Now we've got the PUA and we say "use it". Let's look at non-standardized Tengwar then. Tengwar has letters and combining marks, and is encoded in the CSUR. (Revision is long overdue, but never mind that.) Let's pretend that no one will ever encode Tengwar. What is preferable:

All PUA characters are defined as whatever they are, and there you go.

or

PUA characters can be defined, locally and privately, according to some protocal which will WORK if people write software to do what they want

I am not a programmer. But if the first scenario is all that people can have, won't they just start substituting Tengwar consonants for Unicode U+00xx characters, and substituting Tengwar vowels for Unicode U+03xx characters?

What I see people asking for is some sort of protocol that, locally and privately, between agreeing parties, they can use to say what a PUA character's properties are.

Is that something we can give them? For surely in the absence of such a protocol, it's back to the world of "ASCII hacks" with U+ encoding.

In other words, it's fine for the UTC to say "PUA characters are all LTR spacing characters" but for people who need to do something else, the means to do so should be made available to them.

If I have understood the question.
--
Michael Everson * * Everson Typography *  * http://www.evertype.com



Reply via email to