Doug Ewell posted on my suggestion of using IPA characters to encode Shavian:

It's not a 1-to-1 match; Shavian includes letters for "affricatives,"
diphthongs, and other compound sounds.

The phonetic chart for Shavian at http://www.unicode.org/pending/shavian/shavian.html indicates two affricates, both which are coded as single characters in Unicode as U+02A4 and U+02A7.


The diphthongs are the problem.

For the letter OUT probably U+0223 (LATIN SMALL LETTER OU) would do.

For the others, one might adopt some wild interpretation such as acute accent means following _i_ type sound, grave accent means preceding _i_/_j_ type sound, Vietnamese hook indicates following _r_ and then use the closest precomposed forms.

A hideous kludge, but no more unituitive than some actual Latin transliteration schemes I have seen for non-Latin scripts, and to my mind better than PUA which also requires non-standard fonts.

Both methods are kludges, but the cypher method allows the data to be read, though with some difficulty, even without a proper font.

The danger is that if you were to leave a non-standard cypher font active, the characters in it may be picked up by a browser or word processor in lieu of standard glyphs in other fonts if your current font lacks one of those characters.

Of course Shavian has escaped the PUA ghetto.

But there are other alphabets which have gained far less acceptance that probably should not be encoded in Unicode, at least for the forseeable future.




Jim Allan












Reply via email to