[Feed another to the shubnet . . .]

I have a copy of Shellbear's Practical Malay Grammar that I'm preparing 
to transcribe for Project Gutenberg. Unfortunately, he represents the 
Malaysian alphabet in a Latin transliteration that includes ng as a 
single ligatured form, and I don't know how to transcribe in Unicode. 
Some ideas: 

(1) Use a private use character. Not feasible, because it needs to readable 
by the average person, not just someone who has patience to set up their 
computer for this one file.

(2) Use a ZWJ between n and g. If I'm not mistaken, most current systems 
will show the ZWJ as a little black box, and there's going to be very 
few systems any time soon that  would  actually display the ng ligature.
Still, a good Unicode system will elide the ZWJ displaying the acceptable 
ng with the real information still in the file.

(3) Petition Unicode for a new character. Right. I'm going to argue 
for a character used in two books (that I know of) that bears 
annoying similarity to the ng (non-ligatured) flame wars, that 
in the best of cases I wait a couple years for it to be accepted.

(4) Resort to ASCII trickery to distinguish between ng (ligatured) and 
ng (non-ligatured). Marking the ng (ligatured) would be ugly; marking
the unligatured would be also ugly, although a lot rarer - I don't know 
if Malay (in this transliteration) uses ng (non-ligatured). 

(5) Just use ng. A simple, just ASCII solution. I don't know if it's 
information preserving though.

Any suggestions?

-- 
David Starner - [EMAIL PROTECTED]
Gutenberg stuff - http://dvdeug.dhis.org/guten/ (down for the week)

Free, encrypted, secure Web-based email at www.hushmail.com

Reply via email to