On 31/07/2003 11:31, Jony Rosenne wrote:

This argumentation applies equally well to th (which should be at least two
Unicodes in English), gh (how many?), etc.

Jony



-----Original Message-----
From: Ted Hopp [mailto:[EMAIL PROTECTED] Sent: Thursday, July 31, 2003 4:58 PM
To: Peter Kirk
Cc: Jony Rosenne; [EMAIL PROTECTED]
Subject: Re: Hebrew Vav Holam





...


I think of holam male as an indivisible glyph that happens to look like a vav with a dot centered above it (or above its stem, if you will, but that's just how it might vary from font to font). It's much the same as a lower-case 'i' not being a dotless i glyph with a combining dot. (Sometimes an 'i' is just an 'i'.) I wouldn't call the dot anything but a dot, certainly not a holam male.

Let's encode Hebrew, not dots. It may mean changes to what SIL, UniScribe, and others are doing, but there's no free lunch here.




As a native speaker of English, I certainly think of th and gh as sequences of two glyphs, not as indivisible combinations, so that is the difference here.

But a better example might be French e, e acute and e grave. These are three separate letters which need three different ways to encode them. Whether the accented versions are encoded as one character or two is unimportant as long as they are distinct. Similarly we have three letters, vav on its own, vav with right holam and vav with left holam, and so we need three ways of encoding them.

As for the character name, I am forced to consider these entirely meaningless except for being unique and stable, as UTC has refused to correct demonstrable mistakes in these names, including at least one Hebrew accent. So I would actually prefer to use a meaningless random string of characters because at least that is more or less guaranteed not to be misleading.

--
Peter Kirk
[EMAIL PROTECTED]
http://web.onetel.net.uk/~peterkirk/





Reply via email to