Re: Yerushala(y)im - or Biblical Hebrew

Peter Kirk Sat, 26 Jul 2003 04:09:19 -0700

On 25/07/2003 17:39, Kenneth Whistler wrote:

...In Unicode 4.0, CGJ has been stripped of all interpretation
except as an invisible mark which can be used to tailor
collation (and searching), so as to distinguish digraphic units
from sequences of the same characters.

Thank you, Ken, for the long and helpful explanation of which this is an extract.

One question arises. If CGJ is used as proposed, so we have sequences such as patah CGJ hiriq and perhaps meteg CGJ vowel, does this imply that these sequences will necessarily be treated in collation as distinct from simple patah hiriq and meteg vowel sequences (the latter would of course be reversed by normalisation)? This is a simple question. I'm not yet sure if this would be desirable or not. Well, it would probably be better for meteg CGJ vowel to be collated the same as vowel meteg, as the distinction here is graphical but not semantic. As for patah CGJ hiriq, an advantage of collating this sequence the same as hiriq patah would be that existing texts which do not have CGJ here would be collated together with ones which do, and perhaps that users doing searches would not have to type the CGJ. But is this perhaps something for which specific collation rules can be tailored?

--
Peter Kirk
[EMAIL PROTECTED]
http://web.onetel.net.uk/~peterkirk/

Re: Yerushala(y)im - or Biblical Hebrew

Reply via email to