Re: Yerushala(y)im - or Biblical Hebrew

Peter Kirk Tue, 08 Jul 2003 03:07:30 -0700

On 07/07/2003 19:23, John Hudson wrote:

At 08:51 07/07/2003, Ted Hopp wrote:

Editing would also be an "interesting" experience. Could one search for lamed-patah and find it as part of lamed-<patah+hiriq>? Or would the proposal be to use these new codes only as part of bookend processing around normalization (i.e., automatically recognize the sequences and substitute, normalize, and then automatically substitute back)?

I suppose the latter is feasible. I am very keen that *any* solution should be invisible to the user.

Would it work to define a new character, for example, for patah-hiriq which has a canonical decomposition into patah plus hiriq, or even into hiriq plus patah? Would normalisation compose a patah-hiriq sequence into this character and so get round the reordering problem? Remember that the reverse sequence is actually not attested, as far as I can tell for any of the sequences in question.

I think we need to keep Peter Constable's point in mind that current usage should not define the limits of Unicode functionality. Since the principle is that all sequences of character codes are permitted (2.10), it seems wrong to supply a fix for only "the small number of attested sequences".

But I agree here. The kind of solution I have just proposed is in danger of escalating in the way in which the number of Latin characters escalated until a decision was made not to add any more.

--
Peter Kirk
[EMAIL PROTECTED]
http://web.onetel.net.uk/~peterkirk/

Re: Yerushala(y)im - or Biblical Hebrew

Reply via email to