Just a reminder that the statement of the problem has not been agreed to. I
don't see a vowel sequence in  Yerushala(y)im.

Jony

> -----Original Message-----
> From: [EMAIL PROTECTED] 
> [mailto:[EMAIL PROTECTED] On Behalf Of Peter Kirk
> Sent: Tuesday, July 08, 2003 3:19 PM
> To: [EMAIL PROTECTED]
> Subject: SPAM: Re: Yerushala(y)im - or Biblical Hebrew
> 
> 
> On 08/07/2003 02:23, Peter Kirk wrote:
> 
> >
> > Would it work to define a new character, for example, for 
> patah-hiriq
> > which has a canonical decomposition into patah plus hiriq, or even 
> > into hiriq plus patah? Would normalisation compose a patah-hiriq 
> > sequence into this character and so get round the 
> reordering problem? 
> > Remember that the reverse sequence is actually not 
> attested, as far as 
> > I can tell for any of the sequences in question.
> >
> A couple of off list comments have made it clear to me that this 
> proposal needs some clarification and adjustment. But I think it can 
> still be made to work. It is a nasty kludge, but then as 
> someone pointed 
> out any solution to this problem is bound to be a nasty 
> kludge. In some 
> ways it is less nasty than others that have been suggested, and it 
> doesn't have some of the disadvantages that have been 
> mentioned. It also 
> has the advantage that no recoding of existing text is required. That 
> doesn't make it my preferred solution (the CGJ solution is 
> still that), 
> but it is at least worth considering.
> 
> This solution requires adding a new character for each vowel sequence 
> found in Hebrew texts. Currently six such sequences have been 
> identified 
> in the WTS Bible text - though one of these (sheva-hiriq) is 
> already in 
> canonical order and so not a problem. So this is fewer new characters 
> than the earlier proposal - but there may be other sequences in other 
> texts. This relies on the fact that none of these sequences 
> are found in 
> reverse, although we cannot guarantee that this is true for 
> all texts. I 
> will use the patah-hiriq sequence as an example, all other sequences 
> solved separately in the same way.
> 
> The solution for this sequence is as follows: Define a new combining 
> character something like HEBREW LIGATURE PATAH HIRIQ with a canonical 
> decomposition of hiriq - patah (yes, that way round) and a 
> glyph with a 
> hiriq to the left of a patah. How does this help? Well, it will not 
> affect users who type patah then hiriq, in non-canonical 
> order, into an 
> application which does not immediately normalise the text, as the 
> renderer will still render hiriq to left of patah as typed. But when 
> this text is normalised into NFC, the sequence will first be 
> reordered 
> as hiriq - patah, and then this combination will be composed into the 
> new ligature. That is correct, isn't it? So an application 
> which renders 
> the NFC text will see the new character and should render it 
> according 
> to its glyph. In NFD text, the hiriq - patah sequence remains, but it 
> is, I think, customary if not required for the renderer to 
> combine the 
> glyphs into the defined ligature before rendering. So in 
> every case the 
> end user sees hiriq to the left of patah, although in fact the 
> underlying encoding is reversed.
> 
> Have I missed anything vital here? I know that more study may 
> be needed 
> of interaction with cantillation marks, some of which can 
> appear between 
> the patah and the hiriq.
> 
> Of  course we could simply store the reversed order without 
> defining a 
> new character. But renderers would then need clear 
> instruction somewhere 
> in the Unicode text that, as an exception to the normal rules for 
> rendering multiple diacritics, the hiriq should be positioned to the 
> left of the patah and similarly for the other attested sequences.
> 
> -- 
> Peter Kirk
> [EMAIL PROTECTED]
> http://web.onetel.net.uk/~peterkirk/
> 
> 
> 
> 


Reply via email to