Correction:

05C3 (not 05C0) is a punctuation mark often used in unpointed religious
books to indicate the end of the sentence, similar to a full stop.

05BE is the Hebrew hyphen.

Neither should be folded in the general case.

Jony

> -----Original Message-----
> From: [EMAIL PROTECTED]
> [mailto:[EMAIL PROTECTED] On Behalf Of Peter Kirk
> Sent: Monday, July 19, 2004 8:53 PM
> To: Mark E. Shoulson
> Cc: Jony Rosenne; 'Unicode List'
> Subject: Re: Folding algorithm and canonical equivalence
> 
> 
> On 19/07/2004 03:20, Mark E. Shoulson wrote:
> 
> > ...
> >
> > Jony's right: when it's down to brass tacks in Hebrew, it's
> consonants
> > and whitespace (and punctuation, I guess).
> >
> Agreed. But then there are a few characters which are not combining
> marks but which are really part of the accent system and so should 
> perhaps be stripped when points are removed: 05C0 
> paseq/legarmeh, which 
> should be deleted; and 05BE maqaf, which should be replaced 
> by a (word 
> dividing) space. For 05C0 is an annotation which certainly 
> has no place 
> in an unpointed text; and in an accented text whether two words are 
> separated by maqaf or space depends on their accentuation, 
> and space is 
> always used in unaccented texts.
> 
> Within the biblical text it would also be logical to delete 05C3 sof
> pasuq, but its use elsewhere as punctuation suggests otherwise.
> 
> --
> Peter Kirk
> [EMAIL PROTECTED] (personal)
> [EMAIL PROTECTED] (work)
> http://www.qaya.org/
> 
> 
> 
> 


Reply via email to