On 07/07/2003 8:52 AM, Peter Kirk wrote: > On 06/07/2003 17:22, John Hudson wrote: > > ... Given the small number of attested sequences that would be > > adversely affected by normalisation re-ordering, I'm beginning to > > favour the idea of encoding these sequences as individual characters. > > We'd probably only need three or four, plus a right meteg, to solve > > the problem, and rendering would work find with existing font and > > layout engine technologies. > > This sounds like a sensible alternative.
This would make data entry difficult for users. Nobody thinks of these character sequences as single characters. Editing would also be an "interesting" experience. Could one search for lamed-patah and find it as part of lamed-<patah+hiriq>? Or would the proposal be to use these new codes only as part of bookend processing around normalization (i.e., automatically recognize the sequences and substitute, normalize, and then automatically substitute back)? I think we need to keep Peter Constable's point in mind that current usage should not define the limits of Unicode functionality. Since the principle is that all sequences of character codes are permitted (2.10), it seems wrong to supply a fix for only "the small number of attested sequences". In view of this principle, the current combining class values are at odds with definition D46 (combining class; section 3.11) as well as with the discussion in 2.10 on multiple combining characters. That is what should be fixed. Ted Ted Hopp, Ph.D. ZigZag, Inc. [EMAIL PROTECTED] +1-301-990-7453 newSLATE is your personal learning workspace ...on the web at http://www.newSLATE.com/

