On Thursday, July 31, 2003 4:56 PM, John Cowan wrote: > Unicode allows any combining character to be attached to any base character > whatsoever. However, putting a dagesh into a DEVANAGARI KA, or placing a > circumflex over an ARABIC MEEM, is pretty certain to cause bad rendering, and > may screw up other text processes such as syllabication.
>From Unicode 3.2, Chapter 8 [regarding shin and sin dot]: "The two dots are mutually exclusive. The base letter shin can also have dagesh, a vowel, and other diacritics. Use of the two dots with any other base character is an error." Sometimes, doing something that's allowed can still be an error. > > Would FB4B continue to decompose into 05D5 05B9? > > Yes. Normalization stability requires it. That's what I thought. > > It seems to me that either I'm misinterpreting things, or most people in > > this discussion would prefer a new combining character to a new base > > character. If this is so, I'd appreciate an explanation of why, because I > > don't understand it. > > Assertions of the form "Mark X is only used with base form Y" have proven to > be false too often in the past. All the more reason to avoid introducing more marks. Ted Ted Hopp, Ph.D. ZigZag, Inc. [EMAIL PROTECTED] +1-301-990-7453 newSLATE is your personal learning workspace ...on the web at http://www.newSLATE.com/

