On 07/22/2002 03:38:50 PM Kenneth Whistler wrote:
>Abstract character > > that which is encoded; an element of the repertoire (existing > independent of the character encoding standard, and often > identifiable in other character encoding standards, as well > as the Unicode Standard); the implicit basis of transcodings. [snip] >> - do <U+00C5> (�) and <U+0041, U+030A> (A followed by combining ring >> above) represent the same abstract character? > >Yes. That is the implicit claim behind a specification of canonical >equivalence. This brings to mind another question: what's the relationship between character sequences and abstract characters? Does < 0041, 030A > represent a single abstract character or a sequence of abstract characters? Ken's answer above suggests a single abstract character. Actually, the question that's really bothering me is the next one. Moving one step further (perhaps you already guessed where I was going), what of < 1000, 102D, 102F >? Whether we consider it a single abstract character, or a sequence of abstract characters, the more important question to me is whether it is the same abstract character (sequence) as < 1000, 102F, 102D >. The only thing that makes sense is that they are the same abstract character sequences. But, they are not canonically equivalent! Is the contrapositive to your statement true? I.e. is it true that lack of canonical equivalence implies a distinction in abstract character (sequences)? - Peter --------------------------------------------------------------------------- Peter Constable Non-Roman Script Initiative, SIL International 7500 W. Camp Wisdom Rd., Dallas, TX 75236, USA Tel: +1 972 708 7485 E-mail: <[EMAIL PROTECTED]>

