On 09/12/2000 02:59:38 PM Kenneth Whistler wrote:

[snip]

I think Ken's comments on planes is good.


>3. The term "surrogate character" should be eschewed altogether, because
>   of the confusion is causes. "Surrogate code point" can continue to
>   be used as it currently is, and the term "surrogate pair" is also
>   useful. But the other terminology related to characters...

The other terminology Ken discussed had to do with the plane in which a
character is found. What I think is still open is how d800 - dfff get
referred to. Ken indicated that "surrogate code point" can continue in use
as is; I don't recall exactly how TUS 3.0 uses it. (Would have made for a
rather challenging trivia question :-) My biggest concern here is that
people should not be referring to U+d800 - U+dfff as characters. (I'd be
willing to accept code point, provided there is a clear statement as to
what is meant by a code point.) For that matter, I'd be inclined to say
that the U+ notation should not be used here - U+ should be reserved for
use to refer to encoded characters in terms of their Unicode scalar values.
So, 0xd800 is OK, but U+d800 would be wrong.



- Peter


---------------------------------------------------------------------------
Peter Constable

Non-Roman Script Initiative, SIL International
7500 W. Camp Wisdom Rd., Dallas, TX 75236, USA
Tel: +1 972 708 7485
E-mail: <[EMAIL PROTECTED]>


Reply via email to