David Starner, normally <[EMAIL PROTECTED]> but on this occasion
<[EMAIL PROTECTED]>, wrote:

> I was having some problems with a test of my SCSU decoder recently,
> and I discovered it was due to my decoder rejecting 10FFFF as a valid
> Unicode value (because it ends in FFFF.) The fourth test pattern,
> Section 9.4 of Tech Report 6 (SCSU) uses DBFF DFFF as a surrogate
> pair, which is 10FFFF. Is this wrong, or is there something I'm
> overlooking?

Good question.  Unicode scalar values ending in FFFE and FFFE do not
represent valid characters, but by definition D29 (recently clarified
for me) a UTF must encode and decode these values.  SCSU is not a UTF,
but my guess is that this requirement should apply to SCSU as well.

I think the SCSU decoder should go ahead and decode the 0B BF FF and
subsequent 15 FF as U+10FFFF, and leave the job of deciding which values
are valid or invalid to the higher-level process that interprets them.

-Doug Ewell
 Fullerton, California

Reply via email to