In a message dated 2001-07-17 2:24:44 Pacific Daylight Time, 
[EMAIL PROTECTED] writes:

>  The document character set used by HTML
>  is Unicode, but some characters have been disallowed, and may not
>  appear in documents, whether directly or by reference. These are
>
>   U+0000 - U+0009
>   U+000B - U+000C
>   U+000E - U+0019
>   U+007F - U+009F
>   U+D800 - U+DFFF 

This list, and others like it, needs to be updated to include the 
non-characters (0xFDD0 through 0xFDEF, plus all code points whose low-order 
16 bits are 0xFFFE or 0xFFFF).

I was just looking through the XML spec today, and the only non-characters 
excluded (other than the surrogates) are 0xFFFE and 0xFFFF.

-Doug Ewell
 Fullerton, California

Reply via email to