Marcin 'Qrczak' Kowalczyk scripsit:
> http://www.w3.org/TR/2000/REC-xml-20001006#charsets
> implies that the appropriate level for parsing XML is code points.
You are reading the XML Recommendation incorrectly. It is not defined
in terms of codepoints (8-bit, 16-bit, or 32-bit) but in terms of
characters. XML processors are required to process UTF-8 and UTF-16,
and may process other character encodings or not. But the internal
model is that of characters. Thus surrogate code points are not
allowed.
--
John Cowan www.reutershealth.com www.ccil.org/~cowan [EMAIL PROTECTED]
Arise, you prisoners of Windows / Arise, you slaves of Redmond, Wash,
The day and hour soon are coming / When all the IT folks say "Gosh!"
It isn't from a clever lawsuit / That Windowsland will finally fall,
But thousands writing open source code / Like mice who nibble through a wall.
--The Linux-nationale by Greg Baker