Re: [REPOST, LONG] XML and tags (LONG) - SCSU for XML

John Cowan Fri, 21 Feb 2003 19:41:52 -0800

Markus Scherer scripsit:

> Yes. Any reasonable SCSU encoder will stay in the ASCII-compatible
> single-byte mode until it sees a character from beyond Latin-1. Thus
> the encoding declaration will be ASCII-readable.


Indeed, there is no such requirement.  A parser can perfectly well handle
EBCDIC or other non-ASCII-compatible encodings provided a proper declaration
expressed in that encoding is present.

To be sure, some encodings, like US-BSCII, are problematic.  US-BSCII is
the same as US-ASCII except that 0x41 is B and 0x42 is A; the trouble
being of course that the string "US-ASCII" encoded in US-ASCII uses the
same bytes as the string "US-BSCII" encoded in US-BSCII.  But such a thing
is not likely to happen except through perversity such as this.

-- 
John Cowan           http://www.ccil.org/~cowan              [EMAIL PROTECTED]
To say that Bilbo's breath was taken away is no description at all.  There
are no words left to express his staggerment, since Men changed the language
that they learned of elves in the days when all the world was wonderful.
        --_The Hobbit_

Re: [REPOST, LONG] XML and tags (LONG) - SCSU for XML

Reply via email to