[android-developers] Re: SAXParser throws exception for bad character in CDATA block, bug???

Bob Kerns Thu, 21 Apr 2011 17:46:52 -0700

Oh, one other model that may help put all this in context -- and make it 
plain that there really is no other way for it to be.


The confusion here stems from confusing bytes and characters. XML works with 
characters.

The encoding (UTF-8, ISO-8859-1) of an XML document refers to how a stream 
of characters is encoded into a stream of bytes -- and, on input, bytes into 
characters.

XML is then defined in terms of a series of characters.

CDATA then says how those characters are treated.

The failure here, happens at the level of interpreting the bytes. The bytes 
that your byte stream is presenting, cannot be converted into ISO-8859-1 
characters.

FAIL. It hasn't even gotten to the XML parser. Depending on the buffering 
going on, in fact, the XML parser may not yet even have seen the start of 
the CDATA section.

The failure here is simply that a supposed ISO-8859-1 stream -- is not. It 
really has nothing to do with CDATA or XML at all.

Jens' approach operates at this level -- it turns a byte stream that is not 
a valid ISO-8859-1-encoded byte stream, into one that at least *appears* to 
be an ISO-8859-1-encoded byte stream.

If it happens to be a UTF-8 byte stream (ISO-10646 encoded via UTF-8), it 
will be gibberish -- even many valid ISO-8859-1 characters will become 
gibberish bytes mis-interpreted, rather than the ISO-8859-1 characters they 
started out life as.

That's why I don't like Jens' solution. (I doubt Jens likes it either -- 
this is a "hold your nose" sort of situation). It really has the potential 
to make things much worse.

I hope that makes it a bit more clear.

-- 
You received this message because you are subscribed to the Google
Groups "Android Developers" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to
[email protected]
For more options, visit this group at
http://groups.google.com/group/android-developers?hl=en

[android-developers] Re: SAXParser throws exception for bad character in CDATA block, bug???

Reply via email to