Unfortunately it appears he's parsing a vBulletin RSS feed, i.e. its
most likely something he has little or no control over - and if the
host he is scraping ever upgrades to a version that fixes this odd
behavior (i.e. mixing character encodings) his app should not break if
he supplies a Reader to the "emergency fix" that actually uses the
character set declared by the server.

Fortunately it appears he's parsing a vBulleting RSS feed - i.e.
transient events that hopefully aren't super important in anything but
the "near scope".

On 22 Apr, 02:46, Bob Kerns <r...@acm.org> wrote:
> Oh, one other model that may help put all this in context -- and make it
> plain that there really is no other way for it to be.
>
> The confusion here stems from confusing bytes and characters. XML works with
> characters.
>
> The encoding (UTF-8, ISO-8859-1) of an XML document refers to how a stream
> of characters is encoded into a stream of bytes -- and, on input, bytes into
> characters.
>
> XML is then defined in terms of a series of characters.
>
> CDATA then says how those characters are treated.
>
> The failure here, happens at the level of interpreting the bytes. The bytes
> that your byte stream is presenting, cannot be converted into ISO-8859-1
> characters.
>
> FAIL. It hasn't even gotten to the XML parser. Depending on the buffering
> going on, in fact, the XML parser may not yet even have seen the start of
> the CDATA section.
>
> The failure here is simply that a supposed ISO-8859-1 stream -- is not. It
> really has nothing to do with CDATA or XML at all.
>
> Jens' approach operates at this level -- it turns a byte stream that is not
> a valid ISO-8859-1-encoded byte stream, into one that at least *appears* to
> be an ISO-8859-1-encoded byte stream.
>
> If it happens to be a UTF-8 byte stream (ISO-10646 encoded via UTF-8), it
> will be gibberish -- even many valid ISO-8859-1 characters will become
> gibberish bytes mis-interpreted, rather than the ISO-8859-1 characters they
> started out life as.
>
> That's why I don't like Jens' solution. (I doubt Jens likes it either --
> this is a "hold your nose" sort of situation). It really has the potential
> to make things much worse.
>
> I hope that makes it a bit more clear.

-- 
You received this message because you are subscribed to the Google
Groups "Android Developers" group.
To post to this group, send email to android-developers@googlegroups.com
To unsubscribe from this group, send email to
android-developers+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/android-developers?hl=en

Reply via email to