Unfortunately it appears he's parsing a vBulletin RSS feed, i.e. its most likely something he has little or no control over - and if the host he is scraping ever upgrades to a version that fixes this odd behavior (i.e. mixing character encodings) his app should not break if he supplies a Reader to the "emergency fix" that actually uses the character set declared by the server.
Fortunately it appears he's parsing a vBulleting RSS feed - i.e. transient events that hopefully aren't super important in anything but the "near scope". On 22 Apr, 02:46, Bob Kerns <r...@acm.org> wrote: > Oh, one other model that may help put all this in context -- and make it > plain that there really is no other way for it to be. > > The confusion here stems from confusing bytes and characters. XML works with > characters. > > The encoding (UTF-8, ISO-8859-1) of an XML document refers to how a stream > of characters is encoded into a stream of bytes -- and, on input, bytes into > characters. > > XML is then defined in terms of a series of characters. > > CDATA then says how those characters are treated. > > The failure here, happens at the level of interpreting the bytes. The bytes > that your byte stream is presenting, cannot be converted into ISO-8859-1 > characters. > > FAIL. It hasn't even gotten to the XML parser. Depending on the buffering > going on, in fact, the XML parser may not yet even have seen the start of > the CDATA section. > > The failure here is simply that a supposed ISO-8859-1 stream -- is not. It > really has nothing to do with CDATA or XML at all. > > Jens' approach operates at this level -- it turns a byte stream that is not > a valid ISO-8859-1-encoded byte stream, into one that at least *appears* to > be an ISO-8859-1-encoded byte stream. > > If it happens to be a UTF-8 byte stream (ISO-10646 encoded via UTF-8), it > will be gibberish -- even many valid ISO-8859-1 characters will become > gibberish bytes mis-interpreted, rather than the ISO-8859-1 characters they > started out life as. > > That's why I don't like Jens' solution. (I doubt Jens likes it either -- > this is a "hold your nose" sort of situation). It really has the potential > to make things much worse. > > I hope that makes it a bit more clear. -- You received this message because you are subscribed to the Google Groups "Android Developers" group. To post to this group, send email to android-developers@googlegroups.com To unsubscribe from this group, send email to android-developers+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/android-developers?hl=en