Hi, After having banged my heads for weeks, i found the problems:
The issue was not device dependant, but network access dependant. For some reason the pages encoding when accessed using my mobile operator access are changed to UTF-8, as showed in the ContentType HTTP header (ContentType: text/html; Charset=UTF-8) whereas the HTTP content still specifies ISO-8859-1 in meta tag. So the final solution is to : 1) grab encoding in the HTTP ContentType header if any 2) if so set the feature http://cyberneko.org/html/features/scanner/ignore-specified-charset to false 3) in the XMLInputSource constructor, pass "ISO-8859-1" by default or the charset found in ContentType header if any 4) in the filter characters function, no decoding/encoding/getByte or whatsoever charset change is further required; XMLString.toString() will directly gives correct :) Hope this will oneday help another charset-newbie ;) Thierry. 2010/2/13 Thierry Legras <[email protected]> > Thanks for your reply. > > yes this is a java.lang.String. Indeed all i want to do is to correctly > display the string in some View. > > Ok i got the point about java String being 16 bits. If so, and as it is not > well displayed, i guess this means it was not properly created at first. > > Maybe this issue is more related to my (bad) use of xerces when i > initialize the xerces XMLDocumentFilter object. > > XMLParserConfiguration parser = new HTMLConfiguration(); > parser.setDocumentHandler(filter); // filter is a > XMLDocumentFilter > XMLInputSource source = new XMLInputSource(null, null, > null,myHttpResponse.getEntity().getContent(), "iso-8859-1"); > parser.parse(source); > > I will check more in detail in xerces ressources, this probably is not an > Android related topic after all. > > Thierry. > > > 2010/2/13 Frank Weiss <[email protected]> > > First, some clarifications. Locale has nothing to do with character >> encoding. Java stores all character data internally as 16-bit Unicode, >> regardless of locale. >> >> I suspect that myString.getBytes("iso-8859-1") is erroneous. I'm assuming >> that myString is of type java.lang.String. What are you doing with the >> result and why do you want to encode a sequence of Unicode characters back >> to ISO-8859-1 (Latin1)? >> >> -- >> You received this message because you are subscribed to the Google >> Groups "Android Developers" group. >> To post to this group, send email to [email protected] >> To unsubscribe from this group, send email to >> [email protected]<android-developers%[email protected]> >> For more options, visit this group at >> http://groups.google.com/group/android-developers?hl=en > > > -- Thierry. -- You received this message because you are subscribed to the Google Groups "Android Developers" group. To post to this group, send email to [email protected] To unsubscribe from this group, send email to [email protected] For more options, visit this group at http://groups.google.com/group/android-developers?hl=en

