Good catch, Bob.
You are right of course - I missed the lack of encoding parameter in the
second call.
The docs say this:
http://developer.android.com/reference/java/lang/String.html#String(byte[])
Converts the byte array to a string using the default encoding as
specified by the file.encoding system property. If the system property
is not defined, the default encoding is ISO8859_1 (ISO-Latin-1). If
8859-1 is not available, an ASCII encoding is used.
Looks like the encoding is quite likely to be single-byte based, this is
implied by the choice of ASCII (not UTF-8) as the fallback.
Then there is the deprecated "public String (byte[] data, int high)", I
see it as a failed attempt to fix things in a simple, but incorrect way.
Overall, it looks like the designers of Java did not push for proper and
consistent use of character encodings across the board when the language
was still young.
Over time, though, the Java standard library evolved to make consistent
use of encodings, because only that is guaranteed to give correct results.
-- Kostya
10.11.2010 20:36, Bob Kerns пишет:
It's clearly not logcat that's the issue here, because the two strings
output differently. He's expecting them to be the same for some
reason.
It just now occurs to me that he may be assuming that the the one-
argument version defaults to UTF-8; it defaults to *something*, but
something ill-specified that is probably never UTF-8. That's now now
it's worded, of course, but that's the effect.
I couldn't begin to tell you how many bugs I've tracked down and fixed
in people's code due to this.
On Nov 8, 12:19 pm, Kostya Vasilyev<[email protected]> wrote:
I wouldn't count on logcat output to be always correct with respect to
localization.
What do you get if you use decoded strings in a TextView (for example)?
-- Kostya
08.11.2010 19:46, Simon MacDonald пишет:
Hi all,
I'm wondering if I found a bug in Android. When I run this code on my
laptop:
String myData = "hockey,marché,football";
byte[] rawData;
rawData = myData.getBytes("UTF-8");
System.out.println("UTF-8 decoded: "+new String(rawData,"UTF-8"));
System.out.println("Default decoded: "+new String(rawData));
I get the output:
*UTF-8 decoded: hockey,marché,football*
*Default decoded: hockey,marché,football*
However, when I run the same code in an Android application and view
the output it "adb logcat" I get:
*D/FileUtils( 485): UTF-8 decoded: hockey,march�,football*
*D/FileUtils( 485): Default decoded: hockey,march�,football*
I get the same issue if I change the locale of my phone to French
(Canada) as well. It doesn't seem like French characters are getting
encoded properly.
Any thoughts?
Simon Mac Donald
http://hi.im/simonmacdonald
--
You received this message because you are subscribed to the Google
Groups "Android Developers" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to
[email protected]
For more options, visit this group at
http://groups.google.com/group/android-developers?hl=en
--
Kostya Vasilyev -- WiFi Manager + pretty widget --http://kmansoft.wordpress.com
--
Kostya Vasilyev -- WiFi Manager + pretty widget -- http://kmansoft.wordpress.com
--
You received this message because you are subscribed to the Google
Groups "Android Developers" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to
[email protected]
For more options, visit this group at
http://groups.google.com/group/android-developers?hl=en