The current fetchJson implementation uses "new
String(results.getByteArray())" to convert the response bytes to a
string for inclusion in the JSON reply to the gadget.  The behavior of
new String(byte[]) is undefined "when the given bytes are not valid in
the default charset".

The default charset could be anything, and the returned bytes from the
remote server could also be anything.  This is likely to cause
problems (data corruption) for gadgets fetching data from non-english
web sites.

I'll open up a JIRA issue for this, but I wanted to see whether anyone
had proposals for a solution.  The fix will probably involve using
CharsetDecoder, so we at least have well-defined behavior.  How we
pick the CharsetDecoder to use is an open question.  What to do when
the CharsetDecoding fails is another issue.  I'm tempted to put in a
quick fix that specifies UTF-8 for the character set.  That will
prevent anyone from depending on the current undefined behavior while
we work out what should happen.

Cheers,
Brian

Reply via email to