http://codereview.appspot.com/186278/diff/1/2
File
java/gadgets/src/main/java/org/apache/shindig/gadgets/servlet/MakeRequestHandler.java
(right):

http://codereview.appspot.com/186278/diff/1/2#newcode119
java/gadgets/src/main/java/org/apache/shindig/gadgets/servlet/MakeRequestHandler.java:119:
encoding = "UTF-8";
I love browser quirks.  I think that when the browsers submit an HTML
form, they use the codeset of the parent page.  But XMLHttpRequest is
different.

Safari, IE, Chrome -
   Send the raw bytes specified by the client.
   Send the raw content-type header specified by the client.
   No guarantee that content-type header has a charset, or that the
charset matches the data.

Firefox -
   Send the raw bytes specified by the client.
   Always set charset=UTF-8 in the content-type header.
   No guarantee that the charset matches the data.

I think that's a bug in FF.  I didn't do browser OS or version testing,
because I am lazy.  I also didn't test the impact of OS locale settings.

Test cases for browser behavior are here:

http://sandblower.net/xhr-iso-8859-1-script-has-charset.html
http://sandblower.net/xhr-iso-8859-1.html
http://sandblower.net/xhr-utf8-script-has-charset.html
http://sandblower.net/xhr-utf8.html

I also experimented a bit with gadgets.io.makeRequest, and I do not
understand the results.  As far  as I can tell, javascript magically
decides that the string 'test=\xe4\xeb\xef\xf6\xfc' is latin-1, and then
converts it to UTF-8 before submission.  The submitted data (verified
with a packet capture) is
postData=test%3D%C3%A4%C3%AB%C3%AF%C3%B6%C3%BC.

I'm completely mystified by how that happens, but it does happen on
Safari, IE, FF, and Chrome.

If we can rely on that behavior, we're golden.  Always assume the data
is UTF-8 encoded, and we're good to go.

http://codereview.appspot.com/186278/show

Reply via email to