[
https://issues.apache.org/jira/browse/SHINDIG-1229?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Matthieu Huguet updated SHINDIG-1229:
-------------------------------------
Attachment: decodeUtf8.diff
Here is a patch to fix this issue.
Regex was modified according to JSON RFC :
( http://tools.ietf.org/html/rfc4627#section-2.5 )
"Any character may be escaped. If the character is in the Basic
Multilingual Plane (U+0000 through U+FFFF), then it may be
represented as a six-character sequence: a reverse solidus, followed
by the lowercase letter u, followed by four hexadecimal digits that
encode the character's code point. The hexadecimal letters A though
F can be upper or lowercase. So, for example, a string containing
only a single reverse solidus character may be represented as
"\u005C".
"
I hope it doesn't break anything...
Note that before and after this patch, utf8 decoding is limited to Basic
Multilingual Plane (U+0000 to U+FFFF).
> MakeRequest::decodeUtf8() seems to be broken in some cases
> ----------------------------------------------------------
>
> Key: SHINDIG-1229
> URL: https://issues.apache.org/jira/browse/SHINDIG-1229
> Project: Shindig
> Issue Type: Bug
> Components: PHP
> Affects Versions: 1.1-BETA5
> Environment: PHP Shindig (r881567) / PHP 5.2.4
> Reporter: Matthieu Huguet
> Attachments: decodeUtf8.diff, json-response.txt
>
>
> I have a gadget which is fetching some JSON data from a remote PHP script
> with makeRequest :
> Client code :
> -----------------
> [...]
> var params = {};
> params[gadgets.io.RequestParameters.AUTHORIZATION] =
> gadgets.io.AuthorizationType.SIGNED;
> params[gadgets.io.RequestParameters.CONTENT_TYPE] =
> gadgets.io.ContentType.JSON;
> params['OWNER_SIGNED'] = true;
> params['VIEWER_SIGNED'] = true;
> gadgets.io.makeRequest(url, callback params);
> [...]
> JSON reponse :
> ----------------------
> JSON data contains some special characters (in UTF-8) and are encoded with
> json_encode().
> In some cases, some characters are filtered out by MakeRequest::decodeUtf8().
> Here is an example :
> * The remote PHP script is returning :
> json_encode(array("test" => "Désolé"));
> (See the full http response in json-response.txt attachment.)
> * In MakeRequest::decodeUtf8(), here is how $content is transformed :
> 1 (original) : {"test":"D\u00e9sol\u00e9"}
> 2 (after the second preg_replace. the first one is not executed) :
> {"test":"Déé"}
> 3 (after mb_decode_numericentity) : {"test":"Déé"}
> The weird thing is that only non-special characters are filtered out.
> Is it something wrong with my Json encoded data ?
> I've no problem while decoding them with json_decode() function.
> I've tried to add charset=UTF-8 in my Content-Type response, but it changes
> nothing.
> Some help will be really appreciated ! Thanks
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.