[
https://issues.apache.org/jira/browse/THRIFT-2336?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13885577#comment-13885577
]
Alexander Steshenko commented on THRIFT-2336:
---------------------------------------------
apparently, TJsonProtocol seems to know how to handle utf-encoded characters,
but it expects exactly two zeros after {{\u}}:
{{private TByteArrayOutputStream readJSONString(boolean skipContext)}}
{code}
if (ch == ESCSEQ[0]) {
ch = reader_.read();
if (ch == ESCSEQ[1]) {
readJSONSyntaxChar(ZERO);
readJSONSyntaxChar(ZERO);
trans_.readAll(tmpbuf_, 0, 2);
{code}
> UTF-8 sent by PHP as JSON is not understood by JAVA's TJsonProtocol
> -------------------------------------------------------------------
>
> Key: THRIFT-2336
> URL: https://issues.apache.org/jira/browse/THRIFT-2336
> Project: Thrift
> Issue Type: Bug
> Reporter: Alexander Steshenko
>
> This is similar to THRIFT-2285.
> Whenever I have our Thrift-For-Php send non-latin utf-8 characters, e.g.
> "Русское Название" (Russian), I get this:
> {noformat}
> {"3":{"str":"\u0420\u0443\u0441\u0441\u043a\u043e\u0435
> \u041d\u0430\u0437\u0432\u0430\u043d\u0438\u0435"},"6":{"tf":0}}
> {noformat}
> which is a perfectly valid JSON, and I don't mind it being encoded like that.
> Java fails with
> {noformat}
> Caused by: ! org.apache.thrift.protocol.TProtocolException: Unexpected
> character:4
> {noformat}
--
This message was sent by Atlassian JIRA
(v6.1.5#6160)