[ 
https://issues.apache.org/jira/browse/THRIFT-2336?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13889924#comment-13889924
 ] 

Jens Geyer edited comment on THRIFT-2336 at 2/3/14 10:33 PM:
-------------------------------------------------------------

> Do you know what are these readJSONSyntaxChar(ZERO); for ?

Indeed. Looks strange. Someone probably lives only in 8-bit-ASCII-land, and all 
others made a copy.

> but the bug is in Java, don't you agree? 

The bad news: Not only there. Copy and paste is a very efficient way to spread 
bugs. (I'm including myself here). 

I found it in 
 * Java
 * Javame
 * C#
 * Delphi
 * C++
 * D
 *  python

---

I think I got it. The \u00 sequence is because of the matching write code. It 
encodes only a few ctrl chars, because all other unicode chars are valid JSON. 
For that part, it is ok, but the assumption made on the read end does not hold 
as we see here. 


was (Author: jensg):
> Do you know what are these readJSONSyntaxChar(ZERO); for ?

Indeed. Looks strange. Someone probably lives only in 8-bit-ASCII-land, and all 
others made a copy.

> but the bug is in Java, don't you agree? 

The bad news: Not only there. Copy and paste is a very efficient way to spread 
bugs. (I'm including myself here). 

I found it in 
 * Java
 * Javame
 * C#
 * Delphi
 * C++
 * D
 *  python

EDIT: As these two lines seem to be the only purpose of the ZERO char in the 
JSON code, we should start there and eliminate it.


> UTF-8 sent by PHP as JSON is not understood by JAVA's TJsonProtocol
> -------------------------------------------------------------------
>
>                 Key: THRIFT-2336
>                 URL: https://issues.apache.org/jira/browse/THRIFT-2336
>             Project: Thrift
>          Issue Type: Bug
>          Components: C# - Library, C++ - Library, D - Library, Delphi - 
> Library, Java - Library, JavaME - Library, Python - Library
>            Reporter: Alexander Steshenko
>
> This is similar to THRIFT-2285.
> Whenever I have our Thrift-For-Php send non-latin utf-8 characters, e.g. 
> "Русское Название" (Russian), I get this:
> {noformat}
> {"3":{"str":"\u0420\u0443\u0441\u0441\u043a\u043e\u0435 
> \u041d\u0430\u0437\u0432\u0430\u043d\u0438\u0435"},"6":{"tf":0}}
> {noformat}
> which is a perfectly valid JSON, and I don't mind it being encoded like that. 
> Java fails with 
> {noformat}
> Caused by: ! org.apache.thrift.protocol.TProtocolException: Unexpected 
> character:4
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

Reply via email to