Sure.

For example, I defined the below message in the proto file:
message Person 
{
 string first_name = 1;
 string last_name = 2;
}


When I set the first_name field to "Ron" both binary serialization and JSON 
serialization work fine.


But when I set it to "רון" (as UTF8) , while the serialization to binary is 
correct (shown here as base64):

*CgbXqNeV158=*
... when using *BinaryToJsonString *to get the JSON representation the 
value is mishandled and is ultimatately replaced with an empty string:
{ "firstName": "" }


This example will probably only work correctly with compilers that define 
char as unsigned by default, but with compilers that define char as signed 
(such as Microsoft's) - I think you should get the same (incorrect) result 
I pasted above.



On Tuesday, November 24, 2015 at 10:51:55 PM UTC+2, Feng Xiao wrote:
>
>
>
> On Tue, Nov 24, 2015 at 11:42 AM, Ron <[email protected] <javascript:>> 
> wrote:
>
>> Hi,
>>
>> When using *BinaryToJsonString *or *BinaryToJsonStream*, I seem to 
>> encounter a problem whenever there's a message containing a string 
>> containing multibyte characters.
>> After some debugging, it seems the place where things start to go wrong 
>> is in *ReadCodePoint* (in json_escaping.cc) when the first byte of the 
>> multibyte character is being read from the string (as char) and assigned 
>> into a variable of type uint32. This casting directly from a signed 1-byte 
>> value to an unsigned 4-byte value seems to produce values that are 
>> different than intended and different than expected a little later on by 
>> some *if-else* statements trying to look at that value to determine the 
>> correct length of the multibyte character. From there things go wrong and 
>> the string isn't serialized and just gets dropped...
>>
>> For now as a temporary solution I added a cast of the value returned by 
>> StringPiece's *operator[ ]* to uint8 before the assignment into uint32, 
>> but any advice or a more permanent solution will be appreciated.
>>
> Could you provide a sample input that will fail for this reason?
>  
>
>>
>> Thanks,
>> Ron
>>
>> -- 
>> You received this message because you are subscribed to the Google Groups 
>> "Protocol Buffers" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to [email protected] <javascript:>.
>> To post to this group, send email to [email protected] 
>> <javascript:>.
>> Visit this group at http://groups.google.com/group/protobuf.
>> For more options, visit https://groups.google.com/d/optout.
>>
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"Protocol Buffers" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at http://groups.google.com/group/protobuf.
For more options, visit https://groups.google.com/d/optout.

Reply via email to