On Wed, Nov 25, 2015 at 12:47 AM Ron <ronnnnnnnnnn...@gmail.com> wrote:

> Sure.
>
> For example, I defined the below message in the proto file:
> message Person
> {
>  string first_name = 1;
>  string last_name = 2;
> }
>
>
> When I set the first_name field to "Ron" both binary serialization and
> JSON serialization work fine.
>
>
> But when I set it to "רון" (as UTF8) , while the serialization to binary
> is correct (shown here as base64):
>
> *CgbXqNeV158=*
> ... when using *BinaryToJsonString *to get the JSON representation the
> value is mishandled and is ultimatately replaced with an empty string:
> { "firstName": "" }
>
>
> This example will probably only work correctly with compilers that define
> char as unsigned by default, but with compilers that define char as signed
> (such as Microsoft's) - I think you should get the same (incorrect) result
> I pasted above.
>
Thanks for the explanation. Could you help file a bug for this on protobuf
github site? If you know of an solution to this, you are also welcomed to
send us a pull request.


>
>
>
> On Tuesday, November 24, 2015 at 10:51:55 PM UTC+2, Feng Xiao wrote:
>>
>>
>>
>> On Tue, Nov 24, 2015 at 11:42 AM, Ron <ronnnnn...@gmail.com> wrote:
>>
>>> Hi,
>>>
>>> When using *BinaryToJsonString *or *BinaryToJsonStream*, I seem to
>>> encounter a problem whenever there's a message containing a string
>>> containing multibyte characters.
>>> After some debugging, it seems the place where things start to go wrong
>>> is in *ReadCodePoint* (in json_escaping.cc) when the first byte of the
>>> multibyte character is being read from the string (as char) and assigned
>>> into a variable of type uint32. This casting directly from a signed 1-byte
>>> value to an unsigned 4-byte value seems to produce values that are
>>> different than intended and different than expected a little later on by
>>> some *if-else* statements trying to look at that value to determine the
>>> correct length of the multibyte character. From there things go wrong and
>>> the string isn't serialized and just gets dropped...
>>>
>>> For now as a temporary solution I added a cast of the value returned by
>>> StringPiece's *operator[ ]* to uint8 before the assignment into uint32,
>>> but any advice or a more permanent solution will be appreciated.
>>>
>> Could you provide a sample input that will fail for this reason?
>>
>>
>
>>> Thanks,
>>> Ron
>>>
>> --
>>> You received this message because you are subscribed to the Google
>>> Groups "Protocol Buffers" group.
>>>
>> To unsubscribe from this group and stop receiving emails from it, send an
>>> email to protobuf+u...@googlegroups.com.
>>> To post to this group, send email to prot...@googlegroups.com.
>>
>>
>>> Visit this group at http://groups.google.com/group/protobuf.
>>> For more options, visit https://groups.google.com/d/optout.
>>>
>> --
> You received this message because you are subscribed to the Google Groups
> "Protocol Buffers" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to protobuf+unsubscr...@googlegroups.com.
> To post to this group, send email to protobuf@googlegroups.com.
> Visit this group at http://groups.google.com/group/protobuf.
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"Protocol Buffers" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to protobuf+unsubscr...@googlegroups.com.
To post to this group, send email to protobuf@googlegroups.com.
Visit this group at http://groups.google.com/group/protobuf.
For more options, visit https://groups.google.com/d/optout.

Reply via email to