Hi Tom,
I'm not exactly sure if I understand the issue correctly, but at least I can
say that the wire format of string shall be UTF-8. Anything else is
suspicios. See also https://issues.apache.org/jira/browse/THRIFT-414 for a
discussion of the latter.
Does that help you any further?
Have fun,
JensG
-----Ursprüngliche Nachricht-----
From: Tom Hesp
Sent: Tuesday, January 20, 2015 10:19 AM
To: [email protected]
Subject: Diacritics get garbled when sent from Perl client.
Hi,
This question may have been asked before on this list but I have not
been able to find anything about it.
I am using Thrift version 0.9.1 and have a C++ Thrift server maintaining
user records in a database.
When I send user information containing diacritics (like á, ö, è, etc.)
to it from a C++ or PHP client everything is fine.
However, when I do the same from a Perl client, the diacritics become
garbled. The example characters above are received by the server as
something like this: áöè
I am using the BinaryProtocol so I checked the BinaryProtocol.pm and saw
the following construct in writeString:
if( utf8::is_utf8($value) ){
$value = Encode::encode_utf8($value);
}
Which means that the string is encoded to Perl's internal format.
I also checked the C++ libraries at the receiving (server) end but I do
not see the string being decoded again!
I even tried this with a little Perl server but the results are the
same, the data gets encoded but is never decoded.
Am I missing something? Do I need to define something in the IDL so the
server knows it may have to decode the string?
Thanks for your time.
Kind regards,
Tom Hesp
--