Am 08.04.2014 10:14, schrieb Jean-Marc Choulet:
>
> Tommi,
>
> I use tntnet for my web services. I also have Perl clients. When tntnet
> returns the json data, all accents are encoded. For example, "é" become
> \u00c3\uu00a9. There is no problem with javascript. But with Perl, how
> can I decode these ?
>
> Jean-Marc
Hi,
the problem is actually in your code. What is happening here? You have
something like:
std::string foo = "é";
si.addMember("foo") <<= foo;
The std::string foo has after the assignment actually 2 bytes, since
your source code is utf-8 encoded for sure and the character é is not an
ascii character but needs 2 bytes in utf-8. The assignment is actually
the same as:
std::string foo = "\xc3\xa9";
The next step is to put the foo into a cxxtools::SerializationInfo. The
si tries to do its best to interpret the string. It guesses, that the
string has 2 characters: '\xc3' and '\xa9'. They are interpreted as
unicode code points. So what we get are 2 unicode characters with the
values '\xc3' and '\xa9'. And when those are encoded into json, we get
exactly those byte values.
As a side note I recommend to read
http://www.joelonsoftware.com/articles/Unicode.html if you are not
familiar with utf-8 and unicode.
So what is the solution? You actually have to make sure, the std::string
is interpreted correctly. Cxxtools has a utf-8 codec, which is able to
convert a utf-8 string to unicode and back to utf-8. So change the call
to the serialization operator like this:
si.addMember("foo") <<= cxxtools::Utf8Codec::decode(foo);
And the deserialization:
cxxtools::String uncodeFoo;
si.getMember("foo") >>= unicodeFoo;
foo = cxxtools::Utf8Codec::encode(unicodeFoo);
After fixing that, you will get the correct json string "\u00e9", which
is the correct json notation for the letter 'é'.
As an alternative you may want to change the type of foo (or whatever
your data is called) from std::string to cxxtools::String. Then you have
a true unicode string.
And since this all looks so ugly, I decided to write a small wrapper,
which makes it much nicer. When I'm done, you may do:
si.addMember("foo") <<= cxxtools::Utf8(foo);
si.getMember("foo") >>= cxxtools::Utf8(foo);
At least I hope to be able to implement it that way. Do not take it as a
final reference documentation. Details may change.
Tommi
------------------------------------------------------------------------------
Put Bad Developers to Shame
Dominate Development with Jenkins Continuous Integration
Continuously Automate Build, Test & Deployment
Start a new project now. Try Jenkins in the cloud.
http://p.sf.net/sfu/13600_Cloudbees
_______________________________________________
Tntnet-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/tntnet-general