On Mon, Aug 1, 2011 at 9:42 AM, Sage Weil <[email protected]> wrote:
> On Mon, 1 Aug 2011, Tommi Virtanen wrote:
>> On Mon, Aug 1, 2011 at 09:24, Tommi Virtanen
>> <[email protected]> wrote:
>> > We've talked about generating/parsing JSON a few times, and how we've
>> > run into edge cases whenever we've rolled our own functions for that.
>> > I've mentioned this C library a few times, but I'm not sure if I've
>> > actually sent the link to anyone.. Here's a C library for
>> > generating/parsing JSON, written by an ex-cow-orker of mine.
>>
>> D'oh!
>>
>> https://github.com/akheron/jansson
>
> Ah, yeah.  The problem with the current code is it's assuming the
> string to dump is ASCII.  We need to do this:
>
> https://github.com/akheron/jansson/blob/master/src/dump.c#L67
>

That piece of code only comes into effect if you pass in 1 for
"ascii". Basically, that activates a special mode in the Jansson
library where it mashes all utf-8 into ascii with escape sequences.

However, you do not need to activate this mode, and you probably
shouldn't, because as RFC4627 says (
http://www.ietf.org/rfc/rfc4627.txt ):
"JSON text SHALL be encoded in Unicode.  The default encoding is UTF-8."

JSON only gives you a 4 byte long escape sequence. As you may know, a
4-byte escape sequence cannot fully represent an arbitrary utf-8
character. This is unfortunate, but it doesn't really matter, because
the only characters you NEED to escape are control characters.

See this parse table for json: http://www.json.org/string.gif

So basically, there is no problem with using utf8 in JSON. We already
have a function to do the necessary escaping of slash, quote, and so
on, in rgw_escape.c. I wrote it when fixing bug #939:
http://tracker.newdream.net/issues/939. It lives here:

http://ceph.newdream.net/git/?p=ceph.git;a=blob;f=src/rgw/rgw_escape.c;h=aa19720f43e75e4ebb6db4f88993e8c4830421f8;hb=master

And yes, it has unit tests. :)

cheers,
Colin
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to