Re: [Qemu-devel] [PATCH 24/56] json: Accept overlong \xC0\x80 as U+0000 ("modified UTF-8")

Eric Blake Fri, 10 Aug 2018 09:09:45 -0700

On 08/10/2018 10:48 AM, Eric Blake wrote:

   * Note:
- * - Input must be encoded in UTF-8.
+ * - Input must be encoded in modified UTF-8.
Worth documenting this in the QMP doc as an explicit extension? Ingeneral, our QMP interfaces that take binary input do so via base64encoding, rather than via a modified UTF-8 string - and I don't know howyajl or jansson would feel about an extension for producing modifiedUTF-8 for QMP to consume if we really did want to pass NUL bytes withoutthe overhead of UTF-8; what's more, even if you can pass NUL, you stillhave to worry about all other byte sequences being valid (so base64 isstill better for true binary data - it's hard to argue that we'd everhave an interface where we want UTF-8 including embedded NUL rather thantrue binary). I guess it can also be argued that outputting modifiedUTF-8 is a violation of JSON, so the fact that we can round-trip NULdoesn't help if the client can't read it.
So having typed all that, I guess the answer is no, we don't want todocument it; for now, the fact that we accept \xc0\x80 on input andproduce it on output is only for the testsuite, and unlikely to matterto any real client of QMP.

Actually, I guess we never output \xc0\x80; but would output the Cstring "\\u0000" (since any byte above 0x1f is passed through our UTFdecoder back into a codepoint then output with \u). So it's really onlya question of whether our input engine can pass "\x00" vs. "\\u0000"when we NEED an input NUL, and except for the testsuite, our QAPI schemanever really needs an input NUL.


--
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3266
Virtualization:  qemu.org | libvirt.org

Re: [Qemu-devel] [PATCH 24/56] json: Accept overlong \xC0\x80 as U+0000 ("modified UTF-8")

Reply via email to