Re: [Qemu-devel] [PATCH 17/56] json: Reject unescaped control characters
Eric Blake writes: > On 08/08/2018 07:02 AM, Markus Armbruster wrote: >> Fix the lexer to reject unescaped control characters in JSON strings, >> in accordance with RFC 7159. > > Question - can this break existing QMP clients that were relying on > this extension working? In theory, yes. The "extension" is undocumented. That makes it a bug. I'm not aware of clients relying on it. > Libvirt used to use libyajl, now it uses libjansson. So I'll check > both of those libraries: > > yajl: https://github.com/lloyd/yajl/blob/master/src/yajl_encode.c#L32 > > default: > if ((unsigned char) str[end] < 32) { > CharToHex(str[end], hexBuf + 4); > escaped = hexBuf; > > jansson: https://github.com/akheron/jansson/blob/master/src/dump.c#L101 > > /* mandatory escape or control char */ > if(codepoint == '\\' || codepoint == '"' || codepoint < 0x20) > > Okay, both libraries appear to always send control characters encoded, > and thus were not relying on this accidental QMP extension. > > Are we worried about other clients? Breakage seems unlikely to me. >> Bonus: we now recover more nicely from unclosed strings. E.g. >> >> {"one: 1}\n{"two": 2} >> >> now recovers cleanly after the newline, where before the lexer >> remained confused until the next unpaired double quote or lexical >> error. > > On that grounds alone, I could live with this patch, even if we end up > having to revert it later if some client was actually depending on > sending raw control characters as part of a string. Having to revert the patch to stay bug-compatible wouldn't be exactly terrible. > Reviewed-by: Eric Blake Thanks!
Re: [Qemu-devel] [PATCH 17/56] json: Reject unescaped control characters
On 08/08/2018 07:02 AM, Markus Armbruster wrote: Fix the lexer to reject unescaped control characters in JSON strings, in accordance with RFC 7159. Question - can this break existing QMP clients that were relying on this extension working? Libvirt used to use libyajl, now it uses libjansson. So I'll check both of those libraries: yajl: https://github.com/lloyd/yajl/blob/master/src/yajl_encode.c#L32 default: if ((unsigned char) str[end] < 32) { CharToHex(str[end], hexBuf + 4); escaped = hexBuf; jansson: https://github.com/akheron/jansson/blob/master/src/dump.c#L101 /* mandatory escape or control char */ if(codepoint == '\\' || codepoint == '"' || codepoint < 0x20) Okay, both libraries appear to always send control characters encoded, and thus were not relying on this accidental QMP extension. Are we worried about other clients? Bonus: we now recover more nicely from unclosed strings. E.g. {"one: 1}\n{"two": 2} now recovers cleanly after the newline, where before the lexer remained confused until the next unpaired double quote or lexical error. On that grounds alone, I could live with this patch, even if we end up having to revert it later if some client was actually depending on sending raw control characters as part of a string. Reviewed-by: Eric Blake -- Eric Blake, Principal Software Engineer Red Hat, Inc. +1-919-301-3266 Virtualization: qemu.org | libvirt.org