Re: [Qemu-devel] [PATCH 17/56] json: Reject unescaped control characters

2018-08-10 Thread Markus Armbruster
Eric Blake  writes:

> On 08/08/2018 07:02 AM, Markus Armbruster wrote:
>> Fix the lexer to reject unescaped control characters in JSON strings,
>> in accordance with RFC 7159.
>
> Question - can this break existing QMP clients that were relying on
> this extension working?

In theory, yes.

The "extension" is undocumented.  That makes it a bug.

I'm not aware of clients relying on it.

> Libvirt used to use libyajl, now it uses libjansson. So I'll check
> both of those libraries:
>
> yajl: https://github.com/lloyd/yajl/blob/master/src/yajl_encode.c#L32
>
> default:
> if ((unsigned char) str[end] < 32) {
> CharToHex(str[end], hexBuf + 4);
> escaped = hexBuf;
>
> jansson: https://github.com/akheron/jansson/blob/master/src/dump.c#L101
>
> /* mandatory escape or control char */
> if(codepoint == '\\' || codepoint == '"' || codepoint < 0x20)
>
> Okay, both libraries appear to always send control characters encoded,
> and thus were not relying on this accidental QMP extension.
>
> Are we worried about other clients?

Breakage seems unlikely to me.

>> Bonus: we now recover more nicely from unclosed strings.  E.g.
>>
>>  {"one: 1}\n{"two": 2}
>>
>> now recovers cleanly after the newline, where before the lexer
>> remained confused until the next unpaired double quote or lexical
>> error.
>
> On that grounds alone, I could live with this patch, even if we end up
> having to revert it later if some client was actually depending on
> sending raw control characters as part of a string.

Having to revert the patch to stay bug-compatible wouldn't be exactly
terrible.

> Reviewed-by: Eric Blake 

Thanks!



Re: [Qemu-devel] [PATCH 17/56] json: Reject unescaped control characters

2018-08-09 Thread Eric Blake

On 08/08/2018 07:02 AM, Markus Armbruster wrote:

Fix the lexer to reject unescaped control characters in JSON strings,
in accordance with RFC 7159.


Question - can this break existing QMP clients that were relying on this 
extension working?


Libvirt used to use libyajl, now it uses libjansson. So I'll check both 
of those libraries:


yajl: https://github.com/lloyd/yajl/blob/master/src/yajl_encode.c#L32

default:
if ((unsigned char) str[end] < 32) {
CharToHex(str[end], hexBuf + 4);
escaped = hexBuf;

jansson: https://github.com/akheron/jansson/blob/master/src/dump.c#L101

/* mandatory escape or control char */
if(codepoint == '\\' || codepoint == '"' || codepoint < 0x20)

Okay, both libraries appear to always send control characters encoded, 
and thus were not relying on this accidental QMP extension.


Are we worried about other clients?



Bonus: we now recover more nicely from unclosed strings.  E.g.

 {"one: 1}\n{"two": 2}

now recovers cleanly after the newline, where before the lexer
remained confused until the next unpaired double quote or lexical
error.


On that grounds alone, I could live with this patch, even if we end up 
having to revert it later if some client was actually depending on 
sending raw control characters as part of a string.


Reviewed-by: Eric Blake 

--
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3266
Virtualization:  qemu.org | libvirt.org