Re: [Qemu-devel] [PATCH v4 06/22] qobject: Perform %% interpolation in qobject_from_jsonf()

2017-08-09 Thread Eric Blake
On 08/09/2017 04:06 AM, Markus Armbruster wrote:
> Eric Blake  writes:
> 
>> We want -Wformat to catch blatant programming errors in format
>> strings that we pass to qobject_from_jsonf().  But if someone
>> were to pass a JSON string "'%s'" as the format string, gcc would
>> insist that it be paired with a char* argument, even though our
>> lexer does not recognize % sequences inside a JSON string.  You can
>> bypass -Wformat checking by passing the Unicode escape \u0025 for
>> %, but who wants to remember to type that?  So the solution is that
>> anywhere we are relying on -Wformat checking, the caller should
>> pass the usual printf %% escape sequence where a literal % is
>> wanted on output.
>>

>> +bool double_quote = *ptr++ == '"';
>>
>>  str = qstring_new();
>> -while (*ptr && 
>> -   ((double_quote && *ptr != '"') || (!double_quote && *ptr != 
>> '\''))) {
>> +while (*ptr && *ptr != "'\""[double_quote]) {
> 
> Simpler:
> 
>bool quote = *ptr++;
> 
> and then
> 
>while (*ptr && *ptr != quote) {

Well, 'char quote' rather than 'bool quote', but yes, I like it.

> 
> Have you considered splitting the patch into one to simplify, and one to
> implement %%?

Will split.

>> @@ -455,13 +452,15 @@ static QObject *parse_escape(JSONParserContext *ctxt, 
>> va_list *ap)
>>  {
>>  JSONToken *token;
>>
>> -if (ap == NULL) {
>> -return NULL;
>> -}
>> -
>>  token = parser_context_pop_token(ctxt);
>>  assert(token && token->type == JSON_ESCAPE);
>>
>> +if (ap == NULL) {
>> +parse_error(ctxt, token, "escape parsing for '%s' not requested",
>> +token->str);
>> +return NULL;
>> +}
>> +
> 
> When I manage to fat-finger a '%' into my QMP input, I now get this
> error message instead of "Invalid JSON syntax".  Makes me go "what is
> escape parsing, and how do I request it?"  Not an improvement, I'm
> afraid.

Pre-patch, I see:

$ qemu-kvm -nodefaults -nographic -qmp stdio
{"QMP": {"version": {"qemu": {"micro": 91, "minor": 9, "major": 2},
"package": "(qemu-2.10.0-0.1.rc1.fc26)"}, "capabilities": []}}
{'execute':%s}
{"error": {"class": "GenericError", "desc": "JSON parse error, Missing
value in dict"}}
{'execute':%%}
{"error": {"class": "GenericError", "desc": "Invalid JSON syntax"}}
{"error": {"class": "GenericError", "desc": "JSON parse error, expecting
value"}}

I find it odd that NOT calling parse_error() but still returning NULL
changes the behavior on what error message eventually gets emitted; but
I also agree that the QMP case should drive what error message (if any)
is needed in parse_escape().  I'll play with it some more (the parser's
error handling is weird).


>> +/* In vararg form, %% must occur in strings */
>> +/* qobject_from_jsonf("%%"); */
>> +/* qobject_from_jsonf("{%%}"); */
>> +
>> +/* In vararg form, strings must not use % except for %% */
>> +/* qobject_from_jsonf("'%s'", "unpaired"); */
> 
> Could use g_test_trap_subprocess().  Not sure it's worth the bother.

I don't know - this is one case where proving we abort on invalid usage
might actually be a good validation of the contract.

> 
> I hate code in comments.  Better:
> 
>/* The following intentionally cause assertion failures */
>#if 0
>/* In vararg form, %% must occur in strings */
>qobject_from_jsonf("%%");

If I don't use the g_test_trap_subprocess() trick, then I can override
checkpatch's complaints about #if 0.

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3266
Virtualization:  qemu.org | libvirt.org



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-devel] [PATCH v4 06/22] qobject: Perform %% interpolation in qobject_from_jsonf()

2017-08-09 Thread Markus Armbruster
Eric Blake  writes:

> We want -Wformat to catch blatant programming errors in format
> strings that we pass to qobject_from_jsonf().  But if someone
> were to pass a JSON string "'%s'" as the format string, gcc would
> insist that it be paired with a char* argument, even though our
> lexer does not recognize % sequences inside a JSON string.  You can
> bypass -Wformat checking by passing the Unicode escape \u0025 for
> %, but who wants to remember to type that?  So the solution is that
> anywhere we are relying on -Wformat checking, the caller should
> pass the usual printf %% escape sequence where a literal % is
> wanted on output.
>
> Note that since % can only appear in JSON inside a string, we don't
> have to teach the lexer how to parse any new % sequences, but instead
> only have to add code to the parser.  Likewise, the parser is still
> where we reject any attempt at mid-string % interpolation other than
> %% (this is only a runtime failure, rather than compile-time), but
> since we already document that qobject_from_jsonf() asserts on invalid
> usage, presumably anyone that is adding a new usage will have tested
> that their usage doesn't always fail.
>
> Simplify qstring_from_escaped_str() while touching it, by using
> bool, a more compact conditional, and qstring_append_chr().
> Likewise, improve the error message when parse_escape() is reached
> without interpolation (for example, if a client sends garbage
> rather than JSON over a QMP connection).
>
> The testsuite additions pass under valgrind, proving that we are
> indeed passing the reference of anything given through %p to the
> returned containing object, even when more than one object is
> interpolated.
>
> Signed-off-by: Eric Blake 
> ---
>  qobject/json-lexer.c  |  6 --
>  qobject/json-parser.c | 49 -
>  qobject/qjson.c   |  4 ++--
>  tests/check-qjson.c   | 50 ++
>  4 files changed, 80 insertions(+), 29 deletions(-)
>
> diff --git a/qobject/json-lexer.c b/qobject/json-lexer.c
> index b846d2852d..599b7446b7 100644
> --- a/qobject/json-lexer.c
> +++ b/qobject/json-lexer.c
> @@ -32,9 +32,11 @@
>   * Extension for vararg handling in JSON construction, when using
>   * qobject_from_jsonf() instead of qobject_from_json() (this lexer
>   * actually accepts multiple forms of PRId64, but parse_escape() later
> - * filters to only parse the current platform's spelling):
> + * filters to only parse the current platform's spelling; meanwhile,
> + * JSON only allows % inside strings, where the parser handles %%, so
> + * we do not need to lex it here):

The parenthesis is becoming unwieldy.  Turn it into a note...

>   *
> - * %(PRI[du]64|(l|ll)?[ud]|[ipsf])
> + * %(PRI[du]64|(l|ll)?[ud]|[ipsf%])
>   *

... here?

>   */
>
> diff --git a/qobject/json-parser.c b/qobject/json-parser.c
> index 388aa7a386..daafe77a0c 100644
> --- a/qobject/json-parser.c
> +++ b/qobject/json-parser.c
> @@ -120,25 +120,21 @@ static int hex2decimal(char ch)
>   *  \n
>   *  \r
>   *  \t
> - *  \u four-hex-digits 
> + *  \u four-hex-digits
> + *
> + * Additionally, if @percent is true, all % in @token must be doubled,
> + * replaced by a single % will be in the result; this allows -Wformat
> + * processing of qobject_from_jsonf().
>   */
>  static QString *qstring_from_escaped_str(JSONParserContext *ctxt,
> - JSONToken *token)
> + JSONToken *token, bool percent)
>  {
>  const char *ptr = token->str;
>  QString *str;
> -int double_quote = 1;
> -
> -if (*ptr == '"') {
> -double_quote = 1;
> -} else {
> -double_quote = 0;
> -}
> -ptr++;
> +bool double_quote = *ptr++ == '"';
>
>  str = qstring_new();
> -while (*ptr && 
> -   ((double_quote && *ptr != '"') || (!double_quote && *ptr != 
> '\''))) {
> +while (*ptr && *ptr != "'\""[double_quote]) {

Simpler:

   bool quote = *ptr++;

and then

   while (*ptr && *ptr != quote) {

Have you considered splitting the patch into one to simplify, and one to
implement %%?

>  if (*ptr == '\\') {
>  ptr++;
>
> @@ -205,12 +201,13 @@ static QString 
> *qstring_from_escaped_str(JSONParserContext *ctxt,
>  goto out;
>  }
>  } else {
> -char dummy[2];
> -
> -dummy[0] = *ptr++;
> -dummy[1] = 0;
> -
> -qstring_append(str, dummy);
> +if (*ptr == '%' && percent) {
> +if (*++ptr != '%') {
> +parse_error(ctxt, token, "invalid %% sequence in 
> string");
> +goto out;
> +}
> +}
> +qstring_append_chr(str, *ptr++);
>  }
>  }
>
> @@ -455,13 +452,15 @@ static QObject *parse_escape(JSONParserContext *ctxt, 
> va_list *ap)