Re: [Qemu-devel] [PATCH v2 2/4] qjson: do not save/restore contexts
Paolo Bonziniwrites: > JSON is LL(1) and our parser indeed needs only 1 token lookahead. > Saving the parser context is mostly unnecessary; "Mit Kanonen auf Spatzen" (shooting cannons at sparrows). > we can replace it > with peeking at the next token, or remove it altogether when the > restore only happens on errors. The token list is destroyed anyway > on errors. > > The only interesting thing is that parse_keyword always eats > a TOKEN_KEYWORD, even if it is invalid, so it must come last in > parse_value (otherwise, NULL is returned, parse_literal is invoked > and it tries to peek beyond end of input). This is caught by > /errors/unterminated/literal, which actually checks for an unterminated > keyword. ಠ_ಠ > > Signed-off-by: Paolo Bonzini > --- > qobject/json-parser.c | 59 > ++- > 1 file changed, 21 insertions(+), 38 deletions(-) > > diff --git a/qobject/json-parser.c b/qobject/json-parser.c > index ac991ba..7a287ea 100644 > --- a/qobject/json-parser.c > +++ b/qobject/json-parser.c > @@ -296,23 +296,6 @@ static QObject > *parser_context_peek_token(JSONParserContext *ctxt) > return token; > } > > -static JSONParserContext parser_context_save(JSONParserContext *ctxt) > -{ > -JSONParserContext saved_ctxt = {0}; > -saved_ctxt.tokens.pos = ctxt->tokens.pos; > -saved_ctxt.tokens.count = ctxt->tokens.count; > -saved_ctxt.tokens.buf = ctxt->tokens.buf; > -return saved_ctxt; > -} > - > -static void parser_context_restore(JSONParserContext *ctxt, > - JSONParserContext saved_ctxt) > -{ > -ctxt->tokens.pos = saved_ctxt.tokens.pos; > -ctxt->tokens.count = saved_ctxt.tokens.count; > -ctxt->tokens.buf = saved_ctxt.tokens.buf; > -} > - This saves and restores tokens, which is an array @buf of @count tokens with a cursor @pos. @buf and count only ever change in parser_context_new(). Saving and restoring them has always been pointless. What this actually does is saving and restoring the cursor. > static void tokens_append_from_iter(QObject *obj, void *opaque) > { > JSONParserContext *ctxt = opaque; > @@ -364,7 +347,6 @@ static void parser_context_free(JSONParserContext *ctxt) > static int parse_pair(JSONParserContext *ctxt, QDict *dict, va_list *ap) > { > QObject *key = NULL, *token = NULL, *value, *peek; > -JSONParserContext saved_ctxt = parser_context_save(ctxt); > > peek = parser_context_peek_token(ctxt); > if (peek == NULL) { > @@ -402,7 +384,6 @@ static int parse_pair(JSONParserContext *ctxt, QDict > *dict, va_list *ap) > return 0; > > out: > -parser_context_restore(ctxt, saved_ctxt); > qobject_decref(key); > > return -1; Before your patch: on error, we rewind the cursor to where it was on entry. Useful in a backtracking parser, but this shouldn't be one. Now: we leave it wherever we error out. Fine, because the sole caller parse_object() won't actually use it on error. > @@ -412,9 +393,8 @@ static QObject *parse_object(JSONParserContext *ctxt, > va_list *ap) > { > QDict *dict = NULL; > QObject *token, *peek; > -JSONParserContext saved_ctxt = parser_context_save(ctxt); > > -token = parser_context_pop_token(ctxt); > +token = parser_context_peek_token(ctxt); > if (token == NULL) { > goto out; > } > @@ -425,6 +405,7 @@ static QObject *parse_object(JSONParserContext *ctxt, > va_list *ap) > > dict = qdict_new(); > > +parser_context_pop_token(ctxt); > peek = parser_context_peek_token(ctxt); > if (peek == NULL) { > parse_error(ctxt, NULL, "premature EOI"); > @@ -465,7 +446,6 @@ static QObject *parse_object(JSONParserContext *ctxt, > va_list *ap) > return QOBJECT(dict); > > out: > -parser_context_restore(ctxt, saved_ctxt); > QDECREF(dict); > return NULL; > } I'm not 100% when exactly parser_context_peek_token() returns null, but I think can see what your patch does anyway. Before: if we can parse an object, advance cursor behind the object and return it, else leave the cursor alone and return null. The sole caller parse_value() relies on this behavior to try alternatives until one succeeds. After: if we can parse an object, same as before, else advance the cursor to right before the offending token and return null. Consider QMP input { 1 [ 2 ] }. json_parser_parse_err(): parse_value() @ pos=0: triy parse_object() @ pos=0: consume '{' parse_pair() @ pos=1: parse_value() @ pos=1: try ..., consume 1, now pos=2 fail with "key is not a string in object" propagate failure try parse_array() @ pos=2: consume [ 2 ], return the QList, now pos=5 return the QList
Re: [Qemu-devel] [PATCH v2 2/4] qjson: do not save/restore contexts
Hi, > Is it accepted practice to put UTF-8 in commit messages? (Or, actually, > anywhere in patches, except maybe the notes section?) Sure. We don't want limit people names (in signed-off etc) to us-ascii. See "eb8934b vnc: fix memory corruption (CVE-2015-5225)" for a name written in kanji. > I'd recommend o_O. Heh, it's 2015, not 1995 ... cheers, Gerd
Re: [Qemu-devel] [PATCH v2 2/4] qjson: do not save/restore contexts
On 11/24/15 09:03, Gerd Hoffmann wrote: > Hi, > >> Is it accepted practice to put UTF-8 in commit messages? (Or, actually, >> anywhere in patches, except maybe the notes section?) > > Sure. We don't want limit people names (in signed-off etc) to us-ascii. > > See "eb8934b vnc: fix memory corruption (CVE-2015-5225)" for a name > written in kanji. I'm very sorry, but this is something I expressly disagree with. (I didn't want to bring this up on my own, but since you did...) International engineering / science / research etc. are being done in English. We use English because we expect people to learn one common language (native English speakers have it easy, but that's a side point), so that everyone not have to learn every possible language. The same point applies *much more* to writing systems / alphabets. You (the generic you) can't expect me (the generic me) to read Kanji, Sanskrit, Thai script, Cyrillic script, and so on, even if your name is written in that language natively. You come up with an approximation in Latin script, and use that. Is your purpose to feel pleased about the faithful representation of your name in the commit message (that the international community is unable to read, not even approximately), or is your goal to allow the community to read your (approximate) name? I bet everyone who is involved in international development, and travels occasionally, has business cards in Latin script *too*. I bet whoever does research and publishes papers in English puts their name (or at least an official approximation of it) on the front page in Latin script *too*. Specifically about the commit you mention, the email of the reporter is: zuozhi@alibaba-inc.com I'm absolutely sure that "zuozhi" is the official Pinyin transliteration of the reporter's name (or a part of it). Now, while Pinyin has its own separate pronunciation rules, I *can* (and occasionally do) look up those rules. So Pinyin allows me to *work* with the name with relative safety, and it even gives me a fleeting chance at getting the pronunciation right, should we meet. My name is László Érsek. I've dropped the accents for the purpose of international exchange in advance; I just write Laszlo Ersek, even when I sign physical documents that are in English. Even that way, I've seen the larger community abuse my name endlessly; in particular I've seen all permutations (= reordering) and variations (= missing characters) of the substring "szl" in "Laszlo". If the larger community fails to get such a simple ASCII name right -- and yes I'm at fault too, I have occasionally left off the second "n" of your last name --, then why am I (or anyone else) expected to struggle with names written in non-Latin script? They are much harder, and have exactly zero value, as far as international collaboration is concerned. The development is being done in English, and the script of English is Latin. Stick with it. >> I'd recommend o_O. > > Heh, it's 2015, not 1995 ... Sure, and the Internet Standards are still being written in pure ASCII. https://en.wikipedia.org/wiki/Request_for_Comments#Obtaining_RFCs Laszlo
Re: [Qemu-devel] [PATCH v2 2/4] qjson: do not save/restore contexts
Hi, > The same point applies *much more* to writing systems / alphabets. You > (the generic you) can't expect me (the generic me) to read Kanji, > Sanskrit, Thai script, Cyrillic script, and so on, even if your name is > written in that language natively. You come up with an approximation in > Latin script, and use that. > Is your purpose to feel pleased about the faithful representation of > your name in the commit message (that the international community is > unable to read, not even approximately), or is your goal to allow the > community to read your (approximate) name? IMO that is for the people in question to decide. Usually I just cut +paste (or let stefans great patches script collect) what they are using them-self and are apparently comfortable with. Having Kanji only as in that specific case is unusual indeed, in most cases there is latin transcript, either in place of or additionally to the native version, like this: "latin (native) ". The latter looks a bit silly for latin-family of alphabets, for others (kanji/cyrillic/...) it makes more sense and seems to be more common. But even when they use native only I still think their name and not only the email address should appear in the commit message when giving them the credit they deserve. cheers, Gerd
Re: [Qemu-devel] [PATCH v2 2/4] qjson: do not save/restore contexts
On 24/11/2015 11:50, Laszlo Ersek wrote: > You > (the generic you) can't expect me (the generic me) to read Kanji, > Sanskrit, Thai script, Cyrillic script, and so on, even if your name is > written in that language natively. You come up with an approximation in > Latin script, and use that. > > Is your purpose to feel pleased about the faithful representation of > your name in the commit message (that the international community is > unable to read, not even approximately), or is your goal to allow the > community to read your (approximate) name? > > Specifically about the commit you mention, the email of the reporter is: > > zuozhi@alibaba-inc.com > > I'm absolutely sure that "zuozhi" is the official Pinyin transliteration > of the reporter's name (or a part of it). Now, while Pinyin has its own > separate pronunciation rules, I *can* (and occasionally do) look up > those rules. So Pinyin allows me to *work* with the name with relative > safety, and it even gives me a fleeting chance at getting the > pronunciation right, should we meet. I think this is getting into dangerous territory. :) Chinese language and names are very different from ours and it's possible that a person for some reason is very attached to the particular hanzi (kanji is Japanese :)) that are used for their name. (Fam, correct me if I'm wrong). I've also seen people who have adopted an alternative English name and still want to include the Chinese name somewhere, at which point it's understandable that they write the latter with hanzi. Certainly it's better if you get something like these: Signed-off-by: Gong Li (巩俐)Signed-off-by: 巩俐 (Gong Li) but if the email is understandable I have no problem with Signed-off-by: 巩俐 or in the case of an English name any of Signed-off-by: Jane Li (巩俐) Signed-off-by: 巩俐 (Jane Li) Signed-off-by: 巩俐 (Jane Li) where in the last case the email doesn't give the full Chinese name, but there is an alternative for people who cannot read Chinese characters. For some reason this almost never happens with Japanese developers, but https://en.wikipedia.org/wiki/Japanese_name#Difficulty_of_reading_names suggests that they may also appreciate using kanji for their names in commit messages. Thanks, Paolo
Re: [Qemu-devel] [PATCH v2 2/4] qjson: do not save/restore contexts
On 11/24/15 11:50, Laszlo Ersek wrote: > My name is László Érsek. OMG, my native name is actually "Érsek László". 95% of my written communication is in English, and this very discussion is occurring in English. So in this context I forgot that in Hungarian the family name goes first. (Just goes on to show that I'm willing to mangle my name for collaboration's sake. Others should be too.)
Re: [Qemu-devel] [PATCH v2 2/4] qjson: do not save/restore contexts
On 24/11/2015 13:44, Fam Zheng wrote: > All in all, in my opinion, only something like > > Signed-off-by: 巩俐> > , from which you can't easily infer the pronunciation, is apparently > inconsiderate in an English context. Otherwise I think we should tolerate the > usage of non-latin characters even if it means we have to copy FWIW, I totally agree. Paolo
Re: [Qemu-devel] [PATCH v2 2/4] qjson: do not save/restore contexts
On Tue, 11/24 12:18, Paolo Bonzini wrote: > > > On 24/11/2015 11:50, Laszlo Ersek wrote: > > You > > (the generic you) can't expect me (the generic me) to read Kanji, > > Sanskrit, Thai script, Cyrillic script, and so on, even if your name is > > written in that language natively. You come up with an approximation in > > Latin script, and use that. > > > > Is your purpose to feel pleased about the faithful representation of > > your name in the commit message (that the international community is > > unable to read, not even approximately), or is your goal to allow the > > community to read your (approximate) name? > > > > Specifically about the commit you mention, the email of the reporter is: > > > > zuozhi@alibaba-inc.com > > > > I'm absolutely sure that "zuozhi" is the official Pinyin transliteration > > of the reporter's name (or a part of it). Now, while Pinyin has its own Yes, it's the Pinyin of 祚至, his given name. It's a peculiar and literary choice of name. :) > > separate pronunciation rules, I *can* (and occasionally do) look up > > those rules. So Pinyin allows me to *work* with the name with relative > > safety, and it even gives me a fleeting chance at getting the > > pronunciation right, should we meet. > > I think this is getting into dangerous territory. :) > > Chinese language and names are very different from ours and it's > possible that a person for some reason is very attached to the > particular hanzi (kanji is Japanese :)) that are used for their name. > (Fam, correct me if I'm wrong). > > I've also seen people who have adopted an alternative English name and > still want to include the Chinese name somewhere, at which point it's > understandable that they write the latter with hanzi. > > Certainly it's better if you get something like these: > >Signed-off-by: Gong Li (巩俐)>Signed-off-by: 巩俐 (Gong Li) > > but if the email is understandable I have no problem with > >Signed-off-by: 巩俐 > > or in the case of an English name any of > >Signed-off-by: Jane Li (巩俐) >Signed-off-by: 巩俐 (Jane Li) >Signed-off-by: 巩俐 (Jane Li) > > where in the last case the email doesn't give the full Chinese name, but > there is an alternative for people who cannot read Chinese characters. > > For some reason this almost never happens with Japanese developers, but > https://en.wikipedia.org/wiki/Japanese_name#Difficulty_of_reading_names > suggests that they may also appreciate using kanji for their names in > commit messages. The problem for using Pinyin is, even I as a Chinese can't tell which is the last name out of a two-syllable name such as "Gong Li", because of obvious reasons. Having hanzi together with Pinyin helps a lot, in that it tells me how to call it, should we meet. Not just in "which is the family name", but also which tone to use. In Chinese, you're not calling one's name if not using the right tone. For example "zuozhi" can be "左之" or "祚至" or one of other 100 combinations, which vary in tones. Personally I don't feel very comfortable when a Mandarin speaker calls me with a wrong tone. And I know that will happen from time to time if I stick to Pinyin rather than Fam. Thinking from the other way round, that may be a reason why they use hanzi. All in all, in my opinion, only something like Signed-off-by: 巩俐 , from which you can't easily infer the pronunciation, is apparently inconsiderate in an English context. Otherwise I think we should tolerate the usage of non-latin characters even if it means we have to copy Fam
Re: [Qemu-devel] [PATCH v2 2/4] qjson: do not save/restore contexts
Paolo Bonziniwrites: > On 24/11/2015 13:44, Fam Zheng wrote: >> All in all, in my opinion, only something like >> >> Signed-off-by: 巩俐 >> >> , from which you can't easily infer the pronunciation, is apparently >> inconsiderate in an English context. Otherwise I think we should tolerate the >> usage of non-latin characters even if it means we have to copy > > FWIW, I totally agree. Me too.
Re: [Qemu-devel] [PATCH v2 2/4] qjson: do not save/restore contexts
On 11/23/15 19:09, Paolo Bonzini wrote: > > > On 23/11/2015 18:59, Laszlo Ersek wrote: The only interesting thing is that parse_keyword always eats a TOKEN_KEYWORD, even if it is invalid, so it must come last in parse_value (otherwise, NULL is returned, parse_literal is invoked and it tries to peek beyond end of input). This is caught by /errors/unterminated/literal, which actually checks for an unterminated keyword. ಠ_ಠ >> Is it accepted practice to put UTF-8 in commit messages? (Or, actually, >> anywhere in patches, except maybe the notes section?) >> >> I'd recommend o_O. > > Perhaps not Kannada, but Unicode 0xA1 to 0x1FF is pretty common in > commit messages. That seems to be a subset of the union of the following three: https://en.wikipedia.org/wiki/Latin-1_Supplement_%28Unicode_block%29 https://en.wikipedia.org/wiki/Latin_Extended-A https://en.wikipedia.org/wiki/Latin_Extended-B Thanks for the info, Laszlo
Re: [Qemu-devel] [PATCH v2 2/4] qjson: do not save/restore contexts
On 23/11/2015 18:59, Laszlo Ersek wrote: >> > The only interesting thing is that parse_keyword always eats >> > a TOKEN_KEYWORD, even if it is invalid, so it must come last in >> > parse_value (otherwise, NULL is returned, parse_literal is invoked >> > and it tries to peek beyond end of input). This is caught by >> > /errors/unterminated/literal, which actually checks for an unterminated >> > keyword. ಠ_ಠ > Is it accepted practice to put UTF-8 in commit messages? (Or, actually, > anywhere in patches, except maybe the notes section?) > > I'd recommend o_O. Perhaps not Kannada, but Unicode 0xA1 to 0x1FF is pretty common in commit messages. Paolo
Re: [Qemu-devel] [PATCH v2 2/4] qjson: do not save/restore contexts
On 11/23/2015 10:59 AM, Laszlo Ersek wrote: > On 11/23/15 18:44, Paolo Bonzini wrote: >> JSON is LL(1) and our parser indeed needs only 1 token lookahead. >> Saving the parser context is mostly unnecessary; we can replace it >> with peeking at the next token, or remove it altogether when the >> restore only happens on errors. The token list is destroyed anyway >> on errors. >> >> The only interesting thing is that parse_keyword always eats >> a TOKEN_KEYWORD, even if it is invalid, so it must come last in >> parse_value (otherwise, NULL is returned, parse_literal is invoked >> and it tries to peek beyond end of input). This is caught by >> /errors/unterminated/literal, which actually checks for an unterminated >> keyword. ಠ_ಠ > > Is it accepted practice to put UTF-8 in commit messages? (Or, actually, > anywhere in patches, except maybe the notes section?) > Git handles UTF-8 just fine (and for any other encoding, properly transmitted in the email, git transcodes to UTF-8 before writing it into the repository). -- Eric Blake eblake redhat com+1-919-301-3266 Libvirt virtualization library http://libvirt.org signature.asc Description: OpenPGP digital signature
Re: [Qemu-devel] [PATCH v2 2/4] qjson: do not save/restore contexts
On 11/23/15 21:05, Eric Blake wrote: > On 11/23/2015 10:59 AM, Laszlo Ersek wrote: >> On 11/23/15 18:44, Paolo Bonzini wrote: >>> JSON is LL(1) and our parser indeed needs only 1 token lookahead. >>> Saving the parser context is mostly unnecessary; we can replace it >>> with peeking at the next token, or remove it altogether when the >>> restore only happens on errors. The token list is destroyed anyway >>> on errors. >>> >>> The only interesting thing is that parse_keyword always eats >>> a TOKEN_KEYWORD, even if it is invalid, so it must come last in >>> parse_value (otherwise, NULL is returned, parse_literal is invoked >>> and it tries to peek beyond end of input). This is caught by >>> /errors/unterminated/literal, which actually checks for an unterminated >>> keyword. ಠ_ಠ >> >> Is it accepted practice to put UTF-8 in commit messages? (Or, actually, >> anywhere in patches, except maybe the notes section?) >> > > Git handles UTF-8 just fine (and for any other encoding, properly > transmitted in the email, git transcodes to UTF-8 before writing it into > the repository). > Yes, I know. I use latin2: $ locale LANG= LC_CTYPE=hu_HU.ISO8859-2 LC_NUMERIC="POSIX" LC_TIME="POSIX" LC_COLLATE="POSIX" LC_MONETARY="POSIX" LC_MESSAGES="POSIX" LC_PAPER="POSIX" LC_NAME="POSIX" LC_ADDRESS="POSIX" LC_TELEPHONE="POSIX" LC_MEASUREMENT="POSIX" LC_IDENTIFICATION="POSIX" LC_ALL= and from my git config: [i18n] logOutputEncoding = latin2 commitencoding = latin2 This works very well -- as long as it doesn't choke on something outside of latin2 --, both the glibc locale support and git are doing their jobs perfectly fine; my question concerned any other users who decided to stay with single-byte encodings (with an ASCII subset). (I believe that RFCs stick with ASCII to this day, and I also think that our source code and docs/ should stick with ASCII; but I know I can't plausibly argue for the same in commit messages, assuming I'm alone with that anyway. BTW I should have written "non-ASCII Unicode code points" in my original question, rather than "UTF-8".) Thanks! Laszlo
[Qemu-devel] [PATCH v2 2/4] qjson: do not save/restore contexts
JSON is LL(1) and our parser indeed needs only 1 token lookahead. Saving the parser context is mostly unnecessary; we can replace it with peeking at the next token, or remove it altogether when the restore only happens on errors. The token list is destroyed anyway on errors. The only interesting thing is that parse_keyword always eats a TOKEN_KEYWORD, even if it is invalid, so it must come last in parse_value (otherwise, NULL is returned, parse_literal is invoked and it tries to peek beyond end of input). This is caught by /errors/unterminated/literal, which actually checks for an unterminated keyword. ಠ_ಠ Signed-off-by: Paolo Bonzini--- qobject/json-parser.c | 59 ++- 1 file changed, 21 insertions(+), 38 deletions(-) diff --git a/qobject/json-parser.c b/qobject/json-parser.c index ac991ba..7a287ea 100644 --- a/qobject/json-parser.c +++ b/qobject/json-parser.c @@ -296,23 +296,6 @@ static QObject *parser_context_peek_token(JSONParserContext *ctxt) return token; } -static JSONParserContext parser_context_save(JSONParserContext *ctxt) -{ -JSONParserContext saved_ctxt = {0}; -saved_ctxt.tokens.pos = ctxt->tokens.pos; -saved_ctxt.tokens.count = ctxt->tokens.count; -saved_ctxt.tokens.buf = ctxt->tokens.buf; -return saved_ctxt; -} - -static void parser_context_restore(JSONParserContext *ctxt, - JSONParserContext saved_ctxt) -{ -ctxt->tokens.pos = saved_ctxt.tokens.pos; -ctxt->tokens.count = saved_ctxt.tokens.count; -ctxt->tokens.buf = saved_ctxt.tokens.buf; -} - static void tokens_append_from_iter(QObject *obj, void *opaque) { JSONParserContext *ctxt = opaque; @@ -364,7 +347,6 @@ static void parser_context_free(JSONParserContext *ctxt) static int parse_pair(JSONParserContext *ctxt, QDict *dict, va_list *ap) { QObject *key = NULL, *token = NULL, *value, *peek; -JSONParserContext saved_ctxt = parser_context_save(ctxt); peek = parser_context_peek_token(ctxt); if (peek == NULL) { @@ -402,7 +384,6 @@ static int parse_pair(JSONParserContext *ctxt, QDict *dict, va_list *ap) return 0; out: -parser_context_restore(ctxt, saved_ctxt); qobject_decref(key); return -1; @@ -412,9 +393,8 @@ static QObject *parse_object(JSONParserContext *ctxt, va_list *ap) { QDict *dict = NULL; QObject *token, *peek; -JSONParserContext saved_ctxt = parser_context_save(ctxt); -token = parser_context_pop_token(ctxt); +token = parser_context_peek_token(ctxt); if (token == NULL) { goto out; } @@ -425,6 +405,7 @@ static QObject *parse_object(JSONParserContext *ctxt, va_list *ap) dict = qdict_new(); +parser_context_pop_token(ctxt); peek = parser_context_peek_token(ctxt); if (peek == NULL) { parse_error(ctxt, NULL, "premature EOI"); @@ -465,7 +446,6 @@ static QObject *parse_object(JSONParserContext *ctxt, va_list *ap) return QOBJECT(dict); out: -parser_context_restore(ctxt, saved_ctxt); QDECREF(dict); return NULL; } @@ -474,9 +454,8 @@ static QObject *parse_array(JSONParserContext *ctxt, va_list *ap) { QList *list = NULL; QObject *token, *peek; -JSONParserContext saved_ctxt = parser_context_save(ctxt); -token = parser_context_pop_token(ctxt); +token = parser_context_peek_token(ctxt); if (token == NULL) { goto out; } @@ -487,6 +466,7 @@ static QObject *parse_array(JSONParserContext *ctxt, va_list *ap) list = qlist_new(); +parser_context_pop_token(ctxt); peek = parser_context_peek_token(ctxt); if (peek == NULL) { parse_error(ctxt, NULL, "premature EOI"); @@ -537,7 +517,6 @@ static QObject *parse_array(JSONParserContext *ctxt, va_list *ap) return QOBJECT(list); out: -parser_context_restore(ctxt, saved_ctxt); QDECREF(list); return NULL; } @@ -545,9 +524,8 @@ out: static QObject *parse_keyword(JSONParserContext *ctxt) { QObject *token, *ret; -JSONParserContext saved_ctxt = parser_context_save(ctxt); -token = parser_context_pop_token(ctxt); +token = parser_context_peek_token(ctxt); if (token == NULL) { goto out; } @@ -556,6 +534,7 @@ static QObject *parse_keyword(JSONParserContext *ctxt) goto out; } +parser_context_pop_token(ctxt); if (token_is_keyword(token, "true")) { ret = QOBJECT(qbool_from_bool(true)); } else if (token_is_keyword(token, "false")) { @@ -570,7 +549,6 @@ static QObject *parse_keyword(JSONParserContext *ctxt) return ret; out: -parser_context_restore(ctxt, saved_ctxt); return NULL; } @@ -578,17 +556,21 @@ out: static QObject *parse_escape(JSONParserContext *ctxt, va_list *ap) { QObject *token = NULL, *obj; -JSONParserContext saved_ctxt = parser_context_save(ctxt); if (ap == NULL) { goto out; } -token =
Re: [Qemu-devel] [PATCH v2 2/4] qjson: do not save/restore contexts
On 11/23/15 18:44, Paolo Bonzini wrote: > JSON is LL(1) and our parser indeed needs only 1 token lookahead. > Saving the parser context is mostly unnecessary; we can replace it > with peeking at the next token, or remove it altogether when the > restore only happens on errors. The token list is destroyed anyway > on errors. > > The only interesting thing is that parse_keyword always eats > a TOKEN_KEYWORD, even if it is invalid, so it must come last in > parse_value (otherwise, NULL is returned, parse_literal is invoked > and it tries to peek beyond end of input). This is caught by > /errors/unterminated/literal, which actually checks for an unterminated > keyword. ಠ_ಠ Is it accepted practice to put UTF-8 in commit messages? (Or, actually, anywhere in patches, except maybe the notes section?) I'd recommend o_O. Thanks Laszlo > > Signed-off-by: Paolo Bonzini> --- > qobject/json-parser.c | 59 > ++- > 1 file changed, 21 insertions(+), 38 deletions(-) > > diff --git a/qobject/json-parser.c b/qobject/json-parser.c > index ac991ba..7a287ea 100644 > --- a/qobject/json-parser.c > +++ b/qobject/json-parser.c > @@ -296,23 +296,6 @@ static QObject > *parser_context_peek_token(JSONParserContext *ctxt) > return token; > } > > -static JSONParserContext parser_context_save(JSONParserContext *ctxt) > -{ > -JSONParserContext saved_ctxt = {0}; > -saved_ctxt.tokens.pos = ctxt->tokens.pos; > -saved_ctxt.tokens.count = ctxt->tokens.count; > -saved_ctxt.tokens.buf = ctxt->tokens.buf; > -return saved_ctxt; > -} > - > -static void parser_context_restore(JSONParserContext *ctxt, > - JSONParserContext saved_ctxt) > -{ > -ctxt->tokens.pos = saved_ctxt.tokens.pos; > -ctxt->tokens.count = saved_ctxt.tokens.count; > -ctxt->tokens.buf = saved_ctxt.tokens.buf; > -} > - > static void tokens_append_from_iter(QObject *obj, void *opaque) > { > JSONParserContext *ctxt = opaque; > @@ -364,7 +347,6 @@ static void parser_context_free(JSONParserContext *ctxt) > static int parse_pair(JSONParserContext *ctxt, QDict *dict, va_list *ap) > { > QObject *key = NULL, *token = NULL, *value, *peek; > -JSONParserContext saved_ctxt = parser_context_save(ctxt); > > peek = parser_context_peek_token(ctxt); > if (peek == NULL) { > @@ -402,7 +384,6 @@ static int parse_pair(JSONParserContext *ctxt, QDict > *dict, va_list *ap) > return 0; > > out: > -parser_context_restore(ctxt, saved_ctxt); > qobject_decref(key); > > return -1; > @@ -412,9 +393,8 @@ static QObject *parse_object(JSONParserContext *ctxt, > va_list *ap) > { > QDict *dict = NULL; > QObject *token, *peek; > -JSONParserContext saved_ctxt = parser_context_save(ctxt); > > -token = parser_context_pop_token(ctxt); > +token = parser_context_peek_token(ctxt); > if (token == NULL) { > goto out; > } > @@ -425,6 +405,7 @@ static QObject *parse_object(JSONParserContext *ctxt, > va_list *ap) > > dict = qdict_new(); > > +parser_context_pop_token(ctxt); > peek = parser_context_peek_token(ctxt); > if (peek == NULL) { > parse_error(ctxt, NULL, "premature EOI"); > @@ -465,7 +446,6 @@ static QObject *parse_object(JSONParserContext *ctxt, > va_list *ap) > return QOBJECT(dict); > > out: > -parser_context_restore(ctxt, saved_ctxt); > QDECREF(dict); > return NULL; > } > @@ -474,9 +454,8 @@ static QObject *parse_array(JSONParserContext *ctxt, > va_list *ap) > { > QList *list = NULL; > QObject *token, *peek; > -JSONParserContext saved_ctxt = parser_context_save(ctxt); > > -token = parser_context_pop_token(ctxt); > +token = parser_context_peek_token(ctxt); > if (token == NULL) { > goto out; > } > @@ -487,6 +466,7 @@ static QObject *parse_array(JSONParserContext *ctxt, > va_list *ap) > > list = qlist_new(); > > +parser_context_pop_token(ctxt); > peek = parser_context_peek_token(ctxt); > if (peek == NULL) { > parse_error(ctxt, NULL, "premature EOI"); > @@ -537,7 +517,6 @@ static QObject *parse_array(JSONParserContext *ctxt, > va_list *ap) > return QOBJECT(list); > > out: > -parser_context_restore(ctxt, saved_ctxt); > QDECREF(list); > return NULL; > } > @@ -545,9 +524,8 @@ out: > static QObject *parse_keyword(JSONParserContext *ctxt) > { > QObject *token, *ret; > -JSONParserContext saved_ctxt = parser_context_save(ctxt); > > -token = parser_context_pop_token(ctxt); > +token = parser_context_peek_token(ctxt); > if (token == NULL) { > goto out; > } > @@ -556,6 +534,7 @@ static QObject *parse_keyword(JSONParserContext *ctxt) > goto out; > } > > +parser_context_pop_token(ctxt); > if (token_is_keyword(token, "true")) { > ret =