Re: [Qemu-devel] [PATCH v2 01/11] qapi: add qapi2texi script
Hi On Thu, Nov 3, 2016 at 5:46 PM Markus Armbrusterwrote: > Marc-André Lureau writes: > > > If we want to have a very strict grammar, there is even more work needed > to > > cleanup the documentation. I don't fancy doing that work, it gives only > low > > benefits. > > Since I have pretty specific ideas on how I want this part of your much > larger work done, I think it's only fair if I give it a try myself: > propose a grammar, create a parser to show it works, clean up existing > comments to comply with it. > > Thank you, I gladly accept the help :) However, could you wait until the next iteration? I am trying to improve the parsing and simplifying a bit the code (although probably not reaching your goal) -- Marc-André Lureau
Re: [Qemu-devel] [PATCH v2 01/11] qapi: add qapi2texi script
Marc-André Lureauwrites: > Hi > > On Fri, Oct 28, 2016 at 7:45 PM Markus Armbruster wrote: > >> Marc-André Lureau writes: >> >> > As the name suggests, the qapi2texi script converts JSON QAPI >> > description into a standalone texi file suitable for different target >> > formats. >> > >> > It parses the following kind of blocks with some little variations: >> > >> > ## >> > # = Section >> > # == Subsection >> > # >> > # Some text foo with *emphasis* >> > # 1. with a list >> > # 2. like that >> > # >> > # And some code: >> > # | $ echo foo >> > # | <- do this >> > # | -> get that >> > # >> > ## >> > >> > ## >> > # @symbol >> > # >> > # Symbol body ditto ergo sum. Foo bar >> > # baz ding. >> > # >> > # @arg: foo >> > # @arg: #optional foo >> > # >> > # Returns: returns bla bla >> > # >> > # Or bla blah >> > # >> > # Since: version >> > # Notes: notes, comments can have >> > #- itemized list >> > #- like this >> > # >> > #and continue... >> > # >> > # Example: >> > # >> > # <- { "execute": "quit" } >> > # -> { "return": {} } >> > # >> > ## >> > >> > Thanks to the json declaration, it's able to give extra information >> > about the type of arguments and return value expected. >> > >> > Signed-off-by: Marc-André Lureau >> > --- >> > scripts/qapi.py| 100 +++- >> > scripts/qapi2texi.py | 314 >> + >> > docs/qapi-code-gen.txt | 44 +-- >> > 3 files changed, 446 insertions(+), 12 deletions(-) >> > create mode 100755 scripts/qapi2texi.py >> >> My review will be easier to follow if you skip ahead qapi-code-gen.txt, >> then go to class QAPISchemaParser, then come back here. >> >> > diff --git a/scripts/qapi.py b/scripts/qapi.py >> > index 21bc32f..4efc7e7 100644 >> > --- a/scripts/qapi.py >> > +++ b/scripts/qapi.py >> > @@ -122,6 +122,79 @@ class QAPIExprError(Exception): >> > "%s:%d: %s" % (self.info['file'], self.info['line'], >> self.msg) >> > >> > >> > +class QAPIDoc: >> >> Old-style class. Make this >> >>class QAPIDoc(object): >> > > ok > > >> >> > +def __init__(self, comment): >> > +self.symbol = None >> > +self.comment = "" # the main symbol comment >> >> PEP8 wants at least two spaces before #, and I prefer the # in colum 32. >> > > let's just remove that comment > > >> >> > +self.args = OrderedDict() >> > +# meta is for Since:, Notes:, Examples:, Returns:... >> > +self.meta = OrderedDict() >> > +# the current section to populate, array of [dict, key, >> > comment...] >> > +self.section = None >> > + >> > +for line in comment.split('\n'): >> >> Makes me wonder whether the caller should pass a list of lines instead >> of a string that needs to be split into lines. >> >> > +# remove multiple spaces >> > +sline = ' '.join(line.split()) >> >> Actually, this doesn't "remove multiple spaces", it squeezes sequences >> of whitespace into a single space character. Rather inefficiently, I >> suspect, but let's not worry about that now. >> >> > +# take the first word out >> > +split = sline.split(' ', 1) >> > +key = split[0] >> > + >> > +if key.startswith("@"): >> > +key = key[1:].rstrip(':') >> >> Treats the colon as optional. I understand you're trying to be >> unintrusive, but since we're parsing comments already, we can make the >> parser do double duty and enforce consistent comment formatting. But >> this isn't the place to discuss the format we want, review of >> qapi-code-gen.txt is. >> > > Indeed, I was being permissive: my initial plan was rather to have > something quickly last year. It's simply comments, it does not have to be > rigorous as code to me. Now that we are taking >1.5y to get it in, we may > as well enforce more stuff from the start. My aim is to avoid silent garbage-in, garbage out in the long run, where "garbage" input looks innocent at least to casual reviewers, and to assist developers in writing non-garbage input. A quick solution that doesn't accomplish that may still get us closer. Or it may be an avoidable detour. Depends. >Obviously that kind of feedback > would have been nice earlier. You're absolutely right. >(needless to say my motivation is lower than > when I started) Who wouldn't be frustrated by such delays! You're the victim of unlucky timing and perhaps questionable judgement calls. Your work arrived in the middle of the huge effort to merge Eric's work. For better or worse, we decided to stay focused on the task that was in progress. Maybe we should've paused at least for some high level review, and to see whether we
Re: [Qemu-devel] [PATCH v2 01/11] qapi: add qapi2texi script
Hi On Fri, Oct 28, 2016 at 7:45 PM Markus Armbrusterwrote: > Marc-André Lureau writes: > > > As the name suggests, the qapi2texi script converts JSON QAPI > > description into a standalone texi file suitable for different target > > formats. > > > > It parses the following kind of blocks with some little variations: > > > > ## > > # = Section > > # == Subsection > > # > > # Some text foo with *emphasis* > > # 1. with a list > > # 2. like that > > # > > # And some code: > > # | $ echo foo > > # | <- do this > > # | -> get that > > # > > ## > > > > ## > > # @symbol > > # > > # Symbol body ditto ergo sum. Foo bar > > # baz ding. > > # > > # @arg: foo > > # @arg: #optional foo > > # > > # Returns: returns bla bla > > # > > # Or bla blah > > # > > # Since: version > > # Notes: notes, comments can have > > #- itemized list > > #- like this > > # > > #and continue... > > # > > # Example: > > # > > # <- { "execute": "quit" } > > # -> { "return": {} } > > # > > ## > > > > Thanks to the json declaration, it's able to give extra information > > about the type of arguments and return value expected. > > > > Signed-off-by: Marc-André Lureau > > --- > > scripts/qapi.py| 100 +++- > > scripts/qapi2texi.py | 314 > + > > docs/qapi-code-gen.txt | 44 +-- > > 3 files changed, 446 insertions(+), 12 deletions(-) > > create mode 100755 scripts/qapi2texi.py > > My review will be easier to follow if you skip ahead qapi-code-gen.txt, > then go to class QAPISchemaParser, then come back here. > > > diff --git a/scripts/qapi.py b/scripts/qapi.py > > index 21bc32f..4efc7e7 100644 > > --- a/scripts/qapi.py > > +++ b/scripts/qapi.py > > @@ -122,6 +122,79 @@ class QAPIExprError(Exception): > > "%s:%d: %s" % (self.info['file'], self.info['line'], > self.msg) > > > > > > +class QAPIDoc: > > Old-style class. Make this > >class QAPIDoc(object): > ok > > > +def __init__(self, comment): > > +self.symbol = None > > +self.comment = "" # the main symbol comment > > PEP8 wants at least two spaces before #, and I prefer the # in colum 32. > let's just remove that comment > > > +self.args = OrderedDict() > > +# meta is for Since:, Notes:, Examples:, Returns:... > > +self.meta = OrderedDict() > > +# the current section to populate, array of [dict, key, > comment...] > > +self.section = None > > + > > +for line in comment.split('\n'): > > Makes me wonder whether the caller should pass a list of lines instead > of a string that needs to be split into lines. > > > +# remove multiple spaces > > +sline = ' '.join(line.split()) > > Actually, this doesn't "remove multiple spaces", it squeezes sequences > of whitespace into a single space character. Rather inefficiently, I > suspect, but let's not worry about that now. > > > +# take the first word out > > +split = sline.split(' ', 1) > > +key = split[0] > > + > > +if key.startswith("@"): > > +key = key[1:].rstrip(':') > > Treats the colon as optional. I understand you're trying to be > unintrusive, but since we're parsing comments already, we can make the > parser do double duty and enforce consistent comment formatting. But > this isn't the place to discuss the format we want, review of > qapi-code-gen.txt is. > Indeed, I was being permissive: my initial plan was rather to have something quickly last year. It's simply comments, it does not have to be rigorous as code to me. Now that we are taking >1.5y to get it in, we may as well enforce more stuff from the start. Obviously that kind of feedback would have been nice earlier. (needless to say my motivation is lower than when I started) > > > +sline = split[1] if len(split) > 1 else "" > > +if self.symbol is None: > > +# the first is the section symbol > > +self.symbol = key > > Let's not say "the section symbol". It's the symbol this APIDoc object > documents. The APIDoc object has sections. > ok > > > +else: > > +# else an arg > > +self.start_section(self.args, key) > > +elif self.symbol and \ > > +key in ("Returns:", > > +# special case for Since often used without: > > +"Since:", "Since", > > +# those are often singular or plural > > +"Note:", "Notes:", > > +"Example:", "Examples:"): > > +sline = split[1] if len(split) > 1 else "" > > +line = None > > +#
Re: [Qemu-devel] [PATCH v2 01/11] qapi: add qapi2texi script
On 10/28/2016 11:44 AM, Markus Armbruster wrote: > Marc-André Lureauwrites: > >> As the name suggests, the qapi2texi script converts JSON QAPI >> description into a standalone texi file suitable for different target >> formats. >> >> It parses the following kind of blocks with some little variations: >> >> ## >> # = Section >> # == Subsection >> # >> # Some text foo with *emphasis* >> # 1. with a list >> # 2. like that >> # >> # And some code: >> # | $ echo foo >> # | <- do this >> # | -> get that Backwards. Pre-patch, qmp-commands.txt states: > Also, the following notation is used to denote data flow: > > -> data issued by the Client > <- Server data response so you want -> do this <- get that >> # >> ## How much effort is it to get rid of the "little variations" and flag documentation that does not match consistent patterns? A tighter parser that requires specific format, but also diagnoses where comments don't meet that format, is probably going to be easier to maintain in the long run (fewer special cases) even if it is more painful up front (more tweaks to bring things into a consistent layout). >> >> ## >> # @symbol >> # >> # Symbol body ditto ergo sum. Foo bar >> # baz ding. >> # >> # @arg: foo >> # @arg: #optional foo >> # >> # Returns: returns bla bla >> # >> # Or bla blah >> # >> # Since: version >> # Notes: notes, comments can have >> #- itemized list >> #- like this >> # >> #and continue... >> # >> # Example: >> # >> # <- { "execute": "quit" } >> # -> { "return": {} } Again, backwards. >> # >> ## >> >> Thanks to the json declaration, it's able to give extra information >> about the type of arguments and return value expected. >> >> Signed-off-by: Marc-André Lureau >> --- >> scripts/qapi.py| 100 +++- >> scripts/qapi2texi.py | 314 >> + >> docs/qapi-code-gen.txt | 44 +-- >> 3 files changed, 446 insertions(+), 12 deletions(-) >> create mode 100755 scripts/qapi2texi.py > > My review will be easier to follow if you skip ahead qapi-code-gen.txt, > then go to class QAPISchemaParser, then come back here. Commenting on just a few points: >> +if key.startswith("@"): >> +key = key[1:].rstrip(':') > > Treats the colon as optional. I understand you're trying to be > unintrusive, but since we're parsing comments already, we can make the > parser do double duty and enforce consistent comment formatting. But > this isn't the place to discuss the format we want, review of > qapi-code-gen.txt is. Makes sense - since we are writing a parser, we can make it be strict and pick one approved format, rather than allowing two separate formats; the parser will point out what needs to be fixed, and we actually make those fixes then rearrange the series so the fixes come first. > > Taken together: > > * The QAPI parser is hacked to accumulate and exfiltrate some comments > to the QAPIDoc parser. I haven't closely looked at the code yet, just the review comments at this point. During the QAPI parse of the .json file, are we passing the entire ##...## comment (of which there should be at most one per top-level element of the .json file), and then letting QAPIDoc further parse that comment block? Or are you trying to combine both levels of parsing into the same parser object? > General remarks: > > * I feel the only way to keep the API documentation in order as we > develop is to have the parser reject malformed API comments. Agreed. Forcing docs to be well-written up front will pay benefits in ease of maintenance and uniformity of style (it's a lot easier to copy-and-paste a working style if that is the only style to copy from), once we have converted existing documentation to be well-formed. > > * The only way to create a robust and maintainable parser is to specify > the language *first* and *rigorously*. > > * Stacking parsers is not a good idea. Figuring out one parser's > accepted language is hard enough. We do have the nice aspect that there is (should be) at most one ##...## comment body per QAPI (pseudo-JSON) object, so the QAPI parse just needs to find that comment block and associate it with everything else it parsed. Further parsing the documentation comment into sections for output, as well as double-checking sanity aspects (such as no missing or extra parameters) may require more coordination between the parsed QAPI object and the code that parses the comments. >> +++ b/docs/qapi-code-gen.txt >> @@ -45,16 +45,13 @@ QAPI parser does not). At present, there is no place >> where a QAPI >> schema requires the use of JSON numbers or null. >> >> Comments are allowed; anything between an unquoted # and the following >> -newline is ignored. Although there is not yet a documentation >>
Re: [Qemu-devel] [PATCH v2 01/11] qapi: add qapi2texi script
Marc-André Lureauwrites: > As the name suggests, the qapi2texi script converts JSON QAPI > description into a standalone texi file suitable for different target > formats. > > It parses the following kind of blocks with some little variations: > > ## > # = Section > # == Subsection > # > # Some text foo with *emphasis* > # 1. with a list > # 2. like that > # > # And some code: > # | $ echo foo > # | <- do this > # | -> get that > # > ## > > ## > # @symbol > # > # Symbol body ditto ergo sum. Foo bar > # baz ding. > # > # @arg: foo > # @arg: #optional foo > # > # Returns: returns bla bla > # > # Or bla blah > # > # Since: version > # Notes: notes, comments can have > #- itemized list > #- like this > # > #and continue... > # > # Example: > # > # <- { "execute": "quit" } > # -> { "return": {} } > # > ## > > Thanks to the json declaration, it's able to give extra information > about the type of arguments and return value expected. > > Signed-off-by: Marc-André Lureau > --- > scripts/qapi.py| 100 +++- > scripts/qapi2texi.py | 314 > + > docs/qapi-code-gen.txt | 44 +-- > 3 files changed, 446 insertions(+), 12 deletions(-) > create mode 100755 scripts/qapi2texi.py My review will be easier to follow if you skip ahead qapi-code-gen.txt, then go to class QAPISchemaParser, then come back here. > diff --git a/scripts/qapi.py b/scripts/qapi.py > index 21bc32f..4efc7e7 100644 > --- a/scripts/qapi.py > +++ b/scripts/qapi.py > @@ -122,6 +122,79 @@ class QAPIExprError(Exception): > "%s:%d: %s" % (self.info['file'], self.info['line'], self.msg) > > > +class QAPIDoc: Old-style class. Make this class QAPIDoc(object): > +def __init__(self, comment): > +self.symbol = None > +self.comment = "" # the main symbol comment PEP8 wants at least two spaces before #, and I prefer the # in colum 32. > +self.args = OrderedDict() > +# meta is for Since:, Notes:, Examples:, Returns:... > +self.meta = OrderedDict() > +# the current section to populate, array of [dict, key, comment...] > +self.section = None > + > +for line in comment.split('\n'): Makes me wonder whether the caller should pass a list of lines instead of a string that needs to be split into lines. > +# remove multiple spaces > +sline = ' '.join(line.split()) Actually, this doesn't "remove multiple spaces", it squeezes sequences of whitespace into a single space character. Rather inefficiently, I suspect, but let's not worry about that now. > +# take the first word out > +split = sline.split(' ', 1) > +key = split[0] > + > +if key.startswith("@"): > +key = key[1:].rstrip(':') Treats the colon as optional. I understand you're trying to be unintrusive, but since we're parsing comments already, we can make the parser do double duty and enforce consistent comment formatting. But this isn't the place to discuss the format we want, review of qapi-code-gen.txt is. > +sline = split[1] if len(split) > 1 else "" > +if self.symbol is None: > +# the first is the section symbol > +self.symbol = key Let's not say "the section symbol". It's the symbol this APIDoc object documents. The APIDoc object has sections. > +else: > +# else an arg > +self.start_section(self.args, key) > +elif self.symbol and \ > +key in ("Returns:", > +# special case for Since often used without: > +"Since:", "Since", > +# those are often singular or plural > +"Note:", "Notes:", > +"Example:", "Examples:"): > +sline = split[1] if len(split) > 1 else "" > +line = None > +# new "meta" section > +self.start_section(self.meta, key.rstrip(':')) > + > +if self.section and self.section[1].startswith("Example"): > +# example is verbatim > +self.append_comment(line) > +else: > +self.append_comment(sline) Sections other than "Example" and "Examples" have whitespace squeezed. I believe this is the true reason for the squeezing. There are a few other places that rely on it, but they could do just as well without. > + > +self.end_section() > + > +def append_comment(self, line): > +"""Adds a comment to the current section, or the symbol comment""" > +if line is None: > +return > +if self.section is not None: >
[Qemu-devel] [PATCH v2 01/11] qapi: add qapi2texi script
As the name suggests, the qapi2texi script converts JSON QAPI description into a standalone texi file suitable for different target formats. It parses the following kind of blocks with some little variations: ## # = Section # == Subsection # # Some text foo with *emphasis* # 1. with a list # 2. like that # # And some code: # | $ echo foo # | <- do this # | -> get that # ## ## # @symbol # # Symbol body ditto ergo sum. Foo bar # baz ding. # # @arg: foo # @arg: #optional foo # # Returns: returns bla bla # # Or bla blah # # Since: version # Notes: notes, comments can have #- itemized list #- like this # #and continue... # # Example: # # <- { "execute": "quit" } # -> { "return": {} } # ## Thanks to the json declaration, it's able to give extra information about the type of arguments and return value expected. Signed-off-by: Marc-André Lureau--- scripts/qapi.py| 100 +++- scripts/qapi2texi.py | 314 + docs/qapi-code-gen.txt | 44 +-- 3 files changed, 446 insertions(+), 12 deletions(-) create mode 100755 scripts/qapi2texi.py diff --git a/scripts/qapi.py b/scripts/qapi.py index 21bc32f..4efc7e7 100644 --- a/scripts/qapi.py +++ b/scripts/qapi.py @@ -122,6 +122,79 @@ class QAPIExprError(Exception): "%s:%d: %s" % (self.info['file'], self.info['line'], self.msg) +class QAPIDoc: +def __init__(self, comment): +self.symbol = None +self.comment = "" # the main symbol comment +self.args = OrderedDict() +# meta is for Since:, Notes:, Examples:, Returns:... +self.meta = OrderedDict() +# the current section to populate, array of [dict, key, comment...] +self.section = None + +for line in comment.split('\n'): +# remove multiple spaces +sline = ' '.join(line.split()) +# take the first word out +split = sline.split(' ', 1) +key = split[0] + +if key.startswith("@"): +key = key[1:].rstrip(':') +sline = split[1] if len(split) > 1 else "" +if self.symbol is None: +# the first is the section symbol +self.symbol = key +else: +# else an arg +self.start_section(self.args, key) +elif self.symbol and \ +key in ("Returns:", +# special case for Since often used without: +"Since:", "Since", +# those are often singular or plural +"Note:", "Notes:", +"Example:", "Examples:"): +sline = split[1] if len(split) > 1 else "" +line = None +# new "meta" section +self.start_section(self.meta, key.rstrip(':')) + +if self.section and self.section[1].startswith("Example"): +# example is verbatim +self.append_comment(line) +else: +self.append_comment(sline) + +self.end_section() + +def append_comment(self, line): +"""Adds a comment to the current section, or the symbol comment""" +if line is None: +return +if self.section is not None: +if self.section[-1] == "" and line == "": +self.end_section() +else: +self.section.append(line) +elif self.comment == "": +self.comment = line +else: +self.comment += "\n" + line + +def end_section(self): +if self.section is not None: +dic = self.section[0] +key = self.section[1] +doc = "\n".join(self.section[2:]) +dic[key] = doc +self.section = None + +def start_section(self, dic, key): +self.end_section() +self.section = [dic, key] # .. remaining elems will be the doc + + class QAPISchemaParser(object): def __init__(self, fp, previously_included=[], incl_info=None): @@ -137,11 +210,14 @@ class QAPISchemaParser(object): self.line = 1 self.line_pos = 0 self.exprs = [] +self.comment = None +self.apidoc = incl_info['doc'] if incl_info else [] self.accept() while self.tok is not None: expr_info = {'file': fname, 'line': self.line, - 'parent': self.incl_info} + 'parent': self.incl_info, 'doc': self.apidoc} +self.apidoc = [] expr = self.get_expr(False) if isinstance(expr, dict) and "include" in expr: if len(expr) != 1: @@ -162,6 +238,8 @@ class QAPISchemaParser(object):