Hi ----- Original Message ----- > Marc-André Lureau <mlur...@redhat.com> writes: > > > Hi > > > > ----- Original Message ----- > >> Marc-André Lureau <marcandre.lur...@redhat.com> writes: > >> > >> > As the name suggests, the qapi2texi script converts JSON QAPI > >> > description into a texi file suitable for different target > >> > formats (info/man/txt/pdf/html...). > >> > > >> > It parses the following kind of blocks: > >> > > >> > Free-form: > >> > > >> > ## > >> > # = Section > >> > # == Subsection > >> > # > >> > # Some text foo with *emphasis* > >> > # 1. with a list > >> > # 2. like that > >> > # > >> > # And some code: > >> > # | $ echo foo > >> > # | -> do this > >> > # | <- get that > >> > # > >> > ## > >> > > >> > Symbol description: > >> > > >> > ## > >> > # @symbol: > >> > # > >> > # Symbol body ditto ergo sum. Foo bar > >> > # baz ding. > >> > # > >> > # @param1: the frob to frobnicate > >> > # @param2: #optional how hard to frobnicate > >> > # > >> > # Returns: the frobnicated frob. > >> > # If frob isn't frobnicatable, GenericError. > >> > # > >> > # Since: version > >> > # Notes: notes, comments can have > >> > # - itemized list > >> > # - like this > >> > # > >> > # Example: > >> > # > >> > # -> { "execute": "quit" } > >> > # <- { "return": {} } > >> > # > >> > ## > >> > > >> > That's roughly following the following EBNF grammar: > >> > > >> > api_comment = "##\n" comment "##\n" > >> > comment = freeform_comment | symbol_comment > >> > freeform_comment = { "# " text "\n" | "#\n" } > >> > symbol_comment = "# @" name ":\n" { member | tag_section | > >> > freeform_comment > >> > } > >> > member = "# @" name ':' [ text ] "\n" freeform_comment > >> > tag_section = "# " ( "Returns:", "Since:", "Note:", "Notes:", > >> > "Example:", > >> > "Examples:" ) [ text ] "\n" freeform_comment > >> > text = free text with markup > >> > > >> > Note that the grammar is ambiguous: a line "# @foo:\n" can be parsed > >> > both as freeform_comment and as symbol_comment. The actual parser > >> > recognizes symbol_comment. > >> > > >> > See docs/qapi-code-gen.txt for more details. > >> > > >> > Deficiencies: > >> > >> Perhaps "Deficiencies and limitations". > > > > ok > > > >> > >> > - the generated QMP documentation includes internal types > >> > - union-type support is lacking > >> > >> "union type" (no dash). > >> > >> > - type information is lacking in generated documentation > >> > - doc comment error message positions are imprecise, they point > >> > to the beginning of the comment. > >> > - see other TODO/FIXME in this commit > >> > >> Suggest: - a few minor issues, all marked TODO/FIXME in the code > >> > > > > ok > > > >> > > >> > Signed-off-by: Marc-André Lureau <marcandre.lur...@redhat.com> > >> > --- > >> [diffstat snipped...] > >> > diff --git a/scripts/qapi.py b/scripts/qapi.py > >> > index 3d5f9e1eaf..a92a86f428 100644 > >> > --- a/scripts/qapi.py > >> > +++ b/scripts/qapi.py > >> > @@ -125,6 +125,122 @@ class QAPISemError(QAPIError): > >> > info['parent'], msg) > >> > > >> > > >> > +class QAPIDoc(object): > >> > + class Section(object): > >> > + def __init__(self, name=None): > >> > + # optional section name (argument/member or section name) > >> > + self.name = name > >> > + # the list of lines for this section > >> > + self.content = [] > >> > + > >> > + def append(self, line): > >> > + self.content.append(line) > >> > + > >> > + def __repr__(self): > >> > + return "\n".join(self.content).strip() > >> > + > >> > + class ArgSection(Section): > >> > + pass > >> > + > >> > + def __init__(self, parser, info): > >> > + # self.parser is used to report errors with QAPIParseError. > >> > The > >> > + # resulting error position depends on the state of the parser. > >> > + # It happens to be the beginning of the comment. More or less > >> > + # servicable, but action at a distance. > >> > + self.parser = parser > >> > + self.info = info > >> > + self.symbol = None > >> > + self.body = QAPIDoc.Section() > >> > + # dict mapping parameter name to ArgSection > >> > + self.args = OrderedDict() > >> > + # a list of Section > >> > + self.sections = [] > >> > + # the current section > >> > + self.section = self.body > >> > + # associated expression (to be set by expression parser) > >> > + self.expr = None > >> > + > >> > + def has_section(self, name): > >> > + """Return True if we have a section with this name.""" > >> > + for i in self.sections: > >> > + if i.name == name: > >> > + return True > >> > + return False > >> > + > >> > + def append(self, line): > >> > + """Parse a comment line and add it to the documentation.""" > >> > + line = line[1:] > >> > + if not line: > >> > + self._append_freeform(line) > >> > + return > >> > + > >> > + if line[0] != ' ': > >> > + raise QAPIParseError(self.parser, "Missing space after #") > >> > + line = line[1:] > >> > + > >> > + # FIXME not nice: things like '# @foo:' and '# @foo: ' aren't > >> > + # recognized, and get silently treated as ordinary text > >> > + if self.symbol: > >> > + self._append_symbol_line(line) > >> > + elif not self.body.content and line.startswith("@"): > >> > + if not line.endswith(":"): > >> > + raise QAPIParseError(self.parser, "Line should end with > >> > :") > >> > + self.symbol = line[1:-1] > >> > + # FIXME invalid names other than the empty string aren't > >> > flagged > >> > + if not self.symbol: > >> > + raise QAPIParseError(self.parser, "Invalid name") > >> > + else: > >> > + self._append_freeform(line) > >> > + > >> > + def _append_symbol_line(self, line): > >> > + name = line.split(' ', 1)[0] > >> > + > >> > + if name.startswith("@") and name.endswith(":"): > >> > + line = line[len(name)+1:] > >> > + self._start_args_section(name[1:-1]) > >> > + elif name in ("Returns:", "Since:", > >> > + # those are often singular or plural > >> > + "Note:", "Notes:", > >> > + "Example:", "Examples:", > >> > + "TODO:"): > >> > + line = line[len(name)+1:] > >> > + self._start_section(name[:-1]) > >> > + > >> > + self._append_freeform(line) > >> > + > >> > + def _start_args_section(self, name): > >> > + # FIXME invalid names other than the empty string aren't > >> > flagged > >> > + if not name: > >> > + raise QAPIParseError(self.parser, "Invalid parameter name") > >> > + if name in self.args: > >> > + raise QAPIParseError(self.parser, > >> > + "'%s' parameter name duplicated" % > >> > name) > >> > + if self.sections: > >> > + raise QAPIParseError(self.parser, > >> > + "'%s' parameter documentation is > >> > interleaved " > >> > + "with other sections" % name) > >> > >> This error message is rather confusing. Ideally, we'd point to the the > >> source code that made us add to self.sections, but that's not in the > >> cards right now. Here's my best try: > >> > >> raise QAPIParseError(self.parser, > >> "'@%s:' can't follow '%s' section" > >> % (name, self.sections[0].name)) > >> > > > > works for me too > > > >> > + self.section = QAPIDoc.ArgSection(name) > >> > + self.args[name] = self.section > >> > + > >> > + def _start_section(self, name=""): > >> > >> Note: @name changed to default to "" since v6 for the benefit of > >> _append_freeform() below. > >> > >> > + if name in ("Returns", "Since") and self.has_section(name): > >> > + raise QAPIParseError(self.parser, > >> > + "Duplicated '%s' section" % name) > >> > + self.section = QAPIDoc.Section(name) > >> > + self.sections.append(self.section) > >> > + > >> > + def _append_freeform(self, line): > >> > + in_arg = isinstance(self.section, QAPIDoc.ArgSection) > >> > + if (in_arg and self.section.content and > >> > + not self.section.content[-1] and > >> > + line and not line[0].isspace()): > >> > >> PEP8 recommends breaking lines before operators in new code. > > > > ok > > > >> > >> > + self._start_section() > >> > >> Changed from v6's > >> > >> raise QAPIParseError(self.parser, "Invalid section > >> indentation") > >> > >> May well be an improvement (I'm certainly no fan of this error myself), > >> but it throws my review off the fast track: now I have to figure out how > >> the new code works. Can you give me a hint on what this change does? > > > > In v6, we rejected unindented sections after arguments (see > > tests/qapi-schema/doc-bad-section.json): > > > > # @arg: blah bla > > # zing bar > > # > > # This should be rejected > > > > And a patch in v6 reordered the json comments. But you asked for the > > doc to not be reordered without careful review (both source and > > output). So those additional paragraphs are added as untitled > > sections, and the doc generator keep section order. The output will be > > in the same order as the source. > > Thanks. > > >> Or should we take a shortcut and revert to v6 here? We can always > >> improve on top. > > > > Probably not needed to revert. > > I played with it a bit, and it looks like it's working fine. > > >> > + if (in_arg or not self.section.name or > >> > + not self.section.name.startswith("Example")): > >> > >> Again, break before the operator. > > > > ok > > > >> > >> > + line = line.strip() > >> > + self.section.append(line) > >> > + > >> > + > >> > class QAPISchemaParser(object): > >> > > >> > def __init__(self, fp, previously_included=[], incl_info=None): > >> > @@ -140,11 +256,17 @@ class QAPISchemaParser(object): > >> > self.line = 1 > >> > self.line_pos = 0 > >> > self.exprs = [] > >> > + self.docs = [] > >> > self.accept() > >> > > >> > while self.tok is not None: > >> > info = {'file': fname, 'line': self.line, > >> > 'parent': self.incl_info} > >> > + if self.tok == '#': > >> > + doc = self.get_doc(info) > >> > + self.docs.append(doc) > >> > + continue > >> > + > >> > expr = self.get_expr(False) > >> > if isinstance(expr, dict) and "include" in expr: > >> > if len(expr) != 1: > >> > @@ -162,6 +284,7 @@ class QAPISchemaParser(object): > >> > raise QAPISemError(info, "Inclusion loop for > >> > %s" > >> > % include) > >> > inf = inf['parent'] > >> > + > >> > # skip multiple include of the same file > >> > if incl_abs_fname in previously_included: > >> > continue > >> > @@ -172,12 +295,19 @@ class QAPISchemaParser(object): > >> > exprs_include = QAPISchemaParser(fobj, > >> > previously_included, > >> > info) > >> > self.exprs.extend(exprs_include.exprs) > >> > + self.docs.extend(exprs_include.docs) > >> > else: > >> > expr_elem = {'expr': expr, > >> > 'info': info} > >> > + if (self.docs and > >> > + self.docs[-1].info['file'] == fname and > >> > + not self.docs[-1].expr): > >> > >> Again, break before the operator. > > > > ok > > > >> > >> > + self.docs[-1].expr = expr > >> > + expr_elem['doc'] = self.docs[-1] > >> > + > >> > self.exprs.append(expr_elem) > >> > > >> > - def accept(self): > >> > + def accept(self, skip_comment=True): > >> > while True: > >> > self.tok = self.src[self.cursor] > >> > self.pos = self.cursor > >> > @@ -185,7 +315,13 @@ class QAPISchemaParser(object): > >> > self.val = None > >> > > >> > if self.tok == '#': > >> > + if self.src[self.cursor] == '#': > >> > + # Start of doc comment > >> > + skip_comment = False > >> > self.cursor = self.src.find('\n', self.cursor) > >> > + if not skip_comment: > >> > + self.val = self.src[self.pos:self.cursor] > >> > + return > >> > elif self.tok in "{}:,[]": > >> > return > >> > elif self.tok == "'": > >> > >> Copied from review of v3, so I don't forget: > >> > >> Comment tokens are thrown away as before, except when the parser asks > >> for them by passing skip_comment=False, or when the comment token starts > >> with ##. The parser asks while parsing a doc comment, in get_doc(). > >> > >> This is a backchannel from the parser to the lexer. I'd rather avoid > >> such lexer hacks, but I guess we can address that on top. > >> > >> A comment starting with ## inside an expression is now a syntax error. > >> For instance, input > >> > >> { > >> ## > >> > >> yields > >> > >> /dev/stdin:2:1: Expected string or "}" > >> > >> Rather unfriendly error message, but we can fix that on top. > >> > >> > @@ -319,6 +455,28 @@ class QAPISchemaParser(object): > >> > raise QAPIParseError(self, 'Expected "{", "[" or string') > >> > return expr > >> > > >> > + def get_doc(self, info): > >> > + if self.val != '##': > >> > + raise QAPIParseError(self, "Junk after '##' at start of " > >> > + "documentation comment") > >> > + > >> > + doc = QAPIDoc(self, info) > >> > + self.accept(False) > >> > + while self.tok == '#': > >> > + if self.val.startswith('##'): > >> > + # End of doc comment > >> > + if self.val != '##': > >> > + raise QAPIParseError(self, "Junk after '##' at end > >> > of > >> > " > >> > + "documentation comment") > >> > + self.accept() > >> > + return doc > >> > + else: > >> > + doc.append(self.val) > >> > + self.accept(False) > >> > + > >> > + raise QAPIParseError(self, "Documentation comment must end with > >> > '##'") > >> > + > >> > + > >> > # > >> > # Semantic analysis of schema expressions > >> > # TODO fold into QAPISchema > >> > @@ -703,6 +861,11 @@ def check_exprs(exprs): > >> > for expr_elem in exprs: > >> > expr = expr_elem['expr'] > >> > info = expr_elem['info'] > >> > + > >> > + if 'doc' not in expr_elem: > >> > + raise QAPISemError(info, > >> > + "Expression missing documentation > >> > comment") > >> > + > >> > if 'enum' in expr: > >> > check_keys(expr_elem, 'enum', ['data'], ['prefix']) > >> > add_enum(expr['enum'], info, expr['data']) > >> > @@ -761,6 +924,84 @@ def check_exprs(exprs): > >> > return exprs > >> > > >> > > >> > +def check_freeform_doc(doc): > >> > + if doc.symbol: > >> > + raise QAPISemError(doc.info, > >> > + "Documention for '%s' is not followed" > >> > + " by the definition" % doc.symbol) > >> > + > >> > + body = str(doc.body) > >> > + if re.search(r'@\S+:', body, re.MULTILINE): > >> > + raise QAPISemError(doc.info, > >> > + "Free-form documentation block must not > >> > contain" > >> > + " @NAME: sections") > >> > + > >> > + > >> > +def check_definition_doc(doc, expr, info): > >> > >> Lots of churn in this function. It can certainly use improvement, as > >> pointed out in prior reviews, but it triggers still more review. Based > >> on the tests, I guess it's for finding documented parameters that don't > >> actually exist, and to diagnose optional mismatch. Anything else? > >> > > > > implicitely adds #optional (since it's no longer in type info, it better be > > in the description), code simplification, message improvement too. > > > > I'd rather keep this too, and improve on top. > > Okay. Review below. > > >> Or should we take a shortcut and revert to v6 here? We can always > >> improve on top. I think v6 would need a # TODO Should ensure #optional > >> matches the schema. > >> > >> > + for i in ('enum', 'union', 'alternate', 'struct', 'command', > >> > 'event'): > >> > + if i in expr: > >> > + meta = i > >> > + break > >> > + > >> > + name = expr[meta] > >> > + if doc.symbol != name: > >> > + raise QAPISemError(info, "Definition of '%s' follows > >> > documentation" > >> > + " for '%s'" % (name, doc.symbol)) > >> > + if doc.has_section('Returns') and 'command' not in expr: > >> > + raise QAPISemError(info, "'Returns:' is only valid for > >> > commands") > >> > >> Copied from review of v5, so I don't forget: > >> > >> We accept 'Returns:' even when the command doesn't return anything, > >> because we currently use it to document errors, too. Can't say I like > >> that, but it's out of scope here. > >> > >> > + > >> > + if meta == 'union': > >> > + args = expr.get('base', []) > > Note: args is either string or dictionary. > > >> > + else: > >> > + args = expr.get('data', []) > > Note: args is either string (only if command or event), dictionary (only > if struct, alternate, command, event) or list (only if enum) > > >> > + if isinstance(args, dict): > >> > + args = args.keys() > > Now args is either string (only if command or event) or list. > > >> > + > >> > + if meta == 'alternate' or \ > >> > + (meta == 'union' and not expr.get('discriminator')): > > PEP8 prefers parenthesis over backslash, and recommends breaking lines > before instead of after binary operators:
ok > > if (meta == 'alternate' > or (meta == 'union' and not expr.get('discriminator'))): > > Note: args can only be list here. > > >> > + args.append('type') > > Works because we know it's a list. It's less than obvious, though. > > >> > + > >> > + if isinstance(args, list): > >> > + for arg in args: > >> > + if arg[0] == '*': > >> > + opt = True > >> > + desc = doc.args.get(arg[1:]) > >> > + else: > >> > + opt = False > >> > + desc = doc.args.get(arg) > >> > + if not desc: > >> > + continue > >> > + desc_opt = "#optional" in str(desc) > >> > + if desc_opt and not opt: > >> > + raise QAPISemError(info, "Description has #optional, " > >> > + "but the declaration doesn't") > >> > + if not desc_opt and opt: > >> > + # silently fix the doc > >> > + desc.append("#optional") > > #optional is redundant information. But as long as we use it, it better > be correct, to avoid misleading readers of the schema. Two sides of > correctness: it's there when arg is optional, and it's not there when > it's not. You enforce the latter, but you silently fix the former. I > figure you do that because #optional is currently missing in quite a few > places. > > Let's add > # TODO either fix the schema and make this an error, > # or drop #optional entirely > ok > >> > + > >> > + doc_args = set(doc.args.keys()) > >> > + args = set([name.strip('*') for name in args]) > > Now args morphs from string or dict to set. I'd use a new variable. > > If args is a string, it becomes a set of characters, which is not what > you want. For instance, qapi-schema.json's QCryptoBlockOpenOptions has > 'base': 'QCryptoBlockOptionsBase', and args changes from > 'QCryptoBlockOptionsBase' to set(['O', 'a', 'C', 'B', 'e', 'i', 'c', > 'k', 'l', 'o', 'n', 'Q', 'p', 's', 'r', 't', 'y']) here. > > >> > + if not doc_args.issubset(args): > >> > + raise QAPISemError(info, "The following documented members are > >> > not in " > >> > + "the declaration: %s" % ", ".join(doc_args - > >> > args)) > > If args is originally a string, then we happily accept documentation for > its characters. Quite unlikely to bite in practice, but it's wrong > anyways. > > I suspect you could save yourself some grief by returning early when > args is string, say: > > if meta == 'union': > args = expr.get('base', []) > else: > args = expr.get('data', []) > if isinstance(args, str): > return > if isinstance(args, dict): > args = args.keys() > assert isinstance(args, list) > > if (meta == 'alternate' > or (meta == 'union' and not expr.get('discriminator'))): > args.append('type') > > for arg in args: > ... ok > > >> Copied from review of v5, so I don't forget: > >> > >> As explained in review of v3, this is only a subset of the real set of > >> members. Computing the exact set is impractical when working with the > >> abstract syntax tree. I believe we'll eventually have to rewrite this > >> code to work with the QAPISchemaEntity instead. > >> > >> > + > >> > + > >> > +def check_docs(docs): > >> > + for doc in docs: > >> > + for section in doc.args.values() + doc.sections: > >> > + content = str(section) > >> > + if not content or content.isspace(): > >> > + raise QAPISemError(doc.info, > >> > + "Empty doc section '%s'" % > >> > section.name) > >> > + > >> > + if not doc.expr: > >> > + check_freeform_doc(doc) > >> > + else: > >> > + check_definition_doc(doc, doc.expr, doc.info) > >> > + > >> > + return docs > >> > + > >> > + > >> > # > >> > # Schema compiler frontend > >> > # > >> > @@ -1229,7 +1470,9 @@ class QAPISchemaEvent(QAPISchemaEntity): > >> > class QAPISchema(object): > >> > def __init__(self, fname): > >> > try: > >> > - self.exprs = check_exprs(QAPISchemaParser(open(fname, > >> > "r")).exprs) > >> > + parser = QAPISchemaParser(open(fname, "r")) > >> > + self.exprs = check_exprs(parser.exprs) > >> > + self.docs = check_docs(parser.docs) > >> > self._entity_dict = {} > >> > self._predefining = True > >> > self._def_predefineds() > >> > diff --git a/scripts/qapi2texi.py b/scripts/qapi2texi.py > >> > new file mode 100755 > >> > index 0000000000..436749ec7b > >> > --- /dev/null > >> > +++ b/scripts/qapi2texi.py > >> > @@ -0,0 +1,266 @@ > >> > +#!/usr/bin/env python > >> > +# QAPI texi generator > >> > +# > >> > +# This work is licensed under the terms of the GNU LGPL, version 2+. > >> > +# See the COPYING file in the top-level directory. > >> > +"""This script produces the documentation of a qapi schema in texinfo > >> > format""" > >> > +import re > >> > +import sys > >> > + > >> > +import qapi > >> > + > >> > +COMMAND_FMT = """ > >> > +@deftypefn {type} {{}} {name} > >> > + > >> > +{body} > >> > + > >> > +@end deftypefn > >> > + > >> > +""".format > >> > + > >> > +ENUM_FMT = """ > >> > +@deftp Enum {name} > >> > + > >> > +{body} > >> > + > >> > +@end deftp > >> > + > >> > +""".format > >> > + > >> > +STRUCT_FMT = """ > >> > +@deftp {{{type}}} {name} > >> > + > >> > +{body} > >> > + > >> > +@end deftp > >> > + > >> > +""".format > >> > + > >> > +EXAMPLE_FMT = """@example > >> > +{code} > >> > +@end example > >> > +""".format > >> > + > >> > + > >> > +def subst_strong(doc): > >> > + """Replaces *foo* by @strong{foo}""" > >> > + return re.sub(r'\*([^*\n]+)\*', r'@emph{\1}', doc) > >> > + > >> > + > >> > +def subst_emph(doc): > >> > + """Replaces _foo_ by @emph{foo}""" > >> > + return re.sub(r'\b_([^_\n]+)_\b', r' @emph{\1} ', doc) > >> > + > >> > + > >> > +def subst_vars(doc): > >> > + """Replaces @var by @code{var}""" > >> > + return re.sub(r'@([\w-]+)', r'@code{\1}', doc) > >> > + > >> > + > >> > +def subst_braces(doc): > >> > + """Replaces {} with @{ @}""" > >> > + return doc.replace("{", "@{").replace("}", "@}") > >> > + > >> > + > >> > +def texi_example(doc): > >> > + """Format @example""" > >> > + # TODO: Neglects to escape @ characters. > >> > + # We should probably escape them in subst_braces(), and rename the > >> > + # function to subst_special() or subs_texi_special(). If we do > >> > that, > >> > we > >> > + # need to delay it until after subst_vars() in texi_format(). > >> > + doc = subst_braces(doc).strip('\n') > >> > + return EXAMPLE_FMT(code=doc) > >> > + > >> > + > >> > +def texi_format(doc): > >> > + """ > >> > + Format documentation > >> > + > >> > + Lines starting with: > >> > + - |: generates an @example > >> > + - =: generates @section > >> > + - ==: generates @subsection > >> > + - 1. or 1): generates an @enumerate @item > >> > + - */-: generates an @itemize list > >> > + """ > >> > + lines = [] > >> > + doc = subst_braces(doc) > >> > + doc = subst_vars(doc) > >> > + doc = subst_emph(doc) > >> > + doc = subst_strong(doc) > >> > + inlist = "" > >> > + lastempty = False > >> > + for line in doc.split('\n'): > >> > + empty = line == "" > >> > + > >> > + # FIXME: Doing this in a single if / elif chain is > >> > + # problematic. For instance, a line without markup terminates > >> > + # a list if it follows a blank line (reaches the final elif), > >> > + # but a line with some *other* markup, such as a = title > >> > + # doesn't. > >> > + if line.startswith("| "): > >> > + line = EXAMPLE_FMT(code=line[2:]) > >> > + elif line.startswith("= "): > >> > + line = "@section " + line[2:] > >> > + elif line.startswith("== "): > >> > + line = "@subsection " + line[3:] > >> > + elif re.match("^([0-9]*[.]) ", line): > >> > >> [.] is an unusual way to say \. > > > > Right, forgot to change that when removing ) from v6. > > > >> > >> Perhaps we should consistently use r"..." for strings used as regular > >> expressions. > >> > > > > No idea, could be changed on top. > > Yes. But if you respin anyway, adding some r wouldn't hurt. ok > > >> > + if not inlist: > >> > + lines.append("@enumerate") > >> > + inlist = "enumerate" > >> > + line = line[line.find(" ")+1:] > >> > + lines.append("@item") > >> > + elif re.match("^[*-] ", line): > >> > + if not inlist: > >> > + lines.append("@itemize %s" % {'*': "@bullet", > >> > + '-': "@minus"}[line[0]]) > >> > + inlist = "itemize" > >> > + lines.append("@item") > >> > + line = line[2:] > >> > + elif lastempty and inlist: > >> > + lines.append("@end %s\n" % inlist) > >> > + inlist = "" > >> > + > >> > + lastempty = empty > >> > + lines.append(line) > >> > + > >> > + if inlist: > >> > + lines.append("@end %s\n" % inlist) > >> > + return "\n".join(lines) > >> > + > >> > + > >> > +def texi_body(doc): > >> > + """ > >> > + Format the body of a symbol documentation: > >> > + - main body > >> > + - table of arguments > >> > + - followed by "Returns/Notes/Since/Example" sections > >> > + """ > >> > + body = texi_format(str(doc.body)) + "\n" > >> > + if doc.args: > >> > + body += "@table @asis\n" > >> > + for arg, section in doc.args.iteritems(): > >> > + desc = str(section) > >> > + opt = '' > >> > + if "#optional" in desc: > >> > + desc = desc.replace("#optional", "") > >> > + opt = ' (optional)' > >> > + body += "@item @code{'%s'}%s\n%s\n" % (arg, opt, > >> > + texi_format(desc)) > >> > + body += "@end table\n" > >> > + > >> > + for section in doc.sections: > >> > + name, doc = (section.name, str(section)) > >> > + func = texi_format > >> > + if name.startswith("Example"): > >> > + func = texi_example > >> > + > >> > + if name: > >> > >> @quotation produces confusing .txt and .html output, as discussed in > >> review of v6. Either drop it, or record the problem in a comment, e.g. > >> > >> # FIXME the indentation produced by @quotation in .txt and > >> # .html output is confusing > >> > > > > added > > Thanks. > > >> > + body += "\n@quotation %s\n%s\n@end quotation" % \ > >> > + (name, func(doc)) > >> > + else: > >> > + body += func(doc) > >> > + > >> > + return body > >> > + > >> > + > >> > +def texi_alternate(expr, doc): > >> > + """Format an alternate to texi""" > >> > + body = texi_body(doc) > >> > + return STRUCT_FMT(type="Alternate", > >> > + name=doc.symbol, > >> > + body=body) > >> > + > >> > + > >> > +def texi_union(expr, doc): > >> > + """Format a union to texi""" > >> > + discriminator = expr.get("discriminator") > >> > + if discriminator: > >> > + union = "Flat Union" > >> > + else: > >> > + union = "Simple Union" > >> > >> Not sure flat vs. simple matters for users of generated documentation, > >> but let's not worry about that now. > >> > >> > + > >> > + body = texi_body(doc) > >> > + return STRUCT_FMT(type=union, > >> > + name=doc.symbol, > >> > + body=body) > >> > + > >> > + > >> > +def texi_enum(expr, doc): > >> > + """Format an enum to texi""" > >> > + for i in expr['data']: > >> > + if i not in doc.args: > >> > + doc.args[i] = '' > >> > + body = texi_body(doc) > >> > + return ENUM_FMT(name=doc.symbol, > >> > + body=body) > >> > + > >> > + > >> > +def texi_struct(expr, doc): > >> > + """Format a struct to texi""" > >> > + body = texi_body(doc) > >> > + return STRUCT_FMT(type="Struct", > >> > + name=doc.symbol, > >> > + body=body) > >> > + > >> > + > >> > +def texi_command(expr, doc): > >> > + """Format a command to texi""" > >> > + body = texi_body(doc) > >> > + return COMMAND_FMT(type="Command", > >> > + name=doc.symbol, > >> > + body=body) > >> > + > >> > + > >> > +def texi_event(expr, doc): > >> > + """Format an event to texi""" > >> > + body = texi_body(doc) > >> > + return COMMAND_FMT(type="Event", > >> > + name=doc.symbol, > >> > + body=body) > >> > + > >> > + > >> > +def texi_expr(expr, doc): > >> > + """Format an expr to texi""" > >> > + (kind, _) = expr.items()[0] > >> > + > >> > + fmt = {"command": texi_command, > >> > + "struct": texi_struct, > >> > + "enum": texi_enum, > >> > + "union": texi_union, > >> > + "alternate": texi_alternate, > >> > + "event": texi_event}[kind] > >> > + > >> > + return fmt(expr, doc) > >> > + > >> > + > >> > +def texi(docs): > >> > + """Convert QAPI schema expressions to texi documentation""" > >> > + res = [] > >> > + for doc in docs: > >> > + expr = doc.expr > >> > + if not expr: > >> > + res.append(texi_body(doc)) > >> > + continue > >> > + try: > >> > + doc = texi_expr(expr, doc) > >> > + res.append(doc) > >> > + except: > >> > + print >>sys.stderr, "error at @%s" % doc.info > >> > + raise > >> > + > >> > + return '\n'.join(res) > >> > + > >> > + > >> > +def main(argv): > >> > + """Takes schema argument, prints result to stdout""" > >> > + if len(argv) != 2: > >> > + print >>sys.stderr, "%s: need exactly 1 argument: SCHEMA" % > >> > argv[0] > >> > + sys.exit(1) > >> > + > >> > + schema = qapi.QAPISchema(argv[1]) > >> > + print texi(schema.docs) > >> > + > >> > + > >> > +if __name__ == "__main__": > >> > + main(sys.argv) > >> > diff --git a/docs/qapi-code-gen.txt b/docs/qapi-code-gen.txt > >> > index 2841c5144a..b29f996fdc 100644 > >> > --- a/docs/qapi-code-gen.txt > >> > +++ b/docs/qapi-code-gen.txt > >> > @@ -44,40 +44,148 @@ Input must be ASCII (although QMP supports full > >> > Unicode strings, the > >> > QAPI parser does not). At present, there is no place where a QAPI > >> > schema requires the use of JSON numbers or null. > >> > > >> > + > >> > +=== Comments === > >> > + > >> > Comments are allowed; anything between an unquoted # and the following > >> > -newline is ignored. Although there is not yet a documentation > >> > -generator, a form of stylized comments has developed for consistently > >> > -documenting details about an expression and when it was added to the > >> > -schema. The documentation is delimited between two lines of ##, then > >> > -the first line names the expression, an optional overview is provided, > >> > -then individual documentation about each member of 'data' is provided, > >> > -and finally, a 'Since: x.y.z' tag lists the release that introduced > >> > -the expression. Optional members are tagged with the phrase > >> > -'#optional', often with their default value; and extensions added > >> > -after the expression was first released are also given a '(since > >> > -x.y.z)' comment. For example: > >> > - > >> > - ## > >> > - # @BlockStats: > >> > - # > >> > - # Statistics of a virtual block device or a block backing device. > >> > - # > >> > - # @device: #optional If the stats are for a virtual block device, > >> > the > >> > name > >> > - # corresponding to the virtual block device. > >> > - # > >> > - # @stats: A @BlockDeviceStats for the device. > >> > - # > >> > - # @parent: #optional This describes the file block device if it has > >> > one. > >> > - # > >> > - # @backing: #optional This describes the backing block device if it > >> > has one. > >> > - # (Since 2.0) > >> > - # > >> > - # Since: 0.14.0 > >> > - ## > >> > - { 'struct': 'BlockStats', > >> > - 'data': {'*device': 'str', 'stats': 'BlockDeviceStats', > >> > - '*parent': 'BlockStats', > >> > - '*backing': 'BlockStats'} } > >> > +newline is ignored. > >> > + > >> > +A multi-line comment that starts and ends with a '##' line is a > >> > +documentation comment. These are parsed by the documentation > >> > +generator, which recognizes certain markup detailed below. > >> > + > >> > + > >> > +==== Documentation markup ==== > >> > + > >> > +Comment text starting with '=' is a section title: > >> > + > >> > + # = Section title > >> > + > >> > +Double the '=' for a subsection title: > >> > + > >> > + # == Subection title > >> > + > >> > +'|' denotes examples: > >> > + > >> > + # | Text of the example, may span > >> > + # | multiple lines > >> > + > >> > +'*' starts an itemized list: > >> > + > >> > + # * First item, may span > >> > + # multiple lines > >> > + # * Second item > >> > + > >> > +You can also use '-' instead of '*'. > >> > + > >> > +A decimal number followed by '.' starts a numbered list: > >> > + > >> > + # 1. First item, may span > >> > + # multiple lines > > Should the second line be indented? ok > > >> > + # 2. Second item > >> > + > >> > +The actual number doesn't matter. You could even use '*' instead of > >> > +'2.' for the second item. > >> > + > >> > +FIXME what exactly ends a list > >> > >> Can you say offhand what ends a list? If yes, can we resolve this FIXME > >> now rather than later? > > > > well, the code is too permissive, there is a FIXME there already. > > > > In general, I think we want to end a list after a blank line and a > > "non-item" line. Is the wording ok? > > Blank line is pretty restrictive, because list items consisting of > multiple paragraphs aren't possible then. > > I think writing something like this is fairly natural: > > * First item, may span > multiple lines. > > Even multiple paragaphs are okay. > > * Second item > > - This is a sublist > - Second item of sublist > > * Third item > > This sentence is not part of the list. > > We don't support sublists, and I'm not saying we should, I'm just > showing how they would fit in. > > My point is that indentation is a natural way to delimit list items. > > I don't expect you to implement this now. All I want for this series is > reasonable documentation. As far as I can tell, texi_format() ends the > list at a blank line, except when it doesn't (see its FIXME). > > What about this: replace "FIXME what exactly ends a list" by > > Lists can't be nested. Blank lines are currently not supported > within lists. > > and add to texi_format()'s FIXME something like > > Make sure to update section "Documentation markup" in > docs/qapi-code-gen.txt when fixing this. > > Okay? ok > > >> > + > >> > +Additional whitespace between the initial '#' and the comment text is > >> > +permitted. > >> > + > >> > +*foo* and _foo_ are for strong and emphasis styles respectively (they > >> > +do not work over multiple lines). @foo is used to reference a name in > >> > +the schema. > >> > + > >> > +Example: > >> > + > >> > +## > >> > +# = Section > >> > +# == Subsection > >> > +# > >> > +# Some text foo with *strong* and _emphasis_ > >> > +# 1. with a list > >> > +# 2. like that > >> > +# > >> > +# And some code: > >> > +# | $ echo foo > >> > +# | -> do this > >> > +# | <- get that > >> > +# > >> > +## > >> > + > >> > + > >> > +==== Expression documentation ==== > >> > + > >> > +Each expression that isn't an include directive must be preceded by a > >> > +documentation block. Such blocks are called expression documentation > >> > +blocks. > >> > + > >> > +The first line of the documentation names the expression, then the > >> > +documentation body is provided, then individual documentation about > >> > +each member of 'data' is provided. Finally, several tagged sections > >> > +can be added. > >> > >> In my review of v6, I wrote: > >> > >> I'm afraid this is more aspiration than specification: the parser > >> accepts these things in almost any order. > >> > >> Could be filed under "deficiencies" for now. > >> > >> To file under "deficiencies", we need to add a FIXME or TODO either > >> here, or in the parser. Assuming you haven't tightened the parser > >> meanwhile (hope you haven't; we need to control the churn). > > > > ok > > > >> > >> Member of 'data' won't remain accurate. Consider: > >> > >> { 'union': 'GlusterServer', > >> 'base': { 'type': 'GlusterTransport' }, > >> 'discriminator': 'type', > >> 'data': { 'unix': 'UnixSocketAddress', > >> 'tcp': 'InetSocketAddress' } } > >> > >> Its doc comment currently doesn't contain argument sections. It > >> should > >> eventually contain @type:, @unix: and @tcp:. Only the latter two are > >> members of 'data'. > >> > >> I should propose something better, but I'm getting perilously close to > >> the christmas break already. Later. > >> > >> Also: passive voice is meh (not your idea; you adapted the existing > >> text). > >> > >> What about: > >> > >> The documentation block consists of a first line naming the > >> expression, an optional overview, a description of each argument (for > >> commands and events) or member (for structs, unions and alternates), > >> and optional tagged sections. > >> > >> I still don't like the use of "section" for both the = things and the > >> tagged sections, but it's not important enough to spend time on it now. > >> > > > > ok > > > >> > + > >> > +Optional members are tagged with the phrase '#optional', often with > >> > >> If we adopt my text, we should say "arguments / members" here. > >> > > > > ok > > > >> > +their default value; and extensions added after the expression was > >> > +first released are also given a '(since x.y.z)' comment. > >> > + > >> > +A tagged section starts with one of the following words: > >> > +"Note:"/"Notes:", "Since:", "Example"/"Examples", "Returns:", "TODO:". > >> > >> What ends such a section? The start of the next one / end of the > >> comment block? > >> > > > > yes > > > >> I suspect we could avoid PATCH 03 by recognizing "TODO" in addition to > >> "TODO:". Might also be more robust. But let's ignore this now. > >> > >> > + > >> > +A 'Since: x.y.z' tag lists the release that introduced the expression. > >> > >> Nitpick: 'Since: x.y.z' is a tagged section, not a tag. > >> > > > > ok > > > >> > + > >> > +For example: > >> > + > >> > +## > >> > +# @BlockStats: > >> > +# > >> > +# Statistics of a virtual block device or a block backing device. > >> > +# > >> > +# @device: #optional If the stats are for a virtual block device, the > >> > name > >> > +# corresponding to the virtual block device. > >> > +# > >> > +# @node-name: #optional The node name of the device. (since 2.3) > >> > +# > >> > +# ... more members ... > >> > +# > >> > +# Since: 0.14.0 > >> > +## > >> > +{ 'struct': 'BlockStats', > >> > + 'data': {'*device': 'str', '*node-name': 'str', > >> > + ... more members ... } } > >> > + > >> > +## > >> > +# @query-blockstats: > >> > +# > >> > +# Query the @BlockStats for all virtual block devices. > >> > +# > >> > +# @query-nodes: #optional If true, the command will query all the > >> > +# block nodes ... explain, explain ... (since 2.3) > >> > +# > >> > +# Returns: A list of @BlockStats for each virtual block devices. > >> > +# > >> > +# Since: 0.14.0 > >> > +# > >> > +# Example: > >> > +# > >> > +# -> { "execute": "query-blockstats" } > >> > +# <- { > >> > +# ... lots of output ... > >> > +# } > >> > +# > >> > +## > >> > +{ 'command': 'query-blockstats', > >> > + 'data': { '*query-nodes': 'bool' }, > >> > + 'returns': ['BlockStats'] } > >> > + > >> > +==== Free-form documentation ==== > >> > + > >> > +A documentation block that isn't an expression documentation block is > >> > +a free-form documentation block. These may be used to provide > >> > +additional text and structuring content. > >> > + > >> > + > >> > +=== Schema overview === > >> > > >> > The schema sets up a series of types, as well as commands and events > >> > that will use those types. Forward references are allowed: the parser > >> [Tests look okay, snipped...] > >> >