Re: [Qemu-devel] [PATCH v7 15/21] qapi: add qapi2texi script

Marc-André Lureau Thu, 12 Jan 2017 09:47:15 -0800

Hi

----- Original Message -----
> Marc-André Lureau <mlur...@redhat.com> writes:
> 
> > Hi
> >
> > ----- Original Message -----
> >> Marc-André Lureau <marcandre.lur...@redhat.com> writes:
> >> 
> >> > As the name suggests, the qapi2texi script converts JSON QAPI
> >> > description into a texi file suitable for different target
> >> > formats (info/man/txt/pdf/html...).
> >> >
> >> > It parses the following kind of blocks:
> >> >
> >> > Free-form:
> >> >
> >> >   ##
> >> >   # = Section
> >> >   # == Subsection
> >> >   #
> >> >   # Some text foo with *emphasis*
> >> >   # 1. with a list
> >> >   # 2. like that
> >> >   #
> >> >   # And some code:
> >> >   # | $ echo foo
> >> >   # | -> do this
> >> >   # | <- get that
> >> >   #
> >> >   ##
> >> >
> >> > Symbol description:
> >> >
> >> >   ##
> >> >   # @symbol:
> >> >   #
> >> >   # Symbol body ditto ergo sum. Foo bar
> >> >   # baz ding.
> >> >   #
> >> >   # @param1: the frob to frobnicate
> >> >   # @param2: #optional how hard to frobnicate
> >> >   #
> >> >   # Returns: the frobnicated frob.
> >> >   #          If frob isn't frobnicatable, GenericError.
> >> >   #
> >> >   # Since: version
> >> >   # Notes: notes, comments can have
> >> >   #        - itemized list
> >> >   #        - like this
> >> >   #
> >> >   # Example:
> >> >   #
> >> >   # -> { "execute": "quit" }
> >> >   # <- { "return": {} }
> >> >   #
> >> >   ##
> >> >
> >> > That's roughly following the following EBNF grammar:
> >> >
> >> > api_comment = "##\n" comment "##\n"
> >> > comment = freeform_comment | symbol_comment
> >> > freeform_comment = { "# " text "\n" | "#\n" }
> >> > symbol_comment = "# @" name ":\n" { member | tag_section |
> >> > freeform_comment
> >> > }
> >> > member = "# @" name ':' [ text ] "\n" freeform_comment
> >> > tag_section = "# " ( "Returns:", "Since:", "Note:", "Notes:",
> >> > "Example:",
> >> > "Examples:" ) [ text ]  "\n" freeform_comment
> >> > text = free text with markup
> >> >
> >> > Note that the grammar is ambiguous: a line "# @foo:\n" can be parsed
> >> > both as freeform_comment and as symbol_comment.  The actual parser
> >> > recognizes symbol_comment.
> >> >
> >> > See docs/qapi-code-gen.txt for more details.
> >> >
> >> > Deficiencies:
> >> 
> >> Perhaps "Deficiencies and limitations".
> >
> > ok
> >
> >> 
> >> > - the generated QMP documentation includes internal types
> >> > - union-type support is lacking
> >> 
> >> "union type" (no dash).
> >> 
> >> > - type information is lacking in generated documentation
> >> > - doc comment error message positions are imprecise, they point
> >> >   to the beginning of the comment.
> >> > - see other TODO/FIXME in this commit
> >> 
> >> Suggest: - a few minor issues, all marked TODO/FIXME in the code
> >> 
> >
> > ok
> >
> >> >
> >> > Signed-off-by: Marc-André Lureau <marcandre.lur...@redhat.com>
> >> > ---
> >> [diffstat snipped...]
> >> > diff --git a/scripts/qapi.py b/scripts/qapi.py
> >> > index 3d5f9e1eaf..a92a86f428 100644
> >> > --- a/scripts/qapi.py
> >> > +++ b/scripts/qapi.py
> >> > @@ -125,6 +125,122 @@ class QAPISemError(QAPIError):
> >> >                             info['parent'], msg)
> >> >  
> >> >  
> >> > +class QAPIDoc(object):
> >> > +    class Section(object):
> >> > +        def __init__(self, name=None):
> >> > +            # optional section name (argument/member or section name)
> >> > +            self.name = name
> >> > +            # the list of lines for this section
> >> > +            self.content = []
> >> > +
> >> > +        def append(self, line):
> >> > +            self.content.append(line)
> >> > +
> >> > +        def __repr__(self):
> >> > +            return "\n".join(self.content).strip()
> >> > +
> >> > +    class ArgSection(Section):
> >> > +        pass
> >> > +
> >> > +    def __init__(self, parser, info):
> >> > +        # self.parser is used to report errors with QAPIParseError.
> >> > The
> >> > +        # resulting error position depends on the state of the parser.
> >> > +        # It happens to be the beginning of the comment.  More or less
> >> > +        # servicable, but action at a distance.
> >> > +        self.parser = parser
> >> > +        self.info = info
> >> > +        self.symbol = None
> >> > +        self.body = QAPIDoc.Section()
> >> > +        # dict mapping parameter name to ArgSection
> >> > +        self.args = OrderedDict()
> >> > +        # a list of Section
> >> > +        self.sections = []
> >> > +        # the current section
> >> > +        self.section = self.body
> >> > +        # associated expression (to be set by expression parser)
> >> > +        self.expr = None
> >> > +
> >> > +    def has_section(self, name):
> >> > +        """Return True if we have a section with this name."""
> >> > +        for i in self.sections:
> >> > +            if i.name == name:
> >> > +                return True
> >> > +        return False
> >> > +
> >> > +    def append(self, line):
> >> > +        """Parse a comment line and add it to the documentation."""
> >> > +        line = line[1:]
> >> > +        if not line:
> >> > +            self._append_freeform(line)
> >> > +            return
> >> > +
> >> > +        if line[0] != ' ':
> >> > +            raise QAPIParseError(self.parser, "Missing space after #")
> >> > +        line = line[1:]
> >> > +
> >> > +        # FIXME not nice: things like '#  @foo:' and '# @foo: ' aren't
> >> > +        # recognized, and get silently treated as ordinary text
> >> > +        if self.symbol:
> >> > +            self._append_symbol_line(line)
> >> > +        elif not self.body.content and line.startswith("@"):
> >> > +            if not line.endswith(":"):
> >> > +                raise QAPIParseError(self.parser, "Line should end with
> >> > :")
> >> > +            self.symbol = line[1:-1]
> >> > +            # FIXME invalid names other than the empty string aren't
> >> > flagged
> >> > +            if not self.symbol:
> >> > +                raise QAPIParseError(self.parser, "Invalid name")
> >> > +        else:
> >> > +            self._append_freeform(line)
> >> > +
> >> > +    def _append_symbol_line(self, line):
> >> > +        name = line.split(' ', 1)[0]
> >> > +
> >> > +        if name.startswith("@") and name.endswith(":"):
> >> > +            line = line[len(name)+1:]
> >> > +            self._start_args_section(name[1:-1])
> >> > +        elif name in ("Returns:", "Since:",
> >> > +                      # those are often singular or plural
> >> > +                      "Note:", "Notes:",
> >> > +                      "Example:", "Examples:",
> >> > +                      "TODO:"):
> >> > +            line = line[len(name)+1:]
> >> > +            self._start_section(name[:-1])
> >> > +
> >> > +        self._append_freeform(line)
> >> > +
> >> > +    def _start_args_section(self, name):
> >> > +        # FIXME invalid names other than the empty string aren't
> >> > flagged
> >> > +        if not name:
> >> > +            raise QAPIParseError(self.parser, "Invalid parameter name")
> >> > +        if name in self.args:
> >> > +            raise QAPIParseError(self.parser,
> >> > +                                 "'%s' parameter name duplicated" %
> >> > name)
> >> > +        if self.sections:
> >> > +            raise QAPIParseError(self.parser,
> >> > +                                 "'%s' parameter documentation is
> >> > interleaved "
> >> > +                                 "with other sections" % name)
> >> 
> >> This error message is rather confusing.  Ideally, we'd point to the the
> >> source code that made us add to self.sections, but that's not in the
> >> cards right now.  Here's my best try:
> >> 
> >>                raise QAPIParseError(self.parser,
> >>                                     "'@%s:' can't follow '%s' section"
> >>                                     % (name, self.sections[0].name))
> >> 
> >
> > works for me too
> >
> >> > +        self.section = QAPIDoc.ArgSection(name)
> >> > +        self.args[name] = self.section
> >> > +
> >> > +    def _start_section(self, name=""):
> >> 
> >> Note: @name changed to default to "" since v6 for the benefit of
> >> _append_freeform() below.
> >> 
> >> > +        if name in ("Returns", "Since") and self.has_section(name):
> >> > +            raise QAPIParseError(self.parser,
> >> > +                                 "Duplicated '%s' section" % name)
> >> > +        self.section = QAPIDoc.Section(name)
> >> > +        self.sections.append(self.section)
> >> > +
> >> > +    def _append_freeform(self, line):
> >> > +        in_arg = isinstance(self.section, QAPIDoc.ArgSection)
> >> > +        if (in_arg and self.section.content and
> >> > +                not self.section.content[-1] and
> >> > +                line and not line[0].isspace()):
> >> 
> >> PEP8 recommends breaking lines before operators in new code.
> >
> > ok
> >
> >> 
> >> > +            self._start_section()
> >> 
> >> Changed from v6's
> >> 
> >>                raise QAPIParseError(self.parser, "Invalid section
> >>                indentation")
> >> 
> >> May well be an improvement (I'm certainly no fan of this error myself),
> >> but it throws my review off the fast track: now I have to figure out how
> >> the new code works.  Can you give me a hint on what this change does?
> >
> > In v6, we rejected unindented sections after arguments (see
> > tests/qapi-schema/doc-bad-section.json):
> >
> > # @arg: blah bla
> > #       zing bar
> > #
> > # This should be rejected
> >
> > And a patch in v6 reordered the json comments. But you asked for the
> > doc to not be reordered without careful review (both source and
> > output). So those additional paragraphs are added as untitled
> > sections, and the doc generator keep section order. The output will be
> > in the same order as the source.
> 
> Thanks.
> 
> >> Or should we take a shortcut and revert to v6 here?  We can always
> >> improve on top.
> >
> > Probably not needed to revert.
> 
> I played with it a bit, and it looks like it's working fine.
> 
> >> > +        if (in_arg or not self.section.name or
> >> > +                not self.section.name.startswith("Example")):
> >> 
> >> Again, break before the operator.
> >
> > ok
> >
> >> 
> >> > +            line = line.strip()
> >> > +        self.section.append(line)
> >> > +
> >> > +
> >> >  class QAPISchemaParser(object):
> >> >  
> >> >      def __init__(self, fp, previously_included=[], incl_info=None):
> >> > @@ -140,11 +256,17 @@ class QAPISchemaParser(object):
> >> >          self.line = 1
> >> >          self.line_pos = 0
> >> >          self.exprs = []
> >> > +        self.docs = []
> >> >          self.accept()
> >> >  
> >> >          while self.tok is not None:
> >> >              info = {'file': fname, 'line': self.line,
> >> >                      'parent': self.incl_info}
> >> > +            if self.tok == '#':
> >> > +                doc = self.get_doc(info)
> >> > +                self.docs.append(doc)
> >> > +                continue
> >> > +
> >> >              expr = self.get_expr(False)
> >> >              if isinstance(expr, dict) and "include" in expr:
> >> >                  if len(expr) != 1:
> >> > @@ -162,6 +284,7 @@ class QAPISchemaParser(object):
> >> >                          raise QAPISemError(info, "Inclusion loop for
> >> >                          %s"
> >> >                                             % include)
> >> >                      inf = inf['parent']
> >> > +
> >> >                  # skip multiple include of the same file
> >> >                  if incl_abs_fname in previously_included:
> >> >                      continue
> >> > @@ -172,12 +295,19 @@ class QAPISchemaParser(object):
> >> >                  exprs_include = QAPISchemaParser(fobj,
> >> >                  previously_included,
> >> >                                                   info)
> >> >                  self.exprs.extend(exprs_include.exprs)
> >> > +                self.docs.extend(exprs_include.docs)
> >> >              else:
> >> >                  expr_elem = {'expr': expr,
> >> >                               'info': info}
> >> > +                if (self.docs and
> >> > +                        self.docs[-1].info['file'] == fname and
> >> > +                        not self.docs[-1].expr):
> >> 
> >> Again, break before the operator.
> >
> > ok
> >
> >> 
> >> > +                    self.docs[-1].expr = expr
> >> > +                    expr_elem['doc'] = self.docs[-1]
> >> > +
> >> >                  self.exprs.append(expr_elem)
> >> >  
> >> > -    def accept(self):
> >> > +    def accept(self, skip_comment=True):
> >> >          while True:
> >> >              self.tok = self.src[self.cursor]
> >> >              self.pos = self.cursor
> >> > @@ -185,7 +315,13 @@ class QAPISchemaParser(object):
> >> >              self.val = None
> >> >  
> >> >              if self.tok == '#':
> >> > +                if self.src[self.cursor] == '#':
> >> > +                    # Start of doc comment
> >> > +                    skip_comment = False
> >> >                  self.cursor = self.src.find('\n', self.cursor)
> >> > +                if not skip_comment:
> >> > +                    self.val = self.src[self.pos:self.cursor]
> >> > +                    return
> >> >              elif self.tok in "{}:,[]":
> >> >                  return
> >> >              elif self.tok == "'":
> >> 
> >> Copied from review of v3, so I don't forget:
> >> 
> >> Comment tokens are thrown away as before, except when the parser asks
> >> for them by passing skip_comment=False, or when the comment token starts
> >> with ##.  The parser asks while parsing a doc comment, in get_doc().
> >> 
> >> This is a backchannel from the parser to the lexer.  I'd rather avoid
> >> such lexer hacks, but I guess we can address that on top.
> >> 
> >> A comment starting with ## inside an expression is now a syntax error.
> >> For instance, input
> >> 
> >>     {
> >>     ##
> >> 
> >> yields
> >> 
> >>     /dev/stdin:2:1: Expected string or "}"
> >> 
> >> Rather unfriendly error message, but we can fix that on top.
> >> 
> >> > @@ -319,6 +455,28 @@ class QAPISchemaParser(object):
> >> >              raise QAPIParseError(self, 'Expected "{", "[" or string')
> >> >          return expr
> >> >  
> >> > +    def get_doc(self, info):
> >> > +        if self.val != '##':
> >> > +            raise QAPIParseError(self, "Junk after '##' at start of "
> >> > +                                 "documentation comment")
> >> > +
> >> > +        doc = QAPIDoc(self, info)
> >> > +        self.accept(False)
> >> > +        while self.tok == '#':
> >> > +            if self.val.startswith('##'):
> >> > +                # End of doc comment
> >> > +                if self.val != '##':
> >> > +                    raise QAPIParseError(self, "Junk after '##' at end
> >> > of
> >> > "
> >> > +                                         "documentation comment")
> >> > +                self.accept()
> >> > +                return doc
> >> > +            else:
> >> > +                doc.append(self.val)
> >> > +            self.accept(False)
> >> > +
> >> > +        raise QAPIParseError(self, "Documentation comment must end with
> >> > '##'")
> >> > +
> >> > +
> >> >  #
> >> >  # Semantic analysis of schema expressions
> >> >  # TODO fold into QAPISchema
> >> > @@ -703,6 +861,11 @@ def check_exprs(exprs):
> >> >      for expr_elem in exprs:
> >> >          expr = expr_elem['expr']
> >> >          info = expr_elem['info']
> >> > +
> >> > +        if 'doc' not in expr_elem:
> >> > +            raise QAPISemError(info,
> >> > +                               "Expression missing documentation
> >> > comment")
> >> > +
> >> >          if 'enum' in expr:
> >> >              check_keys(expr_elem, 'enum', ['data'], ['prefix'])
> >> >              add_enum(expr['enum'], info, expr['data'])
> >> > @@ -761,6 +924,84 @@ def check_exprs(exprs):
> >> >      return exprs
> >> >  
> >> >  
> >> > +def check_freeform_doc(doc):
> >> > +    if doc.symbol:
> >> > +        raise QAPISemError(doc.info,
> >> > +                           "Documention for '%s' is not followed"
> >> > +                           " by the definition" % doc.symbol)
> >> > +
> >> > +    body = str(doc.body)
> >> > +    if re.search(r'@\S+:', body, re.MULTILINE):
> >> > +        raise QAPISemError(doc.info,
> >> > +                           "Free-form documentation block must not
> >> > contain"
> >> > +                           " @NAME: sections")
> >> > +
> >> > +
> >> > +def check_definition_doc(doc, expr, info):
> >> 
> >> Lots of churn in this function.  It can certainly use improvement, as
> >> pointed out in prior reviews, but it triggers still more review.  Based
> >> on the tests, I guess it's for finding documented parameters that don't
> >> actually exist, and to diagnose optional mismatch.  Anything else?
> >> 
> >
> > implicitely adds #optional (since it's no longer in type info, it better be
> > in the description), code simplification, message improvement too.
> >
> > I'd rather keep this too, and improve on top.
> 
> Okay.  Review below.
> 
> >> Or should we take a shortcut and revert to v6 here?  We can always
> >> improve on top.  I think v6 would need a # TODO Should ensure #optional
> >> matches the schema.
> >> 
> >> > +    for i in ('enum', 'union', 'alternate', 'struct', 'command',
> >> > 'event'):
> >> > +        if i in expr:
> >> > +            meta = i
> >> > +            break
> >> > +
> >> > +    name = expr[meta]
> >> > +    if doc.symbol != name:
> >> > +        raise QAPISemError(info, "Definition of '%s' follows
> >> > documentation"
> >> > +                           " for '%s'" % (name, doc.symbol))
> >> > +    if doc.has_section('Returns') and 'command' not in expr:
> >> > +        raise QAPISemError(info, "'Returns:' is only valid for
> >> > commands")
> >> 
> >> Copied from review of v5, so I don't forget:
> >> 
> >> We accept 'Returns:' even when the command doesn't return anything,
> >> because we currently use it to document errors, too.  Can't say I like
> >> that, but it's out of scope here.
> >> 
> >> > +
> >> > +    if meta == 'union':
> >> > +        args = expr.get('base', [])
> 
> Note: args is either string or dictionary.
> 
> >> > +    else:
> >> > +        args = expr.get('data', [])
> 
> Note: args is either string (only if command or event), dictionary (only
> if struct, alternate, command, event) or list (only if enum)
> 
> >> > +    if isinstance(args, dict):
> >> > +        args = args.keys()
> 
> Now args is either string (only if command or event) or list.
> 
> >> > +
> >> > +    if meta == 'alternate' or \
> >> > +       (meta == 'union' and not expr.get('discriminator')):
> 
> PEP8 prefers parenthesis over backslash, and recommends breaking lines
> before instead of after binary operators:


ok

> 
>           if (meta == 'alternate'
>               or (meta == 'union' and not expr.get('discriminator'))):
> 
> Note: args can only be list here.
> 
> >> > +        args.append('type')
> 
> Works because we know it's a list.  It's less than obvious, though.
> 
> >> > +
> >> > +    if isinstance(args, list):
> >> > +        for arg in args:
> >> > +            if arg[0] == '*':
> >> > +                opt = True
> >> > +                desc = doc.args.get(arg[1:])
> >> > +            else:
> >> > +                opt = False
> >> > +                desc = doc.args.get(arg)
> >> > +            if not desc:
> >> > +                continue
> >> > +            desc_opt = "#optional" in str(desc)
> >> > +            if desc_opt and not opt:
> >> > +                raise QAPISemError(info, "Description has #optional, "
> >> > +                                   "but the declaration doesn't")
> >> > +            if not desc_opt and opt:
> >> > +                # silently fix the doc
> >> > +                desc.append("#optional")
> 
> #optional is redundant information.  But as long as we use it, it better
> be correct, to avoid misleading readers of the schema.  Two sides of
> correctness: it's there when arg is optional, and it's not there when
> it's not.  You enforce the latter, but you silently fix the former.  I
> figure you do that because #optional is currently missing in quite a few
> places.
> 
> Let's add
>                       # TODO either fix the schema and make this an error,
>                       # or drop #optional entirely
> 

ok

> >> > +
> >> > +    doc_args = set(doc.args.keys())
> >> > +    args = set([name.strip('*') for name in args])
> 
> Now args morphs from string or dict to set.  I'd use a new variable.
> 
> If args is a string, it becomes a set of characters, which is not what
> you want.  For instance, qapi-schema.json's QCryptoBlockOpenOptions has
> 'base': 'QCryptoBlockOptionsBase', and args changes from
> 'QCryptoBlockOptionsBase' to set(['O', 'a', 'C', 'B', 'e', 'i', 'c',
> 'k', 'l', 'o', 'n', 'Q', 'p', 's', 'r', 't', 'y']) here.
> 
> >> > +    if not doc_args.issubset(args):
> >> > +        raise QAPISemError(info, "The following documented members are
> >> > not in "
> >> > +                           "the declaration: %s" % ", ".join(doc_args -
> >> > args))
> 
> If args is originally a string, then we happily accept documentation for
> its characters.  Quite unlikely to bite in practice, but it's wrong
> anyways.
> 
> I suspect you could save yourself some grief by returning early when
> args is string, say:
> 
>     if meta == 'union':
>         args = expr.get('base', [])
>     else:
>         args = expr.get('data', [])
>     if isinstance(args, str):
>         return
>     if isinstance(args, dict):
>         args = args.keys()
>     assert isinstance(args, list)
> 
>     if (meta == 'alternate'
>             or (meta == 'union' and not expr.get('discriminator'))):
>         args.append('type')
> 
>     for arg in args:
>         ...

ok

> 
> >> Copied from review of v5, so I don't forget:
> >> 
> >> As explained in review of v3, this is only a subset of the real set of
> >> members.  Computing the exact set is impractical when working with the
> >> abstract syntax tree.  I believe we'll eventually have to rewrite this
> >> code to work with the QAPISchemaEntity instead.
> >> 
> >> > +
> >> > +
> >> > +def check_docs(docs):
> >> > +    for doc in docs:
> >> > +        for section in doc.args.values() + doc.sections:
> >> > +            content = str(section)
> >> > +            if not content or content.isspace():
> >> > +                raise QAPISemError(doc.info,
> >> > +                                   "Empty doc section '%s'" %
> >> > section.name)
> >> > +
> >> > +        if not doc.expr:
> >> > +            check_freeform_doc(doc)
> >> > +        else:
> >> > +            check_definition_doc(doc, doc.expr, doc.info)
> >> > +
> >> > +    return docs
> >> > +
> >> > +
> >> >  #
> >> >  # Schema compiler frontend
> >> >  #
> >> > @@ -1229,7 +1470,9 @@ class QAPISchemaEvent(QAPISchemaEntity):
> >> >  class QAPISchema(object):
> >> >      def __init__(self, fname):
> >> >          try:
> >> > -            self.exprs = check_exprs(QAPISchemaParser(open(fname,
> >> > "r")).exprs)
> >> > +            parser = QAPISchemaParser(open(fname, "r"))
> >> > +            self.exprs = check_exprs(parser.exprs)
> >> > +            self.docs = check_docs(parser.docs)
> >> >              self._entity_dict = {}
> >> >              self._predefining = True
> >> >              self._def_predefineds()
> >> > diff --git a/scripts/qapi2texi.py b/scripts/qapi2texi.py
> >> > new file mode 100755
> >> > index 0000000000..436749ec7b
> >> > --- /dev/null
> >> > +++ b/scripts/qapi2texi.py
> >> > @@ -0,0 +1,266 @@
> >> > +#!/usr/bin/env python
> >> > +# QAPI texi generator
> >> > +#
> >> > +# This work is licensed under the terms of the GNU LGPL, version 2+.
> >> > +# See the COPYING file in the top-level directory.
> >> > +"""This script produces the documentation of a qapi schema in texinfo
> >> > format"""
> >> > +import re
> >> > +import sys
> >> > +
> >> > +import qapi
> >> > +
> >> > +COMMAND_FMT = """
> >> > +@deftypefn {type} {{}} {name}
> >> > +
> >> > +{body}
> >> > +
> >> > +@end deftypefn
> >> > +
> >> > +""".format
> >> > +
> >> > +ENUM_FMT = """
> >> > +@deftp Enum {name}
> >> > +
> >> > +{body}
> >> > +
> >> > +@end deftp
> >> > +
> >> > +""".format
> >> > +
> >> > +STRUCT_FMT = """
> >> > +@deftp {{{type}}} {name}
> >> > +
> >> > +{body}
> >> > +
> >> > +@end deftp
> >> > +
> >> > +""".format
> >> > +
> >> > +EXAMPLE_FMT = """@example
> >> > +{code}
> >> > +@end example
> >> > +""".format
> >> > +
> >> > +
> >> > +def subst_strong(doc):
> >> > +    """Replaces *foo* by @strong{foo}"""
> >> > +    return re.sub(r'\*([^*\n]+)\*', r'@emph{\1}', doc)
> >> > +
> >> > +
> >> > +def subst_emph(doc):
> >> > +    """Replaces _foo_ by @emph{foo}"""
> >> > +    return re.sub(r'\b_([^_\n]+)_\b', r' @emph{\1} ', doc)
> >> > +
> >> > +
> >> > +def subst_vars(doc):
> >> > +    """Replaces @var by @code{var}"""
> >> > +    return re.sub(r'@([\w-]+)', r'@code{\1}', doc)
> >> > +
> >> > +
> >> > +def subst_braces(doc):
> >> > +    """Replaces {} with @{ @}"""
> >> > +    return doc.replace("{", "@{").replace("}", "@}")
> >> > +
> >> > +
> >> > +def texi_example(doc):
> >> > +    """Format @example"""
> >> > +    # TODO: Neglects to escape @ characters.
> >> > +    # We should probably escape them in subst_braces(), and rename the
> >> > +    # function to subst_special() or subs_texi_special().  If we do
> >> > that,
> >> > we
> >> > +    # need to delay it until after subst_vars() in texi_format().
> >> > +    doc = subst_braces(doc).strip('\n')
> >> > +    return EXAMPLE_FMT(code=doc)
> >> > +
> >> > +
> >> > +def texi_format(doc):
> >> > +    """
> >> > +    Format documentation
> >> > +
> >> > +    Lines starting with:
> >> > +    - |: generates an @example
> >> > +    - =: generates @section
> >> > +    - ==: generates @subsection
> >> > +    - 1. or 1): generates an @enumerate @item
> >> > +    - */-: generates an @itemize list
> >> > +    """
> >> > +    lines = []
> >> > +    doc = subst_braces(doc)
> >> > +    doc = subst_vars(doc)
> >> > +    doc = subst_emph(doc)
> >> > +    doc = subst_strong(doc)
> >> > +    inlist = ""
> >> > +    lastempty = False
> >> > +    for line in doc.split('\n'):
> >> > +        empty = line == ""
> >> > +
> >> > +        # FIXME: Doing this in a single if / elif chain is
> >> > +        # problematic.  For instance, a line without markup terminates
> >> > +        # a list if it follows a blank line (reaches the final elif),
> >> > +        # but a line with some *other* markup, such as a = title
> >> > +        # doesn't.
> >> > +        if line.startswith("| "):
> >> > +            line = EXAMPLE_FMT(code=line[2:])
> >> > +        elif line.startswith("= "):
> >> > +            line = "@section " + line[2:]
> >> > +        elif line.startswith("== "):
> >> > +            line = "@subsection " + line[3:]
> >> > +        elif re.match("^([0-9]*[.]) ", line):
> >> 
> >> [.] is an unusual way to say \.
> >
> > Right, forgot to change that when removing ) from v6.
> >
> >> 
> >> Perhaps we should consistently use r"..." for strings used as regular
> >> expressions.
> >> 
> >
> > No idea, could be changed on top.
> 
> Yes.  But if you respin anyway, adding some r wouldn't hurt.

ok

> 
> >> > +            if not inlist:
> >> > +                lines.append("@enumerate")
> >> > +                inlist = "enumerate"
> >> > +            line = line[line.find(" ")+1:]
> >> > +            lines.append("@item")
> >> > +        elif re.match("^[*-] ", line):
> >> > +            if not inlist:
> >> > +                lines.append("@itemize %s" % {'*': "@bullet",
> >> > +                                              '-': "@minus"}[line[0]])
> >> > +                inlist = "itemize"
> >> > +            lines.append("@item")
> >> > +            line = line[2:]
> >> > +        elif lastempty and inlist:
> >> > +            lines.append("@end %s\n" % inlist)
> >> > +            inlist = ""
> >> > +
> >> > +        lastempty = empty
> >> > +        lines.append(line)
> >> > +
> >> > +    if inlist:
> >> > +        lines.append("@end %s\n" % inlist)
> >> > +    return "\n".join(lines)
> >> > +
> >> > +
> >> > +def texi_body(doc):
> >> > +    """
> >> > +    Format the body of a symbol documentation:
> >> > +    - main body
> >> > +    - table of arguments
> >> > +    - followed by "Returns/Notes/Since/Example" sections
> >> > +    """
> >> > +    body = texi_format(str(doc.body)) + "\n"
> >> > +    if doc.args:
> >> > +        body += "@table @asis\n"
> >> > +        for arg, section in doc.args.iteritems():
> >> > +            desc = str(section)
> >> > +            opt = ''
> >> > +            if "#optional" in desc:
> >> > +                desc = desc.replace("#optional", "")
> >> > +                opt = ' (optional)'
> >> > +            body += "@item @code{'%s'}%s\n%s\n" % (arg, opt,
> >> > +                                                   texi_format(desc))
> >> > +        body += "@end table\n"
> >> > +
> >> > +    for section in doc.sections:
> >> > +        name, doc = (section.name, str(section))
> >> > +        func = texi_format
> >> > +        if name.startswith("Example"):
> >> > +            func = texi_example
> >> > +
> >> > +        if name:
> >> 
> >> @quotation produces confusing .txt and .html output, as discussed in
> >> review of v6.  Either drop it, or record the problem in a comment, e.g.
> >> 
> >>                # FIXME the indentation produced by @quotation in .txt and
> >>                # .html output is confusing
> >> 
> >
> > added
> 
> Thanks.
> 
> >> > +            body += "\n@quotation %s\n%s\n@end quotation" % \
> >> > +                    (name, func(doc))
> >> > +        else:
> >> > +            body += func(doc)
> >> > +
> >> > +    return body
> >> > +
> >> > +
> >> > +def texi_alternate(expr, doc):
> >> > +    """Format an alternate to texi"""
> >> > +    body = texi_body(doc)
> >> > +    return STRUCT_FMT(type="Alternate",
> >> > +                      name=doc.symbol,
> >> > +                      body=body)
> >> > +
> >> > +
> >> > +def texi_union(expr, doc):
> >> > +    """Format a union to texi"""
> >> > +    discriminator = expr.get("discriminator")
> >> > +    if discriminator:
> >> > +        union = "Flat Union"
> >> > +    else:
> >> > +        union = "Simple Union"
> >> 
> >> Not sure flat vs. simple matters for users of generated documentation,
> >> but let's not worry about that now.
> >> 
> >> > +
> >> > +    body = texi_body(doc)
> >> > +    return STRUCT_FMT(type=union,
> >> > +                      name=doc.symbol,
> >> > +                      body=body)
> >> > +
> >> > +
> >> > +def texi_enum(expr, doc):
> >> > +    """Format an enum to texi"""
> >> > +    for i in expr['data']:
> >> > +        if i not in doc.args:
> >> > +            doc.args[i] = ''
> >> > +    body = texi_body(doc)
> >> > +    return ENUM_FMT(name=doc.symbol,
> >> > +                    body=body)
> >> > +
> >> > +
> >> > +def texi_struct(expr, doc):
> >> > +    """Format a struct to texi"""
> >> > +    body = texi_body(doc)
> >> > +    return STRUCT_FMT(type="Struct",
> >> > +                      name=doc.symbol,
> >> > +                      body=body)
> >> > +
> >> > +
> >> > +def texi_command(expr, doc):
> >> > +    """Format a command to texi"""
> >> > +    body = texi_body(doc)
> >> > +    return COMMAND_FMT(type="Command",
> >> > +                       name=doc.symbol,
> >> > +                       body=body)
> >> > +
> >> > +
> >> > +def texi_event(expr, doc):
> >> > +    """Format an event to texi"""
> >> > +    body = texi_body(doc)
> >> > +    return COMMAND_FMT(type="Event",
> >> > +                       name=doc.symbol,
> >> > +                       body=body)
> >> > +
> >> > +
> >> > +def texi_expr(expr, doc):
> >> > +    """Format an expr to texi"""
> >> > +    (kind, _) = expr.items()[0]
> >> > +
> >> > +    fmt = {"command": texi_command,
> >> > +           "struct": texi_struct,
> >> > +           "enum": texi_enum,
> >> > +           "union": texi_union,
> >> > +           "alternate": texi_alternate,
> >> > +           "event": texi_event}[kind]
> >> > +
> >> > +    return fmt(expr, doc)
> >> > +
> >> > +
> >> > +def texi(docs):
> >> > +    """Convert QAPI schema expressions to texi documentation"""
> >> > +    res = []
> >> > +    for doc in docs:
> >> > +        expr = doc.expr
> >> > +        if not expr:
> >> > +            res.append(texi_body(doc))
> >> > +            continue
> >> > +        try:
> >> > +            doc = texi_expr(expr, doc)
> >> > +            res.append(doc)
> >> > +        except:
> >> > +            print >>sys.stderr, "error at @%s" % doc.info
> >> > +            raise
> >> > +
> >> > +    return '\n'.join(res)
> >> > +
> >> > +
> >> > +def main(argv):
> >> > +    """Takes schema argument, prints result to stdout"""
> >> > +    if len(argv) != 2:
> >> > +        print >>sys.stderr, "%s: need exactly 1 argument: SCHEMA" %
> >> > argv[0]
> >> > +        sys.exit(1)
> >> > +
> >> > +    schema = qapi.QAPISchema(argv[1])
> >> > +    print texi(schema.docs)
> >> > +
> >> > +
> >> > +if __name__ == "__main__":
> >> > +    main(sys.argv)
> >> > diff --git a/docs/qapi-code-gen.txt b/docs/qapi-code-gen.txt
> >> > index 2841c5144a..b29f996fdc 100644
> >> > --- a/docs/qapi-code-gen.txt
> >> > +++ b/docs/qapi-code-gen.txt
> >> > @@ -44,40 +44,148 @@ Input must be ASCII (although QMP supports full
> >> > Unicode strings, the
> >> >  QAPI parser does not).  At present, there is no place where a QAPI
> >> >  schema requires the use of JSON numbers or null.
> >> >  
> >> > +
> >> > +=== Comments ===
> >> > +
> >> >  Comments are allowed; anything between an unquoted # and the following
> >> > -newline is ignored.  Although there is not yet a documentation
> >> > -generator, a form of stylized comments has developed for consistently
> >> > -documenting details about an expression and when it was added to the
> >> > -schema.  The documentation is delimited between two lines of ##, then
> >> > -the first line names the expression, an optional overview is provided,
> >> > -then individual documentation about each member of 'data' is provided,
> >> > -and finally, a 'Since: x.y.z' tag lists the release that introduced
> >> > -the expression.  Optional members are tagged with the phrase
> >> > -'#optional', often with their default value; and extensions added
> >> > -after the expression was first released are also given a '(since
> >> > -x.y.z)' comment.  For example:
> >> > -
> >> > -    ##
> >> > -    # @BlockStats:
> >> > -    #
> >> > -    # Statistics of a virtual block device or a block backing device.
> >> > -    #
> >> > -    # @device: #optional If the stats are for a virtual block device,
> >> > the
> >> > name
> >> > -    #          corresponding to the virtual block device.
> >> > -    #
> >> > -    # @stats:  A @BlockDeviceStats for the device.
> >> > -    #
> >> > -    # @parent: #optional This describes the file block device if it has
> >> > one.
> >> > -    #
> >> > -    # @backing: #optional This describes the backing block device if it
> >> > has one.
> >> > -    #           (Since 2.0)
> >> > -    #
> >> > -    # Since: 0.14.0
> >> > -    ##
> >> > -    { 'struct': 'BlockStats',
> >> > -      'data': {'*device': 'str', 'stats': 'BlockDeviceStats',
> >> > -               '*parent': 'BlockStats',
> >> > -               '*backing': 'BlockStats'} }
> >> > +newline is ignored.
> >> > +
> >> > +A multi-line comment that starts and ends with a '##' line is a
> >> > +documentation comment.  These are parsed by the documentation
> >> > +generator, which recognizes certain markup detailed below.
> >> > +
> >> > +
> >> > +==== Documentation markup ====
> >> > +
> >> > +Comment text starting with '=' is a section title:
> >> > +
> >> > +    # = Section title
> >> > +
> >> > +Double the '=' for a subsection title:
> >> > +
> >> > +    # == Subection title
> >> > +
> >> > +'|' denotes examples:
> >> > +
> >> > +    # | Text of the example, may span
> >> > +    # | multiple lines
> >> > +
> >> > +'*' starts an itemized list:
> >> > +
> >> > +    # * First item, may span
> >> > +    #   multiple lines
> >> > +    # * Second item
> >> > +
> >> > +You can also use '-' instead of '*'.
> >> > +
> >> > +A decimal number followed by '.' starts a numbered list:
> >> > +
> >> > +    # 1. First item, may span
> >> > +    # multiple lines
> 
> Should the second line be indented?

ok

> 
> >> > +    # 2. Second item
> >> > +
> >> > +The actual number doesn't matter.  You could even use '*' instead of
> >> > +'2.' for the second item.
> >> > +
> >> > +FIXME what exactly ends a list
> >> 
> >> Can you say offhand what ends a list?  If yes, can we resolve this FIXME
> >> now rather than later?
> >
> > well, the code is too permissive, there is a FIXME there already.
> >
> > In general, I think we want to end a list after a blank line and a
> > "non-item" line. Is the wording ok?
> 
> Blank line is pretty restrictive, because list items consisting of
> multiple paragraphs aren't possible then.
> 
> I think writing something like this is fairly natural:
> 
>     * First item, may span
>       multiple lines.
> 
>       Even multiple paragaphs are okay.
> 
>     * Second item
> 
>       - This is a sublist
>       - Second item of sublist
> 
>     * Third item
> 
>     This sentence is not part of the list.
> 
> We don't support sublists, and I'm not saying we should, I'm just
> showing how they would fit in.
> 
> My point is that indentation is a natural way to delimit list items.
> 
> I don't expect you to implement this now.  All I want for this series is
> reasonable documentation.  As far as I can tell, texi_format() ends the
> list at a blank line, except when it doesn't (see its FIXME).
> 
> What about this: replace "FIXME what exactly ends a list" by
> 
>     Lists can't be nested.  Blank lines are currently not supported
>     within lists.
> 
> and add to texi_format()'s FIXME something like
> 
>     Make sure to update section "Documentation markup" in
>     docs/qapi-code-gen.txt when fixing this.
> 
> Okay?

ok

> 
> >> > +
> >> > +Additional whitespace between the initial '#' and the comment text is
> >> > +permitted.
> >> > +
> >> > +*foo* and _foo_ are for strong and emphasis styles respectively (they
> >> > +do not work over multiple lines). @foo is used to reference a name in
> >> > +the schema.
> >> > +
> >> > +Example:
> >> > +
> >> > +##
> >> > +# = Section
> >> > +# == Subsection
> >> > +#
> >> > +# Some text foo with *strong* and _emphasis_
> >> > +# 1. with a list
> >> > +# 2. like that
> >> > +#
> >> > +# And some code:
> >> > +# | $ echo foo
> >> > +# | -> do this
> >> > +# | <- get that
> >> > +#
> >> > +##
> >> > +
> >> > +
> >> > +==== Expression documentation ====
> >> > +
> >> > +Each expression that isn't an include directive must be preceded by a
> >> > +documentation block.  Such blocks are called expression documentation
> >> > +blocks.
> >> > +
> >> > +The first line of the documentation names the expression, then the
> >> > +documentation body is provided, then individual documentation about
> >> > +each member of 'data' is provided. Finally, several tagged sections
> >> > +can be added.
> >> 
> >> In my review of v6, I wrote:
> >> 
> >>     I'm afraid this is more aspiration than specification: the parser
> >>     accepts these things in almost any order.
> >> 
> >>     Could be filed under "deficiencies" for now.
> >> 
> >> To file under "deficiencies", we need to add a FIXME or TODO either
> >> here, or in the parser.  Assuming you haven't tightened the parser
> >> meanwhile (hope you haven't; we need to control the churn).
> >
> > ok
> >
> >> 
> >>     Member of 'data' won't remain accurate.  Consider:
> >> 
> >>         { 'union': 'GlusterServer',
> >>           'base': { 'type': 'GlusterTransport' },
> >>           'discriminator': 'type',
> >>           'data': { 'unix': 'UnixSocketAddress',
> >>                     'tcp': 'InetSocketAddress' } }
> >> 
> >>     Its doc comment currently doesn't contain argument sections.  It
> >>     should
> >>     eventually contain @type:, @unix: and @tcp:.  Only the latter two are
> >>     members of 'data'.
> >> 
> >>     I should propose something better, but I'm getting perilously close to
> >>     the christmas break already.  Later.
> >> 
> >> Also: passive voice is meh (not your idea; you adapted the existing
> >> text).
> >> 
> >> What about:
> >> 
> >>    The documentation block consists of a first line naming the
> >>    expression, an optional overview, a description of each argument (for
> >>    commands and events) or member (for structs, unions and alternates),
> >>    and optional tagged sections.
> >> 
> >> I still don't like the use of "section" for both the = things and the
> >> tagged sections, but it's not important enough to spend time on it now.
> >> 
> >
> > ok
> >
> >> > +
> >> > +Optional members are tagged with the phrase '#optional', often with
> >> 
> >> If we adopt my text, we should say "arguments / members" here.
> >> 
> >
> > ok
> >
> >> > +their default value; and extensions added after the expression was
> >> > +first released are also given a '(since x.y.z)' comment.
> >> > +
> >> > +A tagged section starts with one of the following words:
> >> > +"Note:"/"Notes:", "Since:", "Example"/"Examples", "Returns:", "TODO:".
> >> 
> >> What ends such a section?  The start of the next one / end of the
> >> comment block?
> >> 
> >
> > yes
> >
> >> I suspect we could avoid PATCH 03 by recognizing "TODO" in addition to
> >> "TODO:".  Might also be more robust.  But let's ignore this now.
> >> 
> >> > +
> >> > +A 'Since: x.y.z' tag lists the release that introduced the expression.
> >> 
> >> Nitpick: 'Since: x.y.z' is a tagged section, not a tag.
> >> 
> >
> > ok
> >
> >> > +
> >> > +For example:
> >> > +
> >> > +##
> >> > +# @BlockStats:
> >> > +#
> >> > +# Statistics of a virtual block device or a block backing device.
> >> > +#
> >> > +# @device: #optional If the stats are for a virtual block device, the
> >> > name
> >> > +#          corresponding to the virtual block device.
> >> > +#
> >> > +# @node-name: #optional The node name of the device. (since 2.3)
> >> > +#
> >> > +# ... more members ...
> >> > +#
> >> > +# Since: 0.14.0
> >> > +##
> >> > +{ 'struct': 'BlockStats',
> >> > +  'data': {'*device': 'str', '*node-name': 'str',
> >> > +           ... more members ... } }
> >> > +
> >> > +##
> >> > +# @query-blockstats:
> >> > +#
> >> > +# Query the @BlockStats for all virtual block devices.
> >> > +#
> >> > +# @query-nodes: #optional If true, the command will query all the
> >> > +#               block nodes ... explain, explain ...  (since 2.3)
> >> > +#
> >> > +# Returns: A list of @BlockStats for each virtual block devices.
> >> > +#
> >> > +# Since: 0.14.0
> >> > +#
> >> > +# Example:
> >> > +#
> >> > +# -> { "execute": "query-blockstats" }
> >> > +# <- {
> >> > +#      ... lots of output ...
> >> > +#    }
> >> > +#
> >> > +##
> >> > +{ 'command': 'query-blockstats',
> >> > +  'data': { '*query-nodes': 'bool' },
> >> > +  'returns': ['BlockStats'] }
> >> > +
> >> > +==== Free-form documentation ====
> >> > +
> >> > +A documentation block that isn't an expression documentation block is
> >> > +a free-form documentation block.  These may be used to provide
> >> > +additional text and structuring content.
> >> > +
> >> > +
> >> > +=== Schema overview ===
> >> >  
> >> >  The schema sets up a series of types, as well as commands and events
> >> >  that will use those types.  Forward references are allowed: the parser
> >> [Tests look okay, snipped...]
> >> 
>

Re: [Qemu-devel] [PATCH v7 15/21] qapi: add qapi2texi script

Reply via email to