Re: [Qemu-devel] [PATCH v5 13/17] qapi: add qapi2texi script

Marc-André Lureau Tue, 06 Dec 2016 05:11:35 -0800

Hi

On Tue, Dec 6, 2016 at 2:50 PM Markus Armbruster <arm...@redhat.com> wrote:


> I had to resort to diff to find your replies, and massage the text
> manually to produce a readable reply myself.  Please quote the usual
> way.
>
>
I'd have to switch to something else than gmail (which bothers me for
various reasons, let's not discuss the merits of various mail clients
please ;) In general, I don't have problems, but this mail is rather big,
sorry for the inconvenience..

> Markus Armbruster <arm...@redhat.com> writes:
> >
> > > Marc-André Lureau <marcandre.lur...@redhat.com> writes:
> > >
> > >> As the name suggests, the qapi2texi script converts JSON QAPI
> > >> description into a texi file suitable for different target
> > >> formats (info/man/txt/pdf/html...).
> > >>
> > >> It parses the following kind of blocks:
> > >>
> > >> Free-form:
> > >>
> > >>   ##
> > >>   # = Section
> > >>   # == Subsection
> > >>   #
> > >>   # Some text foo with *emphasis*
> > >>   # 1. with a list
> > >>   # 2. like that
> > >>   #
> > >>   # And some code:
> > >>   # | $ echo foo
> > >>   # | -> do this
> > >>   # | <- get that
> > >>   #
> > >>   ##
> > >>
> > >> Symbol:
> > >>
> > >>   ##
> > >>   # @symbol:
> > >>   #
> > >>   # Symbol body ditto ergo sum. Foo bar
> > >>   # baz ding.
> > >>   #
> > >>   # @arg: foo
> > >>   # @arg: #optional foo
> > >
> > > Let's not use @arg twice.
> > >
> > > Terminology: I prefer to use "parameter" for formal parameters, and
> > > "argument" for actual arguments.  This matches how The Python Language
> > > Reference uses the terms.
> > >
> > > What about
> > >
> > >     # @param1: the frob to frobnicate
> > >     # @param2: #optional how hard to frobnicate
> >
> > ok
> >
> > >>   #
> > >>   # Returns: returns bla bla
> > >>   #          Or bla blah
> > >
> > > Repeating "returns" is awkward, and we don't do that in our schemas.
> > >
> > > We need a period before "Or", or spell it "or".
> > >
> > > What about
> > >
> > >     # Returns: the frobnicated frob.
> > >     #          If frob isn't frobnicatable, GenericError.
> >
> > ok
> >
> > >>   #
> > >>   # Since: version
> > >>   # Notes: notes, comments can have
> > >>   #        - itemized list
> > >>   #        - like this
> > >>   #
> > >>   # Example:
> > >>   #
> > >>   # -> { "execute": "quit" }
> > >>   # <- { "return": {} }
> > >>   #
> > >>   ##
> > >>
> > >> That's roughly following the following EBNF grammar:
> > >>
> > >> api_comment = "##\n" comment "##\n"
> > >> comment = freeform_comment | symbol_comment
> > >> freeform_comment = { "# " text "\n" | "#\n" }
> > >> symbol_comment = "# @" name ":\n" { member | meta | freeform_comment }
> > >
> > > Rejects non-empty comments where "#" is not followed by space.  Such
> > > usage is accepted outside doc comments.  Hmm.
> > >
> > >> member = "# @" name ':' [ text ] freeform_comment
> > >
> > > Are you missing a "\n" before freeform_comment?
> >
> > yes
> >
> > >> meta = "# " ( "Returns:", "Since:", "Note:", "Notes:", "Example:",
> "Examples:" ) [ text ] freeform_comment
> > >
> > > Likewise.
> >
> > ok
> >
> > >> text = free-text markdown-like, "#optional" for members
> > >
> > > The grammar is ambiguous: a line "# @foo:\n" can be parsed both as
> > > freeform_comment and as symbol_comment.  Since your intent is obvious
> > > enough, it can still serve as documentation.  It's not a suitable
> > > foundation for parsing, though.  Okay for a commit message.
> > >
> > >> Thanks to the following json expressions, the documentation is
> enhanced
> > >> with extra information about the type of arguments and return value
> > >> expected.
> > >
> > > I guess you want to say that we enrich the documentation we extract
> from
> > > comments with information from the actual schema.  Correct?
> >
> > yes
> >
> > > Missing: a brief discussion of deficiencies.  These include:
> > >
> > > * The generated QMP documentation includes internal types
> > >
> > >   We use qapi-schema.json both for defining the external QMP interface
> > >   and for defining internal types.  qmp-introspect.py carefully
> > >   separates the two, to not expose internal types.  qapi2texi.py
> happily
> > >   exposes everything.
> > >
> > >   Currently, about a fifth of the types in the generated docs are
> > >   internal:
> > >
> > >       AcpiTableOptions
> > >       BiosAtaTranslation
> > >       BlockDeviceMapEntry
> > >       COLOMessage
> > >       COLOMode
> > >       DummyForceArrays
> > >       FailoverStatus
> > >       FloppyDriveType
> > >       ImageCheck
> > >       LostTickPolicy
> > >       MapEntry
> > >       MigrationParameter
> > >       NetClientDriver
> > >       NetFilterDirection
> > >       NetLegacy
> > >       NetLegacyNicOptions
> > >       NetLegacyOptions
> > >       NetLegacyOptionsKind
> > >       Netdev
> > >       NetdevBridgeOptions
> > >       NetdevDumpOptions
> > >       NetdevHubPortOptions
> > >       NetdevL2TPv3Options
> > >       NetdevNetmapOptions
> > >       NetdevNoneOptions
> > >       NetdevSocketOptions
> > >       NetdevTapOptions
> > >       NetdevUserOptions
> > >       NetdevVdeOptions
> > >       NetdevVhostUserOptions
> > >       NumaNodeOptions
> > >       NumaOptions
> > >       NumaOptionsKind
> > >       OnOffAuto
> > >       OnOffSplit
> > >       PreallocMode
> > >       QCryptoBlockCreateOptions
> > >       QCryptoBlockCreateOptionsLUKS
> > >       QCryptoBlockFormat
> > >       QCryptoBlockInfo
> > >       QCryptoBlockInfoBase
> > >       QCryptoBlockInfoQCow
> > >       QCryptoBlockOpenOptions
> > >       QCryptoBlockOptionsBase
> > >       QCryptoBlockOptionsLUKS
> > >       QCryptoBlockOptionsQCow
> > >       QCryptoSecretFormat
> > >       QCryptoTLSCredsEndpoint
> > >       QapiErrorClass
> > >       ReplayMode
> > >       X86CPUFeatureWordInfo
> > >       X86CPURegister32
> > >
> > >   Generating documentation for internal types might be useful, but
> > >   letting them pollute QMP interface documentation isn't.  Needs fixing
> > >   before we release.  Until then, needs a FIXME comment in
> qapi2texi.py.
> > >
> > > * Union support is lacking
> > >
> > >   The doc string language is adequate for documenting commands, events,
> > >   and non-union types.  It doesn't really handle union variants.
> Hardly
> > >   surprising, as you fitted the language do existing practice, and
> > >   existing (mal-)practice is neglecting to document union variant
> > >   members.
> > >
> > > * Documentation is lacking
> > >
> > >   See review of qapi-code-gen.txt below.
> > >
> > > * Doc comment error message positions are imprecise
> > >
> > >   They always point to the beginning of the comment.
> > >
> > > * Probably more
> > >
> > >   We should update this with noteworthy findings during review.  I
> > >   tried, but I suspect I missed some.
> >
> > ok
> >
> > >> Signed-off-by: Marc-André Lureau <marcandre.lur...@redhat.com>
> > > [Lengthy diffstat snipped...]
> > >>
> > >> diff --git a/tests/Makefile.include b/tests/Makefile.include
> > >> index e98d3b6..f16764c 100644
> > >> --- a/tests/Makefile.include
> > >> +++ b/tests/Makefile.include
> > >> @@ -350,6 +350,21 @@ qapi-schema += base-cycle-direct.json
> > >>  qapi-schema += base-cycle-indirect.json
> > >>  qapi-schema += command-int.json
> > >>  qapi-schema += comments.json
> > >> +qapi-schema += doc-bad-args.json
> > >> +qapi-schema += doc-bad-symbol.json
> > >> +qapi-schema += doc-duplicated-arg.json
> > >> +qapi-schema += doc-duplicated-return.json
> > >> +qapi-schema += doc-duplicated-since.json
> > >> +qapi-schema += doc-empty-arg.json
> > >> +qapi-schema += doc-empty-section.json
> > >> +qapi-schema += doc-empty-symbol.json
> > >> +qapi-schema += doc-invalid-end.json
> > >> +qapi-schema += doc-invalid-end2.json
> > >> +qapi-schema += doc-invalid-return.json
> > >> +qapi-schema += doc-invalid-section.json
> > >> +qapi-schema += doc-invalid-start.json
> > >> +qapi-schema += doc-missing-expr.json
> > >> +qapi-schema += doc-missing-space.json
> > >>  qapi-schema += double-data.json
> > >>  qapi-schema += double-type.json
> > >>  qapi-schema += duplicate-key.json
> > >> @@ -443,6 +458,8 @@ qapi-schema += union-optional-branch.json
> > >>  qapi-schema += union-unknown.json
> > >>  qapi-schema += unknown-escape.json
> > >>  qapi-schema += unknown-expr-key.json
> > >> +
> > >> +
> > >>  check-qapi-schema-y := $(addprefix tests/qapi-schema/,
> $(qapi-schema))
> > >>
> > >>  GENERATED_HEADERS += tests/test-qapi-types.h tests/test-qapi-visit.h
> \
> > >> diff --git a/scripts/qapi.py b/scripts/qapi.py
> > >> index 4d1b0e4..1b456b4 100644
> > >> --- a/scripts/qapi.py
> > >> +++ b/scripts/qapi.py
> > >> @@ -122,6 +122,109 @@ class QAPILineError(Exception):
> > >>              "%s:%d: %s" % (self.info['file'], self.info['line'],
> self.msg)
> > >>
> > >>
> > >> +class QAPIDoc(object):
> > >> +    class Section(object):
> > >> +        def __init__(self, name=""):
> > >
> > > name=None feels more natural than "" for an absent optional name.
> >
> > ok
> >
> > >> +            # optional section name (argument/member or section name)
> > >> +            self.name = name
> > >> +            # the list of strings for this section
> > >
> > > Would "list of lines" be more accurate, or less?
> >
> > more accurate, ok
> >
> > >> +            self.content = []
> > >> +
> > >> +        def append(self, line):
> > >> +            self.content.append(line)
> > >> +
> > >> +        def __repr__(self):
> > >> +            return "\n".join(self.content).strip()
> > >> +
> > >> +    class ArgSection(Section):
> > >> +        pass
> > >
> > > Begs the question what makes ArgSection differ from Section.  I think
> > > it's the semantics of self.name: an ArgSection's name is the parameter
> > > name, a non-ArgSection can either be a "meta" section (name is one of
> > > "Returns"", "Since", "Note", "Notes", "Example", "Examples") or an
> > > anonymous section (name is "").
> >
> > yes, and the section type is checked in  _append_freeform()
> >
> > >> +
> > >> +    def __init__(self, parser):
> > >> +        self.parser = parser
> > >> +        self.symbol = None
> > >> +        self.body = QAPIDoc.Section()
> > >> +        # a dict {'arg': ArgSection, ...}
> > >
> > > Works.  Alternatevely:
> > >
> > >            # dict mapping parameter name to ArgSection
> >
> > ok
> >
> > >> +        self.args = OrderedDict()
> > >> +        # a list of Section
> > >> +        self.meta = []
> > >> +        # the current section
> > >> +        self.section = self.body
> > >> +        # associated expression and info (set by expression parser)
> > >
> > > "will be set by expression parser", or "to be set by expression
> parser"?
> >
> > ok
> >
> > >> +        self.expr = None
> > >> +        self.info = None
> > >
> > > Clearer now.  The one remark I still have is on the use of "meta" isn't
> > > obvious here.  I think a comment further up explaining the different
> > > kinds of sections would help.
> > >
> > > To be honest, I don't get what's "meta" about the "meta" sections :)
> >
> > Let's drop that distinction, it isn't necessary anymore
> >
> > >> +
> > >> +    def has_meta(self, name):
> > >> +        """Returns True if the doc has a meta section 'name'"""
> > >
> > > PEP257[1] demands imperative mood, and punctuation:
> > >
> > >            """Return True if the doc has a meta section 'name'."""
> > >
> > > Unfortunately, there seems to be no established convention for marking
> > > parameter names.  You put this one in single quotes, which works here,
> > > but could be confusing in other cases.  I'd sidestep it:
> > >
> > >            """Return True if we have a meta section with this name."""
> > >
> > > or
> > >
> > >            """Does a meta section with this name exist?"""
> >
> > ok
> >
> > >> +        for i in self.meta:
> > >> +            if i.name == name:
> > >> +                return True
> > >> +        return False
> > >> +
> > >> +    def append(self, line):
> > >> +        """Adds a # comment line, to be parsed and added to current
> section"""
> > >
> > > Imperative mood:
> > >
> > >     """Add a comment line, to be parsed and added to the current
> section."""
> > >
> > > However, we're not always adding to the current section, we can also
> > > start a new one.  The following avoids suggesting anything about the
> > > current section:
> > >
> > >     """Parse a comment line and add it to the documentation."""
> > >
> > > When the function name is a verb, and its doc string starts with a
> > > different verb, the name might be suboptimal.  Use your judgement.
> > >
> > > If this one-liner is too terse, we can try a multi-line doc string:
> > >
> > >     """Parse a comment line and add it to the documentation.
> > >
> > >     TODO should we tell more about how we parse?
> > >
> > >     Args:
> > >         line: the comment starting with '#', with the newline stripped.
> > >
> > >     Raises:
> > >         QAPISchemaError: TODO explain error conditions
> > >     """
> > >
> > > Format stolen from the Google Python Style Guide[2], because PEP257 is
> > > of not much help there.
> > >
> > > Once you start with such elaborate doc strings, pressure will likely
> > > mount to do the same elsewhere.  Could be a distraction right now.  Use
> > > your judgement.
> >
> > ok, thanks
> >
> > >> +        line = line[1:]
> > >> +        if not line:
> > >> +            self._append_freeform(line)
> > >> +            return
> > >> +
> > >> +        if line[0] != ' ':
> > >> +            raise QAPISchemaError(self.parser, "missing space after
> #")
> > >> +        line = line[1:]
> > >
> > > QAPISchemaError takes the error position from its QAPISchemaParser
> > > argument.  In other words, the error position depends on the state of
> > > the parser when it calls this function.  Turns out the position is the
> > > beginning of the comment.  Servicable here.  Action at a distance,
> > > though.
> > >
> > > Less strict:
> > >
> > >            # strip leading # and space
> > >            line = line[1:]
> > >            if line[0] == ' ':
> > >                line = line[1:]
> > >
> > > Also avoids special-casing empty comments.
> >
> > That would raise "IndexError: string index out of range" on empty comment
> > lines ("#")
>
> Fixable:
>                if line.startswith(' '):
>
> > Also I'd like to keep the error in case a space is missing (it caught
> some
> > already).
>
> On the one hand, I share your distaste for lack of space between # and
> comment text.  On the other hand, I dislike enforcing spacing
> inconsistently: enforce in doc comments, but not in other comments.
>
>
It's only enforced for doc comments, which I think is fair.


> > > Perhaps we should treat all leading whitespace the same (I'm not sure):
> > >
> > >            # strip leading # and whitespace
> > >            line = line[1:]
> > >            line = line.lstrip()
> >
> > That would break indentation in Example sections
>
> True; scratch the idea.
>
> > >> +
> > >> +        if self.symbol:
> > >> +            self._append_symbol_line(line)
> > >> +        elif (not self.body.content and
> > >> +              line.startswith("@") and line.endswith(":")):
> > >> +            self.symbol = line[1:-1]
> > >> +            if not self.symbol:
> > >> +                raise QAPISchemaError(self.parser, "Invalid symbol")
> > >
> > > Doesn't recognize the symbol when there's anything other than a single
> > > space between # and @.  Pathological case: '#\t@foo:' starting in
> column
> > > 6 looks exactly like '# @foo:', but doesn't work.  Fortunately, your
> > > code rejects the tab there.  Assigning meaning to (leading) whitespace
> > > may make sense, but it must be documented clearly.
> >
> > I think it's simpler to only accept a simple space (# + space + @ +
> symbol
> > + :), it will also lead to more uniform doc.
>
> Simpler is always an argument worth considering, but simpler isn't
> always better.  Experience with other languages has taught me to be
> carefule when making the kind or amount of whitespace significant.
>
> Kind of whitespace is not a usability issue with your proposal, because
> your patch rejects anything but space.
>
> Amount of whitespace is, because getting it wrong turns a symbol comment
> block into a free-form comment block, or a parameter section into
> free-form lines belonging to whatever section precedes it.
>
> As long as we reject definitions without a symbol comment, errors of the
> former kind will still be caught, although the error message could be
> confusing.
>
> As is, we don't reject definitions whose symbol comment fails to
> document all parameters.  Errors of the latter kind will therefore go
> undetected.  Trap for the unwary.  See also "@param2 is True" below.
>
> If we can't find a solution without such usability issues now, then I'm
> willing to settle for documentation warning unwary users, or even for a
> FIXME comment.
>

ok


>
> > > Doesn't recognize the symbol when there's crap after ":".  Confusing
> > > when it's invisible crap, i.e. trailing whitespace.  In general, it's
> > > best to parse left to right, like this:
> > >
> > >            elif not self.body.content and line.startswith('@'):
> > >                match = valid_name.match(line[1:]);
> > >                if not match:
> > >                    raise QAPISchemaError(self.parser, 'Invalid name')
> > >                if line.startswith(':', match.end(0)):
> > >                    raise QAPISchemaError(self.parser, 'Expected ":")
> >
> > Easier is simply to check :
> > if not line.endswith(":"):
> >                 raise QAPIParseError(self.parser, "Line should end with
> :")
> >
> > test added
>
> There's a reason we have a mountain of theory on parsing, built in
> decades of practice.  What starts "easier" than that generally ends up
> rewritten from scratch.  Anyway, let's continue.
>
> > > Here, the error position is clearly suboptimal: it's the beginning of
> > > the comment instead of where the actual error is, namely the beginning
> /
> > > the end of the name.  More of the same below.  The root cause is
> > > stacking parsers.  I'm not asking you to do anything about it at this
> > > time.
> > >
> > > Note my use of "name" rather than "symbol" in the error message.
> > > qapi-code-gen.txt talks about names, not symbols.  Consistent use of
> > > terminology matters.
> >
> > Agree, isn't "symbol" more appropriate for documentation though?
>
> Possibly.  But if it's an improvement for generated documentation, it
> surely is an improvement for hand-written documentation such as
> qapi-code-gen.txt and the comments in qapi*.py, too.  And the Python
> identifiers.  I have to admit that cools my enthusiasm for making the
> improvement right now :)
>
> > >> +        else:
> > >> +            self._append_freeform(line)
> > >> +
> > >> +    def _append_symbol_line(self, line):
> > >> +        name = line.split(' ', 1)[0]
> > >> +
> > >> +        if name.startswith("@") and name.endswith(":"):
> > >> +            line = line[len(name)+1:]
> > >> +            self._start_args_section(name[1:-1])
> > >
> > > Similar issue.  Left to right:
> > >
> > >            if re.match(r'\s*@', line):
> > >                match = valid_name.match(line[1:]);
> > >                if not match:
> > >                    raise QAPISchemaError(self.parser, 'Invalid
> parameter name')
> > >                line = line[match.end(0):]
> > >                if line.startswith(':'):
> > >                    raise QAPISchemaError(self.parser, 'Expected ":")
> > >                line = line[1:]
> > >                self._start_args_section(match.group(0))
> > >
> > > Even better would be factoring out the common code to parse '@foo:'.
> >
> > It's a valid case to have a section freeform comment starting with @foo,
> ex:
> >
> > @param: do this and than if
> >                @param2 is True
>
> Good point, but it's a language issue, not a parsing issue.
>
> The language issue is that we need to recognize a line "# @param: ..."
> as the start of a parameter section, while still permitting free-form
> lines like "# @param2 is True".
>
> This is a pretty convincing argument for requiring the ':'.  It can also
> be used as an argument for requiring exactly one space between # and @.
>
> How the language is parsed is completely orthogonal.
>
> >  I don't think it's worth to factor out the code to parse @foo vs @foo:,
> yet
> >
> > >> +        elif name in ("Returns:", "Since:",
> > >> +                      # those are often singular or plural
> > >> +                      "Note:", "Notes:",
> > >> +                      "Example:", "Examples:"):
> > >> +            line = line[len(name)+1:]
> > >> +            self._start_meta_section(name[:-1])
> > >
> > > Since we're re.match()ing already, here's how do to it that way:
> >
> > I don't follow your suggestion for the reason above,so I still have the
> > first word 'name'
> >
> > >            else
> > >                match =
> re.match(r'\s*(Returns|Since|Notes?|Examples?):', line)
> > >                if match:
> > >                    line = line[match.end(0):]
> > >                    self._start_meta_section(match.group(1))
> >
> > not really much nicer to me (especially because no match yet), I'll keep
> > the current code for now
> >
> > >> +
> > >> +        self._append_freeform(line)
> > >> +
> > >> +    def _start_args_section(self, name):
> > >> +        if not name:
> > >> +            raise QAPISchemaError(self.parser, "Invalid argument
> name")
> > >
> > > parameter name
> >
> > ok
> >
> > >> +        if name in self.args:
> > >> +            raise QAPISchemaError(self.parser, "'%s' arg duplicated"
> % name)
> > >
> > > Duplicate parameter name
> >
> > ok
> >
> > >> +        self.section = QAPIDoc.ArgSection(name)
> > >> +        self.args[name] = self.section
> > >> +
> > >> +    def _start_meta_section(self, name):
> > >> +        if name in ("Returns", "Since") and self.has_meta(name):
> > >> +            raise QAPISchemaError(self.parser,
> > >> +                                  "Duplicated '%s' section" % name)
> > >> +        self.section = QAPIDoc.Section(name)
> > >> +        self.meta.append(self.section)
> > >> +
> > >> +    def _append_freeform(self, line):
> > >> +        in_arg = isinstance(self.section, QAPIDoc.ArgSection)
> > >> +        if in_arg and self.section.content and not
> self.section.content[-1] \
> > >> +           and line and not line[0].isspace():
> > >
> > > PEP8[3] prefers parenthesises over backslash:
> > >
> > >            if (in_arg and self.section.content and not
> self.section.content[-1]
> > >                and line and not line[0].isspace()):
> >
> > yes, thanks
> >
> > >> +            # an empty line followed by a non-indented
> > >> +            # comment ends the argument section
> > >> +            self.section = self.body
> > >> +            self._append_freeform(line)
> > >> +            return
> > >
> > > Switching back to the self.body section like this reorders the
> > > documentation text.  I still think this is a terrible idea.  A dumb
> > > script is exceedingly unlikely to improve human-written doc comment
> text
> > > by reordering.  In the rare case it does, the doc comment source should
> > > be reordered.
> > >
> > > Here's an example where the doc generator happily creates
> unintelligible
> > > garbage if I format CpuDefinitionInfo's doc comment in a slightly off
> > > way:
> > >
> > >     ##
> > >     # @CpuDefinitionInfo:
> > >     #
> > >     # Virtual CPU definition.
> > >     #
> > >     # @name: the name of the CPU definition
> > >     #
> > >     # @migration-safe: #optional whether a CPU definition can be safely
> > >     # used for migration in combination with a QEMU compatibility
> > >     # machine when migrating between different QMU versions and between
> > >     # hosts with different sets of (hardware or software)
> > >     # capabilities.
> > >     #
> > >     # If not provided, information is not available and callers should
> > >     # not assume the CPU definition to be migration-safe. (since 2.8)
> > >     #
> > >     # @static: whether a CPU definition is static and will not change
> > >     # depending on QEMU version, machine type, machine options and
> > >     # accelerator options.  A static model is always
> > >     # migration-safe. (since 2.8)
> > >     #
> > >     # @unavailable-features: #optional List of properties that prevent
> > >     # the CPU model from running in the current host. (since 2.8)
> > >     #
> > >     # @unavailable-features is a list of QOM property names that
> > >     # represent CPU model attributes that prevent the CPU from running.
> > >     # If the QOM property is read-only, that means there's no known
> > >     # way to make the CPU model run in the current host.
> Implementations
> > >     # that choose not to provide specific information return the
> > >     # property name "type".
> > >     #
> > >     # If the property is read-write, it means that it MAY be possible
> > >     # to run the CPU model in the current host if that property is
> > >     # changed. Management software can use it as hints to suggest or
> > >     # choose an alternative for the user, or just to generate
> meaningful
> > >     # error messages explaining why the CPU model can't be used.
> > >     # If @unavailable-features is an empty list, the CPU model is
> > >     # runnable using the current host and machine-type.
> > >     # If @unavailable-features is not present, runnability
> > >     # information for the CPU is not available.
> > >     #
> > >     # Since: 1.2.0
> > >     ##
> > >     { 'struct': 'CpuDefinitionInfo',
> > >       'data': { 'name': 'str', '*migration-safe': 'bool', 'static':
> 'bool',
> > >                 '*unavailable-features': [ 'str' ] } }
> > >
> > > To detect the problem, you have to read the generated docs attentively.
> > > You know my opinion on our chances for that to happen during
> > > development.
> > >
> > > My point is not that we should support this doc comment format.  My
> > > point is that people could conceivably write something like it, and not
> > > get caught in patch review.
> > >
> > > I can see three ways out of this swamp:
> > >
> > > 1. Let sections continue until another one begins.
> >
> > but we have interleaved  sections, and no explicit "body" section tag,
> ex:
> >
> > ##
> > # @input-send-event:
> > #
> > # Send input event(s) to guest.
> > #
> > # @device: #optional display device to send event(s) to.
> > # @head: #optional head to send event(s) to, in case the
> > #        display device supports multiple scanouts.
> > # @events: List of InputEvent union.
> > #
> > # Returns: Nothing on success.
> > #
> > # The @device and @head parameters can be used to send the input event
> > # to specific input devices in case (a) multiple input devices of the
> > # same kind are added to the virtual machine and (b) you have
> > # configured input routing (see docs/multiseat.txt) for those input
> > # ....
> >
> > > 2. Make text between sections an error.
> > >
> > > 3. When a section ends, start a new anonymous section.
> >
> > It's not clear how to recognize when a section ends and append to comment
> > body.
> >
> > 2. Make text between sections an error.
> >
> > Sound like the best option to me. I'll fix the doc and add a check & test
> > for this common pattern though:
> >
> > ##
> > # @TestStruct:
> > #
> > # body with @var
> > #
> > # @integer: foo
> > #           blah
> > #
> > #           bao
> > #
> > # Let's catch this bad comment.
> > ##
>
> Go ahead.
>
> > > Can't say offhand which one will work best.
> > >
> > >> +        if in_arg or not self.section.name.startswith("Example"):
> > >> +            line = line.strip()
> > >
> > > Stripping whitespace is not "Markdown-like", because intendation
> carries
> > > meaning in Markdown.  Example:
> > >
> > > * First item in itemized list
> > > * Second item
> > >     * Sub-item of second item
> > >     * Another sub-item
> > > * Third item
> > >
> > > Stripping whitespace destroys the list structure.  If that's what you
> > > want, you get to document where your "Markdown-like" markup is unlike
> > > Markdown :)
> >
> > let's use "comment annotations" instead
>
> Okay.
>
> > > Is there a technical reason for stripping whitespace?
> > >
> > > See also discussion of space after # above.
> > >
> > >> +        self.section.append(line)
> > >> +
> > >> +
> > >>  class QAPISchemaParser(object):
> > >>
> > >>      def __init__(self, fp, previously_included=[], incl_info=None):
> > >> @@ -137,11 +240,18 @@ class QAPISchemaParser(object):
> > >>          self.line = 1
> > >>          self.line_pos = 0
> > >>          self.exprs = []
> > >> +        self.docs = []
> > >>          self.accept()
> > >>
> > >>          while self.tok is not None:
> > >>              info = {'file': fname, 'line': self.line,
> > >>                      'parent': self.incl_info}
> > >> +            if self.tok == '#' and self.val.startswith('##'):
> > >
> > > How can self.val *not* start with '##' here?
> >
> > you are right, unnecessary condition since we break get_expr() only in
> > this case now
> >
> > >> +                doc = self.get_doc()
> > >> +                doc.info = info
> > >
> > > Let's pass info as argument to get_doc(), so we don't have to dot into
> > > doc here.  get_doc() can avoid dotting into doc by passing info to the
> > > constructor.
> >
> > ok
> >
> > >> +                self.docs.append(doc)
> > >> +                continue
> > >> +
> > >>              expr = self.get_expr(False)
> > >>              if isinstance(expr, dict) and "include" in expr:
> > >>                  if len(expr) != 1:
> > >> @@ -160,6 +270,7 @@ class QAPISchemaParser(object):
> > >>                          raise QAPILineError(info, "Inclusion loop
> for %s"
> > >>                                              % include)
> > >>                      inf = inf['parent']
> > >> +
> > >>                  # skip multiple include of the same file
> > >>                  if incl_abs_fname in previously_included:
> > >>                      continue
> > >> @@ -171,12 +282,38 @@ class QAPISchemaParser(object):
> > >>                  exprs_include = QAPISchemaParser(fobj,
> previously_included,
> > >>                                                   info)
> > >>                  self.exprs.extend(exprs_include.exprs)
> > >> +                self.docs.extend(exprs_include.docs)
> > >>              else:
> > >>                  expr_elem = {'expr': expr,
> > >>                               'info': info}
> > >> +                if self.docs and not self.docs[-1].expr:
> > >> +                    self.docs[-1].expr = expr
> > >> +                    expr_elem['doc'] = self.docs[-1]
> > >> +
> > >
> > > Attaches the expression to the last doc comment that doesn't already
> > > have one.  A bit sloppy, because there could be non-doc comments in
> > > between, or the doc comment could be in another file.  It'll do, at
> > > least for now.
> >
> > I extended the condition to check it attaches the doc from the same file
> >
> > >>                  self.exprs.append(expr_elem)
> > >>
> > >> -    def accept(self):
> > >> +    def get_doc(self):
> > >> +        if self.val != '##':
> > >> +            raise QAPISchemaError(self, "Junk after '##' at start of
> "
> > >> +                                  "documentation comment")
> > >> +
> > >> +        doc = QAPIDoc(self)
> > >> +        self.accept(False)
> > >> +        while self.tok == '#':
> > >> +            if self.val.startswith('##'):
> > >> +                # End of doc comment
> > >> +                if self.val != '##':
> > >> +                    raise QAPISchemaError(self, "Junk after '##' at
> end of "
> > >> +                                          "documentation comment")
> > >> +                self.accept()
> > >> +                return doc
> > >> +            else:
> > >> +                doc.append(self.val)
> > >> +            self.accept(False)
> > >> +
> > >> +        raise QAPISchemaError(self, "Documentation comment must end
> with '##'")
> > >
> > > Let's put this after accept, next to the other get_FOO().
> >
> > ok
> >
> > >> +
> > >> +    def accept(self, skip_comment=True):
> > >>          while True:
> > >>              self.tok = self.src[self.cursor]
> > >>              self.pos = self.cursor
> > >> @@ -184,7 +321,13 @@ class QAPISchemaParser(object):
> > >>              self.val = None
> > >>
> > >>              if self.tok == '#':
> > >> +                if self.src[self.cursor] == '#':
> > >> +                    # Start of doc comment
> > >> +                    skip_comment = False
> > >>                  self.cursor = self.src.find('\n', self.cursor)
> > >> +                if not skip_comment:
> > >> +                    self.val = self.src[self.pos:self.cursor]
> > >> +                    return
> > >>              elif self.tok in "{}:,[]":
> > >>                  return
> > >>              elif self.tok == "'":
> > >
> > > Copied from review of v3, so I don't forget:
> > >
> > > Comment tokens are thrown away as before, except when the parser asks
> > > for them by passing skip_comment=False, or when the comment token
> starts
> > > with ##.  The parser asks while parsing a doc comment, in get_doc().
> > >
> > > This is a backchannel from the parser to the lexer.  I'd rather avoid
> > > such lexer hacks, but I guess we can address that on top.
> >
> > I don't have a good solution to that
>
> I have an idea which may or may not work.  I can explore it on top.
>
>
thanks


> > > A comment starting with ## inside an expression is now a syntax error.
> > > For instance, input
> > >
> > >     {
> > >     ##
> > >
> > > yields
> > >
> > >     /dev/stdin:2:1: Expected string or "}"
> > >
> > > Rather unfriendly error message, but we can fix that on top.
> > >
> > >> @@ -713,7 +856,7 @@ def check_keys(expr_elem, meta, required,
> optional=[]):
> > >>                                  % (key, meta, name))
> > >>
> > >>
> > >> -def check_exprs(exprs):
> > >> +def check_exprs(exprs, strict_doc):
> > >
> > > Note: strict_doc=False unless this is qapi2texi.py.
> >
> > For tests reasons: we may want to fix the test instead, or have a flag
> > QAPI_CHECK/NOSTRICT_DOC=1 or an option.
> >
> > >>      global all_names
> > >>
> > >>      # Learn the types and check for valid expression keys
> > >> @@ -722,6 +865,11 @@ def check_exprs(exprs):
> > >>      for expr_elem in exprs:
> > >>          expr = expr_elem['expr']
> > >>          info = expr_elem['info']
> > >> +
> > >> +        if strict_doc and 'doc' not in expr_elem:
> > >> +            raise QAPILineError(info,
> > >> +                                "Expression missing documentation
> comment")
> > >> +
> > >
> > > Why should we supress this error in programs other than qapi2texi.py?
> >
> > because the tests don't have comments
>
> Good point :)
>
> Can we come up with a brief comment explaining this?
>
> I fixed the tests instead (after all, we have complete control on what to
accept or not. that was boring but now it's done)


> > > Can't see a test for this one.
> >
> > because tests aren't strict :)
> >
> > I guess you'll tell me to fix the tests instead, so I did that in the
> next
> > series.
> >
> > >>          if 'enum' in expr:
> > >>              check_keys(expr_elem, 'enum', ['data'], ['prefix'])
> > >>              add_enum(expr['enum'], info, expr['data'])
> > >> @@ -780,6 +928,63 @@ def check_exprs(exprs):
> > >>      return exprs
> > >>
> > >>
> > >> +def check_simple_doc(doc):
> > >
> > > You call this "free-form" elsewhere.  Pick one name and stick to it.
> > > I think free-form is more descriptive than simple.
> >
> > ok
> >
> > >> +    if doc.symbol:
> > >> +        raise QAPILineError(doc.info,
> > >> +                            "'%s' documention is not followed by the
> definition"
> > >> +                            % doc.symbol)
> > >
> > > "Documentation for %s is not ..."
> >
> > ok
> >
> > >> +
> > >> +    body = str(doc.body)
> > >> +    if re.search(r'@\S+:', body, re.MULTILINE):
> > >> +        raise QAPILineError(doc.info,
> > >> +                            "Document body cannot contain @NAME:
> sections")
> > >> +
> > >> +
> > >> +def check_expr_doc(doc, expr, info):
> > >
> > > You call this "symbol" elsewhere.  I think "definition" would be better
> > > than either.
> >
> > ok
> >
> > >> +    for i in ('enum', 'union', 'alternate', 'struct', 'command',
> 'event'):
> > >> +        if i in expr:
> > >> +            meta = i
> > >> +            break
> > >> +
> > >> +    name = expr[meta]
> > >> +    if doc.symbol != name:
> > >> +        raise QAPILineError(info, "Definition of '%s' follows
> documentation"
> > >> +                            " for '%s'" % (name, doc.symbol))
> > >> +    if doc.has_meta('Returns') and 'command' not in expr:
> > >> +        raise QAPILineError(info, "Invalid return documentation")
> > >
> > > Suggest something like "'Returns:' is only valid for commands".
> > >
> > > We accept 'Returns:' even when the command doesn't return anything,
> > > because we currently use it to document errors, too.  Can't say I like
> > > that, but it's out of scope here.
> >
> > ok
> >
> > >> +
> > >> +    doc_args = set(doc.args.keys())
> > >> +    if meta == 'union':
> > >> +        data = expr.get('base', [])
> > >> +    else:
> > >> +        data = expr.get('data', [])
> > >> +    if isinstance(data, dict):
> > >> +        data = data.keys()
> > >> +    args = set([name.strip('*') for name in data])
> > >
> > > Not fixed since v3:
> > >
> > > * Flat union where 'base' is a string, e.g. union UserDefFlatUnion in
> > >   qapi-schema-test.json has base 'UserDefUnionBase', args is set(['a',
> > >   'B', 'e', 'D', 'f', 'i', 'o', 'n', 's', 'r', 'U'])
> > >
> > > * Command where 'data' is a string, e.g. user_def_cmd0 in
> > >   qapi-schema-test.json has data 'Empty2', args is set(['E', 'm', 'p',
> > >   '2', 't', 'y'])
> > >
> > > * Event where 'data' is a string, no test case handy (hole in test
> > >   coverage)
> >
> > ok, I changed it that way, that fixes it:
> > +    if isinstance(data, list):
> > +        args = set([name.strip('*') for name in data])
> > +    else:
> > +        args = set()
>
> Uh, sure this does the right thing when data is a dict?
>

yes, the line above takes care of extracting the keys in data.


>
> > >> +    if meta == 'alternate' or \
> > >> +       (meta == 'union' and not expr.get('discriminator')):
> > >> +        args.add('type')
> > >
> > > As explained in review of v3, this is only a subset of the real set of
> > > members.  Computing the exact set is impractical when working with the
> > > abstract syntax tree.  I believe we'll eventually have to rewrite this
> > > code to work with the QAPISchemaEntity instead.
> >
> > I don't think we want to list all the members as this would lead to
> > duplicated documentation. Instead it should document only the members of
> > the expr being defined.
>
> You're sticking to established practice, which makes lots of sense.
> However, established practice is a lot more clear for simple cases than
> for things like members inherited from base types, variant members and
> such.  At some point, we'll have to figure out how we want not so simple
> cases documented.  Until then, we can't really decide how to check
> documentation completeness.
>
> >                         In which case, it looks like this check is good
> > enough, no?
>
> For now, yes.  Later on, maybe.
>
> Let's document the limitation in a comment and the commit message, and
> move on.
>

I'd appreciate your help to list the limitations in the commit message, as
I have not as thorough understanding as you, and even worse I often don't
use the same name for the various concepts.


>
> > >> +    if not doc_args.issubset(args):
> > >> +        raise QAPILineError(info, "Members documentation is not a
> subset of"
> > >> +                            " API %r > %r" % (list(doc_args),
> list(args)))
> > >> +
> > >> +
> > >> +def check_docs(docs):
> > >> +    for doc in docs:
> > >> +        for section in doc.args.values() + doc.meta:
> > >> +            content = str(section)
> > >> +            if not content or content.isspace():
> > >> +                raise QAPILineError(doc.info,
> > >> +                                    "Empty doc section '%s'" %
> section.name)
> > >> +
> > >> +        if not doc.expr:
> > >> +            check_simple_doc(doc)
> > >> +        else:
> > >> +            check_expr_doc(doc, doc.expr, doc.info)
> > >> +
> > >> +    return docs
> > >> +
> > >> +
> > >>  #
> > >>  # Schema compiler frontend
> > >>  #
> > >> @@ -1249,9 +1454,11 @@ class QAPISchemaEvent(QAPISchemaEntity):
> > >>
> > >>
> > >>  class QAPISchema(object):
> > >> -    def __init__(self, fname):
> > >> +    def __init__(self, fname, strict_doc=False):
> > >>          try:
> > >> -            self.exprs = check_exprs(QAPISchemaParser(open(fname,
> "r")).exprs)
> > >> +            parser = QAPISchemaParser(open(fname, "r"))
> > >> +            self.exprs = check_exprs(parser.exprs, strict_doc)
> > >> +            self.docs = check_docs(parser.docs)
> > >>              self._entity_dict = {}
> > >>              self._predefining = True
> > >>              self._def_predefineds()
> > >> diff --git a/scripts/qapi2texi.py b/scripts/qapi2texi.py
> > >> new file mode 100755
> > >> index 0000000..0cec43a
> > >> --- /dev/null
> > >> +++ b/scripts/qapi2texi.py
> > >
> > > Still only skimming this one.
> > >
> > >> @@ -0,0 +1,331 @@
> > >> +#!/usr/bin/env python
> > >> +# QAPI texi generator
> > >> +#
> > >> +# This work is licensed under the terms of the GNU LGPL, version 2+.
> > >> +# See the COPYING file in the top-level directory.
> > >> +"""This script produces the documentation of a qapi schema in
> texinfo format"""
> > >> +import re
> > >> +import sys
> > >> +
> > >> +import qapi
> > >> +
> > >> +COMMAND_FMT = """
> > >> +@deftypefn {type} {{{ret}}} {name} @
> > >> +{{{args}}}
> > >> +
> > >> +{body}
> > >> +
> > >> +@end deftypefn
> > >> +
> > >> +""".format
> > >> +
> > >> +ENUM_FMT = """
> > >> +@deftp Enum {name}
> > >> +
> > >> +{body}
> > >> +
> > >> +@end deftp
> > >> +
> > >> +""".format
> > >> +
> > >> +STRUCT_FMT = """
> > >> +@deftp {{{type}}} {name} @
> > >> +{{{attrs}}}
> > >> +
> > >> +{body}
> > >> +
> > >> +@end deftp
> > >> +
> > >> +""".format
> > >> +
> > >> +EXAMPLE_FMT = """@example
> > >> +{code}
> > >> +@end example
> > >> +""".format
> > >> +
> > >> +
> > >> +def subst_strong(doc):
> > >> +    """Replaces *foo* by @strong{foo}"""
> > >> +    return re.sub(r'\*([^_\n]+)\*', r'@emph{\1}', doc)
> > >> +
> > >> +
> > >> +def subst_emph(doc):
> > >> +    """Replaces _foo_ by @emph{foo}"""
> > >> +    return re.sub(r'\s_([^_\n]+)_\s', r' @emph{\1} ', doc)
> > >> +
> > >> +
> > >> +def subst_vars(doc):
> > >> +    """Replaces @var by @code{var}"""
> > >> +    return re.sub(r'@([\w-]+)', r'@code{\1}', doc)
> > >> +
> > >> +
> > >> +def subst_braces(doc):
> > >> +    """Replaces {} with @{ @}"""
> > >> +    return doc.replace("{", "@{").replace("}", "@}")
> > >> +
> > >> +
> > >> +def texi_example(doc):
> > >> +    """Format @example"""
> > >> +    doc = subst_braces(doc).strip('\n')
> > >> +    return EXAMPLE_FMT(code=doc)
> > >> +
> > >> +
> > >> +def texi_comment(doc):
> > >> +    """
> > >> +    Format a comment
> > >> +
> > >> +    Lines starting with:
> > >> +    - |: generates an @example
> > >> +    - =: generates @section
> > >> +    - ==: generates @subsection
> > >> +    - 1. or 1): generates an @enumerate @item
> > >> +    - o/*/-: generates an @itemize list
> > >> +    """
> > >> +    lines = []
> > >> +    doc = subst_braces(doc)
> > >> +    doc = subst_vars(doc)
> > >> +    doc = subst_emph(doc)
> > >> +    doc = subst_strong(doc)
> > >> +    inlist = ""
> > >> +    lastempty = False
> > >> +    for line in doc.split('\n'):
> > >> +        empty = line == ""
> > >> +
> > >> +        if line.startswith("| "):
> > >> +            line = EXAMPLE_FMT(code=line[2:])
> > >> +        elif line.startswith("= "):
> > >> +            line = "@section " + line[2:]
> > >> +        elif line.startswith("== "):
> > >> +            line = "@subsection " + line[3:]
> > >> +        elif re.match("^([0-9]*[.)]) ", line):
> > >> +            if not inlist:
> > >> +                lines.append("@enumerate")
> > >> +                inlist = "enumerate"
> > >> +            line = line[line.find(" ")+1:]
> > >> +            lines.append("@item")
> > >> +        elif re.match("^[o*-] ", line):
> > >> +            if not inlist:
> > >> +                lines.append("@itemize %s" % {'o': "@bullet",
> > >> +                                              '*': "@minus",
> > >> +                                              '-': ""}[line[0]])
> > >> +                inlist = "itemize"
> > >> +            lines.append("@item")
> > >> +            line = line[2:]
> > >> +        elif lastempty and inlist:
> > >> +            lines.append("@end %s\n" % inlist)
> > >> +            inlist = ""
> > >> +
> > >> +        lastempty = empty
> > >> +        lines.append(line)
> > >> +
> > >> +    if inlist:
> > >> +        lines.append("@end %s\n" % inlist)
> > >> +    return "\n".join(lines)
> > >> +
> > >> +
> > >> +def texi_args(expr, key="data"):
> > >> +    """
> > >> +    Format the functions/structure/events.. arguments/members
> > >> +    """
> > >> +    if key not in expr:
> > >> +        return ""
> > >> +
> > >> +    args = expr[key]
> > >> +    arg_list = []
> > >> +    if isinstance(args, str):
> > >> +        arg_list.append(args)
> > >> +    else:
> > >> +        for name, typ in args.iteritems():
> > >> +            # optional arg
> > >> +            if name.startswith("*"):
> > >> +                name = name[1:]
> > >> +                arg_list.append("['%s': @var{%s}]" % (name, typ))
> > >> +            # regular arg
> > >> +            else:
> > >> +                arg_list.append("'%s': @var{%s}" % (name, typ))
> > >
> > > Inappropriate use of @var.  @var is for metasyntactic variables,
> > > i.e. something that stands for another piece of text.  typ isn't, it's
> > > the name of a specific QAPI type.  I think you should use @code.
> >
> > yes
> >
> > > This is the reason why the type names in qemu-qmp-ref.txt are often
> > > mangled, e.g.
> > >
> > >  -- Struct: VersionInfo { 'qemu': VERSIONTRIPLE, 'package': STR }
> >
> > Right, we have format issue here. If we change it for @code, we get
> > additional quotes in the text format. The simpler is to use no format or
> > @t{}
>
> Pick something you like.
>
> > >> +
> > >> +    return ", ".join(arg_list)
> > >> +
> > >> +
> > >> +def texi_body(doc):
> > >> +    """
> > >> +    Format the body of a symbol documentation:
> > >> +    - a table of arguments
> > >> +    - followed by "Returns/Notes/Since/Example" sections
> > >> +    """
> > >> +    def _section_order(section):
> > >> +        return {"Returns": 0,
> > >> +                "Note": 1,
> > >> +                "Notes": 1,
> > >> +                "Since": 2,
> > >> +                "Example": 3,
> > >> +                "Examples": 3}[section]
> > >> +
> > >> +    body = "@table @asis\n"
> > >> +    for arg, section in doc.args.iteritems():
> > >> +        desc = str(section)
> > >> +        opt = ''
> > >> +        if desc.startswith("#optional"):
> > >> +            desc = desc[10:]
> > >> +            opt = ' *'
> > >> +        elif desc.endswith("#optional"):
> > >> +            desc = desc[:-10]
> > >> +            opt = ' *'
> > >> +        body += "@item @code{'%s'}%s\n%s\n" % (arg, opt,
> texi_comment(desc))
> > >> +    body += "@end table\n"
> > >> +    body += texi_comment(str(doc.body))
> > >> +
> > >> +    meta = sorted(doc.meta, key=lambda i: _section_order(i.name))
> > >> +    for section in meta:
> > >> +        name, doc = (section.name, str(section))
> > >> +        func = texi_comment
> > >> +        if name.startswith("Example"):
> > >> +            func = texi_example
> > >> +
> > >> +        body += "\n@quotation %s\n%s\n@end quotation" % \
> > >> +                (name, func(doc))
> > >> +    return body
> > >> +
> > >> +
> > >> +def texi_alternate(expr, doc):
> > >> +    """
> > >> +    Format an alternate to texi
> > >> +    """
> > >> +    args = texi_args(expr)
> > >> +    body = texi_body(doc)
> > >> +    return STRUCT_FMT(type="Alternate",
> > >> +                      name=doc.symbol,
> > >> +                      attrs="[ " + args + " ]",
> > >> +                      body=body)
> > >> +
> > >> +
> > >> +def texi_union(expr, doc):
> > >> +    """
> > >> +    Format an union to texi
> > >
> > > I think it's "a union".
> >
> > ok
> >
> > >> +    """
> > >> +    attrs = "@{ " + texi_args(expr, "base") + " @}"
> > >> +    discriminator = expr.get("discriminator")
> > >> +    if not discriminator:
> > >> +        union = "Flat Union"
> > >> +        discriminator = "type"
> > >> +    else:
> > >> +        union = "Union"
> > >
> > > Condition is backwards.
> >
> > fixed
> >
> > >> +    attrs += " + '%s' = [ " % discriminator
> > >> +    attrs += texi_args(expr, "data") + " ]"
> > >> +    body = texi_body(doc)
> > >> +
> > >> +    return STRUCT_FMT(type=union,
> > >> +                      name=doc.symbol,
> > >> +                      attrs=attrs,
> > >> +                      body=body)
> > >
> > > You're inventing syntax here.  Example output:
> > >
> > >  -- Union: QCryptoBlockOpenOptions { QCryptoBlockOptionsBase } +
> > >           'format' = [ 'qcow': QCRYPTOBLOCKOPTIONSQCOW, 'luks':
> > >           QCRYPTOBLOCKOPTIONSLUKS ]
> > >
> > >      The options that are available for all encryption formats when
> > >      opening an existing volume
> > >           Since: 2.6
> > >
> > >  -- Flat Union: ImageInfoSpecific { } + 'type' = [ 'qcow2':
> > >           IMAGEINFOSPECIFICQCOW2, 'vmdk': IMAGEINFOSPECIFICVMDK,
> 'luks':
> > >           QCRYPTOBLOCKINFOLUKS ]
> > >
> > >      A discriminated record of image format specific information
> > >      structures.
> > >           Since: 1.7
> > >
> > > Note that QCryptoBlockOpenOptions is actually a flat union, and
> > > ImageInfoSpecific a simple union.  As I said, the condition is
> > > backwards.
> > >
> > > The meaning of the attrs part is unobvious.  Familiarity with schema
> > > syntax doesn't really help.
> > >
> > > Could we simply use schema syntax here?
> >
> > Union: QCryptoBlockOpenOptions {
> > 'base': QCryptoBlockOptionsBase,
> > 'discriminator': 'format',
> > 'data':  { 'qcow': QCryptoBlockOptionsQCow, 'luks':
> > QCryptoBlockCreateOptionsLUKS }
> >
> > }
> >
> > Doesn't look obvious either and pollute the documentation with
> > schema-specific parameters.
>
> The schema syntax for flat unions is pretty horrid :)  I hope to improve
> it, but there's more urgent fish to fry.
>
> > > If not: whatever format you use, you first need to explain it.
> >
> > I am not sure my solution is the best and will remain, but ok let's try
> to
> > document it for now.
>
> Before you pour time into documenting what you have, we should probably
> discuss what we need.
>

Can this be marked as limitation and improved laster? The series is big,
patch is getting bigger over time, it's quite hard to handle all your
requirements in one go.

> >> +
> > >> +
> > >> +def texi_enum(expr, doc):
> > >> +    """
> > >> +    Format an enum to texi
> > >> +    """
> > >> +    for i in expr['data']:
> > >> +        if i not in doc.args:
> > >> +            doc.args[i] = ''
> > >> +    body = texi_body(doc)
> > >> +    return ENUM_FMT(name=doc.symbol,
> > >> +                    body=body)
> > >> +
> > >> +
> > >> +def texi_struct(expr, doc):
> > >> +    """
> > >> +    Format a struct to texi
> > >> +    """
> > >> +    args = texi_args(expr)
> > >> +    body = texi_body(doc)
> > >> +    attrs = "@{ " + args + " @}"
> > >> +    base = expr.get("base")
> > >> +    if base:
> > >> +        attrs += " + %s" % base
> > >> +    return STRUCT_FMT(type="Struct",
> > >> +                      name=doc.symbol,
> > >> +                      attrs=attrs,
> > >> +                      body=body)
> > >
> > > More syntax invention.  Example output:
> > >
> > >  -- Struct: BlockdevOptionsReplication { 'mode': REPLICATIONMODE,
> > >           ['top-id': STR] } + BlockdevOptionsGenericFormat
> > >
> > >      ''mode''
> > >           the replication mode
> > >      ''top-id'' *
> > >           In secondary mode, node name or device ID of the root node
> who
> > >           owns the replication node chain.  Must not be given in
> primary
> > >           mode.
> > >      Driver specific block device options for replication
> > >           Since: 2.8
> > >
> > > Meaning of the attrs part is perhaps more guessable here, but it's
> still
> > > guesswork.
> > >
> > > The meaning of * after ''top-id'' is also unobvious.
> > >
> > > Note the redundancy between the attrs part and the body: both state
> > > member names and optionalness.  The body doesn't state member types and
> > > base type.  If we fixed that, we could drop the attrs part and save us
> > > the trouble of defining and explaining a syntax for it.
> > >
> > > Let me take a step back.  This document is about the QMP wire format.
> > > There are no such things as simple and flat unions on the wire, only
> > > JSON objects.  QMP introspection duly describes a type's JSON objects,
> > > not how it's defined in QAPI.  I think QMP documentation should ideally
> > > do the same.
> > >
> > > QMP introspection uses a common structure for struct, simple and flat
> > > union: common members, variants, and if variants, then the common
> member
> > > that serves as tag.  See introspect.json for details.
> > >
> > > Base types are flattened away.  Appropriate for introspection, but
> > > documentation shouldn't do that.
> > >
> > > I wrote "ideally" because it's probably too big a step.  I'm willing to
> > > settle for something less than ideal.
> >
> > I don't have clear idea what to do here, so if we can leave that for
> later,
> > that would be nice for me (I am already spending more time than I
> imagined
> > I would on doc stuff)
>
> Me too, if that's any consolation...
>
> Perhaps I can find a bit of time to think so I can propose what we could
> do.
>
> > >> +
> > >> +
> > >> +def texi_command(expr, doc):
> > >> +    """
> > >> +    Format a command to texi
> > >> +    """
> > >> +    args = texi_args(expr)
> > >> +    ret = expr["returns"] if "returns" in expr else ""
> > >> +    body = texi_body(doc)
> > >> +    return COMMAND_FMT(type="Command",
> > >> +                       name=doc.symbol,
> > >> +                       ret=ret,
> > >> +                       args="(" + args + ")",
> > >> +                       body=body)
> > >> +
> > >> +
> > >> +def texi_event(expr, doc):
> > >> +    """
> > >> +    Format an event to texi
> > >> +    """
> > >> +    args = texi_args(expr)
> > >> +    body = texi_body(doc)
> > >> +    return COMMAND_FMT(type="Event",
> > >> +                       name=doc.symbol,
> > >> +                       ret="",
> > >> +                       args="(" + args + ")",
> > >> +                       body=body)
> > >> +
> > >> +
> > >> +def texi_expr(expr, doc):
> > >> +    """
> > >> +    Format an expr to texi
> > >> +    """
> > >> +    (kind, _) = expr.items()[0]
> > >> +
> > >> +    fmt = {"command": texi_command,
> > >> +           "struct": texi_struct,
> > >> +           "enum": texi_enum,
> > >> +           "union": texi_union,
> > >> +           "alternate": texi_alternate,
> > >> +           "event": texi_event}
> > >> +    try:
> > >> +        fmt = fmt[kind]
> > >> +    except KeyError:
> > >> +        raise ValueError("Unknown expression kind '%s'" % kind)
> > >
> > > The try / except converts one kind of error into another.  What does
> > > that buy us?  As far as I can tell, this shouldn't ever happen.
> >
> > dropped
> >
> > >> +
> > >> +    return fmt(expr, doc)
> > >> +
> > >> +
> > >> +def texi(docs):
> > >> +    """
> > >> +    Convert QAPI schema expressions to texi documentation
> > >> +    """
> > >> +    res = []
> > >> +    for doc in docs:
> > >> +        expr = doc.expr
> > >> +        if not expr:
> > >> +            res.append(texi_body(doc))
> > >> +            continue
> > >> +        try:
> > >> +            doc = texi_expr(expr, doc)
> > >> +            res.append(doc)
> > >> +        except:
> > >> +            print >>sys.stderr, "error at @%s" % doc.info
> > >> +            raise
> > >> +
> > >> +    return '\n'.join(res)
> > >> +
> > >> +
> > >> +def main(argv):
> > >> +    """
> > >> +    Takes schema argument, prints result to stdout
> > >> +    """
> > >> +    if len(argv) != 2:
> > >> +        print >>sys.stderr, "%s: need exactly 1 argument: SCHEMA" %
> argv[0]
> > >> +        sys.exit(1)
> > >> +
> > >> +    schema = qapi.QAPISchema(argv[1], strict_doc=True)
> > >> +    print texi(schema.docs)
> > >> +
> > >> +
> > >> +if __name__ == "__main__":
> > >> +    main(sys.argv)
> > >> diff --git a/docs/qapi-code-gen.txt b/docs/qapi-code-gen.txt
> > >> index 2841c51..8bc963e 100644
> > >> --- a/docs/qapi-code-gen.txt
> > >> +++ b/docs/qapi-code-gen.txt
> > >> @@ -45,16 +45,13 @@ QAPI parser does not).  At present, there is no
> place where a QAPI
> > >>  schema requires the use of JSON numbers or null.
> > >>
> > >>  Comments are allowed; anything between an unquoted # and the
> following
> > >> -newline is ignored.  Although there is not yet a documentation
> > >> -generator, a form of stylized comments has developed for consistently
> > >> -documenting details about an expression and when it was added to the
> > >> -schema.  The documentation is delimited between two lines of ##, then
> > >> -the first line names the expression, an optional overview is
> provided,
> > >> -then individual documentation about each member of 'data' is
> provided,
> > >> -and finally, a 'Since: x.y.z' tag lists the release that introduced
> > >> -the expression.  Optional members are tagged with the phrase
> > >> -'#optional', often with their default value; and extensions added
> > >> -after the expression was first released are also given a '(since
> > >> +newline is ignored.  The documentation is delimited between two lines
> > >> +of ##, then the first line names the expression, an optional overview
> > >> +is provided, then individual documentation about each member of
> 'data'
> > >> +is provided, and finally, a 'Since: x.y.z' tag lists the release that
> > >> +introduced the expression.  Optional members are tagged with the
> > >> +phrase '#optional', often with their default value; and extensions
> > >> +added after the expression was first released are also given a
> '(since
> > >>  x.y.z)' comment.  For example:
> > >>
> > >>      ##
> > >> @@ -73,12 +70,49 @@ x.y.z)' comment.  For example:
> > >>      #           (Since 2.0)
> > >>      #
> > >>      # Since: 0.14.0
> > >> +    #
> > >> +    # Notes: You can also make a list:
> > >> +    #        - with items
> > >> +    #        - like this
> > >> +    #
> > >> +    # Example:
> > >> +    #
> > >> +    # -> { "execute": ... }
> > >> +    # <- { "return": ... }
> > >> +    #
> > >>      ##
> > >>      { 'struct': 'BlockStats',
> > >>        'data': {'*device': 'str', 'stats': 'BlockDeviceStats',
> > >>                 '*parent': 'BlockStats',
> > >>                 '*backing': 'BlockStats'} }
> > >
> > > This example gives an idea of how to document struct types.  But the
> > > reader is left guessing how to document other kinds of definitions.  In
> > > particular, there's no mention of "Returns", "Note" and "Examples"
> > > sections anywhere in this file.  As is, this could do for a tutorial,
> > > but this file is a *reference*, not a tutorial.
> > >
> > > For a reference, we need to be more thorough.  A doc comments section
> on
> > > its own seems advisable.
> > >
> > > I guess doc comment examples are best added to the schema code examples
> > > in sections === Struct types ===, ..., === Events ===.
> > >
> > > Careful review for completeness is advised.
> >
> > So far we did quite fine with the generic example. I am not convinced we
> > need to have doc comments for all kinds of types, it looks redundant to
> me.
>
> Well, so far we didn't require people to write well-formed doc comments.
>
> Without such comments in qapi-code-gen.txt's examples, they become
> incomplete.  Pity, because I like to feed them to the QAPI generators;
> extracted like this:
>
>     $ sed '1,/\$ cat example-schema.json/d;/^[^ ]/,$d;s/^    //'
> ../docs/qapi-code-gen.txt >example-schema.json
>
> > >> +It's also possible to create documentation sections, such as:
> > >> +
> > >> +    ##
> > >> +    # = Section
> > >> +    # == Subsection
> > >> +    #
> > >> +    # Some text foo with *strong* and _emphasis_
> > >> +    # 1. with a list
> > >> +    # 2. like that
> > >> +    #
> > >> +    # And some code:
> > >> +    # | $ echo foo
> > >> +    # | -> do this
> > >> +    # | <- get that
> > >> +    #
> > >> +    ##
> > >> +
> > >> +Text *foo* and _foo_ are for "strong" and "emphasis" styles (they do
> > >> +not work over multiple lines). @foo is used to reference a symbol.
> > >> +
> > >> +Lines starting with the following characters and a space:
> > >> +- | are examples
> > >> +- = are top section
> > >> +- == are subsection
> > >> +- X. or X) are enumerations (X is any number)
> > >> +- o/*/- are itemized list
> > >> +
> > >
> > > Is this doc markup documentation complete?
> >
> > I think so
> >
> > > Is it related to any existing text markup language?  Hmm, the commit
> > > message calls it "Markdown-like".  Should we mention that here?
> >
> > I'll use "annotations" instead in the doc
> >
> > > Is all markup valid in all contexts?  For instance, is = Section valid
> > > in symbol blocks?  What does | (or any markup for that matter) do
> within
> > > an Example section?
> >
> > Example is verbatim<

-- 
Marc-André Lureau

Re: [Qemu-devel] [PATCH v5 13/17] qapi: add qapi2texi script

Reply via email to