Em Thu, 14 Aug 2025 18:13:20 +0100
Peter Maydell <peter.mayd...@linaro.org> escreveu:

> This commit makes the equivalent changes to the Python script that we
> had for the old Perl script in commit 4cf41794411f ("docs: tweak
> kernel-doc for QEMU coding standards").  To repeat the rationale from
> that commit:
> 
>     Surprisingly, QEMU does have a pretty consistent doc comment style and
>     it is not very different from the Linux kernel's.  Of the documentation
>     "sigils", only "#" separates the QEMU doc comment style from Linux's,
>     and it has 200+ instances vs. 6 for the kernel's '&struct foo' (all in
>     accel/tcg/translate-all.c), so it's clear that the two standards are
>     different in this respect.  In addition, our structs are typedefed and
>     recognized by CamelCase names.
> 
> Note that in 4cf41794411f we used '(?!)' as our type_fallback regex;
> this is strictly not quite a replacement for the upstream
> '\&([_\w]+)', because the latter includes a group that can later be
> matched with \1, and the former does not.  The old perl script did
> not care about this, but the python version does, so we must include
> the extra set of brackets to ensure we have a group.
> 
> This commit does not include all the same changes that 4cf41794411f
> did.  Of the missing pieces, some had already gone in an earlier
> kernel-doc update; the parts we still had but do not include here are:
> 
>     @@ -2057,7 +2060,7 @@
>          }
>          elsif (/$doc_decl/o) {
>             $identifier = $1;
>     -       if (/\s*([\w\s]+?)(\(\))?\s*-/) {
>     +       if (/\s*([\w\s]+?)(\s*-|:)/) {
>                 $identifier = $1;
>             }
> 
>     @@ -2067,7 +2070,7 @@
>             $contents = "";
>             $section = $section_default;
>             $new_start_line = $. + 1;
>     -       if (/-(.*)/) {
>     +       if (/[-:](.*)/) {
>                 # strip leading/trailing/multiple spaces
>                 $descr= $1;
>                 $descr =~ s/^\s*//;
> 
> The second of these is already in the upstream version: the line r =
> KernRe("[-:](.*)") in process_name() matches the regex we have. 

Yes. If I recall correctly, we added this one to solve some issues on a 
couple of files that were full of ":" as separator. They violate what
is documented as a valid kernel-doc markup, but it didn't hurt adding 
support for such variant.

> The
> first change has been refactored into the doc_begin_data and
> doc_begin_func changes.  Since the output HTML for QEMU's
> documentation has no relevant changes with the new kerneldoc, we
> assume that this too has been handled upstream.
> 
> Signed-off-by: Peter Maydell <peter.mayd...@linaro.org>

LGTM, but see my notes below.

Anyway:

Reviewed-by: Mauro Carvalho Chehab <mchehab+hua...@kernel.org>

> ---
>  scripts/lib/kdoc/kdoc_output.py | 12 ++++++------
>  1 file changed, 6 insertions(+), 6 deletions(-)
> 
> diff --git a/scripts/lib/kdoc/kdoc_output.py b/scripts/lib/kdoc/kdoc_output.py
> index ea8914537ba..39fa872dfca 100644
> --- a/scripts/lib/kdoc/kdoc_output.py
> +++ b/scripts/lib/kdoc/kdoc_output.py
> @@ -38,12 +38,12 @@
>  type_fp_param2 = KernRe(r"\@(\w+->\S+)\(\)", cache=False)
>  
>  type_env = KernRe(r"(\$\w+)", cache=False)
> -type_enum = KernRe(r"\&(enum\s*([_\w]+))", cache=False)
> -type_struct = KernRe(r"\&(struct\s*([_\w]+))", cache=False)
> -type_typedef = KernRe(r"\&(typedef\s*([_\w]+))", cache=False)
> -type_union = KernRe(r"\&(union\s*([_\w]+))", cache=False)
> -type_member = KernRe(r"\&([_\w]+)(\.|->)([_\w]+)", cache=False)
> -type_fallback = KernRe(r"\&([_\w]+)", cache=False)

> +type_enum = KernRe(r"#(enum\s*([_\w]+))", cache=False)
> +type_struct = KernRe(r"#(struct\s*([_\w]+))", cache=False)
> +type_typedef = KernRe(r"#(([A-Z][_\w]*))", cache=False)
> +type_union = KernRe(r"#(union\s*([_\w]+))", cache=False)
> +type_member = KernRe(r"#([_\w]+)(\.|->)([_\w]+)", cache=False)
> +type_fallback = KernRe(r"((?!))", cache=False) # this never matches
>  type_member_func = type_member + KernRe(r"\(\)", cache=False)

That seems something that a class override would address it better.

Basically, you can do something like:


        type_enum = KernRe(r"#(enum\s*([_\w]+))", cache=False)
        type_struct = KernRe(r"#(struct\s*([_\w]+))", cache=False)
        type_typedef = KernRe(r"#(([A-Z][_\w]*))", cache=False)
        type_union = KernRe(r"#(union\s*([_\w]+))", cache=False)
        type_member = KernRe(r"#([_\w]+)(\.|->)([_\w]+)", cache=False)
        type_fallback = KernRe(r"((?!))", cache=False) # this never matches
        ...

        (either keep the other types or add a __init__ that would append
         or replace only the above elements)

        class QemuRestFormat(RestFormatOutput):
             highlights = [
                (type_constant, r"``\1``"),
                (type_constant2, r"``\1``"),

                # Note: need to escape () to avoid func matching later
                (type_member_func, r":c:type:`\1\2\3\\(\\) <\1>`"),
                (type_member, r":c:type:`\1\2\3 <\1>`"),
                (type_fp_param, r"**\1\\(\\)**"),
                (type_fp_param2, r"**\1\\(\\)**"),
                (type_func, r"\1()"),
                (type_enum, r":c:type:`\1 <\2>`"),
                (type_struct, r":c:type:`\1 <\2>`"),
                (type_typedef, r":c:type:`\1 <\2>`"),
                (type_union, r":c:type:`\1 <\2>`"),
        
                # in rst this can refer to any type
                (type_fallback, r":c:type:`\1`"),
                (type_param_ref, r"**\1\2**")
            ]

Where the above will be the QEMU-specific regexes.

Then, when creating a KernelFiles() instance at kerneldoc.py Sphinx
extension:

        def setup_kfiles(app):
            global kfiles

            out_style = QemuRestFormat()
            kfiles = KernelFiles(out_style=out_style, logger=logger)

keeping the remaining code of the Kernel version of kerneldoc.py.

Thanks,
Mauro

Reply via email to