Re: [PATCH 08/18] docs: kdoc_parser: fix parser to support multi-word types

Mauro Carvalho Chehab Tue, 03 Mar 2026 12:24:45 -0800

On Tue, 03 Mar 2026 10:34:48 -0700
Jonathan Corbet <[email protected]> wrote:


> Mauro Carvalho Chehab <[email protected]> writes:
> 
> > The regular expression currently expects a single word for the
> > type, but it may be something like  "struct foo".
> >
> > Add support for it.
> >
> > Signed-off-by: Mauro Carvalho Chehab <[email protected]>
> > Acked-by: Randy Dunlap <[email protected]>
> > Tested-by: Randy Dunlap <[email protected]>
> > Reviewed-by: Aleksandr Loktionov <[email protected]>
> > ---
> >  tools/lib/python/kdoc/kdoc_parser.py | 4 ++--
> >  1 file changed, 2 insertions(+), 2 deletions(-)
> >
> > diff --git a/tools/lib/python/kdoc/kdoc_parser.py 
> > b/tools/lib/python/kdoc/kdoc_parser.py
> > index 39ff27d421eb..22a820d33dc8 100644
> > --- a/tools/lib/python/kdoc/kdoc_parser.py
> > +++ b/tools/lib/python/kdoc/kdoc_parser.py
> > @@ -1018,14 +1018,14 @@ class KernelDoc:
> >  
> >          default_val = None
> >  
> > -        r= KernRe(OPTIONAL_VAR_ATTR + 
> > r"[\w_]*\s+(?:\*+)?([\w_]+)\s*[\d\]\[]*\s*(=.*)?")
> > +        r= KernRe(OPTIONAL_VAR_ATTR + 
> > r"\s*[\w_\s]*\s+(?:\*+)?([\w_]+)\s*[\d\]\[]*\s*(=.*)?")  
> 
> Just for future reference...I *really* think that the code is improved
> by breaking up and commenting gnarly regexes like this.  They are really
> unreadable in this form.  (And yes, I know the code has been full of
> these forever, but we can always try to make it better :)

Heh, you're right: this could be better.

> Anyway, just grumbling.

Heh, if we start using a code like the tokenizer I'm experimenting
here:

        https://lore.kernel.org/linux-doc/20260303155310.5235b367@localhost/

we could probably get rid of regexes in the future, using instead
a loop that would be picking "ID" tokens, e.g. basically we would
have something similar to this completely untested code snippet:

        self.tokenizer = CTokenizer()

        ...

        ids = []
        get_default = False

        while kind, value in self.tokenizer(proto):
                if kind == "ID":
                        ids.append(value)

                if kind == "OP" and value == "=":
                        get_default = True
                        break

        if get_default:
                while kind, value in self.tokenizer(proto):
                        if kind in ["CHAR", "STRING", "NUMBER"]:
                                default_val = value
                                break

        declaration_name = ids[-1]
        

Thanks,
Mauro

Re: [PATCH 08/18] docs: kdoc_parser: fix parser to support multi-word types

Reply via email to