On Wed, 04 Mar 2026 12:07:45 +0200 Jani Nikula <[email protected]> wrote:
> On Mon, 23 Feb 2026, Jonathan Corbet <[email protected]> wrote: > > Jani Nikula <[email protected]> writes: > > > >> There's always the question, if you're putting a lot of effort into > >> making kernel-doc closer to an actual C parser, why not put all that > >> effort into using and adapting to, you know, an actual C parser? > > > > Not speaking to the current effort but ... in the past, when I have > > contemplated this (using, say, tree-sitter), the real problem is that > > those parsers simply strip out the comments. Kerneldoc without comments > > ... doesn't work very well. If there were a parser without those > > problems, and which could be made to do the right thing with all of our > > weird macro usage, it would certainly be worth considering. > > I think e.g. libclang and its Python bindings can be made to work. The > main problems with that are passing proper compiler options (because > it'll need to include stuff to know about types etc. because it is a > proper parser), preprocessing everything is going to take time, you need > to invest a bunch into it to know how slow exactly compared to the > current thing and whether it's prohitive, and it introduces an extra > dependency. It is not just that. Assume we're parsing something like this: static __always_inline int _raw_read_trylock(rwlock_t *lock) __cond_acquires_shared(true, lock); using a cpp (or libclang). We would need to define/undefine 3 symbols: #if defined(WARN_CONTEXT_ANALYSIS) && !defined(__CHECKER__) && !defined(__GENKSYMS__) (in this particular case, the default is OK, but on others, it may not be) This is by far more complex than just writing a logic that would convert the above into: static int _raw_read_trylock(rwlock_t *lock); which is the current kernel-doc approach. - Using a C preprocessor, we might have a very big prototype - and even have arch-specific defines affecting it, as some includes may be inside arch/*/include. So, we would need a kernel-doc ".config" file with a set of defines that can be hard to maintain. > So yeah, there are definitely tradeoffs there. But it's not like this > constant patching of kernel-doc is exactly burden free either. I don't > know, is it just me, but I'd like to think as a profession we'd be past > writing ad hoc C parsers by now. I'd say that the binding logic and the ".config" kernel-doc defines will be complex to maintain. Maybe more complex than kernel-doc patching and a simple C parser, like the one on my test. > > On Mon, 23 Feb 2026 15:47:00 +0200 > > Jani Nikula <[email protected]> wrote: > >> There's always the question, if you're putting a lot of effort into > >> making kernel-doc closer to an actual C parser, why not put all that > >> effort into using and adapting to, you know, an actual C parser? > > > > Playing with this idea, it is not that hard to write an actual C > > parser - or at least a tokenizer. > > Just for the record, I suggested using an existing parser, not going all > NIH and writing your own. I know, but I suspect that a simple tokenizer similar to my example might do the job without any major impact, but yeah, tests are needed. -- Thanks, Mauro

