Re: [RFC PATCH v1 2/6] kernel-doc: replace kernel-doc perl parser with a pure python one (WIP)

2017-01-27 Thread Markus Heiser

Am 26.01.2017 um 20:26 schrieb Jani Nikula :

> On Thu, 26 Jan 2017, Jonathan Corbet  wrote:
>> Give me a new kerneldoc that passes those tests, and I'll happily
>> merge it.  (I have some sympathy with the idea that we should look
>> into other parsers, but I would not hold up a new kerneldoc that
>> passed those tests on this basis alone.)
> 
> I'll just note in passing that having another parser that actually works
> for our needs might be a pink unicorn pony. It might exist, it might
> not, and someone would have to put in the hours to try to find it, tame
> it, and bring it to the kernel. But it would be awesome to
> have. Switching to a homebrew Python parser first does not preclude a
> unicorn hunt later.

Here are my experience about parsing C code and kernel-doc comments.

The reg-expressions divide into two parts:

a.) those parsing "C sources", catching up function prototypes, structs etc. and

b.) those parsing "kernel-doc comments", catching up attribute descriptions,
   cross references etc.

When I developed the py-version in my POC I realized that the reg-expressions
parsing C sources (a.) aren't so bad. They have a long history and are well
tested against kernel' sources (As far as I remember, I added only one regexp
more to match function prototypes).

 This was the time where I looked at some other parsing tool and
 after a day I throw away the idea of using a external parser
 tool, first.

Most problems I have had, was parsing the kernel-doc markup itself. E.g. the
ambiguous attribute markup "* @foo: lorem" and its cross-ref "@foo". The latter
syntax is ambiguous, it fails mostly on new-lines and with strings like
"m...@foo.bar".

When I looked at the whole sources, I also realized that we have two flavors of
kernel-doc markups.

b.1) Those from traditional DocBook where whitespaces aren't markups and

b.2) those which has been rewritten with reST markup in, where whitespaces are a
part of the reST.

But this was only the half truth of b.2) : the 'new' markup did not only
consists of pure reST markup. For convince it is a mix of kernel-doc markup and
reST markup (e.g. remember the cross-ref mentioned above).

I suppose that we will never completely get rid off traditional (b.1), since
this means; changing the whole kernel source ;)

At that time I wanted to implement a parser which has the ability to handle both
flavors. A (undocumented) 'vintage' mode an the user-documented 'reST' mode.
But what is the criteria to switch from one mode to the other?  For this I made
a primitive assumption: every C source file which is used in a ".. kernel-doc::"
directive has to be marked up with the modern reST flavor.

ATM, the py-version of kernel-doc implements the same state machine as the perl
one and the modes are implemented in the same state machine (not perfect but it
worked for me first, suppose we can make it better).

I remember about a very early discussion we had about those modes and I know
that it doesn't find friends in the community (at that time).  May be today we
have more experience and new ideas.

  I really like to see (to work on) a parser with we can parse the
  whole kernel source and generate reST from.

What do you think, is it a bloody idea?

Thanks!

-- Markus --



Re: [RFC PATCH v1 2/6] kernel-doc: replace kernel-doc perl parser with a pure python one (WIP)

2017-01-27 Thread Markus Heiser

Am 26.01.2017 um 20:26 schrieb Jani Nikula :

> On Thu, 26 Jan 2017, Jonathan Corbet  wrote:
>> Give me a new kerneldoc that passes those tests, and I'll happily
>> merge it.  (I have some sympathy with the idea that we should look
>> into other parsers, but I would not hold up a new kerneldoc that
>> passed those tests on this basis alone.)
> 
> I'll just note in passing that having another parser that actually works
> for our needs might be a pink unicorn pony. It might exist, it might
> not, and someone would have to put in the hours to try to find it, tame
> it, and bring it to the kernel. But it would be awesome to
> have. Switching to a homebrew Python parser first does not preclude a
> unicorn hunt later.

Here are my experience about parsing C code and kernel-doc comments.

The reg-expressions divide into two parts:

a.) those parsing "C sources", catching up function prototypes, structs etc. and

b.) those parsing "kernel-doc comments", catching up attribute descriptions,
   cross references etc.

When I developed the py-version in my POC I realized that the reg-expressions
parsing C sources (a.) aren't so bad. They have a long history and are well
tested against kernel' sources (As far as I remember, I added only one regexp
more to match function prototypes).

 This was the time where I looked at some other parsing tool and
 after a day I throw away the idea of using a external parser
 tool, first.

Most problems I have had, was parsing the kernel-doc markup itself. E.g. the
ambiguous attribute markup "* @foo: lorem" and its cross-ref "@foo". The latter
syntax is ambiguous, it fails mostly on new-lines and with strings like
"m...@foo.bar".

When I looked at the whole sources, I also realized that we have two flavors of
kernel-doc markups.

b.1) Those from traditional DocBook where whitespaces aren't markups and

b.2) those which has been rewritten with reST markup in, where whitespaces are a
part of the reST.

But this was only the half truth of b.2) : the 'new' markup did not only
consists of pure reST markup. For convince it is a mix of kernel-doc markup and
reST markup (e.g. remember the cross-ref mentioned above).

I suppose that we will never completely get rid off traditional (b.1), since
this means; changing the whole kernel source ;)

At that time I wanted to implement a parser which has the ability to handle both
flavors. A (undocumented) 'vintage' mode an the user-documented 'reST' mode.
But what is the criteria to switch from one mode to the other?  For this I made
a primitive assumption: every C source file which is used in a ".. kernel-doc::"
directive has to be marked up with the modern reST flavor.

ATM, the py-version of kernel-doc implements the same state machine as the perl
one and the modes are implemented in the same state machine (not perfect but it
worked for me first, suppose we can make it better).

I remember about a very early discussion we had about those modes and I know
that it doesn't find friends in the community (at that time).  May be today we
have more experience and new ideas.

  I really like to see (to work on) a parser with we can parse the
  whole kernel source and generate reST from.

What do you think, is it a bloody idea?

Thanks!

-- Markus --



Re: [RFC PATCH v1 2/6] kernel-doc: replace kernel-doc perl parser with a pure python one (WIP)

2017-01-26 Thread Jani Nikula
On Thu, 26 Jan 2017, Jonathan Corbet  wrote:
> Give me a new kerneldoc that passes those tests, and I'll happily
> merge it.  (I have some sympathy with the idea that we should look
> into other parsers, but I would not hold up a new kerneldoc that
> passed those tests on this basis alone.)

I'll just note in passing that having another parser that actually works
for our needs might be a pink unicorn pony. It might exist, it might
not, and someone would have to put in the hours to try to find it, tame
it, and bring it to the kernel. But it would be awesome to
have. Switching to a homebrew Python parser first does not preclude a
unicorn hunt later.

BR,
Jani.


-- 
Jani Nikula, Intel Open Source Technology Center


Re: [RFC PATCH v1 2/6] kernel-doc: replace kernel-doc perl parser with a pure python one (WIP)

2017-01-26 Thread Jani Nikula
On Thu, 26 Jan 2017, Jonathan Corbet  wrote:
> Give me a new kerneldoc that passes those tests, and I'll happily
> merge it.  (I have some sympathy with the idea that we should look
> into other parsers, but I would not hold up a new kerneldoc that
> passed those tests on this basis alone.)

I'll just note in passing that having another parser that actually works
for our needs might be a pink unicorn pony. It might exist, it might
not, and someone would have to put in the hours to try to find it, tame
it, and bring it to the kernel. But it would be awesome to
have. Switching to a homebrew Python parser first does not preclude a
unicorn hunt later.

BR,
Jani.


-- 
Jani Nikula, Intel Open Source Technology Center


Re: [RFC PATCH v1 2/6] kernel-doc: replace kernel-doc perl parser with a pure python one (WIP)

2017-01-26 Thread Jonathan Corbet
On Wed, 25 Jan 2017 20:07:47 +0100
Markus Heiser  wrote:

> So, what I mean is, the new parser has to generate a complete different reST
> output and thats why we can't compare the perl parser with python one on a 
> reST
> basis ... and if reST is different, HTML is different :(
> 
> So we do not have any chance to track regression when switching from
> the old to the new parser.
> 
> Thats are my thoughts on this topic, may be you have a solution for this?

The solution, I think, is as has been described by others in the thread.
I'll make a try at it now :)

The objectives in a patch set are something like this:

 - Replace the kernel-doc utility with one that is easier to maintain and
   enhance.

 - Add various enhancements (man pages, linting, better output, better
   parsing) to the docs build system.

What everybody is complaining about here is that all of that stuff is
being thrown in together into a single patch set.  We don't do things that
way because long experience says we'll create a mess that takes a long
time to straighten out again.

As I said before, I'm very much amenable to the idea of replacing
kernel-doc with one that is easier to work with.  I haven't yet had the
time to look closely enough at yours to have an opinion on whether it does
that or not.  But, assuming it does, the proper way to make this change is
to provide a new kerneldoc that behaves as closely to the old one as
possible, with an absolute minimum of output changes.

Doing it that way probably seems like a pretty annoying request.  But it
lets us validate its basic mechanics and be confident that we won't break
the docs build in weird ways.  It also lets us evaluate the question of
whether the replacement has merit in its own right, independent of any
other change we want to make.  Give me a new kerneldoc that passes those
tests, and I'll happily merge it.  (I have some sympathy with the idea
that we should look into other parsers, but I would not hold up a new
kerneldoc that passed those tests on this basis alone.)

*Then* we can start adding the other stuff, which, from a first look,
appears to be stuff that we very much want to have.  Each one of those,
too, should stand alone and pass muster on its own merits.  Changes
presented in this way could be merged in the same development cycle if
they are ready, but we need to be able to evaluate each one separately.

Does this make sense?  We all really appreciate the work you're doing
here, we're just asking that it be done in an evolutionary manner so we
can evaluate it properly.

Thanks,

jon


Re: [RFC PATCH v1 2/6] kernel-doc: replace kernel-doc perl parser with a pure python one (WIP)

2017-01-26 Thread Jonathan Corbet
On Wed, 25 Jan 2017 20:07:47 +0100
Markus Heiser  wrote:

> So, what I mean is, the new parser has to generate a complete different reST
> output and thats why we can't compare the perl parser with python one on a 
> reST
> basis ... and if reST is different, HTML is different :(
> 
> So we do not have any chance to track regression when switching from
> the old to the new parser.
> 
> Thats are my thoughts on this topic, may be you have a solution for this?

The solution, I think, is as has been described by others in the thread.
I'll make a try at it now :)

The objectives in a patch set are something like this:

 - Replace the kernel-doc utility with one that is easier to maintain and
   enhance.

 - Add various enhancements (man pages, linting, better output, better
   parsing) to the docs build system.

What everybody is complaining about here is that all of that stuff is
being thrown in together into a single patch set.  We don't do things that
way because long experience says we'll create a mess that takes a long
time to straighten out again.

As I said before, I'm very much amenable to the idea of replacing
kernel-doc with one that is easier to work with.  I haven't yet had the
time to look closely enough at yours to have an opinion on whether it does
that or not.  But, assuming it does, the proper way to make this change is
to provide a new kerneldoc that behaves as closely to the old one as
possible, with an absolute minimum of output changes.

Doing it that way probably seems like a pretty annoying request.  But it
lets us validate its basic mechanics and be confident that we won't break
the docs build in weird ways.  It also lets us evaluate the question of
whether the replacement has merit in its own right, independent of any
other change we want to make.  Give me a new kerneldoc that passes those
tests, and I'll happily merge it.  (I have some sympathy with the idea
that we should look into other parsers, but I would not hold up a new
kerneldoc that passed those tests on this basis alone.)

*Then* we can start adding the other stuff, which, from a first look,
appears to be stuff that we very much want to have.  Each one of those,
too, should stand alone and pass muster on its own merits.  Changes
presented in this way could be merged in the same development cycle if
they are ready, but we need to be able to evaluate each one separately.

Does this make sense?  We all really appreciate the work you're doing
here, we're just asking that it be done in an evolutionary manner so we
can evaluate it properly.

Thanks,

jon


Re: [RFC PATCH v1 2/6] kernel-doc: replace kernel-doc perl parser with a pure python one (WIP)

2017-01-26 Thread Jani Nikula
On Thu, 26 Jan 2017, Markus Heiser  wrote:
> Am 25.01.2017 um 21:59 schrieb Jani Nikula :
>
>>> But the problem I see here is, that the perl script generates a
>>> reST output which I can't use. As an example we can take a look at
>>> the man-page builder I shipped in the series.
>> 
>> Sorry, I still don't understand *why* you can't use the same rst. Your
>> explanation seems to relate to man pages, but man pages come
>> *afterwards*, and are a separate improvement. I know you talk about lack
>> of proper structure and all that, but *why* can it strictly not be used,
>> if the *current* rst clearly can be used?
>
> "afterwards" is the word, that lets me slowly realize, that I have to
> stop solving the world's problems with one patch. Now I guess how my
> next patch series has to look like. Thanks! ... for being patient with
> me.

Indeed, we change the world, one small incremental patch at a time. ;)

> Before I start, I want to hear your thoughts about the parsing
> aspect ...
>
 That said, perhaps having an elegant parser (perhaps based on a
 compiler plugin) is incompatible with the idea of making it a
 bug-for-bug drop-in replacement of the old one, and it's something
 we need to think about.
>
> Did you have any suggestions?

The perfect is the enemy of the good... If we see that the current Perl
parser just rewritten in Python really is an improvement, we should
consider it. But as I wrote, there are still issues there, like
performance, that we need to understand. I'll mostly defer to Jon on
this.

But before we plunge on with this, I would like to see at least some
research into reusing existing parsers which I would expect are
plentiful. We may end up deciding regexps are the way to go after all,
but I'd like it to be based on a decision rather than a lack of one. And
we might decide to look at this as a later improvement instead as well.

I've looked at python-clang myself, but it's a huge dependency, and it's
not trivial to cover all the things that the current one does with
that. I'd dismiss that.


BR,
Jani.


-- 
Jani Nikula, Intel Open Source Technology Center


Re: [RFC PATCH v1 2/6] kernel-doc: replace kernel-doc perl parser with a pure python one (WIP)

2017-01-26 Thread Jani Nikula
On Thu, 26 Jan 2017, Markus Heiser  wrote:
> Am 25.01.2017 um 21:59 schrieb Jani Nikula :
>
>>> But the problem I see here is, that the perl script generates a
>>> reST output which I can't use. As an example we can take a look at
>>> the man-page builder I shipped in the series.
>> 
>> Sorry, I still don't understand *why* you can't use the same rst. Your
>> explanation seems to relate to man pages, but man pages come
>> *afterwards*, and are a separate improvement. I know you talk about lack
>> of proper structure and all that, but *why* can it strictly not be used,
>> if the *current* rst clearly can be used?
>
> "afterwards" is the word, that lets me slowly realize, that I have to
> stop solving the world's problems with one patch. Now I guess how my
> next patch series has to look like. Thanks! ... for being patient with
> me.

Indeed, we change the world, one small incremental patch at a time. ;)

> Before I start, I want to hear your thoughts about the parsing
> aspect ...
>
 That said, perhaps having an elegant parser (perhaps based on a
 compiler plugin) is incompatible with the idea of making it a
 bug-for-bug drop-in replacement of the old one, and it's something
 we need to think about.
>
> Did you have any suggestions?

The perfect is the enemy of the good... If we see that the current Perl
parser just rewritten in Python really is an improvement, we should
consider it. But as I wrote, there are still issues there, like
performance, that we need to understand. I'll mostly defer to Jon on
this.

But before we plunge on with this, I would like to see at least some
research into reusing existing parsers which I would expect are
plentiful. We may end up deciding regexps are the way to go after all,
but I'd like it to be based on a decision rather than a lack of one. And
we might decide to look at this as a later improvement instead as well.

I've looked at python-clang myself, but it's a huge dependency, and it's
not trivial to cover all the things that the current one does with
that. I'd dismiss that.


BR,
Jani.


-- 
Jani Nikula, Intel Open Source Technology Center


Re: [RFC PATCH v1 2/6] kernel-doc: replace kernel-doc perl parser with a pure python one (WIP)

2017-01-26 Thread Markus Heiser

Am 25.01.2017 um 21:59 schrieb Jani Nikula :

>> But the problem I see here is, that the perl script generates a
>> reST output which I can't use. As an example we can take a look at
>> the man-page builder I shipped in the series.
> 
> Sorry, I still don't understand *why* you can't use the same rst. Your
> explanation seems to relate to man pages, but man pages come
> *afterwards*, and are a separate improvement. I know you talk about lack
> of proper structure and all that, but *why* can it strictly not be used,
> if the *current* rst clearly can be used?

"afterwards" is the word, that lets me slowly realize, that I have to
stop solving the world's problems with one patch. Now I guess how my
next patch series has to look like. Thanks! ... for being patient with
me.

Before I start, I want to hear your thoughts about the parsing
aspect ...

>>> That said, perhaps having an elegant parser (perhaps based on a compiler
>>> plugin) is incompatible with the idea of making it a bug-for-bug drop-in
>>> replacement of the old one, and it's something we need to think about.

Did you have any suggestions?

-- Markus --



Re: [RFC PATCH v1 2/6] kernel-doc: replace kernel-doc perl parser with a pure python one (WIP)

2017-01-26 Thread Markus Heiser

Am 25.01.2017 um 21:59 schrieb Jani Nikula :

>> But the problem I see here is, that the perl script generates a
>> reST output which I can't use. As an example we can take a look at
>> the man-page builder I shipped in the series.
> 
> Sorry, I still don't understand *why* you can't use the same rst. Your
> explanation seems to relate to man pages, but man pages come
> *afterwards*, and are a separate improvement. I know you talk about lack
> of proper structure and all that, but *why* can it strictly not be used,
> if the *current* rst clearly can be used?

"afterwards" is the word, that lets me slowly realize, that I have to
stop solving the world's problems with one patch. Now I guess how my
next patch series has to look like. Thanks! ... for being patient with
me.

Before I start, I want to hear your thoughts about the parsing
aspect ...

>>> That said, perhaps having an elegant parser (perhaps based on a compiler
>>> plugin) is incompatible with the idea of making it a bug-for-bug drop-in
>>> replacement of the old one, and it's something we need to think about.

Did you have any suggestions?

-- Markus --



Re: [RFC PATCH v1 2/6] kernel-doc: replace kernel-doc perl parser with a pure python one (WIP)

2017-01-25 Thread Jani Nikula
On Wed, 25 Jan 2017, Markus Heiser  wrote:
> Am 25.01.2017 um 11:24 schrieb Jani Nikula :
>
>> Markus, thanks for your work on this.
>
> Thanks for your comments!
>
>> Excuse me for my bluntness, but I think changing everything in a single
>> commit, or even a few commits, is strictly not acceptable.
>
> OK, I understand.
>
>> When I changed *small* things in scripts/kernel-doc, I would make
>> htmldocs before and after the change, and recursively diff the produced
>> output to ensure there were no surprises. We already have enough
>> documentation that a manual eyeballing of the output is simply not
>> sufficient to ensure things don't break.
>
>> The diff in output between before and after this series? 160k lines of
>> unified diff without context ('diff -u0 -r old new | wc -l').
>> 
>> Many of the changes are improvements on the result, such as using proper
>>  tags for function parameter lists etc., but clearly changing the
>> output should be independent of changing the parser, so we have some
>> chance of validating the parser.
>
>
> Hmm ... I try to sort my thoughts on this:
>
> The both parser are generating reST output. We have tested reST output
> so it should be enough to compare the reST from the old one with the
> new one ... at least theoretical.
>
> But the problem I see here is, that the perl script generates a
> reST output which I can't use. As an example we can take a look at
> the man-page builder I shipped in the series.

Sorry, I still don't understand *why* you can't use the same rst. Your
explanation seems to relate to man pages, but man pages come
*afterwards*, and are a separate improvement. I know you talk about lack
of proper structure and all that, but *why* can it strictly not be used,
if the *current* rst clearly can be used?

BR,
Jani.


>
>  https://www.mail-archive.com/linux-doc@vger.kernel.org/msg09017.html
>
> In the commit message there is a small example:
>
>   
> ...
> 
>   
> int
> ...
>
> You see that it has  tag with childs ,  
> and so on. This structured markup is used by builders, they navigate through
> the structured tree picking up nodes and spit out some man-page html, or
> whatever builder it is.
>
> ATM the perl parser generates a reST output which does not have such
> a structured tree, so the builder can't navigate in.
>
> So, what I mean is, the new parser has to generate a complete different reST
> output and thats why we can't compare the perl parser with python one on a 
> reST
> basis ... and if reST is different, HTML is different :(
>
> So we do not have any chance to track regression when switching from
> the old to the new parser.
>
> Thats are my thoughts on this topic, may be you have a solution for this?
>
>> 
>>>   Ideally at the time of merging, we would be able to build the docs with
>>>   *either* kerneldoc.
>> 
>> I'd be fine with switching over in a single commit that doesn't
>> drastically change the output.
>
> One solution might be to improve the reST output of the perl script
> first, so that it produce something which has a structure and we
> all can agree on (short: reST output is the reference, ATM the reference
> need some improvements)
>
> If this is a way we like to go, I can send a patch for the perl script,
> so that we can commit one a reST reference.
>
>> A drop-in replacement. But that's not the
>> case here.
>> 
>>> - I'll have to try it out to see how noisy it is.  I'm not opposed to
>>>   stricter checks; indeed, they could be a good thing.  But we might want
>>>   to have an option so we can cut back on the noise by default.
>> 
>> The increase in 'make htmldocs' build log was from 1521 to 2791 lines in
>> my tree. Arguably there was useful extra diagnosis, but some of it was
>> the printouts of long lists of definitions that were not found, one per
>> line. So it could be condensed without losing info too.
>
> Yes, this was just a 1:1 merge from my POC, there are a lot of things
> which could be meld down. ATM, for me it is important to get a feedback 
> on the functionalities and concepts of kernel-doc apps (RFC).
>
>> On to performance. With the default build options the new system was
>> noticeably slower than the current one, with a 50% increase on my
>> machine. But what really caught me by surprise was that passing
>> SPHINXOPTS=-j5 to parallelize worked better on the current system,
>> making the new one a whopping 70% slower. Of course, the argument is
>> that the proposed parser does more and is better, but due to the
>> monolithic change it's impossible to pinpoint the culprit or do a proper
>> cost/benefit analysis on this. Again, this calls for a more broken down
>> series of patches to make the changes.
>
> Ups, I have to look closer ... I thought the py-solution is faster 
> since it does not for processes and does some caching.
>
>> Finally, while I'd love to see scripts/kernel-doc go, I do have to ask
>> if changing roughly 

Re: [RFC PATCH v1 2/6] kernel-doc: replace kernel-doc perl parser with a pure python one (WIP)

2017-01-25 Thread Jani Nikula
On Wed, 25 Jan 2017, Markus Heiser  wrote:
> Am 25.01.2017 um 11:24 schrieb Jani Nikula :
>
>> Markus, thanks for your work on this.
>
> Thanks for your comments!
>
>> Excuse me for my bluntness, but I think changing everything in a single
>> commit, or even a few commits, is strictly not acceptable.
>
> OK, I understand.
>
>> When I changed *small* things in scripts/kernel-doc, I would make
>> htmldocs before and after the change, and recursively diff the produced
>> output to ensure there were no surprises. We already have enough
>> documentation that a manual eyeballing of the output is simply not
>> sufficient to ensure things don't break.
>
>> The diff in output between before and after this series? 160k lines of
>> unified diff without context ('diff -u0 -r old new | wc -l').
>> 
>> Many of the changes are improvements on the result, such as using proper
>>  tags for function parameter lists etc., but clearly changing the
>> output should be independent of changing the parser, so we have some
>> chance of validating the parser.
>
>
> Hmm ... I try to sort my thoughts on this:
>
> The both parser are generating reST output. We have tested reST output
> so it should be enough to compare the reST from the old one with the
> new one ... at least theoretical.
>
> But the problem I see here is, that the perl script generates a
> reST output which I can't use. As an example we can take a look at
> the man-page builder I shipped in the series.

Sorry, I still don't understand *why* you can't use the same rst. Your
explanation seems to relate to man pages, but man pages come
*afterwards*, and are a separate improvement. I know you talk about lack
of proper structure and all that, but *why* can it strictly not be used,
if the *current* rst clearly can be used?

BR,
Jani.


>
>  https://www.mail-archive.com/linux-doc@vger.kernel.org/msg09017.html
>
> In the commit message there is a small example:
>
>   
> ...
> 
>   
> int
> ...
>
> You see that it has  tag with childs ,  
> and so on. This structured markup is used by builders, they navigate through
> the structured tree picking up nodes and spit out some man-page html, or
> whatever builder it is.
>
> ATM the perl parser generates a reST output which does not have such
> a structured tree, so the builder can't navigate in.
>
> So, what I mean is, the new parser has to generate a complete different reST
> output and thats why we can't compare the perl parser with python one on a 
> reST
> basis ... and if reST is different, HTML is different :(
>
> So we do not have any chance to track regression when switching from
> the old to the new parser.
>
> Thats are my thoughts on this topic, may be you have a solution for this?
>
>> 
>>>   Ideally at the time of merging, we would be able to build the docs with
>>>   *either* kerneldoc.
>> 
>> I'd be fine with switching over in a single commit that doesn't
>> drastically change the output.
>
> One solution might be to improve the reST output of the perl script
> first, so that it produce something which has a structure and we
> all can agree on (short: reST output is the reference, ATM the reference
> need some improvements)
>
> If this is a way we like to go, I can send a patch for the perl script,
> so that we can commit one a reST reference.
>
>> A drop-in replacement. But that's not the
>> case here.
>> 
>>> - I'll have to try it out to see how noisy it is.  I'm not opposed to
>>>   stricter checks; indeed, they could be a good thing.  But we might want
>>>   to have an option so we can cut back on the noise by default.
>> 
>> The increase in 'make htmldocs' build log was from 1521 to 2791 lines in
>> my tree. Arguably there was useful extra diagnosis, but some of it was
>> the printouts of long lists of definitions that were not found, one per
>> line. So it could be condensed without losing info too.
>
> Yes, this was just a 1:1 merge from my POC, there are a lot of things
> which could be meld down. ATM, for me it is important to get a feedback 
> on the functionalities and concepts of kernel-doc apps (RFC).
>
>> On to performance. With the default build options the new system was
>> noticeably slower than the current one, with a 50% increase on my
>> machine. But what really caught me by surprise was that passing
>> SPHINXOPTS=-j5 to parallelize worked better on the current system,
>> making the new one a whopping 70% slower. Of course, the argument is
>> that the proposed parser does more and is better, but due to the
>> monolithic change it's impossible to pinpoint the culprit or do a proper
>> cost/benefit analysis on this. Again, this calls for a more broken down
>> series of patches to make the changes.
>
> Ups, I have to look closer ... I thought the py-solution is faster 
> since it does not for processes and does some caching.
>
>> Finally, while I'd love to see scripts/kernel-doc go, I do have to ask
>> if changing roughly 3k lines of Perl to roughly 3k lines of Python 

Re: [RFC PATCH v1 2/6] kernel-doc: replace kernel-doc perl parser with a pure python one (WIP)

2017-01-25 Thread Markus Heiser

Am 25.01.2017 um 11:24 schrieb Jani Nikula :

> Markus, thanks for your work on this.

Thanks for your comments!

> Excuse me for my bluntness, but I think changing everything in a single
> commit, or even a few commits, is strictly not acceptable.

OK, I understand.

> When I changed *small* things in scripts/kernel-doc, I would make
> htmldocs before and after the change, and recursively diff the produced
> output to ensure there were no surprises. We already have enough
> documentation that a manual eyeballing of the output is simply not
> sufficient to ensure things don't break.

> The diff in output between before and after this series? 160k lines of
> unified diff without context ('diff -u0 -r old new | wc -l').
> 
> Many of the changes are improvements on the result, such as using proper
>  tags for function parameter lists etc., but clearly changing the
> output should be independent of changing the parser, so we have some
> chance of validating the parser.


Hmm ... I try to sort my thoughts on this:

The both parser are generating reST output. We have tested reST output
so it should be enough to compare the reST from the old one with the
new one ... at least theoretical.

But the problem I see here is, that the perl script generates a
reST output which I can't use. As an example we can take a look at
the man-page builder I shipped in the series.

 https://www.mail-archive.com/linux-doc@vger.kernel.org/msg09017.html

In the commit message there is a small example:

  
...

  
int
...

You see that it has  tag with childs ,  
and so on. This structured markup is used by builders, they navigate through
the structured tree picking up nodes and spit out some man-page html, or
whatever builder it is.

ATM the perl parser generates a reST output which does not have such
a structured tree, so the builder can't navigate in.

So, what I mean is, the new parser has to generate a complete different reST
output and thats why we can't compare the perl parser with python one on a reST
basis ... and if reST is different, HTML is different :(

So we do not have any chance to track regression when switching from
the old to the new parser.

Thats are my thoughts on this topic, may be you have a solution for this?

> 
>>   Ideally at the time of merging, we would be able to build the docs with
>>   *either* kerneldoc.
> 
> I'd be fine with switching over in a single commit that doesn't
> drastically change the output.

One solution might be to improve the reST output of the perl script
first, so that it produce something which has a structure and we
all can agree on (short: reST output is the reference, ATM the reference
need some improvements)

If this is a way we like to go, I can send a patch for the perl script,
so that we can commit one a reST reference.

> A drop-in replacement. But that's not the
> case here.
> 
>> - I'll have to try it out to see how noisy it is.  I'm not opposed to
>>   stricter checks; indeed, they could be a good thing.  But we might want
>>   to have an option so we can cut back on the noise by default.
> 
> The increase in 'make htmldocs' build log was from 1521 to 2791 lines in
> my tree. Arguably there was useful extra diagnosis, but some of it was
> the printouts of long lists of definitions that were not found, one per
> line. So it could be condensed without losing info too.

Yes, this was just a 1:1 merge from my POC, there are a lot of things
which could be meld down. ATM, for me it is important to get a feedback 
on the functionalities and concepts of kernel-doc apps (RFC).

> On to performance. With the default build options the new system was
> noticeably slower than the current one, with a 50% increase on my
> machine. But what really caught me by surprise was that passing
> SPHINXOPTS=-j5 to parallelize worked better on the current system,
> making the new one a whopping 70% slower. Of course, the argument is
> that the proposed parser does more and is better, but due to the
> monolithic change it's impossible to pinpoint the culprit or do a proper
> cost/benefit analysis on this. Again, this calls for a more broken down
> series of patches to make the changes.

Ups, I have to look closer ... I thought the py-solution is faster 
since it does not for processes and does some caching.

> Finally, while I'd love to see scripts/kernel-doc go, I do have to ask
> if changing roughly 3k lines of Perl to roughly 3k lines of Python (*)
> really makes everything better? They both still parse everything using a
> large pile of regular expressions and a clunky state machine. When I
> look at the code, I'm afraid I do not get that liberating feeling of
> throwing out old junk in favor of something small or elegant or even
> obviously more maintainable than the old one. The new one offers more
> features, but repeatedly we face the problem that it's all lumped in
> together with the parser change. We should be able to look at the parser

Re: [RFC PATCH v1 2/6] kernel-doc: replace kernel-doc perl parser with a pure python one (WIP)

2017-01-25 Thread Markus Heiser

Am 25.01.2017 um 11:24 schrieb Jani Nikula :

> Markus, thanks for your work on this.

Thanks for your comments!

> Excuse me for my bluntness, but I think changing everything in a single
> commit, or even a few commits, is strictly not acceptable.

OK, I understand.

> When I changed *small* things in scripts/kernel-doc, I would make
> htmldocs before and after the change, and recursively diff the produced
> output to ensure there were no surprises. We already have enough
> documentation that a manual eyeballing of the output is simply not
> sufficient to ensure things don't break.

> The diff in output between before and after this series? 160k lines of
> unified diff without context ('diff -u0 -r old new | wc -l').
> 
> Many of the changes are improvements on the result, such as using proper
>  tags for function parameter lists etc., but clearly changing the
> output should be independent of changing the parser, so we have some
> chance of validating the parser.


Hmm ... I try to sort my thoughts on this:

The both parser are generating reST output. We have tested reST output
so it should be enough to compare the reST from the old one with the
new one ... at least theoretical.

But the problem I see here is, that the perl script generates a
reST output which I can't use. As an example we can take a look at
the man-page builder I shipped in the series.

 https://www.mail-archive.com/linux-doc@vger.kernel.org/msg09017.html

In the commit message there is a small example:

  
...

  
int
...

You see that it has  tag with childs ,  
and so on. This structured markup is used by builders, they navigate through
the structured tree picking up nodes and spit out some man-page html, or
whatever builder it is.

ATM the perl parser generates a reST output which does not have such
a structured tree, so the builder can't navigate in.

So, what I mean is, the new parser has to generate a complete different reST
output and thats why we can't compare the perl parser with python one on a reST
basis ... and if reST is different, HTML is different :(

So we do not have any chance to track regression when switching from
the old to the new parser.

Thats are my thoughts on this topic, may be you have a solution for this?

> 
>>   Ideally at the time of merging, we would be able to build the docs with
>>   *either* kerneldoc.
> 
> I'd be fine with switching over in a single commit that doesn't
> drastically change the output.

One solution might be to improve the reST output of the perl script
first, so that it produce something which has a structure and we
all can agree on (short: reST output is the reference, ATM the reference
need some improvements)

If this is a way we like to go, I can send a patch for the perl script,
so that we can commit one a reST reference.

> A drop-in replacement. But that's not the
> case here.
> 
>> - I'll have to try it out to see how noisy it is.  I'm not opposed to
>>   stricter checks; indeed, they could be a good thing.  But we might want
>>   to have an option so we can cut back on the noise by default.
> 
> The increase in 'make htmldocs' build log was from 1521 to 2791 lines in
> my tree. Arguably there was useful extra diagnosis, but some of it was
> the printouts of long lists of definitions that were not found, one per
> line. So it could be condensed without losing info too.

Yes, this was just a 1:1 merge from my POC, there are a lot of things
which could be meld down. ATM, for me it is important to get a feedback 
on the functionalities and concepts of kernel-doc apps (RFC).

> On to performance. With the default build options the new system was
> noticeably slower than the current one, with a 50% increase on my
> machine. But what really caught me by surprise was that passing
> SPHINXOPTS=-j5 to parallelize worked better on the current system,
> making the new one a whopping 70% slower. Of course, the argument is
> that the proposed parser does more and is better, but due to the
> monolithic change it's impossible to pinpoint the culprit or do a proper
> cost/benefit analysis on this. Again, this calls for a more broken down
> series of patches to make the changes.

Ups, I have to look closer ... I thought the py-solution is faster 
since it does not for processes and does some caching.

> Finally, while I'd love to see scripts/kernel-doc go, I do have to ask
> if changing roughly 3k lines of Perl to roughly 3k lines of Python (*)
> really makes everything better? They both still parse everything using a
> large pile of regular expressions and a clunky state machine. When I
> look at the code, I'm afraid I do not get that liberating feeling of
> throwing out old junk in favor of something small or elegant or even
> obviously more maintainable than the old one. The new one offers more
> features, but repeatedly we face the problem that it's all lumped in
> together with the parser change. We should be able to look at the parser
> change and the other 

Re: [RFC PATCH v1 2/6] kernel-doc: replace kernel-doc perl parser with a pure python one (WIP)

2017-01-25 Thread Daniel Vetter
On Wed, Jan 25, 2017 at 12:24:31PM +0200, Jani Nikula wrote:
> Finally, while I'd love to see scripts/kernel-doc go, I do have to ask
> if changing roughly 3k lines of Perl to roughly 3k lines of Python (*)
> really makes everything better? They both still parse everything using a
> large pile of regular expressions and a clunky state machine. When I
> look at the code, I'm afraid I do not get that liberating feeling of
> throwing out old junk in favor of something small or elegant or even
> obviously more maintainable than the old one. The new one offers more
> features, but repeatedly we face the problem that it's all lumped in
> together with the parser change. We should be able to look at the parser
> change and the other improvements separately.

I share this concern a lot. The kernel-doc perl is a horror show, but it's
a horror show that 3-4 people now somewhat understand. Simply translating
the entire script into python leaves us with the same horror show, but in
a different language. And personally I'm not versed at all in either of
them (and I think that applies to many kernel hackers), so seems a wash.

If the new script would implement the state machinery in some
parser-combinator library to make it much easier to maintain, while still
being bug-for-bug compatible, then I'd be much, much more in favour of
doing this. And once we go to that amount of effort, then rewriting it in
python for more consistency with sphinx is definitely a good idea.

> That said, perhaps having an elegant parser (perhaps based on a compiler
> plugin) is incompatible with the idea of making it a bug-for-bug drop-in
> replacement of the old one, and it's something we need to think about.

Yeah, I fear we'll always need our own parser to avoid breaking the world.
But there's definitely better ways out there to write parsers than
cobbling together regexes in a state machine that uses globals :-)
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


Re: [RFC PATCH v1 2/6] kernel-doc: replace kernel-doc perl parser with a pure python one (WIP)

2017-01-25 Thread Daniel Vetter
On Wed, Jan 25, 2017 at 12:24:31PM +0200, Jani Nikula wrote:
> Finally, while I'd love to see scripts/kernel-doc go, I do have to ask
> if changing roughly 3k lines of Perl to roughly 3k lines of Python (*)
> really makes everything better? They both still parse everything using a
> large pile of regular expressions and a clunky state machine. When I
> look at the code, I'm afraid I do not get that liberating feeling of
> throwing out old junk in favor of something small or elegant or even
> obviously more maintainable than the old one. The new one offers more
> features, but repeatedly we face the problem that it's all lumped in
> together with the parser change. We should be able to look at the parser
> change and the other improvements separately.

I share this concern a lot. The kernel-doc perl is a horror show, but it's
a horror show that 3-4 people now somewhat understand. Simply translating
the entire script into python leaves us with the same horror show, but in
a different language. And personally I'm not versed at all in either of
them (and I think that applies to many kernel hackers), so seems a wash.

If the new script would implement the state machinery in some
parser-combinator library to make it much easier to maintain, while still
being bug-for-bug compatible, then I'd be much, much more in favour of
doing this. And once we go to that amount of effort, then rewriting it in
python for more consistency with sphinx is definitely a good idea.

> That said, perhaps having an elegant parser (perhaps based on a compiler
> plugin) is incompatible with the idea of making it a bug-for-bug drop-in
> replacement of the old one, and it's something we need to think about.

Yeah, I fear we'll always need our own parser to avoid breaking the world.
But there's definitely better ways out there to write parsers than
cobbling together regexes in a state machine that uses globals :-)
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


Re: [RFC PATCH v1 2/6] kernel-doc: replace kernel-doc perl parser with a pure python one (WIP)

2017-01-25 Thread Jani Nikula
On Wed, 25 Jan 2017, Jonathan Corbet  wrote:
> On Tue, 24 Jan 2017 20:52:40 +0100
> Markus Heiser  wrote:
>
>> This patch is the initial merge of a pure python implementation
>> to parse kernel-doc comments and generate reST from.
>> 
>> It consist mainly of to parts, the parser module (kerneldoc.py) and the
>> sphinx-doc extension (rstKernelDoc.py). For the command line, there is
>> also a 'scripts/kerneldoc' added.::
>> 
>>scripts/kerneldoc --help
>> 
>> The main two parts are merged 1:1 from
>> 
>>   https://github.com/return42/linuxdoc  commit 3991d3c
>> 
>> Take this as a starting point, there is a lot of work to do (WIP).
>> Since it is merged 1:1, you will also notice it's CodingStyle is (ATM)
>> not kernel compliant and it lacks a user doc ('Documentation/doc-guide').
>> 
>> I will send patches for this when the community agreed about
>> functionalities. I guess there are a lot of topics we have to agree
>> about. E.g. the py-implementation is more strict the perl one.  When you
>> build doc with the py-module you will see a lot of additional errors and
>> warnings compared to the sloppy perl one.

Markus, thanks for your work on this.

> Again, quick comments...
>
>  - I would *much* rather evolve our existing Sphinx extension in the
>direction we want it to go than to just replace it wholesale.
>Replacement is the wrong approach for a few reasons, including the need
>to minimize change and preserve credit for Jani's work.  Can we work on
>that basis, please?

I would grossly downplay the role of preserving credit for what I've
done, and put much more emphasis on the need to create a patch series
that gradually, step by step, evolves the current approach into
something better.

Excuse me for my bluntness, but I think changing everything in a single
commit, or even a few commits, is strictly not acceptable.

When I changed *small* things in scripts/kernel-doc, I would make
htmldocs before and after the change, and recursively diff the produced
output to ensure there were no surprises. We already have enough
documentation that a manual eyeballing of the output is simply not
sufficient to ensure things don't break.

The diff in output between before and after this series? 160k lines of
unified diff without context ('diff -u0 -r old new | wc -l').

Many of the changes are improvements on the result, such as using proper
 tags for function parameter lists etc., but clearly changing the
output should be independent of changing the parser, so we have some
chance of validating the parser.

>Ideally at the time of merging, we would be able to build the docs with
>*either* kerneldoc.

I'd be fine with switching over in a single commit that doesn't
drastically change the output. A drop-in replacement. But that's not the
case here.

>  - I'll have to try it out to see how noisy it is.  I'm not opposed to
>stricter checks; indeed, they could be a good thing.  But we might want
>to have an option so we can cut back on the noise by default.

The increase in 'make htmldocs' build log was from 1521 to 2791 lines in
my tree. Arguably there was useful extra diagnosis, but some of it was
the printouts of long lists of definitions that were not found, one per
line. So it could be condensed without losing info too.

On to performance. With the default build options the new system was
noticeably slower than the current one, with a 50% increase on my
machine. But what really caught me by surprise was that passing
SPHINXOPTS=-j5 to parallelize worked better on the current system,
making the new one a whopping 70% slower. Of course, the argument is
that the proposed parser does more and is better, but due to the
monolithic change it's impossible to pinpoint the culprit or do a proper
cost/benefit analysis on this. Again, this calls for a more broken down
series of patches to make the changes.

Finally, while I'd love to see scripts/kernel-doc go, I do have to ask
if changing roughly 3k lines of Perl to roughly 3k lines of Python (*)
really makes everything better? They both still parse everything using a
large pile of regular expressions and a clunky state machine. When I
look at the code, I'm afraid I do not get that liberating feeling of
throwing out old junk in favor of something small or elegant or even
obviously more maintainable than the old one. The new one offers more
features, but repeatedly we face the problem that it's all lumped in
together with the parser change. We should be able to look at the parser
change and the other improvements separately.

That said, perhaps having an elegant parser (perhaps based on a compiler
plugin) is incompatible with the idea of making it a bug-for-bug drop-in
replacement of the old one, and it's something we need to think about.

All in all I think the message should be clear: this needs to be split
into small, incremental changes. Just like we do everything in the
kernel.


BR,
Jani.


(*) Please 

Re: [RFC PATCH v1 2/6] kernel-doc: replace kernel-doc perl parser with a pure python one (WIP)

2017-01-25 Thread Jani Nikula
On Wed, 25 Jan 2017, Jonathan Corbet  wrote:
> On Tue, 24 Jan 2017 20:52:40 +0100
> Markus Heiser  wrote:
>
>> This patch is the initial merge of a pure python implementation
>> to parse kernel-doc comments and generate reST from.
>> 
>> It consist mainly of to parts, the parser module (kerneldoc.py) and the
>> sphinx-doc extension (rstKernelDoc.py). For the command line, there is
>> also a 'scripts/kerneldoc' added.::
>> 
>>scripts/kerneldoc --help
>> 
>> The main two parts are merged 1:1 from
>> 
>>   https://github.com/return42/linuxdoc  commit 3991d3c
>> 
>> Take this as a starting point, there is a lot of work to do (WIP).
>> Since it is merged 1:1, you will also notice it's CodingStyle is (ATM)
>> not kernel compliant and it lacks a user doc ('Documentation/doc-guide').
>> 
>> I will send patches for this when the community agreed about
>> functionalities. I guess there are a lot of topics we have to agree
>> about. E.g. the py-implementation is more strict the perl one.  When you
>> build doc with the py-module you will see a lot of additional errors and
>> warnings compared to the sloppy perl one.

Markus, thanks for your work on this.

> Again, quick comments...
>
>  - I would *much* rather evolve our existing Sphinx extension in the
>direction we want it to go than to just replace it wholesale.
>Replacement is the wrong approach for a few reasons, including the need
>to minimize change and preserve credit for Jani's work.  Can we work on
>that basis, please?

I would grossly downplay the role of preserving credit for what I've
done, and put much more emphasis on the need to create a patch series
that gradually, step by step, evolves the current approach into
something better.

Excuse me for my bluntness, but I think changing everything in a single
commit, or even a few commits, is strictly not acceptable.

When I changed *small* things in scripts/kernel-doc, I would make
htmldocs before and after the change, and recursively diff the produced
output to ensure there were no surprises. We already have enough
documentation that a manual eyeballing of the output is simply not
sufficient to ensure things don't break.

The diff in output between before and after this series? 160k lines of
unified diff without context ('diff -u0 -r old new | wc -l').

Many of the changes are improvements on the result, such as using proper
 tags for function parameter lists etc., but clearly changing the
output should be independent of changing the parser, so we have some
chance of validating the parser.

>Ideally at the time of merging, we would be able to build the docs with
>*either* kerneldoc.

I'd be fine with switching over in a single commit that doesn't
drastically change the output. A drop-in replacement. But that's not the
case here.

>  - I'll have to try it out to see how noisy it is.  I'm not opposed to
>stricter checks; indeed, they could be a good thing.  But we might want
>to have an option so we can cut back on the noise by default.

The increase in 'make htmldocs' build log was from 1521 to 2791 lines in
my tree. Arguably there was useful extra diagnosis, but some of it was
the printouts of long lists of definitions that were not found, one per
line. So it could be condensed without losing info too.

On to performance. With the default build options the new system was
noticeably slower than the current one, with a 50% increase on my
machine. But what really caught me by surprise was that passing
SPHINXOPTS=-j5 to parallelize worked better on the current system,
making the new one a whopping 70% slower. Of course, the argument is
that the proposed parser does more and is better, but due to the
monolithic change it's impossible to pinpoint the culprit or do a proper
cost/benefit analysis on this. Again, this calls for a more broken down
series of patches to make the changes.

Finally, while I'd love to see scripts/kernel-doc go, I do have to ask
if changing roughly 3k lines of Perl to roughly 3k lines of Python (*)
really makes everything better? They both still parse everything using a
large pile of regular expressions and a clunky state machine. When I
look at the code, I'm afraid I do not get that liberating feeling of
throwing out old junk in favor of something small or elegant or even
obviously more maintainable than the old one. The new one offers more
features, but repeatedly we face the problem that it's all lumped in
together with the parser change. We should be able to look at the parser
change and the other improvements separately.

That said, perhaps having an elegant parser (perhaps based on a compiler
plugin) is incompatible with the idea of making it a bug-for-bug drop-in
replacement of the old one, and it's something we need to think about.

All in all I think the message should be clear: this needs to be split
into small, incremental changes. Just like we do everything in the
kernel.


BR,
Jani.


(*) Please do not get hung up on these numbers. The 

Re: [RFC PATCH v1 2/6] kernel-doc: replace kernel-doc perl parser with a pure python one (WIP)

2017-01-24 Thread Markus Heiser
Hi Jon, hi Daniel !

Am 25.01.2017 um 07:37 schrieb Daniel Vetter :

>> Again, quick comments...
>> 
>> - I would *much* rather evolve our existing Sphinx extension in the
>>   direction we want it to go than to just replace it wholesale.
>>   Replacement is the wrong approach for a few reasons, including the need
>>   to minimize change and preserve credit for Jani's work.  Can we work on
>>   that basis, please?

Sure. But I fear I haven't understood you right  last post was:

> Markus, would you consider sending out a new patch set for review?  What I
> would like to do see is something adding the new script for the Sphinx
> toolchain, while leaving the DocBook build unchanged, using the old
> script.  We could then delete it once the last template file has moved
> over. 

talking about DocBook and now I read ...

>>   Ideally at the time of merging, we would be able to build the docs with
>>   *either* kerneldoc.

Now I'am totally confused ... it's no about you, but I do not understand
you clearly ... can you help a conceptual man?

> Seconded, I think renaming the extension string like this is just fairly
> pointless busy-work.

Hi Daniel, please help me, what did you mean with "renaming" the extension
string and "busy-work"?

There is a renaming of module's name but there should no work outside this
patch ... 

> Kernel-doc isn't interacting perfectly with rst, but
> now we already have a sizeable amount of stuff converted and going through
> all that once more needs imo som really clear benefits.

from authors POV nothing has changed.

> I think bug-for-bug compatibility would be much better. Later on we could do
> changes, on a change-by-change basis.

Both sphinx-extensions (the one we have and the one in the series) are
adapter to a "parser backend". 

1. Documentation/sphinx/kerneldoc.py<--> scripts/kerneldoc -rst
2. Documentation/sphinx/rstKernelDoc.py <--> import module 
Documentation/sphinx/kernel_doc.py

Maintain two adapters for the two backends is possible. But one adapter
for two complete different backends .. is this what you mean?

>> - I'll have to try it out to see how noisy it is.  I'm not opposed to
>>   stricter checks; indeed, they could be a good thing.  But we might want
>>   to have an option so we can cut back on the noise by default.

As said, I'am willing to go communities way, it seems just a communication
problem (on my side) to understand what this way would be.

I try to sum what I guess ... e.g. to build output as usual with (1.)

  $ make htmldocs

to build with the py-parser and its sphinx-extension (see 2. above)::

  $ USE_PY_PARSER=1 make htmldocs

this should be easy and I can realize it in v2, but is this what you want?

Please give me some more hints / Thanks a lot!

--Markus--









Re: [RFC PATCH v1 2/6] kernel-doc: replace kernel-doc perl parser with a pure python one (WIP)

2017-01-24 Thread Markus Heiser
Hi Jon, hi Daniel !

Am 25.01.2017 um 07:37 schrieb Daniel Vetter :

>> Again, quick comments...
>> 
>> - I would *much* rather evolve our existing Sphinx extension in the
>>   direction we want it to go than to just replace it wholesale.
>>   Replacement is the wrong approach for a few reasons, including the need
>>   to minimize change and preserve credit for Jani's work.  Can we work on
>>   that basis, please?

Sure. But I fear I haven't understood you right  last post was:

> Markus, would you consider sending out a new patch set for review?  What I
> would like to do see is something adding the new script for the Sphinx
> toolchain, while leaving the DocBook build unchanged, using the old
> script.  We could then delete it once the last template file has moved
> over. 

talking about DocBook and now I read ...

>>   Ideally at the time of merging, we would be able to build the docs with
>>   *either* kerneldoc.

Now I'am totally confused ... it's no about you, but I do not understand
you clearly ... can you help a conceptual man?

> Seconded, I think renaming the extension string like this is just fairly
> pointless busy-work.

Hi Daniel, please help me, what did you mean with "renaming" the extension
string and "busy-work"?

There is a renaming of module's name but there should no work outside this
patch ... 

> Kernel-doc isn't interacting perfectly with rst, but
> now we already have a sizeable amount of stuff converted and going through
> all that once more needs imo som really clear benefits.

from authors POV nothing has changed.

> I think bug-for-bug compatibility would be much better. Later on we could do
> changes, on a change-by-change basis.

Both sphinx-extensions (the one we have and the one in the series) are
adapter to a "parser backend". 

1. Documentation/sphinx/kerneldoc.py<--> scripts/kerneldoc -rst
2. Documentation/sphinx/rstKernelDoc.py <--> import module 
Documentation/sphinx/kernel_doc.py

Maintain two adapters for the two backends is possible. But one adapter
for two complete different backends .. is this what you mean?

>> - I'll have to try it out to see how noisy it is.  I'm not opposed to
>>   stricter checks; indeed, they could be a good thing.  But we might want
>>   to have an option so we can cut back on the noise by default.

As said, I'am willing to go communities way, it seems just a communication
problem (on my side) to understand what this way would be.

I try to sum what I guess ... e.g. to build output as usual with (1.)

  $ make htmldocs

to build with the py-parser and its sphinx-extension (see 2. above)::

  $ USE_PY_PARSER=1 make htmldocs

this should be easy and I can realize it in v2, but is this what you want?

Please give me some more hints / Thanks a lot!

--Markus--









Re: [RFC PATCH v1 2/6] kernel-doc: replace kernel-doc perl parser with a pure python one (WIP)

2017-01-24 Thread Daniel Vetter
On Tue, Jan 24, 2017 at 05:13:14PM -0700, Jonathan Corbet wrote:
> On Tue, 24 Jan 2017 20:52:40 +0100
> Markus Heiser  wrote:
> 
> > This patch is the initial merge of a pure python implementation
> > to parse kernel-doc comments and generate reST from.
> > 
> > It consist mainly of to parts, the parser module (kerneldoc.py) and the
> > sphinx-doc extension (rstKernelDoc.py). For the command line, there is
> > also a 'scripts/kerneldoc' added.::
> > 
> >scripts/kerneldoc --help
> > 
> > The main two parts are merged 1:1 from
> > 
> >   https://github.com/return42/linuxdoc  commit 3991d3c
> > 
> > Take this as a starting point, there is a lot of work to do (WIP).
> > Since it is merged 1:1, you will also notice it's CodingStyle is (ATM)
> > not kernel compliant and it lacks a user doc ('Documentation/doc-guide').
> > 
> > I will send patches for this when the community agreed about
> > functionalities. I guess there are a lot of topics we have to agree
> > about. E.g. the py-implementation is more strict the perl one.  When you
> > build doc with the py-module you will see a lot of additional errors and
> > warnings compared to the sloppy perl one.
> 
> Again, quick comments...
> 
>  - I would *much* rather evolve our existing Sphinx extension in the
>direction we want it to go than to just replace it wholesale.
>Replacement is the wrong approach for a few reasons, including the need
>to minimize change and preserve credit for Jani's work.  Can we work on
>that basis, please?
> 
>Ideally at the time of merging, we would be able to build the docs with
>*either* kerneldoc.

Seconded, I think renaming the extension string like this is just fairly
pointless busy-work. Kernel-doc isn't interacting perfectly with rst, but
now we already have a sizeable amount of stuff converted and going through
all that once more needs imo som really clear benefits. I think
bug-for-bug compatibility would be much better. Later on we could do
changes, on a change-by-change basis.
-Daniel


>  - I'll have to try it out to see how noisy it is.  I'm not opposed to
>stricter checks; indeed, they could be a good thing.  But we might want
>to have an option so we can cut back on the noise by default.


> 
> Thanks,
> 
> jon

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


Re: [RFC PATCH v1 2/6] kernel-doc: replace kernel-doc perl parser with a pure python one (WIP)

2017-01-24 Thread Daniel Vetter
On Tue, Jan 24, 2017 at 05:13:14PM -0700, Jonathan Corbet wrote:
> On Tue, 24 Jan 2017 20:52:40 +0100
> Markus Heiser  wrote:
> 
> > This patch is the initial merge of a pure python implementation
> > to parse kernel-doc comments and generate reST from.
> > 
> > It consist mainly of to parts, the parser module (kerneldoc.py) and the
> > sphinx-doc extension (rstKernelDoc.py). For the command line, there is
> > also a 'scripts/kerneldoc' added.::
> > 
> >scripts/kerneldoc --help
> > 
> > The main two parts are merged 1:1 from
> > 
> >   https://github.com/return42/linuxdoc  commit 3991d3c
> > 
> > Take this as a starting point, there is a lot of work to do (WIP).
> > Since it is merged 1:1, you will also notice it's CodingStyle is (ATM)
> > not kernel compliant and it lacks a user doc ('Documentation/doc-guide').
> > 
> > I will send patches for this when the community agreed about
> > functionalities. I guess there are a lot of topics we have to agree
> > about. E.g. the py-implementation is more strict the perl one.  When you
> > build doc with the py-module you will see a lot of additional errors and
> > warnings compared to the sloppy perl one.
> 
> Again, quick comments...
> 
>  - I would *much* rather evolve our existing Sphinx extension in the
>direction we want it to go than to just replace it wholesale.
>Replacement is the wrong approach for a few reasons, including the need
>to minimize change and preserve credit for Jani's work.  Can we work on
>that basis, please?
> 
>Ideally at the time of merging, we would be able to build the docs with
>*either* kerneldoc.

Seconded, I think renaming the extension string like this is just fairly
pointless busy-work. Kernel-doc isn't interacting perfectly with rst, but
now we already have a sizeable amount of stuff converted and going through
all that once more needs imo som really clear benefits. I think
bug-for-bug compatibility would be much better. Later on we could do
changes, on a change-by-change basis.
-Daniel


>  - I'll have to try it out to see how noisy it is.  I'm not opposed to
>stricter checks; indeed, they could be a good thing.  But we might want
>to have an option so we can cut back on the noise by default.


> 
> Thanks,
> 
> jon

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


Re: [RFC PATCH v1 2/6] kernel-doc: replace kernel-doc perl parser with a pure python one (WIP)

2017-01-24 Thread Jonathan Corbet
On Tue, 24 Jan 2017 20:52:40 +0100
Markus Heiser  wrote:

> This patch is the initial merge of a pure python implementation
> to parse kernel-doc comments and generate reST from.
> 
> It consist mainly of to parts, the parser module (kerneldoc.py) and the
> sphinx-doc extension (rstKernelDoc.py). For the command line, there is
> also a 'scripts/kerneldoc' added.::
> 
>scripts/kerneldoc --help
> 
> The main two parts are merged 1:1 from
> 
>   https://github.com/return42/linuxdoc  commit 3991d3c
> 
> Take this as a starting point, there is a lot of work to do (WIP).
> Since it is merged 1:1, you will also notice it's CodingStyle is (ATM)
> not kernel compliant and it lacks a user doc ('Documentation/doc-guide').
> 
> I will send patches for this when the community agreed about
> functionalities. I guess there are a lot of topics we have to agree
> about. E.g. the py-implementation is more strict the perl one.  When you
> build doc with the py-module you will see a lot of additional errors and
> warnings compared to the sloppy perl one.

Again, quick comments...

 - I would *much* rather evolve our existing Sphinx extension in the
   direction we want it to go than to just replace it wholesale.
   Replacement is the wrong approach for a few reasons, including the need
   to minimize change and preserve credit for Jani's work.  Can we work on
   that basis, please?

   Ideally at the time of merging, we would be able to build the docs with
   *either* kerneldoc.

 - I'll have to try it out to see how noisy it is.  I'm not opposed to
   stricter checks; indeed, they could be a good thing.  But we might want
   to have an option so we can cut back on the noise by default.

Thanks,

jon


Re: [RFC PATCH v1 2/6] kernel-doc: replace kernel-doc perl parser with a pure python one (WIP)

2017-01-24 Thread Jonathan Corbet
On Tue, 24 Jan 2017 20:52:40 +0100
Markus Heiser  wrote:

> This patch is the initial merge of a pure python implementation
> to parse kernel-doc comments and generate reST from.
> 
> It consist mainly of to parts, the parser module (kerneldoc.py) and the
> sphinx-doc extension (rstKernelDoc.py). For the command line, there is
> also a 'scripts/kerneldoc' added.::
> 
>scripts/kerneldoc --help
> 
> The main two parts are merged 1:1 from
> 
>   https://github.com/return42/linuxdoc  commit 3991d3c
> 
> Take this as a starting point, there is a lot of work to do (WIP).
> Since it is merged 1:1, you will also notice it's CodingStyle is (ATM)
> not kernel compliant and it lacks a user doc ('Documentation/doc-guide').
> 
> I will send patches for this when the community agreed about
> functionalities. I guess there are a lot of topics we have to agree
> about. E.g. the py-implementation is more strict the perl one.  When you
> build doc with the py-module you will see a lot of additional errors and
> warnings compared to the sloppy perl one.

Again, quick comments...

 - I would *much* rather evolve our existing Sphinx extension in the
   direction we want it to go than to just replace it wholesale.
   Replacement is the wrong approach for a few reasons, including the need
   to minimize change and preserve credit for Jani's work.  Can we work on
   that basis, please?

   Ideally at the time of merging, we would be able to build the docs with
   *either* kerneldoc.

 - I'll have to try it out to see how noisy it is.  I'm not opposed to
   stricter checks; indeed, they could be a good thing.  But we might want
   to have an option so we can cut back on the noise by default.

Thanks,

jon