Re: mandoc style warning about text lines > 80 bytes

2021-06-27 Thread Ingo Schwarze
Hi Jason,

Jason McIntyre wrote on Sat, Jun 26, 2021 at 06:55:44PM +0100:

> i count mkhybrid as 3rd party. the man page is still old style, and has
> only 2 commits in 20 years. if it is ours, it needs considerable work.

Fair enough, you are probably right about that.

> i have no strong opinions. it actually doesn;t really touch anything i
> do - i never check pages for long lines, though i do shorten them if i'm
> doing other stuff on a page. i guess the question is more if others
> would find it handy to have it flagged, or disruptive.

It appears some consider it potentially useful.

> on balance, i'd be fine with such an addition.

Thanks to all three of you for checking, it is now committed.

Yours,
  Ingo



Re: mandoc style warning about text lines > 80 bytes

2021-06-27 Thread Klemens Nanni
On Sat, Jun 26, 2021 at 08:06:58PM +0200, Theo Buehler wrote:
> On Sat, Jun 26, 2021 at 07:20:52PM +0200, Ingo Schwarze wrote:
> > Hi Jason and Theo,
> > 
> > Jason McIntyre wrote on Tue, Jun 22, 2021 at 06:37:27AM +0100:
> > > On Tue, Jun 22, 2021 at 04:48:39AM +0200, Theo Buehler wrote:
> > 
> > >> You have two overlong lines as indicated below. I would have thought
> > >> that mandoc -Tlint complains about that, but apparently it doesn't have
> > >> such a warning... With those wrapped,
> >  
> > > yes, there is no feedback on long lines. although we try to keep the
> > > source less than 80 width, there are some places where it is not
> > > possible.
> > > 
> > > i'm not sure whether adding a warning would be helpful or disruptive.
> > 
> > Here is a patch implementing such a style warning, leaning very
> > heavily into the direction of never producing false positives, that
> > is, not warning about long lines
> > 
> >  - in no-fill mode (e.g., .Bd -literal, .EX, .nf and the like)
> >  - that start with a dot (normal macro and request lines)
> >or with a non-standard control character
> >  - that start with a space character or with an escape sequence
> >  - in tbl(7) context
> >  - in eqn(7) context
> >  - that do not contain a blank character before column 80
> 
> This sounds perfectly reasonable. While I don't think I'm likely to
> introduce overlong lines myself, I believe it's a useful warning for
> development. Since you've already done the work, I am in favor of
> committing this. It's not expensive at all in terms of runtime and code
> complexity. I agree that it won't be disruptive.
> 
> The code reads fine, thanks!

Agreed.

OK kn



Re: mandoc style warning about text lines > 80 bytes

2021-06-26 Thread Theo Buehler
On Sat, Jun 26, 2021 at 07:20:52PM +0200, Ingo Schwarze wrote:
> Hi Jason and Theo,
> 
> Jason McIntyre wrote on Tue, Jun 22, 2021 at 06:37:27AM +0100:
> > On Tue, Jun 22, 2021 at 04:48:39AM +0200, Theo Buehler wrote:
> 
> >> You have two overlong lines as indicated below. I would have thought
> >> that mandoc -Tlint complains about that, but apparently it doesn't have
> >> such a warning... With those wrapped,
>  
> > yes, there is no feedback on long lines. although we try to keep the
> > source less than 80 width, there are some places where it is not
> > possible.
> > 
> > i'm not sure whether adding a warning would be helpful or disruptive.
> 
> Here is a patch implementing such a style warning, leaning very
> heavily into the direction of never producing false positives, that
> is, not warning about long lines
> 
>  - in no-fill mode (e.g., .Bd -literal, .EX, .nf and the like)
>  - that start with a dot (normal macro and request lines)
>or with a non-standard control character
>  - that start with a space character or with an escape sequence
>  - in tbl(7) context
>  - in eqn(7) context
>  - that do not contain a blank character before column 80

This sounds perfectly reasonable. While I don't think I'm likely to
introduce overlong lines myself, I believe it's a useful warning for
development. Since you've already done the work, I am in favor of
committing this. It's not expensive at all in terms of runtime and code
complexity. I agree that it won't be disruptive.

The code reads fine, thanks!



Re: mandoc style warning about text lines > 80 bytes

2021-06-26 Thread Jason McIntyre
On Sat, Jun 26, 2021 at 07:20:52PM +0200, Ingo Schwarze wrote:
> Hi Jason and Theo,
> 
> Jason McIntyre wrote on Tue, Jun 22, 2021 at 06:37:27AM +0100:
> > On Tue, Jun 22, 2021 at 04:48:39AM +0200, Theo Buehler wrote:
> 
> >> You have two overlong lines as indicated below. I would have thought
> >> that mandoc -Tlint complains about that, but apparently it doesn't have
> >> such a warning... With those wrapped,
>  
> > yes, there is no feedback on long lines. although we try to keep the
> > source less than 80 width, there are some places where it is not
> > possible.
> > 
> > i'm not sure whether adding a warning would be helpful or disruptive.
> 
> Here is a patch implementing such a style warning, leaning very
> heavily into the direction of never producing false positives, that
> is, not warning about long lines
> 
>  - in no-fill mode (e.g., .Bd -literal, .EX, .nf and the like)
>  - that start with a dot (normal macro and request lines)
>or with a non-standard control character
>  - that start with a space character or with an escape sequence
>  - in tbl(7) context
>  - in eqn(7) context
>  - that do not contain a blank character before column 80
> 
> So, this certainly does not find all long lines that can be improved,
> but everything it finds ought to be trivial and worthwhile to fix.
> 
> There are less than twenty-five offenders below /usr/src/share/man/,
> and none of those are false positives.
> 
> However, i see over twenty-three thousand offending lines
> below /usr/share/man/, the vast majority in Perl manual pages
> and considerable amounts in other third-party stuff like GCC
> and binutils.
> The worst offenders we maintain ourselves are tmux(1) with 15
> offending lines, terminfo(5) with 11, mkhybrid(8) with 6, tic(1),
> magic(5), sysctl(2), cdio(1), and the rest with three or less
> offending lines each, maybe a few dozen offending files all told.
> 

i count mkhybrid as 3rd party. the man page is still old style, and has
only 2 commits in 20 years. if it is ours, it needs considerable work.

> I don't think such a patch would be disruptive.  I expect most of the
> third-party manuals it flags already have many other warnings.
> Our own stuff does not have large amounts of issues due to Jason's
> diligent manual work.
> 
> Does it help?  Maybe it could slightly reduce one aspect of Jason's
> workload and help developers who want to find their own glitches in
> this respect, in particular those usually working on terminals wider
> than 80 columns.
> 
> There is no need to chase all of these down, but the style warning
> might help when editing a page for other reasons.
> 
> What do you think?
>   Ingo

i have no strong opinions. it actually doesn;t really touch anything i
do - i never check pages for long lines, though i do shorten them if i'm
doing other stuff on a page. i guess the question is more if others
would find it handy to have it flagged, or disruptive.

on balance, i'd be fine with such an addition.

jmc

> 
> 
> P.S.
> The following script makes it easy to count violations:
> 
> mandoc -T lint */*.[1-9]* */*/*.[1-9]* | \
>   grep 'longer than' | \
>   cut -d : -f 2 | \
>   uniq -c | \
>   sort -nr
> 
> 
> Index: libmandoc.h
> ===
> RCS file: /cvs/src/usr.bin/mandoc/libmandoc.h,v
> retrieving revision 1.64
> diff -u -p -r1.64 libmandoc.h
> --- libmandoc.h   3 Apr 2020 11:34:19 -   1.64
> +++ libmandoc.h   26 Jun 2021 16:19:03 -
> @@ -73,7 +73,7 @@ void roff_reset(struct roff *);
>  void  roff_man_free(struct roff_man *);
>  struct roff_man  *roff_man_alloc(struct roff *, const char *, int);
>  void  roff_man_reset(struct roff_man *);
> -int   roff_parseln(struct roff *, int, struct buf *, int *);
> +int   roff_parseln(struct roff *, int, struct buf *, int *, size_t);
>  void  roff_userret(struct roff *);
>  void  roff_endparse(struct roff *);
>  void  roff_setreg(struct roff *, const char *, int, char);
> Index: mandoc.1
> ===
> RCS file: /cvs/src/usr.bin/mandoc/mandoc.1,v
> retrieving revision 1.175
> diff -u -p -r1.175 mandoc.1
> --- mandoc.1  2 Jun 2021 18:27:36 -   1.175
> +++ mandoc.1  26 Jun 2021 16:19:04 -
> @@ -1066,6 +1066,9 @@ An
>  request occurs even though the document already switched to no-fill mode
>  and did not switch back to fill mode yet.
>  It has no effect.
> +.It Sy "input text line longer than 80 bytes"
> +Consider breaking the input text line
> +at one of the blank characters before column 80.
>  .It Sy "verbatim \(dq--\(dq, maybe consider using \e(em"
>  .Pq mdoc
>  Even though the ASCII output device renders an em-dash as
> Index: mandoc.h
> ===
> RCS file: /cvs/src/usr.bin/mandoc/mandoc.h,v
> retrieving revision 1.212
> diff -u -p -r1.212 mandoc.h
> -

mandoc style warning about text lines > 80 bytes

2021-06-26 Thread Ingo Schwarze
Hi Jason and Theo,

Jason McIntyre wrote on Tue, Jun 22, 2021 at 06:37:27AM +0100:
> On Tue, Jun 22, 2021 at 04:48:39AM +0200, Theo Buehler wrote:

>> You have two overlong lines as indicated below. I would have thought
>> that mandoc -Tlint complains about that, but apparently it doesn't have
>> such a warning... With those wrapped,
 
> yes, there is no feedback on long lines. although we try to keep the
> source less than 80 width, there are some places where it is not
> possible.
> 
> i'm not sure whether adding a warning would be helpful or disruptive.

Here is a patch implementing such a style warning, leaning very
heavily into the direction of never producing false positives, that
is, not warning about long lines

 - in no-fill mode (e.g., .Bd -literal, .EX, .nf and the like)
 - that start with a dot (normal macro and request lines)
   or with a non-standard control character
 - that start with a space character or with an escape sequence
 - in tbl(7) context
 - in eqn(7) context
 - that do not contain a blank character before column 80

So, this certainly does not find all long lines that can be improved,
but everything it finds ought to be trivial and worthwhile to fix.

There are less than twenty-five offenders below /usr/src/share/man/,
and none of those are false positives.

However, i see over twenty-three thousand offending lines
below /usr/share/man/, the vast majority in Perl manual pages
and considerable amounts in other third-party stuff like GCC
and binutils.
The worst offenders we maintain ourselves are tmux(1) with 15
offending lines, terminfo(5) with 11, mkhybrid(8) with 6, tic(1),
magic(5), sysctl(2), cdio(1), and the rest with three or less
offending lines each, maybe a few dozen offending files all told.

I don't think such a patch would be disruptive.  I expect most of the
third-party manuals it flags already have many other warnings.
Our own stuff does not have large amounts of issues due to Jason's
diligent manual work.

Does it help?  Maybe it could slightly reduce one aspect of Jason's
workload and help developers who want to find their own glitches in
this respect, in particular those usually working on terminals wider
than 80 columns.

There is no need to chase all of these down, but the style warning
might help when editing a page for other reasons.

What do you think?
  Ingo


P.S.
The following script makes it easy to count violations:

mandoc -T lint */*.[1-9]* */*/*.[1-9]* | \
  grep 'longer than' | \
  cut -d : -f 2 | \
  uniq -c | \
  sort -nr


Index: libmandoc.h
===
RCS file: /cvs/src/usr.bin/mandoc/libmandoc.h,v
retrieving revision 1.64
diff -u -p -r1.64 libmandoc.h
--- libmandoc.h 3 Apr 2020 11:34:19 -   1.64
+++ libmandoc.h 26 Jun 2021 16:19:03 -
@@ -73,7 +73,7 @@ void   roff_reset(struct roff *);
 voidroff_man_free(struct roff_man *);
 struct roff_man*roff_man_alloc(struct roff *, const char *, int);
 voidroff_man_reset(struct roff_man *);
-int roff_parseln(struct roff *, int, struct buf *, int *);
+int roff_parseln(struct roff *, int, struct buf *, int *, size_t);
 voidroff_userret(struct roff *);
 voidroff_endparse(struct roff *);
 voidroff_setreg(struct roff *, const char *, int, char);
Index: mandoc.1
===
RCS file: /cvs/src/usr.bin/mandoc/mandoc.1,v
retrieving revision 1.175
diff -u -p -r1.175 mandoc.1
--- mandoc.12 Jun 2021 18:27:36 -   1.175
+++ mandoc.126 Jun 2021 16:19:04 -
@@ -1066,6 +1066,9 @@ An
 request occurs even though the document already switched to no-fill mode
 and did not switch back to fill mode yet.
 It has no effect.
+.It Sy "input text line longer than 80 bytes"
+Consider breaking the input text line
+at one of the blank characters before column 80.
 .It Sy "verbatim \(dq--\(dq, maybe consider using \e(em"
 .Pq mdoc
 Even though the ASCII output device renders an em-dash as
Index: mandoc.h
===
RCS file: /cvs/src/usr.bin/mandoc/mandoc.h,v
retrieving revision 1.212
diff -u -p -r1.212 mandoc.h
--- mandoc.h2 Jun 2021 18:27:37 -   1.212
+++ mandoc.h26 Jun 2021 16:19:04 -
@@ -72,6 +72,7 @@ enum  mandocerr {
MANDOCERR_DELIM_NB, /* no blank before trailing delimiter: macro ... */
MANDOCERR_FI_SKIP, /* fill mode already enabled, skipping: fi */
MANDOCERR_NF_SKIP, /* fill mode already disabled, skipping: nf */
+   MANDOCERR_TEXT_LONG, /* input text line longer than 80 bytes */
MANDOCERR_DASHDASH, /* verbatim "--", maybe consider using \(em */
MANDOCERR_FUNC, /* function name without markup: name() */
MANDOCERR_SPACE_EOL, /* whitespace at end of input line */
Index: mandoc_msg.c
===
RCS file: /cvs/s