On Sat, Jun 26, 2021 at 07:20:52PM +0200, Ingo Schwarze wrote:
> Hi Jason and Theo,
> 
> Jason McIntyre wrote on Tue, Jun 22, 2021 at 06:37:27AM +0100:
> > On Tue, Jun 22, 2021 at 04:48:39AM +0200, Theo Buehler wrote:
> 
> >> You have two overlong lines as indicated below. I would have thought
> >> that mandoc -Tlint complains about that, but apparently it doesn't have
> >> such a warning... With those wrapped,
>  
> > yes, there is no feedback on long lines. although we try to keep the
> > source less than 80 width, there are some places where it is not
> > possible.
> > 
> > i'm not sure whether adding a warning would be helpful or disruptive.
> 
> Here is a patch implementing such a style warning, leaning very
> heavily into the direction of never producing false positives, that
> is, not warning about long lines
> 
>  - in no-fill mode (e.g., .Bd -literal, .EX, .nf and the like)
>  - that start with a dot (normal macro and request lines)
>    or with a non-standard control character
>  - that start with a space character or with an escape sequence
>  - in tbl(7) context
>  - in eqn(7) context
>  - that do not contain a blank character before column 80
> 
> So, this certainly does not find all long lines that can be improved,
> but everything it finds ought to be trivial and worthwhile to fix.
> 
> There are less than twenty-five offenders below /usr/src/share/man/,
> and none of those are false positives.
> 
> However, i see over twenty-three thousand offending lines
> below /usr/share/man/, the vast majority in Perl manual pages
> and considerable amounts in other third-party stuff like GCC
> and binutils.
> The worst offenders we maintain ourselves are tmux(1) with 15
> offending lines, terminfo(5) with 11, mkhybrid(8) with 6, tic(1),
> magic(5), sysctl(2), cdio(1), and the rest with three or less
> offending lines each, maybe a few dozen offending files all told.
> 

i count mkhybrid as 3rd party. the man page is still old style, and has
only 2 commits in 20 years. if it is ours, it needs considerable work.

> I don't think such a patch would be disruptive.  I expect most of the
> third-party manuals it flags already have many other warnings.
> Our own stuff does not have large amounts of issues due to Jason's
> diligent manual work.
> 
> Does it help?  Maybe it could slightly reduce one aspect of Jason's
> workload and help developers who want to find their own glitches in
> this respect, in particular those usually working on terminals wider
> than 80 columns.
> 
> There is no need to chase all of these down, but the style warning
> might help when editing a page for other reasons.
> 
> What do you think?
>   Ingo

i have no strong opinions. it actually doesn;t really touch anything i
do - i never check pages for long lines, though i do shorten them if i'm
doing other stuff on a page. i guess the question is more if others
would find it handy to have it flagged, or disruptive.

on balance, i'd be fine with such an addition.

jmc

> 
> 
> P.S.
> The following script makes it easy to count violations:
> 
> mandoc -T lint */*.[1-9]* */*/*.[1-9]* | \
>   grep 'longer than' | \
>   cut -d : -f 2 | \
>   uniq -c | \
>   sort -nr
> 
> 
> Index: libmandoc.h
> ===================================================================
> RCS file: /cvs/src/usr.bin/mandoc/libmandoc.h,v
> retrieving revision 1.64
> diff -u -p -r1.64 libmandoc.h
> --- libmandoc.h       3 Apr 2020 11:34:19 -0000       1.64
> +++ libmandoc.h       26 Jun 2021 16:19:03 -0000
> @@ -73,7 +73,7 @@ void                 roff_reset(struct roff *);
>  void          roff_man_free(struct roff_man *);
>  struct roff_man      *roff_man_alloc(struct roff *, const char *, int);
>  void          roff_man_reset(struct roff_man *);
> -int           roff_parseln(struct roff *, int, struct buf *, int *);
> +int           roff_parseln(struct roff *, int, struct buf *, int *, size_t);
>  void          roff_userret(struct roff *);
>  void          roff_endparse(struct roff *);
>  void          roff_setreg(struct roff *, const char *, int, char);
> Index: mandoc.1
> ===================================================================
> RCS file: /cvs/src/usr.bin/mandoc/mandoc.1,v
> retrieving revision 1.175
> diff -u -p -r1.175 mandoc.1
> --- mandoc.1  2 Jun 2021 18:27:36 -0000       1.175
> +++ mandoc.1  26 Jun 2021 16:19:04 -0000
> @@ -1066,6 +1066,9 @@ An
>  request occurs even though the document already switched to no-fill mode
>  and did not switch back to fill mode yet.
>  It has no effect.
> +.It Sy "input text line longer than 80 bytes"
> +Consider breaking the input text line
> +at one of the blank characters before column 80.
>  .It Sy "verbatim \(dq--\(dq, maybe consider using \e(em"
>  .Pq mdoc
>  Even though the ASCII output device renders an em-dash as
> Index: mandoc.h
> ===================================================================
> RCS file: /cvs/src/usr.bin/mandoc/mandoc.h,v
> retrieving revision 1.212
> diff -u -p -r1.212 mandoc.h
> --- mandoc.h  2 Jun 2021 18:27:37 -0000       1.212
> +++ mandoc.h  26 Jun 2021 16:19:04 -0000
> @@ -72,6 +72,7 @@ enum        mandocerr {
>       MANDOCERR_DELIM_NB, /* no blank before trailing delimiter: macro ... */
>       MANDOCERR_FI_SKIP, /* fill mode already enabled, skipping: fi */
>       MANDOCERR_NF_SKIP, /* fill mode already disabled, skipping: nf */
> +     MANDOCERR_TEXT_LONG, /* input text line longer than 80 bytes */
>       MANDOCERR_DASHDASH, /* verbatim "--", maybe consider using \(em */
>       MANDOCERR_FUNC, /* function name without markup: name() */
>       MANDOCERR_SPACE_EOL, /* whitespace at end of input line */
> Index: mandoc_msg.c
> ===================================================================
> RCS file: /cvs/src/usr.bin/mandoc/mandoc_msg.c,v
> retrieving revision 1.11
> diff -u -p -r1.11 mandoc_msg.c
> --- mandoc_msg.c      2 Jun 2021 18:27:37 -0000       1.11
> +++ mandoc_msg.c      26 Jun 2021 16:19:04 -0000
> @@ -71,6 +71,7 @@ static      const char *const type_message[MA
>       "no blank before trailing delimiter",
>       "fill mode already enabled, skipping",
>       "fill mode already disabled, skipping",
> +     "input text line longer than 80 bytes",
>       "verbatim \"--\", maybe consider using \\(em",
>       "function name without markup",
>       "whitespace at end of input line",
> Index: read.c
> ===================================================================
> RCS file: /cvs/src/usr.bin/mandoc/read.c,v
> retrieving revision 1.190
> diff -u -p -r1.190 read.c
> --- read.c    24 Apr 2020 11:58:02 -0000      1.190
> +++ read.c    26 Jun 2021 16:19:04 -0000
> @@ -152,6 +152,7 @@ mparse_buf_r(struct mparse *curp, struct
>       struct buf      *firstln, *lastln, *thisln, *loop;
>       char            *cp;
>       size_t           pos; /* byte number in the ln buffer */
> +     size_t           spos; /* at the start of the current line parse */
>       int              line_result, result;
>       int              of;
>       int              lnn; /* line number in the real file */
> @@ -178,6 +179,7 @@ mparse_buf_r(struct mparse *curp, struct
>                           curp->filenc & MPARSE_LATIN1)
>                               curp->filenc = preconv_cue(&blk, i);
>               }
> +             spos = pos;
>  
>               while (i < blk.sz && (start || blk.buf[i] != '\0')) {
>  
> @@ -277,7 +279,8 @@ mparse_buf_r(struct mparse *curp, struct
>  
>               of = 0;
>  rerun:
> -             line_result = roff_parseln(curp->roff, curp->line, &ln, &of);
> +             line_result = roff_parseln(curp->roff, curp->line,
> +                 &ln, &of, start && spos == 0 ? pos : 0);
>  
>               /* Process options. */
>  
> Index: roff.c
> ===================================================================
> RCS file: /cvs/src/usr.bin/mandoc/roff.c,v
> retrieving revision 1.248
> diff -u -p -r1.248 roff.c
> --- roff.c    27 Aug 2020 12:58:00 -0000      1.248
> +++ roff.c    26 Jun 2021 16:19:04 -0000
> @@ -1821,7 +1821,7 @@ roff_parsetext(struct roff *r, struct bu
>  }
>  
>  int
> -roff_parseln(struct roff *r, int ln, struct buf *buf, int *offs)
> +roff_parseln(struct roff *r, int ln, struct buf *buf, int *offs, size_t len)
>  {
>       enum roff_tok    t;
>       int              e;
> @@ -1831,6 +1831,14 @@ roff_parseln(struct roff *r, int ln, str
>       int              ctl;   /* macro line (boolean) */
>  
>       ppos = pos = *offs;
> +
> +     if (len > 80 && r->tbl == NULL && r->eqn == NULL &&
> +         (r->man->flags & ROFF_NOFILL) == 0 &&
> +         strchr(" .\\", buf->buf[pos]) == NULL &&
> +         buf->buf[pos] != r->control &&
> +         strcspn(buf->buf, " ") < 80)
> +             mandoc_msg(MANDOCERR_TEXT_LONG, ln, (int)len - 1,
> +                 "%.20s...", buf->buf + pos);
>  
>       /* Handle in-line equation delimiters. */
>  

Reply via email to