On Sat, Jun 26, 2021 at 07:20:52PM +0200, Ingo Schwarze wrote:
> Hi Jason and Theo,
>
> Jason McIntyre wrote on Tue, Jun 22, 2021 at 06:37:27AM +0100:
> > On Tue, Jun 22, 2021 at 04:48:39AM +0200, Theo Buehler wrote:
>
> >> You have two overlong lines as indicated below. I would have thought
> >> that mandoc -Tlint complains about that, but apparently it doesn't have
> >> such a warning... With those wrapped,
>
> > yes, there is no feedback on long lines. although we try to keep the
> > source less than 80 width, there are some places where it is not
> > possible.
> >
> > i'm not sure whether adding a warning would be helpful or disruptive.
>
> Here is a patch implementing such a style warning, leaning very
> heavily into the direction of never producing false positives, that
> is, not warning about long lines
>
> - in no-fill mode (e.g., .Bd -literal, .EX, .nf and the like)
> - that start with a dot (normal macro and request lines)
> or with a non-standard control character
> - that start with a space character or with an escape sequence
> - in tbl(7) context
> - in eqn(7) context
> - that do not contain a blank character before column 80
>
> So, this certainly does not find all long lines that can be improved,
> but everything it finds ought to be trivial and worthwhile to fix.
>
> There are less than twenty-five offenders below /usr/src/share/man/,
> and none of those are false positives.
>
> However, i see over twenty-three thousand offending lines
> below /usr/share/man/, the vast majority in Perl manual pages
> and considerable amounts in other third-party stuff like GCC
> and binutils.
> The worst offenders we maintain ourselves are tmux(1) with 15
> offending lines, terminfo(5) with 11, mkhybrid(8) with 6, tic(1),
> magic(5), sysctl(2), cdio(1), and the rest with three or less
> offending lines each, maybe a few dozen offending files all told.
>
i count mkhybrid as 3rd party. the man page is still old style, and has
only 2 commits in 20 years. if it is ours, it needs considerable work.
> I don't think such a patch would be disruptive. I expect most of the
> third-party manuals it flags already have many other warnings.
> Our own stuff does not have large amounts of issues due to Jason's
> diligent manual work.
>
> Does it help? Maybe it could slightly reduce one aspect of Jason's
> workload and help developers who want to find their own glitches in
> this respect, in particular those usually working on terminals wider
> than 80 columns.
>
> There is no need to chase all of these down, but the style warning
> might help when editing a page for other reasons.
>
> What do you think?
> Ingo
i have no strong opinions. it actually doesn;t really touch anything i
do - i never check pages for long lines, though i do shorten them if i'm
doing other stuff on a page. i guess the question is more if others
would find it handy to have it flagged, or disruptive.
on balance, i'd be fine with such an addition.
jmc
>
>
> P.S.
> The following script makes it easy to count violations:
>
> mandoc -T lint */*.[1-9]* */*/*.[1-9]* | \
> grep 'longer than' | \
> cut -d : -f 2 | \
> uniq -c | \
> sort -nr
>
>
> Index: libmandoc.h
> ===================================================================
> RCS file: /cvs/src/usr.bin/mandoc/libmandoc.h,v
> retrieving revision 1.64
> diff -u -p -r1.64 libmandoc.h
> --- libmandoc.h 3 Apr 2020 11:34:19 -0000 1.64
> +++ libmandoc.h 26 Jun 2021 16:19:03 -0000
> @@ -73,7 +73,7 @@ void roff_reset(struct roff *);
> void roff_man_free(struct roff_man *);
> struct roff_man *roff_man_alloc(struct roff *, const char *, int);
> void roff_man_reset(struct roff_man *);
> -int roff_parseln(struct roff *, int, struct buf *, int *);
> +int roff_parseln(struct roff *, int, struct buf *, int *, size_t);
> void roff_userret(struct roff *);
> void roff_endparse(struct roff *);
> void roff_setreg(struct roff *, const char *, int, char);
> Index: mandoc.1
> ===================================================================
> RCS file: /cvs/src/usr.bin/mandoc/mandoc.1,v
> retrieving revision 1.175
> diff -u -p -r1.175 mandoc.1
> --- mandoc.1 2 Jun 2021 18:27:36 -0000 1.175
> +++ mandoc.1 26 Jun 2021 16:19:04 -0000
> @@ -1066,6 +1066,9 @@ An
> request occurs even though the document already switched to no-fill mode
> and did not switch back to fill mode yet.
> It has no effect.
> +.It Sy "input text line longer than 80 bytes"
> +Consider breaking the input text line
> +at one of the blank characters before column 80.
> .It Sy "verbatim \(dq--\(dq, maybe consider using \e(em"
> .Pq mdoc
> Even though the ASCII output device renders an em-dash as
> Index: mandoc.h
> ===================================================================
> RCS file: /cvs/src/usr.bin/mandoc/mandoc.h,v
> retrieving revision 1.212
> diff -u -p -r1.212 mandoc.h
> --- mandoc.h 2 Jun 2021 18:27:37 -0000 1.212
> +++ mandoc.h 26 Jun 2021 16:19:04 -0000
> @@ -72,6 +72,7 @@ enum mandocerr {
> MANDOCERR_DELIM_NB, /* no blank before trailing delimiter: macro ... */
> MANDOCERR_FI_SKIP, /* fill mode already enabled, skipping: fi */
> MANDOCERR_NF_SKIP, /* fill mode already disabled, skipping: nf */
> + MANDOCERR_TEXT_LONG, /* input text line longer than 80 bytes */
> MANDOCERR_DASHDASH, /* verbatim "--", maybe consider using \(em */
> MANDOCERR_FUNC, /* function name without markup: name() */
> MANDOCERR_SPACE_EOL, /* whitespace at end of input line */
> Index: mandoc_msg.c
> ===================================================================
> RCS file: /cvs/src/usr.bin/mandoc/mandoc_msg.c,v
> retrieving revision 1.11
> diff -u -p -r1.11 mandoc_msg.c
> --- mandoc_msg.c 2 Jun 2021 18:27:37 -0000 1.11
> +++ mandoc_msg.c 26 Jun 2021 16:19:04 -0000
> @@ -71,6 +71,7 @@ static const char *const type_message[MA
> "no blank before trailing delimiter",
> "fill mode already enabled, skipping",
> "fill mode already disabled, skipping",
> + "input text line longer than 80 bytes",
> "verbatim \"--\", maybe consider using \\(em",
> "function name without markup",
> "whitespace at end of input line",
> Index: read.c
> ===================================================================
> RCS file: /cvs/src/usr.bin/mandoc/read.c,v
> retrieving revision 1.190
> diff -u -p -r1.190 read.c
> --- read.c 24 Apr 2020 11:58:02 -0000 1.190
> +++ read.c 26 Jun 2021 16:19:04 -0000
> @@ -152,6 +152,7 @@ mparse_buf_r(struct mparse *curp, struct
> struct buf *firstln, *lastln, *thisln, *loop;
> char *cp;
> size_t pos; /* byte number in the ln buffer */
> + size_t spos; /* at the start of the current line parse */
> int line_result, result;
> int of;
> int lnn; /* line number in the real file */
> @@ -178,6 +179,7 @@ mparse_buf_r(struct mparse *curp, struct
> curp->filenc & MPARSE_LATIN1)
> curp->filenc = preconv_cue(&blk, i);
> }
> + spos = pos;
>
> while (i < blk.sz && (start || blk.buf[i] != '\0')) {
>
> @@ -277,7 +279,8 @@ mparse_buf_r(struct mparse *curp, struct
>
> of = 0;
> rerun:
> - line_result = roff_parseln(curp->roff, curp->line, &ln, &of);
> + line_result = roff_parseln(curp->roff, curp->line,
> + &ln, &of, start && spos == 0 ? pos : 0);
>
> /* Process options. */
>
> Index: roff.c
> ===================================================================
> RCS file: /cvs/src/usr.bin/mandoc/roff.c,v
> retrieving revision 1.248
> diff -u -p -r1.248 roff.c
> --- roff.c 27 Aug 2020 12:58:00 -0000 1.248
> +++ roff.c 26 Jun 2021 16:19:04 -0000
> @@ -1821,7 +1821,7 @@ roff_parsetext(struct roff *r, struct bu
> }
>
> int
> -roff_parseln(struct roff *r, int ln, struct buf *buf, int *offs)
> +roff_parseln(struct roff *r, int ln, struct buf *buf, int *offs, size_t len)
> {
> enum roff_tok t;
> int e;
> @@ -1831,6 +1831,14 @@ roff_parseln(struct roff *r, int ln, str
> int ctl; /* macro line (boolean) */
>
> ppos = pos = *offs;
> +
> + if (len > 80 && r->tbl == NULL && r->eqn == NULL &&
> + (r->man->flags & ROFF_NOFILL) == 0 &&
> + strchr(" .\\", buf->buf[pos]) == NULL &&
> + buf->buf[pos] != r->control &&
> + strcspn(buf->buf, " ") < 80)
> + mandoc_msg(MANDOCERR_TEXT_LONG, ln, (int)len - 1,
> + "%.20s...", buf->buf + pos);
>
> /* Handle in-line equation delimiters. */
>