Hi Jason and Theo,
Jason McIntyre wrote on Tue, Jun 22, 2021 at 06:37:27AM +0100:
> On Tue, Jun 22, 2021 at 04:48:39AM +0200, Theo Buehler wrote:
>> You have two overlong lines as indicated below. I would have thought
>> that mandoc -Tlint complains about that, but apparently it doesn't have
>> such a warning... With those wrapped,
> yes, there is no feedback on long lines. although we try to keep the
> source less than 80 width, there are some places where it is not
> possible.
>
> i'm not sure whether adding a warning would be helpful or disruptive.
Here is a patch implementing such a style warning, leaning very
heavily into the direction of never producing false positives, that
is, not warning about long lines
- in no-fill mode (e.g., .Bd -literal, .EX, .nf and the like)
- that start with a dot (normal macro and request lines)
or with a non-standard control character
- that start with a space character or with an escape sequence
- in tbl(7) context
- in eqn(7) context
- that do not contain a blank character before column 80
So, this certainly does not find all long lines that can be improved,
but everything it finds ought to be trivial and worthwhile to fix.
There are less than twenty-five offenders below /usr/src/share/man/,
and none of those are false positives.
However, i see over twenty-three thousand offending lines
below /usr/share/man/, the vast majority in Perl manual pages
and considerable amounts in other third-party stuff like GCC
and binutils.
The worst offenders we maintain ourselves are tmux(1) with 15
offending lines, terminfo(5) with 11, mkhybrid(8) with 6, tic(1),
magic(5), sysctl(2), cdio(1), and the rest with three or less
offending lines each, maybe a few dozen offending files all told.
I don't think such a patch would be disruptive. I expect most of the
third-party manuals it flags already have many other warnings.
Our own stuff does not have large amounts of issues due to Jason's
diligent manual work.
Does it help? Maybe it could slightly reduce one aspect of Jason's
workload and help developers who want to find their own glitches in
this respect, in particular those usually working on terminals wider
than 80 columns.
There is no need to chase all of these down, but the style warning
might help when editing a page for other reasons.
What do you think?
Ingo
P.S.
The following script makes it easy to count violations:
mandoc -T lint */*.[1-9]* */*/*.[1-9]* | \
grep 'longer than' | \
cut -d : -f 2 | \
uniq -c | \
sort -nr
Index: libmandoc.h
===================================================================
RCS file: /cvs/src/usr.bin/mandoc/libmandoc.h,v
retrieving revision 1.64
diff -u -p -r1.64 libmandoc.h
--- libmandoc.h 3 Apr 2020 11:34:19 -0000 1.64
+++ libmandoc.h 26 Jun 2021 16:19:03 -0000
@@ -73,7 +73,7 @@ void roff_reset(struct roff *);
void roff_man_free(struct roff_man *);
struct roff_man *roff_man_alloc(struct roff *, const char *, int);
void roff_man_reset(struct roff_man *);
-int roff_parseln(struct roff *, int, struct buf *, int *);
+int roff_parseln(struct roff *, int, struct buf *, int *, size_t);
void roff_userret(struct roff *);
void roff_endparse(struct roff *);
void roff_setreg(struct roff *, const char *, int, char);
Index: mandoc.1
===================================================================
RCS file: /cvs/src/usr.bin/mandoc/mandoc.1,v
retrieving revision 1.175
diff -u -p -r1.175 mandoc.1
--- mandoc.1 2 Jun 2021 18:27:36 -0000 1.175
+++ mandoc.1 26 Jun 2021 16:19:04 -0000
@@ -1066,6 +1066,9 @@ An
request occurs even though the document already switched to no-fill mode
and did not switch back to fill mode yet.
It has no effect.
+.It Sy "input text line longer than 80 bytes"
+Consider breaking the input text line
+at one of the blank characters before column 80.
.It Sy "verbatim \(dq--\(dq, maybe consider using \e(em"
.Pq mdoc
Even though the ASCII output device renders an em-dash as
Index: mandoc.h
===================================================================
RCS file: /cvs/src/usr.bin/mandoc/mandoc.h,v
retrieving revision 1.212
diff -u -p -r1.212 mandoc.h
--- mandoc.h 2 Jun 2021 18:27:37 -0000 1.212
+++ mandoc.h 26 Jun 2021 16:19:04 -0000
@@ -72,6 +72,7 @@ enum mandocerr {
MANDOCERR_DELIM_NB, /* no blank before trailing delimiter: macro ... */
MANDOCERR_FI_SKIP, /* fill mode already enabled, skipping: fi */
MANDOCERR_NF_SKIP, /* fill mode already disabled, skipping: nf */
+ MANDOCERR_TEXT_LONG, /* input text line longer than 80 bytes */
MANDOCERR_DASHDASH, /* verbatim "--", maybe consider using \(em */
MANDOCERR_FUNC, /* function name without markup: name() */
MANDOCERR_SPACE_EOL, /* whitespace at end of input line */
Index: mandoc_msg.c
===================================================================
RCS file: /cvs/src/usr.bin/mandoc/mandoc_msg.c,v
retrieving revision 1.11
diff -u -p -r1.11 mandoc_msg.c
--- mandoc_msg.c 2 Jun 2021 18:27:37 -0000 1.11
+++ mandoc_msg.c 26 Jun 2021 16:19:04 -0000
@@ -71,6 +71,7 @@ static const char *const type_message[MA
"no blank before trailing delimiter",
"fill mode already enabled, skipping",
"fill mode already disabled, skipping",
+ "input text line longer than 80 bytes",
"verbatim \"--\", maybe consider using \\(em",
"function name without markup",
"whitespace at end of input line",
Index: read.c
===================================================================
RCS file: /cvs/src/usr.bin/mandoc/read.c,v
retrieving revision 1.190
diff -u -p -r1.190 read.c
--- read.c 24 Apr 2020 11:58:02 -0000 1.190
+++ read.c 26 Jun 2021 16:19:04 -0000
@@ -152,6 +152,7 @@ mparse_buf_r(struct mparse *curp, struct
struct buf *firstln, *lastln, *thisln, *loop;
char *cp;
size_t pos; /* byte number in the ln buffer */
+ size_t spos; /* at the start of the current line parse */
int line_result, result;
int of;
int lnn; /* line number in the real file */
@@ -178,6 +179,7 @@ mparse_buf_r(struct mparse *curp, struct
curp->filenc & MPARSE_LATIN1)
curp->filenc = preconv_cue(&blk, i);
}
+ spos = pos;
while (i < blk.sz && (start || blk.buf[i] != '\0')) {
@@ -277,7 +279,8 @@ mparse_buf_r(struct mparse *curp, struct
of = 0;
rerun:
- line_result = roff_parseln(curp->roff, curp->line, &ln, &of);
+ line_result = roff_parseln(curp->roff, curp->line,
+ &ln, &of, start && spos == 0 ? pos : 0);
/* Process options. */
Index: roff.c
===================================================================
RCS file: /cvs/src/usr.bin/mandoc/roff.c,v
retrieving revision 1.248
diff -u -p -r1.248 roff.c
--- roff.c 27 Aug 2020 12:58:00 -0000 1.248
+++ roff.c 26 Jun 2021 16:19:04 -0000
@@ -1821,7 +1821,7 @@ roff_parsetext(struct roff *r, struct bu
}
int
-roff_parseln(struct roff *r, int ln, struct buf *buf, int *offs)
+roff_parseln(struct roff *r, int ln, struct buf *buf, int *offs, size_t len)
{
enum roff_tok t;
int e;
@@ -1831,6 +1831,14 @@ roff_parseln(struct roff *r, int ln, str
int ctl; /* macro line (boolean) */
ppos = pos = *offs;
+
+ if (len > 80 && r->tbl == NULL && r->eqn == NULL &&
+ (r->man->flags & ROFF_NOFILL) == 0 &&
+ strchr(" .\\", buf->buf[pos]) == NULL &&
+ buf->buf[pos] != r->control &&
+ strcspn(buf->buf, " ") < 80)
+ mandoc_msg(MANDOCERR_TEXT_LONG, ln, (int)len - 1,
+ "%.20s...", buf->buf + pos);
/* Handle in-line equation delimiters. */