deroff chokes when given lines > 2048 bytes, and produces non-deterministic
output on little endian archs.

Reproducer:

$ jot -s '' -b blah 513 > /tmp/blah
$ for i in 1 2 3 4 ; do deroff /tmp/blah | md5 ; done
2d8f4eebd61ed2a07101419169fc9997
ae19be78a09e6b371787003bf76f5937
82b4bcea8510605bea2723ffc70c99b4
0ea7b0ddc76d2a280dd30cff6a69574e

This happens because regline() writes one byte past the end of line[], and
typically this will be the first byte of *lp.

On little-endian archs this makes lp jump a few bytes backwards or forwards,
meaning that we write the terminating null in slightly the wrong place,
usually resulting in some garbage characters at the end of each output line,
(long lines > LINE_MAX are supposed to be split at LINE_MAX in the output).

Big-endian will almost always just crash immediately, as the pointer is
completely trashed.

Fix below.

Note that on each iteration of the loop, we've already read the next character
from the input stream at the end of the previous iteration, so we have to
increase lp by one before breaking out, to avoid overwriting the last
character of any line that is formed by splitting an overly long one with the
null terminator.

Further note that since the null terminator is stored in line[], we can only
read up to LINE_MAX - 1, so long input lines will be split at 2047 characters.
If this is an issue, then the size of line[] can always be bumped to
LINE_MAX + 1.

Whilst here, fix a compiler warning when FULLDEBUG is defined.

Oh, and since deroff is called by /usr/bin/spell, this can cause unexpected
behaviour when running spell against files with long lines too.

--- deroff.c.dist       Wed Mar  8 01:43:10 2023
+++ deroff.c    Tue Sep 19 09:31:29 2023
@@ -484,6 +484,10 @@
                }
                if (c == '\n')
                        break;
+               if (lp - line == sizeof(line) - 2) {
+                       lp++;
+                       break;
+                       }
                if (intable && c == 'T') {
                        *++lp = C;
                        if (c == '{' || c == '}') {
@@ -968,7 +972,7 @@
 #ifdef FULLDEBUG
                {
                        char    *p;
-                       printf("[%d,%d: ", argno, np - cp);
+                       printf("[%d,%ld: ", argno, np - cp);
                        for (p = cp; p < np; p++) {
                                putchar(*p);
                        }

Reply via email to