deroff chokes when given lines > 2048 bytes, and produces non-deterministic output on little endian archs.
Reproducer: $ jot -s '' -b blah 513 > /tmp/blah $ for i in 1 2 3 4 ; do deroff /tmp/blah | md5 ; done 2d8f4eebd61ed2a07101419169fc9997 ae19be78a09e6b371787003bf76f5937 82b4bcea8510605bea2723ffc70c99b4 0ea7b0ddc76d2a280dd30cff6a69574e This happens because regline() writes one byte past the end of line[], and typically this will be the first byte of *lp. On little-endian archs this makes lp jump a few bytes backwards or forwards, meaning that we write the terminating null in slightly the wrong place, usually resulting in some garbage characters at the end of each output line, (long lines > LINE_MAX are supposed to be split at LINE_MAX in the output). Big-endian will almost always just crash immediately, as the pointer is completely trashed. Fix below. Note that on each iteration of the loop, we've already read the next character from the input stream at the end of the previous iteration, so we have to increase lp by one before breaking out, to avoid overwriting the last character of any line that is formed by splitting an overly long one with the null terminator. Further note that since the null terminator is stored in line[], we can only read up to LINE_MAX - 1, so long input lines will be split at 2047 characters. If this is an issue, then the size of line[] can always be bumped to LINE_MAX + 1. Whilst here, fix a compiler warning when FULLDEBUG is defined. Oh, and since deroff is called by /usr/bin/spell, this can cause unexpected behaviour when running spell against files with long lines too. --- deroff.c.dist Wed Mar 8 01:43:10 2023 +++ deroff.c Tue Sep 19 09:31:29 2023 @@ -484,6 +484,10 @@ } if (c == '\n') break; + if (lp - line == sizeof(line) - 2) { + lp++; + break; + } if (intable && c == 'T') { *++lp = C; if (c == '{' || c == '}') { @@ -968,7 +972,7 @@ #ifdef FULLDEBUG { char *p; - printf("[%d,%d: ", argno, np - cp); + printf("[%d,%ld: ", argno, np - cp); for (p = cp; p < np; p++) { putchar(*p); }