Hi folks, An Arch Linux user has reported a performance regression in man page rendering in groff 1.24. For "small" man pages (or collections thereof), it's not noticeable, but the reported degradation is quadratic in large inputs, roughly twice as bad as one would expect for 25 copies in a row of the gcc man page.
I'm not able to reproduce the problem. Since I don't have the gcc man page handy (it's not DFSG-free and my system is Debian-based), I used up to 50 copies of the bash 5.3 man page, which uncompressed is... $ wc $(man -w bash) 13495 64023 392678 /home/branden/share/man/man1/bash.1 ...so 50 copies is almost 20 megabytes of input. Interestingly the reporter has narrowed the problem down to a single commit, and can recover groff 1.23's performance by simply pointing groff 1.24 to 1.23's macro files. That suggests a humdinger of a macro programming problem...but when I look at the identified regressing commit, it's hard for me to see how. https://cgit.git.savannah.gnu.org/cgit/groff.git/commit/?id=732b07d4998bec1cc942481e7cf4e7287050c40b One thing you'll notice about this commit is that it substantially _reduces_ the amount of macro code being interpreted. Other things being equal, you'd expect that to _improve_ performance. Obviously other things aren't equal. But there's no change in cyclomatic complexity--meaning, no loops are added or removed, and more to the point for a macro-oriented language like *roff, there's no change to recursive macro calls. (I don't think our man(7) package has _any_ recursive macro calls, and I don't see any in this diff.) I have only one guess about a culprit here, and I don't think it's a very good one. See this bit at the end? +.\" In continuous rendering mode, make page breaks less potent and the +.\" page length "infinite". +.if \n[cR] \{\ +. rn bp an*real-bp +. rn an*bp bp +. pl \n[.R]u/1v +.\} That division operation gave me pause for a moment, because `.R` is guaranteed to have a honking large value. This seems like an unpromising site for exploration, though, for two reasons. 1. GNU troff doesn't implement its own divider. It translates its own arithmetic language to C++ and relies on the language runtime to perform the actual operations. This of course gets compiled to assembly langage and handed off to the CPU. The reporter didn't mention what machine architecture they're using, but I'm guessing it's one with a hardware divider. 2. This computation done only once per load of the macro package. That means, if you use "-mandoc", it will happen once at every switch to man(7) to mdoc(7) and vice versa. If no such switch occurs, or if all your documents use the same macro package, then it happens once, period. A constant-order factor cannot create a quadratic performance degradation. It's certainly possible that `.R` is still implicated somehow--it, and the prerequisite change to adopt saturating arithmetic in GNU troff, is the "deepest" change I've ever made to the formatter. Possibly the arithmetic is fine but there's something unfortunate elsewhere in the formatter that creates needless churn when the distance to the next vertical position trap is gigantic. (If that's true, then if we fix that, we'll be fixing it for many more applications than just man pages.) But right now I don't see any _evidence_ of that. So, can anyone reproduce this problem and supply the necessary evidence? The Savannah ticket has scripts for performing the stress test with the bash man page. And also auxiliary scripts for producing a diagram of comparative groff 1.23 and groff 1.24 macro files' performance with gnuplot, if you're interested in doing that. Thanks in advance. Regards, Branden
signature.asc
Description: PGP signature
