Follow-up Comment #17, bug #68145 (group groff):
On Tuesday, 17 March 2026 17:54:06 GMT G. Branden Robinson wrote:
> Follow-up Comment #15, bug #68145 (group groff):
>
> [comment #13 comment #13:]
>
>> I don't know if this is helpful.
>
> I think it is!
>
>> With unlimited page length the value of 'vpos' just keeps increasing.
>
> Yes. That's by design; it's the nature of the continuous rendering beast.
> At least in the "new" approach.
>
>> At the bottom of the multi bash file it is set to
>>
>> V16440280
>
> That sounds about right. It's 56 times more than the vertical drawing
> position at the end of _one_ copy of the bash man page on my system.
>
>
> $ groff -man -T utf8 -Z $(man -w bash) | grep '^V' | tail -n 1
> V289880
> $ echo '16440280/289880' | bc -l
> 56.71408858838139919966
>
>
> So it's in the ballpark for 50 (or 64) copies of the document.
>
>> In that case this code in tty.cpp looks expensive:-
>>
>> if (vpos > nlines) {
>> tty_glyph **old_lines = lines;
>> lines = new tty_glyph *[vpos + 1];
>> memcpy(lines, old_lines, nlines * sizeof(tty_glyph *));
>> for (int i = nlines; i <= vpos; i++)
>> lines[i] = 0;
>> delete[] old_lines;
>> nlines = vpos + 1;
>> }
>>
>>
>> Under 1.23.0 the max value for vpos is 9960. This code is called in
>> add_char so I assume it is called for every character, and vpos is
>> incremented for every line output.
>
> Yes and no. add_char() _is_ called for every output character but this
> new/memcpy/delete sequence that demands a lot of the language runtime's
> dynamic memory allocator doesn't run for every character added, because
> there are two `if` guards, one of which you quoted.
>
>
> if (v == cached_v && cached_v != 0)
> ...
> else
> ...
> if (vpos > nlines) {
> ...
>
>
> So I think the allocation dance happens only when `v` is not `cached_v`
> _and_ not zero, _and_ when `vpos` exceeds `nlines`.
>
> That should happen, at worst, with every line written by _grotty_.
>
> That said, for 50 copies of _bash_(1):
>
>
> $ groff -man -T utf8 $(man -w bash) | wc -l
> 7247
> $ echo '7247 50 * p' | dc
> 362350
>
For 64 copies of bash.1 and grepping for ^V unique values (so vpos won't be
equal to its cached value) I find 369087, so same ballpark. This could be
slightly under reported if there are any relative vertical calls ^v (there
aren't any in bash.1.
> ...which is a lot of _memcpy_(3) calls.
>
>> Pretty sure my analysis is /wrong/incomplete/unhelpful/ as usual. 😄
>
> No, I think it's worth exploring. I think the next thing to do is
> instrument this code to count the number of times grotty reallocates its
> character cell array. (That's what this `lines` thing is.)
>
> We should find out if the Arch Linux users suffering the performance hit get
> the same number.
>
> If they don't, we definitely want to find out why.
>
> If they do, then we can ask them to take up the quadratic performance
> degradation with the vendor of their C++ runtime.
>
> Either way, there _might_ be something we can do in _grotty_ about this.
>
> A. We could support a command-line option that pre-sizes `nlines` to a
> specified value, or to something gigantic. The variable is dereferenced in
> only a few places. It's initialized to "66", which is bog-standard 12-point
> spacing on an 11-inch-tall U.S. letter piece of paper. We already have
> satisfactory experiences with the page length being shortened below that
> just before the document ends.
>
> B. We could have the _man_(7) (and _mdoc_(7)) packages use a device
> extension command to transmit a hint to _grotty_ that continuous rendering
> is in use, and therefore `nlines` should be huge.
> There are tradeoffs here.
>
> 1. Having _grotty_ demand more memory than it's going to actually use is
> discourteous to other processes on the system.
>
> 2. The reported problem arises only in pathological cases. While the
> original report claims a performance degradation for inputs of all size, I
> observe from the results in comment #8 that a document doesn't render twice
> as slowly as before until it's eight times the size of the Bash man page.
> That's a 3 megabyte input document. The new approach to continuous
> rendering solves real-world problems, like misdrawn vertical rules in the
> Linux man-pages _ascii_(7) document--which happens to be **much** shorter
> than the Bash man page, let alone multiple copies thereof. A performance
> regression that affects only extreme outliers of possible inputs might not
> be worth solving.
>
> And the real issue might ultimately not be ours anyway. Maybe somebody's
> Standard C++ Library needs to use a different heap management strategy, or
> support configurable hints available on a per-process basis for selecting
> among several.
> One approach that I saw used in C with the X Window System was, once the
> dynamic storage allocated to some variable (often an array) was almost full,
> to `realloc()` it as double the size. Repeat as needed. That would work
> well with _grotty_(1)'s use case. (And we could actually handle `nlines`
> this way ourselves.) Where it's not so good is if your storage requirement
> doesn't monotonically increase but bounces around. There was a case like
> this in X. I added code to some piece of it to `realloc()` the desired
> space _smaller_ once it was vacated down to one-quarter of its previous
> size. Why one-quarter? Because one-half would take you back to where you
> were at the last doubling, and if you have the misfortune to be servicing
> requests that repeatedly take you just over the limit and then duck back
> below it, you'll thrash the allocator. With the double-and-one-quarter
> approach, only **big** swings in a memory region's utilization prompt
> reallocation.
>
>
>
> _______________________________________________________
>
_______________________________________________________
Reply to this item at:
<https://savannah.gnu.org/bugs/?68145>
_______________________________________________
Message sent via Savannah
https://savannah.gnu.org/
signature.asc
Description: PGP signature
