Follow-up Comment #7, bug #67509 (group groff):

On Thursday, 18 September 2025 23:16:25 BST G. Branden Robinson wrote:
> Follow-up Comment #6, bug #67509 (group groff):
>
> At 2025-09-18T17:40:00-0400, Deri James wrote:
[...[
>>
>> I thing the only bit left is '.strhex'. So the only thing preventing
>> branch deletion is waiting for "I must do something about that" when I
>> told you about the significant slow down in specific workloads since
>> the introduction of the looping lookup.
>
> Right.  Is that reproducible with the master branch today?  If so, with
> what input document?
>
> A reproducer that is easy for me to throw at the code would help
> immensely.

Hi Branden,

The instructions for testing were here:-

<https://lists.gnu.org/archive/html/groff/2024-12/msg00168.html>

Just grab the file referred to, DON'T do the git checkout, DON'T apply the
patch, DO run:-

time (./test-pdfmom --roff -Tpdf -man -petk LMB.prep > dj.pdf )

It runs in about 02:45 on my rig (this includes my "speedup" code, if you want

to discover the state it was in after you inserted your code follow the
instructions in the quoted list message - or take my word it was over 10
mins). This compares to 16 secs before your code was added.

>> I was hoping, and you have mentioned it in passing, using c++ maps in
>> groff, would result in a new .map/.lookup feature for groff which
>> would have many user applications such as possibly indexing, as well
>> as in pdf.tmac.
>
> I'd prefer to avoid adding major new features at this point in this
> release cycle, but if there's a slowdown that is unbearable, we'll have
> to resolve it somehow.

Although this job (on bleeding edge) has a 10x slowdown, it is very much
dependent on the size of the lookup table and the number of lookups which
fail, this job is not typical, although in the noughties I was doing millions
of pages through groff (for a UK bank) and being paid £25 per 1000 pages (I
paid the sales and helpdesk (who worked for a software company) in a
percentage). The reason I got the job is because groff was lightening compared

to the commercial report writing software running on windows, which required
multiple servers and licenses. I would not be happy at a slowdown if I hadn't
retired.

So I think we can live with it for this release, and you can take the flak if
someone else is using groff for monster jobs.

>> However with your herculean effort with .asciify I think I can see a
>> way of replacing the loop and asciify the lookup name (instead of the
>> device control string which is no longer done) to avoid the parser
>> barfing. Not sure, an asciified string must still contain \[uXXXX]
>> entities, may not be accepted as part of a .ds name (but should be if
>> we welcome non-english users).
>
> Hmm, well, any printable character is legal in a string identifier
> except the escape character, which is configurable.
>
> So you might do something underhanded like this.
>
>
> .\" Define a new key string.
> .eo
> .ds pdf*key*\[u1234]\[u2345]*whatever value-\[u1234]\[u2345]
> .ec
> .
> .\" ...much later...
> .
> .\" Look it up.
> .eo
> .als pdf*bookmark-content pdf*key*\[u1234]\[u2345]*whatever
> .ec
> .
> .\" Use it.
> \X'pdf:bookmark \*[pdf*bookmark-content]
>
I think I have v. similar code in one of my stashes. I know I hit a wall,
can't remember, it may have been before we added \A'' protection on loops
using parameters.

Cheers

Deri

> Just spitballing.  I won't be surprised if this technique doesn't
> survive a real-world attempt at application.
>
> Regards,
> Branden
>


    _______________________________________________________

Reply to this item at:

  <https://savannah.gnu.org/bugs/?67509>

_______________________________________________
Message sent via Savannah
https://savannah.gnu.org/

Attachment: signature.asc
Description: PGP signature

Reply via email to