Follow-up Comment #7, bug #67509 (group groff): On Thursday, 18 September 2025 23:16:25 BST G. Branden Robinson wrote: > Follow-up Comment #6, bug #67509 (group groff): > > At 2025-09-18T17:40:00-0400, Deri James wrote: [...[ >> >> I thing the only bit left is '.strhex'. So the only thing preventing >> branch deletion is waiting for "I must do something about that" when I >> told you about the significant slow down in specific workloads since >> the introduction of the looping lookup. > > Right. Is that reproducible with the master branch today? If so, with > what input document? > > A reproducer that is easy for me to throw at the code would help > immensely.
Hi Branden, The instructions for testing were here:- <https://lists.gnu.org/archive/html/groff/2024-12/msg00168.html> Just grab the file referred to, DON'T do the git checkout, DON'T apply the patch, DO run:- time (./test-pdfmom --roff -Tpdf -man -petk LMB.prep > dj.pdf ) It runs in about 02:45 on my rig (this includes my "speedup" code, if you want to discover the state it was in after you inserted your code follow the instructions in the quoted list message - or take my word it was over 10 mins). This compares to 16 secs before your code was added. >> I was hoping, and you have mentioned it in passing, using c++ maps in >> groff, would result in a new .map/.lookup feature for groff which >> would have many user applications such as possibly indexing, as well >> as in pdf.tmac. > > I'd prefer to avoid adding major new features at this point in this > release cycle, but if there's a slowdown that is unbearable, we'll have > to resolve it somehow. Although this job (on bleeding edge) has a 10x slowdown, it is very much dependent on the size of the lookup table and the number of lookups which fail, this job is not typical, although in the noughties I was doing millions of pages through groff (for a UK bank) and being paid £25 per 1000 pages (I paid the sales and helpdesk (who worked for a software company) in a percentage). The reason I got the job is because groff was lightening compared to the commercial report writing software running on windows, which required multiple servers and licenses. I would not be happy at a slowdown if I hadn't retired. So I think we can live with it for this release, and you can take the flak if someone else is using groff for monster jobs. >> However with your herculean effort with .asciify I think I can see a >> way of replacing the loop and asciify the lookup name (instead of the >> device control string which is no longer done) to avoid the parser >> barfing. Not sure, an asciified string must still contain \[uXXXX] >> entities, may not be accepted as part of a .ds name (but should be if >> we welcome non-english users). > > Hmm, well, any printable character is legal in a string identifier > except the escape character, which is configurable. > > So you might do something underhanded like this. > > > .\" Define a new key string. > .eo > .ds pdf*key*\[u1234]\[u2345]*whatever value-\[u1234]\[u2345] > .ec > . > .\" ...much later... > . > .\" Look it up. > .eo > .als pdf*bookmark-content pdf*key*\[u1234]\[u2345]*whatever > .ec > . > .\" Use it. > \X'pdf:bookmark \*[pdf*bookmark-content] > I think I have v. similar code in one of my stashes. I know I hit a wall, can't remember, it may have been before we added \A'' protection on loops using parameters. Cheers Deri > Just spitballing. I won't be surprised if this technique doesn't > survive a real-world attempt at application. > > Regards, > Branden > _______________________________________________________ Reply to this item at: <https://savannah.gnu.org/bugs/?67509> _______________________________________________ Message sent via Savannah https://savannah.gnu.org/
signature.asc
Description: PGP signature