URL:
  <https://savannah.gnu.org/bugs/?67718>

                 Summary: [troff] `class` request overpopulates `ranges`
property
                   Group: GNU roff
               Submitter: gbranden
               Submitted: Tue 18 Nov 2025 05:57:18 PM UTC
                Category: Core
                Severity: 2 - Minor
              Item Group: Lint
                  Status: None
                 Privacy: Public
             Assigned to: None
             Open/Closed: Open
         Discussion Lock: Any
         Planned Release: None


    _______________________________________________________

Follow-up Comments:


-------------------------------------------------------
Date: Tue 18 Nov 2025 05:57:18 PM UTC By: G. Branden Robinson <gbranden>
This issue has become observable only with recent `pchar` request work.

When populating range-based character classes, the end point of a range gets
duplicatively included in the members of the range vector.  This is harmless,
but wasteful.

I instrumented "input.cpp" to illustrate that this is a real bug in how the
STL vector object is populated, and not in my dumper function.


$ git diff
diff --git a/src/roff/troff/input.cpp b/src/roff/troff/input.cpp
index dc6c4e77a..b52cb9cdd 100644
--- a/src/roff/troff/input.cpp
+++ b/src/roff/troff/input.cpp
@@ -10723,6 +10723,7 @@ void charinfo::dump()
     errprint("  defined at: ");
     mac->dump();
     fflush(stderr);
+    errprint("  DEBUG: class has %1 ranges in it\n", int(ranges.size()));
     errprint("  contains ranges: ");
     const size_t buflen = sizeof "U+10FFFF";
     int range_begin = 0;
$ printf '.pchar \\C"[CJKnormal]"\n' | ./build/test-groff -m ja -T utf8
character class '[CJKnormal]'
  defined at: file name: "/home/branden/src/GIT/groff/build/../tmac/ja.tmac",
line number: 44
  DEBUG: class has 6 ranges in it
  contains ranges: U+3041-U+3096 U+3096 U+30A0-U+30FF U+30FF U+4E00-U+9FFF
U+9FFF 
  contains nested classes: (none)
$ sed -n '44,+2p' tmac/ja.tmac 
.class [CJKnormal] \
  \[u3041]-\[u3096] \[u30A0]-\[u30FF] \[u4E00]-\[u9FFF]
.


The `[CJKnormal]` class is populated with 6 ranges; it should contain 4.







    _______________________________________________________

Reply to this item at:

  <https://savannah.gnu.org/bugs/?67718>

_______________________________________________
Message sent via Savannah
https://savannah.gnu.org/

Attachment: signature.asc
Description: PGP signature

Reply via email to