[bug #67978] [troff] SEGV in font_info::get_tfont()

2026-03-25 Thread G. Branden Robinson
Follow-up Comment #17, bug #67978 (group groff):

[comment #12 comment #12:]

> But as the "Font Families" node of our Texinfo manual explains, it is
> possible to manipulate oneself into a situation where the current mounting
> position (tracked in register `.t`) is unavailable.

That statement was a thinko.  The currently selected font mounting position is
tracked in register `.f`, not register `.t`.

https://www.gnu.org/software/groff/manual/groff.html.node/Font-Positions.html


___

Reply to this item at:

  

___
Message sent via Savannah
https://savannah.gnu.org/


signature.asc
Description: PGP signature


[bug #67978] [troff] SEGV in font_info::get_tfont()

2026-03-14 Thread G. Branden Robinson
Follow-up Comment #16, bug #67978 (group groff):


[comment #15 comment #15:]
> Thanks for fixing this! It is more complicated than I had imagined.

There are some similar spots that might require similar patch-ups, but as yet
I don't know how to get input to produce a SEGV or assertion failure.  I'll
likely be tricking in patches for these cases over the coming days/weeks.

Thanks for the report!

Also, another improvement spawned off of this one.


commit 97c22ed30d3b624b89f243bb9335519b1dbf8f53
Author: G. Branden Robinson 
Date:   Wed Jan 28 14:48:19 2026 -0600

[troff]: Improve diagnostic message contents.

* src/roff/troff/input.cpp (token::description): Refer to C0 control
  characters (that are valid as GNU troff input) and Basic Latin DEL
  (all unprintable) by their decimal code points, not by writing them
  out literally.

Noted while working on Savannah #67978.




___

Reply to this item at:

  

___
Message sent via Savannah
https://savannah.gnu.org/


signature.asc
Description: PGP signature


[bug #67978] [troff] SEGV in font_info::get_tfont()

2026-03-14 Thread Bruno Haible
Follow-up Comment #15, bug #67978 (group groff):

Thanks for fixing this! It is more complicated than I had imagined.


___

Reply to this item at:

  

___
Message sent via Savannah
https://savannah.gnu.org/


signature.asc
Description: PGP signature


[bug #67978] [troff] SEGV in font_info::get_tfont()

2026-03-13 Thread G. Branden Robinson
Update of bug #67978 (group groff):

  Status: In Progress => Fixed
 Open/Closed:Open => Closed
 Planned Release:None => 1.25.0

___

Follow-up Comment #14:


commit c8f533436e357ab576b060cc23c1a523d40571fc
Author: G. Branden Robinson 
AuthorDate: Mon Mar 9 23:02:39 2026 -0500

[groff]: Regression-test Savannah #67978.

* src/roff/groff/tests/do-not-crash-on-backslash-X-if-font-invalid.sh:
  Do it.

* src/roff/groff/groff.am (groff_TESTS): Run test.

commit 40fe18446e794be206e52c8d2ddc3af111020cd0
Author: G. Branden Robinson 
AuthorDate: Thu Mar 5 08:36:51 2026 -0600

[troff]: Fix Savannah #67978.

* src/roff/troff/input.cpp (do_device_extension): Emit error diagnostic
  instead of accessing invalid memory if an attempt is made to output a
  device extension command while the currently selected font is invalid.
  In the future we might not need the current font to be valid to format
  a device extension node, but for now we do because it "dirties", and
  therefore has to subsequently restore, several bits of font-related
  state.  We might be able to make this unnecessary with a parameterized
  extension to the `fl` request.  See Savannah #66187.

Fixes .




___

Reply to this item at:

  

___
Message sent via Savannah
https://savannah.gnu.org/


signature.asc
Description: PGP signature


[bug #67978] [troff] SEGV in font_info::get_tfont()

2026-03-05 Thread G. Branden Robinson
Follow-up Comment #13, bug #67978 (group groff):

[comment #12 comment #12:]
> But as the "Font Families" node of our Texinfo manual explains, it is
> possible to manipulate oneself into a situation where the current mounting
> position (tracked in register `.t`) is unavailable.  I'd hyperlink to that,
> but because today's weekday name ends in "y", www.gnu.org is timing out.

It's back up.  See
[https://www.gnu.org/software/groff/manual/groff.html.node/Font-Families.html
the description of the `sty` request.]


___

Reply to this item at:

  

___
Message sent via Savannah
https://savannah.gnu.org/


signature.asc
Description: PGP signature


[bug #67978] [troff] SEGV in font_info::get_tfont()

2026-03-05 Thread G. Branden Robinson
Update of bug #67978 (group groff):

  Status:   Confirmed => In Progress
 Assigned to:None => gbranden

___

Follow-up Comment #12:

Non-fuzzy reproducer:


$ cat ./ATTIC/67978alt3
.nr BarPos \n[.fp]
.sty \n[.fp] Bar
.fam Foo
.ft \n[BarPos]
.tm .f=\n[.f]
\X'baz'
$ groff --version | head -n 1
GNU groff version 1.24.0
$ groff ./ATTIC/67978alt3
troff:./ATTIC/67978alt3:3: error: no font family named 'Foo' exists
.f=41
groff: error: troff: Segmentation fault (core dumped)


The problem occurs because the `device_extension_node` class constructor
assumes that the current font mounting position is valid--meaning, there's an
"actual font" mounted there.

But that might not be the case: the current font mounting position might be an
abstract style.

It's somewhat difficult to get into that situation because all of the requests
and escape sequences for selecting fonts and styles go to some trouble to make
sure that the current mounting position is a "real" font, not an abstract
style.

But as the "Font Families" node of our Texinfo manual explains, it is possible
to manipulate oneself into a situation where the current mounting position
(tracked in register `.t`) is unavailable.  I'd hyperlink to that, but because
today's weekday name ends in "y", www.gnu.org is timing out.


___

Reply to this item at:

  

___
Message sent via Savannah
https://savannah.gnu.org/


signature.asc
Description: PGP signature


[bug #67978] [troff] SEGV in font_info::get_tfont()

2026-02-06 Thread G. Branden Robinson
Update of bug #67978 (group groff):

 Assigned to:gbranden => None

___

Follow-up Comment #11:

Unassigning from self.  This is descheduled for _groff_ 1.24.0 since it is a
long-standing bug, not a regression at any point in the past ten years.


___

Reply to this item at:

  

___
Message sent via Savannah
https://savannah.gnu.org/


signature.asc
Description: PGP signature


[bug #67978] [troff] SEGV in font_info::get_tfont()

2026-01-28 Thread G. Branden Robinson
Follow-up Comment #10, bug #67978 (group groff):

[comment #6 comment #6:]
> I wouldn't delay the 1.24.0 release for this,

[https://lists.gnu.org/archive/html/groff/2026-01/msg00150.html Collin Funk
has taken a contrary position.]  I'm working on this problem in a private
branch while I wait for a consensus to form.

I did want to note one point:

> * Complete handling of such inputs would take several weeks. When I did input
> fuzzing on the 'xgettext' program, it took me two weeks to fix the various
> findings. And for groff, Ingo Schwarze estimates it to be "at least a month
> of full-time work", see
> https://lists.nongnu.org/archive/html/groff/2019-12/msg00078.html


"Complete handling of such inputs" is _not_ within the scope of this ticket,
unless you want to lobby for it to become thus.

This is a known SEGV.  We can fix it without claiming or implying that we have
fixed all possible SEGVs in GNU _troff_, be they arisen from failed input
validation or otherwise.

(I wavered in my subjunctive construction there.  I was tempted to say, "arise
they from failed input validation", but became uncomfortable because while the
subjunctive mood normally employs the infinitive, I seldom see it with verbs
other than "to be".  So I wussed out and went with a passive subjunctive, with
bothers me for style reasons.)


___

Reply to this item at:

  

___
Message sent via Savannah
https://savannah.gnu.org/


signature.asc
Description: PGP signature


[bug #67978] [troff] SEGV in font_info::get_tfont()

2026-01-28 Thread G. Branden Robinson
Follow-up Comment #9, bug #67978 (group groff):

[https://lists.gnu.org/archive/html/groff/2026-01/msg00149.html Feedback
solicited.]


___

Reply to this item at:

  

___
Message sent via Savannah
https://savannah.gnu.org/


signature.asc
Description: PGP signature


[bug #67978] [troff] SEGV in font_info::get_tfont()

2026-01-28 Thread G. Branden Robinson
Update of bug #67978 (group groff):

 Planned Release:  1.24.0 => None

___

Follow-up Comment #8:

Clearing Planned Release.  Comment #6 and comment #7 make a sufficient case
for not busting the C/C++ code freeze.

However, I'll be soliciting feedback from the _groff_ community via the
discussion list, as I'm personally conflicted over this decision, and have
ambivalent/competing biases, a situation that makes objective decision-making
especially difficult.


___

Reply to this item at:

  

___
Message sent via Savannah
https://savannah.gnu.org/


signature.asc
Description: PGP signature


[bug #67978] [troff] SEGV in font_info::get_tfont()

2026-01-28 Thread G. Branden Robinson
Follow-up Comment #7, bug #67978 (group groff):

Not a new defect.


$ ~/groff-1.23.0/bin/groff -Ww -z ~/src/sed-4.8.tar.xz 2>&1 | tail -n1
/home/branden/groff-1.23.0/bin/groff: error: troff: Segmentation fault (core
dumped)
$ ~/groff-1.22.4/bin/groff -Ww -z ~/src/sed-4.8.tar.xz 2>&1 | tail -n1
/home/branden/groff-1.22.4/bin/groff: troff: Signal 11 (core dumped)
$ ~/groff-1.22.3/bin/groff -Ww -z ~/src/sed-4.8.tar.xz 2>&1 | tail -n1
/home/branden/groff-1.22.3/bin/groff: troff: Signal 11 (core dumped)




___

Reply to this item at:

  

___
Message sent via Savannah
https://savannah.gnu.org/


signature.asc
Description: PGP signature


[bug #67978] [troff] SEGV in font_info::get_tfont()

2026-01-28 Thread Bruno Haible
Follow-up Comment #6, bug #67978 (group groff):


> Updating Planned Release field.  Planning to bust the C/C++ code freeze for
> this, given that it's a crasher.

I wouldn't delay the 1.24.0 release for this, because

* It's an absurd, unrealistic input.
* Complete handling of such inputs would take several weeks. When I did input
fuzzing on the 'xgettext' program, it took me two weeks to fix the various
findings. And for groff, Ingo Schwarze estimates it to be "at least a month of
full-time work", see
https://lists.nongnu.org/archive/html/groff/2019-12/msg00078.html
* You have 15 pages of NEWS accumulated for this release. Get the new features
out to the users!



___

Reply to this item at:

  

___
Message sent via Savannah
https://savannah.gnu.org/


signature.asc
Description: PGP signature


[bug #67978] [troff] SEGV in font_info::get_tfont()

2026-01-28 Thread G. Branden Robinson
Follow-up Comment #5, bug #67978 (group groff):

[comment #4 comment #4:]
> Another diagnostic is similar, but not necessarily indicative of a problem.
> 

> troff::517: error: invalid argument '�' to output
> suppression escape sequence


> 
> The argument to the output suppression escape sequence is a single decimal
> digit, not an identifier. ...

No bug here.  In a Latin-1-encoded terminal we see a Latin-1 character...


troff::517: error: invalid argument 'ö' to output suppression
escape sequence


...which is fine.  GNU _troff_ is deliberately not in the business of
transcoding invalid character codes to emit them in diagnostic messages.

In the future, once GNU _troff_ natively accepts UTF-8 input, we might simply
report such codes' integer values, and do so in hexadecimal since that's the
universally employed number base for identifying UTF-8 code points.  This
might be a point worth adding to bug #40720.


___

Reply to this item at:

  

___
Message sent via Savannah
https://savannah.gnu.org/


signature.asc
Description: PGP signature


[bug #67978] [troff] SEGV in font_info::get_tfont()

2026-01-28 Thread G. Branden Robinson
Update of bug #67978 (group groff):

 Planned Release:None => 1.24.0

___

Follow-up Comment #4:

Updating Planned Release field.  Planning to bust the C/C++ code freeze for
this, given that it's a crasher.

Two observations:

= 1 =

The "minimal" chunk of GNU _sed_'s 4.8 distribution archive necessary to
reproduce this is still pretty big.


$ sed -n '3121,4026p' ~/src/sed-4.8.tar.xz | ./build/test-groff -Ww -z 2>&1
/dev/null && echo success
success
$ sed -n '3120,4026p' ~/src/sed-4.8.tar.xz | ./build/test-groff -Ww -z 2>&1 &&
echo success
...
/home/branden/src/GIT/groff/build/groff: error: troff: Segmentation fault
(core dumped)
$ sed -n '3120,4026p' ~/src/sed-4.8.tar.xz | wc -c
247575


We'll need a much smaller reproducing input for an automated regression test
script.

= 2 =

I noticed the following error diagnostics in my UTF-8-encoded terminal
session.


$ sed -n '3121,4026p' ~/src/sed-4.8.tar.xz | ./build/test-groff -Ww -z && echo
success
...
troff::554: error: no font family named '�' exists
...
troff::690: error: invalid base character 'k�C£"f=P9¨`' in
composite character name
...



I thought I had banned C1 controls and Latin-1 supplement characters from use
in identifiers (bug #67734).

So either (a) we're reading from uninitialized memory in these diagnostics,
which is bad--we should be zeroing out these identifiers before populating
them, something I thought I had done reasonably comprehensively already;
and/or (b) this wacked-out input is managing to create GNU _troff_ objects
with identifiers, and the formatter is not preventing injection of banned
characters codes into these identifiers.

Another diagnostic is similar, but not necessarily indicative of a problem.


troff::517: error: invalid argument '�' to output
suppression escape sequence


The argument to the output suppression escape sequence is a single decimal
digit, not an identifier.  (Hedge: the syntax of "\O5" permits a file name
specification, and, I feel, a design wart.
[https://www.gnu.org/software/groff/manual/groff.html.node/Suppressing-Output.html#Suppressing-Output
See the GNU troff manual.]  It, too, should not be allowed to use C1 controls
or Latin-1 Supplement characters in its argument.  We don't yet properly
support accessing non-ASCII file names from GNU _troff_, but bug #65108 plans
to do so.)


___

Reply to this item at:

  

___
Message sent via Savannah
https://savannah.gnu.org/


signature.asc
Description: PGP signature


[bug #67978] [troff] SEGV in font_info::get_tfont()

2026-01-28 Thread G. Branden Robinson
Follow-up Comment #3, bug #67978 (group groff):

Getting a minimal reproducer is going to require some digging; the problem
appears to be state dependent, and the formatter is sure to be in a pretty
weird state given the absurd input.


$ sed -n '4026p' ~/src/sed-4.8.tar.xz | ./build/test-groff -Ww -z && echo
success
troff::1: error: ignoring invalid numeric expression starting
with character code 225
success




___

Reply to this item at:

  

___
Message sent via Savannah
https://savannah.gnu.org/


signature.asc
Description: PGP signature


[bug #67978] [troff] SEGV in font_info::get_tfont()

2026-01-28 Thread G. Branden Robinson
Update of bug #67978 (group groff):

  Status:None => Confirmed
 Assigned to:None => gbranden

___

Follow-up Comment #2:


[comment #0 original submission:]
> In the 1990ies it was well-known that you could crash many programs by
> giving them garbage (a.k.a. "random") input. This technique uncovered
> many bugs that would also occur on valid (but rarely seen) input.
> Later this technique evolved into "fuzzing".

The technique may be even older; I recall Doug McIlroy mentioning it as being
employed at the Bell Labs CSRC.  I believe he said so either on the _groff_
list or Warren Toomey's TUHS list.

I used it myself as recently as November.

https://cgit.git.savannah.gnu.org/cgit/groff.git/commit/?id=158b0d973e6e802558b7a4611fc56c52bd8864e9

Can reproduce:


$ ./build/test-groff -Ww -z ~/src/sed-4.8.tar.xz 2>&1 | tail
troff:/home/branden/src/sed-4.8.tar.xz:4026: error: cannot format glyph: no
current font
troff:/home/branden/src/sed-4.8.tar.xz:4026: error: cannot format glyph: no
current font
troff:/home/branden/src/sed-4.8.tar.xz:4026: error: cannot format glyph: no
current font
troff:/home/branden/src/sed-4.8.tar.xz:4026: error: cannot format glyph: no
current font
troff:/home/branden/src/sed-4.8.tar.xz:4026: error: cannot format glyph: no
current font
troff:/home/branden/src/sed-4.8.tar.xz:4026: error: cannot format glyph: no
current font
troff:/home/branden/src/sed-4.8.tar.xz:4026: error: cannot format glyph: no
current font
troff:/home/branden/src/sed-4.8.tar.xz:4026: error: cannot format glyph: no
current font
troff:/home/branden/src/sed-4.8.tar.xz:4026: error: ignoring invalid numeric
expression starting with character code 225
/home/branden/src/GIT/groff/build/groff: error: troff: Segmentation fault
(core dumped)


Will investigate.


___

Reply to this item at:

  

___
Message sent via Savannah
https://savannah.gnu.org/


signature.asc
Description: PGP signature


[bug #67978] [troff] SEGV in font_info::get_tfont()

2026-01-28 Thread G. Branden Robinson
Follow-up Comment #1, bug #67978 (group groff):

Adding Bruno to CC list.


___

Reply to this item at:

  

___
Message sent via Savannah
https://savannah.gnu.org/


signature.asc
Description: PGP signature


[bug #67978] [troff] SEGV in font_info::get_tfont()

2026-01-28 Thread G. Branden Robinson
URL:
  

 Summary: [troff] SEGV in font_info::get_tfont()
   Group: GNU roff
   Submitter: gbranden
   Submitted: Wed 28 Jan 2026 05:29:47 PM UTC
Category: Core
Severity: 4 - Important
  Item Group: Crash/Unresponsive
  Status: None
 Privacy: Public
 Assigned to: None
 Open/Closed: Open
 Discussion Lock: Unlocked
 Planned Release: None


___

Follow-up Comments:


---
Date: Wed 28 Jan 2026 05:29:47 PM UTC By: G. Branden Robinson 
[https://lists.gnu.org/archive/html/bug-groff/2026-01/msg00339.html Bruno
Haible reported the following to the bug-groff list.]

***

In the 1990ies it was well-known that you could crash many programs by
giving them garbage (a.k.a. "random") input. This technique uncovered
many bugs that would also occur on valid (but rarely seen) input.
Later this technique evolved into "fuzzing".

It applies to groff too. See:


$ groff --version
GNU groff version 1.24.0.rc2
...
GNU grops (groff) version 1.24.0.rc2
GNU troff (groff) version 1.24.0.rc2

$ wget https://ftp.gnu.org/gnu/sed/sed-4.8.tar.xz

$ groff -Tpdf sed-4.8.tar.xz  > /dev/null
...
troff:sed-4.8.tar.xz:4026: error: cannot format glyph: no current font
troff:sed-4.8.tar.xz:4026: error: ignoring invalid numeric expression starting

with character code 225
groff: error: troff: Segmentation fault (core dumped)


Oops: troff dumped core! Let's see where:


$ gdb /arch/x86_64-linux-gnu/gnu-inst-groff/1.24.0-rc2/bin/troff 
/var/lib/apport/coredump/core._arch_x86_64-linux-gnu_gnu-inst-groff_1_24_0-rc2_bin_troff.1000.1da7a7b2-b09b-458f-8578-69b57283288c.1409721.176848446

Core was generated by `troff -Tpdf sed-4.8.tar.xz'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0  0x0044704d in font_info::get_tfont (this=0x191, fs=..., height=0,

slant=0, fontno=-1) at src/roff/troff/node.cpp:312
312   if (0 /* nullptr */ == last_tfont
(gdb) where
#0  0x0044704d in font_info::get_tfont (this=0x191, fs=..., height=0,

slant=0, fontno=-1) at src/roff/troff/node.cpp:312
#1  0x00452377 in device_extension_node::device_extension_node 
(this=0x14ded180, m=..., b=false) at src/roff/troff/node.cpp:4369
#2  0x0042fcb2 in do_device_extension () at 
src/roff/troff/input.cpp:6787
#3  0x00423635 in token::next (this=0x4b4b40 ) at 
src/roff/troff/input.cpp:2675
#4  0x00425d28 in process_input_stack () at 
src/roff/troff/input.cpp:3556
#5  0x004376d3 in process_input_file (name=0x7ffde98a36d3 
"sed-4.8.tar.xz") at src/roff/troff/input.cpp:9731
#6  0x004389ba in main (argc=3, argv=0x7ffde98a2ce8) at 
src/roff/troff/input.cpp:10103
(gdb) print this
$4 = (font_info * const) 0x191
(gdb) up
#1  0x00452377 in device_extension_node::device_extension_node 
(this=0x14ded180, m=..., b=false) at src/roff/troff/node.cpp:4369
4369  tf = font_table[fontno]->get_tfont(fs, char_height, char_slant,
(gdb) print font_table
$5 = (font_info **) 0x14cef9c0
(gdb) print fontno
$6 = -1
(gdb) print this
$7 = (device_extension_node * const) 0x14ded180
(gdb) print *this
$8 = { = {_vptr.node = 0x47dc80 , 
next = 0x0, last = 0x0, state = 0x0, push_state = 0x0, 
div_nest_level = 0, is_special = true}, mac = { = 
{ = {_vptr.object = 0x479218 , 
refcount = 0}, }, filename = 0x14d3d490 
"sed-4.8.tar.xz", lineno = 4026, len = 233, is_empty_macro = false, 
is_a_diversion = false, is_a_string = true, p = 0x14de14e0}, tf = 
0x59eeabe12b567b76, gcol = 0x367076eedc716e23, fcol = 0x34673043c6e9ad4c, 
  lacks_command_prefix = false}
(gdb) print curenv
$9 = (environment *) 0x14b01e10
(gdb) print *curenv
$10 = {is_dummy_env = false, prev_line_length = {n = 468000}, line_length = {n

= 468000}, prev_title_length = {n = 468000}, title_length = {
n = 468000}, prev_size = {p = 1}, size = {p = 4000}, requested_size =

4000, prev_requested_size = 1, char_height = 0, 
  char_slant = 0, prev_fontno = 24, fontno = 2, prev_family = 0x14dd5170, 
family = 0x14dd1190, space_size = 12, sentence_space_size = 12, 
  adjust_mode = 1, is_filling = true, was_line_interrupted = false, 
was_previous_line_interrupted = 0, centered_line_count = 0, 
  right_aligned_line_count = 0, prev_vertical_spacing = {n = 12000}, 
vertical_spacing = {n = 12000}, prev_post_vertical_spacing = {n = 0}, 
  post_vertical_spacing = {n = 0}, prev_line_spacing = 1, line_spacing = 1, 
prev_indent = {n = 0}, indent = {n = 0}, temporary_indent = {
n = 0}, have_temporary_indent = false, saved_indent = {n = 0}, 
target_text_length = {n = 468000}, pre_underline_fontno = 0, 
  underlined_line_count = 0, underline_spaces = false, input_trap = {s = 0x0},

input_trap_count = -1, continued_input_trap = false, 
  line = 0x14