Issue 185067
Summary [clang] `-fcoverage-mapping` produces corrupt coverage data when `#include` inside a function body resolves via `-isystem`
Labels clang
Assignees
Reporter dotnwat
    When a non-system source file textually `#include`s another file **inside a function body**, and that included file is found via `-isystem`, clang produces a corrupt `__llvm_covmap` section. `llvm-cov export` then fails with **"truncated coverage data"**.

The same compilation succeeds when `-isystem` is replaced with `-I`.

## Minimal reproducer

### Setup

```bash
mkdir -p sysinclude

cat > sysinclude/body.c <<'EOF'
result = 42;
EOF

cat > main.c <<'EOF'
int func(void) {
    int result = 0;
#include "body.c"
    return result;
}
EOF

# Build a valid profdata (needed for llvm-cov export)
cat > trivial.c <<'EOF'
int main(void) { return 0; }
EOF
clang -fprofile-instr-generate -fcoverage-mapping -o trivial trivial.c
LLVM_PROFILE_FILE=default.profraw ./trivial
llvm-profdata merge -o default.profdata default.profraw
```

### BUG: `-isystem` produces truncated coverage data

```bash
clang -c -fprofile-instr-generate -fcoverage-mapping -isystem sysinclude -o main.o main.c
llvm-cov export -format=lcov -object main.o -instr-profile default.profdata
```

Output:

```
error: failed to load coverage: 'main.o': truncated coverage data
```

### OK: `-I` works correctly

```bash
clang -c -fprofile-instr-generate -fcoverage-mapping -I sysinclude -o main.o main.c
llvm-cov export -format=lcov -object main.o -instr-profile default.profdata
```

Output:

```
SF:main.c
FN:1,func
...
end_of_record
SF:sysinclude/body.c
...
end_of_record
```

## Analysis

Dumping the `__llvm_covmap` and `__llvm_covfun` sections reveals the cause.

With `-isystem`, `body.c` is treated as a system header and is excluded from the `__llvm_covmap` filename table. However, clang still emits a `__llvm_covfun` record for `func()` whose coverage regions reference source locations in `body.c`. Since `body.c` is not in the filename table, the file-ID reference is out of bounds, and `llvm-cov` reports "truncated coverage data" when it tries to resolve it.

With `-I`, `body.c` is not a system header, so it appears in the filename table and the file-ID references are valid.

### Section comparison

```bash
# -isystem: 1 covfun record, small filename table (missing body.c)
llvm-objdump -h main_isystem.o | grep __llvm_cov
#  __llvm_covfun  <small>
#  __llvm_covmap  <small>

# -I: 1 covfun record, filename table includes body.c
llvm-objdump -h main_I.o | grep __llvm_cov
#  __llvm_covfun  <small>
#  __llvm_covmap  <larger, includes body.c filename>
```

## More realistic example

This pattern appears in real-world C codebases. For example, [libpg_query](https://github.com/pganalyze/libpg_query) (PostgreSQL's parser extracted as a library) uses textual `#include` of `.c` files inside function bodies for code generation:

```c
// pg_query_fingerprint.c
int fingerprint(int tag, ...) {
    switch (tag) {
#include "pg_query_fingerprint_conds.c"   // 258 switch cases
    }
}
```

When built with `-isystem` for the include paths (as Bazel does for external dependencies), four `.o` files produce corrupt coverage data, which causes `llvm-cov export` to abort entirely and produce empty coverage reports.

## Environment

- Ubuntu clang 20.1.8 (`apt` package `clang-20` on Ubuntu 25.10)
- Also reproduced with the corresponding `llvm-cov-20` / `llvm-profdata-20`
- Linux x86_64
_______________________________________________
llvm-bugs mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs

Reply via email to