michaelselehov wrote:

## Background

A compressed offload bundle (CCOB) stores one or more code objects in a
zstd/zlib-compressed payload behind a small header. Several places in the tree
read these bundles back, and they all support **multiple concatenated bundles**
in one buffer (e.g. when a linker merges `.hip_fatbin` input sections from
several translation units into one section). To walk that sequence, each reader
needs to know where one bundle ends and the next begins.

## The bug

All three readers locate that boundary by scanning the buffer for the literal
4-byte magic `"CCOB"`:

```cpp
// llvm/lib/Object/OffloadBundle.cpp  (extractOffloadBundle)
// clang/lib/Driver/OffloadBundler.cpp  (ListBundleIDsInFile, UnbundleFiles)
NextBundleStart = Buffer.find("CCOB", 4);
...
MemoryBuffer::getMemBuffer(Buffer.take_front(NextBundleStart), ...); // -> 
decompress
```

The magic is `0x43 0x43 0x4F 0x42`. Nothing prevents those four bytes from
occurring **by chance inside a compressed payload**. When they do, the scan
stops in the middle of the current bundle, `take_front` truncates it, and the
truncated buffer is handed to the decompressor. zstd then sees a source size
that is smaller than the frame it is told to decode and bails out:

```
Failed to decompress input: Could not decompress embedded file contents: Src 
size is incorrect
```

At runtime this path is reached through comgr's `AMD_COMGR_ACTION_UNBUNDLE`, so
the user-visible symptom is `hipErrorInvalidImage` — *"device kernel image is
invalid"* — when loading the affected code object, followed by *"named symbol
not found"* for the kernels it should have contained.

A few properties that made this confusing to triage:

* **The bundle is not corrupt.** Decompressing the full payload with the system
  `zstd` (or the in-tree decompressor) succeeds and yields the expected bytes.
  Only the *outer splitter* truncates the buffer; `decompress()` itself already
  uses the header's `FileSize` to size the frame.
* **It is data-dependent.** Only bundles whose compressed bytes happen to
  contain `"CCOB"` are affected. Two objects built from the same sources with
  slightly different contents can differ in whether they trip the bug, so it
  looks intermittent and "object-specific" and has no single causing commit.
* It became reachable once compressed bundles became the common case; before
  that the scan rarely had enough compressed data to collide with the magic.

## The fix

Stop using the magic to find where a bundle *ends*. The compressed header
already records the bundle's on-disk size (`FileSize`, present for header
versions V2/V3 via `CompressedBundleHeader::tryParse`), which is authoritative.
Use it to delimit the current bundle, and only search for the next bundle's
`"CCOB"` magic *past* the end of the current one:

```cpp
if (std::optional<size_t> Size = getCompressedBundleSize(Buffer)) {
  CurBundleEnd    = *Size;                        // exact end of this bundle
  NextBundleStart = Buffer.find("CCOB", *Size);   // next bundle, if any
} else {
  // Legacy V1 header without a size field: keep the previous behaviour.
  NextBundleStart = Buffer.find("CCOB", 4);
  CurBundleEnd    = NextBundleStart;
}
```

This keeps the multi-bundle iteration intact — searching from `*Size` rather
than from offset 4 also tolerates any alignment padding between concatenated
bundles — while a magic-byte collision inside a payload is no longer able to
truncate anything. The change is applied identically in all three readers:

* `clang/lib/Driver/OffloadBundler.cpp` — `ListBundleIDsInFile`
* `clang/lib/Driver/OffloadBundler.cpp` — `UnbundleFiles`
* `llvm/lib/Object/OffloadBundle.cpp` — `extractOffloadBundle`

## Testing

`clang/test/Driver/clang-offload-bundler-magic-collision.c` is a regression test
for exactly this collision. Manufacturing a payload whose *compressed* bytes
contain `"CCOB"` by tuning the input would be brittle, so the input is built
deterministically instead: a real compressed bundle gets a **zstd skippable
frame** spliced into its compressed region, and that frame's body is the bytes
`"CCOB"`. A skippable frame is defined by the zstd format to be ignored by the
decompressor, so the bundle still decompresses correctly — but the old
`find("CCOB", 4)` scan stops inside it. The committed `.co` and the small Python
generator used to produce it both live under `Inputs/`.

* Without the fix, `--list`/`--unbundle` on this input fail with
  `Src size is incorrect`.
* With the fix, both succeed and enumerate/extract the targets.

The existing `clang/test/Driver/clang-offload-bundler-multi-compress.c`
(multiple concatenated CCOB blobs) continues to pass, confirming the
multi-bundle path is unaffected.

---

*Assisted-by: Claude Opus (Cursor).*


https://github.com/llvm/llvm-project/pull/205587
_______________________________________________
cfe-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to