Issue 108946
Summary llvm-objcopy does not seem to handle `-O binary` correctly, adds PE header to the output
Labels tools:llvm-objcopy/strip
Assignees
Reporter mgorny
    In Gentoo, we're using `objcopy` to extract the Linux kernel image from UKI image.

Reproducer:

```
wget https://distfiles.gentoo.org/distfiles/1d/gentoo-kernel-6.10.9-1.amd64.gpkg.tar
tar -xf gentoo-kernel-6.10.9-1.amd64.gpkg.tar gentoo-kernel-6.10.9-1/image.tar.xz
tar -xf gentoo-kernel-6.10.9-1/image.tar.xz image/usr/src/linux-6.10.9-gentoo-dist/arch/x86/boot/uki.efi
objcopy -O binary -j.linux image/usr/src/linux-6.10.9-gentoo-dist/arch/x86/boot/uki.efi bzImage
```

Comparing the files created by GNU objcopy and LLVM objcopy:
```
-rwxr-xr-x 1 mgorny mgorny  19606512 09-17 11:15 bzImage.gnu
-rwxr-xr-x 1 mgorny mgorny  19607040 09-17 11:15 bzImage.llvm
```

The LLVM file has additional 512 bytes at the front:

```
00000000  4d 5a 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |MZ..............|
00000010  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
*
00000030  00 00 00 00 00 00 00 00  00 00 00 00 40 00 00 00  |............@...|
00000040  50 45 00 00 64 86 01 00 de 9a d0 66 00 00 00 00  |PE..d......f....|
00000050  00 00 00 00 f0 00 2e 02  0b 02 00 00 ae 87 00 00  |................|
00000060  00 00 00 00 00 00 00 00  e0 96 00 00 00 10 00 00  |................|
00000070  00 00 f9 4d 01 00 00 00  00 10 00 00 00 02 00 00  |...M............|
00000080  00 00 00 00 00 01 05 00  01 00 01 00 00 00 00 00  |................|
00000090 00 c0 dd 05 00 02 00 00  00 00 00 00 0a 00 60 01 |..............`.|
000000a0  00 00 10 00 00 00 00 00  00 10 00 00 00 00 00 00  |................|
*
000000c0  00 00 00 00 10 00 00 00  00 00 00 00 00 00 00 00  |................|
000000d0  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
000000e0  00 00 00 00 00 00 00 00 00 28 dd 05 f0 09 00 00  |.........(......|
000000f0  00 10 01 00 84 00 00 00  00 00 00 00 00 00 00 00  |................|
00000100  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
*
00000140  00 00 00 00 00 00 00 00  2e 6c 69 6e 75 78 00 00  |.........linux..|
00000150 f0 2b 2b 01 00 90 b2 04  00 2c 2b 01 00 02 00 00 |.++......,+.....|
00000160  00 00 00 00 00 00 00 00  00 00 00 00 20 00 00 40  |............ ..@|
00000170  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
*
```

And 16 bytes (of padding?) at the end:

```
012b2df0  cc cc cc cc cc cc cc cc  cc cc cc cc cc cc cc cc  |................|
```

In fact, if I run GNU objcopy without `-O binary`, I get roughly the same format as LLVM gives. This leads me to conclude that LLVM objcopy does not implement `-O binary` correctly, and instead uses PE output, same as the original file.

```
$ file bzImage.*
bzImage.gnu:                  Linux kernel x86 boot executable bzImage, version 6.10.9-gentoo-dist (root@devbox) #1 SMP PREEMPT_DYNAMIC Sun Sep  8 11:45:05 -00 2024, RO-rootFS, swap_dev 0X12, Normal VGA
bzImage.gnu-without-O-binary: PE32+ executable (EFI application) x86-64 (stripped to external PDB), for MS Windows
bzImage.llvm: PE32+ executable (EFI application) x86-64 (stripped to external PDB), for MS Windows
```
_______________________________________________
llvm-bugs mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs

Reply via email to