On Mon, 2 Jan 2023 at 07:17, Borislav Petkov <b...@alien8.de> wrote: > > On Mon, Jan 02, 2023 at 07:01:50AM +0100, Borislav Petkov wrote: > > On Sat, Dec 31, 2022 at 07:31:21PM -0800, H. Peter Anvin wrote: > > > It would probably be a good idea to add a "maximum physical address for > > > initrd/setup_data/cmdline" field to struct kernel_info, though. It appears > > > right now that those fields are being identity-mapped in the decompressor, > > > and that means that if 48-bit addressing is used, physical memory may > > > extend > > > past the addressable range. > > > > Yeah, we will probably need that too. > > > > Btw, looka here - it can't get any more obvious than that after dumping > > setup_data too: > > > > early console in setup code > > early console in extract_kernel > > input_data: 0x00000000040f92bf > > input_len: 0x0000000000f1c325 > > output: 0x0000000001000000 > > output_len: 0x0000000003c5e7d8 > > kernel_total_size: 0x0000000004428000 > > needed_size: 0x0000000004600000 > > boot_params->hdr.setup_data: 0x00000000010203b0 > > trampoline_32bit: 0x000000000009d000 > > > > Decompressing Linux... Parsing ELF... done. > > Booting the kernel. > > <EOF> > > > > Aligning them vertically: > > > > output: 0x0000000001000000 > > output_len: 0x0000000003c5e7d8 > > kernel_total_size: 0x0000000004428000 > > needed_size: 0x0000000004600000 > > boot_params->hdr.setup_data: 0x00000000010203b0 > > Ok, waait a minute: > > ============ ============ > Field name: pref_address > Type: read (reloc) > Offset/size: 0x258/8 > Protocol: 2.10+ > ============ ============ > > This field, if nonzero, represents a preferred load address for the > kernel. A relocating bootloader should attempt to load at this > address if possible. > > A non-relocatable kernel will unconditionally move itself and to run > at this address. > > so a kernel loader (qemu in this case) already knows where the kernel goes: > > boot_params->hdr.setup_data: 0x0000000001020450 > boot_params->hdr.pref_address: 0x0000000001000000 > ^^^^^^^^^^^^^^^^^ > > now, considering that same kernel loader (qemu) knows how big that kernel is: > > kernel_total_size: 0x0000000004428000 > > should that loader *not* put anything that the kernel will use in the range > > pref_addr + kernel_total_size >
This seems to be related to another issue that was discussed in the context of this change, but affecting EFI boot not legacy BIOS boot [0]. So, in a nutshell, we have the following pieces: - QEMU, which manages a directory of files and other data blobs, and exposes them via its fw_cfg interface. - SeaBIOS, which invokes the fw_cfg interface to load the 'kernel' blob at its preferred address - The boot code in the kernel, which interprets the various fields in the setup header to figure out where the compressed image lives etc So the problem here, which applies to SETUP_DTB as well as SETUP_RNG_SEED, is that the internal file representation of the kernel blob (which does not have an absolute address at this point, it's just a file in the fw_cfg filesystem) is augmented with: 1) setup_data linked-list entries carrying absolute addresses that are assumed to be valid once SeaBIOS loads the file to memory 2) DTB and/or RNG seed blobs appended to the compressed 'kernel' blob, but without updating that file's internal metadata Issue 1) is what broke EFI boot, given that EFI interprets the kernel blob as a PE/COFF image and hands it to the Loadimage() boot service, which has no awareness of boot_params or setup_data and so just ignores it and loads the image at an arbitrary address, resulting in setup_data absolute address values pointing to bogus places. It seems that now, we have another issue 2), where the fw_cfg view of the file size goes out of sync with the compressed image's own view of its size. As a fix for issue 1), we explored another solution, which was to allocate fixed areas in memory for the RNG seed, so that the absolute address added to setup_data is guaranteed to be correct regardless of where the compressed image is loaded, but that was shot down for other reasons, and we ended up enabling this feature only for legacy BIOS boot. But apparently, this approach has other issues so perhaps it is better to revisit that solution again. So instead of appending data to the compressed image and assuming that it will stay in place, create or extend a memory reservation elsewhere, and refer to its absolute address in setup_data. -- Ard. [0] https://lore.kernel.org/all/camj1kxfr6bv4_g0-wctu4fp_icrg060nhjx_j2dbnyifjky...@mail.gmail.com/