On Aug 25, 2025, at 11:49 AM, Brian Mak <[email protected]> wrote: > On Aug 21, 2025, at 8:33 PM, Baoquan He <[email protected]> wrote: > >> Yeah, this is a good catch and great fix. Without this fix, >> kexec_file_load syscall will failed and return '-EINVAL' when >> KEXEC_FILE_NO_CMA is specified just as below code shows. So, for this >> patch, >> >> Acked-by: Baoquan He <[email protected]> > > Hi Baoquan, > > Thanks for the ACK! > >> And, by the way, has the user space kexec-tools got the change merged >> to allow KEXEC_FILE_NO_CMA specified? > > I don't see any recent commits to kexec-tools to support > KEXEC_FILE_NO_CMA. > >>> From: Brian Mak <[email protected]> >>> Subject: x86/kexec: carry forward the boot DTB on kexec >>> Date: Tue, 5 Aug 2025 14:15:27 -0700 >>> >>> Currently, the kexec_file_load syscall on x86 does not support passing a >>> device tree blob to the new kernel. Some embedded x86 systems use device >>> trees. On these systems, failing to pass a device tree to the new kernel >>> causes a boot failure. >>> >>> To add support for this, we copy the behavior of ARM64 and PowerPC and >>> copy the current boot's device tree blob for use in the new kernel. We do >>> this on x86 by passing the device tree blob as a setup_data entry in >>> accordance with the x86 boot protocol. >>> >>> This behavior is gated behind the KEXEC_FILE_FORCE_DTB flag. >>> >>> Link: >>> https://urldefense.com/v3/__https://lkml.kernel.org/r/[email protected]__;!!NEt6yMaO-gk!EbJyF8xO2E51MyYdN3_zqCBBMj0JKoiKoPuG_8vEctQMk9uCyjX0LdSEH_FGkPDV8egxzc7w$ >>> Signed-off-by: Brian Mak <[email protected]> >>> Cc: Alexander Graf <[email protected]> >>> Cc: Baoquan He <[email protected]> >>> Cc: Borislav Betkov <[email protected]> >>> Cc: Dave Young <[email protected]> >>> Cc: "H. Peter Anvin" <[email protected]> >>> Cc: Ingo Molnar <[email protected]> >>> Cc: Rob Herring <[email protected]> >>> Cc: Saravana Kannan <[email protected]> >>> Cc: Thomas Gleinxer <[email protected]> >>> Signed-off-by: Andrew Morton <[email protected]> >>> --- >>> >>> arch/x86/kernel/kexec-bzimage64.c | 47 ++++++++++++++++++++++++++-- >>> include/linux/kexec.h | 5 ++ >>> include/uapi/linux/kexec.h | 4 ++ >>> kernel/kexec_file.c | 1 >>> 4 files changed, 53 insertions(+), 4 deletions(-) >>> >>> --- >>> a/arch/x86/kernel/kexec-bzimage64.c~x86-kexec-carry-forward-the-boot-dtb-on-kexec >>> +++ a/arch/x86/kernel/kexec-bzimage64.c >>> @@ -16,6 +16,8 @@ >>> #include <linux/kexec.h> >>> #include <linux/kernel.h> >>> #include <linux/mm.h> >>> +#include <linux/libfdt.h> >>> +#include <linux/of_fdt.h> >>> #include <linux/efi.h> >>> #include <linux/random.h> >>> >>> @@ -212,6 +214,28 @@ setup_efi_state(struct boot_params *para >>> } >>> #endif /* CONFIG_EFI */ >>> >>> +#ifdef CONFIG_OF_FLATTREE >>> +static void setup_dtb(struct boot_params *params, >>> + unsigned long params_load_addr, >>> + unsigned int dtb_setup_data_offset) >>> +{ >>> + struct setup_data *sd = (void *)params + dtb_setup_data_offset; >>> + unsigned long setup_data_phys, dtb_len; >>> + >>> + dtb_len = fdt_totalsize(initial_boot_params); >>> + sd->type = SETUP_DTB; >>> + sd->len = dtb_len; >>> + >>> + /* Carry over current boot DTB with setup_data */ >>> + memcpy(sd->data, initial_boot_params, dtb_len); >>> + >>> + /* Add setup data */ >>> + setup_data_phys = params_load_addr + dtb_setup_data_offset; >>> + sd->next = params->hdr.setup_data; >>> + params->hdr.setup_data = setup_data_phys; >>> +} >>> +#endif /* CONFIG_OF_FLATTREE */ >>> + >>> static void >>> setup_ima_state(const struct kimage *image, struct boot_params *params, >>> unsigned long params_load_addr, >>> @@ -336,6 +360,17 @@ setup_boot_parameters(struct kimage *ima >>> sizeof(struct efi_setup_data); >>> #endif >>> >>> +#ifdef CONFIG_OF_FLATTREE >>> + if (image->force_dtb && initial_boot_params) { >>> + setup_dtb(params, params_load_addr, setup_data_offset); >>> + setup_data_offset += sizeof(struct setup_data) + >>> + fdt_totalsize(initial_boot_params); >>> + } else { >>> + pr_debug("Not carrying over DTB, force_dtb = %d\n", >>> + image->force_dtb); >>> + } >>> +#endif >>> + >>> if (IS_ENABLED(CONFIG_IMA_KEXEC)) { >>> /* Setup IMA log buffer state */ >>> setup_ima_state(image, params, params_load_addr, >>> @@ -529,6 +564,12 @@ static void *bzImage64_load(struct kimag >>> sizeof(struct setup_data) + >>> RNG_SEED_LENGTH; >>> >>> +#ifdef CONFIG_OF_FLATTREE >>> + if (image->force_dtb && initial_boot_params) >>> + kbuf.bufsz += sizeof(struct setup_data) + >>> + fdt_totalsize(initial_boot_params); >>> +#endif >>> + >>> if (IS_ENABLED(CONFIG_IMA_KEXEC)) >>> kbuf.bufsz += sizeof(struct setup_data) + >>> sizeof(struct ima_setup_data); >>> @@ -537,7 +578,7 @@ static void *bzImage64_load(struct kimag >>> kbuf.bufsz += sizeof(struct setup_data) + >>> sizeof(struct kho_data); >>> >>> - params = kzalloc(kbuf.bufsz, GFP_KERNEL); >>> + params = kvzalloc(kbuf.bufsz, GFP_KERNEL); >> >> Wondering how big the dtb blob is, can you explain a little bit about >> the kvzalloc usage here? >> >> Except of this, I have no other concern about this patch. >> >> And what's your plan about the user space kexec-tool change? > > When I tested this earlier on x86, the DTB was allowed to be up to just > under 64 pages large before the DTB failed to load. This is because it > has to fit into an early_memremap() mapping (relevant code snippet at > the bottom). Since the allocation can be many pages, I changed the > kzalloc to a kvzalloc. > > For the kexec-tools change, I have a draft change that I've already > shared on this thread for testing purposes. I believe you said you were > going to test it, but I haven't heard anything back from that yet. I'll > raise that change for review properly once this kernel commit is in > mainline. > > --------- > > void __init x86_flattree_get_config(void) > { > #ifdef CONFIG_OF_EARLY_FLATTREE > u32 size, map_len; > void *dt; > > if (initial_dtb) { > map_len = max(PAGE_SIZE - (initial_dtb & ~PAGE_MASK), (u64)128); > > dt = early_memremap(initial_dtb, map_len); > size = fdt_totalsize(dt); > if (map_len < size) { > early_memunmap(dt, map_len); > dt = early_memremap(initial_dtb, size); > map_len = size; > } > > early_init_dt_verify(dt, __pa(dt)); > } > > unflatten_and_copy_device_tree(); > > if (initial_dtb) > early_memunmap(dt, map_len); > #endif > if (acpi_disabled && of_have_populated_dt()) > x86_init.mpparse.parse_smp_cfg = x86_dtb_parse_smp_config; > } > > --------- > > Thanks, > Brian
Hi Baoquan, Just wanted to check back on this. Did you have any further concerns? If not, would you be able to provide an ACK? Thanks, Brian
