[PATCH v2] powerpc/kexec_file: Restore FDT size estimation for kdump kernel
Commit 2377c92e37fe ("powerpc/kexec_file: fix FDT size estimation for kdump kernel") fixed how elf64_load() estimates the FDT size needed by the crashdump kernel. At the same time, commit 130b2d59cec0 ("powerpc: Use common of_kexec_alloc_and_setup_fdt()") changed the same code to use the generic function of_kexec_alloc_and_setup_fdt() to calculate the FDT size. That change made the code overestimate it a bit by counting twice the space required for the kernel command line and /chosen properties. Therefore change kexec_fdt_totalsize_ppc64() to calculate just the extra space needed by the kdump kernel, and change the function name so that it better reflects what the function is now doing. Signed-off-by: Thiago Jung Bauermann Reviewed-by: Lakshmi Ramasubramanian --- arch/powerpc/include/asm/kexec.h | 2 +- arch/powerpc/kexec/elf_64.c | 2 +- arch/powerpc/kexec/file_load_64.c | 26 -- 3 files changed, 10 insertions(+), 20 deletions(-) Applies on top of next-20210219. Changes since v1: - Adjusted comment describing kexec_extra_fdt_size_ppc64() as suggested by Lakshmi. diff --git a/arch/powerpc/include/asm/kexec.h b/arch/powerpc/include/asm/kexec.h index baab158e215c..5a11cc8d2350 100644 --- a/arch/powerpc/include/asm/kexec.h +++ b/arch/powerpc/include/asm/kexec.h @@ -128,7 +128,7 @@ int load_crashdump_segments_ppc64(struct kimage *image, int setup_purgatory_ppc64(struct kimage *image, const void *slave_code, const void *fdt, unsigned long kernel_load_addr, unsigned long fdt_load_addr); -unsigned int kexec_fdt_totalsize_ppc64(struct kimage *image); +unsigned int kexec_extra_fdt_size_ppc64(struct kimage *image); int setup_new_fdt_ppc64(const struct kimage *image, void *fdt, unsigned long initrd_load_addr, unsigned long initrd_len, const char *cmdline); diff --git a/arch/powerpc/kexec/elf_64.c b/arch/powerpc/kexec/elf_64.c index 0492ca6003f3..5a569bb51349 100644 --- a/arch/powerpc/kexec/elf_64.c +++ b/arch/powerpc/kexec/elf_64.c @@ -104,7 +104,7 @@ static void *elf64_load(struct kimage *image, char *kernel_buf, fdt = of_kexec_alloc_and_setup_fdt(image, initrd_load_addr, initrd_len, cmdline, - kexec_fdt_totalsize_ppc64(image)); + kexec_extra_fdt_size_ppc64(image)); if (!fdt) { pr_err("Error setting up the new device tree.\n"); ret = -EINVAL; diff --git a/arch/powerpc/kexec/file_load_64.c b/arch/powerpc/kexec/file_load_64.c index 3609de30a170..297f73795a1f 100644 --- a/arch/powerpc/kexec/file_load_64.c +++ b/arch/powerpc/kexec/file_load_64.c @@ -927,37 +927,27 @@ int setup_purgatory_ppc64(struct kimage *image, const void *slave_code, } /** - * kexec_fdt_totalsize_ppc64 - Return the estimated size needed to setup FDT - * for kexec/kdump kernel. - * @image: kexec image being loaded. + * kexec_extra_fdt_size_ppc64 - Return the estimated additional size needed to + * setup FDT for kexec/kdump kernel. + * @image: kexec image being loaded. * - * Returns the estimated size needed for kexec/kdump kernel FDT. + * Returns the estimated extra size needed for kexec/kdump kernel FDT. */ -unsigned int kexec_fdt_totalsize_ppc64(struct kimage *image) +unsigned int kexec_extra_fdt_size_ppc64(struct kimage *image) { - unsigned int fdt_size; u64 usm_entries; - /* -* The below estimate more than accounts for a typical kexec case where -* the additional space is to accommodate things like kexec cmdline, -* chosen node with properties for initrd start & end addresses and -* a property to indicate kexec boot.. -*/ - fdt_size = fdt_totalsize(initial_boot_params) + (2 * COMMAND_LINE_SIZE); if (image->type != KEXEC_TYPE_CRASH) - return fdt_size; + return 0; /* -* For kdump kernel, also account for linux,usable-memory and +* For kdump kernel, account for linux,usable-memory and * linux,drconf-usable-memory properties. Get an approximate on the * number of usable memory entries and use for FDT size estimation. */ usm_entries = ((memblock_end_of_DRAM() / drmem_lmb_size()) + (2 * (resource_size(_res) / drmem_lmb_size(; - fdt_size += (unsigned int)(usm_entries * sizeof(u64)); - - return fdt_size; + return (unsigned int)(usm_entries * sizeof(u64)); } /** ___ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec
Re: [PATCH] powerpc/kexec_file: Restore FDT size estimation for kdump kernel
Lakshmi Ramasubramanian writes: > On 2/19/21 6:25 AM, Thiago Jung Bauermann wrote: > > One small nit in the function header (please see below), but otherwise the > change looks good. > > Reviewed-by: Lakshmi Ramasubramanian Thanks for your review. I incorporated your suggestion and will send v2 shortly. >> --- a/arch/powerpc/kexec/file_load_64.c >> +++ b/arch/powerpc/kexec/file_load_64.c >> @@ -927,37 +927,27 @@ int setup_purgatory_ppc64(struct kimage *image, const >> void *slave_code, >> } >> /** >> - * kexec_fdt_totalsize_ppc64 - Return the estimated size needed to setup FDT >> - * for kexec/kdump kernel. >> - * @image: kexec image being loaded. >> + * kexec_extra_fdt_size_ppc63 - Return the estimated size needed to setup >> FDT > > Perhaps change to > > "Return the estimated additional size needed to setup FDT for kexec/kdump > kernel"? That's better indeed. I also hadn't noticed that I changed ppc64 to ppc63. Fixed as well. -- Thiago Jung Bauermann IBM Linux Technology Center ___ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec
[PATCH] powerpc/kexec_file: Restore FDT size estimation for kdump kernel
Commit 2377c92e37fe ("powerpc/kexec_file: fix FDT size estimation for kdump kernel") fixed how elf64_load() estimates the FDT size needed by the crashdump kernel. At the same time, commit 130b2d59cec0 ("powerpc: Use common of_kexec_alloc_and_setup_fdt()") changed the same code to use the generic function of_kexec_alloc_and_setup_fdt() to calculate the FDT size. That change made the code overestimate it a bit by counting twice the space required for the kernel command line and /chosen properties. Therefore change kexec_fdt_totalsize_ppc64() to calculate just the extra space needed by the kdump kernel, and change the function name so that it better reflects what the function is now doing. Signed-off-by: Thiago Jung Bauermann --- arch/powerpc/include/asm/kexec.h | 2 +- arch/powerpc/kexec/elf_64.c | 2 +- arch/powerpc/kexec/file_load_64.c | 26 -- 3 files changed, 10 insertions(+), 20 deletions(-) Applies on top of next-20210219. diff --git a/arch/powerpc/include/asm/kexec.h b/arch/powerpc/include/asm/kexec.h index baab158e215c..5a11cc8d2350 100644 --- a/arch/powerpc/include/asm/kexec.h +++ b/arch/powerpc/include/asm/kexec.h @@ -128,7 +128,7 @@ int load_crashdump_segments_ppc64(struct kimage *image, int setup_purgatory_ppc64(struct kimage *image, const void *slave_code, const void *fdt, unsigned long kernel_load_addr, unsigned long fdt_load_addr); -unsigned int kexec_fdt_totalsize_ppc64(struct kimage *image); +unsigned int kexec_extra_fdt_size_ppc64(struct kimage *image); int setup_new_fdt_ppc64(const struct kimage *image, void *fdt, unsigned long initrd_load_addr, unsigned long initrd_len, const char *cmdline); diff --git a/arch/powerpc/kexec/elf_64.c b/arch/powerpc/kexec/elf_64.c index 0492ca6003f3..5a569bb51349 100644 --- a/arch/powerpc/kexec/elf_64.c +++ b/arch/powerpc/kexec/elf_64.c @@ -104,7 +104,7 @@ static void *elf64_load(struct kimage *image, char *kernel_buf, fdt = of_kexec_alloc_and_setup_fdt(image, initrd_load_addr, initrd_len, cmdline, - kexec_fdt_totalsize_ppc64(image)); + kexec_extra_fdt_size_ppc64(image)); if (!fdt) { pr_err("Error setting up the new device tree.\n"); ret = -EINVAL; diff --git a/arch/powerpc/kexec/file_load_64.c b/arch/powerpc/kexec/file_load_64.c index 3609de30a170..8541ba731908 100644 --- a/arch/powerpc/kexec/file_load_64.c +++ b/arch/powerpc/kexec/file_load_64.c @@ -927,37 +927,27 @@ int setup_purgatory_ppc64(struct kimage *image, const void *slave_code, } /** - * kexec_fdt_totalsize_ppc64 - Return the estimated size needed to setup FDT - * for kexec/kdump kernel. - * @image: kexec image being loaded. + * kexec_extra_fdt_size_ppc63 - Return the estimated size needed to setup FDT + * for kexec/kdump kernel. + * @image: kexec image being loaded. * - * Returns the estimated size needed for kexec/kdump kernel FDT. + * Returns the estimated extra size needed for kexec/kdump kernel FDT. */ -unsigned int kexec_fdt_totalsize_ppc64(struct kimage *image) +unsigned int kexec_extra_fdt_size_ppc64(struct kimage *image) { - unsigned int fdt_size; u64 usm_entries; - /* -* The below estimate more than accounts for a typical kexec case where -* the additional space is to accommodate things like kexec cmdline, -* chosen node with properties for initrd start & end addresses and -* a property to indicate kexec boot.. -*/ - fdt_size = fdt_totalsize(initial_boot_params) + (2 * COMMAND_LINE_SIZE); if (image->type != KEXEC_TYPE_CRASH) - return fdt_size; + return 0; /* -* For kdump kernel, also account for linux,usable-memory and +* For kdump kernel, account for linux,usable-memory and * linux,drconf-usable-memory properties. Get an approximate on the * number of usable memory entries and use for FDT size estimation. */ usm_entries = ((memblock_end_of_DRAM() / drmem_lmb_size()) + (2 * (resource_size(_res) / drmem_lmb_size(; - fdt_size += (unsigned int)(usm_entries * sizeof(u64)); - - return fdt_size; + return (unsigned int)(usm_entries * sizeof(u64)); } /** ___ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec
Re: [RESEND PATCH v5 08/11] ppc64/kexec_file: setup backup region for kdump kernel
Hari Bathini writes: > Though kdump kernel boots from loaded address, the first 64KB of it is > copied down to real 0. So, setup a backup region and let purgatory > copy the first 64KB of crashed kernel into this backup region before > booting into kdump kernel. Update reserve map with backup region and > crashed kernel's memory to avoid kdump kernel from accidentially using > that memory. > > Signed-off-by: Hari Bathini Reviewed-by: Thiago Jung Bauermann -- Thiago Jung Bauermann IBM Linux Technology Center ___ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec
Re: [RESEND PATCH v5 07/11] ppc64/kexec_file: enable early kernel's OPAL calls
Hari Bathini writes: > Kernel built with CONFIG_PPC_EARLY_DEBUG_OPAL enabled expects r8 & r9 > to be filled with OPAL base & entry addresses respectively. Setting > these registers allows the kernel to perform OPAL calls before the > device tree is parsed. > > Signed-off-by: Hari Bathini Reviewed-by: Thiago Jung Bauermann -- Thiago Jung Bauermann IBM Linux Technology Center ___ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec
Re: [RESEND PATCH v5 06/11] ppc64/kexec_file: restrict memory usage of kdump kernel
Hari Bathini writes: > Kdump kernel, used for capturing the kernel core image, is supposed > to use only specific memory regions to avoid corrupting the image to > be captured. The regions are crashkernel range - the memory reserved > explicitly for kdump kernel, memory used for the tce-table, the OPAL > region and RTAS region as applicable. Restrict kdump kernel memory > to use only these regions by setting up usable-memory DT property. > Also, tell the kdump kernel to run at the loaded address by setting > the magic word at 0x5c. > > Signed-off-by: Hari Bathini > Tested-by: Pingfan Liu I liked the new versions of get_node_path_size() and get_node_path(). Reviewed-by: Thiago Jung Bauermann -- Thiago Jung Bauermann IBM Linux Technology Center ___ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec
Re: [PATCH v4 06/12] ppc64/kexec_file: restrict memory usage of kdump kernel
Hari Bathini writes: > On 24/07/20 5:36 am, Thiago Jung Bauermann wrote: >> >> Hari Bathini writes: >> >>> Kdump kernel, used for capturing the kernel core image, is supposed >>> to use only specific memory regions to avoid corrupting the image to >>> be captured. The regions are crashkernel range - the memory reserved >>> explicitly for kdump kernel, memory used for the tce-table, the OPAL >>> region and RTAS region as applicable. Restrict kdump kernel memory >>> to use only these regions by setting up usable-memory DT property. >>> Also, tell the kdump kernel to run at the loaded address by setting >>> the magic word at 0x5c. >>> >>> Signed-off-by: Hari Bathini >>> Tested-by: Pingfan Liu >>> --- >>> >>> v3 -> v4: >>> * Updated get_node_path() to be an iterative function instead of a >>> recursive one. >>> * Added comment explaining why low memory is added to kdump kernel's >>> usable memory ranges though it doesn't fall in crashkernel region. >>> * For correctness, added fdt_add_mem_rsv() for the low memory being >>> added to kdump kernel's usable memory ranges. >> >> Good idea. >> >>> * Fixed prop pointer update in add_usable_mem_property() and changed >>> duple to tuple as suggested by Thiago. >> >> >> >>> +/** >>> + * get_node_pathlen - Get the full path length of the given node. >>> + * @dn: Node. >>> + * >>> + * Also, counts '/' at the end of the path. >>> + * For example, /memory@0 will be "/memory@0/\0" => 11 bytes. >> >> Wouldn't this function return 10 in the case of /memory@0? > > Actually, it does return 11. +1 while returning is for counting %NUL. > On top of that we count an extra '/' for root node.. so, it ends up as 11. > ('/'memory@0'/''\0'). Note the extra '/' before '\0'. Let me handle root node > separately. That should avoid the confusion. Ah, that is true. I forgot to count the iteration for the root node. Sorry about that. -- Thiago Jung Bauermann IBM Linux Technology Center ___ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec
Re: [PATCH v4 10/12] ppc64/kexec_file: prepare elfcore header for crashing kernel
Hari Bathini writes: > Prepare elf headers for the crashing kernel's core file using > crash_prepare_elf64_headers() and pass on this info to kdump > kernel by updating its command line with elfcorehdr parameter. > Also, add elfcorehdr location to reserve map to avoid it from > being stomped on while booting. > > Signed-off-by: Hari Bathini > Tested-by: Pingfan Liu Reviewed-by: Thiago Jung Bauermann -- Thiago Jung Bauermann IBM Linux Technology Center ___ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec
Re: [PATCH v4 09/12] ppc64/kexec_file: setup backup region for kdump kernel
Hari Bathini writes: > Though kdump kernel boots from loaded address, the first 64K bytes > of it is copied down to real 0. So, setup a backup region to copy > the first 64K bytes of crashed kernel, in purgatory, before booting > into kdump kernel. Also, update reserve map with backup region and > crashed kernel's memory to avoid kdump kernel from accidentially > using that memory. > > Reported-by: kernel test robot > [lkp: In v1, purgatory() declaration was missing] > Signed-off-by: Hari Bathini Reviewed-by: Thiago Jung Bauermann Just one minor comment below: > @@ -1047,13 +1120,26 @@ int setup_new_fdt_ppc64(const struct kimage *image, > void *fdt, > goto out; > } > > - /* Ensure we don't touch crashed kernel's memory */ > - ret = fdt_add_mem_rsv(fdt, 0, crashk_res.start); > + /* > + * Ensure we don't touch crashed kernel's memory except the > + * first 64K of RAM, which will be backed up. > + */ > + ret = fdt_add_mem_rsv(fdt, BACKUP_SRC_SIZE, I know BACKUP_SRC_START is 0, but please forgive my pedantry when I say that I think it's clearer if the start address above is changed to BACKUP_SRC_START + BACKUP_SRC_SIZE... > + crashk_res.start - BACKUP_SRC_SIZE); > if (ret) { > pr_err("Error reserving crash memory: %s\n", > fdt_strerror(ret)); > goto out; > } > + > + /* Ensure backup region is not used by kdump/capture kernel */ > + ret = fdt_add_mem_rsv(fdt, image->arch.backup_start, > + BACKUP_SRC_SIZE); > + if (ret) { > + pr_err("Error reserving memory for backup: %s\n", > + fdt_strerror(ret)); > + goto out; > + } > } > > out: -- Thiago Jung Bauermann IBM Linux Technology Center ___ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec
Re: [PATCH v4 06/12] ppc64/kexec_file: restrict memory usage of kdump kernel
Hari Bathini writes: > Kdump kernel, used for capturing the kernel core image, is supposed > to use only specific memory regions to avoid corrupting the image to > be captured. The regions are crashkernel range - the memory reserved > explicitly for kdump kernel, memory used for the tce-table, the OPAL > region and RTAS region as applicable. Restrict kdump kernel memory > to use only these regions by setting up usable-memory DT property. > Also, tell the kdump kernel to run at the loaded address by setting > the magic word at 0x5c. > > Signed-off-by: Hari Bathini > Tested-by: Pingfan Liu > --- > > v3 -> v4: > * Updated get_node_path() to be an iterative function instead of a > recursive one. > * Added comment explaining why low memory is added to kdump kernel's > usable memory ranges though it doesn't fall in crashkernel region. > * For correctness, added fdt_add_mem_rsv() for the low memory being > added to kdump kernel's usable memory ranges. Good idea. > * Fixed prop pointer update in add_usable_mem_property() and changed > duple to tuple as suggested by Thiago. > +/** > + * get_node_pathlen - Get the full path length of the given node. > + * @dn: Node. > + * > + * Also, counts '/' at the end of the path. > + * For example, /memory@0 will be "/memory@0/\0" => 11 bytes. Wouldn't this function return 10 in the case of /memory@0? Are you saying that it should count the \0 at the end too? it's not doing that, AFAICS. > + * > + * Returns the string length of the node's full path. > + */ Maybe it's me (by analogy with strlen()), but I would expect "string length" to not include the terminating \0. I suggest renaming the function to something like get_node_path_size() and do s/length/size/ in the comment above if it's supposed to count the terminating \0. > +static int get_node_pathlen(struct device_node *dn) > +{ > + int len = 0; > + > + if (!dn) > + return 0; > + > + while (dn) { > + len += strlen(dn->full_name) + 1; > + dn = dn->parent; > + } > + > + return len + 1; > +} > + > +/** > + * get_node_path - Get the full path of the given node. > + * @node: Device node. > + * > + * Allocates buffer for node path. The caller must free the buffer > + * after use. > + * > + * Returns buffer with path on success, NULL otherwise. > + */ > +static char *get_node_path(struct device_node *node) > +{ > + struct device_node *dn; > + int len, idx, nlen; > + char *path = NULL; > + char end_char; > + > + if (!node) > + goto err; > + > + /* > + * Get the path length first and use it to iteratively build the path > + * from node to root. > + */ > + len = get_node_pathlen(node); > + > + /* Allocate memory for node path */ > + path = kzalloc(ALIGN(len, 8), GFP_KERNEL); > + if (!path) > + goto err; > + > + /* > + * Iteratively update path from node to root by decrementing > + * index appropriately. > + * > + * Also, add %NUL at the end of node & '/' at the end of all its > + * parent nodes. > + */ > + dn = node; > + path[0] = '/'; > + idx = len - 1; Here, idx is pointing to the supposed '/' at the end of the node path ... > + end_char = '\0'; > + while (dn->parent) { > + path[--idx] = end_char; .. and in the first ireation, this is writing '\0' at a place which will be overwritten by the memcpy() below with the last character of dn->full_name. You need to start idx with len, not len - 1. > + end_char = '/'; > + > + nlen = strlen(dn->full_name); > + idx -= nlen; > + memcpy(path + idx, dn->full_name, nlen); > + > + dn = dn->parent; > + } > + > + return path; > +err: > + kfree(path); > + return NULL; > +} -- Thiago Jung Bauermann IBM Linux Technology Center ___ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec
Re: [PATCH v4 04/12] ppc64/kexec_file: avoid stomping memory used by special regions
Hari Bathini writes: > crashkernel region could have an overlap with special memory regions > like opal, rtas, tce-table & such. These regions are referred to as > exclude memory ranges. Setup this ranges during image probe in order > to avoid them while finding the buffer for different kdump segments. > Override arch_kexec_locate_mem_hole() to locate a memory hole taking > these ranges into account. > > Signed-off-by: Hari Bathini Reviewed-by: Thiago Jung Bauermann -- Thiago Jung Bauermann IBM Linux Technology Center ___ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec
Re: [PATCH v4 03/12] powerpc/kexec_file: add helper functions for getting memory ranges
Hari Bathini writes: > In kexec case, the kernel to be loaded uses the same memory layout as > the running kernel. So, passing on the DT of the running kernel would > be good enough. > > But in case of kdump, different memory ranges are needed to manage > loading the kdump kernel, booting into it and exporting the elfcore > of the crashing kernel. The ranges are exclude memory ranges, usable > memory ranges, reserved memory ranges and crash memory ranges. > > Exclude memory ranges specify the list of memory ranges to avoid while > loading kdump segments. Usable memory ranges list the memory ranges > that could be used for booting kdump kernel. Reserved memory ranges > list the memory regions for the loading kernel's reserve map. Crash > memory ranges list the memory ranges to be exported as the crashing > kernel's elfcore. > > Add helper functions for setting up the above mentioned memory ranges. > This helpers facilitate in understanding the subsequent changes better > and make it easy to setup the different memory ranges listed above, as > and when appropriate. > > Signed-off-by: Hari Bathini > Tested-by: Pingfan Liu Just one comment below, but regardless: Reviewed-by: Thiago Jung Bauermann > +/** > + * add_htab_mem_range - Adds htab range to the given memory ranges list, > + * if it exists > + * @mem_ranges: Range list to add the memory range to. > + * > + * Returns 0 on success, negative errno on error. > + */ > +int add_htab_mem_range(struct crash_mem **mem_ranges) > +{ > + if (!htab_address) > + return 0; > + > + return add_mem_range(mem_ranges, __pa(htab_address), htab_size_bytes); > +} I believe you need to surround this function with `#ifdef CONFIG_PPC_BOOK3S_64` and `#endif` to match what is done in . -- Thiago Jung Bauermann IBM Linux Technology Center ___ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec
Re: [PATCH v3 07/12] ppc64/kexec_file: add support to relocate purgatory
Hari Bathini writes: > On 16/07/20 5:50 am, Thiago Jung Bauermann wrote: >> >> Hari Bathini writes: >> >>> So, add support to relocate purgatory in kexec_file_load system call >>> by setting up TOC pointer and applying RELA relocations as needed. >> >> If we do want to use a C purgatory, Michael Ellerman had suggested >> building it as a Position Independent Executable, which greatly reduces >> the number and types of relocations that are needed. See patches 4 and 9 >> here: >> >> https://lore.kernel.org/linuxppc-dev/1478748449-3894-1-git-send-email-bauer...@linux.vnet.ibm.com/ >> >> In the series above I hadn't converted x86 to PIE. If I had done that, >> possibly Dave Young's opinion would have been different. :-) >> >> If that's still not desirable, he suggested in that discussion lifting >> some code from x86 to generic code, which I implemented and would >> simplify this patch as well: >> >> https://lore.kernel.org/linuxppc-dev/5009580.5GxAkTrMYA@morokweng/ >> > > Agreed. But I prefer to work on PIE and/or moving common relocation_add code > for x86 & s390 to generic code later when I try to build on these purgatory > changes. So, a separate series later to rework purgatory with the things you > mentioned above sounds ok? Sounds ok to me. Let's see what the maintainers think, then. -- Thiago Jung Bauermann IBM Linux Technology Center ___ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec
Re: [PATCH v3 09/12] ppc64/kexec_file: setup backup region for kdump kernel
Hari Bathini writes: > On 16/07/20 7:08 am, Thiago Jung Bauermann wrote: >> >> Hari Bathini writes: >> >>> @@ -968,7 +1040,7 @@ int setup_new_fdt_ppc64(const struct kimage *image, >>> void *fdt, >>> >>> /* >>> * Restrict memory usage for kdump kernel by setting up >>> -* usable memory ranges. >>> +* usable memory ranges and memory reserve map. >>> */ >>> if (image->type == KEXEC_TYPE_CRASH) { >>> ret = get_usable_memory_ranges(); >>> @@ -980,6 +1052,24 @@ int setup_new_fdt_ppc64(const struct kimage *image, >>> void *fdt, >>> pr_err("Error setting up usable-memory property for >>> kdump kernel\n"); >>> goto out; >>> } >>> + >>> + ret = fdt_add_mem_rsv(fdt, BACKUP_SRC_START + BACKUP_SRC_SIZE, >>> + crashk_res.start - BACKUP_SRC_SIZE); >> >> I believe this answers my question from the other email about how the >> crashkernel is prevented from stomping in the crashed kernel's memory, >> right? I needed to think for a bit to understand what the above >> reservation was protecting. I think it's worth adding a comment. > > Right. The reason to add it in the first place is, prom presses the panic > button if > it can't find low memory. Marking it reserved seems to keep it quiet though. > so.. > > Will add comment mentioning that.. Ah, makes sense. Thanks for the explanation. >>> +void purgatory(void) >>> +{ >>> + void *dest, *src; >>> + >>> + src = (void *)BACKUP_SRC_START; >>> + if (backup_start) { >>> + dest = (void *)backup_start; >>> + __memcpy(dest, src, BACKUP_SRC_SIZE); >>> + } >>> +} >> >> In general I'm in favor of using C code over assembly, but having to >> bring in that relocation support just for the above makes me wonder if >> it's worth it in this case. > > I am planning to build on purgatory later with "I'm in purgatory" print > support > for pseries at least and also, sha256 digest check. Ok. In that case, my preference would be to convert both the powerpc and x86 purgatories to PIE since this greatly reduces the types of relocations that are emitted, but better ask Dave Young what he thinks before going down that route. -- Thiago Jung Bauermann IBM Linux Technology Center ___ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec
Re: [PATCH v3 06/12] ppc64/kexec_file: restrict memory usage of kdump kernel
Hari Bathini writes: > On 16/07/20 4:22 am, Thiago Jung Bauermann wrote: >> >> Hari Bathini writes: >> > > > >>> +/** >>> + * get_node_path - Get the full path of the given node. >>> + * @dn:Node. >>> + * @path: Updated with the full path of the node. >>> + * >>> + * Returns nothing. >>> + */ >>> +static void get_node_path(struct device_node *dn, char *path) >>> +{ >>> + if (!dn) >>> + return; >>> + >>> + get_node_path(dn->parent, path); >> >> Is it ok to do recursion in the kernel? In this case I believe it's not >> problematic since the maximum call depth will be the maximum depth of a >> device tree node which shouldn't be too much. Also, there are no local >> variables in this function. But I thought it was worth mentioning. > > You are right. We are better off avoiding the recursion here. Will > change it to an iterative version instead. Ok. >>> +* each representing a memory range. >>> +*/ >>> + ranges = (len >> 2) / (n_mem_addr_cells + n_mem_size_cells); >>> + >>> + for (i = 0; i < ranges; i++) { >>> + base = of_read_number(prop, n_mem_addr_cells); >>> + prop += n_mem_addr_cells; >>> + end = base + of_read_number(prop, n_mem_size_cells) - 1; > > prop is not used after the above. > >> You need to `prop += n_mem_size_cells` here. > > But yeah, adding it would make it look complete in some sense.. Isn't it used in the next iteration of the loop? -- Thiago Jung Bauermann IBM Linux Technology Center ___ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec
Re: [PATCH v3 05/12] powerpc/drmem: make lmb walk a bit more flexible
Hari Bathini writes: > On 15/07/20 9:20 am, Thiago Jung Bauermann wrote: >> >> Hari Bathini writes: >> >>> @@ -534,7 +537,7 @@ static int __init >>> early_init_dt_scan_memory_ppc(unsigned long node, >>> #ifdef CONFIG_PPC_PSERIES >>> if (depth == 1 && >>> strcmp(uname, "ibm,dynamic-reconfiguration-memory") == 0) { >>> - walk_drmem_lmbs_early(node, early_init_drmem_lmb); >>> + walk_drmem_lmbs_early(node, NULL, early_init_drmem_lmb); >> >> walk_drmem_lmbs_early() can now fail. Should this failure be propagated >> as a return value of early_init_dt_scan_memory_ppc()? > >> >>> return 0; >>> } >>> #endif >> >> >>> @@ -787,7 +790,7 @@ static int __init parse_numa_properties(void) >>> */ >>> memory = of_find_node_by_path("/ibm,dynamic-reconfiguration-memory"); >>> if (memory) { >>> - walk_drmem_lmbs(memory, numa_setup_drmem_lmb); >>> + walk_drmem_lmbs(memory, NULL, numa_setup_drmem_lmb); >> >> Similarly here. Now that this call can fail, should >> parse_numa_properties() handle or propagate the failure? > > They would still not fail unless the callbacks early_init_drmem_lmb() & > numa_setup_drmem_lmb() > are updated to have failure scenarios. Also, these call sites always ignored > failure scenarios > even before walk_drmem_lmbs() was introduced. So, I prefer to keep them the > way they are? Ok, makes sense. In this case: Reviewed-by: Thiago Jung Bauermann -- Thiago Jung Bauermann IBM Linux Technology Center ___ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec
Re: [PATCH v3 04/12] ppc64/kexec_file: avoid stomping memory used by special regions
Hari Bathini writes: > On 15/07/20 8:09 am, Thiago Jung Bauermann wrote: >> >> Hari Bathini writes: >> > > > >>> +/** >>> + * __locate_mem_hole_top_down - Looks top down for a large enough memory >>> hole >>> + * in the memory regions between buf_min & >>> buf_max >>> + * for the buffer. If found, sets kbuf->mem. >>> + * @kbuf: Buffer contents and memory parameters. >>> + * @buf_min:Minimum address for the buffer. >>> + * @buf_max:Maximum address for the buffer. >>> + * >>> + * Returns 0 on success, negative errno on error. >>> + */ >>> +static int __locate_mem_hole_top_down(struct kexec_buf *kbuf, >>> + u64 buf_min, u64 buf_max) >>> +{ >>> + int ret = -EADDRNOTAVAIL; >>> + phys_addr_t start, end; >>> + u64 i; >>> + >>> + for_each_mem_range_rev(i, , NULL, NUMA_NO_NODE, >>> + MEMBLOCK_NONE, , , NULL) { >>> + if (start > buf_max) >>> + continue; >>> + >>> + /* Memory hole not found */ >>> + if (end < buf_min) >>> + break; >>> + >>> + /* Adjust memory region based on the given range */ >>> + if (start < buf_min) >>> + start = buf_min; >>> + if (end > buf_max) >>> + end = buf_max; >>> + >>> + start = ALIGN(start, kbuf->buf_align); >>> + if (start < end && (end - start + 1) >= kbuf->memsz) { >> >> This is why I dislike using start and end to express address ranges: >> >> While struct resource seems to use the [address, end] convention, my > > struct crash_mem also uses [address, end] convention. > This off-by-one error did not cause any issues as the hole start and size we > try to find > are at least page aligned. > > Nonetheless, I think fixing 'end' early in the loop with "end -= 1" would > ensure > correctness while continuing to use the same convention for structs crash_mem > & resource. Sounds good. -- Thiago Jung Bauermann IBM Linux Technology Center ___ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec
Re: [PATCH v3 10/12] ppc64/kexec_file: prepare elfcore header for crashing kernel
Hari Bathini writes: > On 16/07/20 7:52 am, Thiago Jung Bauermann wrote: >> >> Hari Bathini writes: >> >>> /** >>> + * get_crash_memory_ranges - Get crash memory ranges. This list includes >>> + * first/crashing kernel's memory regions that >>> + * would be exported via an elfcore. >>> + * @mem_ranges: Range list to add the memory ranges to. >>> + * >>> + * Returns 0 on success, negative errno on error. >>> + */ >>> +static int get_crash_memory_ranges(struct crash_mem **mem_ranges) >>> +{ >>> + struct memblock_region *reg; >>> + struct crash_mem *tmem; >>> + int ret; >>> + >>> + for_each_memblock(memory, reg) { >>> + u64 base, size; >>> + >>> + base = (u64)reg->base; >>> + size = (u64)reg->size; >>> + >>> + /* Skip backup memory region, which needs a separate entry */ >>> + if (base == BACKUP_SRC_START) { >>> + if (size > BACKUP_SRC_SIZE) { >>> + base = BACKUP_SRC_END + 1; >>> + size -= BACKUP_SRC_SIZE; >>> + } else >>> + continue; >>> + } >>> + >>> + ret = add_mem_range(mem_ranges, base, size); >>> + if (ret) >>> + goto out; >>> + >>> + /* Try merging adjacent ranges before reallocation attempt */ >>> + if ((*mem_ranges)->nr_ranges == (*mem_ranges)->max_nr_ranges) >>> + sort_memory_ranges(*mem_ranges, true); >>> + } >>> + >>> + /* Reallocate memory ranges if there is no space to split ranges */ >>> + tmem = *mem_ranges; >>> + if (tmem && (tmem->nr_ranges == tmem->max_nr_ranges)) { >>> + tmem = realloc_mem_ranges(mem_ranges); >>> + if (!tmem) >>> + goto out; >>> + } >>> + >>> + /* Exclude crashkernel region */ >>> + ret = crash_exclude_mem_range(tmem, crashk_res.start, crashk_res.end); >>> + if (ret) >>> + goto out; >>> + >>> + ret = add_rtas_mem_range(mem_ranges); >>> + if (ret) >>> + goto out; >>> + >>> + ret = add_opal_mem_range(mem_ranges); >>> + if (ret) >>> + goto out; >> >> Maybe I'm confused, but don't you add the RTAS and OPAL regions as >> usable memory for the crashkernel? In that case they shouldn't show up >> in the core file. > > kexec-tools does the same thing. I am not endorsing it but I was trying to > stay > in parity to avoid breaking any userspace tools/commands. But as you rightly > pointed, this is NOT right. The right thing to do, to get the rtas/opal data > at > the time of crash, is to have a backup region for them just like we have for > the first 64K memory. I was hoping to do that later. > > Will check how userspace tools respond to dropping these regions. If that > makes > the tools unhappy, will retain the regions with a FIXME. Sorry about the > confusion. No problem, thanks for the clarification. -- Thiago Jung Bauermann IBM Linux Technology Center ___ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec
Re: [PATCH v3 04/12] ppc64/kexec_file: avoid stomping memory used by special regions
Thiago Jung Bauermann writes: > Hari Bathini writes: > >> diff --git a/arch/powerpc/include/asm/crashdump-ppc64.h >> b/arch/powerpc/include/asm/crashdump-ppc64.h >> new file mode 100644 >> index 000..90deb46 >> --- /dev/null >> +++ b/arch/powerpc/include/asm/crashdump-ppc64.h >> @@ -0,0 +1,10 @@ >> +/* SPDX-License-Identifier: GPL-2.0-only */ >> +#ifndef _ASM_POWERPC_CRASHDUMP_PPC64_H >> +#define _ASM_POWERPC_CRASHDUMP_PPC64_H >> + >> +/* min & max addresses for kdump load segments */ >> +#define KDUMP_BUF_MIN (crashk_res.start) >> +#define KDUMP_BUF_MAX ((crashk_res.end < ppc64_rma_size) ? \ >> + crashk_res.end : (ppc64_rma_size - 1)) >> + >> +#endif /* __ASM_POWERPC_CRASHDUMP_PPC64_H */ >> diff --git a/arch/powerpc/include/asm/kexec.h >> b/arch/powerpc/include/asm/kexec.h >> index 7008ea1..bf47a01 100644 >> --- a/arch/powerpc/include/asm/kexec.h >> +++ b/arch/powerpc/include/asm/kexec.h >> @@ -100,14 +100,16 @@ void relocate_new_kernel(unsigned long >> indirection_page, unsigned long reboot_co >> #ifdef CONFIG_KEXEC_FILE >> extern const struct kexec_file_ops kexec_elf64_ops; >> >> -#ifdef CONFIG_IMA_KEXEC >> #define ARCH_HAS_KIMAGE_ARCH >> >> struct kimage_arch { >> +struct crash_mem *exclude_ranges; >> + >> +#ifdef CONFIG_IMA_KEXEC >> phys_addr_t ima_buffer_addr; >> size_t ima_buffer_size; >> -}; >> #endif >> +}; >> >> int setup_purgatory(struct kimage *image, const void *slave_code, >> const void *fdt, unsigned long kernel_load_addr, >> @@ -125,6 +127,7 @@ int setup_new_fdt_ppc64(const struct kimage *image, void >> *fdt, >> unsigned long initrd_load_addr, >> unsigned long initrd_len, const char *cmdline); >> #endif /* CONFIG_PPC64 */ >> + >> #endif /* CONFIG_KEXEC_FILE */ >> >> #else /* !CONFIG_KEXEC_CORE */ >> diff --git a/arch/powerpc/kexec/elf_64.c b/arch/powerpc/kexec/elf_64.c >> index 23ad04c..c695f94 100644 >> --- a/arch/powerpc/kexec/elf_64.c >> +++ b/arch/powerpc/kexec/elf_64.c >> @@ -22,6 +22,7 @@ >> #include >> #include >> #include >> +#include >> >> static void *elf64_load(struct kimage *image, char *kernel_buf, >> unsigned long kernel_len, char *initrd, >> @@ -46,6 +47,12 @@ static void *elf64_load(struct kimage *image, char >> *kernel_buf, >> if (ret) >> goto out; >> >> +if (image->type == KEXEC_TYPE_CRASH) { >> +/* min & max buffer values for kdump case */ >> +kbuf.buf_min = pbuf.buf_min = KDUMP_BUF_MIN; >> +kbuf.buf_max = pbuf.buf_max = KDUMP_BUF_MAX; > > This is only my personal opinion and an actual maintainer may disagree, > but just looking at the lines above, I would assume that KDUMP_BUF_MIN > and KDUMP_BUF_MAX were constants, when in fact they aren't. > > I suggest using static inline macros in , for > example: > > static inline resource_size_t get_kdump_buf_min(void) > { > return crashk_res.start; > } > > static inline resource_size_t get_kdump_buf_max(void) > { > return (crashk_res.end < ppc64_rma_size) ? \ >crashk_res.end : (ppc64_rma_size - 1) > } I later noticed that KDUMP_BUF_MIN and KDUMP_BUF_MAX are only used here. In this case, I think the best option is to avoid the macros and inline functions and just use the actual expressions in the code. -- Thiago Jung Bauermann IBM Linux Technology Center ___ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec
Re: [PATCH v3 12/12] ppc64/kexec_file: fix kexec load failure with lack of memory hole
Hari Bathini writes: > The kexec purgatory has to run in real mode. Only the first memory > block maybe accessible in real mode. And, unlike the case with panic > kernel, no memory is set aside for regular kexec load. Another thing > to note is, the memory for crashkernel is reserved at an offset of > 128MB. So, when crashkernel memory is reserved, the memory ranges to > load kexec segments shrink further as the generic code only looks for > memblock free memory ranges and in all likelihood only a tiny bit of > memory from 0 to 128MB would be available to load kexec segments. > > With kdump being used by default in general, kexec file load is likely > to fail almost always. Ah. I wasn't aware of this problem. > This can be fixed by changing the memory hole > lookup logic for regular kexec to use the same method as kdump. Right. It doesn't make that much sense to use memblock to find free memory areas for the kexec kernel, because memblock tracks which memory areas are free for the currently running kernel. But that's not what matters for the kernel that will be kexec'd into. In this case, regions which may be reserved for the current OS instance may well be free for a freshly started kernel. The kdump method is better at knowing which memory regions are actually reserved by the firmware/hardware. > This > would mean that most kexec segments will overlap with crashkernel > memory region. That should still be ok as the pages, whose destination > address isn't available while loading, are placed in an intermediate > location till a flush to the actual destination address happens during > kexec boot sequence. Yes, since the kdump kernel and the "regular" kexec kernel can't be both booted at the same time, it's not a problem if both plan to use the same region of memory. > > Signed-off-by: Hari Bathini > Tested-by: Pingfan Liu Reviewed-by: Thiago Jung Bauermann > --- > > v2 -> v3: > * Unchanged. Added Tested-by tag from Pingfan. > > v1 -> v2: > * New patch to fix locating memory hole for kexec_file_load (kexec -s -l) > when memory is reserved for crashkernel. > > > arch/powerpc/kexec/file_load_64.c | 33 ++--- > 1 file changed, 14 insertions(+), 19 deletions(-) -- Thiago Jung Bauermann IBM Linux Technology Center ___ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec
Re: [PATCH v3 11/12] ppc64/kexec_file: add appropriate regions for memory reserve map
Hari Bathini writes: > While initrd, elfcorehdr and backup regions are already added to the > reserve map, there are a few missing regions that need to be added to > the memory reserve map. Add them here. And now that all the changes > to load panic kernel are in place, claim likewise. > > Signed-off-by: Hari Bathini > Tested-by: Pingfan Liu Reviewed-by: Thiago Jung Bauermann Just one oinor nit below. > --- > > v2 -> v3: > * Unchanged. Added Tested-by tag from Pingfan. > > v1 -> v2: > * Updated add_rtas_mem_range() & add_opal_mem_range() callsites based on > the new prototype for these functions. > > > arch/powerpc/kexec/file_load_64.c | 58 > ++--- > 1 file changed, 53 insertions(+), 5 deletions(-) > > diff --git a/arch/powerpc/kexec/file_load_64.c > b/arch/powerpc/kexec/file_load_64.c > index 2531bb5..29e5d11 100644 > --- a/arch/powerpc/kexec/file_load_64.c > +++ b/arch/powerpc/kexec/file_load_64.c > @@ -193,6 +193,34 @@ static int get_crash_memory_ranges(struct crash_mem > **mem_ranges) > } > > /** > + * get_reserved_memory_ranges - Get reserve memory ranges. This list includes > + * memory regions that should be added to the > + * memory reserve map to ensure the region is > + * protected from any mischeif. s/mischeif/mischief/ > + * @mem_ranges: Range list to add the memory ranges to. > + * > + * Returns 0 on success, negative errno on error. > + */ > +static int get_reserved_memory_ranges(struct crash_mem **mem_ranges) > +{ > + int ret; > + > + ret = add_rtas_mem_range(mem_ranges); > + if (ret) > + goto out; > + > + ret = add_tce_mem_ranges(mem_ranges); > + if (ret) > + goto out; > + > + ret = add_reserved_ranges(mem_ranges); > +out: > + if (ret) > + pr_err("Failed to setup reserved memory ranges\n"); > + return ret; > +} -- Thiago Jung Bauermann IBM Linux Technology Center ___ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec
Re: [PATCH v3 10/12] ppc64/kexec_file: prepare elfcore header for crashing kernel
f->buffer = headers; > + kbuf->mem = KEXEC_BUF_MEM_UNKNOWN; > + kbuf->bufsz = kbuf->memsz = headers_sz; > + kbuf->top_down = false; > + > + ret = kexec_add_buffer(kbuf); > + if (ret) { > + vfree(headers); > + goto out; > + } > + > + image->arch.elfcorehdr_addr = kbuf->mem; > + image->arch.elf_headers_sz = headers_sz; > + image->arch.elf_headers = headers; > +out: > + kfree(cmem); > + return ret; > +} -- Thiago Jung Bauermann IBM Linux Technology Center ___ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec
Re: [PATCH v3 02/12] powerpc/kexec_file: mark PPC64 specific code
I didn't forget about this patch. I just wanted to see more of the changes before comenting on it. Hari Bathini writes: > Some of the kexec_file_load code isn't PPC64 specific. Move PPC64 > specific code from kexec/file_load.c to kexec/file_load_64.c. Also, > rename purgatory/trampoline.S to purgatory/trampoline_64.S in the > same spirit. There's only a 64 bit implementation of kexec_file_load() so this is a somewhat theoretical exercise, but there's no harm in getting the code organized, so: Reviewed-by: Thiago Jung Bauermann I have just one question below. > Signed-off-by: Hari Bathini > Tested-by: Pingfan Liu > --- > > v2 -> v3: > * Unchanged. Added Tested-by tag from Pingfan. > > v1 -> v2: > * No changes. > > > arch/powerpc/include/asm/kexec.h | 11 +++ > arch/powerpc/kexec/Makefile|2 - > arch/powerpc/kexec/elf_64.c|7 +- > arch/powerpc/kexec/file_load.c | 37 ++ > arch/powerpc/kexec/file_load_64.c | 108 ++ > arch/powerpc/purgatory/Makefile|4 + > arch/powerpc/purgatory/trampoline.S| 117 > > arch/powerpc/purgatory/trampoline_64.S | 117 > > 8 files changed, 248 insertions(+), 155 deletions(-) > create mode 100644 arch/powerpc/kexec/file_load_64.c > delete mode 100644 arch/powerpc/purgatory/trampoline.S > create mode 100644 arch/powerpc/purgatory/trampoline_64.S > diff --git a/arch/powerpc/kexec/file_load_64.c > b/arch/powerpc/kexec/file_load_64.c > new file mode 100644 > index 000..e6bff960 > --- /dev/null > +++ b/arch/powerpc/kexec/file_load_64.c > @@ -0,0 +1,108 @@ > +// SPDX-License-Identifier: GPL-2.0-only > +/* > + * ppc64 code to implement the kexec_file_load syscall > + * > + * Copyright (C) 2004 Adam Litke (a...@us.ibm.com) > + * Copyright (C) 2004 IBM Corp. > + * Copyright (C) 2004,2005 Milton D Miller II, IBM Corporation > + * Copyright (C) 2005 R Sharada (shar...@in.ibm.com) > + * Copyright (C) 2006 Mohan Kumar M (mo...@in.ibm.com) > + * Copyright (C) 2020 IBM Corporation > + * > + * Based on kexec-tools' kexec-ppc64.c, kexec-elf-rel-ppc64.c, fs2dt.c. > + * Heavily modified for the kernel by > + * Hari Bathini . > + */ > + > +#include > +#include > +#include > + > +const struct kexec_file_ops * const kexec_file_loaders[] = { > + _elf64_ops, > + NULL > +}; > + > +/** > + * setup_purgatory_ppc64 - initialize PPC64 specific purgatory's global > + * variables and call setup_purgatory() to initialize > + * common global variable. > + * @image: kexec image. > + * @slave_code:Slave code for the purgatory. > + * @fdt: Flattened device tree for the next kernel. > + * @kernel_load_addr: Address where the kernel is loaded. > + * @fdt_load_addr: Address where the flattened device tree is loaded. > + * > + * Returns 0 on success, negative errno on error. > + */ > +int setup_purgatory_ppc64(struct kimage *image, const void *slave_code, > + const void *fdt, unsigned long kernel_load_addr, > + unsigned long fdt_load_addr) > +{ > + int ret; > + > + ret = setup_purgatory(image, slave_code, fdt, kernel_load_addr, > + fdt_load_addr); > + if (ret) > + pr_err("Failed to setup purgatory symbols"); > + return ret; > +} > + > +/** > + * setup_new_fdt_ppc64 - Update the flattend device-tree of the kernel > + * being loaded. > + * @image: kexec image being loaded. > + * @fdt: Flattened device tree for the next kernel. > + * @initrd_load_addr:Address where the next initrd will be loaded. > + * @initrd_len: Size of the next initrd, or 0 if there will be none. > + * @cmdline: Command line for the next kernel, or NULL if there > will > + * be none. > + * > + * Returns 0 on success, negative errno on error. > + */ > +int setup_new_fdt_ppc64(const struct kimage *image, void *fdt, > + unsigned long initrd_load_addr, > + unsigned long initrd_len, const char *cmdline) > +{ > + int chosen_node, ret; > + > + /* Remove memory reservation for the current device tree. */ > + ret = delete_fdt_mem_rsv(fdt, __pa(initial_boot_params), > + fdt_totalsize(initial_boot_params)); > + if (ret == 0) > + pr_debug("Removed old device tree reservation.\n"); > + else if (re
Re: [PATCH v3 08/12] ppc64/kexec_file: setup the stack for purgatory
Sorry, forgot to send one comment for this patch: Hari Bathini writes: > @@ -898,10 +900,37 @@ int setup_purgatory_ppc64(struct kimage *image, const > void *slave_code, > goto out; > } > > + /* Setup the stack top */ > + stack_buf = kexec_purgatory_get_symbol_addr(image, "stack_buf"); > + if (!stack_buf) > + goto out; > + > + val = (u64)stack_buf + KEXEC_PURGATORY_STACK_SIZE; > + ret = kexec_purgatory_get_set_symbol(image, "stack", , sizeof(val), > + false); > + if (ret) > + goto out; > + > /* Setup the TOC pointer */ > val = get_toc_ptr(&(image->purgatory_info)); > ret = kexec_purgatory_get_set_symbol(image, "my_toc", , sizeof(val), >false); > + if (ret) > + goto out; > + > + /* Setup OPAL base & entry values */ > + dn = of_find_node_by_path("/ibm,opal"); > + if (dn) { > + of_property_read_u64(dn, "opal-base-address", ); > + ret = kexec_purgatory_get_set_symbol(image, "opal_base", , > + sizeof(val), false); > + if (ret) > + goto out; > + > + of_property_read_u64(dn, "opal-entry-address", ); > + ret = kexec_purgatory_get_set_symbol(image, "opal_entry", , > + sizeof(val), false); You need to call of_node_put(dn) here and in the if (ret) case above. > + } > out: > if (ret) > pr_err("Failed to setup purgatory symbols"); -- Thiago Jung Bauermann IBM Linux Technology Center ___ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec
Re: [PATCH v3 09/12] ppc64/kexec_file: setup backup region for kdump kernel
Hari Bathini writes: > @@ -968,7 +1040,7 @@ int setup_new_fdt_ppc64(const struct kimage *image, void > *fdt, > > /* >* Restrict memory usage for kdump kernel by setting up > - * usable memory ranges. > + * usable memory ranges and memory reserve map. >*/ > if (image->type == KEXEC_TYPE_CRASH) { > ret = get_usable_memory_ranges(); > @@ -980,6 +1052,24 @@ int setup_new_fdt_ppc64(const struct kimage *image, > void *fdt, > pr_err("Error setting up usable-memory property for > kdump kernel\n"); > goto out; > } > + > + ret = fdt_add_mem_rsv(fdt, BACKUP_SRC_START + BACKUP_SRC_SIZE, > + crashk_res.start - BACKUP_SRC_SIZE); I believe this answers my question from the other email about how the crashkernel is prevented from stomping in the crashed kernel's memory, right? I needed to think for a bit to understand what the above reservation was protecting. I think it's worth adding a comment. > + if (ret) { > + pr_err("Error reserving crash memory: %s\n", > +fdt_strerror(ret)); > + goto out; > + } > + } > + > + if (image->arch.backup_start) { > + ret = fdt_add_mem_rsv(fdt, image->arch.backup_start, > + BACKUP_SRC_SIZE); > + if (ret) { > + pr_err("Error reserving memory for backup: %s\n", > +fdt_strerror(ret)); > + goto out; > + } > } This is only true for KEXEC_TYPE_CRASH, if I'm following the code correctly. I think it would be clearer to put the if above inside the if for KEXEC_TYPE_CRASH to make it clearer. > > ret = setup_new_fdt(image, fdt, initrd_load_addr, initrd_len, > diff --git a/arch/powerpc/purgatory/purgatory_64.c > b/arch/powerpc/purgatory/purgatory_64.c > new file mode 100644 > index 000..1eca74c > --- /dev/null > +++ b/arch/powerpc/purgatory/purgatory_64.c > @@ -0,0 +1,36 @@ > +// SPDX-License-Identifier: GPL-2.0-only > +/* > + * purgatory: Runs between two kernels > + * > + * Copyright 2020, Hari Bathini, IBM Corporation. > + */ > + > +#include > +#include > + > +extern unsigned long backup_start; > + > +static void *__memcpy(void *dest, const void *src, unsigned long n) > +{ > + unsigned long i; > + unsigned char *d; > + const unsigned char *s; > + > + d = dest; > + s = src; > + for (i = 0; i < n; i++) > + d[i] = s[i]; > + > + return dest; > +} > + > +void purgatory(void) > +{ > + void *dest, *src; > + > + src = (void *)BACKUP_SRC_START; > + if (backup_start) { > + dest = (void *)backup_start; > + __memcpy(dest, src, BACKUP_SRC_SIZE); > + } > +} In general I'm in favor of using C code over assembly, but having to bring in that relocation support just for the above makes me wonder if it's worth it in this case. -- Thiago Jung Bauermann IBM Linux Technology Center ___ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec
Re: [PATCH v3 08/12] ppc64/kexec_file: setup the stack for purgatory
Hari Bathini writes: > To avoid any weird errors, the purgatory should run with its own > stack. Set one up by adding the stack buffer to .data section of > the purgatory. Also, setup opal base & entry values in r8 & r9 > registers to help early OPAL debugging. > > Signed-off-by: Hari Bathini > Tested-by: Pingfan Liu Reviewed-by: Thiago Jung Bauermann > --- > > v2 -> v3: > * Unchanged. Added Tested-by tag from Pingfan. > > v1 -> v2: > * Setting up opal base & entry values in r8 & r9 for early OPAL debug. > > > arch/powerpc/include/asm/kexec.h |4 > arch/powerpc/kexec/file_load_64.c | 29 + > arch/powerpc/purgatory/trampoline_64.S | 32 > ++++ > 3 files changed, 65 insertions(+) > -- Thiago Jung Bauermann IBM Linux Technology Center ___ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec
Re: [PATCH v3 07/12] ppc64/kexec_file: add support to relocate purgatory
Hari Bathini writes: > Right now purgatory implementation is only minimal. But if purgatory > code is to be enhanced to copy memory to the backup region and verify Can't the memcpy be done in asm? We have arch/powerpc/lib/memcpy_64.S for example, perhaps it could be linked in with the purgatory? > sha256 digest, relocations may have to be applied to the purgatory. Do we want to do the sha256 verification? My original patch series for kexec_file_load() had a purgatory in C from kexec-tools which did the sha256 verification but Michael Ellerman thought it was unnecessary and decided to use the simpler purgatory in asm from kexec-lite. As a result, this relocation processing became unnecessary. > So, add support to relocate purgatory in kexec_file_load system call > by setting up TOC pointer and applying RELA relocations as needed. If we do want to use a C purgatory, Michael Ellerman had suggested building it as a Position Independent Executable, which greatly reduces the number and types of relocations that are needed. See patches 4 and 9 here: https://lore.kernel.org/linuxppc-dev/1478748449-3894-1-git-send-email-bauer...@linux.vnet.ibm.com/ In the series above I hadn't converted x86 to PIE. If I had done that, possibly Dave Young's opinion would have been different. :-) If that's still not desirable, he suggested in that discussion lifting some code from x86 to generic code, which I implemented and would simplify this patch as well: https://lore.kernel.org/linuxppc-dev/5009580.5GxAkTrMYA@morokweng/ > Reported-by: kernel test robot > [lkp: In v1, 'struct mem_sym' was declared in parameter list] > Signed-off-by: Hari Bathini > --- > > v2 -> v3: > * Fixed get_toc_section() to return the section info that had relocations > applied, to calculate the correct toc pointer. > * Fixed how relocation value is converted to relative while applying > R_PPC64_REL64 & R_PPC64_REL32 relocations. > > v1 -> v2: > * Fixed wrong use of 'struct mem_sym' in local_entry_offset() as > reported by lkp. lkp report for reference: > - https://lore.kernel.org/patchwork/patch/1264421/ > > > arch/powerpc/kexec/file_load_64.c | 337 > > arch/powerpc/purgatory/trampoline_64.S |8 + > 2 files changed, 345 insertions(+) -- Thiago Jung Bauermann IBM Linux Technology Center ___ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec
Re: [PATCH v3 06/12] ppc64/kexec_file: restrict memory usage of kdump kernel
+ > + /* Get the full path of the memory node */ > + get_node_path(dn, pathname); > + pr_debug("Memory node path: %s\n", pathname); > + > + /* Now that we know the path, find its offset in kdump kernel's fdt */ > + node = fdt_path_offset(fdt, pathname); > + if (node < 0) { > + pr_err("Malformed device tree: error reading %s\n", > +pathname); > + ret = -EINVAL; > + goto out; > + } > + > + /* Get the address & size cells */ > + n_mem_addr_cells = of_n_addr_cells(dn); > + n_mem_size_cells = of_n_size_cells(dn); > + pr_debug("address cells: %d, size cells: %d\n", n_mem_addr_cells, > + n_mem_size_cells); > + > + um_info->idx = 0; > + buf = check_realloc_usable_mem(um_info, 2); > + if (!buf) { > + ret = -ENOMEM; > + goto out; > + } > + > + um_info->buf = buf; > + > + prop = of_get_property(dn, "reg", ); > + if (!prop || len <= 0) { > + ret = 0; > + goto out; > + } > + > + /* > + * "reg" property represents sequence of (addr,size) duples s/duples/tuples/ ? > + * each representing a memory range. > + */ > + ranges = (len >> 2) / (n_mem_addr_cells + n_mem_size_cells); > + > + for (i = 0; i < ranges; i++) { > + base = of_read_number(prop, n_mem_addr_cells); > + prop += n_mem_addr_cells; > + end = base + of_read_number(prop, n_mem_size_cells) - 1; You need to `prop += n_mem_size_cells` here. > + > + ret = add_usable_mem(um_info, base, end, ); > + if (ret) { > + ret = ret; > + goto out; > + } > + } > + > + /* > + * No kdump kernel usable memory found in this memory node. > + * Write (0,0) duple in linux,usable-memory property for s/duple/tuple/ ? > + * this region to be ignored. > + */ > + if (um_info->idx == 0) { > + um_info->buf[0] = 0; > + um_info->buf[1] = 0; > + um_info->idx = 2; > + } > + > + ret = fdt_setprop(fdt, node, "linux,usable-memory", um_info->buf, > + (um_info->idx * sizeof(*(um_info->buf; > + > +out: > + kfree(pathname); > + return ret; > +} -- Thiago Jung Bauermann IBM Linux Technology Center ___ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec
Re: [PATCH v3 05/12] powerpc/drmem: make lmb walk a bit more flexible
Hari Bathini writes: > @@ -534,7 +537,7 @@ static int __init early_init_dt_scan_memory_ppc(unsigned > long node, > #ifdef CONFIG_PPC_PSERIES > if (depth == 1 && > strcmp(uname, "ibm,dynamic-reconfiguration-memory") == 0) { > - walk_drmem_lmbs_early(node, early_init_drmem_lmb); > + walk_drmem_lmbs_early(node, NULL, early_init_drmem_lmb); walk_drmem_lmbs_early() can now fail. Should this failure be propagated as a return value of early_init_dt_scan_memory_ppc()? > return 0; > } > #endif > @@ -787,7 +790,7 @@ static int __init parse_numa_properties(void) >*/ > memory = of_find_node_by_path("/ibm,dynamic-reconfiguration-memory"); > if (memory) { > - walk_drmem_lmbs(memory, numa_setup_drmem_lmb); > + walk_drmem_lmbs(memory, NULL, numa_setup_drmem_lmb); Similarly here. Now that this call can fail, should parse_numa_properties() handle or propagate the failure? > of_node_put(memory); > } > -- Thiago Jung Bauermann IBM Linux Technology Center ___ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec
Re: [PATCH v3 04/12] ppc64/kexec_file: avoid stomping memory used by special regions
his is why I dislike using start and end to express address ranges: While struct resource seems to use the [address, end] convention, my reading of memblock code is that it uses [addres, end). This is guaranteed to lead to bugs. So the above has an off-by-one error. To calculate the size of the current range, you need to use `end - start`. > + /* Suitable memory range found. Set kbuf->mem */ > + kbuf->mem = ALIGN_DOWN(end - kbuf->memsz + 1, Similarly, I believe the `+ 1` here is wrong. > +kbuf->buf_align); > + ret = 0; > + break; > + } > + } > + > + return ret; > +} > + > +/** > + * locate_mem_hole_top_down_ppc64 - Skip special memory regions to find a > + * suitable buffer with top down approach. > + * @kbuf: Buffer contents and memory parameters. > + * @buf_min:Minimum address for the buffer. > + * @buf_max:Maximum address for the buffer. > + * @emem: Exclude memory ranges. > + * > + * Returns 0 on success, negative errno on error. > + */ > +static int locate_mem_hole_top_down_ppc64(struct kexec_buf *kbuf, > + u64 buf_min, u64 buf_max, > + const struct crash_mem *emem) > +{ > + int i, ret = 0, err = -EADDRNOTAVAIL; > + u64 start, end, tmin, tmax; > + > + tmax = buf_max; > + for (i = (emem->nr_ranges - 1); i >= 0; i--) { > + start = emem->ranges[i].start; > + end = emem->ranges[i].end; > + > + if (start > tmax) > + continue; > + > + if (end < tmax) { > + tmin = (end < buf_min ? buf_min : end + 1); > + ret = __locate_mem_hole_top_down(kbuf, tmin, tmax); > + if (!ret) > + return 0; > + } > + > + tmax = start - 1; > + > + if (tmax < buf_min) { > + ret = err; > + break; > + } > + ret = 0; > + } > + > + if (!ret) { > + tmin = buf_min; > + ret = __locate_mem_hole_top_down(kbuf, tmin, tmax); > + } > + return ret; > +} > + > +/** > + * __locate_mem_hole_bottom_up - Looks bottom up for a large enough memory > hole > + * in the memory regions between buf_min & > buf_max > + * for the buffer. If found, sets kbuf->mem. > + * @kbuf:Buffer contents and memory parameters. > + * @buf_min: Minimum address for the buffer. > + * @buf_max: Maximum address for the buffer. > + * > + * Returns 0 on success, negative errno on error. > + */ > +static int __locate_mem_hole_bottom_up(struct kexec_buf *kbuf, > +u64 buf_min, u64 buf_max) > +{ > + int ret = -EADDRNOTAVAIL; > + phys_addr_t start, end; > + u64 i; > + > + for_each_mem_range(i, , NULL, NUMA_NO_NODE, > +MEMBLOCK_NONE, , , NULL) { > + if (end < buf_min) > + continue; > + > + /* Memory hole not found */ > + if (start > buf_max) > + break; > + > + /* Adjust memory region based on the given range */ > + if (start < buf_min) > + start = buf_min; > + if (end > buf_max) > + end = buf_max; buf_max is an inclusive end address, right? Then this should read `end = buf_max + 1`. Same thing in the top-down version above. > + > + start = ALIGN(start, kbuf->buf_align); > + if (start < end && (end - start + 1) >= kbuf->memsz) { Same off-by-one problem. There shouldn't be a `+ 1` here. > + /* Suitable memory range found. Set kbuf->mem */ > + kbuf->mem = start; > + ret = 0; > + break; > + } > + } > + > + return ret; > +} -- Thiago Jung Bauermann IBM Linux Technology Center ___ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec
Re: [PATCH v3 03/12] powerpc/kexec_file: add helper functions for getting memory ranges
tab_mem_range(struct crash_mem **mem_ranges); #else static inline int add_htab_mem_range(struct crash_mem **mem_ranges) { return 0; } #endif And in ranges.c just surround the add_htab_mem_range() definition with #ifdef CONFIG_PPC_BOOK3S_64 and #endif Also, there's no need for the ret variable. You can just `return add_mem_range(...)` directly. > + > +/** > + * add_kernel_mem_range - Adds kernel text region to the given > + *memory ranges list. > + * @mem_ranges: Range list to add the memory range to. > + * > + * Returns 0 on success, negative errno on error. > + */ > +int add_kernel_mem_range(struct crash_mem **mem_ranges) > +{ > + int ret; > + > + ret = add_mem_range(mem_ranges, 0, __pa(_end)); > + return ret; > +} No need for the ret variable here, just `return add_mem_range()` directly. > + > +/** > + * add_rtas_mem_range - Adds RTAS region to the given memory ranges list. > + * @mem_ranges: Range list to add the memory range to. > + * > + * Returns 0 on success, negative errno on error. > + */ > +int add_rtas_mem_range(struct crash_mem **mem_ranges) > +{ > + struct device_node *dn; > + int ret = 0; > + > + dn = of_find_node_by_path("/rtas"); > + if (dn) { > + u32 base, size; > + > + ret = of_property_read_u32(dn, "linux,rtas-base", ); > + ret |= of_property_read_u32(dn, "rtas-size", ); > + if (ret) > + return ret; > + > + ret = add_mem_range(mem_ranges, base, size); You're missing an of_node_put(dn) here (also in the early return in the line above). > + } > + return ret; > +} > + > +/** > + * add_opal_mem_range - Adds OPAL region to the given memory ranges list. > + * @mem_ranges: Range list to add the memory range to. > + * > + * Returns 0 on success, negative errno on error. > + */ > +int add_opal_mem_range(struct crash_mem **mem_ranges) > +{ > + struct device_node *dn; > + int ret = 0; > + > + dn = of_find_node_by_path("/ibm,opal"); > + if (dn) { > + u64 base, size; > + > + ret = of_property_read_u64(dn, "opal-base-address", ); > + ret |= of_property_read_u64(dn, "opal-runtime-size", ); > + if (ret) > + return ret; > + > + ret = add_mem_range(mem_ranges, base, size); You're missing an of_node_put(dn) here (also in the early return in the line above). > + } > + return ret; > +} > + > +/** > + * add_reserved_ranges - Adds "/reserved-ranges" regions exported by f/w > + * to the given memory ranges list. > + * @mem_ranges: Range list to add the memory ranges to. > + * > + * Returns 0 on success, negative errno on error. > + */ > +int add_reserved_ranges(struct crash_mem **mem_ranges) > +{ > + int i, len, ret = 0; > + const __be32 *prop; > + > + prop = of_get_property(of_root, "reserved-ranges", ); > + if (!prop) > + return 0; > + > + /* > + * Each reserved range is an (address,size) pair, 2 cells each, > + * totalling 4 cells per range. Can you assume that, or do you need to check the #address-cells and #size-cells properties of the root node? > + */ > + for (i = 0; i < len / (sizeof(*prop) * 4); i++) { > + u64 base, size; > + > + base = of_read_number(prop + (i * 4) + 0, 2); > + size = of_read_number(prop + (i * 4) + 2, 2); > + > + ret = add_mem_range(mem_ranges, base, size); > + if (ret) > + break; > + } > + > + return ret; > +} > + > +/** > + * sort_memory_ranges - Sorts the given memory ranges list. > + * @mem_ranges: Range list to sort. > + * @merge: If true, merge the list after sorting. > + * > + * Returns nothing. > + */ > +void sort_memory_ranges(struct crash_mem *mrngs, bool merge) > +{ > + struct crash_mem_range *rngs; > + struct crash_mem_range rng; > + int i, j, idx; > + > + if (!mrngs) > + return; > + > + /* Sort the ranges in-place */ > + rngs = >ranges[0]; > + for (i = 0; i < mrngs->nr_ranges; i++) { > + idx = i; > + for (j = (i + 1); j < mrngs->nr_ranges; j++) { > + if (rngs[idx].start > rngs[j].start) > + idx = j; > + } > + if (idx != i) { > + rng = rngs[idx]; > + rngs[idx] = rngs[i]; > + rngs[i] = rng; > + } > + } Would it work using sort() from lib/sort.c here? > + > + if (merge) > + __merge_memory_ranges(mrngs); > +} -- Thiago Jung Bauermann IBM Linux Technology Center ___ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec
Re: [PATCH v3 01/12] kexec_file: allow archs to handle special regions while locating memory hole
Hari Bathini writes: > Some architectures may have special memory regions, within the given > memory range, which can't be used for the buffer in a kexec segment. > Implement weak arch_kexec_locate_mem_hole() definition which arch code > may override, to take care of special regions, while trying to locate > a memory hole. > > Also, add the missing declarations for arch overridable functions and > and drop the __weak descriptors in the declarations to avoid non-weak > definitions from becoming weak. > > Reported-by: kernel test robot > [lkp: In v1, arch_kimage_file_post_load_cleanup() declaration was missing] > Signed-off-by: Hari Bathini > Acked-by: Dave Young > Tested-by: Pingfan Liu Reviewed-by: Thiago Jung Bauermann -- Thiago Jung Bauermann IBM Linux Technology Center ___ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec
Re: [RFC PATCH v1 1/1] Add support for arm64 to carry ima measurement log in kexec_file_load
Hello, prsriva writes: > On 9/19/19 8:07 PM, Thiago Jung Bauermann wrote: >> Hello Prakhar, >> >> Prakhar Srivastava writes: >> >>> diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig >>> index 3adcec05b1f6..f39b12dbf9e8 100644 >>> --- a/arch/arm64/Kconfig >>> +++ b/arch/arm64/Kconfig >>> @@ -976,6 +976,13 @@ config KEXEC_VERIFY_SIG >>> verification for the corresponding kernel image type being >>> loaded in order for this to work. >>> >>> +config HAVE_IMA_KEXEC >>> + bool "Carry over IMA measurement log during kexec_file_load() syscall" >>> + depends on KEXEC_FILE >>> + help >>> + Select this option to carry over IMA measurement log during >>> + kexec_file_load. >>> + >>> config KEXEC_IMAGE_VERIFY_SIG >>> bool "Enable Image signature verification support" >>> default y >> This is not right. As it stands, HAVE_IMA_KEXEC is essentially a synonym >> for IMA_KEXEC. >> >> It's not meant to be user-visible in the config process. Instead, it's >> meant to be selected by the arch Kconfig (probably by the ARM64 config >> symbol) to signal to IMA's Kconfig that it can offer the IMA_KEXEC >> option. >> >> I also mentioned in my previous review that config HAVE_IMA_KEXEC should >> be defined in arch/Kconfig, not separately in both arch/arm64/Kconfig >> and arch/powerpc/Kconfig. > > I see the entry exists in arch/Kconfig and is overwritten. > I will remove entries both from powerpc and arm64. > > How do i cross-compile for powerpc? There are some instructions here: https://github.com/linuxppc/wiki/wiki/Building-powerpc-kernels >>> diff --git a/arch/arm64/include/asm/ima.h b/arch/arm64/include/asm/ima.h >>> new file mode 100644 >>> index ..e23cee84729f >>> --- /dev/null >>> +++ b/arch/arm64/include/asm/ima.h >>> @@ -0,0 +1,29 @@ >>> +/* SPDX-License-Identifier: GPL-2.0 */ >>> +#ifndef _ASM_ARM64_IMA_H >>> +#define _ASM_ARM64_IMA_H >>> + >>> +struct kimage; >>> + >>> +int ima_get_kexec_buffer(void **addr, size_t *size); >>> +int ima_free_kexec_buffer(void); >>> + >>> +#ifdef CONFIG_IMA >>> +void remove_ima_buffer(void *fdt, int chosen_node); >>> +#else >>> +static inline void remove_ima_buffer(void *fdt, int chosen_node) {} >>> +#endif >> I mentioned in my previous review that remove_ima_buffer() should exist >> even if CONFIG_IMA isn't set. Did you arrive at a different conclusion? > > I made the needed changed in makefile, missed removing the > > configs here. Thanks for pointing this out. Thanks. -- Thiago Jung Bauermann IBM Linux Technology Center
Re: [RFC PATCH v1 1/1] Add support for arm64 to carry ima measurement log in kexec_file_load
Hello Prakhar, Prakhar Srivastava writes: > During kexec_file_load, carrying forward the ima measurement log allows > a verifying party to get the entire runtime event log since the last > full reboot since that is when PCRs were last reset. > > Signed-off-by: Prakhar Srivastava > --- > arch/arm64/Kconfig | 7 + > arch/arm64/include/asm/ima.h | 29 > arch/arm64/include/asm/kexec.h | 5 + > arch/arm64/kernel/Makefile | 3 +- > arch/arm64/kernel/ima_kexec.c | 213 + > arch/arm64/kernel/machine_kexec_file.c | 6 + > 6 files changed, 262 insertions(+), 1 deletion(-) > create mode 100644 arch/arm64/include/asm/ima.h > create mode 100644 arch/arm64/kernel/ima_kexec.c > > diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig > index 3adcec05b1f6..f39b12dbf9e8 100644 > --- a/arch/arm64/Kconfig > +++ b/arch/arm64/Kconfig > @@ -976,6 +976,13 @@ config KEXEC_VERIFY_SIG > verification for the corresponding kernel image type being > loaded in order for this to work. > > +config HAVE_IMA_KEXEC > + bool "Carry over IMA measurement log during kexec_file_load() syscall" > + depends on KEXEC_FILE > + help > + Select this option to carry over IMA measurement log during > + kexec_file_load. > + > config KEXEC_IMAGE_VERIFY_SIG > bool "Enable Image signature verification support" > default y This is not right. As it stands, HAVE_IMA_KEXEC is essentially a synonym for IMA_KEXEC. It's not meant to be user-visible in the config process. Instead, it's meant to be selected by the arch Kconfig (probably by the ARM64 config symbol) to signal to IMA's Kconfig that it can offer the IMA_KEXEC option. I also mentioned in my previous review that config HAVE_IMA_KEXEC should be defined in arch/Kconfig, not separately in both arch/arm64/Kconfig and arch/powerpc/Kconfig. > diff --git a/arch/arm64/include/asm/ima.h b/arch/arm64/include/asm/ima.h > new file mode 100644 > index ..e23cee84729f > --- /dev/null > +++ b/arch/arm64/include/asm/ima.h > @@ -0,0 +1,29 @@ > +/* SPDX-License-Identifier: GPL-2.0 */ > +#ifndef _ASM_ARM64_IMA_H > +#define _ASM_ARM64_IMA_H > + > +struct kimage; > + > +int ima_get_kexec_buffer(void **addr, size_t *size); > +int ima_free_kexec_buffer(void); > + > +#ifdef CONFIG_IMA > +void remove_ima_buffer(void *fdt, int chosen_node); > +#else > +static inline void remove_ima_buffer(void *fdt, int chosen_node) {} > +#endif I mentioned in my previous review that remove_ima_buffer() should exist even if CONFIG_IMA isn't set. Did you arrive at a different conclusion? > + > +#ifdef CONFIG_IMA_KEXEC > +int arch_ima_add_kexec_buffer(struct kimage *image, unsigned long load_addr, > + size_t size); > + > +int setup_ima_buffer(const struct kimage *image, void *fdt, int chosen_node); > +#else > +static inline int setup_ima_buffer(const struct kimage *image, void *fdt, > +int chosen_node) > +{ > + remove_ima_buffer(fdt, chosen_node); > + return 0; > +} > +#endif /* CONFIG_IMA_KEXEC */ > +#endif /* _ASM_ARM64_IMA_H */ > diff --git a/arch/arm64/kernel/ima_kexec.c b/arch/arm64/kernel/ima_kexec.c > new file mode 100644 > index ..b14326d541f3 > --- /dev/null > +++ b/arch/arm64/kernel/ima_kexec.c In the previous patch, you took the powerpc file and made a few modifications to fit your needs. This file is now somewhat different than the powerpc version, but I don't understand to what purpose. It's not different in any significant way. Based on review comments from your previous patch, I was expecting to see code from the powerpc file moved to an arch-independent part of the the kernel and possibly adapted so that both arm64 and powerpc could use it. Can you explain why you chose this approach instead? What is the advantage of having superficially different but basically equivalent code in the two architectures? Actually, there's one change that is significant: instead of a single linux,ima-kexec-buffer property holding the start address and size of the buffer, ARM64 is now using two properties (linux,ima-kexec-buffer and linux,ima-kexec-buffer-end) for the start and end addresses. In my opinion, unless there's a good reason for it Linux should be consistent accross architectures when possible. -- Thiago Jung Bauermann IBM Linux Technology Center
Re: [RFC PATCH v1 1/1] Add support for arm64 to carry ima measurement log in kexec_file_load
Mimi Zohar writes: > On Wed, 2019-09-18 at 10:15 -0400, Mimi Zohar wrote: > >> > + uint64_t tmp_start, tmp_end; >> > + >> > + propStart = of_find_property(of_chosen, "linux,ima-kexec-buffer", >> > + NULL); >> > + if (propStart) { >> > + tmp_start = fdt64_to_cpu(*((const fdt64_t *) propStart)); >> > + ret = of_remove_property(of_chosen, propStart); >> > + if (!ret) { >> > + return ret; >> > + } >> > + >> > + propEnd = of_find_property(of_chosen, >> > + "linux,ima-kexec-buffer-end", NULL); >> > + if (!propEnd) { >> > + return -EINVAL; >> > + } >> > + >> > + tmp_end = fdt64_to_cpu(*((const fdt64_t *) propEnd)); >> > + >> > + ret = of_remove_property(of_chosen, propEnd); >> > + if (!ret) { >> > + return ret; >> > + } >> >> There seems to be quite a bit of code duplication in this function and >> in ima_get_kexec_buffer(). It could probably be cleaned up with some >> refactoring. > > Sorry, my mistake. One calls of_get_property(), while the other calls > of_find_property(). of_get_property() is a thin wrapper around of_find_property(), so if that's the only difference I think they can still be merged. -- Thiago Jung Bauermann IBM Linux Technology Center
Re: [PATCH v5 0/7] kexec: add generic support for elf kernel images
Helge Deller writes: > On 06.09.19 23:47, Thiago Jung Bauermann wrote: >> Helge Deller writes: >>> This kexec patch series is the groundwork for kexec on the parisc >>> architecture. >>> Since we want kexec on parisc, I've applied it to my for-next-kexec tree >>> [1], >>> and can push it to Linus in the next merge window through the parisc tree >>> [2]. >> >> I just had a look at this version and it looks fine to me. Identical to >> the version I reviewed before except for the changes I suggested. >> Thanks, Sven! >> >>> If someone has any objections, or if you prefer to take it through >>> a kexec or powerpc tree, please let me know. >>> >>> Helge >>> >>> [1] >>> https://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux.git/log/?h=for-next-kexec >>> [2] >>> https://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux.git/log/?h=for-next >> >> I noticed that the first patch is the only one that doesn't have my >> Reviewed-by. If you want, you can add it: >> >> Reviewed-by: Thiago Jung Bauermann > > Thanks for reviewing again! > I added your Reviewed-by to the patches in the for-next tree. Thanks! -- Thiago Jung Bauermann IBM Linux Technology Center ___ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec
Re: [PATCH v5 0/7] kexec: add generic support for elf kernel images
Helge Deller writes: > Hi all, > > This kexec patch series is the groundwork for kexec on the parisc > architecture. > Since we want kexec on parisc, I've applied it to my for-next-kexec tree [1], > and can push it to Linus in the next merge window through the parisc tree [2]. I just had a look at this version and it looks fine to me. Identical to the version I reviewed before except for the changes I suggested. Thanks, Sven! > If someone has any objections, or if you prefer to take it through > a kexec or powerpc tree, please let me know. > > Helge > > [1] > https://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux.git/log/?h=for-next-kexec > [2] > https://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux.git/log/?h=for-next I noticed that the first patch is the only one that doesn't have my Reviewed-by. If you want, you can add it: Reviewed-by: Thiago Jung Bauermann If it's inconvenient to add it now since it's already applied, that's fine too of course. > On 23.08.19 21:49, Sven Schnelle wrote: >> Changes to v4: >> - rebase on current powerpc/merge tree >> - fix syscall name in commit message >> - remove a few unused #defines in arch/powerpc/kernel/kexec_elf_64.c >>... >> arch/Kconfig | 3 + >> arch/powerpc/Kconfig | 1 + >> arch/powerpc/kernel/kexec_elf_64.c| 545 +- >> include/linux/kexec.h | 23 + >> kernel/Makefile | 1 + >> .../kexec_elf_64.c => kernel/kexec_elf.c | 394 +++-- >> 6 files changed, 115 insertions(+), 852 deletions(-) >> copy arch/powerpc/kernel/kexec_elf_64.c => kernel/kexec_elf.c (50%) -- Thiago Jung Bauermann IBM Linux Technology Center ___ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec
Re: [PATCH v4 1/7] kexec: add KEXEC_ELF
Thiago Jung Bauermann writes: >> diff --git a/arch/powerpc/kernel/kexec_elf_64.c b/kernel/kexec_elf.c >> similarity index 71% >> copy from arch/powerpc/kernel/kexec_elf_64.c >> copy to kernel/kexec_elf.c >> index ba4f18a43ee8..6e9f52171ede 100644 >> --- a/arch/powerpc/kernel/kexec_elf_64.c >> +++ b/kernel/kexec_elf.c >> @@ -1,33 +1,10 @@ >> -/* >> - * Load ELF vmlinux file for the kexec_file_load syscall. >> - * >> - * Copyright (C) 2004 Adam Litke (a...@us.ibm.com) >> - * Copyright (C) 2004 IBM Corp. >> - * Copyright (C) 2005 R Sharada (shar...@in.ibm.com) >> - * Copyright (C) 2006 Mohan Kumar M (mo...@in.ibm.com) >> - * Copyright (C) 2016 IBM Corporation >> - * >> - * Based on kexec-tools' kexec-elf-exec.c and kexec-elf-ppc64.c. >> - * Heavily modified for the kernel by >> - * Thiago Jung Bauermann . >> - * >> - * This program is free software; you can redistribute it and/or modify >> - * it under the terms of the GNU General Public License as published by >> - * the Free Software Foundation (version 2 of the License). >> - * >> - * This program is distributed in the hope that it will be useful, >> - * but WITHOUT ANY WARRANTY; without even the implied warranty of >> - * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the >> - * GNU General Public License for more details. >> - */ >> +// SPDX-License-Identifier: GPL-2.0-only > > I may be wrong, but my understanding of the SPDX license identifier is > that it substitutes the license text (i.e., the last two paragraphs > above), but not the copyright statements. Is it ok to have a file with a > SPDX license identifier but no copyright statement? Answering my own question: I just came accross commit b24413180f56 ("License cleanup: add SPDX GPL-2.0 license identifier to files with no license") which adds SPDX license identifiers to a lot of files without any copyright statement so I conclude that it is indeed ok to not have copyright statements in a file. In this instance the new file is heavily based on the old one though, so IMHO it makes sense for it to inherit the copyright statements from the original file. -- Thiago Jung Bauermann IBM Linux Technology Center ___ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec
Re: [PATCH v4 4/7] kexec_elf: remove PURGATORY_STACK_SIZE
Thiago Jung Bauermann writes: > Sven Schnelle writes: > >> It's not used anywhere so just drop it. >> >> Signed-off-by: Sven Schnelle >> --- >> kernel/kexec_elf.c | 2 -- >> 1 file changed, 2 deletions(-) >> >> diff --git a/kernel/kexec_elf.c b/kernel/kexec_elf.c >> index effe9dc0b055..70d31b8feeae 100644 >> --- a/kernel/kexec_elf.c >> +++ b/kernel/kexec_elf.c >> @@ -8,8 +8,6 @@ >> #include >> #include >> >> -#define PURGATORY_STACK_SIZE(16 * 1024) >> - >> #define elf_addr_to_cpu elf64_to_cpu >> >> #ifndef Elf_Rel > > Can you remove it from the file in arch/powerpc as well? Sorry, forgot to add: Reviewed-by: Thiago Jung Bauermann -- Thiago Jung Bauermann IBM Linux Technology Center ___ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec
Re: [PATCH v4 0/7] kexec: add generic support for elf kernel images
Sven Schnelle writes: > Changes to v3: > - add support for 32-bit ELF files > > Changes to v2: > - use git format-patch -C > > Changes to v1: > - split up patch into smaller pieces > - rebase onto powerpc/next > - remove unused variable in kexec_elf_load() > > Changes to RFC version: > - remove unused Elf_Rel macro > - remove section header parsing > - remove PURGATORY_STACK_SIZE > - change order of elf_*_to_cpu() functions > - remove elf_addr_to_cpu macro > > Sven Schnelle (7): > kexec: add KEXEC_ELF > kexec_elf: change order of elf_*_to_cpu() functions > kexec_elf: remove parsing of section headers > kexec_elf: remove PURGATORY_STACK_SIZE > kexec_elf: remove Elf_Rel macro > kexec_elf: remove unused variable in kexec_elf_load() > kexec_elf: support 32 bit ELF files > > arch/Kconfig | 3 + > arch/powerpc/Kconfig | 1 + > arch/powerpc/kernel/kexec_elf_64.c | 551 + > include/linux/kexec.h | 23 ++ > kernel/Makefile| 1 + > kernel/kexec_elf.c | 418 ++ > 6 files changed, 456 insertions(+), 541 deletions(-) > create mode 100644 kernel/kexec_elf.c The series applies on v5.1 but not newer kernels, so it needs to be rebased. I tested with v5.1 in ppc64le kexecing to both little-endian and big-endian kernels, and also in ppc64 kexecing to both big-endian and little-endian kernels so: Tested-by: Thiago Jung Bauermann -- Thiago Jung Bauermann IBM Linux Technology Center ___ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec
Re: [PATCH v4 1/7] kexec: add KEXEC_ELF
Hello Sven, Just a few small comments below. Regardless of them: Reviewed-by: Thiago Jung Bauermann Sven Schnelle writes: > Right now powerpc provides an implementation to read elf files > with the kexec_file() syscall. Make that available as a public Nit: the syscall is kexec_file_load() > kexec interface so it can be re-used on other architectures. > > Signed-off-by: Sven Schnelle > --- > arch/Kconfig | 3 + > arch/powerpc/Kconfig | 1 + > arch/powerpc/kernel/kexec_elf_64.c| 551 +- > include/linux/kexec.h | 24 + > kernel/Makefile | 1 + > .../kexec_elf_64.c => kernel/kexec_elf.c | 199 ++- > 6 files changed, 75 insertions(+), 704 deletions(-) > copy arch/powerpc/kernel/kexec_elf_64.c => kernel/kexec_elf.c (71%) > diff --git a/arch/powerpc/kernel/kexec_elf_64.c > b/arch/powerpc/kernel/kexec_elf_64.c > index ba4f18a43ee8..30bd57a93c17 100644 > --- a/arch/powerpc/kernel/kexec_elf_64.c > +++ b/arch/powerpc/kernel/kexec_elf_64.c > @@ -1,3 +1,4 @@ > +// SPDX-License-Identifier: GPL-2.0-only > /* > * Load ELF vmlinux file for the kexec_file_load syscall. > * > @@ -10,15 +11,6 @@ > * Based on kexec-tools' kexec-elf-exec.c and kexec-elf-ppc64.c. > * Heavily modified for the kernel by > * Thiago Jung Bauermann . > - * > - * This program is free software; you can redistribute it and/or modify > - * it under the terms of the GNU General Public License as published by > - * the Free Software Foundation (version 2 of the License). > - * > - * This program is distributed in the hope that it will be useful, > - * but WITHOUT ANY WARRANTY; without even the implied warranty of > - * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the > - * GNU General Public License for more details. > */ > > #define pr_fmt(fmt) "kexec_elf: " fmt > @@ -39,532 +31,6 @@ > #define Elf_Rel Elf64_Rel > #endif /* Elf_Rel */ Perhaps this patch could remove the #define for elf_addr_to_cpu since it's not used anymore in this file? > diff --git a/arch/powerpc/kernel/kexec_elf_64.c b/kernel/kexec_elf.c > similarity index 71% > copy from arch/powerpc/kernel/kexec_elf_64.c > copy to kernel/kexec_elf.c > index ba4f18a43ee8..6e9f52171ede 100644 > --- a/arch/powerpc/kernel/kexec_elf_64.c > +++ b/kernel/kexec_elf.c > @@ -1,33 +1,10 @@ > -/* > - * Load ELF vmlinux file for the kexec_file_load syscall. > - * > - * Copyright (C) 2004 Adam Litke (a...@us.ibm.com) > - * Copyright (C) 2004 IBM Corp. > - * Copyright (C) 2005 R Sharada (shar...@in.ibm.com) > - * Copyright (C) 2006 Mohan Kumar M (mo...@in.ibm.com) > - * Copyright (C) 2016 IBM Corporation > - * > - * Based on kexec-tools' kexec-elf-exec.c and kexec-elf-ppc64.c. > - * Heavily modified for the kernel by > - * Thiago Jung Bauermann . > - * > - * This program is free software; you can redistribute it and/or modify > - * it under the terms of the GNU General Public License as published by > - * the Free Software Foundation (version 2 of the License). > - * > - * This program is distributed in the hope that it will be useful, > - * but WITHOUT ANY WARRANTY; without even the implied warranty of > - * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the > - * GNU General Public License for more details. > - */ > +// SPDX-License-Identifier: GPL-2.0-only I may be wrong, but my understanding of the SPDX license identifier is that it substitutes the license text (i.e., the last two paragraphs above), but not the copyright statements. Is it ok to have a file with a SPDX license identifier but no copyright statement? -- Thiago Jung Bauermann IBM Linux Technology Center ___ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec
Re: [PATCH v4 7/7] kexec_elf: support 32 bit ELF files
Sven Schnelle writes: > The powerpc version only supported 64 bit. Add some > code to switch decoding of fields during runtime so > we can kexec a 32 bit kernel from a 64 bit kernel and > vice versa. > > Signed-off-by: Sven Schnelle Reviewed-by: Thiago Jung Bauermann -- Thiago Jung Bauermann IBM Linux Technology Center ___ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec
Re: [PATCH v4 6/7] kexec_elf: remove unused variable in kexec_elf_load()
Sven Schnelle writes: > base was never assigned, so we can remove it. > > Reviewed-by: Christophe Leroy > Signed-off-by: Sven Schnelle > --- > kernel/kexec_elf.c | 7 ++- > 1 file changed, 2 insertions(+), 5 deletions(-) > > diff --git a/kernel/kexec_elf.c b/kernel/kexec_elf.c > index e346659af324..9421eebbacf0 100644 > --- a/kernel/kexec_elf.c > +++ b/kernel/kexec_elf.c > @@ -350,7 +350,7 @@ int kexec_elf_load(struct kimage *image, struct elfhdr > *ehdr, >struct kexec_buf *kbuf, >unsigned long *lowest_load_addr) > { > - unsigned long base = 0, lowest_addr = UINT_MAX; > + unsigned long lowest_addr = UINT_MAX; > int ret; > size_t i; > > @@ -372,7 +372,7 @@ int kexec_elf_load(struct kimage *image, struct elfhdr > *ehdr, > kbuf->bufsz = size; > kbuf->memsz = phdr->p_memsz; > kbuf->buf_align = phdr->p_align; > - kbuf->buf_min = phdr->p_paddr + base; > + kbuf->buf_min = phdr->p_paddr; > ret = kexec_add_buffer(kbuf); > if (ret) > goto out; > @@ -382,9 +382,6 @@ int kexec_elf_load(struct kimage *image, struct elfhdr > *ehdr, > lowest_addr = load_addr; > } > > - /* Update entry point to reflect new load address. */ > - ehdr->e_entry += base; > - > *lowest_load_addr = lowest_addr; > ret = 0; > out: Reviewed-by: Thiago Jung Bauermann -- Thiago Jung Bauermann IBM Linux Technology Center ___ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec
Re: [PATCH v4 5/7] kexec_elf: remove Elf_Rel macro
Sven Schnelle writes: > It wasn't used anywhere, so lets drop it. > > Reviewed-by: Christophe Leroy > Signed-off-by: Sven Schnelle > --- > kernel/kexec_elf.c | 4 > 1 file changed, 4 deletions(-) > > diff --git a/kernel/kexec_elf.c b/kernel/kexec_elf.c > index 70d31b8feeae..e346659af324 100644 > --- a/kernel/kexec_elf.c > +++ b/kernel/kexec_elf.c > @@ -10,10 +10,6 @@ > > #define elf_addr_to_cpu elf64_to_cpu > > -#ifndef Elf_Rel > -#define Elf_Rel Elf64_Rel > -#endif /* Elf_Rel */ > - > static inline bool elf_is_elf_file(const struct elfhdr *ehdr) > { > return memcmp(ehdr->e_ident, ELFMAG, SELFMAG) == 0; Could you remove this one from the file in arch/powerpc as well? Perhaps this and the previous patch could be placed before patch 1, so that this change can be done only once. In any case: Reviewed-by: Thiago Jung Bauermann -- Thiago Jung Bauermann IBM Linux Technology Center ___ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec
Re: [PATCH v4 3/7] kexec_elf: remove parsing of section headers
Sven Schnelle writes: > We're not using them, so we can drop the parsing. > > Signed-off-by: Sven Schnelle Reviewed-by: Thiago Jung Bauermann -- Thiago Jung Bauermann IBM Linux Technology Center ___ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec
Re: [PATCH v4 2/7] kexec_elf: change order of elf_*_to_cpu() functions
Sven Schnelle writes: > Change the order to have a 64/32/16 order, no functional change. > > Signed-off-by: Sven Schnelle Reviewed-by: Thiago Jung Bauermann -- Thiago Jung Bauermann IBM Linux Technology Center ___ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec
Re: [PATCH] powerpc: Fix loading of kernel + initramfs with kexec_file_load()
Michael Ellerman writes: > On Wed, 2019-05-22 at 22:01:58 UTC, Thiago Jung Bauermann wrote: >> Commit b6664ba42f14 ("s390, kexec_file: drop arch_kexec_mem_walk()") >> changed kexec_add_buffer() to skip searching for a memory location if >> kexec_buf.mem is already set, and use the address that is there. >> >> In powerpc code we reuse a kexec_buf variable for loading both the kernel >> and the initramfs by resetting some of the fields between those uses, but >> not mem. This causes kexec_add_buffer() to try to load the kernel at the >> same address where initramfs will be loaded, which is naturally rejected: >> >> # kexec -s -l --initrd initramfs vmlinuz >> kexec_file_load failed: Invalid argument >> >> Setting the mem field before every call to kexec_add_buffer() fixes this >> regression. >> >> Fixes: b6664ba42f14 ("s390, kexec_file: drop arch_kexec_mem_walk()") >> Signed-off-by: Thiago Jung Bauermann >> Reviewed-by: Dave Young > > Applied to powerpc fixes, thanks. > > https://git.kernel.org/powerpc/c/8b909e3548706cbebc0a676067b81aad Thanks!! -- Thiago Jung Bauermann IBM Linux Technology Center ___ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec
Re: [PATCH] powerpc: Fix loading of kernel + initramfs with kexec_file_load()
Dave Young writes: > On 05/22/19 at 07:01pm, Thiago Jung Bauermann wrote: >> Commit b6664ba42f14 ("s390, kexec_file: drop arch_kexec_mem_walk()") >> changed kexec_add_buffer() to skip searching for a memory location if >> kexec_buf.mem is already set, and use the address that is there. >> >> In powerpc code we reuse a kexec_buf variable for loading both the kernel >> and the initramfs by resetting some of the fields between those uses, but >> not mem. This causes kexec_add_buffer() to try to load the kernel at the >> same address where initramfs will be loaded, which is naturally rejected: >> >> # kexec -s -l --initrd initramfs vmlinuz >> kexec_file_load failed: Invalid argument >> >> Setting the mem field before every call to kexec_add_buffer() fixes this >> regression. >> >> Fixes: b6664ba42f14 ("s390, kexec_file: drop arch_kexec_mem_walk()") >> Signed-off-by: Thiago Jung Bauermann >> --- >> arch/powerpc/kernel/kexec_elf_64.c | 6 +- >> 1 file changed, 5 insertions(+), 1 deletion(-) > > Reviewed-by: Dave Young Thanks! -- Thiago Jung Bauermann IBM Linux Technology Center ___ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec
[PATCH] powerpc: Fix loading of kernel + initramfs with kexec_file_load()
Commit b6664ba42f14 ("s390, kexec_file: drop arch_kexec_mem_walk()") changed kexec_add_buffer() to skip searching for a memory location if kexec_buf.mem is already set, and use the address that is there. In powerpc code we reuse a kexec_buf variable for loading both the kernel and the initramfs by resetting some of the fields between those uses, but not mem. This causes kexec_add_buffer() to try to load the kernel at the same address where initramfs will be loaded, which is naturally rejected: # kexec -s -l --initrd initramfs vmlinuz kexec_file_load failed: Invalid argument Setting the mem field before every call to kexec_add_buffer() fixes this regression. Fixes: b6664ba42f14 ("s390, kexec_file: drop arch_kexec_mem_walk()") Signed-off-by: Thiago Jung Bauermann --- arch/powerpc/kernel/kexec_elf_64.c | 6 +- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/arch/powerpc/kernel/kexec_elf_64.c b/arch/powerpc/kernel/kexec_elf_64.c index ba4f18a43ee8..52a29fc73730 100644 --- a/arch/powerpc/kernel/kexec_elf_64.c +++ b/arch/powerpc/kernel/kexec_elf_64.c @@ -547,6 +547,7 @@ static int elf_exec_load(struct kimage *image, struct elfhdr *ehdr, kbuf.memsz = phdr->p_memsz; kbuf.buf_align = phdr->p_align; kbuf.buf_min = phdr->p_paddr + base; + kbuf.mem = KEXEC_BUF_MEM_UNKNOWN; ret = kexec_add_buffer(); if (ret) goto out; @@ -581,7 +582,8 @@ static void *elf64_load(struct kimage *image, char *kernel_buf, struct kexec_buf kbuf = { .image = image, .buf_min = 0, .buf_max = ppc64_rma_size }; struct kexec_buf pbuf = { .image = image, .buf_min = 0, - .buf_max = ppc64_rma_size, .top_down = true }; + .buf_max = ppc64_rma_size, .top_down = true, + .mem = KEXEC_BUF_MEM_UNKNOWN }; ret = build_elf_exec_info(kernel_buf, kernel_len, , _info); if (ret) @@ -606,6 +608,7 @@ static void *elf64_load(struct kimage *image, char *kernel_buf, kbuf.bufsz = kbuf.memsz = initrd_len; kbuf.buf_align = PAGE_SIZE; kbuf.top_down = false; + kbuf.mem = KEXEC_BUF_MEM_UNKNOWN; ret = kexec_add_buffer(); if (ret) goto out; @@ -638,6 +641,7 @@ static void *elf64_load(struct kimage *image, char *kernel_buf, kbuf.bufsz = kbuf.memsz = fdt_size; kbuf.buf_align = PAGE_SIZE; kbuf.top_down = true; + kbuf.mem = KEXEC_BUF_MEM_UNKNOWN; ret = kexec_add_buffer(); if (ret) goto out;
Re: [PATCH v2 7/7] ima: Support platform keyring for kernel appraisal
Nayna Jain writes: > On secure boot enabled systems, the bootloader verifies the kernel > image and possibly the initramfs signatures based on a set of keys. A > soft reboot(kexec) of the system, with the same kernel image and > initramfs, requires access to the original keys to verify the > signatures. > > This patch allows IMA-appraisal access to those original keys, now > loaded on the platform keyring, needed for verifying the kernel image > and initramfs signatures. > > Signed-off-by: Nayna Jain > Reviewed-by: Mimi Zohar > Acked-by: Serge Hallyn > - replace 'rc' with 'xattr_len' when calling integrity_digsig_verify() > with INTEGRITY_KEYRING_IMA for readability > Suggested-by: Serge Hallyn > --- > Changelog: > > v2: > - replace 'rc' with 'xattr_len' when calling integrity_digsig_verify() > with INTEGRITY_KEYRING_IMA for readability > > security/integrity/ima/ima_appraise.c | 13 +++-- > 1 file changed, 11 insertions(+), 2 deletions(-) With the change to only access the platform keyring when it is enabled: Reviewed-by: Thiago Jung Bauermann -- Thiago Jung Bauermann IBM Linux Technology Center ___ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec
Re: [PATCH v2 2/7] integrity: Load certs to the platform keyring
Nayna Jain writes: > The patch refactors integrity_load_x509(), making it a wrapper for a new > function named integrity_add_key(). This patch also defines a new > function named integrity_load_cert() for loading the platform keys. > > Signed-off-by: Nayna Jain > Reviewed-by: Mimi Zohar > Acked-by: Serge Hallyn > --- > security/integrity/digsig.c| 71 > ++ > security/integrity/integrity.h | 20 ++ > .../integrity/platform_certs/platform_keyring.c| 23 +++ > 3 files changed, 90 insertions(+), 24 deletions(-) Reviewed-by: Thiago Jung Bauermann -- Thiago Jung Bauermann IBM Linux Technology Center ___ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec
Re: [PATCH v2 1/7] integrity: Define a trusted platform keyring
Nayna Jain writes: > On secure boot enabled systems, a verified kernel may need to kexec > additional kernels. For example, it may be used as a bootloader needing > to kexec a target kernel or it may need to kexec a crashdump kernel. In > such cases, it may want to verify the signature of the next kernel > image. > > It is further possible that the kernel image is signed with third party > keys which are stored as platform or firmware keys in the 'db' variable. > The kernel, however, can not directly verify these platform keys, and an > administrator may therefore not want to trust them for arbitrary usage. > In order to differentiate platform keys from other keys and provide the > necessary separation of trust, the kernel needs an additional keyring to > store platform keys. > > This patch creates the new keyring called ".platform" to isolate keys > provided by platform from keys by kernel. These keys are used to > facilitate signature verification during kexec. Since the scope of this > keyring is only the platform/firmware keys, it cannot be updated from > userspace. > > This keyring can be enabled by setting CONFIG_INTEGRITY_PLATFORM_KEYRING. > > Signed-off-by: Nayna Jain > Reviewed-by: Mimi Zohar > Acked-by: Serge Hallyn > --- > security/integrity/Kconfig | 11 + > security/integrity/Makefile| 1 + > security/integrity/digsig.c| 48 > +++--- > security/integrity/integrity.h | 3 +- > .../integrity/platform_certs/platform_keyring.c| 35 > 5 files changed, 83 insertions(+), 15 deletions(-) > create mode 100644 security/integrity/platform_certs/platform_keyring.c Reviewed-by: Thiago Jung Bauermann -- Thiago Jung Bauermann IBM Linux Technology Center ___ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec
Re: [PATCH v2 7/7] ima: Support platform keyring for kernel appraisal
Hello, Nayna Jain writes: > On secure boot enabled systems, the bootloader verifies the kernel > image and possibly the initramfs signatures based on a set of keys. A > soft reboot(kexec) of the system, with the same kernel image and > initramfs, requires access to the original keys to verify the > signatures. > > This patch allows IMA-appraisal access to those original keys, now > loaded on the platform keyring, needed for verifying the kernel image > and initramfs signatures. > > Signed-off-by: Nayna Jain > Reviewed-by: Mimi Zohar > Acked-by: Serge Hallyn > - replace 'rc' with 'xattr_len' when calling integrity_digsig_verify() > with INTEGRITY_KEYRING_IMA for readability > Suggested-by: Serge Hallyn > --- > Changelog: > > v2: > - replace 'rc' with 'xattr_len' when calling integrity_digsig_verify() > with INTEGRITY_KEYRING_IMA for readability > > security/integrity/ima/ima_appraise.c | 13 +++-- > 1 file changed, 11 insertions(+), 2 deletions(-) > > diff --git a/security/integrity/ima/ima_appraise.c > b/security/integrity/ima/ima_appraise.c > index deec1804a00a..e8f520450895 100644 > --- a/security/integrity/ima/ima_appraise.c > +++ b/security/integrity/ima/ima_appraise.c > @@ -289,12 +289,21 @@ int ima_appraise_measurement(enum ima_hooks func, > case EVM_IMA_XATTR_DIGSIG: > set_bit(IMA_DIGSIG, >atomic_flags); > rc = integrity_digsig_verify(INTEGRITY_KEYRING_IMA, > - (const char *)xattr_value, rc, > + (const char *)xattr_value, > + xattr_len, >iint->ima_hash->digest, >iint->ima_hash->length); > if (rc == -EOPNOTSUPP) { > status = INTEGRITY_UNKNOWN; > - } else if (rc) { > + break; > + } > + if (rc && func == KEXEC_KERNEL_CHECK) > + rc = integrity_digsig_verify(INTEGRITY_KEYRING_PLATFORM, > + (const char *)xattr_value, > + xattr_len, > + iint->ima_hash->digest, > + iint->ima_hash->length); If CONFIG_INTEGRITY_PLATFORM_KEYRING=n the second call to integrity_digsig_verify() above will always fail, and the audit message of failed signature verifications for KEXEC_KERNEL will always log the same rc value, which is whatever request_key() returns when asked to look for an inexistent keyring. Here is a patch which only performs the second try if the platform keyring is enabled. >From d5fb94ab9eb13f6294f8dc44d1344cb85dfa41b8 Mon Sep 17 00:00:00 2001 From: Thiago Jung Bauermann Date: Wed, 12 Dec 2018 16:02:09 -0200 Subject: [PATCH] ima: Only use the platform keyring if it's enabled Signed-off-by: Thiago Jung Bauermann --- security/integrity/ima/ima_appraise.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/security/integrity/ima/ima_appraise.c b/security/integrity/ima/ima_appraise.c index e8f520450895..f6ac405daabb 100644 --- a/security/integrity/ima/ima_appraise.c +++ b/security/integrity/ima/ima_appraise.c @@ -297,7 +297,8 @@ int ima_appraise_measurement(enum ima_hooks func, status = INTEGRITY_UNKNOWN; break; } - if (rc && func == KEXEC_KERNEL_CHECK) + if (IS_ENABLED(CONFIG_INTEGRITY_PLATFORM_KEYRING) && rc && + func == KEXEC_KERNEL_CHECK) rc = integrity_digsig_verify(INTEGRITY_KEYRING_PLATFORM, (const char *)xattr_value, xattr_len, ___ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec
[PATCH] powerpc: kexec_file: Fix error code when trying to load kdump kernel
kexec_file_load() on powerpc doesn't support kdump kernels yet, so it returns -ENOTSUPP in that case. I've recently learned that this errno is internal to the kernel and isn't supposed to be exposed to userspace. Therefore, change to -EOPNOTSUPP which is defined in an uapi header. This does indeed make kexec-tools happier. Before the patch, on ppc64le: # ~bauermann/src/kexec-tools/build/sbin/kexec -s -p /boot/vmlinuz kexec_file_load failed: Unknown error 524 After the patch: # ~bauermann/src/kexec-tools/build/sbin/kexec -s -p /boot/vmlinuz kexec_file_load failed: Operation not supported Fixes: a0458284f062 ("powerpc: Add support code for kexec_file_load()") Reported-by: Dave Young <dyo...@redhat.com> Signed-off-by: Thiago Jung Bauermann <bauer...@linux.vnet.ibm.com> --- arch/powerpc/kernel/machine_kexec_file_64.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) This is a minor issue, but since it's a simple patch it might be worth applying it to stable branches. This is the kexec-tools thread where this problem was brought up: https://lists.infradead.org/pipermail/kexec/2018-March/020346.html And this is an instance of a similar fix being applied elsewhere in the kernel, for the same reasons: https://patchwork.kernel.org/patch/8490791/ The test shown in the commit log was made using Hari Bathini's patch adding kexec_file_load() support to kexec-tools in ppc64. diff --git a/arch/powerpc/kernel/machine_kexec_file_64.c b/arch/powerpc/kernel/machine_kexec_file_64.c index e4395f937d63..45e0b7d5f200 100644 --- a/arch/powerpc/kernel/machine_kexec_file_64.c +++ b/arch/powerpc/kernel/machine_kexec_file_64.c @@ -43,7 +43,7 @@ int arch_kexec_kernel_image_probe(struct kimage *image, void *buf, /* We don't support crash kernels yet. */ if (image->type == KEXEC_TYPE_CRASH) - return -ENOTSUPP; + return -EOPNOTSUPP; for (i = 0; i < ARRAY_SIZE(kexec_file_loaders); i++) { fops = kexec_file_loaders[i]; ___ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec
Re: [PATCH v5 4/5] kexec: Add option to fall back to KEXEC_LOAD when KEXEC_FILE_LOAD is not supported
else >> > > +if (do_kexec_fallback) switch (result) { >> > > +/* >> > > + * Something failed with >> > > signature verification. >> > > + * Reject the image. >> > > + */ >> > > +case -ELIBBAD: >> > > +case -EKEYREJECTED: >> > > +case -ENOPKG: >> > > +case -ENOKEY: >> > > +case -EBADMSG: >> > > +case -EMSGSIZE: >> > > +/* >> > > + * By default reject or >> > > do nothing if >> > > + * succeded >> > > + */ >> > > +default: break; >> > > +case -ENOSYS: /* not implemented >> > > */ >> > > +/* >> > > + * Parsing image or >> > > other options failed >> > > + * The image may be >> > > invalid or image >> > > + * type may not >> > > supported by kernel so >> > > + * retry parsing in >> > > kexec-tools. >> > > + */ >> > > +case -EINVAL: >> > > +case -ENOEXEC: >> > > + /* >> > > + * ENOTSUPP can be >> > > unsupported image >> > > + * type or unsupported >> > > PE signature >> > > + * wrapper type, duh >> > > + */ >> > > +case -ENOTSUP: >> > >> > Hmm, this is still used in latest version. kernel does not return >> > such error number, I might not say clearly previously. Please >> > check the kernel code, the only one place I know is because no >> > kdump support in power kexec_file: >> > arch/powerpc/kernel/machine_kexec_file_64.c >> > >> > /* We don't support crash kernels yet. */ >> > if (image->type == KEXEC_TYPE_CRASH) >> > return -ENOTSUPP; >> > >> > So I suggest not checking this as well since -ENOTSUPP is not >> > populated in userspace headers, and -ENOTSUP is not used at all. >> > >> > Also as I mentioned in another reply -EINVAL and -ENOEXEC is also >> > not ncessary. >> > >> > For -ENOTSUP, maybe someone can submit a patch to switch to >> > -ENOTSUPP so that userspace can check it. >> > Ccing Thiago and Hari for the -ENOTSUPP errno issue. >> >> Oops for the hurry reply, I means -ENOTSUPP might be able to replaced >> with -EOPNOTSUPP, a similar change like this: >> https://patchwork.kernel.org/patch/8490791/ > > Thanks for catching this. In Linux ENOTSUPP with extra P is different > from EOPNOTSUPP and ENOTSUP (single P). Since we are talking to the > kernel and it returns the double P ENOTSUPP we need to define it in > kexec as well. And we should check ENOTSUP with single P in case > somebody some day thinks that returning undefined error codes to > userspace is not nice like in the patch above. I wasn't aware that ENOTSUPP was an in-kernel only errno. Should I submit a patch for the kernel so that powerpc returns -EOPNOTSUPP in case of trying to load kdump kernel with kexec_file_load()? -- Thiago Jung Bauermann IBM Linux Technology Center ___ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec
Re: [PATCH] kexec/ppc64: leverage kexec_file_load support
Hello Hari, Hari Bathini <hbath...@linux.vnet.ibm.com> writes: > PPC64 kernel now supports kexec_file_load system call. Leverage it by > enabling that support here. Note that loading crash kernel with this > system call is not yet supported in the kernel and trying to load one > will fail with '-ENOTSUPP' error. > > Signed-off-by: Hari Bathini <hbath...@linux.vnet.ibm.com> > --- > kexec/arch/ppc64/kexec-elf-ppc64.c | 84 > > kexec/kexec-syscall.h |3 + > 2 files changed, 87 insertions(+) Thanks for implementing this! Looks good to me, just one nit below. Regardless of that: Reviewed-by: Thiago Jung Bauermann <bauer...@linux.vnet.ibm.com> > diff --git a/kexec/arch/ppc64/kexec-elf-ppc64.c > b/kexec/arch/ppc64/kexec-elf-ppc64.c > index ddd3de8..2742cd6 100644 > --- a/kexec/arch/ppc64/kexec-elf-ppc64.c > +++ b/kexec/arch/ppc64/kexec-elf-ppc64.c > @@ -117,6 +196,9 @@ int elf_ppc64_load(int argc, char **argv, const char > *buf, off_t len, > uint32_t my_run_at_load; > unsigned int slave_code[256/sizeof (unsigned int)], master_entry; > > + if (info->file_mode) > + return elf_ppc64_load_file(argc, argv, info); > + > /* See options.h -- add any more there, too. */ > static const struct option options[] = { > KEXEC_ARCH_OPTIONS This is placing executable code between variable declarations. It may be fine for gcc but it's more idiomatic C to put it after all variable declarations. But perhaps the kexec-tools style is fine with it? -- Thiago Jung Bauermann IBM Linux Technology Center ___ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec
Re: [PATCH v2 2/7] kexec_file, x86, powerpc: factor out kexec_file_ops functions
Dave Young <dyo...@redhat.com> writes: > On 03/06/18 at 07:22pm, AKASHI Takahiro wrote: >> As arch_kexec_kernel_image_{probe,load}(), >> arch_kimage_file_post_load_cleanup() and arch_kexec_kernel_verify_sig() >> are almost duplicated among architectures, they can be commonalized with >> an architecture-defined kexec_file_ops array. So let's factor them out. >> >> Signed-off-by: AKASHI Takahiro <takahiro.aka...@linaro.org> >> Cc: Dave Young <dyo...@redhat.com> >> Cc: Vivek Goyal <vgo...@redhat.com> >> Cc: Baoquan He <b...@redhat.com> >> Cc: Michael Ellerman <m...@ellerman.id.au> >> Cc: Thiago Jung Bauermann <bauer...@linux.vnet.ibm.com> >> --- >> arch/powerpc/include/asm/kexec.h| 2 +- >> arch/powerpc/kernel/kexec_elf_64.c | 2 +- >> arch/powerpc/kernel/machine_kexec_file_64.c | 39 ++- >> arch/x86/include/asm/kexec-bzimage64.h | 2 +- >> arch/x86/kernel/kexec-bzimage64.c | 2 +- >> arch/x86/kernel/machine_kexec_64.c | 45 +- >> include/linux/kexec.h | 13 +++ >> kernel/kexec_file.c | 60 >> +++++++-- >> 8 files changed, 71 insertions(+), 94 deletions(-) >> > > For this patch it also needs some review from powerpc people. FWIW: Reviewed-by: Thiago Jung Bauermann <bauer...@linux.vnet.ibm.com> Also, tested on a ppc64le KVM guest: Tested-by: Thiago Jung Bauermann <bauer...@linux.vnet.ibm.com> -- Thiago Jung Bauermann IBM Linux Technology Center ___ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec
Re: [PATCH 14/14] arm64: kexec_file: add vmlinux format support
Mark Rutland <mark.rutl...@arm.com> writes: > On Thu, Aug 24, 2017 at 06:30:50PM +0100, Mark Rutland wrote: >> On Thu, Aug 24, 2017 at 05:18:11PM +0900, AKASHI Takahiro wrote: >> > The first PT_LOAD segment, which is assumed to be "text" code, in vmlinux >> > will be loaded at the offset of TEXT_OFFSET from the begining of system >> > memory. The other PT_LOAD segments are placed relative to the first one. >> >> I really don't like assuming things about the vmlinux ELF file. >> >> > Regarding kernel verification, since there is no standard way to contain >> > a signature within elf binary, we follow PowerPC's (not yet upstreamed) >> > approach, that is, appending a signature right after the kernel binary >> > itself like module signing. >> >> I also *really* don't like this. It's a bizarre in-band mechanism, >> without explcit information. It's not a nice ABI. >> >> If we can load an Image, why do we need to be able to load a vmlinux? > > So IIUC, the whole point of this is to be able to kexec_file_load() a > vmlinux + signature bundle, for !CONFIG_EFI kernels. > > For that, I think that we actually need a new kexec_file_load${N} > syscall, where we can pass the signature for the kernel as a separate > file. Ideally also with a flags argument and perhaps the ability to sign > the initrd too. > > That way we don't ahve to come up with a magic vmlinux+signature format, > as we can just pass a regular image and a signature for that image > separately. That should work for PPC and others, too. powerpc uses the same format that is used for signed kernel modules, which is a signature appended at the end of the file. It doesn't need to be passed separately since it's embedded in the file itself. The kernel already has a mechanism to verify signatures that aren't embedded in the file: it's possible to use IMA via the LSM hook in kernel_read_file_from_fd (which is called in kimage_file_prepare_segments) to verify a signature stored in an extended attribute by using an IMA policy rule such as: appraise func=KEXEC_KERNEL_CHECK appraise_type=imasig Of course, that only works if the kernel image is stored in a filesystem which supports extended attributes. But that is the case of most filesystems nowadays, with the notable exception of FAT-based filesystems. evmctl, the IMA userspace tool, also support signatures stored in a separate file as well ("sidecar" signatures), but the kernel can only verify them if they are copied into an xattr (which I believe the userspace tool can do). -- Thiago Jung Bauermann IBM Linux Technology Center ___ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec
Re: [PATCH 08/14] arm64: kexec_file: create purgatory
Mark Rutland <mark.rutl...@arm.com> writes: > On Fri, Aug 25, 2017 at 10:00:59AM +0900, AKASHI Takahiro wrote: >> On Thu, Aug 24, 2017 at 05:56:17PM +0100, Mark Rutland wrote: >> > On Thu, Aug 24, 2017 at 05:18:05PM +0900, AKASHI Takahiro wrote: >> > > This is a basic purgtory, or a kind of glue code between the two kernel, >> > > for arm64. We will later add a feature of verifying a digest check >> > > against >> > > loaded memory segments. >> > > >> > > arch_kexec_apply_relocations_add() is responsible for re-linking any >> > > relative symbols in purgatory. Please note that the purgatory is not >> > > an executable, but a non-linked archive of binaries so relative symbols >> > > contained here must be resolved at kexec load time. >> > > Despite that arm64_kernel_start and arm64_dtb_addr are only such global >> > > variables now, arch_kexec_apply_relocations_add() can manage more various >> > > types of relocations. >> > >> > Why does the purgatory code need to be so complex? >> > >> > Why is it not possible to write this as position-independent asm? >> >> I don't get your point, but please note that these values are also >> re-written by the 1st kernel when it loads the 2nd kernel and so >> they must appear as globals. > > My fear about complexity is that we must "re-link" the purgatory. > > I don't understand why that has to be necessary. Surely we can have the > purgatory code be position independent, and store those globals in a > single struct purgatory_info that we can fill in from the host? > > i.e. similar to what we do for values shared with the VDSO, where we > just poke vdso_data->field, no re-linking required. Right. I'm not sure why it is a partially linked object. I believe that the purgatory could be linked at build time into a PIE executable with exported symbols for the variables that need to be filled in from the host. On some architectures (e.g., powerpc), this would greatly reduce the number of relocation types that the kernel needs to know how to process. On x86 it make less of a difference because the partially linked object already has just a handful of relocation types. > Otherwise, why can't the purgatory code be written in assembly? AFAICT, > the only complex part is the hashing code, which I don't beleive is > strictly necessary. When I posted a similar series for powerpc with similar changes to handle a partially linked purgatory in the kernel, Michael Ellerman preferred to go for a purgatory written in assembly, partially based on the one from kexec-lite. That purgatory doesn't do the checksum verification of the segments. -- Thiago Jung Bauermann IBM Linux Technology Center ___ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec
Re: [PATCH] kexec: allocate buffer in top-down, if specified, correctly
Am Freitag, 28. April 2017, 09:51:39 BRT schrieb AKASHI Takahiro: > On Thu, Apr 27, 2017 at 07:00:04PM -0300, Thiago Jung Bauermann wrote: > > Hello, > > > > Am Mittwoch, 26. April 2017, 17:22:09 BRT schrieb AKASHI Takahiro: > > > The current kexec_locate_mem_hole(kbuf.top_down == 1) stops searching at > > > the first memory region that has enough space for requested size even if > > > some of higher regions may also have. > > > > kexec_locate_mem_hole expects arch_kexec_walk_mem to walk memory from top > > to bottom if top_down is true. That is what powerpc's version does. > > Ah, I haven't noticed that, but x86 doesn't have arch_kexec_walk_mem and > how can it work for x86? Looking at v4.9's kexec_add_buffer, the logic has been this way before I factored kexec_locate_mem_hole out of it. So x86 has been behaving this way for a while. > > Isn't it possible to walk resources from top to bottom? > > Yes, it will be, but it seems to me that such a behavior is not intuitive > and even confusing if it doesn't come with explicit explanation. Yes, I should have put a comment pointing out that assumption. -- Thiago Jung Bauermann IBM Linux Technology Center ___ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec
Re: [PATCH v10 04/10] kexec_file: Add support for purgatory built as PIE.
Am Mittwoch, 23. November 2016, 09:32:58 BRST schrieb Dave Young: > On 11/22/16 at 11:44am, Thiago Jung Bauermann wrote: > > Am Dienstag, 22. November 2016, 17:01:10 BRST schrieb Michael Ellerman: > > > Thiago Jung Bauermann <bauer...@linux.vnet.ibm.com> writes: > > > > Am Sonntag, 20. November 2016, 10:45:46 BRST schrieb Dave Young: > > > >> On 11/10/16 at 01:27am, Thiago Jung Bauermann wrote: > > > >> > powerpc's purgatory.ro has 12 relocation types when built as > > > >> > a relocatable object. To implement support for them requires > > > >> > arch_kexec_apply_relocations_add to duplicate a lot of code with > > > >> > module_64.c:apply_relocate_add. > > > >> > > > > >> > When built as a Position Independent Executable there are only 4 > > > >> > relocation types in purgatory.ro, so it becomes practical for the > > > >> > powerpc > > > >> > implementation of kexec_file to have its own relocation > > > >> > implementation. > > > >> > > > > >> > Also, the purgatory is an executable and not an intermediary output > > > >> > from > > > >> > the compiler so it makes sense conceptually that it is easier to > > > >> > build > > > >> > it as a PIE than as a partially linked object. > > > >> > > > > >> > Apart from the greatly reduced number of relocations, there are two > > > >> > differences between a relocatable object and a PIE: > > > >> > > > > >> > 1. __kexec_load_purgatory needs to use the program headers rather > > > >> > than > > > >> > the > > > >> > > > > >> >section headers to figure out how to load the binary. > > > >> > > > > >> > 2. Symbol values are absolute addresses instead of relative to the > > > >> > > > > >> >start of the section. > > > >> > > > > >> > This patch adds the support needed in generic code for the > > > >> > differences > > > >> > above and allows powerpc to load and relocate a position > > > >> > independent > > > >> > purgatory. > > > >> > > > >> [snip] > > > >> > > > >> The kexec-tools machine_apply_elf_rel is pretty simple for ppc64, it > > > >> is > > > >> not that complex. So could you look into simplify your kexec_file > > > >> implementation? > > > > > > > > I can try, but there is one fundamental issue here: powerpc > > > > position-dependent code relies more on relocations than x86 > > > > position-dependent code does, so there's a limit to how simple it can > > > > be > > > > made without switching to position- independent code. And it will > > > > always > > > > be more involved than it is on x86. > > > > > > I think we need to go back to the drawing board on this one. > > > > > > My hope was that building purgatory as PIE would reduce the amount of > > > complexity, but instead it's just added more. Sorry for sending you in > > > that direction. > > > > It added complexity because in my series powerpc was using a PIE purgatory > > but x86 kept using a partially-linked object (because of the problem I > > mentioned I had when trying out a PIE x86 purgatory), so generic code > > needed two purgatory loaders. > > > > I'll see if I can make the PIE x86 purgatory to work so that generic code > > can have only one loader implementation. Then it will indeed be simpler. > Do we really need the PIE purgatory, after moving generic code out of > x86, there will be no much benefit, no? It still makes a big difference on powerpc, even after moving out the generic code. I just got the PIE purgatory working on x86 and it also simplifies the code there, so it's a win for both architectures. I'll clean up the code and post tomorrow so that you can see what you think. > Anyway, the first step should be > making the purgatory code more generic so that it can be easier for > other arches to support kexec_file in the future. I'll try putting sha256.c in lib/purgatory/ as you suggested. -- Thiago Jung Bauermann IBM Linux Technology Center ___ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec
Re: [PATCH v10 04/10] kexec_file: Add support for purgatory built as PIE.
Am Dienstag, 22. November 2016, 17:01:10 BRST schrieb Michael Ellerman: > Thiago Jung Bauermann <bauer...@linux.vnet.ibm.com> writes: > > Am Sonntag, 20. November 2016, 10:45:46 BRST schrieb Dave Young: > >> On 11/10/16 at 01:27am, Thiago Jung Bauermann wrote: > >> > powerpc's purgatory.ro has 12 relocation types when built as > >> > a relocatable object. To implement support for them requires > >> > arch_kexec_apply_relocations_add to duplicate a lot of code with > >> > module_64.c:apply_relocate_add. > >> > > >> > When built as a Position Independent Executable there are only 4 > >> > relocation types in purgatory.ro, so it becomes practical for the > >> > powerpc > >> > implementation of kexec_file to have its own relocation implementation. > >> > > >> > Also, the purgatory is an executable and not an intermediary output > >> > from > >> > the compiler so it makes sense conceptually that it is easier to build > >> > it as a PIE than as a partially linked object. > >> > > >> > Apart from the greatly reduced number of relocations, there are two > >> > differences between a relocatable object and a PIE: > >> > > >> > 1. __kexec_load_purgatory needs to use the program headers rather than > >> > the > >> > > >> >section headers to figure out how to load the binary. > >> > > >> > 2. Symbol values are absolute addresses instead of relative to the > >> > > >> >start of the section. > >> > > >> > This patch adds the support needed in generic code for the differences > >> > above and allows powerpc to load and relocate a position independent > >> > purgatory. > >> > >> [snip] > >> > >> The kexec-tools machine_apply_elf_rel is pretty simple for ppc64, it is > >> not that complex. So could you look into simplify your kexec_file > >> implementation? > > > > I can try, but there is one fundamental issue here: powerpc > > position-dependent code relies more on relocations than x86 > > position-dependent code does, so there's a limit to how simple it can be > > made without switching to position- independent code. And it will always > > be more involved than it is on x86. > I think we need to go back to the drawing board on this one. > > My hope was that building purgatory as PIE would reduce the amount of > complexity, but instead it's just added more. Sorry for sending you in > that direction. It added complexity because in my series powerpc was using a PIE purgatory but x86 kept using a partially-linked object (because of the problem I mentioned I had when trying out a PIE x86 purgatory), so generic code needed two purgatory loaders. I'll see if I can make the PIE x86 purgatory to work so that generic code can have only one loader implementation. Then it will indeed be simpler. Am Dienstag, 22. November 2016, 14:16:22 BRST schrieb Dave Young: > Hi Michael > > On 11/22/16 at 05:01pm, Michael Ellerman wrote: > > In general I dislike the level of complexity of the kexec-tools > > purgatory, and in particular I'm not comfortable with things like: > > > > diff --git a/arch/powerpc/purgatory/sha256.c > > b/arch/powerpc/purgatory/sha256.c new file mode 100644 > > index ..6abee1877d56 > > --- /dev/null > > +++ b/arch/powerpc/purgatory/sha256.c > > @@ -0,0 +1,6 @@ > > +#include "../boot/string.h" > > + > > +/* Avoid including x86's boot/string.h in sha256.c. */ > > +#define BOOT_STRING_H > > + > > +#include "../../x86/purgatory/sha256.c" > > Agreed, include x86 code in powerpc looks bad > > > I think the best way to get this over the line would be to take the > > kexec-lite purgatory implementation and use that to begin with. I know > > it doesn't have all the features of the kexec-tools version, but it > > should work, and we can look at adding the extra features later. > > Instead of adding other implementation, moving the purgatory sha256 code > out of x86 sounds better so that we can reuse them cleanly.. Do you have a suggestion of where that code can live so that it can be shared between purgatories for different arches? Do we need a purgatory with generic and arch-specific code like in kexec- tools? -- Thiago Jung Bauermann IBM Linux Technology Center ___ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec
Re: [PATCH v10 04/10] kexec_file: Add support for purgatory built as PIE.
Hello Dave, Thanks for your review. Am Sonntag, 20. November 2016, 10:45:46 BRST schrieb Dave Young: > On 11/10/16 at 01:27am, Thiago Jung Bauermann wrote: > > powerpc's purgatory.ro has 12 relocation types when built as > > a relocatable object. To implement support for them requires > > arch_kexec_apply_relocations_add to duplicate a lot of code with > > module_64.c:apply_relocate_add. > > > > When built as a Position Independent Executable there are only 4 > > relocation types in purgatory.ro, so it becomes practical for the powerpc > > implementation of kexec_file to have its own relocation implementation. > > > > Also, the purgatory is an executable and not an intermediary output from > > the compiler so it makes sense conceptually that it is easier to build > > it as a PIE than as a partially linked object. > > > > Apart from the greatly reduced number of relocations, there are two > > differences between a relocatable object and a PIE: > > > > 1. __kexec_load_purgatory needs to use the program headers rather than the > > > >section headers to figure out how to load the binary. > > > > 2. Symbol values are absolute addresses instead of relative to the > > > >start of the section. > > > > This patch adds the support needed in generic code for the differences > > above and allows powerpc to load and relocate a position independent > > purgatory. > > [snip] > > The kexec-tools machine_apply_elf_rel is pretty simple for ppc64, it is > not that complex. So could you look into simplify your kexec_file > implementation? I can try, but there is one fundamental issue here: powerpc position-dependent code relies more on relocations than x86 position-dependent code does, so there's a limit to how simple it can be made without switching to position- independent code. And it will always be more involved than it is on x86. BTW, building x86's purgatory as PIE results in it not having any relocation at all, so it's an advantage even in that architecture. Unfortunately, the machine locks up during reboot and I didn't have time to try to figure out what's going on. > kernel/kexec_file.c kexec_apply_relocations only do limited things > and some of the logic is in arch/x86, so move general code out of arch > code, then I guess the arch code will be simpler I agree that is a good idea. Is the patch below what you had in mind? > and then we probably do not need this PIE stuff anymore. If you are ok with the patch below I can post a new version of the series based on it and we can see if Michael Ellerman thinks it is enough. > BTW, __kexec_really_load_purgatory looks worse than > ___kexec_load_purgatory ;) Really? I find the special handling of bss makes the section-based loader a bit more confusing. -- Thiago Jung Bauermann IBM Linux Technology Center Subject: [PATCH] kexec_file: Move generic relocation code from arch/x86 to kernel/kexec_file.c The check for undefined symbols stays in arch-specific code because powerpc needs to allow TOC symbols to be processed even though they're undefined. There is no functional change. Suggested-by: Dave Young <dyo...@redhat.com> Signed-off-by: Thiago Jung Bauermann <bauer...@linux.vnet.ibm.com> --- arch/x86/kernel/machine_kexec_64.c | 160 +++-- include/linux/kexec.h | 9 ++- kernel/kexec_file.c| 120 +++- 3 files changed, 154 insertions(+), 135 deletions(-) diff --git a/arch/x86/kernel/machine_kexec_64.c b/arch/x86/kernel/machine_kexec_64.c index 8c1f218926d7..f4860c408ece 100644 --- a/arch/x86/kernel/machine_kexec_64.c +++ b/arch/x86/kernel/machine_kexec_64.c @@ -401,143 +401,45 @@ int arch_kexec_kernel_verify_sig(struct kimage *image, void *kernel, } #endif -/* - * Apply purgatory relocations. - * - * ehdr: Pointer to elf headers - * sechdrs: Pointer to section headers. - * relsec: section index of SHT_RELA section. - * - * TODO: Some of the code belongs to generic code. Move that in kexec.c. - */ -int arch_kexec_apply_relocations_add(const Elf64_Ehdr *ehdr, -Elf64_Shdr *sechdrs, unsigned int relsec) +int arch_kexec_apply_relocation_add(const Elf_Ehdr *ehdr, Elf_Shdr *sechdrs, + unsigned int reltype, Elf_Sym *sym, + const char *name, unsigned long *location, + unsigned long address, unsigned long value) { - unsigned int i; - Elf64_Rela *rel; - Elf64_Sym *sym; - void *location; - Elf64_Shdr *section, *symtabsec; - unsigned long address, sec_base, value; - const char *strtab, *name, *shstrtab; - - /* -* ->sh_offset has been modified
[PATCH v10 06/10] powerpc: Implement kexec_file_load.
Add arch-specific functions needed by the generic kexec_file code. Signed-off-by: Josh Sklar <sk...@linux.vnet.ibm.com> Signed-off-by: Thiago Jung Bauermann <bauer...@linux.vnet.ibm.com> --- arch/powerpc/Kconfig| 14 ++ arch/powerpc/include/asm/systbl.h | 1 + arch/powerpc/include/asm/unistd.h | 2 +- arch/powerpc/include/uapi/asm/unistd.h | 1 + arch/powerpc/kernel/Makefile| 1 + arch/powerpc/kernel/machine_kexec_file_64.c | 301 6 files changed, 319 insertions(+), 1 deletion(-) diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig index 6cb59c6e5ba4..a5a7bcf30c05 100644 --- a/arch/powerpc/Kconfig +++ b/arch/powerpc/Kconfig @@ -455,6 +455,20 @@ config KEXEC interface is strongly in flux, so no good recommendation can be made. +config KEXEC_FILE + bool "kexec file based system call" + select KEXEC_CORE + select HAVE_KEXEC_FILE_PIE_PURGATORY + select BUILD_BIN2C + depends on PPC64 + depends on CRYPTO=y + depends on CRYPTO_SHA256=y + help + This is a new version of the kexec system call. This call is + file based and takes in file descriptors as system call arguments + for kernel and initramfs as opposed to a list of segments as is the + case for the older kexec call. + config RELOCATABLE bool "Build a relocatable kernel" depends on (PPC64 && !COMPILE_TEST) || (FLATMEM && (44x || FSL_BOOKE)) diff --git a/arch/powerpc/include/asm/systbl.h b/arch/powerpc/include/asm/systbl.h index 2fc5d4db503c..4b369d83fe9c 100644 --- a/arch/powerpc/include/asm/systbl.h +++ b/arch/powerpc/include/asm/systbl.h @@ -386,3 +386,4 @@ SYSCALL(mlock2) SYSCALL(copy_file_range) COMPAT_SYS_SPU(preadv2) COMPAT_SYS_SPU(pwritev2) +SYSCALL(kexec_file_load) diff --git a/arch/powerpc/include/asm/unistd.h b/arch/powerpc/include/asm/unistd.h index cf12c580f6b2..a01e97d3f305 100644 --- a/arch/powerpc/include/asm/unistd.h +++ b/arch/powerpc/include/asm/unistd.h @@ -12,7 +12,7 @@ #include -#define NR_syscalls382 +#define NR_syscalls383 #define __NR__exit __NR_exit diff --git a/arch/powerpc/include/uapi/asm/unistd.h b/arch/powerpc/include/uapi/asm/unistd.h index e9f5f41aa55a..2f26335a3c42 100644 --- a/arch/powerpc/include/uapi/asm/unistd.h +++ b/arch/powerpc/include/uapi/asm/unistd.h @@ -392,5 +392,6 @@ #define __NR_copy_file_range 379 #define __NR_preadv2 380 #define __NR_pwritev2 381 +#define __NR_kexec_file_load 382 #endif /* _UAPI_ASM_POWERPC_UNISTD_H_ */ diff --git a/arch/powerpc/kernel/Makefile b/arch/powerpc/kernel/Makefile index 22534a56c914..6de731d90bff 100644 --- a/arch/powerpc/kernel/Makefile +++ b/arch/powerpc/kernel/Makefile @@ -109,6 +109,7 @@ obj-$(CONFIG_PCI) += pci_$(BITS).o $(pci64-y) \ obj-$(CONFIG_PCI_MSI) += msi.o obj-$(CONFIG_KEXEC_CORE) += machine_kexec.o crash.o \ machine_kexec_$(BITS).o +obj-$(CONFIG_KEXEC_FILE) += machine_kexec_file_$(BITS).o obj-$(CONFIG_AUDIT)+= audit.o obj64-$(CONFIG_AUDIT) += compat_audit.o diff --git a/arch/powerpc/kernel/machine_kexec_file_64.c b/arch/powerpc/kernel/machine_kexec_file_64.c new file mode 100644 index ..172f6f736987 --- /dev/null +++ b/arch/powerpc/kernel/machine_kexec_file_64.c @@ -0,0 +1,301 @@ +/* + * ppc64 code to implement the kexec_file_load syscall + * + * Copyright (C) 2004 Adam Litke (a...@us.ibm.com) + * Copyright (C) 2004 IBM Corp. + * Copyright (C) 2005 R Sharada (shar...@in.ibm.com) + * Copyright (C) 2006 Mohan Kumar M (mo...@in.ibm.com) + * Copyright (C) 2016 IBM Corporation + * + * Based on kexec-tools' kexec-elf-ppc64.c. + * Heavily modified for the kernel by + * Thiago Jung Bauermann <bauer...@linux.vnet.ibm.com>. + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation (version 2 of the License). + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + */ + +#include +#include +#include +#include + +#define SLAVE_CODE_SIZE256 + +static struct kexec_file_ops *kexec_file_loaders[] = { }; + +int arch_kexec_kernel_image_probe(struct kimage *image, void *buf, + unsigned long buf_len) +{ + int i, ret = -ENOEXEC; + struct kexec_file_ops *fops; + + /* We don't support crash kernels yet. */ + if (image->type == KEXEC_TYPE_CRASH) + return -ENOTSUPP; + + for (i = 0; i < ARRAY_SIZE(kexec_file_
[PATCH v10 08/10] powerpc: Add support for loading ELF kernels with kexec_file_load.
This uses all the infrastructure built up by the previous patches in the series to load an ELF vmlinux file and an initrd. It uses the flattened device tree at initial_boot_params as a base and adjusts memory reservations and its /chosen node for the next kernel. [a...@linux-foundation.org: coding-style fixes] Signed-off-by: Thiago Jung Bauermann <bauer...@linux.vnet.ibm.com> Signed-off-by: Andrew Morton <a...@linux-foundation.org> --- arch/powerpc/include/asm/kexec.h| 12 ++ arch/powerpc/kernel/Makefile| 3 +- arch/powerpc/kernel/kexec_elf_64.c | 279 +++ arch/powerpc/kernel/machine_kexec_file_64.c | 281 +++- 4 files changed, 572 insertions(+), 3 deletions(-) diff --git a/arch/powerpc/include/asm/kexec.h b/arch/powerpc/include/asm/kexec.h index eca2f975bf44..6b8cbcf42466 100644 --- a/arch/powerpc/include/asm/kexec.h +++ b/arch/powerpc/include/asm/kexec.h @@ -91,6 +91,18 @@ static inline bool kdump_in_progress(void) return crashing_cpu >= 0; } +#ifdef CONFIG_KEXEC_FILE +#define PURGATORY_ELF_TYPE ET_EXEC + +extern struct kexec_file_ops kexec_elf64_ops; + +int setup_purgatory(struct kimage *image, const void *slave_code, + const void *fdt, unsigned long kernel_load_addr, + unsigned long fdt_load_addr, unsigned long stack_top); +int setup_new_fdt(void *fdt, unsigned long initrd_load_addr, + unsigned long initrd_len, const char *cmdline); +#endif /* CONFIG_KEXEC_FILE */ + #else /* !CONFIG_KEXEC_CORE */ static inline void crash_kexec_secondary(struct pt_regs *regs) { } diff --git a/arch/powerpc/kernel/Makefile b/arch/powerpc/kernel/Makefile index de14b7eb11bb..424b13b1b2b0 100644 --- a/arch/powerpc/kernel/Makefile +++ b/arch/powerpc/kernel/Makefile @@ -109,7 +109,8 @@ obj-$(CONFIG_PCI) += pci_$(BITS).o $(pci64-y) \ obj-$(CONFIG_PCI_MSI) += msi.o obj-$(CONFIG_KEXEC_CORE) += machine_kexec.o crash.o \ machine_kexec_$(BITS).o -obj-$(CONFIG_KEXEC_FILE) += machine_kexec_file_$(BITS).o elf_util.o +obj-$(CONFIG_KEXEC_FILE) += machine_kexec_file_$(BITS).o elf_util.o \ + kexec_elf_$(BITS).o obj-$(CONFIG_AUDIT)+= audit.o obj64-$(CONFIG_AUDIT) += compat_audit.o diff --git a/arch/powerpc/kernel/kexec_elf_64.c b/arch/powerpc/kernel/kexec_elf_64.c new file mode 100644 index ..f58b77d80d59 --- /dev/null +++ b/arch/powerpc/kernel/kexec_elf_64.c @@ -0,0 +1,279 @@ +/* + * Load ELF vmlinux file for the kexec_file_load syscall. + * + * Copyright (C) 2004 Adam Litke (a...@us.ibm.com) + * Copyright (C) 2004 IBM Corp. + * Copyright (C) 2005 R Sharada (shar...@in.ibm.com) + * Copyright (C) 2006 Mohan Kumar M (mo...@in.ibm.com) + * Copyright (C) 2016 IBM Corporation + * + * Based on kexec-tools' kexec-elf-exec.c and kexec-elf-ppc64.c. + * Heavily modified for the kernel by + * Thiago Jung Bauermann <bauer...@linux.vnet.ibm.com>. + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation (version 2 of the License). + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + */ + +#define pr_fmt(fmt)"kexec_elf: " fmt + +#include +#include +#include +#include +#include +#include +#include + +#define PURGATORY_STACK_SIZE (16 * 1024) + +/** + * build_elf_exec_info - read ELF executable and check that we can use it + */ +static int build_elf_exec_info(const char *buf, size_t len, struct elfhdr *ehdr, + struct elf_info *elf_info) +{ + int i; + int ret; + + ret = elf_read_from_buffer(buf, len, ehdr, elf_info); + if (ret) + return ret; + + /* Big endian vmlinux has type ET_DYN. */ + if (ehdr->e_type != ET_EXEC && ehdr->e_type != ET_DYN) { + pr_err("Not an ELF executable.\n"); + goto error; + } else if (!elf_info->proghdrs) { + pr_err("No ELF program header.\n"); + goto error; + } + + for (i = 0; i < ehdr->e_phnum; i++) { + /* +* Kexec does not support loading interpreters. +* In addition this check keeps us from attempting +* to kexec ordinay executables. +*/ + if (elf_info->proghdrs[i].p_type == PT_INTERP) { + pr_err("Requires an ELF interpreter.\n"); + goto error; + } + } + + return 0; +error: + elf_fre
[PATCH v10 10/10] powerpc: Enable CONFIG_KEXEC_FILE in powerpc server defconfigs.
Enable CONFIG_KEXEC_FILE in powernv_defconfig, ppc64_defconfig and pseries_defconfig. It depends on CONFIG_CRYPTO_SHA256=y, so add that as well. Signed-off-by: Thiago Jung Bauermann <bauer...@linux.vnet.ibm.com> --- arch/powerpc/configs/powernv_defconfig | 2 ++ arch/powerpc/configs/ppc64_defconfig | 2 ++ arch/powerpc/configs/pseries_defconfig | 2 ++ 3 files changed, 6 insertions(+) diff --git a/arch/powerpc/configs/powernv_defconfig b/arch/powerpc/configs/powernv_defconfig index d98b6eb3254f..5a190aa5534b 100644 --- a/arch/powerpc/configs/powernv_defconfig +++ b/arch/powerpc/configs/powernv_defconfig @@ -49,6 +49,7 @@ CONFIG_BINFMT_MISC=m CONFIG_PPC_TRANSACTIONAL_MEM=y CONFIG_HOTPLUG_CPU=y CONFIG_KEXEC=y +CONFIG_KEXEC_FILE=y CONFIG_IRQ_ALL_CPUS=y CONFIG_NUMA=y CONFIG_MEMORY_HOTPLUG=y @@ -301,6 +302,7 @@ CONFIG_CRYPTO_CCM=m CONFIG_CRYPTO_PCBC=m CONFIG_CRYPTO_HMAC=y CONFIG_CRYPTO_MICHAEL_MIC=m +CONFIG_CRYPTO_SHA256=y CONFIG_CRYPTO_TGR192=m CONFIG_CRYPTO_WP512=m CONFIG_CRYPTO_ANUBIS=m diff --git a/arch/powerpc/configs/ppc64_defconfig b/arch/powerpc/configs/ppc64_defconfig index 58a98d40086f..0059d2088b9c 100644 --- a/arch/powerpc/configs/ppc64_defconfig +++ b/arch/powerpc/configs/ppc64_defconfig @@ -46,6 +46,7 @@ CONFIG_HZ_100=y CONFIG_BINFMT_MISC=m CONFIG_PPC_TRANSACTIONAL_MEM=y CONFIG_KEXEC=y +CONFIG_KEXEC_FILE=y CONFIG_CRASH_DUMP=y CONFIG_IRQ_ALL_CPUS=y CONFIG_MEMORY_HOTREMOVE=y @@ -336,6 +337,7 @@ CONFIG_CRYPTO_TEST=m CONFIG_CRYPTO_PCBC=m CONFIG_CRYPTO_HMAC=y CONFIG_CRYPTO_MICHAEL_MIC=m +CONFIG_CRYPTO_SHA256=y CONFIG_CRYPTO_TGR192=m CONFIG_CRYPTO_WP512=m CONFIG_CRYPTO_ANUBIS=m diff --git a/arch/powerpc/configs/pseries_defconfig b/arch/powerpc/configs/pseries_defconfig index 8a3bc016b732..f022f657a984 100644 --- a/arch/powerpc/configs/pseries_defconfig +++ b/arch/powerpc/configs/pseries_defconfig @@ -52,6 +52,7 @@ CONFIG_HZ_100=y CONFIG_BINFMT_MISC=m CONFIG_PPC_TRANSACTIONAL_MEM=y CONFIG_KEXEC=y +CONFIG_KEXEC_FILE=y CONFIG_IRQ_ALL_CPUS=y CONFIG_MEMORY_HOTPLUG=y CONFIG_MEMORY_HOTREMOVE=y @@ -303,6 +304,7 @@ CONFIG_CRYPTO_TEST=m CONFIG_CRYPTO_PCBC=m CONFIG_CRYPTO_HMAC=y CONFIG_CRYPTO_MICHAEL_MIC=m +CONFIG_CRYPTO_SHA256=y CONFIG_CRYPTO_TGR192=m CONFIG_CRYPTO_WP512=m CONFIG_CRYPTO_ANUBIS=m -- 2.7.4 ___ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec
[PATCH v10 05/10] powerpc: Change places using CONFIG_KEXEC to use CONFIG_KEXEC_CORE instead.
Commit 2965faa5e03d ("kexec: split kexec_load syscall from kexec core code") introduced CONFIG_KEXEC_CORE so that CONFIG_KEXEC means whether the kexec_load system call should be compiled-in and CONFIG_KEXEC_FILE means whether the kexec_file_load system call should be compiled-in. These options can be set independently from each other. Since until now powerpc only supported kexec_load, CONFIG_KEXEC and CONFIG_KEXEC_CORE were synonyms. That is not the case anymore, so we need to make a distinction. Almost all places where CONFIG_KEXEC was being used should be using CONFIG_KEXEC_CORE instead, since kexec_file_load also needs that code compiled in. Signed-off-by: Thiago Jung Bauermann <bauer...@linux.vnet.ibm.com> --- arch/powerpc/Kconfig | 2 +- arch/powerpc/include/asm/debug.h | 2 +- arch/powerpc/include/asm/kexec.h | 6 +++--- arch/powerpc/include/asm/machdep.h| 4 ++-- arch/powerpc/include/asm/smp.h| 2 +- arch/powerpc/kernel/Makefile | 4 ++-- arch/powerpc/kernel/head_64.S | 2 +- arch/powerpc/kernel/misc_32.S | 2 +- arch/powerpc/kernel/misc_64.S | 6 +++--- arch/powerpc/kernel/prom.c| 2 +- arch/powerpc/kernel/setup_64.c| 4 ++-- arch/powerpc/kernel/smp.c | 6 +++--- arch/powerpc/kernel/traps.c | 2 +- arch/powerpc/platforms/85xx/corenet_generic.c | 2 +- arch/powerpc/platforms/85xx/smp.c | 8 arch/powerpc/platforms/cell/spu_base.c| 2 +- arch/powerpc/platforms/powernv/setup.c| 6 +++--- arch/powerpc/platforms/ps3/setup.c| 4 ++-- arch/powerpc/platforms/pseries/Makefile | 2 +- arch/powerpc/platforms/pseries/setup.c| 4 ++-- 20 files changed, 36 insertions(+), 36 deletions(-) diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig index 65fba4c34cd7..6cb59c6e5ba4 100644 --- a/arch/powerpc/Kconfig +++ b/arch/powerpc/Kconfig @@ -489,7 +489,7 @@ config CRASH_DUMP config FA_DUMP bool "Firmware-assisted dump" - depends on PPC64 && PPC_RTAS && CRASH_DUMP && KEXEC + depends on PPC64 && PPC_RTAS && CRASH_DUMP && KEXEC_CORE help A robust mechanism to get reliable kernel crash dump with assistance from firmware. This approach does not use kexec, diff --git a/arch/powerpc/include/asm/debug.h b/arch/powerpc/include/asm/debug.h index a954e4975049..86308f177f2d 100644 --- a/arch/powerpc/include/asm/debug.h +++ b/arch/powerpc/include/asm/debug.h @@ -10,7 +10,7 @@ struct pt_regs; extern struct dentry *powerpc_debugfs_root; -#if defined(CONFIG_DEBUGGER) || defined(CONFIG_KEXEC) +#if defined(CONFIG_DEBUGGER) || defined(CONFIG_KEXEC_CORE) extern int (*__debugger)(struct pt_regs *regs); extern int (*__debugger_ipi)(struct pt_regs *regs); diff --git a/arch/powerpc/include/asm/kexec.h b/arch/powerpc/include/asm/kexec.h index a46f5f45570c..eca2f975bf44 100644 --- a/arch/powerpc/include/asm/kexec.h +++ b/arch/powerpc/include/asm/kexec.h @@ -53,7 +53,7 @@ typedef void (*crash_shutdown_t)(void); -#ifdef CONFIG_KEXEC +#ifdef CONFIG_KEXEC_CORE /* * This function is responsible for capturing register states if coming @@ -91,7 +91,7 @@ static inline bool kdump_in_progress(void) return crashing_cpu >= 0; } -#else /* !CONFIG_KEXEC */ +#else /* !CONFIG_KEXEC_CORE */ static inline void crash_kexec_secondary(struct pt_regs *regs) { } static inline int overlaps_crashkernel(unsigned long start, unsigned long size) @@ -116,7 +116,7 @@ static inline bool kdump_in_progress(void) return false; } -#endif /* CONFIG_KEXEC */ +#endif /* CONFIG_KEXEC_CORE */ #endif /* ! __ASSEMBLY__ */ #endif /* __KERNEL__ */ #endif /* _ASM_POWERPC_KEXEC_H */ diff --git a/arch/powerpc/include/asm/machdep.h b/arch/powerpc/include/asm/machdep.h index e02cbc6a6c70..5011b69107a7 100644 --- a/arch/powerpc/include/asm/machdep.h +++ b/arch/powerpc/include/asm/machdep.h @@ -183,7 +183,7 @@ struct machdep_calls { */ void (*machine_shutdown)(void); -#ifdef CONFIG_KEXEC +#ifdef CONFIG_KEXEC_CORE void (*kexec_cpu_down)(int crash_shutdown, int secondary); /* Called to do what every setup is needed on image and the @@ -198,7 +198,7 @@ struct machdep_calls { * no return. */ void (*machine_kexec)(struct kimage *image); -#endif /* CONFIG_KEXEC */ +#endif /* CONFIG_KEXEC_CORE */ #ifdef CONFIG_SUSPEND /* These are called to disable and enable, respectively, IRQs when diff --git a/arch/powerpc/include/asm/smp.h b/arch/powerpc/include/asm/smp.h index 0d02c11dc331..32db16d2e7ad 100644 --- a/arch/powerpc/include/asm/smp.h +++ b/arch/powerpc/include/asm/smp.h @@ -176,7 +176,7 @@ static inline void set_hard_smp_processor_id(int cpu, int phys) #endif /* !CONFI
[PATCH v10 07/10] powerpc: Add functions to read ELF files of any endianness.
A little endian kernel might need to kexec a big endian kernel (the opposite is less likely but could happen as well), so we can't just cast the buffer with the binary to ELF structs and use them as is done elsewhere. This patch adds functions which do byte-swapping as necessary when populating the ELF structs. These functions will be used in the next patch in the series. Signed-off-by: Thiago Jung Bauermann <bauer...@linux.vnet.ibm.com> --- arch/powerpc/include/asm/elf_util.h | 43 arch/powerpc/kernel/Makefile| 2 +- arch/powerpc/kernel/elf_util.c | 437 3 files changed, 481 insertions(+), 1 deletion(-) diff --git a/arch/powerpc/include/asm/elf_util.h b/arch/powerpc/include/asm/elf_util.h new file mode 100644 index ..944b3a2d8b73 --- /dev/null +++ b/arch/powerpc/include/asm/elf_util.h @@ -0,0 +1,43 @@ +/* + * Utility functions to work with ELF files. + * + * Copyright (C) 2016, IBM Corporation + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2, or (at your option) + * any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + */ + +#ifndef _ASM_POWERPC_ELF_UTIL_H +#define _ASM_POWERPC_ELF_UTIL_H + +#include + +struct elf_info { + /* +* Where the ELF binary contents are kept. +* Memory managed by the user of the struct. +*/ + const char *buffer; + + const struct elfhdr *ehdr; + const struct elf_phdr *proghdrs; + struct elf_shdr *sechdrs; +}; + +static inline bool elf_is_elf_file(const struct elfhdr *ehdr) +{ + return memcmp(ehdr->e_ident, ELFMAG, SELFMAG) == 0; +} + +int elf_read_from_buffer(const char *buf, size_t len, struct elfhdr *ehdr, +struct elf_info *elf_info); +void elf_free_info(struct elf_info *elf_info); + +#endif /* _ASM_POWERPC_ELF_UTIL_H */ diff --git a/arch/powerpc/kernel/Makefile b/arch/powerpc/kernel/Makefile index 6de731d90bff..de14b7eb11bb 100644 --- a/arch/powerpc/kernel/Makefile +++ b/arch/powerpc/kernel/Makefile @@ -109,7 +109,7 @@ obj-$(CONFIG_PCI) += pci_$(BITS).o $(pci64-y) \ obj-$(CONFIG_PCI_MSI) += msi.o obj-$(CONFIG_KEXEC_CORE) += machine_kexec.o crash.o \ machine_kexec_$(BITS).o -obj-$(CONFIG_KEXEC_FILE) += machine_kexec_file_$(BITS).o +obj-$(CONFIG_KEXEC_FILE) += machine_kexec_file_$(BITS).o elf_util.o obj-$(CONFIG_AUDIT)+= audit.o obj64-$(CONFIG_AUDIT) += compat_audit.o diff --git a/arch/powerpc/kernel/elf_util.c b/arch/powerpc/kernel/elf_util.c new file mode 100644 index ..8572fc84a802 --- /dev/null +++ b/arch/powerpc/kernel/elf_util.c @@ -0,0 +1,437 @@ +/* + * Utility functions to work with ELF files. + * + * Copyright (C) 2016, IBM Corporation + * + * Based on kexec-tools' kexec-elf.c. Heavily modified for the + * kernel by Thiago Jung Bauermann <bauer...@linux.vnet.ibm.com>. + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation (version 2 of the License). + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + */ + +#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt + +#include +#include +#include + +#if ELF_CLASS == ELFCLASS32 +#define elf_addr_to_cpuelf32_to_cpu + +#ifndef Elf_Rel +#define Elf_RelElf32_Rel +#endif /* Elf_Rel */ +#else /* ELF_CLASS == ELFCLASS32 */ +#define elf_addr_to_cpuelf64_to_cpu + +#ifndef Elf_Rel +#define Elf_RelElf64_Rel +#endif /* Elf_Rel */ + +static uint64_t elf64_to_cpu(const struct elfhdr *ehdr, uint64_t value) +{ + if (ehdr->e_ident[EI_DATA] == ELFDATA2LSB) + value = le64_to_cpu(value); + else if (ehdr->e_ident[EI_DATA] == ELFDATA2MSB) + value = be64_to_cpu(value); + + return value; +} +#endif /* ELF_CLASS == ELFCLASS32 */ + +static uint16_t elf16_to_cpu(const struct elfhdr *ehdr, uint16_t value) +{ + if (ehdr->e_ident[EI_DATA] == ELFDATA2LSB) + value = le16_to_cpu(value); + else if (ehdr->e_ident[EI_DATA] == ELFDATA2MSB) + value = be16_to_cpu(value); + + return value; +} + +static uint32_t elf32_to_cpu(const struct elfhdr *ehdr, uint32_t value) +{ + if (ehdr->e_ident[EI_DATA] == ELFDATA2LSB) +
[PATCH v10 09/10] powerpc: Add purgatory for kexec_file_load implementation.
This purgatory implementation comes from kexec-tools and was trimmed down a bit. It uses the memset, memcpy and memcmp implementations from lib/string.c. It's not straightforward to #include "lib/string.c" so we simply copy those functions. The changes made to the purgatory code relative to the version in kexec-tools were: The support for printing messages to the console was removed. Also, since we don't support loading a crashdump kernel via kexec_file_load yet, the code related to that functionality has been removed for now. The sha256_regions global variable was renamed to sha_regions to match what kexec_file_load expects, and to use the sha256.c file from x86's purgatory (this avoids adding yet another SHA-256 implementation). The global variables in purgatory.c and purgatory-ppc64.c now use a __section attribute to put them in the .data section instead of being initialized to zero. It doesn't matter what their initial value is, because they will be set by the kernel when preparing the kexec image. Finally, some checkpatch.pl warnings were fixed. Signed-off-by: Thiago Jung Bauermann <bauer...@linux.vnet.ibm.com> --- arch/powerpc/Makefile | 1 + arch/powerpc/purgatory/.gitignore | 2 + arch/powerpc/purgatory/Makefile| 40 arch/powerpc/purgatory/crtsavres.S | 5 + arch/powerpc/purgatory/kexec-sha256.h | 11 +++ arch/powerpc/purgatory/purgatory-ppc64.c | 36 +++ arch/powerpc/purgatory/purgatory.c | 48 + arch/powerpc/purgatory/purgatory.h | 10 ++ arch/powerpc/purgatory/sha256.c| 6 ++ arch/powerpc/purgatory/sha256.h| 1 + arch/powerpc/purgatory/string.c| 60 +++ arch/powerpc/purgatory/v2wrap.S| 132 + arch/powerpc/scripts/check-purgatory-relocs.sh | 47 + 13 files changed, 399 insertions(+) diff --git a/arch/powerpc/Makefile b/arch/powerpc/Makefile index 617dece67924..5e7dcdaf93f5 100644 --- a/arch/powerpc/Makefile +++ b/arch/powerpc/Makefile @@ -249,6 +249,7 @@ core-y += arch/powerpc/kernel/ \ core-$(CONFIG_XMON)+= arch/powerpc/xmon/ core-$(CONFIG_KVM) += arch/powerpc/kvm/ core-$(CONFIG_PERF_EVENTS) += arch/powerpc/perf/ +core-$(CONFIG_KEXEC_FILE) += arch/powerpc/purgatory/ drivers-$(CONFIG_OPROFILE) += arch/powerpc/oprofile/ diff --git a/arch/powerpc/purgatory/.gitignore b/arch/powerpc/purgatory/.gitignore new file mode 100644 index ..e9e66f178a6d --- /dev/null +++ b/arch/powerpc/purgatory/.gitignore @@ -0,0 +1,2 @@ +kexec-purgatory.c +purgatory.ro diff --git a/arch/powerpc/purgatory/Makefile b/arch/powerpc/purgatory/Makefile new file mode 100644 index ..2dfb53ac9944 --- /dev/null +++ b/arch/powerpc/purgatory/Makefile @@ -0,0 +1,40 @@ +OBJECT_FILES_NON_STANDARD := y + +purgatory-y := purgatory.o string.o v2wrap.o purgatory-ppc64.o crtsavres.o \ + sha256.o + +targets += $(purgatory-y) +PURGATORY_OBJS = $(addprefix $(obj)/,$(purgatory-y)) + +LDFLAGS_purgatory.ro := -pie --no-dynamic-linker -e purgatory_start \ + --no-undefined -nostartfiles -nostdlib -nodefaultlibs +targets += purgatory.ro + +KBUILD_CFLAGS := $(filter-out $(CC_FLAGS_FTRACE), $(KBUILD_CFLAGS)) + +KBUILD_CFLAGS += -fno-zero-initialized-in-bss -fno-builtin -ffreestanding \ +-fno-stack-protector -fno-exceptions -fpie +KBUILD_AFLAGS += -fno-exceptions -msoft-float -fpie + +$(obj)/purgatory.ro: $(PURGATORY_OBJS) FORCE + $(call if_changed,ld) + +targets += kexec-purgatory.c + +quiet_cmd_relocs_check = CALL$< + cmd_relocs_check = $(CONFIG_SHELL) $< "$(OBJDUMP)" "$(obj)/purgatory.ro" + +PHONY += relocs_check +relocs_check: arch/powerpc/scripts/check-purgatory-relocs.sh $(obj)/purgatory.ro + $(call cmd,relocs_check) + +CMD_BIN2C = $(objtree)/scripts/basic/bin2c +quiet_cmd_bin2c = BIN2C $@ + cmd_bin2c = $(CMD_BIN2C) kexec_purgatory < $< > $@ + +$(obj)/kexec-purgatory.c: $(obj)/purgatory.ro relocs_check FORCE + $(call if_changed,bin2c) + @: + + +obj-$(CONFIG_KEXEC_FILE) += kexec-purgatory.o diff --git a/arch/powerpc/purgatory/crtsavres.S b/arch/powerpc/purgatory/crtsavres.S new file mode 100644 index ..5d17e1c0d575 --- /dev/null +++ b/arch/powerpc/purgatory/crtsavres.S @@ -0,0 +1,5 @@ +#ifndef CONFIG_CC_OPTIMIZE_FOR_SIZE +#define CONFIG_CC_OPTIMIZE_FOR_SIZE 1 +#endif + +#include "../lib/crtsavres.S" diff --git a/arch/powerpc/purgatory/kexec-sha256.h b/arch/powerpc/purgatory/kexec-sha256.h new file mode 100644 index ..4418ed02c052 --- /dev/null +++ b/arch/powerpc/purgatory/kexec-sha256.h @@ -0,0 +1,11 @@ +#ifndef KEXEC_SHA256_H +#define KEXEC_SHA256_H + +struct kexec_sha_region { + unsigned long start
[PATCH v10 03/10] kexec_file: Factor out kexec_locate_mem_hole from kexec_add_buffer.
kexec_locate_mem_hole will be used by the PowerPC kexec_file_load implementation to find free memory for the purgatory stack. Signed-off-by: Thiago Jung Bauermann <bauer...@linux.vnet.ibm.com> Acked-by: Dave Young <dyo...@redhat.com> --- include/linux/kexec.h | 1 + kernel/kexec_file.c | 25 - 2 files changed, 21 insertions(+), 5 deletions(-) diff --git a/include/linux/kexec.h b/include/linux/kexec.h index 437ef1b47428..a33f63351f86 100644 --- a/include/linux/kexec.h +++ b/include/linux/kexec.h @@ -176,6 +176,7 @@ struct kexec_buf { int __weak arch_kexec_walk_mem(struct kexec_buf *kbuf, int (*func)(u64, u64, void *)); extern int kexec_add_buffer(struct kexec_buf *kbuf); +int kexec_locate_mem_hole(struct kexec_buf *kbuf); #endif /* CONFIG_KEXEC_FILE */ struct kimage { diff --git a/kernel/kexec_file.c b/kernel/kexec_file.c index efd2c094af7e..0c2df7f73792 100644 --- a/kernel/kexec_file.c +++ b/kernel/kexec_file.c @@ -450,6 +450,23 @@ int __weak arch_kexec_walk_mem(struct kexec_buf *kbuf, } /** + * kexec_locate_mem_hole - find free memory for the purgatory or the next kernel + * @kbuf: Parameters for the memory search. + * + * On success, kbuf->mem will have the start address of the memory region found. + * + * Return: 0 on success, negative errno on error. + */ +int kexec_locate_mem_hole(struct kexec_buf *kbuf) +{ + int ret; + + ret = arch_kexec_walk_mem(kbuf, locate_mem_hole_callback); + + return ret == 1 ? 0 : -EADDRNOTAVAIL; +} + +/** * kexec_add_buffer - place a buffer in a kexec segment * @kbuf: Buffer contents and memory parameters. * @@ -489,11 +506,9 @@ int kexec_add_buffer(struct kexec_buf *kbuf) kbuf->buf_align = max(kbuf->buf_align, PAGE_SIZE); /* Walk the RAM ranges and allocate a suitable range for the buffer */ - ret = arch_kexec_walk_mem(kbuf, locate_mem_hole_callback); - if (ret != 1) { - /* A suitable memory range could not be found for buffer */ - return -EADDRNOTAVAIL; - } + ret = kexec_locate_mem_hole(kbuf); + if (ret) + return ret; /* Found a suitable memory range */ ksegment = >image->segment[kbuf->image->nr_segments]; -- 2.7.4 ___ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec
[PATCH v10 02/10] kexec_file: Change kexec_add_buffer to take kexec_buf as argument.
This is done to simplify the kexec_add_buffer argument list. Adapt all callers to set up a kexec_buf to pass to kexec_add_buffer. In addition, change the type of kexec_buf.buffer from char * to void *. There is no particular reason for it to be a char *, and the change allows us to get rid of 3 existing casts to char * in the code. Signed-off-by: Thiago Jung Bauermann <bauer...@linux.vnet.ibm.com> Acked-by: Dave Young <dyo...@redhat.com> Acked-by: Balbir Singh <bsinghar...@gmail.com> --- arch/x86/kernel/crash.c | 37 arch/x86/kernel/kexec-bzimage64.c | 48 +++-- include/linux/kexec.h | 8 +--- kernel/kexec_file.c | 88 ++- 4 files changed, 87 insertions(+), 94 deletions(-) diff --git a/arch/x86/kernel/crash.c b/arch/x86/kernel/crash.c index 650830e39e3a..3741461c63a0 100644 --- a/arch/x86/kernel/crash.c +++ b/arch/x86/kernel/crash.c @@ -631,9 +631,9 @@ static int determine_backup_region(u64 start, u64 end, void *arg) int crash_load_segments(struct kimage *image) { - unsigned long src_start, src_sz, elf_sz; - void *elf_addr; int ret; + struct kexec_buf kbuf = { .image = image, .buf_min = 0, + .buf_max = ULONG_MAX, .top_down = false }; /* * Determine and load a segment for backup area. First 640K RAM @@ -647,43 +647,44 @@ int crash_load_segments(struct kimage *image) if (ret < 0) return ret; - src_start = image->arch.backup_src_start; - src_sz = image->arch.backup_src_sz; - /* Add backup segment. */ - if (src_sz) { + if (image->arch.backup_src_sz) { + kbuf.buffer = _zero_bytes; + kbuf.bufsz = sizeof(crash_zero_bytes); + kbuf.memsz = image->arch.backup_src_sz; + kbuf.buf_align = PAGE_SIZE; /* * Ideally there is no source for backup segment. This is * copied in purgatory after crash. Just add a zero filled * segment for now to make sure checksum logic works fine. */ - ret = kexec_add_buffer(image, (char *)_zero_bytes, - sizeof(crash_zero_bytes), src_sz, - PAGE_SIZE, 0, -1, 0, - >arch.backup_load_addr); + ret = kexec_add_buffer(); if (ret) return ret; + image->arch.backup_load_addr = kbuf.mem; pr_debug("Loaded backup region at 0x%lx backup_start=0x%lx memsz=0x%lx\n", -image->arch.backup_load_addr, src_start, src_sz); +image->arch.backup_load_addr, +image->arch.backup_src_start, kbuf.memsz); } /* Prepare elf headers and add a segment */ - ret = prepare_elf_headers(image, _addr, _sz); + ret = prepare_elf_headers(image, , ); if (ret) return ret; - image->arch.elf_headers = elf_addr; - image->arch.elf_headers_sz = elf_sz; + image->arch.elf_headers = kbuf.buffer; + image->arch.elf_headers_sz = kbuf.bufsz; - ret = kexec_add_buffer(image, (char *)elf_addr, elf_sz, elf_sz, - ELF_CORE_HEADER_ALIGN, 0, -1, 0, - >arch.elf_load_addr); + kbuf.memsz = kbuf.bufsz; + kbuf.buf_align = ELF_CORE_HEADER_ALIGN; + ret = kexec_add_buffer(); if (ret) { vfree((void *)image->arch.elf_headers); return ret; } + image->arch.elf_load_addr = kbuf.mem; pr_debug("Loaded ELF headers at 0x%lx bufsz=0x%lx memsz=0x%lx\n", -image->arch.elf_load_addr, elf_sz, elf_sz); +image->arch.elf_load_addr, kbuf.bufsz, kbuf.bufsz); return ret; } diff --git a/arch/x86/kernel/kexec-bzimage64.c b/arch/x86/kernel/kexec-bzimage64.c index 3407b148c240..d0a814a9d96a 100644 --- a/arch/x86/kernel/kexec-bzimage64.c +++ b/arch/x86/kernel/kexec-bzimage64.c @@ -331,17 +331,17 @@ static void *bzImage64_load(struct kimage *image, char *kernel, struct setup_header *header; int setup_sects, kern16_size, ret = 0; - unsigned long setup_header_size, params_cmdline_sz, params_misc_sz; + unsigned long setup_header_size, params_cmdline_sz; struct boot_params *params; unsigned long bootparam_load_addr, kernel_load_addr, initrd_load_addr; unsigned long purgatory_load_addr; - unsigned long kernel_bufsz, kernel_memsz, kernel_align; - char *kernel_buf; struct bzimage64_data *ldata; struct kexec_entry64_regs regs64; void *stack; unsigned int setup_hdr_offset = offsetof(struct boot_param
[PATCH v10 04/10] kexec_file: Add support for purgatory built as PIE.
powerpc's purgatory.ro has 12 relocation types when built as a relocatable object. To implement support for them requires arch_kexec_apply_relocations_add to duplicate a lot of code with module_64.c:apply_relocate_add. When built as a Position Independent Executable there are only 4 relocation types in purgatory.ro, so it becomes practical for the powerpc implementation of kexec_file to have its own relocation implementation. Also, the purgatory is an executable and not an intermediary output from the compiler so it makes sense conceptually that it is easier to build it as a PIE than as a partially linked object. Apart from the greatly reduced number of relocations, there are two differences between a relocatable object and a PIE: 1. __kexec_load_purgatory needs to use the program headers rather than the section headers to figure out how to load the binary. 2. Symbol values are absolute addresses instead of relative to the start of the section. This patch adds the support needed in generic code for the differences above and allows powerpc to load and relocate a position independent purgatory. Suggested-by: Michael Ellerman <m...@ellerman.id.au> Signed-off-by: Thiago Jung Bauermann <bauer...@linux.vnet.ibm.com> --- arch/Kconfig | 11 ++ include/linux/kexec.h | 4 + kernel/kexec_file.c | 314 ++ 3 files changed, 253 insertions(+), 76 deletions(-) diff --git a/arch/Kconfig b/arch/Kconfig index 659bdd079277..f4498530a618 100644 --- a/arch/Kconfig +++ b/arch/Kconfig @@ -5,6 +5,17 @@ config KEXEC_CORE bool +config HAVE_KEXEC_FILE_PIE_PURGATORY + bool + help + By default, the purgatory binary is built as a relocatable + object, but on some architectures it might be an advantage + to build it as a Position Independent Executable to reduce + the types of relocation that have to be dealt with. + + If an architecture builds a PIE purgatory it should select + this symbol. + config OPROFILE tristate "OProfile system profiling" depends on PROFILING diff --git a/include/linux/kexec.h b/include/linux/kexec.h index a33f63351f86..5c356e387240 100644 --- a/include/linux/kexec.h +++ b/include/linux/kexec.h @@ -112,6 +112,10 @@ struct compat_kexec_segment { #endif #ifdef CONFIG_KEXEC_FILE +#ifndef PURGATORY_ELF_TYPE +#define PURGATORY_ELF_TYPE ET_REL +#endif + struct purgatory_info { /* Pointer to elf header of read only purgatory */ Elf_Ehdr *ehdr; diff --git a/kernel/kexec_file.c b/kernel/kexec_file.c index 0c2df7f73792..cf8c17111b12 100644 --- a/kernel/kexec_file.c +++ b/kernel/kexec_file.c @@ -633,68 +633,139 @@ static int kexec_calculate_store_digests(struct kimage *image) return ret; } -/* Actually load purgatory. Lot of code taken from kexec-tools */ -static int __kexec_load_purgatory(struct kimage *image, unsigned long min, - unsigned long max, int top_down) +#ifdef CONFIG_HAVE_KEXEC_FILE_PIE_PURGATORY +/* + * Load position independent executable purgatory using program header + * information. + */ +static int __kexec_really_load_purgatory(const Elf_Ehdr *ehdr, +Elf_Shdr *sechdrs, +struct kexec_buf *kbuf, +unsigned long *entry_addr) { - struct purgatory_info *pi = >purgatory_info; - unsigned long align, bss_align, bss_sz, bss_pad; - unsigned long entry, load_addr, curr_load_addr, bss_addr, offset; - unsigned char *buf_addr, *src; - int i, ret = 0, entry_sidx = -1; - const Elf_Shdr *sechdrs_c; - Elf_Shdr *sechdrs = NULL; - struct kexec_buf kbuf = { .image = image, .bufsz = 0, .buf_align = 1, - .buf_min = min, .buf_max = max, - .top_down = top_down }; + int ret; + unsigned long entry, dst_mem; + void *dst; + const Elf_Phdr *phdr, *first_load_seg = NULL, *last_load_seg = NULL; + const Elf_Phdr *prev_load_seg; + const Elf_Phdr *phdrs = (const void *) ehdr + ehdr->e_phoff; + + /* Determine how much memory is needed to load the executable. */ + for (phdr = phdrs; phdr < phdrs + ehdr->e_phnum; phdr++) { + if (phdr->p_type != PT_LOAD) + continue; - /* -* sechdrs_c points to section headers in purgatory and are read -* only. No modifications allowed. -*/ - sechdrs_c = (void *)pi->ehdr + pi->ehdr->e_shoff; + if (!first_load_seg) + first_load_seg = phdr; - /* -* We can not modify sechdrs_c[] and its fields. It is read only. -* Copy it over to a local copy where one can store some temporary -* data and free it at the end. We need to modify ->
[PATCH v10 01/10] kexec_file: Allow arch-specific memory walking for kexec_add_buffer
Allow architectures to specify a different memory walking function for kexec_add_buffer. x86 uses iomem to track reserved memory ranges, but PowerPC uses the memblock subsystem. Signed-off-by: Thiago Jung Bauermann <bauer...@linux.vnet.ibm.com> Acked-by: Dave Young <dyo...@redhat.com> Acked-by: Balbir Singh <bsinghar...@gmail.com> --- include/linux/kexec.h | 29 - kernel/kexec_file.c | 30 ++ kernel/kexec_internal.h | 16 3 files changed, 50 insertions(+), 25 deletions(-) diff --git a/include/linux/kexec.h b/include/linux/kexec.h index 406c33dcae13..5e320ddaaa82 100644 --- a/include/linux/kexec.h +++ b/include/linux/kexec.h @@ -148,7 +148,34 @@ struct kexec_file_ops { kexec_verify_sig_t *verify_sig; #endif }; -#endif + +/** + * struct kexec_buf - parameters for finding a place for a buffer in memory + * @image: kexec image in which memory to search. + * @buffer:Contents which will be copied to the allocated memory. + * @bufsz: Size of @buffer. + * @mem: On return will have address of the buffer in memory. + * @memsz: Size for the buffer in memory. + * @buf_align: Minimum alignment needed. + * @buf_min: The buffer can't be placed below this address. + * @buf_max: The buffer can't be placed above this address. + * @top_down: Allocate from top of memory. + */ +struct kexec_buf { + struct kimage *image; + char *buffer; + unsigned long bufsz; + unsigned long mem; + unsigned long memsz; + unsigned long buf_align; + unsigned long buf_min; + unsigned long buf_max; + bool top_down; +}; + +int __weak arch_kexec_walk_mem(struct kexec_buf *kbuf, + int (*func)(u64, u64, void *)); +#endif /* CONFIG_KEXEC_FILE */ struct kimage { kimage_entry_t head; diff --git a/kernel/kexec_file.c b/kernel/kexec_file.c index 037c321c5618..f865674bff51 100644 --- a/kernel/kexec_file.c +++ b/kernel/kexec_file.c @@ -428,6 +428,27 @@ static int locate_mem_hole_callback(u64 start, u64 end, void *arg) return locate_mem_hole_bottom_up(start, end, kbuf); } +/** + * arch_kexec_walk_mem - call func(data) on free memory regions + * @kbuf: Context info for the search. Also passed to @func. + * @func: Function to call for each memory region. + * + * Return: The memory walk will stop when func returns a non-zero value + * and that value will be returned. If all free regions are visited without + * func returning non-zero, then zero will be returned. + */ +int __weak arch_kexec_walk_mem(struct kexec_buf *kbuf, + int (*func)(u64, u64, void *)) +{ + if (kbuf->image->type == KEXEC_TYPE_CRASH) + return walk_iomem_res_desc(crashk_res.desc, + IORESOURCE_SYSTEM_RAM | IORESOURCE_BUSY, + crashk_res.start, crashk_res.end, + kbuf, func); + else + return walk_system_ram_res(0, ULONG_MAX, kbuf, func); +} + /* * Helper function for placing a buffer in a kexec segment. This assumes * that kexec_mutex is held. @@ -474,14 +495,7 @@ int kexec_add_buffer(struct kimage *image, char *buffer, unsigned long bufsz, kbuf->top_down = top_down; /* Walk the RAM ranges and allocate a suitable range for the buffer */ - if (image->type == KEXEC_TYPE_CRASH) - ret = walk_iomem_res_desc(crashk_res.desc, - IORESOURCE_SYSTEM_RAM | IORESOURCE_BUSY, - crashk_res.start, crashk_res.end, kbuf, - locate_mem_hole_callback); - else - ret = walk_system_ram_res(0, -1, kbuf, - locate_mem_hole_callback); + ret = arch_kexec_walk_mem(kbuf, locate_mem_hole_callback); if (ret != 1) { /* A suitable memory range could not be found for buffer */ return -EADDRNOTAVAIL; diff --git a/kernel/kexec_internal.h b/kernel/kexec_internal.h index 0a52315d9c62..4cef7e4706b0 100644 --- a/kernel/kexec_internal.h +++ b/kernel/kexec_internal.h @@ -20,22 +20,6 @@ struct kexec_sha_region { unsigned long len; }; -/* - * Keeps track of buffer parameters as provided by caller for requesting - * memory placement of buffer. - */ -struct kexec_buf { - struct kimage *image; - char *buffer; - unsigned long bufsz; - unsigned long mem; - unsigned long memsz; - unsigned long buf_align; - unsigned long buf_min; - unsigned long buf_max; - bool top_down; /* allocate from top of memory hole */ -}; - void kimage_file_post_load_cleanup(struct kimage *image); #else /* CONFIG_KEXEC_FILE */ static inline void kimage_file_post_load_cleanup(struc
Re: [RFC] kexec_file: Add support for purgatory built as PIE
Hello Eric, Am Freitag, 4. November 2016, 10:13:39 BRST schrieb Eric W. Biederman: > Baoquan He <b...@redhat.com> writes: > > On 11/02/16 at 04:00am, Thiago Jung Bauermann wrote: > >> Hello, > >> > >> The kexec_file code currently builds the purgatory as a partially linked > >> object (using ld -r). Is there a particular reason to use that instead > >> of a position independent executable (PIE)? > > > > It's taken as "-r", relocatable in user space kexec-tools too originally. > > I think Vivek just keeps it the same when moving into kernel. > > At least on x86 using just -r removed the need for a GOT and all of the > other nasty dynamic relocatable bits, that are not needed when the you > don't want to share your text bits with the page cache. > > I can see reaons for refactoring code but I expect PIE expecutables need > a GOT and all of that pain in the neck stuff that can just be avoided by > building the code to run at an absolute address. At least on powerpc, building the purgatory as PIE resulted in only the following differences: 1. A lot less relocation types to deal with. 2. __kexec_load_purgatory needs to use the program headers rather than the section headers to figure out how to load the binary. 3. Symbol values are absolute addresses instead of relative to the start of the section. 2. is an advantage too because it's actually easier to use the program headers because unlike section headers, the purpose of program headers is to provide the information needed by a program loader. You can see this by comparing the two implementations of __kexec_load_purgatory in the WIP patch I posted. The one using program headers is simpler. 3. isn't a problem, it's easy to convert the absolute addresses back into relative ones, as can be seen in my patch. > So far I have not seen ELF relocations that are difficult to process. The problem is not that it's difficult to process, but that on powerpc it takes a lot of code to implement that processing. In v9 of the kexec_file_load implementation for powerpc, the switch statement implementing all the relocation types (shared by powerpc's module_64.c and machine_kexec_file_64.c) has 200 lines. The switch statement implementing only the relocation types used by the PIE purgatory has 26 lines. This is not a problem in x86, though: the purgatory built as a relocatable object has only two relocation types. -- Thiago Jung Bauermann IBM Linux Technology Center ___ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec
Re: [RFC] kexec_file: Add support for purgatory built as PIE
Hello Baoquan, Am Freitag, 4. November 2016, 15:38:40 BRST schrieb Baoquan He: > On 11/02/16 at 04:00am, Thiago Jung Bauermann wrote: > > Hello, > > > > The kexec_file code currently builds the purgatory as a partially linked > > object (using ld -r). Is there a particular reason to use that instead of > > a position independent executable (PIE)? > > It's taken as "-r", relocatable in user space kexec-tools too originally. > I think Vivek just keeps it the same when moving into kernel. Ok. If that's the only reason then PIE is better suited at least for powerpc. > > I found a discussion from 2013 in the archives but from what I understood > > it was about the purgatory as a separate object vs having it linked into > > the kernel, which is different from what I'm asking: > > > > http://lists.infradead.org/pipermail/kexec/2013-December/010535.html > > > > Here is my motivation for this question: > > On ppc64 purgatory.ro has 12 relocation types when built as a partially > > > > linked object. This makes arch_kexec_apply_relocations_add duplicate a lot > > of code with module_64.c:apply_relocate_add to implement these > > relocations. The alternative is to do some refactoring so that both > > functions can share the implementation of the relocations. This is done > > in patches 5 and 6 of the > > kexec_file_load implementation for powerpc: > In user space kexec-tools utility, you also got this problem? Yes, kexec-tools' purgatory.ro has 10 relocation types instead of 12 (I don't know why), but that's still a lot. > > @@ -942,7 +1085,13 @@ static Elf_Sym *kexec_purgatory_find_symbol(struct > > purgatory_info *pi,> > > /* Go through symbols for a match */ > > for (k = 0; k < sechdrs[i].sh_size/sizeof(Elf_Sym); k++) { > > > > - if (ELF_ST_BIND(syms[k].st_info) != STB_GLOBAL) > > + /* > > +* FIXME: See if we can or should export the .TOC. > > +* symbol as global instead of searching local symbols > > +* here. > > +*/ > > + if (ELF_ST_BIND(syms[k].st_info) != STB_GLOBAL && > > + ELF_ST_BIND(syms[k].st_info) != STB_LOCAL) > > > > continue; > > > > if (strcmp(strtab + syms[k].st_name, name) != 0) > > I don't need the change above anymore. I found a way to obtain the TOC pointer without looking for the .TOC. symbol. -- Thiago Jung Bauermann IBM Linux Technology Center ___ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec
[RFC] kexec_file: Add support for purgatory built as PIE
Hello, The kexec_file code currently builds the purgatory as a partially linked object (using ld -r). Is there a particular reason to use that instead of a position independent executable (PIE)? I found a discussion from 2013 in the archives but from what I understood it was about the purgatory as a separate object vs having it linked into the kernel, which is different from what I'm asking: http://lists.infradead.org/pipermail/kexec/2013-December/010535.html Here is my motivation for this question: On ppc64 purgatory.ro has 12 relocation types when built as a partially linked object. This makes arch_kexec_apply_relocations_add duplicate a lot of code with module_64.c:apply_relocate_add to implement these relocations. The alternative is to do some refactoring so that both functions can share the implementation of the relocations. This is done in patches 5 and 6 of the kexec_file_load implementation for powerpc: https://lists.ozlabs.org/pipermail/linuxppc-dev/2016-October/149984.html Michael Ellerman would prefer if module_64.c didn't need to be changed, and suggested that the purgatory could be a position independent executable. Indeed, in that case there are only 4 relocation types in purgatory.ro (which aren't even implemented in module_64.c:apply_relocate_add), so the relocation code for the purgatory can leave that file alone and have its own relocation implementation. Also, the purgatory is an executable and not an intermediary output from the compiler, so in my mind it makes sense conceptually that it is easier to build it as a PIE than as a partially linked object. The patch below adds the support needed in kexec_file.c to allow powerpc- specific code to load and relocate a purgatory binary built as PIE. This is WIP and can probably be refined a bit. Would you accept a change along these lines? Signed-off-by: Thiago Jung Bauermann <bauer...@linux.vnet.ibm.com> --- arch/Kconfig| 3 + kernel/kexec_file.c | 159 ++-- kernel/kexec_internal.h | 26 3 files changed, 183 insertions(+), 5 deletions(-) diff --git a/arch/Kconfig b/arch/Kconfig index 659bdd079277..7fd6879be222 100644 --- a/arch/Kconfig +++ b/arch/Kconfig @@ -5,6 +5,9 @@ config KEXEC_CORE bool +config HAVE_KEXEC_FILE_PIE_PURGATORY + bool + config OPROFILE tristate "OProfile system profiling" depends on PROFILING diff --git a/kernel/kexec_file.c b/kernel/kexec_file.c index 0c2df7f73792..dfc3e015160d 100644 --- a/kernel/kexec_file.c +++ b/kernel/kexec_file.c @@ -633,7 +633,149 @@ static int kexec_calculate_store_digests(struct kimage *image) return ret; } -/* Actually load purgatory. Lot of code taken from kexec-tools */ +#ifdef CONFIG_HAVE_KEXEC_FILE_PIE_PURGATORY +/* Load PIE purgatory using the program header information. */ +static int __kexec_load_purgatory(struct kimage *image, unsigned long min, + unsigned long max, int top_down) +{ + struct purgatory_info *pi = >purgatory_info; + unsigned long first_offset; + unsigned long orig_load_addr = 0; + const void *src; + int i, ret; + const Elf_Phdr *phdrs = (const void *) pi->ehdr + pi->ehdr->e_phoff; + const Elf_Phdr *phdr; + const Elf_Shdr *sechdrs_c; + Elf_Shdr *sechdr; + Elf_Shdr *sechdrs = NULL; + struct kexec_buf kbuf = { .image = image, .bufsz = 0, .buf_align = 1, + .buf_min = min, .buf_max = max, + .top_down = top_down }; + + /* +* sechdrs_c points to section headers in purgatory and are read +* only. No modifications allowed. +*/ + sechdrs_c = (void *) pi->ehdr + pi->ehdr->e_shoff; + + /* +* We can not modify sechdrs_c[] and its fields. It is read only. +* Copy it over to a local copy where one can store some temporary +* data and free it at the end. We need to modify ->sh_addr and +* ->sh_offset fields to keep track of permanent and temporary +* locations of sections. +*/ + sechdrs = vzalloc(pi->ehdr->e_shnum * sizeof(Elf_Shdr)); + if (!sechdrs) + return -ENOMEM; + + memcpy(sechdrs, sechdrs_c, pi->ehdr->e_shnum * sizeof(Elf_Shdr)); + + /* +* We seem to have multiple copies of sections. First copy is which +* is embedded in kernel in read only section. Some of these sections +* will be copied to a temporary buffer and relocated. And these +* sections will finally be copied to their final destination at +* segment load time. +* +* Use ->sh_offset to reflect section address in memory. It will +* point to original read only copy if section is not allocatable. +* Otherwise it will point to temporary copy which will be relocated. +
[PATCH v9 06/10] powerpc: Implement kexec_file_load.
Add arch-specific functions needed by generic kexec_file code. Also, module_64.c's apply_relocate_add and kexec_file's arch_kexec_apply_relocations_add have slightly different needs, so elf64_apply_relocate_add_item needs to be adapted to accommodate both: When apply_relocate_add is called, the module is already loaded at its final location in memory so the place where the relocation needs to be applied and its address in the module's memory are the same. This is not the case for kexec's purgatory, because it is stored in a buffer and will only be copied to its final location in memory right before being executed. Therefore, it needs to be relocated while still in its buffer. In this case, the place where the relocation needs to be applied is different from its address in the purgatory's memory. So we add an address argument to elf64_apply_relocate_add_item to specify the final address of the relocation in memory. We also add more relocation types that are used by the purgatory. Signed-off-by: Josh Sklar <sk...@linux.vnet.ibm.com> Signed-off-by: Thiago Jung Bauermann <bauer...@linux.vnet.ibm.com> --- arch/powerpc/Kconfig| 13 ++ arch/powerpc/include/asm/elf_util.h | 43 + arch/powerpc/include/asm/systbl.h | 1 + arch/powerpc/include/asm/unistd.h | 2 +- arch/powerpc/include/uapi/asm/unistd.h | 1 + arch/powerpc/kernel/Makefile| 1 + arch/powerpc/kernel/elf_util.c | 46 ++ arch/powerpc/kernel/machine_kexec_file_64.c | 245 arch/powerpc/kernel/module_64.c | 71 ++-- 9 files changed, 406 insertions(+), 17 deletions(-) diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig index 6cb59c6e5ba4..897d0f14447d 100644 --- a/arch/powerpc/Kconfig +++ b/arch/powerpc/Kconfig @@ -455,6 +455,19 @@ config KEXEC interface is strongly in flux, so no good recommendation can be made. +config KEXEC_FILE + bool "kexec file based system call" + select KEXEC_CORE + select BUILD_BIN2C + depends on PPC64 + depends on CRYPTO=y + depends on CRYPTO_SHA256=y + help + This is a new version of the kexec system call. This call is + file based and takes in file descriptors as system call arguments + for kernel and initramfs as opposed to a list of segments as is the + case for the older kexec call. + config RELOCATABLE bool "Build a relocatable kernel" depends on (PPC64 && !COMPILE_TEST) || (FLATMEM && (44x || FSL_BOOKE)) diff --git a/arch/powerpc/include/asm/elf_util.h b/arch/powerpc/include/asm/elf_util.h new file mode 100644 index ..1df232f65ec8 --- /dev/null +++ b/arch/powerpc/include/asm/elf_util.h @@ -0,0 +1,43 @@ +/* + * Utility functions to work with ELF files. + * + * Copyright (C) 2016, IBM Corporation + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2, or (at your option) + * any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + */ + +#ifndef _ASM_POWERPC_ELF_UTIL_H +#define _ASM_POWERPC_ELF_UTIL_H + +#include + +/* + * r2 is the TOC pointer: it actually points 0x8000 into the TOC (this + * gives the value maximum span in an instruction which uses a signed + * offset) + */ +static inline unsigned long elf_my_r2(const struct elf_shdr *sechdrs, + unsigned int toc_section) +{ + return sechdrs[toc_section].sh_addr + 0x8000; +} + +unsigned int elf_toc_section(const struct elfhdr *ehdr, +const struct elf_shdr *sechdrs); + +int elf64_apply_relocate_add_item(const Elf64_Shdr *sechdrs, const char *strtab, + const Elf64_Rela *rela, const Elf64_Sym *sym, + unsigned long *location, + unsigned long address, unsigned long value, + unsigned long my_r2, const char *obj_name, + struct module *me); + +#endif /* _ASM_POWERPC_ELF_UTIL_H */ diff --git a/arch/powerpc/include/asm/systbl.h b/arch/powerpc/include/asm/systbl.h index 2fc5d4db503c..4b369d83fe9c 100644 --- a/arch/powerpc/include/asm/systbl.h +++ b/arch/powerpc/include/asm/systbl.h @@ -386,3 +386,4 @@ SYSCALL(mlock2) SYSCALL(copy_file_range) COMPAT_SYS_SPU(preadv2) COMPAT_SYS_SPU(pwritev2) +SYSCALL(kexec_file_load) diff --git a/arch/powerpc/include/asm/unistd.h b/arch/powerpc/include/asm/unistd.h index cf12c580f6b2..a01e97d3f305 100644 --- a/arch/powerpc/include/asm/
[PATCH v6 08/10] ima: support restoring multiple template formats
From: Mimi ZoharThe configured IMA measurement list template format can be replaced at runtime on the boot command line, including a custom template format. This patch adds support for restoring a measuremement list containing multiple builtin/custom template formats. Signed-off-by: Mimi Zohar --- security/integrity/ima/ima_template.c | 53 +-- 1 file changed, 50 insertions(+), 3 deletions(-) diff --git a/security/integrity/ima/ima_template.c b/security/integrity/ima/ima_template.c index c0d808c20c40..e57b4682ff93 100644 --- a/security/integrity/ima/ima_template.c +++ b/security/integrity/ima/ima_template.c @@ -155,9 +155,14 @@ static int template_desc_init_fields(const char *template_fmt, { const char *template_fmt_ptr; struct ima_template_field *found_fields[IMA_TEMPLATE_NUM_FIELDS_MAX]; - int template_num_fields = template_fmt_size(template_fmt); + int template_num_fields; int i, len; + if (num_fields && *num_fields > 0) /* already initialized? */ + return 0; + + template_num_fields = template_fmt_size(template_fmt); + if (template_num_fields > IMA_TEMPLATE_NUM_FIELDS_MAX) { pr_err("format string '%s' contains too many fields\n", template_fmt); @@ -237,6 +242,35 @@ int __init ima_init_template(void) return result; } +static struct ima_template_desc *restore_template_fmt(char *template_name) +{ + struct ima_template_desc *template_desc = NULL; + int ret; + + ret = template_desc_init_fields(template_name, NULL, NULL); + if (ret < 0) { + pr_err("attempting to initialize the template \"%s\" failed\n", + template_name); + goto out; + } + + template_desc = kzalloc(sizeof(*template_desc), GFP_KERNEL); + if (!template_desc) + goto out; + + template_desc->name = ""; + template_desc->fmt = kstrdup(template_name, GFP_KERNEL); + if (!template_desc->fmt) + goto out; + + spin_lock(_list); + list_add_tail_rcu(_desc->list, _templates); + spin_unlock(_list); + synchronize_rcu(); +out: + return template_desc; +} + static int ima_restore_template_data(struct ima_template_desc *template_desc, void *template_data, int template_data_size, @@ -367,10 +401,23 @@ int ima_restore_measurement_list(loff_t size, void *buf) } data_v1 = bufp += (u_int8_t)hdr_v1->template_name_len; - /* get template format */ template_desc = lookup_template_desc(template_name); if (!template_desc) { - pr_err("template \"%s\" not found\n", template_name); + template_desc = restore_template_fmt(template_name); + if (!template_desc) + break; + } + + /* +* Only the running system's template format is initialized +* on boot. As needed, initialize the other template formats. +*/ + ret = template_desc_init_fields(template_desc->fmt, + &(template_desc->fields), + &(template_desc->num_fields)); + if (ret < 0) { + pr_err("attempting to restore the template fmt \"%s\" \ + failed\n", template_desc->fmt); ret = -EINVAL; break; } -- 2.7.4 ___ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec
[PATCH v6 06/10] ima: on soft reboot, save the measurement list
From: Mimi Zohar <zo...@linux.vnet.ibm.com> The TPM PCRs are only reset on a hard reboot. In order to validate a TPM's quote after a soft reboot (eg. kexec -e), the IMA measurement list of the running kernel must be saved and restored on boot. This patch uses the kexec buffer passing mechanism to pass the serialized IMA binary_runtime_measurements to the next kernel. Changelog v5: - move writing the IMA measurement list to kexec load and remove from kexec execute. - remove registering notifier to call update on kexec execute - add includes needed by code in this patch to ima_kexec.c (Thiago) - fold patch "ima: serialize the binary_runtime_measurements" into this patch. Changelog v4: - Revert the skip_checksum change.  Instead calculate the checksum with the measurement list segment, on update validate the existing checksum before re-calulating a new checksum with the updated measurement list. Changelog v3: - Request a kexec segment for storing the measurement list a half page, not a full page, more than needed for additional measurements. - Added binary_runtime_size overflow test - Limit maximum number of pages needed for kexec_segment_size to half of totalram_pages. (Dave Young) Changelog v2: - Fix build issue by defining a stub ima_add_kexec_buffer and stub struct kimage when CONFIG_IMA=n and CONFIG_IMA_KEXEC=n. (Fenguang Wu) - removed kexec_add_handover_buffer() checksum argument. - added skip_checksum member to kexec_buf - only register reboot notifier once Changelog v1: - updated to call IMA functions (Mimi) - move code from ima_template.c to ima_kexec.c (Mimi) Signed-off-by: Thiago Jung Bauermann <bauer...@linux.vnet.ibm.com> Signed-off-by: Mimi Zohar <zo...@linux.vnet.ibm.com> Acked-by: "Eric W. Biederman" <ebied...@xmission.com> --- include/linux/ima.h| 12 kernel/kexec_file.c| 4 ++ security/integrity/ima/ima.h | 1 + security/integrity/ima/ima_fs.c| 2 +- security/integrity/ima/ima_kexec.c | 117 + 5 files changed, 135 insertions(+), 1 deletion(-) diff --git a/include/linux/ima.h b/include/linux/ima.h index 0eb7c2e7f0d6..7f6952f8d6aa 100644 --- a/include/linux/ima.h +++ b/include/linux/ima.h @@ -11,6 +11,7 @@ #define _LINUX_IMA_H #include +#include struct linux_binprm; #ifdef CONFIG_IMA @@ -23,6 +24,10 @@ extern int ima_post_read_file(struct file *file, void *buf, loff_t size, enum kernel_read_file_id id); extern void ima_post_path_mknod(struct dentry *dentry); +#ifdef CONFIG_IMA_KEXEC +extern void ima_add_kexec_buffer(struct kimage *image); +#endif + #else static inline int ima_bprm_check(struct linux_binprm *bprm) { @@ -62,6 +67,13 @@ static inline void ima_post_path_mknod(struct dentry *dentry) #endif /* CONFIG_IMA */ +#ifndef CONFIG_IMA_KEXEC +struct kimage; + +static inline void ima_add_kexec_buffer(struct kimage *image) +{} +#endif + #ifdef CONFIG_IMA_APPRAISE extern void ima_inode_post_setattr(struct dentry *dentry); extern int ima_inode_setxattr(struct dentry *dentry, const char *xattr_name, diff --git a/kernel/kexec_file.c b/kernel/kexec_file.c index 0c2df7f73792..b56a558e406d 100644 --- a/kernel/kexec_file.c +++ b/kernel/kexec_file.c @@ -19,6 +19,7 @@ #include #include #include +#include #include #include #include @@ -132,6 +133,9 @@ kimage_file_prepare_segments(struct kimage *image, int kernel_fd, int initrd_fd, return ret; image->kernel_buf_len = size; + /* IMA needs to pass the measurement list to the next kernel. */ + ima_add_kexec_buffer(image); + /* Call arch image probe handlers */ ret = arch_kexec_kernel_image_probe(image, image->kernel_buf, image->kernel_buf_len); diff --git a/security/integrity/ima/ima.h b/security/integrity/ima/ima.h index ea1dcc452911..139dec67dcbf 100644 --- a/security/integrity/ima/ima.h +++ b/security/integrity/ima/ima.h @@ -143,6 +143,7 @@ void ima_print_digest(struct seq_file *m, u8 *digest, u32 size); struct ima_template_desc *ima_template_desc_current(void); int ima_restore_measurement_entry(struct ima_template_entry *entry); int ima_restore_measurement_list(loff_t bufsize, void *buf); +int ima_measurements_show(struct seq_file *m, void *v); unsigned long ima_get_binary_runtime_size(void); int ima_init_template(void); diff --git a/security/integrity/ima/ima_fs.c b/security/integrity/ima/ima_fs.c index c07a3844ea0a..66e5dd5e226f 100644 --- a/security/integrity/ima/ima_fs.c +++ b/security/integrity/ima/ima_fs.c @@ -116,7 +116,7 @@ void ima_putc(struct seq_file *m, void *data, int datalen) * [eventdata length] * eventdata[n]=template specific data */ -static int ima_measurements_show(struct seq_file *m, void *v) +int ima_measurements_show(struct seq_file *m, void *v) { /* the list never sh
[PATCH v6 07/10] ima: store the builtin/custom template definitions in a list
From: Mimi ZoharThe builtin and single custom templates are currently stored in an array. In preparation for being able to restore a measurement list containing multiple builtin/custom templates, this patch stores the builtin and custom templates as a linked list. This will permit defining more than one custom template per boot. Changelog v4: - fix "spinlock bad magic" BUG - reported by Dmitry Vyukov Changelog v3: - initialize template format list in ima_template_desc_current(), as it might be called during __setup before normal initialization. (kernel test robot) - remove __init annotation of ima_init_template_list() Changelog v2: - fix lookup_template_desc() preemption imbalance (kernel test robot) Signed-off-by: Mimi Zohar --- security/integrity/ima/ima.h | 2 ++ security/integrity/ima/ima_main.c | 1 + security/integrity/ima/ima_template.c | 52 +++ 3 files changed, 44 insertions(+), 11 deletions(-) diff --git a/security/integrity/ima/ima.h b/security/integrity/ima/ima.h index 139dec67dcbf..6b0540ad189f 100644 --- a/security/integrity/ima/ima.h +++ b/security/integrity/ima/ima.h @@ -85,6 +85,7 @@ struct ima_template_field { /* IMA template descriptor definition */ struct ima_template_desc { + struct list_head list; char *name; char *fmt; int num_fields; @@ -146,6 +147,7 @@ int ima_restore_measurement_list(loff_t bufsize, void *buf); int ima_measurements_show(struct seq_file *m, void *v); unsigned long ima_get_binary_runtime_size(void); int ima_init_template(void); +void ima_init_template_list(void); /* * used to protect h_table and sha_table diff --git a/security/integrity/ima/ima_main.c b/security/integrity/ima/ima_main.c index 423d111b3b94..50818c60538b 100644 --- a/security/integrity/ima/ima_main.c +++ b/security/integrity/ima/ima_main.c @@ -418,6 +418,7 @@ static int __init init_ima(void) { int error; + ima_init_template_list(); hash_setup(CONFIG_IMA_DEFAULT_HASH); error = ima_init(); if (!error) { diff --git a/security/integrity/ima/ima_template.c b/security/integrity/ima/ima_template.c index 37f972cb05fe..c0d808c20c40 100644 --- a/security/integrity/ima/ima_template.c +++ b/security/integrity/ima/ima_template.c @@ -15,16 +15,20 @@ #define pr_fmt(fmt) KBUILD_MODNAME ": " fmt +#include #include "ima.h" #include "ima_template_lib.h" -static struct ima_template_desc defined_templates[] = { +static struct ima_template_desc builtin_templates[] = { {.name = IMA_TEMPLATE_IMA_NAME, .fmt = IMA_TEMPLATE_IMA_FMT}, {.name = "ima-ng", .fmt = "d-ng|n-ng"}, {.name = "ima-sig", .fmt = "d-ng|n-ng|sig"}, {.name = "", .fmt = ""},/* placeholder for a custom format */ }; +static LIST_HEAD(defined_templates); +static DEFINE_SPINLOCK(template_list); + static struct ima_template_field supported_fields[] = { {.field_id = "d", .field_init = ima_eventdigest_init, .field_show = ima_show_template_digest}, @@ -53,6 +57,8 @@ static int __init ima_template_setup(char *str) if (ima_template) return 1; + ima_init_template_list(); + /* * Verify that a template with the supplied name exists. * If not, use CONFIG_IMA_DEFAULT_TEMPLATE. @@ -81,7 +87,7 @@ __setup("ima_template=", ima_template_setup); static int __init ima_template_fmt_setup(char *str) { - int num_templates = ARRAY_SIZE(defined_templates); + int num_templates = ARRAY_SIZE(builtin_templates); if (ima_template) return 1; @@ -92,22 +98,28 @@ static int __init ima_template_fmt_setup(char *str) return 1; } - defined_templates[num_templates - 1].fmt = str; - ima_template = defined_templates + num_templates - 1; + builtin_templates[num_templates - 1].fmt = str; + ima_template = builtin_templates + num_templates - 1; + return 1; } __setup("ima_template_fmt=", ima_template_fmt_setup); static struct ima_template_desc *lookup_template_desc(const char *name) { - int i; + struct ima_template_desc *template_desc; + int found = 0; - for (i = 0; i < ARRAY_SIZE(defined_templates); i++) { - if (strcmp(defined_templates[i].name, name) == 0) - return defined_templates + i; + rcu_read_lock(); + list_for_each_entry_rcu(template_desc, _templates, list) { + if ((strcmp(template_desc->name, name) == 0) || + (strcmp(template_desc->fmt, name) == 0)) { + found = 1; + break; + } } - - return NULL; + rcu_read_unlock(); + return found ? template_desc : NULL; } static struct ima_template_field *lookup_template_field(const char *field_id) @@ -183,11 +195,29 @@ static int
[PATCH v6 09/10] ima: define a canonical binary_runtime_measurements list format
From: Mimi ZoharThe IMA binary_runtime_measurements list is currently in platform native format. To allow restoring a measurement list carried across kexec with a different endianness than the targeted kernel, this patch defines little-endian as the canonical format. For big endian systems wanting to save/restore the measurement list from a system with a different endianness, a new boot command line parameter named "ima_canonical_fmt" is defined. Considerations: use of the "ima_canonical_fmt" boot command line option will break existing userspace applications on big endian systems expecting the binary_runtime_measurements list to be in platform native format. Changelog v3: - restore PCR value properly Signed-off-by: Mimi Zohar --- Documentation/kernel-parameters.txt | 4 security/integrity/ima/ima.h | 6 ++ security/integrity/ima/ima_fs.c | 28 +--- security/integrity/ima/ima_kexec.c| 11 +-- security/integrity/ima/ima_template.c | 24 ++-- security/integrity/ima/ima_template_lib.c | 7 +-- 6 files changed, 67 insertions(+), 13 deletions(-) diff --git a/Documentation/kernel-parameters.txt b/Documentation/kernel-parameters.txt index 37babf91f2cb..3ee81afad7e9 100644 --- a/Documentation/kernel-parameters.txt +++ b/Documentation/kernel-parameters.txt @@ -1641,6 +1641,10 @@ bytes respectively. Such letter suffixes can also be entirely omitted. The builtin appraise policy appraises all files owned by uid=0. + ima_canonical_fmt [IMA] + Use the canonical format for the binary runtime + measurements, instead of host native format. + ima_hash= [IMA] Format: { md5 | sha1 | rmd160 | sha256 | sha384 | sha512 | ... } diff --git a/security/integrity/ima/ima.h b/security/integrity/ima/ima.h index 6b0540ad189f..5e6180a4da7d 100644 --- a/security/integrity/ima/ima.h +++ b/security/integrity/ima/ima.h @@ -122,6 +122,12 @@ void ima_load_kexec_buffer(void); static inline void ima_load_kexec_buffer(void) {} #endif /* CONFIG_HAVE_IMA_KEXEC */ +/* + * The default binary_runtime_measurements list format is defined as the + * platform native format. The canonical format is defined as little-endian. + */ +extern bool ima_canonical_fmt; + /* Internal IMA function definitions */ int ima_init(void); int ima_fs_init(void); diff --git a/security/integrity/ima/ima_fs.c b/security/integrity/ima/ima_fs.c index 66e5dd5e226f..2bcad99d434e 100644 --- a/security/integrity/ima/ima_fs.c +++ b/security/integrity/ima/ima_fs.c @@ -28,6 +28,16 @@ static DEFINE_MUTEX(ima_write_mutex); +bool ima_canonical_fmt; +static int __init default_canonical_fmt_setup(char *str) +{ +#ifdef __BIG_ENDIAN + ima_canonical_fmt = 1; +#endif + return 1; +} +__setup("ima_canonical_fmt", default_canonical_fmt_setup); + static int valid_policy = 1; #define TMPBUFLEN 12 static ssize_t ima_show_htable_value(char __user *buf, size_t count, @@ -122,7 +132,7 @@ int ima_measurements_show(struct seq_file *m, void *v) struct ima_queue_entry *qe = v; struct ima_template_entry *e; char *template_name; - int namelen; + u32 pcr, namelen, template_data_len; /* temporary fields */ bool is_ima_template = false; int i; @@ -139,25 +149,29 @@ int ima_measurements_show(struct seq_file *m, void *v) * PCR used defaults to the same (config option) in * little-endian format, unless set in policy */ - ima_putc(m, >pcr, sizeof(e->pcr)); + pcr = !ima_canonical_fmt ? e->pcr : cpu_to_le32(e->pcr); + ima_putc(m, , sizeof(e->pcr)); /* 2nd: template digest */ ima_putc(m, e->digest, TPM_DIGEST_SIZE); /* 3rd: template name size */ - namelen = strlen(template_name); + namelen = !ima_canonical_fmt ? strlen(template_name) : + cpu_to_le32(strlen(template_name)); ima_putc(m, , sizeof(namelen)); /* 4th: template name */ - ima_putc(m, template_name, namelen); + ima_putc(m, template_name, strlen(template_name)); /* 5th: template length (except for 'ima' template) */ if (strcmp(template_name, IMA_TEMPLATE_IMA_NAME) == 0) is_ima_template = true; - if (!is_ima_template) - ima_putc(m, >template_data_len, -sizeof(e->template_data_len)); + if (!is_ima_template) { + template_data_len = !ima_canonical_fmt ? e->template_data_len : + cpu_to_le32(e->template_data_len); + ima_putc(m, _data_len, sizeof(e->template_data_len)); + } /* 6th: template specific data */ for (i = 0; i < e->template_desc->num_fields; i++) {
[PATCH v6 10/10] ima: platform-independent hash value
From: Andreas SteffenFor remote attestion it is important for the ima measurement values to be platform-independent. Therefore integer fields to be hashed must be converted to canonical format. Changelog: - Define canonical format as little endian (Mimi) Signed-off-by: Andreas Steffen Signed-off-by: Mimi Zohar --- security/integrity/ima/ima_crypto.c | 6 -- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/security/integrity/ima/ima_crypto.c b/security/integrity/ima/ima_crypto.c index 38f2ed830dd6..802d5d20f36f 100644 --- a/security/integrity/ima/ima_crypto.c +++ b/security/integrity/ima/ima_crypto.c @@ -477,11 +477,13 @@ static int ima_calc_field_array_hash_tfm(struct ima_field_data *field_data, u8 buffer[IMA_EVENT_NAME_LEN_MAX + 1] = { 0 }; u8 *data_to_hash = field_data[i].data; u32 datalen = field_data[i].len; + u32 datalen_to_hash = + !ima_canonical_fmt ? datalen : cpu_to_le32(datalen); if (strcmp(td->name, IMA_TEMPLATE_IMA_NAME) != 0) { rc = crypto_shash_update(shash, - (const u8 *) _data[i].len, - sizeof(field_data[i].len)); + (const u8 *) _to_hash, + sizeof(datalen_to_hash)); if (rc) break; } else if (strcmp(td->fields[i]->field_id, "n") == 0) { -- 2.7.4 ___ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec
[PATCH v6 02/10] ima: on soft reboot, restore the measurement list
From: Mimi Zohar <zo...@linux.vnet.ibm.com> The TPM PCRs are only reset on a hard reboot. In order to validate a TPM's quote after a soft reboot (eg. kexec -e), the IMA measurement list of the running kernel must be saved and restored on boot. This patch restores the measurement list. Changelog v5: - replace CONFIG_KEXEC_FILE with architecture CONFIG_HAVE_IMA_KEXEC (Thiago) - replace kexec_get_handover_buffer() with ima_get_kexec_buffer() (Thiago) - replace kexec_free_handover_buffer() with ima_free_kexec_buffer() (Thiago) - remove unnecessary includes from ima_kexec.c (Thiago) - fix off-by-one error when checking hdr_v1->template_name_len (Colin King) Changelog v2: - redefined ima_kexec_hdr to use types with well defined sizes (M. Ellerman) - defined missing ima_load_kexec_buffer() stub function Changelog v1: - call ima_load_kexec_buffer() (Thiago) Signed-off-by: Mimi Zohar <zo...@linux.vnet.ibm.com> --- security/integrity/ima/Makefile | 1 + security/integrity/ima/ima.h | 21 + security/integrity/ima/ima_init.c | 2 + security/integrity/ima/ima_kexec.c| 44 + security/integrity/ima/ima_queue.c| 10 ++ security/integrity/ima/ima_template.c | 170 ++ 6 files changed, 248 insertions(+) diff --git a/security/integrity/ima/Makefile b/security/integrity/ima/Makefile index 9aeaedad1e2b..29f198bde02b 100644 --- a/security/integrity/ima/Makefile +++ b/security/integrity/ima/Makefile @@ -8,4 +8,5 @@ obj-$(CONFIG_IMA) += ima.o ima-y := ima_fs.o ima_queue.o ima_init.o ima_main.o ima_crypto.o ima_api.o \ ima_policy.o ima_template.o ima_template_lib.o ima-$(CONFIG_IMA_APPRAISE) += ima_appraise.o +ima-$(CONFIG_HAVE_IMA_KEXEC) += ima_kexec.o obj-$(CONFIG_IMA_BLACKLIST_KEYRING) += ima_mok.o diff --git a/security/integrity/ima/ima.h b/security/integrity/ima/ima.h index db25f54a04fe..51dc8d57d64d 100644 --- a/security/integrity/ima/ima.h +++ b/security/integrity/ima/ima.h @@ -28,6 +28,10 @@ #include "../integrity.h" +#ifdef CONFIG_HAVE_IMA_KEXEC +#include +#endif + enum ima_show_type { IMA_SHOW_BINARY, IMA_SHOW_BINARY_NO_FIELD_LEN, IMA_SHOW_BINARY_OLD_STRING_FMT, IMA_SHOW_ASCII }; enum tpm_pcrs { TPM_PCR0 = 0, TPM_PCR8 = 8 }; @@ -102,6 +106,21 @@ struct ima_queue_entry { }; extern struct list_head ima_measurements; /* list of all measurements */ +/* Some details preceding the binary serialized measurement list */ +struct ima_kexec_hdr { + u16 version; + u16 _reserved0; + u32 _reserved1; + u64 buffer_size; + u64 count; +}; + +#ifdef CONFIG_HAVE_IMA_KEXEC +void ima_load_kexec_buffer(void); +#else +static inline void ima_load_kexec_buffer(void) {} +#endif /* CONFIG_HAVE_IMA_KEXEC */ + /* Internal IMA function definitions */ int ima_init(void); int ima_fs_init(void); @@ -122,6 +141,8 @@ int ima_init_crypto(void); void ima_putc(struct seq_file *m, void *data, int datalen); void ima_print_digest(struct seq_file *m, u8 *digest, u32 size); struct ima_template_desc *ima_template_desc_current(void); +int ima_restore_measurement_entry(struct ima_template_entry *entry); +int ima_restore_measurement_list(loff_t bufsize, void *buf); int ima_init_template(void); /* diff --git a/security/integrity/ima/ima_init.c b/security/integrity/ima/ima_init.c index 32912bd54ead..3ba0ca49cba6 100644 --- a/security/integrity/ima/ima_init.c +++ b/security/integrity/ima/ima_init.c @@ -128,6 +128,8 @@ int __init ima_init(void) if (rc != 0) return rc; + ima_load_kexec_buffer(); + rc = ima_add_boot_aggregate(); /* boot aggregate must be first entry */ if (rc != 0) return rc; diff --git a/security/integrity/ima/ima_kexec.c b/security/integrity/ima/ima_kexec.c new file mode 100644 index ..36afd0fe9747 --- /dev/null +++ b/security/integrity/ima/ima_kexec.c @@ -0,0 +1,44 @@ +/* + * Copyright (C) 2016 IBM Corporation + * + * Authors: + * Thiago Jung Bauermann <bauer...@linux.vnet.ibm.com> + * Mimi Zohar <zo...@linux.vnet.ibm.com> + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2 of the License, or + * (at your option) any later version. + */ +#include "ima.h" + +/* + * Restore the measurement list from the previous kernel. + */ +void ima_load_kexec_buffer(void) +{ + void *kexec_buffer = NULL; + size_t kexec_buffer_size = 0; + int rc; + + rc = ima_get_kexec_buffer(_buffer, _buffer_size); + switch (rc) { + case 0: + rc = ima_restore_measurement_list(kexec_buffer_size, + kexec_buffer); + if (rc != 0) + pr_err("Failed to restore the measurement list: %d\n", +
[PATCH v6 01/10] powerpc: ima: Get the kexec buffer passed by the previous kernel
The IMA kexec buffer allows the currently running kernel to pass the measurement list via a kexec segment to the kernel that will be kexec'd. The second kernel can check whether the previous kernel sent the buffer and retrieve it. This is the architecture-specific part which enables IMA to receive the measurement list passed by the previous kernel. It will be used in the next patch. The change in machine_kexec_64.c is to factor out the logic of removing an FDT memory reservation so that it can be used by remove_ima_buffer. Changelog v6: - The kexec_file_load patches v9 already define delete_fdt_mem_rsv, so now we just need to export it. Changelog v5: - New patch in this version. This code was previously in the kexec buffer handover patch series. Changelog relative to kexec handover patches v5: - Added CONFIG_HAVE_IMA_KEXEC. - Added arch/powerpc/include/asm/ima.h. - Moved code to arch/powerpc/kernel/ima_kexec.c. - Renamed functions to variations of ima_kexec_buffer instead of variations of kexec_handover_buffer. - Use a single property /chosen/linux,ima-kexec-buffer containing the buffer address and length, instead of /chosen/linux,kexec-handover-buffer-{start,end}. - Use #address-cells and #size-cells to read the DT property. - Use size_t instead of unsigned long for size arguments. - Always remove linux,ima-kexec-buffer and its memory reservation when preparing a device tree for kexec_file_load. Signed-off-by: Thiago Jung Bauermann <bauer...@linux.vnet.ibm.com> Acked-by: "Eric W. Biederman" <ebied...@xmission.com> --- arch/Kconfig| 3 + arch/powerpc/Kconfig| 1 + arch/powerpc/include/asm/ima.h | 13 +++ arch/powerpc/include/asm/kexec.h| 1 + arch/powerpc/kernel/Makefile| 4 + arch/powerpc/kernel/ima_kexec.c | 132 arch/powerpc/kernel/machine_kexec_file_64.c | 5 +- 7 files changed, 158 insertions(+), 1 deletion(-) diff --git a/arch/Kconfig b/arch/Kconfig index 659bdd079277..e1605ff286a1 100644 --- a/arch/Kconfig +++ b/arch/Kconfig @@ -5,6 +5,9 @@ config KEXEC_CORE bool +config HAVE_IMA_KEXEC + bool + config OPROFILE tristate "OProfile system profiling" depends on PROFILING diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig index 897d0f14447d..40ee044f1915 100644 --- a/arch/powerpc/Kconfig +++ b/arch/powerpc/Kconfig @@ -458,6 +458,7 @@ config KEXEC config KEXEC_FILE bool "kexec file based system call" select KEXEC_CORE + select HAVE_IMA_KEXEC select BUILD_BIN2C depends on PPC64 depends on CRYPTO=y diff --git a/arch/powerpc/include/asm/ima.h b/arch/powerpc/include/asm/ima.h new file mode 100644 index ..d5a72dd9b499 --- /dev/null +++ b/arch/powerpc/include/asm/ima.h @@ -0,0 +1,13 @@ +#ifndef _ASM_POWERPC_IMA_H +#define _ASM_POWERPC_IMA_H + +int ima_get_kexec_buffer(void **addr, size_t *size); +int ima_free_kexec_buffer(void); + +#ifdef CONFIG_IMA +void remove_ima_buffer(void *fdt, int chosen_node); +#else +static inline void remove_ima_buffer(void *fdt, int chosen_node) {} +#endif + +#endif /* _ASM_POWERPC_IMA_H */ diff --git a/arch/powerpc/include/asm/kexec.h b/arch/powerpc/include/asm/kexec.h index 4497db7555b0..23056d2dc330 100644 --- a/arch/powerpc/include/asm/kexec.h +++ b/arch/powerpc/include/asm/kexec.h @@ -101,6 +101,7 @@ int setup_purgatory(struct kimage *image, const void *slave_code, int setup_new_fdt(void *fdt, unsigned long initrd_load_addr, unsigned long initrd_len, const char *cmdline); bool find_debug_console(const void *fdt); +int delete_fdt_mem_rsv(void *fdt, unsigned long start, unsigned long size); #endif /* CONFIG_KEXEC_FILE */ #else /* !CONFIG_KEXEC_CORE */ diff --git a/arch/powerpc/kernel/Makefile b/arch/powerpc/kernel/Makefile index 424b13b1b2b0..c3b37171168c 100644 --- a/arch/powerpc/kernel/Makefile +++ b/arch/powerpc/kernel/Makefile @@ -111,6 +111,10 @@ obj-$(CONFIG_KEXEC_CORE) += machine_kexec.o crash.o \ machine_kexec_$(BITS).o obj-$(CONFIG_KEXEC_FILE) += machine_kexec_file_$(BITS).o elf_util.o \ kexec_elf_$(BITS).o +ifeq ($(CONFIG_HAVE_IMA_KEXEC)$(CONFIG_IMA),yy) +obj-y += ima_kexec.o +endif + obj-$(CONFIG_AUDIT)+= audit.o obj64-$(CONFIG_AUDIT) += compat_audit.o diff --git a/arch/powerpc/kernel/ima_kexec.c b/arch/powerpc/kernel/ima_kexec.c new file mode 100644 index ..36e5a5df3804 --- /dev/null +++ b/arch/powerpc/kernel/ima_kexec.c @@ -0,0 +1,132 @@ +/* + * Copyright (C) 2016 IBM Corporation + * + * Authors: + * Thiago Jung Bauermann <bauer...@linux.vnet.ibm.com> + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as publi
[PATCH v6 05/10] powerpc: ima: Send the kexec buffer to the next kernel
The IMA kexec buffer allows the currently running kernel to pass the measurement list via a kexec segment to the kernel that will be kexec'd. This is the architecture-specific part of setting up the IMA kexec buffer for the next kernel. It will be used in the next patch. Changelog v5: - New patch in this version. This code was previously in the kexec buffer handover patch series. Changelog relative to kexec handover patches v5: - Moved code to arch/powerpc/kernel/ima_kexec.c. - Renamed functions and struct members to variations of ima_kexec_buffer instead of variations of kexec_handover_buffer. - Use a single property /chosen/linux,ima-kexec-buffer containing the buffer address and length, instead of /chosen/linux,kexec-handover-buffer-{start,end}. - Use #address-cells and #size-cells to write the DT property. - Use size_t instead of unsigned long for size arguments. - Use CONFIG_IMA_KEXEC to build this code only when necessary. Signed-off-by: Thiago Jung Bauermann <bauer...@linux.vnet.ibm.com> Acked-by: "Eric W. Biederman" <ebied...@xmission.com> --- arch/powerpc/include/asm/ima.h | 16 + arch/powerpc/include/asm/kexec.h| 14 - arch/powerpc/kernel/ima_kexec.c | 91 + arch/powerpc/kernel/kexec_elf_64.c | 2 +- arch/powerpc/kernel/machine_kexec_file_64.c | 12 +++- 5 files changed, 129 insertions(+), 6 deletions(-) diff --git a/arch/powerpc/include/asm/ima.h b/arch/powerpc/include/asm/ima.h index d5a72dd9b499..2313bdface34 100644 --- a/arch/powerpc/include/asm/ima.h +++ b/arch/powerpc/include/asm/ima.h @@ -1,6 +1,8 @@ #ifndef _ASM_POWERPC_IMA_H #define _ASM_POWERPC_IMA_H +struct kimage; + int ima_get_kexec_buffer(void **addr, size_t *size); int ima_free_kexec_buffer(void); @@ -10,4 +12,18 @@ void remove_ima_buffer(void *fdt, int chosen_node); static inline void remove_ima_buffer(void *fdt, int chosen_node) {} #endif +#ifdef CONFIG_IMA_KEXEC +int arch_ima_add_kexec_buffer(struct kimage *image, unsigned long load_addr, + size_t size); + +int setup_ima_buffer(const struct kimage *image, void *fdt, int chosen_node); +#else +static inline int setup_ima_buffer(const struct kimage *image, void *fdt, + int chosen_node) +{ + remove_ima_buffer(fdt, chosen_node); + return 0; +} +#endif /* CONFIG_IMA_KEXEC */ + #endif /* _ASM_POWERPC_IMA_H */ diff --git a/arch/powerpc/include/asm/kexec.h b/arch/powerpc/include/asm/kexec.h index 23056d2dc330..a49cab287acb 100644 --- a/arch/powerpc/include/asm/kexec.h +++ b/arch/powerpc/include/asm/kexec.h @@ -94,12 +94,22 @@ static inline bool kdump_in_progress(void) #ifdef CONFIG_KEXEC_FILE extern struct kexec_file_ops kexec_elf64_ops; +#ifdef CONFIG_IMA_KEXEC +#define ARCH_HAS_KIMAGE_ARCH + +struct kimage_arch { + phys_addr_t ima_buffer_addr; + size_t ima_buffer_size; +}; +#endif + int setup_purgatory(struct kimage *image, const void *slave_code, const void *fdt, unsigned long kernel_load_addr, unsigned long fdt_load_addr, unsigned long stack_top, int debug); -int setup_new_fdt(void *fdt, unsigned long initrd_load_addr, - unsigned long initrd_len, const char *cmdline); +int setup_new_fdt(const struct kimage *image, void *fdt, + unsigned long initrd_load_addr, unsigned long initrd_len, + const char *cmdline); bool find_debug_console(const void *fdt); int delete_fdt_mem_rsv(void *fdt, unsigned long start, unsigned long size); #endif /* CONFIG_KEXEC_FILE */ diff --git a/arch/powerpc/kernel/ima_kexec.c b/arch/powerpc/kernel/ima_kexec.c index 36e5a5df3804..5ea42c937ca9 100644 --- a/arch/powerpc/kernel/ima_kexec.c +++ b/arch/powerpc/kernel/ima_kexec.c @@ -130,3 +130,94 @@ void remove_ima_buffer(void *fdt, int chosen_node) if (!ret) pr_debug("Removed old IMA buffer reservation.\n"); } + +#ifdef CONFIG_IMA_KEXEC +/** + * arch_ima_add_kexec_buffer - do arch-specific steps to add the IMA buffer + * + * Architectures should use this function to pass on the IMA buffer + * information to the next kernel. + * + * Return: 0 on success, negative errno on error. + */ +int arch_ima_add_kexec_buffer(struct kimage *image, unsigned long load_addr, + size_t size) +{ + image->arch.ima_buffer_addr = load_addr; + image->arch.ima_buffer_size = size; + + return 0; +} + +static int write_number(void *p, u64 value, int cells) +{ + if (cells == 1) { + u32 tmp; + + if (value > U32_MAX) + return -EINVAL; + + tmp = cpu_to_be32(value); + memcpy(p, , sizeof(tmp)); + } else if (cells == 2) { + u64 tmp; + + tmp = cpu_to_be64(value); + memcpy(p, , sizeof(tmp)); +
[PATCH v6 03/10] ima: permit duplicate measurement list entries
From: Mimi ZoharMeasurements carried across kexec need to be added to the IMA measurement list, but should not prevent measurements of the newly booted kernel from being added to the measurement list. This patch adds support for allowing duplicate measurements. The "boot_aggregate" measurement entry is the delimiter between soft boots. Signed-off-by: Mimi Zohar --- security/integrity/ima/ima_queue.c | 15 +-- 1 file changed, 9 insertions(+), 6 deletions(-) diff --git a/security/integrity/ima/ima_queue.c b/security/integrity/ima/ima_queue.c index 4b1bb7787839..12d1b040bca9 100644 --- a/security/integrity/ima/ima_queue.c +++ b/security/integrity/ima/ima_queue.c @@ -65,11 +65,12 @@ static struct ima_queue_entry *ima_lookup_digest_entry(u8 *digest_value, } /* ima_add_template_entry helper function: - * - Add template entry to measurement list and hash table. + * - Add template entry to the measurement list and hash table, for + * all entries except those carried across kexec. * * (Called with ima_extend_list_mutex held.) */ -static int ima_add_digest_entry(struct ima_template_entry *entry) +static int ima_add_digest_entry(struct ima_template_entry *entry, int flags) { struct ima_queue_entry *qe; unsigned int key; @@ -85,8 +86,10 @@ static int ima_add_digest_entry(struct ima_template_entry *entry) list_add_tail_rcu(>later, _measurements); atomic_long_inc(_htable.len); - key = ima_hash_key(entry->digest); - hlist_add_head_rcu(>hnext, _htable.queue[key]); + if (flags) { + key = ima_hash_key(entry->digest); + hlist_add_head_rcu(>hnext, _htable.queue[key]); + } return 0; } @@ -126,7 +129,7 @@ int ima_add_template_entry(struct ima_template_entry *entry, int violation, } } - result = ima_add_digest_entry(entry); + result = ima_add_digest_entry(entry, 1); if (result < 0) { audit_cause = "ENOMEM"; audit_info = 0; @@ -155,7 +158,7 @@ int ima_restore_measurement_entry(struct ima_template_entry *entry) int result = 0; mutex_lock(_extend_list_mutex); - result = ima_add_digest_entry(entry); + result = ima_add_digest_entry(entry, 0); mutex_unlock(_extend_list_mutex); return result; } -- 2.7.4 ___ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec
[PATCH v6 00/10] ima: carry the measurement list across kexec
Hello, This is just a rebase on top of kexec_file_load patches v9 which I just posted. The previous version of this series has some conflicts with it. Original cover letter: The TPM PCRs are only reset on a hard reboot. In order to validate a TPM's quote after a soft reboot (eg. kexec -e), the IMA measurement list of the running kernel must be saved and then restored on the subsequent boot, possibly of a different architecture. The existing securityfs binary_runtime_measurements file conveniently provides a serialized format of the IMA measurement list. This patch set serializes the measurement list in this format and restores it. Up to now, the binary_runtime_measurements was defined as architecture native format. The assumption being that userspace could and would handle any architecture conversions. With the ability of carrying the measurement list across kexec, possibly from one architecture to a different one, the per boot architecture information is lost and with it the ability of recalculating the template digest hash. To resolve this problem, without breaking the existing ABI, this patch set introduces the boot command line option "ima_canonical_fmt", which is arbitrarily defined as little endian. The need for this boot command line option will be limited to the existing version 1 format of the binary_runtime_measurements. Subsequent formats will be defined as canonical format (eg. TPM 2.0 support for larger digests). A simplified method of Thiago Bauermann's "kexec buffer handover" patch series for carrying the IMA measurement list across kexec is included in this patch set. The simplified method requires all file measurements be taken prior to executing the kexec load, as subsequent measurements will not be carried across the kexec and restored. Changelog v6: - Rebased on top of "kexec_file_load implementation for PowerPC" patches v9. Changelog v5: - Included patches from Thiago Bauermann's "kexec buffer handover" patch series for carrying the IMA measurement list across kexec. - Added CONFIG_HAVE_IMA_KEXEC - Renamed functions to variations of ima_kexec_buffer instead of variations of kexec_handover_buffer Changelog v4: - Fixed "spinlock bad magic" BUG - reported by Dmitry Vyukov - Rebased on Thiago Bauermann's v5 patch set - Removed the skip_checksum initialization Changelog v3: - Cleaned up the code for calculating the requested kexec segment size needed for the IMA measurement list, limiting the segment size to half of the totalram_pages. - Fixed kernel test robot reports as enumerated in the respective patch changelog. Changelog v2: - Canonical measurement list support added - Redefined the ima_kexec_hdr struct to use well defined sizes Andreas Steffen (1): ima: platform-independent hash value Mimi Zohar (7): ima: on soft reboot, restore the measurement list ima: permit duplicate measurement list entries ima: maintain memory size needed for serializing the measurement list ima: on soft reboot, save the measurement list ima: store the builtin/custom template definitions in a list ima: support restoring multiple template formats ima: define a canonical binary_runtime_measurements list format Thiago Jung Bauermann (2): powerpc: ima: Get the kexec buffer passed by the previous kernel powerpc: ima: Send the kexec buffer to the next kernel Documentation/kernel-parameters.txt | 4 + arch/Kconfig| 3 + arch/powerpc/Kconfig| 1 + arch/powerpc/include/asm/ima.h | 29 +++ arch/powerpc/include/asm/kexec.h| 15 +- arch/powerpc/kernel/Makefile| 4 + arch/powerpc/kernel/ima_kexec.c | 223 + arch/powerpc/kernel/kexec_elf_64.c | 2 +- arch/powerpc/kernel/machine_kexec_file_64.c | 15 +- include/linux/ima.h | 12 ++ kernel/kexec_file.c | 4 + security/integrity/ima/Kconfig | 12 ++ security/integrity/ima/Makefile | 1 + security/integrity/ima/ima.h| 31 +++ security/integrity/ima/ima_crypto.c | 6 +- security/integrity/ima/ima_fs.c | 30 ++- security/integrity/ima/ima_init.c | 2 + security/integrity/ima/ima_kexec.c | 168 security/integrity/ima/ima_main.c | 1 + security/integrity/ima/ima_queue.c | 76 +++- security/integrity/ima/ima_template.c | 293 ++-- security/integrity/ima/ima_template_lib.c | 7 +- 22 files changed, 901 insertions(+), 38 deletions(-) create mode 100644 arch/powerpc/include/asm/ima.h create mode 100644 arch/powerpc/kernel/ima_kexec.c create mode 100644 security/integrity/ima/ima_kexec.c -- 2.7.4 ___ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec
[PATCH v9 10/10] powerpc: Enable CONFIG_KEXEC_FILE in powerpc server defconfigs.
Enable CONFIG_KEXEC_FILE in powernv_defconfig, ppc64_defconfig and pseries_defconfig. It depends on CONFIG_CRYPTO_SHA256=y, so add that as well. Signed-off-by: Thiago Jung Bauermann <bauer...@linux.vnet.ibm.com> --- arch/powerpc/configs/powernv_defconfig | 2 ++ arch/powerpc/configs/ppc64_defconfig | 2 ++ arch/powerpc/configs/pseries_defconfig | 2 ++ 3 files changed, 6 insertions(+) diff --git a/arch/powerpc/configs/powernv_defconfig b/arch/powerpc/configs/powernv_defconfig index d98b6eb3254f..5a190aa5534b 100644 --- a/arch/powerpc/configs/powernv_defconfig +++ b/arch/powerpc/configs/powernv_defconfig @@ -49,6 +49,7 @@ CONFIG_BINFMT_MISC=m CONFIG_PPC_TRANSACTIONAL_MEM=y CONFIG_HOTPLUG_CPU=y CONFIG_KEXEC=y +CONFIG_KEXEC_FILE=y CONFIG_IRQ_ALL_CPUS=y CONFIG_NUMA=y CONFIG_MEMORY_HOTPLUG=y @@ -301,6 +302,7 @@ CONFIG_CRYPTO_CCM=m CONFIG_CRYPTO_PCBC=m CONFIG_CRYPTO_HMAC=y CONFIG_CRYPTO_MICHAEL_MIC=m +CONFIG_CRYPTO_SHA256=y CONFIG_CRYPTO_TGR192=m CONFIG_CRYPTO_WP512=m CONFIG_CRYPTO_ANUBIS=m diff --git a/arch/powerpc/configs/ppc64_defconfig b/arch/powerpc/configs/ppc64_defconfig index 58a98d40086f..0059d2088b9c 100644 --- a/arch/powerpc/configs/ppc64_defconfig +++ b/arch/powerpc/configs/ppc64_defconfig @@ -46,6 +46,7 @@ CONFIG_HZ_100=y CONFIG_BINFMT_MISC=m CONFIG_PPC_TRANSACTIONAL_MEM=y CONFIG_KEXEC=y +CONFIG_KEXEC_FILE=y CONFIG_CRASH_DUMP=y CONFIG_IRQ_ALL_CPUS=y CONFIG_MEMORY_HOTREMOVE=y @@ -336,6 +337,7 @@ CONFIG_CRYPTO_TEST=m CONFIG_CRYPTO_PCBC=m CONFIG_CRYPTO_HMAC=y CONFIG_CRYPTO_MICHAEL_MIC=m +CONFIG_CRYPTO_SHA256=y CONFIG_CRYPTO_TGR192=m CONFIG_CRYPTO_WP512=m CONFIG_CRYPTO_ANUBIS=m diff --git a/arch/powerpc/configs/pseries_defconfig b/arch/powerpc/configs/pseries_defconfig index 8a3bc016b732..f022f657a984 100644 --- a/arch/powerpc/configs/pseries_defconfig +++ b/arch/powerpc/configs/pseries_defconfig @@ -52,6 +52,7 @@ CONFIG_HZ_100=y CONFIG_BINFMT_MISC=m CONFIG_PPC_TRANSACTIONAL_MEM=y CONFIG_KEXEC=y +CONFIG_KEXEC_FILE=y CONFIG_IRQ_ALL_CPUS=y CONFIG_MEMORY_HOTPLUG=y CONFIG_MEMORY_HOTREMOVE=y @@ -303,6 +304,7 @@ CONFIG_CRYPTO_TEST=m CONFIG_CRYPTO_PCBC=m CONFIG_CRYPTO_HMAC=y CONFIG_CRYPTO_MICHAEL_MIC=m +CONFIG_CRYPTO_SHA256=y CONFIG_CRYPTO_TGR192=m CONFIG_CRYPTO_WP512=m CONFIG_CRYPTO_ANUBIS=m -- 2.7.4 ___ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec
[PATCH v9 04/10] powerpc: Change places using CONFIG_KEXEC to use CONFIG_KEXEC_CORE instead.
Commit 2965faa5e03d ("kexec: split kexec_load syscall from kexec core code") introduced CONFIG_KEXEC_CORE so that CONFIG_KEXEC means whether the kexec_load system call should be compiled-in and CONFIG_KEXEC_FILE means whether the kexec_file_load system call should be compiled-in. These options can be set independently from each other. Since until now powerpc only supported kexec_load, CONFIG_KEXEC and CONFIG_KEXEC_CORE were synonyms. That is not the case anymore, so we need to make a distinction. Almost all places where CONFIG_KEXEC was being used should be using CONFIG_KEXEC_CORE instead, since kexec_file_load also needs that code compiled in. Signed-off-by: Thiago Jung Bauermann <bauer...@linux.vnet.ibm.com> --- arch/powerpc/Kconfig | 2 +- arch/powerpc/include/asm/debug.h | 2 +- arch/powerpc/include/asm/kexec.h | 6 +++--- arch/powerpc/include/asm/machdep.h| 4 ++-- arch/powerpc/include/asm/smp.h| 2 +- arch/powerpc/kernel/Makefile | 4 ++-- arch/powerpc/kernel/head_64.S | 2 +- arch/powerpc/kernel/misc_32.S | 2 +- arch/powerpc/kernel/misc_64.S | 6 +++--- arch/powerpc/kernel/prom.c| 2 +- arch/powerpc/kernel/setup_64.c| 4 ++-- arch/powerpc/kernel/smp.c | 6 +++--- arch/powerpc/kernel/traps.c | 2 +- arch/powerpc/platforms/85xx/corenet_generic.c | 2 +- arch/powerpc/platforms/85xx/smp.c | 8 arch/powerpc/platforms/cell/spu_base.c| 2 +- arch/powerpc/platforms/powernv/setup.c| 6 +++--- arch/powerpc/platforms/ps3/setup.c| 4 ++-- arch/powerpc/platforms/pseries/Makefile | 2 +- arch/powerpc/platforms/pseries/setup.c| 4 ++-- 20 files changed, 36 insertions(+), 36 deletions(-) diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig index 65fba4c34cd7..6cb59c6e5ba4 100644 --- a/arch/powerpc/Kconfig +++ b/arch/powerpc/Kconfig @@ -489,7 +489,7 @@ config CRASH_DUMP config FA_DUMP bool "Firmware-assisted dump" - depends on PPC64 && PPC_RTAS && CRASH_DUMP && KEXEC + depends on PPC64 && PPC_RTAS && CRASH_DUMP && KEXEC_CORE help A robust mechanism to get reliable kernel crash dump with assistance from firmware. This approach does not use kexec, diff --git a/arch/powerpc/include/asm/debug.h b/arch/powerpc/include/asm/debug.h index a954e4975049..86308f177f2d 100644 --- a/arch/powerpc/include/asm/debug.h +++ b/arch/powerpc/include/asm/debug.h @@ -10,7 +10,7 @@ struct pt_regs; extern struct dentry *powerpc_debugfs_root; -#if defined(CONFIG_DEBUGGER) || defined(CONFIG_KEXEC) +#if defined(CONFIG_DEBUGGER) || defined(CONFIG_KEXEC_CORE) extern int (*__debugger)(struct pt_regs *regs); extern int (*__debugger_ipi)(struct pt_regs *regs); diff --git a/arch/powerpc/include/asm/kexec.h b/arch/powerpc/include/asm/kexec.h index a46f5f45570c..eca2f975bf44 100644 --- a/arch/powerpc/include/asm/kexec.h +++ b/arch/powerpc/include/asm/kexec.h @@ -53,7 +53,7 @@ typedef void (*crash_shutdown_t)(void); -#ifdef CONFIG_KEXEC +#ifdef CONFIG_KEXEC_CORE /* * This function is responsible for capturing register states if coming @@ -91,7 +91,7 @@ static inline bool kdump_in_progress(void) return crashing_cpu >= 0; } -#else /* !CONFIG_KEXEC */ +#else /* !CONFIG_KEXEC_CORE */ static inline void crash_kexec_secondary(struct pt_regs *regs) { } static inline int overlaps_crashkernel(unsigned long start, unsigned long size) @@ -116,7 +116,7 @@ static inline bool kdump_in_progress(void) return false; } -#endif /* CONFIG_KEXEC */ +#endif /* CONFIG_KEXEC_CORE */ #endif /* ! __ASSEMBLY__ */ #endif /* __KERNEL__ */ #endif /* _ASM_POWERPC_KEXEC_H */ diff --git a/arch/powerpc/include/asm/machdep.h b/arch/powerpc/include/asm/machdep.h index e02cbc6a6c70..5011b69107a7 100644 --- a/arch/powerpc/include/asm/machdep.h +++ b/arch/powerpc/include/asm/machdep.h @@ -183,7 +183,7 @@ struct machdep_calls { */ void (*machine_shutdown)(void); -#ifdef CONFIG_KEXEC +#ifdef CONFIG_KEXEC_CORE void (*kexec_cpu_down)(int crash_shutdown, int secondary); /* Called to do what every setup is needed on image and the @@ -198,7 +198,7 @@ struct machdep_calls { * no return. */ void (*machine_kexec)(struct kimage *image); -#endif /* CONFIG_KEXEC */ +#endif /* CONFIG_KEXEC_CORE */ #ifdef CONFIG_SUSPEND /* These are called to disable and enable, respectively, IRQs when diff --git a/arch/powerpc/include/asm/smp.h b/arch/powerpc/include/asm/smp.h index 0d02c11dc331..32db16d2e7ad 100644 --- a/arch/powerpc/include/asm/smp.h +++ b/arch/powerpc/include/asm/smp.h @@ -176,7 +176,7 @@ static inline void set_hard_smp_processor_id(int cpu, int phys) #endif /* !CONFI
[PATCH v9 09/10] powerpc: Add purgatory for kexec_file_load implementation.
This purgatory implementation comes from kexec-tools, almost unchanged. In order to use boot/string.S in ppc64 big endian mode, the functions defined in it need to have dot symbols so that they can be called from C code. Therefore, change the file to use a DOTSYM macro if one is defined, so that the purgatory can add those dot symbols. The changes made to the purgatory code relative to the version in kexec-tools were: The sha256_regions global variable was renamed to sha_regions to match what kexec_file_load expects, and to use the sha256.c file from x86's purgatory (this avoids adding yet another SHA-256 implementation). The global variables in purgatory.c and purgatory-ppc64.c now use a __section attribute to put them in the .data section instead of being initialized to zero. It doesn't matter what their initial value is, because they will be set by the kernel when preparing the kexec image. Also, since we don't support loading a crashdump kernel via kexec_file_load yet, the code related to that functionality has been removed. Finally, some checkpatch.pl warnings were fixed. Signed-off-by: Thiago Jung Bauermann <bauer...@linux.vnet.ibm.com> --- arch/powerpc/Makefile| 1 + arch/powerpc/boot/string.S | 67 +++-- arch/powerpc/purgatory/.gitignore| 2 + arch/powerpc/purgatory/Makefile | 33 +++ arch/powerpc/purgatory/console-ppc64.c | 37 +++ arch/powerpc/purgatory/crtsavres.S | 5 + arch/powerpc/purgatory/hvCall.S | 27 + arch/powerpc/purgatory/hvCall.h | 8 ++ arch/powerpc/purgatory/kexec-sha256.h| 11 +++ arch/powerpc/purgatory/ppc64_asm.h | 20 arch/powerpc/purgatory/printf.c | 164 +++ arch/powerpc/purgatory/purgatory-ppc64.c | 36 +++ arch/powerpc/purgatory/purgatory.c | 62 arch/powerpc/purgatory/purgatory.h | 14 +++ arch/powerpc/purgatory/sha256.c | 6 ++ arch/powerpc/purgatory/sha256.h | 1 + arch/powerpc/purgatory/string.S | 2 + arch/powerpc/purgatory/v2wrap.S | 134 + 18 files changed, 601 insertions(+), 29 deletions(-) diff --git a/arch/powerpc/Makefile b/arch/powerpc/Makefile index 617dece67924..5e7dcdaf93f5 100644 --- a/arch/powerpc/Makefile +++ b/arch/powerpc/Makefile @@ -249,6 +249,7 @@ core-y += arch/powerpc/kernel/ \ core-$(CONFIG_XMON)+= arch/powerpc/xmon/ core-$(CONFIG_KVM) += arch/powerpc/kvm/ core-$(CONFIG_PERF_EVENTS) += arch/powerpc/perf/ +core-$(CONFIG_KEXEC_FILE) += arch/powerpc/purgatory/ drivers-$(CONFIG_OPROFILE) += arch/powerpc/oprofile/ diff --git a/arch/powerpc/boot/string.S b/arch/powerpc/boot/string.S index acc9428f2789..b54bbad5f83d 100644 --- a/arch/powerpc/boot/string.S +++ b/arch/powerpc/boot/string.S @@ -11,9 +11,18 @@ #include "ppc_asm.h" +/* + * The ppc64 kexec purgatory uses this file and packages it in ELF64, + * so it needs dot symbols for the ppc64 big endian ABI. This macro + * allows it to create those symbols. + */ +#ifndef DOTSYM +#define DOTSYM(a) a +#endif + .text - .globl strcpy -strcpy: + .globl DOTSYM(strcpy) +DOTSYM(strcpy): addir5,r3,-1 addir4,r4,-1 1: lbzur0,1(r4) @@ -22,8 +31,8 @@ strcpy: bne 1b blr - .globl strncpy -strncpy: + .globl DOTSYM(strncpy) +DOTSYM(strncpy): cmpwi 0,r5,0 beqlr mtctr r5 @@ -35,8 +44,8 @@ strncpy: bdnzf 2,1b/* dec ctr, branch if ctr != 0 && !cr0.eq */ blr - .globl strcat -strcat: + .globl DOTSYM(strcat) +DOTSYM(strcat): addir5,r3,-1 addir4,r4,-1 1: lbzur0,1(r5) @@ -49,8 +58,8 @@ strcat: bne 1b blr - .globl strchr -strchr: + .globl DOTSYM(strchr) +DOTSYM(strchr): addir3,r3,-1 1: lbzur0,1(r3) cmpw0,r0,r4 @@ -60,8 +69,8 @@ strchr: li r3,0 blr - .globl strcmp -strcmp: + .globl DOTSYM(strcmp) +DOTSYM(strcmp): addir5,r3,-1 addir4,r4,-1 1: lbzur3,1(r5) @@ -72,8 +81,8 @@ strcmp: beq 1b blr - .globl strncmp -strncmp: + .globl DOTSYM(strncmp) +DOTSYM(strncmp): mtctr r5 addir5,r3,-1 addir4,r4,-1 @@ -85,8 +94,8 @@ strncmp: bdnzt eq,1b blr - .globl strlen -strlen: + .globl DOTSYM(strlen) +DOTSYM(strlen): addir4,r3,-1 1: lbzur0,1(r4) cmpwi 0,r0,0 @@ -94,8 +103,8 @@ strlen: subfr3,r3,r4 blr - .globl memset -memset: + .globl DOTSYM(memset) +DOTSYM(memset): rlwimi r4,r4,8,16,23 rlwimi r4,r4,16,0,15 addir6,r3,-4 @@ -120,14 +129,14 @@ me
[PATCH v9 08/10] powerpc: Add support for loading ELF kernels with kexec_file_load.
This uses all the infrastructure built up by the previous patches in the series to load an ELF vmlinux file and an initrd. It uses the flattened device tree at initial_boot_params as a base and adjusts memory reservations and its /chosen node for the next kernel. [a...@linux-foundation.org: coding-style fixes] Signed-off-by: Thiago Jung Bauermann <bauer...@linux.vnet.ibm.com> Signed-off-by: Andrew Morton <a...@linux-foundation.org> --- arch/powerpc/include/asm/kexec.h| 12 + arch/powerpc/kernel/Makefile| 3 +- arch/powerpc/kernel/kexec_elf_64.c | 280 +++ arch/powerpc/kernel/machine_kexec_file_64.c | 338 +++- 4 files changed, 630 insertions(+), 3 deletions(-) diff --git a/arch/powerpc/include/asm/kexec.h b/arch/powerpc/include/asm/kexec.h index eca2f975bf44..4497db7555b0 100644 --- a/arch/powerpc/include/asm/kexec.h +++ b/arch/powerpc/include/asm/kexec.h @@ -91,6 +91,18 @@ static inline bool kdump_in_progress(void) return crashing_cpu >= 0; } +#ifdef CONFIG_KEXEC_FILE +extern struct kexec_file_ops kexec_elf64_ops; + +int setup_purgatory(struct kimage *image, const void *slave_code, + const void *fdt, unsigned long kernel_load_addr, + unsigned long fdt_load_addr, unsigned long stack_top, + int debug); +int setup_new_fdt(void *fdt, unsigned long initrd_load_addr, + unsigned long initrd_len, const char *cmdline); +bool find_debug_console(const void *fdt); +#endif /* CONFIG_KEXEC_FILE */ + #else /* !CONFIG_KEXEC_CORE */ static inline void crash_kexec_secondary(struct pt_regs *regs) { } diff --git a/arch/powerpc/kernel/Makefile b/arch/powerpc/kernel/Makefile index de14b7eb11bb..424b13b1b2b0 100644 --- a/arch/powerpc/kernel/Makefile +++ b/arch/powerpc/kernel/Makefile @@ -109,7 +109,8 @@ obj-$(CONFIG_PCI) += pci_$(BITS).o $(pci64-y) \ obj-$(CONFIG_PCI_MSI) += msi.o obj-$(CONFIG_KEXEC_CORE) += machine_kexec.o crash.o \ machine_kexec_$(BITS).o -obj-$(CONFIG_KEXEC_FILE) += machine_kexec_file_$(BITS).o elf_util.o +obj-$(CONFIG_KEXEC_FILE) += machine_kexec_file_$(BITS).o elf_util.o \ + kexec_elf_$(BITS).o obj-$(CONFIG_AUDIT)+= audit.o obj64-$(CONFIG_AUDIT) += compat_audit.o diff --git a/arch/powerpc/kernel/kexec_elf_64.c b/arch/powerpc/kernel/kexec_elf_64.c new file mode 100644 index ..dc29e0131b76 --- /dev/null +++ b/arch/powerpc/kernel/kexec_elf_64.c @@ -0,0 +1,280 @@ +/* + * Load ELF vmlinux file for the kexec_file_load syscall. + * + * Copyright (C) 2004 Adam Litke (a...@us.ibm.com) + * Copyright (C) 2004 IBM Corp. + * Copyright (C) 2005 R Sharada (shar...@in.ibm.com) + * Copyright (C) 2006 Mohan Kumar M (mo...@in.ibm.com) + * Copyright (C) 2016 IBM Corporation + * + * Based on kexec-tools' kexec-elf-exec.c and kexec-elf-ppc64.c. + * Heavily modified for the kernel by + * Thiago Jung Bauermann <bauer...@linux.vnet.ibm.com>. + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation (version 2 of the License). + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + */ + +#define pr_fmt(fmt)"kexec_elf: " fmt + +#include +#include +#include +#include +#include +#include +#include + +#define PURGATORY_STACK_SIZE (16 * 1024) + +/** + * build_elf_exec_info - read ELF executable and check that we can use it + */ +static int build_elf_exec_info(const char *buf, size_t len, struct elfhdr *ehdr, + struct elf_info *elf_info) +{ + int i; + int ret; + + ret = elf_read_from_buffer(buf, len, ehdr, elf_info); + if (ret) + return ret; + + /* Big endian vmlinux has type ET_DYN. */ + if (ehdr->e_type != ET_EXEC && ehdr->e_type != ET_DYN) { + pr_err("Not an ELF executable.\n"); + goto error; + } else if (!elf_info->proghdrs) { + pr_err("No ELF program header.\n"); + goto error; + } + + for (i = 0; i < ehdr->e_phnum; i++) { + /* +* Kexec does not support loading interpreters. +* In addition this check keeps us from attempting +* to kexec ordinay executables. +*/ + if (elf_info->proghdrs[i].p_type == PT_INTERP) { + pr_err("Requires an ELF interpreter.\n"); + goto error; + } + } + + retur
[PATCH v9 03/10] kexec_file: Factor out kexec_locate_mem_hole from kexec_add_buffer.
kexec_locate_mem_hole will be used by the PowerPC kexec_file_load implementation to find free memory for the purgatory stack. Signed-off-by: Thiago Jung Bauermann <bauer...@linux.vnet.ibm.com> Acked-by: Dave Young <dyo...@redhat.com> --- include/linux/kexec.h | 1 + kernel/kexec_file.c | 25 - 2 files changed, 21 insertions(+), 5 deletions(-) diff --git a/include/linux/kexec.h b/include/linux/kexec.h index 437ef1b47428..a33f63351f86 100644 --- a/include/linux/kexec.h +++ b/include/linux/kexec.h @@ -176,6 +176,7 @@ struct kexec_buf { int __weak arch_kexec_walk_mem(struct kexec_buf *kbuf, int (*func)(u64, u64, void *)); extern int kexec_add_buffer(struct kexec_buf *kbuf); +int kexec_locate_mem_hole(struct kexec_buf *kbuf); #endif /* CONFIG_KEXEC_FILE */ struct kimage { diff --git a/kernel/kexec_file.c b/kernel/kexec_file.c index efd2c094af7e..0c2df7f73792 100644 --- a/kernel/kexec_file.c +++ b/kernel/kexec_file.c @@ -450,6 +450,23 @@ int __weak arch_kexec_walk_mem(struct kexec_buf *kbuf, } /** + * kexec_locate_mem_hole - find free memory for the purgatory or the next kernel + * @kbuf: Parameters for the memory search. + * + * On success, kbuf->mem will have the start address of the memory region found. + * + * Return: 0 on success, negative errno on error. + */ +int kexec_locate_mem_hole(struct kexec_buf *kbuf) +{ + int ret; + + ret = arch_kexec_walk_mem(kbuf, locate_mem_hole_callback); + + return ret == 1 ? 0 : -EADDRNOTAVAIL; +} + +/** * kexec_add_buffer - place a buffer in a kexec segment * @kbuf: Buffer contents and memory parameters. * @@ -489,11 +506,9 @@ int kexec_add_buffer(struct kexec_buf *kbuf) kbuf->buf_align = max(kbuf->buf_align, PAGE_SIZE); /* Walk the RAM ranges and allocate a suitable range for the buffer */ - ret = arch_kexec_walk_mem(kbuf, locate_mem_hole_callback); - if (ret != 1) { - /* A suitable memory range could not be found for buffer */ - return -EADDRNOTAVAIL; - } + ret = kexec_locate_mem_hole(kbuf); + if (ret) + return ret; /* Found a suitable memory range */ ksegment = >image->segment[kbuf->image->nr_segments]; -- 2.7.4 ___ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec
[PATCH v9 05/10] powerpc: Factor out relocation code in module_64.c
The kexec_file_load system call needs to relocate the purgatory, so factor out the module relocation code so that it can be shared. This patch's purpose is to move the ELF relocation logic from apply_relocate_add to the new function elf64_apply_relocate_add_item with as few changes as possible. The following changes were needed: elf64_apply_relocate_add_item takes a my_r2 argument because the kexec code can't use the my_r2 function since it doesn't have a struct module to pass to it. For the same reason, it also takes an obj_name argument to use in error messages. It still takes a pointer to struct module argument, but kexec code can just pass NULL because except for the TOC symbol, the purgatory doesn't have undefined symbols so the module pointer isn't used. Apart from what is described in the paragraph above, the code has no functional changes. Suggested-by: Michael Ellerman <m...@ellerman.id.au> Signed-off-by: Thiago Jung Bauermann <bauer...@linux.vnet.ibm.com> --- arch/powerpc/kernel/module_64.c | 344 +--- 1 file changed, 182 insertions(+), 162 deletions(-) diff --git a/arch/powerpc/kernel/module_64.c b/arch/powerpc/kernel/module_64.c index 183368e008cf..61baad036639 100644 --- a/arch/powerpc/kernel/module_64.c +++ b/arch/powerpc/kernel/module_64.c @@ -507,6 +507,181 @@ static int restore_r2(u32 *instruction, struct module *me) return 1; } +static int elf64_apply_relocate_add_item(const Elf64_Shdr *sechdrs, +const char *strtab, +const Elf64_Rela *rela, +const Elf64_Sym *sym, +unsigned long *location, +unsigned long value, +unsigned long my_r2, +const char *obj_name, +struct module *me) +{ + switch (ELF64_R_TYPE(rela->r_info)) { + case R_PPC64_ADDR32: + /* Simply set it */ + *(u32 *)location = value; + break; + + case R_PPC64_ADDR64: + /* Simply set it */ + *(unsigned long *)location = value; + break; + + case R_PPC64_TOC: + *(unsigned long *)location = my_r2; + break; + + case R_PPC64_TOC16: + /* Subtract TOC pointer */ + value -= my_r2; + if (value + 0x8000 > 0x) { + pr_err("%s: bad TOC16 relocation (0x%lx)\n", + obj_name, value); + return -ENOEXEC; + } + *((uint16_t *) location) + = (*((uint16_t *) location) & ~0x) + | (value & 0x); + break; + + case R_PPC64_TOC16_LO: + /* Subtract TOC pointer */ + value -= my_r2; + *((uint16_t *) location) + = (*((uint16_t *) location) & ~0x) + | (value & 0x); + break; + + case R_PPC64_TOC16_DS: + /* Subtract TOC pointer */ + value -= my_r2; + if ((value & 3) != 0 || value + 0x8000 > 0x) { + pr_err("%s: bad TOC16_DS relocation (0x%lx)\n", + obj_name, value); + return -ENOEXEC; + } + *((uint16_t *) location) + = (*((uint16_t *) location) & ~0xfffc) + | (value & 0xfffc); + break; + + case R_PPC64_TOC16_LO_DS: + /* Subtract TOC pointer */ + value -= my_r2; + if ((value & 3) != 0) { + pr_err("%s: bad TOC16_LO_DS relocation (0x%lx)\n", + obj_name, value); + return -ENOEXEC; + } + *((uint16_t *) location) + = (*((uint16_t *) location) & ~0xfffc) + | (value & 0xfffc); + break; + + case R_PPC64_TOC16_HA: + /* Subtract TOC pointer */ + value -= my_r2; + value = ((value + 0x8000) >> 16); + *((uint16_t *) location) + = (*((uint16_t *) location) & ~0x) + | (value & 0x); + break; + + case R_PPC_REL24: + /* FIXME: Handle weak symbols here --RR */ + if (sym->st_shndx == SHN_UNDEF) { + /* External: go via stub */ + value = stub_for_addr(sechdrs, value, me); + if (!value) +
[PATCH v9 01/10] kexec_file: Allow arch-specific memory walking for kexec_add_buffer
Allow architectures to specify a different memory walking function for kexec_add_buffer. x86 uses iomem to track reserved memory ranges, but PowerPC uses the memblock subsystem. Signed-off-by: Thiago Jung Bauermann <bauer...@linux.vnet.ibm.com> Acked-by: Dave Young <dyo...@redhat.com> Acked-by: Balbir Singh <bsinghar...@gmail.com> --- include/linux/kexec.h | 29 - kernel/kexec_file.c | 30 ++ kernel/kexec_internal.h | 16 3 files changed, 50 insertions(+), 25 deletions(-) diff --git a/include/linux/kexec.h b/include/linux/kexec.h index 406c33dcae13..5e320ddaaa82 100644 --- a/include/linux/kexec.h +++ b/include/linux/kexec.h @@ -148,7 +148,34 @@ struct kexec_file_ops { kexec_verify_sig_t *verify_sig; #endif }; -#endif + +/** + * struct kexec_buf - parameters for finding a place for a buffer in memory + * @image: kexec image in which memory to search. + * @buffer:Contents which will be copied to the allocated memory. + * @bufsz: Size of @buffer. + * @mem: On return will have address of the buffer in memory. + * @memsz: Size for the buffer in memory. + * @buf_align: Minimum alignment needed. + * @buf_min: The buffer can't be placed below this address. + * @buf_max: The buffer can't be placed above this address. + * @top_down: Allocate from top of memory. + */ +struct kexec_buf { + struct kimage *image; + char *buffer; + unsigned long bufsz; + unsigned long mem; + unsigned long memsz; + unsigned long buf_align; + unsigned long buf_min; + unsigned long buf_max; + bool top_down; +}; + +int __weak arch_kexec_walk_mem(struct kexec_buf *kbuf, + int (*func)(u64, u64, void *)); +#endif /* CONFIG_KEXEC_FILE */ struct kimage { kimage_entry_t head; diff --git a/kernel/kexec_file.c b/kernel/kexec_file.c index 037c321c5618..f865674bff51 100644 --- a/kernel/kexec_file.c +++ b/kernel/kexec_file.c @@ -428,6 +428,27 @@ static int locate_mem_hole_callback(u64 start, u64 end, void *arg) return locate_mem_hole_bottom_up(start, end, kbuf); } +/** + * arch_kexec_walk_mem - call func(data) on free memory regions + * @kbuf: Context info for the search. Also passed to @func. + * @func: Function to call for each memory region. + * + * Return: The memory walk will stop when func returns a non-zero value + * and that value will be returned. If all free regions are visited without + * func returning non-zero, then zero will be returned. + */ +int __weak arch_kexec_walk_mem(struct kexec_buf *kbuf, + int (*func)(u64, u64, void *)) +{ + if (kbuf->image->type == KEXEC_TYPE_CRASH) + return walk_iomem_res_desc(crashk_res.desc, + IORESOURCE_SYSTEM_RAM | IORESOURCE_BUSY, + crashk_res.start, crashk_res.end, + kbuf, func); + else + return walk_system_ram_res(0, ULONG_MAX, kbuf, func); +} + /* * Helper function for placing a buffer in a kexec segment. This assumes * that kexec_mutex is held. @@ -474,14 +495,7 @@ int kexec_add_buffer(struct kimage *image, char *buffer, unsigned long bufsz, kbuf->top_down = top_down; /* Walk the RAM ranges and allocate a suitable range for the buffer */ - if (image->type == KEXEC_TYPE_CRASH) - ret = walk_iomem_res_desc(crashk_res.desc, - IORESOURCE_SYSTEM_RAM | IORESOURCE_BUSY, - crashk_res.start, crashk_res.end, kbuf, - locate_mem_hole_callback); - else - ret = walk_system_ram_res(0, -1, kbuf, - locate_mem_hole_callback); + ret = arch_kexec_walk_mem(kbuf, locate_mem_hole_callback); if (ret != 1) { /* A suitable memory range could not be found for buffer */ return -EADDRNOTAVAIL; diff --git a/kernel/kexec_internal.h b/kernel/kexec_internal.h index 0a52315d9c62..4cef7e4706b0 100644 --- a/kernel/kexec_internal.h +++ b/kernel/kexec_internal.h @@ -20,22 +20,6 @@ struct kexec_sha_region { unsigned long len; }; -/* - * Keeps track of buffer parameters as provided by caller for requesting - * memory placement of buffer. - */ -struct kexec_buf { - struct kimage *image; - char *buffer; - unsigned long bufsz; - unsigned long mem; - unsigned long memsz; - unsigned long buf_align; - unsigned long buf_min; - unsigned long buf_max; - bool top_down; /* allocate from top of memory hole */ -}; - void kimage_file_post_load_cleanup(struct kimage *image); #else /* CONFIG_KEXEC_FILE */ static inline void kimage_file_post_load_cleanup(struc
[PATCH v9 02/10] kexec_file: Change kexec_add_buffer to take kexec_buf as argument.
This is done to simplify the kexec_add_buffer argument list. Adapt all callers to set up a kexec_buf to pass to kexec_add_buffer. In addition, change the type of kexec_buf.buffer from char * to void *. There is no particular reason for it to be a char *, and the change allows us to get rid of 3 existing casts to char * in the code. Signed-off-by: Thiago Jung Bauermann <bauer...@linux.vnet.ibm.com> Acked-by: Dave Young <dyo...@redhat.com> Acked-by: Balbir Singh <bsinghar...@gmail.com> --- arch/x86/kernel/crash.c | 37 arch/x86/kernel/kexec-bzimage64.c | 48 +++-- include/linux/kexec.h | 8 +--- kernel/kexec_file.c | 88 ++- 4 files changed, 87 insertions(+), 94 deletions(-) diff --git a/arch/x86/kernel/crash.c b/arch/x86/kernel/crash.c index 650830e39e3a..3741461c63a0 100644 --- a/arch/x86/kernel/crash.c +++ b/arch/x86/kernel/crash.c @@ -631,9 +631,9 @@ static int determine_backup_region(u64 start, u64 end, void *arg) int crash_load_segments(struct kimage *image) { - unsigned long src_start, src_sz, elf_sz; - void *elf_addr; int ret; + struct kexec_buf kbuf = { .image = image, .buf_min = 0, + .buf_max = ULONG_MAX, .top_down = false }; /* * Determine and load a segment for backup area. First 640K RAM @@ -647,43 +647,44 @@ int crash_load_segments(struct kimage *image) if (ret < 0) return ret; - src_start = image->arch.backup_src_start; - src_sz = image->arch.backup_src_sz; - /* Add backup segment. */ - if (src_sz) { + if (image->arch.backup_src_sz) { + kbuf.buffer = _zero_bytes; + kbuf.bufsz = sizeof(crash_zero_bytes); + kbuf.memsz = image->arch.backup_src_sz; + kbuf.buf_align = PAGE_SIZE; /* * Ideally there is no source for backup segment. This is * copied in purgatory after crash. Just add a zero filled * segment for now to make sure checksum logic works fine. */ - ret = kexec_add_buffer(image, (char *)_zero_bytes, - sizeof(crash_zero_bytes), src_sz, - PAGE_SIZE, 0, -1, 0, - >arch.backup_load_addr); + ret = kexec_add_buffer(); if (ret) return ret; + image->arch.backup_load_addr = kbuf.mem; pr_debug("Loaded backup region at 0x%lx backup_start=0x%lx memsz=0x%lx\n", -image->arch.backup_load_addr, src_start, src_sz); +image->arch.backup_load_addr, +image->arch.backup_src_start, kbuf.memsz); } /* Prepare elf headers and add a segment */ - ret = prepare_elf_headers(image, _addr, _sz); + ret = prepare_elf_headers(image, , ); if (ret) return ret; - image->arch.elf_headers = elf_addr; - image->arch.elf_headers_sz = elf_sz; + image->arch.elf_headers = kbuf.buffer; + image->arch.elf_headers_sz = kbuf.bufsz; - ret = kexec_add_buffer(image, (char *)elf_addr, elf_sz, elf_sz, - ELF_CORE_HEADER_ALIGN, 0, -1, 0, - >arch.elf_load_addr); + kbuf.memsz = kbuf.bufsz; + kbuf.buf_align = ELF_CORE_HEADER_ALIGN; + ret = kexec_add_buffer(); if (ret) { vfree((void *)image->arch.elf_headers); return ret; } + image->arch.elf_load_addr = kbuf.mem; pr_debug("Loaded ELF headers at 0x%lx bufsz=0x%lx memsz=0x%lx\n", -image->arch.elf_load_addr, elf_sz, elf_sz); +image->arch.elf_load_addr, kbuf.bufsz, kbuf.bufsz); return ret; } diff --git a/arch/x86/kernel/kexec-bzimage64.c b/arch/x86/kernel/kexec-bzimage64.c index 3407b148c240..d0a814a9d96a 100644 --- a/arch/x86/kernel/kexec-bzimage64.c +++ b/arch/x86/kernel/kexec-bzimage64.c @@ -331,17 +331,17 @@ static void *bzImage64_load(struct kimage *image, char *kernel, struct setup_header *header; int setup_sects, kern16_size, ret = 0; - unsigned long setup_header_size, params_cmdline_sz, params_misc_sz; + unsigned long setup_header_size, params_cmdline_sz; struct boot_params *params; unsigned long bootparam_load_addr, kernel_load_addr, initrd_load_addr; unsigned long purgatory_load_addr; - unsigned long kernel_bufsz, kernel_memsz, kernel_align; - char *kernel_buf; struct bzimage64_data *ldata; struct kexec_entry64_regs regs64; void *stack; unsigned int setup_hdr_offset = offsetof(struct boot_param
[PATCH v9 00/10] kexec_file_load implementation for PowerPC
.8-rc1 + the extend kexec_file_load series. - Patch "powerpc: Adapt elf64_apply_relocate_add for kexec_file_load." - New patch. These changes were previously in patch 10. The code itself is unchanged from v4. - Patch "powerpc: Implement kexec_file_load." - Moved arch_kexec_walk_mem, arch_kexec_apply_relocations_add and setup_purgatory from patch 10 to this patch. - arch_kexec_apply_relocations_add is unchanged from v4. - Fixed off-by-one error in arch_kexec_walk_mem when passing range to func. - Moved setup_purgatory from kexec_elf_64.c to machine_kexec_64.c, and changed it to receive a pointer to the slave code directly rather than a struct elf_info and getting the pointer from there. - Patch "powerpc: Add code to work with device trees in kexec_file_load." - New patch. These changes were previously in patch 10. - find_debug_console moved from kexec_elf_64.c to machine_kexec_64.c. The code is unchanged from v4. - setup_new_fdt is a new function factored out of elf64_load. The only code change from v4 is to create /chosen if it doesn't exist yet. - Patch "powerpc: Add support for loading ELF kernels with kexec_file_load." - This patch was too big, so moved some of its changes to other patches to facilitate review. - Allow loading ELF file type ET_DYN, which is what the BE kernel uses. - The code adapting the device tree for booting the new kernel was moved out of elf64_load to setup_new_fdt. - Patch "powerpc: Allow userspace to set device tree properties in kexec_file_load" - New patch. - The code in this patch didn't exist in v4. - This is the only patch that depends on the extend kexec_file_load series. - Patch "powerpc: Enable CONFIG_KEXEC_FILE in powerpc server defconfigs." - New patch. Thiago Jung Bauermann (10): kexec_file: Allow arch-specific memory walking for kexec_add_buffer kexec_file: Change kexec_add_buffer to take kexec_buf as argument. kexec_file: Factor out kexec_locate_mem_hole from kexec_add_buffer. powerpc: Change places using CONFIG_KEXEC to use CONFIG_KEXEC_CORE instead. powerpc: Factor out relocation code in module_64.c powerpc: Implement kexec_file_load. powerpc: Add functions to read ELF files of any endianness. powerpc: Add support for loading ELF kernels with kexec_file_load. powerpc: Add purgatory for kexec_file_load implementation. powerpc: Enable CONFIG_KEXEC_FILE in powerpc server defconfigs. arch/powerpc/Kconfig | 15 +- arch/powerpc/Makefile | 1 + arch/powerpc/boot/string.S| 67 +-- arch/powerpc/configs/powernv_defconfig| 2 + arch/powerpc/configs/ppc64_defconfig | 2 + arch/powerpc/configs/pseries_defconfig| 2 + arch/powerpc/include/asm/debug.h | 2 +- arch/powerpc/include/asm/elf_util.h | 64 +++ arch/powerpc/include/asm/kexec.h | 18 +- arch/powerpc/include/asm/machdep.h| 4 +- arch/powerpc/include/asm/smp.h| 2 +- arch/powerpc/include/asm/systbl.h | 1 + arch/powerpc/include/asm/unistd.h | 2 +- arch/powerpc/include/uapi/asm/unistd.h| 1 + arch/powerpc/kernel/Makefile | 6 +- arch/powerpc/kernel/elf_util.c| 464 + arch/powerpc/kernel/head_64.S | 2 +- arch/powerpc/kernel/kexec_elf_64.c| 280 + arch/powerpc/kernel/machine_kexec_file_64.c | 579 ++ arch/powerpc/kernel/misc_32.S | 2 +- arch/powerpc/kernel/misc_64.S | 6 +- arch/powerpc/kernel/module_64.c | 383 ++--- arch/powerpc/kernel/prom.c| 2 +- arch/powerpc/kernel/setup_64.c| 4 +- arch/powerpc/kernel/smp.c | 6 +- arch/powerpc/kernel/traps.c | 2 +- arch/powerpc/platforms/85xx/corenet_generic.c | 2 +- arch/powerpc/platforms/85xx/smp.c | 8 +- arch/powerpc/platforms/cell/spu_base.c| 2 +- arch/powerpc/platforms/powernv/setup.c| 6 +- arch/powerpc/platforms/ps3/setup.c| 4 +- arch/powerpc/platforms/pseries/Makefile | 2 +- arch/powerpc/platforms/pseries/setup.c| 4 +- arch/powerpc/purgatory/.gitignore | 2 + arch/powerpc/purgatory/Makefile | 33 ++ arch/powerpc/purgatory/console-ppc64.c| 37 ++ arch/powerpc/purgatory/crtsavres.S| 5 + arch/powerpc/purgatory/hvCall.S | 27 ++ arch/powerpc/purgatory/hvCall.h | 8 + arch/powerpc/purgatory/kexec-sha256.h | 11 + arch/powerpc/purgatory/ppc64_asm.h| 20 + arch/powerpc/purgatory/printf.c | 164 arch/powerpc/purgatory/purgatory-ppc64.c | 36 ++ arch/power
Re: [PATHC v2 0/9] ima: carry the measurement list across kexec
Am Donnerstag, 29 September 2016, 16:43:08 schrieb Eric W. Biederman: > Thiago Jung Bauermann <bauer...@linux.vnet.ibm.com> writes: > > Hello Eric, > > > > Am Dienstag, 20 September 2016, 11:07:29 schrieb Eric W. Biederman: > >> A semi-generic concept called a hand-over buffer seems to be a > >> construction of infrustructure for no actual reason that will just > >> result in confusion. There are lots of things that are handed over, > >> the > >> flattend device tree, ramdisks, bootparams on x86, etc, etc. ima is > >> not > >> special in this execpt for being perhaps the first addition that we are > >> going to want the option of including on most architectures. > > > > Ok, I understand. I decided to implement a generic concept because I > > thought that proposing a feature that is more useful than what I need > > it for would increase its chance of being accepted. It's interesting to > > see that it had the opposite effect. > > Yes. In this case it was not clear that anyone else could use it, and > being less generic you can tweak the needs of the code to ima without > anyone having to worry about it. > > So thank you very much for making the code more specific to the > circumstances. Thank you very much for your feedback and your reviews! -- []'s Thiago Jung Bauermann IBM Linux Technology Center ___ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec
Re: [PATHC v2 0/9] ima: carry the measurement list across kexec
Hello Eric, Am Dienstag, 20 September 2016, 11:07:29 schrieb Eric W. Biederman: > Thiago Jung Bauermann <bauer...@linux.vnet.ibm.com> writes: > > Am Samstag, 17 September 2016, 00:17:37 schrieb Eric W. Biederman: > >> Thiago Jung Bauermann <bauer...@linux.vnet.ibm.com> writes: > > Is this what you had in mind? > > Sort of. > > I was just thinking that instead of having the boot path verify your ima > list matches what is in the tpm and halting the boot there, we could > test that on reboot. Which would give a clean failure without the nasty > poking into a prepared image. The downside is that we have already run > the shutdown scripts so it wouldn't be much cleaner, than triggering a > machine reboot from elsewhere. > > But I don't think we should spend too much time on that. It was a > passing thought. We should focus on getting a non-poked ima buffer > cleanly into kexec and we can worry about the rest later. I was thinking of this as something orthogonal to the ima buffer feature. But you're right, it's better not to discuss this now. I'll post a separate patch for this later. > >> So from 10,000 feet I think that is correct. > >> > >> I am not quite certain why a new mechanism is being invented. We have > >> other information that is already passed (much of it architecture > >> specific) like the flattened device tree. If you remove the need to > >> update the information can you just append this information to the > >> flattened device tree without a new special mechanism to pass the data? > >> > >> I am just reluctant to invent a new mechanism when there is an existing > >> mechanism that looks like it should work without problems. > > > > Michael Ellerman suggested putting the buffer contents inside the device > > tree itself, but the s390 people are also planning to implement this > > feature. That architecture doesn't use device trees, so a solution that > > depends on DTs won't help them. > > > > With this mechanism each architecture will still need its own way of > > communicating to the next kernel where the buffer is, but I think it's > > easier to pass a base address and length than to pass a whole buffer. > > A base address and length pair is fine. There are several other pieces > of data that we pass that way. > > > I suppose we could piggyback the ima measurements buffer at the end of > > one of the other segments such as the kernel or, in the case of > > powerpc, the dtb but it looks hackish to me. I think it's cleaner to > > put it in its own segment. > > The boot protocol unfortunately is different on different architectures, > and for each one we will have to implement and document the change. > Because when you get into boot protocol issues you can't assume the > kernel you are booting is the same version as the kernel that is booting > it. > > Where I run into a problem is you added a semi-generic concept a > hand-over buffer. Not a ima data buffer but a hand-over buffer. > > The data falling in it's own dedicated area of memory and being added > with kexec_add_buffer is completely fine. I can see a dedicated pointer > in struct kimage if necessary. > > A semi-generic concept called a hand-over buffer seems to be a > construction of infrustructure for no actual reason that will just > result in confusion. There are lots of things that are handed over, the > flattend device tree, ramdisks, bootparams on x86, etc, etc. ima is not > special in this execpt for being perhaps the first addition that we are > going to want the option of including on most architectures. Ok, I understand. I decided to implement a generic concept because I thought that proposing a feature that is more useful than what I need it for would increase its chance of being accepted. It's interesting to see that it had the opposite effect. I reworked and simplified the code and folded the hand-over buffer patches into Mimi's patch series to carry the measurement list across kexec. The kexec buffer code is in the following patches now: [PATCH v5 01/10] powerpc: ima: Get the kexec buffer passed by the previous kernel [PATCH v5 05/10] powerpc: ima: Send the kexec buffer to the next kernel Each patch has a changelog listing what I changed to make it specific to IMA. -- []'s Thiago Jung Bauermann IBM Linux Technology Center ___ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec
Re: [PATHC v2 0/9] ima: carry the measurement list across kexec
Am Samstag, 17 September 2016, 00:17:37 schrieb Eric W. Biederman: > Thiago Jung Bauermann <bauer...@linux.vnet.ibm.com> writes: > > Hello Eric, > > > > Am Freitag, 16 September 2016, 14:47:13 schrieb Eric W. Biederman: > >> I can see tracking to see if the list has changed at some > >> point and causing a reboot(LINUX_REBOOT_CMD_KEXEC) to fail. > > > > Yes, that is an interesting feature that I can add using the checksum- > > verifying part of my code. I can submit a patch for that if there's > > interest, adding a reboot notifier that verifies the checksum and causes > > a regular reboot instead of a kexec reboot if the checksum fails. > > I was thinking an early failure instead of getting all of the way down > into a kernel an discovering the tpm/ima subsystem would not > initialized. But where that falls in the reboot pathway I don't expect > there is much value in it. I'm not sure I understand. What I described doesn't involve the tpm or ima. I'm suggesting that if I take the parts of patch 4/5 in the kexec hand-over buffer series that verify the image checksum, I can submit a separate patch that checks the integrity of the kexec image early in kernel_kexec() and reverts to a regular reboot if the check fails. This would be orthogonal to ima carrying its measurement list across kexec. I think there is value in that, because if the kexec image is corrupted the machine will just get stuck in the purgatory and (unless it's a platform where the purgatory can print to the console) without even an error message explaining what is going on. Whereas if we notice the corruption before jumping into the purgatory we can switch to a regular reboot and the machine will boot successfully. To have an early failure, when would the checksum verification be done? What I can think of is to have kexec_file_load accept a new flag KEXEC_FILE_VERIFY_IMAGE, which userspace could use to request an integrity check when it's about to start the reboot procedure. Then it can decide to either reload the kernel or use a regular reboot if the image is corrupted. Is this what you had in mind? > >> At least the common bootloader cases that I know of using kexec are > >> very > >> minimal distributions that live in a ramdisk and as such it should be > >> very straight forward to measure what is needed at or before > >> sys_kexec_load. But that was completely dismissed as unrealistic so I > >> don't have a clue what actual problem you are trying to solve. > > > > We are interested in solving the problem in a general way because it > > will be useful to us in the future for the case of an arbitrary number > > of kexecs (and thus not only a bootloader but also multiple full-blown > > distros may be involved in the chain). > > > > But you are right that for the use case for which we currently need this > > feature it's feasible to measure everything upfront. We can cross the > > other bridge when we get there. > > Then let's start there. Passing the measurment list is something that > should not be controversial. Great! > >> If there is anyway we can start small and not with this big scary > >> infrastructure change I would very much prefer it. > > > > Sounds good. If we pre-measure everything then the following patches > > from my buffer hand-over series are enough: > > > > [PATCH v5 2/5] kexec_file: Add buffer hand-over support for the next > > kernel [PATCH v5 3/5] powerpc: kexec_file: Add buffer hand-over support > > for the next kernel > > > > Would you consider including those two? > > > > And like I mentioned in the cover letter, patch 1/5 is an interesting > > improvement that is worth considering. > > So from 10,000 feet I think that is correct. > > I am not quite certain why a new mechanism is being invented. We have > other information that is already passed (much of it architecture > specific) like the flattened device tree. If you remove the need to > update the information can you just append this information to the > flattened device tree without a new special mechanism to pass the data? > > I am just reluctant to invent a new mechanism when there is an existing > mechanism that looks like it should work without problems. Michael Ellerman suggested putting the buffer contents inside the device tree itself, but the s390 people are also planning to implement this feature. That architecture doesn't use device trees, so a solution that depends on DTs won't help them. With this mechanism each architecture will still need its own way of communicating to the next kernel where the buffer is, but I think it's easier to pass a base address and
Re: [PATHC v2 0/9] ima: carry the measurement list across kexec
s an interesting feature that I can add using the checksum- verifying part of my code. I can submit a patch for that if there's interest, adding a reboot notifier that verifies the checksum and causes a regular reboot instead of a kexec reboot if the checksum fails. > At least the common bootloader cases that I know of using kexec are very > minimal distributions that live in a ramdisk and as such it should be > very straight forward to measure what is needed at or before > sys_kexec_load. But that was completely dismissed as unrealistic so I > don't have a clue what actual problem you are trying to solve. We are interested in solving the problem in a general way because it will be useful to us in the future for the case of an arbitrary number of kexecs (and thus not only a bootloader but also multiple full-blown distros may be involved in the chain). But you are right that for the use case for which we currently need this feature it's feasible to measure everything upfront. We can cross the other bridge when we get there. > If there is anyway we can start small and not with this big scary > infrastructure change I would very much prefer it. Sounds good. If we pre-measure everything then the following patches from my buffer hand-over series are enough: [PATCH v5 2/5] kexec_file: Add buffer hand-over support for the next kernel [PATCH v5 3/5] powerpc: kexec_file: Add buffer hand-over support for the next kernel Would you consider including those two? And like I mentioned in the cover letter, patch 1/5 is an interesting improvement that is worth considering. -- []'s Thiago Jung Bauermann IBM Linux Technology Center ___ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec