[PATCH v6 2/4] resource: Use list_head to link sibling resource
The struct resource uses singly linked list to link siblings, implemented by pointer operation. Replace it with list_head for better code readability. Based on this list_head replacement, it will be very easy to do reverse iteration on iomem_resource's sibling list in later patch. Besides, type of member variables of struct resource, sibling and child, are changed from 'struct resource *' to 'struct list_head'. This brings two pointers of size increase. Suggested-by: Andrew Morton Signed-off-by: Baoquan He Cc: Patrik Jakobsson Cc: David Airlie Cc: "K. Y. Srinivasan" Cc: Haiyang Zhang Cc: Stephen Hemminger Cc: Dmitry Torokhov Cc: Dan Williams Cc: Rob Herring Cc: Frank Rowand Cc: Keith Busch Cc: Jonathan Derrick Cc: Lorenzo Pieralisi Cc: Bjorn Helgaas Cc: Thomas Gleixner Cc: Brijesh Singh Cc: "Jérôme Glisse" Cc: Borislav Petkov Cc: Tom Lendacky Cc: Greg Kroah-Hartman Cc: Yaowei Bai Cc: Wei Yang Cc: de...@linuxdriverproject.org Cc: linux-in...@vger.kernel.org Cc: linux-nvd...@lists.01.org Cc: devicet...@vger.kernel.org Cc: linux-...@vger.kernel.org --- arch/arm/plat-samsung/pm-check.c| 6 +- arch/microblaze/pci/pci-common.c| 4 +- arch/powerpc/kernel/pci-common.c| 4 +- arch/sparc/kernel/ioport.c | 2 +- arch/xtensa/include/asm/pci-bridge.h| 4 +- drivers/eisa/eisa-bus.c | 2 + drivers/gpu/drm/drm_memory.c| 3 +- drivers/gpu/drm/gma500/gtt.c| 5 +- drivers/hv/vmbus_drv.c | 52 +++ drivers/input/joystick/iforce/iforce-main.c | 4 +- drivers/nvdimm/namespace_devs.c | 6 +- drivers/nvdimm/nd.h | 5 +- drivers/of/address.c| 4 +- drivers/parisc/lba_pci.c| 4 +- drivers/pci/controller/vmd.c| 8 +- drivers/pci/probe.c | 2 + drivers/pci/setup-bus.c | 2 +- include/linux/ioport.h | 17 ++- kernel/resource.c | 208 ++-- 19 files changed, 175 insertions(+), 167 deletions(-) diff --git a/arch/arm/plat-samsung/pm-check.c b/arch/arm/plat-samsung/pm-check.c index cd2c02c68bc3..5494355b1c49 100644 --- a/arch/arm/plat-samsung/pm-check.c +++ b/arch/arm/plat-samsung/pm-check.c @@ -46,8 +46,8 @@ typedef u32 *(run_fn_t)(struct resource *ptr, u32 *arg); static void s3c_pm_run_res(struct resource *ptr, run_fn_t fn, u32 *arg) { while (ptr != NULL) { - if (ptr->child != NULL) - s3c_pm_run_res(ptr->child, fn, arg); + if (!list_empty(>child)) + s3c_pm_run_res(resource_first_child(>child), fn, arg); if ((ptr->flags & IORESOURCE_SYSTEM_RAM) == IORESOURCE_SYSTEM_RAM) { @@ -57,7 +57,7 @@ static void s3c_pm_run_res(struct resource *ptr, run_fn_t fn, u32 *arg) arg = (fn)(ptr, arg); } - ptr = ptr->sibling; + ptr = resource_sibling(ptr); } } diff --git a/arch/microblaze/pci/pci-common.c b/arch/microblaze/pci/pci-common.c index 7899bafab064..2bf73e27e231 100644 --- a/arch/microblaze/pci/pci-common.c +++ b/arch/microblaze/pci/pci-common.c @@ -533,7 +533,9 @@ void pci_process_bridge_OF_ranges(struct pci_controller *hose, res->flags = range.flags; res->start = range.cpu_addr; res->end = range.cpu_addr + range.size - 1; - res->parent = res->child = res->sibling = NULL; + res->parent = NULL; + INIT_LIST_HEAD(>child); + INIT_LIST_HEAD(>sibling); } } diff --git a/arch/powerpc/kernel/pci-common.c b/arch/powerpc/kernel/pci-common.c index 926035bb378d..28fbe83c9daf 100644 --- a/arch/powerpc/kernel/pci-common.c +++ b/arch/powerpc/kernel/pci-common.c @@ -761,7 +761,9 @@ void pci_process_bridge_OF_ranges(struct pci_controller *hose, res->flags = range.flags; res->start = range.cpu_addr; res->end = range.cpu_addr + range.size - 1; - res->parent = res->child = res->sibling = NULL; + res->parent = NULL; + INIT_LIST_HEAD(>child); + INIT_LIST_HEAD(>sibling); } } } diff --git a/arch/sparc/kernel/ioport.c b/arch/sparc/kernel/ioport.c index cca9134cfa7d..99efe4e98b16 100644 --- a/arch/sparc/kernel/ioport.c +++ b/arch/sparc/kernel/ioport.c @@ -669,7 +669,7 @@ static int sparc_io_proc_show(struct seq_file *m, void *v) struct resource *root = m->private, *r; const char *nm; - for (r = root->child; r != NULL; r = r->sibling) { + list_for_each_entry(r,
Re: [PATCH v5 1/4] resource: Move reparent_resources() to kernel/resource.c and make it public
Hi Andy, On 06/12/18 at 05:24pm, Andy Shevchenko wrote: > On Tue, Jun 12, 2018 at 5:20 PM, Andy Shevchenko > wrote: > >> Hmm, I just copied it from arch/powerpc/kernel/pci-common.c. The > >> function interface expects an integer returned value, not sure what a > >> real error codes look like, could you give more hints? Will change > >> accordingly. > > > > I briefly looked at the code and error codes we have, so, my proposal > > is one of the following > > > - use -ECANCELED (not the best choice for first occurrence here, > > though I can't find better) > > Actually -ENOTSUPP might suit the first case (although the actual > would be something like -EOVERLAP, which we don't have) Sorry for late reply, and many thanks for your great suggestion. I am fine to use -ENOTSUPP as the first returned value, and -ECANCELED for the 2nd one. Or define an enum as you suggested inside the function or in header file. Or use -EBUSY for the first case because existing resource is overlapping but not fully contained by 'res'; and -EINVAL for the 2nd case since didn't find any one resources which is contained by 'res', means we passed in a invalid resource. All is fine to me, I can repost with each of them. Thanks Baoquan > > > - use positive integers (or enum), like > > #define RES_REPARENTED 0 > > #define RES_OVERLAPPED 1 > > #define RES_NOCONFLICT 2 > > > > > >>> > + if (firstpp == NULL) > >>> > + firstpp = pp; > >>> > + } > >>> > >>> > + if (firstpp == NULL) > >>> > + return -1; /* didn't find any conflicting entries? > >>> > */ > >>> > >>> Ditto. > > > > Ditto. > > > >>> > >>> > +} > >>> > +EXPORT_SYMBOL(reparent_resources); > > > > -- > > With Best Regards, > > Andy Shevchenko > > > > -- > With Best Regards, > Andy Shevchenko ___ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec
[PATCH v6 0/4] resource: Use list_head to link sibling resource
This patchset is doing: 1) Replace struct resource's sibling list from singly linked list to list_head. Clearing out those pointer operation within singly linked list for better code readability. 2) Based on list_head replacement, add a new function walk_system_ram_res_rev() which can does reversed iteration on iomem_resource's siblings. 3) Change kexec_file loading to search system RAM top down for kernel loadin, using walk_system_ram_res_rev(). Note: This patchset only passed testing on x86_64 arch with network enabling. The thing we need pay attetion to is that a root resource's child member need be initialized specifically with LIST_HEAD_INIT() if statically defined or INIT_LIST_HEAD() for dynamically definition. Here Just like we do for iomem_resource/ioport_resource, or the change in get_pci_domain_busn_res(). v5: http://lkml.kernel.org/r/20180612032831.29747-1-...@redhat.com v4: http://lkml.kernel.org/r/20180507063224.24229-1-...@redhat.com v3: http://lkml.kernel.org/r/20180419001848.3041-1-...@redhat.com v2: http://lkml.kernel.org/r/20180408024724.16812-1-...@redhat.com v1: http://lkml.kernel.org/r/20180322033722.9279-1-...@redhat.com Changelog: v5->v6: Fix code style problems in reparent_resources() and use existing error codes, according to Andy's suggestion. Fix bugs test robot reported. v4->v5: Add new patch 0001 to move duplicated reparent_resources() to kernel/resource.c to make it be shared by different ARCH-es. Fix several code bugs reported by test robot on ARCH powerpc and microblaze. v3->v4: Fix several bugs test robot reported. Rewrite cover letter and patch log according to reviewer's comment. v2->v3: Rename resource functions first_child() and sibling() to resource_first_chils() and resource_sibling(). Dan suggested this. Move resource_first_chils() and resource_sibling() to linux/ioport.h and make them as inline function. Rob suggested this. Accordingly add linux/list.h including in linux/ioport.h, please help review if this bring efficiency degradation or code redundancy. The change on struct resource {} bring two pointers of size increase, mention this in git log to make it more specifically, Rob suggested this. v1->v2: Use list_head instead to link resource siblings. This is suggested by Andrew. Rewrite walk_system_ram_res_rev() after list_head is taken to link resouce siblings. Baoquan He (4): resource: Move reparent_resources() to kernel/resource.c and make it public resource: Use list_head to link sibling resource resource: add walk_system_ram_res_rev() kexec_file: Load kernel at top of system RAM if required arch/arm/plat-samsung/pm-check.c| 6 +- arch/microblaze/pci/pci-common.c| 41 + arch/powerpc/kernel/pci-common.c| 39 + arch/sparc/kernel/ioport.c | 2 +- arch/xtensa/include/asm/pci-bridge.h| 4 +- drivers/eisa/eisa-bus.c | 2 + drivers/gpu/drm/drm_memory.c| 3 +- drivers/gpu/drm/gma500/gtt.c| 5 +- drivers/hv/vmbus_drv.c | 52 +++--- drivers/input/joystick/iforce/iforce-main.c | 4 +- drivers/nvdimm/namespace_devs.c | 6 +- drivers/nvdimm/nd.h | 5 +- drivers/of/address.c| 4 +- drivers/parisc/lba_pci.c| 4 +- drivers/pci/controller/vmd.c| 8 +- drivers/pci/probe.c | 2 + drivers/pci/setup-bus.c | 2 +- include/linux/ioport.h | 21 ++- kernel/kexec_file.c | 2 + kernel/resource.c | 263 ++-- 20 files changed, 248 insertions(+), 227 deletions(-) -- 2.13.6 ___ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec
[PATCH v6 3/4] resource: add walk_system_ram_res_rev()
This function, being a variant of walk_system_ram_res() introduced in commit 8c86e70acead ("resource: provide new functions to walk through resources"), walks through a list of all the resources of System RAM in reversed order, i.e., from higher to lower. It will be used in kexec_file code. Signed-off-by: Baoquan He Cc: Andrew Morton Cc: Thomas Gleixner Cc: Brijesh Singh Cc: "Jérôme Glisse" Cc: Borislav Petkov Cc: Tom Lendacky Cc: Wei Yang --- include/linux/ioport.h | 3 +++ kernel/resource.c | 40 2 files changed, 43 insertions(+) diff --git a/include/linux/ioport.h b/include/linux/ioport.h index b7456ae889dd..066cc263e2cc 100644 --- a/include/linux/ioport.h +++ b/include/linux/ioport.h @@ -279,6 +279,9 @@ extern int walk_system_ram_res(u64 start, u64 end, void *arg, int (*func)(struct resource *, void *)); extern int +walk_system_ram_res_rev(u64 start, u64 end, void *arg, + int (*func)(struct resource *, void *)); +extern int walk_iomem_res_desc(unsigned long desc, unsigned long flags, u64 start, u64 end, void *arg, int (*func)(struct resource *, void *)); diff --git a/kernel/resource.c b/kernel/resource.c index 6d647a3824b1..4c5fbef4ea24 100644 --- a/kernel/resource.c +++ b/kernel/resource.c @@ -23,6 +23,8 @@ #include #include #include +#include +#include #include @@ -443,6 +445,44 @@ int walk_system_ram_res(u64 start, u64 end, void *arg, } /* + * This function, being a variant of walk_system_ram_res(), calls the @func + * callback against all memory ranges of type System RAM which are marked as + * IORESOURCE_SYSTEM_RAM and IORESOUCE_BUSY in reversed order, i.e., from + * higher to lower. + */ +int walk_system_ram_res_rev(u64 start, u64 end, void *arg, + int (*func)(struct resource *, void *)) +{ + unsigned long flags; + struct resource *res; + int ret = -1; + + flags = IORESOURCE_SYSTEM_RAM | IORESOURCE_BUSY; + + read_lock(_lock); + list_for_each_entry_reverse(res, _resource.child, sibling) { + if (start >= end) + break; + if ((res->flags & flags) != flags) + continue; + if (res->desc != IORES_DESC_NONE) + continue; + if (res->end < start) + break; + + if ((res->end >= start) && (res->start < end)) { + ret = (*func)(res, arg); + if (ret) + break; + } + end = res->start - 1; + + } + read_unlock(_lock); + return ret; +} + +/* * This function calls the @func callback against all memory ranges, which * are ranges marked as IORESOURCE_MEM and IORESOUCE_BUSY. */ -- 2.13.6 ___ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec
[PATCH v6 1/4] resource: Move reparent_resources() to kernel/resource.c and make it public
reparent_resources() is duplicated in arch/microblaze/pci/pci-common.c and arch/powerpc/kernel/pci-common.c, so move it to kernel/resource.c so that it's shared. Signed-off-by: Baoquan He --- arch/microblaze/pci/pci-common.c | 37 - arch/powerpc/kernel/pci-common.c | 35 --- include/linux/ioport.h | 1 + kernel/resource.c| 39 +++ 4 files changed, 40 insertions(+), 72 deletions(-) diff --git a/arch/microblaze/pci/pci-common.c b/arch/microblaze/pci/pci-common.c index f34346d56095..7899bafab064 100644 --- a/arch/microblaze/pci/pci-common.c +++ b/arch/microblaze/pci/pci-common.c @@ -619,43 +619,6 @@ int pcibios_add_device(struct pci_dev *dev) EXPORT_SYMBOL(pcibios_add_device); /* - * Reparent resource children of pr that conflict with res - * under res, and make res replace those children. - */ -static int __init reparent_resources(struct resource *parent, -struct resource *res) -{ - struct resource *p, **pp; - struct resource **firstpp = NULL; - - for (pp = >child; (p = *pp) != NULL; pp = >sibling) { - if (p->end < res->start) - continue; - if (res->end < p->start) - break; - if (p->start < res->start || p->end > res->end) - return -1; /* not completely contained */ - if (firstpp == NULL) - firstpp = pp; - } - if (firstpp == NULL) - return -1; /* didn't find any conflicting entries? */ - res->parent = parent; - res->child = *firstpp; - res->sibling = *pp; - *firstpp = res; - *pp = NULL; - for (p = res->child; p != NULL; p = p->sibling) { - p->parent = res; - pr_debug("PCI: Reparented %s [%llx..%llx] under %s\n", -p->name, -(unsigned long long)p->start, -(unsigned long long)p->end, res->name); - } - return 0; -} - -/* * Handle resources of PCI devices. If the world were perfect, we could * just allocate all the resource regions and do nothing more. It isn't. * On the other hand, we cannot just re-allocate all devices, as it would diff --git a/arch/powerpc/kernel/pci-common.c b/arch/powerpc/kernel/pci-common.c index fe9733aa..926035bb378d 100644 --- a/arch/powerpc/kernel/pci-common.c +++ b/arch/powerpc/kernel/pci-common.c @@ -1088,41 +1088,6 @@ resource_size_t pcibios_align_resource(void *data, const struct resource *res, EXPORT_SYMBOL(pcibios_align_resource); /* - * Reparent resource children of pr that conflict with res - * under res, and make res replace those children. - */ -static int reparent_resources(struct resource *parent, -struct resource *res) -{ - struct resource *p, **pp; - struct resource **firstpp = NULL; - - for (pp = >child; (p = *pp) != NULL; pp = >sibling) { - if (p->end < res->start) - continue; - if (res->end < p->start) - break; - if (p->start < res->start || p->end > res->end) - return -1; /* not completely contained */ - if (firstpp == NULL) - firstpp = pp; - } - if (firstpp == NULL) - return -1; /* didn't find any conflicting entries? */ - res->parent = parent; - res->child = *firstpp; - res->sibling = *pp; - *firstpp = res; - *pp = NULL; - for (p = res->child; p != NULL; p = p->sibling) { - p->parent = res; - pr_debug("PCI: Reparented %s %pR under %s\n", -p->name, p, res->name); - } - return 0; -} - -/* * Handle resources of PCI devices. If the world were perfect, we could * just allocate all the resource regions and do nothing more. It isn't. * On the other hand, we cannot just re-allocate all devices, as it would diff --git a/include/linux/ioport.h b/include/linux/ioport.h index da0ebaec25f0..dfdcd0bfe54e 100644 --- a/include/linux/ioport.h +++ b/include/linux/ioport.h @@ -192,6 +192,7 @@ extern int allocate_resource(struct resource *root, struct resource *new, struct resource *lookup_resource(struct resource *root, resource_size_t start); int adjust_resource(struct resource *res, resource_size_t start, resource_size_t size); +int reparent_resources(struct resource *parent, struct resource *res); resource_size_t resource_alignment(struct resource *res); static inline resource_size_t resource_size(const struct resource *res) { diff --git a/kernel/resource.c b/kernel/resource.c index 30e1bc68503b..d1cbf4b50e17 100644 --- a/kernel/resource.c +++ b/kernel/resource.c @@ -983,6 +983,45 @@ int
[PATCH v6 4/4] kexec_file: Load kernel at top of system RAM if required
For kexec_file loading, if kexec_buf.top_down is 'true', the memory which is used to load kernel/initrd/purgatory is supposed to be allocated from top to down. This is what we have been doing all along in the old kexec loading interface and the kexec loading is still default setting in some distributions. However, the current kexec_file loading interface doesn't do likt this. The function arch_kexec_walk_mem() it calls ignores checking kexec_buf.top_down, but calls walk_system_ram_res() directly to go through all resources of System RAM from bottom to up, to try to find memory region which can contain the specific kexec buffer, then call locate_mem_hole_callback() to allocate memory in that found memory region from top to down. This brings confusion especially when KASLR is widely supported , users have to make clear why kexec/kdump kernel loading position is different between these two interfaces in order to exclude unnecessary noises. Hence these two interfaces need be unified on behaviour. Here add checking if kexec_buf.top_down is 'true' in arch_kexec_walk_mem(), if yes, call the newly added walk_system_ram_res_rev() to find memory region from top to down to load kernel. Signed-off-by: Baoquan He Cc: Eric Biederman Cc: Vivek Goyal Cc: Dave Young Cc: Andrew Morton Cc: Yinghai Lu Cc: kexec@lists.infradead.org --- kernel/kexec_file.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/kernel/kexec_file.c b/kernel/kexec_file.c index c6a3b6851372..75226c1d08ce 100644 --- a/kernel/kexec_file.c +++ b/kernel/kexec_file.c @@ -518,6 +518,8 @@ int __weak arch_kexec_walk_mem(struct kexec_buf *kbuf, IORESOURCE_SYSTEM_RAM | IORESOURCE_BUSY, crashk_res.start, crashk_res.end, kbuf, func); + else if (kbuf->top_down) + return walk_system_ram_res_rev(0, ULONG_MAX, kbuf, func); else return walk_system_ram_res(0, ULONG_MAX, kbuf, func); } -- 2.13.6 ___ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec
Re: [PATCH v5 1/4] resource: Move reparent_resources() to kernel/resource.c and make it public
On 07/03/18 at 11:57pm, Andy Shevchenko wrote: > On Tue, Jul 3, 2018 at 5:55 PM, Baoquan He wrote: > > On 06/12/18 at 05:24pm, Andy Shevchenko wrote: > >> On Tue, Jun 12, 2018 at 5:20 PM, Andy Shevchenko > >> wrote: > > >> > I briefly looked at the code and error codes we have, so, my proposal > >> > is one of the following > >> > >> > - use -ECANCELED (not the best choice for first occurrence here, > >> > though I can't find better) > >> > >> Actually -ENOTSUPP might suit the first case (although the actual > >> would be something like -EOVERLAP, which we don't have) > > > > Sorry for late reply, and many thanks for your great suggestion. > > > > > I am fine to use -ENOTSUPP as the first returned value, and -ECANCELED > > for the 2nd one. > > I have no strong opinion, but I like (slightly better) this approach ^^^ Done, post v6 in this way, many thanks. > > > Or define an enum as you suggested inside the function > > or in header file. > > > > > Or use -EBUSY for the first case because existing resource is > > overlapping but not fully contained by 'res'; and -EINVAL for > > the 2nd case since didn't find any one resources which is contained by > > 'res', means we passed in a invalid resource. > > > > All is fine to me, I can repost with each of them. > > >> > - use positive integers (or enum), like > >> > #define RES_REPARENTED 0 > >> > #define RES_OVERLAPPED 1 > >> > #define RES_NOCONFLICT 2 > > -- > With Best Regards, > Andy Shevchenko ___ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec
Re: [PATCH v5 4/8] firmware: add call to LSM hook before firmware sysfs fallback
Hi Mimi, I love your patch! Yet something to improve: [auto build test ERROR on integrity/next-integrity] [also build test ERROR on v4.18-rc3 next-20180702] [if your patch is applied to the wrong git tree, please drop us a note to help improve the system] url: https://github.com/0day-ci/linux/commits/Mimi-Zohar/kexec-firmware-support-system-wide-policy-requiring-signatures/20180703-04 base: https://git.kernel.org/pub/scm/linux/kernel/git/zohar/linux-integrity.git next-integrity config: x86_64-randconfig-ws0-07031240 (attached as .config) compiler: gcc-7 (Debian 7.3.0-16) 7.3.0 reproduce: # save the attached .config to linux build tree make ARCH=x86_64 All errors (new ones prefixed by >>): >> ERROR: "security_kernel_load_data" >> [drivers/base/firmware_loader/firmware_class.ko] undefined! --- 0-DAY kernel test infrastructureOpen Source Technology Center https://lists.01.org/pipermail/kbuild-all Intel Corporation .config.gz Description: application/gzip ___ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec
Re: [PATCH v5 1/4] resource: Move reparent_resources() to kernel/resource.c and make it public
On Tue, Jul 3, 2018 at 5:55 PM, Baoquan He wrote: > On 06/12/18 at 05:24pm, Andy Shevchenko wrote: >> On Tue, Jun 12, 2018 at 5:20 PM, Andy Shevchenko >> wrote: >> > I briefly looked at the code and error codes we have, so, my proposal >> > is one of the following >> >> > - use -ECANCELED (not the best choice for first occurrence here, >> > though I can't find better) >> >> Actually -ENOTSUPP might suit the first case (although the actual >> would be something like -EOVERLAP, which we don't have) > > Sorry for late reply, and many thanks for your great suggestion. > > I am fine to use -ENOTSUPP as the first returned value, and -ECANCELED > for the 2nd one. I have no strong opinion, but I like (slightly better) this approach ^^^ > Or define an enum as you suggested inside the function > or in header file. > > Or use -EBUSY for the first case because existing resource is > overlapping but not fully contained by 'res'; and -EINVAL for > the 2nd case since didn't find any one resources which is contained by > 'res', means we passed in a invalid resource. > > All is fine to me, I can repost with each of them. >> > - use positive integers (or enum), like >> > #define RES_REPARENTED 0 >> > #define RES_OVERLAPPED 1 >> > #define RES_NOCONFLICT 2 -- With Best Regards, Andy Shevchenko ___ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec
makedumpfile-1.6.4: Add support for 5-level paging
Hello, makedumpfile version 1.6.4 is released. Your comments/patches are welcome. I took over the maintainer role from Tachibana-san. Thank you so much for development and maintenance of makedumpfile for a long time! And I'm pleased to work on makedumpfile with everyone here. The maintainer changed to me, but there is no need to change your way of contribution. Main new features: o 5-level paging support - 5-level paging on x86_64 system is supported from this version. o --mem-usage support for arm64 - The makedumpfile --mem-usage option is now supported on arm64 system. And the option was also tested on ppc64 and s390x by Bhupesh Sharma. o Support new kernels - The supported kernel is updated to 4.17.0 in this version. Changelog: o New feature - [PATCH v2 0/3] arm64: Add '--mem-usage' support (Bhupesh Sharma) f49ca13 - [PATCH v2 1/3] Add a new helper file 'tools.c' that provides some useful APIs (Bhupesh Sharma) 4d86cc7 - [PATCH v2 2/3] arm64: Add support to read symbols like _stext from '/proc/kallsyms' (Bhupesh Sharma) d1e7805 - [PATCH v2 3/3] Documentation: Update documentation regarding --mem-usage' option (Bhupesh Sharma) 8449bda - [PATCH 0/4] Add 5-level paging support (Baoquan He) 342fdab - [PATCH 2/4] Add pgtable_l5_enabled to number_table (Baoquan He) 8ae6f36 - [PATCH 3/4] Add a new function check_5level_paging() (Baoquan He) 33ca808 - [PATCH 4/4] arch/x86_64: Add 5-level paging support (Dou Liyang) 30a3214 - [PATCH] handle mem_section as either a pointer or an array (Thadeu Lima de Souza Cascardo) 14876c4 - [PATCH] ppc64: Increase the VA range (Hari Bathini) bd1ae9c o Bugfix - [PATCH] sadump: Fix a problem of PTI enabled kernel (Takao Indoh) 38de12a - [PATCH] Fix a bug when multi-threads feature meets enospace (Zhou Wenjian) 63b05c9 - [PATCH] Fix array index out of bound exception (Lianbo Jiang) e5f96e7 - [PATCH] fix for hugepages filtering (Hari Bathini) 246801c - [PATCH v2] Use integer arithmetics for the progress bar (Petr Tesarik) c6b79cb - [PATCH 2/2] Check PG_swapbacked for swap cache pages (Petr Tesarik) ca1f31e - [PATCH] Fix -g for change of mem_section (Kazuhito Hagio) f3c87e0 - [PATCH] Fix -i for KASLR (Kazuhito Hagio) 08db98c - [PATCH] Fix validate_mem_section() (Kazuhito Hagio) 8f1aafa - [PATCH] arm64: Fix calculation of page_offset in case we are running cases other than mem-usage (Bhupesh Sharma) cb7af5f o Cleanup - [PATCH v2] Always use bigger SECTION_MAP_MASK (Petr Tesarik) 6c6789f - [PATCH 1/2] Add is_cache_page() helper to check if a page belongs to the cache (Petr Tesarik) 694f28c - [PATCH 1/4] arch/x86_64: Cleanup the address translation of the 4-level page tables (Dou Liyang) acab2a2 - [PATCH] Fix some compilation warnings on x86 system (Kazuhito Hagio) 2148e43 - [PATCH] Update maintainers (Masaki Tachibana) bc7420b Explanation of makedumpfile: To shorten the size of the dumpfile and the time of creating the dumpfile, makedumpfile copies only the necessary pages for analysis to the dumpfile from /proc/vmcore. You can specify the kind of unnecessary pages with dump_level. If you want to shorten the size further, enable the compression of the page data. Download: You can download the latest makedumpfile from the following URL. Details of the change are written on the git page of the following site. https://sourceforge.net/projects/makedumpfile/ Method of installation: You can compile the makedumpfile command as follows; 1. "tar -zxvf makedumpfile-x.y.z.tar.gz" 2. "cd makedumpfile-x.y.z" 3. "make; make install" Usage: makedumpfile [-c|-l|-p] [-E] [-d dump_level] [-x vmlinux] dump_mem dump_file Example: If you want to exclude pages filled by zero, cache pages, user pages and free pages and to enable compression, please execute the following command. # makedumpfile -c -d 31 /proc/vmcore dumpfile Thanks, Kazu ___ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec
Re: [PATCH] arm64: Fix calculation of page_offset in case we are running cases other than mem-usage
On Wed, Jul 4, 2018 at 12:15 AM, Kazuhito Hagio wrote: > On 7/3/2018 1:23 AM, Bhupesh Sharma wrote: >> Hello Kazu-san, >> >> On Sat, Jun 30, 2018 at 3:18 AM, Bhupesh Sharma wrote: >>> Patch f49ca13e5eed5bbdc69e0fd5ef099cb46050cb3d added '--mem-usage' >>> support for arm64 architecture. >>> >>> However, we also need to make sure that the calculation of >>> 'page_offset' is valid in case we are running cases other than >>> '--mem-usage'. >>> >>> This patch does the same. I tested this patch on my qualcomm and apm >>> mustang arm64 boards. >>> >>> Signed-off-by: Bhupesh Sharma >>> --- >>> arch/arm64.c | 6 ++ >>> 1 file changed, 6 insertions(+) >>> >>> diff --git a/arch/arm64.c b/arch/arm64.c >>> index c9dab677f2c9..2fd3e1874376 100644 >>> --- a/arch/arm64.c >>> +++ b/arch/arm64.c >>> @@ -222,6 +222,12 @@ get_stext_symbol(void) >>> int >>> get_machdep_info_arm64(void) >>> { >>> + /* Check if va_bits is still not initialized. If still 0, call >>> +* get_versiondep_info() to initialize the same. >>> +*/ >>> + if (!va_bits) >>> + get_versiondep_info_arm64(); >>> + >>> if (!calculate_plat_config()) { >>> ERRMSG("Can't determine platform config values\n"); >>> return FALSE; >>> -- >>> 2.7.4 >>> >> >> This is an important fix for arm64 makedumpfile as it fixes issues >> reported with Fedora when we use makedumpfile for cases other than >> '--mem-usage'. >> >> So kindly consider this one for the next makedumpfile release. > > Hi Bhupesh, > > I understood its importance and I think that this problem should be > fixed before the v1.6.4 release. Thanks for letting me know it. > > I've reviewed the patch and the process flow of arm64 architecture > around it. It looks good to me, and applied. Great. Thanks a lot for your help. Regards, Bhupesh ___ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec
Re: panic kexec broken on ARM64?
Hi Akashi, On 04/07/18 09:41, takahiro.aka...@linaro.org wrote: > On Tue, Jul 03, 2018 at 09:58:44AM +0100, Marc Zyngier wrote: >> On 03/07/18 08:01, takahiro.aka...@linaro.org wrote: >>> On Sun, Jun 10, 2018 at 01:24:17PM +0100, Marc Zyngier wrote: On Wed, 06 Jun 2018 12:37:02 +0100 James Morse wrote:, > Bingo: its the lan78xx driver that is sleeping from the irqchip > callbacks; The smsc95xx driver doesn't have a struct irq_chip, which > is why the RPi-3-B doesn't do this. > > It may be valid for kdump to only teardown the 'root irqdomain' (if > that even means anything). I assume these secondary irqchip's would > have a summary-interrupt that goes to another irqchip. But I can't > see a way to tell them apart.., Overall, I can't think of an easy fix. We have a few options, but none of them involve a centralised change: 1) We provide a reset infrastructure for irqchips, with an opt-in mechanism. This involves changing the way we teardown irqs at crash-time, and we'd then need some notion of reset ordering (think of the layered ITS and GICv3, for example). >>> >>> Does this mean that all the irqchips have to be implemented with reset? >> >> No. Only those that want to be reset at kexec time. > > I don't get the point yet. (this stuff is new to me, below terminology is probably wrong:) It seems there is actually a tree of irqchips, which feed interrupts through the tree via some summary-interrupt. The problem is one of these later-irqchips is on the other end of the USB bus, and requires a fair amount of sleeping in order to reset it. The trick is everything comes through the root irqchip. So we can reset that to disable all the other interrupts in this tree. This only works until the new kernel re-enables the summary-interrupt, it needs to reset the irqchip that the summary-interrupt leads to first. (I assume shared summary-interrupts are somehow forbidden). > Who should have reset interface? > What is the criteria? (reset-interface -> be reset at kdump time) Just the root irchip that all interrupts have to come through. Thanks, James ___ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec
[PATCH 0/1] x86/purgatory: Fix Makefile bug
Hi everybody, when i moved the purgatories sha256 implementation to common code, i forgot to add FORCE to the new Makefile target. This patch fixes it. Note: There is an equivalent bug in the s390 purgatory. The fix for that will go upstream via Martin. Thanks Philipp Philipp Rudo (1): x86/purgatory: Add missing FORCE to Makefile target arch/x86/purgatory/Makefile | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) -- 2.16.4 ___ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec
[PATCH 1/1] x86/purgatory: Add missing FORCE to Makefile target
Without FORCE make does not detect changes only made to the command line options. So object files might not be re-built even when they should be. Fix this by adding FORCE where it is missing. Fixes: df6f2801f511 ("kernel/kexec_file.c: move purgatories sha256 to common code") Cc: # 4.17 Signed-off-by: Philipp Rudo --- arch/x86/purgatory/Makefile | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/x86/purgatory/Makefile b/arch/x86/purgatory/Makefile index 2e9ee023e6bc..81a8e33115ad 100644 --- a/arch/x86/purgatory/Makefile +++ b/arch/x86/purgatory/Makefile @@ -6,7 +6,7 @@ purgatory-y := purgatory.o stack.o setup-x86_$(BITS).o sha256.o entry64.o string targets += $(purgatory-y) PURGATORY_OBJS = $(addprefix $(obj)/,$(purgatory-y)) -$(obj)/sha256.o: $(srctree)/lib/sha256.c +$(obj)/sha256.o: $(srctree)/lib/sha256.c FORCE $(call if_changed_rule,cc_o_c) LDFLAGS_purgatory.ro := -e purgatory_start -r --no-undefined -nostdlib -z nodefaultlib -- 2.16.4 ___ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec
Re: panic kexec broken on ARM64?
On 04/07/18 09:41, takahiro.aka...@linaro.org wrote: > On Tue, Jul 03, 2018 at 09:58:44AM +0100, Marc Zyngier wrote: >> On 03/07/18 08:01, takahiro.aka...@linaro.org wrote: >>> Marc, James, >>> >>> I'd like to re-ignite the discussion. >>> >>> On Sun, Jun 10, 2018 at 01:24:17PM +0100, Marc Zyngier wrote: On Wed, 06 Jun 2018 12:37:02 +0100, James Morse wrote: > > Hi Stefan, > > On 06/06/18 08:02, Stefan Wahren wrote: >> Am 05.06.2018 um 19:46 schrieb James Morse: >>> On 05/06/18 09:01, Petr Tesarik wrote: I attached a hardware debugger and found out that all CPU cores were stopped except one which was stuck in the idle thread. It seems that irq_set_irqchip_state() may sleep, which is definitely not safe after a kernel panic. > >>> I don't know much about irqchip stuff, but __irq_get_desc_lock() takes a >>> raw_spin_lock(), and calls gic_irq_get_irqchip_state() which is just >>> poking >>> around in mmio registers, this should all be safe unless you re-entered >>> the same >>> code. > If I'm right, then this is broken in general, but I have only ever seen it on RPi 3 Model B+ (even RPi3 Model B works fine), so the issue may be more subtle. > >>> Is there a hardware difference around the interrupt controller on these? > >> No, but the RPi 3 B has a different USB network chip on board (smsc95xx, >> Fast >> ethernet) instead of lan78xx (Gigabit ethernet). > > Bingo: its the lan78xx driver that is sleeping from the irqchip > callbacks; The smsc95xx driver doesn't have a struct irq_chip, which > is why the RPi-3-B doesn't do this. > > It may be valid for kdump to only teardown the 'root irqdomain' (if > that even means anything). I assume these secondary irqchip's would > have a summary-interrupt that goes to another irqchip. But I can't > see a way to tell them apart.., There is none. A cascaded irqchip is just like a root irqchip, just that its output line is connected to another irqchip. But we have no easy way to identify the parent. Also, this particular driver looks quite creative (it reinvents the wheel for chained interrupts -- see intr_complete and lan78xx_status), meaning that even if we could have a magic way of identify a chained irqchip, we'd miss that one. Broken. > I think we need to wait until after the merge window for Marc's > wisdom on this! Overall, I can't think of an easy fix. We have a few options, but none of them involve a centralised change: 1) We provide a reset infrastructure for irqchips, with an opt-in mechanism. This involves changing the way we teardown irqs at crash-time, and we'd then need some notion of reset ordering (think of the layered ITS and GICv3, for example). >>> >>> Does this mean that all the irqchips have to be implemented with reset? >> >> No. Only those that want to be reset at kexec time. > > I don't get the point yet. Who should have reset interface? > What is the criteria? The criteria is "this irqchip requires a reset to be safely used in the secondary kernel". This is a judgement call from the person writing the driver. > 2) We provide a way to identify interrupts that are ultimately backed by a root controller, which implies walking down the hierarchy for >>> >>> To be clear, from bottom to top (or root), right? >> >> I'm not sure I understand your question. The idea is to walk the >> irq_data chain, until we hit a root irqchip. If we do hit one, we >> deactivate/eoi/disable this interrupt. If we don't, we do nothing. > > I thought that we would traverse the (chained irq) hierarchy from > bottom to top and call deactivate or others in that order. > Am I wrong here? You *cannot* traverse a hierarchy through a chained irq. At that stage, irq_data->parent_data is NULL. The only thing you can do is to iterate over all the interrupts and deactivate those that are directly on the root interrupt controller. This will have the effect of stopping the interrupts that are behind a chained controller. M. -- Jazz is not dead. It just smells funny... ___ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec
Re: panic kexec broken on ARM64?
On Tue, Jul 03, 2018 at 09:58:44AM +0100, Marc Zyngier wrote: > On 03/07/18 08:01, takahiro.aka...@linaro.org wrote: > > Marc, James, > > > > I'd like to re-ignite the discussion. > > > > On Sun, Jun 10, 2018 at 01:24:17PM +0100, Marc Zyngier wrote: > >> On Wed, 06 Jun 2018 12:37:02 +0100, > >> James Morse wrote: > >>> > >>> Hi Stefan, > >>> > >>> On 06/06/18 08:02, Stefan Wahren wrote: > Am 05.06.2018 um 19:46 schrieb James Morse: > > On 05/06/18 09:01, Petr Tesarik wrote: > >> I attached a hardware debugger and found > >> out that all CPU cores were stopped except one which was stuck in the > >> idle thread. It seems that irq_set_irqchip_state() may sleep, which is > >> definitely not safe after a kernel panic. > >>> > > I don't know much about irqchip stuff, but __irq_get_desc_lock() takes a > > raw_spin_lock(), and calls gic_irq_get_irqchip_state() which is just > > poking > > around in mmio registers, this should all be safe unless you re-entered > > the same > > code. > >>> > >> If I'm right, then this is broken in general, but I have only ever seen > >> it on RPi 3 Model B+ (even RPi3 Model B works fine), so the issue may > >> be more subtle. > >>> > > Is there a hardware difference around the interrupt controller on these? > >>> > No, but the RPi 3 B has a different USB network chip on board (smsc95xx, > Fast > ethernet) instead of lan78xx (Gigabit ethernet). > >>> > >>> Bingo: its the lan78xx driver that is sleeping from the irqchip > >>> callbacks; The smsc95xx driver doesn't have a struct irq_chip, which > >>> is why the RPi-3-B doesn't do this. > >>> > >>> It may be valid for kdump to only teardown the 'root irqdomain' (if > >>> that even means anything). I assume these secondary irqchip's would > >>> have a summary-interrupt that goes to another irqchip. But I can't > >>> see a way to tell them apart.., > >> > >> There is none. A cascaded irqchip is just like a root irqchip, just > >> that its output line is connected to another irqchip. But we have no > >> easy way to identify the parent. Also, this particular driver looks > >> quite creative (it reinvents the wheel for chained interrupts -- see > >> intr_complete and lan78xx_status), meaning that even if we could have > >> a magic way of identify a chained irqchip, we'd miss that one. Broken. > >> > >>> I think we need to wait until after the merge window for Marc's > >>> wisdom on this! > >> > >> Overall, I can't think of an easy fix. We have a few options, but none > >> of them involve a centralised change: > >> > >> 1) We provide a reset infrastructure for irqchips, with an opt-in > >>mechanism. This involves changing the way we teardown irqs at > >>crash-time, and we'd then need some notion of reset ordering (think > >>of the layered ITS and GICv3, for example). > > > > Does this mean that all the irqchips have to be implemented with reset? > > No. Only those that want to be reset at kexec time. I don't get the point yet. Who should have reset interface? What is the criteria? > >> > >> 2) We provide a way to identify interrupts that are ultimately backed > >>by a root controller, which implies walking down the hierarchy for > > > > To be clear, from bottom to top (or root), right? > > I'm not sure I understand your question. The idea is to walk the > irq_data chain, until we hit a root irqchip. If we do hit one, we > deactivate/eoi/disable this interrupt. If we don't, we do nothing. I thought that we would traverse the (chained irq) hierarchy from bottom to top and call deactivate or others in that order. Am I wrong here? > This would avoid the above brokenness, and still ensures that no > interrupt reaches the CPU. > > > > >>each one of them. Fairly expensive, but minimal in way of changes > >>in the crash code. Requires a per-irqchip flag, but ordering comes > >>in for free. > >> > >> 3) We do the same as (2), but at the irqdomain level. Not sure that's > >>any better, and it may be even more complicated and bring back some > >>ordering issues. > > > > Do you think that the same thing may happen in case of pci/msi? > > I have no confidence but MSI has some kind of irq domain hierarchy. > > Anything can happen, as people implement their interrupt infrastructure > in weird and wonderful ways. So we need to be prepared for the worse. > > I've pushed 3 patches on a branch[1]. It is mostly untested, but it > should allow the above RPi3 disaster to cope with kexec. I don't have any hardware that sees this kind of issue and can't test. -Takahiro AKASHI > M. > > [1]: > https://git.kernel.org/pub/scm/linux/kernel/git/maz/arm-platforms.git/log/?h=irq/root-irqchip > > -- > Jazz is not dead, it just smell funny. ___ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec
Re: panic kexec broken on ARM64?
On 03/07/18 10:58, Marc Zyngier wrote: > On 03/07/18 08:01, takahiro.aka...@linaro.org wrote: >> Marc, James, >> >> I'd like to re-ignite the discussion. >> >> On Sun, Jun 10, 2018 at 01:24:17PM +0100, Marc Zyngier wrote: >>> On Wed, 06 Jun 2018 12:37:02 +0100, >>> James Morse wrote: Hi Stefan, On 06/06/18 08:02, Stefan Wahren wrote: > Am 05.06.2018 um 19:46 schrieb James Morse: >> On 05/06/18 09:01, Petr Tesarik wrote: >>> I attached a hardware debugger and found >>> out that all CPU cores were stopped except one which was stuck in the >>> idle thread. It seems that irq_set_irqchip_state() may sleep, which is >>> definitely not safe after a kernel panic. >> I don't know much about irqchip stuff, but __irq_get_desc_lock() takes a >> raw_spin_lock(), and calls gic_irq_get_irqchip_state() which is just >> poking >> around in mmio registers, this should all be safe unless you re-entered >> the same >> code. >>> If I'm right, then this is broken in general, but I have only ever seen >>> it on RPi 3 Model B+ (even RPi3 Model B works fine), so the issue may >>> be more subtle. >> Is there a hardware difference around the interrupt controller on these? > No, but the RPi 3 B has a different USB network chip on board (smsc95xx, > Fast > ethernet) instead of lan78xx (Gigabit ethernet). Bingo: its the lan78xx driver that is sleeping from the irqchip callbacks; The smsc95xx driver doesn't have a struct irq_chip, which is why the RPi-3-B doesn't do this. It may be valid for kdump to only teardown the 'root irqdomain' (if that even means anything). I assume these secondary irqchip's would have a summary-interrupt that goes to another irqchip. But I can't see a way to tell them apart.., >>> >>> There is none. A cascaded irqchip is just like a root irqchip, just >>> that its output line is connected to another irqchip. But we have no >>> easy way to identify the parent. Also, this particular driver looks >>> quite creative (it reinvents the wheel for chained interrupts -- see >>> intr_complete and lan78xx_status), meaning that even if we could have >>> a magic way of identify a chained irqchip, we'd miss that one. Broken. >>> I think we need to wait until after the merge window for Marc's wisdom on this! >>> >>> Overall, I can't think of an easy fix. We have a few options, but none >>> of them involve a centralised change: >>> >>> 1) We provide a reset infrastructure for irqchips, with an opt-in >>>mechanism. This involves changing the way we teardown irqs at >>>crash-time, and we'd then need some notion of reset ordering (think >>>of the layered ITS and GICv3, for example). >> >> Does this mean that all the irqchips have to be implemented with reset? > > No. Only those that want to be reset at kexec time. > >>> >>> 2) We provide a way to identify interrupts that are ultimately backed >>>by a root controller, which implies walking down the hierarchy for >> >> To be clear, from bottom to top (or root), right? > > I'm not sure I understand your question. The idea is to walk the > irq_data chain, until we hit a root irqchip. If we do hit one, we > deactivate/eoi/disable this interrupt. If we don't, we do nothing. > > This would avoid the above brokenness, and still ensures that no > interrupt reaches the CPU. > >> >>>each one of them. Fairly expensive, but minimal in way of changes >>>in the crash code. Requires a per-irqchip flag, but ordering comes >>>in for free. >>> >>> 3) We do the same as (2), but at the irqdomain level. Not sure that's >>>any better, and it may be even more complicated and bring back some >>>ordering issues. >> >> Do you think that the same thing may happen in case of pci/msi? >> I have no confidence but MSI has some kind of irq domain hierarchy. > > Anything can happen, as people implement their interrupt infrastructure > in weird and wonderful ways. So we need to be prepared for the worse. > > I've pushed 3 patches on a branch[1]. It is mostly untested, but it > should allow the above RPi3 disaster to cope with kexec. > > M. > > [1]: > https://git.kernel.org/pub/scm/linux/kernel/git/maz/arm-platforms.git/log/?h=irq/root-irqchip > I threw the kernel on my RPi3+ model but I wasn't able to start the crash kernel. Unfortunately I don't have a JTAG adapter to check if it hangs for the same reason. Regards, Matthias ___ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec
Re: panic kexec broken on ARM64?
On Wed, 04 Jul 2018 15:08:38 +0100, Matthias Brugger wrote: > On 03/07/18 10:58, Marc Zyngier wrote: > > I've pushed 3 patches on a branch[1]. It is mostly untested, but it > > should allow the above RPi3 disaster to cope with kexec. > > > > M. > > > > [1]: > > https://git.kernel.org/pub/scm/linux/kernel/git/maz/arm-platforms.git/log/?h=irq/root-irqchip > > > > I threw the kernel on my RPi3+ model but I wasn't able to start the crash > kernel. Unfortunately I don't have a JTAG adapter to check if it hangs for the > same reason. Have you annotated the RPi3 root irq_chip structure so that the arm64 kexec code knows that this is the root interrupt controller? I've only done so on the GIC drivers. M. -- Jazz is not dead, it just smell funny. ___ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec
Re: [PATCH v2 3/4] efi/arm: map UEFI memory map earlier on boot
On 4 July 2018 at 19:06, Will Deacon wrote: > Hi all, > > [Ard -- please can you look at the EFI parts of this patch] > > On Tue, Jun 19, 2018 at 03:44:23PM +0900, AKASHI Takahiro wrote: >> Since arm_enter_runtime_services() was modified to always create a virtual >> mapping of UEFI memory map in the previous patch, it is now renamed to >> efi_enter_virtual_mode() and called earlier before acpi_load_tables() >> in acpi_early_init(). >> >> This will allow us to use UEFI memory map in acpi_os_ioremap() to create >> mappings of ACPI tables using memory attributes described in UEFI memory >> map. >> >> See a relevant commit: >> arm64: acpi: fix alignment fault in accessing ACPI tables >> >> Signed-off-by: AKASHI Takahiro >> Cc: Ard Biesheuvel >> Cc: Andrew Morton >> --- >> drivers/firmware/efi/arm-runtime.c | 15 ++- >> init/main.c| 3 +++ >> 2 files changed, 9 insertions(+), 9 deletions(-) >> >> diff --git a/drivers/firmware/efi/arm-runtime.c >> b/drivers/firmware/efi/arm-runtime.c >> index 30ac5c82051e..566ef0a9edb5 100644 >> --- a/drivers/firmware/efi/arm-runtime.c >> +++ b/drivers/firmware/efi/arm-runtime.c >> @@ -106,46 +106,43 @@ static bool __init efi_virtmap_init(void) >> * non-early mapping of the UEFI system table and virtual mappings for all >> * EFI_MEMORY_RUNTIME regions. >> */ >> -static int __init arm_enable_runtime_services(void) >> +void __init efi_enter_virtual_mode(void) >> { >> u64 mapsize; >> >> if (!efi_enabled(EFI_BOOT)) { >> pr_info("EFI services will not be available.\n"); >> - return 0; >> + return; >> } >> >> mapsize = efi.memmap.desc_size * efi.memmap.nr_map; >> >> if (efi_memmap_init_late(efi.memmap.phys_map, mapsize)) { >> pr_err("Failed to remap EFI memory map\n"); >> - return 0; >> + return; >> } >> >> if (efi_runtime_disabled()) { >> pr_info("EFI runtime services will be disabled.\n"); >> - return 0; >> + return; >> } >> >> if (efi_enabled(EFI_RUNTIME_SERVICES)) { >> pr_info("EFI runtime services access via paravirt.\n"); >> - return 0; >> + return; >> } >> >> pr_info("Remapping and enabling EFI services.\n"); >> >> if (!efi_virtmap_init()) { >> pr_err("UEFI virtual mapping missing or invalid -- runtime >> services will not be available\n"); >> - return -ENOMEM; >> + return; >> } >> >> /* Set up runtime services function pointers */ >> efi_native_runtime_setup(); >> set_bit(EFI_RUNTIME_SERVICES, ); >> - >> - return 0; >> } >> -early_initcall(arm_enable_runtime_services); >> >> void efi_virtmap_load(void) >> { >> diff --git a/init/main.c b/init/main.c >> index 3b4ada11ed52..532fc0d02353 100644 >> --- a/init/main.c >> +++ b/init/main.c >> @@ -694,6 +694,9 @@ asmlinkage __visible void __init start_kernel(void) >> debug_objects_mem_init(); >> setup_per_cpu_pageset(); >> numa_policy_init(); >> + if (IS_ENABLED(CONFIG_EFI) && >> + (IS_ENABLED(CONFIG_ARM64) || IS_ENABLED(CONFIG_ARM))) >> + efi_enter_virtual_mode(); > > Hmm, this is ugly as hell. Is there nothing else we can piggy-back off? > It's also fairly jarring that, on x86, efi_enter_virtual_mode() is called > a few lines later, *after* acpi_early_init() has been called. > Currently, there is a gap where we have already torn down the early mapping and haven't created the definitive mapping of the UEFI memory map. There are other reasons why this is an issue, and I recently proposed [0] myself to address one of them (and I didn't remember this particular series, or the fact that I actually suggested this approach IIRC) Akashi-san, could you please confirm whether the patch below would be sufficient for you? Apologies for going back and forth on this, but I agree with Will that we should try to avoid warts like the one above in generic code. [0] https://marc.info/?l=linux-efi=152930773507524=2 > The rest of the series looks fine to me, but I'm not comfortable taking > changes like this via the arm64 tree. > > Will ___ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec
Re: [PATCH v2 3/4] efi/arm: map UEFI memory map earlier on boot
Hi all, [Ard -- please can you look at the EFI parts of this patch] On Tue, Jun 19, 2018 at 03:44:23PM +0900, AKASHI Takahiro wrote: > Since arm_enter_runtime_services() was modified to always create a virtual > mapping of UEFI memory map in the previous patch, it is now renamed to > efi_enter_virtual_mode() and called earlier before acpi_load_tables() > in acpi_early_init(). > > This will allow us to use UEFI memory map in acpi_os_ioremap() to create > mappings of ACPI tables using memory attributes described in UEFI memory > map. > > See a relevant commit: > arm64: acpi: fix alignment fault in accessing ACPI tables > > Signed-off-by: AKASHI Takahiro > Cc: Ard Biesheuvel > Cc: Andrew Morton > --- > drivers/firmware/efi/arm-runtime.c | 15 ++- > init/main.c| 3 +++ > 2 files changed, 9 insertions(+), 9 deletions(-) > > diff --git a/drivers/firmware/efi/arm-runtime.c > b/drivers/firmware/efi/arm-runtime.c > index 30ac5c82051e..566ef0a9edb5 100644 > --- a/drivers/firmware/efi/arm-runtime.c > +++ b/drivers/firmware/efi/arm-runtime.c > @@ -106,46 +106,43 @@ static bool __init efi_virtmap_init(void) > * non-early mapping of the UEFI system table and virtual mappings for all > * EFI_MEMORY_RUNTIME regions. > */ > -static int __init arm_enable_runtime_services(void) > +void __init efi_enter_virtual_mode(void) > { > u64 mapsize; > > if (!efi_enabled(EFI_BOOT)) { > pr_info("EFI services will not be available.\n"); > - return 0; > + return; > } > > mapsize = efi.memmap.desc_size * efi.memmap.nr_map; > > if (efi_memmap_init_late(efi.memmap.phys_map, mapsize)) { > pr_err("Failed to remap EFI memory map\n"); > - return 0; > + return; > } > > if (efi_runtime_disabled()) { > pr_info("EFI runtime services will be disabled.\n"); > - return 0; > + return; > } > > if (efi_enabled(EFI_RUNTIME_SERVICES)) { > pr_info("EFI runtime services access via paravirt.\n"); > - return 0; > + return; > } > > pr_info("Remapping and enabling EFI services.\n"); > > if (!efi_virtmap_init()) { > pr_err("UEFI virtual mapping missing or invalid -- runtime > services will not be available\n"); > - return -ENOMEM; > + return; > } > > /* Set up runtime services function pointers */ > efi_native_runtime_setup(); > set_bit(EFI_RUNTIME_SERVICES, ); > - > - return 0; > } > -early_initcall(arm_enable_runtime_services); > > void efi_virtmap_load(void) > { > diff --git a/init/main.c b/init/main.c > index 3b4ada11ed52..532fc0d02353 100644 > --- a/init/main.c > +++ b/init/main.c > @@ -694,6 +694,9 @@ asmlinkage __visible void __init start_kernel(void) > debug_objects_mem_init(); > setup_per_cpu_pageset(); > numa_policy_init(); > + if (IS_ENABLED(CONFIG_EFI) && > + (IS_ENABLED(CONFIG_ARM64) || IS_ENABLED(CONFIG_ARM))) > + efi_enter_virtual_mode(); Hmm, this is ugly as hell. Is there nothing else we can piggy-back off? It's also fairly jarring that, on x86, efi_enter_virtual_mode() is called a few lines later, *after* acpi_early_init() has been called. The rest of the series looks fine to me, but I'm not comfortable taking changes like this via the arm64 tree. Will ___ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec