[PATCH] efi/fdt: More cleanups
* Ard Biesheuvel wrote: > From: Julien Thierry > > Closing bracket seems to end a for statement when it is actually ending > the contained if. Add some brackets to have clear delimitation of each > scope. > > No functional change/fix, just fix the indentation. > > Signed-off-by: Julien Thierry > Signed-off-by: Ard Biesheuvel > --- > drivers/firmware/efi/libstub/fdt.c | 5 +++-- > 1 file changed, 3 insertions(+), 2 deletions(-) > > diff --git a/drivers/firmware/efi/libstub/fdt.c > b/drivers/firmware/efi/libstub/fdt.c > index 0c0d2312f4a8..a3614f9b5f75 100644 > --- a/drivers/firmware/efi/libstub/fdt.c > +++ b/drivers/firmware/efi/libstub/fdt.c > @@ -376,7 +376,7 @@ void *get_fdt(efi_system_table_t *sys_table, unsigned > long *fdt_size) > tables = (efi_config_table_t *) sys_table->tables; > fdt = NULL; > > - for (i = 0; i < sys_table->nr_tables; i++) > + for (i = 0; i < sys_table->nr_tables; i++) { > if (efi_guidcmp(tables[i].guid, fdt_guid) == 0) { > fdt = (void *) tables[i].table; > if (fdt_check_header(fdt) != 0) { > @@ -385,7 +385,8 @@ void *get_fdt(efi_system_table_t *sys_table, unsigned > long *fdt_size) > } > *fdt_size = fdt_totalsize(fdt); > break; > - } > + } > + } So if we are doing trivial cleanups, how about the patch below on top? It cleans up this file for good. Only minimally build tested. Thanks, Ingo ==> Subject: efi/fdt: More cleanups From: Ingo Molnar Apply a number of cleanups: - Introduce fdt_setprop_*var() helper macros to simplify and shorten repetitive sequences - this also makes it less likely that the wrong variable size is passed in. This change makes a lot of the property-setting calls single-line and easier to read. - Harmonize comment style: capitalization, punctuation, whitespaces, etc. - Fix some whitespace noise in the libstub Makefile which I happened to notice. - Use the standard tabular initialization style: - map.map = _map; - map.map_size = _size; - map.desc_size = _size; - map.desc_ver = _ver; - map.key_ptr = _key; - map.buff_size = _size; + map.map = _map; + map.map_size= _size; + map.desc_size = _size; + map.desc_ver= _ver; + map.key_ptr = _key; + map.buff_size = _size; - Use tabular structure definition for better readability. - Make all pr*() lines single-line, even if they marginally exceed 80 cols - this makes them visually less intrusive. - Unbreak line breaks into single lines when the length exceeds 80 cols only marginally, for better readability. - Move assignment closer to the actual usage site. - Plus some other smaller cleanups, spelling fixes, etc. No change in functionality intended. Cc: Linus Torvalds Cc: Peter Zijlstra Cc: Thomas Gleixner Signed-off-by: Ingo Molnar --- drivers/firmware/efi/libstub/Makefile |4 - drivers/firmware/efi/libstub/fdt.c| 109 -- scripts/dtc/libfdt/libfdt.h | 10 +++ 3 files changed, 64 insertions(+), 59 deletions(-) Index: tip/drivers/firmware/efi/libstub/Makefile === --- tip.orig/drivers/firmware/efi/libstub/Makefile +++ tip/drivers/firmware/efi/libstub/Makefile @@ -49,7 +49,7 @@ lib-$(CONFIG_EFI_ARMSTUB) += arm-stub.o lib-$(CONFIG_ARM) += arm32-stub.o lib-$(CONFIG_ARM64)+= arm64-stub.o -CFLAGS_arm64-stub.o:= -DTEXT_OFFSET=$(TEXT_OFFSET) +CFLAGS_arm64-stub.o:= -DTEXT_OFFSET=$(TEXT_OFFSET) # # arm64 puts the stub in the kernel proper, which will unnecessarily retain all @@ -86,7 +86,7 @@ quiet_cmd_stubcopy = STUBCPY $@ cmd_stubcopy = if $(STRIP) --strip-debug $(STUBCOPY_RM-y) -o $@ $<; \ then if $(OBJDUMP) -r $@ | grep $(STUBCOPY_RELOC-y); \ then (echo >&2 "$@: absolute symbol references not allowed in the EFI stub"; \ - rm -f $@; /bin/false); \ + rm -f $@; /bin/false); \ else $(OBJCOPY) $(STUBCOPY_FLAGS-y) $< $@; fi\ else /bin/false; fi Index: tip/drivers/firmware/efi/libstub/fdt.c === --- tip.orig/drivers/firmware/efi/libstub/fdt.c +++ tip/drivers/firmware/efi/libstub/fdt.c @@ -26,10 +26,8 @@ static void fdt_update_cell_size(efi_sys offset = fdt_path_offset(fdt, "/"); /* Set the #address-cells and #size-cells values for an empty tree */ - fdt_setprop_u32(fdt, offset, "#address-cells", - EFI_DT_ADDR_CELLS_DEFAULT); - -
Re: [PATCH 01/11] x86/efi: Allocate e820 buffer before calling efi_exit_boot_service
* Ard Biesheuvel wrote: > From: Eric Snowberg > > Commit d64934019f6c ("x86/efi: Use efi_exit_boot_services()") > introduced a regression on systems with large memory maps > causing them to hang on boot. The first "goto get_map" that was removed > from exit_boot insured there was enough room for the memory map when > efi_call_early(exit_boot_services) was called. This happens when > (nr_desc > ARRAY_SIZE(params->e820_table). > > Chain of events: > exit_boot() > efi_exit_boot_services() > efi_get_memory_map <- at this point the mm can't grow over 8 desc > priv_func() > exit_boot_func() > allocate_e820ext() <- new mm grows over 8 desc from e820 alloc > efi_call_early(exit_boot_services) <- mm key doesn't match so retry > efi_call_early(get_memory_map) <- not enough room for new mm > system hangs > > This patch allocates the e820 buffer before calling efi_exit_boot_services > and fixes the regression. > > Signed-off-by: Eric Snowberg > Signed-off-by: Ard Biesheuvel > --- > arch/x86/boot/compressed/eboot.c | 65 > 1 file changed, 41 insertions(+), 24 deletions(-) Any objections against marking this for -stable and filing it in efi/urgent? Boot hangs are show-stopper bugs, so distributions would want to backport this fix anyway. > + boot_map.map = > + boot_map.map_size = _size; > + boot_map.desc_size =_size; > + boot_map.desc_ver = NULL; > + boot_map.key_ptr = NULL; > + boot_map.buff_size =_size; I have changed this to the canonical style, which is also used elsewhere in the file: > map.buff_size = _size; > priv.boot_params= boot_params; > priv.efi= _params->efi_info; > - priv.e820ext= NULL; > - priv.e820ext_size = 0; I.e.: + boot_map.map= + boot_map.map_size = _size; + boot_map.desc_size = _size; + boot_map.desc_ver = NULL; + boot_map.key_ptr= NULL; + boot_map.buff_size = _size; + + status = efi_get_memory_map(sys_table, _map); Thanks, Ingo
Re: [PATCH 1/2 v8] resource: add the new I/O resource descriptor 'IORES_DESC_RESERVED'
在 2018年11月30日 11:37, Dave Young 写道: > + more people > Thank you, Dave. That would be fine to let more people review this patch. > On 11/29/18 at 04:09pm, Lianbo Jiang wrote: >> When doing kexec_file_load, the first kernel needs to pass the e820 >> reserved ranges to the second kernel. But kernel can not exactly >> match the e820 reserved ranges when walking through the iomem resources >> with the descriptor 'IORES_DESC_NONE', because several e820 types( >> e.g. E820_TYPE_RESERVED_KERN/E820_TYPE_RAM/E820_TYPE_UNUSABLE/E820 >> _TYPE_RESERVED) are converted to the descriptor 'IORES_DESC_NONE'. It >> may pass these four types to the kdump kernel, that is not desired result. >> >> So, this patch adds a new I/O resource descriptor 'IORES_DESC_RESERVED' >> for the iomem resources search interfaces. It is helpful to exactly >> match the reserved resource ranges when walking through iomem resources. >> >> In addition, since the new descriptor 'IORES_DESC_RESERVED' is introduced, >> these code originally related to the descriptor 'IORES_DESC_NONE' need to >> be updated. Otherwise, it will be easily confused and also cause some >> errors. Because the 'E820_TYPE_RESERVED' type is converted to the new >> descriptor 'IORES_DESC_RESERVED' instead of 'IORES_DESC_NONE', it has been >> changed. >> >> Suggested-by: Dave Young >> Signed-off-by: Lianbo Jiang >> --- >> arch/ia64/kernel/efi.c | 4 >> arch/x86/kernel/e820.c | 2 +- >> arch/x86/mm/ioremap.c | 13 - >> include/linux/ioport.h | 1 + >> kernel/resource.c | 6 +++--- >> 5 files changed, 21 insertions(+), 5 deletions(-) >> >> diff --git a/arch/ia64/kernel/efi.c b/arch/ia64/kernel/efi.c >> index 8f106638913c..1841e9b4db30 100644 >> --- a/arch/ia64/kernel/efi.c >> +++ b/arch/ia64/kernel/efi.c >> @@ -1231,6 +1231,10 @@ efi_initialize_iomem_resources(struct resource >> *code_resource, >> break; >> >> case EFI_RESERVED_TYPE: >> +name = "reserved"; > > Ingo updated X86 code to use "Reserved", I think it would be good to do > same for this case as well > I have noticed the changes on x86, but for IA64, i'm not sure whether it should do the same thing, so keep it as before. If IA64 people would like to give any comment, that will be appreciated. >> +desc = IORES_DESC_RESERVED; >> +break; >> + >> case EFI_RUNTIME_SERVICES_CODE: >> case EFI_RUNTIME_SERVICES_DATA: >> case EFI_ACPI_RECLAIM_MEMORY: > > Originally, above 3 are all "reserved", so probably they all should be > IORES_DESC_RESERVED. > This is a good question. If above 3 types are converted to the new descriptor 'IORES_DESC_RESERVED', how to handle the all 'default' case? Because the all 'default' case is also converted to the descriptor 'IORES_DESC_NONE' and the name 'reserved'. > Can any IA64 people to review this? > >> diff --git a/arch/x86/kernel/e820.c b/arch/x86/kernel/e820.c >> index 50895c2f937d..57fafdafb860 100644 >> --- a/arch/x86/kernel/e820.c >> +++ b/arch/x86/kernel/e820.c >> @@ -1048,10 +1048,10 @@ static unsigned long __init >> e820_type_to_iores_desc(struct e820_entry *entry) >> case E820_TYPE_NVS: return IORES_DESC_ACPI_NV_STORAGE; >> case E820_TYPE_PMEM:return IORES_DESC_PERSISTENT_MEMORY; >> case E820_TYPE_PRAM:return >> IORES_DESC_PERSISTENT_MEMORY_LEGACY; >> +case E820_TYPE_RESERVED:return IORES_DESC_RESERVED; >> case E820_TYPE_RESERVED_KERN: /* Fall-through: */ >> case E820_TYPE_RAM: /* Fall-through: */ >> case E820_TYPE_UNUSABLE:/* Fall-through: */ >> -case E820_TYPE_RESERVED:/* Fall-through: */ >> default:return IORES_DESC_NONE; >> } >> } >> diff --git a/arch/x86/mm/ioremap.c b/arch/x86/mm/ioremap.c >> index 5378d10f1d31..fea2ef99415d 100644 >> --- a/arch/x86/mm/ioremap.c >> +++ b/arch/x86/mm/ioremap.c >> @@ -83,7 +83,18 @@ static bool __ioremap_check_ram(struct resource *res) >> >> static int __ioremap_check_desc_other(struct resource *res) >> { >> -return (res->desc != IORES_DESC_NONE); >> +/* >> + * But now, the 'E820_TYPE_RESERVED' type is converted to the new >> + * descriptor 'IORES_DESC_RESERVED' instead of 'IORES_DESC_NONE', >> + * it has been changed. And the value of 'mem_flags.desc_other' >> + * is equal to 'true' if we don't strengthen the condition in this >> + * function, that is wrong. Because originally it is equal to >> + * 'false' for the same reserved type. >> + * >> + * So, that would be nice to keep it the same as before. >> + */ >> +return ((res->desc != IORES_DESC_NONE) && >> +(res->desc != IORES_DESC_RESERVED)); >> } > > Added Tom since he added the check function. Is it possible to only > check explict valid desc
Re: [PATCH 1/2 v8] resource: add the new I/O resource descriptor 'IORES_DESC_RESERVED'
在 2018年11月30日 11:41, Dave Young 写道: > On 11/29/18 at 04:09pm, Lianbo Jiang wrote: >> When doing kexec_file_load, the first kernel needs to pass the e820 >> reserved ranges to the second kernel. But kernel can not exactly >> match the e820 reserved ranges when walking through the iomem resources >> with the descriptor 'IORES_DESC_NONE', because several e820 types( >> e.g. E820_TYPE_RESERVED_KERN/E820_TYPE_RAM/E820_TYPE_UNUSABLE/E820 >> _TYPE_RESERVED) are converted to the descriptor 'IORES_DESC_NONE'. It >> may pass these four types to the kdump kernel, that is not desired result. >> >> So, this patch adds a new I/O resource descriptor 'IORES_DESC_RESERVED' >> for the iomem resources search interfaces. It is helpful to exactly >> match the reserved resource ranges when walking through iomem resources. >> >> In addition, since the new descriptor 'IORES_DESC_RESERVED' is introduced, >> these code originally related to the descriptor 'IORES_DESC_NONE' need to >> be updated. Otherwise, it will be easily confused and also cause some >> errors. Because the 'E820_TYPE_RESERVED' type is converted to the new >> descriptor 'IORES_DESC_RESERVED' instead of 'IORES_DESC_NONE', it has been >> changed. >> >> Suggested-by: Dave Young > > This was suggested by Boris instead :) > Sorry for this, i forgot to change it. And i will correct this in next version. Thanks. >> Signed-off-by: Lianbo Jiang >> --- >> arch/ia64/kernel/efi.c | 4 >> arch/x86/kernel/e820.c | 2 +- >> arch/x86/mm/ioremap.c | 13 - >> include/linux/ioport.h | 1 + >> kernel/resource.c | 6 +++--- >> 5 files changed, 21 insertions(+), 5 deletions(-) >> >> > [snip] > > Thanks > Dave >
Re: [PATCH 1/2 v8] resource: add the new I/O resource descriptor 'IORES_DESC_RESERVED'
Correct Toshi's email addr On 11/30/18 at 11:37am, Dave Young wrote: > + more people > > On 11/29/18 at 04:09pm, Lianbo Jiang wrote: > > When doing kexec_file_load, the first kernel needs to pass the e820 > > reserved ranges to the second kernel. But kernel can not exactly > > match the e820 reserved ranges when walking through the iomem resources > > with the descriptor 'IORES_DESC_NONE', because several e820 types( > > e.g. E820_TYPE_RESERVED_KERN/E820_TYPE_RAM/E820_TYPE_UNUSABLE/E820 > > _TYPE_RESERVED) are converted to the descriptor 'IORES_DESC_NONE'. It > > may pass these four types to the kdump kernel, that is not desired result. > > > > So, this patch adds a new I/O resource descriptor 'IORES_DESC_RESERVED' > > for the iomem resources search interfaces. It is helpful to exactly > > match the reserved resource ranges when walking through iomem resources. > > > > In addition, since the new descriptor 'IORES_DESC_RESERVED' is introduced, > > these code originally related to the descriptor 'IORES_DESC_NONE' need to > > be updated. Otherwise, it will be easily confused and also cause some > > errors. Because the 'E820_TYPE_RESERVED' type is converted to the new > > descriptor 'IORES_DESC_RESERVED' instead of 'IORES_DESC_NONE', it has been > > changed. > > > > Suggested-by: Dave Young > > Signed-off-by: Lianbo Jiang > > --- > > arch/ia64/kernel/efi.c | 4 > > arch/x86/kernel/e820.c | 2 +- > > arch/x86/mm/ioremap.c | 13 - > > include/linux/ioport.h | 1 + > > kernel/resource.c | 6 +++--- > > 5 files changed, 21 insertions(+), 5 deletions(-) > > > > diff --git a/arch/ia64/kernel/efi.c b/arch/ia64/kernel/efi.c > > index 8f106638913c..1841e9b4db30 100644 > > --- a/arch/ia64/kernel/efi.c > > +++ b/arch/ia64/kernel/efi.c > > @@ -1231,6 +1231,10 @@ efi_initialize_iomem_resources(struct resource > > *code_resource, > > break; > > > > case EFI_RESERVED_TYPE: > > + name = "reserved"; > > Ingo updated X86 code to use "Reserved", I think it would be good to do > same for this case as well > > > + desc = IORES_DESC_RESERVED; > > + break; > > + > > case EFI_RUNTIME_SERVICES_CODE: > > case EFI_RUNTIME_SERVICES_DATA: > > case EFI_ACPI_RECLAIM_MEMORY: > > Originally, above 3 are all "reserved", so probably they all should be > IORES_DESC_RESERVED. > > Can any IA64 people to review this? > > > diff --git a/arch/x86/kernel/e820.c b/arch/x86/kernel/e820.c > > index 50895c2f937d..57fafdafb860 100644 > > --- a/arch/x86/kernel/e820.c > > +++ b/arch/x86/kernel/e820.c > > @@ -1048,10 +1048,10 @@ static unsigned long __init > > e820_type_to_iores_desc(struct e820_entry *entry) > > case E820_TYPE_NVS: return IORES_DESC_ACPI_NV_STORAGE; > > case E820_TYPE_PMEM:return IORES_DESC_PERSISTENT_MEMORY; > > case E820_TYPE_PRAM:return > > IORES_DESC_PERSISTENT_MEMORY_LEGACY; > > + case E820_TYPE_RESERVED:return IORES_DESC_RESERVED; > > case E820_TYPE_RESERVED_KERN: /* Fall-through: */ > > case E820_TYPE_RAM: /* Fall-through: */ > > case E820_TYPE_UNUSABLE:/* Fall-through: */ > > - case E820_TYPE_RESERVED:/* Fall-through: */ > > default:return IORES_DESC_NONE; > > } > > } > > diff --git a/arch/x86/mm/ioremap.c b/arch/x86/mm/ioremap.c > > index 5378d10f1d31..fea2ef99415d 100644 > > --- a/arch/x86/mm/ioremap.c > > +++ b/arch/x86/mm/ioremap.c > > @@ -83,7 +83,18 @@ static bool __ioremap_check_ram(struct resource *res) > > > > static int __ioremap_check_desc_other(struct resource *res) > > { > > - return (res->desc != IORES_DESC_NONE); > > + /* > > +* But now, the 'E820_TYPE_RESERVED' type is converted to the new > > +* descriptor 'IORES_DESC_RESERVED' instead of 'IORES_DESC_NONE', > > +* it has been changed. And the value of 'mem_flags.desc_other' > > +* is equal to 'true' if we don't strengthen the condition in this > > +* function, that is wrong. Because originally it is equal to > > +* 'false' for the same reserved type. > > +* > > +* So, that would be nice to keep it the same as before. > > +*/ > > + return ((res->desc != IORES_DESC_NONE) && > > + (res->desc != IORES_DESC_RESERVED)); > > } > > Added Tom since he added the check function. Is it possible to only > check explict valid desc types instead of exclude IORES_DESC_NONE? > > > > > static int __ioremap_res_check(struct resource *res, void *arg) > > diff --git a/include/linux/ioport.h b/include/linux/ioport.h > > index da0ebaec25f0..6ed59de48bd5 100644 > > --- a/include/linux/ioport.h > > +++ b/include/linux/ioport.h > > @@ -133,6 +133,7 @@ enum { > > IORES_DESC_PERSISTENT_MEMORY_LEGACY = 5, > > IORES_DESC_DEVICE_PRIVATE_MEMORY= 6, > >
Re: [PATCH 1/2 v8] resource: add the new I/O resource descriptor 'IORES_DESC_RESERVED'
+ more people On 11/29/18 at 04:09pm, Lianbo Jiang wrote: > When doing kexec_file_load, the first kernel needs to pass the e820 > reserved ranges to the second kernel. But kernel can not exactly > match the e820 reserved ranges when walking through the iomem resources > with the descriptor 'IORES_DESC_NONE', because several e820 types( > e.g. E820_TYPE_RESERVED_KERN/E820_TYPE_RAM/E820_TYPE_UNUSABLE/E820 > _TYPE_RESERVED) are converted to the descriptor 'IORES_DESC_NONE'. It > may pass these four types to the kdump kernel, that is not desired result. > > So, this patch adds a new I/O resource descriptor 'IORES_DESC_RESERVED' > for the iomem resources search interfaces. It is helpful to exactly > match the reserved resource ranges when walking through iomem resources. > > In addition, since the new descriptor 'IORES_DESC_RESERVED' is introduced, > these code originally related to the descriptor 'IORES_DESC_NONE' need to > be updated. Otherwise, it will be easily confused and also cause some > errors. Because the 'E820_TYPE_RESERVED' type is converted to the new > descriptor 'IORES_DESC_RESERVED' instead of 'IORES_DESC_NONE', it has been > changed. > > Suggested-by: Dave Young > Signed-off-by: Lianbo Jiang > --- > arch/ia64/kernel/efi.c | 4 > arch/x86/kernel/e820.c | 2 +- > arch/x86/mm/ioremap.c | 13 - > include/linux/ioport.h | 1 + > kernel/resource.c | 6 +++--- > 5 files changed, 21 insertions(+), 5 deletions(-) > > diff --git a/arch/ia64/kernel/efi.c b/arch/ia64/kernel/efi.c > index 8f106638913c..1841e9b4db30 100644 > --- a/arch/ia64/kernel/efi.c > +++ b/arch/ia64/kernel/efi.c > @@ -1231,6 +1231,10 @@ efi_initialize_iomem_resources(struct resource > *code_resource, > break; > > case EFI_RESERVED_TYPE: > + name = "reserved"; Ingo updated X86 code to use "Reserved", I think it would be good to do same for this case as well > + desc = IORES_DESC_RESERVED; > + break; > + > case EFI_RUNTIME_SERVICES_CODE: > case EFI_RUNTIME_SERVICES_DATA: > case EFI_ACPI_RECLAIM_MEMORY: Originally, above 3 are all "reserved", so probably they all should be IORES_DESC_RESERVED. Can any IA64 people to review this? > diff --git a/arch/x86/kernel/e820.c b/arch/x86/kernel/e820.c > index 50895c2f937d..57fafdafb860 100644 > --- a/arch/x86/kernel/e820.c > +++ b/arch/x86/kernel/e820.c > @@ -1048,10 +1048,10 @@ static unsigned long __init > e820_type_to_iores_desc(struct e820_entry *entry) > case E820_TYPE_NVS: return IORES_DESC_ACPI_NV_STORAGE; > case E820_TYPE_PMEM:return IORES_DESC_PERSISTENT_MEMORY; > case E820_TYPE_PRAM:return > IORES_DESC_PERSISTENT_MEMORY_LEGACY; > + case E820_TYPE_RESERVED:return IORES_DESC_RESERVED; > case E820_TYPE_RESERVED_KERN: /* Fall-through: */ > case E820_TYPE_RAM: /* Fall-through: */ > case E820_TYPE_UNUSABLE:/* Fall-through: */ > - case E820_TYPE_RESERVED:/* Fall-through: */ > default:return IORES_DESC_NONE; > } > } > diff --git a/arch/x86/mm/ioremap.c b/arch/x86/mm/ioremap.c > index 5378d10f1d31..fea2ef99415d 100644 > --- a/arch/x86/mm/ioremap.c > +++ b/arch/x86/mm/ioremap.c > @@ -83,7 +83,18 @@ static bool __ioremap_check_ram(struct resource *res) > > static int __ioremap_check_desc_other(struct resource *res) > { > - return (res->desc != IORES_DESC_NONE); > + /* > + * But now, the 'E820_TYPE_RESERVED' type is converted to the new > + * descriptor 'IORES_DESC_RESERVED' instead of 'IORES_DESC_NONE', > + * it has been changed. And the value of 'mem_flags.desc_other' > + * is equal to 'true' if we don't strengthen the condition in this > + * function, that is wrong. Because originally it is equal to > + * 'false' for the same reserved type. > + * > + * So, that would be nice to keep it the same as before. > + */ > + return ((res->desc != IORES_DESC_NONE) && > + (res->desc != IORES_DESC_RESERVED)); > } Added Tom since he added the check function. Is it possible to only check explict valid desc types instead of exclude IORES_DESC_NONE? > > static int __ioremap_res_check(struct resource *res, void *arg) > diff --git a/include/linux/ioport.h b/include/linux/ioport.h > index da0ebaec25f0..6ed59de48bd5 100644 > --- a/include/linux/ioport.h > +++ b/include/linux/ioport.h > @@ -133,6 +133,7 @@ enum { > IORES_DESC_PERSISTENT_MEMORY_LEGACY = 5, > IORES_DESC_DEVICE_PRIVATE_MEMORY= 6, > IORES_DESC_DEVICE_PUBLIC_MEMORY = 7, > + IORES_DESC_RESERVED = 8, > }; > > /* helpers to define resources */ > diff --git a/kernel/resource.c b/kernel/resource.c > index
Re: [PATCH 1/2 v8] resource: add the new I/O resource descriptor 'IORES_DESC_RESERVED'
On 11/29/18 at 04:09pm, Lianbo Jiang wrote: > When doing kexec_file_load, the first kernel needs to pass the e820 > reserved ranges to the second kernel. But kernel can not exactly > match the e820 reserved ranges when walking through the iomem resources > with the descriptor 'IORES_DESC_NONE', because several e820 types( > e.g. E820_TYPE_RESERVED_KERN/E820_TYPE_RAM/E820_TYPE_UNUSABLE/E820 > _TYPE_RESERVED) are converted to the descriptor 'IORES_DESC_NONE'. It > may pass these four types to the kdump kernel, that is not desired result. > > So, this patch adds a new I/O resource descriptor 'IORES_DESC_RESERVED' > for the iomem resources search interfaces. It is helpful to exactly > match the reserved resource ranges when walking through iomem resources. > > In addition, since the new descriptor 'IORES_DESC_RESERVED' is introduced, > these code originally related to the descriptor 'IORES_DESC_NONE' need to > be updated. Otherwise, it will be easily confused and also cause some > errors. Because the 'E820_TYPE_RESERVED' type is converted to the new > descriptor 'IORES_DESC_RESERVED' instead of 'IORES_DESC_NONE', it has been > changed. > > Suggested-by: Dave Young This was suggested by Boris instead :) > Signed-off-by: Lianbo Jiang > --- > arch/ia64/kernel/efi.c | 4 > arch/x86/kernel/e820.c | 2 +- > arch/x86/mm/ioremap.c | 13 - > include/linux/ioport.h | 1 + > kernel/resource.c | 6 +++--- > 5 files changed, 21 insertions(+), 5 deletions(-) > > [snip] Thanks Dave
Re: EFI-pstore, sleeping from back context after fault
On Mon, Nov 26, 2018 at 9:04 AM Sebastian Andrzej Siewior wrote: > > So I triggered a bug in x86 code. First the "okay" backtrace: > |BUG: GPF in non-whitelisted uaccess (non-canonical address?) > |general protection fault: [#1] PREEMPT SMP NOPTI > |CPU: 26 PID: 2236 Comm: sig-xstate-bum Not tainted 4.20.0-rc3 #45 > |RIP: 0010:__fpu__restore_sig+0x1c1/0x540 > |Call Trace: > | fpu__restore_sig+0x28/0x40 > | restore_sigcontext+0x13a/0x180 > | __ia32_sys_rt_sigreturn+0xae/0x100 > | do_syscall_64+0x4f/0x100 > | entry_SYSCALL_64_after_hwframe+0x44/0xa9 > |RIP: 0033:0x7f9b06aea227 > |---[ end trace a45ac23b593e9ab0 ]--- This bug got handled by Jann Horn, yes? (I remember seeing a related thread go by...) > > and now the not okay part: > > |BUG: sleeping function called from invalid context at > kernel/sched/completion.c:99 > |in_atomic(): 1, irqs_disabled(): 1, pid: 2236, name: sig-xstate-bum > |Preemption disabled at: > |[] pstore_dump+0x72/0x330 > |CPU: 26 PID: 2236 Comm: sig-xstate-bum Tainted: G D > 4.20.0-rc3 #45 > |Call Trace: > | dump_stack+0x4f/0x6a > | ___might_sleep.cold.91+0xd3/0xe4 > | __might_sleep+0x50/0x90 > | wait_for_completion+0x32/0x130 > | virt_efi_query_variable_info+0x14e/0x160 > | efi_query_variable_store+0x51/0x1a0 > | efivar_entry_set_safe+0xa3/0x1b0 > | efi_pstore_write+0x109/0x140 Eww. :) > | pstore_dump+0x11c/0x330 > | kmsg_dump+0xa4/0xd0 > | oops_exit+0x22/0x30 > | oops_end+0x67/0xd0 > | die+0x41/0x4a > | do_general_protection+0xc1/0x150 > | general_protection+0x1e/0x30 > |RIP: 0010:__fpu__restore_sig+0x1c1/0x540 > | fpu__restore_sig+0x28/0x40 > | restore_sigcontext+0x13a/0x180 > | __ia32_sys_rt_sigreturn+0xae/0x100 > | do_syscall_64+0x4f/0x100 > | entry_SYSCALL_64_after_hwframe+0x44/0xa9 > |RIP: 0033:0x7f9b06aea227 > … [ moar backtrace ] > > This looks like it comes from commit 21b3ddd39fee ("efi: Don't use > spinlocks for efi vars") because it replaced a spin_lock() with > down_interruptible() in this case. In this case, since pstore_dump() > holds psinfo->buf_lock with disabled interrupts we can't block… > > I have this as a workaround: I'm not sure this is correct... > diff --git a/drivers/firmware/efi/vars.c b/drivers/firmware/efi/vars.c > index 9336ffdf6e2c..d4dcfe46eb2e 100644 > --- a/drivers/firmware/efi/vars.c > +++ b/drivers/firmware/efi/vars.c > @@ -755,7 +755,9 @@ int efivar_entry_set_safe(efi_char16_t *name, efi_guid_t > vendor, u32 attributes, > return -EINTR; > } > > - status = check_var_size(attributes, size + ucs2_strsize(name, 1024)); > + status = ops->query_variable_store(attributes, > + size + ucs2_strsize(name, 1024), > + block); Why does this change help? > if (status != EFI_SUCCESS) { > up(_lock); > return -ENOSPC; > diff --git a/fs/pstore/platform.c b/fs/pstore/platform.c > index b821054ca3ed..9aa27a2c8d36 100644 > --- a/fs/pstore/platform.c > +++ b/fs/pstore/platform.c > @@ -127,10 +127,10 @@ static const char *get_reason_str(enum kmsg_dump_reason > reason) > bool pstore_cannot_block_path(enum kmsg_dump_reason reason) > { > /* > -* In case of NMI path, pstore shouldn't be blocked > -* regardless of reason. > +* In case of disabled preemption / interrupts we can't block on any > +* lock regardless of reason. > */ > - if (in_nmi()) > + if (in_atomic() || irqs_disabled()) > return true; > > switch (reason) { I'd like to avoid any case where pstore might "miss" a crash report, though... this seems to make it easier for it to fail to record a crash, yes? I have some pending clean-ups in this area (buf_lock may not be needed at all, actually), so I'll try to work through those too. -Kees -- Kees Cook
[PATCH] efi: let kmemleak ignore false positives
unreferenced object 0x8096c1acf580 (size 128): comm "swapper/63", pid 0, jiffies 4294937418 (age 1201.230s) hex dump (first 32 bytes): 80 87 b5 c1 96 00 00 00 00 00 cc c2 16 00 00 00 00 00 01 00 00 00 00 00 6b 6b 6b 6b 6b 6b 6b 6b backtrace: [<1d2549ba>] kmem_cache_alloc_trace+0x430/0x500 [<93a6dfab>] efi_mem_reserve_persistent+0x50/0xf8 [<0a730828>] its_cpu_init_lpis+0x394/0x4b8 [] its_cpu_init+0x104/0x150 [<4d0342c5>] gic_starting_cpu+0x34/0x40 [<5d9da772>] cpuhp_invoke_callback+0x228/0x1d68 [<61eace9b>] notify_cpu_starting+0xc0/0x118 [<48bc2dc5>] secondary_start_kernel+0x23c/0x3b0 [<15137d6a>] 0x efi_mem_reserve_persistent+0x50/0xf8: kmalloc at include/linux/slab.h:546 (inlined by) efi_mem_reserve_persistent at drivers/firmware/efi/efi.c:979 This line, rsv = kmalloc(sizeof(*rsv), GFP_ATOMIC); Kmemleak has a known limitation that can only track pointers in the kernel virtual space. Hence, it will report false positives due to "rsv" will only reference to other physical addresses, rsv->next = efi_memreserve_root->next; efi_memreserve_root->next = __pa(rsv); Signed-off-by: Qian Cai --- drivers/firmware/efi/efi.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/drivers/firmware/efi/efi.c b/drivers/firmware/efi/efi.c index fad7c62..0b69bb6 100644 --- a/drivers/firmware/efi/efi.c +++ b/drivers/firmware/efi/efi.c @@ -31,6 +31,7 @@ #include #include #include +#include #include @@ -980,6 +981,8 @@ int efi_mem_reserve_persistent(phys_addr_t addr, u64 size) if (!rsv) return -ENOMEM; + kmemleak_ignore(rsv); + rsv->base = addr; rsv->size = size; -- 1.8.3.1
RE: [GIT PULL 00/11] EFI updates
Hi Ard, While building from next branch of efi tree, I noticed the below warning. Could you please check the same on your side? CC lib/list_debug.o drivers/firmware/efi/efi.c: In function 'efi_mem_reserve_persistent': drivers/firmware/efi/efi.c:1000:6: warning: unused variable 'rsvsize' [-Wunused-variable] int rsvsize = EFI_MEMRESERVE_SIZE(1); ^~~ CC lib/bitrev.o CC net/core/sock_reuseport.o > -Original Message- > From: Ard Biesheuvel [mailto:ard.biesheu...@linaro.org] > Sent: Thursday, November 29, 2018 9:12 AM > To: linux-efi@vger.kernel.org; Ingo Molnar ; Thomas > Gleixner > Cc: Ard Biesheuvel ; linux-ker...@vger.kernel.org; > Andy Lutomirski ; Arend van Spriel > ; Bhupesh Sharma ; > Borislav Petkov ; Hansen, Dave ; Eric > Snowberg ; Hans de Goede > ; Joe Perches ; Jon Hunter > ; Julien Thierry ; Marc > Zyngier ; Nathan Chancellor > ; Peter Zijlstra ; Prakhya, > Sai Praneeth ; Sedat Dilek > ; YiFei Zhu > Subject: [GIT PULL 00/11] EFI updates > > The following changes since commit > 976b489120cdab2b1b3a41ffa14661db43d58190: > > efi: Prevent GICv3 WARN() by mapping the memreserve table before first use > (2018-11-27 13:50:20 +0100) > > are available in the Git repository at: > > git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi.git tags/efi-next > > for you to fetch changes up to 1d29afdbf7ae878a23627ebee81efcd213f11749: > > efi/x86: earlyprintk - Fix infinite loop on some screen widths (2018-11-28 > 17:58:42 +0100) > Regards, Sai
[PATCH 10/11] efi: reduce the amount of memblock reservations for persistent allocations
The current implementation of efi_mem_reserve_persistent() is rather naive, in the sense that for each invocation, it creates a separate linked list entry to describe the reservation. Since the linked list entries themselves need to persist across subsequent kexec reboots, every reservation created this way results in two memblock_reserve() calls at the next boot. On arm64 systems with 100s of CPUs, this may result in a excessive number of memblock reservations, and needless fragmentation. So instead, make use of the newly updated struct linux_efi_memreserve layout to put multiple reservations into a single linked list entry. This should get rid of the numerous tiny memblock reservations, and effectively cut the total number of reservations in half on arm64 systems with many CPUs. Tested-by: Marc Zyngier Signed-off-by: Ard Biesheuvel --- drivers/firmware/efi/efi.c | 20 +--- include/linux/efi.h| 3 +++ 2 files changed, 20 insertions(+), 3 deletions(-) diff --git a/drivers/firmware/efi/efi.c b/drivers/firmware/efi/efi.c index 80b11521627a..e90bc32c2670 100644 --- a/drivers/firmware/efi/efi.c +++ b/drivers/firmware/efi/efi.c @@ -998,7 +998,8 @@ int __ref efi_mem_reserve_persistent(phys_addr_t addr, u64 size) { struct linux_efi_memreserve *rsv; int rsvsize = EFI_MEMRESERVE_SIZE(1); - int rc; + unsigned long prsv; + int rc, index; if (efi_memreserve_root == (void *)ULONG_MAX) return -ENODEV; @@ -1009,11 +1010,24 @@ int __ref efi_mem_reserve_persistent(phys_addr_t addr, u64 size) return rc; } - rsv = kmalloc(rsvsize, GFP_ATOMIC); + /* first try to find a slot in an existing linked list entry */ + for (prsv = efi_memreserve_root->next; prsv; prsv = rsv->next) { + rsv = __va(prsv); + index = atomic_fetch_add_unless(>count, 1, rsv->size); + if (index < rsv->size) { + rsv->entry[index].base = addr; + rsv->entry[index].size = size; + + return 0; + } + } + + /* no slot found - allocate a new linked list entry */ + rsv = (struct linux_efi_memreserve *)__get_free_page(GFP_ATOMIC); if (!rsv) return -ENOMEM; - rsv->size = 1; + rsv->size = EFI_MEMRESERVE_COUNT(PAGE_SIZE); atomic_set(>count, 1); rsv->entry[0].base = addr; rsv->entry[0].size = size; diff --git a/include/linux/efi.h b/include/linux/efi.h index 4f27640fdcdc..becd5d76a207 100644 --- a/include/linux/efi.h +++ b/include/linux/efi.h @@ -1724,4 +1724,7 @@ struct linux_efi_memreserve { #define EFI_MEMRESERVE_SIZE(count) (sizeof(struct linux_efi_memreserve) + \ (count) * sizeof(((struct linux_efi_memreserve *)0)->entry[0])) +#define EFI_MEMRESERVE_COUNT(size) (((size) - sizeof(struct linux_efi_memreserve)) \ + / sizeof(((struct linux_efi_memreserve *)0)->entry[0])) + #endif /* _LINUX_EFI_H */ -- 2.19.1
[PATCH 04/11] x86/mm/pageattr: Introduce helper function to unmap EFI boot services
From: Sai Praneeth Prakhya Ideally, after kernel assumes control of the platform, firmware shouldn't access EFI boot services code/data regions. But, it's noticed that this is not so true in many x86 platforms. Hence, during boot, kernel reserves EFI boot services code/data regions [1] and maps [2] them to efi_pgd so that call to set_virtual_address_map() doesn't fail. After returning from set_virtual_address_map(), kernel frees the reserved regions [3] but they still remain mapped. Hence, introduce kernel_unmap_pages_in_pgd() which will later be used to unmap EFI boot services code/data regions. While at it modify kernel_map_pages_in_pgd() by 1. Adding __init modifier because it's always used *only* during boot. 2. Add a warning if it's used after SMP is initialized because it uses __flush_tlb_all() which flushes mappings only on current CPU. Unmapping EFI boot services code/data regions will result in clearing PAGE_PRESENT bit and it shouldn't bother L1TF cases because it's already handled by protnone_mask() at arch/x86/include/asm/pgtable-invert.h. [1] efi_reserve_boot_services() [2] efi_map_region() -> __map_region() -> kernel_map_pages_in_pgd() [3] efi_free_boot_services() Signed-off-by: Sai Praneeth Prakhya Cc: Borislav Petkov Cc: Ingo Molnar Cc: Andy Lutomirski Cc: Dave Hansen Cc: Bhupesh Sharma Cc: Peter Zijlstra Reviewed-by: Thomas Gleixner Signed-off-by: Ard Biesheuvel --- arch/x86/include/asm/pgtable_types.h | 8 -- arch/x86/mm/pageattr.c | 40 ++-- 2 files changed, 44 insertions(+), 4 deletions(-) diff --git a/arch/x86/include/asm/pgtable_types.h b/arch/x86/include/asm/pgtable_types.h index 106b7d0e2dae..d6ff0bbdb394 100644 --- a/arch/x86/include/asm/pgtable_types.h +++ b/arch/x86/include/asm/pgtable_types.h @@ -564,8 +564,12 @@ extern pte_t *lookup_address_in_pgd(pgd_t *pgd, unsigned long address, unsigned int *level); extern pmd_t *lookup_pmd_address(unsigned long address); extern phys_addr_t slow_virt_to_phys(void *__address); -extern int kernel_map_pages_in_pgd(pgd_t *pgd, u64 pfn, unsigned long address, - unsigned numpages, unsigned long page_flags); +extern int __init kernel_map_pages_in_pgd(pgd_t *pgd, u64 pfn, + unsigned long address, + unsigned numpages, + unsigned long page_flags); +extern int __init kernel_unmap_pages_in_pgd(pgd_t *pgd, unsigned long address, + unsigned long numpages); #endif /* !__ASSEMBLY__ */ #endif /* _ASM_X86_PGTABLE_DEFS_H */ diff --git a/arch/x86/mm/pageattr.c b/arch/x86/mm/pageattr.c index db7a10082238..bac35001d896 100644 --- a/arch/x86/mm/pageattr.c +++ b/arch/x86/mm/pageattr.c @@ -2338,8 +2338,8 @@ bool kernel_page_present(struct page *page) #endif /* CONFIG_DEBUG_PAGEALLOC */ -int kernel_map_pages_in_pgd(pgd_t *pgd, u64 pfn, unsigned long address, - unsigned numpages, unsigned long page_flags) +int __init kernel_map_pages_in_pgd(pgd_t *pgd, u64 pfn, unsigned long address, + unsigned numpages, unsigned long page_flags) { int retval = -EINVAL; @@ -2353,6 +2353,8 @@ int kernel_map_pages_in_pgd(pgd_t *pgd, u64 pfn, unsigned long address, .flags = 0, }; + WARN_ONCE(num_online_cpus() > 1, "Don't call after initializing SMP"); + if (!(__supported_pte_mask & _PAGE_NX)) goto out; @@ -2374,6 +2376,40 @@ int kernel_map_pages_in_pgd(pgd_t *pgd, u64 pfn, unsigned long address, return retval; } +/* + * __flush_tlb_all() flushes mappings only on current CPU and hence this + * function shouldn't be used in an SMP environment. Presently, it's used only + * during boot (way before smp_init()) by EFI subsystem and hence is ok. + */ +int __init kernel_unmap_pages_in_pgd(pgd_t *pgd, unsigned long address, +unsigned long numpages) +{ + int retval; + + /* +* The typical sequence for unmapping is to find a pte through +* lookup_address_in_pgd() (ideally, it should never return NULL because +* the address is already mapped) and change it's protections. As pfn is +* the *target* of a mapping, it's not useful while unmapping. +*/ + struct cpa_data cpa = { + .vaddr = , + .pfn= 0, + .pgd= pgd, + .numpages = numpages, + .mask_set = __pgprot(0), + .mask_clr = __pgprot(_PAGE_PRESENT | _PAGE_RW), + .flags = 0, + }; + + WARN_ONCE(num_online_cpus() > 1, "Don't call after initializing SMP"); + + retval = __change_page_attr_set_clr(, 0); + __flush_tlb_all(); + + return
[PATCH 09/11] efi: permit multiple entries in persistent memreserve data structure
In preparation of updating efi_mem_reserve_persistent() to cause less fragmentation when dealing with many persistent reservations, update the struct definition and the code that handles it currently so it can describe an arbitrary number of reservations using a single linked list entry. The actual optimization will be implemented in a subsequent patch. Tested-by: Marc Zyngier Signed-off-by: Ard Biesheuvel --- drivers/firmware/efi/efi.c | 39 + drivers/firmware/efi/libstub/arm-stub.c | 2 +- include/linux/efi.h | 13 +++-- 3 files changed, 38 insertions(+), 16 deletions(-) diff --git a/drivers/firmware/efi/efi.c b/drivers/firmware/efi/efi.c index 415849bab233..80b11521627a 100644 --- a/drivers/firmware/efi/efi.c +++ b/drivers/firmware/efi/efi.c @@ -602,21 +602,33 @@ int __init efi_apply_persistent_mem_reservations(void) while (prsv) { struct linux_efi_memreserve *rsv; - - /* reserve the entry itself */ - memblock_reserve(prsv, sizeof(*rsv)); - - rsv = early_memremap(prsv, sizeof(*rsv)); - if (rsv == NULL) { + u8 *p; + int i; + + /* +* Just map a full page: that is what we will get +* anyway, and it permits us to map the entire entry +* before knowing its size. +*/ + p = early_memremap(ALIGN_DOWN(prsv, PAGE_SIZE), + PAGE_SIZE); + if (p == NULL) { pr_err("Could not map UEFI memreserve entry!\n"); return -ENOMEM; } - if (rsv->size) - memblock_reserve(rsv->base, rsv->size); + rsv = (void *)(p + prsv % PAGE_SIZE); + + /* reserve the entry itself */ + memblock_reserve(prsv, EFI_MEMRESERVE_SIZE(rsv->size)); + + for (i = 0; i < atomic_read(>count); i++) { + memblock_reserve(rsv->entry[i].base, +rsv->entry[i].size); + } prsv = rsv->next; - early_memunmap(rsv, sizeof(*rsv)); + early_memunmap(p, PAGE_SIZE); } } @@ -985,6 +997,7 @@ static int __init efi_memreserve_map_root(void) int __ref efi_mem_reserve_persistent(phys_addr_t addr, u64 size) { struct linux_efi_memreserve *rsv; + int rsvsize = EFI_MEMRESERVE_SIZE(1); int rc; if (efi_memreserve_root == (void *)ULONG_MAX) @@ -996,12 +1009,14 @@ int __ref efi_mem_reserve_persistent(phys_addr_t addr, u64 size) return rc; } - rsv = kmalloc(sizeof(*rsv), GFP_ATOMIC); + rsv = kmalloc(rsvsize, GFP_ATOMIC); if (!rsv) return -ENOMEM; - rsv->base = addr; - rsv->size = size; + rsv->size = 1; + atomic_set(>count, 1); + rsv->entry[0].base = addr; + rsv->entry[0].size = size; spin_lock(_mem_reserve_persistent_lock); rsv->next = efi_memreserve_root->next; diff --git a/drivers/firmware/efi/libstub/arm-stub.c b/drivers/firmware/efi/libstub/arm-stub.c index 3d36142cf812..9e20159ea5f5 100644 --- a/drivers/firmware/efi/libstub/arm-stub.c +++ b/drivers/firmware/efi/libstub/arm-stub.c @@ -86,8 +86,8 @@ void install_memreserve_table(efi_system_table_t *sys_table_arg) } rsv->next = 0; - rsv->base = 0; rsv->size = 0; + atomic_set(>count, 0); status = efi_call_early(install_configuration_table, _table_guid, diff --git a/include/linux/efi.h b/include/linux/efi.h index 2b3b33c83b05..4f27640fdcdc 100644 --- a/include/linux/efi.h +++ b/include/linux/efi.h @@ -1712,9 +1712,16 @@ extern struct efi_runtime_work efi_rts_work; extern struct workqueue_struct *efi_rts_wq; struct linux_efi_memreserve { - phys_addr_t next; - phys_addr_t base; - phys_addr_t size; + int size; // allocated size of the array + atomic_tcount; // number of entries used + phys_addr_t next; // pa of next struct instance + struct { + phys_addr_t base; + phys_addr_t size; + } entry[0]; }; +#define EFI_MEMRESERVE_SIZE(count) (sizeof(struct linux_efi_memreserve) + \ + (count) * sizeof(((struct linux_efi_memreserve *)0)->entry[0])) + #endif /* _LINUX_EFI_H */ -- 2.19.1
[PATCH 01/11] x86/efi: Allocate e820 buffer before calling efi_exit_boot_service
From: Eric Snowberg Commit d64934019f6c ("x86/efi: Use efi_exit_boot_services()") introduced a regression on systems with large memory maps causing them to hang on boot. The first "goto get_map" that was removed from exit_boot insured there was enough room for the memory map when efi_call_early(exit_boot_services) was called. This happens when (nr_desc > ARRAY_SIZE(params->e820_table). Chain of events: exit_boot() efi_exit_boot_services() efi_get_memory_map <- at this point the mm can't grow over 8 desc priv_func() exit_boot_func() allocate_e820ext() <- new mm grows over 8 desc from e820 alloc efi_call_early(exit_boot_services) <- mm key doesn't match so retry efi_call_early(get_memory_map) <- not enough room for new mm system hangs This patch allocates the e820 buffer before calling efi_exit_boot_services and fixes the regression. Signed-off-by: Eric Snowberg Signed-off-by: Ard Biesheuvel --- arch/x86/boot/compressed/eboot.c | 65 1 file changed, 41 insertions(+), 24 deletions(-) diff --git a/arch/x86/boot/compressed/eboot.c b/arch/x86/boot/compressed/eboot.c index 8b4c5e001157..f7bad07bb251 100644 --- a/arch/x86/boot/compressed/eboot.c +++ b/arch/x86/boot/compressed/eboot.c @@ -1,3 +1,4 @@ + /* --- * * Copyright 2011 Intel Corporation; author Matt Fleming @@ -634,37 +635,54 @@ static efi_status_t alloc_e820ext(u32 nr_desc, struct setup_data **e820ext, return status; } +static efi_status_t allocate_e820(struct boot_params *params, + struct setup_data **e820ext, + u32 *e820ext_size) +{ + unsigned long map_size, desc_size, buff_size; + struct efi_boot_memmap boot_map; + efi_memory_desc_t *map; + efi_status_t status; + __u32 nr_desc; + + boot_map.map = + boot_map.map_size = _size; + boot_map.desc_size =_size; + boot_map.desc_ver = NULL; + boot_map.key_ptr = NULL; + boot_map.buff_size =_size; + + status = efi_get_memory_map(sys_table, _map); + if (status != EFI_SUCCESS) + return status; + + nr_desc = buff_size / desc_size; + + if (nr_desc > ARRAY_SIZE(params->e820_table)) { + u32 nr_e820ext = nr_desc - ARRAY_SIZE(params->e820_table); + + status = alloc_e820ext(nr_e820ext, e820ext, e820ext_size); + if (status != EFI_SUCCESS) + return status; + } + + return EFI_SUCCESS; +} + struct exit_boot_struct { struct boot_params *boot_params; struct efi_info *efi; - struct setup_data *e820ext; - __u32 e820ext_size; }; static efi_status_t exit_boot_func(efi_system_table_t *sys_table_arg, struct efi_boot_memmap *map, void *priv) { - static bool first = true; const char *signature; __u32 nr_desc; efi_status_t status; struct exit_boot_struct *p = priv; - if (first) { - nr_desc = *map->buff_size / *map->desc_size; - if (nr_desc > ARRAY_SIZE(p->boot_params->e820_table)) { - u32 nr_e820ext = nr_desc - - ARRAY_SIZE(p->boot_params->e820_table); - - status = alloc_e820ext(nr_e820ext, >e820ext, - >e820ext_size); - if (status != EFI_SUCCESS) - return status; - } - first = false; - } - signature = efi_is_64bit() ? EFI64_LOADER_SIGNATURE : EFI32_LOADER_SIGNATURE; memcpy(>efi->efi_loader_signature, signature, sizeof(__u32)); @@ -687,8 +705,8 @@ static efi_status_t exit_boot(struct boot_params *boot_params, void *handle) { unsigned long map_sz, key, desc_size, buff_size; efi_memory_desc_t *mem_map; - struct setup_data *e820ext; - __u32 e820ext_size; + struct setup_data *e820ext = NULL; + __u32 e820ext_size = 0; efi_status_t status; __u32 desc_version; struct efi_boot_memmap map; @@ -702,8 +720,10 @@ static efi_status_t exit_boot(struct boot_params *boot_params, void *handle) map.buff_size = _size; priv.boot_params= boot_params; priv.efi= _params->efi_info; - priv.e820ext= NULL; - priv.e820ext_size = 0; + + status = allocate_e820(boot_params, , _size); + if (status != EFI_SUCCESS) + return status; /* Might as well exit boot services now */ status = efi_exit_boot_services(sys_table, handle, , , @@ -711,9 +731,6 @@ static
[PATCH 02/11] efi/fdt: Indentation fix
From: Julien Thierry Closing bracket seems to end a for statement when it is actually ending the contained if. Add some brackets to have clear delimitation of each scope. No functional change/fix, just fix the indentation. Signed-off-by: Julien Thierry Signed-off-by: Ard Biesheuvel --- drivers/firmware/efi/libstub/fdt.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/drivers/firmware/efi/libstub/fdt.c b/drivers/firmware/efi/libstub/fdt.c index 0c0d2312f4a8..a3614f9b5f75 100644 --- a/drivers/firmware/efi/libstub/fdt.c +++ b/drivers/firmware/efi/libstub/fdt.c @@ -376,7 +376,7 @@ void *get_fdt(efi_system_table_t *sys_table, unsigned long *fdt_size) tables = (efi_config_table_t *) sys_table->tables; fdt = NULL; - for (i = 0; i < sys_table->nr_tables; i++) + for (i = 0; i < sys_table->nr_tables; i++) { if (efi_guidcmp(tables[i].guid, fdt_guid) == 0) { fdt = (void *) tables[i].table; if (fdt_check_header(fdt) != 0) { @@ -385,7 +385,8 @@ void *get_fdt(efi_system_table_t *sys_table, unsigned long *fdt_size) } *fdt_size = fdt_totalsize(fdt); break; -} + } + } return fdt; } -- 2.19.1
[PATCH 06/11] x86/efi: Move efi__boot_services() to arch/x86
From: Sai Praneeth Prakhya efi__boot_services() are x86 specific quirks and as such should be in asm/efi.h, so move them from linux/efi.h. Also, call efi_free_boot_services() from __efi_enter_virtual_mode() as it is x86 specific call and ideally shouldn't be part of init/main.c Signed-off-by: Sai Praneeth Prakhya Cc: Borislav Petkov Cc: Ingo Molnar Cc: Andy Lutomirski Cc: Dave Hansen Cc: Bhupesh Sharma Cc: Peter Zijlstra Acked-by: Thomas Gleixner Signed-off-by: Ard Biesheuvel --- arch/x86/include/asm/efi.h | 2 ++ arch/x86/platform/efi/efi.c | 2 ++ include/linux/efi.h | 3 --- init/main.c | 4 4 files changed, 4 insertions(+), 7 deletions(-) diff --git a/arch/x86/include/asm/efi.h b/arch/x86/include/asm/efi.h index eea40d52ca78..d1e64ac80b9c 100644 --- a/arch/x86/include/asm/efi.h +++ b/arch/x86/include/asm/efi.h @@ -141,6 +141,8 @@ extern int __init efi_reuse_config(u64 tables, int nr_tables); extern void efi_delete_dummy_variable(void); extern void efi_switch_mm(struct mm_struct *mm); extern void efi_recover_from_page_fault(unsigned long phys_addr); +extern void efi_free_boot_services(void); +extern void efi_reserve_boot_services(void); struct efi_setup_data { u64 fw_vendor; diff --git a/arch/x86/platform/efi/efi.c b/arch/x86/platform/efi/efi.c index 7ae939e353cd..e1cb01a22fa8 100644 --- a/arch/x86/platform/efi/efi.c +++ b/arch/x86/platform/efi/efi.c @@ -993,6 +993,8 @@ static void __init __efi_enter_virtual_mode(void) panic("EFI call to SetVirtualAddressMap() failed!"); } + efi_free_boot_services(); + /* * Now that EFI is in virtual mode, update the function * pointers in the runtime service table to the new virtual addresses. diff --git a/include/linux/efi.h b/include/linux/efi.h index 100ce4a4aff6..2b3b33c83b05 100644 --- a/include/linux/efi.h +++ b/include/linux/efi.h @@ -1000,13 +1000,11 @@ extern void efi_memmap_walk (efi_freemem_callback_t callback, void *arg); extern void efi_gettimeofday (struct timespec64 *ts); extern void efi_enter_virtual_mode (void); /* switch EFI to virtual mode, if possible */ #ifdef CONFIG_X86 -extern void efi_free_boot_services(void); extern efi_status_t efi_query_variable_store(u32 attributes, unsigned long size, bool nonblocking); extern void efi_find_mirror(void); #else -static inline void efi_free_boot_services(void) {} static inline efi_status_t efi_query_variable_store(u32 attributes, unsigned long size, @@ -1046,7 +1044,6 @@ extern void efi_mem_reserve(phys_addr_t addr, u64 size); extern int efi_mem_reserve_persistent(phys_addr_t addr, u64 size); extern void efi_initialize_iomem_resources(struct resource *code_resource, struct resource *data_resource, struct resource *bss_resource); -extern void efi_reserve_boot_services(void); extern int efi_get_fdt_params(struct efi_fdt_params *params); extern struct kobject *efi_kobj; diff --git a/init/main.c b/init/main.c index ee147103ba1b..ccefcd8e855f 100644 --- a/init/main.c +++ b/init/main.c @@ -737,10 +737,6 @@ asmlinkage __visible void __init start_kernel(void) arch_post_acpi_subsys_init(); sfi_init_late(); - if (efi_enabled(EFI_RUNTIME_SERVICES)) { - efi_free_boot_services(); - } - /* Do the rest non-__init'ed, we're now alive */ arch_call_rest_init(); } -- 2.19.1
[PATCH 08/11] firmware: efi: add NULL pointer checks in efivars api functions
From: Arend van Spriel Since commit: ce2e6db554fa ("brcmfmac: Add support for getting nvram contents from EFI variables") we have a device driver accessing the efivars API. Several functions in the efivars API assume __efivars is set, i.e., that they will be accessed only after efivars_register() has been called. However, the following NULL pointer access was reported calling efivar_entry_size() from the brcmfmac device driver. Unable to handle kernel NULL pointer dereference at virtual address 0008 pgd = 60bfa5f1 [0008] *pgd= Internal error: Oops: 5 [#1] SMP ARM ... Hardware name: NVIDIA Tegra SoC (Flattened Device Tree) Workqueue: events request_firmware_work_func PC is at efivar_entry_size+0x28/0x90 LR is at brcmf_fw_complete_request+0x3f8/0x8d4 [brcmfmac] pc : []lr : []psr: a00d0113 sp : ede7fe28 ip : ee983410 fp : c1787f30 r10: r9 : r8 : bf2b2258 r7 : ee983000 r6 : c1604c48 r5 : ede7fe88 r4 : edf337c0 r3 : r2 : r1 : ede7fe88 r0 : c17712c8 Flags: NzCv IRQs on FIQs on Mode SVC_32 ISA ARM Segment none Control: 10c5387d Table: ad16804a DAC: 0051 Disassembly showed that the local static variable __efivars is NULL, which is not entirely unexpected given that it is a non-EFI platform. So add a NULL pointer check to efivar_entry_size(), and to related functions while at it. In efivars_register() a couple of sanity checks are added as well. Cc: Hans de Goede Reported-by: Jon Hunter Signed-off-by: Arend van Spriel Signed-off-by: Ard Biesheuvel --- drivers/firmware/efi/vars.c | 99 + 1 file changed, 78 insertions(+), 21 deletions(-) diff --git a/drivers/firmware/efi/vars.c b/drivers/firmware/efi/vars.c index 9336ffdf6e2c..fceaafd67ec6 100644 --- a/drivers/firmware/efi/vars.c +++ b/drivers/firmware/efi/vars.c @@ -318,7 +318,12 @@ EXPORT_SYMBOL_GPL(efivar_variable_is_removable); static efi_status_t check_var_size(u32 attributes, unsigned long size) { - const struct efivar_operations *fops = __efivars->ops; + const struct efivar_operations *fops; + + if (!__efivars) + return EFI_UNSUPPORTED; + + fops = __efivars->ops; if (!fops->query_variable_store) return EFI_UNSUPPORTED; @@ -329,7 +334,12 @@ check_var_size(u32 attributes, unsigned long size) static efi_status_t check_var_size_nonblocking(u32 attributes, unsigned long size) { - const struct efivar_operations *fops = __efivars->ops; + const struct efivar_operations *fops; + + if (!__efivars) + return EFI_UNSUPPORTED; + + fops = __efivars->ops; if (!fops->query_variable_store) return EFI_UNSUPPORTED; @@ -429,13 +439,18 @@ static void dup_variable_bug(efi_char16_t *str16, efi_guid_t *vendor_guid, int efivar_init(int (*func)(efi_char16_t *, efi_guid_t, unsigned long, void *), void *data, bool duplicates, struct list_head *head) { - const struct efivar_operations *ops = __efivars->ops; + const struct efivar_operations *ops; unsigned long variable_name_size = 1024; efi_char16_t *variable_name; efi_status_t status; efi_guid_t vendor_guid; int err = 0; + if (!__efivars) + return -EFAULT; + + ops = __efivars->ops; + variable_name = kzalloc(variable_name_size, GFP_KERNEL); if (!variable_name) { printk(KERN_ERR "efivars: Memory allocation failed.\n"); @@ -583,12 +598,14 @@ static void efivar_entry_list_del_unlock(struct efivar_entry *entry) */ int __efivar_entry_delete(struct efivar_entry *entry) { - const struct efivar_operations *ops = __efivars->ops; efi_status_t status; - status = ops->set_variable(entry->var.VariableName, - >var.VendorGuid, - 0, 0, NULL); + if (!__efivars) + return -EINVAL; + + status = __efivars->ops->set_variable(entry->var.VariableName, + >var.VendorGuid, + 0, 0, NULL); return efi_status_to_err(status); } @@ -607,12 +624,17 @@ EXPORT_SYMBOL_GPL(__efivar_entry_delete); */ int efivar_entry_delete(struct efivar_entry *entry) { - const struct efivar_operations *ops = __efivars->ops; + const struct efivar_operations *ops; efi_status_t status; if (down_interruptible(_lock)) return -EINTR; + if (!__efivars) { + up(_lock); + return -EINVAL; + } + ops = __efivars->ops; status = ops->set_variable(entry->var.VariableName, >var.VendorGuid, 0, 0, NULL); @@ -650,13 +672,19 @@ EXPORT_SYMBOL_GPL(efivar_entry_delete); int
[PATCH 07/11] efi/libstub: Disable some warnings for x86{,_64}
From: Nathan Chancellor When building the kernel with Clang, some disabled warnings appear because this Makefile overrides KBUILD_CFLAGS for x86{,_64}. Add them to this list so that the build is clean again. -Wpointer-sign was disabled for the whole kernel before the beginning of git history. -Waddress-of-packed-member was disabled for the whole kernel in commit bfb38988c51e ("kbuild: clang: Disable 'address-of-packed-member' warning") and for x86/boot/compressed in commit 20c6c1890455 ("x86/boot: Disable the address-of-packed-member compiler warning"). -Wgnu was disabled for the whole kernel in commit 61163efae020 ("kbuild: LLVMLinux: Add Kbuild support for building kernel with Clang") and for x86/boot/compressed in commit 6c3b56b19730 ("x86/boot: Disable Clang warnings about GNU extensions"). Link: https://github.com/ClangBuiltLinux/linux/issues/112 Signed-off-by: Nathan Chancellor Reviewed-by: Sedat Dilek Tested-by: Sedat Dilek Signed-off-by: Ard Biesheuvel --- drivers/firmware/efi/libstub/Makefile | 5 - 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/drivers/firmware/efi/libstub/Makefile b/drivers/firmware/efi/libstub/Makefile index c51627660dbb..d9845099635e 100644 --- a/drivers/firmware/efi/libstub/Makefile +++ b/drivers/firmware/efi/libstub/Makefile @@ -9,7 +9,10 @@ cflags-$(CONFIG_X86_32):= -march=i386 cflags-$(CONFIG_X86_64):= -mcmodel=small cflags-$(CONFIG_X86) += -m$(BITS) -D__KERNEL__ -O2 \ -fPIC -fno-strict-aliasing -mno-red-zone \ - -mno-mmx -mno-sse -fshort-wchar + -mno-mmx -mno-sse -fshort-wchar \ + -Wno-pointer-sign \ + $(call cc-disable-warning, address-of-packed-member) \ + $(call cc-disable-warning, gnu) # arm64 uses the full KBUILD_CFLAGS so it's necessary to explicitly # disable the stackleak plugin -- 2.19.1
[PATCH 05/11] x86/efi: Unmap EFI boot services code/data regions from efi_pgd
From: Sai Praneeth Prakhya efi_free_boot_services(), as the name suggests, frees EFI boot services code/data regions but forgets to unmap these regions from efi_pgd. This means that any code that's running in efi_pgd address space (e.g: any EFI runtime service) would still be able to access these regions but the contents of these regions would have long been over written by someone else. So, it's important to unmap these regions. Hence, introduce efi_unmap_pages() to unmap these regions from efi_pgd. After unmapping EFI boot services code/data regions, any illegal access by buggy firmware to these regions would result in page fault which will be handled by EFI specific fault handler. Signed-off-by: Sai Praneeth Prakhya Cc: Borislav Petkov Cc: Ingo Molnar Cc: Andy Lutomirski Cc: Dave Hansen Cc: Bhupesh Sharma Cc: Peter Zijlstra Acked-by: Thomas Gleixner Signed-off-by: Ard Biesheuvel --- arch/x86/platform/efi/quirks.c | 25 + 1 file changed, 25 insertions(+) diff --git a/arch/x86/platform/efi/quirks.c b/arch/x86/platform/efi/quirks.c index 95e77a667ba5..09e811b9da26 100644 --- a/arch/x86/platform/efi/quirks.c +++ b/arch/x86/platform/efi/quirks.c @@ -369,6 +369,24 @@ void __init efi_reserve_boot_services(void) } } +/* + * Apart from having VA mappings for EFI boot services code/data regions, + * (duplicate) 1:1 mappings were also created as a quirk for buggy firmware. So, + * unmap both 1:1 and VA mappings. + */ +static void __init efi_unmap_pages(efi_memory_desc_t *md) +{ + pgd_t *pgd = efi_mm.pgd; + u64 pa = md->phys_addr; + u64 va = md->virt_addr; + + if (kernel_unmap_pages_in_pgd(pgd, pa, md->num_pages)) + pr_err("Failed to unmap 1:1 mapping for 0x%llx\n", pa); + + if (kernel_unmap_pages_in_pgd(pgd, va, md->num_pages)) + pr_err("Failed to unmap VA mapping for 0x%llx\n", va); +} + void __init efi_free_boot_services(void) { phys_addr_t new_phys, new_size; @@ -393,6 +411,13 @@ void __init efi_free_boot_services(void) continue; } + /* +* Before calling set_virtual_address_map(), EFI boot services +* code/data regions were mapped as a quirk for buggy firmware. +* Unmap them from efi_pgd before freeing them up. +*/ + efi_unmap_pages(md); + /* * Nasty quirk: if all sub-1MB memory is used for boot * services, we can get here without having allocated the -- 2.19.1
[PATCH 03/11] efi/fdt: Simplify get_fdt flow
From: Julien Thierry Reorganize get_fdt lookup loop, clearly showing that: - Nothing is done for table entries that do not have fdt_guid - Once an entry with fdt_guid is found, break out of the loop No functional changes. Suggested-by: Joe Perches Signed-off-by: Julien Thierry Signed-off-by: Ard Biesheuvel --- drivers/firmware/efi/libstub/fdt.c | 25 + 1 file changed, 13 insertions(+), 12 deletions(-) diff --git a/drivers/firmware/efi/libstub/fdt.c b/drivers/firmware/efi/libstub/fdt.c index a3614f9b5f75..0dc7b4987cc2 100644 --- a/drivers/firmware/efi/libstub/fdt.c +++ b/drivers/firmware/efi/libstub/fdt.c @@ -370,23 +370,24 @@ void *get_fdt(efi_system_table_t *sys_table, unsigned long *fdt_size) { efi_guid_t fdt_guid = DEVICE_TREE_GUID; efi_config_table_t *tables; - void *fdt; int i; - tables = (efi_config_table_t *) sys_table->tables; - fdt = NULL; + tables = (efi_config_table_t *)sys_table->tables; for (i = 0; i < sys_table->nr_tables; i++) { - if (efi_guidcmp(tables[i].guid, fdt_guid) == 0) { - fdt = (void *) tables[i].table; - if (fdt_check_header(fdt) != 0) { - pr_efi_err(sys_table, "Invalid header detected on UEFI supplied FDT, ignoring ...\n"); - return NULL; - } - *fdt_size = fdt_totalsize(fdt); - break; + void *fdt; + + if (efi_guidcmp(tables[i].guid, fdt_guid) != 0) + continue; + + fdt = (void *)tables[i].table; + if (fdt_check_header(fdt) != 0) { + pr_efi_err(sys_table, "Invalid header detected on UEFI supplied FDT, ignoring ...\n"); + return NULL; } + *fdt_size = fdt_totalsize(fdt); + return fdt; } - return fdt; + return NULL; } -- 2.19.1
[GIT PULL 00/11] EFI updates
The following changes since commit 976b489120cdab2b1b3a41ffa14661db43d58190: efi: Prevent GICv3 WARN() by mapping the memreserve table before first use (2018-11-27 13:50:20 +0100) are available in the Git repository at: git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi.git tags/efi-next for you to fetch changes up to 1d29afdbf7ae878a23627ebee81efcd213f11749: efi/x86: earlyprintk - Fix infinite loop on some screen widths (2018-11-28 17:58:42 +0100) EFI updates for v4.21: - Allocate the E820 buffer before doing the GetMemoryMap/ExitBootServices dance so we don't run out of space (Eric) - Clear EFI boot services mappings when freeing the memory (Sai) - Harden efivars against callers that invoke it on non-EFI boots (Arend) - Reduce the number of memblock reservations resulting from extensive use of the new efi_mem_reserve_persistent() API (Ard) - Other assorted fixes and cleanups. Ard Biesheuvel (2): efi: permit multiple entries in persistent memreserve data structure efi: reduce the amount of memblock reservations for persistent allocations Arend van Spriel (1): firmware: efi: add NULL pointer checks in efivars api functions Eric Snowberg (1): x86/efi: Allocate e820 buffer before calling efi_exit_boot_service Julien Thierry (2): efi/fdt: Indentation fix efi/fdt: Simplify get_fdt flow Nathan Chancellor (1): efi/libstub: Disable some warnings for x86{,_64} Sai Praneeth Prakhya (3): x86/mm/pageattr: Introduce helper function to unmap EFI boot services x86/efi: Unmap EFI boot services code/data regions from efi_pgd x86/efi: Move efi__boot_services() to arch/x86 YiFei Zhu (1): efi/x86: earlyprintk - Fix infinite loop on some screen widths arch/x86/boot/compressed/eboot.c| 65 ++ arch/x86/include/asm/efi.h | 2 + arch/x86/include/asm/pgtable_types.h| 8 ++- arch/x86/mm/pageattr.c | 40 - arch/x86/platform/efi/early_printk.c| 2 +- arch/x86/platform/efi/efi.c | 2 + arch/x86/platform/efi/quirks.c | 25 + drivers/firmware/efi/efi.c | 55 +- drivers/firmware/efi/libstub/Makefile | 5 +- drivers/firmware/efi/libstub/arm-stub.c | 2 +- drivers/firmware/efi/libstub/fdt.c | 30 +- drivers/firmware/efi/vars.c | 99 ++--- include/linux/efi.h | 19 +-- init/main.c | 4 -- 14 files changed, 269 insertions(+), 89 deletions(-)
Re: [PATCH 4.9 39/92] efi/arm: Revert deferred unmap of early memmap mapping
On Thu, 29 Nov 2018 at 16:02, Greg Kroah-Hartman wrote: > > On Thu, Nov 29, 2018 at 03:28:44PM +0100, Ard Biesheuvel wrote: > > On Thu, 29 Nov 2018 at 15:21, Greg Kroah-Hartman > > wrote: > > > > > > 4.9-stable review patch. If anyone has any objections, please let me > > > know. > > > > > > -- > > > > > > [ Upstream commit 33412b8673135b18ea42beb7f5117ed0091798b6 ] > > > > > > Commit: > > > > > > 3ea86495aef2 ("efi/arm: preserve early mapping of UEFI memory map > > > longer for BGRT") > > > > > > deferred the unmap of the early mapping of the UEFI memory map to > > > accommodate the ACPI BGRT code, which looks up the memory type that > > > backs the BGRT table to validate it against the requirements of the UEFI > > > spec. > > > > > > Unfortunately, this causes problems on ARM, which does not permit > > > early mappings to persist after paging_init() is called, resulting > > > in a WARN() splat. Since we don't support the BGRT table on ARM anway, > > > let's revert ARM to the old behaviour, which is to take down the > > > early mapping at the end of efi_init(). > > > > > > Signed-off-by: Ard Biesheuvel > > > Cc: Linus Torvalds > > > Cc: Peter Zijlstra > > > Cc: Thomas Gleixner > > > Cc: linux-efi@vger.kernel.org > > > Fixes: 3ea86495aef2 ("efi/arm: preserve early mapping of UEFI memory ...") > > > > This commit is only in v4.19 as far as I know. Does it even apply? > > This commit is in the following releases: > 4.9.129 4.14.72 4.18.10 4.19 > > so it should apply :) > Ah, of course. I didn't realize (or failed to remember) that it had been taken into -stable. Thanks,
Re: [PATCH 4.9 39/92] efi/arm: Revert deferred unmap of early memmap mapping
On Thu, Nov 29, 2018 at 03:28:44PM +0100, Ard Biesheuvel wrote: > On Thu, 29 Nov 2018 at 15:21, Greg Kroah-Hartman > wrote: > > > > 4.9-stable review patch. If anyone has any objections, please let me know. > > > > -- > > > > [ Upstream commit 33412b8673135b18ea42beb7f5117ed0091798b6 ] > > > > Commit: > > > > 3ea86495aef2 ("efi/arm: preserve early mapping of UEFI memory map longer > > for BGRT") > > > > deferred the unmap of the early mapping of the UEFI memory map to > > accommodate the ACPI BGRT code, which looks up the memory type that > > backs the BGRT table to validate it against the requirements of the UEFI > > spec. > > > > Unfortunately, this causes problems on ARM, which does not permit > > early mappings to persist after paging_init() is called, resulting > > in a WARN() splat. Since we don't support the BGRT table on ARM anway, > > let's revert ARM to the old behaviour, which is to take down the > > early mapping at the end of efi_init(). > > > > Signed-off-by: Ard Biesheuvel > > Cc: Linus Torvalds > > Cc: Peter Zijlstra > > Cc: Thomas Gleixner > > Cc: linux-efi@vger.kernel.org > > Fixes: 3ea86495aef2 ("efi/arm: preserve early mapping of UEFI memory ...") > > This commit is only in v4.19 as far as I know. Does it even apply? This commit is in the following releases: 4.9.129 4.14.72 4.18.10 4.19 so it should apply :) thanks, greg k-h
[PATCH 4.9 39/92] efi/arm: Revert deferred unmap of early memmap mapping
4.9-stable review patch. If anyone has any objections, please let me know. -- [ Upstream commit 33412b8673135b18ea42beb7f5117ed0091798b6 ] Commit: 3ea86495aef2 ("efi/arm: preserve early mapping of UEFI memory map longer for BGRT") deferred the unmap of the early mapping of the UEFI memory map to accommodate the ACPI BGRT code, which looks up the memory type that backs the BGRT table to validate it against the requirements of the UEFI spec. Unfortunately, this causes problems on ARM, which does not permit early mappings to persist after paging_init() is called, resulting in a WARN() splat. Since we don't support the BGRT table on ARM anway, let's revert ARM to the old behaviour, which is to take down the early mapping at the end of efi_init(). Signed-off-by: Ard Biesheuvel Cc: Linus Torvalds Cc: Peter Zijlstra Cc: Thomas Gleixner Cc: linux-efi@vger.kernel.org Fixes: 3ea86495aef2 ("efi/arm: preserve early mapping of UEFI memory ...") Link: http://lkml.kernel.org/r/20181114175544.12860-3-ard.biesheu...@linaro.org Signed-off-by: Ingo Molnar Signed-off-by: Sasha Levin --- drivers/firmware/efi/arm-init.c| 4 drivers/firmware/efi/arm-runtime.c | 2 +- drivers/firmware/efi/memmap.c | 3 +++ 3 files changed, 8 insertions(+), 1 deletion(-) diff --git a/drivers/firmware/efi/arm-init.c b/drivers/firmware/efi/arm-init.c index 1d1c9693ebfb..8ee91777abce 100644 --- a/drivers/firmware/efi/arm-init.c +++ b/drivers/firmware/efi/arm-init.c @@ -256,6 +256,10 @@ void __init efi_init(void) (params.mmap & ~PAGE_MASK))); init_screen_info(); + + /* ARM does not permit early mappings to persist across paging_init() */ + if (IS_ENABLED(CONFIG_ARM)) + efi_memmap_unmap(); } static int __init register_gop_device(void) diff --git a/drivers/firmware/efi/arm-runtime.c b/drivers/firmware/efi/arm-runtime.c index 4d788e0debfe..069c5a4479e6 100644 --- a/drivers/firmware/efi/arm-runtime.c +++ b/drivers/firmware/efi/arm-runtime.c @@ -118,7 +118,7 @@ static int __init arm_enable_runtime_services(void) { u64 mapsize; - if (!efi_enabled(EFI_BOOT) || !efi_enabled(EFI_MEMMAP)) { + if (!efi_enabled(EFI_BOOT)) { pr_info("EFI services will not be available.\n"); return 0; } diff --git a/drivers/firmware/efi/memmap.c b/drivers/firmware/efi/memmap.c index 78686443cb37..3fd2b450c649 100644 --- a/drivers/firmware/efi/memmap.c +++ b/drivers/firmware/efi/memmap.c @@ -117,6 +117,9 @@ int __init efi_memmap_init_early(struct efi_memory_map_data *data) void __init efi_memmap_unmap(void) { + if (!efi_enabled(EFI_MEMMAP)) + return; + if (!efi.memmap.late) { unsigned long size; -- 2.17.1
[PATCH 4.14 061/100] efi/arm: Revert deferred unmap of early memmap mapping
4.14-stable review patch. If anyone has any objections, please let me know. -- [ Upstream commit 33412b8673135b18ea42beb7f5117ed0091798b6 ] Commit: 3ea86495aef2 ("efi/arm: preserve early mapping of UEFI memory map longer for BGRT") deferred the unmap of the early mapping of the UEFI memory map to accommodate the ACPI BGRT code, which looks up the memory type that backs the BGRT table to validate it against the requirements of the UEFI spec. Unfortunately, this causes problems on ARM, which does not permit early mappings to persist after paging_init() is called, resulting in a WARN() splat. Since we don't support the BGRT table on ARM anway, let's revert ARM to the old behaviour, which is to take down the early mapping at the end of efi_init(). Signed-off-by: Ard Biesheuvel Cc: Linus Torvalds Cc: Peter Zijlstra Cc: Thomas Gleixner Cc: linux-efi@vger.kernel.org Fixes: 3ea86495aef2 ("efi/arm: preserve early mapping of UEFI memory ...") Link: http://lkml.kernel.org/r/20181114175544.12860-3-ard.biesheu...@linaro.org Signed-off-by: Ingo Molnar Signed-off-by: Sasha Levin --- drivers/firmware/efi/arm-init.c| 4 drivers/firmware/efi/arm-runtime.c | 2 +- drivers/firmware/efi/memmap.c | 3 +++ 3 files changed, 8 insertions(+), 1 deletion(-) diff --git a/drivers/firmware/efi/arm-init.c b/drivers/firmware/efi/arm-init.c index a7c522eac640..312f9f32e168 100644 --- a/drivers/firmware/efi/arm-init.c +++ b/drivers/firmware/efi/arm-init.c @@ -265,6 +265,10 @@ void __init efi_init(void) (params.mmap & ~PAGE_MASK))); init_screen_info(); + + /* ARM does not permit early mappings to persist across paging_init() */ + if (IS_ENABLED(CONFIG_ARM)) + efi_memmap_unmap(); } static int __init register_gop_device(void) diff --git a/drivers/firmware/efi/arm-runtime.c b/drivers/firmware/efi/arm-runtime.c index 8995a48bd067..ad1530aff633 100644 --- a/drivers/firmware/efi/arm-runtime.c +++ b/drivers/firmware/efi/arm-runtime.c @@ -122,7 +122,7 @@ static int __init arm_enable_runtime_services(void) { u64 mapsize; - if (!efi_enabled(EFI_BOOT) || !efi_enabled(EFI_MEMMAP)) { + if (!efi_enabled(EFI_BOOT)) { pr_info("EFI services will not be available.\n"); return 0; } diff --git a/drivers/firmware/efi/memmap.c b/drivers/firmware/efi/memmap.c index 5fc70520e04c..1907db2b38d8 100644 --- a/drivers/firmware/efi/memmap.c +++ b/drivers/firmware/efi/memmap.c @@ -118,6 +118,9 @@ int __init efi_memmap_init_early(struct efi_memory_map_data *data) void __init efi_memmap_unmap(void) { + if (!efi_enabled(EFI_MEMMAP)) + return; + if (!efi.memmap.late) { unsigned long size; -- 2.17.1
Re: [PATCH 4.9 39/92] efi/arm: Revert deferred unmap of early memmap mapping
On Thu, 29 Nov 2018 at 15:21, Greg Kroah-Hartman wrote: > > 4.9-stable review patch. If anyone has any objections, please let me know. > > -- > > [ Upstream commit 33412b8673135b18ea42beb7f5117ed0091798b6 ] > > Commit: > > 3ea86495aef2 ("efi/arm: preserve early mapping of UEFI memory map longer > for BGRT") > > deferred the unmap of the early mapping of the UEFI memory map to > accommodate the ACPI BGRT code, which looks up the memory type that > backs the BGRT table to validate it against the requirements of the UEFI spec. > > Unfortunately, this causes problems on ARM, which does not permit > early mappings to persist after paging_init() is called, resulting > in a WARN() splat. Since we don't support the BGRT table on ARM anway, > let's revert ARM to the old behaviour, which is to take down the > early mapping at the end of efi_init(). > > Signed-off-by: Ard Biesheuvel > Cc: Linus Torvalds > Cc: Peter Zijlstra > Cc: Thomas Gleixner > Cc: linux-efi@vger.kernel.org > Fixes: 3ea86495aef2 ("efi/arm: preserve early mapping of UEFI memory ...") This commit is only in v4.19 as far as I know. Does it even apply? > Link: > http://lkml.kernel.org/r/20181114175544.12860-3-ard.biesheu...@linaro.org > Signed-off-by: Ingo Molnar > Signed-off-by: Sasha Levin > --- > drivers/firmware/efi/arm-init.c| 4 > drivers/firmware/efi/arm-runtime.c | 2 +- > drivers/firmware/efi/memmap.c | 3 +++ > 3 files changed, 8 insertions(+), 1 deletion(-) > > diff --git a/drivers/firmware/efi/arm-init.c b/drivers/firmware/efi/arm-init.c > index 1d1c9693ebfb..8ee91777abce 100644 > --- a/drivers/firmware/efi/arm-init.c > +++ b/drivers/firmware/efi/arm-init.c > @@ -256,6 +256,10 @@ void __init efi_init(void) > (params.mmap & ~PAGE_MASK))); > > init_screen_info(); > + > + /* ARM does not permit early mappings to persist across paging_init() > */ > + if (IS_ENABLED(CONFIG_ARM)) > + efi_memmap_unmap(); > } > > static int __init register_gop_device(void) > diff --git a/drivers/firmware/efi/arm-runtime.c > b/drivers/firmware/efi/arm-runtime.c > index 4d788e0debfe..069c5a4479e6 100644 > --- a/drivers/firmware/efi/arm-runtime.c > +++ b/drivers/firmware/efi/arm-runtime.c > @@ -118,7 +118,7 @@ static int __init arm_enable_runtime_services(void) > { > u64 mapsize; > > - if (!efi_enabled(EFI_BOOT) || !efi_enabled(EFI_MEMMAP)) { > + if (!efi_enabled(EFI_BOOT)) { > pr_info("EFI services will not be available.\n"); > return 0; > } > diff --git a/drivers/firmware/efi/memmap.c b/drivers/firmware/efi/memmap.c > index 78686443cb37..3fd2b450c649 100644 > --- a/drivers/firmware/efi/memmap.c > +++ b/drivers/firmware/efi/memmap.c > @@ -117,6 +117,9 @@ int __init efi_memmap_init_early(struct > efi_memory_map_data *data) > > void __init efi_memmap_unmap(void) > { > + if (!efi_enabled(EFI_MEMMAP)) > + return; > + > if (!efi.memmap.late) { > unsigned long size; > > -- > 2.17.1 > > >
[PATCH 4.19 094/110] efi/arm: Revert deferred unmap of early memmap mapping
4.19-stable review patch. If anyone has any objections, please let me know. -- [ Upstream commit 33412b8673135b18ea42beb7f5117ed0091798b6 ] Commit: 3ea86495aef2 ("efi/arm: preserve early mapping of UEFI memory map longer for BGRT") deferred the unmap of the early mapping of the UEFI memory map to accommodate the ACPI BGRT code, which looks up the memory type that backs the BGRT table to validate it against the requirements of the UEFI spec. Unfortunately, this causes problems on ARM, which does not permit early mappings to persist after paging_init() is called, resulting in a WARN() splat. Since we don't support the BGRT table on ARM anway, let's revert ARM to the old behaviour, which is to take down the early mapping at the end of efi_init(). Signed-off-by: Ard Biesheuvel Cc: Linus Torvalds Cc: Peter Zijlstra Cc: Thomas Gleixner Cc: linux-efi@vger.kernel.org Fixes: 3ea86495aef2 ("efi/arm: preserve early mapping of UEFI memory ...") Link: http://lkml.kernel.org/r/20181114175544.12860-3-ard.biesheu...@linaro.org Signed-off-by: Ingo Molnar Signed-off-by: Sasha Levin --- drivers/firmware/efi/arm-init.c| 4 drivers/firmware/efi/arm-runtime.c | 2 +- drivers/firmware/efi/memmap.c | 3 +++ 3 files changed, 8 insertions(+), 1 deletion(-) diff --git a/drivers/firmware/efi/arm-init.c b/drivers/firmware/efi/arm-init.c index 388a929baf95..1a6a77df8a5e 100644 --- a/drivers/firmware/efi/arm-init.c +++ b/drivers/firmware/efi/arm-init.c @@ -265,6 +265,10 @@ void __init efi_init(void) (params.mmap & ~PAGE_MASK))); init_screen_info(); + + /* ARM does not permit early mappings to persist across paging_init() */ + if (IS_ENABLED(CONFIG_ARM)) + efi_memmap_unmap(); } static int __init register_gop_device(void) diff --git a/drivers/firmware/efi/arm-runtime.c b/drivers/firmware/efi/arm-runtime.c index 922cfb813109..a00934d263c5 100644 --- a/drivers/firmware/efi/arm-runtime.c +++ b/drivers/firmware/efi/arm-runtime.c @@ -110,7 +110,7 @@ static int __init arm_enable_runtime_services(void) { u64 mapsize; - if (!efi_enabled(EFI_BOOT) || !efi_enabled(EFI_MEMMAP)) { + if (!efi_enabled(EFI_BOOT)) { pr_info("EFI services will not be available.\n"); return 0; } diff --git a/drivers/firmware/efi/memmap.c b/drivers/firmware/efi/memmap.c index 5fc70520e04c..1907db2b38d8 100644 --- a/drivers/firmware/efi/memmap.c +++ b/drivers/firmware/efi/memmap.c @@ -118,6 +118,9 @@ int __init efi_memmap_init_early(struct efi_memory_map_data *data) void __init efi_memmap_unmap(void) { + if (!efi_enabled(EFI_MEMMAP)) + return; + if (!efi.memmap.late) { unsigned long size; -- 2.17.1
[PATCH 0/2 v8] add reserved e820 ranges to the kdump kernel e820 table
This patchset did two things: a). add a new I/O resource descriptor 'IORES_DESC_RESERVED' When doing kexec_file_load, the first kernel needs to pass the e820 reserved ranges to the second kernel. But kernel can not exactly match the e820 reserved ranges when walking through the iomem resources with the descriptor 'IORES_DESC_NONE', because several e820 types( e.g. E820_TYPE_RESERVED_KERN/E820_TYPE_RAM/E820_TYPE_UNUSABLE/E820 _TYPE_RESERVED) are converted to the descriptor 'IORES_DESC_NONE'. It may pass these four types to the kdump kernel, that is not desired result. So, this patch adds a new I/O resource descriptor 'IORES_DESC_RESERVED' for the iomem resources search interfaces. It is helpful to exactly match the reserved resource ranges when walking through iomem resources. In addition, since the new descriptor 'IORES_DESC_RESERVED' is introduced, these code originally related to the descriptor 'IORES_DESC_NONE' need to be updated. Otherwise, it will be easily confused and also cause some errors. Because the 'E820_TYPE_RESERVED' type is converted to the new descriptor 'IORES_DESC_RESERVED' instead of 'IORES_DESC_NONE', it has been changed. b). add the e820 reserved ranges to kdump kernel e820 table At present, when use the kexec_file_load syscall to load the kernel image and initramfs(for example: kexec -s -p xxx), kernel does not pass the e820 reserved ranges to the second kernel, which might cause two problems: The first one is the MMCONFIG issue. The basic problem is that this device is in PCI segment 1 and the kernel PCI probing can not find it without all the e820 I/O reservations being present in the e820 table. And the kdump kernel does not have those reservations because the kexec command does not pass the I/O reservation via the "memmap=xxx" command line option. (This problem does not show up for other vendors, as SGI is apparently the actually fails for everyone, but devices in segment 0 are then found by some legacy lookup method.) The workaround for this is to pass the I/O reserved regions to the kdump kernel. MMCONFIG(aka ECAM) space is described in the ACPI MCFG table. If you don't have ECAM: (a) PCI devices won't work at all on non-x86 systems that use only ECAM for config access, (b) you won't be albe to access devices on non-0 segments, (c) you won't be able to access extended config space( address 0x100-0x), which means none of the Extended Capabilities will be available(AER, ACS, ATS, etc). [Bjorn's comment] The second issue is that the SME kdump kernel doesn't work without the e820 reserved ranges. When SME is active in kdump kernel, actually, those reserved regions are still decrypted, but because those reserved ranges are not present at all in kdump kernel e820 table, those reserved regions are considered as encrypted, it goes wrong. The e820 reserved range is useful in kdump kernel, so it is necessary to pass the e820 reserved ranges to kdump kernel. Changes since v1: 1. Modified the value of flags to "0", when walking through the whole tree for e820 reserved ranges. Changes since v2: 1. Modified the value of flags to "0", when walking through the whole tree for e820 reserved ranges. 2. Modified the invalid SOB chain issue. Changes since v3: 1. Dropped [PATCH 1/3 v3] resource: fix an error which walks through iomem resources. Please refer to this commit <010a93bf97c7> "resource: Fix find_next_iomem_res() iteration issue" Changes since v4: 1. Improve the patch log, and add kernel log. Changes since v5: 1. Rewrite these patches log. Changes since v6: 1. Modify the [PATCH 1/2], and add the new I/O resource descriptor 'IORES_DESC_RESERVED' for the iomem resources search interfaces, and also updates these codes relates to 'IORES_DESC_NONE'. 2. Modify the [PATCH 2/2], and walk through io resource based on the new descriptor 'IORES_DESC_RESERVED'. 3. Update patch log. Changes since v7: 1. Improve patch log. 2. Improve this function __ioremap_check_desc_other(). 3. Modify code comment in the __ioremap_check_desc_other() Lianbo Jiang (2): resource: add the new I/O resource descriptor 'IORES_DESC_RESERVED' x86/kexec_file: add reserved e820 ranges to kdump kernel e820 table arch/ia64/kernel/efi.c | 4 arch/x86/kernel/crash.c | 6 ++ arch/x86/kernel/e820.c | 2 +- arch/x86/mm/ioremap.c | 13 - include/linux/ioport.h | 1 + kernel/resource.c | 6 +++--- 6 files changed, 27 insertions(+), 5 deletions(-) -- 2.17.1
[PATCH 1/2 v8] resource: add the new I/O resource descriptor 'IORES_DESC_RESERVED'
When doing kexec_file_load, the first kernel needs to pass the e820 reserved ranges to the second kernel. But kernel can not exactly match the e820 reserved ranges when walking through the iomem resources with the descriptor 'IORES_DESC_NONE', because several e820 types( e.g. E820_TYPE_RESERVED_KERN/E820_TYPE_RAM/E820_TYPE_UNUSABLE/E820 _TYPE_RESERVED) are converted to the descriptor 'IORES_DESC_NONE'. It may pass these four types to the kdump kernel, that is not desired result. So, this patch adds a new I/O resource descriptor 'IORES_DESC_RESERVED' for the iomem resources search interfaces. It is helpful to exactly match the reserved resource ranges when walking through iomem resources. In addition, since the new descriptor 'IORES_DESC_RESERVED' is introduced, these code originally related to the descriptor 'IORES_DESC_NONE' need to be updated. Otherwise, it will be easily confused and also cause some errors. Because the 'E820_TYPE_RESERVED' type is converted to the new descriptor 'IORES_DESC_RESERVED' instead of 'IORES_DESC_NONE', it has been changed. Suggested-by: Dave Young Signed-off-by: Lianbo Jiang --- arch/ia64/kernel/efi.c | 4 arch/x86/kernel/e820.c | 2 +- arch/x86/mm/ioremap.c | 13 - include/linux/ioport.h | 1 + kernel/resource.c | 6 +++--- 5 files changed, 21 insertions(+), 5 deletions(-) diff --git a/arch/ia64/kernel/efi.c b/arch/ia64/kernel/efi.c index 8f106638913c..1841e9b4db30 100644 --- a/arch/ia64/kernel/efi.c +++ b/arch/ia64/kernel/efi.c @@ -1231,6 +1231,10 @@ efi_initialize_iomem_resources(struct resource *code_resource, break; case EFI_RESERVED_TYPE: + name = "reserved"; + desc = IORES_DESC_RESERVED; + break; + case EFI_RUNTIME_SERVICES_CODE: case EFI_RUNTIME_SERVICES_DATA: case EFI_ACPI_RECLAIM_MEMORY: diff --git a/arch/x86/kernel/e820.c b/arch/x86/kernel/e820.c index 50895c2f937d..57fafdafb860 100644 --- a/arch/x86/kernel/e820.c +++ b/arch/x86/kernel/e820.c @@ -1048,10 +1048,10 @@ static unsigned long __init e820_type_to_iores_desc(struct e820_entry *entry) case E820_TYPE_NVS: return IORES_DESC_ACPI_NV_STORAGE; case E820_TYPE_PMEM:return IORES_DESC_PERSISTENT_MEMORY; case E820_TYPE_PRAM:return IORES_DESC_PERSISTENT_MEMORY_LEGACY; + case E820_TYPE_RESERVED:return IORES_DESC_RESERVED; case E820_TYPE_RESERVED_KERN: /* Fall-through: */ case E820_TYPE_RAM: /* Fall-through: */ case E820_TYPE_UNUSABLE:/* Fall-through: */ - case E820_TYPE_RESERVED:/* Fall-through: */ default:return IORES_DESC_NONE; } } diff --git a/arch/x86/mm/ioremap.c b/arch/x86/mm/ioremap.c index 5378d10f1d31..fea2ef99415d 100644 --- a/arch/x86/mm/ioremap.c +++ b/arch/x86/mm/ioremap.c @@ -83,7 +83,18 @@ static bool __ioremap_check_ram(struct resource *res) static int __ioremap_check_desc_other(struct resource *res) { - return (res->desc != IORES_DESC_NONE); + /* +* But now, the 'E820_TYPE_RESERVED' type is converted to the new +* descriptor 'IORES_DESC_RESERVED' instead of 'IORES_DESC_NONE', +* it has been changed. And the value of 'mem_flags.desc_other' +* is equal to 'true' if we don't strengthen the condition in this +* function, that is wrong. Because originally it is equal to +* 'false' for the same reserved type. +* +* So, that would be nice to keep it the same as before. +*/ + return ((res->desc != IORES_DESC_NONE) && + (res->desc != IORES_DESC_RESERVED)); } static int __ioremap_res_check(struct resource *res, void *arg) diff --git a/include/linux/ioport.h b/include/linux/ioport.h index da0ebaec25f0..6ed59de48bd5 100644 --- a/include/linux/ioport.h +++ b/include/linux/ioport.h @@ -133,6 +133,7 @@ enum { IORES_DESC_PERSISTENT_MEMORY_LEGACY = 5, IORES_DESC_DEVICE_PRIVATE_MEMORY= 6, IORES_DESC_DEVICE_PUBLIC_MEMORY = 7, + IORES_DESC_RESERVED = 8, }; /* helpers to define resources */ diff --git a/kernel/resource.c b/kernel/resource.c index b0fbf685c77a..f34a632c4169 100644 --- a/kernel/resource.c +++ b/kernel/resource.c @@ -994,7 +994,7 @@ __reserve_region_with_split(struct resource *root, resource_size_t start, res->start = start; res->end = end; res->flags = type | IORESOURCE_BUSY; - res->desc = IORES_DESC_NONE; + res->desc = IORES_DESC_RESERVED; while (1) { @@ -1029,7 +1029,7 @@ __reserve_region_with_split(struct resource *root, resource_size_t start, next_res->start = conflict->end + 1;
[PATCH 2/2 v8] x86/kexec_file: add reserved e820 ranges to kdump kernel e820 table
At present, when use the kexec_file_load syscall to load the kernel image and initramfs(for example: kexec -s -p xxx), kernel does not pass the e820 reserved ranges to the second kernel, which might cause two problems: The first one is the MMCONFIG issue. The basic problem is that this device is in PCI segment 1 and the kernel PCI probing can not find it without all the e820 I/O reservations being present in the e820 table. And the kdump kernel does not have those reservations because the kexec command does not pass the I/O reservation via the "memmap=xxx" command line option. (This problem does not show up for other vendors, as SGI is apparently the actually fails for everyone, but devices in segment 0 are then found by some legacy lookup method.) The workaround for this is to pass the I/O reserved regions to the kdump kernel. MMCONFIG(aka ECAM) space is described in the ACPI MCFG table. If you don't have ECAM: (a) PCI devices won't work at all on non-x86 systems that use only ECAM for config access, (b) you won't be albe to access devices on non-0 segments, (c) you won't be able to access extended config space( address 0x100-0x), which means none of the Extended Capabilities will be available(AER, ACS, ATS, etc). [Bjorn's comment] The second issue is that the SME kdump kernel doesn't work without the e820 reserved ranges. When SME is active in kdump kernel, actually, those reserved regions are still decrypted, but because those reserved ranges are not present at all in kdump kernel e820 table, those reserved regions are considered as encrypted, it goes wrong. The e820 reserved range is useful in kdump kernel, so it is necessary to pass the e820 reserved ranges to kdump kernel. Suggested-by: Dave Young Signed-off-by: Lianbo Jiang --- arch/x86/kernel/crash.c | 6 ++ 1 file changed, 6 insertions(+) diff --git a/arch/x86/kernel/crash.c b/arch/x86/kernel/crash.c index f631a3f15587..5354a84f1684 100644 --- a/arch/x86/kernel/crash.c +++ b/arch/x86/kernel/crash.c @@ -380,6 +380,12 @@ int crash_setup_memmap_entries(struct kimage *image, struct boot_params *params) walk_iomem_res_desc(IORES_DESC_ACPI_NV_STORAGE, flags, 0, -1, , memmap_entry_callback); + /* Add e820 reserved ranges */ + cmd.type = E820_TYPE_RESERVED; + flags = IORESOURCE_MEM; + walk_iomem_res_desc(IORES_DESC_RESERVED, flags, 0, -1, , + memmap_entry_callback); + /* Add crashk_low_res region */ if (crashk_low_res.end) { ei.addr = crashk_low_res.start; -- 2.17.1