Re: [PATCH v8 01/15] x86/boot: Place kernel_info at a fixed offset

2024-03-22 Thread Daniel P. Smith

On 3/22/24 10:18, H. Peter Anvin wrote:

On March 21, 2024 6:45:48 AM PDT, "Daniel P. Smith" 
 wrote:

Hi Ard!

On 2/15/24 02:56, Ard Biesheuvel wrote:

On Wed, 14 Feb 2024 at 23:31, Ross Philipson  wrote:


From: Arvind Sankar 

There are use cases for storing the offset of a symbol in kernel_info.
For example, the trenchboot series [0] needs to store the offset of the
Measured Launch Environment header in kernel_info.



Why? Is this information consumed by the bootloader?


Yes, the bootloader needs a standardized means to find the offset of the MLE 
header, which communicates a set of meta-data needed by the DCE in order to set 
up for and start the loaded kernel. Arm will also need to provide a similar 
metadata structure and alternative entry point (or a complete rewrite of the 
existing entry point), as the current Arm entry point is in direct conflict 
with Arm DRTM specification.


I'd like to get away from x86 specific hacks for boot code and boot
images, so I would like to explore if we can avoid kernel_info, or at
least expose it in a generic way. We might just add a 32-bit offset
somewhere in the first 64 bytes of the bootable image: this could
co-exist with EFI bootable images, and can be implemented on arm64,
RISC-V and LoongArch as well.


With all due respect, I would not refer to boot params and the kern_info 
extension designed by the x86 maintainers as a hack. It is the well-defined 
boot protocol for x86, just as Arm has its own boot protocol around Device Tree.

We would gladly adopt a cross arch/cross image type, zImage and bzImage, means 
to embedded meta-data about the kernel that can be discovered by a bootloader. 
Otherwise, we are relegated to doing a per arch/per image type discovery 
mechanism. If you have any suggestions that are cross arch/cross image type 
that we could explore, we would be grateful and willing to investigate how to 
adopt such a method.

V/r,
Daniel


To be fair, the way things are going UEFI, i.e. PE/COFF, is becoming the new 
standard format. Yes, ELF would have been better, but...


Fully agree with the ELF sentiment. We started looking to see if PE/COFF 
has something similar to a ELF NOTE, but figured maybe this has been 
solved for other cases. If that is not the case or there are not any 
suggestions, then we can see what we can devise.




[GIT PULL] Please pull powerpc/linux.git powerpc-6.9-2 tag

2024-03-22 Thread Michael Ellerman
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA256

Hi Linus,

Please pull some more powerpc updates for 6.9. These were posted before the
merge window but had complicated dependencies and/or conflicts with other
content that has gone into 6.9.

cheers

The following changes since commit 66a27abac311a30edbbb65fe8c41ff1b13876faa:

  Merge tag 'powerpc-6.9-1' of 
git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux (2024-03-15 
17:53:48 -0700)

are available in the git repository at:

  https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git 
tags/powerpc-6.9-2

for you to fetch changes up to 5c4233cc0920cc90787aafe950b90f6c57a35b88:

  powerpc/kdump: Split KEXEC_CORE and CRASH_DUMP dependency (2024-03-17 
13:34:00 +1100)

- --
powerpc updates for 6.9 #2

 - Handle errors in mark_rodata_ro() and mark_initmem_nx().

 - Make struct crash_mem available without CONFIG_CRASH_DUMP.

Thanks to: Christophe Leroy, Hari Bathini.

- --
Christophe Leroy (1):
  powerpc: Handle error in mark_rodata_ro() and mark_initmem_nx()

Hari Bathini (3):
  kexec/kdump: make struct crash_mem available without CONFIG_CRASH_DUMP
  powerpc/kexec: split CONFIG_KEXEC_FILE and CONFIG_CRASH_DUMP
  powerpc/kdump: Split KEXEC_CORE and CRASH_DUMP dependency


 arch/powerpc/Kconfig |   9 +-
 arch/powerpc/include/asm/kexec.h |  98 +++
 arch/powerpc/kernel/prom.c   |   2 +-
 arch/powerpc/kernel/setup-common.c   |   2 +-
 arch/powerpc/kernel/smp.c|   4 +-
 arch/powerpc/kexec/Makefile  |   3 +-
 arch/powerpc/kexec/core.c|   4 +
 arch/powerpc/kexec/elf_64.c  |   4 +-
 arch/powerpc/kexec/file_load_64.c| 269 ++--
 arch/powerpc/mm/book3s32/mmu.c   |   7 +-
 arch/powerpc/mm/mmu_decl.h   |   8 +-
 arch/powerpc/mm/nohash/8xx.c |  33 ++-
 arch/powerpc/mm/nohash/e500.c|  10 +-
 arch/powerpc/mm/pgtable_32.c |  38 ++-
 arch/powerpc/platforms/powernv/smp.c |   2 +-
 include/linux/crash_core.h   |  12 +-
 16 files changed, 274 insertions(+), 231 deletions(-)
-BEGIN PGP SIGNATURE-

iQIzBAEBCAAdFiEEJFGtCPCthwEv2Y/bUevqPMjhpYAFAmX+Jw4ACgkQUevqPMjh
pYA6WA/+PVlWRRhMWDZ0BEMFKtVMwUlGJWSZGNGqX/5ZV40lcTIsuIruw8C6VY11
Hq1J+CafM3H7LnqzYwruAYhpBYwb1Oje6IK208XiKH+eUmCzzk+hLfjGdbNbkTOx
6xBqoV3Hjj+p4H6QRXYkZQihQDHy9IfuBGNtoaTaiVuqg9NOT9PLnVNYaI11uLBP
qRS08hkORyJEOO/QRjoVXyXdP7pOwl1EbuYYg805BZ9NFlp7j105yT8XjKQ1X5w+
yF4b2eSV78/Z55dpnBM1GqJqkOSaQjq42PKS+JNSBRpgVDZiLzTdVgWBHY1Q2zho
H5XH9RHvT789vtGsXxhYqvSOSMsM+LgdZo82ZQuqHDA5djmwoMOVXcb/NkyVZ0o3
E+glLdWe6X+0B8Fhx2PH4R5j5j1r3/B2Ighf9Qz60rXNCnbUfT8ZJefyUZg6pHMg
Y/YwdftiqBRnVsK1VSvMrIW3/Sk47QHlM2d7B11R9sVw85zlVwg4FHFXlGtoVpFS
cWZityzVY10wKcblhHYt0/X0n2eeMhjyZuq9lvgls2zypr1qxJ+x/URVVM4hzbNH
P+1qvVUp9mHfSmUGqC5OWv/365BhPQy2t7vkK1NYmmfee2r2umHEY9zphu/yysUg
OQvE+v0F7fETuYg/QTWLlsusEyU0uBCoJOxeB2M6KacrNZfE64A=
=SfuI
-END PGP SIGNATURE-

___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Re: [PATCH linux-next v2 0/3] powerpc/kexec: split CONFIG_CRASH_DUMP out from CONFIG_KEXEC_CORE

2024-03-22 Thread Michael Ellerman
On Mon, 26 Feb 2024 16:00:07 +0530, Hari Bathini wrote:
> This patch series is a follow-up to [1] based on discussions at [2]
> about additional work needed to get it working on powerpc.
> 
> The first patch in the series makes struct crash_mem available with or
> without CONFIG_CRASH_DUMP enabled. The next patch moves kdump specific
> code for kexec_file_load syscall under CONFIG_CRASH_DUMP and the last
> patch splits other kdump specific code under CONFIG_CRASH_DUMP and
> removes dependency with CONFIG_CRASH_DUMP for CONFIG_KEXEC_CORE.
> 
> [...]

Applied to powerpc/next.

[1/3] kexec/kdump: make struct crash_mem available without CONFIG_CRASH_DUMP
  https://git.kernel.org/powerpc/c/56a34d799bfa53064e7b8bd354aacd176aeaecc8
[2/3] powerpc/kexec: split CONFIG_KEXEC_FILE and CONFIG_CRASH_DUMP
  https://git.kernel.org/powerpc/c/33f2cc0a2e90f7177c49559b434191b02efd0cd5
[3/3] powerpc/kdump: Split KEXEC_CORE and CRASH_DUMP dependency
  https://git.kernel.org/powerpc/c/5c4233cc0920cc90787aafe950b90f6c57a35b88

cheers

___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Re: [PATCH] util_lib/elf_info.c: fix a warning

2024-03-22 Thread Simon Horman
On Thu, Mar 21, 2024 at 03:30:37PM +0800, Baoquan He wrote:
> There's a incorrect array operation in function scan_vmcoreinfo(), it
> will cause below warning message.
> 
> ---
> util_lib/elf_info.c: In function ‘scan_vmcoreinfo’:
> util_lib/elf_info.c:360:43: warning: writing 1 byte into a region of size 0 
> [-Wstringop-overflow=]
>   360 | temp_buf[len + 1] = '\0';
>   | ~~^~
> util_lib/elf_info.c:319:14: note: at offset 1024 into destination object 
> ‘temp_buf’ of size 1024
>   319 | char temp_buf[1024];
>   |  ^~~~
> -
> 
> Fix it to avoid oob access of array.
> 
> Signed-off-by: Baoquan He 

Thanks, applied.

___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Re: [PATCH v8 01/15] x86/boot: Place kernel_info at a fixed offset

2024-03-22 Thread H. Peter Anvin
On March 21, 2024 6:45:48 AM PDT, "Daniel P. Smith" 
 wrote:
>Hi Ard!
>
>On 2/15/24 02:56, Ard Biesheuvel wrote:
>> On Wed, 14 Feb 2024 at 23:31, Ross Philipson  
>> wrote:
>>> 
>>> From: Arvind Sankar 
>>> 
>>> There are use cases for storing the offset of a symbol in kernel_info.
>>> For example, the trenchboot series [0] needs to store the offset of the
>>> Measured Launch Environment header in kernel_info.
>>> 
>> 
>> Why? Is this information consumed by the bootloader?
>
>Yes, the bootloader needs a standardized means to find the offset of the MLE 
>header, which communicates a set of meta-data needed by the DCE in order to 
>set up for and start the loaded kernel. Arm will also need to provide a 
>similar metadata structure and alternative entry point (or a complete rewrite 
>of the existing entry point), as the current Arm entry point is in direct 
>conflict with Arm DRTM specification.
>
>> I'd like to get away from x86 specific hacks for boot code and boot
>> images, so I would like to explore if we can avoid kernel_info, or at
>> least expose it in a generic way. We might just add a 32-bit offset
>> somewhere in the first 64 bytes of the bootable image: this could
>> co-exist with EFI bootable images, and can be implemented on arm64,
>> RISC-V and LoongArch as well.
>
>With all due respect, I would not refer to boot params and the kern_info 
>extension designed by the x86 maintainers as a hack. It is the well-defined 
>boot protocol for x86, just as Arm has its own boot protocol around Device 
>Tree.
>
>We would gladly adopt a cross arch/cross image type, zImage and bzImage, means 
>to embedded meta-data about the kernel that can be discovered by a bootloader. 
>Otherwise, we are relegated to doing a per arch/per image type discovery 
>mechanism. If you have any suggestions that are cross arch/cross image type 
>that we could explore, we would be grateful and willing to investigate how to 
>adopt such a method.
>
>V/r,
>Daniel

To be fair, the way things are going UEFI, i.e. PE/COFF, is becoming the new 
standard format. Yes, ELF would have been better, but...



Re: Question about Address Range Validation in Crash Kernel Allocation

2024-03-22 Thread Li Huafei


On 2024/3/22 15:18, Dave Young wrote:
> On Thu, 21 Mar 2024 at 20:37, Li Huafei  wrote:
>>
>>
>>
>> On 2024/3/21 18:06, Dave Young wrote:
>>> Hi,
>>>
>>> On Thu, 21 Mar 2024 at 17:49, Li Huafei  wrote:

 Hi Baoquan,

 On 2024/3/21 17:17, chenhaixiang (A) wrote:
>
>>> I'm sorry for the delay. Here are some details from the boot log and
>> /proc/iomem:
>>> The Boot log:
>>> [0.00] Linux version 6.8.0 (root@localhost.localdomain) (gcc 
>>> (GCC)
>> 10.3.1, GNU ld (GNU Binutils) 2.37) #3 SMP PREEMPT_DYNAMIC Wed Mar 20
>> 11:46:11 UTC 2024
>>> [0.00] Command line: BOOT_IMAGE=/vmlinuz-6.8.0
>> root=/dev/mapper/root ro crashkernel=512M resume=/dev/mapper/swap
>> rd.lvm.lv=root rd.lvm.lv=swap crash_kexec_post_notifiers 
>> softlockup_panic=1
>> reserve_kbox_mem=16M fsck.mode=auto fsck.repair=yes panic=3
>> nmi_watchdog=1 quiet rd.shell=0 memblock=debug efi=debug
>> console=ttyS0,115200n8 console=tty0
>> ..snip...
>>> [0.022622] memblock_phys_alloc_range: 536870912 bytes 
>>> align=0x100
>> from=0x max_addr=0x0001
>> reserve_crashkernel_generic+0x7c/0x220
>>> [0.022628] memblock_phys_alloc_range: 536870912 bytes 
>>> align=0x100
>> from=0x0001 max_addr=0x4000
>> reserve_crashkernel_generic+0x7c/0x220
>>> [0.022632] memblock_reserve: [0x00c01f00-0x00c03eff]
>> memblock_alloc_range_nid+0xee/0x170
>>> [0.022634] memblock_phys_alloc_range: 268435456 bytes 
>>> align=0x100
>> from=0x max_addr=0x0001
>> reserve_crashkernel_generic+0x11d/0x220
>>> [0.022638] memblock_reserve: [0x4900-0x58ff]
>> memblock_alloc_range_nid+0xee/0x170
>>> [0.022640] crashkernel low memory reserved: 0x4900 - 0x5900
>> (256 MB)
>>> [0.022641] crashkernel reserved: 0x00c01f00 -
>> 0x00c03f00 (512 MB)
>>
>> Here, crashkernel,low is reserved in region:  [0x4900 - 0x5900] 
>> (256
>> MB)
>>   crashkernel,high is reserved in region: [0x00c01f00 -
>> 0x00c03f00] (512 MB) ..
>>> [0.029839] memblock_reserve: [0x00c03740-0x00c03f7f]
>> memblock_alloc_range_nid+0xee/0x170
>>> [0.029843] e820: update [mem 0x53cbd000-0x53cc] usable ==>
>> reserved
>>> [0.029861] TSC deadline timer available
>>
>> Then here, region [0x53cbd000-0x53cc] is reserved in e820, and print 
>> abvoe
>> "usable ==> reserved". This should be the step which prevents earlier 
>> reserved
>> crashkernel,low from being added to iomem tree. I am not sure what 
>> triggered
>> the e820 update.

 We added dump_stack () printing in efi_mem_reserve () and found that
 [0x53cbd000-0x53cc] was reserved by BGRT:

   [0.032259] e820: update [mem 0x53cbd000-0x53cc] usable ==>
 reserved
   [0.032262] CPU: 0 PID: 0 Comm: swapper Not tainted
 5.10.0-60.18.0.50.h820.eulerosv2r11.x86_64 #7
   [0.032263] Hardware name: Huawei 2288H V5/BC11SPSCB0, BIOS 8.25
 08/30/2022
   [0.032264] Call Trace:
   [0.032265]  ? dump_stack+0x57/0x6e
   [0.032267]  ? bgrt_init+0xc2/0xc2
   [0.032268]  ? __e820__range_update+0x7a/0x1d6
   [0.032270]  ? bgrt_init+0xc2/0xc2
   [0.032272]  ? bgrt_init+0xc2/0xc2
   [0.032274]  ? efi_arch_mem_reserve+0x1a3/0x1d0
   [0.032276]  ? efi_mem_reserve+0x2d/0x42
   [0.032278]  ? acpi_parse_bgrt+0xa/0x11
   [0.032279]  ? acpi_table_parse+0x86/0xbc
   [0.032281]  ? acpi_boot_init+0x79/0xad
   [0.032282]  ? setup_arch+0x835/0x954
   [0.032284]  ? start_kernel+0x5d/0x455
   [0.032286]  ? secondary_startup_64_no_verify+0xc2/0xcb

 efi_reserve_boot_services() has reserved memory of type
 EFI_BOOT_SERVICES_CODE & EFI_BOOT_SERVICES_DATA  before crashkernel.
 efi_bgrt_init() assumes that EFI_BOOT_SERVICES_DATA is not reserved by
 other modules. Then, the e820_table is directly updated, and the BGRT
 memory is reserved.

 However, memblock_is_region_reserved() in efi_reserve_boot_services()
 returns true when the ranges only overlap.

  already_reserved = memblock_is_region_reserved(start, size);
>>>
>>> Do you mean efi_reserve_boot_services is supposed to reserve the bgrt
>>> memory but it does not reserve it due to the region overlapping with
>>> some other reserved region?  If so can you debug and find what exact
>>> memblock reserved region overlaps with the bgrt?
>>
>> Yes. I added the following debug print to efi_reserve_boot_services():
>>
>> --- a/arch/x86/platform/efi/quirks.c
>> +++ b/arch/x86/platform/efi/quirks.c
>> @@ -339,6 +339,10 @@ void __init 

Re: Question about Address Range Validation in Crash Kernel Allocation

2024-03-22 Thread Dave Young
Hi,

On Fri, 22 Mar 2024 at 09:16, Baoquan He  wrote:
>
> On 03/21/24 at 08:37pm, Li Huafei wrote:
> >
> >
> > On 2024/3/21 18:06, Dave Young wrote:
> > > Hi,
> > >
> > > On Thu, 21 Mar 2024 at 17:49, Li Huafei  wrote:
> > >>
> > >> Hi Baoquan,
> > >>
> > >> On 2024/3/21 17:17, chenhaixiang (A) wrote:
> > >>>
> > > I'm sorry for the delay. Here are some details from the boot log and
> >  /proc/iomem:
> > > The Boot log:
> > > [0.00] Linux version 6.8.0 (root@localhost.localdomain) (gcc 
> > > (GCC)
> >  10.3.1, GNU ld (GNU Binutils) 2.37) #3 SMP PREEMPT_DYNAMIC Wed Mar 20
> >  11:46:11 UTC 2024
> > > [0.00] Command line: BOOT_IMAGE=/vmlinuz-6.8.0
> >  root=/dev/mapper/root ro crashkernel=512M resume=/dev/mapper/swap
> >  rd.lvm.lv=root rd.lvm.lv=swap crash_kexec_post_notifiers 
> >  softlockup_panic=1
> >  reserve_kbox_mem=16M fsck.mode=auto fsck.repair=yes panic=3
> >  nmi_watchdog=1 quiet rd.shell=0 memblock=debug efi=debug
> >  console=ttyS0,115200n8 console=tty0
> >  ..snip...
> > > [0.022622] memblock_phys_alloc_range: 536870912 bytes 
> > > align=0x100
> >  from=0x max_addr=0x0001
> >  reserve_crashkernel_generic+0x7c/0x220
> > > [0.022628] memblock_phys_alloc_range: 536870912 bytes 
> > > align=0x100
> >  from=0x0001 max_addr=0x4000
> >  reserve_crashkernel_generic+0x7c/0x220
> > > [0.022632] memblock_reserve: 
> > > [0x00c01f00-0x00c03eff]
> >  memblock_alloc_range_nid+0xee/0x170
> > > [0.022634] memblock_phys_alloc_range: 268435456 bytes 
> > > align=0x100
> >  from=0x max_addr=0x0001
> >  reserve_crashkernel_generic+0x11d/0x220
> > > [0.022638] memblock_reserve: 
> > > [0x4900-0x58ff]
> >  memblock_alloc_range_nid+0xee/0x170
> > > [0.022640] crashkernel low memory reserved: 0x4900 - 
> > > 0x5900
> >  (256 MB)
> > > [0.022641] crashkernel reserved: 0x00c01f00 -
> >  0x00c03f00 (512 MB)
> > 
> >  Here, crashkernel,low is reserved in region:  [0x4900 - 
> >  0x5900] (256
> >  MB)
> >    crashkernel,high is reserved in region: [0x00c01f00 -
> >  0x00c03f00] (512 MB) ..
> > > [0.029839] memblock_reserve: 
> > > [0x00c03740-0x00c03f7f]
> >  memblock_alloc_range_nid+0xee/0x170
> > > [0.029843] e820: update [mem 0x53cbd000-0x53cc] usable ==>
> >  reserved
> > > [0.029861] TSC deadline timer available
> > 
> >  Then here, region [0x53cbd000-0x53cc] is reserved in e820, and 
> >  print abvoe
> >  "usable ==> reserved". This should be the step which prevents earlier 
> >  reserved
> >  crashkernel,low from being added to iomem tree. I am not sure what 
> >  triggered
> >  the e820 update.
> > >>
> > >> We added dump_stack () printing in efi_mem_reserve () and found that
> > >> [0x53cbd000-0x53cc] was reserved by BGRT:
> > >>
> > >>   [0.032259] e820: update [mem 0x53cbd000-0x53cc] usable ==>
> > >> reserved
> > >>   [0.032262] CPU: 0 PID: 0 Comm: swapper Not tainted
> > >> 5.10.0-60.18.0.50.h820.eulerosv2r11.x86_64 #7
> > >>   [0.032263] Hardware name: Huawei 2288H V5/BC11SPSCB0, BIOS 8.25
> > >> 08/30/2022
> > >>   [0.032264] Call Trace:
> > >>   [0.032265]  ? dump_stack+0x57/0x6e
> > >>   [0.032267]  ? bgrt_init+0xc2/0xc2
> > >>   [0.032268]  ? __e820__range_update+0x7a/0x1d6
> > >>   [0.032270]  ? bgrt_init+0xc2/0xc2
> > >>   [0.032272]  ? bgrt_init+0xc2/0xc2
> > >>   [0.032274]  ? efi_arch_mem_reserve+0x1a3/0x1d0
> > >>   [0.032276]  ? efi_mem_reserve+0x2d/0x42
> > >>   [0.032278]  ? acpi_parse_bgrt+0xa/0x11
> > >>   [0.032279]  ? acpi_table_parse+0x86/0xbc
> > >>   [0.032281]  ? acpi_boot_init+0x79/0xad
> > >>   [0.032282]  ? setup_arch+0x835/0x954
> > >>   [0.032284]  ? start_kernel+0x5d/0x455
> > >>   [0.032286]  ? secondary_startup_64_no_verify+0xc2/0xcb
> > >>
> > >> efi_reserve_boot_services() has reserved memory of type
> > >> EFI_BOOT_SERVICES_CODE & EFI_BOOT_SERVICES_DATA  before crashkernel.
> > >> efi_bgrt_init() assumes that EFI_BOOT_SERVICES_DATA is not reserved by
> > >> other modules. Then, the e820_table is directly updated, and the BGRT
> > >> memory is reserved.
> > >>
> > >> However, memblock_is_region_reserved() in efi_reserve_boot_services()
> > >> returns true when the ranges only overlap.
> > >>
> > >>  already_reserved = memblock_is_region_reserved(start, size);
> > >
> > > Do you mean efi_reserve_boot_services is supposed to reserve the bgrt
> > > memory but it does not reserve it due to the region overlapping with
> > > some other reserved region?  If so can you debug and find what exact
> > > memblock 

Re: Question about Address Range Validation in Crash Kernel Allocation

2024-03-22 Thread Dave Young
On Thu, 21 Mar 2024 at 20:37, Li Huafei  wrote:
>
>
>
> On 2024/3/21 18:06, Dave Young wrote:
> > Hi,
> >
> > On Thu, 21 Mar 2024 at 17:49, Li Huafei  wrote:
> >>
> >> Hi Baoquan,
> >>
> >> On 2024/3/21 17:17, chenhaixiang (A) wrote:
> >>>
> > I'm sorry for the delay. Here are some details from the boot log and
>  /proc/iomem:
> > The Boot log:
> > [0.00] Linux version 6.8.0 (root@localhost.localdomain) (gcc 
> > (GCC)
>  10.3.1, GNU ld (GNU Binutils) 2.37) #3 SMP PREEMPT_DYNAMIC Wed Mar 20
>  11:46:11 UTC 2024
> > [0.00] Command line: BOOT_IMAGE=/vmlinuz-6.8.0
>  root=/dev/mapper/root ro crashkernel=512M resume=/dev/mapper/swap
>  rd.lvm.lv=root rd.lvm.lv=swap crash_kexec_post_notifiers 
>  softlockup_panic=1
>  reserve_kbox_mem=16M fsck.mode=auto fsck.repair=yes panic=3
>  nmi_watchdog=1 quiet rd.shell=0 memblock=debug efi=debug
>  console=ttyS0,115200n8 console=tty0
>  ..snip...
> > [0.022622] memblock_phys_alloc_range: 536870912 bytes 
> > align=0x100
>  from=0x max_addr=0x0001
>  reserve_crashkernel_generic+0x7c/0x220
> > [0.022628] memblock_phys_alloc_range: 536870912 bytes 
> > align=0x100
>  from=0x0001 max_addr=0x4000
>  reserve_crashkernel_generic+0x7c/0x220
> > [0.022632] memblock_reserve: [0x00c01f00-0x00c03eff]
>  memblock_alloc_range_nid+0xee/0x170
> > [0.022634] memblock_phys_alloc_range: 268435456 bytes 
> > align=0x100
>  from=0x max_addr=0x0001
>  reserve_crashkernel_generic+0x11d/0x220
> > [0.022638] memblock_reserve: [0x4900-0x58ff]
>  memblock_alloc_range_nid+0xee/0x170
> > [0.022640] crashkernel low memory reserved: 0x4900 - 0x5900
>  (256 MB)
> > [0.022641] crashkernel reserved: 0x00c01f00 -
>  0x00c03f00 (512 MB)
> 
>  Here, crashkernel,low is reserved in region:  [0x4900 - 0x5900] 
>  (256
>  MB)
>    crashkernel,high is reserved in region: [0x00c01f00 -
>  0x00c03f00] (512 MB) ..
> > [0.029839] memblock_reserve: [0x00c03740-0x00c03f7f]
>  memblock_alloc_range_nid+0xee/0x170
> > [0.029843] e820: update [mem 0x53cbd000-0x53cc] usable ==>
>  reserved
> > [0.029861] TSC deadline timer available
> 
>  Then here, region [0x53cbd000-0x53cc] is reserved in e820, and print 
>  abvoe
>  "usable ==> reserved". This should be the step which prevents earlier 
>  reserved
>  crashkernel,low from being added to iomem tree. I am not sure what 
>  triggered
>  the e820 update.
> >>
> >> We added dump_stack () printing in efi_mem_reserve () and found that
> >> [0x53cbd000-0x53cc] was reserved by BGRT:
> >>
> >>   [0.032259] e820: update [mem 0x53cbd000-0x53cc] usable ==>
> >> reserved
> >>   [0.032262] CPU: 0 PID: 0 Comm: swapper Not tainted
> >> 5.10.0-60.18.0.50.h820.eulerosv2r11.x86_64 #7
> >>   [0.032263] Hardware name: Huawei 2288H V5/BC11SPSCB0, BIOS 8.25
> >> 08/30/2022
> >>   [0.032264] Call Trace:
> >>   [0.032265]  ? dump_stack+0x57/0x6e
> >>   [0.032267]  ? bgrt_init+0xc2/0xc2
> >>   [0.032268]  ? __e820__range_update+0x7a/0x1d6
> >>   [0.032270]  ? bgrt_init+0xc2/0xc2
> >>   [0.032272]  ? bgrt_init+0xc2/0xc2
> >>   [0.032274]  ? efi_arch_mem_reserve+0x1a3/0x1d0
> >>   [0.032276]  ? efi_mem_reserve+0x2d/0x42
> >>   [0.032278]  ? acpi_parse_bgrt+0xa/0x11
> >>   [0.032279]  ? acpi_table_parse+0x86/0xbc
> >>   [0.032281]  ? acpi_boot_init+0x79/0xad
> >>   [0.032282]  ? setup_arch+0x835/0x954
> >>   [0.032284]  ? start_kernel+0x5d/0x455
> >>   [0.032286]  ? secondary_startup_64_no_verify+0xc2/0xcb
> >>
> >> efi_reserve_boot_services() has reserved memory of type
> >> EFI_BOOT_SERVICES_CODE & EFI_BOOT_SERVICES_DATA  before crashkernel.
> >> efi_bgrt_init() assumes that EFI_BOOT_SERVICES_DATA is not reserved by
> >> other modules. Then, the e820_table is directly updated, and the BGRT
> >> memory is reserved.
> >>
> >> However, memblock_is_region_reserved() in efi_reserve_boot_services()
> >> returns true when the ranges only overlap.
> >>
> >>  already_reserved = memblock_is_region_reserved(start, size);
> >
> > Do you mean efi_reserve_boot_services is supposed to reserve the bgrt
> > memory but it does not reserve it due to the region overlapping with
> > some other reserved region?  If so can you debug and find what exact
> > memblock reserved region overlaps with the bgrt?
>
> Yes. I added the following debug print to efi_reserve_boot_services():
>
> --- a/arch/x86/platform/efi/quirks.c
> +++ b/arch/x86/platform/efi/quirks.c
> @@ -339,6 +339,10 @@ void __init efi_reserve_boot_services(void)
>
> already_reserved =