[RFC PATCH v3 1/3] x86/boot/e820: Expose kexec range update, remove and table update functions

2023-10-04 Thread Stanislav Kinsburskii
This functions are to be used to reserve memory regions in kexec kernel by
other kernel subsystems.

Signed-off-by: Stanislav Kinsburskii 
---
 arch/x86/include/asm/e820/api.h |4 
 arch/x86/kernel/e820.c  |   21 +++--
 2 files changed, 19 insertions(+), 6 deletions(-)

diff --git a/arch/x86/include/asm/e820/api.h b/arch/x86/include/asm/e820/api.h
index e8f58ddd06d9..24bb8da928bb 100644
--- a/arch/x86/include/asm/e820/api.h
+++ b/arch/x86/include/asm/e820/api.h
@@ -22,6 +22,10 @@ extern void e820__print_table(char *who);
 extern int  e820__update_table(struct e820_table *table);
 extern void e820__update_table_print(void);
 
+extern u64  e820__range_update_kexec(u64 start, u64 size, enum e820_type 
old_type, enum e820_type new_type);
+extern u64  e820__range_remove_kexec(u64 start, u64 size, enum e820_type 
old_type, bool check_type);
+extern void e820__update_table_kexec(void);
+
 extern unsigned long e820__end_of_ram_pfn(void);
 extern unsigned long e820__end_of_low_ram_pfn(void);
 
diff --git a/arch/x86/kernel/e820.c b/arch/x86/kernel/e820.c
index fb8cf953380d..f339815029f7 100644
--- a/arch/x86/kernel/e820.c
+++ b/arch/x86/kernel/e820.c
@@ -532,13 +532,12 @@ u64 __init e820__range_update(u64 start, u64 size, enum 
e820_type old_type, enum
return __e820__range_update(e820_table, start, size, old_type, 
new_type);
 }
 
-static u64 __init e820__range_update_kexec(u64 start, u64 size, enum e820_type 
old_type, enum e820_type  new_type)
+u64 __init e820__range_update_kexec(u64 start, u64 size, enum e820_type 
old_type, enum e820_type  new_type)
 {
return __e820__range_update(e820_table_kexec, start, size, old_type, 
new_type);
 }
 
-/* Remove a range of memory from the E820 table: */
-u64 __init e820__range_remove(u64 start, u64 size, enum e820_type old_type, 
bool check_type)
+u64 __init __e820__range_remove(struct e820_table *table, u64 start, u64 size, 
enum e820_type old_type, bool check_type)
 {
int i;
u64 end;
@@ -553,8 +552,8 @@ u64 __init e820__range_remove(u64 start, u64 size, enum 
e820_type old_type, bool
e820_print_type(old_type);
pr_cont("\n");
 
-   for (i = 0; i < e820_table->nr_entries; i++) {
-   struct e820_entry *entry = _table->entries[i];
+   for (i = 0; i < table->nr_entries; i++) {
+   struct e820_entry *entry = >entries[i];
u64 final_start, final_end;
u64 entry_end;
 
@@ -599,6 +598,16 @@ u64 __init e820__range_remove(u64 start, u64 size, enum 
e820_type old_type, bool
return real_removed_size;
 }
 
+u64 __init e820__range_remove(u64 start, u64 size, enum e820_type old_type, 
bool check_type)
+{
+   return __e820__range_remove(e820_table, start, size, old_type, 
check_type);
+}
+
+u64 __init e820__range_remove_kexec(u64 start, u64 size, enum e820_type 
old_type, bool check_type)
+{
+   return __e820__range_remove(e820_table_kexec, start, size, old_type, 
check_type);
+}
+
 void __init e820__update_table_print(void)
 {
if (e820__update_table(e820_table))
@@ -608,7 +617,7 @@ void __init e820__update_table_print(void)
e820__print_table("modified");
 }
 
-static void __init e820__update_table_kexec(void)
+void __init e820__update_table_kexec(void)
 {
e820__update_table(e820_table_kexec);
 }



___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


[RFC PATCH v3 0/3] Introduce persistent memory pool

2023-10-04 Thread Stanislav Kinsburskii
This patch introduces a memory allocator specifically tailored for
persistent memory within the kernel. The allocator maintains
kernel-specific states like DMA passthrough device states, IOMMU state, and
more across kexec.

The current implementation provides a foundation for custom solutions that
may be developed in the future. Although the design is kept concise and
straightforward to encourage discussion and feedback, it remains fully
functional.

The immediate need for the allocator is in ability to persist the kernel
pages deposited into Microsoft Hypervisor across kexec: these pages must
not be accessed by kernel when deposited, but can be withdrawn and released
back to kernel. Kexec in turn is used for servicing purposes and aimed to
minimize service downtime upon kernel upgrade in a fleet of machines.

The persistent memory pool builds upon the continuous memory allocator
(CMA) and ensures CMA state persistency across kexec by incorporating the
CMA bitmap into the memory region instead of allocation it from kernel
memory.

Persistent memory pool metadata is passed across kexec by using Flattened
Device Tree, which is added as another kexec segment for x86 architecture.

Potential applications include:

  1. Enabling various in-kernel entities to allocate persistent pages from
 a unified memory pool, obviating the need for reserving multiple
 regions.

  2. For in-kernel components that need the allocation address to be
 retained on kernel kexec, this address can be exposed to user space
 and subsequently passed through the command line.

  3. Distinct subsystems or drivers can set aside their region, allocating
 a segment for their persistent memory pool, suitable for uses such as
 file systems, key-value stores, and other applications.

Changes since v2:

  1. Device tree-related change are removed.

  2. Persistent memory pool region is marked as "reserved by kernel" in
 kexec e820 table, which indicates to the new kernel, that the pool
 must restored.

Changes since v1:

  1. Persistent memory pool is now a wrapper on top of CMA instead of being a
 new allocator.

  2. Persistent memory pool metadata doesn't belong to the pool anymore and
 is now passed via Flattened Device Tree instead over kexec to the new
 kernel.

The following series implements...

---

Stanislav Kinsburskii (3):
  x86/boot/e820: Expose kexec range update, remove and table update 
functions
  pmpool: Introduce persistent memory pool
  pmpool: Mark reserved range as "kernel reserved" in kexec e820 table


 arch/x86/include/asm/e820/api.h |4 +
 arch/x86/kernel/e820.c  |   21 -
 include/linux/pmpool.h  |   22 +
 mm/Kconfig  |8 ++
 mm/Makefile |1 
 mm/pmpool.c |  159 +++
 6 files changed, 209 insertions(+), 6 deletions(-)
 create mode 100644 include/linux/pmpool.h
 create mode 100644 mm/pmpool.c


___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


[RFC PATCH v3 3/3] pmpool: Mark reserved range as "kernel reserved" in kexec e820 table

2023-10-04 Thread Stanislav Kinsburskii
Update the logic to classify the persistent memory pool in the kexec e820
table as "kernel reserved" when its corresponding e820 region type is
"System RAM". Restore the pool when its type is "kernel reserved". This
ensures the persistence of the memory pool across kexec operations.

Signed-off-by: Stanislav Kinsburskii 
---
 mm/pmpool.c |   50 +++---
 1 file changed, 47 insertions(+), 3 deletions(-)

diff --git a/mm/pmpool.c b/mm/pmpool.c
index c74f09b99283..1e3a2dffc5d3 100644
--- a/mm/pmpool.c
+++ b/mm/pmpool.c
@@ -11,11 +11,14 @@
 #include 
 #include 
 
+#include 
+
 #include "cma.h"
 
 struct pmpool {
struct resource resource;
struct cma *cma;
+   bool exists;
 };
 
 static struct pmpool *default_pmpool;
@@ -50,6 +53,18 @@ static void pmpool_cma_accomodate_bitmap(struct cma *cma)
pr_info("CMA bitmap moved to %#llx\n", virt_to_phys(cma->bitmap));
 }
 
+static void pmpool_cma_restore_bitmap(struct cma *cma)
+{
+   u64 base;
+
+   base = PFN_PHYS(cma->base_pfn);
+
+   bitmap_free(cma->bitmap);
+   cma->bitmap = phys_to_virt(base);
+
+   pr_info("CMA bitmap restored to %#llx\n", base);
+}
+
 static int __init default_pmpool_fixup(void)
 {
if (!default_pmpool)
@@ -58,7 +73,11 @@ static int __init default_pmpool_fixup(void)
if (insert_resource(_resource, _pmpool->resource))
pr_err("failed to insert resource\n");
 
-   pmpool_cma_accomodate_bitmap(default_pmpool->cma);
+   if (default_pmpool->exists)
+   pmpool_cma_restore_bitmap(default_pmpool->cma);
+   else
+   pmpool_cma_accomodate_bitmap(default_pmpool->cma);
+
return 0;
 }
 postcore_initcall(default_pmpool_fixup);
@@ -73,7 +92,7 @@ static int __init parse_pmpool_opt(char *str)
}
};
phys_addr_t base, size, end;
-   int err;
+   int err, e820_type;
 
/* Format is pmpool=, */
base = memparse(str, );
@@ -92,10 +111,33 @@ static int __init parse_pmpool_opt(char *str)
return 0;
}
 
+   e820_type = e820__get_entry_type(base, end);
+   switch (e820_type) {
+   case E820_TYPE_RAM:
+   e820__range_update_kexec(base, size, E820_TYPE_RAM,
+E820_TYPE_RESERVED_KERN);
+   e820__update_table_kexec();
+   break;
+   case E820_TYPE_RESERVED_KERN:
+   /*
+* TODO: there are several assumptions here:
+*   1. That the kernel reserved region represents pmpool,
+*   2. That the region had the same base and size and
+*   3. That the region was properly initialized.
+* All these assumptions aren't valid in general case and this
+* should be addressed.
+*/
+   pmpool.exists = true;
+   break;
+   default:
+   pr_err("unsupported e820 type: %d\n", e820_type);
+   goto free_memblock;
+   }
+
err = cma_init_reserved_mem(base, size, 0, "pmpool", );
if (err) {
pr_err("failed to initialize CMA: %d\n", err);
-   goto free_memblock;
+   goto remove_e820_kexec_range;
}
 
pmpool.resource.start = base;
@@ -108,6 +150,8 @@ static int __init parse_pmpool_opt(char *str)
 
return 0;
 
+remove_e820_kexec_range:
+   e820__range_remove_kexec(base, size, E820_TYPE_RESERVED_KERN, 1);
 free_memblock:
memblock_phys_free(base, size);
return 0;



___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


[RFC PATCH v3 2/3] pmpool: Introduce persistent memory pool

2023-10-04 Thread Stanislav Kinsburskii
This patch introduces a memory allocator specifically tailored for
persistent memory within the kernel. The allocator maintains
kernel-specific states like DMA passthrough device states, IOMMU state, and
more across kexec.

The current implementation provides a foundation for custom solutions that
may be developed in the future. Although the design is kept concise and
straightforward to encourage discussion and feedback, it remains fully
functional.

The persistent memory pool builds upon the continuous memory allocator
(CMA) and ensures CMA state persistency across kexec by incorporating the
CMA bitmap into the memory region.

Potential applications include:

  1. Enabling various in-kernel entities to allocate persistent pages from
 a unified memory pool, obviating the need for reserving multiple
 regions.

  2. For in-kernel components that need the allocation address to be
 retained on kernel kexec, this address can be exposed to user space
 and subsequently passed through the command line.

  3. Distinct subsystems or drivers can set aside their region, allocating
 a segment for their persistent memory pool, suitable for uses such as
 file systems, key-value stores, and other applications.

Signed-off-by: Stanislav Kinsburskii 
---
 include/linux/pmpool.h |   22 +
 mm/Kconfig |8 +++
 mm/Makefile|1 
 mm/pmpool.c|  115 
 4 files changed, 146 insertions(+)
 create mode 100644 include/linux/pmpool.h
 create mode 100644 mm/pmpool.c

diff --git a/include/linux/pmpool.h b/include/linux/pmpool.h
new file mode 100644
index ..b41f16fa9660
--- /dev/null
+++ b/include/linux/pmpool.h
@@ -0,0 +1,22 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+
+#ifndef _PMPOOL_H
+#define _PMPOOL_H
+
+struct page;
+
+#if defined(CONFIG_PMPOOL)
+struct page *pmpool_alloc(unsigned long count);
+bool pmpool_release(struct page *pages, unsigned long count);
+#else
+static inline struct page *pmpool_alloc(unsigned long count)
+{
+   return NULL;
+}
+static inline bool pmpool_release(struct page *pages, unsigned long count)
+{
+   return false;
+}
+#endif
+
+#endif /* _PMPOOL_H */
diff --git a/mm/Kconfig b/mm/Kconfig
index 09130434e30d..e7c10094fb10 100644
--- a/mm/Kconfig
+++ b/mm/Kconfig
@@ -922,6 +922,14 @@ config CMA_AREAS
 
  If unsure, leave the default value "7" in UMA and "19" in NUMA.
 
+config PMPOOL
+   bool "Persistent memory pool support"
+   select CMA
+   help
+ This option adds support for CMA-based persistent memory pool
+ feature, which provides pages allocation and freeing from a set of
+ persistent memory ranges, deposited to the memory pool.
+
 config MEM_SOFT_DIRTY
bool "Track memory changes"
depends on CHECKPOINT_RESTORE && HAVE_ARCH_SOFT_DIRTY && PROC_FS
diff --git a/mm/Makefile b/mm/Makefile
index 678530a07326..8d3579e58c2c 100644
--- a/mm/Makefile
+++ b/mm/Makefile
@@ -139,3 +139,4 @@ obj-$(CONFIG_IO_MAPPING) += io-mapping.o
 obj-$(CONFIG_HAVE_BOOTMEM_INFO_NODE) += bootmem_info.o
 obj-$(CONFIG_GENERIC_IOREMAP) += ioremap.o
 obj-$(CONFIG_SHRINKER_DEBUG) += shrinker_debug.o
+obj-$(CONFIG_PMPOOL) += pmpool.o
diff --git a/mm/pmpool.c b/mm/pmpool.c
new file mode 100644
index ..c74f09b99283
--- /dev/null
+++ b/mm/pmpool.c
@@ -0,0 +1,115 @@
+// SPDX-License-Identifier: GPL-2.0
+
+#define pr_fmt(fmt) "pmpool: " fmt
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include "cma.h"
+
+struct pmpool {
+   struct resource resource;
+   struct cma *cma;
+};
+
+static struct pmpool *default_pmpool;
+
+bool pmpool_release(struct page *pages, unsigned long count)
+{
+   if (!default_pmpool)
+   return false;
+
+   return cma_release(default_pmpool->cma, pages, count);
+}
+
+struct page *pmpool_alloc(unsigned long count)
+{
+   if (!default_pmpool)
+   return NULL;
+
+   return cma_alloc(default_pmpool->cma, count, 0, true);
+}
+
+static void pmpool_cma_accomodate_bitmap(struct cma *cma)
+{
+   unsigned long bitmap_size;
+
+   bitmap_free(cma->bitmap);
+   cma->bitmap = phys_to_virt(PFN_PHYS(cma->base_pfn));
+
+   bitmap_size = BITS_TO_LONGS(cma_bitmap_maxno(cma));
+   memset(cma->bitmap, 0, bitmap_size);
+   bitmap_set(cma->bitmap, 0, PAGE_ALIGN(bitmap_size) >> PAGE_SHIFT);
+
+   pr_info("CMA bitmap moved to %#llx\n", virt_to_phys(cma->bitmap));
+}
+
+static int __init default_pmpool_fixup(void)
+{
+   if (!default_pmpool)
+   return 0;
+
+   if (insert_resource(_resource, _pmpool->resource))
+   pr_err("failed to insert resource\n");
+
+   pmpool_cma_accomodate_bitmap(default_pmpool->cma);
+   return 0;
+}
+postcore_initcall(default_pmpool_fixup);
+
+static int __init parse_pmpool_opt(char *str)
+{
+   static struct pmpool pmpool = {
+   

Re: [PATCH v3 0/6] crashdump: Kernel handling of CPU and memory hot un/plug

2023-10-04 Thread Eric DeVolder




On 10/4/23 07:08, Simon Horman wrote:

On Wed, Sep 27, 2023 at 02:11:30PM -0400, Eric DeVolder wrote:

When the kdump service is loaded, if a CPU or memory is hot
un/plugged, the crash elfcorehdr, which describes the CPUs and memory
in the system, must also be updated, else the resulting vmcore is
inaccurate (eg. missing either CPU context or memory regions).

The current solution utilizes udev (eg. RHEL /usr/lib/udev/rules.d/
98-kexec.rules) to initiate an unload-then-reload of the *entire* kdump
image (eg. kernel, initrd, boot_params, purgatory and elfcorehdr) by
the userspace kexec utility. This occurrs just so the elfcorehdr can
be updated with the latest list of CPUs and memory regions. In a
previous post I have outlined the significant performance problems
related to offloading this activity to userspace.

With the Linux kernel 6.6 commit below, the kernel now has the ability
to directly modify the elfcorehdr, eliminating the need to
unload-then-reload the entire kdump image when CPU or memory is hot
un/plugged or on/offlined.

  
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=d6
8b4b6f307d155475cce541f2aee938032ed22e

This kexec-tools patch series is for supporting hotplug with the
kexec_load() syscall; the kernel directly supports hotplug for the
kexec_file_load() syscall, requiring no userspace help.

There are two basic obstacles/requirements for the kexec-tools to
overcome in order to support kernel hotplug rewriting of the
elfcorehdr.

First, the buffer containing the elfcorehdr must be excluded from the
purgatory checksum/digest, which is computed at load time. Otherwise
kernel run-time changes to the elfcorehdr, as a result of hot un/plug,
would result in the checksum failing (specifically in purgatory at
panic kernel boot time), and kdump capture kernel failing to start.
To let the kernel know it is okay to modify the elfcorehdr, kexec
sets the KEXEC_UPDATE_ELFCOREHDR flag.

NOTE: The kernel specifically does *NOT* attempt to recompute the
checksum/digest as that would ultimately require patching the in-
memory purgatory image with the updated checksum. As that purgatory
image is already fully linked, it is binary blob containing no ELF
information which would allow it to be re-linked or patched. Thus
excluding the elfcorehdr from the checksum/digests avoids all these
problems.

Second, the size of the elfcorehdr buffer must be large enough
to accomodate growth of the number of CPUs and/or memory regions.

To satisfy the first requirement, this patch series introduces the
--hotplug option to indicate to kexec-tools that kexec should exclude
the elfcorehdr buffer from the purgatory checksum/digest calculation
and set the KEXEC_UPDATE_ELFCOREHDR flag.

To satisfy the second requirement, the size is obtained from the
/sys/kernel/crash_elfcorehdr_size node (new with the kernel series
cited above).

To use this feature with kexec_load() syscall, invoke kexec with:

  kexec -c --hotplug ...

Thanks!
eric


Thanks Eric,

applied.


Excellent, thank you!
eric

___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Re: [PATCH 1/1] kexec: provide a memfd_create() wrapper if not present in libc

2023-10-04 Thread Simon Horman
On Sat, Sep 23, 2023 at 06:46:06PM +0200, Julien Olivain wrote:
> Commit 714fa115 "kexec/arm64: Simplify the code for zImage" introduced
> a use of the memfd_create() system call, included in version
> kexec-tools v2.0.27.
> 
> This system call was introduced in kernel commit [1], first included
> in kernel v3.17 (released on 2014-10-05).
> 
> The memfd_create() glibc wrapper function was added much later in
> commit [2], first included in glibc version 2.27 (released on
> 2018-02-01).
> 
> This direct use memfd_create() introduced a requirement on
> Kernel >= 3.17 and glibc >= 2.27.
> 
> There is old toolchains like [3] for example (which ships gcc 7.3.1,
> glibc 2.25 and includes kernel v4.10 headers), that can still be used
> to build newer kernels. Even if such toolchains can be seen as
> outdated, they are is still claimed as supported by recent kernel.
> For example, Kernel v6.5.5 has a requirement on gcc version 5.1 and
> greater. See [4].
> 
> Moreover, kexec-tools <= 2.0.26 could be compiled using recent
> toolchains with alternative libc (e.g. uclibc-ng, musl) which are not
> providing the memfd_create() wrapper.
> 
> When compiling kexec-tools v2.0.27 with a toolchain not providing the
> memfd_create() syscall wrapper, the compilation fail with message:
> 
> kexec/kexec.c: In function 'copybuf_memfd':
> kexec/kexec.c:645:7: warning: implicit declaration of function 
> 'memfd_create'; did you mean 'SYS_memfd_create'? 
> [-Wimplicit-function-declaration]
>   fd = memfd_create("kernel", MFD_ALLOW_SEALING);
>^~~~
>SYS_memfd_create
> kexec/kexec.c:645:30: error: 'MFD_ALLOW_SEALING' undeclared (first use in 
> this function); did you mean '_PC_ALLOC_SIZE_MIN'?
>   fd = memfd_create("kernel", MFD_ALLOW_SEALING);
>   ^
>   _PC_ALLOC_SIZE_MIN
> 
> In order to let kexec-tools compile in a wider range of configurations,
> this commit adds a memfd_create() function check in autoconf configure
> script, and adds a system call wrapper which will be used if the
> function is not available. With this commit, the environment
> requirement is relaxed to only kernel >= v3.17.
> 
> Note: this issue was found in kexec-tools integration in Buildroot [5]
> using the command "utils/test-pkg -a -p kexec", which tests many
> toolchain/arch combinations.
> 
> [1] 
> https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=9183df25fe7b194563db3fec6dc3202a5855839c
> [2] 
> https://sourceware.org/git/?p=glibc.git;a=commitdiff;h=59d2cbb1fe4b8601d5cbd359c3806973eab6c62d
> [3] 
> https://releases.linaro.org/components/toolchain/binaries/7.3-2018.05/aarch64-linux-gnu/gcc-linaro-7.3.1-2018.05-x86_64_aarch64-linux-gnu.tar.xz
> [4] 
> https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/tree/Documentation/process/changes.rst?h=v6.5.5#n32
> [5] https://buildroot.org/
> 
> Signed-off-by: Julien Olivain 

Thanks Julien,

applied.

___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Re: [PATCH v3 0/6] crashdump: Kernel handling of CPU and memory hot un/plug

2023-10-04 Thread Simon Horman
On Wed, Sep 27, 2023 at 02:11:30PM -0400, Eric DeVolder wrote:
> When the kdump service is loaded, if a CPU or memory is hot
> un/plugged, the crash elfcorehdr, which describes the CPUs and memory
> in the system, must also be updated, else the resulting vmcore is
> inaccurate (eg. missing either CPU context or memory regions).
> 
> The current solution utilizes udev (eg. RHEL /usr/lib/udev/rules.d/
> 98-kexec.rules) to initiate an unload-then-reload of the *entire* kdump
> image (eg. kernel, initrd, boot_params, purgatory and elfcorehdr) by
> the userspace kexec utility. This occurrs just so the elfcorehdr can
> be updated with the latest list of CPUs and memory regions. In a
> previous post I have outlined the significant performance problems
> related to offloading this activity to userspace.
> 
> With the Linux kernel 6.6 commit below, the kernel now has the ability
> to directly modify the elfcorehdr, eliminating the need to
> unload-then-reload the entire kdump image when CPU or memory is hot
> un/plugged or on/offlined.
> 
>  
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=d6
> 8b4b6f307d155475cce541f2aee938032ed22e
> 
> This kexec-tools patch series is for supporting hotplug with the
> kexec_load() syscall; the kernel directly supports hotplug for the
> kexec_file_load() syscall, requiring no userspace help.
> 
> There are two basic obstacles/requirements for the kexec-tools to
> overcome in order to support kernel hotplug rewriting of the
> elfcorehdr.
> 
> First, the buffer containing the elfcorehdr must be excluded from the
> purgatory checksum/digest, which is computed at load time. Otherwise
> kernel run-time changes to the elfcorehdr, as a result of hot un/plug,
> would result in the checksum failing (specifically in purgatory at
> panic kernel boot time), and kdump capture kernel failing to start.
> To let the kernel know it is okay to modify the elfcorehdr, kexec
> sets the KEXEC_UPDATE_ELFCOREHDR flag.
> 
> NOTE: The kernel specifically does *NOT* attempt to recompute the
> checksum/digest as that would ultimately require patching the in-
> memory purgatory image with the updated checksum. As that purgatory
> image is already fully linked, it is binary blob containing no ELF
> information which would allow it to be re-linked or patched. Thus
> excluding the elfcorehdr from the checksum/digests avoids all these
> problems.
> 
> Second, the size of the elfcorehdr buffer must be large enough
> to accomodate growth of the number of CPUs and/or memory regions.
> 
> To satisfy the first requirement, this patch series introduces the
> --hotplug option to indicate to kexec-tools that kexec should exclude
> the elfcorehdr buffer from the purgatory checksum/digest calculation
> and set the KEXEC_UPDATE_ELFCOREHDR flag.
> 
> To satisfy the second requirement, the size is obtained from the
> /sys/kernel/crash_elfcorehdr_size node (new with the kernel series
> cited above).
> 
> To use this feature with kexec_load() syscall, invoke kexec with:
> 
>  kexec -c --hotplug ...
> 
> Thanks!
> eric

Thanks Eric,

applied.

___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Re: [PATCH v2] kexec: update manpage with explicit mention of clean kexec

2023-10-04 Thread Simon Horman
On Wed, Sep 20, 2023 at 05:29:27PM +0530, Hari Bathini wrote:
> While the manpage does mention about kexec boot with a clean shutdown,
> it is not explicit about it. Make it explicit.
> 
> Signed-off-by: Hari Bathini 

Thanks, applied.

___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec