Re: [PATCH AUTOSEL 6.9 18/23] powerpc: make fadump resilient with memory add/remove events

2024-05-30 Thread Sourabh Jain

Hello Sasha,

Thank you for considering this patch for the stable tree 6.9, 6.8, 6.6, 
and 6.1.


This patch does two things:
1. Fixes a potential memory corruption issue mentioned as the third 
point in the commit message
2. Enables the kernel to avoid unnecessary fadump re-registration on 
memory add/remove events


To make the second functionality available to users, I request you also 
consider the upstream patch
mentioned below along with this patch. Both patches were part of the 
same patch series.


https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git/commit/?id=bc446c5acabadeb38b61b565535401c5dfdd1214

Now, there was also a third patch in the same patch series, but that is 
about a documentation update.


Link to patch series:
https://lore.kernel.org/all/20240422195932.1583833-1-sourabhj...@linux.ibm.com/

Thanks,
Sourabh Jain


On 27/05/24 21:20, Sasha Levin wrote:

From: Sourabh Jain 

[ Upstream commit c6c5b14dac0d1bd0da8b4d1d3b77f18eb9085fcb ]

Due to changes in memory resources caused by either memory hotplug or
online/offline events, the elfcorehdr, which describes the CPUs and
memory of the crashed kernel to the kernel that collects the dump (known
as second/fadump kernel), becomes outdated. Consequently, attempting
dump collection with an outdated elfcorehdr can lead to failed or
inaccurate dump collection.

Memory hotplug or online/offline events is referred as memory add/remove
events in reset of the commit message.

The current solution to address the aforementioned issue is as follows:
Monitor memory add/remove events in userspace using udev rules, and
re-register fadump whenever there are changes in memory resources. This
leads to the creation of a new elfcorehdr with updated system memory
information.

There are several notable issues associated with re-registering fadump
for every memory add/remove events.

1. Bulk memory add/remove events with udev-based fadump re-registration
can lead to race conditions and, more importantly, it creates a wide
window during which fadump is inactive until all memory add/remove
events are settled.
2. Re-registering fadump for every memory add/remove event is
inefficient.
3. The memory for elfcorehdr is allocated based on the memblock regions
available during early boot and remains fixed thereafter. However, if
elfcorehdr is later recreated with additional memblock regions, its
size will increase, potentially leading to memory corruption.

Address the aforementioned challenges by shifting the creation of
elfcorehdr from the first kernel (also referred as the crashed kernel),
where it was created and frequently recreated for every memory
add/remove event, to the fadump kernel. As a result, the elfcorehdr only
needs to be created once, thus eliminating the necessity to re-register
fadump during memory add/remove events.

At present, the first kernel prepares fadump header and stores it in the
fadump reserved area. The fadump header includes the start address of
the elfcorehdr, crashing CPU details, and other relevant information. In
the event of a crash in the first kernel, the second/fadump boots and
accesses the fadump header prepared by the first kernel. It then
performs the following steps in a platform-specific function
[rtas|opal]_fadump_process:

1. Sanity check for fadump header
2. Update CPU notes in elfcorehdr

Along with the above, update the setup_fadump()/fadump.c to create
elfcorehdr and set its address to the global variable elfcorehdr_addr
for the vmcore module to process it in the second/fadump kernel.

Section below outlines the information required to create the elfcorehdr
and the changes made to make it available to the fadump kernel if it's
not already.

To create elfcorehdr, the following crashed kernel information is
required: CPU notes, vmcoreinfo, and memory ranges.

At present, the CPU notes are already prepared in the fadump kernel, so
no changes are needed in that regard. The fadump kernel has access to
all crashed kernel memory regions, including boot memory regions that
are relocated by firmware to fadump reserved areas, so no changes for
that either. However, it is necessary to add new members to the fadump
header, i.e., the 'fadump_crash_info_header' structure, in order to pass
the crashed kernel's vmcoreinfo address and its size to fadump kernel.

In addition to the vmcoreinfo address and size, there are a few other
attributes also added to the fadump_crash_info_header structure.

1. version:
It stores the fadump header version, which is currently set to 1.
This provides flexibility to update the fadump crash info header in
the future without changing the magic number. For each change in the
fadump header, the version will be increased. This will help the
updated kernel determine how to handle kernel dumps from older
kernels. The magic number remains relevant for checking fadump header
corruption.

2. pt_regs_sz/cpu_mask_sz:
Store size of pt_regs

Re: [PATCH 2/2] arch/powerpc: hotplug driver bridge support

2024-05-13 Thread Sourabh Jain
slot for non-bridge/EP case */
    if (!((class >> 8) == PCI_CLASS_BRIDGE_PCI) && start->child == 
dn) {

        slotno = PCI_SLOT(PCI_DN(dn)->devfn);
        pci_scan_slot(bus, PCI_DEVFN(slotno, 0));
        of_node_put(dn);
        return;
    }

    /* Call of pci_scan_slot for bridge port */
    if ((class >> 8) == PCI_CLASS_BRIDGE_PCI) {
        slotno = PCI_SLOT(PCI_DN(dn)->devfn);
        pci_scan_slot(bus, PCI_DEVFN(slotno, 0));
    }
    }



+}
+EXPORT_SYMBOL_GPL(pci_traverse_sibling_nodes_and_scan_slot);


What is the need for exporting the above function?


+
  DECLARE_PCI_FIXUP_EARLY(PCI_ANY_ID, PCI_ANY_ID, pci_dev_pdn_setup);


- Sourabh Jain


[PATCH v2 2/2] powerpc/kexec_file: fix cpus node update to FDT

2024-05-10 Thread Sourabh Jain
While updating the cpus node, commit 40c753993e3a ("powerpc/kexec_file:
 Use current CPU info while setting up FDT") first deletes all subnodes
under the /cpus node. However, while adding sub-nodes back, it missed
adding cpus subnodes whose device_type != "cpu", such as l2-cache*,
l3-cache*, ibm,powerpc-cpu-features.

Fix this by only deleting cpus sub-nodes of device_type == "cpus" and
then adding all available nodes with device_type == "cpu".

Fixes: 40c753993e3a ("powerpc/kexec_file: Use current CPU info while setting up 
FDT")
Cc: Aditya Gupta 
Cc: Hari Bathini 
Cc: Mahesh Salgaonkar 
Cc: Michael Ellerman 
Cc: "Naveen N. Rao" 
Signed-off-by: Sourabh Jain 
---

* No changes in v2.

---

 arch/powerpc/kexec/core_64.c | 53 +---
 1 file changed, 37 insertions(+), 16 deletions(-)

diff --git a/arch/powerpc/kexec/core_64.c b/arch/powerpc/kexec/core_64.c
index 85050be08a23..2e625c2cb6b9 100644
--- a/arch/powerpc/kexec/core_64.c
+++ b/arch/powerpc/kexec/core_64.c
@@ -456,9 +456,15 @@ static int add_node_props(void *fdt, int node_offset, 
const struct device_node *
  * @fdt:  Flattened device tree of the kernel.
  *
  * Returns 0 on success, negative errno on error.
+ *
+ * Note: expecting no subnodes under /cpus/ with device_type == "cpu".
+ * If this changes, update this function to include them.
  */
 int update_cpus_node(void *fdt)
 {
+   int prev_node_offset;
+   const char *device_type;
+   const struct fdt_property *prop;
struct device_node *cpus_node, *dn;
int cpus_offset, cpus_subnode_offset, ret = 0;
 
@@ -469,30 +475,44 @@ int update_cpus_node(void *fdt)
return cpus_offset;
}
 
-   if (cpus_offset > 0) {
-   ret = fdt_del_node(fdt, cpus_offset);
+   prev_node_offset = cpus_offset;
+   /* Delete sub-nodes of /cpus node with device_type == "cpu" */
+   for (cpus_subnode_offset = fdt_first_subnode(fdt, cpus_offset); 
cpus_subnode_offset >= 0;) {
+   /* Ignore nodes that do not have a device_type property or 
device_type != "cpu" */
+   prop = fdt_get_property(fdt, cpus_subnode_offset, 
"device_type", NULL);
+   if (!prop || strcmp(prop->data, "cpu")) {
+   prev_node_offset = cpus_subnode_offset;
+   goto next_node;
+   }
+
+   ret = fdt_del_node(fdt, cpus_subnode_offset);
if (ret < 0) {
-   pr_err("Error deleting /cpus node: %s\n", 
fdt_strerror(ret));
-   return -EINVAL;
+   pr_err("Failed to delete a cpus sub-node: %s\n", 
fdt_strerror(ret));
+   return ret;
}
+next_node:
+   if (prev_node_offset == cpus_offset)
+   cpus_subnode_offset = fdt_first_subnode(fdt, 
cpus_offset);
+   else
+   cpus_subnode_offset = fdt_next_subnode(fdt, 
prev_node_offset);
}
 
-   /* Add cpus node to fdt */
-   cpus_offset = fdt_add_subnode(fdt, fdt_path_offset(fdt, "/"), "cpus");
-   if (cpus_offset < 0) {
-   pr_err("Error creating /cpus node: %s\n", 
fdt_strerror(cpus_offset));
+   cpus_node = of_find_node_by_path("/cpus");
+   /* Fail here to avoid kexec/kdump kernel boot hung */
+   if (!cpus_node) {
+   pr_err("No /cpus node found\n");
return -EINVAL;
}
 
-   /* Add cpus node properties */
-   cpus_node = of_find_node_by_path("/cpus");
-   ret = add_node_props(fdt, cpus_offset, cpus_node);
-   of_node_put(cpus_node);
-   if (ret < 0)
-   return ret;
+   /* Add all /cpus sub-nodes of device_type == "cpu" to FDT */
+   for_each_child_of_node(cpus_node, dn) {
+   /* Ignore device nodes that do not have a device_type property
+* or device_type != "cpu".
+*/
+   device_type = of_get_property(dn, "device_type", NULL);
+   if (!device_type || strcmp(device_type, "cpu"))
+   continue;
 
-   /* Loop through all subnodes of cpus and add them to fdt */
-   for_each_node_by_type(dn, "cpu") {
cpus_subnode_offset = fdt_add_subnode(fdt, cpus_offset, 
dn->full_name);
if (cpus_subnode_offset < 0) {
pr_err("Unable to add %s subnode: %s\n", dn->full_name,
@@ -506,6 +526,7 @@ int update_cpus_node(void *fdt)
goto out;
}
 out:
+   of_node_put(cpus_node);
of_node_put(dn);
return ret;
 }
-- 
2.44.0



[PATCH v2 1/2] powerpc/kexec_file: fix extra size calculation for kexec FDT

2024-05-10 Thread Sourabh Jain
While setting up the FDT for kexec, CPU nodes that are added after the
system boots and reserved memory ranges are incorporated into the
initial_boot_params (base FDT).

However, they are not taken into account when determining the additional
size needed for the kexec FDT. As a result, kexec fails to load,
generating the following error:

[1116.774451] Error updating memory reserve map: FDT_ERR_NOSPACE
kexec_file_load failed: No such process

Therefore, consider the extra size for CPU nodes added post-system boot
and reserved memory ranges while preparing the kexec FDT.

While adding a new parameter to the setup_new_fdt_ppc64 function, it was
noticed that there were a couple of unused parameters, so they were
removed.

Cc: Aditya Gupta 
Cc: Hari Bathini 
Cc: Mahesh Salgaonkar 
Cc: Michael Ellerman 
Cc: "Naveen N. Rao" 
Signed-off-by: Sourabh Jain 
---

Changelog:

Since v1:
  - Initialize local variable `cpu_nodes` before using it. 01/02

---

 arch/powerpc/include/asm/kexec.h  |  6 ++--
 arch/powerpc/kexec/elf_64.c   | 12 +--
 arch/powerpc/kexec/file_load_64.c | 53 +--
 3 files changed, 33 insertions(+), 38 deletions(-)

diff --git a/arch/powerpc/include/asm/kexec.h b/arch/powerpc/include/asm/kexec.h
index 95a98b390d62..270ee93a0f7d 100644
--- a/arch/powerpc/include/asm/kexec.h
+++ b/arch/powerpc/include/asm/kexec.h
@@ -103,10 +103,8 @@ int load_crashdump_segments_ppc64(struct kimage *image,
 int setup_purgatory_ppc64(struct kimage *image, const void *slave_code,
  const void *fdt, unsigned long kernel_load_addr,
  unsigned long fdt_load_addr);
-unsigned int kexec_extra_fdt_size_ppc64(struct kimage *image);
-int setup_new_fdt_ppc64(const struct kimage *image, void *fdt,
-   unsigned long initrd_load_addr,
-   unsigned long initrd_len, const char *cmdline);
+unsigned int kexec_extra_fdt_size_ppc64(struct kimage *image, struct crash_mem 
*rmem);
+int setup_new_fdt_ppc64(const struct kimage *image, void *fdt, struct 
crash_mem *rmem);
 #endif /* CONFIG_PPC64 */
 
 #endif /* CONFIG_KEXEC_FILE */
diff --git a/arch/powerpc/kexec/elf_64.c b/arch/powerpc/kexec/elf_64.c
index 214c071c58ed..5d6d616404cf 100644
--- a/arch/powerpc/kexec/elf_64.c
+++ b/arch/powerpc/kexec/elf_64.c
@@ -23,6 +23,7 @@
 #include 
 #include 
 #include 
+#include 
 
 static void *elf64_load(struct kimage *image, char *kernel_buf,
unsigned long kernel_len, char *initrd,
@@ -36,6 +37,7 @@ static void *elf64_load(struct kimage *image, char 
*kernel_buf,
const void *slave_code;
struct elfhdr ehdr;
char *modified_cmdline = NULL;
+   struct crash_mem *rmem = NULL;
struct kexec_elf_info elf_info;
struct kexec_buf kbuf = { .image = image, .buf_min = 0,
  .buf_max = ppc64_rma_size };
@@ -102,17 +104,20 @@ static void *elf64_load(struct kimage *image, char 
*kernel_buf,
kexec_dprintk("Loaded initrd at 0x%lx\n", initrd_load_addr);
}
 
+   ret = get_reserved_memory_ranges();
+   if (ret)
+   goto out;
+
fdt = of_kexec_alloc_and_setup_fdt(image, initrd_load_addr,
   initrd_len, cmdline,
-  kexec_extra_fdt_size_ppc64(image));
+  kexec_extra_fdt_size_ppc64(image, 
rmem));
if (!fdt) {
pr_err("Error setting up the new device tree.\n");
ret = -EINVAL;
goto out;
}
 
-   ret = setup_new_fdt_ppc64(image, fdt, initrd_load_addr,
- initrd_len, cmdline);
+   ret = setup_new_fdt_ppc64(image, fdt, rmem);
if (ret)
goto out_free_fdt;
 
@@ -146,6 +151,7 @@ static void *elf64_load(struct kimage *image, char 
*kernel_buf,
 out_free_fdt:
kvfree(fdt);
 out:
+   kfree(rmem);
kfree(modified_cmdline);
kexec_free_elf_info(_info);
 
diff --git a/arch/powerpc/kexec/file_load_64.c 
b/arch/powerpc/kexec/file_load_64.c
index 925a69ad2468..413c76de283d 100644
--- a/arch/powerpc/kexec/file_load_64.c
+++ b/arch/powerpc/kexec/file_load_64.c
@@ -803,10 +803,9 @@ static unsigned int cpu_node_size(void)
return size;
 }
 
-static unsigned int kdump_extra_fdt_size_ppc64(struct kimage *image)
+static unsigned int kdump_extra_fdt_size_ppc64(struct kimage *image, unsigned 
int cpu_nodes)
 {
-   unsigned int cpu_nodes, extra_size = 0;
-   struct device_node *dn;
+   unsigned int extra_size = 0;
u64 usm_entries;
 #ifdef CONFIG_CRASH_HOTPLUG
unsigned int possible_cpu_nodes;
@@ -826,18 +825,6 @@ static unsigned int kdump_extra_fdt_size_ppc64(struct 
kimage *image)
extra_size += (unsigned int)(usm_entries * sizeof(u64));
}
 
-   /*
-* Get the number 

[PATCH v2 0/2] powerpc: kexec fixes

2024-05-10 Thread Sourabh Jain
Patch series fixes two kexec issues.

01/02: Update extra size calculation for kexec FDT to avoid kexec load
failure due to FDT_ERR_NOSPACE while including CPU nodes added post
boot and reserved memory ranges.

02/02: Fix update_cpus_node/core_64.c function to include missing device
nodes under /cpus node with device_type != "cpu".

Note: this patch series is rebased on top of the linux-next/master,
tag: next-20240509 to avoid the conflict with the below patch series:
https://lore.kernel.org/all/171509287314.62008.11812494124513471250.b4...@ellerman.id.au/

Changelog:
==

v2:
  - Initialize local variable `cpu_nodes` before using it. 01/02
  - Rebased on top of linux-next/mater tag: next-20240509.

v1:
  - 
https://lore.kernel.org/all/20240508130558.1939304-1-sourabhj...@linux.ibm.com/


Cc: Aditya Gupta 
Cc: Hari Bathini 
Cc: Mahesh Salgaonkar 
Cc: Michael Ellerman 
Cc: "Naveen N. Rao" 

Sourabh Jain (2):
  powerpc/kexec_file: fix extra size calculation for kexec FDT
  powerpc/kexec_file: fix cpus node update to FDT

 arch/powerpc/include/asm/kexec.h  |  6 ++--
 arch/powerpc/kexec/core_64.c  | 53 +--
 arch/powerpc/kexec/elf_64.c   | 12 +--
 arch/powerpc/kexec/file_load_64.c | 53 +--
 4 files changed, 70 insertions(+), 54 deletions(-)

-- 
2.44.0



[PATCH 2/2] powerpc/kexec_file: fix cpus node update to FDT

2024-05-08 Thread Sourabh Jain
While updating the cpus node, commit 40c753993e3a ("powerpc/kexec_file:
 Use current CPU info while setting up FDT") first deletes all subnodes
under the /cpus node. However, while adding sub-nodes back, it missed
adding cpus subnodes whose device_type != "cpu", such as l2-cache*,
l3-cache*, ibm,powerpc-cpu-features.

Fix this by only deleting cpus sub-nodes of device_type == "cpus" and
then adding all available nodes with device_type == "cpu".

Fixes: 40c753993e3a ("powerpc/kexec_file: Use current CPU info while setting up 
FDT")
Cc: Aditya Gupta 
Cc: Hari Bathini 
Cc: Mahesh Salgaonkar 
Cc: Michael Ellerman 
Cc: "Naveen N. Rao" 
Signed-off-by: Sourabh Jain 
---
 arch/powerpc/kexec/core_64.c | 53 +---
 1 file changed, 37 insertions(+), 16 deletions(-)

diff --git a/arch/powerpc/kexec/core_64.c b/arch/powerpc/kexec/core_64.c
index 85050be08a23..2e625c2cb6b9 100644
--- a/arch/powerpc/kexec/core_64.c
+++ b/arch/powerpc/kexec/core_64.c
@@ -456,9 +456,15 @@ static int add_node_props(void *fdt, int node_offset, 
const struct device_node *
  * @fdt:  Flattened device tree of the kernel.
  *
  * Returns 0 on success, negative errno on error.
+ *
+ * Note: expecting no subnodes under /cpus/ with device_type == "cpu".
+ * If this changes, update this function to include them.
  */
 int update_cpus_node(void *fdt)
 {
+   int prev_node_offset;
+   const char *device_type;
+   const struct fdt_property *prop;
struct device_node *cpus_node, *dn;
int cpus_offset, cpus_subnode_offset, ret = 0;
 
@@ -469,30 +475,44 @@ int update_cpus_node(void *fdt)
return cpus_offset;
}
 
-   if (cpus_offset > 0) {
-   ret = fdt_del_node(fdt, cpus_offset);
+   prev_node_offset = cpus_offset;
+   /* Delete sub-nodes of /cpus node with device_type == "cpu" */
+   for (cpus_subnode_offset = fdt_first_subnode(fdt, cpus_offset); 
cpus_subnode_offset >= 0;) {
+   /* Ignore nodes that do not have a device_type property or 
device_type != "cpu" */
+   prop = fdt_get_property(fdt, cpus_subnode_offset, 
"device_type", NULL);
+   if (!prop || strcmp(prop->data, "cpu")) {
+   prev_node_offset = cpus_subnode_offset;
+   goto next_node;
+   }
+
+   ret = fdt_del_node(fdt, cpus_subnode_offset);
if (ret < 0) {
-   pr_err("Error deleting /cpus node: %s\n", 
fdt_strerror(ret));
-   return -EINVAL;
+   pr_err("Failed to delete a cpus sub-node: %s\n", 
fdt_strerror(ret));
+   return ret;
}
+next_node:
+   if (prev_node_offset == cpus_offset)
+   cpus_subnode_offset = fdt_first_subnode(fdt, 
cpus_offset);
+   else
+   cpus_subnode_offset = fdt_next_subnode(fdt, 
prev_node_offset);
}
 
-   /* Add cpus node to fdt */
-   cpus_offset = fdt_add_subnode(fdt, fdt_path_offset(fdt, "/"), "cpus");
-   if (cpus_offset < 0) {
-   pr_err("Error creating /cpus node: %s\n", 
fdt_strerror(cpus_offset));
+   cpus_node = of_find_node_by_path("/cpus");
+   /* Fail here to avoid kexec/kdump kernel boot hung */
+   if (!cpus_node) {
+   pr_err("No /cpus node found\n");
return -EINVAL;
}
 
-   /* Add cpus node properties */
-   cpus_node = of_find_node_by_path("/cpus");
-   ret = add_node_props(fdt, cpus_offset, cpus_node);
-   of_node_put(cpus_node);
-   if (ret < 0)
-   return ret;
+   /* Add all /cpus sub-nodes of device_type == "cpu" to FDT */
+   for_each_child_of_node(cpus_node, dn) {
+   /* Ignore device nodes that do not have a device_type property
+* or device_type != "cpu".
+*/
+   device_type = of_get_property(dn, "device_type", NULL);
+   if (!device_type || strcmp(device_type, "cpu"))
+   continue;
 
-   /* Loop through all subnodes of cpus and add them to fdt */
-   for_each_node_by_type(dn, "cpu") {
cpus_subnode_offset = fdt_add_subnode(fdt, cpus_offset, 
dn->full_name);
if (cpus_subnode_offset < 0) {
pr_err("Unable to add %s subnode: %s\n", dn->full_name,
@@ -506,6 +526,7 @@ int update_cpus_node(void *fdt)
goto out;
}
 out:
+   of_node_put(cpus_node);
of_node_put(dn);
return ret;
 }
-- 
2.44.0



[PATCH 0/2] powerpc: kexec fixes

2024-05-08 Thread Sourabh Jain
Patch series fixes two kexec issues.

01/02: Update extra size calculation for kexec FDT to avoid kexec load
failure due to FDT_ERR_NOSPACE while including CPU nodes added post
boot and reserved memory ranges.

02/02: Fix update_cpus_node/core_64.c function to include missing device
nodes under /cpus node with device_type != "cpu".

Note: this patch series is rebased on top of the linux-next/master
to avoid the conflict with the below patch series:
https://lore.kernel.org/all/171509287314.62008.11812494124513471250.b4...@ellerman.id.au/

Cc: Aditya Gupta 
Cc: Hari Bathini 
Cc: Mahesh Salgaonkar 
Cc: Michael Ellerman 
Cc: "Naveen N. Rao" 

Sourabh Jain (2):
  powerpc/kexec_file: fix extra size calculation for kexec FDT
  powerpc/kexec_file: fix cpus node update to FDT

 arch/powerpc/include/asm/kexec.h  |  6 ++--
 arch/powerpc/kexec/core_64.c  | 53 +--
 arch/powerpc/kexec/elf_64.c   | 12 +--
 arch/powerpc/kexec/file_load_64.c | 53 +--
 4 files changed, 70 insertions(+), 54 deletions(-)

-- 
2.44.0



[PATCH 1/2] powerpc/kexec_file: fix extra size calculation for kexec FDT

2024-05-08 Thread Sourabh Jain
While setting up the FDT for kexec, CPU nodes that are added after the
system boots and reserved memory ranges are incorporated into the
initial_boot_params (base FDT).

However, they are not taken into account when determining the additional
size needed for the kexec FDT. As a result, kexec fails to load,
generating the following error:

[1116.774451] Error updating memory reserve map: FDT_ERR_NOSPACE
kexec_file_load failed: No such process

Therefore, consider the extra size for CPU nodes added post-system boot
and reserved memory ranges while preparing the kexec FDT.

While adding a new parameter to the setup_new_fdt_ppc64 function, it was
noticed that there were a couple of unused parameters, so they were
removed.

Cc: Aditya Gupta 
Cc: Hari Bathini 
Cc: Mahesh Salgaonkar 
Cc: Michael Ellerman 
Cc: "Naveen N. Rao" 
Signed-off-by: Sourabh Jain 
---
 arch/powerpc/include/asm/kexec.h  |  6 ++--
 arch/powerpc/kexec/elf_64.c   | 12 +--
 arch/powerpc/kexec/file_load_64.c | 53 +--
 3 files changed, 33 insertions(+), 38 deletions(-)

diff --git a/arch/powerpc/include/asm/kexec.h b/arch/powerpc/include/asm/kexec.h
index 95a98b390d62..270ee93a0f7d 100644
--- a/arch/powerpc/include/asm/kexec.h
+++ b/arch/powerpc/include/asm/kexec.h
@@ -103,10 +103,8 @@ int load_crashdump_segments_ppc64(struct kimage *image,
 int setup_purgatory_ppc64(struct kimage *image, const void *slave_code,
  const void *fdt, unsigned long kernel_load_addr,
  unsigned long fdt_load_addr);
-unsigned int kexec_extra_fdt_size_ppc64(struct kimage *image);
-int setup_new_fdt_ppc64(const struct kimage *image, void *fdt,
-   unsigned long initrd_load_addr,
-   unsigned long initrd_len, const char *cmdline);
+unsigned int kexec_extra_fdt_size_ppc64(struct kimage *image, struct crash_mem 
*rmem);
+int setup_new_fdt_ppc64(const struct kimage *image, void *fdt, struct 
crash_mem *rmem);
 #endif /* CONFIG_PPC64 */
 
 #endif /* CONFIG_KEXEC_FILE */
diff --git a/arch/powerpc/kexec/elf_64.c b/arch/powerpc/kexec/elf_64.c
index 214c071c58ed..5d6d616404cf 100644
--- a/arch/powerpc/kexec/elf_64.c
+++ b/arch/powerpc/kexec/elf_64.c
@@ -23,6 +23,7 @@
 #include 
 #include 
 #include 
+#include 
 
 static void *elf64_load(struct kimage *image, char *kernel_buf,
unsigned long kernel_len, char *initrd,
@@ -36,6 +37,7 @@ static void *elf64_load(struct kimage *image, char 
*kernel_buf,
const void *slave_code;
struct elfhdr ehdr;
char *modified_cmdline = NULL;
+   struct crash_mem *rmem = NULL;
struct kexec_elf_info elf_info;
struct kexec_buf kbuf = { .image = image, .buf_min = 0,
  .buf_max = ppc64_rma_size };
@@ -102,17 +104,20 @@ static void *elf64_load(struct kimage *image, char 
*kernel_buf,
kexec_dprintk("Loaded initrd at 0x%lx\n", initrd_load_addr);
}
 
+   ret = get_reserved_memory_ranges();
+   if (ret)
+   goto out;
+
fdt = of_kexec_alloc_and_setup_fdt(image, initrd_load_addr,
   initrd_len, cmdline,
-  kexec_extra_fdt_size_ppc64(image));
+  kexec_extra_fdt_size_ppc64(image, 
rmem));
if (!fdt) {
pr_err("Error setting up the new device tree.\n");
ret = -EINVAL;
goto out;
}
 
-   ret = setup_new_fdt_ppc64(image, fdt, initrd_load_addr,
- initrd_len, cmdline);
+   ret = setup_new_fdt_ppc64(image, fdt, rmem);
if (ret)
goto out_free_fdt;
 
@@ -146,6 +151,7 @@ static void *elf64_load(struct kimage *image, char 
*kernel_buf,
 out_free_fdt:
kvfree(fdt);
 out:
+   kfree(rmem);
kfree(modified_cmdline);
kexec_free_elf_info(_info);
 
diff --git a/arch/powerpc/kexec/file_load_64.c 
b/arch/powerpc/kexec/file_load_64.c
index 925a69ad2468..41be4546a34e 100644
--- a/arch/powerpc/kexec/file_load_64.c
+++ b/arch/powerpc/kexec/file_load_64.c
@@ -803,10 +803,9 @@ static unsigned int cpu_node_size(void)
return size;
 }
 
-static unsigned int kdump_extra_fdt_size_ppc64(struct kimage *image)
+static unsigned int kdump_extra_fdt_size_ppc64(struct kimage *image, unsigned 
int cpu_nodes)
 {
-   unsigned int cpu_nodes, extra_size = 0;
-   struct device_node *dn;
+   unsigned int extra_size = 0;
u64 usm_entries;
 #ifdef CONFIG_CRASH_HOTPLUG
unsigned int possible_cpu_nodes;
@@ -826,18 +825,6 @@ static unsigned int kdump_extra_fdt_size_ppc64(struct 
kimage *image)
extra_size += (unsigned int)(usm_entries * sizeof(u64));
}
 
-   /*
-* Get the number of CPU nodes in the current DT. This allows to
-* reserve places for CPU n

[PATCH] powerpc/crash: remove unnecessary NULL check before kvfree()

2024-05-02 Thread Sourabh Jain
Fix the following coccicheck build warning:

arch/powerpc/kexec/crash.c:488:2-8: WARNING: NULL check before some
freeing functions is not needed.

Reported-by: kernel test robot 
Closes: 
https://lore.kernel.org/oe-kbuild-all/202404261048.skfv5ddb-...@intel.com/
Cc: Michael Ellerman 
Cc: Stephen Rothwell 
Signed-off-by: Sourabh Jain 
---
 arch/powerpc/kexec/crash.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/arch/powerpc/kexec/crash.c b/arch/powerpc/kexec/crash.c
index 21b193e938a3..9ac3266e4965 100644
--- a/arch/powerpc/kexec/crash.c
+++ b/arch/powerpc/kexec/crash.c
@@ -484,8 +484,7 @@ static void update_crash_elfcorehdr(struct kimage *image, 
struct memory_notify *
}
 out:
kvfree(cmem);
-   if (elfbuf)
-   kvfree(elfbuf);
+   kvfree(elfbuf);
 }
 
 /**
-- 
2.44.0



[PATCH v19 6/6] powerpc/crash: add crash memory hotplug support

2024-04-26 Thread Sourabh Jain
Extend the arch crash hotplug handler, as introduced by the patch title
("powerpc: add crash CPU hotplug support"), to also support memory
add/remove events.

Elfcorehdr describes the memory of the crash kernel to capture the
kernel; hence, it needs to be updated if memory resources change due to
memory add/remove events. Therefore, arch_crash_handle_hotplug_event()
is updated to recreate the elfcorehdr and replace it with the previous
one on memory add/remove events.

The memblock list is used to prepare the elfcorehdr. In the case of
memory hot remove, the memblock list is updated after the arch crash
hotplug handler is triggered, as depicted in Figure 1. Thus, the
hot-removed memory is explicitly removed from the crash memory ranges
to ensure that the memory ranges added to elfcorehdr do not include the
hot-removed memory.

Memory remove
  |
  v
Offline pages
  |
  v
 Initiate memory notify call <> crash hotplug handler
 chain for MEM_OFFLINE event
  |
  v
 Update memblock list

Figure 1

There are two system calls, `kexec_file_load` and `kexec_load`, used to
load the kdump image. A few changes have been made to ensure that the
kernel can safely update the elfcorehdr component of the kdump image for
both system calls.

For the kexec_file_load syscall, kdump image is prepared in the kernel.
To support an increasing number of memory regions, the elfcorehdr is
built with extra buffer space to ensure that it can accommodate
additional memory ranges in future.

For the kexec_load syscall, the elfcorehdr is updated only if the
KEXEC_CRASH_HOTPLUG_SUPPORT kexec flag is passed to the kernel by the
kexec tool. Passing this flag to the kernel indicates that the
elfcorehdr is built to accommodate additional memory ranges and the
elfcorehdr segment is not considered for SHA calculation, making it safe
to update.

The changes related to this feature are kept under the CRASH_HOTPLUG
config, and it is enabled by default.

Signed-off-by: Sourabh Jain 
Acked-by: Hari Bathini 
Cc: Akhil Raj 
Cc: Andrew Morton 
Cc: Aneesh Kumar K.V 
Cc: Baoquan He 
Cc: Borislav Petkov (AMD) 
Cc: Boris Ostrovsky 
Cc: Christophe Leroy 
Cc: Dave Hansen 
Cc: Dave Young 
Cc: David Hildenbrand 
Cc: Greg Kroah-Hartman 
Cc: Laurent Dufour 
Cc: Mahesh Salgaonkar 
Cc: Michael Ellerman 
Cc: Mimi Zohar 
Cc: Naveen N Rao 
Cc: Oscar Salvador 
Cc: Stephen Rothwell 
Cc: Thomas Gleixner 
Cc: Valentin Schneider 
Cc: Vivek Goyal 
Cc: ke...@lists.infradead.org
Cc: x...@kernel.org
---

Changes in v19:
* Fix a build warning: remove NULL check before freeing memory for
  elfbuf in update_crash_elfcorehdr function.

 arch/powerpc/include/asm/kexec.h|  3 +
 arch/powerpc/include/asm/kexec_ranges.h |  1 +
 arch/powerpc/kexec/crash.c  | 94 -
 arch/powerpc/kexec/file_load_64.c   | 20 +-
 arch/powerpc/kexec/ranges.c | 85 ++
 5 files changed, 201 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/include/asm/kexec.h b/arch/powerpc/include/asm/kexec.h
index e75970351bcd..95a98b390d62 100644
--- a/arch/powerpc/include/asm/kexec.h
+++ b/arch/powerpc/include/asm/kexec.h
@@ -141,6 +141,9 @@ void arch_crash_handle_hotplug_event(struct kimage *image, 
void *arg);
 
 int arch_crash_hotplug_support(struct kimage *image, unsigned long 
kexec_flags);
 #define arch_crash_hotplug_support arch_crash_hotplug_support
+
+unsigned int arch_crash_get_elfcorehdr_size(void);
+#define crash_get_elfcorehdr_size arch_crash_get_elfcorehdr_size
 #endif /* CONFIG_CRASH_HOTPLUG */
 
 extern int crashing_cpu;
diff --git a/arch/powerpc/include/asm/kexec_ranges.h 
b/arch/powerpc/include/asm/kexec_ranges.h
index 8489e844b447..14055896cbcb 100644
--- a/arch/powerpc/include/asm/kexec_ranges.h
+++ b/arch/powerpc/include/asm/kexec_ranges.h
@@ -7,6 +7,7 @@
 void sort_memory_ranges(struct crash_mem *mrngs, bool merge);
 struct crash_mem *realloc_mem_ranges(struct crash_mem **mem_ranges);
 int add_mem_range(struct crash_mem **mem_ranges, u64 base, u64 size);
+int remove_mem_range(struct crash_mem **mem_ranges, u64 base, u64 size);
 int get_exclude_memory_ranges(struct crash_mem **mem_ranges);
 int get_reserved_memory_ranges(struct crash_mem **mem_ranges);
 int get_crash_memory_ranges(struct crash_mem **mem_ranges);
diff --git a/arch/powerpc/kexec/crash.c b/arch/powerpc/kexec/crash.c
index 8938a19af12f..9ac3266e4965 100644
--- a/arch/powerpc/kexec/crash.c
+++ b/arch/powerpc/kexec/crash.c
@@ -17,6 +17,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include 
 #include 
@@ -25,6 +26,7 @@
 #include 
 #include 
 #include 
+#include 
 
 /*
  * The primary CPU waits a while for all secondary CPUs to enter. This is to
@@ -398,6 +400,93 @@ void default_machine_crash_shutdown(struct pt_regs *regs)
 #undef pr_fmt
 #define pr_fmt(fmt) "crash hp: " fmt
 
+/*
+ * Advertise preferred elfcorehdr size to userspace via
+ * /sys/kernel/c

[PATCH v19 5/6] powerpc/crash: add crash CPU hotplug support

2024-04-26 Thread Sourabh Jain
Due to CPU/Memory hotplug or online/offline events, the elfcorehdr
(which describes the CPUs and memory of the crashed kernel) and FDT
(Flattened Device Tree) of kdump image becomes outdated. Consequently,
attempting dump collection with an outdated elfcorehdr or FDT can lead
to failed or inaccurate dump collection.

Going forward, CPU hotplug or online/offline events are referred as
CPU/Memory add/remove events.

The current solution to address the above issue involves monitoring the
CPU/Memory add/remove events in userspace using udev rules and whenever
there are changes in CPU and memory resources, the entire kdump image
is loaded again. The kdump image includes kernel, initrd, elfcorehdr,
FDT, purgatory. Given that only elfcorehdr and FDT get outdated due to
CPU/Memory add/remove events, reloading the entire kdump image is
inefficient. More importantly, kdump remains inactive for a substantial
amount of time until the kdump reload completes.

To address the aforementioned issue, commit 247262756121 ("crash: add
generic infrastructure for crash hotplug support") added a generic
infrastructure that allows architectures to selectively update the kdump
image component during CPU or memory add/remove events within the kernel
itself.

In the event of a CPU or memory add/remove events, the generic crash
hotplug event handler, `crash_handle_hotplug_event()`, is triggered. It
then acquires the necessary locks to update the kdump image and invokes
the architecture-specific crash hotplug handler,
`arch_crash_handle_hotplug_event()`, to update the required kdump image
components.

This patch adds crash hotplug handler for PowerPC and enable support to
update the kdump image on CPU add/remove events. Support for memory
add/remove events is added in a subsequent patch with the title
"powerpc: add crash memory hotplug support"

As mentioned earlier, only the elfcorehdr and FDT kdump image components
need to be updated in the event of CPU or memory add/remove events.
However, on PowerPC architecture crash hotplug handler only updates the
FDT to enable crash hotplug support for CPU add/remove events. Here's
why.

The elfcorehdr on PowerPC is built with possible CPUs, and thus, it does
not need an update on CPU add/remove events. On the other hand, the FDT
needs to be updated on CPU add events to include the newly added CPU. If
the FDT is not updated and the kernel crashes on a newly added CPU, the
kdump kernel will fail to boot due to the unavailability of the crashing
CPU in the FDT. During the early boot, it is expected that the boot CPU
must be a part of the FDT; otherwise, the kernel will raise a BUG and
fail to boot. For more information, refer to commit 36ae37e3436b0
("powerpc: Make boot_cpuid common between 32 and 64-bit"). Since it is
okay to have an offline CPU in the kdump FDT, no action is taken in case
of CPU removal.

There are two system calls, `kexec_file_load` and `kexec_load`, used to
load the kdump image. Few changes have been made to ensure kernel can
safely update the FDT of kdump image loaded using both system calls.

For kexec_file_load syscall the kdump image is prepared in kernel. So to
support an increasing number of CPUs, the FDT is constructed with extra
buffer space to ensure it can accommodate a possible number of CPU
nodes. Additionally, a call to fdt_pack (which trims the unused space
once the FDT is prepared) is avoided if this feature is enabled.

For the kexec_load syscall, the FDT is updated only if the
KEXEC_CRASH_HOTPLUG_SUPPORT kexec flag is passed to the kernel by
userspace (kexec tools). When userspace passes this flag to the kernel,
it indicates that the FDT is built to accommodate possible CPUs, and the
FDT segment is excluded from SHA calculation, making it safe to update.

The changes related to this feature are kept under the CRASH_HOTPLUG
config, and it is enabled by default.

Signed-off-by: Sourabh Jain 
Acked-by: Hari Bathini 
Cc: Akhil Raj 
Cc: Andrew Morton 
Cc: Aneesh Kumar K.V 
Cc: Baoquan He 
Cc: Borislav Petkov (AMD) 
Cc: Boris Ostrovsky 
Cc: Christophe Leroy 
Cc: Dave Hansen 
Cc: Dave Young 
Cc: David Hildenbrand 
Cc: Greg Kroah-Hartman 
Cc: Laurent Dufour 
Cc: Mahesh Salgaonkar 
Cc: Michael Ellerman 
Cc: Mimi Zohar 
Cc: Naveen N Rao 
Cc: Oscar Salvador 
Cc: Stephen Rothwell 
Cc: Thomas Gleixner 
Cc: Valentin Schneider 
Cc: Vivek Goyal 
Cc: ke...@lists.infradead.org
Cc: x...@kernel.org
---

* No changes in v19.

 arch/powerpc/Kconfig  |   4 ++
 arch/powerpc/include/asm/kexec.h  |   8 +++
 arch/powerpc/kexec/crash.c| 103 ++
 arch/powerpc/kexec/elf_64.c   |   3 +-
 arch/powerpc/kexec/file_load_64.c |  17 +
 5 files changed, 134 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
index 1c4be3373686..a1a3b3363008 100644
--- a/arch/powerpc/Kconfig
+++ b/arch/powerpc/Kconfig
@@ -686,6 +686,10 @@ config ARCH_SELECTS_CRASH_DUMP
depend

[PATCH v19 3/6] powerpc/kexec: move *_memory_ranges functions to ranges.c

2024-04-26 Thread Sourabh Jain
Move the following functions form kexec/{file_load_64.c => ranges.c} and
make them public so that components other than KEXEC_FILE can also use
these functions.
1. get_exclude_memory_ranges
2. get_reserved_memory_ranges
3. get_crash_memory_ranges
4. get_usable_memory_ranges

Later in the series get_crash_memory_ranges function is utilized for
in-kernel updates to kdump image during CPU/Memory hotplug or
online/offline events for both kexec_load and kexec_file_load syscalls.

Since the above functions are moved to ranges.c, some of the helper
functions in ranges.c are no longer required to be public. Mark them as
static and removed them from kexec_ranges.h header file.

Finally, remove the CONFIG_KEXEC_FILE build dependency for range.c
because it is required for other config, such as CONFIG_CRASH_DUMP.

No functional changes are intended.

Signed-off-by: Sourabh Jain 
Acked-by: Hari Bathini 
Cc: Akhil Raj 
Cc: Andrew Morton 
Cc: Aneesh Kumar K.V 
Cc: Baoquan He 
Cc: Borislav Petkov (AMD) 
Cc: Boris Ostrovsky 
Cc: Christophe Leroy 
Cc: Dave Hansen 
Cc: Dave Young 
Cc: David Hildenbrand 
Cc: Greg Kroah-Hartman 
Cc: Laurent Dufour 
Cc: Mahesh Salgaonkar 
Cc: Michael Ellerman 
Cc: Mimi Zohar 
Cc: Naveen N Rao 
Cc: Oscar Salvador 
Cc: Stephen Rothwell 
Cc: Thomas Gleixner 
Cc: Valentin Schneider 
Cc: Vivek Goyal 
Cc: ke...@lists.infradead.org
Cc: x...@kernel.org
---

* No changes in v19.

 arch/powerpc/include/asm/kexec_ranges.h |  19 +-
 arch/powerpc/kexec/Makefile |   4 +-
 arch/powerpc/kexec/file_load_64.c   | 190 
 arch/powerpc/kexec/ranges.c | 227 +++-
 4 files changed, 224 insertions(+), 216 deletions(-)

diff --git a/arch/powerpc/include/asm/kexec_ranges.h 
b/arch/powerpc/include/asm/kexec_ranges.h
index f83866a19e87..8489e844b447 100644
--- a/arch/powerpc/include/asm/kexec_ranges.h
+++ b/arch/powerpc/include/asm/kexec_ranges.h
@@ -7,19 +7,8 @@
 void sort_memory_ranges(struct crash_mem *mrngs, bool merge);
 struct crash_mem *realloc_mem_ranges(struct crash_mem **mem_ranges);
 int add_mem_range(struct crash_mem **mem_ranges, u64 base, u64 size);
-int add_tce_mem_ranges(struct crash_mem **mem_ranges);
-int add_initrd_mem_range(struct crash_mem **mem_ranges);
-#ifdef CONFIG_PPC_64S_HASH_MMU
-int add_htab_mem_range(struct crash_mem **mem_ranges);
-#else
-static inline int add_htab_mem_range(struct crash_mem **mem_ranges)
-{
-   return 0;
-}
-#endif
-int add_kernel_mem_range(struct crash_mem **mem_ranges);
-int add_rtas_mem_range(struct crash_mem **mem_ranges);
-int add_opal_mem_range(struct crash_mem **mem_ranges);
-int add_reserved_mem_ranges(struct crash_mem **mem_ranges);
-
+int get_exclude_memory_ranges(struct crash_mem **mem_ranges);
+int get_reserved_memory_ranges(struct crash_mem **mem_ranges);
+int get_crash_memory_ranges(struct crash_mem **mem_ranges);
+int get_usable_memory_ranges(struct crash_mem **mem_ranges);
 #endif /* _ASM_POWERPC_KEXEC_RANGES_H */
diff --git a/arch/powerpc/kexec/Makefile b/arch/powerpc/kexec/Makefile
index 8e469c4da3f8..470eb0453e17 100644
--- a/arch/powerpc/kexec/Makefile
+++ b/arch/powerpc/kexec/Makefile
@@ -3,11 +3,11 @@
 # Makefile for the linux kernel.
 #
 
-obj-y  += core.o core_$(BITS).o
+obj-y  += core.o core_$(BITS).o ranges.o
 
 obj-$(CONFIG_PPC32)+= relocate_32.o
 
-obj-$(CONFIG_KEXEC_FILE)   += file_load.o ranges.o file_load_$(BITS).o 
elf_$(BITS).o
+obj-$(CONFIG_KEXEC_FILE)   += file_load.o file_load_$(BITS).o elf_$(BITS).o
 obj-$(CONFIG_VMCORE_INFO)  += vmcore_info.o
 obj-$(CONFIG_CRASH_DUMP)   += crash.o
 
diff --git a/arch/powerpc/kexec/file_load_64.c 
b/arch/powerpc/kexec/file_load_64.c
index 1bc65de6174f..6a01f62b8fcf 100644
--- a/arch/powerpc/kexec/file_load_64.c
+++ b/arch/powerpc/kexec/file_load_64.c
@@ -47,83 +47,6 @@ const struct kexec_file_ops * const kexec_file_loaders[] = {
NULL
 };
 
-/**
- * get_exclude_memory_ranges - Get exclude memory ranges. This list includes
- * regions like opal/rtas, tce-table, initrd,
- * kernel, htab which should be avoided while
- * setting up kexec load segments.
- * @mem_ranges:Range list to add the memory ranges to.
- *
- * Returns 0 on success, negative errno on error.
- */
-static int get_exclude_memory_ranges(struct crash_mem **mem_ranges)
-{
-   int ret;
-
-   ret = add_tce_mem_ranges(mem_ranges);
-   if (ret)
-   goto out;
-
-   ret = add_initrd_mem_range(mem_ranges);
-   if (ret)
-   goto out;
-
-   ret = add_htab_mem_range(mem_ranges);
-   if (ret)
-   goto out;
-
-   ret = add_kernel_mem_range(mem_ranges);
-   if (ret)
-   goto out;
-
-   ret = add_rtas_mem_range(mem_ranges);
-   if (ret)
-   goto out;
-
-   ret = add_opal_mem_range(mem_ran

[PATCH v19 4/6] PowerPC/kexec: make the update_cpus_node() function public

2024-04-26 Thread Sourabh Jain
Move the update_cpus_node() from kexec/{file_load_64.c => core_64.c}
to allow other kexec components to use it.

Later in the series, this function is used for in-kernel updates
to the kdump image during CPU/memory hotplug or online/offline events for
both kexec_load and kexec_file_load syscalls.

No functional changes are intended.

Signed-off-by: Sourabh Jain 
Acked-by: Hari Bathini 
Cc: Akhil Raj 
Cc: Andrew Morton 
Cc: Aneesh Kumar K.V 
Cc: Baoquan He 
Cc: Borislav Petkov (AMD) 
Cc: Boris Ostrovsky 
Cc: Christophe Leroy 
Cc: Dave Hansen 
Cc: Dave Young 
Cc: David Hildenbrand 
Cc: Greg Kroah-Hartman 
Cc: Laurent Dufour 
Cc: Mahesh Salgaonkar 
Cc: Michael Ellerman 
Cc: Mimi Zohar 
Cc: Naveen N Rao 
Cc: Oscar Salvador 
Cc: Stephen Rothwell 
Cc: Thomas Gleixner 
Cc: Valentin Schneider 
Cc: Vivek Goyal 
Cc: ke...@lists.infradead.org
Cc: x...@kernel.org
---

* No changes in v19.

 arch/powerpc/include/asm/kexec.h  |  4 ++
 arch/powerpc/kexec/core_64.c  | 91 +++
 arch/powerpc/kexec/file_load_64.c | 87 -
 3 files changed, 95 insertions(+), 87 deletions(-)

diff --git a/arch/powerpc/include/asm/kexec.h b/arch/powerpc/include/asm/kexec.h
index fdb90e24dc74..d9ff4d0e392d 100644
--- a/arch/powerpc/include/asm/kexec.h
+++ b/arch/powerpc/include/asm/kexec.h
@@ -185,6 +185,10 @@ static inline void crash_send_ipi(void 
(*crash_ipi_callback)(struct pt_regs *))
 
 #endif /* CONFIG_CRASH_DUMP */
 
+#if defined(CONFIG_KEXEC_FILE) || defined(CONFIG_CRASH_DUMP)
+int update_cpus_node(void *fdt);
+#endif
+
 #ifdef CONFIG_PPC_BOOK3S_64
 #include 
 #endif
diff --git a/arch/powerpc/kexec/core_64.c b/arch/powerpc/kexec/core_64.c
index 762e4d09aacf..85050be08a23 100644
--- a/arch/powerpc/kexec/core_64.c
+++ b/arch/powerpc/kexec/core_64.c
@@ -17,6 +17,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include 
 #include 
@@ -30,6 +31,7 @@
 #include 
 #include 
 #include 
+#include 
 
 int machine_kexec_prepare(struct kimage *image)
 {
@@ -419,3 +421,92 @@ static int __init export_htab_values(void)
 }
 late_initcall(export_htab_values);
 #endif /* CONFIG_PPC_64S_HASH_MMU */
+
+#if defined(CONFIG_KEXEC_FILE) || defined(CONFIG_CRASH_DUMP)
+/**
+ * add_node_props - Reads node properties from device node structure and add
+ *  them to fdt.
+ * @fdt:Flattened device tree of the kernel
+ * @node_offset:offset of the node to add a property at
+ * @dn: device node pointer
+ *
+ * Returns 0 on success, negative errno on error.
+ */
+static int add_node_props(void *fdt, int node_offset, const struct device_node 
*dn)
+{
+   int ret = 0;
+   struct property *pp;
+
+   if (!dn)
+   return -EINVAL;
+
+   for_each_property_of_node(dn, pp) {
+   ret = fdt_setprop(fdt, node_offset, pp->name, pp->value, 
pp->length);
+   if (ret < 0) {
+   pr_err("Unable to add %s property: %s\n", pp->name, 
fdt_strerror(ret));
+   return ret;
+   }
+   }
+   return ret;
+}
+
+/**
+ * update_cpus_node - Update cpus node of flattened device tree using of_root
+ *device node.
+ * @fdt:  Flattened device tree of the kernel.
+ *
+ * Returns 0 on success, negative errno on error.
+ */
+int update_cpus_node(void *fdt)
+{
+   struct device_node *cpus_node, *dn;
+   int cpus_offset, cpus_subnode_offset, ret = 0;
+
+   cpus_offset = fdt_path_offset(fdt, "/cpus");
+   if (cpus_offset < 0 && cpus_offset != -FDT_ERR_NOTFOUND) {
+   pr_err("Malformed device tree: error reading /cpus node: %s\n",
+  fdt_strerror(cpus_offset));
+   return cpus_offset;
+   }
+
+   if (cpus_offset > 0) {
+   ret = fdt_del_node(fdt, cpus_offset);
+   if (ret < 0) {
+   pr_err("Error deleting /cpus node: %s\n", 
fdt_strerror(ret));
+   return -EINVAL;
+   }
+   }
+
+   /* Add cpus node to fdt */
+   cpus_offset = fdt_add_subnode(fdt, fdt_path_offset(fdt, "/"), "cpus");
+   if (cpus_offset < 0) {
+   pr_err("Error creating /cpus node: %s\n", 
fdt_strerror(cpus_offset));
+   return -EINVAL;
+   }
+
+   /* Add cpus node properties */
+   cpus_node = of_find_node_by_path("/cpus");
+   ret = add_node_props(fdt, cpus_offset, cpus_node);
+   of_node_put(cpus_node);
+   if (ret < 0)
+   return ret;
+
+   /* Loop through all subnodes of cpus and add them to fdt */
+   for_each_node_by_type(dn, "cpu") {
+   cpus_subnode_offset = fdt_add_subnode(fdt, cpus_offset, 
dn->full_name);
+   if (cpus_subnode_offset < 0) {
+   pr_err("Unable to add %s subno

[PATCH v19 2/6] crash: add a new kexec flag for hotplug support

2024-04-26 Thread Sourabh Jain
Commit a72bbec70da2 ("crash: hotplug support for kexec_load()")
introduced a new kexec flag, `KEXEC_UPDATE_ELFCOREHDR`. Kexec tool uses
this flag to indicate to the kernel that it is safe to modify the
elfcorehdr of the kdump image loaded using the kexec_load system call.

However, it is possible that architectures may need to update kexec
segments other then elfcorehdr. For example, FDT (Flatten Device Tree)
on PowerPC. Introducing a new kexec flag for every new kexec segment
may not be a good solution. Hence, a generic kexec flag bit,
`KEXEC_CRASH_HOTPLUG_SUPPORT`, is introduced to share the CPU/Memory
hotplug support intent between the kexec tool and the kernel for the
kexec_load system call.

Now we have two kexec flags that enables crash hotplug support for
kexec_load system call. First is KEXEC_UPDATE_ELFCOREHDR (only used in
x86), and second is KEXEC_CRASH_HOTPLUG_SUPPORT (for all architectures).

To simplify the process of finding and reporting the crash hotplug
support the following changes are introduced.

1. Define arch specific function to process the kexec flags and
   determine crash hotplug support

2. Rename the @update_elfcorehdr member of struct kimage to
   @hotplug_support and populate it for both kexec_load and
   kexec_file_load syscalls, because architecture can update more than
   one kexec segment

3. Let generic function crash_check_hotplug_support report hotplug
   support for loaded kdump image based on value of @hotplug_support

To bring the x86 crash hotplug support in line with the above points,
the following changes have been made:

- Introduce the arch_crash_hotplug_support function to process kexec
  flags and determine crash hotplug support

- Remove the arch_crash_hotplug_[cpu|memory]_support functions

Signed-off-by: Sourabh Jain 
Acked-by: Baoquan He 
Acked-by: Hari Bathini 
Cc: Akhil Raj 
Cc: Andrew Morton 
Cc: Aneesh Kumar K.V 
Cc: Borislav Petkov (AMD) 
Cc: Boris Ostrovsky 
Cc: Christophe Leroy 
Cc: Dave Hansen 
Cc: Dave Young 
Cc: David Hildenbrand 
Cc: Eric DeVolder 
Cc: Greg Kroah-Hartman 
Cc: Laurent Dufour 
Cc: Mahesh Salgaonkar 
Cc: Michael Ellerman 
Cc: Mimi Zohar 
Cc: Naveen N Rao 
Cc: Oscar Salvador 
Cc: Stephen Rothwell 
Cc: Thomas Gleixner 
Cc: Valentin Schneider 
Cc: Vivek Goyal 
Cc: ke...@lists.infradead.org
Cc: x...@kernel.org
---

* No changes in v19.

 arch/x86/include/asm/kexec.h | 11 ++-
 arch/x86/kernel/crash.c  | 28 +---
 drivers/base/cpu.c   |  2 +-
 drivers/base/memory.c|  2 +-
 include/linux/crash_core.h   | 13 ++---
 include/linux/kexec.h| 11 +++
 include/uapi/linux/kexec.h   |  1 +
 kernel/crash_core.c  | 15 ++-
 kernel/kexec.c   |  4 ++--
 kernel/kexec_file.c  |  5 +
 10 files changed, 48 insertions(+), 44 deletions(-)

diff --git a/arch/x86/include/asm/kexec.h b/arch/x86/include/asm/kexec.h
index cb1320ebbc23..ae5482a2f0ca 100644
--- a/arch/x86/include/asm/kexec.h
+++ b/arch/x86/include/asm/kexec.h
@@ -210,15 +210,8 @@ extern void kdump_nmi_shootdown_cpus(void);
 void arch_crash_handle_hotplug_event(struct kimage *image, void *arg);
 #define arch_crash_handle_hotplug_event arch_crash_handle_hotplug_event
 
-#ifdef CONFIG_HOTPLUG_CPU
-int arch_crash_hotplug_cpu_support(void);
-#define crash_hotplug_cpu_support arch_crash_hotplug_cpu_support
-#endif
-
-#ifdef CONFIG_MEMORY_HOTPLUG
-int arch_crash_hotplug_memory_support(void);
-#define crash_hotplug_memory_support arch_crash_hotplug_memory_support
-#endif
+int arch_crash_hotplug_support(struct kimage *image, unsigned long 
kexec_flags);
+#define arch_crash_hotplug_support arch_crash_hotplug_support
 
 unsigned int arch_crash_get_elfcorehdr_size(void);
 #define crash_get_elfcorehdr_size arch_crash_get_elfcorehdr_size
diff --git a/arch/x86/kernel/crash.c b/arch/x86/kernel/crash.c
index 2a682fe86352..f06501445cd9 100644
--- a/arch/x86/kernel/crash.c
+++ b/arch/x86/kernel/crash.c
@@ -402,20 +402,26 @@ int crash_load_segments(struct kimage *image)
 #undef pr_fmt
 #define pr_fmt(fmt) "crash hp: " fmt
 
-/* These functions provide the value for the sysfs crash_hotplug nodes */
-#ifdef CONFIG_HOTPLUG_CPU
-int arch_crash_hotplug_cpu_support(void)
+int arch_crash_hotplug_support(struct kimage *image, unsigned long kexec_flags)
 {
-   return crash_check_update_elfcorehdr();
-}
-#endif
 
-#ifdef CONFIG_MEMORY_HOTPLUG
-int arch_crash_hotplug_memory_support(void)
-{
-   return crash_check_update_elfcorehdr();
-}
+#ifdef CONFIG_KEXEC_FILE
+   if (image->file_mode)
+   return 1;
 #endif
+   /*
+* Initially, crash hotplug support for kexec_load was added
+* with the KEXEC_UPDATE_ELFCOREHDR flag. Later, this
+* functionality was expanded to accommodate multiple kexec
+* segment updates, leading to the introduction of the
+* KEXEC_CRASH_HOTPLUG_SUPPORT kexec flag bit. Consequently,
+* when the 

[PATCH v19 0/6]powerpc/crash: Kernel handling of CPU and memory hotplug

2024-04-26 Thread Sourabh Jain
er files
  - Rebase to v6.7-rc5

v13:
  - Fix a build warning, take ranges.c out of CONFIG_KEXEC_FILE
  - Rebase to v6.7-rc4

v12:
  - A patch to add new kexec flags to support this feature on kexec_load
system call
  - Change in the way this feature is advertise to userspace for both
kexec_load syscall
  - Rebase to v6.6-rc7

v11:
  - Rebase to v6.4-rc6
  - The patch that introduced CONFIG_CRASH_HOTPLUG for PowerPC has been
removed. The config is now part of common configuration:
https://lore.kernel.org/all/87ilbpflsk.fsf@mail.lhotse/

v10:
  - Drop the patch that adds fdt_index attribute to struct kimage_arch
Find the fdt segment index when needed.
  - Added more details into commits messages.
  - Rebased onto 6.3.0-rc5

v9:
  - Removed patch to prepare elfcorehdr crash notes for possible CPUs.
The patch is moved to generic patch series that introduces generic
infrastructure for in kernel crash update.
  - Removed patch to pass the hotplug action type to the arch crash
hotplug handler function. The generic patch series has introduced
the hotplug action type in kimage struct.
  - Add detail commit message for better understanding.

v8:
  - Restrict fdt_index initialization to machine_kexec_post_load
it work for both kexec_load and kexec_file_load.[3/8] Laurent Dufour

  - Updated the logic to find the number of offline core. [6/8]

  - Changed the logic to find the elfcore program header to accommodate
future memory ranges due memory hotplug events. [8/8]

v7
  - added a new config to configure this feature
  - pass hotplug action type to arch specific handler

v6
  - Added crash memory hotplug support

v5:
  - Replace COFNIG_CRASH_HOTPLUG with CONFIG_HOTPLUG_CPU.
  - Move fdt segment identification for kexec_load case to load path
instead of crash hotplug handler
  - Keep new attribute defined under kimage_arch to track FDT segment
under CONFIG_HOTPLUG_CPU config.

v4:
  - Update the logic to find the additional space needed for hotadd CPUs
post kexec load. Refer "[RFC v4 PATCH 4/5] powerpc/crash hp: add crash
hotplug support for kexec_file_load" patch to know more about the
change.
  - Fix a couple of typo.
  - Replace pr_err to pr_info_once to warn user about memory hotplug
support.
  - In crash hotplug handle exit the for loop if FDT segment is found.

v3
  - Move fdt_index and fdt_index_vaild variables to kimage_arch struct.
  - Rebase patche on top of

https://lore.kernel.org/lkml/20220303162725.49640-1-eric.devol...@oracle.com/
  - Fixed warning reported by checpatch script

v2:
  - Use generic hotplug handler introduced by

https://lore.kernel.org/lkml/20220209195706.51522-1-eric.devol...@oracle.com/
a significant change from v1.

Cc: Akhil Raj 
Cc: Andrew Morton 
Cc: Aneesh Kumar K.V 
Cc: Baoquan He 
Cc: Borislav Petkov (AMD) 
Cc: Boris Ostrovsky 
Cc: Christophe Leroy 
Cc: Dave Hansen 
Cc: Dave Young 
Cc: David Hildenbrand 
Cc: Greg Kroah-Hartman 
Cc: Hari Bathini 
Cc: Laurent Dufour 
Cc: Mahesh Salgaonkar 
Cc: Michael Ellerman 
Cc: Mimi Zohar 
Cc: Naveen N Rao 
Cc: Oscar Salvador 
Cc: Stephen Rothwell 
Cc: Thomas Gleixner 
Cc: Valentin Schneider 
Cc: Vivek Goyal 
Cc: ke...@lists.infradead.org
Cc: x...@kernel.org

Sourabh Jain (6):
  crash: forward memory_notify arg to arch crash hotplug handler
  crash: add a new kexec flag for hotplug support
  powerpc/kexec: move *_memory_ranges functions to ranges.c
  PowerPC/kexec: make the update_cpus_node() function public
  powerpc/crash: add crash CPU hotplug support
  powerpc/crash: add crash memory hotplug support

 arch/powerpc/Kconfig|   4 +
 arch/powerpc/include/asm/kexec.h|  15 ++
 arch/powerpc/include/asm/kexec_ranges.h |  20 +-
 arch/powerpc/kexec/Makefile |   4 +-
 arch/powerpc/kexec/core_64.c|  91 +++
 arch/powerpc/kexec/crash.c  | 195 +++
 arch/powerpc/kexec/elf_64.c |   3 +-
 arch/powerpc/kexec/file_load_64.c   | 314 +++-
 arch/powerpc/kexec/ranges.c | 312 ++-
 arch/x86/include/asm/kexec.h|  13 +-
 arch/x86/kernel/crash.c |  32 ++-
 drivers/base/cpu.c  |   2 +-
 drivers/base/memory.c   |   2 +-
 include/linux/crash_core.h  |  15 +-
 include/linux/kexec.h   |  11 +-
 include/uapi/linux/kexec.h  |   1 +
 kernel/crash_core.c |  29 +--
 kernel/kexec.c  |   4 +-
 kernel/kexec_file.c |   5 +
 19 files changed, 713 insertions(+), 359 deletions(-)

-- 
2.44.0



[PATCH v19 1/6] crash: forward memory_notify arg to arch crash hotplug handler

2024-04-26 Thread Sourabh Jain
In the event of memory hotplug or online/offline events, the crash
memory hotplug notifier `crash_memhp_notifier()` receives a
`memory_notify` object but doesn't forward that object to the
generic and architecture-specific crash hotplug handler.

The `memory_notify` object contains the starting PFN (Page Frame Number)
and the number of pages in the hot-removed memory. This information is
necessary for architectures like PowerPC to update/recreate the kdump
image, specifically `elfcorehdr`.

So update the function signature of `crash_handle_hotplug_event()` and
`arch_crash_handle_hotplug_event()` to accept the `memory_notify` object
as an argument from crash memory hotplug notifier.

Since no such object is available in the case of CPU hotplug event, the
crash CPU hotplug notifier `crash_cpuhp_online()` passes NULL to the
crash hotplug handler.

Signed-off-by: Sourabh Jain 
Acked-by: Baoquan He 
Acked-by: Hari Bathini 
Cc: Akhil Raj 
Cc: Andrew Morton 
Cc: Aneesh Kumar K.V 
Cc: Borislav Petkov (AMD) 
Cc: Boris Ostrovsky 
Cc: Christophe Leroy 
Cc: Dave Hansen 
Cc: Dave Young 
Cc: David Hildenbrand 
Cc: Greg Kroah-Hartman 
Cc: Laurent Dufour 
Cc: Mahesh Salgaonkar 
Cc: Michael Ellerman 
Cc: Mimi Zohar 
Cc: Naveen N Rao 
Cc: Oscar Salvador 
Cc: Stephen Rothwell 
Cc: Thomas Gleixner 
Cc: Valentin Schneider 
Cc: Vivek Goyal 
Cc: ke...@lists.infradead.org
Cc: x...@kernel.org
---

* No changes in v19.

 arch/x86/include/asm/kexec.h |  2 +-
 arch/x86/kernel/crash.c  |  4 +++-
 include/linux/crash_core.h   |  2 +-
 kernel/crash_core.c  | 14 +++---
 4 files changed, 12 insertions(+), 10 deletions(-)

diff --git a/arch/x86/include/asm/kexec.h b/arch/x86/include/asm/kexec.h
index 91ca9a9ee3a2..cb1320ebbc23 100644
--- a/arch/x86/include/asm/kexec.h
+++ b/arch/x86/include/asm/kexec.h
@@ -207,7 +207,7 @@ int arch_kimage_file_post_load_cleanup(struct kimage 
*image);
 extern void kdump_nmi_shootdown_cpus(void);
 
 #ifdef CONFIG_CRASH_HOTPLUG
-void arch_crash_handle_hotplug_event(struct kimage *image);
+void arch_crash_handle_hotplug_event(struct kimage *image, void *arg);
 #define arch_crash_handle_hotplug_event arch_crash_handle_hotplug_event
 
 #ifdef CONFIG_HOTPLUG_CPU
diff --git a/arch/x86/kernel/crash.c b/arch/x86/kernel/crash.c
index e74d0c4286c1..2a682fe86352 100644
--- a/arch/x86/kernel/crash.c
+++ b/arch/x86/kernel/crash.c
@@ -432,10 +432,12 @@ unsigned int arch_crash_get_elfcorehdr_size(void)
 /**
  * arch_crash_handle_hotplug_event() - Handle hotplug elfcorehdr changes
  * @image: a pointer to kexec_crash_image
+ * @arg: struct memory_notify handler for memory hotplug case and
+ *   NULL for CPU hotplug case.
  *
  * Prepare the new elfcorehdr and replace the existing elfcorehdr.
  */
-void arch_crash_handle_hotplug_event(struct kimage *image)
+void arch_crash_handle_hotplug_event(struct kimage *image, void *arg)
 {
void *elfbuf = NULL, *old_elfcorehdr;
unsigned long nr_mem_ranges;
diff --git a/include/linux/crash_core.h b/include/linux/crash_core.h
index d33352c2e386..647e928efee8 100644
--- a/include/linux/crash_core.h
+++ b/include/linux/crash_core.h
@@ -37,7 +37,7 @@ static inline void arch_kexec_unprotect_crashkres(void) { }
 
 
 #ifndef arch_crash_handle_hotplug_event
-static inline void arch_crash_handle_hotplug_event(struct kimage *image) { }
+static inline void arch_crash_handle_hotplug_event(struct kimage *image, void 
*arg) { }
 #endif
 
 int crash_check_update_elfcorehdr(void);
diff --git a/kernel/crash_core.c b/kernel/crash_core.c
index 78b5dc7cee3a..70fa8111a9d6 100644
--- a/kernel/crash_core.c
+++ b/kernel/crash_core.c
@@ -534,7 +534,7 @@ int crash_check_update_elfcorehdr(void)
  * list of segments it checks (since the elfcorehdr changes and thus
  * would require an update to purgatory itself to update the digest).
  */
-static void crash_handle_hotplug_event(unsigned int hp_action, unsigned int 
cpu)
+static void crash_handle_hotplug_event(unsigned int hp_action, unsigned int 
cpu, void *arg)
 {
struct kimage *image;
 
@@ -596,7 +596,7 @@ static void crash_handle_hotplug_event(unsigned int 
hp_action, unsigned int cpu)
image->hp_action = hp_action;
 
/* Now invoke arch-specific update handler */
-   arch_crash_handle_hotplug_event(image);
+   arch_crash_handle_hotplug_event(image, arg);
 
/* No longer handling a hotplug event */
image->hp_action = KEXEC_CRASH_HP_NONE;
@@ -612,17 +612,17 @@ static void crash_handle_hotplug_event(unsigned int 
hp_action, unsigned int cpu)
crash_hotplug_unlock();
 }
 
-static int crash_memhp_notifier(struct notifier_block *nb, unsigned long val, 
void *v)
+static int crash_memhp_notifier(struct notifier_block *nb, unsigned long val, 
void *arg)
 {
switch (val) {
case MEM_ONLINE:
crash_handle_hotplug_event(KEXEC_CRASH_HP_ADD_MEMORY,
-   KEXEC_CRASH_HP_INVALID_CPU);
+   KEXEC_CRASH_HP_INVALID_CP

[PATCH v10 3/3] Documentation/powerpc: update fadump implementation details

2024-04-22 Thread Sourabh Jain
The patch titled ("powerpc: make fadump resilient with memory add/remove
events") has made significant changes to the implementation of fadump,
particularly on elfcorehdr creation and fadump crash info header
structure. Therefore, updating the fadump implementation documentation
to reflect those changes.

Following updates are done to firmware assisted dump documentation:

1. The elfcorehdr is no longer stored after fadump HDR in the reserved
   dump area. Instead, the second kernel dynamically allocates memory
   for the elfcorehdr within the address range from 0 to the boot memory
   size. Therefore, update figures 1 and 2 of Memory Reservation during
   the first and second kernels to reflect this change.

2. A version field has been added to the fadump header to manage the
   future changes to fadump crash info header structure without changing
   the fadump header magic number in the future. Therefore, remove the
   corresponding TODO from the document.

Signed-off-by: Sourabh Jain 
Cc: Aditya Gupta 
Cc: Aneesh Kumar K.V 
Cc: Hari Bathini 
Cc: Mahesh Salgaonkar 
Cc: Michael Ellerman 
Cc: Naveen N Rao 
---
 .../arch/powerpc/firmware-assisted-dump.rst   | 91 +--
 1 file changed, 42 insertions(+), 49 deletions(-)

diff --git a/Documentation/arch/powerpc/firmware-assisted-dump.rst 
b/Documentation/arch/powerpc/firmware-assisted-dump.rst
index e363fc48529a..7e37aadd1f77 100644
--- a/Documentation/arch/powerpc/firmware-assisted-dump.rst
+++ b/Documentation/arch/powerpc/firmware-assisted-dump.rst
@@ -134,12 +134,12 @@ that are run. If there is dump data, then the
 memory is held.
 
 If there is no waiting dump data, then only the memory required to
-hold CPU state, HPTE region, boot memory dump, FADump header and
-elfcore header, is usually reserved at an offset greater than boot
-memory size (see Fig. 1). This area is *not* released: this region
-will be kept permanently reserved, so that it can act as a receptacle
-for a copy of the boot memory content in addition to CPU state and
-HPTE region, in the case a crash does occur.
+hold CPU state, HPTE region, boot memory dump, and FADump header is
+usually reserved at an offset greater than boot memory size (see Fig. 1).
+This area is *not* released: this region will be kept permanently
+reserved, so that it can act as a receptacle for a copy of the boot
+memory content in addition to CPU state and HPTE region, in the case
+a crash does occur.
 
 Since this reserved memory area is used only after the system crash,
 there is no point in blocking this significant chunk of memory from
@@ -153,22 +153,22 @@ that were present in CMA region::
 
   o Memory Reservation during first kernel
 
-  Low memory Top of memory
-  0boot memory size   |<--- Reserved dump area --->|   |
-  |   |   |Permanent Reservation   |   |
-  V   V   ||   V
-  +---+-/ /---+---++---+-+-++--+
-  |   |   |///||  DUMP | HDR | ELF ||  |
-  +---+-/ /---+---++---+-+-++--+
-|   ^^ ^  ^   ^
-|   || |  |   |
-\  CPU  HPTE   /  |   |
- --   |   |
-  Boot memory content gets transferred|   |
-  to reserved area by firmware at the |   |
-  time of crash.  |   |
-  FADump Header   |
-   (meta area)|
+  Low memory  Top of memory
+  0boot memory size   |<-- Reserved dump area ->| |
+  |   |   |  Permanent Reservation  | |
+  V   V   | | V
+  +---+-/ /---+---++---+---++-+
+  |   |   |///||DUMP   |  HDR  || |
+  +---+-/ /---+---++---+---++-+
+|   ^^   ^ ^  ^
+|   ||   | |  |
+\  CPU  HPTE / |  |
+   |  |
+  Boot memory content gets transferred |  |
+  to reserved area by firmware at the  |  |
+  time of crash.   |  |
+   FADump Header  |
+(meta area)   |
   |
   |
   Metadata: This area holds a metadata structure whose
@@ -186,13 +186,2

[PATCH v10 2/3] powerpc/fadump: add hotplug_ready sysfs interface

2024-04-22 Thread Sourabh Jain
The elfcorehdr describes the CPUs and memory of the crashed kernel to
the kernel that captures the dump, known as the second or fadump kernel.
The elfcorehdr needs to be updated if the system's memory changes due to
memory hotplug or online/offline events.

Currently, memory hotplug events are monitored in userspace by udev
rules, and fadump is re-registered, which recreates the elfcorehdr with
the latest available memory in the system.

However, the previous patch ("powerpc: make fadump resilient with memory
add/remove events") moved the creation of elfcorehdr to the second or
fadump kernel. This eliminates the need to regenerate the elfcorehdr
during memory hotplug or online/offline events.

Create a sysfs entry at /sys/kernel/fadump/hotplug_ready to let
userspace know that fadump re-registration is not required for memory
add/remove events.

Signed-off-by: Sourabh Jain 
Cc: Aditya Gupta 
Cc: "Aneesh Kumar K.V" 
Cc: Hari Bathini 
Cc: Mahesh Salgaonkar 
Cc: Michael Ellerman 
Cc: Naveen N Rao 
---
 Documentation/ABI/testing/sysfs-kernel-fadump | 11 +++
 arch/powerpc/kernel/fadump.c  | 14 ++
 2 files changed, 25 insertions(+)

diff --git a/Documentation/ABI/testing/sysfs-kernel-fadump 
b/Documentation/ABI/testing/sysfs-kernel-fadump
index 8f7a64a81783..c586054657d6 100644
--- a/Documentation/ABI/testing/sysfs-kernel-fadump
+++ b/Documentation/ABI/testing/sysfs-kernel-fadump
@@ -38,3 +38,14 @@ Contact: linuxppc-dev@lists.ozlabs.org
 Description:   read only
Provide information about the amount of memory reserved by
FADump to save the crash dump in bytes.
+
+What:  /sys/kernel/fadump/hotplug_ready
+Date:  Apr 2024
+Contact:   linuxppc-dev@lists.ozlabs.org
+Description:   read only
+   Kdump udev rule re-registers fadump on memory add/remove events,
+   primarily to update the elfcorehdr. This sysfs indicates the
+   kdump udev rule that fadump re-registration is not required on
+   memory add/remove events because elfcorehdr is now prepared in
+   the second/fadump kernel.
+User:  kexec-tools
diff --git a/arch/powerpc/kernel/fadump.c b/arch/powerpc/kernel/fadump.c
index 35254fc1516b..dfab452e947b 100644
--- a/arch/powerpc/kernel/fadump.c
+++ b/arch/powerpc/kernel/fadump.c
@@ -1442,6 +1442,18 @@ static ssize_t enabled_show(struct kobject *kobj,
return sprintf(buf, "%d\n", fw_dump.fadump_enabled);
 }
 
+/*
+ * /sys/kernel/fadump/hotplug_ready sysfs node returns 1, which inidcates
+ * to usersapce that fadump re-registration is not required on memory
+ * hotplug events.
+ */
+static ssize_t hotplug_ready_show(struct kobject *kobj,
+ struct kobj_attribute *attr,
+ char *buf)
+{
+   return sprintf(buf, "%d\n", 1);
+}
+
 static ssize_t mem_reserved_show(struct kobject *kobj,
 struct kobj_attribute *attr,
 char *buf)
@@ -1514,11 +1526,13 @@ static struct kobj_attribute release_attr = 
__ATTR_WO(release_mem);
 static struct kobj_attribute enable_attr = __ATTR_RO(enabled);
 static struct kobj_attribute register_attr = __ATTR_RW(registered);
 static struct kobj_attribute mem_reserved_attr = __ATTR_RO(mem_reserved);
+static struct kobj_attribute hotplug_ready_attr = __ATTR_RO(hotplug_ready);
 
 static struct attribute *fadump_attrs[] = {
_attr.attr,
_attr.attr,
_reserved_attr.attr,
+   _ready_attr.attr,
NULL,
 };
 
-- 
2.44.0



[PATCH v10 1/3] powerpc: make fadump resilient with memory add/remove events

2024-04-22 Thread Sourabh Jain
Due to changes in memory resources caused by either memory hotplug or
online/offline events, the elfcorehdr, which describes the CPUs and
memory of the crashed kernel to the kernel that collects the dump (known
as second/fadump kernel), becomes outdated. Consequently, attempting
dump collection with an outdated elfcorehdr can lead to failed or
inaccurate dump collection.

Memory hotplug or online/offline events is referred as memory add/remove
events in reset of the commit message.

The current solution to address the aforementioned issue is as follows:
Monitor memory add/remove events in userspace using udev rules, and
re-register fadump whenever there are changes in memory resources. This
leads to the creation of a new elfcorehdr with updated system memory
information.

There are several notable issues associated with re-registering fadump
for every memory add/remove events.

1. Bulk memory add/remove events with udev-based fadump re-registration
   can lead to race conditions and, more importantly, it creates a wide
   window during which fadump is inactive until all memory add/remove
   events are settled.
2. Re-registering fadump for every memory add/remove event is
   inefficient.
3. The memory for elfcorehdr is allocated based on the memblock regions
   available during early boot and remains fixed thereafter. However, if
   elfcorehdr is later recreated with additional memblock regions, its
   size will increase, potentially leading to memory corruption.

Address the aforementioned challenges by shifting the creation of
elfcorehdr from the first kernel (also referred as the crashed kernel),
where it was created and frequently recreated for every memory
add/remove event, to the fadump kernel. As a result, the elfcorehdr only
needs to be created once, thus eliminating the necessity to re-register
fadump during memory add/remove events.

At present, the first kernel prepares fadump header and stores it in the
fadump reserved area. The fadump header includes the start address of
the elfcorehdr, crashing CPU details, and other relevant information. In
the event of a crash in the first kernel, the second/fadump boots and
accesses the fadump header prepared by the first kernel. It then
performs the following steps in a platform-specific function
[rtas|opal]_fadump_process:

1. Sanity check for fadump header
2. Update CPU notes in elfcorehdr

Along with the above, update the setup_fadump()/fadump.c to create
elfcorehdr and set its address to the global variable elfcorehdr_addr
for the vmcore module to process it in the second/fadump kernel.

Section below outlines the information required to create the elfcorehdr
and the changes made to make it available to the fadump kernel if it's
not already.

To create elfcorehdr, the following crashed kernel information is
required: CPU notes, vmcoreinfo, and memory ranges.

At present, the CPU notes are already prepared in the fadump kernel, so
no changes are needed in that regard. The fadump kernel has access to
all crashed kernel memory regions, including boot memory regions that
are relocated by firmware to fadump reserved areas, so no changes for
that either. However, it is necessary to add new members to the fadump
header, i.e., the 'fadump_crash_info_header' structure, in order to pass
the crashed kernel's vmcoreinfo address and its size to fadump kernel.

In addition to the vmcoreinfo address and size, there are a few other
attributes also added to the fadump_crash_info_header structure.

1. version:
   It stores the fadump header version, which is currently set to 1.
   This provides flexibility to update the fadump crash info header in
   the future without changing the magic number. For each change in the
   fadump header, the version will be increased. This will help the
   updated kernel determine how to handle kernel dumps from older
   kernels. The magic number remains relevant for checking fadump header
   corruption.

2. pt_regs_sz/cpu_mask_sz:
   Store size of pt_regs and cpu_mask structure of first kernel. These
   attributes are used to prevent dump processing if the sizes of
   pt_regs or cpu_mask structure differ between the first and fadump
   kernels.

Note: if either first/crashed kernel or second/fadump kernel do not have
the changes introduced here then kernel fail to collect the dump and
prints relevant error message on the console.

Signed-off-by: Sourabh Jain 
Cc: Aditya Gupta 
Cc: "Aneesh Kumar K.V" 
Cc: Hari Bathini 
Cc: Mahesh Salgaonkar 
Cc: Michael Ellerman 
Cc: Naveen N Rao 
---
 arch/powerpc/include/asm/fadump-internal.h   |  31 +-
 arch/powerpc/kernel/fadump.c | 361 +++
 arch/powerpc/platforms/powernv/opal-fadump.c |  22 +-
 arch/powerpc/platforms/pseries/rtas-fadump.c |  34 +-
 4 files changed, 242 insertions(+), 206 deletions(-)

diff --git a/arch/powerpc/include/asm/fadump-internal.h 
b/arch/powerpc/include/asm/fadump-internal.h
index 27f9e11eda28..5d706a7acc8a 100644
--- a/arch/power

[PATCH v10 0/3] powerpc: make fadump resilient with memory add/remove events

2024-04-22 Thread Sourabh Jain
ader contains an old magic number
  - Rebased it to 6.7.0-rc4

v5: 29 Oct 2023 
  https://lore.kernel.org/all/20231029124548.12198-1-sourabhj...@linux.ibm.com/
  - Fix a comment on the first patch

v4: 21 Oct 2023
  https://lore.kernel.org/all/20231021181733.204311-1-sourabhj...@linux.ibm.com/
  - Fix a build warning about type casting

v3: 9 Oct 2023
  https://lore.kernel.org/all/20231009041953.36139-1-sourabhj...@linux.ibm.com/
  - Assign physical address of elfcorehdr to fdh->elfcorehdr_addr
  - Rename a variable, boot_mem_dest_addr -> boot_mem_dest_offset

v2: 25 Sep 2023
  https://lore.kernel.org/all/20230925051214.678957-1-sourabhj...@linux.ibm.com/
  - Fixed a few indentation issues reported by the checkpatch script.
  - Rebased it to 6.6.0-rc3

v1: 17 Sep 2023
  https://lore.kernel.org/all/20230917080225.561627-1-sourabhj...@linux.ibm.com/

Cc: Aditya Gupta 
Cc: "Aneesh Kumar K.V" 
Cc: Hari Bathini 
Cc: Mahesh Salgaonkar 
Cc: Michael Ellerman 
Cc: Naveen N Rao 

Sourabh Jain (3):
  powerpc: make fadump resilient with memory add/remove events
  powerpc/fadump: add hotplug_ready sysfs interface
  Documentation/powerpc: update fadump implementation details

 Documentation/ABI/testing/sysfs-kernel-fadump |  11 +
 .../arch/powerpc/firmware-assisted-dump.rst   |  91 ++---
 arch/powerpc/include/asm/fadump-internal.h|  31 +-
 arch/powerpc/kernel/fadump.c  | 375 ++
 arch/powerpc/platforms/powernv/opal-fadump.c  |  22 +-
 arch/powerpc/platforms/pseries/rtas-fadump.c  |  34 +-
 6 files changed, 309 insertions(+), 255 deletions(-)

-- 
2.44.0



[PATCH v9 0/3] powerpc: make fadump resilient with memory add/remove events

2024-04-16 Thread Sourabh Jain
m/
  - Fix a comment on the first patch

v4: 21 Oct 2023
  https://lore.kernel.org/all/20231021181733.204311-1-sourabhj...@linux.ibm.com/
  - Fix a build warning about type casting

v3: 9 Oct 2023
  https://lore.kernel.org/all/20231009041953.36139-1-sourabhj...@linux.ibm.com/
  - Assign physical address of elfcorehdr to fdh->elfcorehdr_addr
  - Rename a variable, boot_mem_dest_addr -> boot_mem_dest_offset

v2: 25 Sep 2023
  https://lore.kernel.org/all/20230925051214.678957-1-sourabhj...@linux.ibm.com/
  - Fixed a few indentation issues reported by the checkpatch script.
  - Rebased it to 6.6.0-rc3

v1: 17 Sep 2023
  https://lore.kernel.org/all/20230917080225.561627-1-sourabhj...@linux.ibm.com/

Cc: Aditya Gupta 
Cc: "Aneesh Kumar K.V" 
Cc: Hari Bathini 
Cc: Mahesh Salgaonkar 
Cc: Michael Ellerman 
Cc: Naveen N Rao 

Sourabh Jain (3):
  powerpc: make fadump resilient with memory add/remove events
  powerpc/fadump: add hotplug_ready sysfs interface
  Documentation/powerpc: update fadump implementation details

 Documentation/ABI/testing/sysfs-kernel-fadump |  11 +
 .../arch/powerpc/firmware-assisted-dump.rst   |  91 ++---
 arch/powerpc/include/asm/fadump-internal.h|  31 +-
 arch/powerpc/kernel/fadump.c  | 375 ++
 arch/powerpc/platforms/powernv/opal-fadump.c  |  22 +-
 arch/powerpc/platforms/pseries/rtas-fadump.c  |  34 +-
 6 files changed, 309 insertions(+), 255 deletions(-)

-- 
2.43.0



[PATCH v9 3/3] Documentation/powerpc: update fadump implementation details

2024-04-16 Thread Sourabh Jain
The patch titled ("powerpc: make fadump resilient with memory add/remove
events") has made significant changes to the implementation of fadump,
particularly on elfcorehdr creation and fadump crash info header
structure. Therefore, updating the fadump implementation documentation
to reflect those changes.

Following updates are done to firmware assisted dump documentation:

1. The elfcorehdr is no longer stored after fadump HDR in the reserved
   dump area. Instead, the second kernel dynamically allocates memory
   for the elfcorehdr within the address range from 0 to the boot memory
   size. Therefore, update figures 1 and 2 of Memory Reservation during
   the first and second kernels to reflect this change.

2. A version field has been added to the fadump header to manage the
   future changes to fadump crash info header structure without changing
   the fadump header magic number in the future. Therefore, remove the
   corresponding TODO from the document.

Signed-off-by: Sourabh Jain 
Cc: Aditya Gupta 
Cc: Aneesh Kumar K.V 
Cc: Hari Bathini 
Cc: Mahesh Salgaonkar 
Cc: Michael Ellerman 
Cc: Naveen N Rao 
---
 .../arch/powerpc/firmware-assisted-dump.rst   | 91 +--
 1 file changed, 42 insertions(+), 49 deletions(-)

diff --git a/Documentation/arch/powerpc/firmware-assisted-dump.rst 
b/Documentation/arch/powerpc/firmware-assisted-dump.rst
index e363fc48529a..7e37aadd1f77 100644
--- a/Documentation/arch/powerpc/firmware-assisted-dump.rst
+++ b/Documentation/arch/powerpc/firmware-assisted-dump.rst
@@ -134,12 +134,12 @@ that are run. If there is dump data, then the
 memory is held.
 
 If there is no waiting dump data, then only the memory required to
-hold CPU state, HPTE region, boot memory dump, FADump header and
-elfcore header, is usually reserved at an offset greater than boot
-memory size (see Fig. 1). This area is *not* released: this region
-will be kept permanently reserved, so that it can act as a receptacle
-for a copy of the boot memory content in addition to CPU state and
-HPTE region, in the case a crash does occur.
+hold CPU state, HPTE region, boot memory dump, and FADump header is
+usually reserved at an offset greater than boot memory size (see Fig. 1).
+This area is *not* released: this region will be kept permanently
+reserved, so that it can act as a receptacle for a copy of the boot
+memory content in addition to CPU state and HPTE region, in the case
+a crash does occur.
 
 Since this reserved memory area is used only after the system crash,
 there is no point in blocking this significant chunk of memory from
@@ -153,22 +153,22 @@ that were present in CMA region::
 
   o Memory Reservation during first kernel
 
-  Low memory Top of memory
-  0boot memory size   |<--- Reserved dump area --->|   |
-  |   |   |Permanent Reservation   |   |
-  V   V   ||   V
-  +---+-/ /---+---++---+-+-++--+
-  |   |   |///||  DUMP | HDR | ELF ||  |
-  +---+-/ /---+---++---+-+-++--+
-|   ^^ ^  ^   ^
-|   || |  |   |
-\  CPU  HPTE   /  |   |
- --   |   |
-  Boot memory content gets transferred|   |
-  to reserved area by firmware at the |   |
-  time of crash.  |   |
-  FADump Header   |
-   (meta area)|
+  Low memory  Top of memory
+  0boot memory size   |<-- Reserved dump area ->| |
+  |   |   |  Permanent Reservation  | |
+  V   V   | | V
+  +---+-/ /---+---++---+---++-+
+  |   |   |///||DUMP   |  HDR  || |
+  +---+-/ /---+---++---+---++-+
+|   ^^   ^ ^  ^
+|   ||   | |  |
+\  CPU  HPTE / |  |
+   |  |
+  Boot memory content gets transferred |  |
+  to reserved area by firmware at the  |  |
+  time of crash.   |  |
+   FADump Header  |
+(meta area)   |
   |
   |
   Metadata: This area holds a metadata structure whose
@@ -186,13 +186,2

[PATCH v9 1/3] powerpc: make fadump resilient with memory add/remove events

2024-04-16 Thread Sourabh Jain
Due to changes in memory resources caused by either memory hotplug or
online/offline events, the elfcorehdr, which describes the CPUs and
memory of the crashed kernel to the kernel that collects the dump (known
as second/fadump kernel), becomes outdated. Consequently, attempting
dump collection with an outdated elfcorehdr can lead to failed or
inaccurate dump collection.

Memory hotplug or online/offline events is referred as memory add/remove
events in reset of the commit message.

The current solution to address the aforementioned issue is as follows:
Monitor memory add/remove events in userspace using udev rules, and
re-register fadump whenever there are changes in memory resources. This
leads to the creation of a new elfcorehdr with updated system memory
information.

There are several notable issues associated with re-registering fadump
for every memory add/remove events.

1. Bulk memory add/remove events with udev-based fadump re-registration
   can lead to race conditions and, more importantly, it creates a wide
   window during which fadump is inactive until all memory add/remove
   events are settled.
2. Re-registering fadump for every memory add/remove event is
   inefficient.
3. The memory for elfcorehdr is allocated based on the memblock regions
   available during early boot and remains fixed thereafter. However, if
   elfcorehdr is later recreated with additional memblock regions, its
   size will increase, potentially leading to memory corruption.

Address the aforementioned challenges by shifting the creation of
elfcorehdr from the first kernel (also referred as the crashed kernel),
where it was created and frequently recreated for every memory
add/remove event, to the fadump kernel. As a result, the elfcorehdr only
needs to be created once, thus eliminating the necessity to re-register
fadump during memory add/remove events.

At present, the first kernel prepares fadump header and stores it in the
fadump reserved area. The fadump header includes the start address of
the elfcorehdr, crashing CPU details, and other relevant information. In
the event of a crash in the first kernel, the second/fadump boots and
accesses the fadump header prepared by the first kernel. It then
performs the following steps in a platform-specific function
[rtas|opal]_fadump_process:

1. Sanity check for fadump header
2. Update CPU notes in elfcorehdr

Along with the above, update the setup_fadump()/fadump.c to create
elfcorehdr and set its address to the global variable elfcorehdr_addr
for the vmcore module to process it in the second/fadump kernel.

Section below outlines the information required to create the elfcorehdr
and the changes made to make it available to the fadump kernel if it's
not already.

To create elfcorehdr, the following crashed kernel information is
required: CPU notes, vmcoreinfo, and memory ranges.

At present, the CPU notes are already prepared in the fadump kernel, so
no changes are needed in that regard. The fadump kernel has access to
all crashed kernel memory regions, including boot memory regions that
are relocated by firmware to fadump reserved areas, so no changes for
that either. However, it is necessary to add new members to the fadump
header, i.e., the 'fadump_crash_info_header' structure, in order to pass
the crashed kernel's vmcoreinfo address and its size to fadump kernel.

In addition to the vmcoreinfo address and size, there are a few other
attributes also added to the fadump_crash_info_header structure.

1. version:
   It stores the fadump header version, which is currently set to 1.
   This provides flexibility to update the fadump crash info header in
   the future without changing the magic number. For each change in the
   fadump header, the version will be increased. This will help the
   updated kernel determine how to handle kernel dumps from older
   kernels. The magic number remains relevant for checking fadump header
   corruption.

2. pt_regs_sz/cpu_mask_sz:
   Store size of pt_regs and cpu_mask structure of first kernel. These
   attributes are used to prevent dump processing if the sizes of
   pt_regs or cpu_mask structure differ between the first and fadump
   kernels.

Note: if either first/crashed kernel or second/fadump kernel do not have
the changes introduced here then kernel fail to collect the dump and
prints relevant error message on the console.

Signed-off-by: Sourabh Jain 
Cc: Aditya Gupta 
Cc: "Aneesh Kumar K.V" 
Cc: Hari Bathini 
Cc: Mahesh Salgaonkar 
Cc: Michael Ellerman 
Cc: Naveen N Rao 
---
 arch/powerpc/include/asm/fadump-internal.h   |  31 +-
 arch/powerpc/kernel/fadump.c | 361 +++
 arch/powerpc/platforms/powernv/opal-fadump.c |  22 +-
 arch/powerpc/platforms/pseries/rtas-fadump.c |  34 +-
 4 files changed, 242 insertions(+), 206 deletions(-)

diff --git a/arch/powerpc/include/asm/fadump-internal.h 
b/arch/powerpc/include/asm/fadump-internal.h
index 27f9e11eda28..5d706a7acc8a 100644
--- a/arch/power

[PATCH v9 2/3] powerpc/fadump: add hotplug_ready sysfs interface

2024-04-16 Thread Sourabh Jain
The elfcorehdr describes the CPUs and memory of the crashed kernel to
the kernel that captures the dump, known as the second or fadump kernel.
The elfcorehdr needs to be updated if the system's memory changes due to
memory hotplug or online/offline events.

Currently, memory hotplug events are monitored in userspace by udev
rules, and fadump is re-registered, which recreates the elfcorehdr with
the latest available memory in the system.

However, the previous patch ("powerpc: make fadump resilient with memory
add/remove events") moved the creation of elfcorehdr to the second or
fadump kernel. This eliminates the need to regenerate the elfcorehdr
during memory hotplug or online/offline events.

Create a sysfs entry at /sys/kernel/fadump/hotplug_ready to let
userspace know that fadump re-registration is not required for memory
add/remove events.

Signed-off-by: Sourabh Jain 
Cc: Aditya Gupta 
Cc: "Aneesh Kumar K.V" 
Cc: Hari Bathini 
Cc: Mahesh Salgaonkar 
Cc: Michael Ellerman 
Cc: Naveen N Rao 
---
 Documentation/ABI/testing/sysfs-kernel-fadump | 11 +++
 arch/powerpc/kernel/fadump.c  | 14 ++
 2 files changed, 25 insertions(+)

diff --git a/Documentation/ABI/testing/sysfs-kernel-fadump 
b/Documentation/ABI/testing/sysfs-kernel-fadump
index 8f7a64a81783..c586054657d6 100644
--- a/Documentation/ABI/testing/sysfs-kernel-fadump
+++ b/Documentation/ABI/testing/sysfs-kernel-fadump
@@ -38,3 +38,14 @@ Contact: linuxppc-dev@lists.ozlabs.org
 Description:   read only
Provide information about the amount of memory reserved by
FADump to save the crash dump in bytes.
+
+What:  /sys/kernel/fadump/hotplug_ready
+Date:  Apr 2024
+Contact:   linuxppc-dev@lists.ozlabs.org
+Description:   read only
+   Kdump udev rule re-registers fadump on memory add/remove events,
+   primarily to update the elfcorehdr. This sysfs indicates the
+   kdump udev rule that fadump re-registration is not required on
+   memory add/remove events because elfcorehdr is now prepared in
+   the second/fadump kernel.
+User:  kexec-tools
diff --git a/arch/powerpc/kernel/fadump.c b/arch/powerpc/kernel/fadump.c
index e816725a11c0..131bf0d2d45d 100644
--- a/arch/powerpc/kernel/fadump.c
+++ b/arch/powerpc/kernel/fadump.c
@@ -1442,6 +1442,18 @@ static ssize_t enabled_show(struct kobject *kobj,
return sprintf(buf, "%d\n", fw_dump.fadump_enabled);
 }
 
+/*
+ * /sys/kernel/fadump/hotplug_ready sysfs node returns 1, which inidcates
+ * to usersapce that fadump re-registration is not required on memory
+ * hotplug events.
+ */
+static ssize_t hotplug_ready_show(struct kobject *kobj,
+ struct kobj_attribute *attr,
+ char *buf)
+{
+   return sprintf(buf, "%d\n", 1);
+}
+
 static ssize_t mem_reserved_show(struct kobject *kobj,
 struct kobj_attribute *attr,
 char *buf)
@@ -1514,11 +1526,13 @@ static struct kobj_attribute release_attr = 
__ATTR_WO(release_mem);
 static struct kobj_attribute enable_attr = __ATTR_RO(enabled);
 static struct kobj_attribute register_attr = __ATTR_RW(registered);
 static struct kobj_attribute mem_reserved_attr = __ATTR_RO(mem_reserved);
+static struct kobj_attribute hotplug_ready_attr = __ATTR_RO(hotplug_ready);
 
 static struct attribute *fadump_attrs[] = {
_attr.attr,
_attr.attr,
_reserved_attr.attr,
+   _ready_attr.attr,
NULL,
 };
 
-- 
2.43.0



[PATCH v18 6/6] powerpc/crash: add crash memory hotplug support

2024-03-26 Thread Sourabh Jain
Extend the arch crash hotplug handler, as introduced by the patch title
("powerpc: add crash CPU hotplug support"), to also support memory
add/remove events.

Elfcorehdr describes the memory of the crash kernel to capture the
kernel; hence, it needs to be updated if memory resources change due to
memory add/remove events. Therefore, arch_crash_handle_hotplug_event()
is updated to recreate the elfcorehdr and replace it with the previous
one on memory add/remove events.

The memblock list is used to prepare the elfcorehdr. In the case of
memory hot remove, the memblock list is updated after the arch crash
hotplug handler is triggered, as depicted in Figure 1. Thus, the
hot-removed memory is explicitly removed from the crash memory ranges
to ensure that the memory ranges added to elfcorehdr do not include the
hot-removed memory.

Memory remove
  |
  v
Offline pages
  |
  v
 Initiate memory notify call <> crash hotplug handler
 chain for MEM_OFFLINE event
  |
  v
 Update memblock list

Figure 1

There are two system calls, `kexec_file_load` and `kexec_load`, used to
load the kdump image. A few changes have been made to ensure that the
kernel can safely update the elfcorehdr component of the kdump image for
both system calls.

For the kexec_file_load syscall, kdump image is prepared in the kernel.
To support an increasing number of memory regions, the elfcorehdr is
built with extra buffer space to ensure that it can accommodate
additional memory ranges in future.

For the kexec_load syscall, the elfcorehdr is updated only if the
KEXEC_CRASH_HOTPLUG_SUPPORT kexec flag is passed to the kernel by the
kexec tool. Passing this flag to the kernel indicates that the
elfcorehdr is built to accommodate additional memory ranges and the
elfcorehdr segment is not considered for SHA calculation, making it safe
to update.

The changes related to this feature are kept under the CRASH_HOTPLUG
config, and it is enabled by default.

Signed-off-by: Sourabh Jain 
Acked-by: Hari Bathini 
Cc: Akhil Raj 
Cc: Andrew Morton 
Cc: Aneesh Kumar K.V 
Cc: Baoquan He 
Cc: Borislav Petkov (AMD) 
Cc: Boris Ostrovsky 
Cc: Christophe Leroy 
Cc: Dave Hansen 
Cc: Dave Young 
Cc: David Hildenbrand 
Cc: Greg Kroah-Hartman 
Cc: Laurent Dufour 
Cc: Mahesh Salgaonkar 
Cc: Michael Ellerman 
Cc: Mimi Zohar 
Cc: Naveen N Rao 
Cc: Oscar Salvador 
Cc: Thomas Gleixner 
Cc: Valentin Schneider 
Cc: Vivek Goyal 
Cc: ke...@lists.infradead.org
Cc: x...@kernel.org
---
* No changes in v18.

 arch/powerpc/include/asm/kexec.h|  3 +
 arch/powerpc/include/asm/kexec_ranges.h |  1 +
 arch/powerpc/kexec/crash.c  | 95 -
 arch/powerpc/kexec/file_load_64.c   | 20 +-
 arch/powerpc/kexec/ranges.c | 85 ++
 5 files changed, 202 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/include/asm/kexec.h b/arch/powerpc/include/asm/kexec.h
index e75970351bcd..95a98b390d62 100644
--- a/arch/powerpc/include/asm/kexec.h
+++ b/arch/powerpc/include/asm/kexec.h
@@ -141,6 +141,9 @@ void arch_crash_handle_hotplug_event(struct kimage *image, 
void *arg);
 
 int arch_crash_hotplug_support(struct kimage *image, unsigned long 
kexec_flags);
 #define arch_crash_hotplug_support arch_crash_hotplug_support
+
+unsigned int arch_crash_get_elfcorehdr_size(void);
+#define crash_get_elfcorehdr_size arch_crash_get_elfcorehdr_size
 #endif /* CONFIG_CRASH_HOTPLUG */
 
 extern int crashing_cpu;
diff --git a/arch/powerpc/include/asm/kexec_ranges.h 
b/arch/powerpc/include/asm/kexec_ranges.h
index 8489e844b447..14055896cbcb 100644
--- a/arch/powerpc/include/asm/kexec_ranges.h
+++ b/arch/powerpc/include/asm/kexec_ranges.h
@@ -7,6 +7,7 @@
 void sort_memory_ranges(struct crash_mem *mrngs, bool merge);
 struct crash_mem *realloc_mem_ranges(struct crash_mem **mem_ranges);
 int add_mem_range(struct crash_mem **mem_ranges, u64 base, u64 size);
+int remove_mem_range(struct crash_mem **mem_ranges, u64 base, u64 size);
 int get_exclude_memory_ranges(struct crash_mem **mem_ranges);
 int get_reserved_memory_ranges(struct crash_mem **mem_ranges);
 int get_crash_memory_ranges(struct crash_mem **mem_ranges);
diff --git a/arch/powerpc/kexec/crash.c b/arch/powerpc/kexec/crash.c
index 8938a19af12f..21b193e938a3 100644
--- a/arch/powerpc/kexec/crash.c
+++ b/arch/powerpc/kexec/crash.c
@@ -17,6 +17,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include 
 #include 
@@ -25,6 +26,7 @@
 #include 
 #include 
 #include 
+#include 
 
 /*
  * The primary CPU waits a while for all secondary CPUs to enter. This is to
@@ -398,6 +400,94 @@ void default_machine_crash_shutdown(struct pt_regs *regs)
 #undef pr_fmt
 #define pr_fmt(fmt) "crash hp: " fmt
 
+/*
+ * Advertise preferred elfcorehdr size to userspace via
+ * /sys/kernel/crash_elfcorehdr_size sysfs interface.
+ */
+unsigned int arch_crash_get_elfcorehdr_size(void)
+{
+   unsigned long phdr_cnt;
+
+

[PATCH v18 5/6] powerpc/crash: add crash CPU hotplug support

2024-03-25 Thread Sourabh Jain
Due to CPU/Memory hotplug or online/offline events, the elfcorehdr
(which describes the CPUs and memory of the crashed kernel) and FDT
(Flattened Device Tree) of kdump image becomes outdated. Consequently,
attempting dump collection with an outdated elfcorehdr or FDT can lead
to failed or inaccurate dump collection.

Going forward, CPU hotplug or online/offline events are referred as
CPU/Memory add/remove events.

The current solution to address the above issue involves monitoring the
CPU/Memory add/remove events in userspace using udev rules and whenever
there are changes in CPU and memory resources, the entire kdump image
is loaded again. The kdump image includes kernel, initrd, elfcorehdr,
FDT, purgatory. Given that only elfcorehdr and FDT get outdated due to
CPU/Memory add/remove events, reloading the entire kdump image is
inefficient. More importantly, kdump remains inactive for a substantial
amount of time until the kdump reload completes.

To address the aforementioned issue, commit 247262756121 ("crash: add
generic infrastructure for crash hotplug support") added a generic
infrastructure that allows architectures to selectively update the kdump
image component during CPU or memory add/remove events within the kernel
itself.

In the event of a CPU or memory add/remove events, the generic crash
hotplug event handler, `crash_handle_hotplug_event()`, is triggered. It
then acquires the necessary locks to update the kdump image and invokes
the architecture-specific crash hotplug handler,
`arch_crash_handle_hotplug_event()`, to update the required kdump image
components.

This patch adds crash hotplug handler for PowerPC and enable support to
update the kdump image on CPU add/remove events. Support for memory
add/remove events is added in a subsequent patch with the title
"powerpc: add crash memory hotplug support"

As mentioned earlier, only the elfcorehdr and FDT kdump image components
need to be updated in the event of CPU or memory add/remove events.
However, on PowerPC architecture crash hotplug handler only updates the
FDT to enable crash hotplug support for CPU add/remove events. Here's
why.

The elfcorehdr on PowerPC is built with possible CPUs, and thus, it does
not need an update on CPU add/remove events. On the other hand, the FDT
needs to be updated on CPU add events to include the newly added CPU. If
the FDT is not updated and the kernel crashes on a newly added CPU, the
kdump kernel will fail to boot due to the unavailability of the crashing
CPU in the FDT. During the early boot, it is expected that the boot CPU
must be a part of the FDT; otherwise, the kernel will raise a BUG and
fail to boot. For more information, refer to commit 36ae37e3436b0
("powerpc: Make boot_cpuid common between 32 and 64-bit"). Since it is
okay to have an offline CPU in the kdump FDT, no action is taken in case
of CPU removal.

There are two system calls, `kexec_file_load` and `kexec_load`, used to
load the kdump image. Few changes have been made to ensure kernel can
safely update the FDT of kdump image loaded using both system calls.

For kexec_file_load syscall the kdump image is prepared in kernel. So to
support an increasing number of CPUs, the FDT is constructed with extra
buffer space to ensure it can accommodate a possible number of CPU
nodes. Additionally, a call to fdt_pack (which trims the unused space
once the FDT is prepared) is avoided if this feature is enabled.

For the kexec_load syscall, the FDT is updated only if the
KEXEC_CRASH_HOTPLUG_SUPPORT kexec flag is passed to the kernel by
userspace (kexec tools). When userspace passes this flag to the kernel,
it indicates that the FDT is built to accommodate possible CPUs, and the
FDT segment is excluded from SHA calculation, making it safe to update.

The changes related to this feature are kept under the CRASH_HOTPLUG
config, and it is enabled by default.

Signed-off-by: Sourabh Jain 
Acked-by: Hari Bathini 
Cc: Akhil Raj 
Cc: Andrew Morton 
Cc: Aneesh Kumar K.V 
Cc: Baoquan He 
Cc: Borislav Petkov (AMD) 
Cc: Boris Ostrovsky 
Cc: Christophe Leroy 
Cc: Dave Hansen 
Cc: Dave Young 
Cc: David Hildenbrand 
Cc: Greg Kroah-Hartman 
Cc: Laurent Dufour 
Cc: Mahesh Salgaonkar 
Cc: Michael Ellerman 
Cc: Mimi Zohar 
Cc: Naveen N Rao 
Cc: Oscar Salvador 
Cc: Thomas Gleixner 
Cc: Valentin Schneider 
Cc: Vivek Goyal 
Cc: ke...@lists.infradead.org
Cc: x...@kernel.org
---
* No changes in v18

 arch/powerpc/Kconfig  |   4 ++
 arch/powerpc/include/asm/kexec.h  |   8 +++
 arch/powerpc/kexec/crash.c| 103 ++
 arch/powerpc/kexec/elf_64.c   |   3 +-
 arch/powerpc/kexec/file_load_64.c |  17 +
 5 files changed, 134 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
index 1c4be3373686..a1a3b3363008 100644
--- a/arch/powerpc/Kconfig
+++ b/arch/powerpc/Kconfig
@@ -686,6 +686,10 @@ config ARCH_SELECTS_CRASH_DUMP
depends on CRASH_DUMP
sele

[PATCH v18 3/6] powerpc/kexec: move *_memory_ranges functions to ranges.c

2024-03-25 Thread Sourabh Jain
Move the following functions form kexec/{file_load_64.c => ranges.c} and
make them public so that components other than KEXEC_FILE can also use
these functions.
1. get_exclude_memory_ranges
2. get_reserved_memory_ranges
3. get_crash_memory_ranges
4. get_usable_memory_ranges

Later in the series get_crash_memory_ranges function is utilized for
in-kernel updates to kdump image during CPU/Memory hotplug or
online/offline events for both kexec_load and kexec_file_load syscalls.

Since the above functions are moved to ranges.c, some of the helper
functions in ranges.c are no longer required to be public. Mark them as
static and removed them from kexec_ranges.h header file.

Finally, remove the CONFIG_KEXEC_FILE build dependency for range.c
because it is required for other config, such as CONFIG_CRASH_DUMP.

No functional changes are intended.

Signed-off-by: Sourabh Jain 
Acked-by: Hari Bathini 
Cc: Akhil Raj 
Cc: Andrew Morton 
Cc: Aneesh Kumar K.V 
Cc: Baoquan He 
Cc: Borislav Petkov (AMD) 
Cc: Boris Ostrovsky 
Cc: Christophe Leroy 
Cc: Dave Hansen 
Cc: Dave Young 
Cc: David Hildenbrand 
Cc: Greg Kroah-Hartman 
Cc: Laurent Dufour 
Cc: Mahesh Salgaonkar 
Cc: Michael Ellerman 
Cc: Mimi Zohar 
Cc: Naveen N Rao 
Cc: Oscar Salvador 
Cc: Thomas Gleixner 
Cc: Valentin Schneider 
Cc: Vivek Goyal 
Cc: ke...@lists.infradead.org
Cc: x...@kernel.org
---
Chnages in v18:
* Fix a typo in the commit message

 arch/powerpc/include/asm/kexec_ranges.h |  19 +-
 arch/powerpc/kexec/Makefile |   4 +-
 arch/powerpc/kexec/file_load_64.c   | 190 
 arch/powerpc/kexec/ranges.c | 227 +++-
 4 files changed, 224 insertions(+), 216 deletions(-)

diff --git a/arch/powerpc/include/asm/kexec_ranges.h 
b/arch/powerpc/include/asm/kexec_ranges.h
index f83866a19e87..8489e844b447 100644
--- a/arch/powerpc/include/asm/kexec_ranges.h
+++ b/arch/powerpc/include/asm/kexec_ranges.h
@@ -7,19 +7,8 @@
 void sort_memory_ranges(struct crash_mem *mrngs, bool merge);
 struct crash_mem *realloc_mem_ranges(struct crash_mem **mem_ranges);
 int add_mem_range(struct crash_mem **mem_ranges, u64 base, u64 size);
-int add_tce_mem_ranges(struct crash_mem **mem_ranges);
-int add_initrd_mem_range(struct crash_mem **mem_ranges);
-#ifdef CONFIG_PPC_64S_HASH_MMU
-int add_htab_mem_range(struct crash_mem **mem_ranges);
-#else
-static inline int add_htab_mem_range(struct crash_mem **mem_ranges)
-{
-   return 0;
-}
-#endif
-int add_kernel_mem_range(struct crash_mem **mem_ranges);
-int add_rtas_mem_range(struct crash_mem **mem_ranges);
-int add_opal_mem_range(struct crash_mem **mem_ranges);
-int add_reserved_mem_ranges(struct crash_mem **mem_ranges);
-
+int get_exclude_memory_ranges(struct crash_mem **mem_ranges);
+int get_reserved_memory_ranges(struct crash_mem **mem_ranges);
+int get_crash_memory_ranges(struct crash_mem **mem_ranges);
+int get_usable_memory_ranges(struct crash_mem **mem_ranges);
 #endif /* _ASM_POWERPC_KEXEC_RANGES_H */
diff --git a/arch/powerpc/kexec/Makefile b/arch/powerpc/kexec/Makefile
index 8e469c4da3f8..470eb0453e17 100644
--- a/arch/powerpc/kexec/Makefile
+++ b/arch/powerpc/kexec/Makefile
@@ -3,11 +3,11 @@
 # Makefile for the linux kernel.
 #
 
-obj-y  += core.o core_$(BITS).o
+obj-y  += core.o core_$(BITS).o ranges.o
 
 obj-$(CONFIG_PPC32)+= relocate_32.o
 
-obj-$(CONFIG_KEXEC_FILE)   += file_load.o ranges.o file_load_$(BITS).o 
elf_$(BITS).o
+obj-$(CONFIG_KEXEC_FILE)   += file_load.o file_load_$(BITS).o elf_$(BITS).o
 obj-$(CONFIG_VMCORE_INFO)  += vmcore_info.o
 obj-$(CONFIG_CRASH_DUMP)   += crash.o
 
diff --git a/arch/powerpc/kexec/file_load_64.c 
b/arch/powerpc/kexec/file_load_64.c
index 1bc65de6174f..6a01f62b8fcf 100644
--- a/arch/powerpc/kexec/file_load_64.c
+++ b/arch/powerpc/kexec/file_load_64.c
@@ -47,83 +47,6 @@ const struct kexec_file_ops * const kexec_file_loaders[] = {
NULL
 };
 
-/**
- * get_exclude_memory_ranges - Get exclude memory ranges. This list includes
- * regions like opal/rtas, tce-table, initrd,
- * kernel, htab which should be avoided while
- * setting up kexec load segments.
- * @mem_ranges:Range list to add the memory ranges to.
- *
- * Returns 0 on success, negative errno on error.
- */
-static int get_exclude_memory_ranges(struct crash_mem **mem_ranges)
-{
-   int ret;
-
-   ret = add_tce_mem_ranges(mem_ranges);
-   if (ret)
-   goto out;
-
-   ret = add_initrd_mem_range(mem_ranges);
-   if (ret)
-   goto out;
-
-   ret = add_htab_mem_range(mem_ranges);
-   if (ret)
-   goto out;
-
-   ret = add_kernel_mem_range(mem_ranges);
-   if (ret)
-   goto out;
-
-   ret = add_rtas_mem_range(mem_ranges);
-   if (ret)
-   goto out;
-
-   ret = add_opal_mem_range(mem_ran

[PATCH v18 4/6] PowerPC/kexec: make the update_cpus_node() function public

2024-03-25 Thread Sourabh Jain
Move the update_cpus_node() from kexec/{file_load_64.c => core_64.c}
to allow other kexec components to use it.

Later in the series, this function is used for in-kernel updates
to the kdump image during CPU/memory hotplug or online/offline events for
both kexec_load and kexec_file_load syscalls.

No functional changes are intended.

Signed-off-by: Sourabh Jain 
Acked-by: Hari Bathini 
Cc: Akhil Raj 
Cc: Andrew Morton 
Cc: Aneesh Kumar K.V 
Cc: Baoquan He 
Cc: Borislav Petkov (AMD) 
Cc: Boris Ostrovsky 
Cc: Christophe Leroy 
Cc: Dave Hansen 
Cc: Dave Young 
Cc: David Hildenbrand 
Cc: Greg Kroah-Hartman 
Cc: Laurent Dufour 
Cc: Mahesh Salgaonkar 
Cc: Michael Ellerman 
Cc: Mimi Zohar 
Cc: Naveen N Rao 
Cc: Oscar Salvador 
Cc: Thomas Gleixner 
Cc: Valentin Schneider 
Cc: Vivek Goyal 
Cc: ke...@lists.infradead.org
Cc: x...@kernel.org
---
* No changes in v18

 arch/powerpc/include/asm/kexec.h  |  4 ++
 arch/powerpc/kexec/core_64.c  | 91 +++
 arch/powerpc/kexec/file_load_64.c | 87 -
 3 files changed, 95 insertions(+), 87 deletions(-)

diff --git a/arch/powerpc/include/asm/kexec.h b/arch/powerpc/include/asm/kexec.h
index fdb90e24dc74..d9ff4d0e392d 100644
--- a/arch/powerpc/include/asm/kexec.h
+++ b/arch/powerpc/include/asm/kexec.h
@@ -185,6 +185,10 @@ static inline void crash_send_ipi(void 
(*crash_ipi_callback)(struct pt_regs *))
 
 #endif /* CONFIG_CRASH_DUMP */
 
+#if defined(CONFIG_KEXEC_FILE) || defined(CONFIG_CRASH_DUMP)
+int update_cpus_node(void *fdt);
+#endif
+
 #ifdef CONFIG_PPC_BOOK3S_64
 #include 
 #endif
diff --git a/arch/powerpc/kexec/core_64.c b/arch/powerpc/kexec/core_64.c
index 762e4d09aacf..85050be08a23 100644
--- a/arch/powerpc/kexec/core_64.c
+++ b/arch/powerpc/kexec/core_64.c
@@ -17,6 +17,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include 
 #include 
@@ -30,6 +31,7 @@
 #include 
 #include 
 #include 
+#include 
 
 int machine_kexec_prepare(struct kimage *image)
 {
@@ -419,3 +421,92 @@ static int __init export_htab_values(void)
 }
 late_initcall(export_htab_values);
 #endif /* CONFIG_PPC_64S_HASH_MMU */
+
+#if defined(CONFIG_KEXEC_FILE) || defined(CONFIG_CRASH_DUMP)
+/**
+ * add_node_props - Reads node properties from device node structure and add
+ *  them to fdt.
+ * @fdt:Flattened device tree of the kernel
+ * @node_offset:offset of the node to add a property at
+ * @dn: device node pointer
+ *
+ * Returns 0 on success, negative errno on error.
+ */
+static int add_node_props(void *fdt, int node_offset, const struct device_node 
*dn)
+{
+   int ret = 0;
+   struct property *pp;
+
+   if (!dn)
+   return -EINVAL;
+
+   for_each_property_of_node(dn, pp) {
+   ret = fdt_setprop(fdt, node_offset, pp->name, pp->value, 
pp->length);
+   if (ret < 0) {
+   pr_err("Unable to add %s property: %s\n", pp->name, 
fdt_strerror(ret));
+   return ret;
+   }
+   }
+   return ret;
+}
+
+/**
+ * update_cpus_node - Update cpus node of flattened device tree using of_root
+ *device node.
+ * @fdt:  Flattened device tree of the kernel.
+ *
+ * Returns 0 on success, negative errno on error.
+ */
+int update_cpus_node(void *fdt)
+{
+   struct device_node *cpus_node, *dn;
+   int cpus_offset, cpus_subnode_offset, ret = 0;
+
+   cpus_offset = fdt_path_offset(fdt, "/cpus");
+   if (cpus_offset < 0 && cpus_offset != -FDT_ERR_NOTFOUND) {
+   pr_err("Malformed device tree: error reading /cpus node: %s\n",
+  fdt_strerror(cpus_offset));
+   return cpus_offset;
+   }
+
+   if (cpus_offset > 0) {
+   ret = fdt_del_node(fdt, cpus_offset);
+   if (ret < 0) {
+   pr_err("Error deleting /cpus node: %s\n", 
fdt_strerror(ret));
+   return -EINVAL;
+   }
+   }
+
+   /* Add cpus node to fdt */
+   cpus_offset = fdt_add_subnode(fdt, fdt_path_offset(fdt, "/"), "cpus");
+   if (cpus_offset < 0) {
+   pr_err("Error creating /cpus node: %s\n", 
fdt_strerror(cpus_offset));
+   return -EINVAL;
+   }
+
+   /* Add cpus node properties */
+   cpus_node = of_find_node_by_path("/cpus");
+   ret = add_node_props(fdt, cpus_offset, cpus_node);
+   of_node_put(cpus_node);
+   if (ret < 0)
+   return ret;
+
+   /* Loop through all subnodes of cpus and add them to fdt */
+   for_each_node_by_type(dn, "cpu") {
+   cpus_subnode_offset = fdt_add_subnode(fdt, cpus_offset, 
dn->full_name);
+   if (cpus_subnode_offset < 0) {
+   pr_err("Unable to add %s subnode: %s\n", dn->full_name,

[PATCH v18 2/6] crash: add a new kexec flag for hotplug support

2024-03-25 Thread Sourabh Jain
Commit a72bbec70da2 ("crash: hotplug support for kexec_load()")
introduced a new kexec flag, `KEXEC_UPDATE_ELFCOREHDR`. Kexec tool uses
this flag to indicate to the kernel that it is safe to modify the
elfcorehdr of the kdump image loaded using the kexec_load system call.

However, it is possible that architectures may need to update kexec
segments other then elfcorehdr. For example, FDT (Flatten Device Tree)
on PowerPC. Introducing a new kexec flag for every new kexec segment
may not be a good solution. Hence, a generic kexec flag bit,
`KEXEC_CRASH_HOTPLUG_SUPPORT`, is introduced to share the CPU/Memory
hotplug support intent between the kexec tool and the kernel for the
kexec_load system call.

Now we have two kexec flags that enables crash hotplug support for
kexec_load system call. First is KEXEC_UPDATE_ELFCOREHDR (only used in
x86), and second is KEXEC_CRASH_HOTPLUG_SUPPORT (for all architectures).

To simplify the process of finding and reporting the crash hotplug
support the following changes are introduced.

1. Define arch specific function to process the kexec flags and
   determine crash hotplug support

2. Rename the @update_elfcorehdr member of struct kimage to
   @hotplug_support and populate it for both kexec_load and
   kexec_file_load syscalls, because architecture can update more than
   one kexec segment

3. Let generic function crash_check_hotplug_support report hotplug
   support for loaded kdump image based on value of @hotplug_support

To bring the x86 crash hotplug support in line with the above points,
the following changes have been made:

- Introduce the arch_crash_hotplug_support function to process kexec
  flags and determine crash hotplug support

- Remove the arch_crash_hotplug_[cpu|memory]_support functions

Signed-off-by: Sourabh Jain 
Acked-by: Baoquan He 
Acked-by: Hari Bathini 
Cc: Akhil Raj 
Cc: Andrew Morton 
Cc: Aneesh Kumar K.V 
Cc: Borislav Petkov (AMD) 
Cc: Boris Ostrovsky 
Cc: Christophe Leroy 
Cc: Dave Hansen 
Cc: Dave Young 
Cc: David Hildenbrand 
Cc: Eric DeVolder 
Cc: Greg Kroah-Hartman 
Cc: Laurent Dufour 
Cc: Mahesh Salgaonkar 
Cc: Michael Ellerman 
Cc: Mimi Zohar 
Cc: Naveen N Rao 
Cc: Oscar Salvador 
Cc: Thomas Gleixner 
Cc: Valentin Schneider 
Cc: Vivek Goyal 
Cc: ke...@lists.infradead.org
Cc: x...@kernel.org
---
Changes in v18:

* Describe x86 changes in commit message
* Update comment of crash_check_hotplug_support()

 arch/x86/include/asm/kexec.h | 11 ++-
 arch/x86/kernel/crash.c  | 28 +---
 drivers/base/cpu.c   |  2 +-
 drivers/base/memory.c|  2 +-
 include/linux/crash_core.h   | 13 ++---
 include/linux/kexec.h| 11 +++
 include/uapi/linux/kexec.h   |  1 +
 kernel/crash_core.c  | 15 ++-
 kernel/kexec.c   |  4 ++--
 kernel/kexec_file.c  |  5 +
 10 files changed, 48 insertions(+), 44 deletions(-)

diff --git a/arch/x86/include/asm/kexec.h b/arch/x86/include/asm/kexec.h
index cb1320ebbc23..ae5482a2f0ca 100644
--- a/arch/x86/include/asm/kexec.h
+++ b/arch/x86/include/asm/kexec.h
@@ -210,15 +210,8 @@ extern void kdump_nmi_shootdown_cpus(void);
 void arch_crash_handle_hotplug_event(struct kimage *image, void *arg);
 #define arch_crash_handle_hotplug_event arch_crash_handle_hotplug_event
 
-#ifdef CONFIG_HOTPLUG_CPU
-int arch_crash_hotplug_cpu_support(void);
-#define crash_hotplug_cpu_support arch_crash_hotplug_cpu_support
-#endif
-
-#ifdef CONFIG_MEMORY_HOTPLUG
-int arch_crash_hotplug_memory_support(void);
-#define crash_hotplug_memory_support arch_crash_hotplug_memory_support
-#endif
+int arch_crash_hotplug_support(struct kimage *image, unsigned long 
kexec_flags);
+#define arch_crash_hotplug_support arch_crash_hotplug_support
 
 unsigned int arch_crash_get_elfcorehdr_size(void);
 #define crash_get_elfcorehdr_size arch_crash_get_elfcorehdr_size
diff --git a/arch/x86/kernel/crash.c b/arch/x86/kernel/crash.c
index 2a682fe86352..f06501445cd9 100644
--- a/arch/x86/kernel/crash.c
+++ b/arch/x86/kernel/crash.c
@@ -402,20 +402,26 @@ int crash_load_segments(struct kimage *image)
 #undef pr_fmt
 #define pr_fmt(fmt) "crash hp: " fmt
 
-/* These functions provide the value for the sysfs crash_hotplug nodes */
-#ifdef CONFIG_HOTPLUG_CPU
-int arch_crash_hotplug_cpu_support(void)
+int arch_crash_hotplug_support(struct kimage *image, unsigned long kexec_flags)
 {
-   return crash_check_update_elfcorehdr();
-}
-#endif
 
-#ifdef CONFIG_MEMORY_HOTPLUG
-int arch_crash_hotplug_memory_support(void)
-{
-   return crash_check_update_elfcorehdr();
-}
+#ifdef CONFIG_KEXEC_FILE
+   if (image->file_mode)
+   return 1;
 #endif
+   /*
+* Initially, crash hotplug support for kexec_load was added
+* with the KEXEC_UPDATE_ELFCOREHDR flag. Later, this
+* functionality was expanded to accommodate multiple kexec
+* segment updates, leading to the introduction of the
+* KEXEC_CRASH_HOT

[PATCH v18 0/6] powerpc/crash: Kernel handling of CPU and memory hotplug

2024-03-25 Thread Sourabh Jain
ystem call
  - Change in the way this feature is advertise to userspace for both
kexec_load syscall
  - Rebase to v6.6-rc7

v11:
  - Rebase to v6.4-rc6
  - The patch that introduced CONFIG_CRASH_HOTPLUG for PowerPC has been
removed. The config is now part of common configuration:
https://lore.kernel.org/all/87ilbpflsk.fsf@mail.lhotse/

v10:
  - Drop the patch that adds fdt_index attribute to struct kimage_arch
Find the fdt segment index when needed.
  - Added more details into commits messages.
  - Rebased onto 6.3.0-rc5

v9:
  - Removed patch to prepare elfcorehdr crash notes for possible CPUs.
The patch is moved to generic patch series that introduces generic
infrastructure for in kernel crash update.
  - Removed patch to pass the hotplug action type to the arch crash
hotplug handler function. The generic patch series has introduced
the hotplug action type in kimage struct.
  - Add detail commit message for better understanding.

v8:
  - Restrict fdt_index initialization to machine_kexec_post_load
it work for both kexec_load and kexec_file_load.[3/8] Laurent Dufour

  - Updated the logic to find the number of offline core. [6/8]

  - Changed the logic to find the elfcore program header to accommodate
future memory ranges due memory hotplug events. [8/8]

v7
  - added a new config to configure this feature
  - pass hotplug action type to arch specific handler

v6
  - Added crash memory hotplug support

v5:
  - Replace COFNIG_CRASH_HOTPLUG with CONFIG_HOTPLUG_CPU.
  - Move fdt segment identification for kexec_load case to load path
instead of crash hotplug handler
  - Keep new attribute defined under kimage_arch to track FDT segment
under CONFIG_HOTPLUG_CPU config.

v4:
  - Update the logic to find the additional space needed for hotadd CPUs
post kexec load. Refer "[RFC v4 PATCH 4/5] powerpc/crash hp: add crash
hotplug support for kexec_file_load" patch to know more about the
change.
  - Fix a couple of typo.
  - Replace pr_err to pr_info_once to warn user about memory hotplug
support.
  - In crash hotplug handle exit the for loop if FDT segment is found.

v3
  - Move fdt_index and fdt_index_vaild variables to kimage_arch struct.
  - Rebase patche on top of

https://lore.kernel.org/lkml/20220303162725.49640-1-eric.devol...@oracle.com/
  - Fixed warning reported by checpatch script

v2:
  - Use generic hotplug handler introduced by

https://lore.kernel.org/lkml/20220209195706.51522-1-eric.devol...@oracle.com/
a significant change from v1.

Cc: Akhil Raj 
Cc: Andrew Morton 
Cc: Aneesh Kumar K.V 
Cc: Baoquan He 
Cc: Borislav Petkov (AMD) 
Cc: Boris Ostrovsky 
Cc: Christophe Leroy 
Cc: Dave Hansen 
Cc: Dave Young 
Cc: David Hildenbrand 
Cc: Greg Kroah-Hartman 
Cc: Hari Bathini 
Cc: Laurent Dufour 
Cc: Mahesh Salgaonkar 
Cc: Michael Ellerman 
Cc: Mimi Zohar 
Cc: Naveen N Rao 
Cc: Oscar Salvador 
Cc: Thomas Gleixner 
Cc: Valentin Schneider 
Cc: Vivek Goyal 
Cc: ke...@lists.infradead.org
Cc: x...@kernel.org

Sourabh Jain (6):
  crash: forward memory_notify arg to arch crash hotplug handler
  crash: add a new kexec flag for hotplug support
  powerpc/kexec: move *_memory_ranges functions to ranges.c
  PowerPC/kexec: make the update_cpus_node() function public
  powerpc/crash: add crash CPU hotplug support
  powerpc/crash: add crash memory hotplug support

 arch/powerpc/Kconfig|   4 +
 arch/powerpc/include/asm/kexec.h|  15 ++
 arch/powerpc/include/asm/kexec_ranges.h |  20 +-
 arch/powerpc/kexec/Makefile |   4 +-
 arch/powerpc/kexec/core_64.c|  91 +++
 arch/powerpc/kexec/crash.c  | 196 +++
 arch/powerpc/kexec/elf_64.c |   3 +-
 arch/powerpc/kexec/file_load_64.c   | 314 +++-
 arch/powerpc/kexec/ranges.c | 312 ++-
 arch/x86/include/asm/kexec.h|  13 +-
 arch/x86/kernel/crash.c |  32 ++-
 drivers/base/cpu.c  |   2 +-
 drivers/base/memory.c   |   2 +-
 include/linux/crash_core.h  |  15 +-
 include/linux/kexec.h   |  11 +-
 include/uapi/linux/kexec.h  |   1 +
 kernel/crash_core.c |  29 +--
 kernel/kexec.c  |   4 +-
 kernel/kexec_file.c |   5 +
 19 files changed, 714 insertions(+), 359 deletions(-)

-- 
2.43.0



[PATCH v18 1/6] crash: forward memory_notify arg to arch crash hotplug handler

2024-03-25 Thread Sourabh Jain
In the event of memory hotplug or online/offline events, the crash
memory hotplug notifier `crash_memhp_notifier()` receives a
`memory_notify` object but doesn't forward that object to the
generic and architecture-specific crash hotplug handler.

The `memory_notify` object contains the starting PFN (Page Frame Number)
and the number of pages in the hot-removed memory. This information is
necessary for architectures like PowerPC to update/recreate the kdump
image, specifically `elfcorehdr`.

So update the function signature of `crash_handle_hotplug_event()` and
`arch_crash_handle_hotplug_event()` to accept the `memory_notify` object
as an argument from crash memory hotplug notifier.

Since no such object is available in the case of CPU hotplug event, the
crash CPU hotplug notifier `crash_cpuhp_online()` passes NULL to the
crash hotplug handler.

Signed-off-by: Sourabh Jain 
Acked-by: Baoquan He 
Acked-by: Hari Bathini 
Cc: Akhil Raj 
Cc: Andrew Morton 
Cc: Aneesh Kumar K.V 
Cc: Borislav Petkov (AMD) 
Cc: Boris Ostrovsky 
Cc: Christophe Leroy 
Cc: Dave Hansen 
Cc: Dave Young 
Cc: David Hildenbrand 
Cc: Greg Kroah-Hartman 
Cc: Laurent Dufour 
Cc: Mahesh Salgaonkar 
Cc: Michael Ellerman 
Cc: Mimi Zohar 
Cc: Naveen N Rao 
Cc: Oscar Salvador 
Cc: Thomas Gleixner 
Cc: Valentin Schneider 
Cc: Vivek Goyal 
Cc: ke...@lists.infradead.org
Cc: x...@kernel.org
---
* No chnages in v18

 arch/x86/include/asm/kexec.h |  2 +-
 arch/x86/kernel/crash.c  |  4 +++-
 include/linux/crash_core.h   |  2 +-
 kernel/crash_core.c  | 14 +++---
 4 files changed, 12 insertions(+), 10 deletions(-)

diff --git a/arch/x86/include/asm/kexec.h b/arch/x86/include/asm/kexec.h
index 91ca9a9ee3a2..cb1320ebbc23 100644
--- a/arch/x86/include/asm/kexec.h
+++ b/arch/x86/include/asm/kexec.h
@@ -207,7 +207,7 @@ int arch_kimage_file_post_load_cleanup(struct kimage 
*image);
 extern void kdump_nmi_shootdown_cpus(void);
 
 #ifdef CONFIG_CRASH_HOTPLUG
-void arch_crash_handle_hotplug_event(struct kimage *image);
+void arch_crash_handle_hotplug_event(struct kimage *image, void *arg);
 #define arch_crash_handle_hotplug_event arch_crash_handle_hotplug_event
 
 #ifdef CONFIG_HOTPLUG_CPU
diff --git a/arch/x86/kernel/crash.c b/arch/x86/kernel/crash.c
index e74d0c4286c1..2a682fe86352 100644
--- a/arch/x86/kernel/crash.c
+++ b/arch/x86/kernel/crash.c
@@ -432,10 +432,12 @@ unsigned int arch_crash_get_elfcorehdr_size(void)
 /**
  * arch_crash_handle_hotplug_event() - Handle hotplug elfcorehdr changes
  * @image: a pointer to kexec_crash_image
+ * @arg: struct memory_notify handler for memory hotplug case and
+ *   NULL for CPU hotplug case.
  *
  * Prepare the new elfcorehdr and replace the existing elfcorehdr.
  */
-void arch_crash_handle_hotplug_event(struct kimage *image)
+void arch_crash_handle_hotplug_event(struct kimage *image, void *arg)
 {
void *elfbuf = NULL, *old_elfcorehdr;
unsigned long nr_mem_ranges;
diff --git a/include/linux/crash_core.h b/include/linux/crash_core.h
index d33352c2e386..647e928efee8 100644
--- a/include/linux/crash_core.h
+++ b/include/linux/crash_core.h
@@ -37,7 +37,7 @@ static inline void arch_kexec_unprotect_crashkres(void) { }
 
 
 #ifndef arch_crash_handle_hotplug_event
-static inline void arch_crash_handle_hotplug_event(struct kimage *image) { }
+static inline void arch_crash_handle_hotplug_event(struct kimage *image, void 
*arg) { }
 #endif
 
 int crash_check_update_elfcorehdr(void);
diff --git a/kernel/crash_core.c b/kernel/crash_core.c
index 78b5dc7cee3a..70fa8111a9d6 100644
--- a/kernel/crash_core.c
+++ b/kernel/crash_core.c
@@ -534,7 +534,7 @@ int crash_check_update_elfcorehdr(void)
  * list of segments it checks (since the elfcorehdr changes and thus
  * would require an update to purgatory itself to update the digest).
  */
-static void crash_handle_hotplug_event(unsigned int hp_action, unsigned int 
cpu)
+static void crash_handle_hotplug_event(unsigned int hp_action, unsigned int 
cpu, void *arg)
 {
struct kimage *image;
 
@@ -596,7 +596,7 @@ static void crash_handle_hotplug_event(unsigned int 
hp_action, unsigned int cpu)
image->hp_action = hp_action;
 
/* Now invoke arch-specific update handler */
-   arch_crash_handle_hotplug_event(image);
+   arch_crash_handle_hotplug_event(image, arg);
 
/* No longer handling a hotplug event */
image->hp_action = KEXEC_CRASH_HP_NONE;
@@ -612,17 +612,17 @@ static void crash_handle_hotplug_event(unsigned int 
hp_action, unsigned int cpu)
crash_hotplug_unlock();
 }
 
-static int crash_memhp_notifier(struct notifier_block *nb, unsigned long val, 
void *v)
+static int crash_memhp_notifier(struct notifier_block *nb, unsigned long val, 
void *arg)
 {
switch (val) {
case MEM_ONLINE:
crash_handle_hotplug_event(KEXEC_CRASH_HP_ADD_MEMORY,
-   KEXEC_CRASH_HP_INVALID_CPU);
+   KEXEC_CRASH_HP_INVALID_CPU, arg);
   

Re: [PATCH v8 1/3] powerpc: make fadump resilient with memory add/remove events

2024-03-11 Thread Sourabh Jain

Hello Hari,

On 11/03/24 14:08, Hari Bathini wrote:



On 17/02/24 12:50 pm, Sourabh Jain wrote:

Due to changes in memory resources caused by either memory hotplug or
online/offline events, the elfcorehdr, which describes the CPUs and
memory of the crashed kernel to the kernel that collects the dump (known
as second/fadump kernel), becomes outdated. Consequently, attempting
dump collection with an outdated elfcorehdr can lead to failed or
inaccurate dump collection.

Memory hotplug or online/offline events is referred as memory add/remove
events in reset of the commit message.

The current solution to address the aforementioned issue is as follows:
Monitor memory add/remove events in userspace using udev rules, and
re-register fadump whenever there are changes in memory resources. This
leads to the creation of a new elfcorehdr with updated system memory
information.

There are several notable issues associated with re-registering fadump
for every memory add/remove events.

1. Bulk memory add/remove events with udev-based fadump re-registration
    can lead to race conditions and, more importantly, it creates a wide
    window during which fadump is inactive until all memory add/remove
    events are settled.
2. Re-registering fadump for every memory add/remove event is
    inefficient.
3. The memory for elfcorehdr is allocated based on the memblock regions
    available during early boot and remains fixed thereafter. 
However, if

    elfcorehdr is later recreated with additional memblock regions, its
    size will increase, potentially leading to memory corruption.

Address the aforementioned challenges by shifting the creation of
elfcorehdr from the first kernel (also referred as the crashed kernel),
where it was created and frequently recreated for every memory
add/remove event, to the fadump kernel. As a result, the elfcorehdr only
needs to be created once, thus eliminating the necessity to re-register
fadump during memory add/remove events.

At present, the first kernel is responsible for preparing the fadump
header and storing it in the fadump reserved area. The fadump header
includes the start address of the elfcorehdr, crashing CPU details, and
other relevant information. In the event of a crash in the first kernel,
the second/fadump boots and accesses the fadump header prepared by the
first kernel. It then performs the following steps in a
platform-specific function [rtas|opal]_fadump_process:

1. Sanity check for fadump header
2. Update CPU notes in elfcorehdr

Along with the above, update the setup_fadump()/fadump.c to create
elfcorehdr and set its address to the global variable elfcorehdr_addr
for the vmcore module to process it in the second/fadump kernel.

Section below outlines the information required to create the elfcorehdr
and the changes made to make it available to the fadump kernel if it's
not already.

To create elfcorehdr, the following crashed kernel information is
required: CPU notes, vmcoreinfo, and memory ranges.

At present, the CPU notes are already prepared in the fadump kernel, so
no changes are needed in that regard. The fadump kernel has access to
all crashed kernel memory regions, including boot memory regions that
are relocated by firmware to fadump reserved areas, so no changes for
that either. However, it is necessary to add new members to the fadump
header, i.e., the 'fadump_crash_info_header' structure, in order to pass
the crashed kernel's vmcoreinfo address and its size to fadump kernel.

In addition to the vmcoreinfo address and size, there are a few other
attributes also added to the fadump_crash_info_header structure.

1. version:
    It stores the fadump header version, which is currently set to 1.
    This provides flexibility to update the fadump crash info header in
    the future without changing the magic number. For each change in the
    fadump header, the version will be increased. This will help the
    updated kernel determine how to handle kernel dumps from older
    kernels. The magic number remains relevant for checking fadump 
header

    corruption.

2. pt_regs_sz/cpu_mask_sz:
    Store size of pt_regs and cpu_mask structure of first kernel. These
    attributes are used to prevent dump processing if the sizes of
    pt_regs or cpu_mask structure differ between the first and fadump
    kernels.

Note: if either first/crashed kernel or second/fadump kernel do not have
the changes introduced here then kernel fail to collect the dump and
prints relevant error message on the console.

Signed-off-by: Sourabh Jain 
Cc: Aditya Gupta 
Cc: Aneesh Kumar K.V 
Cc: Hari Bathini 
Cc: Mahesh Salgaonkar 
Cc: Michael Ellerman 
Cc: Naveen N Rao 
---
  arch/powerpc/include/asm/fadump-internal.h   |  31 +-
  arch/powerpc/kernel/fadump.c | 339 +++
  arch/powerpc/platforms/powernv/opal-fadump.c |  22 +-
  arch/powerpc/platforms/pseries/rtas-fadump.c |  30 +-
  4 files changed, 232 insertions(+), 190 deletions(-)

diff --git a/arch

Re: [PATCH v17 6/6] powerpc/crash: add crash memory hotplug support

2024-03-04 Thread Sourabh Jain




On 02/03/24 18:49, Hari Bathini wrote:



On 26/02/24 2:11 pm, Sourabh Jain wrote:

Extend the arch crash hotplug handler, as introduced by the patch title
("powerpc: add crash CPU hotplug support"), to also support memory
add/remove events.

Elfcorehdr describes the memory of the crash kernel to capture the
kernel; hence, it needs to be updated if memory resources change due to
memory add/remove events. Therefore, arch_crash_handle_hotplug_event()
is updated to recreate the elfcorehdr and replace it with the previous
one on memory add/remove events.

The memblock list is used to prepare the elfcorehdr. In the case of
memory hot remove, the memblock list is updated after the arch crash
hotplug handler is triggered, as depicted in Figure 1. Thus, the
hot-removed memory is explicitly removed from the crash memory ranges
to ensure that the memory ranges added to elfcorehdr do not include the
hot-removed memory.

 Memory remove
   |
   v
 Offline pages
   |
   v
  Initiate memory notify call <> crash hotplug handler
  chain for MEM_OFFLINE event
   |
   v
  Update memblock list

  Figure 1

There are two system calls, `kexec_file_load` and `kexec_load`, used to
load the kdump image. A few changes have been made to ensure that the
kernel can safely update the elfcorehdr component of the kdump image for
both system calls.

For the kexec_file_load syscall, kdump image is prepared in the kernel.
To support an increasing number of memory regions, the elfcorehdr is
built with extra buffer space to ensure that it can accommodate
additional memory ranges in future.

For the kexec_load syscall, the elfcorehdr is updated only if the
KEXEC_CRASH_HOTPLUG_SUPPORT kexec flag is passed to the kernel by the
kexec tool. Passing this flag to the kernel indicates that the
elfcorehdr is built to accommodate additional memory ranges and the
elfcorehdr segment is not considered for SHA calculation, making it safe
to update.

The changes related to this feature are kept under the CRASH_HOTPLUG
config, and it is enabled by default.



Overall, the patchset looks good. I tried out the changes too.

Acked-by: Hari Bathini 


Hello Hari,

Thanks for trying out the change.

- Sourabh




Signed-off-by: Sourabh Jain 
Cc: Akhil Raj 
Cc: Andrew Morton 
Cc: Aneesh Kumar K.V 
Cc: Baoquan He 
Cc: Borislav Petkov (AMD) 
Cc: Boris Ostrovsky 
Cc: Christophe Leroy 
Cc: Dave Hansen 
Cc: Dave Young 
Cc: David Hildenbrand 
Cc: Greg Kroah-Hartman 
Cc: Hari Bathini 
Cc: Laurent Dufour 
Cc: Mahesh Salgaonkar 
Cc: Michael Ellerman 
Cc: Mimi Zohar 
Cc: Naveen N Rao 
Cc: Oscar Salvador 
Cc: Thomas Gleixner 
Cc: Valentin Schneider 
Cc: Vivek Goyal 
Cc: ke...@lists.infradead.org
Cc: x...@kernel.org
---
  arch/powerpc/include/asm/kexec.h    |  3 +
  arch/powerpc/include/asm/kexec_ranges.h |  1 +
  arch/powerpc/kexec/crash.c  | 95 -
  arch/powerpc/kexec/file_load_64.c   | 20 +-
  arch/powerpc/kexec/ranges.c | 85 ++
  5 files changed, 202 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/include/asm/kexec.h 
b/arch/powerpc/include/asm/kexec.h

index e75970351bcd..95a98b390d62 100644
--- a/arch/powerpc/include/asm/kexec.h
+++ b/arch/powerpc/include/asm/kexec.h
@@ -141,6 +141,9 @@ void arch_crash_handle_hotplug_event(struct 
kimage *image, void *arg);
    int arch_crash_hotplug_support(struct kimage *image, unsigned 
long kexec_flags);

  #define arch_crash_hotplug_support arch_crash_hotplug_support
+
+unsigned int arch_crash_get_elfcorehdr_size(void);
+#define crash_get_elfcorehdr_size arch_crash_get_elfcorehdr_size
  #endif /* CONFIG_CRASH_HOTPLUG */
    extern int crashing_cpu;
diff --git a/arch/powerpc/include/asm/kexec_ranges.h 
b/arch/powerpc/include/asm/kexec_ranges.h

index 8489e844b447..14055896cbcb 100644
--- a/arch/powerpc/include/asm/kexec_ranges.h
+++ b/arch/powerpc/include/asm/kexec_ranges.h
@@ -7,6 +7,7 @@
  void sort_memory_ranges(struct crash_mem *mrngs, bool merge);
  struct crash_mem *realloc_mem_ranges(struct crash_mem **mem_ranges);
  int add_mem_range(struct crash_mem **mem_ranges, u64 base, u64 size);
+int remove_mem_range(struct crash_mem **mem_ranges, u64 base, u64 
size);

  int get_exclude_memory_ranges(struct crash_mem **mem_ranges);
  int get_reserved_memory_ranges(struct crash_mem **mem_ranges);
  int get_crash_memory_ranges(struct crash_mem **mem_ranges);
diff --git a/arch/powerpc/kexec/crash.c b/arch/powerpc/kexec/crash.c
index 8938a19af12f..21b193e938a3 100644
--- a/arch/powerpc/kexec/crash.c
+++ b/arch/powerpc/kexec/crash.c
@@ -17,6 +17,7 @@
  #include 
  #include 
  #include 
+#include 
    #include 
  #include 
@@ -25,6 +26,7 @@
  #include 
  #include 
  #include 
+#include 
    /*
   * The primary CPU waits a while for all secondary CPUs to enter. 
This is to
@@ -398,6 +400,94 @@ void default_machine_crash_shutdown(struct 
pt_regs *regs)

  #undef pr_fmt

Re: [PATCH v17 2/6] crash: add a new kexec flag for hotplug support

2024-03-03 Thread Sourabh Jain

Hello Hari,

On 02/03/24 18:47, Hari Bathini wrote:



On 26/02/24 2:11 pm, Sourabh Jain wrote:

Commit a72bbec70da2 ("crash: hotplug support for kexec_load()")
introduced a new kexec flag, `KEXEC_UPDATE_ELFCOREHDR`. Kexec tool uses
this flag to indicate to the kernel that it is safe to modify the
elfcorehdr of the kdump image loaded using the kexec_load system call.

However, it is possible that architectures may need to update kexec
segments other then elfcorehdr. For example, FDT (Flatten Device Tree)
on PowerPC. Introducing a new kexec flag for every new kexec segment
may not be a good solution. Hence, a generic kexec flag bit,
`KEXEC_CRASH_HOTPLUG_SUPPORT`, is introduced to share the CPU/Memory
hotplug support intent between the kexec tool and the kernel for the
kexec_load system call.

Now, if the kexec tool sends KEXEC_CRASH_HOTPLUG_SUPPORT kexec flag to
the kernel, it indicates to the kernel that all the required kexec
segment is skipped from SHA calculation and it is safe to update kdump
image loaded using the kexec_load syscall.

While loading the kdump image using the kexec_load syscall, the
@update_elfcorehdr member of struct kimage is set if the kexec tool
sends the KEXEC_UPDATE_ELFCOREHDR kexec flag. This member is later used
to determine whether it is safe to update elfcorehdr on hotplug events.
However, with the introduction of the KEXEC_CRASH_HOTPLUG_SUPPORT kexec
flag, the kexec tool could mark all the required kexec segments on an
architecture as safe to update. So rename the @update_elfcorehdr to
@hotplug_support. If @hotplug_support is set, the kernel can safely
update all the required kexec segments of the kdump image during
CPU/Memory hotplug events.

Introduce an architecture-specific function to process kexec flags for
determining hotplug support. Set the @hotplug_support member of struct
kimage for both kexec_load and kexec_file_load system calls. This
simplifies kernel checks to identify hotplug support for the currently
loaded kdump image by just examining the value of @hotplug_support.



Couple of minor nits. See comments below.
Otherwise, looks good to me.

Acked-by: Hari Bathini 


Thank you!




Signed-off-by: Sourabh Jain 
Cc: Akhil Raj 
Cc: Andrew Morton 
Cc: Aneesh Kumar K.V 
Cc: Baoquan He 
Cc: Borislav Petkov (AMD) 
Cc: Boris Ostrovsky 
Cc: Christophe Leroy 
Cc: Dave Hansen 
Cc: Dave Young 
Cc: David Hildenbrand 
Cc: Eric DeVolder 
Cc: Greg Kroah-Hartman 
Cc: Hari Bathini 
Cc: Laurent Dufour 
Cc: Mahesh Salgaonkar 
Cc: Michael Ellerman 
Cc: Mimi Zohar 
Cc: Naveen N Rao 
Cc: Oscar Salvador 
Cc: Thomas Gleixner 
Cc: Valentin Schneider 
Cc: Vivek Goyal 
Cc: ke...@lists.infradead.org
Cc: x...@kernel.org
---
  arch/x86/include/asm/kexec.h | 11 ++-
  arch/x86/kernel/crash.c  | 28 +---
  drivers/base/cpu.c   |  2 +-
  drivers/base/memory.c    |  2 +-
  include/linux/crash_core.h   | 13 ++---
  include/linux/kexec.h    | 11 +++
  include/uapi/linux/kexec.h   |  1 +
  kernel/crash_core.c  | 11 ---
  kernel/kexec.c   |  4 ++--
  kernel/kexec_file.c  |  5 +
  10 files changed, 46 insertions(+), 42 deletions(-)

diff --git a/arch/x86/include/asm/kexec.h b/arch/x86/include/asm/kexec.h
index cb1320ebbc23..ae5482a2f0ca 100644
--- a/arch/x86/include/asm/kexec.h
+++ b/arch/x86/include/asm/kexec.h
@@ -210,15 +210,8 @@ extern void kdump_nmi_shootdown_cpus(void);
  void arch_crash_handle_hotplug_event(struct kimage *image, void *arg);
  #define arch_crash_handle_hotplug_event 
arch_crash_handle_hotplug_event

  -#ifdef CONFIG_HOTPLUG_CPU
-int arch_crash_hotplug_cpu_support(void);
-#define crash_hotplug_cpu_support arch_crash_hotplug_cpu_support
-#endif
-
-#ifdef CONFIG_MEMORY_HOTPLUG
-int arch_crash_hotplug_memory_support(void);
-#define crash_hotplug_memory_support arch_crash_hotplug_memory_support
-#endif
+int arch_crash_hotplug_support(struct kimage *image, unsigned long 
kexec_flags);

+#define arch_crash_hotplug_support arch_crash_hotplug_support
    unsigned int arch_crash_get_elfcorehdr_size(void);
  #define crash_get_elfcorehdr_size arch_crash_get_elfcorehdr_size
diff --git a/arch/x86/kernel/crash.c b/arch/x86/kernel/crash.c
index 2a682fe86352..f06501445cd9 100644
--- a/arch/x86/kernel/crash.c
+++ b/arch/x86/kernel/crash.c
@@ -402,20 +402,26 @@ int crash_load_segments(struct kimage *image)
  #undef pr_fmt
  #define pr_fmt(fmt) "crash hp: " fmt




-/* These functions provide the value for the sysfs crash_hotplug 
nodes */

-#ifdef CONFIG_HOTPLUG_CPU
-int arch_crash_hotplug_cpu_support(void)
+int arch_crash_hotplug_support(struct kimage *image, unsigned long 
kexec_flags)

  {
-    return crash_check_update_elfcorehdr();
-}
-#endif
  -#ifdef CONFIG_MEMORY_HOTPLUG
-int arch_crash_hotplug_memory_support(void)
-{
-    return crash_check_update_elfcorehdr();
-}
+#ifdef CONFIG_KEXEC_FILE
+    if (image->file_mode)
+    return 1;
  #endif
+    /*
+

Re: [PATCH v17 0/6] powerpc/crash: Kernel handling of CPU and memory hotplug

2024-02-29 Thread Sourabh Jain

Hello Baoquan,

On 29/02/24 19:21, Baoquan He wrote:

Hi Sourabh,

On 02/26/24 at 02:11pm, Sourabh Jain wrote:

Commit 247262756121 ("crash: add generic infrastructure for crash
hotplug support") added a generic infrastructure that allows
architectures to selectively update the kdump image component during CPU
or memory add/remove events within the kernel itself.

This patch series adds crash hotplug handler for PowerPC and enable
support to update the kdump image on CPU/Memory add/remove events.

Among the 5 patches in this series, the first two patches make changes
to the generic crash hotplug handler to assist PowerPC in adding support
for this feature. The last three patches add support for this feature.

The whole series looks good to me. I have acked patch 1 and 2. Leave
those three ppc patches to ppc expert to review and approve. Thanks a
lot for your great work.
Thanks for your feedback. I will soon send v18 to fix the two mirror 
document issues
and will look forward to PPC maintainers to provide feedback on the rest 
of the series.


Appreciate your support!

- Sourabh


Re: [PATCH v17 2/6] crash: add a new kexec flag for hotplug support

2024-02-29 Thread Sourabh Jain

Hello

On 29/02/24 12:58, Baoquan He wrote:

On 02/26/24 at 02:11pm, Sourabh Jain wrote:
..snip...

diff --git a/kernel/crash_core.c b/kernel/crash_core.c
index 70fa8111a9d6..630c4fd7ea39 100644
--- a/kernel/crash_core.c
+++ b/kernel/crash_core.c
@@ -496,7 +496,7 @@ static DEFINE_MUTEX(__crash_hotplug_lock);
   * It reflects the kernel's ability/permission to update the crash
   * elfcorehdr directly.

  ~ this should be updated too.


   */
-int crash_check_update_elfcorehdr(void)
+int crash_check_hotplug_support(void)
  {
int rc = 0;
  
@@ -508,10 +508,7 @@ int crash_check_update_elfcorehdr(void)

return 0;
}
if (kexec_crash_image) {
-   if (kexec_crash_image->file_mode)
-   rc = 1;
-   else
-   rc = kexec_crash_image->update_elfcorehdr;
+   rc = kexec_crash_image->hotplug_support;
}
/* Release lock now that update complete */
kexec_unlock();
@@ -552,8 +549,8 @@ static void crash_handle_hotplug_event(unsigned int 
hp_action, unsigned int cpu,
  
  	image = kexec_crash_image;
  
-	/* Check that updating elfcorehdr is permitted */

-   if (!(image->file_mode || image->update_elfcorehdr))
+   /* Check that kexec segments update is permitted */
+   if (!image->hotplug_support)
goto out;
  
  	if (hp_action == KEXEC_CRASH_HP_ADD_CPU ||

diff --git a/kernel/kexec.c b/kernel/kexec.c
index bab542fc1463..a6b3f96bb50c 100644
--- a/kernel/kexec.c
+++ b/kernel/kexec.c
@@ -135,8 +135,8 @@ static int do_kexec_load(unsigned long entry, unsigned long 
nr_segments,
image->preserve_context = 1;
  
  #ifdef CONFIG_CRASH_HOTPLUG

-   if (flags & KEXEC_UPDATE_ELFCOREHDR)
-   image->update_elfcorehdr = 1;
+   if ((flags & KEXEC_ON_CRASH) && arch_crash_hotplug_support(image, 
flags))
+   image->hotplug_support = 1;
  #endif
  
  	ret = machine_kexec_prepare(image);

diff --git a/kernel/kexec_file.c b/kernel/kexec_file.c
index 2d1db05fbf04..3d64290d24c9 100644
--- a/kernel/kexec_file.c
+++ b/kernel/kexec_file.c
@@ -376,6 +376,11 @@ SYSCALL_DEFINE5(kexec_file_load, int, kernel_fd, int, 
initrd_fd,
if (ret)
goto out;
  
+#ifdef CONFIG_CRASH_HOTPLUG

+   if ((flags & KEXEC_FILE_ON_CRASH) && arch_crash_hotplug_support(image, 
flags))
+   image->hotplug_support = 1;
+#endif
+
ret = machine_kexec_prepare(image);
if (ret)
goto out;

Other than the tiny part, the overall looks good to me.

Acked-by: Baoquan He 



Thank you for the review and feedback.

- Sourabh


Re: [PATCH v17 3/6] powerpc/kexec: move *_memory_ranges functions to ranges.c

2024-02-29 Thread Sourabh Jain




On 29/02/24 13:41, Baoquan He wrote:

On 02/26/24 at 02:11pm, Sourabh Jain wrote:

Move the following functions form kexec/{file_load_64.c => ranges.c} and
make them public so that components other KEXEC_FILE can also use these

^
   'than' missed?


Yes, I will update it.

Thanks,
Sourabh Jain


functions.
1. get_exclude_memory_ranges
2. get_reserved_memory_ranges
3. get_crash_memory_ranges
4. get_usable_memory_ranges

Later in the series get_crash_memory_ranges function is utilized for
in-kernel updates to kdump image during CPU/Memory hotplug or
online/offline events for both kexec_load and kexec_file_load syscalls.

Since the above functions are moved to ranges.c, some of the helper
functions in ranges.c are no longer required to be public. Mark them as
static and removed them from kexec_ranges.h header file.

Finally, remove the CONFIG_KEXEC_FILE build dependency for range.c
because it is required for other config, such as CONFIG_CRASH_DUMP.

No functional changes are intended.

..snip





Re: [PATCH v17 2/6] crash: add a new kexec flag for hotplug support

2024-02-29 Thread Sourabh Jain

Hello Baoquan,

On 29/02/24 12:58, Baoquan He wrote:

On 02/26/24 at 02:11pm, Sourabh Jain wrote:
..snip...

diff --git a/kernel/crash_core.c b/kernel/crash_core.c
index 70fa8111a9d6..630c4fd7ea39 100644
--- a/kernel/crash_core.c
+++ b/kernel/crash_core.c
@@ -496,7 +496,7 @@ static DEFINE_MUTEX(__crash_hotplug_lock);
   * It reflects the kernel's ability/permission to update the crash
   * elfcorehdr directly.

  ~ this should be updated too.


Yes, it should.

Thanks,
Sourabh



   */
-int crash_check_update_elfcorehdr(void)
+int crash_check_hotplug_support(void)
  {
int rc = 0;
  
@@ -508,10 +508,7 @@ int crash_check_update_elfcorehdr(void)

return 0;
}
if (kexec_crash_image) {
-   if (kexec_crash_image->file_mode)
-   rc = 1;
-   else
-   rc = kexec_crash_image->update_elfcorehdr;
+   rc = kexec_crash_image->hotplug_support;
}
/* Release lock now that update complete */
kexec_unlock();
@@ -552,8 +549,8 @@ static void crash_handle_hotplug_event(unsigned int 
hp_action, unsigned int cpu,
  
  	image = kexec_crash_image;
  
-	/* Check that updating elfcorehdr is permitted */

-   if (!(image->file_mode || image->update_elfcorehdr))
+   /* Check that kexec segments update is permitted */
+   if (!image->hotplug_support)
goto out;
  
  	if (hp_action == KEXEC_CRASH_HP_ADD_CPU ||

diff --git a/kernel/kexec.c b/kernel/kexec.c
index bab542fc1463..a6b3f96bb50c 100644
--- a/kernel/kexec.c
+++ b/kernel/kexec.c
@@ -135,8 +135,8 @@ static int do_kexec_load(unsigned long entry, unsigned long 
nr_segments,
image->preserve_context = 1;
  
  #ifdef CONFIG_CRASH_HOTPLUG

-   if (flags & KEXEC_UPDATE_ELFCOREHDR)
-   image->update_elfcorehdr = 1;
+   if ((flags & KEXEC_ON_CRASH) && arch_crash_hotplug_support(image, 
flags))
+   image->hotplug_support = 1;
  #endif
  
  	ret = machine_kexec_prepare(image);

diff --git a/kernel/kexec_file.c b/kernel/kexec_file.c
index 2d1db05fbf04..3d64290d24c9 100644
--- a/kernel/kexec_file.c
+++ b/kernel/kexec_file.c
@@ -376,6 +376,11 @@ SYSCALL_DEFINE5(kexec_file_load, int, kernel_fd, int, 
initrd_fd,
if (ret)
goto out;
  
+#ifdef CONFIG_CRASH_HOTPLUG

+   if ((flags & KEXEC_FILE_ON_CRASH) && arch_crash_hotplug_support(image, 
flags))
+   image->hotplug_support = 1;
+#endif
+
ret = machine_kexec_prepare(image);
if (ret)
goto out;

Other than the tiny part, the overall looks good to me.

Acked-by: Baoquan He 





Re: [PATCH v17 2/6] crash: add a new kexec flag for hotplug support

2024-02-29 Thread Sourabh Jain




On 29/02/24 11:26, Baoquan He wrote:

On 02/29/24 at 10:35am, Sourabh Jain wrote:

Hello Baoquan,

Do you have any comments or suggestions for this patch series, especially
for this patch?

Have applied this series and reviewing, will ack or add comment if any
concern. Thanks.


Thanks, looking forward to your feedback!

- Sourabh




On 26/02/24 14:11, Sourabh Jain wrote:

Commit a72bbec70da2 ("crash: hotplug support for kexec_load()")
introduced a new kexec flag, `KEXEC_UPDATE_ELFCOREHDR`. Kexec tool uses
this flag to indicate to the kernel that it is safe to modify the
elfcorehdr of the kdump image loaded using the kexec_load system call.

However, it is possible that architectures may need to update kexec
segments other then elfcorehdr. For example, FDT (Flatten Device Tree)
on PowerPC. Introducing a new kexec flag for every new kexec segment
may not be a good solution. Hence, a generic kexec flag bit,
`KEXEC_CRASH_HOTPLUG_SUPPORT`, is introduced to share the CPU/Memory
hotplug support intent between the kexec tool and the kernel for the
kexec_load system call.

Now, if the kexec tool sends KEXEC_CRASH_HOTPLUG_SUPPORT kexec flag to
the kernel, it indicates to the kernel that all the required kexec
segment is skipped from SHA calculation and it is safe to update kdump
image loaded using the kexec_load syscall.

While loading the kdump image using the kexec_load syscall, the
@update_elfcorehdr member of struct kimage is set if the kexec tool
sends the KEXEC_UPDATE_ELFCOREHDR kexec flag. This member is later used
to determine whether it is safe to update elfcorehdr on hotplug events.
However, with the introduction of the KEXEC_CRASH_HOTPLUG_SUPPORT kexec
flag, the kexec tool could mark all the required kexec segments on an
architecture as safe to update. So rename the @update_elfcorehdr to
@hotplug_support. If @hotplug_support is set, the kernel can safely
update all the required kexec segments of the kdump image during
CPU/Memory hotplug events.

Introduce an architecture-specific function to process kexec flags for
determining hotplug support. Set the @hotplug_support member of struct
kimage for both kexec_load and kexec_file_load system calls. This
simplifies kernel checks to identify hotplug support for the currently
loaded kdump image by just examining the value of @hotplug_support.

Signed-off-by: Sourabh Jain 
Cc: Akhil Raj 
Cc: Andrew Morton 
Cc: Aneesh Kumar K.V 
Cc: Baoquan He 
Cc: Borislav Petkov (AMD) 
Cc: Boris Ostrovsky 
Cc: Christophe Leroy 
Cc: Dave Hansen 
Cc: Dave Young 
Cc: David Hildenbrand 
Cc: Eric DeVolder 
Cc: Greg Kroah-Hartman 
Cc: Hari Bathini 
Cc: Laurent Dufour 
Cc: Mahesh Salgaonkar 
Cc: Michael Ellerman 
Cc: Mimi Zohar 
Cc: Naveen N Rao 
Cc: Oscar Salvador 
Cc: Thomas Gleixner 
Cc: Valentin Schneider 
Cc: Vivek Goyal 
Cc: ke...@lists.infradead.org
Cc: x...@kernel.org
---
   arch/x86/include/asm/kexec.h | 11 ++-
   arch/x86/kernel/crash.c  | 28 +---
   drivers/base/cpu.c   |  2 +-
   drivers/base/memory.c|  2 +-
   include/linux/crash_core.h   | 13 ++---
   include/linux/kexec.h| 11 +++
   include/uapi/linux/kexec.h   |  1 +
   kernel/crash_core.c  | 11 ---
   kernel/kexec.c   |  4 ++--
   kernel/kexec_file.c  |  5 +
   10 files changed, 46 insertions(+), 42 deletions(-)

diff --git a/arch/x86/include/asm/kexec.h b/arch/x86/include/asm/kexec.h
index cb1320ebbc23..ae5482a2f0ca 100644
--- a/arch/x86/include/asm/kexec.h
+++ b/arch/x86/include/asm/kexec.h
@@ -210,15 +210,8 @@ extern void kdump_nmi_shootdown_cpus(void);
   void arch_crash_handle_hotplug_event(struct kimage *image, void *arg);
   #define arch_crash_handle_hotplug_event arch_crash_handle_hotplug_event
-#ifdef CONFIG_HOTPLUG_CPU
-int arch_crash_hotplug_cpu_support(void);
-#define crash_hotplug_cpu_support arch_crash_hotplug_cpu_support
-#endif
-
-#ifdef CONFIG_MEMORY_HOTPLUG
-int arch_crash_hotplug_memory_support(void);
-#define crash_hotplug_memory_support arch_crash_hotplug_memory_support
-#endif
+int arch_crash_hotplug_support(struct kimage *image, unsigned long 
kexec_flags);
+#define arch_crash_hotplug_support arch_crash_hotplug_support
   unsigned int arch_crash_get_elfcorehdr_size(void);
   #define crash_get_elfcorehdr_size arch_crash_get_elfcorehdr_size
diff --git a/arch/x86/kernel/crash.c b/arch/x86/kernel/crash.c
index 2a682fe86352..f06501445cd9 100644
--- a/arch/x86/kernel/crash.c
+++ b/arch/x86/kernel/crash.c
@@ -402,20 +402,26 @@ int crash_load_segments(struct kimage *image)
   #undef pr_fmt
   #define pr_fmt(fmt) "crash hp: " fmt
-/* These functions provide the value for the sysfs crash_hotplug nodes */
-#ifdef CONFIG_HOTPLUG_CPU
-int arch_crash_hotplug_cpu_support(void)
+int arch_crash_hotplug_support(struct kimage *image, unsigned long kexec_flags)
   {
-   return crash_check_update_elfcorehdr();
-}
-#endif
-#ifdef CONFIG_MEMORY_HOTPLUG
-int arch_crash_ho

Re: [PATCH v8 1/3] powerpc: make fadump resilient with memory add/remove events

2024-02-28 Thread Sourabh Jain

Hello Michael and Aneesh,

Please let me know if you have any comments or suggestions for this 
patch series.


Thanks,
Sourabh


On 17/02/24 12:50, Sourabh Jain wrote:

Due to changes in memory resources caused by either memory hotplug or
online/offline events, the elfcorehdr, which describes the CPUs and
memory of the crashed kernel to the kernel that collects the dump (known
as second/fadump kernel), becomes outdated. Consequently, attempting
dump collection with an outdated elfcorehdr can lead to failed or
inaccurate dump collection.

Memory hotplug or online/offline events is referred as memory add/remove
events in reset of the commit message.

The current solution to address the aforementioned issue is as follows:
Monitor memory add/remove events in userspace using udev rules, and
re-register fadump whenever there are changes in memory resources. This
leads to the creation of a new elfcorehdr with updated system memory
information.

There are several notable issues associated with re-registering fadump
for every memory add/remove events.

1. Bulk memory add/remove events with udev-based fadump re-registration
can lead to race conditions and, more importantly, it creates a wide
window during which fadump is inactive until all memory add/remove
events are settled.
2. Re-registering fadump for every memory add/remove event is
inefficient.
3. The memory for elfcorehdr is allocated based on the memblock regions
available during early boot and remains fixed thereafter. However, if
elfcorehdr is later recreated with additional memblock regions, its
size will increase, potentially leading to memory corruption.

Address the aforementioned challenges by shifting the creation of
elfcorehdr from the first kernel (also referred as the crashed kernel),
where it was created and frequently recreated for every memory
add/remove event, to the fadump kernel. As a result, the elfcorehdr only
needs to be created once, thus eliminating the necessity to re-register
fadump during memory add/remove events.

At present, the first kernel is responsible for preparing the fadump
header and storing it in the fadump reserved area. The fadump header
includes the start address of the elfcorehdr, crashing CPU details, and
other relevant information. In the event of a crash in the first kernel,
the second/fadump boots and accesses the fadump header prepared by the
first kernel. It then performs the following steps in a
platform-specific function [rtas|opal]_fadump_process:

1. Sanity check for fadump header
2. Update CPU notes in elfcorehdr

Along with the above, update the setup_fadump()/fadump.c to create
elfcorehdr and set its address to the global variable elfcorehdr_addr
for the vmcore module to process it in the second/fadump kernel.

Section below outlines the information required to create the elfcorehdr
and the changes made to make it available to the fadump kernel if it's
not already.

To create elfcorehdr, the following crashed kernel information is
required: CPU notes, vmcoreinfo, and memory ranges.

At present, the CPU notes are already prepared in the fadump kernel, so
no changes are needed in that regard. The fadump kernel has access to
all crashed kernel memory regions, including boot memory regions that
are relocated by firmware to fadump reserved areas, so no changes for
that either. However, it is necessary to add new members to the fadump
header, i.e., the 'fadump_crash_info_header' structure, in order to pass
the crashed kernel's vmcoreinfo address and its size to fadump kernel.

In addition to the vmcoreinfo address and size, there are a few other
attributes also added to the fadump_crash_info_header structure.

1. version:
It stores the fadump header version, which is currently set to 1.
This provides flexibility to update the fadump crash info header in
the future without changing the magic number. For each change in the
fadump header, the version will be increased. This will help the
updated kernel determine how to handle kernel dumps from older
kernels. The magic number remains relevant for checking fadump header
corruption.

2. pt_regs_sz/cpu_mask_sz:
Store size of pt_regs and cpu_mask structure of first kernel. These
attributes are used to prevent dump processing if the sizes of
pt_regs or cpu_mask structure differ between the first and fadump
kernels.

Note: if either first/crashed kernel or second/fadump kernel do not have
the changes introduced here then kernel fail to collect the dump and
prints relevant error message on the console.

Signed-off-by: Sourabh Jain 
Cc: Aditya Gupta 
Cc: Aneesh Kumar K.V 
Cc: Hari Bathini 
Cc: Mahesh Salgaonkar 
Cc: Michael Ellerman 
Cc: Naveen N Rao 
---
  arch/powerpc/include/asm/fadump-internal.h   |  31 +-
  arch/powerpc/kernel/fadump.c | 339 +++
  arch/powerpc/platforms/powernv/opal-fadump.c |  22 +-
  arch/powerpc/platforms/pseries/rtas-fadump.c |  30 +-
  4

Re: [PATCH v17 2/6] crash: add a new kexec flag for hotplug support

2024-02-28 Thread Sourabh Jain

Hello Baoquan,

Do you have any comments or suggestions for this patch series, 
especially for this patch?


Thanks,
Sourabh

On 26/02/24 14:11, Sourabh Jain wrote:

Commit a72bbec70da2 ("crash: hotplug support for kexec_load()")
introduced a new kexec flag, `KEXEC_UPDATE_ELFCOREHDR`. Kexec tool uses
this flag to indicate to the kernel that it is safe to modify the
elfcorehdr of the kdump image loaded using the kexec_load system call.

However, it is possible that architectures may need to update kexec
segments other then elfcorehdr. For example, FDT (Flatten Device Tree)
on PowerPC. Introducing a new kexec flag for every new kexec segment
may not be a good solution. Hence, a generic kexec flag bit,
`KEXEC_CRASH_HOTPLUG_SUPPORT`, is introduced to share the CPU/Memory
hotplug support intent between the kexec tool and the kernel for the
kexec_load system call.

Now, if the kexec tool sends KEXEC_CRASH_HOTPLUG_SUPPORT kexec flag to
the kernel, it indicates to the kernel that all the required kexec
segment is skipped from SHA calculation and it is safe to update kdump
image loaded using the kexec_load syscall.

While loading the kdump image using the kexec_load syscall, the
@update_elfcorehdr member of struct kimage is set if the kexec tool
sends the KEXEC_UPDATE_ELFCOREHDR kexec flag. This member is later used
to determine whether it is safe to update elfcorehdr on hotplug events.
However, with the introduction of the KEXEC_CRASH_HOTPLUG_SUPPORT kexec
flag, the kexec tool could mark all the required kexec segments on an
architecture as safe to update. So rename the @update_elfcorehdr to
@hotplug_support. If @hotplug_support is set, the kernel can safely
update all the required kexec segments of the kdump image during
CPU/Memory hotplug events.

Introduce an architecture-specific function to process kexec flags for
determining hotplug support. Set the @hotplug_support member of struct
kimage for both kexec_load and kexec_file_load system calls. This
simplifies kernel checks to identify hotplug support for the currently
loaded kdump image by just examining the value of @hotplug_support.

Signed-off-by: Sourabh Jain 
Cc: Akhil Raj 
Cc: Andrew Morton 
Cc: Aneesh Kumar K.V 
Cc: Baoquan He 
Cc: Borislav Petkov (AMD) 
Cc: Boris Ostrovsky 
Cc: Christophe Leroy 
Cc: Dave Hansen 
Cc: Dave Young 
Cc: David Hildenbrand 
Cc: Eric DeVolder 
Cc: Greg Kroah-Hartman 
Cc: Hari Bathini 
Cc: Laurent Dufour 
Cc: Mahesh Salgaonkar 
Cc: Michael Ellerman 
Cc: Mimi Zohar 
Cc: Naveen N Rao 
Cc: Oscar Salvador 
Cc: Thomas Gleixner 
Cc: Valentin Schneider 
Cc: Vivek Goyal 
Cc: ke...@lists.infradead.org
Cc: x...@kernel.org
---
  arch/x86/include/asm/kexec.h | 11 ++-
  arch/x86/kernel/crash.c  | 28 +---
  drivers/base/cpu.c   |  2 +-
  drivers/base/memory.c|  2 +-
  include/linux/crash_core.h   | 13 ++---
  include/linux/kexec.h| 11 +++
  include/uapi/linux/kexec.h   |  1 +
  kernel/crash_core.c  | 11 ---
  kernel/kexec.c   |  4 ++--
  kernel/kexec_file.c  |  5 +
  10 files changed, 46 insertions(+), 42 deletions(-)

diff --git a/arch/x86/include/asm/kexec.h b/arch/x86/include/asm/kexec.h
index cb1320ebbc23..ae5482a2f0ca 100644
--- a/arch/x86/include/asm/kexec.h
+++ b/arch/x86/include/asm/kexec.h
@@ -210,15 +210,8 @@ extern void kdump_nmi_shootdown_cpus(void);
  void arch_crash_handle_hotplug_event(struct kimage *image, void *arg);
  #define arch_crash_handle_hotplug_event arch_crash_handle_hotplug_event
  
-#ifdef CONFIG_HOTPLUG_CPU

-int arch_crash_hotplug_cpu_support(void);
-#define crash_hotplug_cpu_support arch_crash_hotplug_cpu_support
-#endif
-
-#ifdef CONFIG_MEMORY_HOTPLUG
-int arch_crash_hotplug_memory_support(void);
-#define crash_hotplug_memory_support arch_crash_hotplug_memory_support
-#endif
+int arch_crash_hotplug_support(struct kimage *image, unsigned long 
kexec_flags);
+#define arch_crash_hotplug_support arch_crash_hotplug_support
  
  unsigned int arch_crash_get_elfcorehdr_size(void);

  #define crash_get_elfcorehdr_size arch_crash_get_elfcorehdr_size
diff --git a/arch/x86/kernel/crash.c b/arch/x86/kernel/crash.c
index 2a682fe86352..f06501445cd9 100644
--- a/arch/x86/kernel/crash.c
+++ b/arch/x86/kernel/crash.c
@@ -402,20 +402,26 @@ int crash_load_segments(struct kimage *image)
  #undef pr_fmt
  #define pr_fmt(fmt) "crash hp: " fmt
  
-/* These functions provide the value for the sysfs crash_hotplug nodes */

-#ifdef CONFIG_HOTPLUG_CPU
-int arch_crash_hotplug_cpu_support(void)
+int arch_crash_hotplug_support(struct kimage *image, unsigned long kexec_flags)
  {
-   return crash_check_update_elfcorehdr();
-}
-#endif
  
-#ifdef CONFIG_MEMORY_HOTPLUG

-int arch_crash_hotplug_memory_support(void)
-{
-   return crash_check_update_elfcorehdr();
-}
+#ifdef CONFIG_KEXEC_FILE
+   if (image->file_mode)
+   return 1;
  #endif
+   /*
+* Initially

[PATCH v17 6/6] powerpc/crash: add crash memory hotplug support

2024-02-26 Thread Sourabh Jain
Extend the arch crash hotplug handler, as introduced by the patch title
("powerpc: add crash CPU hotplug support"), to also support memory
add/remove events.

Elfcorehdr describes the memory of the crash kernel to capture the
kernel; hence, it needs to be updated if memory resources change due to
memory add/remove events. Therefore, arch_crash_handle_hotplug_event()
is updated to recreate the elfcorehdr and replace it with the previous
one on memory add/remove events.

The memblock list is used to prepare the elfcorehdr. In the case of
memory hot remove, the memblock list is updated after the arch crash
hotplug handler is triggered, as depicted in Figure 1. Thus, the
hot-removed memory is explicitly removed from the crash memory ranges
to ensure that the memory ranges added to elfcorehdr do not include the
hot-removed memory.

Memory remove
  |
  v
Offline pages
  |
  v
 Initiate memory notify call <> crash hotplug handler
 chain for MEM_OFFLINE event
  |
  v
 Update memblock list

Figure 1

There are two system calls, `kexec_file_load` and `kexec_load`, used to
load the kdump image. A few changes have been made to ensure that the
kernel can safely update the elfcorehdr component of the kdump image for
both system calls.

For the kexec_file_load syscall, kdump image is prepared in the kernel.
To support an increasing number of memory regions, the elfcorehdr is
built with extra buffer space to ensure that it can accommodate
additional memory ranges in future.

For the kexec_load syscall, the elfcorehdr is updated only if the
KEXEC_CRASH_HOTPLUG_SUPPORT kexec flag is passed to the kernel by the
kexec tool. Passing this flag to the kernel indicates that the
elfcorehdr is built to accommodate additional memory ranges and the
elfcorehdr segment is not considered for SHA calculation, making it safe
to update.

The changes related to this feature are kept under the CRASH_HOTPLUG
config, and it is enabled by default.

Signed-off-by: Sourabh Jain 
Cc: Akhil Raj 
Cc: Andrew Morton 
Cc: Aneesh Kumar K.V 
Cc: Baoquan He 
Cc: Borislav Petkov (AMD) 
Cc: Boris Ostrovsky 
Cc: Christophe Leroy 
Cc: Dave Hansen 
Cc: Dave Young 
Cc: David Hildenbrand 
Cc: Greg Kroah-Hartman 
Cc: Hari Bathini 
Cc: Laurent Dufour 
Cc: Mahesh Salgaonkar 
Cc: Michael Ellerman 
Cc: Mimi Zohar 
Cc: Naveen N Rao 
Cc: Oscar Salvador 
Cc: Thomas Gleixner 
Cc: Valentin Schneider 
Cc: Vivek Goyal 
Cc: ke...@lists.infradead.org
Cc: x...@kernel.org
---
 arch/powerpc/include/asm/kexec.h|  3 +
 arch/powerpc/include/asm/kexec_ranges.h |  1 +
 arch/powerpc/kexec/crash.c  | 95 -
 arch/powerpc/kexec/file_load_64.c   | 20 +-
 arch/powerpc/kexec/ranges.c | 85 ++
 5 files changed, 202 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/include/asm/kexec.h b/arch/powerpc/include/asm/kexec.h
index e75970351bcd..95a98b390d62 100644
--- a/arch/powerpc/include/asm/kexec.h
+++ b/arch/powerpc/include/asm/kexec.h
@@ -141,6 +141,9 @@ void arch_crash_handle_hotplug_event(struct kimage *image, 
void *arg);
 
 int arch_crash_hotplug_support(struct kimage *image, unsigned long 
kexec_flags);
 #define arch_crash_hotplug_support arch_crash_hotplug_support
+
+unsigned int arch_crash_get_elfcorehdr_size(void);
+#define crash_get_elfcorehdr_size arch_crash_get_elfcorehdr_size
 #endif /* CONFIG_CRASH_HOTPLUG */
 
 extern int crashing_cpu;
diff --git a/arch/powerpc/include/asm/kexec_ranges.h 
b/arch/powerpc/include/asm/kexec_ranges.h
index 8489e844b447..14055896cbcb 100644
--- a/arch/powerpc/include/asm/kexec_ranges.h
+++ b/arch/powerpc/include/asm/kexec_ranges.h
@@ -7,6 +7,7 @@
 void sort_memory_ranges(struct crash_mem *mrngs, bool merge);
 struct crash_mem *realloc_mem_ranges(struct crash_mem **mem_ranges);
 int add_mem_range(struct crash_mem **mem_ranges, u64 base, u64 size);
+int remove_mem_range(struct crash_mem **mem_ranges, u64 base, u64 size);
 int get_exclude_memory_ranges(struct crash_mem **mem_ranges);
 int get_reserved_memory_ranges(struct crash_mem **mem_ranges);
 int get_crash_memory_ranges(struct crash_mem **mem_ranges);
diff --git a/arch/powerpc/kexec/crash.c b/arch/powerpc/kexec/crash.c
index 8938a19af12f..21b193e938a3 100644
--- a/arch/powerpc/kexec/crash.c
+++ b/arch/powerpc/kexec/crash.c
@@ -17,6 +17,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include 
 #include 
@@ -25,6 +26,7 @@
 #include 
 #include 
 #include 
+#include 
 
 /*
  * The primary CPU waits a while for all secondary CPUs to enter. This is to
@@ -398,6 +400,94 @@ void default_machine_crash_shutdown(struct pt_regs *regs)
 #undef pr_fmt
 #define pr_fmt(fmt) "crash hp: " fmt
 
+/*
+ * Advertise preferred elfcorehdr size to userspace via
+ * /sys/kernel/crash_elfcorehdr_size sysfs interface.
+ */
+unsigned int arch_crash_get_elfcorehdr_size(void)
+{
+   unsigned long phdr_cnt;
+
+   /* A progra

[PATCH v17 5/6] powerpc/crash: add crash CPU hotplug support

2024-02-26 Thread Sourabh Jain
Due to CPU/Memory hotplug or online/offline events, the elfcorehdr
(which describes the CPUs and memory of the crashed kernel) and FDT
(Flattened Device Tree) of kdump image becomes outdated. Consequently,
attempting dump collection with an outdated elfcorehdr or FDT can lead
to failed or inaccurate dump collection.

Going forward, CPU hotplug or online/offline events are referred as
CPU/Memory add/remove events.

The current solution to address the above issue involves monitoring the
CPU/Memory add/remove events in userspace using udev rules and whenever
there are changes in CPU and memory resources, the entire kdump image
is loaded again. The kdump image includes kernel, initrd, elfcorehdr,
FDT, purgatory. Given that only elfcorehdr and FDT get outdated due to
CPU/Memory add/remove events, reloading the entire kdump image is
inefficient. More importantly, kdump remains inactive for a substantial
amount of time until the kdump reload completes.

To address the aforementioned issue, commit 247262756121 ("crash: add
generic infrastructure for crash hotplug support") added a generic
infrastructure that allows architectures to selectively update the kdump
image component during CPU or memory add/remove events within the kernel
itself.

In the event of a CPU or memory add/remove events, the generic crash
hotplug event handler, `crash_handle_hotplug_event()`, is triggered. It
then acquires the necessary locks to update the kdump image and invokes
the architecture-specific crash hotplug handler,
`arch_crash_handle_hotplug_event()`, to update the required kdump image
components.

This patch adds crash hotplug handler for PowerPC and enable support to
update the kdump image on CPU add/remove events. Support for memory
add/remove events is added in a subsequent patch with the title
"powerpc: add crash memory hotplug support"

As mentioned earlier, only the elfcorehdr and FDT kdump image components
need to be updated in the event of CPU or memory add/remove events.
However, on PowerPC architecture crash hotplug handler only updates the
FDT to enable crash hotplug support for CPU add/remove events. Here's
why.

The elfcorehdr on PowerPC is built with possible CPUs, and thus, it does
not need an update on CPU add/remove events. On the other hand, the FDT
needs to be updated on CPU add events to include the newly added CPU. If
the FDT is not updated and the kernel crashes on a newly added CPU, the
kdump kernel will fail to boot due to the unavailability of the crashing
CPU in the FDT. During the early boot, it is expected that the boot CPU
must be a part of the FDT; otherwise, the kernel will raise a BUG and
fail to boot. For more information, refer to commit 36ae37e3436b0
("powerpc: Make boot_cpuid common between 32 and 64-bit"). Since it is
okay to have an offline CPU in the kdump FDT, no action is taken in case
of CPU removal.

There are two system calls, `kexec_file_load` and `kexec_load`, used to
load the kdump image. Few changes have been made to ensure kernel can
safely update the FDT of kdump image loaded using both system calls.

For kexec_file_load syscall the kdump image is prepared in kernel. So to
support an increasing number of CPUs, the FDT is constructed with extra
buffer space to ensure it can accommodate a possible number of CPU
nodes. Additionally, a call to fdt_pack (which trims the unused space
once the FDT is prepared) is avoided if this feature is enabled.

For the kexec_load syscall, the FDT is updated only if the
KEXEC_CRASH_HOTPLUG_SUPPORT kexec flag is passed to the kernel by
userspace (kexec tools). When userspace passes this flag to the kernel,
it indicates that the FDT is built to accommodate possible CPUs, and the
FDT segment is excluded from SHA calculation, making it safe to update.

The changes related to this feature are kept under the CRASH_HOTPLUG
config, and it is enabled by default.

Signed-off-by: Sourabh Jain 
Cc: Akhil Raj 
Cc: Andrew Morton 
Cc: Aneesh Kumar K.V 
Cc: Baoquan He 
Cc: Borislav Petkov (AMD) 
Cc: Boris Ostrovsky 
Cc: Christophe Leroy 
Cc: Dave Hansen 
Cc: Dave Young 
Cc: David Hildenbrand 
Cc: Greg Kroah-Hartman 
Cc: Hari Bathini 
Cc: Laurent Dufour 
Cc: Mahesh Salgaonkar 
Cc: Michael Ellerman 
Cc: Mimi Zohar 
Cc: Naveen N Rao 
Cc: Oscar Salvador 
Cc: Thomas Gleixner 
Cc: Valentin Schneider 
Cc: Vivek Goyal 
Cc: ke...@lists.infradead.org
Cc: x...@kernel.org
---
 arch/powerpc/Kconfig  |   4 ++
 arch/powerpc/include/asm/kexec.h  |   8 +++
 arch/powerpc/kexec/crash.c| 103 ++
 arch/powerpc/kexec/elf_64.c   |   3 +-
 arch/powerpc/kexec/file_load_64.c |  17 +
 5 files changed, 134 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
index e377deefa2dc..16d2b20574c4 100644
--- a/arch/powerpc/Kconfig
+++ b/arch/powerpc/Kconfig
@@ -686,6 +686,10 @@ config ARCH_SELECTS_CRASH_DUMP
depends on CRASH_DUMP
select RELOCATABLE if PPC6

[PATCH v17 4/6] PowerPC/kexec: make the update_cpus_node() function public

2024-02-26 Thread Sourabh Jain
Move the update_cpus_node() from kexec/{file_load_64.c => core_64.c}
to allow other kexec components to use it.

Later in the series, this function is used for in-kernel updates
to the kdump image during CPU/memory hotplug or online/offline events for
both kexec_load and kexec_file_load syscalls.

No functional changes are intended.

Signed-off-by: Sourabh Jain 
Cc: Akhil Raj 
Cc: Andrew Morton 
Cc: Aneesh Kumar K.V 
Cc: Baoquan He 
Cc: Borislav Petkov (AMD) 
Cc: Boris Ostrovsky 
Cc: Christophe Leroy 
Cc: Dave Hansen 
Cc: Dave Young 
Cc: David Hildenbrand 
Cc: Greg Kroah-Hartman 
Cc: Hari Bathini 
Cc: Laurent Dufour 
Cc: Mahesh Salgaonkar 
Cc: Michael Ellerman 
Cc: Mimi Zohar 
Cc: Naveen N Rao 
Cc: Oscar Salvador 
Cc: Thomas Gleixner 
Cc: Valentin Schneider 
Cc: Vivek Goyal 
Cc: ke...@lists.infradead.org
Cc: x...@kernel.org
---
 arch/powerpc/include/asm/kexec.h  |  4 ++
 arch/powerpc/kexec/core_64.c  | 91 +++
 arch/powerpc/kexec/file_load_64.c | 87 -
 3 files changed, 95 insertions(+), 87 deletions(-)

diff --git a/arch/powerpc/include/asm/kexec.h b/arch/powerpc/include/asm/kexec.h
index fdb90e24dc74..d9ff4d0e392d 100644
--- a/arch/powerpc/include/asm/kexec.h
+++ b/arch/powerpc/include/asm/kexec.h
@@ -185,6 +185,10 @@ static inline void crash_send_ipi(void 
(*crash_ipi_callback)(struct pt_regs *))
 
 #endif /* CONFIG_CRASH_DUMP */
 
+#if defined(CONFIG_KEXEC_FILE) || defined(CONFIG_CRASH_DUMP)
+int update_cpus_node(void *fdt);
+#endif
+
 #ifdef CONFIG_PPC_BOOK3S_64
 #include 
 #endif
diff --git a/arch/powerpc/kexec/core_64.c b/arch/powerpc/kexec/core_64.c
index 762e4d09aacf..85050be08a23 100644
--- a/arch/powerpc/kexec/core_64.c
+++ b/arch/powerpc/kexec/core_64.c
@@ -17,6 +17,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include 
 #include 
@@ -30,6 +31,7 @@
 #include 
 #include 
 #include 
+#include 
 
 int machine_kexec_prepare(struct kimage *image)
 {
@@ -419,3 +421,92 @@ static int __init export_htab_values(void)
 }
 late_initcall(export_htab_values);
 #endif /* CONFIG_PPC_64S_HASH_MMU */
+
+#if defined(CONFIG_KEXEC_FILE) || defined(CONFIG_CRASH_DUMP)
+/**
+ * add_node_props - Reads node properties from device node structure and add
+ *  them to fdt.
+ * @fdt:Flattened device tree of the kernel
+ * @node_offset:offset of the node to add a property at
+ * @dn: device node pointer
+ *
+ * Returns 0 on success, negative errno on error.
+ */
+static int add_node_props(void *fdt, int node_offset, const struct device_node 
*dn)
+{
+   int ret = 0;
+   struct property *pp;
+
+   if (!dn)
+   return -EINVAL;
+
+   for_each_property_of_node(dn, pp) {
+   ret = fdt_setprop(fdt, node_offset, pp->name, pp->value, 
pp->length);
+   if (ret < 0) {
+   pr_err("Unable to add %s property: %s\n", pp->name, 
fdt_strerror(ret));
+   return ret;
+   }
+   }
+   return ret;
+}
+
+/**
+ * update_cpus_node - Update cpus node of flattened device tree using of_root
+ *device node.
+ * @fdt:  Flattened device tree of the kernel.
+ *
+ * Returns 0 on success, negative errno on error.
+ */
+int update_cpus_node(void *fdt)
+{
+   struct device_node *cpus_node, *dn;
+   int cpus_offset, cpus_subnode_offset, ret = 0;
+
+   cpus_offset = fdt_path_offset(fdt, "/cpus");
+   if (cpus_offset < 0 && cpus_offset != -FDT_ERR_NOTFOUND) {
+   pr_err("Malformed device tree: error reading /cpus node: %s\n",
+  fdt_strerror(cpus_offset));
+   return cpus_offset;
+   }
+
+   if (cpus_offset > 0) {
+   ret = fdt_del_node(fdt, cpus_offset);
+   if (ret < 0) {
+   pr_err("Error deleting /cpus node: %s\n", 
fdt_strerror(ret));
+   return -EINVAL;
+   }
+   }
+
+   /* Add cpus node to fdt */
+   cpus_offset = fdt_add_subnode(fdt, fdt_path_offset(fdt, "/"), "cpus");
+   if (cpus_offset < 0) {
+   pr_err("Error creating /cpus node: %s\n", 
fdt_strerror(cpus_offset));
+   return -EINVAL;
+   }
+
+   /* Add cpus node properties */
+   cpus_node = of_find_node_by_path("/cpus");
+   ret = add_node_props(fdt, cpus_offset, cpus_node);
+   of_node_put(cpus_node);
+   if (ret < 0)
+   return ret;
+
+   /* Loop through all subnodes of cpus and add them to fdt */
+   for_each_node_by_type(dn, "cpu") {
+   cpus_subnode_offset = fdt_add_subnode(fdt, cpus_offset, 
dn->full_name);
+   if (cpus_subnode_offset < 0) {
+   pr_err("Unable to add %s subnode: %s\n", dn->full_name,

[PATCH v17 3/6] powerpc/kexec: move *_memory_ranges functions to ranges.c

2024-02-26 Thread Sourabh Jain
Move the following functions form kexec/{file_load_64.c => ranges.c} and
make them public so that components other KEXEC_FILE can also use these
functions.
1. get_exclude_memory_ranges
2. get_reserved_memory_ranges
3. get_crash_memory_ranges
4. get_usable_memory_ranges

Later in the series get_crash_memory_ranges function is utilized for
in-kernel updates to kdump image during CPU/Memory hotplug or
online/offline events for both kexec_load and kexec_file_load syscalls.

Since the above functions are moved to ranges.c, some of the helper
functions in ranges.c are no longer required to be public. Mark them as
static and removed them from kexec_ranges.h header file.

Finally, remove the CONFIG_KEXEC_FILE build dependency for range.c
because it is required for other config, such as CONFIG_CRASH_DUMP.

No functional changes are intended.

Signed-off-by: Sourabh Jain 
Cc: Akhil Raj 
Cc: Andrew Morton 
Cc: Aneesh Kumar K.V 
Cc: Baoquan He 
Cc: Borislav Petkov (AMD) 
Cc: Boris Ostrovsky 
Cc: Christophe Leroy 
Cc: Dave Hansen 
Cc: Dave Young 
Cc: David Hildenbrand 
Cc: Greg Kroah-Hartman 
Cc: Hari Bathini 
Cc: Laurent Dufour 
Cc: Mahesh Salgaonkar 
Cc: Michael Ellerman 
Cc: Mimi Zohar 
Cc: Naveen N Rao 
Cc: Oscar Salvador 
Cc: Thomas Gleixner 
Cc: Valentin Schneider 
Cc: Vivek Goyal 
Cc: ke...@lists.infradead.org
Cc: x...@kernel.org
---
 arch/powerpc/include/asm/kexec_ranges.h |  19 +-
 arch/powerpc/kexec/Makefile |   4 +-
 arch/powerpc/kexec/file_load_64.c   | 190 
 arch/powerpc/kexec/ranges.c | 227 +++-
 4 files changed, 224 insertions(+), 216 deletions(-)

diff --git a/arch/powerpc/include/asm/kexec_ranges.h 
b/arch/powerpc/include/asm/kexec_ranges.h
index f83866a19e87..8489e844b447 100644
--- a/arch/powerpc/include/asm/kexec_ranges.h
+++ b/arch/powerpc/include/asm/kexec_ranges.h
@@ -7,19 +7,8 @@
 void sort_memory_ranges(struct crash_mem *mrngs, bool merge);
 struct crash_mem *realloc_mem_ranges(struct crash_mem **mem_ranges);
 int add_mem_range(struct crash_mem **mem_ranges, u64 base, u64 size);
-int add_tce_mem_ranges(struct crash_mem **mem_ranges);
-int add_initrd_mem_range(struct crash_mem **mem_ranges);
-#ifdef CONFIG_PPC_64S_HASH_MMU
-int add_htab_mem_range(struct crash_mem **mem_ranges);
-#else
-static inline int add_htab_mem_range(struct crash_mem **mem_ranges)
-{
-   return 0;
-}
-#endif
-int add_kernel_mem_range(struct crash_mem **mem_ranges);
-int add_rtas_mem_range(struct crash_mem **mem_ranges);
-int add_opal_mem_range(struct crash_mem **mem_ranges);
-int add_reserved_mem_ranges(struct crash_mem **mem_ranges);
-
+int get_exclude_memory_ranges(struct crash_mem **mem_ranges);
+int get_reserved_memory_ranges(struct crash_mem **mem_ranges);
+int get_crash_memory_ranges(struct crash_mem **mem_ranges);
+int get_usable_memory_ranges(struct crash_mem **mem_ranges);
 #endif /* _ASM_POWERPC_KEXEC_RANGES_H */
diff --git a/arch/powerpc/kexec/Makefile b/arch/powerpc/kexec/Makefile
index 8e469c4da3f8..470eb0453e17 100644
--- a/arch/powerpc/kexec/Makefile
+++ b/arch/powerpc/kexec/Makefile
@@ -3,11 +3,11 @@
 # Makefile for the linux kernel.
 #
 
-obj-y  += core.o core_$(BITS).o
+obj-y  += core.o core_$(BITS).o ranges.o
 
 obj-$(CONFIG_PPC32)+= relocate_32.o
 
-obj-$(CONFIG_KEXEC_FILE)   += file_load.o ranges.o file_load_$(BITS).o 
elf_$(BITS).o
+obj-$(CONFIG_KEXEC_FILE)   += file_load.o file_load_$(BITS).o elf_$(BITS).o
 obj-$(CONFIG_VMCORE_INFO)  += vmcore_info.o
 obj-$(CONFIG_CRASH_DUMP)   += crash.o
 
diff --git a/arch/powerpc/kexec/file_load_64.c 
b/arch/powerpc/kexec/file_load_64.c
index 1bc65de6174f..6a01f62b8fcf 100644
--- a/arch/powerpc/kexec/file_load_64.c
+++ b/arch/powerpc/kexec/file_load_64.c
@@ -47,83 +47,6 @@ const struct kexec_file_ops * const kexec_file_loaders[] = {
NULL
 };
 
-/**
- * get_exclude_memory_ranges - Get exclude memory ranges. This list includes
- * regions like opal/rtas, tce-table, initrd,
- * kernel, htab which should be avoided while
- * setting up kexec load segments.
- * @mem_ranges:Range list to add the memory ranges to.
- *
- * Returns 0 on success, negative errno on error.
- */
-static int get_exclude_memory_ranges(struct crash_mem **mem_ranges)
-{
-   int ret;
-
-   ret = add_tce_mem_ranges(mem_ranges);
-   if (ret)
-   goto out;
-
-   ret = add_initrd_mem_range(mem_ranges);
-   if (ret)
-   goto out;
-
-   ret = add_htab_mem_range(mem_ranges);
-   if (ret)
-   goto out;
-
-   ret = add_kernel_mem_range(mem_ranges);
-   if (ret)
-   goto out;
-
-   ret = add_rtas_mem_range(mem_ranges);
-   if (ret)
-   goto out;
-
-   ret = add_opal_mem_range(mem_ranges);
-   if (ret)
-   goto out;
-
-   

[PATCH v17 2/6] crash: add a new kexec flag for hotplug support

2024-02-26 Thread Sourabh Jain
Commit a72bbec70da2 ("crash: hotplug support for kexec_load()")
introduced a new kexec flag, `KEXEC_UPDATE_ELFCOREHDR`. Kexec tool uses
this flag to indicate to the kernel that it is safe to modify the
elfcorehdr of the kdump image loaded using the kexec_load system call.

However, it is possible that architectures may need to update kexec
segments other then elfcorehdr. For example, FDT (Flatten Device Tree)
on PowerPC. Introducing a new kexec flag for every new kexec segment
may not be a good solution. Hence, a generic kexec flag bit,
`KEXEC_CRASH_HOTPLUG_SUPPORT`, is introduced to share the CPU/Memory
hotplug support intent between the kexec tool and the kernel for the
kexec_load system call.

Now, if the kexec tool sends KEXEC_CRASH_HOTPLUG_SUPPORT kexec flag to
the kernel, it indicates to the kernel that all the required kexec
segment is skipped from SHA calculation and it is safe to update kdump
image loaded using the kexec_load syscall.

While loading the kdump image using the kexec_load syscall, the
@update_elfcorehdr member of struct kimage is set if the kexec tool
sends the KEXEC_UPDATE_ELFCOREHDR kexec flag. This member is later used
to determine whether it is safe to update elfcorehdr on hotplug events.
However, with the introduction of the KEXEC_CRASH_HOTPLUG_SUPPORT kexec
flag, the kexec tool could mark all the required kexec segments on an
architecture as safe to update. So rename the @update_elfcorehdr to
@hotplug_support. If @hotplug_support is set, the kernel can safely
update all the required kexec segments of the kdump image during
CPU/Memory hotplug events.

Introduce an architecture-specific function to process kexec flags for
determining hotplug support. Set the @hotplug_support member of struct
kimage for both kexec_load and kexec_file_load system calls. This
simplifies kernel checks to identify hotplug support for the currently
loaded kdump image by just examining the value of @hotplug_support.

Signed-off-by: Sourabh Jain 
Cc: Akhil Raj 
Cc: Andrew Morton 
Cc: Aneesh Kumar K.V 
Cc: Baoquan He 
Cc: Borislav Petkov (AMD) 
Cc: Boris Ostrovsky 
Cc: Christophe Leroy 
Cc: Dave Hansen 
Cc: Dave Young 
Cc: David Hildenbrand 
Cc: Eric DeVolder 
Cc: Greg Kroah-Hartman 
Cc: Hari Bathini 
Cc: Laurent Dufour 
Cc: Mahesh Salgaonkar 
Cc: Michael Ellerman 
Cc: Mimi Zohar 
Cc: Naveen N Rao 
Cc: Oscar Salvador 
Cc: Thomas Gleixner 
Cc: Valentin Schneider 
Cc: Vivek Goyal 
Cc: ke...@lists.infradead.org
Cc: x...@kernel.org
---
 arch/x86/include/asm/kexec.h | 11 ++-
 arch/x86/kernel/crash.c  | 28 +---
 drivers/base/cpu.c   |  2 +-
 drivers/base/memory.c|  2 +-
 include/linux/crash_core.h   | 13 ++---
 include/linux/kexec.h| 11 +++
 include/uapi/linux/kexec.h   |  1 +
 kernel/crash_core.c  | 11 ---
 kernel/kexec.c   |  4 ++--
 kernel/kexec_file.c  |  5 +
 10 files changed, 46 insertions(+), 42 deletions(-)

diff --git a/arch/x86/include/asm/kexec.h b/arch/x86/include/asm/kexec.h
index cb1320ebbc23..ae5482a2f0ca 100644
--- a/arch/x86/include/asm/kexec.h
+++ b/arch/x86/include/asm/kexec.h
@@ -210,15 +210,8 @@ extern void kdump_nmi_shootdown_cpus(void);
 void arch_crash_handle_hotplug_event(struct kimage *image, void *arg);
 #define arch_crash_handle_hotplug_event arch_crash_handle_hotplug_event
 
-#ifdef CONFIG_HOTPLUG_CPU
-int arch_crash_hotplug_cpu_support(void);
-#define crash_hotplug_cpu_support arch_crash_hotplug_cpu_support
-#endif
-
-#ifdef CONFIG_MEMORY_HOTPLUG
-int arch_crash_hotplug_memory_support(void);
-#define crash_hotplug_memory_support arch_crash_hotplug_memory_support
-#endif
+int arch_crash_hotplug_support(struct kimage *image, unsigned long 
kexec_flags);
+#define arch_crash_hotplug_support arch_crash_hotplug_support
 
 unsigned int arch_crash_get_elfcorehdr_size(void);
 #define crash_get_elfcorehdr_size arch_crash_get_elfcorehdr_size
diff --git a/arch/x86/kernel/crash.c b/arch/x86/kernel/crash.c
index 2a682fe86352..f06501445cd9 100644
--- a/arch/x86/kernel/crash.c
+++ b/arch/x86/kernel/crash.c
@@ -402,20 +402,26 @@ int crash_load_segments(struct kimage *image)
 #undef pr_fmt
 #define pr_fmt(fmt) "crash hp: " fmt
 
-/* These functions provide the value for the sysfs crash_hotplug nodes */
-#ifdef CONFIG_HOTPLUG_CPU
-int arch_crash_hotplug_cpu_support(void)
+int arch_crash_hotplug_support(struct kimage *image, unsigned long kexec_flags)
 {
-   return crash_check_update_elfcorehdr();
-}
-#endif
 
-#ifdef CONFIG_MEMORY_HOTPLUG
-int arch_crash_hotplug_memory_support(void)
-{
-   return crash_check_update_elfcorehdr();
-}
+#ifdef CONFIG_KEXEC_FILE
+   if (image->file_mode)
+   return 1;
 #endif
+   /*
+* Initially, crash hotplug support for kexec_load was added
+* with the KEXEC_UPDATE_ELFCOREHDR flag. Later, this
+* functionality was expanded to accommodate multiple kexec
+* s

[PATCH v17 0/6] powerpc/crash: Kernel handling of CPU and memory hotplug

2024-02-26 Thread Sourabh Jain
 The patch that introduced CONFIG_CRASH_HOTPLUG for PowerPC has been
removed. The config is now part of common configuration:
https://lore.kernel.org/all/87ilbpflsk.fsf@mail.lhotse/

v10:
  - Drop the patch that adds fdt_index attribute to struct kimage_arch
Find the fdt segment index when needed.
  - Added more details into commits messages.
  - Rebased onto 6.3.0-rc5

v9:
  - Removed patch to prepare elfcorehdr crash notes for possible CPUs.
The patch is moved to generic patch series that introduces generic
infrastructure for in kernel crash update.
  - Removed patch to pass the hotplug action type to the arch crash
hotplug handler function. The generic patch series has introduced
the hotplug action type in kimage struct.
  - Add detail commit message for better understanding.

v8:
  - Restrict fdt_index initialization to machine_kexec_post_load
it work for both kexec_load and kexec_file_load.[3/8] Laurent Dufour

  - Updated the logic to find the number of offline core. [6/8]

  - Changed the logic to find the elfcore program header to accommodate
future memory ranges due memory hotplug events. [8/8]

v7
  - added a new config to configure this feature
  - pass hotplug action type to arch specific handler

v6
  - Added crash memory hotplug support

v5:
  - Replace COFNIG_CRASH_HOTPLUG with CONFIG_HOTPLUG_CPU.
  - Move fdt segment identification for kexec_load case to load path
instead of crash hotplug handler
  - Keep new attribute defined under kimage_arch to track FDT segment
under CONFIG_HOTPLUG_CPU config.

v4:
  - Update the logic to find the additional space needed for hotadd CPUs
post kexec load. Refer "[RFC v4 PATCH 4/5] powerpc/crash hp: add crash
hotplug support for kexec_file_load" patch to know more about the
change.
  - Fix a couple of typo.
  - Replace pr_err to pr_info_once to warn user about memory hotplug
support.
  - In crash hotplug handle exit the for loop if FDT segment is found.

v3
  - Move fdt_index and fdt_index_vaild variables to kimage_arch struct.
  - Rebase patche on top of

https://lore.kernel.org/lkml/20220303162725.49640-1-eric.devol...@oracle.com/
  - Fixed warning reported by checpatch script

v2:
  - Use generic hotplug handler introduced by

https://lore.kernel.org/lkml/20220209195706.51522-1-eric.devol...@oracle.com/
a significant change from v1.

Cc: Akhil Raj 
Cc: Andrew Morton 
Cc: Aneesh Kumar K.V 
Cc: Baoquan He 
Cc: Borislav Petkov (AMD) 
Cc: Boris Ostrovsky 
Cc: Christophe Leroy 
Cc: Dave Hansen 
Cc: Dave Young 
Cc: David Hildenbrand 
Cc: Greg Kroah-Hartman 
Cc: Hari Bathini 
Cc: Laurent Dufour 
Cc: Mahesh Salgaonkar 
Cc: Michael Ellerman 
Cc: Mimi Zohar 
Cc: Naveen N Rao 
Cc: Oscar Salvador 
Cc: Thomas Gleixner 
Cc: Valentin Schneider 
Cc: Vivek Goyal 
Cc: ke...@lists.infradead.org
Cc: x...@kernel.org

Sourabh Jain (6):
  crash: forward memory_notify arg to arch crash hotplug handler
  crash: add a new kexec flag for hotplug support
  powerpc/kexec: move *_memory_ranges functions to ranges.c
  PowerPC/kexec: make the update_cpus_node() function public
  powerpc/crash: add crash CPU hotplug support
  powerpc/crash: add crash memory hotplug support

 arch/powerpc/Kconfig|   4 +
 arch/powerpc/include/asm/kexec.h|  15 ++
 arch/powerpc/include/asm/kexec_ranges.h |  20 +-
 arch/powerpc/kexec/Makefile |   4 +-
 arch/powerpc/kexec/core_64.c|  91 +++
 arch/powerpc/kexec/crash.c  | 196 +++
 arch/powerpc/kexec/elf_64.c |   3 +-
 arch/powerpc/kexec/file_load_64.c   | 314 +++-
 arch/powerpc/kexec/ranges.c | 312 ++-
 arch/x86/include/asm/kexec.h|  13 +-
 arch/x86/kernel/crash.c |  32 ++-
 drivers/base/cpu.c  |   2 +-
 drivers/base/memory.c   |   2 +-
 include/linux/crash_core.h  |  15 +-
 include/linux/kexec.h   |  11 +-
 include/uapi/linux/kexec.h  |   1 +
 kernel/crash_core.c |  25 +-
 kernel/kexec.c  |   4 +-
 kernel/kexec_file.c |   5 +
 19 files changed, 712 insertions(+), 357 deletions(-)

-- 
2.43.0



[PATCH v17 1/6] crash: forward memory_notify arg to arch crash hotplug handler

2024-02-26 Thread Sourabh Jain
In the event of memory hotplug or online/offline events, the crash
memory hotplug notifier `crash_memhp_notifier()` receives a
`memory_notify` object but doesn't forward that object to the
generic and architecture-specific crash hotplug handler.

The `memory_notify` object contains the starting PFN (Page Frame Number)
and the number of pages in the hot-removed memory. This information is
necessary for architectures like PowerPC to update/recreate the kdump
image, specifically `elfcorehdr`.

So update the function signature of `crash_handle_hotplug_event()` and
`arch_crash_handle_hotplug_event()` to accept the `memory_notify` object
as an argument from crash memory hotplug notifier.

Since no such object is available in the case of CPU hotplug event, the
crash CPU hotplug notifier `crash_cpuhp_online()` passes NULL to the
crash hotplug handler.

Signed-off-by: Sourabh Jain 
Acked-by: Baoquan He 
Cc: Akhil Raj 
Cc: Andrew Morton 
Cc: Aneesh Kumar K.V 
Cc: Borislav Petkov (AMD) 
Cc: Boris Ostrovsky 
Cc: Christophe Leroy 
Cc: Dave Hansen 
Cc: Dave Young 
Cc: David Hildenbrand 
Cc: Greg Kroah-Hartman 
Cc: Hari Bathini 
Cc: Laurent Dufour 
Cc: Mahesh Salgaonkar 
Cc: Michael Ellerman 
Cc: Mimi Zohar 
Cc: Naveen N Rao 
Cc: Oscar Salvador 
Cc: Thomas Gleixner 
Cc: Valentin Schneider 
Cc: Vivek Goyal 
Cc: ke...@lists.infradead.org
Cc: x...@kernel.org
---
 arch/x86/include/asm/kexec.h |  2 +-
 arch/x86/kernel/crash.c  |  4 +++-
 include/linux/crash_core.h   |  2 +-
 kernel/crash_core.c  | 14 +++---
 4 files changed, 12 insertions(+), 10 deletions(-)

diff --git a/arch/x86/include/asm/kexec.h b/arch/x86/include/asm/kexec.h
index 91ca9a9ee3a2..cb1320ebbc23 100644
--- a/arch/x86/include/asm/kexec.h
+++ b/arch/x86/include/asm/kexec.h
@@ -207,7 +207,7 @@ int arch_kimage_file_post_load_cleanup(struct kimage 
*image);
 extern void kdump_nmi_shootdown_cpus(void);
 
 #ifdef CONFIG_CRASH_HOTPLUG
-void arch_crash_handle_hotplug_event(struct kimage *image);
+void arch_crash_handle_hotplug_event(struct kimage *image, void *arg);
 #define arch_crash_handle_hotplug_event arch_crash_handle_hotplug_event
 
 #ifdef CONFIG_HOTPLUG_CPU
diff --git a/arch/x86/kernel/crash.c b/arch/x86/kernel/crash.c
index e74d0c4286c1..2a682fe86352 100644
--- a/arch/x86/kernel/crash.c
+++ b/arch/x86/kernel/crash.c
@@ -432,10 +432,12 @@ unsigned int arch_crash_get_elfcorehdr_size(void)
 /**
  * arch_crash_handle_hotplug_event() - Handle hotplug elfcorehdr changes
  * @image: a pointer to kexec_crash_image
+ * @arg: struct memory_notify handler for memory hotplug case and
+ *   NULL for CPU hotplug case.
  *
  * Prepare the new elfcorehdr and replace the existing elfcorehdr.
  */
-void arch_crash_handle_hotplug_event(struct kimage *image)
+void arch_crash_handle_hotplug_event(struct kimage *image, void *arg)
 {
void *elfbuf = NULL, *old_elfcorehdr;
unsigned long nr_mem_ranges;
diff --git a/include/linux/crash_core.h b/include/linux/crash_core.h
index d33352c2e386..647e928efee8 100644
--- a/include/linux/crash_core.h
+++ b/include/linux/crash_core.h
@@ -37,7 +37,7 @@ static inline void arch_kexec_unprotect_crashkres(void) { }
 
 
 #ifndef arch_crash_handle_hotplug_event
-static inline void arch_crash_handle_hotplug_event(struct kimage *image) { }
+static inline void arch_crash_handle_hotplug_event(struct kimage *image, void 
*arg) { }
 #endif
 
 int crash_check_update_elfcorehdr(void);
diff --git a/kernel/crash_core.c b/kernel/crash_core.c
index 78b5dc7cee3a..70fa8111a9d6 100644
--- a/kernel/crash_core.c
+++ b/kernel/crash_core.c
@@ -534,7 +534,7 @@ int crash_check_update_elfcorehdr(void)
  * list of segments it checks (since the elfcorehdr changes and thus
  * would require an update to purgatory itself to update the digest).
  */
-static void crash_handle_hotplug_event(unsigned int hp_action, unsigned int 
cpu)
+static void crash_handle_hotplug_event(unsigned int hp_action, unsigned int 
cpu, void *arg)
 {
struct kimage *image;
 
@@ -596,7 +596,7 @@ static void crash_handle_hotplug_event(unsigned int 
hp_action, unsigned int cpu)
image->hp_action = hp_action;
 
/* Now invoke arch-specific update handler */
-   arch_crash_handle_hotplug_event(image);
+   arch_crash_handle_hotplug_event(image, arg);
 
/* No longer handling a hotplug event */
image->hp_action = KEXEC_CRASH_HP_NONE;
@@ -612,17 +612,17 @@ static void crash_handle_hotplug_event(unsigned int 
hp_action, unsigned int cpu)
crash_hotplug_unlock();
 }
 
-static int crash_memhp_notifier(struct notifier_block *nb, unsigned long val, 
void *v)
+static int crash_memhp_notifier(struct notifier_block *nb, unsigned long val, 
void *arg)
 {
switch (val) {
case MEM_ONLINE:
crash_handle_hotplug_event(KEXEC_CRASH_HP_ADD_MEMORY,
-   KEXEC_CRASH_HP_INVALID_CPU);
+   KEXEC_CRASH_HP_INVALID_CPU, arg);
break;
 
  

Re: [PATCH linux-next 3/3] powerpc/kdump: Split KEXEC_CORE and CRASH_DUMP dependency

2024-02-22 Thread Sourabh Jain

Hello Hari,

Build failure detected.

On 13/02/24 17:01, Hari Bathini wrote:

Remove CONFIG_CRASH_DUMP dependency on CONFIG_KEXEC. CONFIG_KEXEC_CORE
was used at places where CONFIG_CRASH_DUMP or CONFIG_CRASH_RESERVE was
appropriate. Replace with appropriate #ifdefs to support CONFIG_KEXEC
and !CONFIG_CRASH_DUMP configuration option. Also, make CONFIG_FA_DUMP
dependent on CONFIG_CRASH_DUMP to avoid unmet dependencies for FA_DUMP
with !CONFIG_KEXEC_CORE configuration option.

Signed-off-by: Hari Bathini 
---
  arch/powerpc/Kconfig   |  9 +--
  arch/powerpc/include/asm/kexec.h   | 98 +++---
  arch/powerpc/kernel/prom.c |  2 +-
  arch/powerpc/kernel/setup-common.c |  2 +-
  arch/powerpc/kernel/smp.c  |  4 +-
  arch/powerpc/kexec/Makefile|  3 +-
  arch/powerpc/kexec/core.c  |  4 ++
  7 files changed, 60 insertions(+), 62 deletions(-)

diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
index 5cf8ad8d7e8e..e377deefa2dc 100644
--- a/arch/powerpc/Kconfig
+++ b/arch/powerpc/Kconfig
@@ -607,11 +607,6 @@ config PPC64_SUPPORTS_MEMORY_FAILURE
  config ARCH_SUPPORTS_KEXEC
def_bool PPC_BOOK3S || PPC_E500 || (44x && !SMP)
  
-config ARCH_SELECTS_KEXEC

-   def_bool y
-   depends on KEXEC
-   select CRASH_DUMP
-
  config ARCH_SUPPORTS_KEXEC_FILE
def_bool PPC64
  
@@ -622,7 +617,6 @@ config ARCH_SELECTS_KEXEC_FILE

def_bool y
depends on KEXEC_FILE
select KEXEC_ELF
-   select CRASH_DUMP
select HAVE_IMA_KEXEC if IMA
  
  config PPC64_BIG_ENDIAN_ELF_ABI_V2

@@ -694,8 +688,7 @@ config ARCH_SELECTS_CRASH_DUMP
  
  config FA_DUMP

bool "Firmware-assisted dump"
-   depends on PPC64 && (PPC_RTAS || PPC_POWERNV)
-   select CRASH_DUMP
+   depends on CRASH_DUMP && PPC64 && (PPC_RTAS || PPC_POWERNV)
help
  A robust mechanism to get reliable kernel crash dump with
  assistance from firmware. This approach does not use kexec,
diff --git a/arch/powerpc/include/asm/kexec.h b/arch/powerpc/include/asm/kexec.h
index e1b43aa12175..fdb90e24dc74 100644
--- a/arch/powerpc/include/asm/kexec.h
+++ b/arch/powerpc/include/asm/kexec.h
@@ -55,59 +55,18 @@
  typedef void (*crash_shutdown_t)(void);
  
  #ifdef CONFIG_KEXEC_CORE

-
-/*
- * This function is responsible for capturing register states if coming
- * via panic or invoking dump using sysrq-trigger.
- */
-static inline void crash_setup_regs(struct pt_regs *newregs,
-   struct pt_regs *oldregs)
-{
-   if (oldregs)
-   memcpy(newregs, oldregs, sizeof(*newregs));
-   else
-   ppc_save_regs(newregs);
-}
+struct kimage;
+struct pt_regs;
  
  extern void kexec_smp_wait(void);	/* get and clear naca physid, wait for

  master to copy new code to 0 */
-extern int crashing_cpu;
-extern void crash_send_ipi(void (*crash_ipi_callback)(struct pt_regs *));
-extern void crash_ipi_callback(struct pt_regs *);
-extern int crash_wake_offline;
-
-struct kimage;
-struct pt_regs;
  extern void default_machine_kexec(struct kimage *image);
-extern void default_machine_crash_shutdown(struct pt_regs *regs);
-extern int crash_shutdown_register(crash_shutdown_t handler);
-extern int crash_shutdown_unregister(crash_shutdown_t handler);
-
-extern void crash_kexec_prepare(void);
-extern void crash_kexec_secondary(struct pt_regs *regs);
-int __init overlaps_crashkernel(unsigned long start, unsigned long size);
-extern void reserve_crashkernel(void);
  extern void machine_kexec_mask_interrupts(void);
  
-static inline bool kdump_in_progress(void)

-{
-   return crashing_cpu >= 0;
-}
-
  void relocate_new_kernel(unsigned long indirection_page, unsigned long 
reboot_code_buffer,
 unsigned long start_address) __noreturn;
-
  void kexec_copy_flush(struct kimage *image);
  
-#if defined(CONFIG_CRASH_DUMP)

-bool is_kdump_kernel(void);
-#define is_kdump_kernelis_kdump_kernel
-#if defined(CONFIG_PPC_RTAS)
-void crash_free_reserved_phys_range(unsigned long begin, unsigned long end);
-#define crash_free_reserved_phys_range crash_free_reserved_phys_range
-#endif /* CONFIG_PPC_RTAS */
-#endif /* CONFIG_CRASH_DUMP */
-
  #ifdef CONFIG_KEXEC_FILE
  extern const struct kexec_file_ops kexec_elf64_ops;
  
@@ -152,15 +111,56 @@ int setup_new_fdt_ppc64(const struct kimage *image, void *fdt,
  
  #endif /* CONFIG_KEXEC_FILE */
  
-#else /* !CONFIG_KEXEC_CORE */

-static inline void crash_kexec_secondary(struct pt_regs *regs) { }
+#endif /* CONFIG_KEXEC_CORE */
+
+#ifdef CONFIG_CRASH_RESERVE
+int __init overlaps_crashkernel(unsigned long start, unsigned long size);
+extern void reserve_crashkernel(void);
+#else
+static inline void reserve_crashkernel(void) {}
+static inline int overlaps_crashkernel(unsigned long start, unsigned long 
size) { return 0; }
+#endif
  
-static inline int overlaps_crashkernel(unsigned 

Re: [PATCH v16 2/5] crash: add a new kexec flag for hotplug support

2024-02-21 Thread Sourabh Jain




On 22/02/24 09:28, Baoquan He wrote:

On 02/22/24 at 09:01am, Sourabh Jain wrote:

Hello Baoquan,

There are a lot of code movements introduced by your patch series, 'Split
crash out from kexec and clean up related config items.'

https://lore.kernel.org/all/20240221125752.36fbfe9c307496313198b...@linux-foundation.org/

Do you want me to rebase this patch series on top of the above patch series?

Yes, appreciate that, that would be very helpful. Rebasing this to
latest next/master. And I saw Hari's patch sereis, basically it's fine
to me. It will be great if this patch can sit on that patchset.


Sure, let me rebase it and send v17.

Thanks,
Sourabh


On 17/02/24 13:44, Sourabh Jain wrote:

Commit a72bbec70da2 ("crash: hotplug support for kexec_load()")
introduced a new kexec flag, `KEXEC_UPDATE_ELFCOREHDR`. Kexec tool uses
this flag to indicate to the kernel that it is safe to modify the
elfcorehdr of the kdump image loaded using the kexec_load system call.

However, it is possible that architectures may need to update kexec
segments other then elfcorehdr. For example, FDT (Flatten Device Tree)
on PowerPC. Introducing a new kexec flag for every new kexec segment
may not be a good solution. Hence, a generic kexec flag bit,
`KEXEC_CRASH_HOTPLUG_SUPPORT`, is introduced to share the CPU/Memory
hotplug support intent between the kexec tool and the kernel for the
kexec_load system call.

Now, if the kexec tool sends KEXEC_CRASH_HOTPLUG_SUPPORT kexec flag to
the kernel, it indicates to the kernel that all the required kexec
segment is skipped from SHA calculation and it is safe to update kdump
image loaded using the kexec_load syscall.

While loading the kdump image using the kexec_load syscall, the
@update_elfcorehdr member of struct kimage is set if the kexec tool
sends the KEXEC_UPDATE_ELFCOREHDR kexec flag. This member is later used
to determine whether it is safe to update elfcorehdr on hotplug events.
However, with the introduction of the KEXEC_CRASH_HOTPLUG_SUPPORT kexec
flag, the kexec tool could mark all the required kexec segments on an
architecture as safe to update. So rename the @update_elfcorehdr to
@hotplug_support. If @hotplug_support is set, the kernel can safely
update all the required kexec segments of the kdump image during
CPU/Memory hotplug events.

Introduce an architecture-specific function to process kexec flags for
determining hotplug support. Set the @hotplug_support member of struct
kimage for both kexec_load and kexec_file_load system calls. This
simplifies kernel checks to identify hotplug support for the currently
loaded kdump image by just examining the value of @hotplug_support.

Signed-off-by: Sourabh Jain 
Cc: Akhil Raj 
Cc: Andrew Morton 
Cc: Aneesh Kumar K.V 
Cc: Baoquan He 
Cc: Borislav Petkov (AMD) 
Cc: Boris Ostrovsky 
Cc: Christophe Leroy 
Cc: Dave Hansen 
Cc: Dave Young 
Cc: David Hildenbrand 
Cc: Eric DeVolder 
Cc: Greg Kroah-Hartman 
Cc: Hari Bathini 
Cc: Laurent Dufour 
Cc: Mahesh Salgaonkar 
Cc: Michael Ellerman 
Cc: Mimi Zohar 
Cc: Naveen N Rao 
Cc: Oscar Salvador 
Cc: Thomas Gleixner 
Cc: Valentin Schneider 
Cc: Vivek Goyal 
Cc: ke...@lists.infradead.org
Cc: x...@kernel.org
---
   arch/x86/include/asm/kexec.h | 11 ++-
   arch/x86/kernel/crash.c  | 28 +---
   drivers/base/cpu.c   |  2 +-
   drivers/base/memory.c|  2 +-
   include/linux/kexec.h| 27 +--
   include/uapi/linux/kexec.h   |  1 +
   kernel/crash_core.c  | 11 ---
   kernel/kexec.c   |  4 ++--
   kernel/kexec_file.c  |  5 +
   9 files changed, 50 insertions(+), 41 deletions(-)

diff --git a/arch/x86/include/asm/kexec.h b/arch/x86/include/asm/kexec.h
index 9bb6607e864e..8be622e82ba8 100644
--- a/arch/x86/include/asm/kexec.h
+++ b/arch/x86/include/asm/kexec.h
@@ -211,15 +211,8 @@ extern void kdump_nmi_shootdown_cpus(void);
   void arch_crash_handle_hotplug_event(struct kimage *image, void *arg);
   #define arch_crash_handle_hotplug_event arch_crash_handle_hotplug_event
-#ifdef CONFIG_HOTPLUG_CPU
-int arch_crash_hotplug_cpu_support(void);
-#define crash_hotplug_cpu_support arch_crash_hotplug_cpu_support
-#endif
-
-#ifdef CONFIG_MEMORY_HOTPLUG
-int arch_crash_hotplug_memory_support(void);
-#define crash_hotplug_memory_support arch_crash_hotplug_memory_support
-#endif
+int arch_crash_hotplug_support(struct kimage *image, unsigned long 
kexec_flags);
+#define arch_crash_hotplug_support arch_crash_hotplug_support
   unsigned int arch_crash_get_elfcorehdr_size(void);
   #define crash_get_elfcorehdr_size arch_crash_get_elfcorehdr_size
diff --git a/arch/x86/kernel/crash.c b/arch/x86/kernel/crash.c
index 44744e9c68ec..7072aaee2ea0 100644
--- a/arch/x86/kernel/crash.c
+++ b/arch/x86/kernel/crash.c
@@ -398,20 +398,26 @@ int crash_load_segments(struct kimage *image)
   #undef pr_fmt
   #define pr_fmt(fmt) "crash hp: " fmt
-/* These functions provide the value for the sysfs

Re: [PATCH v16 2/5] crash: add a new kexec flag for hotplug support

2024-02-21 Thread Sourabh Jain

Hello Baoquan,

There are a lot of code movements introduced by your patch series, 
'Split crash out from kexec and clean up related config items.'


https://lore.kernel.org/all/20240221125752.36fbfe9c307496313198b...@linux-foundation.org/

Do you want me to rebase this patch series on top of the above patch series?


Thanks,
Sourabh Jain

On 17/02/24 13:44, Sourabh Jain wrote:

Commit a72bbec70da2 ("crash: hotplug support for kexec_load()")
introduced a new kexec flag, `KEXEC_UPDATE_ELFCOREHDR`. Kexec tool uses
this flag to indicate to the kernel that it is safe to modify the
elfcorehdr of the kdump image loaded using the kexec_load system call.

However, it is possible that architectures may need to update kexec
segments other then elfcorehdr. For example, FDT (Flatten Device Tree)
on PowerPC. Introducing a new kexec flag for every new kexec segment
may not be a good solution. Hence, a generic kexec flag bit,
`KEXEC_CRASH_HOTPLUG_SUPPORT`, is introduced to share the CPU/Memory
hotplug support intent between the kexec tool and the kernel for the
kexec_load system call.

Now, if the kexec tool sends KEXEC_CRASH_HOTPLUG_SUPPORT kexec flag to
the kernel, it indicates to the kernel that all the required kexec
segment is skipped from SHA calculation and it is safe to update kdump
image loaded using the kexec_load syscall.

While loading the kdump image using the kexec_load syscall, the
@update_elfcorehdr member of struct kimage is set if the kexec tool
sends the KEXEC_UPDATE_ELFCOREHDR kexec flag. This member is later used
to determine whether it is safe to update elfcorehdr on hotplug events.
However, with the introduction of the KEXEC_CRASH_HOTPLUG_SUPPORT kexec
flag, the kexec tool could mark all the required kexec segments on an
architecture as safe to update. So rename the @update_elfcorehdr to
@hotplug_support. If @hotplug_support is set, the kernel can safely
update all the required kexec segments of the kdump image during
CPU/Memory hotplug events.

Introduce an architecture-specific function to process kexec flags for
determining hotplug support. Set the @hotplug_support member of struct
kimage for both kexec_load and kexec_file_load system calls. This
simplifies kernel checks to identify hotplug support for the currently
loaded kdump image by just examining the value of @hotplug_support.

Signed-off-by: Sourabh Jain 
Cc: Akhil Raj 
Cc: Andrew Morton 
Cc: Aneesh Kumar K.V 
Cc: Baoquan He 
Cc: Borislav Petkov (AMD) 
Cc: Boris Ostrovsky 
Cc: Christophe Leroy 
Cc: Dave Hansen 
Cc: Dave Young 
Cc: David Hildenbrand 
Cc: Eric DeVolder 
Cc: Greg Kroah-Hartman 
Cc: Hari Bathini 
Cc: Laurent Dufour 
Cc: Mahesh Salgaonkar 
Cc: Michael Ellerman 
Cc: Mimi Zohar 
Cc: Naveen N Rao 
Cc: Oscar Salvador 
Cc: Thomas Gleixner 
Cc: Valentin Schneider 
Cc: Vivek Goyal 
Cc: ke...@lists.infradead.org
Cc: x...@kernel.org
---
  arch/x86/include/asm/kexec.h | 11 ++-
  arch/x86/kernel/crash.c  | 28 +---
  drivers/base/cpu.c   |  2 +-
  drivers/base/memory.c|  2 +-
  include/linux/kexec.h| 27 +--
  include/uapi/linux/kexec.h   |  1 +
  kernel/crash_core.c  | 11 ---
  kernel/kexec.c   |  4 ++--
  kernel/kexec_file.c  |  5 +
  9 files changed, 50 insertions(+), 41 deletions(-)

diff --git a/arch/x86/include/asm/kexec.h b/arch/x86/include/asm/kexec.h
index 9bb6607e864e..8be622e82ba8 100644
--- a/arch/x86/include/asm/kexec.h
+++ b/arch/x86/include/asm/kexec.h
@@ -211,15 +211,8 @@ extern void kdump_nmi_shootdown_cpus(void);
  void arch_crash_handle_hotplug_event(struct kimage *image, void *arg);
  #define arch_crash_handle_hotplug_event arch_crash_handle_hotplug_event
  
-#ifdef CONFIG_HOTPLUG_CPU

-int arch_crash_hotplug_cpu_support(void);
-#define crash_hotplug_cpu_support arch_crash_hotplug_cpu_support
-#endif
-
-#ifdef CONFIG_MEMORY_HOTPLUG
-int arch_crash_hotplug_memory_support(void);
-#define crash_hotplug_memory_support arch_crash_hotplug_memory_support
-#endif
+int arch_crash_hotplug_support(struct kimage *image, unsigned long 
kexec_flags);
+#define arch_crash_hotplug_support arch_crash_hotplug_support
  
  unsigned int arch_crash_get_elfcorehdr_size(void);

  #define crash_get_elfcorehdr_size arch_crash_get_elfcorehdr_size
diff --git a/arch/x86/kernel/crash.c b/arch/x86/kernel/crash.c
index 44744e9c68ec..7072aaee2ea0 100644
--- a/arch/x86/kernel/crash.c
+++ b/arch/x86/kernel/crash.c
@@ -398,20 +398,26 @@ int crash_load_segments(struct kimage *image)
  #undef pr_fmt
  #define pr_fmt(fmt) "crash hp: " fmt
  
-/* These functions provide the value for the sysfs crash_hotplug nodes */

-#ifdef CONFIG_HOTPLUG_CPU
-int arch_crash_hotplug_cpu_support(void)
+int arch_crash_hotplug_support(struct kimage *image, unsigned long kexec_flags)
  {
-   return crash_check_update_elfcorehdr();
-}
-#endif
  
-#ifdef CONFIG_MEMORY_HOTPLUG

-int arch_crash_hotplug_memory_support(voi

Re: [PATCH v2 02/14] crash: split vmcoreinfo exporting code out from crash_core.c

2024-02-21 Thread Sourabh Jain

Hello Baoquan,

On 19/01/24 20:22, Baoquan He wrote:

Now move the relevant codes into separate files:
kernel/crash_reserve.c, include/linux/crash_reserve.h.

And add config item CRASH_RESERVE to control its enabling.


Feels like this patch is more about vmcore_info.[c|h] and CONFIG_VMCORE_INFO
then the above mentioned files and config.

- Sourabh



And also update the old ifdeffery of CONFIG_CRASH_CORE, including of
 and config item dependency on CRASH_CORE
accordingly.

And also do renaming as follows:
  - arch/xxx/kernel/{crash_core.c => vmcore_info.c}
because they are only related to vmcoreinfo exporting on x86, arm64,
riscv.

And also Remove config item CRASH_CORE, and rely on CONFIG_KEXEC_CORE to
decide if build in crash_core.c.

Signed-off-by: Baoquan He 
---
  arch/arm64/kernel/Makefile|   2 +-
  .../kernel/{crash_core.c => vmcore_info.c}|   2 +-
  arch/powerpc/Kconfig  |   2 +-
  arch/powerpc/kernel/setup-common.c|   2 +-
  arch/powerpc/platforms/powernv/opal-core.c|   2 +-
  arch/riscv/kernel/Makefile|   2 +-
  .../kernel/{crash_core.c => vmcore_info.c}|   2 +-
  arch/x86/kernel/Makefile  |   2 +-
  .../{crash_core_32.c => vmcore_info_32.c} |   2 +-
  .../{crash_core_64.c => vmcore_info_64.c} |   2 +-
  drivers/firmware/qemu_fw_cfg.c|  14 +-
  fs/proc/Kconfig   |   2 +-
  fs/proc/kcore.c   |   2 +-
  include/linux/buildid.h   |   2 +-
  include/linux/crash_core.h|  73 --
  include/linux/kexec.h |   1 +
  include/linux/vmcore_info.h   |  81 ++
  kernel/Kconfig.kexec  |   4 +-
  kernel/Makefile   |   4 +-
  kernel/crash_core.c   | 208 
  kernel/ksysfs.c   |   6 +-
  kernel/printk/printk.c|   4 +-
  kernel/vmcore_info.c  | 233 ++
  lib/buildid.c |   2 +-
  24 files changed, 345 insertions(+), 311 deletions(-)
  rename arch/arm64/kernel/{crash_core.c => vmcore_info.c} (97%)
  rename arch/riscv/kernel/{crash_core.c => vmcore_info.c} (96%)
  rename arch/x86/kernel/{crash_core_32.c => vmcore_info_32.c} (90%)
  rename arch/x86/kernel/{crash_core_64.c => vmcore_info_64.c} (94%)
  create mode 100644 include/linux/vmcore_info.h
  create mode 100644 kernel/vmcore_info.c

diff --git a/arch/arm64/kernel/Makefile b/arch/arm64/kernel/Makefile
index d95b3d6b471a..bcf89587a549 100644
--- a/arch/arm64/kernel/Makefile
+++ b/arch/arm64/kernel/Makefile
@@ -66,7 +66,7 @@ obj-$(CONFIG_KEXEC_FILE)  += machine_kexec_file.o 
kexec_image.o
  obj-$(CONFIG_ARM64_RELOC_TEST)+= arm64-reloc-test.o
  arm64-reloc-test-y := reloc_test_core.o reloc_test_syms.o
  obj-$(CONFIG_CRASH_DUMP)  += crash_dump.o
-obj-$(CONFIG_CRASH_CORE)   += crash_core.o
+obj-$(CONFIG_VMCORE_INFO)  += vmcore_info.o
  obj-$(CONFIG_ARM_SDE_INTERFACE)   += sdei.o
  obj-$(CONFIG_ARM64_PTR_AUTH)  += pointer_auth.o
  obj-$(CONFIG_ARM64_MTE)   += mte.o
diff --git a/arch/arm64/kernel/crash_core.c b/arch/arm64/kernel/vmcore_info.c
similarity index 97%
rename from arch/arm64/kernel/crash_core.c
rename to arch/arm64/kernel/vmcore_info.c
index 66cde752cd74..a5abf7186922 100644
--- a/arch/arm64/kernel/crash_core.c
+++ b/arch/arm64/kernel/vmcore_info.c
@@ -4,7 +4,7 @@
   * Copyright (C) Huawei Futurewei Technologies.
   */
  
-#include 

+#include 
  #include 
  #include 
  #include 
diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
index 6aeab95f0edd..1520146d7c2c 100644
--- a/arch/powerpc/Kconfig
+++ b/arch/powerpc/Kconfig
@@ -690,7 +690,7 @@ config ARCH_SELECTS_CRASH_DUMP
  config FA_DUMP
bool "Firmware-assisted dump"
depends on PPC64 && (PPC_RTAS || PPC_POWERNV)
-   select CRASH_CORE
+   select VMCORE_INFO
select CRASH_RESERVE
select CRASH_DUMP
help
diff --git a/arch/powerpc/kernel/setup-common.c 
b/arch/powerpc/kernel/setup-common.c
index 9b142b9d5187..733f210ffda1 100644
--- a/arch/powerpc/kernel/setup-common.c
+++ b/arch/powerpc/kernel/setup-common.c
@@ -109,7 +109,7 @@ int ppc_do_canonicalize_irqs;
  EXPORT_SYMBOL(ppc_do_canonicalize_irqs);
  #endif
  
-#ifdef CONFIG_CRASH_CORE

+#ifdef CONFIG_VMCORE_INFO
  /* This keeps a track of which one is the crashing cpu. */
  int crashing_cpu = -1;
  #endif
diff --git a/arch/powerpc/platforms/powernv/opal-core.c 
b/arch/powerpc/platforms/powernv/opal-core.c
index bb7657115f1d..c9a9b759cc92 100644
--- a/arch/powerpc/platforms/powernv/opal-core.c
+++ b/arch/powerpc/platforms/powernv/opal-core.c
@@ -16,7 +16,7 @@
  #include 
  #include 
  #include 
-#include 
+#include 
  #include 
 

Re: [PATCH v2 01/14] kexec: split crashkernel reservation code out from crash_core.c

2024-02-21 Thread Sourabh Jain

Hello Baoquan,

Thank you for reorganizing the kexec and kdump code with a well-defined
configuration structure.

While reviewing the patch series, I noticed a few typos.

On 19/01/24 20:22, Baoquan He wrote:

Both kdump and fa_dump of ppc rely on crashkernel reservation. Move the
relevant codes into separate files:
crash_reserve.c, include/linux/crash_reserve.h.

And also add config item CRASH_RESERVE to control its enabling of the
codes. And update config items which has relationship with crashkernel
reservation.

And also change ifdeffery from CONFIG_CRASH_CORE to CONFIG_CRASH_RESERVE
when those scopes are only crashkernel reservation related.

And also rename arch/XXX/include/asm/{crash_core.h => crash_reserve.h}
on arm64, x86 and risc-v because those architectures' crash_core.h
is only related to crashkernel reservation.

Signed-off-by: Baoquan He 
---
  arch/arm64/Kconfig|   2 +-
  .../asm/{crash_core.h => crash_reserve.h} |   4 +-
  arch/powerpc/Kconfig  |   1 +
  arch/powerpc/mm/nohash/kaslr_booke.c  |   4 +-
  arch/riscv/Kconfig|   2 +-
  .../asm/{crash_core.h => crash_reserve.h} |   4 +-
  arch/x86/Kconfig  |   2 +-
  .../asm/{crash_core.h => crash_reserve.h} |   6 +-
  include/linux/crash_core.h|  40 --
  include/linux/crash_reserve.h |  48 ++
  include/linux/kexec.h |   1 +
  kernel/Kconfig.kexec  |   5 +-
  kernel/Makefile   |   1 +
  kernel/crash_core.c   | 438 -
  kernel/crash_reserve.c| 464 ++
  15 files changed, 531 insertions(+), 491 deletions(-)
  rename arch/arm64/include/asm/{crash_core.h => crash_reserve.h} (81%)
  rename arch/riscv/include/asm/{crash_core.h => crash_reserve.h} (78%)
  rename arch/x86/include/asm/{crash_core.h => crash_reserve.h} (92%)
  create mode 100644 include/linux/crash_reserve.h
  create mode 100644 kernel/crash_reserve.c

diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index ea01a2c43efa..d96bc3c67ec6 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -1501,7 +1501,7 @@ config ARCH_SUPPORTS_CRASH_DUMP
def_bool y
  
  config ARCH_HAS_GENERIC_CRASHKERNEL_RESERVATION

-   def_bool CRASH_CORE
+   def_bool CRASH_RESERVE
  
  config TRANS_TABLE

def_bool y
diff --git a/arch/arm64/include/asm/crash_core.h 
b/arch/arm64/include/asm/crash_reserve.h
similarity index 81%
rename from arch/arm64/include/asm/crash_core.h
rename to arch/arm64/include/asm/crash_reserve.h
index 9f5c8d339f44..4afe027a4e7b 100644
--- a/arch/arm64/include/asm/crash_core.h
+++ b/arch/arm64/include/asm/crash_reserve.h
@@ -1,6 +1,6 @@
  /* SPDX-License-Identifier: GPL-2.0-only */
-#ifndef _ARM64_CRASH_CORE_H
-#define _ARM64_CRASH_CORE_H
+#ifndef _ARM64_CRASH_RESERVE_H
+#define _ARM64_CRASH_RESERVE_H
  
  /* Current arm64 boot protocol requires 2MB alignment */

  #define CRASH_ALIGN SZ_2M
diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
index 414b978b8010..6aeab95f0edd 100644
--- a/arch/powerpc/Kconfig
+++ b/arch/powerpc/Kconfig
@@ -691,6 +691,7 @@ config FA_DUMP
bool "Firmware-assisted dump"
depends on PPC64 && (PPC_RTAS || PPC_POWERNV)
select CRASH_CORE
+   select CRASH_RESERVE
select CRASH_DUMP
help
  A robust mechanism to get reliable kernel crash dump with
diff --git a/arch/powerpc/mm/nohash/kaslr_booke.c 
b/arch/powerpc/mm/nohash/kaslr_booke.c
index b4f2786a7d2b..cdff129abb14 100644
--- a/arch/powerpc/mm/nohash/kaslr_booke.c
+++ b/arch/powerpc/mm/nohash/kaslr_booke.c
@@ -13,7 +13,7 @@
  #include 
  #include 
  #include 
-#include 
+#include 
  #include 
  #include 
  #include 
@@ -173,7 +173,7 @@ static __init bool overlaps_region(const void *fdt, u32 
start,
  
  static void __init get_crash_kernel(void *fdt, unsigned long size)

  {
-#ifdef CONFIG_CRASH_CORE
+#ifdef CONFIG_CRASH_RESERVE
unsigned long long crash_size, crash_base;
int ret;
  
diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig

index b549499eb363..37a438c23deb 100644
--- a/arch/riscv/Kconfig
+++ b/arch/riscv/Kconfig
@@ -712,7 +712,7 @@ config ARCH_SUPPORTS_CRASH_DUMP
def_bool y
  
  config ARCH_HAS_GENERIC_CRASHKERNEL_RESERVATION

-   def_bool CRASH_CORE
+   def_bool CRASH_RESERVE
  
  config COMPAT

bool "Kernel support for 32-bit U-mode"
diff --git a/arch/riscv/include/asm/crash_core.h 
b/arch/riscv/include/asm/crash_reserve.h
similarity index 78%
rename from arch/riscv/include/asm/crash_core.h
rename to arch/riscv/include/asm/crash_reserve.h
index e1874b23feaf..013962e63587 100644
--- a/arch/riscv/include/asm/crash_core.h
+++ b/arch/riscv/include/asm/crash_reserve.h
@@ -1,6 +1,6 @@
  /* SPDX-License-Identifier: GPL-2.0-only */
-#ifndef 

Re: [PATCH v7 1/3] powerpc: make fadump resilient with memory add/remove events

2024-02-19 Thread Sourabh Jain

Hello Hari,

On 23/01/24 15:39, Hari Bathini wrote:



On 11/01/24 7:39 pm, Sourabh Jain wrote:

Due to changes in memory resources caused by either memory hotplug or
online/offline events, the elfcorehdr, which describes the CPUs and
memory of the crashed kernel to the kernel that collects the dump (known
as second/fadump kernel), becomes outdated. Consequently, attempting
dump collection with an outdated elfcorehdr can lead to failed or
inaccurate dump collection.

Memory hotplug or online/offline events is referred as memory add/remove
events in reset of the commit message.

The current solution to address the aforementioned issue is as follows:
Monitor memory add/remove events in userspace using udev rules, and
re-register fadump whenever there are changes in memory resources. This
leads to the creation of a new elfcorehdr with updated system memory
information.

There are several notable issues associated with re-registering fadump
for every memory add/remove events.

1. Bulk memory add/remove events with udev-based fadump re-registration
    can lead to race conditions and, more importantly, it creates a wide
    window during which fadump is inactive until all memory add/remove
    events are settled.
2. Re-registering fadump for every memory add/remove event is
    inefficient.
3. The memory for elfcorehdr is allocated based on the memblock regions
    available during early boot and remains fixed thereafter. 
However, if

    elfcorehdr is later recreated with additional memblock regions, its
    size will increase, potentially leading to memory corruption.

Address the aforementioned challenges by shifting the creation of
elfcorehdr from the first kernel (also referred as the crashed kernel),
where it was created and frequently recreated for every memory
add/remove event, to the fadump kernel. As a result, the elfcorehdr only
needs to be created once, thus eliminating the necessity to re-register
fadump during memory add/remove events.

At present, the first kernel prepares the fadump header and stores it in
the fadump reserved area. The fadump header contains start address of
the elfcorehd, crashing CPU details, etc.  In the event of first kernel


"elfcorehd" used instead of "elfcorehdr" at a couple of places..


Fixed it now. Thanks.




crash, the second/fadump boots and access the fadump header prepared by
first kernel and do the following in a platform-specific function
[rtas|opal]_fadump_process:

At present, the first kernel is responsible for preparing the fadump
header and storing it in the fadump reserved area. The fadump header
includes the start address of the elfcorehd, crashing CPU details, and
other relevant information. In the event of a crash in the first kernel,
the second/fadump boots and accesses the fadump header prepared by the
first kernel. It then performs the following steps in a
platform-specific function [rtas|opal]_fadump_process:

1. Sanity check for fadump header
2. Update CPU notes in elfcorehdr
3. Set the global variable elfcorehdr_addr to the address of the
    fadump header's elfcorehdr. For vmcore module to process it later 
on.


Along with the above, update the setup_fadump()/fadump.c to create
elfcorehdr in second/fadump kernel.

Section below outlines the information required to create the elfcorehdr
and the changes made to make it available to the fadump kernel if it's
not already.

To create elfcorehdr, the following crashed kernel information is
required: CPU notes, vmcoreinfo, and memory ranges.

At present, the CPU notes are already prepared in the fadump kernel, so
no changes are needed in that regard. The fadump kernel has access to
all crashed kernel memory regions, including boot memory regions that
are relocated by firmware to fadump reserved areas, so no changes for
that either. However, it is necessary to add new members to the fadump
header, i.e., the 'fadump_crash_info_header' structure, in order to pass
the crashed kernel's vmcoreinfo address and its size to fadump kernel.

In addition to the vmcoreinfo address and size, there are a few other
attributes also added to the fadump_crash_info_header structure.

1. version:
    It stores the fadump header version, which is currently set to 1.
    This provides flexibility to update the fadump crash info header in
    the future without changing the magic number. For each change in the
    fadump header, the version will be increased. This will help the
    updated kernel determine how to handle kernel dumps from older
    kernels. The magic number remains relevant for checking fadump 
header

    corruption.

2. elfcorehdr_size:
    since elfcorehdr is now prepared in the fadump/second kernel and
    it is not part of the reserved area, this attribute is needed to
    track the memory allocated for elfcorehdr to do the deallocation
    properly.

3. pt_regs_sz/cpu_mask_sz:
    Store size of pt_regs and cpu_mask strucutre in first kernel. These
    attributes are used avoid

[PATCH v16 5/5] powerpc: add crash memory hotplug support

2024-02-17 Thread Sourabh Jain
Extend the arch crash hotplug handler, as introduced by the patch title
("powerpc: add crash CPU hotplug support"), to also support memory
add/remove events.

Elfcorehdr describes the memory of the crash kernel to capture the
kernel; hence, it needs to be updated if memory resources change due to
memory add/remove events. Therefore, arch_crash_handle_hotplug_event()
is updated to recreate the elfcorehdr and replace it with the previous
one on memory add/remove events.

The memblock list is used to prepare the elfcorehdr. In the case of
memory hot remove, the memblock list is updated after the arch crash
hotplug handler is triggered, as depicted in Figure 1. Thus, the
hot-removed memory is explicitly removed from the crash memory ranges
to ensure that the memory ranges added to elfcorehdr do not include the
hot-removed memory.

Memory remove
  |
  v
Offline pages
  |
  v
 Initiate memory notify call <> crash hotplug handler
 chain for MEM_OFFLINE event
  |
  v
 Update memblock list

Figure 1

There are two system calls, `kexec_file_load` and `kexec_load`, used to
load the kdump image. A few changes have been made to ensure that the
kernel can safely update the elfcorehdr component of the kdump image for
both system calls.

For the kexec_file_load syscall, kdump image is prepared in the kernel.
To support an increasing number of memory regions, the elfcorehdr is
built with extra buffer space to ensure that it can accommodate
additional memory ranges in future.

For the kexec_load syscall, the elfcorehdr is updated only if the
KEXEC_CRASH_HOTPLUG_SUPPORT kexec flag is passed to the kernel by the
kexec tool. Passing this flag to the kernel indicates that the
elfcorehdr is built to accommodate additional memory ranges and the
elfcorehdr segment is not considered for SHA calculation, making it safe
to update.

The changes related to this feature are kept under the CRASH_HOTPLUG
config, and it is enabled by default.

Signed-off-by: Sourabh Jain 
Cc: Akhil Raj 
Cc: Andrew Morton 
Cc: Aneesh Kumar K.V 
Cc: Baoquan He 
Cc: Borislav Petkov (AMD) 
Cc: Boris Ostrovsky 
Cc: Christophe Leroy 
Cc: Dave Hansen 
Cc: Dave Young 
Cc: David Hildenbrand 
Cc: Greg Kroah-Hartman 
Cc: Hari Bathini 
Cc: Laurent Dufour 
Cc: Mahesh Salgaonkar 
Cc: Michael Ellerman 
Cc: Mimi Zohar 
Cc: Naveen N Rao 
Cc: Oscar Salvador 
Cc: Thomas Gleixner 
Cc: Valentin Schneider 
Cc: Vivek Goyal 
Cc: ke...@lists.infradead.org
Cc: x...@kernel.org
---
 arch/powerpc/include/asm/kexec.h|   5 +-
 arch/powerpc/include/asm/kexec_ranges.h |   1 +
 arch/powerpc/kexec/core_64.c| 105 +++-
 arch/powerpc/kexec/file_load_64.c   |  34 +++-
 arch/powerpc/kexec/ranges.c |  85 +++
 5 files changed, 226 insertions(+), 4 deletions(-)

diff --git a/arch/powerpc/include/asm/kexec.h b/arch/powerpc/include/asm/kexec.h
index 67bace4c90cf..a4193d922d99 100644
--- a/arch/powerpc/include/asm/kexec.h
+++ b/arch/powerpc/include/asm/kexec.h
@@ -116,8 +116,11 @@ int get_crash_memory_ranges(struct crash_mem **mem_ranges);
 #ifdef CONFIG_CRASH_HOTPLUG
 void arch_crash_handle_hotplug_event(struct kimage *image, void *arg);
 #define arch_crash_handle_hotplug_event arch_crash_handle_hotplug_event
-#endif /* CONFIG_CRASH_HOTPLUG */
 
+unsigned int arch_crash_get_elfcorehdr_size(void);
+#define crash_get_elfcorehdr_size arch_crash_get_elfcorehdr_size
+
+#endif /* CONFIG_CRASH_HOTPLUG */
 #endif /* CONFIG_PPC64 */
 
 #ifdef CONFIG_KEXEC_FILE
diff --git a/arch/powerpc/include/asm/kexec_ranges.h 
b/arch/powerpc/include/asm/kexec_ranges.h
index f83866a19e87..802abf580cf0 100644
--- a/arch/powerpc/include/asm/kexec_ranges.h
+++ b/arch/powerpc/include/asm/kexec_ranges.h
@@ -7,6 +7,7 @@
 void sort_memory_ranges(struct crash_mem *mrngs, bool merge);
 struct crash_mem *realloc_mem_ranges(struct crash_mem **mem_ranges);
 int add_mem_range(struct crash_mem **mem_ranges, u64 base, u64 size);
+int remove_mem_range(struct crash_mem **mem_ranges, u64 base, u64 size);
 int add_tce_mem_ranges(struct crash_mem **mem_ranges);
 int add_initrd_mem_range(struct crash_mem **mem_ranges);
 #ifdef CONFIG_PPC_64S_HASH_MMU
diff --git a/arch/powerpc/kexec/core_64.c b/arch/powerpc/kexec/core_64.c
index ff04cdc80f3d..6f188b5ef51e 100644
--- a/arch/powerpc/kexec/core_64.c
+++ b/arch/powerpc/kexec/core_64.c
@@ -19,8 +19,11 @@
 #include 
 #include 
 #include 
+#include 
 
 #include 
+#include 
+#include 
 #include 
 #include 
 #include 
@@ -590,6 +593,103 @@ late_initcall(export_htab_values);
 #undef pr_fmt
 #define pr_fmt(fmt) "crash hp: " fmt
 
+/*
+ * Advertise preferred elfcorehdr size to userspace via
+ * /sys/kernel/crash_elfcorehdr_size sysfs interface.
+ */
+unsigned int arch_crash_get_elfcorehdr_size(void)
+{
+   unsigned int sz;
+   unsigned long elf_phdr_cnt;
+
+   /* Program header for CPU notes and vmcoreinfo *

[PATCH v16 4/5] powerpc: add crash CPU hotplug support

2024-02-17 Thread Sourabh Jain
Due to CPU/Memory hotplug or online/offline events, the elfcorehdr
(which describes the CPUs and memory of the crashed kernel) and FDT
(Flattened Device Tree) of kdump image becomes outdated. Consequently,
attempting dump collection with an outdated elfcorehdr or FDT can lead
to failed or inaccurate dump collection.

Going forward, CPU hotplug or online/offline events are referred as
CPU/Memory add/remove events.

The current solution to address the above issue involves monitoring the
CPU/Memory add/remove events in userspace using udev rules and whenever
there are changes in CPU and memory resources, the entire kdump image
is loaded again. The kdump image includes kernel, initrd, elfcorehdr,
FDT, purgatory. Given that only elfcorehdr and FDT get outdated due to
CPU/Memory add/remove events, reloading the entire kdump image is
inefficient. More importantly, kdump remains inactive for a substantial
amount of time until the kdump reload completes.

To address the aforementioned issue, commit 247262756121 ("crash: add
generic infrastructure for crash hotplug support") added a generic
infrastructure that allows architectures to selectively update the kdump
image component during CPU or memory add/remove events within the kernel
itself.

In the event of a CPU or memory add/remove events, the generic crash
hotplug event handler, `crash_handle_hotplug_event()`, is triggered. It
then acquires the necessary locks to update the kdump image and invokes
the architecture-specific crash hotplug handler,
`arch_crash_handle_hotplug_event()`, to update the required kdump image
components.

This patch adds crash hotplug handler for PowerPC and enable support to
update the kdump image on CPU add/remove events. Support for memory
add/remove events is added in a subsequent patch with the title
"powerpc: add crash memory hotplug support"

As mentioned earlier, only the elfcorehdr and FDT kdump image components
need to be updated in the event of CPU or memory add/remove events.
However, on PowerPC architecture crash hotplug handler only updates the
FDT to enable crash hotplug support for CPU add/remove events. Here's
why.

The elfcorehdr on PowerPC is built with possible CPUs, and thus, it does
not need an update on CPU add/remove events. On the other hand, the FDT
needs to be updated on CPU add events to include the newly added CPU. If
the FDT is not updated and the kernel crashes on a newly added CPU, the
kdump kernel will fail to boot due to the unavailability of the crashing
CPU in the FDT. During the early boot, it is expected that the boot CPU
must be a part of the FDT; otherwise, the kernel will raise a BUG and
fail to boot. For more information, refer to commit 36ae37e3436b0
("powerpc: Make boot_cpuid common between 32 and 64-bit"). Since it is
okay to have an offline CPU in the kdump FDT, no action is taken in case
of CPU removal.

There are two system calls, `kexec_file_load` and `kexec_load`, used to
load the kdump image. Few changes have been made to ensure kernel can
safely update the FDT of kdump image loaded using both system calls.

For kexec_file_load syscall the kdump image is prepared in kernel. So to
support an increasing number of CPUs, the FDT is constructed with extra
buffer space to ensure it can accommodate a possible number of CPU
nodes. Additionally, a call to fdt_pack (which trims the unused space
once the FDT is prepared) is avoided if this feature is enabled.

For the kexec_load syscall, the FDT is updated only if the
KEXEC_CRASH_HOTPLUG_SUPPORT kexec flag is passed to the kernel by
userspace (kexec tools). When userspace passes this flag to the kernel,
it indicates that the FDT is built to accommodate possible CPUs, and the
FDT segment is excluded from SHA calculation, making it safe to update.

The changes related to this feature are kept under the CRASH_HOTPLUG
config, and it is enabled by default.

Signed-off-by: Sourabh Jain 
Cc: Akhil Raj 
Cc: Andrew Morton 
Cc: Aneesh Kumar K.V 
Cc: Baoquan He 
Cc: Borislav Petkov (AMD) 
Cc: Boris Ostrovsky 
Cc: Christophe Leroy 
Cc: Dave Hansen 
Cc: Dave Young 
Cc: David Hildenbrand 
Cc: Greg Kroah-Hartman 
Cc: Hari Bathini 
Cc: Laurent Dufour 
Cc: Mahesh Salgaonkar 
Cc: Michael Ellerman 
Cc: Mimi Zohar 
Cc: Naveen N Rao 
Cc: Oscar Salvador 
Cc: Thomas Gleixner 
Cc: Valentin Schneider 
Cc: Vivek Goyal 
Cc: ke...@lists.infradead.org
Cc: x...@kernel.org
---
 arch/powerpc/Kconfig  |  4 ++
 arch/powerpc/include/asm/kexec.h  |  6 ++
 arch/powerpc/kexec/core_64.c  | 93 +++
 arch/powerpc/kexec/elf_64.c   | 12 +++-
 arch/powerpc/kexec/file_load_64.c | 17 ++
 5 files changed, 131 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
index b9fc064d38d2..fd1bf07244c6 100644
--- a/arch/powerpc/Kconfig
+++ b/arch/powerpc/Kconfig
@@ -682,6 +682,10 @@ config RELOCATABLE_TEST
 config ARCH_SUPPORTS_CRASH_DUMP
def_bool PPC64 || PPC_BOOK3S_32 || PPC_8

[PATCH v16 3/5] powerpc/kexec: turn some static helper functions public

2024-02-17 Thread Sourabh Jain
Move the functions update_cpus_node and get_crash_memory_ranges from
kexec/file_load_64.c to kexec/core_64.c to make these functions usable
by other kexec components.

get_crash_memory_ranges uses functions defined in ranges.c, so take
ranges.c out of CONFIG_KEXEC_FILE.

Later in the series, these functions are utilized for in-kernel updates
to kdump image during CPU/Memory hotplug or online/offline events for
both kexec_load and kexec_file_load syscalls.

There is no intended functional change.

Signed-off-by: Sourabh Jain 
Reviewed-by: Laurent Dufour 
Cc: Akhil Raj 
Cc: Andrew Morton 
Cc: Aneesh Kumar K.V 
Cc: Baoquan He 
Cc: Borislav Petkov (AMD) 
Cc: Boris Ostrovsky 
Cc: Christophe Leroy 
Cc: Dave Hansen 
Cc: Dave Young 
Cc: David Hildenbrand 
Cc: Greg Kroah-Hartman 
Cc: Hari Bathini 
Cc: Mahesh Salgaonkar 
Cc: Michael Ellerman 
Cc: Mimi Zohar 
Cc: Naveen N Rao 
Cc: Oscar Salvador 
Cc: Thomas Gleixner 
Cc: Valentin Schneider 
Cc: Vivek Goyal 
Cc: ke...@lists.infradead.org
Cc: x...@kernel.org
---
 arch/powerpc/include/asm/kexec.h  |   6 ++
 arch/powerpc/kexec/Makefile   |   4 +-
 arch/powerpc/kexec/core_64.c  | 166 ++
 arch/powerpc/kexec/file_load_64.c | 162 -
 4 files changed, 174 insertions(+), 164 deletions(-)

diff --git a/arch/powerpc/include/asm/kexec.h b/arch/powerpc/include/asm/kexec.h
index e1b43aa12175..562e1bb689da 100644
--- a/arch/powerpc/include/asm/kexec.h
+++ b/arch/powerpc/include/asm/kexec.h
@@ -108,6 +108,12 @@ void crash_free_reserved_phys_range(unsigned long begin, 
unsigned long end);
 #endif /* CONFIG_PPC_RTAS */
 #endif /* CONFIG_CRASH_DUMP */
 
+#ifdef CONFIG_PPC64
+struct crash_mem;
+int update_cpus_node(void *fdt);
+int get_crash_memory_ranges(struct crash_mem **mem_ranges);
+#endif /* CONFIG_PPC64 */
+
 #ifdef CONFIG_KEXEC_FILE
 extern const struct kexec_file_ops kexec_elf64_ops;
 
diff --git a/arch/powerpc/kexec/Makefile b/arch/powerpc/kexec/Makefile
index 0c2abe7f9908..f2ed5b85b912 100644
--- a/arch/powerpc/kexec/Makefile
+++ b/arch/powerpc/kexec/Makefile
@@ -3,11 +3,11 @@
 # Makefile for the linux kernel.
 #
 
-obj-y  += core.o crash.o core_$(BITS).o
+obj-y  += core.o crash.o ranges.o core_$(BITS).o
 
 obj-$(CONFIG_PPC32)+= relocate_32.o
 
-obj-$(CONFIG_KEXEC_FILE)   += file_load.o ranges.o file_load_$(BITS).o 
elf_$(BITS).o
+obj-$(CONFIG_KEXEC_FILE)   += file_load.o file_load_$(BITS).o elf_$(BITS).o
 
 # Disable GCOV, KCOV & sanitizers in odd or sensitive code
 GCOV_PROFILE_core_$(BITS).o := n
diff --git a/arch/powerpc/kexec/core_64.c b/arch/powerpc/kexec/core_64.c
index 762e4d09aacf..48beaadcfb70 100644
--- a/arch/powerpc/kexec/core_64.c
+++ b/arch/powerpc/kexec/core_64.c
@@ -17,6 +17,8 @@
 #include 
 #include 
 #include 
+#include 
+#include 
 
 #include 
 #include 
@@ -30,6 +32,8 @@
 #include 
 #include 
 #include 
+#include 
+#include 
 
 int machine_kexec_prepare(struct kimage *image)
 {
@@ -376,6 +380,168 @@ void default_machine_kexec(struct kimage *image)
/* NOTREACHED */
 }
 
+/**
+ * get_crash_memory_ranges - Get crash memory ranges. This list includes
+ *   first/crashing kernel's memory regions that
+ *   would be exported via an elfcore.
+ * @mem_ranges:  Range list to add the memory ranges to.
+ *
+ * Returns 0 on success, negative errno on error.
+ */
+int get_crash_memory_ranges(struct crash_mem **mem_ranges)
+{
+   phys_addr_t base, end;
+   struct crash_mem *tmem;
+   u64 i;
+   int ret;
+
+   for_each_mem_range(i, , ) {
+   u64 size = end - base;
+
+   /* Skip backup memory region, which needs a separate entry */
+   if (base == BACKUP_SRC_START) {
+   if (size > BACKUP_SRC_SIZE) {
+   base = BACKUP_SRC_END + 1;
+   size -= BACKUP_SRC_SIZE;
+   } else
+   continue;
+   }
+
+   ret = add_mem_range(mem_ranges, base, size);
+   if (ret)
+   goto out;
+
+   /* Try merging adjacent ranges before reallocation attempt */
+   if ((*mem_ranges)->nr_ranges == (*mem_ranges)->max_nr_ranges)
+   sort_memory_ranges(*mem_ranges, true);
+   }
+
+   /* Reallocate memory ranges if there is no space to split ranges */
+   tmem = *mem_ranges;
+   if (tmem && (tmem->nr_ranges == tmem->max_nr_ranges)) {
+   tmem = realloc_mem_ranges(mem_ranges);
+   if (!tmem)
+   goto out;
+   }
+
+   /* Exclude crashkernel region */
+   ret = crash_exclude_mem_range(tmem, crashk_res.start, crashk_res.end);
+   if (ret)
+   goto out;
+
+   /*
+* FIXME: For now, stay in parity with k

[PATCH v16 2/5] crash: add a new kexec flag for hotplug support

2024-02-17 Thread Sourabh Jain
Commit a72bbec70da2 ("crash: hotplug support for kexec_load()")
introduced a new kexec flag, `KEXEC_UPDATE_ELFCOREHDR`. Kexec tool uses
this flag to indicate to the kernel that it is safe to modify the
elfcorehdr of the kdump image loaded using the kexec_load system call.

However, it is possible that architectures may need to update kexec
segments other then elfcorehdr. For example, FDT (Flatten Device Tree)
on PowerPC. Introducing a new kexec flag for every new kexec segment
may not be a good solution. Hence, a generic kexec flag bit,
`KEXEC_CRASH_HOTPLUG_SUPPORT`, is introduced to share the CPU/Memory
hotplug support intent between the kexec tool and the kernel for the
kexec_load system call.

Now, if the kexec tool sends KEXEC_CRASH_HOTPLUG_SUPPORT kexec flag to
the kernel, it indicates to the kernel that all the required kexec
segment is skipped from SHA calculation and it is safe to update kdump
image loaded using the kexec_load syscall.

While loading the kdump image using the kexec_load syscall, the
@update_elfcorehdr member of struct kimage is set if the kexec tool
sends the KEXEC_UPDATE_ELFCOREHDR kexec flag. This member is later used
to determine whether it is safe to update elfcorehdr on hotplug events.
However, with the introduction of the KEXEC_CRASH_HOTPLUG_SUPPORT kexec
flag, the kexec tool could mark all the required kexec segments on an
architecture as safe to update. So rename the @update_elfcorehdr to
@hotplug_support. If @hotplug_support is set, the kernel can safely
update all the required kexec segments of the kdump image during
CPU/Memory hotplug events.

Introduce an architecture-specific function to process kexec flags for
determining hotplug support. Set the @hotplug_support member of struct
kimage for both kexec_load and kexec_file_load system calls. This
simplifies kernel checks to identify hotplug support for the currently
loaded kdump image by just examining the value of @hotplug_support.

Signed-off-by: Sourabh Jain 
Cc: Akhil Raj 
Cc: Andrew Morton 
Cc: Aneesh Kumar K.V 
Cc: Baoquan He 
Cc: Borislav Petkov (AMD) 
Cc: Boris Ostrovsky 
Cc: Christophe Leroy 
Cc: Dave Hansen 
Cc: Dave Young 
Cc: David Hildenbrand 
Cc: Eric DeVolder 
Cc: Greg Kroah-Hartman 
Cc: Hari Bathini 
Cc: Laurent Dufour 
Cc: Mahesh Salgaonkar 
Cc: Michael Ellerman 
Cc: Mimi Zohar 
Cc: Naveen N Rao 
Cc: Oscar Salvador 
Cc: Thomas Gleixner 
Cc: Valentin Schneider 
Cc: Vivek Goyal 
Cc: ke...@lists.infradead.org
Cc: x...@kernel.org
---
 arch/x86/include/asm/kexec.h | 11 ++-
 arch/x86/kernel/crash.c  | 28 +---
 drivers/base/cpu.c   |  2 +-
 drivers/base/memory.c|  2 +-
 include/linux/kexec.h| 27 +--
 include/uapi/linux/kexec.h   |  1 +
 kernel/crash_core.c  | 11 ---
 kernel/kexec.c   |  4 ++--
 kernel/kexec_file.c  |  5 +
 9 files changed, 50 insertions(+), 41 deletions(-)

diff --git a/arch/x86/include/asm/kexec.h b/arch/x86/include/asm/kexec.h
index 9bb6607e864e..8be622e82ba8 100644
--- a/arch/x86/include/asm/kexec.h
+++ b/arch/x86/include/asm/kexec.h
@@ -211,15 +211,8 @@ extern void kdump_nmi_shootdown_cpus(void);
 void arch_crash_handle_hotplug_event(struct kimage *image, void *arg);
 #define arch_crash_handle_hotplug_event arch_crash_handle_hotplug_event
 
-#ifdef CONFIG_HOTPLUG_CPU
-int arch_crash_hotplug_cpu_support(void);
-#define crash_hotplug_cpu_support arch_crash_hotplug_cpu_support
-#endif
-
-#ifdef CONFIG_MEMORY_HOTPLUG
-int arch_crash_hotplug_memory_support(void);
-#define crash_hotplug_memory_support arch_crash_hotplug_memory_support
-#endif
+int arch_crash_hotplug_support(struct kimage *image, unsigned long 
kexec_flags);
+#define arch_crash_hotplug_support arch_crash_hotplug_support
 
 unsigned int arch_crash_get_elfcorehdr_size(void);
 #define crash_get_elfcorehdr_size arch_crash_get_elfcorehdr_size
diff --git a/arch/x86/kernel/crash.c b/arch/x86/kernel/crash.c
index 44744e9c68ec..7072aaee2ea0 100644
--- a/arch/x86/kernel/crash.c
+++ b/arch/x86/kernel/crash.c
@@ -398,20 +398,26 @@ int crash_load_segments(struct kimage *image)
 #undef pr_fmt
 #define pr_fmt(fmt) "crash hp: " fmt
 
-/* These functions provide the value for the sysfs crash_hotplug nodes */
-#ifdef CONFIG_HOTPLUG_CPU
-int arch_crash_hotplug_cpu_support(void)
+int arch_crash_hotplug_support(struct kimage *image, unsigned long kexec_flags)
 {
-   return crash_check_update_elfcorehdr();
-}
-#endif
 
-#ifdef CONFIG_MEMORY_HOTPLUG
-int arch_crash_hotplug_memory_support(void)
-{
-   return crash_check_update_elfcorehdr();
-}
+#ifdef CONFIG_KEXEC_FILE
+   if (image->file_mode)
+   return 1;
 #endif
+   /*
+* Initially, crash hotplug support for kexec_load was added
+* with the KEXEC_UPDATE_ELFCOREHDR flag. Later, this
+* functionality was expanded to accommodate multiple kexec
+* segment updates, lead

[PATCH v16 1/5] crash: forward memory_notify arg to arch crash hotplug handler

2024-02-17 Thread Sourabh Jain
In the event of memory hotplug or online/offline events, the crash
memory hotplug notifier `crash_memhp_notifier()` receives a
`memory_notify` object but doesn't forward that object to the
generic and architecture-specific crash hotplug handler.

The `memory_notify` object contains the starting PFN (Page Frame Number)
and the number of pages in the hot-removed memory. This information is
necessary for architectures like PowerPC to update/recreate the kdump
image, specifically `elfcorehdr`.

So update the function signature of `crash_handle_hotplug_event()` and
`arch_crash_handle_hotplug_event()` to accept the `memory_notify` object
as an argument from crash memory hotplug notifier.

Since no such object is available in the case of CPU hotplug event, the
crash CPU hotplug notifier `crash_cpuhp_online()` passes NULL to the
crash hotplug handler.

Signed-off-by: Sourabh Jain 
Acked-by: Baoquan He 
Cc: Akhil Raj 
Cc: Andrew Morton 
Cc: Aneesh Kumar K.V 
Cc: Borislav Petkov (AMD) 
Cc: Boris Ostrovsky 
Cc: Christophe Leroy 
Cc: Dave Hansen 
Cc: Dave Young 
Cc: David Hildenbrand 
Cc: Greg Kroah-Hartman 
Cc: Hari Bathini 
Cc: Laurent Dufour 
Cc: Mahesh Salgaonkar 
Cc: Michael Ellerman 
Cc: Mimi Zohar 
Cc: Naveen N Rao 
Cc: Oscar Salvador 
Cc: Thomas Gleixner 
Cc: Valentin Schneider 
Cc: Vivek Goyal 
Cc: ke...@lists.infradead.org
Cc: x...@kernel.org
---
 arch/x86/include/asm/kexec.h |  2 +-
 arch/x86/kernel/crash.c  |  3 ++-
 include/linux/kexec.h|  2 +-
 kernel/crash_core.c  | 14 +++---
 4 files changed, 11 insertions(+), 10 deletions(-)

diff --git a/arch/x86/include/asm/kexec.h b/arch/x86/include/asm/kexec.h
index c9f6a6c5de3c..9bb6607e864e 100644
--- a/arch/x86/include/asm/kexec.h
+++ b/arch/x86/include/asm/kexec.h
@@ -208,7 +208,7 @@ int arch_kimage_file_post_load_cleanup(struct kimage 
*image);
 extern void kdump_nmi_shootdown_cpus(void);
 
 #ifdef CONFIG_CRASH_HOTPLUG
-void arch_crash_handle_hotplug_event(struct kimage *image);
+void arch_crash_handle_hotplug_event(struct kimage *image, void *arg);
 #define arch_crash_handle_hotplug_event arch_crash_handle_hotplug_event
 
 #ifdef CONFIG_HOTPLUG_CPU
diff --git a/arch/x86/kernel/crash.c b/arch/x86/kernel/crash.c
index b6b044356f1b..44744e9c68ec 100644
--- a/arch/x86/kernel/crash.c
+++ b/arch/x86/kernel/crash.c
@@ -428,10 +428,11 @@ unsigned int arch_crash_get_elfcorehdr_size(void)
 /**
  * arch_crash_handle_hotplug_event() - Handle hotplug elfcorehdr changes
  * @image: a pointer to kexec_crash_image
+ * @arg: struct memory_notify handler for memory hotplug case and NULL for CPU 
hotplug case.
  *
  * Prepare the new elfcorehdr and replace the existing elfcorehdr.
  */
-void arch_crash_handle_hotplug_event(struct kimage *image)
+void arch_crash_handle_hotplug_event(struct kimage *image, void *arg)
 {
void *elfbuf = NULL, *old_elfcorehdr;
unsigned long nr_mem_ranges;
diff --git a/include/linux/kexec.h b/include/linux/kexec.h
index 400cb6c02176..802052d9c64b 100644
--- a/include/linux/kexec.h
+++ b/include/linux/kexec.h
@@ -483,7 +483,7 @@ static inline void arch_kexec_pre_free_pages(void *vaddr, 
unsigned int pages) {
 #endif
 
 #ifndef arch_crash_handle_hotplug_event
-static inline void arch_crash_handle_hotplug_event(struct kimage *image) { }
+static inline void arch_crash_handle_hotplug_event(struct kimage *image, void 
*arg) { }
 #endif
 
 int crash_check_update_elfcorehdr(void);
diff --git a/kernel/crash_core.c b/kernel/crash_core.c
index 75cd6a736d03..b692ec5955de 100644
--- a/kernel/crash_core.c
+++ b/kernel/crash_core.c
@@ -924,7 +924,7 @@ int crash_check_update_elfcorehdr(void)
  * list of segments it checks (since the elfcorehdr changes and thus
  * would require an update to purgatory itself to update the digest).
  */
-static void crash_handle_hotplug_event(unsigned int hp_action, unsigned int 
cpu)
+static void crash_handle_hotplug_event(unsigned int hp_action, unsigned int 
cpu, void *arg)
 {
struct kimage *image;
 
@@ -986,7 +986,7 @@ static void crash_handle_hotplug_event(unsigned int 
hp_action, unsigned int cpu)
image->hp_action = hp_action;
 
/* Now invoke arch-specific update handler */
-   arch_crash_handle_hotplug_event(image);
+   arch_crash_handle_hotplug_event(image, arg);
 
/* No longer handling a hotplug event */
image->hp_action = KEXEC_CRASH_HP_NONE;
@@ -1002,17 +1002,17 @@ static void crash_handle_hotplug_event(unsigned int 
hp_action, unsigned int cpu)
crash_hotplug_unlock();
 }
 
-static int crash_memhp_notifier(struct notifier_block *nb, unsigned long val, 
void *v)
+static int crash_memhp_notifier(struct notifier_block *nb, unsigned long val, 
void *arg)
 {
switch (val) {
case MEM_ONLINE:
crash_handle_hotplug_event(KEXEC_CRASH_HP_ADD_MEMORY,
-   KEXEC_CRASH_HP_INVALID_CPU);
+   KEXEC_CRASH_HP_INVALID_CPU, arg);
break;
 
  

[PATCH v16 0/5] powerpc/crash: Kernel handling of CPU and memory hotplug

2024-02-17 Thread Sourabh Jain
mit message for better understanding.

v8:
  - Restrict fdt_index initialization to machine_kexec_post_load
it work for both kexec_load and kexec_file_load.[3/8] Laurent Dufour

  - Updated the logic to find the number of offline core. [6/8]

  - Changed the logic to find the elfcore program header to accommodate
future memory ranges due memory hotplug events. [8/8]

v7
  - added a new config to configure this feature
  - pass hotplug action type to arch specific handler

v6
  - Added crash memory hotplug support

v5:
  - Replace COFNIG_CRASH_HOTPLUG with CONFIG_HOTPLUG_CPU.
  - Move fdt segment identification for kexec_load case to load path
instead of crash hotplug handler
  - Keep new attribute defined under kimage_arch to track FDT segment
under CONFIG_HOTPLUG_CPU config.

v4:
  - Update the logic to find the additional space needed for hotadd CPUs
post kexec load. Refer "[RFC v4 PATCH 4/5] powerpc/crash hp: add crash
hotplug support for kexec_file_load" patch to know more about the
change.
  - Fix a couple of typo.
  - Replace pr_err to pr_info_once to warn user about memory hotplug
support.
  - In crash hotplug handle exit the for loop if FDT segment is found.

v3
  - Move fdt_index and fdt_index_vaild variables to kimage_arch struct.
  - Rebase patche on top of

https://lore.kernel.org/lkml/20220303162725.49640-1-eric.devol...@oracle.com/
  - Fixed warning reported by checpatch script

v2:
  - Use generic hotplug handler introduced by

https://lore.kernel.org/lkml/20220209195706.51522-1-eric.devol...@oracle.com/
a significant change from v1.

Cc: Akhil Raj 
Cc: Andrew Morton 
Cc: Aneesh Kumar K.V 
Cc: Baoquan He 
Cc: Borislav Petkov (AMD) 
Cc: Boris Ostrovsky 
Cc: Christophe Leroy 
Cc: Dave Hansen 
Cc: Dave Young 
Cc: David Hildenbrand 
Cc: Greg Kroah-Hartman 
Cc: Hari Bathini 
Cc: Laurent Dufour 
Cc: Mahesh Salgaonkar 
Cc: Michael Ellerman 
Cc: Mimi Zohar 
Cc: Naveen N Rao 
Cc: Oscar Salvador 
Cc: Thomas Gleixner 
Cc: Valentin Schneider 
Cc: Vivek Goyal 
Cc: ke...@lists.infradead.org
Cc: x...@kernel.org

Sourabh Jain (5):
  crash: forward memory_notify arg to arch crash hotplug handler
  crash: add a new kexec flag for hotplug support
  powerpc/kexec: turn some static helper functions public
  powerpc: add crash CPU hotplug support
  powerpc: add crash memory hotplug support

 arch/powerpc/Kconfig|   4 +
 arch/powerpc/include/asm/kexec.h|  15 +
 arch/powerpc/include/asm/kexec_ranges.h |   1 +
 arch/powerpc/kexec/Makefile |   4 +-
 arch/powerpc/kexec/core_64.c| 362 
 arch/powerpc/kexec/elf_64.c |  12 +-
 arch/powerpc/kexec/file_load_64.c   | 211 --
 arch/powerpc/kexec/ranges.c |  85 ++
 arch/x86/include/asm/kexec.h|  13 +-
 arch/x86/kernel/crash.c |  31 +-
 drivers/base/cpu.c  |   2 +-
 drivers/base/memory.c   |   2 +-
 include/linux/kexec.h   |  29 +-
 include/uapi/linux/kexec.h  |   1 +
 kernel/crash_core.c |  25 +-
 kernel/kexec.c  |   4 +-
 kernel/kexec_file.c |   5 +
 17 files changed, 589 insertions(+), 217 deletions(-)

-- 
2.43.0



[PATCH v8 3/3] Documentation/powerpc: update fadump implementation details

2024-02-16 Thread Sourabh Jain
The patch titled ("powerpc: make fadump resilient with memory add/remove
events") has made significant changes to the implementation of fadump,
particularly on elfcorehdr creation and fadump crash info header
structure. Therefore, updating the fadump implementation documentation
to reflect those changes.

Following updates are done to firmware assisted dump documentation:

1. The elfcorehdr is no longer stored after fadump HDR in the reserved
   dump area. Instead, the second kernel dynamically allocates memory
   for the elfcorehdr within the address range from 0 to the boot memory
   size. Therefore, update figures 1 and 2 of Memory Reservation during
   the first and second kernels to reflect this change.

2. A version field has been added to the fadump header to manage the
   future changes to fadump crash info header structure without changing
   the fadump header magic number in the future. Therefore, remove the
   corresponding TODO from the document.

Signed-off-by: Sourabh Jain 
Cc: Aditya Gupta 
Cc: Aneesh Kumar K.V 
Cc: Hari Bathini 
Cc: Mahesh Salgaonkar 
Cc: Michael Ellerman 
Cc: Naveen N Rao 
---
 .../arch/powerpc/firmware-assisted-dump.rst   | 91 +--
 1 file changed, 42 insertions(+), 49 deletions(-)

diff --git a/Documentation/arch/powerpc/firmware-assisted-dump.rst 
b/Documentation/arch/powerpc/firmware-assisted-dump.rst
index e363fc48529a..7e37aadd1f77 100644
--- a/Documentation/arch/powerpc/firmware-assisted-dump.rst
+++ b/Documentation/arch/powerpc/firmware-assisted-dump.rst
@@ -134,12 +134,12 @@ that are run. If there is dump data, then the
 memory is held.
 
 If there is no waiting dump data, then only the memory required to
-hold CPU state, HPTE region, boot memory dump, FADump header and
-elfcore header, is usually reserved at an offset greater than boot
-memory size (see Fig. 1). This area is *not* released: this region
-will be kept permanently reserved, so that it can act as a receptacle
-for a copy of the boot memory content in addition to CPU state and
-HPTE region, in the case a crash does occur.
+hold CPU state, HPTE region, boot memory dump, and FADump header is
+usually reserved at an offset greater than boot memory size (see Fig. 1).
+This area is *not* released: this region will be kept permanently
+reserved, so that it can act as a receptacle for a copy of the boot
+memory content in addition to CPU state and HPTE region, in the case
+a crash does occur.
 
 Since this reserved memory area is used only after the system crash,
 there is no point in blocking this significant chunk of memory from
@@ -153,22 +153,22 @@ that were present in CMA region::
 
   o Memory Reservation during first kernel
 
-  Low memory Top of memory
-  0boot memory size   |<--- Reserved dump area --->|   |
-  |   |   |Permanent Reservation   |   |
-  V   V   ||   V
-  +---+-/ /---+---++---+-+-++--+
-  |   |   |///||  DUMP | HDR | ELF ||  |
-  +---+-/ /---+---++---+-+-++--+
-|   ^^ ^  ^   ^
-|   || |  |   |
-\  CPU  HPTE   /  |   |
- --   |   |
-  Boot memory content gets transferred|   |
-  to reserved area by firmware at the |   |
-  time of crash.  |   |
-  FADump Header   |
-   (meta area)|
+  Low memory  Top of memory
+  0boot memory size   |<-- Reserved dump area ->| |
+  |   |   |  Permanent Reservation  | |
+  V   V   | | V
+  +---+-/ /---+---++---+---++-+
+  |   |   |///||DUMP   |  HDR  || |
+  +---+-/ /---+---++---+---++-+
+|   ^^   ^ ^  ^
+|   ||   | |  |
+\  CPU  HPTE / |  |
+   |  |
+  Boot memory content gets transferred |  |
+  to reserved area by firmware at the  |  |
+  time of crash.   |  |
+   FADump Header  |
+(meta area)   |
   |
   |
   Metadata: This area holds a metadata structure whose
@@ -186,13 +186,2

[PATCH v8 2/3] powerpc/fadump: add hotplug_ready sysfs interface

2024-02-16 Thread Sourabh Jain
The elfcorehdr describes the CPUs and memory of the crashed kernel to
the kernel that captures the dump, known as the second or fadump kernel.
The elfcorehdr needs to be updated if the system's memory changes due to
memory hotplug or online/offline events.

Currently, memory hotplug events are monitored in userspace by udev
rules, and fadump is re-registered, which recreates the elfcorehdr with
the latest available memory in the system.

However, the previous patch ("powerpc: make fadump resilient with memory
add/remove events") moved the creation of elfcorehdr to the second or
fadump kernel. This eliminates the need to regenerate the elfcorehdr
during memory hotplug or online/offline events.

Create a sysfs entry at /sys/kernel/fadump/hotplug_ready to let
userspace know that fadump re-registration is not required for memory
add/remove events.

Signed-off-by: Sourabh Jain 
Cc: Aditya Gupta 
Cc: Aneesh Kumar K.V 
Cc: Hari Bathini 
Cc: Mahesh Salgaonkar 
Cc: Michael Ellerman 
Cc: Naveen N Rao 
---
 Documentation/ABI/testing/sysfs-kernel-fadump | 11 +++
 arch/powerpc/kernel/fadump.c  | 14 ++
 2 files changed, 25 insertions(+)

diff --git a/Documentation/ABI/testing/sysfs-kernel-fadump 
b/Documentation/ABI/testing/sysfs-kernel-fadump
index 8f7a64a81783..e2786a3db8dd 100644
--- a/Documentation/ABI/testing/sysfs-kernel-fadump
+++ b/Documentation/ABI/testing/sysfs-kernel-fadump
@@ -38,3 +38,14 @@ Contact: linuxppc-dev@lists.ozlabs.org
 Description:   read only
Provide information about the amount of memory reserved by
FADump to save the crash dump in bytes.
+
+What:  /sys/kernel/fadump/hotplug_ready
+Date:  Feb 2024
+Contact:   linuxppc-dev@lists.ozlabs.org
+Description:   read only
+   Kdump udev rule re-registers fadump on memory add/remove events,
+   primarily to update the elfcorehdr. This sysfs indicates the
+   kdump udev rule that fadump re-registration is not required on
+   memory add/remove events because elfcorehdr is now prepared in
+   the second/fadump kernel.
+User:  kexec-tools
diff --git a/arch/powerpc/kernel/fadump.c b/arch/powerpc/kernel/fadump.c
index a55ad8514745..6478820a2038 100644
--- a/arch/powerpc/kernel/fadump.c
+++ b/arch/powerpc/kernel/fadump.c
@@ -1444,6 +1444,18 @@ static ssize_t enabled_show(struct kobject *kobj,
return sprintf(buf, "%d\n", fw_dump.fadump_enabled);
 }
 
+/*
+ * /sys/kernel/fadump/hotplug_ready sysfs node returns 1, which inidcates
+ * to usersapce that fadump re-registration is not required on memory
+ * hotplug events.
+ */
+static ssize_t hotplug_ready_show(struct kobject *kobj,
+ struct kobj_attribute *attr,
+ char *buf)
+{
+   return sprintf(buf, "%d\n", 1);
+}
+
 static ssize_t mem_reserved_show(struct kobject *kobj,
 struct kobj_attribute *attr,
 char *buf)
@@ -1516,11 +1528,13 @@ static struct kobj_attribute release_attr = 
__ATTR_WO(release_mem);
 static struct kobj_attribute enable_attr = __ATTR_RO(enabled);
 static struct kobj_attribute register_attr = __ATTR_RW(registered);
 static struct kobj_attribute mem_reserved_attr = __ATTR_RO(mem_reserved);
+static struct kobj_attribute hotplug_ready_attr = __ATTR_RO(hotplug_ready);
 
 static struct attribute *fadump_attrs[] = {
_attr.attr,
_attr.attr,
_reserved_attr.attr,
+   _ready_attr.attr,
NULL,
 };
 
-- 
2.43.0



[PATCH v8 1/3] powerpc: make fadump resilient with memory add/remove events

2024-02-16 Thread Sourabh Jain
Due to changes in memory resources caused by either memory hotplug or
online/offline events, the elfcorehdr, which describes the CPUs and
memory of the crashed kernel to the kernel that collects the dump (known
as second/fadump kernel), becomes outdated. Consequently, attempting
dump collection with an outdated elfcorehdr can lead to failed or
inaccurate dump collection.

Memory hotplug or online/offline events is referred as memory add/remove
events in reset of the commit message.

The current solution to address the aforementioned issue is as follows:
Monitor memory add/remove events in userspace using udev rules, and
re-register fadump whenever there are changes in memory resources. This
leads to the creation of a new elfcorehdr with updated system memory
information.

There are several notable issues associated with re-registering fadump
for every memory add/remove events.

1. Bulk memory add/remove events with udev-based fadump re-registration
   can lead to race conditions and, more importantly, it creates a wide
   window during which fadump is inactive until all memory add/remove
   events are settled.
2. Re-registering fadump for every memory add/remove event is
   inefficient.
3. The memory for elfcorehdr is allocated based on the memblock regions
   available during early boot and remains fixed thereafter. However, if
   elfcorehdr is later recreated with additional memblock regions, its
   size will increase, potentially leading to memory corruption.

Address the aforementioned challenges by shifting the creation of
elfcorehdr from the first kernel (also referred as the crashed kernel),
where it was created and frequently recreated for every memory
add/remove event, to the fadump kernel. As a result, the elfcorehdr only
needs to be created once, thus eliminating the necessity to re-register
fadump during memory add/remove events.

At present, the first kernel is responsible for preparing the fadump
header and storing it in the fadump reserved area. The fadump header
includes the start address of the elfcorehdr, crashing CPU details, and
other relevant information. In the event of a crash in the first kernel,
the second/fadump boots and accesses the fadump header prepared by the
first kernel. It then performs the following steps in a
platform-specific function [rtas|opal]_fadump_process:

1. Sanity check for fadump header
2. Update CPU notes in elfcorehdr

Along with the above, update the setup_fadump()/fadump.c to create
elfcorehdr and set its address to the global variable elfcorehdr_addr
for the vmcore module to process it in the second/fadump kernel.

Section below outlines the information required to create the elfcorehdr
and the changes made to make it available to the fadump kernel if it's
not already.

To create elfcorehdr, the following crashed kernel information is
required: CPU notes, vmcoreinfo, and memory ranges.

At present, the CPU notes are already prepared in the fadump kernel, so
no changes are needed in that regard. The fadump kernel has access to
all crashed kernel memory regions, including boot memory regions that
are relocated by firmware to fadump reserved areas, so no changes for
that either. However, it is necessary to add new members to the fadump
header, i.e., the 'fadump_crash_info_header' structure, in order to pass
the crashed kernel's vmcoreinfo address and its size to fadump kernel.

In addition to the vmcoreinfo address and size, there are a few other
attributes also added to the fadump_crash_info_header structure.

1. version:
   It stores the fadump header version, which is currently set to 1.
   This provides flexibility to update the fadump crash info header in
   the future without changing the magic number. For each change in the
   fadump header, the version will be increased. This will help the
   updated kernel determine how to handle kernel dumps from older
   kernels. The magic number remains relevant for checking fadump header
   corruption.

2. pt_regs_sz/cpu_mask_sz:
   Store size of pt_regs and cpu_mask structure of first kernel. These
   attributes are used to prevent dump processing if the sizes of
   pt_regs or cpu_mask structure differ between the first and fadump
   kernels.

Note: if either first/crashed kernel or second/fadump kernel do not have
the changes introduced here then kernel fail to collect the dump and
prints relevant error message on the console.

Signed-off-by: Sourabh Jain 
Cc: Aditya Gupta 
Cc: Aneesh Kumar K.V 
Cc: Hari Bathini 
Cc: Mahesh Salgaonkar 
Cc: Michael Ellerman 
Cc: Naveen N Rao 
---
 arch/powerpc/include/asm/fadump-internal.h   |  31 +-
 arch/powerpc/kernel/fadump.c | 339 +++
 arch/powerpc/platforms/powernv/opal-fadump.c |  22 +-
 arch/powerpc/platforms/pseries/rtas-fadump.c |  30 +-
 4 files changed, 232 insertions(+), 190 deletions(-)

diff --git a/arch/powerpc/include/asm/fadump-internal.h 
b/arch/powerpc/include/asm/fadump-internal.h
index 27f9e11eda28..5d706a7acc8a 100644

[PATCH v8 0/3] powerpc: make fadump resilient with memory add/remove events

2024-02-16 Thread Sourabh Jain
d by the checkpatch script.
  - Rebased it to 6.6.0-rc3

v1: 17 Sep 2023
  https://lore.kernel.org/all/20230917080225.561627-1-sourabhj...@linux.ibm.com/

Cc: Aditya Gupta 
Cc: Aneesh Kumar K.V 
Cc: Hari Bathini 
Cc: Mahesh Salgaonkar 
Cc: Michael Ellerman 
Cc: Naveen N Rao 

Sourabh Jain (3):
  powerpc: make fadump resilient with memory add/remove events
  powerpc/fadump: add hotplug_ready sysfs interface
  Documentation/powerpc: update fadump implementation details

 Documentation/ABI/testing/sysfs-kernel-fadump |  11 +
 .../arch/powerpc/firmware-assisted-dump.rst   |  91 +++--
 arch/powerpc/include/asm/fadump-internal.h|  31 +-
 arch/powerpc/kernel/fadump.c  | 353 +++---
 arch/powerpc/platforms/powernv/opal-fadump.c  |  22 +-
 arch/powerpc/platforms/pseries/rtas-fadump.c  |  30 +-
 6 files changed, 299 insertions(+), 239 deletions(-)

-- 
2.43.0



Re: [PATCH v15 2/5] crash: add a new kexec flag for hotplug support

2024-02-14 Thread Sourabh Jain




On 13/02/24 08:51, Baoquan He wrote:

On 02/12/24 at 07:27pm, Sourabh Jain wrote:

Hello Baoquan,

On 05/02/24 08:40, Baoquan He wrote:

Hi Sourabh,


..

diff --git a/include/linux/kexec.h b/include/linux/kexec.h
index 802052d9c64b..7880d74dc5c4 100644
--- a/include/linux/kexec.h
+++ b/include/linux/kexec.h
@@ -317,8 +317,8 @@ struct kimage {
/* If set, we are using file mode kexec syscall */
unsigned int file_mode:1;
   #ifdef CONFIG_CRASH_HOTPLUG
-   /* If set, allow changes to elfcorehdr of kexec_load'd image */
-   unsigned int update_elfcorehdr:1;
+   /* If set, allow changes to kexec segments of kexec_load'd image */

The code comment doesn't reflect the usage of the flag.

I should have updated the comment to indicate that this flag is for both
system calls.
More comments below.


You set it too
when it's kexec_file_load. Speaking of this, I do wonder why you need
set it too for kexec_file_load,

If we do this one can just access image->hotplug_support to find hotplug
support for currently loaded kdump image without bothering about which
system call was used to load the kdump image.


and why we have
arch_crash_hotplug_support(), then crash_check_hotplug_support() both of
which have the same effect.

arch_crash_hotplug_support(): This function processes the kexec flags and
finds the
hotplug support for the kdump image. Based on the return value of this
function,
the image->hotplug_support attribute is set.

Now, once the kdump image is loaded, we no longer have access to the kexec
flags.
Therefore, crash_check_hotplug_support simply returns the value of
image->hotplug_support
when user space accesses the following sysfs files:
/sys/devices/system/[cpu|memory]/crash_hotplug.

To keep things simple, I have introduced two functions: One function
processes the kexec flags
and determines the hotplug support for the image being loaded. And other
function simply
accesses image->hotplug_support and advertises CPU/Memory hotplug support to
userspace.

 From the function name and their functionality, they seems to be
duplicated, even though it's different from the internal detail. This
could bring a little confusion to code understanding. It's fine, we can
refactor them if needed in the future. So let's keep it as the patch is.
Thanks.

Ok sure.

- Sourabh


Re: [PATCH v15 2/5] crash: add a new kexec flag for hotplug support

2024-02-12 Thread Sourabh Jain

Hello Baoquan,

On 05/02/24 08:40, Baoquan He wrote:

Hi Sourabh,

Thanks for the great work. There are some concerns, please see inline
comments.


Thank you :)



On 01/11/24 at 04:21pm, Sourabh Jain wrote:
..

Now, if the kexec tool sends KEXEC_CRASH_HOTPLUG_SUPPORT kexec flag to
the kernel, it indicates to the kernel that all the required kexec
segment is skipped from SHA calculation and it is safe to update kdump
image loaded using the kexec_load syscall.

So finally you add a new KEXEC_CRASH_HOTPLUG_SUPPORT flag, that's fine.

..

diff --git a/arch/x86/include/asm/kexec.h b/arch/x86/include/asm/kexec.h
index 9bb6607e864e..e791129fdf6c 100644
--- a/arch/x86/include/asm/kexec.h
+++ b/arch/x86/include/asm/kexec.h
@@ -211,6 +211,9 @@ extern void kdump_nmi_shootdown_cpus(void);
  void arch_crash_handle_hotplug_event(struct kimage *image, void *arg);
  #define arch_crash_handle_hotplug_event arch_crash_handle_hotplug_event
  
+int arch_crash_hotplug_support(struct kimage *image, unsigned long kexec_flags);

+#define arch_crash_hotplug_support arch_crash_hotplug_support
+
  #ifdef CONFIG_HOTPLUG_CPU
  int arch_crash_hotplug_cpu_support(void);
  #define crash_hotplug_cpu_support arch_crash_hotplug_cpu_support

Then crash_hotplug_cpu_support is not needed any more on x86_64, and
crash_hotplug_memory_support(), if you remove their implementation in
arch/x86/kernel/crash.c, won't it cause building warning or error on x86?


Yeah, crash_hotplug_cpu_support and crash_hotplug_memory_support are
no longer required. My bad, I forgot to remove them.


diff --git a/arch/x86/kernel/crash.c b/arch/x86/kernel/crash.c
index 44744e9c68ec..293b54bff706 100644
--- a/arch/x86/kernel/crash.c
+++ b/arch/x86/kernel/crash.c
@@ -398,20 +398,16 @@ int crash_load_segments(struct kimage *image)
  #undef pr_fmt
  #define pr_fmt(fmt) "crash hp: " fmt
  
-/* These functions provide the value for the sysfs crash_hotplug nodes */

-#ifdef CONFIG_HOTPLUG_CPU
-int arch_crash_hotplug_cpu_support(void)
+int arch_crash_hotplug_support(struct kimage *image, unsigned long kexec_flags)
  {
-   return crash_check_update_elfcorehdr();
-}
-#endif
  
-#ifdef CONFIG_MEMORY_HOTPLUG

-int arch_crash_hotplug_memory_support(void)
-{
-   return crash_check_update_elfcorehdr();
-}
+#ifdef CONFIG_KEXEC_FILE
+   if (image->file_mode)
+   return 1;
  #endif
+   return (kexec_flags & KEXEC_UPDATE_ELFCOREHDR ||
+   kexec_flags & KEXEC_CRASH_HOTPLUG_SUPPORT);

Do we need add some document to tell why there are two kexec flags on
x86_64, except of checking this patch log?


Sure I will add a comment about it.




+}
  
  unsigned int arch_crash_get_elfcorehdr_size(void)

  {
diff --git a/drivers/base/cpu.c b/drivers/base/cpu.c
index 548491de818e..2f411ddfbd8b 100644
--- a/drivers/base/cpu.c
+++ b/drivers/base/cpu.c
@@ -306,7 +306,7 @@ static ssize_t crash_hotplug_show(struct device *dev,
 struct device_attribute *attr,
 char *buf)
  {
-   return sysfs_emit(buf, "%d\n", crash_hotplug_cpu_support());
+   return sysfs_emit(buf, "%d\n", crash_check_hotplug_support());
  }
  static DEVICE_ATTR_ADMIN_RO(crash_hotplug);
  #endif
diff --git a/drivers/base/memory.c b/drivers/base/memory.c
index 8a13babd826c..e70ab1d3428e 100644
--- a/drivers/base/memory.c
+++ b/drivers/base/memory.c
@@ -514,7 +514,7 @@ static DEVICE_ATTR_RW(auto_online_blocks);
  static ssize_t crash_hotplug_show(struct device *dev,
   struct device_attribute *attr, char *buf)
  {
-   return sysfs_emit(buf, "%d\n", crash_hotplug_memory_support());
+   return sysfs_emit(buf, "%d\n", crash_check_hotplug_support());
  }
  static DEVICE_ATTR_RO(crash_hotplug);
  #endif
diff --git a/include/linux/kexec.h b/include/linux/kexec.h
index 802052d9c64b..7880d74dc5c4 100644
--- a/include/linux/kexec.h
+++ b/include/linux/kexec.h
@@ -317,8 +317,8 @@ struct kimage {
/* If set, we are using file mode kexec syscall */
unsigned int file_mode:1;
  #ifdef CONFIG_CRASH_HOTPLUG
-   /* If set, allow changes to elfcorehdr of kexec_load'd image */
-   unsigned int update_elfcorehdr:1;
+   /* If set, allow changes to kexec segments of kexec_load'd image */

The code comment doesn't reflect the usage of the flag.
I should have updated the comment to indicate that this flag is for both 
system calls.

More comments below.


You set it too
when it's kexec_file_load. Speaking of this, I do wonder why you need
set it too for kexec_file_load,

If we do this one can just access image->hotplug_support to find hotplug
support for currently loaded kdump image without bothering about which
system call was used to load the kdump image.


and why we have
arch_crash_hotplug_support(), then crash_check_hotplug_support() both of
which have the same effect.


arch_crash_hotpl

Re: [PATCH v15 1/5] crash: forward memory_notify arg to arch crash hotplug handler

2024-02-12 Thread Sourabh Jain




On 05/02/24 08:41, Baoquan He wrote:

On 01/11/24 at 04:21pm, Sourabh Jain wrote:

In the event of memory hotplug or online/offline events, the crash
memory hotplug notifier `crash_memhp_notifier()` receives a
`memory_notify` object but doesn't forward that object to the
generic and architecture-specific crash hotplug handler.

The `memory_notify` object contains the starting PFN (Page Frame Number)
and the number of pages in the hot-removed memory. This information is
necessary for architectures like PowerPC to update/recreate the kdump
image, specifically `elfcorehdr`.

So update the function signature of `crash_handle_hotplug_event()` and
`arch_crash_handle_hotplug_event()` to accept the `memory_notify` object
as an argument from crash memory hotplug notifier.

Since no such object is available in the case of CPU hotplug event, the
crash CPU hotplug notifier `crash_cpuhp_online()` passes NULL to the
crash hotplug handler.


..

---
  arch/x86/include/asm/kexec.h |  2 +-
  arch/x86/kernel/crash.c  |  3 ++-
  include/linux/kexec.h|  2 +-
  kernel/crash_core.c  | 14 +++---
  4 files changed, 11 insertions(+), 10 deletions(-)

LGTM,

Acked-by: Baoquan He 


Thanks Baoquan He

- Sourabh




diff --git a/arch/x86/include/asm/kexec.h b/arch/x86/include/asm/kexec.h
index c9f6a6c5de3c..9bb6607e864e 100644
--- a/arch/x86/include/asm/kexec.h
+++ b/arch/x86/include/asm/kexec.h
@@ -208,7 +208,7 @@ int arch_kimage_file_post_load_cleanup(struct kimage 
*image);
  extern void kdump_nmi_shootdown_cpus(void);
  
  #ifdef CONFIG_CRASH_HOTPLUG

-void arch_crash_handle_hotplug_event(struct kimage *image);
+void arch_crash_handle_hotplug_event(struct kimage *image, void *arg);
  #define arch_crash_handle_hotplug_event arch_crash_handle_hotplug_event
  
  #ifdef CONFIG_HOTPLUG_CPU

diff --git a/arch/x86/kernel/crash.c b/arch/x86/kernel/crash.c
index b6b044356f1b..44744e9c68ec 100644
--- a/arch/x86/kernel/crash.c
+++ b/arch/x86/kernel/crash.c
@@ -428,10 +428,11 @@ unsigned int arch_crash_get_elfcorehdr_size(void)
  /**
   * arch_crash_handle_hotplug_event() - Handle hotplug elfcorehdr changes
   * @image: a pointer to kexec_crash_image
+ * @arg: struct memory_notify handler for memory hotplug case and NULL for CPU 
hotplug case.
   *
   * Prepare the new elfcorehdr and replace the existing elfcorehdr.
   */
-void arch_crash_handle_hotplug_event(struct kimage *image)
+void arch_crash_handle_hotplug_event(struct kimage *image, void *arg)
  {
void *elfbuf = NULL, *old_elfcorehdr;
unsigned long nr_mem_ranges;
diff --git a/include/linux/kexec.h b/include/linux/kexec.h
index 400cb6c02176..802052d9c64b 100644
--- a/include/linux/kexec.h
+++ b/include/linux/kexec.h
@@ -483,7 +483,7 @@ static inline void arch_kexec_pre_free_pages(void *vaddr, 
unsigned int pages) {
  #endif
  
  #ifndef arch_crash_handle_hotplug_event

-static inline void arch_crash_handle_hotplug_event(struct kimage *image) { }
+static inline void arch_crash_handle_hotplug_event(struct kimage *image, void 
*arg) { }
  #endif
  
  int crash_check_update_elfcorehdr(void);

diff --git a/kernel/crash_core.c b/kernel/crash_core.c
index d48315667752..ab1c8e79759d 100644
--- a/kernel/crash_core.c
+++ b/kernel/crash_core.c
@@ -914,7 +914,7 @@ int crash_check_update_elfcorehdr(void)
   * list of segments it checks (since the elfcorehdr changes and thus
   * would require an update to purgatory itself to update the digest).
   */
-static void crash_handle_hotplug_event(unsigned int hp_action, unsigned int 
cpu)
+static void crash_handle_hotplug_event(unsigned int hp_action, unsigned int 
cpu, void *arg)
  {
struct kimage *image;
  
@@ -976,7 +976,7 @@ static void crash_handle_hotplug_event(unsigned int hp_action, unsigned int cpu)

image->hp_action = hp_action;
  
  	/* Now invoke arch-specific update handler */

-   arch_crash_handle_hotplug_event(image);
+   arch_crash_handle_hotplug_event(image, arg);
  
  	/* No longer handling a hotplug event */

image->hp_action = KEXEC_CRASH_HP_NONE;
@@ -992,17 +992,17 @@ static void crash_handle_hotplug_event(unsigned int 
hp_action, unsigned int cpu)
crash_hotplug_unlock();
  }
  
-static int crash_memhp_notifier(struct notifier_block *nb, unsigned long val, void *v)

+static int crash_memhp_notifier(struct notifier_block *nb, unsigned long val, 
void *arg)
  {
switch (val) {
case MEM_ONLINE:
crash_handle_hotplug_event(KEXEC_CRASH_HP_ADD_MEMORY,
-   KEXEC_CRASH_HP_INVALID_CPU);
+   KEXEC_CRASH_HP_INVALID_CPU, arg);
break;
  
  	case MEM_OFFLINE:

crash_handle_hotplug_event(KEXEC_CRASH_HP_REMOVE_MEMORY,
-   KEXEC_CRASH_HP_INVALID_CPU);
+   KEXEC_CRASH_HP_INVALID_CPU, arg);
break;
}
return NOTIFY_OK;
@@ -1015,13 +1015,13 @@ static struct notifier_block crash_me

Re: [PATCH v15 5/5] powerpc: add crash memory hotplug support

2024-01-28 Thread Sourabh Jain




On 23/01/24 15:52, Hari Bathini wrote:



On 11/01/24 4:21 pm, Sourabh Jain wrote:

Extend the arch crash hotplug handler, as introduced by the patch title
("powerpc: add crash CPU hotplug support"), to also support memory
add/remove events.

Elfcorehdr describes the memory of the crash kernel to capture the
kernel; hence, it needs to be updated if memory resources change due to
memory add/remove events. Therefore, arch_crash_handle_hotplug_event()
is updated to recreate the elfcorehdr and replace it with the previous
one on memory add/remove events.

The memblock list is used to prepare the elfcorehdr. In the case of
memory hot removal, the memblock list is updated after the arch crash
hotplug handler is triggered, as depicted in Figure 1. Thus, the
hot-removed memory is explicitly removed from the crash memory ranges
to ensure that the memory ranges added to elfcorehdr do not include the
hot-removed memory.

 Memory remove
   |
   v
 Offline pages
   |
   v
  Initiate memory notify call <> crash hotplug handler
  chain for MEM_OFFLINE event
   |
   v
  Update memblock list

  Figure 1

There are two system calls, `kexec_file_load` and `kexec_load`, used to
load the kdump image. A few changes have been made to ensure that the
kernel can safely update the elfcorehdr component of the kdump image for
both system calls.

For the kexec_file_load syscall, kdump image is prepared in the kernel.
To support an increasing number of memory regions, the elfcorehdr is
built with extra buffer space to ensure that it can accommodate
additional memory ranges in future.

For the kexec_load syscall, the elfcorehdr is updated only if the
KEXEC_CRASH_HOTPLUG_SUPPORT kexec flag is passed to the kernel by the
kexec tool. Passing this flag to the kernel indicates that the
elfcorehdr is built to accommodate additional memory ranges and the
elfcorehdr segment is not considered for SHA calculation, making it safe
to update.

The changes related to this feature are kept under the CRASH_HOTPLUG
config, and it is enabled by default.

Signed-off-by: Sourabh Jain 
Cc: Akhil Raj 
Cc: Andrew Morton 
Cc: Aneesh Kumar K.V 
Cc: Baoquan He 
Cc: Borislav Petkov (AMD) 
Cc: Boris Ostrovsky 
Cc: Christophe Leroy 
Cc: Dave Hansen 
Cc: Dave Young 
Cc: David Hildenbrand 
Cc: Greg Kroah-Hartman 
Cc: Hari Bathini 
Cc: Laurent Dufour 
Cc: Mahesh Salgaonkar 
Cc: Michael Ellerman 
Cc: Mimi Zohar 
Cc: Naveen N Rao 
Cc: Oscar Salvador 
Cc: Thomas Gleixner 
Cc: Valentin Schneider 
Cc: Vivek Goyal 
Cc: ke...@lists.infradead.org
Cc: x...@kernel.org
---
  arch/powerpc/include/asm/kexec.h    |   5 +-
  arch/powerpc/include/asm/kexec_ranges.h |   1 +
  arch/powerpc/kexec/core_64.c    | 107 +++-
  arch/powerpc/kexec/file_load_64.c   |  34 +++-
  arch/powerpc/kexec/ranges.c |  85 +++
  5 files changed, 225 insertions(+), 7 deletions(-)

diff --git a/arch/powerpc/include/asm/kexec.h 
b/arch/powerpc/include/asm/kexec.h

index 943e58eb9bff..25ff5b7f1a28 100644
--- a/arch/powerpc/include/asm/kexec.h
+++ b/arch/powerpc/include/asm/kexec.h
@@ -116,8 +116,11 @@ int get_crash_memory_ranges(struct crash_mem 
**mem_ranges);

  #ifdef CONFIG_CRASH_HOTPLUG
  void arch_crash_handle_hotplug_event(struct kimage *image, void *arg);
  #define arch_crash_handle_hotplug_event 
arch_crash_handle_hotplug_event

-#endif /*CONFIG_CRASH_HOTPLUG */
  +unsigned int arch_crash_get_elfcorehdr_size(void);
+#define crash_get_elfcorehdr_size arch_crash_get_elfcorehdr_size
+
+#endif /*CONFIG_CRASH_HOTPLUG */
  #endif /* CONFIG_PPC64 */
    #ifdef CONFIG_KEXEC_FILE
diff --git a/arch/powerpc/include/asm/kexec_ranges.h 
b/arch/powerpc/include/asm/kexec_ranges.h

index f83866a19e87..802abf580cf0 100644
--- a/arch/powerpc/include/asm/kexec_ranges.h
+++ b/arch/powerpc/include/asm/kexec_ranges.h
@@ -7,6 +7,7 @@
  void sort_memory_ranges(struct crash_mem *mrngs, bool merge);
  struct crash_mem *realloc_mem_ranges(struct crash_mem **mem_ranges);
  int add_mem_range(struct crash_mem **mem_ranges, u64 base, u64 size);
+int remove_mem_range(struct crash_mem **mem_ranges, u64 base, u64 
size);

  int add_tce_mem_ranges(struct crash_mem **mem_ranges);
  int add_initrd_mem_range(struct crash_mem **mem_ranges);
  #ifdef CONFIG_PPC_64S_HASH_MMU
diff --git a/arch/powerpc/kexec/core_64.c b/arch/powerpc/kexec/core_64.c
index 43fcd78c2102..4673f150f973 100644
--- a/arch/powerpc/kexec/core_64.c
+++ b/arch/powerpc/kexec/core_64.c
@@ -19,8 +19,11 @@
  #include 
  #include 
  #include 
+#include 
    #include 
+#include 
+#include 
  #include 
  #include 
  #include 
@@ -546,6 +549,101 @@ int update_cpus_node(void *fdt)
  #undef pr_fmt
  #define pr_fmt(fmt) "crash hp: " fmt
  +/*
+ * Advertise preferred elfcorehdr size to userspace via
+ * /sys/kernel/crash_elfcorehdr_size sysfs interface.
+ */
+unsigned int arch_crash_get_elfcorehdr_size(void)
+{
+    unsi

[PATCH v7 3/3] Documentation/powerpc: update fadump implementation details

2024-01-11 Thread Sourabh Jain
The patch titled ("powerpc: make fadump resilient with memory add/remove
events") has made significant changes to the implementation of fadump,
particularly on elfcorehdr creation and fadump crash info header
structure. Therefore, updating the fadump implementation documentation
to reflect those changes.

Following updates are done to firmware assisted dump documentation:

1. The elfcorehdr is no longer stored after fadump HDR in the reserved
   dump area. Instead, the second kernel dynamically allocates memory
   for the elfcorehdr within the address range from 0 to the boot memory
   size. Therefore, update figures 1 and 2 of Memory Reservation during
   the first and second kernels to reflect this change.

2. A version field has been added to the fadump header to manage the
   future changes to fadump crash info header structure without changing
   the fadump header magic number in the future. Therefore, remove the
   corresponding TODO from the document.

Signed-off-by: Sourabh Jain 
Cc: Aditya Gupta 
Cc: Aneesh Kumar K.V 
Cc: Hari Bathini 
Cc: Mahesh Salgaonkar 
Cc: Michael Ellerman 
Cc: Naveen N Rao 
---
 .../arch/powerpc/firmware-assisted-dump.rst   | 91 +--
 1 file changed, 42 insertions(+), 49 deletions(-)

diff --git a/Documentation/arch/powerpc/firmware-assisted-dump.rst 
b/Documentation/arch/powerpc/firmware-assisted-dump.rst
index e363fc48529a..7e37aadd1f77 100644
--- a/Documentation/arch/powerpc/firmware-assisted-dump.rst
+++ b/Documentation/arch/powerpc/firmware-assisted-dump.rst
@@ -134,12 +134,12 @@ that are run. If there is dump data, then the
 memory is held.
 
 If there is no waiting dump data, then only the memory required to
-hold CPU state, HPTE region, boot memory dump, FADump header and
-elfcore header, is usually reserved at an offset greater than boot
-memory size (see Fig. 1). This area is *not* released: this region
-will be kept permanently reserved, so that it can act as a receptacle
-for a copy of the boot memory content in addition to CPU state and
-HPTE region, in the case a crash does occur.
+hold CPU state, HPTE region, boot memory dump, and FADump header is
+usually reserved at an offset greater than boot memory size (see Fig. 1).
+This area is *not* released: this region will be kept permanently
+reserved, so that it can act as a receptacle for a copy of the boot
+memory content in addition to CPU state and HPTE region, in the case
+a crash does occur.
 
 Since this reserved memory area is used only after the system crash,
 there is no point in blocking this significant chunk of memory from
@@ -153,22 +153,22 @@ that were present in CMA region::
 
   o Memory Reservation during first kernel
 
-  Low memory Top of memory
-  0boot memory size   |<--- Reserved dump area --->|   |
-  |   |   |Permanent Reservation   |   |
-  V   V   ||   V
-  +---+-/ /---+---++---+-+-++--+
-  |   |   |///||  DUMP | HDR | ELF ||  |
-  +---+-/ /---+---++---+-+-++--+
-|   ^^ ^  ^   ^
-|   || |  |   |
-\  CPU  HPTE   /  |   |
- --   |   |
-  Boot memory content gets transferred|   |
-  to reserved area by firmware at the |   |
-  time of crash.  |   |
-  FADump Header   |
-   (meta area)|
+  Low memory  Top of memory
+  0boot memory size   |<-- Reserved dump area ->| |
+  |   |   |  Permanent Reservation  | |
+  V   V   | | V
+  +---+-/ /---+---++---+---++-+
+  |   |   |///||DUMP   |  HDR  || |
+  +---+-/ /---+---++---+---++-+
+|   ^^   ^ ^  ^
+|   ||   | |  |
+\  CPU  HPTE / |  |
+   |  |
+  Boot memory content gets transferred |  |
+  to reserved area by firmware at the  |  |
+  time of crash.   |  |
+   FADump Header  |
+(meta area)   |
   |
   |
   Metadata: This area holds a metadata structure whose
@@ -186,13 +186,2

[PATCH v7 2/3] powerpc/fadump: add hotplug_ready sysfs interface

2024-01-11 Thread Sourabh Jain
The elfcorehdr describes the CPUs and memory of the crashed kernel to
the kernel that captures the dump, known as the second or fadump kernel.
The elfcorehdr needs to be updated if the system's memory changes due to
memory hotplug or online/offline events.

Currently, memory hotplug events are monitored in userspace by udev
rules, and fadump is re-registered, which recreates the elfcorehdr with
the latest available memory in the system.

However, the previous patch ("powerpc: make fadump resilient with memory
add/remove events") moved the creation of elfcorehdr to the second or
fadump kernel. This eliminates the need to regenerate the elfcorehdr
during memory hotplug or online/offline events.

Create a sysfs entry at /sys/kernel/fadump/hotplug_ready to let
userspace know that fadump re-registration is not required for memory
add/remove events.

Signed-off-by: Sourabh Jain 
Cc: Aditya Gupta 
Cc: Aneesh Kumar K.V 
Cc: Hari Bathini 
Cc: Mahesh Salgaonkar 
Cc: Michael Ellerman 
Cc: Naveen N Rao 
---
 Documentation/ABI/testing/sysfs-kernel-fadump | 11 +++
 arch/powerpc/kernel/fadump.c  | 14 ++
 2 files changed, 25 insertions(+)

diff --git a/Documentation/ABI/testing/sysfs-kernel-fadump 
b/Documentation/ABI/testing/sysfs-kernel-fadump
index 8f7a64a81783..8e18a6c93650 100644
--- a/Documentation/ABI/testing/sysfs-kernel-fadump
+++ b/Documentation/ABI/testing/sysfs-kernel-fadump
@@ -38,3 +38,14 @@ Contact: linuxppc-dev@lists.ozlabs.org
 Description:   read only
Provide information about the amount of memory reserved by
FADump to save the crash dump in bytes.
+
+What:  /sys/kernel/fadump/hotplug_ready
+Date:  Jan 2024
+Contact:   linuxppc-dev@lists.ozlabs.org
+Description:   read only
+   Kdump udev rule re-registers fadump on memory add/remove events,
+   primarily to update the elfcorehdr. This sysfs indicates the
+   kdump udev rule that fadump re-registration is not required on
+   memory add/remove events because elfcorehdr is now prepared in
+   the second/fadump kernel.
+User:  kexec-tools
diff --git a/arch/powerpc/kernel/fadump.c b/arch/powerpc/kernel/fadump.c
index eb9132538268..a55dd9bf754c 100644
--- a/arch/powerpc/kernel/fadump.c
+++ b/arch/powerpc/kernel/fadump.c
@@ -1455,6 +1455,18 @@ static ssize_t enabled_show(struct kobject *kobj,
return sprintf(buf, "%d\n", fw_dump.fadump_enabled);
 }
 
+/*
+ * /sys/kernel/fadump/hotplug_ready sysfs node returns 1, which inidcates
+ * to usersapce that fadump re-registration is not required on memory
+ * hotplug events.
+ */
+static ssize_t hotplug_ready_show(struct kobject *kobj,
+ struct kobj_attribute *attr,
+ char *buf)
+{
+   return sprintf(buf, "%d\n", 1);
+}
+
 static ssize_t mem_reserved_show(struct kobject *kobj,
 struct kobj_attribute *attr,
 char *buf)
@@ -1527,11 +1539,13 @@ static struct kobj_attribute release_attr = 
__ATTR_WO(release_mem);
 static struct kobj_attribute enable_attr = __ATTR_RO(enabled);
 static struct kobj_attribute register_attr = __ATTR_RW(registered);
 static struct kobj_attribute mem_reserved_attr = __ATTR_RO(mem_reserved);
+static struct kobj_attribute hotplug_ready_attr = __ATTR_RO(hotplug_ready);
 
 static struct attribute *fadump_attrs[] = {
_attr.attr,
_attr.attr,
_reserved_attr.attr,
+   _ready_attr.attr,
NULL,
 };
 
-- 
2.41.0



[PATCH v7 1/3] powerpc: make fadump resilient with memory add/remove events

2024-01-11 Thread Sourabh Jain
 to collect the dump and
prints relevant error message on the console.

Signed-off-by: Sourabh Jain 
Cc: Aditya Gupta 
Cc: Aneesh Kumar K.V 
Cc: Hari Bathini 
Cc: Mahesh Salgaonkar 
Cc: Michael Ellerman 
Cc: Naveen N Rao 
---
 arch/powerpc/include/asm/fadump-internal.h   |  31 +-
 arch/powerpc/kernel/fadump.c | 355 +++
 arch/powerpc/platforms/powernv/opal-fadump.c |  18 +-
 arch/powerpc/platforms/pseries/rtas-fadump.c |  23 +-
 4 files changed, 242 insertions(+), 185 deletions(-)

diff --git a/arch/powerpc/include/asm/fadump-internal.h 
b/arch/powerpc/include/asm/fadump-internal.h
index 27f9e11eda28..a632e9708610 100644
--- a/arch/powerpc/include/asm/fadump-internal.h
+++ b/arch/powerpc/include/asm/fadump-internal.h
@@ -42,13 +42,40 @@ static inline u64 fadump_str_to_u64(const char *str)
 
 #define FADUMP_CPU_UNKNOWN (~((u32)0))
 
-#define FADUMP_CRASH_INFO_MAGICfadump_str_to_u64("FADMPINF")
+/*
+ * The introduction of new fields in the fadump crash info header has
+ * led to a change in the magic key from `FADMPINF` to `FADMPSIG` for
+ * identifying a kernel crash from an old kernel.
+ *
+ * To prevent the need for further changes to the magic number in the
+ * event of future modifications to the fadump crash info header, a
+ * version field has been introduced to track the fadump crash info
+ * header version.
+ *
+ * Consider a few points before adding new members to the fadump crash info
+ * header structure:
+ *
+ *  - Append new members; avoid adding them in between.
+ *  - Non-primitive members should have a size member as well.
+ *  - For every change in the fadump header, increment the
+ *fadump header version. This helps the updated kernel decide how to
+ *handle kernel dumps from older kernels.
+ */
+#define FADUMP_CRASH_INFO_MAGIC_OLDfadump_str_to_u64("FADMPINF")
+#define FADUMP_CRASH_INFO_MAGICfadump_str_to_u64("FADMPSIG")
+#define FADUMP_HEADER_VERSION  1
 
 /* fadump crash info structure */
 struct fadump_crash_info_header {
u64 magic_number;
-   u64 elfcorehdr_addr;
+   u32 version;
u32 crashing_cpu;
+   u64 elfcorehdr_addr;
+   u64 elfcorehdr_size;
+   u64 vmcoreinfo_raddr;
+   u64 vmcoreinfo_size;
+   u32 pt_regs_sz;
+   u32 cpu_mask_sz;
struct pt_regs  regs;
struct cpumask  cpu_mask;
 };
diff --git a/arch/powerpc/kernel/fadump.c b/arch/powerpc/kernel/fadump.c
index d14eda1e8589..eb9132538268 100644
--- a/arch/powerpc/kernel/fadump.c
+++ b/arch/powerpc/kernel/fadump.c
@@ -53,8 +53,6 @@ static struct kobject *fadump_kobj;
 static atomic_t cpus_in_fadump;
 static DEFINE_MUTEX(fadump_mutex);
 
-static struct fadump_mrange_info crash_mrange_info = { "crash", NULL, 0, 0, 0, 
false };
-
 #define RESERVED_RNGS_SZ   16384 /* 16K - 128 entries */
 #define RESERVED_RNGS_CNT  (RESERVED_RNGS_SZ / \
 sizeof(struct fadump_memory_range))
@@ -373,12 +371,6 @@ static unsigned long __init get_fadump_area_size(void)
size = PAGE_ALIGN(size);
size += fw_dump.boot_memory_size;
size += sizeof(struct fadump_crash_info_header);
-   size += sizeof(struct elfhdr); /* ELF core header.*/
-   size += sizeof(struct elf_phdr); /* place holder for cpu notes */
-   /* Program headers for crash memory regions. */
-   size += sizeof(struct elf_phdr) * (memblock_num_regions(memory) + 2);
-
-   size = PAGE_ALIGN(size);
 
/* This is to hold kernel metadata on platforms that support it */
size += (fw_dump.ops->fadump_get_metadata_size ?
@@ -931,36 +923,6 @@ static inline int fadump_add_mem_range(struct 
fadump_mrange_info *mrange_info,
return 0;
 }
 
-static int fadump_exclude_reserved_area(u64 start, u64 end)
-{
-   u64 ra_start, ra_end;
-   int ret = 0;
-
-   ra_start = fw_dump.reserve_dump_area_start;
-   ra_end = ra_start + fw_dump.reserve_dump_area_size;
-
-   if ((ra_start < end) && (ra_end > start)) {
-   if ((start < ra_start) && (end > ra_end)) {
-   ret = fadump_add_mem_range(_mrange_info,
-  start, ra_start);
-   if (ret)
-   return ret;
-
-   ret = fadump_add_mem_range(_mrange_info,
-  ra_end, end);
-   } else if (start < ra_start) {
-   ret = fadump_add_mem_range(_mrange_info,
-  start, ra_start);
-   } else if (ra_end < end) {
-   ret = fadump_add_mem_range(_mrange_info,
-  ra_end, end)

[PATCH v7 0/3] powerpc: make fadump resilient with memory add/remove events

2024-01-11 Thread Sourabh Jain
Problem:

Due to changes in memory resources caused by either memory hotplug or
online/offline events, the elfcorehdr, which describes the cpus and
memory of the crashed kernel to the kernel that collects the dump (known
as second/fadump kernel), becomes outdated. Consequently, attempting
dump collection with an outdated elfcorehdr can lead to failed or
inaccurate dump collection.

Memory hotplug or online/offline events is referred as memory add/remove
events in reset of the patch series.

Existing solution:
==
Monitor memory add/remove events in userspace using udev rules, and
re-register fadump whenever there are changes in memory resources. This
leads to the creation of a new elfcorehdr with updated system memory
information.

Challenges with existing solution:
==
1. Performing bulk memory add/remove with udev-based fadump
   re-registration can lead to race conditions and, more importantly,
   it creates a large wide window during which fadump is inactive until
   all memory add/remove events are settled.
2. Re-registering fadump for every memory add/remove event is
   inefficient.
3. Memory for elfcorehdr is allocated based on the memblock regions
   available during first kernel early boot and it remains fixed
   thereafter. However, if the elfcorehdr is later recreated with
   additional memblock regions, its size will increase, potentially
   leading to memory corruption.

Proposed solution:
==
Address the aforementioned challenges by shifting the creation of
elfcorehdr from the first kernel (also referred as the crashed kernel),
where it was created and frequently recreated for every memory
add/remove event, to the fadump kernel. As a result, the elfcorehdr only
needs to be created once, thus eliminating the necessity to re-register
fadump during memory add/remove events.

To know more about elfcorehdr creation in the fadump kernel, refer to
the first patch in this series.

The second patch includes a new sysfs interface that tells userspace
that fadump re-registration isn't needed for memory add/remove events. 
note that userspace changes do not need to be in sync with kernel
changes; they can roll out independently.

Since there are significant changes in the fadump implementation, the
third patch updates the fadump documentation to reflect the changes made
in this patch series.

Kernel tree rebased on 6.7.0-rc4 with patch series applied:
=
https://github.com/sourabhjains/linux/tree/fadump-mem-hotplug-v7

Userspace changes:
==
To realize this feature, one must update the kdump udev rules to prevent
fadump re-registration during memory add/remove events.

On rhel apply the following changes to file
/usr/lib/udev/rules.d/98-kexec.rules

-run+="/bin/sh -c '/usr/bin/systemctl is-active kdump.service || exit 0; 
/usr/bin/systemd-run --quiet --no-block /usr/lib/udev/kdump-udev-throttler'"
+# don't re-register fadump if the value of the node
+# /sys/kernel/fadump/hotplug_ready is 1.
+
+run+="/bin/sh -c '/usr/bin/systemctl is-active kdump.service || exit 0; ! test 
-f /sys/kernel/fadump_enabled || cat /sys/kernel/fadump_enabled | grep 0  || ! 
test -f /sys/kernel/fadump/hotplug_ready || cat 
/sys/kernel/fadump/hotplug_ready | grep 0 || exit 0; /usr/bin/systemd-run 
--quiet --no-block /usr/lib/udev/kdump-udev-throttler'"

Changelog:
==
v7: 11 Jan 2023
  - Rebase it to 6.7

v6: 8 Dec 2023
  https://lore.kernel.org/all/20231208115159.82236-1-sourabhj...@linux.ibm.com/
  - Add size fields for `pt_regs` and `cpumask` in the fadump header
structure
  - Don't process the dump if the size of `pt_regs` and `cpu_mask` is
not same in the crashed and fadump kernel
  - Include an additional check for endianness mismatch when the magic
number doesn't match, to print the relevant error message
  - Don't process the dump if the fadump header contains an old magic number
  - Rebased it to 6.7.0-rc4

v5: 29 Oct 2023 
  https://lore.kernel.org/all/20231029124548.12198-1-sourabhj...@linux.ibm.com/
  - Fix a comment on the first patch

v4: 21 Oct 2023
  https://lore.kernel.org/all/20231021181733.204311-1-sourabhj...@linux.ibm.com/
  - Fix a build warning about type casting

v3: 9 Oct 2023
  https://lore.kernel.org/all/20231009041953.36139-1-sourabhj...@linux.ibm.com/
  - Assign physical address of elfcorehdr to fdh->elfcorehdr_addr
  - Rename a variable, boot_mem_dest_addr -> boot_mem_dest_offset

v2: 25 Sep 2023
  https://lore.kernel.org/all/20230925051214.678957-1-sourabhj...@linux.ibm.com/
  - Fixed a few indentation issues reported by the checkpatch script.
  - Rebased it to 6.6.0-rc3

v1: 17 Sep 2023
  https://lore.kernel.org/all/20230917080225.561627-1-sourabhj...@linux.ibm.com/

Cc: Aditya Gupta 
Cc: Aneesh Kumar K.V 
Cc: Hari Bathini 
Cc: Mahesh Salgaonkar 
Cc: Michael Ellerman 
Cc: Naveen N Rao 

Sourabh Jain (3):
  powe

[PATCH v15 1/5] crash: forward memory_notify arg to arch crash hotplug handler

2024-01-11 Thread Sourabh Jain
In the event of memory hotplug or online/offline events, the crash
memory hotplug notifier `crash_memhp_notifier()` receives a
`memory_notify` object but doesn't forward that object to the
generic and architecture-specific crash hotplug handler.

The `memory_notify` object contains the starting PFN (Page Frame Number)
and the number of pages in the hot-removed memory. This information is
necessary for architectures like PowerPC to update/recreate the kdump
image, specifically `elfcorehdr`.

So update the function signature of `crash_handle_hotplug_event()` and
`arch_crash_handle_hotplug_event()` to accept the `memory_notify` object
as an argument from crash memory hotplug notifier.

Since no such object is available in the case of CPU hotplug event, the
crash CPU hotplug notifier `crash_cpuhp_online()` passes NULL to the
crash hotplug handler.

Signed-off-by: Sourabh Jain 
Cc: Akhil Raj 
Cc: Andrew Morton 
Cc: Aneesh Kumar K.V 
Cc: Baoquan He 
Cc: Borislav Petkov (AMD) 
Cc: Boris Ostrovsky 
Cc: Christophe Leroy 
Cc: Dave Hansen 
Cc: Dave Young 
Cc: David Hildenbrand 
Cc: Greg Kroah-Hartman 
Cc: Hari Bathini 
Cc: Laurent Dufour 
Cc: Mahesh Salgaonkar 
Cc: Michael Ellerman 
Cc: Mimi Zohar 
Cc: Naveen N Rao 
Cc: Oscar Salvador 
Cc: Thomas Gleixner 
Cc: Valentin Schneider 
Cc: Vivek Goyal 
Cc: ke...@lists.infradead.org
Cc: x...@kernel.org
---
 arch/x86/include/asm/kexec.h |  2 +-
 arch/x86/kernel/crash.c  |  3 ++-
 include/linux/kexec.h|  2 +-
 kernel/crash_core.c  | 14 +++---
 4 files changed, 11 insertions(+), 10 deletions(-)

diff --git a/arch/x86/include/asm/kexec.h b/arch/x86/include/asm/kexec.h
index c9f6a6c5de3c..9bb6607e864e 100644
--- a/arch/x86/include/asm/kexec.h
+++ b/arch/x86/include/asm/kexec.h
@@ -208,7 +208,7 @@ int arch_kimage_file_post_load_cleanup(struct kimage 
*image);
 extern void kdump_nmi_shootdown_cpus(void);
 
 #ifdef CONFIG_CRASH_HOTPLUG
-void arch_crash_handle_hotplug_event(struct kimage *image);
+void arch_crash_handle_hotplug_event(struct kimage *image, void *arg);
 #define arch_crash_handle_hotplug_event arch_crash_handle_hotplug_event
 
 #ifdef CONFIG_HOTPLUG_CPU
diff --git a/arch/x86/kernel/crash.c b/arch/x86/kernel/crash.c
index b6b044356f1b..44744e9c68ec 100644
--- a/arch/x86/kernel/crash.c
+++ b/arch/x86/kernel/crash.c
@@ -428,10 +428,11 @@ unsigned int arch_crash_get_elfcorehdr_size(void)
 /**
  * arch_crash_handle_hotplug_event() - Handle hotplug elfcorehdr changes
  * @image: a pointer to kexec_crash_image
+ * @arg: struct memory_notify handler for memory hotplug case and NULL for CPU 
hotplug case.
  *
  * Prepare the new elfcorehdr and replace the existing elfcorehdr.
  */
-void arch_crash_handle_hotplug_event(struct kimage *image)
+void arch_crash_handle_hotplug_event(struct kimage *image, void *arg)
 {
void *elfbuf = NULL, *old_elfcorehdr;
unsigned long nr_mem_ranges;
diff --git a/include/linux/kexec.h b/include/linux/kexec.h
index 400cb6c02176..802052d9c64b 100644
--- a/include/linux/kexec.h
+++ b/include/linux/kexec.h
@@ -483,7 +483,7 @@ static inline void arch_kexec_pre_free_pages(void *vaddr, 
unsigned int pages) {
 #endif
 
 #ifndef arch_crash_handle_hotplug_event
-static inline void arch_crash_handle_hotplug_event(struct kimage *image) { }
+static inline void arch_crash_handle_hotplug_event(struct kimage *image, void 
*arg) { }
 #endif
 
 int crash_check_update_elfcorehdr(void);
diff --git a/kernel/crash_core.c b/kernel/crash_core.c
index d48315667752..ab1c8e79759d 100644
--- a/kernel/crash_core.c
+++ b/kernel/crash_core.c
@@ -914,7 +914,7 @@ int crash_check_update_elfcorehdr(void)
  * list of segments it checks (since the elfcorehdr changes and thus
  * would require an update to purgatory itself to update the digest).
  */
-static void crash_handle_hotplug_event(unsigned int hp_action, unsigned int 
cpu)
+static void crash_handle_hotplug_event(unsigned int hp_action, unsigned int 
cpu, void *arg)
 {
struct kimage *image;
 
@@ -976,7 +976,7 @@ static void crash_handle_hotplug_event(unsigned int 
hp_action, unsigned int cpu)
image->hp_action = hp_action;
 
/* Now invoke arch-specific update handler */
-   arch_crash_handle_hotplug_event(image);
+   arch_crash_handle_hotplug_event(image, arg);
 
/* No longer handling a hotplug event */
image->hp_action = KEXEC_CRASH_HP_NONE;
@@ -992,17 +992,17 @@ static void crash_handle_hotplug_event(unsigned int 
hp_action, unsigned int cpu)
crash_hotplug_unlock();
 }
 
-static int crash_memhp_notifier(struct notifier_block *nb, unsigned long val, 
void *v)
+static int crash_memhp_notifier(struct notifier_block *nb, unsigned long val, 
void *arg)
 {
switch (val) {
case MEM_ONLINE:
crash_handle_hotplug_event(KEXEC_CRASH_HP_ADD_MEMORY,
-   KEXEC_CRASH_HP_INVALID_CPU);
+   KEXEC_CRASH_HP_INVALID_CPU, arg);
break;
 
case MEM_O

[PATCH v15 0/5] powerpc/crash: Kernel handling of CPU and memory hotplug

2024-01-11 Thread Sourabh Jain
dded a new config to configure this feature
  - pass hotplug action type to arch specific handler

v6
  - Added crash memory hotplug support

v5:
  - Replace COFNIG_CRASH_HOTPLUG with CONFIG_HOTPLUG_CPU.
  - Move fdt segment identification for kexec_load case to load path
instead of crash hotplug handler
  - Keep new attribute defined under kimage_arch to track FDT segment
under CONFIG_HOTPLUG_CPU config.

v4:
  - Update the logic to find the additional space needed for hotadd CPUs
post kexec load. Refer "[RFC v4 PATCH 4/5] powerpc/crash hp: add crash
hotplug support for kexec_file_load" patch to know more about the
change.
  - Fix a couple of typo.
  - Replace pr_err to pr_info_once to warn user about memory hotplug
support.
  - In crash hotplug handle exit the for loop if FDT segment is found.

v3
  - Move fdt_index and fdt_index_vaild variables to kimage_arch struct.
  - Rebase patche on top of

https://lore.kernel.org/lkml/20220303162725.49640-1-eric.devol...@oracle.com/
  - Fixed warning reported by checpatch script

v2:
  - Use generic hotplug handler introduced by

https://lore.kernel.org/lkml/20220209195706.51522-1-eric.devol...@oracle.com/
a significant change from v1.

Cc: Akhil Raj 
Cc: Andrew Morton 
Cc: Aneesh Kumar K.V 
Cc: Baoquan He 
Cc: Borislav Petkov (AMD) 
Cc: Boris Ostrovsky 
Cc: Christophe Leroy 
Cc: Dave Hansen 
Cc: Dave Young 
Cc: David Hildenbrand 
Cc: Greg Kroah-Hartman 
Cc: Hari Bathini 
Cc: Laurent Dufour 
Cc: Mahesh Salgaonkar 
Cc: Michael Ellerman 
Cc: Mimi Zohar 
Cc: Naveen N Rao 
Cc: Oscar Salvador 
Cc: Thomas Gleixner 
Cc: Valentin Schneider 
Cc: Vivek Goyal 
Cc: ke...@lists.infradead.org
Cc: x...@kernel.org

Sourabh Jain (5):
  crash: forward memory_notify arg to arch crash hotplug handler
  crash: add a new kexec flag for hotplug support
  powerpc/kexec: turn some static helper functions public
  powerpc: add crash CPU hotplug support
  powerpc: add crash memory hotplug support

 arch/powerpc/Kconfig|   4 +
 arch/powerpc/include/asm/kexec.h|  15 ++
 arch/powerpc/include/asm/kexec_ranges.h |   1 +
 arch/powerpc/kexec/Makefile |   4 +-
 arch/powerpc/kexec/core_64.c| 334 
 arch/powerpc/kexec/elf_64.c |  12 +-
 arch/powerpc/kexec/file_load_64.c   | 211 ---
 arch/powerpc/kexec/ranges.c |  85 ++
 arch/x86/include/asm/kexec.h|   5 +-
 arch/x86/kernel/crash.c |  21 +-
 drivers/base/cpu.c  |   2 +-
 drivers/base/memory.c   |   2 +-
 include/linux/kexec.h   |  27 +-
 include/uapi/linux/kexec.h  |   1 +
 kernel/crash_core.c |  25 +-
 kernel/kexec.c  |   4 +-
 kernel/kexec_file.c |   5 +
 17 files changed, 549 insertions(+), 209 deletions(-)

-- 
2.41.0



[PATCH v15 5/5] powerpc: add crash memory hotplug support

2024-01-11 Thread Sourabh Jain
Extend the arch crash hotplug handler, as introduced by the patch title
("powerpc: add crash CPU hotplug support"), to also support memory
add/remove events.

Elfcorehdr describes the memory of the crash kernel to capture the
kernel; hence, it needs to be updated if memory resources change due to
memory add/remove events. Therefore, arch_crash_handle_hotplug_event()
is updated to recreate the elfcorehdr and replace it with the previous
one on memory add/remove events.

The memblock list is used to prepare the elfcorehdr. In the case of
memory hot removal, the memblock list is updated after the arch crash
hotplug handler is triggered, as depicted in Figure 1. Thus, the
hot-removed memory is explicitly removed from the crash memory ranges
to ensure that the memory ranges added to elfcorehdr do not include the
hot-removed memory.

Memory remove
  |
  v
Offline pages
  |
  v
 Initiate memory notify call <> crash hotplug handler
 chain for MEM_OFFLINE event
  |
  v
 Update memblock list

Figure 1

There are two system calls, `kexec_file_load` and `kexec_load`, used to
load the kdump image. A few changes have been made to ensure that the
kernel can safely update the elfcorehdr component of the kdump image for
both system calls.

For the kexec_file_load syscall, kdump image is prepared in the kernel.
To support an increasing number of memory regions, the elfcorehdr is
built with extra buffer space to ensure that it can accommodate
additional memory ranges in future.

For the kexec_load syscall, the elfcorehdr is updated only if the
KEXEC_CRASH_HOTPLUG_SUPPORT kexec flag is passed to the kernel by the
kexec tool. Passing this flag to the kernel indicates that the
elfcorehdr is built to accommodate additional memory ranges and the
elfcorehdr segment is not considered for SHA calculation, making it safe
to update.

The changes related to this feature are kept under the CRASH_HOTPLUG
config, and it is enabled by default.

Signed-off-by: Sourabh Jain 
Cc: Akhil Raj 
Cc: Andrew Morton 
Cc: Aneesh Kumar K.V 
Cc: Baoquan He 
Cc: Borislav Petkov (AMD) 
Cc: Boris Ostrovsky 
Cc: Christophe Leroy 
Cc: Dave Hansen 
Cc: Dave Young 
Cc: David Hildenbrand 
Cc: Greg Kroah-Hartman 
Cc: Hari Bathini 
Cc: Laurent Dufour 
Cc: Mahesh Salgaonkar 
Cc: Michael Ellerman 
Cc: Mimi Zohar 
Cc: Naveen N Rao 
Cc: Oscar Salvador 
Cc: Thomas Gleixner 
Cc: Valentin Schneider 
Cc: Vivek Goyal 
Cc: ke...@lists.infradead.org
Cc: x...@kernel.org
---
 arch/powerpc/include/asm/kexec.h|   5 +-
 arch/powerpc/include/asm/kexec_ranges.h |   1 +
 arch/powerpc/kexec/core_64.c| 107 +++-
 arch/powerpc/kexec/file_load_64.c   |  34 +++-
 arch/powerpc/kexec/ranges.c |  85 +++
 5 files changed, 225 insertions(+), 7 deletions(-)

diff --git a/arch/powerpc/include/asm/kexec.h b/arch/powerpc/include/asm/kexec.h
index 943e58eb9bff..25ff5b7f1a28 100644
--- a/arch/powerpc/include/asm/kexec.h
+++ b/arch/powerpc/include/asm/kexec.h
@@ -116,8 +116,11 @@ int get_crash_memory_ranges(struct crash_mem **mem_ranges);
 #ifdef CONFIG_CRASH_HOTPLUG
 void arch_crash_handle_hotplug_event(struct kimage *image, void *arg);
 #define arch_crash_handle_hotplug_event arch_crash_handle_hotplug_event
-#endif /*CONFIG_CRASH_HOTPLUG */
 
+unsigned int arch_crash_get_elfcorehdr_size(void);
+#define crash_get_elfcorehdr_size arch_crash_get_elfcorehdr_size
+
+#endif /*CONFIG_CRASH_HOTPLUG */
 #endif /* CONFIG_PPC64 */
 
 #ifdef CONFIG_KEXEC_FILE
diff --git a/arch/powerpc/include/asm/kexec_ranges.h 
b/arch/powerpc/include/asm/kexec_ranges.h
index f83866a19e87..802abf580cf0 100644
--- a/arch/powerpc/include/asm/kexec_ranges.h
+++ b/arch/powerpc/include/asm/kexec_ranges.h
@@ -7,6 +7,7 @@
 void sort_memory_ranges(struct crash_mem *mrngs, bool merge);
 struct crash_mem *realloc_mem_ranges(struct crash_mem **mem_ranges);
 int add_mem_range(struct crash_mem **mem_ranges, u64 base, u64 size);
+int remove_mem_range(struct crash_mem **mem_ranges, u64 base, u64 size);
 int add_tce_mem_ranges(struct crash_mem **mem_ranges);
 int add_initrd_mem_range(struct crash_mem **mem_ranges);
 #ifdef CONFIG_PPC_64S_HASH_MMU
diff --git a/arch/powerpc/kexec/core_64.c b/arch/powerpc/kexec/core_64.c
index 43fcd78c2102..4673f150f973 100644
--- a/arch/powerpc/kexec/core_64.c
+++ b/arch/powerpc/kexec/core_64.c
@@ -19,8 +19,11 @@
 #include 
 #include 
 #include 
+#include 
 
 #include 
+#include 
+#include 
 #include 
 #include 
 #include 
@@ -546,6 +549,101 @@ int update_cpus_node(void *fdt)
 #undef pr_fmt
 #define pr_fmt(fmt) "crash hp: " fmt
 
+/*
+ * Advertise preferred elfcorehdr size to userspace via
+ * /sys/kernel/crash_elfcorehdr_size sysfs interface.
+ */
+unsigned int arch_crash_get_elfcorehdr_size(void)
+{
+   unsigned int sz;
+   unsigned long elf_phdr_cnt;
+
+   /* Program header for CPU notes and vmcoreinfo */
+   elf_phdr_cnt =

[PATCH v15 3/5] powerpc/kexec: turn some static helper functions public

2024-01-11 Thread Sourabh Jain
Move the functions update_cpus_node and get_crash_memory_ranges from
kexec/file_load_64.c to kexec/core_64.c to make these functions usable
by other kexec components.

get_crash_memory_ranges uses functions defined in ranges.c, so take
ranges.c out of CONFIG_KEXEC_FILE.

Later in the series, these functions are utilized for in-kernel updates
to kdump image during CPU/Memory hotplug or online/offline events for
both kexec_load and kexec_file_load syscalls.

There is no intended functional change.

Signed-off-by: Sourabh Jain 
Reviewed-by: Laurent Dufour 
Cc: Akhil Raj 
Cc: Andrew Morton 
Cc: Aneesh Kumar K.V 
Cc: Baoquan He 
Cc: Borislav Petkov (AMD) 
Cc: Boris Ostrovsky 
Cc: Christophe Leroy 
Cc: Dave Hansen 
Cc: Dave Young 
Cc: David Hildenbrand 
Cc: Greg Kroah-Hartman 
Cc: Hari Bathini 
Cc: Mahesh Salgaonkar 
Cc: Michael Ellerman 
Cc: Mimi Zohar 
Cc: Naveen N Rao 
Cc: Oscar Salvador 
Cc: Thomas Gleixner 
Cc: Valentin Schneider 
Cc: Vivek Goyal 
Cc: ke...@lists.infradead.org
Cc: x...@kernel.org
---
 arch/powerpc/include/asm/kexec.h  |   6 ++
 arch/powerpc/kexec/Makefile   |   4 +-
 arch/powerpc/kexec/core_64.c  | 166 ++
 arch/powerpc/kexec/file_load_64.c | 162 -
 4 files changed, 174 insertions(+), 164 deletions(-)

diff --git a/arch/powerpc/include/asm/kexec.h b/arch/powerpc/include/asm/kexec.h
index e1b43aa12175..562e1bb689da 100644
--- a/arch/powerpc/include/asm/kexec.h
+++ b/arch/powerpc/include/asm/kexec.h
@@ -108,6 +108,12 @@ void crash_free_reserved_phys_range(unsigned long begin, 
unsigned long end);
 #endif /* CONFIG_PPC_RTAS */
 #endif /* CONFIG_CRASH_DUMP */
 
+#ifdef CONFIG_PPC64
+struct crash_mem;
+int update_cpus_node(void *fdt);
+int get_crash_memory_ranges(struct crash_mem **mem_ranges);
+#endif /* CONFIG_PPC64 */
+
 #ifdef CONFIG_KEXEC_FILE
 extern const struct kexec_file_ops kexec_elf64_ops;
 
diff --git a/arch/powerpc/kexec/Makefile b/arch/powerpc/kexec/Makefile
index 0c2abe7f9908..f2ed5b85b912 100644
--- a/arch/powerpc/kexec/Makefile
+++ b/arch/powerpc/kexec/Makefile
@@ -3,11 +3,11 @@
 # Makefile for the linux kernel.
 #
 
-obj-y  += core.o crash.o core_$(BITS).o
+obj-y  += core.o crash.o ranges.o core_$(BITS).o
 
 obj-$(CONFIG_PPC32)+= relocate_32.o
 
-obj-$(CONFIG_KEXEC_FILE)   += file_load.o ranges.o file_load_$(BITS).o 
elf_$(BITS).o
+obj-$(CONFIG_KEXEC_FILE)   += file_load.o file_load_$(BITS).o elf_$(BITS).o
 
 # Disable GCOV, KCOV & sanitizers in odd or sensitive code
 GCOV_PROFILE_core_$(BITS).o := n
diff --git a/arch/powerpc/kexec/core_64.c b/arch/powerpc/kexec/core_64.c
index 762e4d09aacf..48beaadcfb70 100644
--- a/arch/powerpc/kexec/core_64.c
+++ b/arch/powerpc/kexec/core_64.c
@@ -17,6 +17,8 @@
 #include 
 #include 
 #include 
+#include 
+#include 
 
 #include 
 #include 
@@ -30,6 +32,8 @@
 #include 
 #include 
 #include 
+#include 
+#include 
 
 int machine_kexec_prepare(struct kimage *image)
 {
@@ -376,6 +380,168 @@ void default_machine_kexec(struct kimage *image)
/* NOTREACHED */
 }
 
+/**
+ * get_crash_memory_ranges - Get crash memory ranges. This list includes
+ *   first/crashing kernel's memory regions that
+ *   would be exported via an elfcore.
+ * @mem_ranges:  Range list to add the memory ranges to.
+ *
+ * Returns 0 on success, negative errno on error.
+ */
+int get_crash_memory_ranges(struct crash_mem **mem_ranges)
+{
+   phys_addr_t base, end;
+   struct crash_mem *tmem;
+   u64 i;
+   int ret;
+
+   for_each_mem_range(i, , ) {
+   u64 size = end - base;
+
+   /* Skip backup memory region, which needs a separate entry */
+   if (base == BACKUP_SRC_START) {
+   if (size > BACKUP_SRC_SIZE) {
+   base = BACKUP_SRC_END + 1;
+   size -= BACKUP_SRC_SIZE;
+   } else
+   continue;
+   }
+
+   ret = add_mem_range(mem_ranges, base, size);
+   if (ret)
+   goto out;
+
+   /* Try merging adjacent ranges before reallocation attempt */
+   if ((*mem_ranges)->nr_ranges == (*mem_ranges)->max_nr_ranges)
+   sort_memory_ranges(*mem_ranges, true);
+   }
+
+   /* Reallocate memory ranges if there is no space to split ranges */
+   tmem = *mem_ranges;
+   if (tmem && (tmem->nr_ranges == tmem->max_nr_ranges)) {
+   tmem = realloc_mem_ranges(mem_ranges);
+   if (!tmem)
+   goto out;
+   }
+
+   /* Exclude crashkernel region */
+   ret = crash_exclude_mem_range(tmem, crashk_res.start, crashk_res.end);
+   if (ret)
+   goto out;
+
+   /*
+* FIXME: For now, stay in parity with k

[PATCH v15 4/5] powerpc: add crash CPU hotplug support

2024-01-11 Thread Sourabh Jain
Due to CPU/Memory hotplug or online/offline events, the elfcorehdr
(which describes the CPUs and memory of the crashed kernel) and FDT
(Flattened Device Tree) of kdump image becomes outdated. Consequently,
attempting dump collection with an outdated elfcorehdr or FDT can lead
to failed or inaccurate dump collection.

Going forward, CPU hotplug or online/offline events are referred as
CPU/Memory add/remove events.

The current solution to address the above issue involves monitoring the
CPU/Memory add/remove events in userspace using udev rules and whenever
there are changes in CPU and memory resources, the entire kdump image
is loaded again. The kdump image includes kernel, initrd, elfcorehdr,
FDT, purgatory. Given that only elfcorehdr and FDT get outdated due to
CPU/Memory add/remove events, reloading the entire kdump image is
inefficient. More importantly, kdump remains inactive for a substantial
amount of time until the kdump reload completes.

To address the aforementioned issue, commit 247262756121 ("crash: add
generic infrastructure for crash hotplug support") added a generic
infrastructure that allows architectures to selectively update the kdump
image component during CPU or memory add/remove events within the kernel
itself.

In the event of a CPU or memory add/remove events, the generic crash
hotplug event handler, `crash_handle_hotplug_event()`, is triggered. It
then acquires the necessary locks to update the kdump image and invokes
the architecture-specific crash hotplug handler,
`arch_crash_handle_hotplug_event()`, to update the required kdump image
components.

This patch adds crash hotplug handler for PowerPC and enable support to
update the kdump image on CPU add/remove events. Support for memory
add/remove events is added in a subsequent patch with the title
"powerpc: add crash memory hotplug support"

As mentioned earlier, only the elfcorehdr and FDT kdump image components
need to be updated in the event of CPU or memory add/remove events.
However, on PowerPC architecture crash hotplug handler only updates the
FDT to enable crash hotplug support for CPU add/remove events. Here's
why.

The elfcorehdr on PowerPC is built with possible CPUs, and thus, it does
not need an update on CPU add/remove events. On the other hand, the FDT
needs to be updated on CPU add events to include the newly added CPU. If
the FDT is not updated and the kernel crashes on a newly added CPU, the
kdump kernel will fail to boot due to the unavailability of the crashing
CPU in the FDT. During the early boot, it is expected that the boot CPU
must be a part of the FDT; otherwise, the kernel will raise a BUG and
fail to boot. For more information, refer to commit 36ae37e3436b0
("powerpc: Make boot_cpuid common between 32 and 64-bit"). Since it is
okay to have an offline CPU in the kdump FDT, no action is taken in case
of CPU removal.

There are two system calls, `kexec_file_load` and `kexec_load`, used to
load the kdump image. Few changes have been made to ensure kernel can
safely update the FDT of kdump image loaded using both system calls.

For kexec_file_load syscall the kdump image is prepared in kernel. So to
support an increasing number of CPUs, the FDT is constructed with extra
buffer space to ensure it can accommodate a possible number of CPU
nodes. Additionally, a call to fdt_pack (which trims the unused space
once the FDT is prepared) is avoided if this feature is enabled.

For the kexec_load syscall, the FDT is updated only if the
KEXEC_CRASH_HOTPLUG_SUPPORT kexec flag is passed to the kernel by
userspace (kexec tools). When userspace passes this flag to the kernel,
it indicates that the FDT is built to accommodate possible CPUs, and the
FDT segment is excluded from SHA calculation, making it safe to update.

The changes related to this feature are kept under the CRASH_HOTPLUG
config, and it is enabled by default.

Signed-off-by: Sourabh Jain 
Cc: Akhil Raj 
Cc: Andrew Morton 
Cc: Aneesh Kumar K.V 
Cc: Baoquan He 
Cc: Borislav Petkov (AMD) 
Cc: Boris Ostrovsky 
Cc: Christophe Leroy 
Cc: Dave Hansen 
Cc: Dave Young 
Cc: David Hildenbrand 
Cc: Greg Kroah-Hartman 
Cc: Hari Bathini 
Cc: Laurent Dufour 
Cc: Mahesh Salgaonkar 
Cc: Michael Ellerman 
Cc: Mimi Zohar 
Cc: Naveen N Rao 
Cc: Oscar Salvador 
Cc: Thomas Gleixner 
Cc: Valentin Schneider 
Cc: Vivek Goyal 
Cc: ke...@lists.infradead.org
Cc: x...@kernel.org
---
 arch/powerpc/Kconfig  |  4 ++
 arch/powerpc/include/asm/kexec.h  |  6 +++
 arch/powerpc/kexec/core_64.c  | 69 +++
 arch/powerpc/kexec/elf_64.c   | 12 +-
 arch/powerpc/kexec/file_load_64.c | 15 +++
 5 files changed, 105 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
index 414b978b8010..91d7bb0b81ee 100644
--- a/arch/powerpc/Kconfig
+++ b/arch/powerpc/Kconfig
@@ -682,6 +682,10 @@ config RELOCATABLE_TEST
 config ARCH_SUPPORTS_CRASH_DUMP
def_bool PPC64 || PPC_BOOK3S_32 || PPC_8

[PATCH v15 2/5] crash: add a new kexec flag for hotplug support

2024-01-11 Thread Sourabh Jain
Commit a72bbec70da2 ("crash: hotplug support for kexec_load()")
introduced a new kexec flag, `KEXEC_UPDATE_ELFCOREHDR`. Kexec tool uses
this flag to indicate to the kernel that it is safe to modify the
elfcorehdr of the kdump image loaded using the kexec_load system call.

However, it is possible that architectures may need to update kexec
segments other then elfcorehdr. For example, FDT (Flatten Device Tree)
on PowerPC. Introducing a new kexec flag for every new kexec segment
may not be a good solution. Hence, a generic kexec flag bit,
`KEXEC_CRASH_HOTPLUG_SUPPORT`, is introduced to share the CPU/Memory
hotplug support intent between the kexec tool and the kernel for the
kexec_load system call.

Now, if the kexec tool sends KEXEC_CRASH_HOTPLUG_SUPPORT kexec flag to
the kernel, it indicates to the kernel that all the required kexec
segment is skipped from SHA calculation and it is safe to update kdump
image loaded using the kexec_load syscall.

While loading the kdump image using the kexec_load syscall, the
@update_elfcorehdr member of struct kimage is set if the kexec tool
sends the KEXEC_UPDATE_ELFCOREHDR kexec flag. This member is later used
to determine whether it is safe to update elfcorehdr on hotplug events.
However, with the introduction of the KEXEC_CRASH_HOTPLUG_SUPPORT kexec
flag, the kexec tool could mark all the required kexec segments on an
architecture as safe to update. So rename the @update_elfcorehdr to
@hotplug_support. If @hotplug_support is set, the kernel can safely
update all the required kexec segments of the kdump image during
CPU/Memory hotplug events.

Introduce an architecture-specific function to process kexec flags for
determining hotplug support. Set the @hotplug_support member of struct
kimage for both kexec_load and kexec_file_load system calls. This
simplifies kernel checks to identify hotplug support for the currently
loaded kdump image by just examining the value of @hotplug_support.

Signed-off-by: Sourabh Jain 
Cc: Akhil Raj 
Cc: Andrew Morton 
Cc: Aneesh Kumar K.V 
Cc: Baoquan He 
Cc: Borislav Petkov (AMD) 
Cc: Boris Ostrovsky 
Cc: Christophe Leroy 
Cc: Dave Hansen 
Cc: Dave Young 
Cc: David Hildenbrand 
Cc: Eric DeVolder 
Cc: Greg Kroah-Hartman 
Cc: Hari Bathini 
Cc: Laurent Dufour 
Cc: Mahesh Salgaonkar 
Cc: Michael Ellerman 
Cc: Mimi Zohar 
Cc: Naveen N Rao 
Cc: Oscar Salvador 
Cc: Thomas Gleixner 
Cc: Valentin Schneider 
Cc: Vivek Goyal 
Cc: ke...@lists.infradead.org
Cc: x...@kernel.org
---
 arch/x86/include/asm/kexec.h |  3 +++
 arch/x86/kernel/crash.c  | 18 +++---
 drivers/base/cpu.c   |  2 +-
 drivers/base/memory.c|  2 +-
 include/linux/kexec.h| 25 +++--
 include/uapi/linux/kexec.h   |  1 +
 kernel/crash_core.c  | 11 ---
 kernel/kexec.c   |  4 ++--
 kernel/kexec_file.c  |  5 +
 9 files changed, 39 insertions(+), 32 deletions(-)

diff --git a/arch/x86/include/asm/kexec.h b/arch/x86/include/asm/kexec.h
index 9bb6607e864e..e791129fdf6c 100644
--- a/arch/x86/include/asm/kexec.h
+++ b/arch/x86/include/asm/kexec.h
@@ -211,6 +211,9 @@ extern void kdump_nmi_shootdown_cpus(void);
 void arch_crash_handle_hotplug_event(struct kimage *image, void *arg);
 #define arch_crash_handle_hotplug_event arch_crash_handle_hotplug_event
 
+int arch_crash_hotplug_support(struct kimage *image, unsigned long 
kexec_flags);
+#define arch_crash_hotplug_support arch_crash_hotplug_support
+
 #ifdef CONFIG_HOTPLUG_CPU
 int arch_crash_hotplug_cpu_support(void);
 #define crash_hotplug_cpu_support arch_crash_hotplug_cpu_support
diff --git a/arch/x86/kernel/crash.c b/arch/x86/kernel/crash.c
index 44744e9c68ec..293b54bff706 100644
--- a/arch/x86/kernel/crash.c
+++ b/arch/x86/kernel/crash.c
@@ -398,20 +398,16 @@ int crash_load_segments(struct kimage *image)
 #undef pr_fmt
 #define pr_fmt(fmt) "crash hp: " fmt
 
-/* These functions provide the value for the sysfs crash_hotplug nodes */
-#ifdef CONFIG_HOTPLUG_CPU
-int arch_crash_hotplug_cpu_support(void)
+int arch_crash_hotplug_support(struct kimage *image, unsigned long kexec_flags)
 {
-   return crash_check_update_elfcorehdr();
-}
-#endif
 
-#ifdef CONFIG_MEMORY_HOTPLUG
-int arch_crash_hotplug_memory_support(void)
-{
-   return crash_check_update_elfcorehdr();
-}
+#ifdef CONFIG_KEXEC_FILE
+   if (image->file_mode)
+   return 1;
 #endif
+   return (kexec_flags & KEXEC_UPDATE_ELFCOREHDR ||
+   kexec_flags & KEXEC_CRASH_HOTPLUG_SUPPORT);
+}
 
 unsigned int arch_crash_get_elfcorehdr_size(void)
 {
diff --git a/drivers/base/cpu.c b/drivers/base/cpu.c
index 548491de818e..2f411ddfbd8b 100644
--- a/drivers/base/cpu.c
+++ b/drivers/base/cpu.c
@@ -306,7 +306,7 @@ static ssize_t crash_hotplug_show(struct device *dev,
 struct device_attribute *attr,
 char *buf)
 {
-   return sysfs_emit(buf, "%d\n", cra

Re: [PATCH v14 3/6] crash: add a new kexec flag for FDT update

2023-12-20 Thread Sourabh Jain

Hello Baoquan,

While replying to this email earlier, I mistakenly pressed "Reply to List"
instead of "Reply to All." Consequently, my response was sent only to 
powerpc

mailing list.

On 17/12/23 06:29, Baoquan He wrote:

On 12/17/23 at 12:27am, Sourabh Jain wrote:

On 16/12/23 15:11, Baoquan He wrote:

On 12/15/23 at 12:17pm, Sourabh Jain wrote:
..

diff --git a/include/linux/kexec.h b/include/linux/kexec.h
index 0f6ea35879ee..bcedb7625b1f 100644
--- a/include/linux/kexec.h
+++ b/include/linux/kexec.h
@@ -319,6 +319,7 @@ struct kimage {
#ifdef CONFIG_CRASH_HOTPLUG
/* If set, allow changes to elfcorehdr of kexec_load'd image */
unsigned int update_elfcorehdr:1;
+   unsigned int update_fdt:1;

Can we unify this to one flag, e.g hotplug_update?

With this, on x86_64, we will skip the sha calculation for elfcorehdr.
On ppc, we will skip the sha calculation for elfcorehdr and fdt.

Yeah, that's what I suggested to Eric. I can do that, but I see one
problem with powerpc or other platforms that need to skip SHA
for more kexec segments in addition to elfcorehdr.

`update_elfcorehdr` is set when the kexec tool sends the
`KEXEC_UPDATE_ELFCOREHDR`
flag to the kernel for the `kexec_load` system call.

Given that the kexec tool has already been updated to send the
`KEXEC_UPDATE_ELFCOREHDR` flag only when elfcorehdr is skipped from
SHA verification in generic code, now it would be tricky for architectures
to
determine whether kexec has skipped SHA verification for just elfcorehdr
or all segments needed on the platform with the same flag.

In kexec-tools, it's judged by do_hotplug to skip the elfcorehdr
segment. I am wondering how you skip the fdt segment when calculating
and verifying sha, only saw the update_fdt mark.

In the kexec tool where we loop through all the kexec segments to calculate
the SHA, there will be a arch call made to determine whether the segment
needs
to be excluded from SHA or not.

OK, a arch call will be added to exclude segments in the ARCH. And the
elfcorehdr segment need be excluded in x86 ARCH in case other ARCH later
may not want to exclude elfcorehdr.


Yes, Arch can choose which segment to exclude.



Now in the arch function if decide a specific segment needs to excluded then
corresponding flag is also set by arch function to communicate same with the
kernel.

But I don't see how you exclude elfcorehdr and fdt in kernel for
kexec_file codes. It's not happening in kexec-tools.


On PowerPC, SHA verification is NOT performed for the kexec_file_load 
case; hence, you
won't find any code changes in my patch series to exclude FDT in the 
kernel code.


However, let's consider a scenario where it gets added in the future, or 
other architectures
need to skip the kexec segment, in addition to elfcorehdr. In that case, 
we can use the
same setup as you suggested below. For each kexec segment, there should 
be an
architecture-specific function call to decide whether the segment needs 
to be excluded or not.



About the existing KEXEC_UPDATE_ELFCOREHDR, we only rename the macro,
but still use the same value, could you think of what problem could be
caused between kernel and kexec-tools utility, the old and new version
compatibility?

Just changing the macro name will NOT help because the current kexec tool
enables the KEXEC_UPDATE_ELFCOREHDR = 0x0004 kexec flag bit
if
the command argument --hotplug is passed to the kexec
and
the /sys/kernel/crash_elfcorehdr_size file exists in the system.

As we have discussed, excluding will be done in each ARCH's function
when doing sha calculation in kexec-tools, isn't it?

diff --git a/kexec/kexec.c b/kexec/kexec.c
index b5393e3b20aa..0095aeec988a 100644
--- a/kexec/kexec.c
+++ b/kexec/kexec.c
@@ -701,10 +701,10 @@ static void update_purgatory(struct kexec_info *info)
continue;
}
  
-		/* Don't include elfcorehdr in the checksum, if hotplug

+   /* Don't include unwanted segments in the checksum, if hotplug
 * support enabled.
-*/
-   if (do_hotplug && (info->segment[i].mem == (void 
*)info->elfcorehdr)) {
+   if (do_hotplug)
+   arch_exclude_segments(info, )
continue;
}


Yes, something like the above should work.

Now, let's say an architecture enables this feature in the kernel with the
assumption
that the 0x0004 kexec flag bit is passed from the kexec tool when all
the required
kexec segments are skipped from SHA calculation. In this case, the current
kexec tool,
which passes the 0x0004 kexec flag bit only when the elfcorehdr is
skipped, will
cause issues for architectures.


If it's about the new header files installed on older kernel, we can
change it like below? Fortunately only one release, 6.6 passed.

diff --git a/include/uapi/linux/kexec.h b/include/uapi/linux/kexec.h
index 3d5b3d757bed..df6a6505e267 10064

Re: [RFC PATCH 1/3] powerpc/pseries/fadump: add support for multiple boot memory regions

2023-12-18 Thread Sourabh Jain

Hello Aditya,

On 17/12/23 14:11, Aditya Gupta wrote:

Hi sourabh,

On 06/12/23 01:48, Hari Bathini wrote:

From: Sourabh Jain 

Currently, fadump on pseries assumes a single boot memory region even
though f/w supports more than one boot memory region. Add support for
more boot memory regions to make the implementation flexible for any
enhancements that introduce other region types. For this, rtas memory
structure for fadump is updated to have multiple boot memory regions
instead of just one. Additionally, methods responsible for creating
the fadump memory structure during both the first and second kernel
boot have been modified to take these multiple boot memory regions
into account. Also, a new callback has been added to the fadump_ops
structure to get the maximum boot memory regions supported by the
platform.

Signed-off-by: Sourabh Jain 
Signed-off-by: Hari Bathini 
---
  arch/powerpc/include/asm/fadump-internal.h   |   2 +-
  arch/powerpc/kernel/fadump.c |  27 +-
  arch/powerpc/platforms/powernv/opal-fadump.c |   8 +
  arch/powerpc/platforms/pseries/rtas-fadump.c | 258 ---
  arch/powerpc/platforms/pseries/rtas-fadump.h |  26 +-
  5 files changed, 199 insertions(+), 122 deletions(-)

diff --git a/arch/powerpc/include/asm/fadump-internal.h 
b/arch/powerpc/include/asm/fadump-internal.h

index 27f9e11eda28..b3956c400519 100644
--- a/arch/powerpc/include/asm/fadump-internal.h
+++ b/arch/powerpc/include/asm/fadump-internal.h
@@ -129,6 +129,7 @@ struct fadump_ops {
    struct seq_file *m);
  void    (*fadump_trigger)(struct fadump_crash_info_header *fdh,
    const char *msg);
+    int    (*fadump_max_boot_mem_rgns)(void);
  };
    /* Helper functions */
@@ -136,7 +137,6 @@ s32 __init fadump_setup_cpu_notes_buf(u32 num_cpus);
  void fadump_free_cpu_notes_buf(void);
  u32 *__init fadump_regs_to_elf_notes(u32 *buf, struct pt_regs *regs);
  void __init fadump_update_elfcore_header(char *bufp);
-bool is_fadump_boot_mem_contiguous(void);
  bool is_fadump_reserved_mem_contiguous(void);
    #else /* !CONFIG_PRESERVE_FA_DUMP */
diff --git a/arch/powerpc/kernel/fadump.c b/arch/powerpc/kernel/fadump.c
index d14eda1e8589..757681658dda 100644
--- a/arch/powerpc/kernel/fadump.c
+++ b/arch/powerpc/kernel/fadump.c
@@ -222,28 +222,6 @@ static bool is_fadump_mem_area_contiguous(u64 
d_start, u64 d_end)

  return ret;
  }
  -/*
- * Returns true, if there are no holes in boot memory area,
- * false otherwise.
- */
-bool is_fadump_boot_mem_contiguous(void)
-{
-    unsigned long d_start, d_end;
-    bool ret = false;
-    int i;
-
-    for (i = 0; i < fw_dump.boot_mem_regs_cnt; i++) {
-    d_start = fw_dump.boot_mem_addr[i];
-    d_end   = d_start + fw_dump.boot_mem_sz[i];
-
-    ret = is_fadump_mem_area_contiguous(d_start, d_end);
-    if (!ret)
-    break;
-    }
-
-    return ret;
-}
-
  /*
   * Returns true, if there are no holes in reserved memory area,
   * false otherwise.
@@ -389,10 +367,11 @@ static unsigned long __init 
get_fadump_area_size(void)

  static int __init add_boot_mem_region(unsigned long rstart,
    unsigned long rsize)
  {
+    int max_boot_mem_rgns = fw_dump.ops->fadump_max_boot_mem_rgns();
  int i = fw_dump.boot_mem_regs_cnt++;
  -    if (fw_dump.boot_mem_regs_cnt > FADUMP_MAX_MEM_REGS) {
-    fw_dump.boot_mem_regs_cnt = FADUMP_MAX_MEM_REGS;
+    if (fw_dump.boot_mem_regs_cnt > max_boot_mem_rgns) {
+    fw_dump.boot_mem_regs_cnt = max_boot_mem_rgns;
  return 0;
  }
  diff --git a/arch/powerpc/platforms/powernv/opal-fadump.c 
b/arch/powerpc/platforms/powernv/opal-fadump.c

index 964f464b1b0e..fa26c21a08d9 100644
--- a/arch/powerpc/platforms/powernv/opal-fadump.c
+++ b/arch/powerpc/platforms/powernv/opal-fadump.c
@@ -615,6 +615,13 @@ static void opal_fadump_trigger(struct 
fadump_crash_info_header *fdh,

  pr_emerg("No backend support for MPIPL!\n");
  }
  +/* FADUMP_MAX_MEM_REGS or lower */
+static int opal_fadump_max_boot_mem_rgns(void)
+{
+    return FADUMP_MAX_MEM_REGS;
+
+}
+
  static struct fadump_ops opal_fadump_ops = {
  .fadump_init_mem_struct    = opal_fadump_init_mem_struct,
  .fadump_get_metadata_size    = opal_fadump_get_metadata_size,
@@ -627,6 +634,7 @@ static struct fadump_ops opal_fadump_ops = {
  .fadump_process    = opal_fadump_process,
  .fadump_region_show    = opal_fadump_region_show,
  .fadump_trigger    = opal_fadump_trigger,
+    .fadump_max_boot_mem_rgns    = opal_fadump_max_boot_mem_rgns,
  };
    void __init opal_fadump_dt_scan(struct fw_dump *fadump_conf, u64 
node)
diff --git a/arch/powerpc/platforms/pseries/rtas-fadump.c 
b/arch/powerpc/platforms/pseries/rtas-fadump.c

index b5853e9fcc3c..1b05b4cefdfd 100644
--- a/arch/powerpc/platforms/pseries/rtas-fadump.c
+++ b/arch/powerpc/platforms/pseries/rtas-fadump.c
@@ -29,9 +29,6 @@ static const struct rtas_fadump_m

Re: [PATCH v14 3/6] crash: add a new kexec flag for FDT update

2023-12-17 Thread Sourabh Jain




On 17/12/23 06:29, Baoquan He wrote:

On 12/17/23 at 12:27am, Sourabh Jain wrote:


On 16/12/23 15:11, Baoquan He wrote:

On 12/15/23 at 12:17pm, Sourabh Jain wrote:
..

diff --git a/include/linux/kexec.h b/include/linux/kexec.h
index 0f6ea35879ee..bcedb7625b1f 100644
--- a/include/linux/kexec.h
+++ b/include/linux/kexec.h
@@ -319,6 +319,7 @@ struct kimage {
#ifdef CONFIG_CRASH_HOTPLUG
/* If set, allow changes to elfcorehdr of kexec_load'd image */
unsigned int update_elfcorehdr:1;
+   unsigned int update_fdt:1;

Can we unify this to one flag, e.g hotplug_update?

With this, on x86_64, we will skip the sha calculation for elfcorehdr.
On ppc, we will skip the sha calculation for elfcorehdr and fdt.

Yeah, that's what I suggested to Eric. I can do that, but I see one
problem with powerpc or other platforms that need to skip SHA
for more kexec segments in addition to elfcorehdr.

`update_elfcorehdr` is set when the kexec tool sends the
`KEXEC_UPDATE_ELFCOREHDR`
flag to the kernel for the `kexec_load` system call.

Given that the kexec tool has already been updated to send the
`KEXEC_UPDATE_ELFCOREHDR` flag only when elfcorehdr is skipped from
SHA verification in generic code, now it would be tricky for architectures
to
determine whether kexec has skipped SHA verification for just elfcorehdr
or all segments needed on the platform with the same flag.

In kexec-tools, it's judged by do_hotplug to skip the elfcorehdr
segment. I am wondering how you skip the fdt segment when calculating
and verifying sha, only saw the update_fdt mark.

In the kexec tool where we loop through all the kexec segments to calculate
the SHA, there will be a arch call made to determine whether the segment
needs
to be excluded from SHA or not.

OK, a arch call will be added to exclude segments in the ARCH. And the
elfcorehdr segment need be excluded in x86 ARCH in case other ARCH later
may not want to exclude elfcorehdr.


Yes, Arch can choose which segment to exclude.




Now in the arch function if decide a specific segment needs to excluded then
corresponding flag is also set by arch function to communicate same with the
kernel.

But I don't see how you exclude elfcorehdr and fdt in kernel for
kexec_file codes. It's not happening in kexec-tools.


On PowerPC, SHA verification is NOT performed for the kexec_file_load 
case; hence, you
won't find any code changes in my patch series to exclude FDT in the 
kernel code.


However, let's consider a scenario where it gets added in the future, or 
other architectures
need to skip the kexec segment, in addition to elfcorehdr. In that case, 
we can use the
same setup as you suggested below. For each kexec segment, there should 
be an
architecture-specific function call to decide whether the segment needs 
to be excluded or not.





About the existing KEXEC_UPDATE_ELFCOREHDR, we only rename the macro,
but still use the same value, could you think of what problem could be
caused between kernel and kexec-tools utility, the old and new version
compatibility?

Just changing the macro name will NOT help because the current kexec tool
enables the KEXEC_UPDATE_ELFCOREHDR = 0x0004 kexec flag bit
if
the command argument --hotplug is passed to the kexec
and
the /sys/kernel/crash_elfcorehdr_size file exists in the system.

As we have discussed, excluding will be done in each ARCH's function
when doing sha calculation in kexec-tools, isn't it?

diff --git a/kexec/kexec.c b/kexec/kexec.c
index b5393e3b20aa..0095aeec988a 100644
--- a/kexec/kexec.c
+++ b/kexec/kexec.c
@@ -701,10 +701,10 @@ static void update_purgatory(struct kexec_info *info)
continue;
}
  
-		/* Don't include elfcorehdr in the checksum, if hotplug

+   /* Don't include unwanted segments in the checksum, if hotplug
 * support enabled.
-*/
-   if (do_hotplug && (info->segment[i].mem == (void 
*)info->elfcorehdr)) {
+   if (do_hotplug)
+   arch_exclude_segments(info, )
continue;
}


Yes, something like the above should work.

  


Now, let's say an architecture enables this feature in the kernel with the
assumption
that the 0x0004 kexec flag bit is passed from the kexec tool when all
the required
kexec segments are skipped from SHA calculation. In this case, the current
kexec tool,
which passes the 0x0004 kexec flag bit only when the elfcorehdr is
skipped, will
cause issues for architectures.


If it's about the new header files installed on older kernel, we can
change it like below? Fortunately only one release, 6.6 passed.

diff --git a/include/uapi/linux/kexec.h b/include/uapi/linux/kexec.h
index 3d5b3d757bed..df6a6505e267 100644
--- a/include/uapi/linux/kexec.h
+++ b/include/uapi/linux/kexec.h
@@ -13,7 +13,7 @@
   #define KEXEC_ON_CRASH 0x0001
   #define KEXEC_PRESERVE_CONTEXT 0x0002
-#define KE

Re: [PATCH v14 3/6] crash: add a new kexec flag for FDT update

2023-12-16 Thread Sourabh Jain




On 16/12/23 15:11, Baoquan He wrote:

On 12/15/23 at 12:17pm, Sourabh Jain wrote:
..

diff --git a/include/linux/kexec.h b/include/linux/kexec.h
index 0f6ea35879ee..bcedb7625b1f 100644
--- a/include/linux/kexec.h
+++ b/include/linux/kexec.h
@@ -319,6 +319,7 @@ struct kimage {
   #ifdef CONFIG_CRASH_HOTPLUG
/* If set, allow changes to elfcorehdr of kexec_load'd image */
unsigned int update_elfcorehdr:1;
+   unsigned int update_fdt:1;

Can we unify this to one flag, e.g hotplug_update?

With this, on x86_64, we will skip the sha calculation for elfcorehdr.
On ppc, we will skip the sha calculation for elfcorehdr and fdt.

Yeah, that's what I suggested to Eric. I can do that, but I see one
problem with powerpc or other platforms that need to skip SHA
for more kexec segments in addition to elfcorehdr.

`update_elfcorehdr` is set when the kexec tool sends the
`KEXEC_UPDATE_ELFCOREHDR`
flag to the kernel for the `kexec_load` system call.

Given that the kexec tool has already been updated to send the
`KEXEC_UPDATE_ELFCOREHDR` flag only when elfcorehdr is skipped from
SHA verification in generic code, now it would be tricky for architectures
to
determine whether kexec has skipped SHA verification for just elfcorehdr
or all segments needed on the platform with the same flag.

In kexec-tools, it's judged by do_hotplug to skip the elfcorehdr
segment. I am wondering how you skip the fdt segment when calculating
and verifying sha, only saw the update_fdt mark.


In the kexec tool where we loop through all the kexec segments to calculate
the SHA, there will be a arch call made to determine whether the segment 
needs

to be excluded from SHA or not.

Now in the arch function if decide a specific segment needs to excluded then
corresponding flag is also set by arch function to communicate same with the
kernel.



About the existing KEXEC_UPDATE_ELFCOREHDR, we only rename the macro,
but still use the same value, could you think of what problem could be
caused between kernel and kexec-tools utility, the old and new version
compatibility?


Just changing the macro name will NOT help because the current kexec tool
enables the KEXEC_UPDATE_ELFCOREHDR = 0x0004 kexec flag bit
if
the command argument --hotplug is passed to the kexec
and
the /sys/kernel/crash_elfcorehdr_size file exists in the system.

Now, let's say an architecture enables this feature in the kernel with 
the assumption
that the 0x0004 kexec flag bit is passed from the kexec tool when 
all the required
kexec segments are skipped from SHA calculation. In this case, the 
current kexec tool,
which passes the 0x0004 kexec flag bit only when the elfcorehdr is 
skipped, will

cause issues for architectures.



If it's about the new header files installed on older kernel, we can
change it like below? Fortunately only one release, 6.6 passed.

diff --git a/include/uapi/linux/kexec.h b/include/uapi/linux/kexec.h
index 3d5b3d757bed..df6a6505e267 100644
--- a/include/uapi/linux/kexec.h
+++ b/include/uapi/linux/kexec.h
@@ -13,7 +13,7 @@
  #define KEXEC_ON_CRASH 0x0001
  #define KEXEC_PRESERVE_CONTEXT 0x0002
-#define KEXEC_UPDATE_FDT   0x0008
+#define KEXEC_CRASH_HOTPLUG_UPDATE 0x0004
  #define KEXEC_UPDATE_ELFCOREHDR0x0004
  #define KEXEC_ARCH_MASK0x
  
  /*


With my understanding, the kexec flag should be indicating the action,
the mem/cpu hotplug, but not relating to any detail. Imagine later
another segment need be skipped on one ARCH again, then another flag
need be added, this sounds not reasonable.

I strongly agree with you. The KEXEC_CRASH_HOTPLUG_UPDATE kexec flag
should be sufficient to inform the kernel that the kexec tool has been 
updated

to support CPU/Memory hotplug for the kexec_load system call. Unfortunately,
we cannot use the 0x0004 kexec flags bit for KEXEC_CRASH_HOTPLUG_UPDATE
at the moment.

What about using 0x0008 for the KEXEC_CRASH_HOTPLUG_UPDATE flag?

I am aware that we are utilizing two kexec flag bits (0x0004 and 
0x0008)

for the same feature, but what other options do we have?

Thanks,
Sourabh

Code snippet from the kexec tool:

main() {
     ...
     /* NOTE: Xen KEXEC_LIVE_UPDATE and KEXEC_UPDATE_ELFCOREHDR collide */
     if (do_hotplug) {
         ...

     /* Indicate to the kernel it is ok to modify the elfcorehdr */
     kexec_flags |= KEXEC_UPDATE_ELFCOREHDR;
     }
     ...
}

Any suggestion how to handle this with just one kexec flag?

Thanks for the review.

Thanks,
Sourabh Jain


   #endif
   #ifdef ARCH_HAS_KIMAGE_ARCH
@@ -396,9 +397,10 @@ bool kexec_load_permitted(int kexec_image_type);
   /* List of defined/legal kexec flags */
   #ifndef CONFIG_KEXEC_JUMP
-#define KEXEC_FLAGS(KEXEC_ON_CRASH | KEXEC_UPDATE_ELFCOREHDR)
+#define KEXEC_FLAGS(KEXEC_ON_CRASH | KEXEC_UPDATE_ELFCOREHDR | 
KEXEC_UPDATE_FDT)
   #else
-#define KEXEC_FLAGS(KEXEC_ON_CRASH | KEXEC_PRESERVE_CONTEXT

Re: [RFC PATCH 2/3] powerpc/fadump: pass additional parameters to dump capture kernel

2023-12-15 Thread Sourabh Jain

Hello Hari,

On 06/12/23 01:48, Hari Bathini wrote:

For fadump case, passing additional parameters to dump capture kernel
helps in minimizing the memory footprint for it and also provides the
flexibility to disable components/modules, like hugepages, that are
hindering the boot process of the special dump capture environment.

Set up a dedicated parameter area to be passed to the capture kernel.
This area type is defined as RTAS_FADUMP_PARAM_AREA. Sysfs attribute
'/sys/kernel/fadump/bootargs_append' is exported to the userspace to
specify the additional parameters to be passed to the capture kernel

Signed-off-by: Hari Bathini 
---
  arch/powerpc/include/asm/fadump-internal.h   |  3 +
  arch/powerpc/kernel/fadump.c | 80 
  arch/powerpc/platforms/powernv/opal-fadump.c |  6 +-
  arch/powerpc/platforms/pseries/rtas-fadump.c | 35 -
  arch/powerpc/platforms/pseries/rtas-fadump.h | 11 ++-
  5 files changed, 126 insertions(+), 9 deletions(-)

diff --git a/arch/powerpc/include/asm/fadump-internal.h 
b/arch/powerpc/include/asm/fadump-internal.h
index b3956c400519..81629226b15f 100644
--- a/arch/powerpc/include/asm/fadump-internal.h
+++ b/arch/powerpc/include/asm/fadump-internal.h
@@ -97,6 +97,8 @@ struct fw_dump {
unsigned long   cpu_notes_buf_vaddr;
unsigned long   cpu_notes_buf_size;
  
+	unsigned long	param_area;

+
/*
 * Maximum size supported by firmware to copy from source to
 * destination address per entry.
@@ -111,6 +113,7 @@ struct fw_dump {
unsigned long   dump_active:1;
unsigned long   dump_registered:1;
unsigned long   nocma:1;
+   unsigned long   param_area_supported:1;
  
  	struct fadump_ops	*ops;

  };
diff --git a/arch/powerpc/kernel/fadump.c b/arch/powerpc/kernel/fadump.c
index 757681658dda..98f089747ac9 100644
--- a/arch/powerpc/kernel/fadump.c
+++ b/arch/powerpc/kernel/fadump.c
@@ -1470,6 +1470,7 @@ static ssize_t mem_reserved_show(struct kobject *kobj,
return sprintf(buf, "%ld\n", fw_dump.reserve_dump_area_size);
  }
  
+

  static ssize_t registered_show(struct kobject *kobj,
   struct kobj_attribute *attr,
   char *buf)
@@ -1477,6 +1478,43 @@ static ssize_t registered_show(struct kobject *kobj,
return sprintf(buf, "%d\n", fw_dump.dump_registered);
  }
  
+static ssize_t bootargs_append_show(struct kobject *kobj,

+  struct kobj_attribute *attr,
+  char *buf)
+{
+   return sprintf(buf, "%s\n", (char *)__va(fw_dump.param_area));
+}
+
+static ssize_t bootargs_append_store(struct kobject *kobj,
+  struct kobj_attribute *attr,
+  const char *buf, size_t count)
+{
+   char *params;
+
+   if (!fw_dump.fadump_enabled || fw_dump.dump_active)
+   return -EPERM;
+
+   if (count >= COMMAND_LINE_SIZE)
+   return -EINVAL;
+
+   /*
+* Fail here instead of handling this scenario with
+* some silly workaround in capture kernel.
+*/
+   if (saved_command_line_len + count >= COMMAND_LINE_SIZE) {
+   pr_err("Appending parameters exceeds cmdline size!\n");
+   return -ENOSPC;
+   }
+
+   params = __va(fw_dump.param_area);
+   strscpy_pad(params, buf, COMMAND_LINE_SIZE);
+   /* Remove newline character at the end. */
+   if (params[count-1] == '\n')
+   params[count-1] = '\0';
+
+   return count;
+}
+
  static ssize_t registered_store(struct kobject *kobj,
struct kobj_attribute *attr,
const char *buf, size_t count)
@@ -1535,6 +1573,7 @@ static struct kobj_attribute release_attr = 
__ATTR_WO(release_mem);
  static struct kobj_attribute enable_attr = __ATTR_RO(enabled);
  static struct kobj_attribute register_attr = __ATTR_RW(registered);
  static struct kobj_attribute mem_reserved_attr = __ATTR_RO(mem_reserved);
+static struct kobj_attribute bootargs_append_attr = __ATTR_RW(bootargs_append);
  
  static struct attribute *fadump_attrs[] = {

_attr.attr,
@@ -1611,6 +1650,46 @@ static void __init fadump_init_files(void)
return;
  }
  
+/*

+ * Reserve memory to store additional parameters to be passed
+ * for fadump/capture kernel.
+ */
+static void fadump_setup_param_area(void)
+{
+   phys_addr_t range_start, range_end;
+
+   if (!fw_dump.param_area_supported || fw_dump.dump_active)
+   return;
+
+   /* This memory can't be used by PFW or bootloader as it is shared 
across kernels */
+   if (radix_enabled()) {
+   /*
+* Anywhere in the upper half should be good enough as all 
memory
+* is accessible in real mode.
+*/
+   range_start = memblock_end_of_DRAM() / 2;
+   range_end = 

Re: [RFC PATCH 1/3] powerpc/pseries/fadump: add support for multiple boot memory regions

2023-12-15 Thread Sourabh Jain

Hello Hari,

On 06/12/23 01:48, Hari Bathini wrote:

From: Sourabh Jain 

Currently, fadump on pseries assumes a single boot memory region even
though f/w supports more than one boot memory region. Add support for
more boot memory regions to make the implementation flexible for any
enhancements that introduce other region types. For this, rtas memory
structure for fadump is updated to have multiple boot memory regions
instead of just one. Additionally, methods responsible for creating
the fadump memory structure during both the first and second kernel
boot have been modified to take these multiple boot memory regions
into account. Also, a new callback has been added to the fadump_ops
structure to get the maximum boot memory regions supported by the
platform.

Signed-off-by: Sourabh Jain 
Signed-off-by: Hari Bathini 
---
  arch/powerpc/include/asm/fadump-internal.h   |   2 +-
  arch/powerpc/kernel/fadump.c |  27 +-
  arch/powerpc/platforms/powernv/opal-fadump.c |   8 +
  arch/powerpc/platforms/pseries/rtas-fadump.c | 258 ---
  arch/powerpc/platforms/pseries/rtas-fadump.h |  26 +-
  5 files changed, 199 insertions(+), 122 deletions(-)

diff --git a/arch/powerpc/include/asm/fadump-internal.h 
b/arch/powerpc/include/asm/fadump-internal.h
index 27f9e11eda28..b3956c400519 100644
--- a/arch/powerpc/include/asm/fadump-internal.h
+++ b/arch/powerpc/include/asm/fadump-internal.h
@@ -129,6 +129,7 @@ struct fadump_ops {
  struct seq_file *m);
void(*fadump_trigger)(struct fadump_crash_info_header *fdh,
  const char *msg);
+   int (*fadump_max_boot_mem_rgns)(void);
  };
  
  /* Helper functions */

@@ -136,7 +137,6 @@ s32 __init fadump_setup_cpu_notes_buf(u32 num_cpus);
  void fadump_free_cpu_notes_buf(void);
  u32 *__init fadump_regs_to_elf_notes(u32 *buf, struct pt_regs *regs);
  void __init fadump_update_elfcore_header(char *bufp);
-bool is_fadump_boot_mem_contiguous(void);
  bool is_fadump_reserved_mem_contiguous(void);
  
  #else /* !CONFIG_PRESERVE_FA_DUMP */

diff --git a/arch/powerpc/kernel/fadump.c b/arch/powerpc/kernel/fadump.c
index d14eda1e8589..757681658dda 100644
--- a/arch/powerpc/kernel/fadump.c
+++ b/arch/powerpc/kernel/fadump.c
@@ -222,28 +222,6 @@ static bool is_fadump_mem_area_contiguous(u64 d_start, u64 
d_end)
return ret;
  }
  
-/*

- * Returns true, if there are no holes in boot memory area,
- * false otherwise.
- */
-bool is_fadump_boot_mem_contiguous(void)
-{
-   unsigned long d_start, d_end;
-   bool ret = false;
-   int i;
-
-   for (i = 0; i < fw_dump.boot_mem_regs_cnt; i++) {
-   d_start = fw_dump.boot_mem_addr[i];
-   d_end   = d_start + fw_dump.boot_mem_sz[i];
-
-   ret = is_fadump_mem_area_contiguous(d_start, d_end);
-   if (!ret)
-   break;
-   }
-
-   return ret;
-}
-
  /*
   * Returns true, if there are no holes in reserved memory area,
   * false otherwise.
@@ -389,10 +367,11 @@ static unsigned long __init get_fadump_area_size(void)
  static int __init add_boot_mem_region(unsigned long rstart,
  unsigned long rsize)
  {
+   int max_boot_mem_rgns = fw_dump.ops->fadump_max_boot_mem_rgns();
int i = fw_dump.boot_mem_regs_cnt++;
  
-	if (fw_dump.boot_mem_regs_cnt > FADUMP_MAX_MEM_REGS) {

-   fw_dump.boot_mem_regs_cnt = FADUMP_MAX_MEM_REGS;
+   if (fw_dump.boot_mem_regs_cnt > max_boot_mem_rgns) {
+   fw_dump.boot_mem_regs_cnt = max_boot_mem_rgns;
return 0;
}
  
diff --git a/arch/powerpc/platforms/powernv/opal-fadump.c b/arch/powerpc/platforms/powernv/opal-fadump.c

index 964f464b1b0e..fa26c21a08d9 100644
--- a/arch/powerpc/platforms/powernv/opal-fadump.c
+++ b/arch/powerpc/platforms/powernv/opal-fadump.c
@@ -615,6 +615,13 @@ static void opal_fadump_trigger(struct 
fadump_crash_info_header *fdh,
pr_emerg("No backend support for MPIPL!\n");
  }
  
+/* FADUMP_MAX_MEM_REGS or lower */

+static int opal_fadump_max_boot_mem_rgns(void)
+{
+   return FADUMP_MAX_MEM_REGS;
+


Nitpick: we can get rid of the above blank line.

- Sourabh


+}
+
  static struct fadump_ops opal_fadump_ops = {
.fadump_init_mem_struct = opal_fadump_init_mem_struct,
.fadump_get_metadata_size   = opal_fadump_get_metadata_size,
@@ -627,6 +634,7 @@ static struct fadump_ops opal_fadump_ops = {
.fadump_process = opal_fadump_process,
.fadump_region_show = opal_fadump_region_show,
.fadump_trigger = opal_fadump_trigger,
+   .fadump_max_boot_mem_rgns   = opal_fadump_max_boot_mem_rgns,
  };
  
  void __init opal_fadump_dt_scan(struct fw_dump *fadump_conf, u64 node)

diff --git a/arch/powerpc/platforms/pseries/rtas-fadump.c 
b/arch/powerpc/platforms/pseries/rtas-fadum

Re: [PATCH v14 3/6] crash: add a new kexec flag for FDT update

2023-12-14 Thread Sourabh Jain

Hello Baoquan,

On 15/12/23 07:58, Baoquan He wrote:

On 12/11/23 at 02:00pm, Sourabh Jain wrote:

The commit a72bbec70da2 ("crash: hotplug support for kexec_load()")
introduced a new kexec flag, `KEXEC_UPDATE_ELFCOREHDR`. Kexec tool uses
this flag to indicate kernel that it is safe to modify the elfcorehdr
of kdump image loaded using kexec_load system call.

Similarly, add a new kexec flag, `KEXEC_UPDATE_FDT`, for another kdump
component named FDT (Flatten Device Tree). Architectures like PowerPC
need to update FDT kdump image component on CPU hotplug events. Kexec
tool passing `KEXEC_UPDATE_FDT` will be an indication to kernel that FDT
segment is not part of SHA calculation hence it is safe to update it.

With the `KEXEC_UPDATE_ELFCOREHDR` and `KEXEC_UPDATE_FDT` kexec flags,
crash hotplug support can be added to PowerPC for the kexec_load syscall
while maintaining the backward compatibility with older kexec tools that
do not have these newly introduced flags.

Signed-off-by: Sourabh Jain 
Cc: Akhil Raj 
Cc: Andrew Morton 
Cc: Aneesh Kumar K.V 
Cc: Baoquan He 
Cc: Borislav Petkov (AMD) 
Cc: Boris Ostrovsky 
Cc: Christophe Leroy 
Cc: Dave Hansen 
Cc: Dave Young 
Cc: David Hildenbrand 
Cc: Eric DeVolder 
Cc: Greg Kroah-Hartman 
Cc: Hari Bathini 
Cc: Laurent Dufour 
Cc: Mahesh Salgaonkar 
Cc: Michael Ellerman 
Cc: Mimi Zohar 
Cc: Naveen N Rao 
Cc: Oscar Salvador 
Cc: Thomas Gleixner 
Cc: Valentin Schneider 
Cc: Vivek Goyal 
Cc: ke...@lists.infradead.org
Cc: x...@kernel.org
---
  include/linux/kexec.h  | 6 --
  include/uapi/linux/kexec.h | 1 +
  kernel/kexec.c | 2 ++
  3 files changed, 7 insertions(+), 2 deletions(-)

diff --git a/include/linux/kexec.h b/include/linux/kexec.h
index 0f6ea35879ee..bcedb7625b1f 100644
--- a/include/linux/kexec.h
+++ b/include/linux/kexec.h
@@ -319,6 +319,7 @@ struct kimage {
  #ifdef CONFIG_CRASH_HOTPLUG
/* If set, allow changes to elfcorehdr of kexec_load'd image */
unsigned int update_elfcorehdr:1;
+   unsigned int update_fdt:1;

Can we unify this to one flag, e.g hotplug_update?

With this, on x86_64, we will skip the sha calculation for elfcorehdr.
On ppc, we will skip the sha calculation for elfcorehdr and fdt.

Yeah, that's what I suggested to Eric. I can do that, but I see one
problem with powerpc or other platforms that need to skip SHA
for more kexec segments in addition to elfcorehdr.

`update_elfcorehdr` is set when the kexec tool sends the 
`KEXEC_UPDATE_ELFCOREHDR`

flag to the kernel for the `kexec_load` system call.

Given that the kexec tool has already been updated to send the
`KEXEC_UPDATE_ELFCOREHDR` flag only when elfcorehdr is skipped from
SHA verification in generic code, now it would be tricky for 
architectures to

determine whether kexec has skipped SHA verification for just elfcorehdr
or all segments needed on the platform with the same flag.

Code snippet from the kexec tool:

main() {
    ...
    /* NOTE: Xen KEXEC_LIVE_UPDATE and KEXEC_UPDATE_ELFCOREHDR collide */
    if (do_hotplug) {
        ...

    /* Indicate to the kernel it is ok to modify the elfcorehdr */
    kexec_flags |= KEXEC_UPDATE_ELFCOREHDR;
    }
    ...
}

Any suggestion how to handle this with just one kexec flag?

Thanks for the review.

Thanks,
Sourabh Jain




  #endif
  
  #ifdef ARCH_HAS_KIMAGE_ARCH

@@ -396,9 +397,10 @@ bool kexec_load_permitted(int kexec_image_type);
  
  /* List of defined/legal kexec flags */

  #ifndef CONFIG_KEXEC_JUMP
-#define KEXEC_FLAGS(KEXEC_ON_CRASH | KEXEC_UPDATE_ELFCOREHDR)
+#define KEXEC_FLAGS(KEXEC_ON_CRASH | KEXEC_UPDATE_ELFCOREHDR | 
KEXEC_UPDATE_FDT)
  #else
-#define KEXEC_FLAGS(KEXEC_ON_CRASH | KEXEC_PRESERVE_CONTEXT | 
KEXEC_UPDATE_ELFCOREHDR)
+#define KEXEC_FLAGS(KEXEC_ON_CRASH | KEXEC_PRESERVE_CONTEXT | 
KEXEC_UPDATE_ELFCOREHDR | \
+   KEXEC_UPDATE_FDT)
  #endif
  
  /* List of defined/legal kexec file flags */

diff --git a/include/uapi/linux/kexec.h b/include/uapi/linux/kexec.h
index 01766dd839b0..3d5b3d757bed 100644
--- a/include/uapi/linux/kexec.h
+++ b/include/uapi/linux/kexec.h
@@ -13,6 +13,7 @@
  #define KEXEC_ON_CRASH0x0001
  #define KEXEC_PRESERVE_CONTEXT0x0002
  #define KEXEC_UPDATE_ELFCOREHDR   0x0004
+#define KEXEC_UPDATE_FDT   0x0008
  #define KEXEC_ARCH_MASK   0x
  
  /*

diff --git a/kernel/kexec.c b/kernel/kexec.c
index 8f35a5a42af8..97eb151cd931 100644
--- a/kernel/kexec.c
+++ b/kernel/kexec.c
@@ -132,6 +132,8 @@ static int do_kexec_load(unsigned long entry, unsigned long 
nr_segments,
  #ifdef CONFIG_CRASH_HOTPLUG
if (flags & KEXEC_UPDATE_ELFCOREHDR)
image->update_elfcorehdr = 1;
+   if (flags & KEXEC_UPDATE_FDT)
+   image->update_fdt = 1;
  #endif
  
  	ret = machine_kexec_prepare(image);

--
2.41.0





Re: [PATCH v14 6/6] powerpc: add crash memory hotplug support

2023-12-14 Thread Sourabh Jain




On 15/12/23 06:53, Baoquan He wrote:

On 12/11/23 at 02:00pm, Sourabh Jain wrote:
..

diff --git a/arch/powerpc/include/asm/kexec_ranges.h 
b/arch/powerpc/include/asm/kexec_ranges.h
index f83866a19e87..802abf580cf0 100644
--- a/arch/powerpc/include/asm/kexec_ranges.h
+++ b/arch/powerpc/include/asm/kexec_ranges.h
@@ -7,6 +7,7 @@
  void sort_memory_ranges(struct crash_mem *mrngs, bool merge);
  struct crash_mem *realloc_mem_ranges(struct crash_mem **mem_ranges);
  int add_mem_range(struct crash_mem **mem_ranges, u64 base, u64 size);
+int remove_mem_range(struct crash_mem **mem_ranges, u64 base, u64 size);
  int add_tce_mem_ranges(struct crash_mem **mem_ranges);
  int add_initrd_mem_range(struct crash_mem **mem_ranges);
  #ifdef CONFIG_PPC_64S_HASH_MMU
diff --git a/arch/powerpc/kexec/core_64.c b/arch/powerpc/kexec/core_64.c
index 9932793cd64b..5be30659172f 100644
--- a/arch/powerpc/kexec/core_64.c
+++ b/arch/powerpc/kexec/core_64.c
@@ -19,8 +19,11 @@
  #include 
  #include 
  #include 
+#include 
  
  #include 

+#include 
+#include 
  #include 
  #include 
  #include 
@@ -547,9 +550,7 @@ int update_cpus_node(void *fdt)
  #undef pr_fmt
  #define pr_fmt(fmt) "crash hp: " fmt
  
-#ifdef CONFIG_HOTPLUG_CPU

- /* Provides the value for the sysfs crash_hotplug nodes */
-int arch_crash_hotplug_cpu_support(struct kimage *image)
+static int crash_hotplug_support(struct kimage *image)
  {
if (image->file_mode)
return 1;
@@ -560,8 +561,118 @@ int arch_crash_hotplug_cpu_support(struct kimage *image)
 */
return image->update_elfcorehdr && image->update_fdt;
  }
+
+#ifdef CONFIG_HOTPLUG_CPU
+ /* Provides the value for the sysfs crash_hotplug nodes */
+int arch_crash_hotplug_cpu_support(struct kimage *image)
+{
+   return crash_hotplug_support(image);
+}
+#endif
+
+#ifdef CONFIG_MEMORY_HOTPLUG
+ /* Provides the value for the sysfs memory_hotplug nodes */
+int arch_crash_hotplug_memory_support(struct kimage *image)
+{
+   return crash_hotplug_support(image);
+}
  #endif
  
+/*

+ * Advertise preferred elfcorehdr size to userspace via
+ * /sys/kernel/crash_elfcorehdr_size sysfs interface.
+ */
+unsigned int arch_crash_get_elfcorehdr_size(void)
+{
+   unsigned int sz;
+   unsigned long elf_phdr_cnt;
+
+   /* Program header for CPU notes and vmcoreinfo */
+   elf_phdr_cnt = 2;
+   if (IS_ENABLED(CONFIG_MEMORY_HOTPLUG))
+   /* In the worst case, a Phdr is needed for every other LMB to be
+* represented as an individual crash range.
+*/
+   elf_phdr_cnt += memory_hotplug_max() / (2 * drmem_lmb_size());
+
+   /* Do not cross the max limit */
+   if (elf_phdr_cnt > PN_XNUM)
+   elf_phdr_cnt = PN_XNUM;
+
+   sz = sizeof(struct elfhdr) + (elf_phdr_cnt * sizeof(Elf64_Phdr));
+   return sz;
+}
+
+/**
+ * update_crash_elfcorehdr() - Recreate the elfcorehdr and replace it with old
+ *elfcorehdr in the kexec segment array.
+ * @image: the active struct kimage
+ * @mn: struct memory_notify data handler
+ */
+static void update_crash_elfcorehdr(struct kimage *image, struct memory_notify 
*mn)
+{
+   int ret;
+   struct crash_mem *cmem = NULL;
+   struct kexec_segment *ksegment;
+   void *ptr, *mem, *elfbuf = NULL;
+   unsigned long elfsz, memsz, base_addr, size;
+
+   ksegment = >segment[image->elfcorehdr_index];
+   mem = (void *) ksegment->mem;
+   memsz = ksegment->memsz;
+
+   ret = get_crash_memory_ranges();
+   if (ret) {
+   pr_err("Failed to get crash mem range\n");
+   return;
+   }
+
+   /*
+* The hot unplugged memory is part of crash memory ranges,
+* remove it here.
+*/
+   if (image->hp_action == KEXEC_CRASH_HP_REMOVE_MEMORY) {
+   base_addr = PFN_PHYS(mn->start_pfn);
+   size = mn->nr_pages * PAGE_SIZE;
+   ret = remove_mem_range(, base_addr, size);

Althouth this is ppc specific, I don't understand. Why don't you
recreate the elfcorehdr, but take removing the removed region. Comparing the
remove_mem_range() implementation with recreating, I don't see too much
benefit from that, and it makes your code more complicated. Just
curious, surely ppc people can decide what should be taken.


I am recreating `elfcorehdr` by calling `crash_prepare_elf64_headers()` 
below.


This complexity is necessary to avoid adding hot-removed memory to the
new `elfcorehdr`.

On powerpc, the memblock list is utilized to prepare the `elfcorehdr`. 
In the

case of memory hot removal, the memblock list is updated after the arch
crash hotplug handler is triggered. Thus, the hot-removed memory is 
explicitly

removed from the crash memory ranges to ensure that the memory ranges
added to `elfcorehdr` do not include the hot-removed memory.

Thanks,
Sourabh Jain




+   

Re: [PATCH v14 2/6] crash: make CPU and Memory hotplug support reporting flexible

2023-12-14 Thread Sourabh Jain

Hello Baoquan,

On 14/12/23 19:43, Baoquan He wrote:

On 12/11/23 at 02:00pm, Sourabh Jain wrote:

Architectures' specific functions `arch_crash_hotplug_cpu_support()` and
`arch_crash_hotplug_memory_support()` advertise the kernel's capability
to update the kdump image on CPU and Memory hotplug events to userspace
via the sysfs interface. These architecture-specific functions need to
access various attributes of the `kexec_crash_image` object to determine
whether the kernel can update the kdump image and advertise this
information to userspace accordingly.

As the architecture-specific code is not exposed to the APIs required to
acquire the lock for accessing the `kexec_crash_image` object, it calls
a generic function, `crash_check_update_elfcorehdr()`, to determine
whether the kernel can update the kdump image or not.

The lack of access to the `kexec_crash_image` object in
architecture-specific code restricts architectures from performing
additional architecture-specific checks required to determine if the
kdump image is updatable or not. For instance, on PowerPC, the kernel
can update the kdump image only if both the elfcorehdr and FDT are
marked as updatable for the `kexec_load` system call.

So define two generic functions, `crash_hotplug_cpu_support()` and
`crash_hotplug_memory_support()`, which are called when userspace
attempts to read the crash CPU and Memory hotplug support via the sysfs
interface. These functions take the necessary locks needed to access the
`kexec_crash_image` object and then forward it to the
architecture-specific handler to do the rest.

Signed-off-by: Sourabh Jain 
Cc: Akhil Raj 
Cc: Andrew Morton 
Cc: Aneesh Kumar K.V 
Cc: Baoquan He 
Cc: Borislav Petkov (AMD) 
Cc: Boris Ostrovsky 
Cc: Christophe Leroy 
Cc: Dave Hansen 
Cc: Dave Young 
Cc: David Hildenbrand 
Cc: Eric DeVolder 
Cc: Greg Kroah-Hartman 
Cc: Hari Bathini 
Cc: Laurent Dufour 
Cc: Mahesh Salgaonkar 
Cc: Michael Ellerman 
Cc: Mimi Zohar 
Cc: Naveen N Rao 
Cc: Oscar Salvador 
Cc: Thomas Gleixner 
Cc: Valentin Schneider 
Cc: Vivek Goyal 
Cc: ke...@lists.infradead.org
Cc: x...@kernel.org
---
  arch/x86/include/asm/kexec.h |  8 
  arch/x86/kernel/crash.c  | 20 +++-
  include/linux/kexec.h| 13 +++--
  kernel/crash_core.c  | 23 +--
  4 files changed, 43 insertions(+), 21 deletions(-)

diff --git a/arch/x86/include/asm/kexec.h b/arch/x86/include/asm/kexec.h
index 9bb6607e864e..5c88d27b086d 100644
--- a/arch/x86/include/asm/kexec.h
+++ b/arch/x86/include/asm/kexec.h
@@ -212,13 +212,13 @@ void arch_crash_handle_hotplug_event(struct kimage 
*image, void *arg);
  #define arch_crash_handle_hotplug_event arch_crash_handle_hotplug_event
  
  #ifdef CONFIG_HOTPLUG_CPU

-int arch_crash_hotplug_cpu_support(void);
-#define crash_hotplug_cpu_support arch_crash_hotplug_cpu_support
+int arch_crash_hotplug_cpu_support(struct kimage *image);
+#define arch_crash_hotplug_cpu_support arch_crash_hotplug_cpu_support
  #endif
  
  #ifdef CONFIG_MEMORY_HOTPLUG

-int arch_crash_hotplug_memory_support(void);
-#define crash_hotplug_memory_support arch_crash_hotplug_memory_support
+int arch_crash_hotplug_memory_support(struct kimage *image);
+#define arch_crash_hotplug_memory_support arch_crash_hotplug_memory_support
  #endif
  
  unsigned int arch_crash_get_elfcorehdr_size(void);

diff --git a/arch/x86/kernel/crash.c b/arch/x86/kernel/crash.c
index 0d7b2657beb6..ad5941665589 100644
--- a/arch/x86/kernel/crash.c
+++ b/arch/x86/kernel/crash.c
@@ -398,18 +398,28 @@ int crash_load_segments(struct kimage *image)
  #undef pr_fmt
  #define pr_fmt(fmt) "crash hp: " fmt
  
-/* These functions provide the value for the sysfs crash_hotplug nodes */

+#if defined(CONFIG_HOTPLUG_CPU) || defined(CONFIG_MEMORY_HOTPLUG)
+static int crash_hotplug_support(struct kimage *image)
+{
+   if (image->file_mode)
+   return 1;
+
+   return image->update_elfcorehdr;
+}
+#endif
+
  #ifdef CONFIG_HOTPLUG_CPU
-int arch_crash_hotplug_cpu_support(void)
+/* These functions provide the value for the sysfs crash_hotplug nodes */
+int arch_crash_hotplug_cpu_support(struct kimage *image)
  {
-   return crash_check_update_elfcorehdr();
+   return crash_hotplug_support(image);
  }
  #endif
  
  #ifdef CONFIG_MEMORY_HOTPLUG

-int arch_crash_hotplug_memory_support(void)
+int arch_crash_hotplug_memory_support(struct kimage *image)
  {
-   return crash_check_update_elfcorehdr();
+   return crash_hotplug_support(image);
  }
  #endif
  
diff --git a/include/linux/kexec.h b/include/linux/kexec.h

index ee28c09a7fb0..0f6ea35879ee 100644
--- a/include/linux/kexec.h
+++ b/include/linux/kexec.h
@@ -486,16 +486,17 @@ static inline void arch_kexec_pre_free_pages(void *vaddr, 
unsigned int pages) {
  static inline void arch_crash_handle_hotplug_event(struct kimage *image, void 
*arg) { }
  #endif
  
-int crash_check_update_elfcorehdr(void);

-
-#ifndef crash_hotplug_cpu

[PATCH v14 6/6] powerpc: add crash memory hotplug support

2023-12-11 Thread Sourabh Jain
Extend the arch crash hotplug handler, as introduced by the patch title
("powerpc: add crash CPU hotplug support"), to also support memory
add/remove events.

Elfcorehdr describes the memory of the crash kernel to capture the
kernel; hence, it needs to be updated if memory resources change due to
memory add/remove events. Therefore, arch_crash_handle_hotplug_event()
is updated to recreate the elfcorehdr and replace it with the previous
one on memory add/remove events.

The memblock list is used to prepare the elfcorehdr. In the case of
memory hot removal, the memblock list is updated after the arch crash
hotplug handler is triggered, as depicted in Figure 1. Thus, the
hot-removed memory is explicitly removed from the crash memory ranges
to ensure that the memory ranges added to elfcorehdr do not include the
hot-removed memory.

Memory remove
  |
  v
Offline pages
  |
  v
 Initiate memory notify call <> crash hotplug handler
 chain for MEM_OFFLINE event
  |
  v
 Update memblock list

Figure 1

There are two system calls, `kexec_file_load` and `kexec_load`, used to
load the kdump image. A few changes have been made to ensure that the
kernel can safely update the elfcorehdr component of the kdump image for
both system calls.

For the kexec_file_load syscall, kdump image is prepared in the kernel.
To support an increasing number of memory regions, the elfcorehdr is
built with extra buffer space to ensure that it can accommodate
additional memory ranges in future.

For the kexec_load syscall, the elfcorehdr is updated only if both the
KEXEC_UPDATE_FDT and KEXEC_UPDATE_ELFCOREHDR kexec flags are passed to
the kernel by the kexec tool. Passing these flags to the kernel
indicates that the elfcorehdr is built to accommodate additional memory
ranges and the elfcorehdr segment is not considered for SHA calculation,
making it safe to update it.

Commit 88a6f8994421 ("crash: memory and CPU hotplug sysfs attributes")
added a sysfs interface to indicate userspace (kdump udev rule) that
kernel will update the kdump image on memory hotplug events, so kdump
reload can be avoided. Implement arch specific function
`arch_crash_hotplug_memory_support()` to correctly advertise kernel
capability to update kdump image.

This feature is advertised to usersapce when following conditions are
met:

1. Kdump image is loaded using kexec_file_load system call.
2. Kdump image is loaded using kexec_load system and both
   KEXEC_UPATE_ELFCOREHDR and KEXEC_UPDATE_FDT kexec flags are passed
   to kernel.

The changes related to this feature are kept under the CRASH_HOTPLUG
config, and it is enabled by default.

Signed-off-by: Sourabh Jain 
Cc: Akhil Raj 
Cc: Andrew Morton 
Cc: Aneesh Kumar K.V 
Cc: Baoquan He 
Cc: Borislav Petkov (AMD) 
Cc: Boris Ostrovsky 
Cc: Christophe Leroy 
Cc: Dave Hansen 
Cc: Dave Young 
Cc: David Hildenbrand 
Cc: Eric DeVolder 
Cc: Greg Kroah-Hartman 
Cc: Hari Bathini 
Cc: Laurent Dufour 
Cc: Mahesh Salgaonkar 
Cc: Michael Ellerman 
Cc: Mimi Zohar 
Cc: Naveen N Rao 
Cc: Oscar Salvador 
Cc: Thomas Gleixner 
Cc: Valentin Schneider 
Cc: Vivek Goyal 
Cc: ke...@lists.infradead.org
Cc: x...@kernel.org
---
 arch/powerpc/include/asm/kexec.h|   8 ++
 arch/powerpc/include/asm/kexec_ranges.h |   1 +
 arch/powerpc/kexec/core_64.c| 126 ++--
 arch/powerpc/kexec/file_load_64.c   |  34 ++-
 arch/powerpc/kexec/ranges.c |  85 
 5 files changed, 245 insertions(+), 9 deletions(-)

diff --git a/arch/powerpc/include/asm/kexec.h b/arch/powerpc/include/asm/kexec.h
index 7823ab10d323..c8d6cfda523c 100644
--- a/arch/powerpc/include/asm/kexec.h
+++ b/arch/powerpc/include/asm/kexec.h
@@ -122,6 +122,14 @@ int arch_crash_hotplug_cpu_support(struct kimage *image);
 #define arch_crash_hotplug_cpu_support arch_crash_hotplug_cpu_support
 #endif
 
+#ifdef CONFIG_MEMORY_HOTPLUG
+int arch_crash_hotplug_memory_support(struct kimage *image);
+#define arch_crash_hotplug_memory_support arch_crash_hotplug_memory_support
+#endif
+
+unsigned int arch_crash_get_elfcorehdr_size(void);
+#define crash_get_elfcorehdr_size arch_crash_get_elfcorehdr_size
+
 #endif /*CONFIG_CRASH_HOTPLUG */
 #endif /* CONFIG_PPC64 */
 
diff --git a/arch/powerpc/include/asm/kexec_ranges.h 
b/arch/powerpc/include/asm/kexec_ranges.h
index f83866a19e87..802abf580cf0 100644
--- a/arch/powerpc/include/asm/kexec_ranges.h
+++ b/arch/powerpc/include/asm/kexec_ranges.h
@@ -7,6 +7,7 @@
 void sort_memory_ranges(struct crash_mem *mrngs, bool merge);
 struct crash_mem *realloc_mem_ranges(struct crash_mem **mem_ranges);
 int add_mem_range(struct crash_mem **mem_ranges, u64 base, u64 size);
+int remove_mem_range(struct crash_mem **mem_ranges, u64 base, u64 size);
 int add_tce_mem_ranges(struct crash_mem **mem_ranges);
 int add_initrd_mem_range(struct crash_mem **mem_ranges);
 #ifdef CONFIG_PPC_64S_HASH_MMU
diff --git a/arch/powe

[PATCH v14 5/6] powerpc: add crash CPU hotplug support

2023-12-11 Thread Sourabh Jain
Due to CPU/Memory hotplug or online/offline events the elfcorehdr
(which describes the CPUs and memory of the crashed kernel) and FDT
(Flattened Device Tree) of kdump image becomes outdated. Consequently,
attempting dump collection with an outdated elfcorehdr or FDT can lead
to failed or inaccurate dump collection.

Going forward CPU hotplug or online/offlice events are referred as
CPU/Memory add/remvoe events.

The current solution to address the above issue involves monitoring the
CPU/memory add/remove events in userspace using udev rules and whenever
there are changes in CPU and memory resources, the entire kdump image
is loaded again. The kdump image includes kernel, initrd, elfcorehdr,
FDT, purgatory. Given that only elfcorehdr and FDT get outdated due to
CPU/Memory add/remove events, reloading the entire kdump image is
inefficient. More importantly, kdump remains inactive for a substantial
amount of time until the kdump reload completes.

To address the aforementioned issue, commit 247262756121 ("crash: add
generic infrastructure for crash hotplug support") added a generic
infrastructure that allows architectures to selectively update the kdump
image component during CPU or memory add/remove events within the kernel
itself.

In the event of a CPU or memory add/remove event, the generic crash
hotplug event handler, `crash_handle_hotplug_event()`, is triggered. It
then acquires the necessary locks to update the kdump image and invokes
the architecture-specific crash hotplug handler,
`arch_crash_handle_hotplug_event()`, to update the required kdump image
components.

This patch adds crash hotplug handler for PowerPC and enable support to
update the kdump image on CPU add/remove events. Support for memory
add/remove events is added in a subsequent patch with the title
"powerpc: add crash memory hotplug support."

As mentioned earlier, only the elfcorehdr and FDT kdump image components
need to be updated in the event of CPU or memory add/remove events.
However, the PowerPC architecture crash hotplug handler only updates the
FDT to enable crash hotplug support for CPU add/remove events. Here's
why.

The Elfcorehdr on PowerPC is built with possible CPUs, and thus, it does
not need an update on CPU add/remove events. On the other hand, the FDT
needs to be updated on CPU add events to include the newly added CPU. If
the FDT is not updated and the kernel crashes on a newly added CPU, the
kdump kernel will fail to boot due to the unavailability of the crashing
CPU in the FDT. During the early boot, it is expected that the boot CPU
must be a part of the FDT; otherwise, the kernel will raise a BUG and
fail to boot. For more information, refer to commit 36ae37e3436b0
("powerpc: Make boot_cpuid common between 32 and 64-bit"). Since it is
okay to have an offline CPU in the kdump FDT, no action is taken in case
of CPU removal.

There are two system calls, `kexec_file_load` and `kexec_load`, used to
load the kdump image. Few changes have been made to ensure kernel can
safely update the kdump FDT for both system calls.

For kexec_file_load syscall the kdump image is prepared in kernel. So to
support an increasing number of CPUs, the FDT is constructed with extra
buffer space to ensure it can accommodate a possible number of CPU
nodes. Additionally, a call to fdt_pack (which trims the unused space
once the FDT is prepared) is avoided for kdump image loading if this
feature is enabled.

For the kexec_load syscall, the FDT is updated only if both the
KEXEC_UPDATE_FDT and KEXEC_UPDATE_ELFCOREHDR kexec flags are passed to
the kernel by the kexec tool. Passing these flags to the kernel
indicates that the FDT is built to accommodate possible CPUs, and the
FDT segment is not considered for SHA calculation, making it safe to
update the FDT.

Commit 88a6f8994421 ("crash: memory and CPU hotplug sysfs attributes")
added a sysfs interface to indicate userspace (kdump udev rule) that
kernel will update the kdump image on CPU hotplug events, so kdump
reload can be avoided. Implement arch specific function
`arch_crash_hotplug_cpu_support()` to correctly advertise kernel
capability to update kdump image.

This feature is advertised to userspace when the following conditions
are met:

1. Kdump image is loaded using kexec_file_load system call.
2. Kdump image is loaded using kexec_load system and both
   KEXEC_UPATE_ELFCOREHDR and KEXEC_UPDATE_FDT kexec flags are
   passed to kernel.

The changes related to this feature are kept under the CRASH_HOTPLUG
config, and it is enabled by default.

Signed-off-by: Sourabh Jain 
Cc: Akhil Raj 
Cc: Andrew Morton 
Cc: Aneesh Kumar K.V 
Cc: Baoquan He 
Cc: Borislav Petkov (AMD) 
Cc: Boris Ostrovsky 
Cc: Christophe Leroy 
Cc: Dave Hansen 
Cc: Dave Young 
Cc: David Hildenbrand 
Cc: Eric DeVolder 
Cc: Greg Kroah-Hartman 
Cc: Hari Bathini 
Cc: Laurent Dufour 
Cc: Mahesh Salgaonkar 
Cc: Michael Ellerman 
Cc: Mimi Zohar 
Cc: Naveen N Rao 
Cc: Oscar Salvador 
Cc: Tho

[PATCH v14 3/6] crash: add a new kexec flag for FDT update

2023-12-11 Thread Sourabh Jain
The commit a72bbec70da2 ("crash: hotplug support for kexec_load()")
introduced a new kexec flag, `KEXEC_UPDATE_ELFCOREHDR`. Kexec tool uses
this flag to indicate kernel that it is safe to modify the elfcorehdr
of kdump image loaded using kexec_load system call.

Similarly, add a new kexec flag, `KEXEC_UPDATE_FDT`, for another kdump
component named FDT (Flatten Device Tree). Architectures like PowerPC
need to update FDT kdump image component on CPU hotplug events. Kexec
tool passing `KEXEC_UPDATE_FDT` will be an indication to kernel that FDT
segment is not part of SHA calculation hence it is safe to update it.

With the `KEXEC_UPDATE_ELFCOREHDR` and `KEXEC_UPDATE_FDT` kexec flags,
crash hotplug support can be added to PowerPC for the kexec_load syscall
while maintaining the backward compatibility with older kexec tools that
do not have these newly introduced flags.

Signed-off-by: Sourabh Jain 
Cc: Akhil Raj 
Cc: Andrew Morton 
Cc: Aneesh Kumar K.V 
Cc: Baoquan He 
Cc: Borislav Petkov (AMD) 
Cc: Boris Ostrovsky 
Cc: Christophe Leroy 
Cc: Dave Hansen 
Cc: Dave Young 
Cc: David Hildenbrand 
Cc: Eric DeVolder 
Cc: Greg Kroah-Hartman 
Cc: Hari Bathini 
Cc: Laurent Dufour 
Cc: Mahesh Salgaonkar 
Cc: Michael Ellerman 
Cc: Mimi Zohar 
Cc: Naveen N Rao 
Cc: Oscar Salvador 
Cc: Thomas Gleixner 
Cc: Valentin Schneider 
Cc: Vivek Goyal 
Cc: ke...@lists.infradead.org
Cc: x...@kernel.org
---
 include/linux/kexec.h  | 6 --
 include/uapi/linux/kexec.h | 1 +
 kernel/kexec.c | 2 ++
 3 files changed, 7 insertions(+), 2 deletions(-)

diff --git a/include/linux/kexec.h b/include/linux/kexec.h
index 0f6ea35879ee..bcedb7625b1f 100644
--- a/include/linux/kexec.h
+++ b/include/linux/kexec.h
@@ -319,6 +319,7 @@ struct kimage {
 #ifdef CONFIG_CRASH_HOTPLUG
/* If set, allow changes to elfcorehdr of kexec_load'd image */
unsigned int update_elfcorehdr:1;
+   unsigned int update_fdt:1;
 #endif
 
 #ifdef ARCH_HAS_KIMAGE_ARCH
@@ -396,9 +397,10 @@ bool kexec_load_permitted(int kexec_image_type);
 
 /* List of defined/legal kexec flags */
 #ifndef CONFIG_KEXEC_JUMP
-#define KEXEC_FLAGS(KEXEC_ON_CRASH | KEXEC_UPDATE_ELFCOREHDR)
+#define KEXEC_FLAGS(KEXEC_ON_CRASH | KEXEC_UPDATE_ELFCOREHDR | 
KEXEC_UPDATE_FDT)
 #else
-#define KEXEC_FLAGS(KEXEC_ON_CRASH | KEXEC_PRESERVE_CONTEXT | 
KEXEC_UPDATE_ELFCOREHDR)
+#define KEXEC_FLAGS(KEXEC_ON_CRASH | KEXEC_PRESERVE_CONTEXT | 
KEXEC_UPDATE_ELFCOREHDR | \
+   KEXEC_UPDATE_FDT)
 #endif
 
 /* List of defined/legal kexec file flags */
diff --git a/include/uapi/linux/kexec.h b/include/uapi/linux/kexec.h
index 01766dd839b0..3d5b3d757bed 100644
--- a/include/uapi/linux/kexec.h
+++ b/include/uapi/linux/kexec.h
@@ -13,6 +13,7 @@
 #define KEXEC_ON_CRASH 0x0001
 #define KEXEC_PRESERVE_CONTEXT 0x0002
 #define KEXEC_UPDATE_ELFCOREHDR0x0004
+#define KEXEC_UPDATE_FDT   0x0008
 #define KEXEC_ARCH_MASK0x
 
 /*
diff --git a/kernel/kexec.c b/kernel/kexec.c
index 8f35a5a42af8..97eb151cd931 100644
--- a/kernel/kexec.c
+++ b/kernel/kexec.c
@@ -132,6 +132,8 @@ static int do_kexec_load(unsigned long entry, unsigned long 
nr_segments,
 #ifdef CONFIG_CRASH_HOTPLUG
if (flags & KEXEC_UPDATE_ELFCOREHDR)
image->update_elfcorehdr = 1;
+   if (flags & KEXEC_UPDATE_FDT)
+   image->update_fdt = 1;
 #endif
 
ret = machine_kexec_prepare(image);
-- 
2.41.0



[PATCH v14 0/6] powerpc/crash: Kernel handling of CPU and memory hotplug

2023-12-11 Thread Sourabh Jain
 load. Refer "[RFC v4 PATCH 4/5] powerpc/crash hp: add crash
hotplug support for kexec_file_load" patch to know more about the
change.
  - Fix a couple of typo.
  - Replace pr_err to pr_info_once to warn user about memory hotplug
support.
  - In crash hotplug handle exit the for loop if FDT segment is found.

v3
  - Move fdt_index and fdt_index_vaild variables to kimage_arch struct.
  - Rebase patche on top of

https://lore.kernel.org/lkml/20220303162725.49640-1-eric.devol...@oracle.com/
  - Fixed warning reported by checpatch script

v2:
  - Use generic hotplug handler introduced by

https://lore.kernel.org/lkml/20220209195706.51522-1-eric.devol...@oracle.com/
a significant change from v1.

Cc: Akhil Raj 
Cc: Andrew Morton 
Cc: Aneesh Kumar K.V 
Cc: Baoquan He 
Cc: Borislav Petkov (AMD) 
Cc: Boris Ostrovsky 
Cc: Christophe Leroy 
Cc: Dave Hansen 
Cc: Dave Young 
Cc: David Hildenbrand 
Cc: Eric DeVolder 
Cc: Greg Kroah-Hartman 
Cc: Hari Bathini 
Cc: Laurent Dufour 
Cc: Mahesh Salgaonkar 
Cc: Michael Ellerman 
Cc: Mimi Zohar 
Cc: Naveen N Rao 
Cc: Oscar Salvador 
Cc: Thomas Gleixner 
Cc: Valentin Schneider 
Cc: Vivek Goyal 
Cc: ke...@lists.infradead.org
Cc: x...@kernel.org

Sourabh Jain (6):
  crash: forward memory_notify arg to arch crash hotplug handler
  crash: make CPU and Memory hotplug support reporting flexible
  crash: add a new kexec flag for FDT update
  powerpc/kexec: turn some static helper functions public
  powerpc: add crash CPU hotplug support
  powerpc: add crash memory hotplug support

 arch/powerpc/Kconfig|   4 +
 arch/powerpc/include/asm/kexec.h|  25 ++
 arch/powerpc/include/asm/kexec_ranges.h |   1 +
 arch/powerpc/kexec/Makefile |   4 +-
 arch/powerpc/kexec/core_64.c| 369 
 arch/powerpc/kexec/elf_64.c |  12 +-
 arch/powerpc/kexec/file_load_64.c   | 211 +++---
 arch/powerpc/kexec/ranges.c |  85 ++
 arch/x86/include/asm/kexec.h|  10 +-
 arch/x86/kernel/crash.c |  23 +-
 include/linux/kexec.h   |  21 +-
 include/uapi/linux/kexec.h  |   1 +
 kernel/crash_core.c |  37 ++-
 kernel/kexec.c  |   2 +
 14 files changed, 605 insertions(+), 200 deletions(-)

-- 
2.41.0



[PATCH v14 4/6] powerpc/kexec: turn some static helper functions public

2023-12-11 Thread Sourabh Jain
Move the functions update_cpus_node and get_crash_memory_ranges from
kexec/file_load_64.c to kexec/core_64.c to make these functions usable
by other kexec components.

get_crash_memory_ranges uses functions defined in ranges.c, so take
ranges.c out of CONFIG_KEXEC_FILE.

Later in the series, these functions are utilized for in-kernel updates
to kdump image during CPU/Memory hotplug or online/offline events for
both kexec_load and kexec_file_load syscalls.

There is no intended functional change.

Signed-off-by: Sourabh Jain 
Reviewed-by: Laurent Dufour 
Cc: Akhil Raj 
Cc: Andrew Morton 
Cc: Aneesh Kumar K.V 
Cc: Baoquan He 
Cc: Borislav Petkov (AMD) 
Cc: Boris Ostrovsky 
Cc: Christophe Leroy 
Cc: Dave Hansen 
Cc: Dave Young 
Cc: David Hildenbrand 
Cc: Eric DeVolder 
Cc: Greg Kroah-Hartman 
Cc: Hari Bathini 
Cc: Mahesh Salgaonkar 
Cc: Michael Ellerman 
Cc: Mimi Zohar 
Cc: Naveen N Rao 
Cc: Oscar Salvador 
Cc: Thomas Gleixner 
Cc: Valentin Schneider 
Cc: Vivek Goyal 
Cc: ke...@lists.infradead.org
Cc: x...@kernel.org
---
 arch/powerpc/include/asm/kexec.h  |   6 ++
 arch/powerpc/kexec/Makefile   |   4 +-
 arch/powerpc/kexec/core_64.c  | 166 ++
 arch/powerpc/kexec/file_load_64.c | 162 -
 4 files changed, 174 insertions(+), 164 deletions(-)

diff --git a/arch/powerpc/include/asm/kexec.h b/arch/powerpc/include/asm/kexec.h
index e1b43aa12175..562e1bb689da 100644
--- a/arch/powerpc/include/asm/kexec.h
+++ b/arch/powerpc/include/asm/kexec.h
@@ -108,6 +108,12 @@ void crash_free_reserved_phys_range(unsigned long begin, 
unsigned long end);
 #endif /* CONFIG_PPC_RTAS */
 #endif /* CONFIG_CRASH_DUMP */
 
+#ifdef CONFIG_PPC64
+struct crash_mem;
+int update_cpus_node(void *fdt);
+int get_crash_memory_ranges(struct crash_mem **mem_ranges);
+#endif /* CONFIG_PPC64 */
+
 #ifdef CONFIG_KEXEC_FILE
 extern const struct kexec_file_ops kexec_elf64_ops;
 
diff --git a/arch/powerpc/kexec/Makefile b/arch/powerpc/kexec/Makefile
index 0c2abe7f9908..f2ed5b85b912 100644
--- a/arch/powerpc/kexec/Makefile
+++ b/arch/powerpc/kexec/Makefile
@@ -3,11 +3,11 @@
 # Makefile for the linux kernel.
 #
 
-obj-y  += core.o crash.o core_$(BITS).o
+obj-y  += core.o crash.o ranges.o core_$(BITS).o
 
 obj-$(CONFIG_PPC32)+= relocate_32.o
 
-obj-$(CONFIG_KEXEC_FILE)   += file_load.o ranges.o file_load_$(BITS).o 
elf_$(BITS).o
+obj-$(CONFIG_KEXEC_FILE)   += file_load.o file_load_$(BITS).o elf_$(BITS).o
 
 # Disable GCOV, KCOV & sanitizers in odd or sensitive code
 GCOV_PROFILE_core_$(BITS).o := n
diff --git a/arch/powerpc/kexec/core_64.c b/arch/powerpc/kexec/core_64.c
index 0bee7ca9a77c..9966b51d9aa8 100644
--- a/arch/powerpc/kexec/core_64.c
+++ b/arch/powerpc/kexec/core_64.c
@@ -17,6 +17,8 @@
 #include 
 #include 
 #include 
+#include 
+#include 
 
 #include 
 #include 
@@ -30,6 +32,8 @@
 #include 
 #include 
 #include 
+#include 
+#include 
 
 int machine_kexec_prepare(struct kimage *image)
 {
@@ -377,6 +381,168 @@ void default_machine_kexec(struct kimage *image)
/* NOTREACHED */
 }
 
+/**
+ * get_crash_memory_ranges - Get crash memory ranges. This list includes
+ *   first/crashing kernel's memory regions that
+ *   would be exported via an elfcore.
+ * @mem_ranges:  Range list to add the memory ranges to.
+ *
+ * Returns 0 on success, negative errno on error.
+ */
+int get_crash_memory_ranges(struct crash_mem **mem_ranges)
+{
+   phys_addr_t base, end;
+   struct crash_mem *tmem;
+   u64 i;
+   int ret;
+
+   for_each_mem_range(i, , ) {
+   u64 size = end - base;
+
+   /* Skip backup memory region, which needs a separate entry */
+   if (base == BACKUP_SRC_START) {
+   if (size > BACKUP_SRC_SIZE) {
+   base = BACKUP_SRC_END + 1;
+   size -= BACKUP_SRC_SIZE;
+   } else
+   continue;
+   }
+
+   ret = add_mem_range(mem_ranges, base, size);
+   if (ret)
+   goto out;
+
+   /* Try merging adjacent ranges before reallocation attempt */
+   if ((*mem_ranges)->nr_ranges == (*mem_ranges)->max_nr_ranges)
+   sort_memory_ranges(*mem_ranges, true);
+   }
+
+   /* Reallocate memory ranges if there is no space to split ranges */
+   tmem = *mem_ranges;
+   if (tmem && (tmem->nr_ranges == tmem->max_nr_ranges)) {
+   tmem = realloc_mem_ranges(mem_ranges);
+   if (!tmem)
+   goto out;
+   }
+
+   /* Exclude crashkernel region */
+   ret = crash_exclude_mem_range(tmem, crashk_res.start, crashk_res.end);
+   if (ret)
+   goto out;
+
+   /*
+* FIXME: For now, stay

[PATCH v14 1/6] crash: forward memory_notify arg to arch crash hotplug handler

2023-12-11 Thread Sourabh Jain
In the event of memory hotplug or online/offline events, the crash
memory hotplug notifier `crash_memhp_notifier()` receives a
`memory_notify` object but doesn't forward that object to the
generic and architecture-specific crash hotplug handler.

The `memory_notify` object contains the starting PFN (Page Frame Number)
and the number of pages in the hot-removed memory. This information is
necessary for architectures like PowerPC to update/recreate the kdump
image, specifically `elfcorehdr`.

So update the function signature of `crash_handle_hotplug_event()` and
`arch_crash_handle_hotplug_event()` to accept the `memory_notify` object
as an argument from crash memory hotplug notifier.

Since no such object is available in the case of CPU hotplug event, the
crash CPU hotplug notifier `crash_cpuhp_online()` passes NULL to the
crash hotplug handler.

Signed-off-by: Sourabh Jain 
Cc: Akhil Raj 
Cc: Andrew Morton 
Cc: Aneesh Kumar K.V 
Cc: Baoquan He 
Cc: Borislav Petkov (AMD) 
Cc: Boris Ostrovsky 
Cc: Christophe Leroy 
Cc: Dave Hansen 
Cc: Dave Young 
Cc: David Hildenbrand 
Cc: Eric DeVolder 
Cc: Greg Kroah-Hartman 
Cc: Hari Bathini 
Cc: Laurent Dufour 
Cc: Mahesh Salgaonkar 
Cc: Michael Ellerman 
Cc: Mimi Zohar 
Cc: Naveen N Rao 
Cc: Oscar Salvador 
Cc: Thomas Gleixner 
Cc: Valentin Schneider 
Cc: Vivek Goyal 
Cc: ke...@lists.infradead.org
Cc: x...@kernel.org
---
 arch/x86/include/asm/kexec.h |  2 +-
 arch/x86/kernel/crash.c  |  3 ++-
 include/linux/kexec.h|  2 +-
 kernel/crash_core.c  | 14 +++---
 4 files changed, 11 insertions(+), 10 deletions(-)

diff --git a/arch/x86/include/asm/kexec.h b/arch/x86/include/asm/kexec.h
index c9f6a6c5de3c..9bb6607e864e 100644
--- a/arch/x86/include/asm/kexec.h
+++ b/arch/x86/include/asm/kexec.h
@@ -208,7 +208,7 @@ int arch_kimage_file_post_load_cleanup(struct kimage 
*image);
 extern void kdump_nmi_shootdown_cpus(void);
 
 #ifdef CONFIG_CRASH_HOTPLUG
-void arch_crash_handle_hotplug_event(struct kimage *image);
+void arch_crash_handle_hotplug_event(struct kimage *image, void *arg);
 #define arch_crash_handle_hotplug_event arch_crash_handle_hotplug_event
 
 #ifdef CONFIG_HOTPLUG_CPU
diff --git a/arch/x86/kernel/crash.c b/arch/x86/kernel/crash.c
index c92d88680dbf..0d7b2657beb6 100644
--- a/arch/x86/kernel/crash.c
+++ b/arch/x86/kernel/crash.c
@@ -428,10 +428,11 @@ unsigned int arch_crash_get_elfcorehdr_size(void)
 /**
  * arch_crash_handle_hotplug_event() - Handle hotplug elfcorehdr changes
  * @image: a pointer to kexec_crash_image
+ * @arg: struct memory_notify handler for memory hotplug case and NULL for CPU 
hotplug case.
  *
  * Prepare the new elfcorehdr and replace the existing elfcorehdr.
  */
-void arch_crash_handle_hotplug_event(struct kimage *image)
+void arch_crash_handle_hotplug_event(struct kimage *image, void *arg)
 {
void *elfbuf = NULL, *old_elfcorehdr;
unsigned long nr_mem_ranges;
diff --git a/include/linux/kexec.h b/include/linux/kexec.h
index 8227455192b7..ee28c09a7fb0 100644
--- a/include/linux/kexec.h
+++ b/include/linux/kexec.h
@@ -483,7 +483,7 @@ static inline void arch_kexec_pre_free_pages(void *vaddr, 
unsigned int pages) {
 #endif
 
 #ifndef arch_crash_handle_hotplug_event
-static inline void arch_crash_handle_hotplug_event(struct kimage *image) { }
+static inline void arch_crash_handle_hotplug_event(struct kimage *image, void 
*arg) { }
 #endif
 
 int crash_check_update_elfcorehdr(void);
diff --git a/kernel/crash_core.c b/kernel/crash_core.c
index efe87d501c8c..b9190265fe52 100644
--- a/kernel/crash_core.c
+++ b/kernel/crash_core.c
@@ -935,7 +935,7 @@ int crash_check_update_elfcorehdr(void)
  * list of segments it checks (since the elfcorehdr changes and thus
  * would require an update to purgatory itself to update the digest).
  */
-static void crash_handle_hotplug_event(unsigned int hp_action, unsigned int 
cpu)
+static void crash_handle_hotplug_event(unsigned int hp_action, unsigned int 
cpu, void *arg)
 {
struct kimage *image;
 
@@ -997,7 +997,7 @@ static void crash_handle_hotplug_event(unsigned int 
hp_action, unsigned int cpu)
image->hp_action = hp_action;
 
/* Now invoke arch-specific update handler */
-   arch_crash_handle_hotplug_event(image);
+   arch_crash_handle_hotplug_event(image, arg);
 
/* No longer handling a hotplug event */
image->hp_action = KEXEC_CRASH_HP_NONE;
@@ -1013,17 +1013,17 @@ static void crash_handle_hotplug_event(unsigned int 
hp_action, unsigned int cpu)
crash_hotplug_unlock();
 }
 
-static int crash_memhp_notifier(struct notifier_block *nb, unsigned long val, 
void *v)
+static int crash_memhp_notifier(struct notifier_block *nb, unsigned long val, 
void *arg)
 {
switch (val) {
case MEM_ONLINE:
crash_handle_hotplug_event(KEXEC_CRASH_HP_ADD_MEMORY,
-   KEXEC_CRASH_HP_INVALID_CPU);
+   KEXEC_CRASH_HP_INVALID_CPU, arg);
   

[PATCH v14 2/6] crash: make CPU and Memory hotplug support reporting flexible

2023-12-11 Thread Sourabh Jain
Architectures' specific functions `arch_crash_hotplug_cpu_support()` and
`arch_crash_hotplug_memory_support()` advertise the kernel's capability
to update the kdump image on CPU and Memory hotplug events to userspace
via the sysfs interface. These architecture-specific functions need to
access various attributes of the `kexec_crash_image` object to determine
whether the kernel can update the kdump image and advertise this
information to userspace accordingly.

As the architecture-specific code is not exposed to the APIs required to
acquire the lock for accessing the `kexec_crash_image` object, it calls
a generic function, `crash_check_update_elfcorehdr()`, to determine
whether the kernel can update the kdump image or not.

The lack of access to the `kexec_crash_image` object in
architecture-specific code restricts architectures from performing
additional architecture-specific checks required to determine if the
kdump image is updatable or not. For instance, on PowerPC, the kernel
can update the kdump image only if both the elfcorehdr and FDT are
marked as updatable for the `kexec_load` system call.

So define two generic functions, `crash_hotplug_cpu_support()` and
`crash_hotplug_memory_support()`, which are called when userspace
attempts to read the crash CPU and Memory hotplug support via the sysfs
interface. These functions take the necessary locks needed to access the
`kexec_crash_image` object and then forward it to the
architecture-specific handler to do the rest.

Signed-off-by: Sourabh Jain 
Cc: Akhil Raj 
Cc: Andrew Morton 
Cc: Aneesh Kumar K.V 
Cc: Baoquan He 
Cc: Borislav Petkov (AMD) 
Cc: Boris Ostrovsky 
Cc: Christophe Leroy 
Cc: Dave Hansen 
Cc: Dave Young 
Cc: David Hildenbrand 
Cc: Eric DeVolder 
Cc: Greg Kroah-Hartman 
Cc: Hari Bathini 
Cc: Laurent Dufour 
Cc: Mahesh Salgaonkar 
Cc: Michael Ellerman 
Cc: Mimi Zohar 
Cc: Naveen N Rao 
Cc: Oscar Salvador 
Cc: Thomas Gleixner 
Cc: Valentin Schneider 
Cc: Vivek Goyal 
Cc: ke...@lists.infradead.org
Cc: x...@kernel.org
---
 arch/x86/include/asm/kexec.h |  8 
 arch/x86/kernel/crash.c  | 20 +++-
 include/linux/kexec.h| 13 +++--
 kernel/crash_core.c  | 23 +--
 4 files changed, 43 insertions(+), 21 deletions(-)

diff --git a/arch/x86/include/asm/kexec.h b/arch/x86/include/asm/kexec.h
index 9bb6607e864e..5c88d27b086d 100644
--- a/arch/x86/include/asm/kexec.h
+++ b/arch/x86/include/asm/kexec.h
@@ -212,13 +212,13 @@ void arch_crash_handle_hotplug_event(struct kimage 
*image, void *arg);
 #define arch_crash_handle_hotplug_event arch_crash_handle_hotplug_event
 
 #ifdef CONFIG_HOTPLUG_CPU
-int arch_crash_hotplug_cpu_support(void);
-#define crash_hotplug_cpu_support arch_crash_hotplug_cpu_support
+int arch_crash_hotplug_cpu_support(struct kimage *image);
+#define arch_crash_hotplug_cpu_support arch_crash_hotplug_cpu_support
 #endif
 
 #ifdef CONFIG_MEMORY_HOTPLUG
-int arch_crash_hotplug_memory_support(void);
-#define crash_hotplug_memory_support arch_crash_hotplug_memory_support
+int arch_crash_hotplug_memory_support(struct kimage *image);
+#define arch_crash_hotplug_memory_support arch_crash_hotplug_memory_support
 #endif
 
 unsigned int arch_crash_get_elfcorehdr_size(void);
diff --git a/arch/x86/kernel/crash.c b/arch/x86/kernel/crash.c
index 0d7b2657beb6..ad5941665589 100644
--- a/arch/x86/kernel/crash.c
+++ b/arch/x86/kernel/crash.c
@@ -398,18 +398,28 @@ int crash_load_segments(struct kimage *image)
 #undef pr_fmt
 #define pr_fmt(fmt) "crash hp: " fmt
 
-/* These functions provide the value for the sysfs crash_hotplug nodes */
+#if defined(CONFIG_HOTPLUG_CPU) || defined(CONFIG_MEMORY_HOTPLUG)
+static int crash_hotplug_support(struct kimage *image)
+{
+   if (image->file_mode)
+   return 1;
+
+   return image->update_elfcorehdr;
+}
+#endif
+
 #ifdef CONFIG_HOTPLUG_CPU
-int arch_crash_hotplug_cpu_support(void)
+/* These functions provide the value for the sysfs crash_hotplug nodes */
+int arch_crash_hotplug_cpu_support(struct kimage *image)
 {
-   return crash_check_update_elfcorehdr();
+   return crash_hotplug_support(image);
 }
 #endif
 
 #ifdef CONFIG_MEMORY_HOTPLUG
-int arch_crash_hotplug_memory_support(void)
+int arch_crash_hotplug_memory_support(struct kimage *image)
 {
-   return crash_check_update_elfcorehdr();
+   return crash_hotplug_support(image);
 }
 #endif
 
diff --git a/include/linux/kexec.h b/include/linux/kexec.h
index ee28c09a7fb0..0f6ea35879ee 100644
--- a/include/linux/kexec.h
+++ b/include/linux/kexec.h
@@ -486,16 +486,17 @@ static inline void arch_kexec_pre_free_pages(void *vaddr, 
unsigned int pages) {
 static inline void arch_crash_handle_hotplug_event(struct kimage *image, void 
*arg) { }
 #endif
 
-int crash_check_update_elfcorehdr(void);
-
-#ifndef crash_hotplug_cpu_support
-static inline int crash_hotplug_cpu_support(void) { return 0; }
+#ifndef arch_crash_hotplug_cpu_support
+s

[PATCH v6 3/3] Documentation/powerpc: update fadump implementation details

2023-12-08 Thread Sourabh Jain
The patch titled ("powerpc: make fadump resilient with memory add/remove
events") has made significant changes to the implementation of fadump,
particularly on elfcorehdr creation and fadump crash info header
structure. Therefore, updating the fadump implementation documentation
to reflect those changes.

Following updates are done to firmware assisted dump documentation:

1. The elfcorehdr is no longer stored after fadump HDR in the reserved
   dump area. Instead, the second kernel dynamically allocates memory
   for the elfcorehdr within the address range from 0 to the boot memory
   size. Therefore, update figures 1 and 2 of Memory Reservation during
   the first and second kernels to reflect this change.

2. A version field has been added to the fadump header to manage the
   future changes to fadump crash info header structure without changing
   the fadump header magic number in the future. Therefore, remove the
   corresponding TODO from the document.

Signed-off-by: Sourabh Jain 
Cc: Aditya Gupta 
Cc: Aneesh Kumar K.V 
Cc: Hari Bathini 
Cc: Mahesh Salgaonkar 
Cc: Michael Ellerman 
Cc: Naveen N Rao 
---
 .../arch/powerpc/firmware-assisted-dump.rst   | 91 +--
 1 file changed, 42 insertions(+), 49 deletions(-)

diff --git a/Documentation/arch/powerpc/firmware-assisted-dump.rst 
b/Documentation/arch/powerpc/firmware-assisted-dump.rst
index e363fc48529a..7e37aadd1f77 100644
--- a/Documentation/arch/powerpc/firmware-assisted-dump.rst
+++ b/Documentation/arch/powerpc/firmware-assisted-dump.rst
@@ -134,12 +134,12 @@ that are run. If there is dump data, then the
 memory is held.
 
 If there is no waiting dump data, then only the memory required to
-hold CPU state, HPTE region, boot memory dump, FADump header and
-elfcore header, is usually reserved at an offset greater than boot
-memory size (see Fig. 1). This area is *not* released: this region
-will be kept permanently reserved, so that it can act as a receptacle
-for a copy of the boot memory content in addition to CPU state and
-HPTE region, in the case a crash does occur.
+hold CPU state, HPTE region, boot memory dump, and FADump header is
+usually reserved at an offset greater than boot memory size (see Fig. 1).
+This area is *not* released: this region will be kept permanently
+reserved, so that it can act as a receptacle for a copy of the boot
+memory content in addition to CPU state and HPTE region, in the case
+a crash does occur.
 
 Since this reserved memory area is used only after the system crash,
 there is no point in blocking this significant chunk of memory from
@@ -153,22 +153,22 @@ that were present in CMA region::
 
   o Memory Reservation during first kernel
 
-  Low memory Top of memory
-  0boot memory size   |<--- Reserved dump area --->|   |
-  |   |   |Permanent Reservation   |   |
-  V   V   ||   V
-  +---+-/ /---+---++---+-+-++--+
-  |   |   |///||  DUMP | HDR | ELF ||  |
-  +---+-/ /---+---++---+-+-++--+
-|   ^^ ^  ^   ^
-|   || |  |   |
-\  CPU  HPTE   /  |   |
- --   |   |
-  Boot memory content gets transferred|   |
-  to reserved area by firmware at the |   |
-  time of crash.  |   |
-  FADump Header   |
-   (meta area)|
+  Low memory  Top of memory
+  0boot memory size   |<-- Reserved dump area ->| |
+  |   |   |  Permanent Reservation  | |
+  V   V   | | V
+  +---+-/ /---+---++---+---++-+
+  |   |   |///||DUMP   |  HDR  || |
+  +---+-/ /---+---++---+---++-+
+|   ^^   ^ ^  ^
+|   ||   | |  |
+\  CPU  HPTE / |  |
+   |  |
+  Boot memory content gets transferred |  |
+  to reserved area by firmware at the  |  |
+  time of crash.   |  |
+   FADump Header  |
+(meta area)   |
   |
   |
   Metadata: This area holds a metadata structure whose
@@ -186,13 +186,2

[PATCH v6 2/3] powerpc/fadump: add hotplug_ready sysfs interface

2023-12-08 Thread Sourabh Jain
The elfcorehdr describes the CPUs and memory of the crashed kernel to
the kernel that captures the dump, known as the second or fadump kernel.
The elfcorehdr needs to be updated if the system's memory changes due to
memory hotplug or online/offline events.

Currently, memory hotplug events are monitored in userspace by udev
rules, and fadump is re-registered, which recreates the elfcorehdr with
the latest available memory in the system.

However, the previous patch ("powerpc: make fadump resilient with memory
add/remove events") moved the creation of elfcorehdr to the second or
fadump kernel. This eliminates the need to regenerate the elfcorehdr
during memory hotplug or online/offline events.

Create a sysfs entry at /sys/kernel/fadump/hotplug_ready to let
userspace know that fadump re-registration is not required for memory
add/remove events.

Signed-off-by: Sourabh Jain 
Cc: Aditya Gupta 
Cc: Aneesh Kumar K.V 
Cc: Hari Bathini 
Cc: Mahesh Salgaonkar 
Cc: Michael Ellerman 
Cc: Naveen N Rao 
---
 Documentation/ABI/testing/sysfs-kernel-fadump | 11 +++
 arch/powerpc/kernel/fadump.c  | 14 ++
 2 files changed, 25 insertions(+)

diff --git a/Documentation/ABI/testing/sysfs-kernel-fadump 
b/Documentation/ABI/testing/sysfs-kernel-fadump
index 8f7a64a81783..971934b2891e 100644
--- a/Documentation/ABI/testing/sysfs-kernel-fadump
+++ b/Documentation/ABI/testing/sysfs-kernel-fadump
@@ -38,3 +38,14 @@ Contact: linuxppc-dev@lists.ozlabs.org
 Description:   read only
Provide information about the amount of memory reserved by
FADump to save the crash dump in bytes.
+
+What:  /sys/kernel/fadump/hotplug_ready
+Date:  Sep 2023
+Contact:   linuxppc-dev@lists.ozlabs.org
+Description:   read only
+   Kdump udev rule re-registers fadump on memory add/remove events,
+   primarily to update the elfcorehdr. This sysfs indicates the
+   kdump udev rule that fadump re-registration is not required on
+   memory add/remove events because elfcorehdr is now prepared in
+   the second/fadump kernel.
+User:  kexec-tools
diff --git a/arch/powerpc/kernel/fadump.c b/arch/powerpc/kernel/fadump.c
index eb9132538268..a55dd9bf754c 100644
--- a/arch/powerpc/kernel/fadump.c
+++ b/arch/powerpc/kernel/fadump.c
@@ -1455,6 +1455,18 @@ static ssize_t enabled_show(struct kobject *kobj,
return sprintf(buf, "%d\n", fw_dump.fadump_enabled);
 }
 
+/*
+ * /sys/kernel/fadump/hotplug_ready sysfs node returns 1, which inidcates
+ * to usersapce that fadump re-registration is not required on memory
+ * hotplug events.
+ */
+static ssize_t hotplug_ready_show(struct kobject *kobj,
+ struct kobj_attribute *attr,
+ char *buf)
+{
+   return sprintf(buf, "%d\n", 1);
+}
+
 static ssize_t mem_reserved_show(struct kobject *kobj,
 struct kobj_attribute *attr,
 char *buf)
@@ -1527,11 +1539,13 @@ static struct kobj_attribute release_attr = 
__ATTR_WO(release_mem);
 static struct kobj_attribute enable_attr = __ATTR_RO(enabled);
 static struct kobj_attribute register_attr = __ATTR_RW(registered);
 static struct kobj_attribute mem_reserved_attr = __ATTR_RO(mem_reserved);
+static struct kobj_attribute hotplug_ready_attr = __ATTR_RO(hotplug_ready);
 
 static struct attribute *fadump_attrs[] = {
_attr.attr,
_attr.attr,
_reserved_attr.attr,
+   _ready_attr.attr,
NULL,
 };
 
-- 
2.41.0



[PATCH v6 1/3] powerpc: make fadump resilient with memory add/remove events

2023-12-08 Thread Sourabh Jain
 the dump and prints
relevant error message on console.

Signed-off-by: Sourabh Jain 
Cc: Aditya Gupta 
Cc: Aneesh Kumar K.V 
Cc: Hari Bathini 
Cc: Mahesh Salgaonkar 
Cc: Michael Ellerman 
Cc: Naveen N Rao 
---
 arch/powerpc/include/asm/fadump-internal.h   |  31 +-
 arch/powerpc/kernel/fadump.c | 355 +++
 arch/powerpc/platforms/powernv/opal-fadump.c |  18 +-
 arch/powerpc/platforms/pseries/rtas-fadump.c |  23 +-
 4 files changed, 242 insertions(+), 185 deletions(-)

diff --git a/arch/powerpc/include/asm/fadump-internal.h 
b/arch/powerpc/include/asm/fadump-internal.h
index 27f9e11eda28..a632e9708610 100644
--- a/arch/powerpc/include/asm/fadump-internal.h
+++ b/arch/powerpc/include/asm/fadump-internal.h
@@ -42,13 +42,40 @@ static inline u64 fadump_str_to_u64(const char *str)
 
 #define FADUMP_CPU_UNKNOWN (~((u32)0))
 
-#define FADUMP_CRASH_INFO_MAGICfadump_str_to_u64("FADMPINF")
+/*
+ * The introduction of new fields in the fadump crash info header has
+ * led to a change in the magic key from `FADMPINF` to `FADMPSIG` for
+ * identifying a kernel crash from an old kernel.
+ *
+ * To prevent the need for further changes to the magic number in the
+ * event of future modifications to the fadump crash info header, a
+ * version field has been introduced to track the fadump crash info
+ * header version.
+ *
+ * Consider a few points before adding new members to the fadump crash info
+ * header structure:
+ *
+ *  - Append new members; avoid adding them in between.
+ *  - Non-primitive members should have a size member as well.
+ *  - For every change in the fadump header, increment the
+ *fadump header version. This helps the updated kernel decide how to
+ *handle kernel dumps from older kernels.
+ */
+#define FADUMP_CRASH_INFO_MAGIC_OLDfadump_str_to_u64("FADMPINF")
+#define FADUMP_CRASH_INFO_MAGICfadump_str_to_u64("FADMPSIG")
+#define FADUMP_HEADER_VERSION  1
 
 /* fadump crash info structure */
 struct fadump_crash_info_header {
u64 magic_number;
-   u64 elfcorehdr_addr;
+   u32 version;
u32 crashing_cpu;
+   u64 elfcorehdr_addr;
+   u64 elfcorehdr_size;
+   u64 vmcoreinfo_raddr;
+   u64 vmcoreinfo_size;
+   u32 pt_regs_sz;
+   u32 cpu_mask_sz;
struct pt_regs  regs;
struct cpumask  cpu_mask;
 };
diff --git a/arch/powerpc/kernel/fadump.c b/arch/powerpc/kernel/fadump.c
index d14eda1e8589..eb9132538268 100644
--- a/arch/powerpc/kernel/fadump.c
+++ b/arch/powerpc/kernel/fadump.c
@@ -53,8 +53,6 @@ static struct kobject *fadump_kobj;
 static atomic_t cpus_in_fadump;
 static DEFINE_MUTEX(fadump_mutex);
 
-static struct fadump_mrange_info crash_mrange_info = { "crash", NULL, 0, 0, 0, 
false };
-
 #define RESERVED_RNGS_SZ   16384 /* 16K - 128 entries */
 #define RESERVED_RNGS_CNT  (RESERVED_RNGS_SZ / \
 sizeof(struct fadump_memory_range))
@@ -373,12 +371,6 @@ static unsigned long __init get_fadump_area_size(void)
size = PAGE_ALIGN(size);
size += fw_dump.boot_memory_size;
size += sizeof(struct fadump_crash_info_header);
-   size += sizeof(struct elfhdr); /* ELF core header.*/
-   size += sizeof(struct elf_phdr); /* place holder for cpu notes */
-   /* Program headers for crash memory regions. */
-   size += sizeof(struct elf_phdr) * (memblock_num_regions(memory) + 2);
-
-   size = PAGE_ALIGN(size);
 
/* This is to hold kernel metadata on platforms that support it */
size += (fw_dump.ops->fadump_get_metadata_size ?
@@ -931,36 +923,6 @@ static inline int fadump_add_mem_range(struct 
fadump_mrange_info *mrange_info,
return 0;
 }
 
-static int fadump_exclude_reserved_area(u64 start, u64 end)
-{
-   u64 ra_start, ra_end;
-   int ret = 0;
-
-   ra_start = fw_dump.reserve_dump_area_start;
-   ra_end = ra_start + fw_dump.reserve_dump_area_size;
-
-   if ((ra_start < end) && (ra_end > start)) {
-   if ((start < ra_start) && (end > ra_end)) {
-   ret = fadump_add_mem_range(_mrange_info,
-  start, ra_start);
-   if (ret)
-   return ret;
-
-   ret = fadump_add_mem_range(_mrange_info,
-  ra_end, end);
-   } else if (start < ra_start) {
-   ret = fadump_add_mem_range(_mrange_info,
-  start, ra_start);
-   } else if (ra_end < end) {
-   ret = fadump_add_mem_range(_mrange_info,
-  ra_end, end);
-   }

[PATCH v6 0/3] powerpc: make fadump resilient with memory add/remove events

2023-12-08 Thread Sourabh Jain
Problem:

Due to changes in memory resources caused by either memory hotplug or
online/offline events, the elfcorehdr, which describes the cpus and
memory of the crashed kernel to the kernel that collects the dump (known
as second/fadump kernel), becomes outdated. Consequently, attempting
dump collection with an outdated elfcorehdr can lead to failed or
inaccurate dump collection.

Memory hotplug or online/offline events is referred as memory add/remove
events in reset of the patch series.

Existing solution:
==
Monitor memory add/remove events in userspace using udev rules, and
re-register fadump whenever there are changes in memory resources. This
leads to the creation of a new elfcorehdr with updated system memory
information.

Challenges with existing solution:
==
1. Performing bulk memory add/remove with udev-based fadump
   re-registration can lead to race conditions and, more importantly,
   it creates a large wide window during which fadump is inactive until
   all memory add/remove events are settled.
2. Re-registering fadump for every memory add/remove event is
   inefficient.
3. Memory for elfcorehdr is allocated based on the memblock regions
   available during first kernel early boot and it remains fixed
   thereafter. However, if the elfcorehdr is later recreated with
   additional memblock regions, its size will increase, potentially
   leading to memory corruption.

Proposed solution:
==
Address the aforementioned challenges by shifting the creation of
elfcorehdr from the first kernel (also referred as the crashed kernel),
where it was created and frequently recreated for every memory
add/remove event, to the fadump kernel. As a result, the elfcorehdr only
needs to be created once, thus eliminating the necessity to re-register
fadump during memory add/remove events.

To know more about elfcorehdr creation in the fadump kernel, refer to
the first patch in this series.

The second patch includes a new sysfs interface that tells userspace
that fadump re-registration isn't needed for memory add/remove events. 
note that userspace changes do not need to be in sync with kernel
changes; they can roll out independently.

Since there are significant changes in the fadump implementation, the
third patch updates the fadump documentation to reflect the changes made
in this patch series.

Kernel tree rebased on 6.7.0-rc4 with patch series applied:
=
https://github.com/sourabhjains/linux/tree/fadump-mem-hotplug-v6

Userspace changes:
==
To realize this feature, one must update the kdump udev rules to prevent
fadump re-registration during memory add/remove events.

On rhel apply the following changes to file
/usr/lib/udev/rules.d/98-kexec.rules

-run+="/bin/sh -c '/usr/bin/systemctl is-active kdump.service || exit 0; 
/usr/bin/systemd-run --quiet --no-block /usr/lib/udev/kdump-udev-throttler'"
+# don't re-register fadump if the value of the node
+# /sys/kernel/fadump/hotplug_ready is 1.
+
+run+="/bin/sh -c '/usr/bin/systemctl is-active kdump.service || exit 0; ! test 
-f /sys/kernel/fadump_enabled || cat /sys/kernel/fadump_enabled | grep 0  || ! 
test -f /sys/kernel/fadump/hotplug_ready || cat 
/sys/kernel/fadump/hotplug_ready | grep 0 || exit 0; /usr/bin/systemd-run 
--quiet --no-block /usr/lib/udev/kdump-udev-throttler'"

Changelog:
==
v6: 8 Dec 2023
  - Add size fields for `pt_regs` and `cpumask` in the fadump header
structure
  - Don't process the dump if the size of `pt_regs` and `cpu_mask` is
not same in the crashed and fadump kernel
  - Include an additional check for endianness mismatch when the magic
number doesn't match, to print the relevant error message
  - Don't process the dump if the fadump header contains an old magic number
  - Rebased it to 6.7.0-rc4

v5: 29 Oct 2023 
  https://lore.kernel.org/all/20231029124548.12198-1-sourabhj...@linux.ibm.com/
  - Fix a comment on the first patch

v4: 21 Oct 2023
  https://lore.kernel.org/all/20231021181733.204311-1-sourabhj...@linux.ibm.com/
  - Fix a build warning about type casting

v3: 9 Oct 2023
  https://lore.kernel.org/all/20231009041953.36139-1-sourabhj...@linux.ibm.com/
  - Assign physical address of elfcorehdr to fdh->elfcorehdr_addr
  - Rename a variable, boot_mem_dest_addr -> boot_mem_dest_offset

v2: 25 Sep 2023
  https://lore.kernel.org/all/20230925051214.678957-1-sourabhj...@linux.ibm.com/
  - Fixed a few indentation issues reported by the checkpatch script.
  - Rebased it to 6.6.0-rc3

v1: 17 Sep 2023
  https://lore.kernel.org/all/20230917080225.561627-1-sourabhj...@linux.ibm.com/

Cc: Aditya Gupta 
Cc: Aneesh Kumar K.V 
Cc: Hari Bathini 
Cc: Mahesh Salgaonkar 
Cc: Michael Ellerman 
Cc: Naveen N Rao 

Sourabh Jain (3):
  powerpc: make fadump resilient with memory add/remove events
  powerpc/fadump: add hotplug_ready sysfs interface
  Documentat

[PATCH v13 6/6] powerpc: add crash memory hotplug support

2023-12-03 Thread Sourabh Jain
Extend the arch crash hotplug handler, as introduced by the patch title
("powerpc: add crash CPU hotplug support"), to also support memory
add/remove events.

Elfcorehdr describes the memory of the crash kernel to capture the
kernel; hence, it needs to be updated if memory resources change due to
memory add/remove events. Therefore, arch_crash_handle_hotplug_event()
is updated to recreate the elfcorehdr and replace it with the previous
one on memory add/remove events.

The memblock list is used to prepare the elfcorehdr. In the case of
memory hot removal, the memblock list is updated after the arch crash
hotplug handler is triggered, as depicted in Figure 1. Thus, the
hot-removed memory is explicitly removed from the crash memory ranges
to ensure that the memory ranges added to elfcorehdr do not include the
hot-removed memory.

Memory remove
  |
  v
Offline pages
  |
  v
 Initiate memory notify call <> crash hotplug handler
 chain for MEM_OFFLINE event
  |
  v
 Update memblock list

Figure 1

There are two system calls, `kexec_file_load` and `kexec_load`, used to
load the kdump image. A few changes have been made to ensure that the
kernel can safely update the elfcorehdr component of the kdump image for
both system calls.

For the kexec_file_load syscall, kdump image is prepared in the kernel.
To support an increasing number of memory regions, the elfcorehdr is
built with extra buffer space to ensure that it can accommodate
additional memory ranges in future.

For the kexec_load syscall, the elfcorehdr is updated only if both the
KEXEC_UPDATE_FDT and KEXEC_UPDATE_ELFCOREHDR kexec flags are passed to
the kernel by the kexec tool. Passing these flags to the kernel
indicates that the elfcorehdr is built to accommodate additional memory
ranges and the elfcorehdr segment is not considered for SHA calculation,
making it safe to update it.

Commit 88a6f8994421 ("crash: memory and CPU hotplug sysfs attributes")
added a sysfs interface to indicate userspace (kdump udev rule) that
kernel will update the kdump image on memory hotplug events, so kdump
reload can be avoided. Implement arch specific function
`arch_crash_hotplug_memory_support()` to correctly advertise kernel
capability to update kdump image.

This feature is advertised to usersapce when following conditions are
met:

1. Kdump image is loaded using kexec_file_load system call.
2. Kdump image is loaded using kexec_load system and both
   KEXEC_UPATE_ELFCOREHDR and KEXEC_UPDATE_FDT kexec flags are passed
   to kernel.

The changes related to this feature are kept under the CRASH_HOTPLUG
config, and it is enabled by default.

Signed-off-by: Sourabh Jain 
Cc: Akhil Raj 
Cc: Andrew Morton 
Cc: Baoquan He 
Cc: Borislav Petkov (AMD) 
Cc: Boris Ostrovsky 
Cc: Christophe Leroy 
Cc: Dave Hansen 
Cc: Dave Young 
Cc: David Hildenbrand 
Cc: Eric DeVolder 
Cc: Greg Kroah-Hartman 
Cc: Hari Bathini 
Cc: Laurent Dufour 
Cc: Mahesh Salgaonkar 
Cc: Michael Ellerman 
Cc: Mimi Zohar 
Cc: Oscar Salvador 
Cc: Thomas Gleixner 
Cc: Valentin Schneider 
Cc: Vivek Goyal 
Cc: ke...@lists.infradead.org
Cc: x...@kernel.org
---
 arch/powerpc/include/asm/kexec.h|   8 ++
 arch/powerpc/include/asm/kexec_ranges.h |   1 +
 arch/powerpc/kexec/core_64.c| 125 ++--
 arch/powerpc/kexec/file_load_64.c   |  34 ++-
 arch/powerpc/kexec/ranges.c |  85 
 5 files changed, 244 insertions(+), 9 deletions(-)

diff --git a/arch/powerpc/include/asm/kexec.h b/arch/powerpc/include/asm/kexec.h
index 7823ab10d323..c8d6cfda523c 100644
--- a/arch/powerpc/include/asm/kexec.h
+++ b/arch/powerpc/include/asm/kexec.h
@@ -122,6 +122,14 @@ int arch_crash_hotplug_cpu_support(struct kimage *image);
 #define arch_crash_hotplug_cpu_support arch_crash_hotplug_cpu_support
 #endif
 
+#ifdef CONFIG_MEMORY_HOTPLUG
+int arch_crash_hotplug_memory_support(struct kimage *image);
+#define arch_crash_hotplug_memory_support arch_crash_hotplug_memory_support
+#endif
+
+unsigned int arch_crash_get_elfcorehdr_size(void);
+#define crash_get_elfcorehdr_size arch_crash_get_elfcorehdr_size
+
 #endif /*CONFIG_CRASH_HOTPLUG */
 #endif /* CONFIG_PPC64 */
 
diff --git a/arch/powerpc/include/asm/kexec_ranges.h 
b/arch/powerpc/include/asm/kexec_ranges.h
index f83866a19e87..802abf580cf0 100644
--- a/arch/powerpc/include/asm/kexec_ranges.h
+++ b/arch/powerpc/include/asm/kexec_ranges.h
@@ -7,6 +7,7 @@
 void sort_memory_ranges(struct crash_mem *mrngs, bool merge);
 struct crash_mem *realloc_mem_ranges(struct crash_mem **mem_ranges);
 int add_mem_range(struct crash_mem **mem_ranges, u64 base, u64 size);
+int remove_mem_range(struct crash_mem **mem_ranges, u64 base, u64 size);
 int add_tce_mem_ranges(struct crash_mem **mem_ranges);
 int add_initrd_mem_range(struct crash_mem **mem_ranges);
 #ifdef CONFIG_PPC_64S_HASH_MMU
diff --git a/arch/powerpc/kexec/core_64.c b/arch/powerpc/kexec/co

[PATCH v13 5/6] powerpc: add crash CPU hotplug support

2023-12-03 Thread Sourabh Jain
Due to CPU/Memory hotplug or online/offline events the elfcorehdr
(which describes the CPUs and memory of the crashed kernel) and FDT
(Flattened Device Tree) of kdump image becomes outdated. Consequently,
attempting dump collection with an outdated elfcorehdr or FDT can lead
to failed or inaccurate dump collection.

Going forward CPU hotplug or online/offlice events are referred as
CPU/Memory add/remvoe events.

The current solution to address the above issue involves monitoring the
CPU/memory add/remove events in userspace using udev rules and whenever
there are changes in CPU and memory resources, the entire kdump image
is loaded again. The kdump image includes kernel, initrd, elfcorehdr,
FDT, purgatory. Given that only elfcorehdr and FDT get outdated due to
CPU/Memory add/remove events, reloading the entire kdump image is
inefficient. More importantly, kdump remains inactive for a substantial
amount of time until the kdump reload completes.

To address the aforementioned issue, commit 247262756121 ("crash: add
generic infrastructure for crash hotplug support") added a generic
infrastructure that allows architectures to selectively update the kdump
image component during CPU or memory add/remove events within the kernel
itself.

In the event of a CPU or memory add/remove event, the generic crash
hotplug event handler, `crash_handle_hotplug_event()`, is triggered. It
then acquires the necessary locks to update the kdump image and invokes
the architecture-specific crash hotplug handler,
`arch_crash_handle_hotplug_event()`, to update the required kdump image
components.

This patch adds crash hotplug handler for PowerPC and enable support to
update the kdump image on CPU add/remove events. Support for memory
add/remove events is added in a subsequent patch with the title
"powerpc: add crash memory hotplug support."

As mentioned earlier, only the elfcorehdr and FDT kdump image components
need to be updated in the event of CPU or memory add/remove events.
However, the PowerPC architecture crash hotplug handler only updates the
FDT to enable crash hotplug support for CPU add/remove events. Here's
why.

The Elfcorehdr on PowerPC is built with possible CPUs, and thus, it does
not need an update on CPU add/remove events. On the other hand, the FDT
needs to be updated on CPU add events to include the newly added CPU. If
the FDT is not updated and the kernel crashes on a newly added CPU, the
kdump kernel will fail to boot due to the unavailability of the crashing
CPU in the FDT. During the early boot, it is expected that the boot CPU
must be a part of the FDT; otherwise, the kernel will raise a BUG and
fail to boot. For more information, refer to commit 36ae37e3436b0
("powerpc: Make boot_cpuid common between 32 and 64-bit"). Since it is
okay to have an offline CPU in the kdump FDT, no action is taken in case
of CPU removal.

There are two system calls, `kexec_file_load` and `kexec_load`, used to
load the kdump image. Few changes have been made to ensure kernel can
safely update the kdump FDT for both system calls.

For kexec_file_load syscall the kdump image is prepared in kernel. So to
support an increasing number of CPUs, the FDT is constructed with extra
buffer space to ensure it can accommodate a possible number of CPU
nodes. Additionally, a call to fdt_pack (which trims the unused space
once the FDT is prepared) is avoided for kdump image loading if this
feature is enabled.

For the kexec_load syscall, the FDT is updated only if both the
KEXEC_UPDATE_FDT and KEXEC_UPDATE_ELFCOREHDR kexec flags are passed to
the kernel by the kexec tool. Passing these flags to the kernel
indicates that the FDT is built to accommodate possible CPUs, and the
FDT segment is not considered for SHA calculation, making it safe to
update the FDT.

Commit 88a6f8994421 ("crash: memory and CPU hotplug sysfs attributes")
added a sysfs interface to indicate userspace (kdump udev rule) that
kernel will update the kdump image on CPU hotplug events, so kdump
reload can be avoided. Implement arch specific function
`arch_crash_hotplug_cpu_support()` to correctly advertise kernel
capability to update kdump image.

This feature is advertised to userspace when the following conditions
are met:

1. Kdump image is loaded using kexec_file_load system call.
2. Kdump image is loaded using kexec_load system and both
   KEXEC_UPATE_ELFCOREHDR and KEXEC_UPDATE_FDT kexec flags are
   passed to kernel.

The changes related to this feature are kept under the CRASH_HOTPLUG
config, and it is enabled by default.

Signed-off-by: Sourabh Jain 
Cc: Akhil Raj 
Cc: Andrew Morton 
Cc: Baoquan He 
Cc: Borislav Petkov (AMD) 
Cc: Boris Ostrovsky 
Cc: Christophe Leroy 
Cc: Dave Hansen 
Cc: Dave Young 
Cc: David Hildenbrand 
Cc: Eric DeVolder 
Cc: Greg Kroah-Hartman 
Cc: Hari Bathini 
Cc: Laurent Dufour 
Cc: Mahesh Salgaonkar 
Cc: Michael Ellerman 
Cc: Mimi Zohar 
Cc: Oscar Salvador 
Cc: Thomas Gleixner 
Cc: Val

  1   2   3   4   >