On 03/11/25 15:40, Ritesh Harjani (IBM) wrote:
Sourabh Jain <[email protected]> writes:

Commit 35c18f2933c5 ("Add a new optional ",cma" suffix to the
crashkernel= command line option") and commit ab475510e042 ("kdump:
implement reserve_crashkernel_cma") added CMA support for kdump
crashkernel reservation.

Extend crashkernel CMA reservation support to powerpc.

The following changes are made to enable CMA reservation on powerpc:

- Parse and obtain the CMA reservation size along with other crashkernel
   parameters
- Call reserve_crashkernel_cma() to allocate the CMA region for kdump
- Include the CMA-reserved ranges in the usable memory ranges for the
   kdump kernel to use.
- Exclude the CMA-reserved ranges from the crash kernel memory to
   prevent them from being exported through /proc/vmcore.

With the introduction of the CMA crashkernel regions,
crash_exclude_mem_range() needs to be called multiple times to exclude
both crashk_res and crashk_cma_ranges from the crash memory ranges. To
avoid repetitive logic for validating mem_ranges size and handling
reallocation when required, this functionality is moved to a new wrapper
function crash_exclude_mem_range_guarded().

To ensure proper CMA reservation, reserve_crashkernel_cma() is called
after pageblock_order is initialized.

Update kernel-parameters.txt to document CMA support for crashkernel on
powerpc architecture.

Cc: Baoquan he <[email protected]>
Cc: Jiri Bohac <[email protected]>
Cc: Hari Bathini <[email protected]>
Cc: Madhavan Srinivasan <[email protected]>
Cc: Mahesh Salgaonkar <[email protected]>
Cc: Michael Ellerman <[email protected]>
Cc: Ritesh Harjani (IBM) <[email protected]>
Cc: Shivang Upadhyay <[email protected]>
Cc: [email protected]
Signed-off-by: Sourabh Jain <[email protected]>
---
Changlog:

v3 -> v4
  - Removed repeated initialization to tmem in
    crash_exclude_mem_range_guarded()
  - Call crash_exclude_mem_range() with right crashk ranges

v4 -> v5:
  - Document CMA-based crashkernel support for ppc64 in kernel-parameters.txt
---
  .../admin-guide/kernel-parameters.txt         |  2 +-
  arch/powerpc/include/asm/kexec.h              |  2 +
  arch/powerpc/kernel/setup-common.c            |  4 +-
  arch/powerpc/kexec/core.c                     | 10 ++++-
  arch/powerpc/kexec/ranges.c                   | 43 ++++++++++++++-----
  5 files changed, 47 insertions(+), 14 deletions(-)

diff --git a/Documentation/admin-guide/kernel-parameters.txt 
b/Documentation/admin-guide/kernel-parameters.txt
index 6c42061ca20e..0f386b546cec 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -1013,7 +1013,7 @@
                        It will be ignored when crashkernel=X,high is not used
                        or memory reserved is below 4G.
        crashkernel=size[KMG],cma
-                       [KNL, X86] Reserve additional crash kernel memory from
+                       [KNL, X86, ppc64] Reserve additional crash kernel 
memory from
Shouldn't this be PPC and not ppc64?

If I see the crash_dump support...

config ARCH_SUPPORTS_CRASH_DUMP
        def_bool PPC64 || PPC_BOOK3S_32 || PPC_85xx || (44x && !SMP)

The changes below aren't specific to ppc64 correct?

The thing is this feature is only supported with KEXEC_FILE and which only supported on PPC64:

config ARCH_SUPPORTS_KEXEC_FILE
    def_bool PPC64

Hence I kept it as ppc64.

I think I should update that in the commit message.

Also do you think is it good to restrict this feature to KEXEC_FILE?


                        CMA. This reservation is usable by the first system's
                        userspace memory and kernel movable allocations (memory
                        balloon, zswap). Pages allocated from this memory range
diff --git a/arch/powerpc/include/asm/kexec.h b/arch/powerpc/include/asm/kexec.h
index 4bbf9f699aaa..bd4a6c42a5f3 100644
--- a/arch/powerpc/include/asm/kexec.h
+++ b/arch/powerpc/include/asm/kexec.h
@@ -115,9 +115,11 @@ int setup_new_fdt_ppc64(const struct kimage *image, void 
*fdt, struct crash_mem
  #ifdef CONFIG_CRASH_RESERVE
  int __init overlaps_crashkernel(unsigned long start, unsigned long size);
  extern void arch_reserve_crashkernel(void);
+extern void kdump_cma_reserve(void);
  #else
  static inline void arch_reserve_crashkernel(void) {}
  static inline int overlaps_crashkernel(unsigned long start, unsigned long 
size) { return 0; }
+static inline void kdump_cma_reserve(void) { }
  #endif
#if defined(CONFIG_CRASH_DUMP)
diff --git a/arch/powerpc/kernel/setup-common.c 
b/arch/powerpc/kernel/setup-common.c
index 68d47c53876c..c8c42b419742 100644
--- a/arch/powerpc/kernel/setup-common.c
+++ b/arch/powerpc/kernel/setup-common.c
@@ -35,6 +35,7 @@
  #include <linux/of_irq.h>
  #include <linux/hugetlb.h>
  #include <linux/pgtable.h>
+#include <asm/kexec.h>
  #include <asm/io.h>
  #include <asm/paca.h>
  #include <asm/processor.h>
@@ -995,11 +996,12 @@ void __init setup_arch(char **cmdline_p)
        initmem_init();
/*
-        * Reserve large chunks of memory for use by CMA for fadump, KVM and
+        * Reserve large chunks of memory for use by CMA for kdump, fadump, KVM 
and
         * hugetlb. These must be called after initmem_init(), so that
         * pageblock_order is initialised.
         */
        fadump_cma_init();
+       kdump_cma_reserve();
        kvm_cma_reserve();
        gigantic_hugetlb_cma_reserve();
diff --git a/arch/powerpc/kexec/core.c b/arch/powerpc/kexec/core.c
index d1a2d755381c..25744737eff5 100644
--- a/arch/powerpc/kexec/core.c
+++ b/arch/powerpc/kexec/core.c
@@ -33,6 +33,8 @@ void machine_kexec_cleanup(struct kimage *image)
  {
  }
+unsigned long long cma_size;
+
nit:
Since this is a gloabal powerpc variable you are defining, then can we
keep it's name to crashk_cma_size?

Yeah make sense. I will update the variable name.



  /*
   * Do not allocate memory (or fail in any way) in machine_kexec().
   * We are past the point of no return, committed to rebooting now.
@@ -110,7 +112,7 @@ void __init arch_reserve_crashkernel(void)
/* use common parsing */
        ret = parse_crashkernel(boot_command_line, total_mem_sz, &crash_size,
-                               &crash_base, NULL, NULL, NULL);
+                               &crash_base, NULL, &cma_size, NULL);
if (ret)
                return;
@@ -130,6 +132,12 @@ void __init arch_reserve_crashkernel(void)
        reserve_crashkernel_generic(crash_size, crash_base, 0, false);
  }
+void __init kdump_cma_reserve(void)
+{
+       if (cma_size)
+               reserve_crashkernel_cma(cma_size);
+}
+
nit:
cma_size is already checked for null within reserve_crashkernel_cma(),
so we don't really need kdump_cma_reserve() function call as such.

Also kdump_cma_reserve() only make sense with #ifdef CRASHKERNEL_CMA..
so instead do you think we can directly call reserve_crashkernel_cma(cma_size)?

I think the above kdump_cma_reserve() definition should come under CONFIG_CRASH_RESERVE
because the way it is declared in arch/powerpc/include/asm/kexec.h.

I would like to keep kdump_cma_reserve() as is it because of two reasons:

- It keeps setup_arch() free from kdump #ifdefs
- In case if we want to add some condition on this reservation it would straight forward.

So lets keep kdump_cma_reserve as is, unless you have strong opinion on not to.

  int __init overlaps_crashkernel(unsigned long start, unsigned long size)
  {
        return (start + size) > crashk_res.start && start <= crashk_res.end;
diff --git a/arch/powerpc/kexec/ranges.c b/arch/powerpc/kexec/ranges.c
index 3702b0bdab14..3bd27c38726b 100644
--- a/arch/powerpc/kexec/ranges.c
+++ b/arch/powerpc/kexec/ranges.c
@@ -515,7 +515,7 @@ int get_exclude_memory_ranges(struct crash_mem **mem_ranges)
   */
  int get_usable_memory_ranges(struct crash_mem **mem_ranges)
  {
-       int ret;
+       int ret, i;
/*
         * Early boot failure observed on guests when low memory (first memory
@@ -528,6 +528,13 @@ int get_usable_memory_ranges(struct crash_mem **mem_ranges)
        if (ret)
                goto out;
+ for (i = 0; i < crashk_cma_cnt; i++) {
+               ret = add_mem_range(mem_ranges, crashk_cma_ranges[i].start,
+                                   crashk_cma_ranges[i].end - 
crashk_cma_ranges[i].start + 1);
+               if (ret)
+                       goto out;
+       }
+
        ret = add_rtas_mem_range(mem_ranges);
        if (ret)
                goto out;
@@ -546,6 +553,22 @@ int get_usable_memory_ranges(struct crash_mem **mem_ranges)
  #endif /* CONFIG_KEXEC_FILE */
#ifdef CONFIG_CRASH_DUMP
+static int crash_exclude_mem_range_guarded(struct crash_mem **mem_ranges,
+                                          unsigned long long mstart,
+                                          unsigned long long mend)
+{
+       struct crash_mem *tmem = *mem_ranges;
+
+       /* Reallocate memory ranges if there is no space to split ranges */
+       if (tmem && (tmem->nr_ranges == tmem->max_nr_ranges)) {
+               tmem = realloc_mem_ranges(mem_ranges);
+               if (!tmem)
+                       return -ENOMEM;
+       }
+
+       return crash_exclude_mem_range(tmem, mstart, mend);
+}
+
  /**
   * get_crash_memory_ranges - Get crash memory ranges. This list includes
   *                           first/crashing kernel's memory regions that
@@ -557,7 +580,6 @@ int get_usable_memory_ranges(struct crash_mem **mem_ranges)
  int get_crash_memory_ranges(struct crash_mem **mem_ranges)
  {
        phys_addr_t base, end;
-       struct crash_mem *tmem;
        u64 i;
        int ret;
@@ -582,19 +604,18 @@ int get_crash_memory_ranges(struct crash_mem **mem_ranges)
                        sort_memory_ranges(*mem_ranges, true);
        }
- /* Reallocate memory ranges if there is no space to split ranges */
-       tmem = *mem_ranges;
-       if (tmem && (tmem->nr_ranges == tmem->max_nr_ranges)) {
-               tmem = realloc_mem_ranges(mem_ranges);
-               if (!tmem)
-                       goto out;
-       }
-
        /* Exclude crashkernel region */
-       ret = crash_exclude_mem_range(tmem, crashk_res.start, crashk_res.end);
+       ret = crash_exclude_mem_range_guarded(mem_ranges, crashk_res.start, 
crashk_res.end);
        if (ret)
                goto out;
+ for (i = 0; i < crashk_cma_cnt; ++i) {
+               ret = crash_exclude_mem_range_guarded(mem_ranges, 
crashk_cma_ranges[i].start,
+                                             crashk_cma_ranges[i].end);
+               if (ret)
+                       goto out;
+       }
+
        /*
         * FIXME: For now, stay in parity with kexec-tools but if RTAS/OPAL
         *        regions are exported to save their context at the time of
--
2.51.0


Reply via email to