Re: [PATCH v3 12/12] kdump: Use vmlinux_build_id to simplify

2021-04-07 Thread Stephen Boyd
Quoting Petr Mladek (2021-04-07 10:03:28)
> On Tue 2021-03-30 20:05:20, Stephen Boyd wrote:
> > We can use the vmlinux_build_id array here now instead of open coding
> > it. This mostly consolidates code.
> > 
> > Cc: Jiri Olsa 
> > Cc: Alexei Starovoitov 
> > Cc: Jessica Yu 
> > Cc: Evan Green 
> > Cc: Hsin-Yi Wang 
> > Cc: Dave Young 
> > Cc: Baoquan He 
> > Cc: Vivek Goyal 
> > Cc: 
> > Signed-off-by: Stephen Boyd 
> > ---
> >  include/linux/crash_core.h |  6 +-
> >  kernel/crash_core.c| 41 ++
> >  2 files changed, 3 insertions(+), 44 deletions(-)
> > 
> > diff --git a/include/linux/crash_core.h b/include/linux/crash_core.h
> > index 206bde8308b2..fb8ab99bb2ee 100644
> > --- a/include/linux/crash_core.h
> > +++ b/include/linux/crash_core.h
> > @@ -39,7 +39,7 @@ phys_addr_t paddr_vmcoreinfo_note(void);
> >  #define VMCOREINFO_OSRELEASE(value) \
> >   vmcoreinfo_append_str("OSRELEASE=%s\n", value)
> >  #define VMCOREINFO_BUILD_ID(value) \
> > - vmcoreinfo_append_str("BUILD-ID=%s\n", value)
> > + vmcoreinfo_append_str("BUILD-ID=%20phN\n", value)
> 
> Please, add also build check that BUILD_ID_MAX == 20.
> 

I added a BUILD_BUG_ON() in kernel/crash_core.c. I tried static_assert()
here but got mixed ISO errors from gcc-10, although it feels like it
should work.

In file included from ./arch/arm64/include/asm/cmpxchg.h:10,
 from ./arch/arm64/include/asm/atomic.h:16,
 from ./include/linux/atomic.h:7,
 from ./include/linux/mm_types_task.h:13,
 from ./include/linux/mm_types.h:5,
 from ./include/linux/buildid.h:5,
 from kernel/crash_core.c:7:
kernel/crash_core.c: In function 'crash_save_vmcoreinfo_init':
./include/linux/build_bug.h:78:41: warning: ISO C90 forbids mixed declarations 
and code [-Wdeclaration-after-statement]
   78 | #define __static_assert(expr, msg, ...) _Static_assert(expr, msg)
  | ^~
./include/linux/build_bug.h:77:34: note: in expansion of macro '__static_assert'
   77 | #define static_assert(expr, ...) __static_assert(expr, ##__VA_ARGS__, 
#expr)
  |  ^~~
./include/linux/crash_core.h:42:2: note: in expansion of macro 'static_assert'
   42 |  static_assert(ARRAY_SIZE(value) == BUILD_ID_SIZE_MAX); \
  |  ^
kernel/crash_core.c:401:2: note: in expansion of macro 'VMCOREINFO_BUILD_ID'
  401 |  VMCOREINFO_BUILD_ID(vmlinux_build_id);

> 
> The function add_build_id_vmcoreinfo() is used in
> crash_save_vmcoreinfo_init() in this context:
> 
> 
> VMCOREINFO_OSRELEASE(init_uts_ns.name.release);
> add_build_id_vmcoreinfo();
> VMCOREINFO_PAGESIZE(PAGE_SIZE);
> 
> VMCOREINFO_SYMBOL(init_uts_ns);
> VMCOREINFO_OFFSET(uts_namespace, name);
> VMCOREINFO_SYMBOL(node_online_map);
> 
> The function is not longer need. VMCOREINFO_BUILD_ID()
> can be used directly:
> 
> VMCOREINFO_OSRELEASE(init_uts_ns.name.release);
> VMCOREINFO_BUILD_ID(vmlinux_build_id);
> VMCOREINFO_PAGESIZE(PAGE_SIZE);
> 
> VMCOREINFO_SYMBOL(init_uts_ns);
> VMCOREINFO_OFFSET(uts_namespace, name);
> VMCOREINFO_SYMBOL(node_online_map);
> 
> 

Thanks. Makes sense. I've rolled that in.

___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


[PATCH v13 18/18] arm64/mm: remove useless trans_pgd_map_page()

2021-04-07 Thread Pavel Tatashin
From: Pingfan Liu 

The intend of trans_pgd_map_page() was to map contigous range of VA
memory to the memory that is getting relocated during kexec. However,
since we are now using linear map instead of contigous range this
function is not needed

Signed-off-by: Pingfan Liu 
[Changed commit message]
Signed-off-by: Pavel Tatashin 
---
 arch/arm64/include/asm/trans_pgd.h |  5 +--
 arch/arm64/mm/trans_pgd.c  | 57 --
 2 files changed, 1 insertion(+), 61 deletions(-)

diff --git a/arch/arm64/include/asm/trans_pgd.h 
b/arch/arm64/include/asm/trans_pgd.h
index e0760e52d36d..234353df2f13 100644
--- a/arch/arm64/include/asm/trans_pgd.h
+++ b/arch/arm64/include/asm/trans_pgd.h
@@ -15,7 +15,7 @@
 /*
  * trans_alloc_page
  * - Allocator that should return exactly one zeroed page, if this
- *   allocator fails, trans_pgd_create_copy() and trans_pgd_map_page()
+ *   allocator fails, trans_pgd_create_copy() and trans_pgd_idmap_page()
  *   return -ENOMEM error.
  *
  * trans_alloc_arg
@@ -30,9 +30,6 @@ struct trans_pgd_info {
 int trans_pgd_create_copy(struct trans_pgd_info *info, pgd_t **trans_pgd,
  unsigned long start, unsigned long end);
 
-int trans_pgd_map_page(struct trans_pgd_info *info, pgd_t *trans_pgd,
-  void *page, unsigned long dst_addr, pgprot_t pgprot);
-
 int trans_pgd_idmap_page(struct trans_pgd_info *info, phys_addr_t *trans_ttbr0,
 unsigned long *t0sz, void *page);
 
diff --git a/arch/arm64/mm/trans_pgd.c b/arch/arm64/mm/trans_pgd.c
index 61549451ed3a..e24a749013c1 100644
--- a/arch/arm64/mm/trans_pgd.c
+++ b/arch/arm64/mm/trans_pgd.c
@@ -217,63 +217,6 @@ int trans_pgd_create_copy(struct trans_pgd_info *info, 
pgd_t **dst_pgdp,
return rc;
 }
 
-/*
- * Add map entry to trans_pgd for a base-size page at PTE level.
- * info:   contains allocator and its argument
- * trans_pgd:  page table in which new map is added.
- * page:   page to be mapped.
- * dst_addr:   new VA address for the page
- * pgprot: protection for the page.
- *
- * Returns 0 on success, and -ENOMEM on failure.
- */
-int trans_pgd_map_page(struct trans_pgd_info *info, pgd_t *trans_pgd,
-  void *page, unsigned long dst_addr, pgprot_t pgprot)
-{
-   pgd_t *pgdp;
-   p4d_t *p4dp;
-   pud_t *pudp;
-   pmd_t *pmdp;
-   pte_t *ptep;
-
-   pgdp = pgd_offset_pgd(trans_pgd, dst_addr);
-   if (pgd_none(READ_ONCE(*pgdp))) {
-   p4dp = trans_alloc(info);
-   if (!pgdp)
-   return -ENOMEM;
-   pgd_populate(NULL, pgdp, p4dp);
-   }
-
-   p4dp = p4d_offset(pgdp, dst_addr);
-   if (p4d_none(READ_ONCE(*p4dp))) {
-   pudp = trans_alloc(info);
-   if (!pudp)
-   return -ENOMEM;
-   p4d_populate(NULL, p4dp, pudp);
-   }
-
-   pudp = pud_offset(p4dp, dst_addr);
-   if (pud_none(READ_ONCE(*pudp))) {
-   pmdp = trans_alloc(info);
-   if (!pmdp)
-   return -ENOMEM;
-   pud_populate(NULL, pudp, pmdp);
-   }
-
-   pmdp = pmd_offset(pudp, dst_addr);
-   if (pmd_none(READ_ONCE(*pmdp))) {
-   ptep = trans_alloc(info);
-   if (!ptep)
-   return -ENOMEM;
-   pmd_populate_kernel(NULL, pmdp, ptep);
-   }
-
-   ptep = pte_offset_kernel(pmdp, dst_addr);
-   set_pte(ptep, pfn_pte(virt_to_pfn(page), pgprot));
-
-   return 0;
-}
-
 /*
  * The page we want to idmap may be outside the range covered by VA_BITS that
  * can be built using the kernel's p?d_populate() helpers. As a one off, for a
-- 
2.25.1


___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


[PATCH v13 16/18] arm64: kexec: remove the pre-kexec PoC maintenance

2021-04-07 Thread Pavel Tatashin
Now that kexec does its relocations with the MMU enabled, we no longer
need to clean the relocation data to the PoC.

Co-developed-by: James Morse 
Signed-off-by: Pavel Tatashin 
---
 arch/arm64/kernel/machine_kexec.c | 40 ---
 1 file changed, 40 deletions(-)

diff --git a/arch/arm64/kernel/machine_kexec.c 
b/arch/arm64/kernel/machine_kexec.c
index d5c8aefc66f3..a1c9bee0cddd 100644
--- a/arch/arm64/kernel/machine_kexec.c
+++ b/arch/arm64/kernel/machine_kexec.c
@@ -76,45 +76,6 @@ int machine_kexec_prepare(struct kimage *kimage)
return 0;
 }
 
-/**
- * kexec_list_flush - Helper to flush the kimage list and source pages to PoC.
- */
-static void kexec_list_flush(struct kimage *kimage)
-{
-   kimage_entry_t *entry;
-
-   __flush_dcache_area(kimage, sizeof(*kimage));
-
-   for (entry = >head; ; entry++) {
-   unsigned int flag;
-   void *addr;
-
-   /* flush the list entries. */
-   __flush_dcache_area(entry, sizeof(kimage_entry_t));
-
-   flag = *entry & IND_FLAGS;
-   if (flag == IND_DONE)
-   break;
-
-   addr = phys_to_virt(*entry & PAGE_MASK);
-
-   switch (flag) {
-   case IND_INDIRECTION:
-   /* Set entry point just before the new list page. */
-   entry = (kimage_entry_t *)addr - 1;
-   break;
-   case IND_SOURCE:
-   /* flush the source pages. */
-   __flush_dcache_area(addr, PAGE_SIZE);
-   break;
-   case IND_DESTINATION:
-   break;
-   default:
-   BUG();
-   }
-   }
-}
-
 /**
  * kexec_segment_flush - Helper to flush the kimage segments to PoC.
  */
@@ -200,7 +161,6 @@ int machine_kexec_post_load(struct kimage *kimage)
__flush_dcache_area(reloc_code, reloc_size);
flush_icache_range((uintptr_t)reloc_code, (uintptr_t)reloc_code +
   reloc_size);
-   kexec_list_flush(kimage);
kexec_image_info(kimage);
 
return 0;
-- 
2.25.1


___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


[PATCH v13 17/18] arm64: kexec: Remove cpu-reset.h

2021-04-07 Thread Pavel Tatashin
This header contains only cpu_soft_restart() which is never used directly
anymore. So, remove this header, and rename the helper to be
cpu_soft_restart().

Suggested-by: James Morse 
Signed-off-by: Pavel Tatashin 
---
 arch/arm64/include/asm/kexec.h|  6 ++
 arch/arm64/kernel/cpu-reset.S |  7 +++
 arch/arm64/kernel/cpu-reset.h | 30 --
 arch/arm64/kernel/machine_kexec.c |  6 ++
 4 files changed, 11 insertions(+), 38 deletions(-)
 delete mode 100644 arch/arm64/kernel/cpu-reset.h

diff --git a/arch/arm64/include/asm/kexec.h b/arch/arm64/include/asm/kexec.h
index 5fc87b51f8a9..ee71ae3b93ed 100644
--- a/arch/arm64/include/asm/kexec.h
+++ b/arch/arm64/include/asm/kexec.h
@@ -90,6 +90,12 @@ static inline void crash_prepare_suspend(void) {}
 static inline void crash_post_resume(void) {}
 #endif
 
+#if defined(CONFIG_KEXEC_CORE)
+void cpu_soft_restart(unsigned long el2_switch, unsigned long entry,
+ unsigned long arg0, unsigned long arg1,
+ unsigned long arg2);
+#endif
+
 #define ARCH_HAS_KIMAGE_ARCH
 
 struct kimage_arch {
diff --git a/arch/arm64/kernel/cpu-reset.S b/arch/arm64/kernel/cpu-reset.S
index 37721eb6f9a1..5d47d6c92634 100644
--- a/arch/arm64/kernel/cpu-reset.S
+++ b/arch/arm64/kernel/cpu-reset.S
@@ -16,8 +16,7 @@
 .pushsection.idmap.text, "awx"
 
 /*
- * __cpu_soft_restart(el2_switch, entry, arg0, arg1, arg2) - Helper for
- * cpu_soft_restart.
+ * cpu_soft_restart(el2_switch, entry, arg0, arg1, arg2)
  *
  * @el2_switch: Flag to indicate a switch to EL2 is needed.
  * @entry: Location to jump to for soft reset.
@@ -29,7 +28,7 @@
  * branch to what would be the reset vector. It must be executed with the
  * flat identity mapping.
  */
-SYM_CODE_START(__cpu_soft_restart)
+SYM_CODE_START(cpu_soft_restart)
/* Clear sctlr_el1 flags. */
mrs x12, sctlr_el1
mov_q   x13, SCTLR_ELx_FLAGS
@@ -51,6 +50,6 @@ SYM_CODE_START(__cpu_soft_restart)
mov x1, x3  // arg1
mov x2, x4  // arg2
br  x8
-SYM_CODE_END(__cpu_soft_restart)
+SYM_CODE_END(cpu_soft_restart)
 
 .popsection
diff --git a/arch/arm64/kernel/cpu-reset.h b/arch/arm64/kernel/cpu-reset.h
deleted file mode 100644
index f6d95512fec6..
--- a/arch/arm64/kernel/cpu-reset.h
+++ /dev/null
@@ -1,30 +0,0 @@
-/* SPDX-License-Identifier: GPL-2.0-only */
-/*
- * CPU reset routines
- *
- * Copyright (C) 2015 Huawei Futurewei Technologies.
- */
-
-#ifndef _ARM64_CPU_RESET_H
-#define _ARM64_CPU_RESET_H
-
-#include 
-
-void __cpu_soft_restart(unsigned long el2_switch, unsigned long entry,
-   unsigned long arg0, unsigned long arg1, unsigned long arg2);
-
-static inline void __noreturn cpu_soft_restart(unsigned long entry,
-  unsigned long arg0,
-  unsigned long arg1,
-  unsigned long arg2)
-{
-   typeof(__cpu_soft_restart) *restart;
-
-   restart = (void *)__pa_symbol(__cpu_soft_restart);
-
-   cpu_install_idmap();
-   restart(0, entry, arg0, arg1, arg2);
-   unreachable();
-}
-
-#endif
diff --git a/arch/arm64/kernel/machine_kexec.c 
b/arch/arm64/kernel/machine_kexec.c
index a1c9bee0cddd..ef7ba93f2bd6 100644
--- a/arch/arm64/kernel/machine_kexec.c
+++ b/arch/arm64/kernel/machine_kexec.c
@@ -23,8 +23,6 @@
 #include 
 #include 
 
-#include "cpu-reset.h"
-
 /**
  * kexec_image_info - For debugging output.
  */
@@ -197,10 +195,10 @@ void machine_kexec(struct kimage *kimage)
 * In kexec_file case, the kernel starts directly without purgatory.
 */
if (kimage->head & IND_DONE) {
-   typeof(__cpu_soft_restart) *restart;
+   typeof(cpu_soft_restart) *restart;
 
cpu_install_idmap();
-   restart = (void *)__pa_symbol(__cpu_soft_restart);
+   restart = (void *)__pa_symbol(cpu_soft_restart);
restart(is_hyp_callable(), kimage->start, kimage->arch.dtb_mem,
0, 0);
} else {
-- 
2.25.1


___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


[PATCH v13 15/18] arm64: kexec: keep MMU enabled during kexec relocation

2021-04-07 Thread Pavel Tatashin
Now, that we have linear map page tables configured, keep MMU enabled
to allow faster relocation of segments to final destination.


Cavium ThunderX2:
Kernel Image size: 38M Iniramfs size: 46M Total relocation size: 84M
MMU-disabled:
relocation  7.489539915s
MMU-enabled:
relocation  0.03946095s

Broadcom Stingray:
The performance data: for a moderate size kernel + initramfs: 25M the
relocation was taking 0.382s, with enabled MMU it now takes
0.019s only or x20 improvement.

The time is proportional to the size of relocation, therefore if initramfs
is larger, 100M it could take over a second.

Signed-off-by: Pavel Tatashin 
---
 arch/arm64/include/asm/kexec.h  |  3 +++
 arch/arm64/kernel/asm-offsets.c |  1 +
 arch/arm64/kernel/machine_kexec.c   | 16 ++
 arch/arm64/kernel/relocate_kernel.S | 33 +++--
 4 files changed, 38 insertions(+), 15 deletions(-)

diff --git a/arch/arm64/include/asm/kexec.h b/arch/arm64/include/asm/kexec.h
index 59ac166daf53..5fc87b51f8a9 100644
--- a/arch/arm64/include/asm/kexec.h
+++ b/arch/arm64/include/asm/kexec.h
@@ -97,8 +97,11 @@ struct kimage_arch {
phys_addr_t dtb_mem;
phys_addr_t kern_reloc;
phys_addr_t el2_vectors;
+   phys_addr_t ttbr0;
phys_addr_t ttbr1;
phys_addr_t zero_page;
+   unsigned long phys_offset;
+   unsigned long t0sz;
/* Core ELF header buffer */
void *elf_headers;
unsigned long elf_headers_mem;
diff --git a/arch/arm64/kernel/asm-offsets.c b/arch/arm64/kernel/asm-offsets.c
index 609362b5aa76..ec7bb80aedc8 100644
--- a/arch/arm64/kernel/asm-offsets.c
+++ b/arch/arm64/kernel/asm-offsets.c
@@ -159,6 +159,7 @@ int main(void)
   DEFINE(KIMAGE_ARCH_DTB_MEM,  offsetof(struct kimage, arch.dtb_mem));
   DEFINE(KIMAGE_ARCH_EL2_VECTORS,  offsetof(struct kimage, 
arch.el2_vectors));
   DEFINE(KIMAGE_ARCH_ZERO_PAGE,offsetof(struct kimage, 
arch.zero_page));
+  DEFINE(KIMAGE_ARCH_PHYS_OFFSET,  offsetof(struct kimage, 
arch.phys_offset));
   DEFINE(KIMAGE_ARCH_TTBR1,offsetof(struct kimage, arch.ttbr1));
   DEFINE(KIMAGE_HEAD,  offsetof(struct kimage, head));
   DEFINE(KIMAGE_START, offsetof(struct kimage, start));
diff --git a/arch/arm64/kernel/machine_kexec.c 
b/arch/arm64/kernel/machine_kexec.c
index c875ef522e53..d5c8aefc66f3 100644
--- a/arch/arm64/kernel/machine_kexec.c
+++ b/arch/arm64/kernel/machine_kexec.c
@@ -190,6 +190,11 @@ int machine_kexec_post_load(struct kimage *kimage)
reloc_size = __relocate_new_kernel_end - __relocate_new_kernel_start;
memcpy(reloc_code, __relocate_new_kernel_start, reloc_size);
kimage->arch.kern_reloc = __pa(reloc_code);
+   rc = trans_pgd_idmap_page(, >arch.ttbr0,
+ >arch.t0sz, reloc_code);
+   if (rc)
+   return rc;
+   kimage->arch.phys_offset = virt_to_phys(kimage) - (long)kimage;
 
/* Flush the reloc_code in preparation for its execution. */
__flush_dcache_area(reloc_code, reloc_size);
@@ -223,9 +228,9 @@ void machine_kexec(struct kimage *kimage)
local_daif_mask();
 
/*
-* Both restart and cpu_soft_restart will shutdown the MMU, disable data
+* Both restart and kernel_reloc will shutdown the MMU, disable data
 * caches. However, restart will start new kernel or purgatory directly,
-* cpu_soft_restart will transfer control to arm64_relocate_new_kernel
+* kernel_reloc contains the body of arm64_relocate_new_kernel
 * In kexec case, kimage->start points to purgatory assuming that
 * kernel entry and dtb address are embedded in purgatory by
 * userspace (kexec-tools).
@@ -239,10 +244,13 @@ void machine_kexec(struct kimage *kimage)
restart(is_hyp_callable(), kimage->start, kimage->arch.dtb_mem,
0, 0);
} else {
+   void (*kernel_reloc)(struct kimage *kimage);
+
if (is_hyp_callable())
__hyp_set_vectors(kimage->arch.el2_vectors);
-   cpu_soft_restart(kimage->arch.kern_reloc,
-virt_to_phys(kimage), 0, 0);
+   cpu_install_ttbr0(kimage->arch.ttbr0, kimage->arch.t0sz);
+   kernel_reloc = (void *)kimage->arch.kern_reloc;
+   kernel_reloc(kimage);
}
 
BUG(); /* Should never get here. */
diff --git a/arch/arm64/kernel/relocate_kernel.S 
b/arch/arm64/kernel/relocate_kernel.S
index e83b6380907d..433a57b3d76e 100644
--- a/arch/arm64/kernel/relocate_kernel.S
+++ b/arch/arm64/kernel/relocate_kernel.S
@@ -4,6 +4,8 @@
  *
  * Copyright (C) Linaro.
  * Copyright (C) Huawei Futurewei Technologies.
+ * Copyright (C) 2020, Microsoft Corporation.
+ * Pavel Tatashin 
  */
 
 #include 
@@ -15,6 +17,15 @@
 #include 
 #include 
 
+.macro turn_off_mmu tmp1, tmp2
+   mrs \tmp1, sctlr_el1
+  

[PATCH v13 13/18] arm64: kexec: use ld script for relocation function

2021-04-07 Thread Pavel Tatashin
Currently, relocation code declares start and end variables
which are used to compute its size.

The better way to do this is to use ld script incited, and put relocation
function in its own section.

Signed-off-by: Pavel Tatashin 
---
 arch/arm64/include/asm/sections.h   |  1 +
 arch/arm64/kernel/machine_kexec.c   | 14 ++
 arch/arm64/kernel/relocate_kernel.S | 15 ++-
 arch/arm64/kernel/vmlinux.lds.S | 19 +++
 4 files changed, 28 insertions(+), 21 deletions(-)

diff --git a/arch/arm64/include/asm/sections.h 
b/arch/arm64/include/asm/sections.h
index 2f36b16a5b5d..31e459af89f6 100644
--- a/arch/arm64/include/asm/sections.h
+++ b/arch/arm64/include/asm/sections.h
@@ -20,5 +20,6 @@ extern char __exittext_begin[], __exittext_end[];
 extern char __irqentry_text_start[], __irqentry_text_end[];
 extern char __mmuoff_data_start[], __mmuoff_data_end[];
 extern char __entry_tramp_text_start[], __entry_tramp_text_end[];
+extern char __relocate_new_kernel_start[], __relocate_new_kernel_end[];
 
 #endif /* __ASM_SECTIONS_H */
diff --git a/arch/arm64/kernel/machine_kexec.c 
b/arch/arm64/kernel/machine_kexec.c
index d5940b7889f8..f1451d807708 100644
--- a/arch/arm64/kernel/machine_kexec.c
+++ b/arch/arm64/kernel/machine_kexec.c
@@ -20,14 +20,11 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 
 #include "cpu-reset.h"
 
-/* Global variables for the arm64_relocate_new_kernel routine. */
-extern const unsigned char arm64_relocate_new_kernel[];
-extern const unsigned long arm64_relocate_new_kernel_size;
-
 /**
  * kexec_image_info - For debugging output.
  */
@@ -157,6 +154,7 @@ static void *kexec_page_alloc(void *arg)
 int machine_kexec_post_load(struct kimage *kimage)
 {
void *reloc_code = page_to_virt(kimage->control_code_page);
+   long reloc_size;
struct trans_pgd_info info = {
.trans_alloc_page   = kexec_page_alloc,
.trans_alloc_arg= kimage,
@@ -177,14 +175,14 @@ int machine_kexec_post_load(struct kimage *kimage)
return rc;
}
 
-   memcpy(reloc_code, arm64_relocate_new_kernel,
-  arm64_relocate_new_kernel_size);
+   reloc_size = __relocate_new_kernel_end - __relocate_new_kernel_start;
+   memcpy(reloc_code, __relocate_new_kernel_start, reloc_size);
kimage->arch.kern_reloc = __pa(reloc_code);
 
/* Flush the reloc_code in preparation for its execution. */
-   __flush_dcache_area(reloc_code, arm64_relocate_new_kernel_size);
+   __flush_dcache_area(reloc_code, reloc_size);
flush_icache_range((uintptr_t)reloc_code, (uintptr_t)reloc_code +
-  arm64_relocate_new_kernel_size);
+  reloc_size);
kexec_list_flush(kimage);
kexec_image_info(kimage);
 
diff --git a/arch/arm64/kernel/relocate_kernel.S 
b/arch/arm64/kernel/relocate_kernel.S
index df023b82544b..7a600ba33ae1 100644
--- a/arch/arm64/kernel/relocate_kernel.S
+++ b/arch/arm64/kernel/relocate_kernel.S
@@ -15,6 +15,7 @@
 #include 
 #include 
 
+.pushsection".kexec_relocate.text", "ax"
 /*
  * arm64_relocate_new_kernel - Put a 2nd stage image in place and boot it.
  *
@@ -77,16 +78,4 @@ SYM_CODE_START(arm64_relocate_new_kernel)
mov x3, xzr
br  x4  /* Jumps from el1 */
 SYM_CODE_END(arm64_relocate_new_kernel)
-
-.align 3   /* To keep the 64-bit values below naturally aligned. */
-
-.Lcopy_end:
-.org   KEXEC_CONTROL_PAGE_SIZE
-
-/*
- * arm64_relocate_new_kernel_size - Number of bytes to copy to the
- * control_code_page.
- */
-.globl arm64_relocate_new_kernel_size
-arm64_relocate_new_kernel_size:
-   .quad   .Lcopy_end - arm64_relocate_new_kernel
+.popsection
diff --git a/arch/arm64/kernel/vmlinux.lds.S b/arch/arm64/kernel/vmlinux.lds.S
index 7eea7888bb02..0d9d5e6af66f 100644
--- a/arch/arm64/kernel/vmlinux.lds.S
+++ b/arch/arm64/kernel/vmlinux.lds.S
@@ -12,6 +12,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 
@@ -92,6 +93,16 @@ jiffies = jiffies_64;
 #define HIBERNATE_TEXT
 #endif
 
+#ifdef CONFIG_KEXEC_CORE
+#define KEXEC_TEXT \
+   . = ALIGN(SZ_4K);   \
+   __relocate_new_kernel_start = .;\
+   *(.kexec_relocate.text) \
+   __relocate_new_kernel_end = .;
+#else
+#define KEXEC_TEXT
+#endif
+
 #ifdef CONFIG_UNMAP_KERNEL_AT_EL0
 #define TRAMP_TEXT \
. = ALIGN(PAGE_SIZE);   \
@@ -152,6 +163,7 @@ SECTIONS
HYPERVISOR_TEXT
IDMAP_TEXT
HIBERNATE_TEXT
+   KEXEC_TEXT
TRAMP_TEXT
*(.fixup)
*(.gnu.warning)
@@ -336,3 +348,10 @@ ASSERT(swapper_pg_dir - reserved_pg_dir == 

[PATCH v13 14/18] arm64: kexec: install a copy of the linear-map

2021-04-07 Thread Pavel Tatashin
To perform the kexec relocations with the MMU enabled, we need a copy
of the linear map.

Create one, and install it from the relocation code. This has to be done
from the assembly code as it will be idmapped with TTBR0. The kernel
runs in TTRB1, so can't use the break-before-make sequence on the mapping
it is executing from.

The makes no difference yet as the relocation code runs with the MMU
disabled.

Co-developed-by: James Morse 
Signed-off-by: Pavel Tatashin 
---
 arch/arm64/include/asm/assembler.h  | 19 +++
 arch/arm64/include/asm/kexec.h  |  2 ++
 arch/arm64/kernel/asm-offsets.c |  2 ++
 arch/arm64/kernel/hibernate-asm.S   | 20 
 arch/arm64/kernel/machine_kexec.c   | 16 ++--
 arch/arm64/kernel/relocate_kernel.S |  3 +++
 6 files changed, 40 insertions(+), 22 deletions(-)

diff --git a/arch/arm64/include/asm/assembler.h 
b/arch/arm64/include/asm/assembler.h
index 29061b76aab6..3ce8131ad660 100644
--- a/arch/arm64/include/asm/assembler.h
+++ b/arch/arm64/include/asm/assembler.h
@@ -425,6 +425,25 @@ USER(\label, icivau, \tmp2)// 
invalidate I line PoU
isb
.endm
 
+/*
+ * To prevent the possibility of old and new partial table walks being visible
+ * in the tlb, switch the ttbr to a zero page when we invalidate the old
+ * records. D4.7.1 'General TLB maintenance requirements' in ARM DDI 0487A.i
+ * Even switching to our copied tables will cause a changed output address at
+ * each stage of the walk.
+ */
+   .macro break_before_make_ttbr_switch zero_page, page_table, tmp, tmp2
+   phys_to_ttbr \tmp, \zero_page
+   msr ttbr1_el1, \tmp
+   isb
+   tlbivmalle1
+   dsb nsh
+   phys_to_ttbr \tmp, \page_table
+   offset_ttbr1 \tmp, \tmp2
+   msr ttbr1_el1, \tmp
+   isb
+   .endm
+
 /*
  * reset_pmuserenr_el0 - reset PMUSERENR_EL0 if PMUv3 present
  */
diff --git a/arch/arm64/include/asm/kexec.h b/arch/arm64/include/asm/kexec.h
index 305cf0840ed3..59ac166daf53 100644
--- a/arch/arm64/include/asm/kexec.h
+++ b/arch/arm64/include/asm/kexec.h
@@ -97,6 +97,8 @@ struct kimage_arch {
phys_addr_t dtb_mem;
phys_addr_t kern_reloc;
phys_addr_t el2_vectors;
+   phys_addr_t ttbr1;
+   phys_addr_t zero_page;
/* Core ELF header buffer */
void *elf_headers;
unsigned long elf_headers_mem;
diff --git a/arch/arm64/kernel/asm-offsets.c b/arch/arm64/kernel/asm-offsets.c
index 2e3278df1fc3..609362b5aa76 100644
--- a/arch/arm64/kernel/asm-offsets.c
+++ b/arch/arm64/kernel/asm-offsets.c
@@ -158,6 +158,8 @@ int main(void)
 #ifdef CONFIG_KEXEC_CORE
   DEFINE(KIMAGE_ARCH_DTB_MEM,  offsetof(struct kimage, arch.dtb_mem));
   DEFINE(KIMAGE_ARCH_EL2_VECTORS,  offsetof(struct kimage, 
arch.el2_vectors));
+  DEFINE(KIMAGE_ARCH_ZERO_PAGE,offsetof(struct kimage, 
arch.zero_page));
+  DEFINE(KIMAGE_ARCH_TTBR1,offsetof(struct kimage, arch.ttbr1));
   DEFINE(KIMAGE_HEAD,  offsetof(struct kimage, head));
   DEFINE(KIMAGE_START, offsetof(struct kimage, start));
   BLANK();
diff --git a/arch/arm64/kernel/hibernate-asm.S 
b/arch/arm64/kernel/hibernate-asm.S
index 8ccca660034e..a31e621ba867 100644
--- a/arch/arm64/kernel/hibernate-asm.S
+++ b/arch/arm64/kernel/hibernate-asm.S
@@ -15,26 +15,6 @@
 #include 
 #include 
 
-/*
- * To prevent the possibility of old and new partial table walks being visible
- * in the tlb, switch the ttbr to a zero page when we invalidate the old
- * records. D4.7.1 'General TLB maintenance requirements' in ARM DDI 0487A.i
- * Even switching to our copied tables will cause a changed output address at
- * each stage of the walk.
- */
-.macro break_before_make_ttbr_switch zero_page, page_table, tmp, tmp2
-   phys_to_ttbr \tmp, \zero_page
-   msr ttbr1_el1, \tmp
-   isb
-   tlbivmalle1
-   dsb nsh
-   phys_to_ttbr \tmp, \page_table
-   offset_ttbr1 \tmp, \tmp2
-   msr ttbr1_el1, \tmp
-   isb
-.endm
-
-
 /*
  * Resume from hibernate
  *
diff --git a/arch/arm64/kernel/machine_kexec.c 
b/arch/arm64/kernel/machine_kexec.c
index f1451d807708..c875ef522e53 100644
--- a/arch/arm64/kernel/machine_kexec.c
+++ b/arch/arm64/kernel/machine_kexec.c
@@ -153,6 +153,8 @@ static void *kexec_page_alloc(void *arg)
 
 int machine_kexec_post_load(struct kimage *kimage)
 {
+   int rc;
+   pgd_t *trans_pgd;
void *reloc_code = page_to_virt(kimage->control_code_page);
long reloc_size;
struct trans_pgd_info info = {
@@ -169,12 +171,22 @@ int machine_kexec_post_load(struct kimage *kimage)
 
kimage->arch.el2_vectors = 0;
if (is_hyp_callable()) {
-   int rc = trans_pgd_copy_el2_vectors(,
-   >arch.el2_vectors);
+   rc = trans_pgd_copy_el2_vectors(,
+   

[PATCH v13 11/18] arm64: kexec: kexec may require EL2 vectors

2021-04-07 Thread Pavel Tatashin
If we have a EL2 mode without VHE, the EL2 vectors are needed in order
to switch to EL2 and jump to new world with hypervisor privileges.

In preporation to MMU enabled relocation, configure our EL2 table now.

Suggested-by: James Morse 
Signed-off-by: Pavel Tatashin 
---
 arch/arm64/Kconfig|  2 +-
 arch/arm64/include/asm/kexec.h|  1 +
 arch/arm64/kernel/asm-offsets.c   |  1 +
 arch/arm64/kernel/machine_kexec.c | 31 +++
 4 files changed, 34 insertions(+), 1 deletion(-)

diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index e4e1b6550115..0e876d980a1f 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -1149,7 +1149,7 @@ config CRASH_DUMP
 
 config TRANS_TABLE
def_bool y
-   depends on HIBERNATION
+   depends on HIBERNATION || KEXEC_CORE
 
 config XEN_DOM0
def_bool y
diff --git a/arch/arm64/include/asm/kexec.h b/arch/arm64/include/asm/kexec.h
index 9befcd87e9a8..305cf0840ed3 100644
--- a/arch/arm64/include/asm/kexec.h
+++ b/arch/arm64/include/asm/kexec.h
@@ -96,6 +96,7 @@ struct kimage_arch {
void *dtb;
phys_addr_t dtb_mem;
phys_addr_t kern_reloc;
+   phys_addr_t el2_vectors;
/* Core ELF header buffer */
void *elf_headers;
unsigned long elf_headers_mem;
diff --git a/arch/arm64/kernel/asm-offsets.c b/arch/arm64/kernel/asm-offsets.c
index 0c92e193f866..2e3278df1fc3 100644
--- a/arch/arm64/kernel/asm-offsets.c
+++ b/arch/arm64/kernel/asm-offsets.c
@@ -157,6 +157,7 @@ int main(void)
 #endif
 #ifdef CONFIG_KEXEC_CORE
   DEFINE(KIMAGE_ARCH_DTB_MEM,  offsetof(struct kimage, arch.dtb_mem));
+  DEFINE(KIMAGE_ARCH_EL2_VECTORS,  offsetof(struct kimage, 
arch.el2_vectors));
   DEFINE(KIMAGE_HEAD,  offsetof(struct kimage, head));
   DEFINE(KIMAGE_START, offsetof(struct kimage, start));
   BLANK();
diff --git a/arch/arm64/kernel/machine_kexec.c 
b/arch/arm64/kernel/machine_kexec.c
index 2e734e4ae12e..fb03b6676fb9 100644
--- a/arch/arm64/kernel/machine_kexec.c
+++ b/arch/arm64/kernel/machine_kexec.c
@@ -20,6 +20,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include "cpu-reset.h"
 
@@ -42,7 +43,9 @@ static void _kexec_image_info(const char *func, int line,
pr_debug("start:   %lx\n", kimage->start);
pr_debug("head:%lx\n", kimage->head);
pr_debug("nr_segments: %lu\n", kimage->nr_segments);
+   pr_debug("dtb_mem: %pa\n", >arch.dtb_mem);
pr_debug("kern_reloc: %pa\n", >arch.kern_reloc);
+   pr_debug("el2_vectors: %pa\n", >arch.el2_vectors);
 
for (i = 0; i < kimage->nr_segments; i++) {
pr_debug("  segment[%lu]: %016lx - %016lx, 0x%lx bytes, %lu 
pages\n",
@@ -137,9 +140,27 @@ static void kexec_segment_flush(const struct kimage 
*kimage)
}
 }
 
+/* Allocates pages for kexec page table */
+static void *kexec_page_alloc(void *arg)
+{
+   struct kimage *kimage = (struct kimage *)arg;
+   struct page *page = kimage_alloc_control_pages(kimage, 0);
+
+   if (!page)
+   return NULL;
+
+   memset(page_address(page), 0, PAGE_SIZE);
+
+   return page_address(page);
+}
+
 int machine_kexec_post_load(struct kimage *kimage)
 {
void *reloc_code = page_to_virt(kimage->control_code_page);
+   struct trans_pgd_info info = {
+   .trans_alloc_page   = kexec_page_alloc,
+   .trans_alloc_arg= kimage,
+   };
 
/* If in place, relocation is not used, only flush next kernel */
if (kimage->head & IND_DONE) {
@@ -148,6 +169,14 @@ int machine_kexec_post_load(struct kimage *kimage)
return 0;
}
 
+   kimage->arch.el2_vectors = 0;
+   if (is_hyp_callable()) {
+   int rc = trans_pgd_copy_el2_vectors(,
+   >arch.el2_vectors);
+   if (rc)
+   return rc;
+   }
+
memcpy(reloc_code, arm64_relocate_new_kernel,
   arm64_relocate_new_kernel_size);
kimage->arch.kern_reloc = __pa(reloc_code);
@@ -200,6 +229,8 @@ void machine_kexec(struct kimage *kimage)
restart(is_hyp_callable(), kimage->start, kimage->arch.dtb_mem,
0, 0);
} else {
+   if (is_hyp_callable())
+   __hyp_set_vectors(kimage->arch.el2_vectors);
cpu_soft_restart(kimage->arch.kern_reloc, virt_to_phys(kimage),
 0, 0);
}
-- 
2.25.1


___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


[PATCH v13 12/18] arm64: kexec: relocate in EL1 mode

2021-04-07 Thread Pavel Tatashin
Since we are going to keep MMU enabled during relocation, we need to
keep EL1 mode throughout the relocation.

Keep EL1 enabled, and switch EL2 only before enterying the new world.

Suggested-by: James Morse 
Signed-off-by: Pavel Tatashin 
---
 arch/arm64/kernel/cpu-reset.h   |  3 +--
 arch/arm64/kernel/machine_kexec.c   |  4 ++--
 arch/arm64/kernel/relocate_kernel.S | 13 +++--
 3 files changed, 14 insertions(+), 6 deletions(-)

diff --git a/arch/arm64/kernel/cpu-reset.h b/arch/arm64/kernel/cpu-reset.h
index 1922e7a690f8..f6d95512fec6 100644
--- a/arch/arm64/kernel/cpu-reset.h
+++ b/arch/arm64/kernel/cpu-reset.h
@@ -20,11 +20,10 @@ static inline void __noreturn cpu_soft_restart(unsigned 
long entry,
 {
typeof(__cpu_soft_restart) *restart;
 
-   unsigned long el2_switch = is_hyp_callable();
restart = (void *)__pa_symbol(__cpu_soft_restart);
 
cpu_install_idmap();
-   restart(el2_switch, entry, arg0, arg1, arg2);
+   restart(0, entry, arg0, arg1, arg2);
unreachable();
 }
 
diff --git a/arch/arm64/kernel/machine_kexec.c 
b/arch/arm64/kernel/machine_kexec.c
index fb03b6676fb9..d5940b7889f8 100644
--- a/arch/arm64/kernel/machine_kexec.c
+++ b/arch/arm64/kernel/machine_kexec.c
@@ -231,8 +231,8 @@ void machine_kexec(struct kimage *kimage)
} else {
if (is_hyp_callable())
__hyp_set_vectors(kimage->arch.el2_vectors);
-   cpu_soft_restart(kimage->arch.kern_reloc, virt_to_phys(kimage),
-0, 0);
+   cpu_soft_restart(kimage->arch.kern_reloc,
+virt_to_phys(kimage), 0, 0);
}
 
BUG(); /* Should never get here. */
diff --git a/arch/arm64/kernel/relocate_kernel.S 
b/arch/arm64/kernel/relocate_kernel.S
index 36b4496524c3..df023b82544b 100644
--- a/arch/arm64/kernel/relocate_kernel.S
+++ b/arch/arm64/kernel/relocate_kernel.S
@@ -13,6 +13,7 @@
 #include 
 #include 
 #include 
+#include 
 
 /*
  * arm64_relocate_new_kernel - Put a 2nd stage image in place and boot it.
@@ -61,12 +62,20 @@ SYM_CODE_START(arm64_relocate_new_kernel)
isb
 
/* Start new image. */
+   ldr x1, [x0, #KIMAGE_ARCH_EL2_VECTORS]  /* relocation start */
+   cbz x1, .Lel1
+   ldr x1, [x0, #KIMAGE_START] /* relocation start */
+   ldr x2, [x0, #KIMAGE_ARCH_DTB_MEM]  /* dtb address */
+   mov x3, xzr
+   mov x4, xzr
+   mov x0, #HVC_SOFT_RESTART
+   hvc #0  /* Jumps from el2 */
+.Lel1:
ldr x4, [x0, #KIMAGE_START] /* relocation start */
ldr x0, [x0, #KIMAGE_ARCH_DTB_MEM]  /* dtb address */
-   mov x1, xzr
mov x2, xzr
mov x3, xzr
-   br  x4
+   br  x4  /* Jumps from el1 */
 SYM_CODE_END(arm64_relocate_new_kernel)
 
 .align 3   /* To keep the 64-bit values below naturally aligned. */
-- 
2.25.1


___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


[PATCH v13 07/18] arm64: kexec: flush image and lists during kexec load time

2021-04-07 Thread Pavel Tatashin
Currently, during kexec load we are copying relocation function and
flushing it. However, we can also flush kexec relocation buffers and
if new kernel image is already in place (i.e. crash kernel), we can
also flush the new kernel image itself.

Signed-off-by: Pavel Tatashin 
---
 arch/arm64/kernel/machine_kexec.c | 49 +++
 1 file changed, 23 insertions(+), 26 deletions(-)

diff --git a/arch/arm64/kernel/machine_kexec.c 
b/arch/arm64/kernel/machine_kexec.c
index 90a335c74442..3a034bc25709 100644
--- a/arch/arm64/kernel/machine_kexec.c
+++ b/arch/arm64/kernel/machine_kexec.c
@@ -59,23 +59,6 @@ void machine_kexec_cleanup(struct kimage *kimage)
/* Empty routine needed to avoid build errors. */
 }
 
-int machine_kexec_post_load(struct kimage *kimage)
-{
-   void *reloc_code = page_to_virt(kimage->control_code_page);
-
-   memcpy(reloc_code, arm64_relocate_new_kernel,
-  arm64_relocate_new_kernel_size);
-   kimage->arch.kern_reloc = __pa(reloc_code);
-   kexec_image_info(kimage);
-
-   /* Flush the reloc_code in preparation for its execution. */
-   __flush_dcache_area(reloc_code, arm64_relocate_new_kernel_size);
-   flush_icache_range((uintptr_t)reloc_code, (uintptr_t)reloc_code +
-  arm64_relocate_new_kernel_size);
-
-   return 0;
-}
-
 /**
  * machine_kexec_prepare - Prepare for a kexec reboot.
  *
@@ -152,6 +135,29 @@ static void kexec_segment_flush(const struct kimage 
*kimage)
}
 }
 
+int machine_kexec_post_load(struct kimage *kimage)
+{
+   void *reloc_code = page_to_virt(kimage->control_code_page);
+
+   /* If in place flush new kernel image, else flush lists and buffers */
+   if (kimage->head & IND_DONE)
+   kexec_segment_flush(kimage);
+   else
+   kexec_list_flush(kimage);
+
+   memcpy(reloc_code, arm64_relocate_new_kernel,
+  arm64_relocate_new_kernel_size);
+   kimage->arch.kern_reloc = __pa(reloc_code);
+   kexec_image_info(kimage);
+
+   /* Flush the reloc_code in preparation for its execution. */
+   __flush_dcache_area(reloc_code, arm64_relocate_new_kernel_size);
+   flush_icache_range((uintptr_t)reloc_code, (uintptr_t)reloc_code +
+  arm64_relocate_new_kernel_size);
+
+   return 0;
+}
+
 /**
  * machine_kexec - Do the kexec reboot.
  *
@@ -169,13 +175,6 @@ void machine_kexec(struct kimage *kimage)
WARN(in_kexec_crash && (stuck_cpus || smp_crash_stop_failed()),
"Some CPUs may be stale, kdump will be unreliable.\n");
 
-   /* Flush the kimage list and its buffers. */
-   kexec_list_flush(kimage);
-
-   /* Flush the new image if already in place. */
-   if ((kimage != kexec_crash_image) && (kimage->head & IND_DONE))
-   kexec_segment_flush(kimage);
-
pr_info("Bye!\n");
 
local_daif_mask();
@@ -250,8 +249,6 @@ void arch_kexec_protect_crashkres(void)
 {
int i;
 
-   kexec_segment_flush(kexec_crash_image);
-
for (i = 0; i < kexec_crash_image->nr_segments; i++)
set_memory_valid(
__phys_to_virt(kexec_crash_image->segment[i].mem),
-- 
2.25.1


___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


[PATCH v13 10/18] arm64: kexec: pass kimage as the only argument to relocation function

2021-04-07 Thread Pavel Tatashin
Currently, kexec relocation function (arm64_relocate_new_kernel) accepts
the following arguments:

head:   start of array that contains relocation information.
entry:  entry point for new kernel or purgatory.
dtb_mem:first and only argument to entry.

The number of arguments cannot be easily expended, because this
function is also called from HVC_SOFT_RESTART, which preserves only
three arguments. And, also arm64_relocate_new_kernel is written in
assembly but called without stack, thus no place to move extra arguments
to free registers.

Soon, we will need to pass more arguments: once we enable MMU we
will need to pass information about page tables.

Pass kimage to arm64_relocate_new_kernel, and teach it to get the
required fields from kimage.

Suggested-by: James Morse 
Signed-off-by: Pavel Tatashin 
---
 arch/arm64/kernel/asm-offsets.c |  7 +++
 arch/arm64/kernel/machine_kexec.c   |  6 --
 arch/arm64/kernel/relocate_kernel.S | 10 --
 3 files changed, 15 insertions(+), 8 deletions(-)

diff --git a/arch/arm64/kernel/asm-offsets.c b/arch/arm64/kernel/asm-offsets.c
index a36e2fc330d4..0c92e193f866 100644
--- a/arch/arm64/kernel/asm-offsets.c
+++ b/arch/arm64/kernel/asm-offsets.c
@@ -9,6 +9,7 @@
 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -153,6 +154,12 @@ int main(void)
   DEFINE(PTRAUTH_USER_KEY_APGA,offsetof(struct 
ptrauth_keys_user, apga));
   DEFINE(PTRAUTH_KERNEL_KEY_APIA,  offsetof(struct ptrauth_keys_kernel, 
apia));
   BLANK();
+#endif
+#ifdef CONFIG_KEXEC_CORE
+  DEFINE(KIMAGE_ARCH_DTB_MEM,  offsetof(struct kimage, arch.dtb_mem));
+  DEFINE(KIMAGE_HEAD,  offsetof(struct kimage, head));
+  DEFINE(KIMAGE_START, offsetof(struct kimage, start));
+  BLANK();
 #endif
   return 0;
 }
diff --git a/arch/arm64/kernel/machine_kexec.c 
b/arch/arm64/kernel/machine_kexec.c
index b150b65f0b84..2e734e4ae12e 100644
--- a/arch/arm64/kernel/machine_kexec.c
+++ b/arch/arm64/kernel/machine_kexec.c
@@ -83,6 +83,8 @@ static void kexec_list_flush(struct kimage *kimage)
 {
kimage_entry_t *entry;
 
+   __flush_dcache_area(kimage, sizeof(*kimage));
+
for (entry = >head; ; entry++) {
unsigned int flag;
void *addr;
@@ -198,8 +200,8 @@ void machine_kexec(struct kimage *kimage)
restart(is_hyp_callable(), kimage->start, kimage->arch.dtb_mem,
0, 0);
} else {
-   cpu_soft_restart(kimage->arch.kern_reloc, kimage->head,
-kimage->start, kimage->arch.dtb_mem);
+   cpu_soft_restart(kimage->arch.kern_reloc, virt_to_phys(kimage),
+0, 0);
}
 
BUG(); /* Should never get here. */
diff --git a/arch/arm64/kernel/relocate_kernel.S 
b/arch/arm64/kernel/relocate_kernel.S
index 718037bef560..36b4496524c3 100644
--- a/arch/arm64/kernel/relocate_kernel.S
+++ b/arch/arm64/kernel/relocate_kernel.S
@@ -27,9 +27,7 @@
  */
 SYM_CODE_START(arm64_relocate_new_kernel)
/* Setup the list loop variables. */
-   mov x18, x2 /* x18 = dtb address */
-   mov x17, x1 /* x17 = kimage_start */
-   mov x16, x0 /* x16 = kimage_head */
+   ldr x16, [x0, #KIMAGE_HEAD] /* x16 = kimage_head */
mov x14, xzr/* x14 = entry ptr */
mov x13, xzr/* x13 = copy dest */
raw_dcache_line_size x15, x1/* x15 = dcache line size */
@@ -63,12 +61,12 @@ SYM_CODE_START(arm64_relocate_new_kernel)
isb
 
/* Start new image. */
-   mov x0, x18
+   ldr x4, [x0, #KIMAGE_START] /* relocation start */
+   ldr x0, [x0, #KIMAGE_ARCH_DTB_MEM]  /* dtb address */
mov x1, xzr
mov x2, xzr
mov x3, xzr
-   br  x17
-
+   br  x4
 SYM_CODE_END(arm64_relocate_new_kernel)
 
 .align 3   /* To keep the 64-bit values below naturally aligned. */
-- 
2.25.1


___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


[PATCH v13 08/18] arm64: kexec: skip relocation code for inplace kexec

2021-04-07 Thread Pavel Tatashin
In case of kdump or when segments are already in place the relocation
is not needed, therefore the setup of relocation function and call to
it can be skipped.

Signed-off-by: Pavel Tatashin 
Suggested-by: James Morse 
---
 arch/arm64/kernel/machine_kexec.c   | 34 ++---
 arch/arm64/kernel/relocate_kernel.S |  3 ---
 2 files changed, 21 insertions(+), 16 deletions(-)

diff --git a/arch/arm64/kernel/machine_kexec.c 
b/arch/arm64/kernel/machine_kexec.c
index 3a034bc25709..b150b65f0b84 100644
--- a/arch/arm64/kernel/machine_kexec.c
+++ b/arch/arm64/kernel/machine_kexec.c
@@ -139,21 +139,23 @@ int machine_kexec_post_load(struct kimage *kimage)
 {
void *reloc_code = page_to_virt(kimage->control_code_page);
 
-   /* If in place flush new kernel image, else flush lists and buffers */
-   if (kimage->head & IND_DONE)
+   /* If in place, relocation is not used, only flush next kernel */
+   if (kimage->head & IND_DONE) {
kexec_segment_flush(kimage);
-   else
-   kexec_list_flush(kimage);
+   kexec_image_info(kimage);
+   return 0;
+   }
 
memcpy(reloc_code, arm64_relocate_new_kernel,
   arm64_relocate_new_kernel_size);
kimage->arch.kern_reloc = __pa(reloc_code);
-   kexec_image_info(kimage);
 
/* Flush the reloc_code in preparation for its execution. */
__flush_dcache_area(reloc_code, arm64_relocate_new_kernel_size);
flush_icache_range((uintptr_t)reloc_code, (uintptr_t)reloc_code +
   arm64_relocate_new_kernel_size);
+   kexec_list_flush(kimage);
+   kexec_image_info(kimage);
 
return 0;
 }
@@ -180,19 +182,25 @@ void machine_kexec(struct kimage *kimage)
local_daif_mask();
 
/*
-* cpu_soft_restart will shutdown the MMU, disable data caches, then
-* transfer control to the kern_reloc which contains a copy of
-* the arm64_relocate_new_kernel routine.  arm64_relocate_new_kernel
-* uses physical addressing to relocate the new image to its final
-* position and transfers control to the image entry point when the
-* relocation is complete.
+* Both restart and cpu_soft_restart will shutdown the MMU, disable data
+* caches. However, restart will start new kernel or purgatory directly,
+* cpu_soft_restart will transfer control to arm64_relocate_new_kernel
 * In kexec case, kimage->start points to purgatory assuming that
 * kernel entry and dtb address are embedded in purgatory by
 * userspace (kexec-tools).
 * In kexec_file case, the kernel starts directly without purgatory.
 */
-   cpu_soft_restart(kimage->arch.kern_reloc, kimage->head, kimage->start,
-kimage->arch.dtb_mem);
+   if (kimage->head & IND_DONE) {
+   typeof(__cpu_soft_restart) *restart;
+
+   cpu_install_idmap();
+   restart = (void *)__pa_symbol(__cpu_soft_restart);
+   restart(is_hyp_callable(), kimage->start, kimage->arch.dtb_mem,
+   0, 0);
+   } else {
+   cpu_soft_restart(kimage->arch.kern_reloc, kimage->head,
+kimage->start, kimage->arch.dtb_mem);
+   }
 
BUG(); /* Should never get here. */
 }
diff --git a/arch/arm64/kernel/relocate_kernel.S 
b/arch/arm64/kernel/relocate_kernel.S
index b78ea5de97a4..8058fabe0a76 100644
--- a/arch/arm64/kernel/relocate_kernel.S
+++ b/arch/arm64/kernel/relocate_kernel.S
@@ -32,8 +32,6 @@ SYM_CODE_START(arm64_relocate_new_kernel)
mov x16, x0 /* x16 = kimage_head */
mov x14, xzr/* x14 = entry ptr */
mov x13, xzr/* x13 = copy dest */
-   /* Check if the new image needs relocation. */
-   tbnzx16, IND_DONE_BIT, .Ldone
raw_dcache_line_size x15, x1/* x15 = dcache line size */
 .Lloop:
and x12, x16, PAGE_MASK /* x12 = addr */
@@ -65,7 +63,6 @@ SYM_CODE_START(arm64_relocate_new_kernel)
 .Lnext:
ldr x16, [x14], #8  /* entry = *ptr++ */
tbz x16, IND_DONE_BIT, .Lloop   /* while (!(entry & DONE)) */
-.Ldone:
/* wait for writes from copy_page to finish */
dsb nsh
ic  iallu
-- 
2.25.1


___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


[PATCH v13 09/18] arm64: kexec: Use dcache ops macros instead of open-coding

2021-04-07 Thread Pavel Tatashin
From: James Morse 

kexec does dcache maintenance when it re-writes all memory. Our
dcache_by_line_op macro depends on reading the sanitised DminLine
from memory. Kexec may have overwritten this, so open-codes the
sequence.

dcache_by_line_op is a whole set of macros, it uses dcache_line_size
which uses read_ctr for the sanitsed DminLine. Reading the DminLine
is the first thing the dcache_by_line_op does.

Rename dcache_by_line_op dcache_by_myline_op and take DminLine as
an argument. Kexec can now use the slightly smaller macro.

This makes up-coming changes to the dcache maintenance easier on
the eye.

Code generated by the existing callers is unchanged.

Signed-off-by: James Morse 

[Fixed merging issues]

Signed-off-by: Pavel Tatashin 
---
 arch/arm64/include/asm/assembler.h  | 12 
 arch/arm64/kernel/relocate_kernel.S | 13 +++--
 2 files changed, 11 insertions(+), 14 deletions(-)

diff --git a/arch/arm64/include/asm/assembler.h 
b/arch/arm64/include/asm/assembler.h
index ca31594d3d6c..29061b76aab6 100644
--- a/arch/arm64/include/asm/assembler.h
+++ b/arch/arm64/include/asm/assembler.h
@@ -371,10 +371,9 @@ alternative_else
 alternative_endif
.endm
 
-   .macro dcache_by_line_op op, domain, kaddr, size, tmp1, tmp2
-   dcache_line_size \tmp1, \tmp2
+   .macro dcache_by_myline_op op, domain, kaddr, size, linesz, tmp2
add \size, \kaddr, \size
-   sub \tmp2, \tmp1, #1
+   sub \tmp2, \linesz, #1
bic \kaddr, \kaddr, \tmp2
 9998:
.ifc\op, cvau
@@ -394,12 +393,17 @@ alternative_endif
.endif
.endif
.endif
-   add \kaddr, \kaddr, \tmp1
+   add \kaddr, \kaddr, \linesz
cmp \kaddr, \size
b.lo9998b
dsb \domain
.endm
 
+   .macro dcache_by_line_op op, domain, kaddr, size, tmp1, tmp2
+   dcache_line_size \tmp1, \tmp2
+   dcache_by_myline_op \op, \domain, \kaddr, \size, \tmp1, \tmp2
+   .endm
+
 /*
  * Macro to perform an instruction cache maintenance for the interval
  * [start, end)
diff --git a/arch/arm64/kernel/relocate_kernel.S 
b/arch/arm64/kernel/relocate_kernel.S
index 8058fabe0a76..718037bef560 100644
--- a/arch/arm64/kernel/relocate_kernel.S
+++ b/arch/arm64/kernel/relocate_kernel.S
@@ -41,16 +41,9 @@ SYM_CODE_START(arm64_relocate_new_kernel)
tbz x16, IND_SOURCE_BIT, .Ltest_indirection
 
/* Invalidate dest page to PoC. */
-   mov x2, x13
-   add x20, x2, #PAGE_SIZE
-   sub x1, x15, #1
-   bic x2, x2, x1
-2: dc  ivac, x2
-   add x2, x2, x15
-   cmp x2, x20
-   b.lo2b
-   dsb sy
-
+   mov x2, x13
+   mov x1, #PAGE_SIZE
+   dcache_by_myline_op ivac, sy, x2, x1, x15, x20
copy_page x13, x12, x1, x2, x3, x4, x5, x6, x7, x8
b   .Lnext
 .Ltest_indirection:
-- 
2.25.1


___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


[PATCH v13 05/18] arm64: trans_pgd: hibernate: Add trans_pgd_copy_el2_vectors

2021-04-07 Thread Pavel Tatashin
Users of trans_pgd may also need a copy of vector table because it is
also may be overwritten if a linear map can be overwritten.

Move setup of EL2 vectors from hibernate to trans_pgd, so it can be
later shared with kexec as well.

Suggested-by: James Morse 
Signed-off-by: Pavel Tatashin 
---
 arch/arm64/include/asm/trans_pgd.h |  3 +++
 arch/arm64/include/asm/virt.h  |  3 +++
 arch/arm64/kernel/hibernate.c  | 28 ++--
 arch/arm64/mm/trans_pgd.c  | 20 
 4 files changed, 36 insertions(+), 18 deletions(-)

diff --git a/arch/arm64/include/asm/trans_pgd.h 
b/arch/arm64/include/asm/trans_pgd.h
index 5d08e5adf3d5..e0760e52d36d 100644
--- a/arch/arm64/include/asm/trans_pgd.h
+++ b/arch/arm64/include/asm/trans_pgd.h
@@ -36,4 +36,7 @@ int trans_pgd_map_page(struct trans_pgd_info *info, pgd_t 
*trans_pgd,
 int trans_pgd_idmap_page(struct trans_pgd_info *info, phys_addr_t *trans_ttbr0,
 unsigned long *t0sz, void *page);
 
+int trans_pgd_copy_el2_vectors(struct trans_pgd_info *info,
+  phys_addr_t *el2_vectors);
+
 #endif /* _ASM_TRANS_TABLE_H */
diff --git a/arch/arm64/include/asm/virt.h b/arch/arm64/include/asm/virt.h
index 4216c8623538..bfbb66018114 100644
--- a/arch/arm64/include/asm/virt.h
+++ b/arch/arm64/include/asm/virt.h
@@ -67,6 +67,9 @@
  */
 extern u32 __boot_cpu_mode[2];
 
+extern char __hyp_stub_vectors[];
+#define ARM64_VECTOR_TABLE_LEN SZ_2K
+
 void __hyp_set_vectors(phys_addr_t phys_vector_base);
 void __hyp_reset_vectors(void);
 
diff --git a/arch/arm64/kernel/hibernate.c b/arch/arm64/kernel/hibernate.c
index c764574a1acb..0b8bad8bb6eb 100644
--- a/arch/arm64/kernel/hibernate.c
+++ b/arch/arm64/kernel/hibernate.c
@@ -48,12 +48,6 @@
  */
 extern int in_suspend;
 
-/* temporary el2 vectors in the __hibernate_exit_text section. */
-extern char hibernate_el2_vectors[];
-
-/* hyp-stub vectors, used to restore el2 during resume from hibernate. */
-extern char __hyp_stub_vectors[];
-
 /*
  * The logical cpu number we should resume on, initialised to a non-cpu
  * number.
@@ -428,6 +422,7 @@ int swsusp_arch_resume(void)
void *zero_page;
size_t exit_size;
pgd_t *tmp_pg_dir;
+   phys_addr_t el2_vectors;
void __noreturn (*hibernate_exit)(phys_addr_t, phys_addr_t, void *,
  void *, phys_addr_t, phys_addr_t);
struct trans_pgd_info trans_info = {
@@ -455,6 +450,14 @@ int swsusp_arch_resume(void)
return -ENOMEM;
}
 
+   if (is_hyp_callable()) {
+   rc = trans_pgd_copy_el2_vectors(_info, _vectors);
+   if (rc) {
+   pr_err("Failed to setup el2 vectors\n");
+   return rc;
+   }
+   }
+
exit_size = __hibernate_exit_text_end - __hibernate_exit_text_start;
/*
 * Copy swsusp_arch_suspend_exit() to a safe page. This will generate
@@ -467,25 +470,14 @@ int swsusp_arch_resume(void)
return rc;
}
 
-   /*
-* The hibernate exit text contains a set of el2 vectors, that will
-* be executed at el2 with the mmu off in order to reload hyp-stub.
-*/
-   __flush_dcache_area(hibernate_exit, exit_size);
-
/*
 * KASLR will cause the el2 vectors to be in a different location in
 * the resumed kernel. Load hibernate's temporary copy into el2.
 *
 * We can skip this step if we booted at EL1, or are running with VHE.
 */
-   if (is_hyp_callable()) {
-   phys_addr_t el2_vectors = (phys_addr_t)hibernate_exit;
-   el2_vectors += hibernate_el2_vectors -
-  __hibernate_exit_text_start; /* offset */
-
+   if (is_hyp_callable())
__hyp_set_vectors(el2_vectors);
-   }
 
hibernate_exit(virt_to_phys(tmp_pg_dir), resume_hdr.ttbr1_el1,
   resume_hdr.reenter_kernel, restore_pblist,
diff --git a/arch/arm64/mm/trans_pgd.c b/arch/arm64/mm/trans_pgd.c
index 527f0a39c3da..61549451ed3a 100644
--- a/arch/arm64/mm/trans_pgd.c
+++ b/arch/arm64/mm/trans_pgd.c
@@ -322,3 +322,23 @@ int trans_pgd_idmap_page(struct trans_pgd_info *info, 
phys_addr_t *trans_ttbr0,
 
return 0;
 }
+
+/*
+ * Create a copy of the vector table so we can call HVC_SET_VECTORS or
+ * HVC_SOFT_RESTART from contexts where the table may be overwritten.
+ */
+int trans_pgd_copy_el2_vectors(struct trans_pgd_info *info,
+  phys_addr_t *el2_vectors)
+{
+   void *hyp_stub = trans_alloc(info);
+
+   if (!hyp_stub)
+   return -ENOMEM;
+   *el2_vectors = virt_to_phys(hyp_stub);
+   memcpy(hyp_stub, &__hyp_stub_vectors, ARM64_VECTOR_TABLE_LEN);
+   __flush_icache_range((unsigned long)hyp_stub,
+(unsigned long)hyp_stub + ARM64_VECTOR_TABLE_LEN);
+   

[PATCH v13 06/18] arm64: hibernate: abstract ttrb0 setup function

2021-04-07 Thread Pavel Tatashin
Currently, only hibernate sets custom ttbr0 with safe idmaped function.
Kexec, is also going to be using this functinality when relocation code
is going to be idmapped.

Move the setup seqeuence to a dedicated cpu_install_ttbr0() for custom
ttbr0.

Suggested-by: James Morse 
Signed-off-by: Pavel Tatashin 
---
 arch/arm64/include/asm/mmu_context.h | 24 
 arch/arm64/kernel/hibernate.c| 21 +
 2 files changed, 25 insertions(+), 20 deletions(-)

diff --git a/arch/arm64/include/asm/mmu_context.h 
b/arch/arm64/include/asm/mmu_context.h
index bd02e99b1a4c..f64d0d5e1b1f 100644
--- a/arch/arm64/include/asm/mmu_context.h
+++ b/arch/arm64/include/asm/mmu_context.h
@@ -115,6 +115,30 @@ static inline void cpu_install_idmap(void)
cpu_switch_mm(lm_alias(idmap_pg_dir), _mm);
 }
 
+/*
+ * Load our new page tables. A strict BBM approach requires that we ensure that
+ * TLBs are free of any entries that may overlap with the global mappings we 
are
+ * about to install.
+ *
+ * For a real hibernate/resume/kexec cycle TTBR0 currently points to a zero
+ * page, but TLBs may contain stale ASID-tagged entries (e.g. for EFI runtime
+ * services), while for a userspace-driven test_resume cycle it points to
+ * userspace page tables (and we must point it at a zero page ourselves).
+ *
+ * We change T0SZ as part of installing the idmap. This is undone by
+ * cpu_uninstall_idmap() in __cpu_suspend_exit().
+ */
+static inline void cpu_install_ttbr0(phys_addr_t ttbr0, unsigned long t0sz)
+{
+   cpu_set_reserved_ttbr0();
+   local_flush_tlb_all();
+   __cpu_set_tcr_t0sz(t0sz);
+
+   /* avoid cpu_switch_mm() and its SW-PAN and CNP interactions */
+   write_sysreg(ttbr0, ttbr0_el1);
+   isb();
+}
+
 /*
  * Atomically replaces the active TTBR1_EL1 PGD with a new VA-compatible PGD,
  * avoiding the possibility of conflicting TLB entries being allocated.
diff --git a/arch/arm64/kernel/hibernate.c b/arch/arm64/kernel/hibernate.c
index 0b8bad8bb6eb..ded5115bcb63 100644
--- a/arch/arm64/kernel/hibernate.c
+++ b/arch/arm64/kernel/hibernate.c
@@ -206,26 +206,7 @@ static int create_safe_exec_page(void *src_start, size_t 
length,
if (rc)
return rc;
 
-   /*
-* Load our new page tables. A strict BBM approach requires that we
-* ensure that TLBs are free of any entries that may overlap with the
-* global mappings we are about to install.
-*
-* For a real hibernate/resume cycle TTBR0 currently points to a zero
-* page, but TLBs may contain stale ASID-tagged entries (e.g. for EFI
-* runtime services), while for a userspace-driven test_resume cycle it
-* points to userspace page tables (and we must point it at a zero page
-* ourselves).
-*
-* We change T0SZ as part of installing the idmap. This is undone by
-* cpu_uninstall_idmap() in __cpu_suspend_exit().
-*/
-   cpu_set_reserved_ttbr0();
-   local_flush_tlb_all();
-   __cpu_set_tcr_t0sz(t0sz);
-   write_sysreg(trans_ttbr0, ttbr0_el1);
-   isb();
-
+   cpu_install_ttbr0(trans_ttbr0, t0sz);
*phys_dst_addr = virt_to_phys(page);
 
return 0;
-- 
2.25.1


___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


[PATCH v13 02/18] arm64: hyp-stub: Move invalid vector entries into the vectors

2021-04-07 Thread Pavel Tatashin
From: James Morse 

Most of the hyp-stub's vector entries are invalid. These are each
a unique function that branches to itself. To move these into the
vectors, merge the ventry and invalid_vector macros and give each
one a unique name.

This means we can copy the hyp-stub as it is self contained within
its vectors.

Signed-off-by: James Morse 

[Fixed merging issues]

Signed-off-by: Pavel Tatashin 
---
 arch/arm64/kernel/hyp-stub.S | 56 +++-
 1 file changed, 23 insertions(+), 33 deletions(-)

diff --git a/arch/arm64/kernel/hyp-stub.S b/arch/arm64/kernel/hyp-stub.S
index 572b28646005..ff329c5c074d 100644
--- a/arch/arm64/kernel/hyp-stub.S
+++ b/arch/arm64/kernel/hyp-stub.S
@@ -16,31 +16,38 @@
 #include 
 #include 
 
+.macro invalid_vector  label
+SYM_CODE_START_LOCAL(\label)
+   .align 7
+   b   \label
+SYM_CODE_END(\label)
+.endm
+
.text
.pushsection.hyp.text, "ax"
 
.align 11
 
 SYM_CODE_START(__hyp_stub_vectors)
-   ventry  el2_sync_invalid// Synchronous EL2t
-   ventry  el2_irq_invalid // IRQ EL2t
-   ventry  el2_fiq_invalid // FIQ EL2t
-   ventry  el2_error_invalid   // Error EL2t
+   invalid_vector  hyp_stub_el2t_sync_invalid  // Synchronous EL2t
+   invalid_vector  hyp_stub_el2t_irq_invalid   // IRQ EL2t
+   invalid_vector  hyp_stub_el2t_fiq_invalid   // FIQ EL2t
+   invalid_vector  hyp_stub_el2t_error_invalid // Error EL2t
 
-   ventry  el2_sync_invalid// Synchronous EL2h
-   ventry  el2_irq_invalid // IRQ EL2h
-   ventry  el2_fiq_invalid // FIQ EL2h
-   ventry  el2_error_invalid   // Error EL2h
+   invalid_vector  hyp_stub_el2h_sync_invalid  // Synchronous EL2h
+   invalid_vector  hyp_stub_el2h_irq_invalid   // IRQ EL2h
+   invalid_vector  hyp_stub_el2h_fiq_invalid   // FIQ EL2h
+   invalid_vector  hyp_stub_el2h_error_invalid // Error EL2h
 
ventry  el1_sync// Synchronous 64-bit EL1
-   ventry  el1_irq_invalid // IRQ 64-bit EL1
-   ventry  el1_fiq_invalid // FIQ 64-bit EL1
-   ventry  el1_error_invalid   // Error 64-bit EL1
-
-   ventry  el1_sync_invalid// Synchronous 32-bit EL1
-   ventry  el1_irq_invalid // IRQ 32-bit EL1
-   ventry  el1_fiq_invalid // FIQ 32-bit EL1
-   ventry  el1_error_invalid   // Error 32-bit EL1
+   invalid_vector  hyp_stub_el1_irq_invalid// IRQ 64-bit EL1
+   invalid_vector  hyp_stub_el1_fiq_invalid// FIQ 64-bit EL1
+   invalid_vector  hyp_stub_el1_error_invalid  // Error 64-bit EL1
+
+   invalid_vector  hyp_stub_32b_el1_sync_invalid   // Synchronous 32-bit 
EL1
+   invalid_vector  hyp_stub_32b_el1_irq_invalid// IRQ 32-bit EL1
+   invalid_vector  hyp_stub_32b_el1_fiq_invalid// FIQ 32-bit EL1
+   invalid_vector  hyp_stub_32b_el1_error_invalid  // Error 32-bit EL1
.align 11
 SYM_INNER_LABEL(__hyp_stub_vectors_end, SYM_L_LOCAL)
 SYM_CODE_END(__hyp_stub_vectors)
@@ -173,23 +180,6 @@ SYM_CODE_END(enter_vhe)
 
.popsection
 
-.macro invalid_vector  label
-SYM_CODE_START_LOCAL(\label)
-   b \label
-SYM_CODE_END(\label)
-.endm
-
-   invalid_vector  el2_sync_invalid
-   invalid_vector  el2_irq_invalid
-   invalid_vector  el2_fiq_invalid
-   invalid_vector  el2_error_invalid
-   invalid_vector  el1_sync_invalid
-   invalid_vector  el1_irq_invalid
-   invalid_vector  el1_fiq_invalid
-   invalid_vector  el1_error_invalid
-
-   .popsection
-
 /*
  * __hyp_set_vectors: Call this after boot to set the initial hypervisor
  * vectors as part of hypervisor installation.  On an SMP system, this should
-- 
2.25.1


___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


[PATCH v13 04/18] arm64: kernel: add helper for booted at EL2 and not VHE

2021-04-07 Thread Pavel Tatashin
Replace places that contain logic like this:
is_hyp_mode_available() && !is_kernel_in_hyp_mode()

With a dedicated boolean function  is_hyp_callable(). This will be needed
later in kexec in order to sooner switch back to EL2.

Suggested-by: James Morse 
Signed-off-by: Pavel Tatashin 
---
 arch/arm64/include/asm/virt.h | 5 +
 arch/arm64/kernel/cpu-reset.h | 3 +--
 arch/arm64/kernel/hibernate.c | 9 +++--
 arch/arm64/kernel/sdei.c  | 2 +-
 4 files changed, 10 insertions(+), 9 deletions(-)

diff --git a/arch/arm64/include/asm/virt.h b/arch/arm64/include/asm/virt.h
index 7379f35ae2c6..4216c8623538 100644
--- a/arch/arm64/include/asm/virt.h
+++ b/arch/arm64/include/asm/virt.h
@@ -128,6 +128,11 @@ static __always_inline bool is_protected_kvm_enabled(void)
return cpus_have_final_cap(ARM64_KVM_PROTECTED_MODE);
 }
 
+static inline bool is_hyp_callable(void)
+{
+   return is_hyp_mode_available() && !is_kernel_in_hyp_mode();
+}
+
 #endif /* __ASSEMBLY__ */
 
 #endif /* ! __ASM__VIRT_H */
diff --git a/arch/arm64/kernel/cpu-reset.h b/arch/arm64/kernel/cpu-reset.h
index ed50e9587ad8..1922e7a690f8 100644
--- a/arch/arm64/kernel/cpu-reset.h
+++ b/arch/arm64/kernel/cpu-reset.h
@@ -20,8 +20,7 @@ static inline void __noreturn cpu_soft_restart(unsigned long 
entry,
 {
typeof(__cpu_soft_restart) *restart;
 
-   unsigned long el2_switch = !is_kernel_in_hyp_mode() &&
-   is_hyp_mode_available();
+   unsigned long el2_switch = is_hyp_callable();
restart = (void *)__pa_symbol(__cpu_soft_restart);
 
cpu_install_idmap();
diff --git a/arch/arm64/kernel/hibernate.c b/arch/arm64/kernel/hibernate.c
index b1cef371df2b..c764574a1acb 100644
--- a/arch/arm64/kernel/hibernate.c
+++ b/arch/arm64/kernel/hibernate.c
@@ -48,9 +48,6 @@
  */
 extern int in_suspend;
 
-/* Do we need to reset el2? */
-#define el2_reset_needed() (is_hyp_mode_available() && 
!is_kernel_in_hyp_mode())
-
 /* temporary el2 vectors in the __hibernate_exit_text section. */
 extern char hibernate_el2_vectors[];
 
@@ -125,7 +122,7 @@ int arch_hibernation_header_save(void *addr, unsigned int 
max_size)
hdr->reenter_kernel = _cpu_resume;
 
/* We can't use __hyp_get_vectors() because kvm may still be loaded */
-   if (el2_reset_needed())
+   if (is_hyp_callable())
hdr->__hyp_stub_vectors = __pa_symbol(__hyp_stub_vectors);
else
hdr->__hyp_stub_vectors = 0;
@@ -387,7 +384,7 @@ int swsusp_arch_suspend(void)
dcache_clean_range(__idmap_text_start, __idmap_text_end);
 
/* Clean kvm setup code to PoC? */
-   if (el2_reset_needed()) {
+   if (is_hyp_callable()) {
dcache_clean_range(__hyp_idmap_text_start, 
__hyp_idmap_text_end);
dcache_clean_range(__hyp_text_start, __hyp_text_end);
}
@@ -482,7 +479,7 @@ int swsusp_arch_resume(void)
 *
 * We can skip this step if we booted at EL1, or are running with VHE.
 */
-   if (el2_reset_needed()) {
+   if (is_hyp_callable()) {
phys_addr_t el2_vectors = (phys_addr_t)hibernate_exit;
el2_vectors += hibernate_el2_vectors -
   __hibernate_exit_text_start; /* offset */
diff --git a/arch/arm64/kernel/sdei.c b/arch/arm64/kernel/sdei.c
index 2c7ca449dd51..af0ac2f920cf 100644
--- a/arch/arm64/kernel/sdei.c
+++ b/arch/arm64/kernel/sdei.c
@@ -200,7 +200,7 @@ unsigned long sdei_arch_get_entry_point(int conduit)
 * dropped to EL1 because we don't support VHE, then we can't support
 * SDEI.
 */
-   if (is_hyp_mode_available() && !is_kernel_in_hyp_mode()) {
+   if (is_hyp_callable()) {
pr_err("Not supported on this hardware/boot configuration\n");
goto out_err;
}
-- 
2.25.1


___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


[PATCH v13 00/18] arm64: MMU enabled kexec relocation

2021-04-07 Thread Pavel Tatashin
Changelog:
v13:
- Fixed a hang on ThunderX2, thank you Pingfan Liu for reporting
  the problem. In relocation function we need civac not ivac, we
  need to clean data in addition to invalidating it.
  Since I was using ThunderX2 machine I also measured the new
  performance data on this large ARM64 server. The MMU improves
  kexec relocation 190 times on this machine! (see below for
  raw data). Saves 7.5s during CentOS kexec reboot.
v12:
- A major change compared to previous version. Instead of using
  contiguous VA range a copy of linear map is now used to perform
  copying of segments during relocation as it was agreed in the
  discussion of version 11 of this project.
- In addition to using linear map, I also took several ideas from
  James Morse to better organize the kexec relocation:
1. skip relocation function entirely if that is not needed
2. remove the PoC flushing function since it is not needed
   anymore with MMU enabled.
v11:
- Fixed missing KEXEC_CORE dependency for trans_pgd.c
- Removed useless "if(rc) return rc" statement (thank you Tyler Hicks)
- Another 12 patches were accepted into maintainer's get.
  Re-based patches against:
  https://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux.git
  Branch: for-next/kexec
v10:
- Addressed a lot of comments form James Morse and from  Marc Zyngier
- Added review-by's
- Synchronized with mainline

v9: - 9 patches from previous series landed in upstream, so now series
  is smaller
- Added two patches from James Morse to address idmap issues for 
machines
  with high physical addresses.
- Addressed comments from Selin Dag about compiling issues. He also 
tested
  my series and got similar performance results: ~60 ms instead of ~580 
ms
  with an initramfs size of ~120MB.
v8:
- Synced with mainline to keep series up-to-date
v7:
-- Addressed comments from James Morse
- arm64: hibernate: pass the allocated pgdp to ttbr0
  Removed "Fixes" tag, and added Added Reviewed-by: James Morse
- arm64: hibernate: check pgd table allocation
  Sent out as a standalone patch so it can be sent to stable
  Series applies on mainline + this patch
- arm64: hibernate: add trans_pgd public functions
  Remove second allocation of tmp_pg_dir in swsusp_arch_resume
  Added Reviewed-by: James Morse 
- arm64: kexec: move relocation function setup and clean up
  Fixed typo in commit log
  Changed kern_reloc to phys_addr_t types.
  Added explanation why kern_reloc is needed.
  Split into four patches:
  arm64: kexec: make dtb_mem always enabled
  arm64: kexec: remove unnecessary debug prints
  arm64: kexec: call kexec_image_info only once
  arm64: kexec: move relocation function setup
- arm64: kexec: add expandable argument to relocation function
  Changed types of new arguments from unsigned long to phys_addr_t.
  Changed offset prefix to KEXEC_*
  Split into four patches:
  arm64: kexec: cpu_soft_restart change argument types
  arm64: kexec: arm64_relocate_new_kernel clean-ups
  arm64: kexec: arm64_relocate_new_kernel don't use x0 as temp
  arm64: kexec: add expandable argument to relocation function
- arm64: kexec: configure trans_pgd page table for kexec
  Added invalid entries into EL2 vector table
  Removed KEXEC_EL2_VECTOR_TABLE_SIZE and KEXEC_EL2_VECTOR_TABLE_OFFSET
  Copy relocation functions and table into separate pages
  Changed types in kern_reloc_arg.
  Split into three patches:
  arm64: kexec: offset for relocation function
  arm64: kexec: kexec EL2 vectors
  arm64: kexec: configure trans_pgd page table for kexec
- arm64: kexec: enable MMU during kexec relocation
  Split into two patches:
  arm64: kexec: enable MMU during kexec relocation
  arm64: kexec: remove head from relocation argument
v6:
- Sync with mainline tip
- Added Acked's from Dave Young
v5:
- Addressed comments from Matthias Brugger: added review-by's, improved
  comments, and made cleanups to swsusp_arch_resume() in addition to
  create_safe_exec_page().
- Synced with mainline tip.
v4:
- Addressed comments from James Morse.
- Split "check pgd table allocation" into two patches, and moved to
  the beginning of series  for simpler backport of the fixes.
  Added "Fixes:" tags to commit logs.
- Changed "arm64, hibernate:" to "arm64: hibernate:"
- Added Reviewed-by's
- Moved "add PUD_SECT_RDONLY" earlier in series 

[PATCH v13 03/18] arm64: hyp-stub: Move el1_sync into the vectors

2021-04-07 Thread Pavel Tatashin
From: James Morse 

The hyp-stub's el1_sync code doesn't do very much, this can easily fit
in the vectors.

With this, all of the hyp-stubs behaviour is contained in its vectors.
This lets kexec and hibernate copy the hyp-stub when they need its
behaviour, instead of re-implementing it.

Signed-off-by: James Morse 

[Fixed merging issues]

Signed-off-by: Pavel Tatashin 
---
 arch/arm64/kernel/hyp-stub.S | 59 ++--
 1 file changed, 29 insertions(+), 30 deletions(-)

diff --git a/arch/arm64/kernel/hyp-stub.S b/arch/arm64/kernel/hyp-stub.S
index ff329c5c074d..d1a73d0f74e0 100644
--- a/arch/arm64/kernel/hyp-stub.S
+++ b/arch/arm64/kernel/hyp-stub.S
@@ -21,6 +21,34 @@ SYM_CODE_START_LOCAL(\label)
.align 7
b   \label
 SYM_CODE_END(\label)
+.endm
+
+.macro hyp_stub_el1_sync
+SYM_CODE_START_LOCAL(hyp_stub_el1_sync)
+   .align 7
+   cmp x0, #HVC_SET_VECTORS
+   b.ne2f
+   msr vbar_el2, x1
+   b   9f
+
+2: cmp x0, #HVC_SOFT_RESTART
+   b.ne3f
+   mov x0, x2
+   mov x2, x4
+   mov x4, x1
+   mov x1, x3
+   br  x4  // no return
+
+3: cmp x0, #HVC_RESET_VECTORS
+   beq 9f  // Nothing to reset!
+
+   /* Someone called kvm_call_hyp() against the hyp-stub... */
+   mov_q   x0, HVC_STUB_ERR
+   eret
+
+9: mov x0, xzr
+   eret
+SYM_CODE_END(hyp_stub_el1_sync)
 .endm
 
.text
@@ -39,7 +67,7 @@ SYM_CODE_START(__hyp_stub_vectors)
invalid_vector  hyp_stub_el2h_fiq_invalid   // FIQ EL2h
invalid_vector  hyp_stub_el2h_error_invalid // Error EL2h
 
-   ventry  el1_sync// Synchronous 64-bit EL1
+   hyp_stub_el1_sync   // Synchronous 64-bit 
EL1
invalid_vector  hyp_stub_el1_irq_invalid// IRQ 64-bit EL1
invalid_vector  hyp_stub_el1_fiq_invalid// FIQ 64-bit EL1
invalid_vector  hyp_stub_el1_error_invalid  // Error 64-bit EL1
@@ -55,35 +83,6 @@ SYM_CODE_END(__hyp_stub_vectors)
 # Check the __hyp_stub_vectors didn't overflow
 .org . - (__hyp_stub_vectors_end - __hyp_stub_vectors) + SZ_2K
 
-
-SYM_CODE_START_LOCAL(el1_sync)
-   cmp x0, #HVC_SET_VECTORS
-   b.ne1f
-   msr vbar_el2, x1
-   b   9f
-
-1: cmp x0, #HVC_VHE_RESTART
-   b.eqmutate_to_vhe
-
-2: cmp x0, #HVC_SOFT_RESTART
-   b.ne3f
-   mov x0, x2
-   mov x2, x4
-   mov x4, x1
-   mov x1, x3
-   br  x4  // no return
-
-3: cmp x0, #HVC_RESET_VECTORS
-   beq 9f  // Nothing to reset!
-
-   /* Someone called kvm_call_hyp() against the hyp-stub... */
-   mov_q   x0, HVC_STUB_ERR
-   eret
-
-9: mov x0, xzr
-   eret
-SYM_CODE_END(el1_sync)
-
 // nVHE? No way! Give me the real thing!
 SYM_CODE_START_LOCAL(mutate_to_vhe)
// Sanity check: MMU *must* be off
-- 
2.25.1


___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


[PATCH v13 01/18] arm64: hyp-stub: Check the size of the HYP stub's vectors

2021-04-07 Thread Pavel Tatashin
From: James Morse 

Hibernate contains a set of temporary EL2 vectors used to 'park'
EL2 somewhere safe while all the memory is thrown in the air.
Making kexec do its relocations with the MMU on means they have to
be done at EL1, so EL2 has to be parked. This means yet another
set of vectors.

All these things do is HVC_SET_VECTORS and HVC_SOFT_RESTART, both
of which are implemented by the hyp-stub. Lets copy it instead
of re-inventing it.

To do this the hyp-stub's entrails need to be packed neatly inside
its 2K vectors.

Start by moving the final 2K alignment inside the end marker, and
add a build check that we didn't overflow 2K.

Signed-off-by: James Morse 
Signed-off-by: Pavel Tatashin 
---
 arch/arm64/kernel/hyp-stub.S | 6 +-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/arch/arm64/kernel/hyp-stub.S b/arch/arm64/kernel/hyp-stub.S
index 5eccbd62fec8..572b28646005 100644
--- a/arch/arm64/kernel/hyp-stub.S
+++ b/arch/arm64/kernel/hyp-stub.S
@@ -41,9 +41,13 @@ SYM_CODE_START(__hyp_stub_vectors)
ventry  el1_irq_invalid // IRQ 32-bit EL1
ventry  el1_fiq_invalid // FIQ 32-bit EL1
ventry  el1_error_invalid   // Error 32-bit EL1
+   .align 11
+SYM_INNER_LABEL(__hyp_stub_vectors_end, SYM_L_LOCAL)
 SYM_CODE_END(__hyp_stub_vectors)
 
-   .align 11
+# Check the __hyp_stub_vectors didn't overflow
+.org . - (__hyp_stub_vectors_end - __hyp_stub_vectors) + SZ_2K
+
 
 SYM_CODE_START_LOCAL(el1_sync)
cmp x0, #HVC_SET_VECTORS
-- 
2.25.1


___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Re: [PATCH v1 1/1] kernel.h: Split out panic and oops helpers

2021-04-07 Thread Luis Chamberlain
On Wed, Apr 07, 2021 at 05:59:19PM +0300, Andy Shevchenko wrote:
> On Wed, Apr 7, 2021 at 5:30 PM Luis Chamberlain  wrote:
> > On Wed, Apr 07, 2021 at 10:33:44AM +0300, Andy Shevchenko wrote:
> > > On Wed, Apr 7, 2021 at 10:25 AM Luis Chamberlain  
> > > wrote:
> > > > On Tue, Apr 06, 2021 at 04:31:58PM +0300, Andy Shevchenko wrote:
> 
> ...
> 
> > > > Why is it worth it to add another file just for this?
> > >
> > > The main point is to break tons of loops that prevent having clean
> > > headers anymore.
> > >
> > > In this case, see bug.h, which is very important in this sense.
> >
> > OK based on the commit log this was not clear, it seemed more of moving
> > panic stuff to its own file, so just cleanup.
> 
> Sorry for that. it should have mentioned the kernel folder instead of
> lib. But I think it won't clarify the above.
> 
> In any case there are several purposes in this case
>  - dropping dependency in bug.h
>  - dropping a loop by moving out panic_notifier.h
>  - unload kernel.h from something which has its own domain
> 
> I think that you are referring to the commit message describing 3rd
> one, but not 1st and 2nd.

Right!

> I will amend this for the future splits, thanks!

Don't get me wrong, I love the motivation behind just the 3rd purpose,
however I figured there might be something more when I saw panic_notifier.h.
It was just not clear.

But awesome stuff!

  Luis

___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Re: [PATCH v1 1/1] kernel.h: Split out panic and oops helpers

2021-04-07 Thread Andy Shevchenko
On Wed, Apr 7, 2021 at 5:30 PM Luis Chamberlain  wrote:
> On Wed, Apr 07, 2021 at 10:33:44AM +0300, Andy Shevchenko wrote:
> > On Wed, Apr 7, 2021 at 10:25 AM Luis Chamberlain  wrote:
> > > On Tue, Apr 06, 2021 at 04:31:58PM +0300, Andy Shevchenko wrote:

...

> > > Why is it worth it to add another file just for this?
> >
> > The main point is to break tons of loops that prevent having clean
> > headers anymore.
> >
> > In this case, see bug.h, which is very important in this sense.
>
> OK based on the commit log this was not clear, it seemed more of moving
> panic stuff to its own file, so just cleanup.

Sorry for that. it should have mentioned the kernel folder instead of
lib. But I think it won't clarify the above.

In any case there are several purposes in this case
 - dropping dependency in bug.h
 - dropping a loop by moving out panic_notifier.h
 - unload kernel.h from something which has its own domain

I think that you are referring to the commit message describing 3rd
one, but not 1st and 2nd.

I will amend this for the future splits, thanks!

> > >  Seems like a very
> > > small file.
> >
> > If it is an argument, it's kinda strange. We have much smaller headers.
>
> The motivation for such separate file was just not clear on the commit
> log.

-- 
With Best Regards,
Andy Shevchenko

___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Re: [PATCH kexec-tools 0/7] build enhancements and GitHub workflow

2021-04-07 Thread Simon Horman
On Fri, Apr 02, 2021 at 12:17:30PM +0200, Simon Horman wrote:
> This series aimes to:
> 
> 1. Allow creation of dist tarball without self-referential hard links
> 2. Add a distckeck target
> 3. Add a GitHub workflow which to performs basic build testing
> 
> A sample run of the workflow can be seen here
> https://github.com/horms/kexec-tools/actions/runs/711376088

Series applied.

___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Re: [PATCH 0/2] Fix early boot OOM issues for some platforms

2021-04-07 Thread Simon Horman
On Tue, Apr 06, 2021 at 03:11:51PM +0100, Hongyan Xia wrote:
> From: Hongyan Xia 
> 
> We have observed a couple of cases where after a successful kexec, the
> crash kernel loaded in the 2nd kernel will run out of memory and
> crash. We narrowed down to two issues:
> 
> 1. when preparing the memory map, kexec excludes the Interrupt Vector
>Table. However, the end address of IVT is incorrect.
> 2. The wrong end address of IVT is not 1KiB aligned. When preparing the
>crashkernel, the memory map will reject unaligned memory chunks. On
>many x86 platforms this means the entire bottom 1MiB range is
>excluded from the crashkernel memory map, resulting in OOM when the
>crashkernel boots.
> 
> Patch 1 fixes 1 which is actually enough to eliminate the issue but we
> feel that such issue may happen again (e.g., with a weird BIOS that has
> unaligned e820 map), so we also have patch 2 to improve the handling of
> unaligned memory.
> 
> Hongyan Xia (2):
>   Fix where the real mode interrupt vector ends
>   Shrink segments to fit alignment instead of throwing them away

Thanks, series applied.

___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Re: kexec does not work for kernel version with patch level >= 256

2021-04-07 Thread Eric W. Biederman
Liu Tao  writes:

> Hello Eric,
>
> Please correct me if I'm wrong. After my research, I found that the
> KERNEL_VERSION
> check cannot be removed.
>
> In x86_64 case, function get_kernel_page_offset set different hard coded
> values into
> elf_info->page_offset according to KERNEL_VERSION, then in function
> get_kernel_vaddr_and_size,
> elf_info->page_offset gets refreshed by reading program segments of
> /proc/kcore.
> The refresh can fail when KASLR is off, thus the hard coded values are
> still needed as pre-set
> default values.

I see that the code is conditional upon KASLR, but I don't see any
particular reason why the code in get_kernel_vaddr_and_size is
conditional upon KASLR.

Skimming through arch/x86/kernel/vmlinux.lds.S and fs/proc/kcore.c I
don't see anything that is ASLR specific.  So everything should work
simply by removing the unnecessary gate on the presence of the
page_address_base symbol.

I suspect the code will even correctly compute PAGE_OFFSET on all
architectures, but we don't need to go that far to remove our use of the
kernel version.

> In addition, If I set a wrong value in elf_info->page_offset, readelf -l
> vmcore will give the value I set,
> reading symbols in crash-utility is not affected.

Especially if the reading the symbols is not affected by a wrong value
just auto-detecting the value really seems to make the most sense.

> From my point of view, extending the patch number from 8bit to 16bit is the
> solution. Any thoughts?

My thought is that in general the kernel version can not be depended
upon for anything as there exist enterprise kernels that get feature
backports.  So there very easily could be a kernel where the kernel
version does not accurately reflect what is going on.  So unless we can
say with certainty that there is no other way to detect the base address
of the kernel we really don't want to use the kernel version.

Right now it just looks like one all that is necessary is the removal of
an unnecessary if check.

Eric

___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


[BUG] kexec-tools blows up when the kernel is 4.4.262

2021-04-07 Thread Joe Korty
The following lines in kexec/kernel_version.c cause a crash dump to fail,
when the kernel is 4.4.262, or indeed any kernel whose patchlevel is >255.

if (major >= 256 || minor >= 256 || patch >= 256) {
fprintf(stderr, "Unsupported utsname.release: %s\n",
utsname.release);
return -1;
}

return KERNEL_VERSION(major, minor, patch);

Regards,
Joe

___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Re: [PATCH v3 12/12] kdump: Use vmlinux_build_id to simplify

2021-04-07 Thread Petr Mladek
On Tue 2021-03-30 20:05:20, Stephen Boyd wrote:
> We can use the vmlinux_build_id array here now instead of open coding
> it. This mostly consolidates code.
> 
> Cc: Jiri Olsa 
> Cc: Alexei Starovoitov 
> Cc: Jessica Yu 
> Cc: Evan Green 
> Cc: Hsin-Yi Wang 
> Cc: Dave Young 
> Cc: Baoquan He 
> Cc: Vivek Goyal 
> Cc: 
> Signed-off-by: Stephen Boyd 
> ---
>  include/linux/crash_core.h |  6 +-
>  kernel/crash_core.c| 41 ++
>  2 files changed, 3 insertions(+), 44 deletions(-)
> 
> diff --git a/include/linux/crash_core.h b/include/linux/crash_core.h
> index 206bde8308b2..fb8ab99bb2ee 100644
> --- a/include/linux/crash_core.h
> +++ b/include/linux/crash_core.h
> @@ -39,7 +39,7 @@ phys_addr_t paddr_vmcoreinfo_note(void);
>  #define VMCOREINFO_OSRELEASE(value) \
>   vmcoreinfo_append_str("OSRELEASE=%s\n", value)
>  #define VMCOREINFO_BUILD_ID(value) \
> - vmcoreinfo_append_str("BUILD-ID=%s\n", value)
> + vmcoreinfo_append_str("BUILD-ID=%20phN\n", value)

Please, add also build check that BUILD_ID_MAX == 20.


>  #define VMCOREINFO_PAGESIZE(value) \
>   vmcoreinfo_append_str("PAGESIZE=%ld\n", value)
>  #define VMCOREINFO_SYMBOL(name) \
> @@ -69,10 +69,6 @@ extern unsigned char *vmcoreinfo_data;
>  extern size_t vmcoreinfo_size;
>  extern u32 *vmcoreinfo_note;
>  
> -/* raw contents of kernel .notes section */
> -extern const void __start_notes __weak;
> -extern const void __stop_notes __weak;
> -
>  Elf_Word *append_elf_note(Elf_Word *buf, char *name, unsigned int type,
> void *data, size_t data_len);
>  void final_note(Elf_Word *buf);
> diff --git a/kernel/crash_core.c b/kernel/crash_core.c
> index 825284baaf46..6b560cf9f374 100644
> --- a/kernel/crash_core.c
> +++ b/kernel/crash_core.c
> @@ -4,6 +4,7 @@
>   * Copyright (C) 2002-2004 Eric Biederman  
>   */
>  
> +#include 
>  #include 
>  #include 
>  #include 
> @@ -378,51 +379,13 @@ phys_addr_t __weak paddr_vmcoreinfo_note(void)
>  }
>  EXPORT_SYMBOL(paddr_vmcoreinfo_note);
>  
> -#define NOTES_SIZE (&__stop_notes - &__start_notes)
> -#define BUILD_ID_MAX SHA1_DIGEST_SIZE
> -#define NT_GNU_BUILD_ID 3
> -
> -struct elf_note_section {
> - struct elf_note n_hdr;
> - u8 n_data[];
> -};
> -
>  /*
>   * Add build ID from .notes section as generated by the GNU ld(1)
>   * or LLVM lld(1) --build-id option.
>   */
>  static void add_build_id_vmcoreinfo(void)
>  {
> - char build_id[BUILD_ID_MAX * 2 + 1];
> - int n_remain = NOTES_SIZE;
> -
> - while (n_remain >= sizeof(struct elf_note)) {
> - const struct elf_note_section *note_sec =
> - &__start_notes + NOTES_SIZE - n_remain;
> - const u32 n_namesz = note_sec->n_hdr.n_namesz;
> -
> - if (note_sec->n_hdr.n_type == NT_GNU_BUILD_ID &&
> - n_namesz != 0 &&
> - !strcmp((char *)_sec->n_data[0], "GNU")) {
> - if (note_sec->n_hdr.n_descsz <= BUILD_ID_MAX) {
> - const u32 n_descsz = note_sec->n_hdr.n_descsz;
> - const u8 *s = _sec->n_data[n_namesz];
> -
> - s = PTR_ALIGN(s, 4);
> - bin2hex(build_id, s, n_descsz);
> - build_id[2 * n_descsz] = '\0';
> - VMCOREINFO_BUILD_ID(build_id);
> - return;
> - }
> - pr_warn("Build ID is too large to include in 
> vmcoreinfo: %u > %u\n",
> - note_sec->n_hdr.n_descsz,
> - BUILD_ID_MAX);
> - return;
> - }
> - n_remain -= sizeof(struct elf_note) +
> - ALIGN(note_sec->n_hdr.n_namesz, 4) +
> - ALIGN(note_sec->n_hdr.n_descsz, 4);
> - }
> + VMCOREINFO_BUILD_ID(vmlinux_build_id);
>  }

The function add_build_id_vmcoreinfo() is used in
crash_save_vmcoreinfo_init() in this context:


VMCOREINFO_OSRELEASE(init_uts_ns.name.release);
add_build_id_vmcoreinfo();
VMCOREINFO_PAGESIZE(PAGE_SIZE);

VMCOREINFO_SYMBOL(init_uts_ns);
VMCOREINFO_OFFSET(uts_namespace, name);
VMCOREINFO_SYMBOL(node_online_map);

The function is not longer need. VMCOREINFO_BUILD_ID()
can be used directly:

VMCOREINFO_OSRELEASE(init_uts_ns.name.release);
VMCOREINFO_BUILD_ID(vmlinux_build_id);
VMCOREINFO_PAGESIZE(PAGE_SIZE);

VMCOREINFO_SYMBOL(init_uts_ns);
VMCOREINFO_OFFSET(uts_namespace, name);
VMCOREINFO_SYMBOL(node_online_map);


Best Regards,
Petr


>  
>  static int __init crash_save_vmcoreinfo_init(void)
> -- 
> https://chromeos.dev

___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Re: crashkernel reservation failed - No suitable area found on a cortina/gemini SoC

2021-04-07 Thread Bruce Mitchell

On 4/7/2021 07:48, Corentin Labbe wrote:

Le Wed, Apr 07, 2021 at 07:28:26AM -0700, Bruce Mitchell a écrit :

On 4/7/2021 07:23, Corentin Labbe wrote:

Le Wed, Apr 07, 2021 at 07:13:04AM -0700, Bruce Mitchell a écrit :

On 4/7/2021 05:54, Corentin Labbe wrote:

Hello

I try to do kexec on a cortina/gemini SoC.
On a "normal" boot, kexec fail to find memory so I added crashkernel=8M to 
cmdline. (kernel size is ~6M).
But now, kernel fail to reserve memory:
Load Kern image from 0x3002 to 0x80 size 7340032
Booting Linux on physical CPU 0x0
Linux version 5.12.0-rc5-next-20210401+ (compile@Red) 
(armv7a-unknown-linux-gnueabihf-gcc (Gentoo 9.3.0-r2 p4) 9.3.0, GNU ld (Gentoo 
2.34 p6) 2.34.0) #98 PREEMPT Wed Apr 7 14:14:08 CEST 2021
CPU: FA526 [66015261] revision 1 (ARMv4), cr=397f
CPU: VIVT data cache, VIVT instruction cache
OF: fdt: Machine model: Edimax NS-2502
Memory policy: Data cache writeback
Zone ranges:
 Normal   [mem 0x-0x07ff]
 HighMem  empty
Movable zone start for each node
Early memory node ranges
 node   0: [mem 0x-0x07ff]
Initmem setup node 0 [mem 0x-0x07ff]
crashkernel reservation failed - No suitable area found.
Built 1 zonelists, mobility grouping on.  Total pages: 32512
Kernel command line: console=ttyS0,19200n8 ip=dhcp crashkernel=8M
Dentry cache hash table entries: 16384 (order: 4, 65536 bytes, linear)
Inode-cache hash table entries: 8192 (order: 3, 32768 bytes, linear)
mem auto-init: stack:off, heap alloc:off, heap free:off
Memory: 119476K/131072K available (5034K kernel code, 579K rwdata, 1372K 
rodata, 3020K init, 210K bss, 11596K reserved, 0K cma-reserved, 0K highmem)
SLUB: HWalign=32, Order=0-3, MinObjects=0, CPUs=1, Nodes=1

What can I do ?

Thanks
Regards

___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec



Hello Corentin,

I see much larger crashkernel=xxM being shown here
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/admin-guide/kdump/kdump.rst
and from many of my other searches.

Here is an interesting article on kdump for ARM-32
https://kaiwantech.wordpress.com/2017/07/13/setting-up-kdump-and-crash-for-arm-32-an-ongoing-saga/


Here is the kernel command line reference
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/admin-guide/kernel-parameters.txt?h=v5.11#n732

I feel your frustrations too.


Hello

Thanks but I have already read those documentation.
I search to know why the kernel cannot find 8M of memory ouf of 128.

Regards



How much more memory does the kernel and initrd above and beyond just
their physical size?  (heaps, stacks, buffers, virtual filesystems)


The kernel size include a rootfs.cpio.lzma of 3MB and dtb is appended.
The total kernel size is 7MB.
The uncompressed size of the kernel is 13M (size of vmlinux)
The uncompressed size of rootfs is 11M.

cat /proc/meminfo
MemTotal: 122496 kB
MemFree:  103700 kB
MemAvailable: 101936 kB
Buffers:   0 kB
Cached:10904 kB
SwapCached:0 kB
Active: 4304 kB
Inactive:   8012 kB
Active(anon):   4304 kB
Inactive(anon): 8012 kB
Active(file):  0 kB
Inactive(file):0 kB
Unevictable:   0 kB
Mlocked:   0 kB
HighTotal: 0 kB
HighFree:  0 kB
LowTotal: 122496 kB
LowFree:  103700 kB
SwapTotal: 0 kB
SwapFree:  0 kB
Dirty: 0 kB
Writeback: 0 kB
AnonPages:  1428 kB
Mapped: 3552 kB
Shmem: 10904 kB
KReclaimable:608 kB
Slab:   2960 kB
SReclaimable:608 kB
SUnreclaim: 2352 kB
KernelStack: 312 kB
PageTables:  136 kB
NFS_Unstable:  0 kB
Bounce:0 kB
WritebackTmp:  0 kB
CommitLimit:   61248 kB
Committed_AS:  14336 kB
VmallocTotal: 901120 kB
VmallocUsed:  64 kB
VmallocChunk:  0 kB
Percpu:   32 kB
CmaTotal:  0 kB
CmaFree:   0 kB



I believe you need space for all of that,
the smallest that would work for me was 20MB.



___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Re: [PATCH v1 1/1] kernel.h: Split out panic and oops helpers

2021-04-07 Thread Kees Cook
On Tue, Apr 06, 2021 at 04:31:58PM +0300, Andy Shevchenko wrote:
> kernel.h is being used as a dump for all kinds of stuff for a long time.
> Here is the attempt to start cleaning it up by splitting out panic and
> oops helpers.
> 
> At the same time convert users in header and lib folder to use new header.
> Though for time being include new header back to kernel.h to avoid twisted
> indirected includes for existing users.
> 
> Signed-off-by: Andy Shevchenko 

I like it! Do you have a multi-arch CI to do allmodconfig builds to
double-check this?

Acked-by: Kees Cook 

-Kees

-- 
Kees Cook

___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Re: [PATCH v1 1/1] kernel.h: Split out panic and oops helpers

2021-04-07 Thread Andy Shevchenko
On Wed, Apr 7, 2021 at 10:25 AM Luis Chamberlain  wrote:
>
> On Tue, Apr 06, 2021 at 04:31:58PM +0300, Andy Shevchenko wrote:
> > diff --git a/include/linux/panic_notifier.h b/include/linux/panic_notifier.h
> > new file mode 100644
> > index ..41e32483d7a7
> > --- /dev/null
> > +++ b/include/linux/panic_notifier.h
> > @@ -0,0 +1,12 @@
> > +/* SPDX-License-Identifier: GPL-2.0 */
> > +#ifndef _LINUX_PANIC_NOTIFIERS_H
> > +#define _LINUX_PANIC_NOTIFIERS_H
> > +
> > +#include 
> > +#include 
> > +
> > +extern struct atomic_notifier_head panic_notifier_list;
> > +
> > +extern bool crash_kexec_post_notifiers;
> > +
> > +#endif   /* _LINUX_PANIC_NOTIFIERS_H */
>
> Why is it worth it to add another file just for this?

The main point is to break tons of loops that prevent having clean
headers anymore.

In this case, see bug.h, which is very important in this sense.

>  Seems like a very
> small file.

If it is an argument, it's kinda strange. We have much smaller headers.

-- 
With Best Regards,
Andy Shevchenko

___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Re: [PATCH v1 1/1] kernel.h: Split out panic and oops helpers

2021-04-07 Thread Wei Liu
On Tue, Apr 06, 2021 at 04:31:58PM +0300, Andy Shevchenko wrote:
> kernel.h is being used as a dump for all kinds of stuff for a long time.
> Here is the attempt to start cleaning it up by splitting out panic and
> oops helpers.
> 
> At the same time convert users in header and lib folder to use new header.
> Though for time being include new header back to kernel.h to avoid twisted
> indirected includes for existing users.
> 
> Signed-off-by: Andy Shevchenko 

Acked-by: Wei Liu 

___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Re: [PATCH v1 1/1] kernel.h: Split out panic and oops helpers

2021-04-07 Thread Luis Chamberlain
On Wed, Apr 07, 2021 at 10:33:44AM +0300, Andy Shevchenko wrote:
> On Wed, Apr 7, 2021 at 10:25 AM Luis Chamberlain  wrote:
> >
> > On Tue, Apr 06, 2021 at 04:31:58PM +0300, Andy Shevchenko wrote:
> > > diff --git a/include/linux/panic_notifier.h 
> > > b/include/linux/panic_notifier.h
> > > new file mode 100644
> > > index ..41e32483d7a7
> > > --- /dev/null
> > > +++ b/include/linux/panic_notifier.h
> > > @@ -0,0 +1,12 @@
> > > +/* SPDX-License-Identifier: GPL-2.0 */
> > > +#ifndef _LINUX_PANIC_NOTIFIERS_H
> > > +#define _LINUX_PANIC_NOTIFIERS_H
> > > +
> > > +#include 
> > > +#include 
> > > +
> > > +extern struct atomic_notifier_head panic_notifier_list;
> > > +
> > > +extern bool crash_kexec_post_notifiers;
> > > +
> > > +#endif   /* _LINUX_PANIC_NOTIFIERS_H */
> >
> > Why is it worth it to add another file just for this?
> 
> The main point is to break tons of loops that prevent having clean
> headers anymore.
>
> In this case, see bug.h, which is very important in this sense.

OK based on the commit log this was not clear, it seemed more of moving
panic stuff to its own file, so just cleanup.

> >  Seems like a very
> > small file.
> 
> If it is an argument, it's kinda strange. We have much smaller headers.

The motivation for such separate file was just not clear on the commit
log.

  Luis

___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Re: [PATCH v1 1/1] kernel.h: Split out panic and oops helpers

2021-04-07 Thread Andy Shevchenko
On Wed, Apr 7, 2021 at 11:17 AM Kees Cook  wrote:
>
> On Tue, Apr 06, 2021 at 04:31:58PM +0300, Andy Shevchenko wrote:
> > kernel.h is being used as a dump for all kinds of stuff for a long time.
> > Here is the attempt to start cleaning it up by splitting out panic and
> > oops helpers.
> >
> > At the same time convert users in header and lib folder to use new header.
> > Though for time being include new header back to kernel.h to avoid twisted
> > indirected includes for existing users.
> >
> > Signed-off-by: Andy Shevchenko 
>
> I like it! Do you have a multi-arch CI to do allmodconfig builds to
> double-check this?

Unfortunately no, I rely on plenty of bots that are harvesting mailing lists.

But I will appreciate it if somebody can run this through various build tests.

> Acked-by: Kees Cook 

Thanks!


-- 
With Best Regards,
Andy Shevchenko

___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Re: crashkernel reservation failed - No suitable area found on a cortina/gemini SoC

2021-04-07 Thread Corentin Labbe
Le Wed, Apr 07, 2021 at 07:28:26AM -0700, Bruce Mitchell a écrit :
> On 4/7/2021 07:23, Corentin Labbe wrote:
> > Le Wed, Apr 07, 2021 at 07:13:04AM -0700, Bruce Mitchell a écrit :
> >> On 4/7/2021 05:54, Corentin Labbe wrote:
> >>> Hello
> >>>
> >>> I try to do kexec on a cortina/gemini SoC.
> >>> On a "normal" boot, kexec fail to find memory so I added crashkernel=8M 
> >>> to cmdline. (kernel size is ~6M).
> >>> But now, kernel fail to reserve memory:
> >>> Load Kern image from 0x3002 to 0x80 size 7340032
> >>> Booting Linux on physical CPU 0x0
> >>> Linux version 5.12.0-rc5-next-20210401+ (compile@Red) 
> >>> (armv7a-unknown-linux-gnueabihf-gcc (Gentoo 9.3.0-r2 p4) 9.3.0, GNU ld 
> >>> (Gentoo 2.34 p6) 2.34.0) #98 PREEMPT Wed Apr 7 14:14:08 CEST 2021
> >>> CPU: FA526 [66015261] revision 1 (ARMv4), cr=397f
> >>> CPU: VIVT data cache, VIVT instruction cache
> >>> OF: fdt: Machine model: Edimax NS-2502
> >>> Memory policy: Data cache writeback
> >>> Zone ranges:
> >>> Normal   [mem 0x-0x07ff]
> >>> HighMem  empty
> >>> Movable zone start for each node
> >>> Early memory node ranges
> >>> node   0: [mem 0x-0x07ff]
> >>> Initmem setup node 0 [mem 0x-0x07ff]
> >>> crashkernel reservation failed - No suitable area found.
> >>> Built 1 zonelists, mobility grouping on.  Total pages: 32512
> >>> Kernel command line: console=ttyS0,19200n8 ip=dhcp crashkernel=8M
> >>> Dentry cache hash table entries: 16384 (order: 4, 65536 bytes, linear)
> >>> Inode-cache hash table entries: 8192 (order: 3, 32768 bytes, linear)
> >>> mem auto-init: stack:off, heap alloc:off, heap free:off
> >>> Memory: 119476K/131072K available (5034K kernel code, 579K rwdata, 1372K 
> >>> rodata, 3020K init, 210K bss, 11596K reserved, 0K cma-reserved, 0K 
> >>> highmem)
> >>> SLUB: HWalign=32, Order=0-3, MinObjects=0, CPUs=1, Nodes=1
> >>>
> >>> What can I do ?
> >>>
> >>> Thanks
> >>> Regards
> >>>
> >>> ___
> >>> kexec mailing list
> >>> kexec@lists.infradead.org
> >>> http://lists.infradead.org/mailman/listinfo/kexec
> >>>
> >>
> >> Hello Corentin,
> >>
> >> I see much larger crashkernel=xxM being shown here
> >> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/admin-guide/kdump/kdump.rst
> >> and from many of my other searches.
> >>
> >> Here is an interesting article on kdump for ARM-32
> >> https://kaiwantech.wordpress.com/2017/07/13/setting-up-kdump-and-crash-for-arm-32-an-ongoing-saga/
> >>
> >>
> >> Here is the kernel command line reference
> >> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/admin-guide/kernel-parameters.txt?h=v5.11#n732
> >>
> >> I feel your frustrations too.
> > 
> > Hello
> > 
> > Thanks but I have already read those documentation.
> > I search to know why the kernel cannot find 8M of memory ouf of 128.
> > 
> > Regards
> > 
> 
> How much more memory does the kernel and initrd above and beyond just 
> their physical size?  (heaps, stacks, buffers, virtual filesystems)

The kernel size include a rootfs.cpio.lzma of 3MB and dtb is appended.
The total kernel size is 7MB.
The uncompressed size of the kernel is 13M (size of vmlinux)
The uncompressed size of rootfs is 11M.

cat /proc/meminfo 
MemTotal: 122496 kB
MemFree:  103700 kB
MemAvailable: 101936 kB
Buffers:   0 kB
Cached:10904 kB
SwapCached:0 kB
Active: 4304 kB
Inactive:   8012 kB
Active(anon):   4304 kB
Inactive(anon): 8012 kB
Active(file):  0 kB
Inactive(file):0 kB
Unevictable:   0 kB
Mlocked:   0 kB
HighTotal: 0 kB
HighFree:  0 kB
LowTotal: 122496 kB
LowFree:  103700 kB
SwapTotal: 0 kB
SwapFree:  0 kB
Dirty: 0 kB
Writeback: 0 kB
AnonPages:  1428 kB
Mapped: 3552 kB
Shmem: 10904 kB
KReclaimable:608 kB
Slab:   2960 kB
SReclaimable:608 kB
SUnreclaim: 2352 kB
KernelStack: 312 kB
PageTables:  136 kB
NFS_Unstable:  0 kB
Bounce:0 kB
WritebackTmp:  0 kB
CommitLimit:   61248 kB
Committed_AS:  14336 kB
VmallocTotal: 901120 kB
VmallocUsed:  64 kB
VmallocChunk:  0 kB
Percpu:   32 kB
CmaTotal:  0 kB
CmaFree:   0 kB


___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Re: crashkernel reservation failed - No suitable area found on a cortina/gemini SoC

2021-04-07 Thread Bruce Mitchell

On 4/7/2021 07:23, Corentin Labbe wrote:

Le Wed, Apr 07, 2021 at 07:13:04AM -0700, Bruce Mitchell a écrit :

On 4/7/2021 05:54, Corentin Labbe wrote:

Hello

I try to do kexec on a cortina/gemini SoC.
On a "normal" boot, kexec fail to find memory so I added crashkernel=8M to 
cmdline. (kernel size is ~6M).
But now, kernel fail to reserve memory:
Load Kern image from 0x3002 to 0x80 size 7340032
Booting Linux on physical CPU 0x0
Linux version 5.12.0-rc5-next-20210401+ (compile@Red) 
(armv7a-unknown-linux-gnueabihf-gcc (Gentoo 9.3.0-r2 p4) 9.3.0, GNU ld (Gentoo 
2.34 p6) 2.34.0) #98 PREEMPT Wed Apr 7 14:14:08 CEST 2021
CPU: FA526 [66015261] revision 1 (ARMv4), cr=397f
CPU: VIVT data cache, VIVT instruction cache
OF: fdt: Machine model: Edimax NS-2502
Memory policy: Data cache writeback
Zone ranges:
Normal   [mem 0x-0x07ff]
HighMem  empty
Movable zone start for each node
Early memory node ranges
node   0: [mem 0x-0x07ff]
Initmem setup node 0 [mem 0x-0x07ff]
crashkernel reservation failed - No suitable area found.
Built 1 zonelists, mobility grouping on.  Total pages: 32512
Kernel command line: console=ttyS0,19200n8 ip=dhcp crashkernel=8M
Dentry cache hash table entries: 16384 (order: 4, 65536 bytes, linear)
Inode-cache hash table entries: 8192 (order: 3, 32768 bytes, linear)
mem auto-init: stack:off, heap alloc:off, heap free:off
Memory: 119476K/131072K available (5034K kernel code, 579K rwdata, 1372K 
rodata, 3020K init, 210K bss, 11596K reserved, 0K cma-reserved, 0K highmem)
SLUB: HWalign=32, Order=0-3, MinObjects=0, CPUs=1, Nodes=1

What can I do ?

Thanks
Regards

___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec



Hello Corentin,

I see much larger crashkernel=xxM being shown here
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/admin-guide/kdump/kdump.rst
and from many of my other searches.

Here is an interesting article on kdump for ARM-32
https://kaiwantech.wordpress.com/2017/07/13/setting-up-kdump-and-crash-for-arm-32-an-ongoing-saga/


Here is the kernel command line reference
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/admin-guide/kernel-parameters.txt?h=v5.11#n732

I feel your frustrations too.


Hello

Thanks but I have already read those documentation.
I search to know why the kernel cannot find 8M of memory ouf of 128.

Regards



How much more memory does the kernel and initrd above and beyond just 
their physical size?  (heaps, stacks, buffers, virtual filesystems)



___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Re: crashkernel reservation failed - No suitable area found on a cortina/gemini SoC

2021-04-07 Thread Corentin Labbe
Le Wed, Apr 07, 2021 at 07:13:04AM -0700, Bruce Mitchell a écrit :
> On 4/7/2021 05:54, Corentin Labbe wrote:
> > Hello
> > 
> > I try to do kexec on a cortina/gemini SoC.
> > On a "normal" boot, kexec fail to find memory so I added crashkernel=8M to 
> > cmdline. (kernel size is ~6M).
> > But now, kernel fail to reserve memory:
> > Load Kern image from 0x3002 to 0x80 size 7340032
> > Booting Linux on physical CPU 0x0
> > Linux version 5.12.0-rc5-next-20210401+ (compile@Red) 
> > (armv7a-unknown-linux-gnueabihf-gcc (Gentoo 9.3.0-r2 p4) 9.3.0, GNU ld 
> > (Gentoo 2.34 p6) 2.34.0) #98 PREEMPT Wed Apr 7 14:14:08 CEST 2021
> > CPU: FA526 [66015261] revision 1 (ARMv4), cr=397f
> > CPU: VIVT data cache, VIVT instruction cache
> > OF: fdt: Machine model: Edimax NS-2502
> > Memory policy: Data cache writeback
> > Zone ranges:
> >Normal   [mem 0x-0x07ff]
> >HighMem  empty
> > Movable zone start for each node
> > Early memory node ranges
> >node   0: [mem 0x-0x07ff]
> > Initmem setup node 0 [mem 0x-0x07ff]
> > crashkernel reservation failed - No suitable area found.
> > Built 1 zonelists, mobility grouping on.  Total pages: 32512
> > Kernel command line: console=ttyS0,19200n8 ip=dhcp crashkernel=8M
> > Dentry cache hash table entries: 16384 (order: 4, 65536 bytes, linear)
> > Inode-cache hash table entries: 8192 (order: 3, 32768 bytes, linear)
> > mem auto-init: stack:off, heap alloc:off, heap free:off
> > Memory: 119476K/131072K available (5034K kernel code, 579K rwdata, 1372K 
> > rodata, 3020K init, 210K bss, 11596K reserved, 0K cma-reserved, 0K highmem)
> > SLUB: HWalign=32, Order=0-3, MinObjects=0, CPUs=1, Nodes=1
> > 
> > What can I do ?
> > 
> > Thanks
> > Regards
> > 
> > ___
> > kexec mailing list
> > kexec@lists.infradead.org
> > http://lists.infradead.org/mailman/listinfo/kexec
> > 
> 
> Hello Corentin,
> 
> I see much larger crashkernel=xxM being shown here
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/admin-guide/kdump/kdump.rst
> and from many of my other searches.
> 
> Here is an interesting article on kdump for ARM-32
> https://kaiwantech.wordpress.com/2017/07/13/setting-up-kdump-and-crash-for-arm-32-an-ongoing-saga/
> 
> 
> Here is the kernel command line reference
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/admin-guide/kernel-parameters.txt?h=v5.11#n732
> 
> I feel your frustrations too.

Hello

Thanks but I have already read those documentation.
I search to know why the kernel cannot find 8M of memory ouf of 128.

Regards

___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Re: crashkernel reservation failed - No suitable area found on a cortina/gemini SoC

2021-04-07 Thread Bruce Mitchell

On 4/7/2021 05:54, Corentin Labbe wrote:

Hello

I try to do kexec on a cortina/gemini SoC.
On a "normal" boot, kexec fail to find memory so I added crashkernel=8M to 
cmdline. (kernel size is ~6M).
But now, kernel fail to reserve memory:
Load Kern image from 0x3002 to 0x80 size 7340032
Booting Linux on physical CPU 0x0
Linux version 5.12.0-rc5-next-20210401+ (compile@Red) 
(armv7a-unknown-linux-gnueabihf-gcc (Gentoo 9.3.0-r2 p4) 9.3.0, GNU ld (Gentoo 
2.34 p6) 2.34.0) #98 PREEMPT Wed Apr 7 14:14:08 CEST 2021
CPU: FA526 [66015261] revision 1 (ARMv4), cr=397f
CPU: VIVT data cache, VIVT instruction cache
OF: fdt: Machine model: Edimax NS-2502
Memory policy: Data cache writeback
Zone ranges:
   Normal   [mem 0x-0x07ff]
   HighMem  empty
Movable zone start for each node
Early memory node ranges
   node   0: [mem 0x-0x07ff]
Initmem setup node 0 [mem 0x-0x07ff]
crashkernel reservation failed - No suitable area found.
Built 1 zonelists, mobility grouping on.  Total pages: 32512
Kernel command line: console=ttyS0,19200n8 ip=dhcp crashkernel=8M
Dentry cache hash table entries: 16384 (order: 4, 65536 bytes, linear)
Inode-cache hash table entries: 8192 (order: 3, 32768 bytes, linear)
mem auto-init: stack:off, heap alloc:off, heap free:off
Memory: 119476K/131072K available (5034K kernel code, 579K rwdata, 1372K 
rodata, 3020K init, 210K bss, 11596K reserved, 0K cma-reserved, 0K highmem)
SLUB: HWalign=32, Order=0-3, MinObjects=0, CPUs=1, Nodes=1

What can I do ?

Thanks
Regards

___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec



Hello Corentin,

I see much larger crashkernel=xxM being shown here
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/admin-guide/kdump/kdump.rst
and from many of my other searches.

Here is an interesting article on kdump for ARM-32
https://kaiwantech.wordpress.com/2017/07/13/setting-up-kdump-and-crash-for-arm-32-an-ongoing-saga/


Here is the kernel command line reference
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/admin-guide/kernel-parameters.txt?h=v5.11#n732

I feel your frustrations too.


--
Bruce

___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


[PATCH] x86/efi: Do not release sub-1MB memory regions when the crashkernel option is specified

2021-04-07 Thread Lianbo Jiang
Some sub-1MB memory regions may be reserved by EFI boot services, and the
memory regions will be released later in the efi_free_boot_services().

Currently, always reserve all sub-1MB memory regions when the crashkernel
option is specified, but unfortunately EFI boot services may have already
reserved some sub-1MB memory regions before the crash_reserve_low_1M() is
called, which makes that the crash_reserve_low_1M() only own the
remaining sub-1MB memory regions, not all sub-1MB memory regions, because,
subsequently EFI boot services will free its own sub-1MB memory regions.
Eventually, DMA will be able to allocate memory from the sub-1MB area and
cause the following error:

crash> kmem -s |grep invalid
kmem: dma-kmalloc-512: slab: d52c40001900 invalid freepointer: 
9403c0067300
kmem: dma-kmalloc-512: slab: d52c40001900 invalid freepointer: 
9403c0067300
crash> vtop 9403c0067300
VIRTUAL   PHYSICAL
9403c0067300  67300   --->The physical address falls into this range 
[0x00063000-0x0008efff]

kernel debugging log:
...
[0.008927] memblock_reserve: [0x0001-0x00013fff] 
efi_reserve_boot_services+0x85/0xd0
[0.008930] memblock_reserve: [0x00063000-0x0008efff] 
efi_reserve_boot_services+0x85/0xd0
...
[0.009425] memblock_reserve: [0x-0x000f] 
crash_reserve_low_1M+0x2c/0x49
...
[0.010586] Zone ranges:
[0.010587]   DMA  [mem 0x1000-0x00ff]
[0.010589]   DMA32[mem 0x0100-0x]
[0.010591]   Normal   [mem 0x0001-0x000c7fff]
[0.010593]   Device   empty
...
[8.814894] __memblock_free_late: [0x00063000-0x0008efff] 
efi_free_boot_services+0x14b/0x23b
[8.815793] __memblock_free_late: [0x0001-0x00013fff] 
efi_free_boot_services+0x14b/0x23b

Do not release sub-1MB memory regions even though they are reserved by
EFI boot services, so that always reserve all sub-1MB memory regions when
the crashkernel option is specified.

Signed-off-by: Lianbo Jiang 
---
 arch/x86/platform/efi/quirks.c | 14 ++
 1 file changed, 14 insertions(+)

diff --git a/arch/x86/platform/efi/quirks.c b/arch/x86/platform/efi/quirks.c
index 67d93a243c35..637f932c4fd4 100644
--- a/arch/x86/platform/efi/quirks.c
+++ b/arch/x86/platform/efi/quirks.c
@@ -18,6 +18,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #define EFI_MIN_RESERVE 5120
 
@@ -303,6 +304,19 @@ void __init efi_arch_mem_reserve(phys_addr_t addr, u64 
size)
  */
 static __init bool can_free_region(u64 start, u64 size)
 {
+   /*
+* Some sub-1MB memory regions may be reserved by EFI boot
+* services, and these memory regions will be released later
+* in the efi_free_boot_services().
+*
+* Do not release sub-1MB memory regions even though they are
+* reserved by EFI boot services, because, always reserve all
+* sub-1MB memory when the crashkernel option is specified.
+*/
+   if (cmdline_find_option(boot_command_line, "crashkernel", NULL, 0) > 0
+   && (start + size < (1<<20)))
+   return false;
+
if (start + size > __pa_symbol(_text) && start <= __pa_symbol(_end))
return false;
 
-- 
2.17.1


___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


crashkernel reservation failed - No suitable area found on a cortina/gemini SoC

2021-04-07 Thread Corentin Labbe
Hello

I try to do kexec on a cortina/gemini SoC.
On a "normal" boot, kexec fail to find memory so I added crashkernel=8M to 
cmdline. (kernel size is ~6M).
But now, kernel fail to reserve memory:
Load Kern image from 0x3002 to 0x80 size 7340032

Booting Linux on physical CPU 0x0   

Linux version 5.12.0-rc5-next-20210401+ (compile@Red) 
(armv7a-unknown-linux-gnueabihf-gcc (Gentoo 9.3.0-r2 p4) 9.3.0, GNU ld (Gentoo 
2.34 p6) 2.34.0) #98 PREEMPT Wed Apr 7 14:14:08 CEST 2021
CPU: FA526 [66015261] revision 1 (ARMv4), cr=397f   

CPU: VIVT data cache, VIVT instruction cache

OF: fdt: Machine model: Edimax NS-2502  

Memory policy: Data cache writeback 

Zone ranges:

  Normal   [mem 0x-0x07ff]  

  HighMem  empty

Movable zone start for each node

Early memory node ranges

  node   0: [mem 0x-0x07ff] 

Initmem setup node 0 [mem 0x-0x07ff]

crashkernel reservation failed - No suitable area found.

Built 1 zonelists, mobility grouping on.  Total pages: 32512

Kernel command line: console=ttyS0,19200n8 ip=dhcp crashkernel=8M   

Dentry cache hash table entries: 16384 (order: 4, 65536 bytes, linear)  

Inode-cache hash table entries: 8192 (order: 3, 32768 bytes, linear)

mem auto-init: stack:off, heap alloc:off, heap free:off 

Memory: 119476K/131072K available (5034K kernel code, 579K rwdata, 1372K 
rodata, 3020K init, 210K bss, 11596K reserved, 0K cma-reserved, 0K highmem)
SLUB: HWalign=32, Order=0-3, MinObjects=0, CPUs=1, Nodes=1  


What can I do ?

Thanks
Regards

___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec