On this HPE Apollo 70 arm64 server with 256 CPUs, triggering a crash
dump just hung. It has 4 threads on each core. Each 2-core share a same
L1 and L2 caches, so that is 8 CPUs shares those. All CPUs share a same
L3 cache.

It turned out that this was due to the TLB contained stale entries (or
uninitialized junk which just happened to look valid) from the first
kernel before turning the MMU on in the second kernel which caused this
instruction hung,

msr     sctlr_el1, x0

Signed-off-by: Qian Cai <c...@lca.pw>
---
 arch/arm64/kernel/head.S | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/arch/arm64/kernel/head.S b/arch/arm64/kernel/head.S
index 4471f570a295..5196f3d729de 100644
--- a/arch/arm64/kernel/head.S
+++ b/arch/arm64/kernel/head.S
@@ -771,6 +771,10 @@ ENTRY(__enable_mmu)
        msr     ttbr0_el1, x2                   // load TTBR0
        msr     ttbr1_el1, x1                   // load TTBR1
        isb
+       dsb     nshst
+       tlbi    vmalle1                         // invalidate TLB
+       dsb     nsh
+       isb
        msr     sctlr_el1, x0
        isb
        /*
-- 
2.17.2 (Apple Git-113)

Reply via email to