4.15-stable review patch.  If anyone has any objections, please let me know.


From: Nicholas Piggin <npig...@gmail.com>

commit eeb715c3e995fbdda0cc05e61216c6c5609bce66 upstream.

This change restores and formalises the behaviour that access to NULL
or other user addresses by the kernel during boot should fault rather
than succeed and modify memory. This was inadvertently broken when
fixing another bug, because it was previously not well defined and
only worked by chance.

powerpc/64s/radix uses high address bits to select an address space
"quadrant", which determines which PID and LPID are used to translate
the rest of the address (effective PID, effective LPID). The kernel
mapping at 0xC... selects quadrant 3, which uses PID=0 and LPID=0. So
the kernel page tables are installed in the PID 0 process table entry.

An address at 0x0... selects quadrant 0, which uses PID=PIDR for
translating the rest of the address (that is, it uses the value of the
PIDR register as the effective PID). If PIDR=0, then the translation
is performed with the PID 0 process table entry page tables. This is
the kernel mapping, so we effectively get another copy of the kernel
address space at 0. A NULL pointer access will access physical memory
address 0.

To prevent duplicating the kernel address space in quadrant 0, this
patch allocates a guard PID containing no translations, and
initializes PIDR with this during boot, before the MMU is switched on.
Any kernel access to quadrant 0 will use this guard PID for
translation and find no valid mappings, and therefore fault.

After boot, this PID will be switchd away to user context PIDs, but
those contain user mappings (and usually NULL pointer protection)
rather than kernel mapping, which is much safer (and by design). It
may be in future this is tightened further, which the guard PID could
be used for.

Commit 371b8044 ("powerpc/64s: Initialize ISAv3 MMU registers before
setting partition table"), introduced this problem because it zeroes
PIDR at boot. However previously the value was inherited from firmware
or kexec, which is not robust and can be zero (e.g., mambo).

Fixes: 371b80447ff3 ("powerpc/64s: Initialize ISAv3 MMU registers before 
setting partition table")
Cc: sta...@vger.kernel.org # v4.15+
Reported-by: Florian Weimer <fwei...@redhat.com>
Tested-by: Mauricio Faria de Oliveira <mauri...@linux.vnet.ibm.com>
Signed-off-by: Nicholas Piggin <npig...@gmail.com>
Signed-off-by: Michael Ellerman <m...@ellerman.id.au>
[mauricfo: backport to v4.15.7 (context line updates only) and re-test]
Signed-off-by: Mauricio Faria de Oliveira <mauri...@linux.vnet.ibm.com>
Acked-by: Michael Ellerman <m...@ellerman.id.au>
Signed-off-by: Greg Kroah-Hartman <gre...@linuxfoundation.org>

 arch/powerpc/mm/pgtable-radix.c |   20 ++++++++++++++++++++
 1 file changed, 20 insertions(+)

--- a/arch/powerpc/mm/pgtable-radix.c
+++ b/arch/powerpc/mm/pgtable-radix.c
@@ -21,6 +21,7 @@
 #include <asm/pgtable.h>
 #include <asm/pgalloc.h>
+#include <asm/mmu_context.h>
 #include <asm/dma.h>
 #include <asm/machdep.h>
 #include <asm/mmu.h>
@@ -334,6 +335,22 @@ static void __init radix_init_pgtable(vo
                     "r" (TLBIEL_INVAL_SET_LPID), "r" (0));
        asm volatile("eieio; tlbsync; ptesync" : : : "memory");
        trace_tlbie(0, 0, TLBIEL_INVAL_SET_LPID, 0, 2, 1, 1);
+       /*
+        * The init_mm context is given the first available (non-zero) PID,
+        * which is the "guard PID" and contains no page table. PIDR should
+        * never be set to zero because that duplicates the kernel address
+        * space at the 0x0... offset (quadrant 0)!
+        *
+        * An arbitrary PID that may later be allocated by the PID allocator
+        * for userspace processes must not be used either, because that
+        * would cause stale user mappings for that PID on CPUs outside of
+        * the TLB invalidation scheme (because it won't be in mm_cpumask).
+        *
+        * So permanently carve out one PID for the purpose of a guard PID.
+        */
+       init_mm.context.id = mmu_base_pid;
+       mmu_base_pid++;
 static void __init radix_init_partition_table(void)
@@ -580,6 +597,8 @@ void __init radix__early_init_mmu(void)
+       /* Switch to the guard PID before turning on MMU */
+       radix__switch_mmu_context(NULL, &init_mm);
 void radix__early_init_mmu_secondary(void)
@@ -601,6 +620,7 @@ void radix__early_init_mmu_secondary(voi
+       radix__switch_mmu_context(NULL, &init_mm);
 void radix__mmu_cleanup_all(void)

Reply via email to