Putting the logical ASID into CR3's PCID bits directly means that we
have two cases to consider separately: ASID == 0 and ASID != 0.
This means that bugs that only hit in one of these cases trigger
nondeterministically.

There were some bugs like this in the past, and I think there's
still one in current kernels.  In particular, we have a number of
ASID-unware code paths that save CR3, write some special value, and
then restore CR3.  This includes suspend/resume, hibernate, kexec,
EFI, and maybe other things I've missed.  This is currently
dangerous: if ASID != 0, then this code sequence will leave garbage
in the TLB tagged for ASID 0.  We could potentially see corruption
when switching back to ASID 0.  In principle, an
initialize_tlbstate_and_flush() call after these sequences would
solve the problem, but EFI, at least, does not call this.  (And it
probably shouldn't -- initialize_tlbstate_and_flush() is rather
expensive.)

Signed-off-by: Andy Lutomirski <[email protected]>
---
 arch/x86/include/asm/mmu_context.h | 21 +++++++++++++++++++--
 1 file changed, 19 insertions(+), 2 deletions(-)

diff --git a/arch/x86/include/asm/mmu_context.h 
b/arch/x86/include/asm/mmu_context.h
index a999ba6b721f..9001f7a52216 100644
--- a/arch/x86/include/asm/mmu_context.h
+++ b/arch/x86/include/asm/mmu_context.h
@@ -286,14 +286,31 @@ static inline bool arch_vma_access_permitted(struct 
vm_area_struct *vma,
        return __pkru_allows_pkey(vma_pkey(vma), write);
 }
 
+/*
+ * If PCID is on, ASID-aware code paths put the ASID + 1 into the PCID
+ * bits.  This serves two purposes.  It prevents a nasty situation in
+ * which PCID-unaware code saves CR3, loads some other value (with PCID
+ * == 0), and then restores CR3, thus corrupting the TLB for ASID 0 if
+ * the saved ASID was nonzero.  It also means that any bugs involving
+ * loading a PCID-enabled CR3 with CR4.PCIDE off will trigger
+ * deterministically.
+ */
+
 static inline unsigned long build_cr3(struct mm_struct *mm, u16 asid)
 {
-       return __sme_pa(mm->pgd) | asid;
+       if (static_cpu_has(X86_FEATURE_PCID)) {
+               VM_WARN_ON_ONCE(asid > 4094);
+               return __sme_pa(mm->pgd) | (asid + 1);
+       } else {
+               VM_WARN_ON_ONCE(asid != 0);
+               return __sme_pa(mm->pgd);
+       }
 }
 
 static inline unsigned long build_cr3_noflush(struct mm_struct *mm, u16 asid)
 {
-       return __sme_pa(mm->pgd) | asid | CR3_NOFLUSH;
+       VM_WARN_ON_ONCE(asid > 4094);
+       return __sme_pa(mm->pgd) | (asid + 1) | CR3_NOFLUSH;
 }
 
 /*
-- 
2.13.5

Reply via email to