Re: [PATCH V3 06/10] x86/entry: Preserve PKRS MSR across exceptions

2020-12-17 Thread Thomas Gleixner
On Fri, Nov 06 2020 at 15:29, ira weiny wrote:
> +#ifdef CONFIG_ARCH_HAS_SUPERVISOR_PKEYS
> +/*
> + * PKRS is a per-logical-processor MSR which overlays additional protection 
> for
> + * pages which have been mapped with a protection key.
> + *
> + * The register is not maintained with XSAVE so we have to maintain the MSR
> + * value in software during context switch and exception handling.
> + *
> + * Context switches save the MSR in the task struct thus taking that value to
> + * other processors if necessary.
> + *
> + * To protect against exceptions having access to this memory we save the
> + * current running value and set the PKRS value for the duration of the
> + * exception.  Thus preventing exception handlers from having the elevated
> + * access of the interrupted task.
> + */
> +noinstr void irq_save_set_pkrs(irqentry_state_t *irq_state, u32 val)
> +{
> + if (!cpu_feature_enabled(X86_FEATURE_PKS))
> + return;
> +
> + irq_state->thread_pkrs = current->thread.saved_pkrs;
> + write_pkrs(INIT_PKRS_VALUE);

Why is this noinstr? Just because it's called from a noinstr function?

Of course the function itself violates the noinstr constraints:

  vmlinux.o: warning: objtool: write_pkrs()+0x36: call to do_trace_write_msr() 
leaves .noinstr.text section

There is absolutely no reason to have this marked noinstr.

Thanks,

tglx


[PATCH V3 06/10] x86/entry: Preserve PKRS MSR across exceptions

2020-11-06 Thread ira . weiny
From: Ira Weiny 

The PKRS MSR is not managed by XSAVE.  It is preserved through a context
switch but this support leaves exception handling code open to memory
accesses during exceptions.

2 possible places for preserving this state were considered,
irqentry_state_t or pt_regs.[1]  pt_regs was much more complicated and
was potentially fraught with unintended consequences.[2]
irqentry_state_t was already an object being used in the exception
handling and is straightforward.  It is also easy for any number of
nested states to be tracked and eventually can be enhanced to store the
reference counting required to support PKS through kmap reentry

Preserve the current task's PKRS values in irqentry_state_t on exception
entry and restoring them on exception exit.

Each nested exception is further saved allowing for any number of levels
of exception handling.

Peter and Thomas both suggested parts of the patch, IDT and NMI respectively.

[1] 
https://lore.kernel.org/lkml/calcetrve1i5jdyzd_bcctxqjn+ze3t38efpgjxn1f577m36...@mail.gmail.com/
[2] https://lore.kernel.org/lkml/874kpxx4jf@nanos.tec.linutronix.de/#t

Cc: Dave Hansen 
Cc: Andy Lutomirski 
Suggested-by: Peter Zijlstra 
Suggested-by: Thomas Gleixner 
Signed-off-by: Ira Weiny 

---
Changes from V1
remove redundant irq_state->pkrs
This value is only needed for the global tracking.  So
it should be included in that patch and not in this one.

Changes from RFC V3
Standardize on 'irq_state' variable name
Per Dave Hansen
irq_save_pkrs() -> irq_save_set_pkrs()
Rebased based on clean up patch by Thomas Gleixner
This includes moving irq_[save_set|restore]_pkrs() to
the core as well.
---
 arch/x86/entry/common.c | 38 +
 arch/x86/include/asm/pkeys_common.h |  5 ++--
 arch/x86/mm/pkeys.c |  2 +-
 include/linux/entry-common.h| 13 ++
 kernel/entry/common.c   | 14 +--
 5 files changed, 67 insertions(+), 5 deletions(-)

diff --git a/arch/x86/entry/common.c b/arch/x86/entry/common.c
index 87dea56a15d2..1b6a419a6fac 100644
--- a/arch/x86/entry/common.c
+++ b/arch/x86/entry/common.c
@@ -19,6 +19,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #ifdef CONFIG_XEN_PV
 #include 
@@ -209,6 +210,41 @@ SYSCALL_DEFINE0(ni_syscall)
return -ENOSYS;
 }
 
+#ifdef CONFIG_ARCH_HAS_SUPERVISOR_PKEYS
+/*
+ * PKRS is a per-logical-processor MSR which overlays additional protection for
+ * pages which have been mapped with a protection key.
+ *
+ * The register is not maintained with XSAVE so we have to maintain the MSR
+ * value in software during context switch and exception handling.
+ *
+ * Context switches save the MSR in the task struct thus taking that value to
+ * other processors if necessary.
+ *
+ * To protect against exceptions having access to this memory we save the
+ * current running value and set the PKRS value for the duration of the
+ * exception.  Thus preventing exception handlers from having the elevated
+ * access of the interrupted task.
+ */
+noinstr void irq_save_set_pkrs(irqentry_state_t *irq_state, u32 val)
+{
+   if (!cpu_feature_enabled(X86_FEATURE_PKS))
+   return;
+
+   irq_state->thread_pkrs = current->thread.saved_pkrs;
+   write_pkrs(INIT_PKRS_VALUE);
+}
+
+noinstr void irq_restore_pkrs(irqentry_state_t *irq_state)
+{
+   if (!cpu_feature_enabled(X86_FEATURE_PKS))
+   return;
+
+   write_pkrs(irq_state->thread_pkrs);
+   current->thread.saved_pkrs = irq_state->thread_pkrs;
+}
+#endif /* CONFIG_ARCH_HAS_SUPERVISOR_PKEYS */
+
 #ifdef CONFIG_XEN_PV
 #ifndef CONFIG_PREEMPTION
 /*
@@ -272,6 +308,8 @@ __visible noinstr void xen_pv_evtchn_do_upcall(struct 
pt_regs *regs)
 
inhcall = get_and_clear_inhcall();
if (inhcall && !WARN_ON_ONCE(irq_state.exit_rcu)) {
+   /* Normally called by irqentry_exit, we must restore pkrs here 
*/
+   irq_restore_pkrs(_state);
instrumentation_begin();
irqentry_exit_cond_resched();
instrumentation_end();
diff --git a/arch/x86/include/asm/pkeys_common.h 
b/arch/x86/include/asm/pkeys_common.h
index 801a75615209..11a95e6efd2d 100644
--- a/arch/x86/include/asm/pkeys_common.h
+++ b/arch/x86/include/asm/pkeys_common.h
@@ -27,9 +27,10 @@
 PKR_AD_KEY(13) | PKR_AD_KEY(14) | PKR_AD_KEY(15))
 
 #ifdef CONFIG_ARCH_HAS_SUPERVISOR_PKEYS
-void write_pkrs(u32 new_pkrs);
+DECLARE_PER_CPU(u32, pkrs_cache);
+noinstr void write_pkrs(u32 new_pkrs);
 #else
-static inline void write_pkrs(u32 new_pkrs) { }
+static __always_inline void write_pkrs(u32 new_pkrs) { }
 #endif
 
 #endif /*_ASM_X86_PKEYS_INTERNAL_H */
diff --git a/arch/x86/mm/pkeys.c b/arch/x86/mm/pkeys.c
index 76a62419c446..6892d4524868 100644
--- a/arch/x86/mm/pkeys.c
+++ b/arch/x86/mm/pkeys.c
@@ -248,7 +248,7 @@