Re: [PATCH RFC V3 7/9] x86/entry: Preserve PKRS MSR across exceptions

2020-10-14 Thread Ira Weiny
On Wed, Oct 14, 2020 at 09:06:44PM -0700, Dave Hansen wrote:
> On 10/14/20 8:46 PM, Ira Weiny wrote:
> > On Tue, Oct 13, 2020 at 11:52:32AM -0700, Dave Hansen wrote:
> >> On 10/9/20 12:42 PM, ira.we...@intel.com wrote:
> >>> @@ -341,6 +341,9 @@ noinstr void irqentry_enter(struct pt_regs *regs, 
> >>> irqentry_state_t *state)
> >>>   /* Use the combo lockdep/tracing function */
> >>>   trace_hardirqs_off();
> >>>   instrumentation_end();
> >>> +
> >>> +done:
> >>> + irq_save_pkrs(state);
> >>>  }
> >> One nit: This saves *and* sets PKRS.  It's not obvious from the call
> >> here that PKRS is altered at this site.  Seems like there could be a
> >> better name.
> >>
> >> Even if we did:
> >>
> >>irq_save_set_pkrs(state, INIT_VAL);
> >>
> >> It would probably compile down to the same thing, but be *really*
> >> obvious what's going on.
> > I suppose that is true.  But I think it is odd having a parameter which is 
> > the
> > same for every call site.
> 
> Well, it depends on what you optimize for.  I'm trying to optimize for
> the code being understood quickly the first time someone reads it.  To
> me, that's more important than minimizing the number of function
> parameters (which are essentially free).
>

Agreed.  Sorry I was not trying to be confrontational.  There is just enough
other things which are going to take me time to get right I need to focus on
them...  :-D

Sorry,
Ira


Re: [PATCH RFC V3 7/9] x86/entry: Preserve PKRS MSR across exceptions

2020-10-14 Thread Dave Hansen
On 10/14/20 8:46 PM, Ira Weiny wrote:
> On Tue, Oct 13, 2020 at 11:52:32AM -0700, Dave Hansen wrote:
>> On 10/9/20 12:42 PM, ira.we...@intel.com wrote:
>>> @@ -341,6 +341,9 @@ noinstr void irqentry_enter(struct pt_regs *regs, 
>>> irqentry_state_t *state)
>>> /* Use the combo lockdep/tracing function */
>>> trace_hardirqs_off();
>>> instrumentation_end();
>>> +
>>> +done:
>>> +   irq_save_pkrs(state);
>>>  }
>> One nit: This saves *and* sets PKRS.  It's not obvious from the call
>> here that PKRS is altered at this site.  Seems like there could be a
>> better name.
>>
>> Even if we did:
>>
>>  irq_save_set_pkrs(state, INIT_VAL);
>>
>> It would probably compile down to the same thing, but be *really*
>> obvious what's going on.
> I suppose that is true.  But I think it is odd having a parameter which is the
> same for every call site.

Well, it depends on what you optimize for.  I'm trying to optimize for
the code being understood quickly the first time someone reads it.  To
me, that's more important than minimizing the number of function
parameters (which are essentially free).


Re: [PATCH RFC V3 7/9] x86/entry: Preserve PKRS MSR across exceptions

2020-10-14 Thread Ira Weiny
On Tue, Oct 13, 2020 at 11:52:32AM -0700, Dave Hansen wrote:
> On 10/9/20 12:42 PM, ira.we...@intel.com wrote:
> > @@ -341,6 +341,9 @@ noinstr void irqentry_enter(struct pt_regs *regs, 
> > irqentry_state_t *state)
> > /* Use the combo lockdep/tracing function */
> > trace_hardirqs_off();
> > instrumentation_end();
> > +
> > +done:
> > +   irq_save_pkrs(state);
> >  }
> 
> One nit: This saves *and* sets PKRS.  It's not obvious from the call
> here that PKRS is altered at this site.  Seems like there could be a
> better name.
> 
> Even if we did:
> 
>   irq_save_set_pkrs(state, INIT_VAL);
> 
> It would probably compile down to the same thing, but be *really*
> obvious what's going on.

I suppose that is true.  But I think it is odd having a parameter which is the
same for every call site.

But I'm not going to quibble over something like this.

Changed,
Ira

> 
> >  void irqentry_exit_cond_resched(void)
> > @@ -362,7 +365,12 @@ noinstr void irqentry_exit(struct pt_regs *regs, 
> > irqentry_state_t *state)
> > /* Check whether this returns to user mode */
> > if (user_mode(regs)) {
> > irqentry_exit_to_user_mode(regs);
> > -   } else if (!regs_irqs_disabled(regs)) {
> > +   return;
> > +   }
> > +
> > +   irq_restore_pkrs(state);
> > +
> > +   if (!regs_irqs_disabled(regs)) {
> > /*
> >  * If RCU was not watching on entry this needs to be done
> >  * carefully and needs the same ordering of lockdep/tracing
> > 
> 


Re: [PATCH RFC V3 7/9] x86/entry: Preserve PKRS MSR across exceptions

2020-10-13 Thread Dave Hansen
On 10/9/20 12:42 PM, ira.we...@intel.com wrote:
> @@ -341,6 +341,9 @@ noinstr void irqentry_enter(struct pt_regs *regs, 
> irqentry_state_t *state)
>   /* Use the combo lockdep/tracing function */
>   trace_hardirqs_off();
>   instrumentation_end();
> +
> +done:
> + irq_save_pkrs(state);
>  }

One nit: This saves *and* sets PKRS.  It's not obvious from the call
here that PKRS is altered at this site.  Seems like there could be a
better name.

Even if we did:

irq_save_set_pkrs(state, INIT_VAL);

It would probably compile down to the same thing, but be *really*
obvious what's going on.

>  void irqentry_exit_cond_resched(void)
> @@ -362,7 +365,12 @@ noinstr void irqentry_exit(struct pt_regs *regs, 
> irqentry_state_t *state)
>   /* Check whether this returns to user mode */
>   if (user_mode(regs)) {
>   irqentry_exit_to_user_mode(regs);
> - } else if (!regs_irqs_disabled(regs)) {
> + return;
> + }
> +
> + irq_restore_pkrs(state);
> +
> + if (!regs_irqs_disabled(regs)) {
>   /*
>* If RCU was not watching on entry this needs to be done
>* carefully and needs the same ordering of lockdep/tracing
> 



[PATCH RFC V3 7/9] x86/entry: Preserve PKRS MSR across exceptions

2020-10-09 Thread ira . weiny
From: Ira Weiny 

The PKRS MSR is not managed by XSAVE.  It is preserved through a context
switch but this support leaves exception handling code open to memory
accesses during exceptions.

2 possible places for preserving this state were considered,
irqentry_state_t or pt_regs.[1]  pt_regs was much more complicated and
was potentially fraught with unintended consequences.[2]
irqentry_state_t was already an object being used in the exception
handling and is straightforward.  It is also easy for any number of
nested states to be tracked and eventually can be enhanced to store the
reference counting required to support PKS through kmap reentry

Preserve the current task's PKRS values in irqentry_state_t on exception
entry and restoring them on exception exit.

Each nested exception is further saved allowing for any number of levels
of exception handling.

Peter and Thomas both suggested parts of the patch, IDT and NMI respectively.

[1] 
https://lore.kernel.org/lkml/calcetrve1i5jdyzd_bcctxqjn+ze3t38efpgjxn1f577m36...@mail.gmail.com/
[2] https://lore.kernel.org/lkml/874kpxx4jf@nanos.tec.linutronix.de/#t

Cc: Dave Hansen 
Cc: Andy Lutomirski 
Suggested-by: Peter Zijlstra 
Suggested-by: Thomas Gleixner 
Signed-off-by: Ira Weiny 
---
 arch/x86/entry/common.c | 43 +
 arch/x86/include/asm/pkeys_common.h |  5 ++--
 arch/x86/kernel/cpu/mce/core.c  |  4 +++
 arch/x86/mm/pkeys.c |  2 +-
 include/linux/entry-common.h| 12 
 kernel/entry/common.c   | 12 ++--
 6 files changed, 73 insertions(+), 5 deletions(-)

diff --git a/arch/x86/entry/common.c b/arch/x86/entry/common.c
index 305da13770b6..324a8fd5ac10 100644
--- a/arch/x86/entry/common.c
+++ b/arch/x86/entry/common.c
@@ -19,6 +19,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #ifdef CONFIG_XEN_PV
 #include 
@@ -222,6 +223,8 @@ noinstr void idtentry_enter_nmi(struct pt_regs *regs, 
irqentry_state_t *irq_stat
trace_hardirqs_off_finish();
ftrace_nmi_enter();
instrumentation_end();
+
+   irq_save_pkrs(irq_state);
 }
 
 noinstr void idtentry_exit_nmi(struct pt_regs *regs, irqentry_state_t 
*irq_state)
@@ -238,9 +241,47 @@ noinstr void idtentry_exit_nmi(struct pt_regs *regs, 
irqentry_state_t *irq_state
lockdep_hardirq_exit();
if (irq_state->exit_rcu)
lockdep_hardirqs_on(CALLER_ADDR0);
+
+   irq_restore_pkrs(irq_state);
__nmi_exit();
 }
 
+#ifdef CONFIG_ARCH_HAS_SUPERVISOR_PKEYS
+/*
+ * PKRS is a per-logical-processor MSR which overlays additional protection for
+ * pages which have been mapped with a protection key.
+ *
+ * The register is not maintained with XSAVE so we have to maintain the MSR
+ * value in software during context switch and exception handling.
+ *
+ * Context switches save the MSR in the task struct thus taking that value to
+ * other processors if necessary.
+ *
+ * To protect against exceptions having access to this memory we save the
+ * current running value and set the default PKRS value for the duration of the
+ * exception.  Thus preventing exception handlers from having the elevated
+ * access of the interrupted task.
+ */
+noinstr void irq_save_pkrs(irqentry_state_t *state)
+{
+   if (!cpu_feature_enabled(X86_FEATURE_PKS))
+   return;
+
+   state->thread_pkrs = current->thread.saved_pkrs;
+   state->pkrs = this_cpu_read(pkrs_cache);
+   write_pkrs(INIT_PKRS_VALUE);
+}
+
+noinstr void irq_restore_pkrs(irqentry_state_t *state)
+{
+   if (!cpu_feature_enabled(X86_FEATURE_PKS))
+   return;
+
+   write_pkrs(state->pkrs);
+   current->thread.saved_pkrs = state->thread_pkrs;
+}
+#endif /* CONFIG_ARCH_HAS_SUPERVISOR_PKEYS */
+
 #ifdef CONFIG_XEN_PV
 #ifndef CONFIG_PREEMPTION
 /*
@@ -304,6 +345,8 @@ __visible noinstr void xen_pv_evtchn_do_upcall(struct 
pt_regs *regs)
 
inhcall = get_and_clear_inhcall();
if (inhcall && !WARN_ON_ONCE(state.exit_rcu)) {
+   /* Normally called by irqentry_exit, we must restore pkrs here 
*/
+   irq_restore_pkrs();
instrumentation_begin();
irqentry_exit_cond_resched();
instrumentation_end();
diff --git a/arch/x86/include/asm/pkeys_common.h 
b/arch/x86/include/asm/pkeys_common.h
index 40464c170522..8961e2ddd6ff 100644
--- a/arch/x86/include/asm/pkeys_common.h
+++ b/arch/x86/include/asm/pkeys_common.h
@@ -27,9 +27,10 @@
 #definePKS_NUM_KEYS16
 
 #ifdef CONFIG_ARCH_HAS_SUPERVISOR_PKEYS
-void write_pkrs(u32 new_pkrs);
+DECLARE_PER_CPU(u32, pkrs_cache);
+noinstr void write_pkrs(u32 new_pkrs);
 #else
-static inline void write_pkrs(u32 new_pkrs) { }
+static __always_inline void write_pkrs(u32 new_pkrs) { }
 #endif
 
 #endif /*_ASM_X86_PKEYS_INTERNAL_H */
diff --git a/arch/x86/kernel/cpu/mce/core.c b/arch/x86/kernel/cpu/mce/core.c
index f43a78bde670..abcd41f19669 100644
--- a/arch/x86/kernel/cpu/mce/core.c