Re: [RFC 09/10] x86/enter: Create macros to restrict/unrestrict Indirect Branch Speculation

2018-02-06 Thread David Woodhouse


On Sun, 2018-02-04 at 19:43 +0100, Thomas Gleixner wrote:
> Yet another possibility is to avoid the function entry and accouting magic
> and use the generic gcc return thunk:
> 
> __x86_return_thunk:
> call L2
> L1:
> pause
> lfence
> jmp L1
> L2:
> lea 8(%rsp), %rsp|lea 4(%esp), %esp
> ret
> 
> which basically refills the RSB on every return. That can be inline or
> extern, but in both cases we should be able to patch it out.
> 
> I have no idea how that affects performance, but it might be worthwhile to
> experiment with that.

That was what I had in mind when I asked HJ to add -mfunction-return.

I suspect the performance hit would be significant because it would
cause a prediction miss on *every* return.

But as I said, let's implement what we can without IBRS for Skylake,
then we can compare the two options for performance, security coverage
and general fugliness.

smime.p7s
Description: S/MIME cryptographic signature


Re: [RFC 09/10] x86/enter: Create macros to restrict/unrestrict Indirect Branch Speculation

2018-02-06 Thread David Woodhouse


On Sun, 2018-02-04 at 19:43 +0100, Thomas Gleixner wrote:
> Yet another possibility is to avoid the function entry and accouting magic
> and use the generic gcc return thunk:
> 
> __x86_return_thunk:
> call L2
> L1:
> pause
> lfence
> jmp L1
> L2:
> lea 8(%rsp), %rsp|lea 4(%esp), %esp
> ret
> 
> which basically refills the RSB on every return. That can be inline or
> extern, but in both cases we should be able to patch it out.
> 
> I have no idea how that affects performance, but it might be worthwhile to
> experiment with that.

That was what I had in mind when I asked HJ to add -mfunction-return.

I suspect the performance hit would be significant because it would
cause a prediction miss on *every* return.

But as I said, let's implement what we can without IBRS for Skylake,
then we can compare the two options for performance, security coverage
and general fugliness.

smime.p7s
Description: S/MIME cryptographic signature


Re: [RFC 09/10] x86/enter: Create macros to restrict/unrestrict Indirect Branch Speculation

2018-02-04 Thread David Woodhouse
On Sun, 2018-02-04 at 19:43 +0100, Thomas Gleixner wrote:
> 
> __x86_return_thunk would look like this:
> 
> __x86_return_thunk:
> testl   $0xf, PER_CPU_VAR(call_depth)
> jnz 1f  
> stuff_rsb
>    1:
> declPER_CPU_VAR(call_depth)
> ret
> 
> The call_depth variable would be reset on context switch.

Note that the 'jnz' can be predicted taken there, allowing the CPU to
speculate all the way to the 'ret'... and beyond.



smime.p7s
Description: S/MIME cryptographic signature


Re: [RFC 09/10] x86/enter: Create macros to restrict/unrestrict Indirect Branch Speculation

2018-02-04 Thread David Woodhouse
On Sun, 2018-02-04 at 19:43 +0100, Thomas Gleixner wrote:
> 
> __x86_return_thunk would look like this:
> 
> __x86_return_thunk:
> testl   $0xf, PER_CPU_VAR(call_depth)
> jnz 1f  
> stuff_rsb
>    1:
> declPER_CPU_VAR(call_depth)
> ret
> 
> The call_depth variable would be reset on context switch.

Note that the 'jnz' can be predicted taken there, allowing the CPU to
speculate all the way to the 'ret'... and beyond.



smime.p7s
Description: S/MIME cryptographic signature


Re: [RFC 09/10] x86/enter: Create macros to restrict/unrestrict Indirect Branch Speculation

2018-02-04 Thread Thomas Gleixner
On Tue, 23 Jan 2018, Ingo Molnar wrote:
> * David Woodhouse  wrote:
> 
> > > On SkyLake this would add an overhead of maybe 2-3 cycles per function 
> > > call and 
> > > obviously all this code and data would be very cache hot. Given that the 
> > > average 
> > > number of function calls per system call is around a dozen, this would be 
> > > _much_ 
> > > faster than any microcode/MSR based approach.
> > 
> > That's kind of neat, except you don't want it at the top of the
> > function; you want it at the bottom.
> > 
> > If you could hijack the *return* site, then you could check for
> > underflow and stuff the RSB right there. But in __fentry__ there's not
> > a lot you can do other than complain that something bad is going to
> > happen in the future. You know that a string of 16+ rets is going to
> > happen, but you've got no gadget in *there* to deal with it when it
> > does.
> 
> No, it can be done with the existing CALL instrumentation callback that 
> CONFIG_DYNAMIC_FTRACE=y provides, by pushing a RET trampoline on the stack 
> from 
> the CALL trampoline - see my previous email.
> 
> > HJ did have patches to turn 'ret' into a form of retpoline, which I
> > don't think ever even got performance-tested.
> 
> Return instrumentation is possible as well, but there are two major drawbacks:
> 
>  - GCC support for it is not as widely available and return instrumentation 
> is 
>less tested in Linux kernel contexts
> 
>  - a major point of my suggestion is that CONFIG_DYNAMIC_FTRACE=y is already 
>enabled in distros here and today, so the runtime overhead to non-SkyLake 
> CPUs 
>would be literally zero, while still allowing to fix the RSB vulnerability 
> on 
>SkyLake.

I played around with that a bit during the week and it turns out to be less
simple than you thought.

1) Injecting a trampoline return only works for functions which have all
   arguments in registers. For functions with arguments on stack like all
   varg functions this breaks because the function wont find its arguments
   anymore.

   I have not yet found a way to figure out reliably which functions have
   arguments on stack. That might be an option to simply ignore them.

   The workaround is to replace the original return on stack with the
   trampoline and store the original return in a per thread stack, which I
   implemented. But this sucks performance wise badly.

2) Doing the whole dance on function entry has a real down side because you
   refill RSB on every 15th return no matter whether its required or
   not. That really gives a very prominent performance hit.

An alternative idea is to do the following (not yet implemented):

__fentry__:
inclPER_CPU_VAR(call_depth)
retq

and use -mfunction-return=thunk-extern which is available on retpoline
enabled compilers. That's a reasonable requirement because w/o retpoline
the whole SKL magic is pointless anyway.

-mfunction-return=thunk-extern issues

jump__x86_return_thunk

instead of ret. In the thunk we can do the whole shebang of mitigation.
That jump can be identified at build time and it can be patched into a ret
for unaffected CPUs. Ideally we do the patching at build time and only
patch the jump in when SKL is detected or paranoia requests it.

We could actually look into that for tracing as well. The only reason why
we don't do that is to select the ideal nop for the CPU the kernel runs on,
which obviously cannot be known at build time.

__x86_return_thunk would look like this:

__x86_return_thunk:
testl   $0xf, PER_CPU_VAR(call_depth)
jnz 1f  
stuff_rsb
   1:
declPER_CPU_VAR(call_depth)
ret

The call_depth variable would be reset on context switch.

Though that has another problem: tail calls. Tail calls will invoke the
__fentry__ call of the tail called function, which makes the call_depth
counter unbalanced. Tail calls can be prevented by using
-fno-optimize-sibling-calls, but that probably sucks as well.

Yet another possibility is to avoid the function entry and accouting magic
and use the generic gcc return thunk:

__x86_return_thunk:
call L2
L1:
pause
lfence
jmp L1
L2:
lea 8(%rsp), %rsp|lea 4(%esp), %esp
ret

which basically refills the RSB on every return. That can be inline or
extern, but in both cases we should be able to patch it out.

I have no idea how that affects performance, but it might be worthwhile to
experiment with that.

If nobody beats me to it, I'll play around with that some more after
vacation.

Thanks,

tglx

Re: [RFC 09/10] x86/enter: Create macros to restrict/unrestrict Indirect Branch Speculation

2018-02-04 Thread Thomas Gleixner
On Tue, 23 Jan 2018, Ingo Molnar wrote:
> * David Woodhouse  wrote:
> 
> > > On SkyLake this would add an overhead of maybe 2-3 cycles per function 
> > > call and 
> > > obviously all this code and data would be very cache hot. Given that the 
> > > average 
> > > number of function calls per system call is around a dozen, this would be 
> > > _much_ 
> > > faster than any microcode/MSR based approach.
> > 
> > That's kind of neat, except you don't want it at the top of the
> > function; you want it at the bottom.
> > 
> > If you could hijack the *return* site, then you could check for
> > underflow and stuff the RSB right there. But in __fentry__ there's not
> > a lot you can do other than complain that something bad is going to
> > happen in the future. You know that a string of 16+ rets is going to
> > happen, but you've got no gadget in *there* to deal with it when it
> > does.
> 
> No, it can be done with the existing CALL instrumentation callback that 
> CONFIG_DYNAMIC_FTRACE=y provides, by pushing a RET trampoline on the stack 
> from 
> the CALL trampoline - see my previous email.
> 
> > HJ did have patches to turn 'ret' into a form of retpoline, which I
> > don't think ever even got performance-tested.
> 
> Return instrumentation is possible as well, but there are two major drawbacks:
> 
>  - GCC support for it is not as widely available and return instrumentation 
> is 
>less tested in Linux kernel contexts
> 
>  - a major point of my suggestion is that CONFIG_DYNAMIC_FTRACE=y is already 
>enabled in distros here and today, so the runtime overhead to non-SkyLake 
> CPUs 
>would be literally zero, while still allowing to fix the RSB vulnerability 
> on 
>SkyLake.

I played around with that a bit during the week and it turns out to be less
simple than you thought.

1) Injecting a trampoline return only works for functions which have all
   arguments in registers. For functions with arguments on stack like all
   varg functions this breaks because the function wont find its arguments
   anymore.

   I have not yet found a way to figure out reliably which functions have
   arguments on stack. That might be an option to simply ignore them.

   The workaround is to replace the original return on stack with the
   trampoline and store the original return in a per thread stack, which I
   implemented. But this sucks performance wise badly.

2) Doing the whole dance on function entry has a real down side because you
   refill RSB on every 15th return no matter whether its required or
   not. That really gives a very prominent performance hit.

An alternative idea is to do the following (not yet implemented):

__fentry__:
inclPER_CPU_VAR(call_depth)
retq

and use -mfunction-return=thunk-extern which is available on retpoline
enabled compilers. That's a reasonable requirement because w/o retpoline
the whole SKL magic is pointless anyway.

-mfunction-return=thunk-extern issues

jump__x86_return_thunk

instead of ret. In the thunk we can do the whole shebang of mitigation.
That jump can be identified at build time and it can be patched into a ret
for unaffected CPUs. Ideally we do the patching at build time and only
patch the jump in when SKL is detected or paranoia requests it.

We could actually look into that for tracing as well. The only reason why
we don't do that is to select the ideal nop for the CPU the kernel runs on,
which obviously cannot be known at build time.

__x86_return_thunk would look like this:

__x86_return_thunk:
testl   $0xf, PER_CPU_VAR(call_depth)
jnz 1f  
stuff_rsb
   1:
declPER_CPU_VAR(call_depth)
ret

The call_depth variable would be reset on context switch.

Though that has another problem: tail calls. Tail calls will invoke the
__fentry__ call of the tail called function, which makes the call_depth
counter unbalanced. Tail calls can be prevented by using
-fno-optimize-sibling-calls, but that probably sucks as well.

Yet another possibility is to avoid the function entry and accouting magic
and use the generic gcc return thunk:

__x86_return_thunk:
call L2
L1:
pause
lfence
jmp L1
L2:
lea 8(%rsp), %rsp|lea 4(%esp), %esp
ret

which basically refills the RSB on every return. That can be inline or
extern, but in both cases we should be able to patch it out.

I have no idea how that affects performance, but it might be worthwhile to
experiment with that.

If nobody beats me to it, I'll play around with that some more after
vacation.

Thanks,

tglx

Re: [RFC 09/10] x86/enter: Create macros to restrict/unrestrict Indirect Branch Speculation

2018-01-29 Thread Mason
[ Dropping large CC list ]

On 25/01/2018 18:16, Greg Kroah-Hartman wrote:

> On Thu, Jan 25, 2018 at 05:19:04PM +0100, Mason wrote:
> 
>> On 23/01/2018 10:30, David Woodhouse wrote:
>>
>>> Skylake takes predictions from the generic branch target buffer when
>>> the RSB underflows.
>>
>> Adding LAKML.
>>
>> AFAIU, some ARM Cortex cores have the same optimization.
>> (A9 maybe, A17 probably, some recent 64-bit cores)
>>
>> Are there software work-arounds for Spectre planned for arm32 and arm64?
> 
> Yes, I think they are currently buried in one of the arm64 trees, and
> they have been posted to the mailing list a few times in the past.

Found the burial ground, thanks Greg.

  https://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux.git/log/?h=kpti

Via https://developer.arm.com/support/security-update

"For Cortex-R8, Cortex-A8, Cortex-A9, and Cortex-A17, invalidate
the branch predictor using a BPIALL instruction."

The latest arm32 patch series was submitted recently:

  https://www.spinics.net/lists/arm-kernel/msg630892.html

Regards.


Re: [RFC 09/10] x86/enter: Create macros to restrict/unrestrict Indirect Branch Speculation

2018-01-29 Thread Mason
[ Dropping large CC list ]

On 25/01/2018 18:16, Greg Kroah-Hartman wrote:

> On Thu, Jan 25, 2018 at 05:19:04PM +0100, Mason wrote:
> 
>> On 23/01/2018 10:30, David Woodhouse wrote:
>>
>>> Skylake takes predictions from the generic branch target buffer when
>>> the RSB underflows.
>>
>> Adding LAKML.
>>
>> AFAIU, some ARM Cortex cores have the same optimization.
>> (A9 maybe, A17 probably, some recent 64-bit cores)
>>
>> Are there software work-arounds for Spectre planned for arm32 and arm64?
> 
> Yes, I think they are currently buried in one of the arm64 trees, and
> they have been posted to the mailing list a few times in the past.

Found the burial ground, thanks Greg.

  https://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux.git/log/?h=kpti

Via https://developer.arm.com/support/security-update

"For Cortex-R8, Cortex-A8, Cortex-A9, and Cortex-A17, invalidate
the branch predictor using a BPIALL instruction."

The latest arm32 patch series was submitted recently:

  https://www.spinics.net/lists/arm-kernel/msg630892.html

Regards.


Re: [RFC 09/10] x86/enter: Create macros to restrict/unrestrict Indirect Branch Speculation

2018-01-27 Thread Dave Hansen
On 01/27/2018 05:42 AM, Konrad Rzeszutek Wilk wrote:
> On Fri, Jan 26, 2018 at 07:11:47PM +, Hansen, Dave wrote:
>> The need for RSB stuffing in all the various scenarios and what the heck it 
>> actually mitigates is freakishly complicated.  I've tried to write it all 
>> down in one place: https://goo.gl/pXbvBE
> Thank you for sharing that.
> 
> One question on the third from the top (' RSB Stuff (16) After
> irq/nmi/#PF/...').
> 
> It says that :"Return from interrupt path (more than 16 deep) can empty
> RSB".
> 
> Just to clarify - you mean all the returns ('ret') that are happening after
> we call do_IRQ and the stack unwinds - but before we do an 'iret' correct?

Correct.  The RSB is not used or updated by iret.


Re: [RFC 09/10] x86/enter: Create macros to restrict/unrestrict Indirect Branch Speculation

2018-01-27 Thread Dave Hansen
On 01/27/2018 05:42 AM, Konrad Rzeszutek Wilk wrote:
> On Fri, Jan 26, 2018 at 07:11:47PM +, Hansen, Dave wrote:
>> The need for RSB stuffing in all the various scenarios and what the heck it 
>> actually mitigates is freakishly complicated.  I've tried to write it all 
>> down in one place: https://goo.gl/pXbvBE
> Thank you for sharing that.
> 
> One question on the third from the top (' RSB Stuff (16) After
> irq/nmi/#PF/...').
> 
> It says that :"Return from interrupt path (more than 16 deep) can empty
> RSB".
> 
> Just to clarify - you mean all the returns ('ret') that are happening after
> we call do_IRQ and the stack unwinds - but before we do an 'iret' correct?

Correct.  The RSB is not used or updated by iret.


Re: [RFC 09/10] x86/enter: Create macros to restrict/unrestrict Indirect Branch Speculation

2018-01-27 Thread Konrad Rzeszutek Wilk
On Fri, Jan 26, 2018 at 07:11:47PM +, Hansen, Dave wrote:
> The need for RSB stuffing in all the various scenarios and what the heck it 
> actually mitigates is freakishly complicated.  I've tried to write it all 
> down in one place: https://goo.gl/pXbvBE

Thank you for sharing that.

One question on the third from the top (' RSB Stuff (16) After
irq/nmi/#PF/...').

It says that :"Return from interrupt path (more than 16 deep) can empty
RSB".

Just to clarify - you mean all the returns ('ret') that are happening after
we call do_IRQ and the stack unwinds - but before we do an 'iret' correct?

I am 99% sure that is what you mean, but just confirming as one could read
this as: 'Need to do RSB after an iret' (say you are in the kernel
and then get an interrupt and iret back to kernel).


Re: [RFC 09/10] x86/enter: Create macros to restrict/unrestrict Indirect Branch Speculation

2018-01-27 Thread Konrad Rzeszutek Wilk
On Fri, Jan 26, 2018 at 07:11:47PM +, Hansen, Dave wrote:
> The need for RSB stuffing in all the various scenarios and what the heck it 
> actually mitigates is freakishly complicated.  I've tried to write it all 
> down in one place: https://goo.gl/pXbvBE

Thank you for sharing that.

One question on the third from the top (' RSB Stuff (16) After
irq/nmi/#PF/...').

It says that :"Return from interrupt path (more than 16 deep) can empty
RSB".

Just to clarify - you mean all the returns ('ret') that are happening after
we call do_IRQ and the stack unwinds - but before we do an 'iret' correct?

I am 99% sure that is what you mean, but just confirming as one could read
this as: 'Need to do RSB after an iret' (say you are in the kernel
and then get an interrupt and iret back to kernel).


Re: [RFC 09/10] x86/enter: Create macros to restrict/unrestrict Indirect Branch Speculation

2018-01-27 Thread Borislav Petkov
On Fri, Jan 26, 2018 at 09:19:09AM -0800, Linus Torvalds wrote:
> But did we do that "disable stuffing with SMEP"? I'm not seeing it. In
> my tree, it's only conditional on X86_FEATURE_RETPOLINE.

Or rather, enable stuffing on !SMEP:

+   if ((!boot_cpu_has(X86_FEATURE_PTI) &&
+!boot_cpu_has(X86_FEATURE_SMEP)) || is_skylake_era()) {
+   setup_force_cpu_cap(X86_FEATURE_RSB_CTXSW);
+   pr_info("Filling RSB on context switch\n");
+   }

Should be

c995efd5a740 ("x86/retpoline: Fill RSB on context switch for affected CPUs")

in your tree.

-- 
Regards/Gruss,
Boris.

SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 
(AG Nürnberg)
-- 


Re: [RFC 09/10] x86/enter: Create macros to restrict/unrestrict Indirect Branch Speculation

2018-01-27 Thread Borislav Petkov
On Fri, Jan 26, 2018 at 09:19:09AM -0800, Linus Torvalds wrote:
> But did we do that "disable stuffing with SMEP"? I'm not seeing it. In
> my tree, it's only conditional on X86_FEATURE_RETPOLINE.

Or rather, enable stuffing on !SMEP:

+   if ((!boot_cpu_has(X86_FEATURE_PTI) &&
+!boot_cpu_has(X86_FEATURE_SMEP)) || is_skylake_era()) {
+   setup_force_cpu_cap(X86_FEATURE_RSB_CTXSW);
+   pr_info("Filling RSB on context switch\n");
+   }

Should be

c995efd5a740 ("x86/retpoline: Fill RSB on context switch for affected CPUs")

in your tree.

-- 
Regards/Gruss,
Boris.

SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 
(AG Nürnberg)
-- 


Re: [RFC 09/10] x86/enter: Create macros to restrict/unrestrict Indirect Branch Speculation

2018-01-26 Thread David Woodhouse
On Fri, 2018-01-26 at 14:02 -0500, Konrad Rzeszutek Wilk wrote:
> 
> -ECONFUSED, see ==>
> 
> Is this incorrect then?
> I see:
> 
> 241  * Skylake era CPUs have a separate issue with *underflow* of the 
>   
> 242  * RSB, when they will predict 'ret' targets from the generic 
> BTB.  
> 243  * The proper mitigation for this is IBRS. If IBRS is not 
> supported 
> 244  * or deactivated in favour of retpolines the RSB fill on context 
>   
> 245  * switch is required.
>   
> 246  */   

No, that's correct (well, except that it's kind of written for a world
where Linus is going to let IBRS anywhere near his kernel, and could
survive being rephrased a little :)

The RSB-stuffing on context switch (or kernel entry) is one of a
*litany* of additional hacks we need on Skylake to make retpolines
safe.

We were adding the RSB-stuffing in this case *anyway* for !SMEP, so it
was trivial enough to add in the (|| Skylake) condition while we were
at it.


smime.p7s
Description: S/MIME cryptographic signature


Re: [RFC 09/10] x86/enter: Create macros to restrict/unrestrict Indirect Branch Speculation

2018-01-26 Thread David Woodhouse
On Fri, 2018-01-26 at 14:02 -0500, Konrad Rzeszutek Wilk wrote:
> 
> -ECONFUSED, see ==>
> 
> Is this incorrect then?
> I see:
> 
> 241  * Skylake era CPUs have a separate issue with *underflow* of the 
>   
> 242  * RSB, when they will predict 'ret' targets from the generic 
> BTB.  
> 243  * The proper mitigation for this is IBRS. If IBRS is not 
> supported 
> 244  * or deactivated in favour of retpolines the RSB fill on context 
>   
> 245  * switch is required.
>   
> 246  */   

No, that's correct (well, except that it's kind of written for a world
where Linus is going to let IBRS anywhere near his kernel, and could
survive being rephrased a little :)

The RSB-stuffing on context switch (or kernel entry) is one of a
*litany* of additional hacks we need on Skylake to make retpolines
safe.

We were adding the RSB-stuffing in this case *anyway* for !SMEP, so it
was trivial enough to add in the (|| Skylake) condition while we were
at it.


smime.p7s
Description: S/MIME cryptographic signature


RE: [RFC 09/10] x86/enter: Create macros to restrict/unrestrict Indirect Branch Speculation

2018-01-26 Thread Hansen, Dave
The need for RSB stuffing in all the various scenarios and what the heck it 
actually mitigates is freakishly complicated.  I've tried to write it all down 
in one place: https://goo.gl/pXbvBE


RE: [RFC 09/10] x86/enter: Create macros to restrict/unrestrict Indirect Branch Speculation

2018-01-26 Thread Hansen, Dave
The need for RSB stuffing in all the various scenarios and what the heck it 
actually mitigates is freakishly complicated.  I've tried to write it all down 
in one place: https://goo.gl/pXbvBE


Re: [RFC 09/10] x86/enter: Create macros to restrict/unrestrict Indirect Branch Speculation

2018-01-26 Thread Konrad Rzeszutek Wilk
On Fri, Jan 26, 2018 at 09:59:01AM -0800, Andi Kleen wrote:
> On Fri, Jan 26, 2018 at 09:19:09AM -0800, Linus Torvalds wrote:
> > On Fri, Jan 26, 2018 at 1:11 AM, David Woodhouse  
> > wrote:
> > >
> > > Do we need to look again at the fact that we've disabled the RSB-
> > > stuffing for SMEP?
> > 
> > Absolutely. SMEP helps make people a lot less worried about things,
> > but it doesn't fix the "BTB only contains partial addresses" case.
> > 
> > But did we do that "disable stuffing with SMEP"? I'm not seeing it. In
> > my tree, it's only conditional on X86_FEATURE_RETPOLINE.
> 
> For Skylake we need RSB stuffing even with SMEP to avoid falling back to the
> BTB on underflow.
> 
> It's also always needed with virtualization.

-ECONFUSED, see ==>

Is this incorrect then?
I see:

241  * Skylake era CPUs have a separate issue with *underflow* of the   

242  * RSB, when they will predict 'ret' targets from the generic BTB.  

243  * The proper mitigation for this is IBRS. If IBRS is not supported 

244  * or deactivated in favour of retpolines the RSB fill on context   

245  * switch is required.  

246  */

which came from this:

commit c995efd5a740d9cbafbf58bde4973e8b50b4d761
Author: David Woodhouse 
Date:   Fri Jan 12 17:49:25 2018 +

x86/retpoline: Fill RSB on context switch for affected CPUs

On context switch from a shallow call stack to a deeper one, as the CPU
does 'ret' up the deeper side it may encounter RSB entries (predictions for
where the 'ret' goes to) which were populated in userspace.

This is problematic if neither SMEP nor KPTI (the latter of which marks
userspace pages as NX for the kernel) are active, as malicious code in
userspace may then be executed speculatively.

Overwrite the CPU's return prediction stack with calls which are predicted
to return to an infinite loop, to "capture" speculation if this
happens. This is required both for retpoline, and also in conjunction with
IBRS for !SMEP && !KPTI.

On Skylake+ the problem is slightly different, and an *underflow* of the
RSB may cause errant branch predictions to occur. So there it's not so much
overwrite, as *filling* the RSB to attempt to prevent it getting
empty. This is only a partial solution for Skylake+ since there are many
==>other conditions which may result in the RSB becoming empty. The full
<==
==>solution on Skylake+ is to use IBRS, which will prevent the problem even 
<==
when the RSB becomes empty. With IBRS, the RSB-stuffing will not be
required on context switch.


Signed-off-by: David Woodhouse 
Signed-off-by: Thomas Gleixner 
Acked-by: Arjan van de Ven 


The "full solution" is what is making me confused.
> 
> -Andi


Re: [RFC 09/10] x86/enter: Create macros to restrict/unrestrict Indirect Branch Speculation

2018-01-26 Thread Konrad Rzeszutek Wilk
On Fri, Jan 26, 2018 at 09:59:01AM -0800, Andi Kleen wrote:
> On Fri, Jan 26, 2018 at 09:19:09AM -0800, Linus Torvalds wrote:
> > On Fri, Jan 26, 2018 at 1:11 AM, David Woodhouse  
> > wrote:
> > >
> > > Do we need to look again at the fact that we've disabled the RSB-
> > > stuffing for SMEP?
> > 
> > Absolutely. SMEP helps make people a lot less worried about things,
> > but it doesn't fix the "BTB only contains partial addresses" case.
> > 
> > But did we do that "disable stuffing with SMEP"? I'm not seeing it. In
> > my tree, it's only conditional on X86_FEATURE_RETPOLINE.
> 
> For Skylake we need RSB stuffing even with SMEP to avoid falling back to the
> BTB on underflow.
> 
> It's also always needed with virtualization.

-ECONFUSED, see ==>

Is this incorrect then?
I see:

241  * Skylake era CPUs have a separate issue with *underflow* of the   

242  * RSB, when they will predict 'ret' targets from the generic BTB.  

243  * The proper mitigation for this is IBRS. If IBRS is not supported 

244  * or deactivated in favour of retpolines the RSB fill on context   

245  * switch is required.  

246  */

which came from this:

commit c995efd5a740d9cbafbf58bde4973e8b50b4d761
Author: David Woodhouse 
Date:   Fri Jan 12 17:49:25 2018 +

x86/retpoline: Fill RSB on context switch for affected CPUs

On context switch from a shallow call stack to a deeper one, as the CPU
does 'ret' up the deeper side it may encounter RSB entries (predictions for
where the 'ret' goes to) which were populated in userspace.

This is problematic if neither SMEP nor KPTI (the latter of which marks
userspace pages as NX for the kernel) are active, as malicious code in
userspace may then be executed speculatively.

Overwrite the CPU's return prediction stack with calls which are predicted
to return to an infinite loop, to "capture" speculation if this
happens. This is required both for retpoline, and also in conjunction with
IBRS for !SMEP && !KPTI.

On Skylake+ the problem is slightly different, and an *underflow* of the
RSB may cause errant branch predictions to occur. So there it's not so much
overwrite, as *filling* the RSB to attempt to prevent it getting
empty. This is only a partial solution for Skylake+ since there are many
==>other conditions which may result in the RSB becoming empty. The full
<==
==>solution on Skylake+ is to use IBRS, which will prevent the problem even 
<==
when the RSB becomes empty. With IBRS, the RSB-stuffing will not be
required on context switch.


Signed-off-by: David Woodhouse 
Signed-off-by: Thomas Gleixner 
Acked-by: Arjan van de Ven 


The "full solution" is what is making me confused.
> 
> -Andi


Re: [RFC 09/10] x86/enter: Create macros to restrict/unrestrict Indirect Branch Speculation

2018-01-26 Thread David Woodhouse
On Fri, 2018-01-26 at 18:44 +, Van De Ven, Arjan wrote:
> your question was specific to RSB not BTB. But please show the empirical 
> evidence for RSB ?

We were hypothesising, which should have been clear from:

On Fri, 2018-01-26 at 09:11 +, David Woodhouse wrote:
> Likewise if the RSB only stores the low 31 bits of the target, SMEP
> isn't much help there either.
> 
> Do we need to look again at the fact that we've disabled the RSB-
> stuffing for SMEP?

... and later... 

On Fri, 2018-01-26 at 17:31 +, David Woodhouse wrote:
> Note, we've switched from talking about BTB to RSB here, so this is a
> valid concern if the *RSB* only has the low bits of the target.

I'm glad to hear that it *isn't* a valid concern for the RSB and the
code in Linus' tree is correct.

Thank you for clearing that up.


smime.p7s
Description: S/MIME cryptographic signature


Re: [RFC 09/10] x86/enter: Create macros to restrict/unrestrict Indirect Branch Speculation

2018-01-26 Thread David Woodhouse
On Fri, 2018-01-26 at 18:44 +, Van De Ven, Arjan wrote:
> your question was specific to RSB not BTB. But please show the empirical 
> evidence for RSB ?

We were hypothesising, which should have been clear from:

On Fri, 2018-01-26 at 09:11 +, David Woodhouse wrote:
> Likewise if the RSB only stores the low 31 bits of the target, SMEP
> isn't much help there either.
> 
> Do we need to look again at the fact that we've disabled the RSB-
> stuffing for SMEP?

... and later... 

On Fri, 2018-01-26 at 17:31 +, David Woodhouse wrote:
> Note, we've switched from talking about BTB to RSB here, so this is a
> valid concern if the *RSB* only has the low bits of the target.

I'm glad to hear that it *isn't* a valid concern for the RSB and the
code in Linus' tree is correct.

Thank you for clearing that up.


smime.p7s
Description: S/MIME cryptographic signature


RE: [RFC 09/10] x86/enter: Create macros to restrict/unrestrict Indirect Branch Speculation

2018-01-26 Thread Van De Ven, Arjan
> > you asked before and even before you sent the email I confirmed to
> > you that the document is correct
> >
> > I'm not sure what the point is to then question that again 15 minutes
> > later other than creating more noise.
> 
> Apologies, I hadn't seen the comment on IRC.
> 
> Sometimes the docs *don't* get it right, especially when they're
> released in a hurry as that one was. I note there's a *fourth* version
> of microcode-update-guidance.pdf available now, for example :)
> 
> So it is useful that you have explicitly stated that for *this*
> specific concern, the document is in fact correct that SMEP saves us
> from BTB and RSB pollution, *despite* the empirical evidence that those
> structures only hold the low 31 bits.

your question was specific to RSB not BTB. But please show the empirical 
evidence for RSB ?



RE: [RFC 09/10] x86/enter: Create macros to restrict/unrestrict Indirect Branch Speculation

2018-01-26 Thread Van De Ven, Arjan
> > you asked before and even before you sent the email I confirmed to
> > you that the document is correct
> >
> > I'm not sure what the point is to then question that again 15 minutes
> > later other than creating more noise.
> 
> Apologies, I hadn't seen the comment on IRC.
> 
> Sometimes the docs *don't* get it right, especially when they're
> released in a hurry as that one was. I note there's a *fourth* version
> of microcode-update-guidance.pdf available now, for example :)
> 
> So it is useful that you have explicitly stated that for *this*
> specific concern, the document is in fact correct that SMEP saves us
> from BTB and RSB pollution, *despite* the empirical evidence that those
> structures only hold the low 31 bits.

your question was specific to RSB not BTB. But please show the empirical 
evidence for RSB ?



Re: [RFC 09/10] x86/enter: Create macros to restrict/unrestrict Indirect Branch Speculation

2018-01-26 Thread David Woodhouse
On Fri, 2018-01-26 at 18:28 +, Van De Ven, Arjan wrote:
> > As you know well, I mean "we think Intel's document is not
> > correct".
>
> you asked before and even before you sent the email I confirmed to
> you that the document is correct
> 
> I'm not sure what the point is to then question that again 15 minutes
> later other than creating more noise.

Apologies, I hadn't seen the comment on IRC.

Sometimes the docs *don't* get it right, especially when they're
released in a hurry as that one was. I note there's a *fourth* version
of microcode-update-guidance.pdf available now, for example :)

So it is useful that you have explicitly stated that for *this*
specific concern, the document is in fact correct that SMEP saves us
from BTB and RSB pollution, *despite* the empirical evidence that those
structures only hold the low 31 bits.

I'm going to get back to other things now, although I'm sure others may
be very interested to reconcile the empirical evidence with what you
say, and want to know *how* that can be the case. Which I'm sure you
won't be able to say in public anyway.

smime.p7s
Description: S/MIME cryptographic signature


Re: [RFC 09/10] x86/enter: Create macros to restrict/unrestrict Indirect Branch Speculation

2018-01-26 Thread David Woodhouse
On Fri, 2018-01-26 at 18:28 +, Van De Ven, Arjan wrote:
> > As you know well, I mean "we think Intel's document is not
> > correct".
>
> you asked before and even before you sent the email I confirmed to
> you that the document is correct
> 
> I'm not sure what the point is to then question that again 15 minutes
> later other than creating more noise.

Apologies, I hadn't seen the comment on IRC.

Sometimes the docs *don't* get it right, especially when they're
released in a hurry as that one was. I note there's a *fourth* version
of microcode-update-guidance.pdf available now, for example :)

So it is useful that you have explicitly stated that for *this*
specific concern, the document is in fact correct that SMEP saves us
from BTB and RSB pollution, *despite* the empirical evidence that those
structures only hold the low 31 bits.

I'm going to get back to other things now, although I'm sure others may
be very interested to reconcile the empirical evidence with what you
say, and want to know *how* that can be the case. Which I'm sure you
won't be able to say in public anyway.

smime.p7s
Description: S/MIME cryptographic signature


RE: [RFC 09/10] x86/enter: Create macros to restrict/unrestrict Indirect Branch Speculation

2018-01-26 Thread Van De Ven, Arjan
> On Fri, 2018-01-26 at 10:12 -0800, Arjan van de Ven wrote:
> > On 1/26/2018 10:11 AM, David Woodhouse wrote:
> > >
> > > I am *actively* ignoring Skylake right now. This is about per-SKL
> > > userspace even with SMEP, because we think Intel's document lies to us.
> >
> > if you think we lie to you then I think we're done with the conversation?
> >
> > Please tell us then what you deploy in AWS for your customers ?
> >
> > or show us research that shows we lied to you?
> 
> As you know well, I mean "we think Intel's document is not correct".

you asked before and even before you sent the email I confirmed to you that the 
document is correct

I'm not sure what the point is to then question that again 15 minutes later 
other than creating more noise.




RE: [RFC 09/10] x86/enter: Create macros to restrict/unrestrict Indirect Branch Speculation

2018-01-26 Thread Van De Ven, Arjan
> On Fri, 2018-01-26 at 10:12 -0800, Arjan van de Ven wrote:
> > On 1/26/2018 10:11 AM, David Woodhouse wrote:
> > >
> > > I am *actively* ignoring Skylake right now. This is about per-SKL
> > > userspace even with SMEP, because we think Intel's document lies to us.
> >
> > if you think we lie to you then I think we're done with the conversation?
> >
> > Please tell us then what you deploy in AWS for your customers ?
> >
> > or show us research that shows we lied to you?
> 
> As you know well, I mean "we think Intel's document is not correct".

you asked before and even before you sent the email I confirmed to you that the 
document is correct

I'm not sure what the point is to then question that again 15 minutes later 
other than creating more noise.




Re: [RFC 09/10] x86/enter: Create macros to restrict/unrestrict Indirect Branch Speculation

2018-01-26 Thread David Woodhouse
On Fri, 2018-01-26 at 10:12 -0800, Arjan van de Ven wrote:
> On 1/26/2018 10:11 AM, David Woodhouse wrote:
> > 
> > I am *actively* ignoring Skylake right now. This is about per-SKL
> > userspace even with SMEP, because we think Intel's document lies to us.
> 
> if you think we lie to you then I think we're done with the conversation?
> 
> Please tell us then what you deploy in AWS for your customers ?
> 
> or show us research that shows we lied to you?

As you know well, I mean "we think Intel's document is not correct". 

The evidence which made us suspect that is fairly clear in the last few
emails in this thread — it's about the BTB/RSB only having the low bits
of the target, which would mean that userspace *can* put malicious
targets into the RSB, regardless of SMEP.



smime.p7s
Description: S/MIME cryptographic signature


Re: [RFC 09/10] x86/enter: Create macros to restrict/unrestrict Indirect Branch Speculation

2018-01-26 Thread David Woodhouse
On Fri, 2018-01-26 at 10:12 -0800, Arjan van de Ven wrote:
> On 1/26/2018 10:11 AM, David Woodhouse wrote:
> > 
> > I am *actively* ignoring Skylake right now. This is about per-SKL
> > userspace even with SMEP, because we think Intel's document lies to us.
> 
> if you think we lie to you then I think we're done with the conversation?
> 
> Please tell us then what you deploy in AWS for your customers ?
> 
> or show us research that shows we lied to you?

As you know well, I mean "we think Intel's document is not correct". 

The evidence which made us suspect that is fairly clear in the last few
emails in this thread — it's about the BTB/RSB only having the low bits
of the target, which would mean that userspace *can* put malicious
targets into the RSB, regardless of SMEP.



smime.p7s
Description: S/MIME cryptographic signature


Re: [RFC 09/10] x86/enter: Create macros to restrict/unrestrict Indirect Branch Speculation

2018-01-26 Thread David Woodhouse
On Fri, 2018-01-26 at 09:59 -0800, Andi Kleen wrote:
> On Fri, Jan 26, 2018 at 09:19:09AM -0800, Linus Torvalds wrote:
> > 
> > On Fri, Jan 26, 2018 at 1:11 AM, David Woodhouse  wrote:
> > > 
> > > 
> > > Do we need to look again at the fact that we've disabled the RSB-
> > > stuffing for SMEP?
> > Absolutely. SMEP helps make people a lot less worried about things,
> > but it doesn't fix the "BTB only contains partial addresses" case.
> > 
> > But did we do that "disable stuffing with SMEP"? I'm not seeing it. In
> > my tree, it's only conditional on X86_FEATURE_RETPOLINE.
>
> For Skylake we need RSB stuffing even with SMEP to avoid falling back to the
> BTB on underflow.

I am *actively* ignoring Skylake right now. This is about per-SKL
userspace even with SMEP, because we think Intel's document lies to us.

If the RSB only holds the low bits of the target, then a userspace
attacker can populate an RSB entry which points to a kernel gadget of
her choice, even with SMEP or KPTI enabled.


smime.p7s
Description: S/MIME cryptographic signature


Re: [RFC 09/10] x86/enter: Create macros to restrict/unrestrict Indirect Branch Speculation

2018-01-26 Thread David Woodhouse
On Fri, 2018-01-26 at 09:59 -0800, Andi Kleen wrote:
> On Fri, Jan 26, 2018 at 09:19:09AM -0800, Linus Torvalds wrote:
> > 
> > On Fri, Jan 26, 2018 at 1:11 AM, David Woodhouse  wrote:
> > > 
> > > 
> > > Do we need to look again at the fact that we've disabled the RSB-
> > > stuffing for SMEP?
> > Absolutely. SMEP helps make people a lot less worried about things,
> > but it doesn't fix the "BTB only contains partial addresses" case.
> > 
> > But did we do that "disable stuffing with SMEP"? I'm not seeing it. In
> > my tree, it's only conditional on X86_FEATURE_RETPOLINE.
>
> For Skylake we need RSB stuffing even with SMEP to avoid falling back to the
> BTB on underflow.

I am *actively* ignoring Skylake right now. This is about per-SKL
userspace even with SMEP, because we think Intel's document lies to us.

If the RSB only holds the low bits of the target, then a userspace
attacker can populate an RSB entry which points to a kernel gadget of
her choice, even with SMEP or KPTI enabled.


smime.p7s
Description: S/MIME cryptographic signature


Re: [RFC 09/10] x86/enter: Create macros to restrict/unrestrict Indirect Branch Speculation

2018-01-26 Thread Arjan van de Ven

On 1/26/2018 10:11 AM, David Woodhouse wrote:


I am *actively* ignoring Skylake right now. This is about per-SKL
userspace even with SMEP, because we think Intel's document lies to us.


if you think we lie to you then I think we're done with the conversation?

Please tell us then what you deploy in AWS for your customers ?

or show us research that shows we lied to you?


Re: [RFC 09/10] x86/enter: Create macros to restrict/unrestrict Indirect Branch Speculation

2018-01-26 Thread Arjan van de Ven

On 1/26/2018 10:11 AM, David Woodhouse wrote:


I am *actively* ignoring Skylake right now. This is about per-SKL
userspace even with SMEP, because we think Intel's document lies to us.


if you think we lie to you then I think we're done with the conversation?

Please tell us then what you deploy in AWS for your customers ?

or show us research that shows we lied to you?


Re: [RFC 09/10] x86/enter: Create macros to restrict/unrestrict Indirect Branch Speculation

2018-01-26 Thread Andi Kleen
On Fri, Jan 26, 2018 at 09:19:09AM -0800, Linus Torvalds wrote:
> On Fri, Jan 26, 2018 at 1:11 AM, David Woodhouse  wrote:
> >
> > Do we need to look again at the fact that we've disabled the RSB-
> > stuffing for SMEP?
> 
> Absolutely. SMEP helps make people a lot less worried about things,
> but it doesn't fix the "BTB only contains partial addresses" case.
> 
> But did we do that "disable stuffing with SMEP"? I'm not seeing it. In
> my tree, it's only conditional on X86_FEATURE_RETPOLINE.

For Skylake we need RSB stuffing even with SMEP to avoid falling back to the
BTB on underflow.

It's also always needed with virtualization.

-Andi


Re: [RFC 09/10] x86/enter: Create macros to restrict/unrestrict Indirect Branch Speculation

2018-01-26 Thread Andi Kleen
On Fri, Jan 26, 2018 at 09:19:09AM -0800, Linus Torvalds wrote:
> On Fri, Jan 26, 2018 at 1:11 AM, David Woodhouse  wrote:
> >
> > Do we need to look again at the fact that we've disabled the RSB-
> > stuffing for SMEP?
> 
> Absolutely. SMEP helps make people a lot less worried about things,
> but it doesn't fix the "BTB only contains partial addresses" case.
> 
> But did we do that "disable stuffing with SMEP"? I'm not seeing it. In
> my tree, it's only conditional on X86_FEATURE_RETPOLINE.

For Skylake we need RSB stuffing even with SMEP to avoid falling back to the
BTB on underflow.

It's also always needed with virtualization.

-Andi


Re: [RFC 09/10] x86/enter: Create macros to restrict/unrestrict Indirect Branch Speculation

2018-01-26 Thread David Woodhouse
On Fri, 2018-01-26 at 17:29 +, David Woodhouse wrote:
> On Fri, 2018-01-26 at 09:19 -0800, Linus Torvalds wrote:
> > On Fri, Jan 26, 2018 at 1:11 AM, David Woodhouse  
> > wrote:
> > > Do we need to look again at the fact that we've disabled the RSB-
> > > stuffing for SMEP?
> >
> > Absolutely. SMEP helps make people a lot less worried about things,
> > but it doesn't fix the "BTB only contains partial addresses" case.
> > 
> > But did we do that "disable stuffing with SMEP"? I'm not seeing it. In
> > my tree, it's only conditional on X86_FEATURE_RETPOLINE.
>
> That's the vmexit one. The one on context switch is in
> commit c995efd5a7 and has its own X86_FEATURE_RSB_CTXSW which in
> kernel/cpu/bugs.c is turned on for (!SMEP || Skylake).
> 
> The "low bits of the BTB" issue probably means that wants to be
> X86_FEATURE_RETPOLINE too. Despite Intel's doc saying otherwise.
> 
> (Intel's doc also says to do it on kernel entry, but we elected to do
> it on context switch instead since *that's* when the imbalances show up
> in the RSB.)

Note, we've switched from talking about BTB to RSB here, so this is a
valid concern if the *RSB* only has the low bits of the target.

smime.p7s
Description: S/MIME cryptographic signature


Re: [RFC 09/10] x86/enter: Create macros to restrict/unrestrict Indirect Branch Speculation

2018-01-26 Thread David Woodhouse
On Fri, 2018-01-26 at 17:29 +, David Woodhouse wrote:
> On Fri, 2018-01-26 at 09:19 -0800, Linus Torvalds wrote:
> > On Fri, Jan 26, 2018 at 1:11 AM, David Woodhouse  
> > wrote:
> > > Do we need to look again at the fact that we've disabled the RSB-
> > > stuffing for SMEP?
> >
> > Absolutely. SMEP helps make people a lot less worried about things,
> > but it doesn't fix the "BTB only contains partial addresses" case.
> > 
> > But did we do that "disable stuffing with SMEP"? I'm not seeing it. In
> > my tree, it's only conditional on X86_FEATURE_RETPOLINE.
>
> That's the vmexit one. The one on context switch is in
> commit c995efd5a7 and has its own X86_FEATURE_RSB_CTXSW which in
> kernel/cpu/bugs.c is turned on for (!SMEP || Skylake).
> 
> The "low bits of the BTB" issue probably means that wants to be
> X86_FEATURE_RETPOLINE too. Despite Intel's doc saying otherwise.
> 
> (Intel's doc also says to do it on kernel entry, but we elected to do
> it on context switch instead since *that's* when the imbalances show up
> in the RSB.)

Note, we've switched from talking about BTB to RSB here, so this is a
valid concern if the *RSB* only has the low bits of the target.

smime.p7s
Description: S/MIME cryptographic signature


Re: [RFC 09/10] x86/enter: Create macros to restrict/unrestrict Indirect Branch Speculation

2018-01-26 Thread David Woodhouse
On Fri, 2018-01-26 at 09:19 -0800, Linus Torvalds wrote:
> On Fri, Jan 26, 2018 at 1:11 AM, David Woodhouse  wrote:
> > 
> > 
> > Do we need to look again at the fact that we've disabled the RSB-
> > stuffing for SMEP?
> Absolutely. SMEP helps make people a lot less worried about things,
> but it doesn't fix the "BTB only contains partial addresses" case.
> 
> But did we do that "disable stuffing with SMEP"? I'm not seeing it. In
> my tree, it's only conditional on X86_FEATURE_RETPOLINE.

That's the vmexit one. The one on context switch is in
commit c995efd5a7 and has its own X86_FEATURE_RSB_CTXSW which in
kernel/cpu/bugs.c is turned on for (!SMEP || Skylake).

The "low bits of the BTB" issue probably means that wants to be
X86_FEATURE_RETPOLINE too. Despite Intel's doc saying otherwise.

(Intel's doc also says to do it on kernel entry, but we elected to do
it on context switch instead since *that's* when the imbalances show up
in the RSB.)

smime.p7s
Description: S/MIME cryptographic signature


Re: [RFC 09/10] x86/enter: Create macros to restrict/unrestrict Indirect Branch Speculation

2018-01-26 Thread David Woodhouse
On Fri, 2018-01-26 at 09:19 -0800, Linus Torvalds wrote:
> On Fri, Jan 26, 2018 at 1:11 AM, David Woodhouse  wrote:
> > 
> > 
> > Do we need to look again at the fact that we've disabled the RSB-
> > stuffing for SMEP?
> Absolutely. SMEP helps make people a lot less worried about things,
> but it doesn't fix the "BTB only contains partial addresses" case.
> 
> But did we do that "disable stuffing with SMEP"? I'm not seeing it. In
> my tree, it's only conditional on X86_FEATURE_RETPOLINE.

That's the vmexit one. The one on context switch is in
commit c995efd5a7 and has its own X86_FEATURE_RSB_CTXSW which in
kernel/cpu/bugs.c is turned on for (!SMEP || Skylake).

The "low bits of the BTB" issue probably means that wants to be
X86_FEATURE_RETPOLINE too. Despite Intel's doc saying otherwise.

(Intel's doc also says to do it on kernel entry, but we elected to do
it on context switch instead since *that's* when the imbalances show up
in the RSB.)

smime.p7s
Description: S/MIME cryptographic signature


Re: [RFC 09/10] x86/enter: Create macros to restrict/unrestrict Indirect Branch Speculation

2018-01-26 Thread Linus Torvalds
On Fri, Jan 26, 2018 at 1:11 AM, David Woodhouse  wrote:
>
> Do we need to look again at the fact that we've disabled the RSB-
> stuffing for SMEP?

Absolutely. SMEP helps make people a lot less worried about things,
but it doesn't fix the "BTB only contains partial addresses" case.

But did we do that "disable stuffing with SMEP"? I'm not seeing it. In
my tree, it's only conditional on X86_FEATURE_RETPOLINE.

   Linus


Re: [RFC 09/10] x86/enter: Create macros to restrict/unrestrict Indirect Branch Speculation

2018-01-26 Thread Linus Torvalds
On Fri, Jan 26, 2018 at 1:11 AM, David Woodhouse  wrote:
>
> Do we need to look again at the fact that we've disabled the RSB-
> stuffing for SMEP?

Absolutely. SMEP helps make people a lot less worried about things,
but it doesn't fix the "BTB only contains partial addresses" case.

But did we do that "disable stuffing with SMEP"? I'm not seeing it. In
my tree, it's only conditional on X86_FEATURE_RETPOLINE.

   Linus


Re: [RFC 09/10] x86/enter: Create macros to restrict/unrestrict Indirect Branch Speculation

2018-01-26 Thread David Woodhouse
On Thu, 2018-01-25 at 18:23 -0800, Dave Hansen wrote:
> On 01/25/2018 06:11 PM, Liran Alon wrote:
> > 
> > It is true that attacker cannot speculate to a kernel-address, but it
> > doesn't mean it cannot use the leaked kernel-address together with
> > another unrelated vulnerability to build a reliable exploit.
>
> The address doesn't leak if you can't execute there.  It's the same
> reason that we don't worry about speculation to user addresses from the
> kernel when SMEP is in play.

If both tags and target in the BTB are only 31 bits, then surely a
user-learned prediction of a branch from

  0x01234567 → 0x07654321

would be equivalent to a kernel-mode branch from

 0x81234567 → 0x87654321

... and interpreted in kernel mode as the latter? So I'm not sure why
SMEP saves us there?

Likewise if the RSB only stores the low 31 bits of the target, SMEP
isn't much help there either.

Do we need to look again at the fact that we've disabled the RSB-
stuffing for SMEP?

smime.p7s
Description: S/MIME cryptographic signature


Re: [RFC 09/10] x86/enter: Create macros to restrict/unrestrict Indirect Branch Speculation

2018-01-26 Thread David Woodhouse
On Thu, 2018-01-25 at 18:23 -0800, Dave Hansen wrote:
> On 01/25/2018 06:11 PM, Liran Alon wrote:
> > 
> > It is true that attacker cannot speculate to a kernel-address, but it
> > doesn't mean it cannot use the leaked kernel-address together with
> > another unrelated vulnerability to build a reliable exploit.
>
> The address doesn't leak if you can't execute there.  It's the same
> reason that we don't worry about speculation to user addresses from the
> kernel when SMEP is in play.

If both tags and target in the BTB are only 31 bits, then surely a
user-learned prediction of a branch from

  0x01234567 → 0x07654321

would be equivalent to a kernel-mode branch from

 0x81234567 → 0x87654321

... and interpreted in kernel mode as the latter? So I'm not sure why
SMEP saves us there?

Likewise if the RSB only stores the low 31 bits of the target, SMEP
isn't much help there either.

Do we need to look again at the fact that we've disabled the RSB-
stuffing for SMEP?

smime.p7s
Description: S/MIME cryptographic signature


Re: [RFC 09/10] x86/enter: Create macros to restrict/unrestrict Indirect Branch Speculation

2018-01-26 Thread David Woodhouse
On Thu, 2018-01-25 at 18:11 -0800, Liran Alon wrote:
> 
> P.S:
> It seems to me that all these issues could be resolved completely at
> hardware in future CPUs if BTB/BHB/RSB entries were tagged with
> prediction-mode (or similar metadata). It will be nice if Intel/AMD
> could share if that is the planned long-term solution instead of
> IBRS-all-the-time.

IBRS-all-the-time is tagging with the ring and VMX root/non-root mode,
it seems. That much they could slip into the upcoming generation of
CPUs. And it's supposed to be fast¹; none of the dirty hacks in
microcode that they needed to implement the first-generation IBRS.

But we still need to tag with ASID/VMID and do proper flushing for
those, before we can completely ditch the need to do IBPB at the right
times.

Reading between the lines, I don't think they could add *that* without
stopping the fabs for a year or so while they go back to the drawing
board. But yes, I sincerely hope they *are* planning to do it, and
expose a 'SPECTRE_NO' bit in IA32_ARCH_CAPABILITIES, as soon as is
humanly possible.



¹ Fast enough that we'll want to use it and ALTERNATIVE out the 
  retpolines.

smime.p7s
Description: S/MIME cryptographic signature


Re: [RFC 09/10] x86/enter: Create macros to restrict/unrestrict Indirect Branch Speculation

2018-01-26 Thread David Woodhouse
On Thu, 2018-01-25 at 18:11 -0800, Liran Alon wrote:
> 
> P.S:
> It seems to me that all these issues could be resolved completely at
> hardware in future CPUs if BTB/BHB/RSB entries were tagged with
> prediction-mode (or similar metadata). It will be nice if Intel/AMD
> could share if that is the planned long-term solution instead of
> IBRS-all-the-time.

IBRS-all-the-time is tagging with the ring and VMX root/non-root mode,
it seems. That much they could slip into the upcoming generation of
CPUs. And it's supposed to be fast¹; none of the dirty hacks in
microcode that they needed to implement the first-generation IBRS.

But we still need to tag with ASID/VMID and do proper flushing for
those, before we can completely ditch the need to do IBPB at the right
times.

Reading between the lines, I don't think they could add *that* without
stopping the fabs for a year or so while they go back to the drawing
board. But yes, I sincerely hope they *are* planning to do it, and
expose a 'SPECTRE_NO' bit in IA32_ARCH_CAPABILITIES, as soon as is
humanly possible.



¹ Fast enough that we'll want to use it and ALTERNATIVE out the 
  retpolines.

smime.p7s
Description: S/MIME cryptographic signature


RE: [RFC 09/10] x86/enter: Create macros to restrict/unrestrict Indirect Branch Speculation

2018-01-25 Thread Van De Ven, Arjan




> -Original Message-
> From: Liran Alon [mailto:liran.a...@oracle.com]
> Sent: Thursday, January 25, 2018 6:50 PM
> To: Hansen, Dave <dave.han...@intel.com>
> Cc: labb...@redhat.com; l...@kernel.org; janakarajan.natara...@amd.com;
> torva...@linux-foundation.org; b...@suse.de; Mallick, Asit K
> <asit.k.mall...@intel.com>; rkrc...@redhat.com; karah...@amazon.de;
> h...@zytor.com; mi...@redhat.com; Nakajima, Jun
> <jun.nakaj...@intel.com>; x...@kernel.org; Raj, Ashok <ashok@intel.com>;
> Van De Ven, Arjan <arjan.van.de@intel.com>; tim.c.c...@linux.intel.com;
> pbonz...@redhat.com; a...@linux.intel.com; linux-kernel@vger.kernel.org;
> dw...@infradead.org; pet...@infradead.org; t...@linutronix.de;
> gre...@linuxfoundation.org; mhira...@kernel.org; ar...@linux.intel.com;
> thomas.lenda...@amd.com; Williams, Dan J <dan.j.willi...@intel.com>;
> j...@8bytes.org; k...@vger.kernel.org; aarca...@redhat.com
> Subject: Re: [RFC 09/10] x86/enter: Create macros to restrict/unrestrict 
> Indirect
> Branch Speculation
> 
> 
 
> Google P0 blog-post
> (https://googleprojectzero.blogspot.co.il/2018/01/reading-privileged-memory-
> with-side.html) claims that BTB & BHB only use <31 low bits of the address of
> the source instruction to lookup into the BTB. In addition, it claims that the
> higher bits of the predicated destination change together with the higher 
> bits of
> the source instruction.
> 
> Therefore, it should be possible to leak the low bits of high predicition-mode
> code BTB/BHB entries from low prediction-mode code. Because the predicted
> destination address will reside in user-space.
> 
> What am I missing?


I thought this email thread was about the RSB...



RE: [RFC 09/10] x86/enter: Create macros to restrict/unrestrict Indirect Branch Speculation

2018-01-25 Thread Van De Ven, Arjan




> -Original Message-
> From: Liran Alon [mailto:liran.a...@oracle.com]
> Sent: Thursday, January 25, 2018 6:50 PM
> To: Hansen, Dave 
> Cc: labb...@redhat.com; l...@kernel.org; janakarajan.natara...@amd.com;
> torva...@linux-foundation.org; b...@suse.de; Mallick, Asit K
> ; rkrc...@redhat.com; karah...@amazon.de;
> h...@zytor.com; mi...@redhat.com; Nakajima, Jun
> ; x...@kernel.org; Raj, Ashok ;
> Van De Ven, Arjan ; tim.c.c...@linux.intel.com;
> pbonz...@redhat.com; a...@linux.intel.com; linux-kernel@vger.kernel.org;
> dw...@infradead.org; pet...@infradead.org; t...@linutronix.de;
> gre...@linuxfoundation.org; mhira...@kernel.org; ar...@linux.intel.com;
> thomas.lenda...@amd.com; Williams, Dan J ;
> j...@8bytes.org; k...@vger.kernel.org; aarca...@redhat.com
> Subject: Re: [RFC 09/10] x86/enter: Create macros to restrict/unrestrict 
> Indirect
> Branch Speculation
> 
> 
 
> Google P0 blog-post
> (https://googleprojectzero.blogspot.co.il/2018/01/reading-privileged-memory-
> with-side.html) claims that BTB & BHB only use <31 low bits of the address of
> the source instruction to lookup into the BTB. In addition, it claims that the
> higher bits of the predicated destination change together with the higher 
> bits of
> the source instruction.
> 
> Therefore, it should be possible to leak the low bits of high predicition-mode
> code BTB/BHB entries from low prediction-mode code. Because the predicted
> destination address will reside in user-space.
> 
> What am I missing?


I thought this email thread was about the RSB...



Re: [RFC 09/10] x86/enter: Create macros to restrict/unrestrict Indirect Branch Speculation

2018-01-25 Thread Liran Alon

- dave.han...@intel.com wrote:

> On 01/25/2018 06:11 PM, Liran Alon wrote:
> > It is true that attacker cannot speculate to a kernel-address, but
> it
> > doesn't mean it cannot use the leaked kernel-address together with
> > another unrelated vulnerability to build a reliable exploit.
> 
> The address doesn't leak if you can't execute there.  It's the same
> reason that we don't worry about speculation to user addresses from
> the
> kernel when SMEP is in play.

Maybe I misunderstand BTB & BHB internals. Will be glad if you could pinpoint 
my error.

Google P0 blog-post 
(https://googleprojectzero.blogspot.co.il/2018/01/reading-privileged-memory-with-side.html)
 claims that BTB & BHB only use <31 low bits of the address of the source 
instruction to lookup into the BTB. In addition, it claims that the higher bits 
of the predicated destination change together with the higher bits of the 
source instruction.

Therefore, it should be possible to leak the low bits of high predicition-mode 
code BTB/BHB entries from low prediction-mode code. Because the predicted 
destination address will reside in user-space.

What am I missing?

Thanks,
-Liran


Re: [RFC 09/10] x86/enter: Create macros to restrict/unrestrict Indirect Branch Speculation

2018-01-25 Thread Liran Alon

- dave.han...@intel.com wrote:

> On 01/25/2018 06:11 PM, Liran Alon wrote:
> > It is true that attacker cannot speculate to a kernel-address, but
> it
> > doesn't mean it cannot use the leaked kernel-address together with
> > another unrelated vulnerability to build a reliable exploit.
> 
> The address doesn't leak if you can't execute there.  It's the same
> reason that we don't worry about speculation to user addresses from
> the
> kernel when SMEP is in play.

Maybe I misunderstand BTB & BHB internals. Will be glad if you could pinpoint 
my error.

Google P0 blog-post 
(https://googleprojectzero.blogspot.co.il/2018/01/reading-privileged-memory-with-side.html)
 claims that BTB & BHB only use <31 low bits of the address of the source 
instruction to lookup into the BTB. In addition, it claims that the higher bits 
of the predicated destination change together with the higher bits of the 
source instruction.

Therefore, it should be possible to leak the low bits of high predicition-mode 
code BTB/BHB entries from low prediction-mode code. Because the predicted 
destination address will reside in user-space.

What am I missing?

Thanks,
-Liran


Re: [RFC 09/10] x86/enter: Create macros to restrict/unrestrict Indirect Branch Speculation

2018-01-25 Thread Dave Hansen
On 01/25/2018 06:11 PM, Liran Alon wrote:
> It is true that attacker cannot speculate to a kernel-address, but it
> doesn't mean it cannot use the leaked kernel-address together with
> another unrelated vulnerability to build a reliable exploit.

The address doesn't leak if you can't execute there.  It's the same
reason that we don't worry about speculation to user addresses from the
kernel when SMEP is in play.


Re: [RFC 09/10] x86/enter: Create macros to restrict/unrestrict Indirect Branch Speculation

2018-01-25 Thread Dave Hansen
On 01/25/2018 06:11 PM, Liran Alon wrote:
> It is true that attacker cannot speculate to a kernel-address, but it
> doesn't mean it cannot use the leaked kernel-address together with
> another unrelated vulnerability to build a reliable exploit.

The address doesn't leak if you can't execute there.  It's the same
reason that we don't worry about speculation to user addresses from the
kernel when SMEP is in play.


Re: [RFC 09/10] x86/enter: Create macros to restrict/unrestrict Indirect Branch Speculation

2018-01-25 Thread Liran Alon

- dave.han...@intel.com wrote:

> On 01/23/2018 03:13 AM, Liran Alon wrote:
> > Therefore, breaking KASLR. In order to handle this, every exit from
> > kernel-mode to user-mode should stuff RSB. In addition, this
> stuffing
> > of RSB may need to be done from a fixed address to avoid leaking
> the
> > address of the RSB stuffing itself.
> 
> With PTI alone in place, I don't see how userspace could do anything
> with this information.  Even if userspace started to speculate to a
> kernel address, there is nothing at the kernel address to execute: no
> TLB entry, no PTE to load, nothing.
> 
> You probably have a valid point about host->guest, though.

I see it differently.

It is true that attacker cannot speculate to a kernel-address, but it doesn't 
mean it cannot use the leaked kernel-address together with another unrelated 
vulnerability to build a reliable exploit.

Security is built in layers.
The purpose of KASLR is to break the reliablity of an exploit which relies on 
vulnerability primitives such as: memory-corruption of a kernel-address, hijack 
kernel control-flow to a kernel-address or even just read a kernel-address. In 
modern exploitation, it is common to chain multiple different vulnerabilities 
in order to build a reliable exploit. Therefore, leaking a kernel-address could 
be exactly the missing primitive to complete a vulnerability-chain of a 
reliable exploit.

I don't see a big difference between leaking a kernel-address from user-mode 
vs. leaking a hypervisor-address from guest. They are both useful just as a 
primitive which is part of an exploit chain.

One could argue though, that currently KASLR is fundementally broken and 
therefore should not be considered a security boundary anymore. This argument 
could be legit as there were some well-known techniques that could break KASLR 
before KPTI patch-set was introduced (e.g. Timing memory accesses to 
kernel-addresses and messure reliably by leveraging TSX). Another well-known 
argument against KASLR is that it is a non-deterministic mitigation which some 
argue is not good enough. However, I think that if we decide KASLR is not a 
security boundary anymore, it should be made loud and clear.

In general, I think there are some info-leak vulnerabilities in our current 
mitigation plan which doesn't seem to be addressed. I will be glad if we could 
address them clearly. These are all the open issues as I see them:

1) Because IBRS doesn't restrict low prediction-mode code from using BTB of 
high prediction-mode code, It is possible to info-leak addresses from high 
prediction-mode code to low prediciton-mode code.
This is the KASLR breakage discussed above. Again, could be ignored if we 
discard KASLR as a security boundary.

2) Both IBRS & retpoline don't prevent use of BHB of high prediction-mode code 
from being used by low prediction-mode code. Therefore, low prediction-mode 
code could deduce the conditional branches taken by high prediction-mode code.

3) Similar leak to (1) exists from the fact that RSB entries of high 
prediction-mode code could be leaked by low prediction-mode code which may 
reveal kernel-addresses. Again, we could decide that this isn't a security 
boundary. An alternative to solve this could be to just stuff RSB from a fixed 
address between prediction-mode transitions.

-Liran

P.S:
It seems to me that all these issues could be resolved completely at hardware 
in future CPUs if BTB/BHB/RSB entries were tagged with prediction-mode (or 
similar metadata). It will be nice if Intel/AMD could share if that is the 
planned long-term solution instead of IBRS-all-the-time.


Re: [RFC 09/10] x86/enter: Create macros to restrict/unrestrict Indirect Branch Speculation

2018-01-25 Thread Liran Alon

- dave.han...@intel.com wrote:

> On 01/23/2018 03:13 AM, Liran Alon wrote:
> > Therefore, breaking KASLR. In order to handle this, every exit from
> > kernel-mode to user-mode should stuff RSB. In addition, this
> stuffing
> > of RSB may need to be done from a fixed address to avoid leaking
> the
> > address of the RSB stuffing itself.
> 
> With PTI alone in place, I don't see how userspace could do anything
> with this information.  Even if userspace started to speculate to a
> kernel address, there is nothing at the kernel address to execute: no
> TLB entry, no PTE to load, nothing.
> 
> You probably have a valid point about host->guest, though.

I see it differently.

It is true that attacker cannot speculate to a kernel-address, but it doesn't 
mean it cannot use the leaked kernel-address together with another unrelated 
vulnerability to build a reliable exploit.

Security is built in layers.
The purpose of KASLR is to break the reliablity of an exploit which relies on 
vulnerability primitives such as: memory-corruption of a kernel-address, hijack 
kernel control-flow to a kernel-address or even just read a kernel-address. In 
modern exploitation, it is common to chain multiple different vulnerabilities 
in order to build a reliable exploit. Therefore, leaking a kernel-address could 
be exactly the missing primitive to complete a vulnerability-chain of a 
reliable exploit.

I don't see a big difference between leaking a kernel-address from user-mode 
vs. leaking a hypervisor-address from guest. They are both useful just as a 
primitive which is part of an exploit chain.

One could argue though, that currently KASLR is fundementally broken and 
therefore should not be considered a security boundary anymore. This argument 
could be legit as there were some well-known techniques that could break KASLR 
before KPTI patch-set was introduced (e.g. Timing memory accesses to 
kernel-addresses and messure reliably by leveraging TSX). Another well-known 
argument against KASLR is that it is a non-deterministic mitigation which some 
argue is not good enough. However, I think that if we decide KASLR is not a 
security boundary anymore, it should be made loud and clear.

In general, I think there are some info-leak vulnerabilities in our current 
mitigation plan which doesn't seem to be addressed. I will be glad if we could 
address them clearly. These are all the open issues as I see them:

1) Because IBRS doesn't restrict low prediction-mode code from using BTB of 
high prediction-mode code, It is possible to info-leak addresses from high 
prediction-mode code to low prediciton-mode code.
This is the KASLR breakage discussed above. Again, could be ignored if we 
discard KASLR as a security boundary.

2) Both IBRS & retpoline don't prevent use of BHB of high prediction-mode code 
from being used by low prediction-mode code. Therefore, low prediction-mode 
code could deduce the conditional branches taken by high prediction-mode code.

3) Similar leak to (1) exists from the fact that RSB entries of high 
prediction-mode code could be leaked by low prediction-mode code which may 
reveal kernel-addresses. Again, we could decide that this isn't a security 
boundary. An alternative to solve this could be to just stuff RSB from a fixed 
address between prediction-mode transitions.

-Liran

P.S:
It seems to me that all these issues could be resolved completely at hardware 
in future CPUs if BTB/BHB/RSB entries were tagged with prediction-mode (or 
similar metadata). It will be nice if Intel/AMD could share if that is the 
planned long-term solution instead of IBRS-all-the-time.


Re: [RFC 09/10] x86/enter: Create macros to restrict/unrestrict Indirect Branch Speculation

2018-01-25 Thread Dave Hansen
On 01/23/2018 03:13 AM, Liran Alon wrote:
> Therefore, breaking KASLR. In order to handle this, every exit from
> kernel-mode to user-mode should stuff RSB. In addition, this stuffing
> of RSB may need to be done from a fixed address to avoid leaking the
> address of the RSB stuffing itself.

With PTI alone in place, I don't see how userspace could do anything
with this information.  Even if userspace started to speculate to a
kernel address, there is nothing at the kernel address to execute: no
TLB entry, no PTE to load, nothing.

You probably have a valid point about host->guest, though.


Re: [RFC 09/10] x86/enter: Create macros to restrict/unrestrict Indirect Branch Speculation

2018-01-25 Thread Dave Hansen
On 01/23/2018 03:13 AM, Liran Alon wrote:
> Therefore, breaking KASLR. In order to handle this, every exit from
> kernel-mode to user-mode should stuff RSB. In addition, this stuffing
> of RSB may need to be done from a fixed address to avoid leaking the
> address of the RSB stuffing itself.

With PTI alone in place, I don't see how userspace could do anything
with this information.  Even if userspace started to speculate to a
kernel address, there is nothing at the kernel address to execute: no
TLB entry, no PTE to load, nothing.

You probably have a valid point about host->guest, though.


Re: [RFC 09/10] x86/enter: Create macros to restrict/unrestrict Indirect Branch Speculation

2018-01-25 Thread Greg Kroah-Hartman
On Thu, Jan 25, 2018 at 05:19:04PM +0100, Mason wrote:
> On 23/01/2018 10:30, David Woodhouse wrote:
> 
> > Skylake takes predictions from the generic branch target buffer when
> > the RSB underflows.
> 
> Adding LAKML.
> 
> AFAIU, some ARM Cortex cores have the same optimization.
> (A9 maybe, A17 probably, some recent 64-bit cores)
> 
> Are there software work-arounds for Spectre planned for arm32 and arm64?

Yes, I think they are currently burried in one of the arm64 trees, and
they have been posted to the mailing list a few times in the past.

thanks,

greg k-h


Re: [RFC 09/10] x86/enter: Create macros to restrict/unrestrict Indirect Branch Speculation

2018-01-25 Thread Greg Kroah-Hartman
On Thu, Jan 25, 2018 at 05:19:04PM +0100, Mason wrote:
> On 23/01/2018 10:30, David Woodhouse wrote:
> 
> > Skylake takes predictions from the generic branch target buffer when
> > the RSB underflows.
> 
> Adding LAKML.
> 
> AFAIU, some ARM Cortex cores have the same optimization.
> (A9 maybe, A17 probably, some recent 64-bit cores)
> 
> Are there software work-arounds for Spectre planned for arm32 and arm64?

Yes, I think they are currently burried in one of the arm64 trees, and
they have been posted to the mailing list a few times in the past.

thanks,

greg k-h


Re: [RFC 09/10] x86/enter: Create macros to restrict/unrestrict Indirect Branch Speculation

2018-01-25 Thread Mason
On 23/01/2018 10:30, David Woodhouse wrote:

> Skylake takes predictions from the generic branch target buffer when
> the RSB underflows.

Adding LAKML.

AFAIU, some ARM Cortex cores have the same optimization.
(A9 maybe, A17 probably, some recent 64-bit cores)

Are there software work-arounds for Spectre planned for arm32 and arm64?

Regards.


Re: [RFC 09/10] x86/enter: Create macros to restrict/unrestrict Indirect Branch Speculation

2018-01-25 Thread Mason
On 23/01/2018 10:30, David Woodhouse wrote:

> Skylake takes predictions from the generic branch target buffer when
> the RSB underflows.

Adding LAKML.

AFAIU, some ARM Cortex cores have the same optimization.
(A9 maybe, A17 probably, some recent 64-bit cores)

Are there software work-arounds for Spectre planned for arm32 and arm64?

Regards.


Re: [RFC 09/10] x86/enter: Create macros to restrict/unrestrict Indirect Branch Speculation

2018-01-23 Thread Andy Lutomirski


> On Jan 23, 2018, at 5:59 PM, Van De Ven, Arjan  
> wrote:
> 
> 
>>> It is a reasonable approach.  Let a process who needs max security
>>> opt in with disabled dumpable. It can have a flush with IBPB clear before
>>> starting to run, and have STIBP set while running.
>>> 
>> 
>> Do we maybe want a separate opt in?  I can easily imagine things like
>> web browsers that *don't* want to be non-dumpable but do want this
>> opt-in.
> 
> eventually we need something better. Probably in addition.
> dumpable is used today for things that want this.
> 
>> 
>> Also, what's the performance hit of STIBP?
> 
> pretty steep, but it depends on the CPU generation, for some it's cheaper 
> than others. (yes I realize this is a vague answer, but the range is really 
> from just about zero to oh my god)
> 
> I'm not a fan of doing this right now to be honest. We really need to not 
> piece meal some of this, and come up with a better concept of protection on a 
> higher level.
> For example, you mention web browsers, but the threat model for browsers is 
> generally internet content. For V2 to work you need to get some "evil 
> pointer" into the app from the observer and browsers usually aren't doing 
> that.
> The most likely user would be some software-TPM-like service that has magic 
> keys.
> 
> And for keys we want something else... we want an madvice() sort of thing 
> that does a few things, like equivalent of mlock (so the key does not end up 
> in swap),

I'd love to see a slight variant: encrypt that page against some ephemeral key 
if it gets swapped.

> not having the page (but potentially the rest) end up in core dumps, and the 
> kernel making sure that if the program exits (say for segv) that the key page 
> gets zeroed before going into the free pool. Once you do that as feature, 
> making the key speculation safe is not too hard (intel and arm have cpu 
> options to mark pages for that)
> 
> 

How do we do that on Intel?  Make it UC?

Re: [RFC 09/10] x86/enter: Create macros to restrict/unrestrict Indirect Branch Speculation

2018-01-23 Thread Andy Lutomirski


> On Jan 23, 2018, at 5:59 PM, Van De Ven, Arjan  
> wrote:
> 
> 
>>> It is a reasonable approach.  Let a process who needs max security
>>> opt in with disabled dumpable. It can have a flush with IBPB clear before
>>> starting to run, and have STIBP set while running.
>>> 
>> 
>> Do we maybe want a separate opt in?  I can easily imagine things like
>> web browsers that *don't* want to be non-dumpable but do want this
>> opt-in.
> 
> eventually we need something better. Probably in addition.
> dumpable is used today for things that want this.
> 
>> 
>> Also, what's the performance hit of STIBP?
> 
> pretty steep, but it depends on the CPU generation, for some it's cheaper 
> than others. (yes I realize this is a vague answer, but the range is really 
> from just about zero to oh my god)
> 
> I'm not a fan of doing this right now to be honest. We really need to not 
> piece meal some of this, and come up with a better concept of protection on a 
> higher level.
> For example, you mention web browsers, but the threat model for browsers is 
> generally internet content. For V2 to work you need to get some "evil 
> pointer" into the app from the observer and browsers usually aren't doing 
> that.
> The most likely user would be some software-TPM-like service that has magic 
> keys.
> 
> And for keys we want something else... we want an madvice() sort of thing 
> that does a few things, like equivalent of mlock (so the key does not end up 
> in swap),

I'd love to see a slight variant: encrypt that page against some ephemeral key 
if it gets swapped.

> not having the page (but potentially the rest) end up in core dumps, and the 
> kernel making sure that if the program exits (say for segv) that the key page 
> gets zeroed before going into the free pool. Once you do that as feature, 
> making the key speculation safe is not too hard (intel and arm have cpu 
> options to mark pages for that)
> 
> 

How do we do that on Intel?  Make it UC?

RE: [RFC 09/10] x86/enter: Create macros to restrict/unrestrict Indirect Branch Speculation

2018-01-23 Thread Van De Ven, Arjan

> > It is a reasonable approach.  Let a process who needs max security
> > opt in with disabled dumpable. It can have a flush with IBPB clear before
> > starting to run, and have STIBP set while running.
> >
> 
> Do we maybe want a separate opt in?  I can easily imagine things like
> web browsers that *don't* want to be non-dumpable but do want this
> opt-in.

eventually we need something better. Probably in addition.
dumpable is used today for things that want this.

> 
> Also, what's the performance hit of STIBP?

pretty steep, but it depends on the CPU generation, for some it's cheaper than 
others. (yes I realize this is a vague answer, but the range is really from 
just about zero to oh my god)

I'm not a fan of doing this right now to be honest. We really need to not piece 
meal some of this, and come up with a better concept of protection on a higher 
level.
For example, you mention web browsers, but the threat model for browsers is 
generally internet content. For V2 to work you need to get some "evil pointer" 
into the app from the observer and browsers usually aren't doing that.
The most likely user would be some software-TPM-like service that has magic 
keys.

And for keys we want something else... we want an madvice() sort of thing that 
does a few things, like equivalent of mlock (so the key does not end up in 
swap), not having the page (but potentially the rest) end up in core dumps, and 
the kernel making sure that if the program exits (say for segv) that the key 
page gets zeroed before going into the free pool. Once you do that as feature, 
making the key speculation safe is not too hard (intel and arm have cpu options 
to mark pages for that)




RE: [RFC 09/10] x86/enter: Create macros to restrict/unrestrict Indirect Branch Speculation

2018-01-23 Thread Van De Ven, Arjan

> > It is a reasonable approach.  Let a process who needs max security
> > opt in with disabled dumpable. It can have a flush with IBPB clear before
> > starting to run, and have STIBP set while running.
> >
> 
> Do we maybe want a separate opt in?  I can easily imagine things like
> web browsers that *don't* want to be non-dumpable but do want this
> opt-in.

eventually we need something better. Probably in addition.
dumpable is used today for things that want this.

> 
> Also, what's the performance hit of STIBP?

pretty steep, but it depends on the CPU generation, for some it's cheaper than 
others. (yes I realize this is a vague answer, but the range is really from 
just about zero to oh my god)

I'm not a fan of doing this right now to be honest. We really need to not piece 
meal some of this, and come up with a better concept of protection on a higher 
level.
For example, you mention web browsers, but the threat model for browsers is 
generally internet content. For V2 to work you need to get some "evil pointer" 
into the app from the observer and browsers usually aren't doing that.
The most likely user would be some software-TPM-like service that has magic 
keys.

And for keys we want something else... we want an madvice() sort of thing that 
does a few things, like equivalent of mlock (so the key does not end up in 
swap), not having the page (but potentially the rest) end up in core dumps, and 
the kernel making sure that if the program exits (say for segv) that the key 
page gets zeroed before going into the free pool. Once you do that as feature, 
making the key speculation safe is not too hard (intel and arm have cpu options 
to mark pages for that)




Re: [RFC 09/10] x86/enter: Create macros to restrict/unrestrict Indirect Branch Speculation

2018-01-23 Thread David Woodhouse
On Tue, 2018-01-23 at 17:00 -0800, Andy Lutomirski wrote:
> On Tue, Jan 23, 2018 at 4:47 PM, Tim Chen  wrote:
> > 
> > On 01/23/2018 03:14 PM, Woodhouse, David wrote:
> > > 
> > > On Tue, 2018-01-23 at 14:49 -0800, Andi Kleen wrote:
> > > > 
> > > > > 
> > > > > Not sure.  Maybe to start, the answer might be to allow it to be set 
> > > > > for
> > > > > the ultra-paranoid, but in general don't enable it by default.  
> > > > > Having it
> > > > > enabled would be an alternative to someone deciding to disable SMT, 
> > > > > since
> > > > > that would have even more of a performance impact.
> > > > I agree. A reasonable strategy would be to only enable it for
> > > > processes that have dumpable disabled. This should be already set for
> > > > high value processes like GPG, and allows others to opt-in if
> > > > they need to.
> > > That seems to make sense, and I think was the solution we were
> > > approaching for IBPB on context switch too, right?
> > > 
> > > Are we generally agreed on dumpable as the criterion for both of those?
> > > 
> > It is a reasonable approach.  Let a process who needs max security
> > opt in with disabled dumpable. It can have a flush with IBPB clear before
> > starting to run, and have STIBP set while running.
> > 
> Do we maybe want a separate opt in?  I can easily imagine things like
> web browsers that *don't* want to be non-dumpable but do want this
> opt-in.
 
This is to protect you from another local process running on a HT
sibling. Not the kind of thing that web browsers are normally worrying
about.

> Also, what's the performance hit of STIBP?

Varies per CPU generation, but generally approaching that of full IBRS
I think? I don't recall looking at this specifically (since we haven't
actually used it for this yet).

smime.p7s
Description: S/MIME cryptographic signature


Re: [RFC 09/10] x86/enter: Create macros to restrict/unrestrict Indirect Branch Speculation

2018-01-23 Thread David Woodhouse
On Tue, 2018-01-23 at 17:00 -0800, Andy Lutomirski wrote:
> On Tue, Jan 23, 2018 at 4:47 PM, Tim Chen  wrote:
> > 
> > On 01/23/2018 03:14 PM, Woodhouse, David wrote:
> > > 
> > > On Tue, 2018-01-23 at 14:49 -0800, Andi Kleen wrote:
> > > > 
> > > > > 
> > > > > Not sure.  Maybe to start, the answer might be to allow it to be set 
> > > > > for
> > > > > the ultra-paranoid, but in general don't enable it by default.  
> > > > > Having it
> > > > > enabled would be an alternative to someone deciding to disable SMT, 
> > > > > since
> > > > > that would have even more of a performance impact.
> > > > I agree. A reasonable strategy would be to only enable it for
> > > > processes that have dumpable disabled. This should be already set for
> > > > high value processes like GPG, and allows others to opt-in if
> > > > they need to.
> > > That seems to make sense, and I think was the solution we were
> > > approaching for IBPB on context switch too, right?
> > > 
> > > Are we generally agreed on dumpable as the criterion for both of those?
> > > 
> > It is a reasonable approach.  Let a process who needs max security
> > opt in with disabled dumpable. It can have a flush with IBPB clear before
> > starting to run, and have STIBP set while running.
> > 
> Do we maybe want a separate opt in?  I can easily imagine things like
> web browsers that *don't* want to be non-dumpable but do want this
> opt-in.
 
This is to protect you from another local process running on a HT
sibling. Not the kind of thing that web browsers are normally worrying
about.

> Also, what's the performance hit of STIBP?

Varies per CPU generation, but generally approaching that of full IBRS
I think? I don't recall looking at this specifically (since we haven't
actually used it for this yet).

smime.p7s
Description: S/MIME cryptographic signature


Re: [RFC 09/10] x86/enter: Create macros to restrict/unrestrict Indirect Branch Speculation

2018-01-23 Thread Andy Lutomirski
On Tue, Jan 23, 2018 at 4:47 PM, Tim Chen  wrote:
> On 01/23/2018 03:14 PM, Woodhouse, David wrote:
>> On Tue, 2018-01-23 at 14:49 -0800, Andi Kleen wrote:
 Not sure.  Maybe to start, the answer might be to allow it to be set for
 the ultra-paranoid, but in general don't enable it by default.  Having it
 enabled would be an alternative to someone deciding to disable SMT, since
 that would have even more of a performance impact.
>>>
>>> I agree. A reasonable strategy would be to only enable it for
>>> processes that have dumpable disabled. This should be already set for
>>> high value processes like GPG, and allows others to opt-in if
>>> they need to.
>>
>> That seems to make sense, and I think was the solution we were
>> approaching for IBPB on context switch too, right?
>>
>> Are we generally agreed on dumpable as the criterion for both of those?
>>
>
> It is a reasonable approach.  Let a process who needs max security
> opt in with disabled dumpable. It can have a flush with IBPB clear before
> starting to run, and have STIBP set while running.
>

Do we maybe want a separate opt in?  I can easily imagine things like
web browsers that *don't* want to be non-dumpable but do want this
opt-in.

Also, what's the performance hit of STIBP?


Re: [RFC 09/10] x86/enter: Create macros to restrict/unrestrict Indirect Branch Speculation

2018-01-23 Thread Andy Lutomirski
On Tue, Jan 23, 2018 at 4:47 PM, Tim Chen  wrote:
> On 01/23/2018 03:14 PM, Woodhouse, David wrote:
>> On Tue, 2018-01-23 at 14:49 -0800, Andi Kleen wrote:
 Not sure.  Maybe to start, the answer might be to allow it to be set for
 the ultra-paranoid, but in general don't enable it by default.  Having it
 enabled would be an alternative to someone deciding to disable SMT, since
 that would have even more of a performance impact.
>>>
>>> I agree. A reasonable strategy would be to only enable it for
>>> processes that have dumpable disabled. This should be already set for
>>> high value processes like GPG, and allows others to opt-in if
>>> they need to.
>>
>> That seems to make sense, and I think was the solution we were
>> approaching for IBPB on context switch too, right?
>>
>> Are we generally agreed on dumpable as the criterion for both of those?
>>
>
> It is a reasonable approach.  Let a process who needs max security
> opt in with disabled dumpable. It can have a flush with IBPB clear before
> starting to run, and have STIBP set while running.
>

Do we maybe want a separate opt in?  I can easily imagine things like
web browsers that *don't* want to be non-dumpable but do want this
opt-in.

Also, what's the performance hit of STIBP?


Re: [RFC 09/10] x86/enter: Create macros to restrict/unrestrict Indirect Branch Speculation

2018-01-23 Thread Tim Chen
On 01/23/2018 03:14 PM, Woodhouse, David wrote:
> On Tue, 2018-01-23 at 14:49 -0800, Andi Kleen wrote:
>>> Not sure.  Maybe to start, the answer might be to allow it to be set for
>>> the ultra-paranoid, but in general don't enable it by default.  Having it
>>> enabled would be an alternative to someone deciding to disable SMT, since
>>> that would have even more of a performance impact.
>>
>> I agree. A reasonable strategy would be to only enable it for
>> processes that have dumpable disabled. This should be already set for
>> high value processes like GPG, and allows others to opt-in if
>> they need to.
> 
> That seems to make sense, and I think was the solution we were
> approaching for IBPB on context switch too, right?
> 
> Are we generally agreed on dumpable as the criterion for both of those?
> 

It is a reasonable approach.  Let a process who needs max security
opt in with disabled dumpable. It can have a flush with IBPB clear before
starting to run, and have STIBP set while running.

Tim


Re: [RFC 09/10] x86/enter: Create macros to restrict/unrestrict Indirect Branch Speculation

2018-01-23 Thread Tim Chen
On 01/23/2018 03:14 PM, Woodhouse, David wrote:
> On Tue, 2018-01-23 at 14:49 -0800, Andi Kleen wrote:
>>> Not sure.  Maybe to start, the answer might be to allow it to be set for
>>> the ultra-paranoid, but in general don't enable it by default.  Having it
>>> enabled would be an alternative to someone deciding to disable SMT, since
>>> that would have even more of a performance impact.
>>
>> I agree. A reasonable strategy would be to only enable it for
>> processes that have dumpable disabled. This should be already set for
>> high value processes like GPG, and allows others to opt-in if
>> they need to.
> 
> That seems to make sense, and I think was the solution we were
> approaching for IBPB on context switch too, right?
> 
> Are we generally agreed on dumpable as the criterion for both of those?
> 

It is a reasonable approach.  Let a process who needs max security
opt in with disabled dumpable. It can have a flush with IBPB clear before
starting to run, and have STIBP set while running.

Tim


Re: [RFC 09/10] x86/enter: Create macros to restrict/unrestrict Indirect Branch Speculation

2018-01-23 Thread Andi Kleen
Ingo Molnar  writes:
>
> Is there any reason why this wouldn't work?

To actually maintain the true call depth you would need to intercept the
return of the function too, because the counter has to be decremented
at the end of the function.

Plain ftrace cannot do that because it only intercepts the function
entry.

The function graph tracer can do this, but only at the cost of
overwriting the return address (and saving return in a special stack)

This always causes a mispredict on every return, and other
overhead, and is one of the reasons why function graph
is so much slower than the plain function tracer.

I suspect the overhead would be significant.

To make your scheme work efficiently work likely we would
need custom gcc instrumentation for the returns.

FWIW our plan was to add enough manual stuffing at strategic
points, until we're sure enough of good enough coverage.

-Andi


Re: [RFC 09/10] x86/enter: Create macros to restrict/unrestrict Indirect Branch Speculation

2018-01-23 Thread Andi Kleen
Ingo Molnar  writes:
>
> Is there any reason why this wouldn't work?

To actually maintain the true call depth you would need to intercept the
return of the function too, because the counter has to be decremented
at the end of the function.

Plain ftrace cannot do that because it only intercepts the function
entry.

The function graph tracer can do this, but only at the cost of
overwriting the return address (and saving return in a special stack)

This always causes a mispredict on every return, and other
overhead, and is one of the reasons why function graph
is so much slower than the plain function tracer.

I suspect the overhead would be significant.

To make your scheme work efficiently work likely we would
need custom gcc instrumentation for the returns.

FWIW our plan was to add enough manual stuffing at strategic
points, until we're sure enough of good enough coverage.

-Andi


Re: [RFC 09/10] x86/enter: Create macros to restrict/unrestrict Indirect Branch Speculation

2018-01-23 Thread Andi Kleen
On Tue, Jan 23, 2018 at 11:14:36PM +, Woodhouse, David wrote:
> On Tue, 2018-01-23 at 14:49 -0800, Andi Kleen wrote:
> > > Not sure.  Maybe to start, the answer might be to allow it to be set for
> > > the ultra-paranoid, but in general don't enable it by default.  Having it
> > > enabled would be an alternative to someone deciding to disable SMT, since
> > > that would have even more of a performance impact.
> > 
> > I agree. A reasonable strategy would be to only enable it for
> > processes that have dumpable disabled. This should be already set for
> > high value processes like GPG, and allows others to opt-in if
> > they need to.
> 
> That seems to make sense, and I think was the solution we were
> approaching for IBPB on context switch too, right?

Right.

-Andi


Re: [RFC 09/10] x86/enter: Create macros to restrict/unrestrict Indirect Branch Speculation

2018-01-23 Thread Andi Kleen
On Tue, Jan 23, 2018 at 11:14:36PM +, Woodhouse, David wrote:
> On Tue, 2018-01-23 at 14:49 -0800, Andi Kleen wrote:
> > > Not sure.  Maybe to start, the answer might be to allow it to be set for
> > > the ultra-paranoid, but in general don't enable it by default.  Having it
> > > enabled would be an alternative to someone deciding to disable SMT, since
> > > that would have even more of a performance impact.
> > 
> > I agree. A reasonable strategy would be to only enable it for
> > processes that have dumpable disabled. This should be already set for
> > high value processes like GPG, and allows others to opt-in if
> > they need to.
> 
> That seems to make sense, and I think was the solution we were
> approaching for IBPB on context switch too, right?

Right.

-Andi


Re: [RFC 09/10] x86/enter: Create macros to restrict/unrestrict Indirect Branch Speculation

2018-01-23 Thread Woodhouse, David
On Tue, 2018-01-23 at 14:49 -0800, Andi Kleen wrote:
> > Not sure.  Maybe to start, the answer might be to allow it to be set for
> > the ultra-paranoid, but in general don't enable it by default.  Having it
> > enabled would be an alternative to someone deciding to disable SMT, since
> > that would have even more of a performance impact.
> 
> I agree. A reasonable strategy would be to only enable it for
> processes that have dumpable disabled. This should be already set for
> high value processes like GPG, and allows others to opt-in if
> they need to.

That seems to make sense, and I think was the solution we were
approaching for IBPB on context switch too, right?

Are we generally agreed on dumpable as the criterion for both of those?

smime.p7s
Description: S/MIME cryptographic signature


Re: [RFC 09/10] x86/enter: Create macros to restrict/unrestrict Indirect Branch Speculation

2018-01-23 Thread Woodhouse, David
On Tue, 2018-01-23 at 14:49 -0800, Andi Kleen wrote:
> > Not sure.  Maybe to start, the answer might be to allow it to be set for
> > the ultra-paranoid, but in general don't enable it by default.  Having it
> > enabled would be an alternative to someone deciding to disable SMT, since
> > that would have even more of a performance impact.
> 
> I agree. A reasonable strategy would be to only enable it for
> processes that have dumpable disabled. This should be already set for
> high value processes like GPG, and allows others to opt-in if
> they need to.

That seems to make sense, and I think was the solution we were
approaching for IBPB on context switch too, right?

Are we generally agreed on dumpable as the criterion for both of those?

smime.p7s
Description: S/MIME cryptographic signature


Re: [RFC 09/10] x86/enter: Create macros to restrict/unrestrict Indirect Branch Speculation

2018-01-23 Thread Andi Kleen
> Not sure.  Maybe to start, the answer might be to allow it to be set for
> the ultra-paranoid, but in general don't enable it by default.  Having it
> enabled would be an alternative to someone deciding to disable SMT, since
> that would have even more of a performance impact.

I agree. A reasonable strategy would be to only enable it for
processes that have dumpable disabled. This should be already set for
high value processes like GPG, and allows others to opt-in if
they need to.

-Andi


Re: [RFC 09/10] x86/enter: Create macros to restrict/unrestrict Indirect Branch Speculation

2018-01-23 Thread Andi Kleen
> Not sure.  Maybe to start, the answer might be to allow it to be set for
> the ultra-paranoid, but in general don't enable it by default.  Having it
> enabled would be an alternative to someone deciding to disable SMT, since
> that would have even more of a performance impact.

I agree. A reasonable strategy would be to only enable it for
processes that have dumpable disabled. This should be already set for
high value processes like GPG, and allows others to opt-in if
they need to.

-Andi


Re: [RFC 09/10] x86/enter: Create macros to restrict/unrestrict Indirect Branch Speculation

2018-01-23 Thread Tom Lendacky
On 1/23/2018 10:20 AM, Woodhouse, David wrote:
> On Tue, 2018-01-23 at 10:12 -0600, Tom Lendacky wrote:
>>
 +.macro UNRESTRICT_IB_SPEC
 +    ALTERNATIVE "jmp .Lskip_\@", "", X86_FEATURE_IBRS
 +    PUSH_MSR_REGS
 +    WRMSR_ASM $MSR_IA32_SPEC_CTRL, $0, $0
>>>  
>> I think you should be writing 2, not 0, since I'm reasonably
>> confident that we want STIBP on.  Can you explain why you're writing
>> 0?
>>
>> Do we want to talk about STIBP in general?  Should it be (yet another)
>> boot option to enable or disable?  If there is STIBP support without
>> IBRS support, it could be a set and forget at boot time.
> 
> We haven't got patches which enable STIBP in general. The kernel itself
> is safe either way with retpoline, or because IBRS implies STIBP too
> (that is, there's no difference between writing 1 and 3).
> 
> So STIBP is purely about protecting userspace processes from one
> another, and VM guests from one another, when they run on HT siblings.
> 
> There's an argument that there are so many other information leaks
> between HT siblings that we might not care. Especially as it's hard to
> *tell* when you're scheduling, whether you trust all the processes (or
> guests) on your HT siblings right now... let alone later when
> scheduling another process if you need to *now* set STIBP on a sibling
> which is no longer save from this process now running.
> 
> I'm not sure we want to set STIBP *unconditionally* either because of
> the performance implications.
> 
> For IBRS we had an answer and it was just ugly. For STIBP we don't
> actually have an answer for "how do we use this?". Do we?

Not sure.  Maybe to start, the answer might be to allow it to be set for
the ultra-paranoid, but in general don't enable it by default.  Having it
enabled would be an alternative to someone deciding to disable SMT, since
that would have even more of a performance impact.

Thanks,
Tom

> 
> 


Re: [RFC 09/10] x86/enter: Create macros to restrict/unrestrict Indirect Branch Speculation

2018-01-23 Thread Tom Lendacky
On 1/23/2018 10:20 AM, Woodhouse, David wrote:
> On Tue, 2018-01-23 at 10:12 -0600, Tom Lendacky wrote:
>>
 +.macro UNRESTRICT_IB_SPEC
 +    ALTERNATIVE "jmp .Lskip_\@", "", X86_FEATURE_IBRS
 +    PUSH_MSR_REGS
 +    WRMSR_ASM $MSR_IA32_SPEC_CTRL, $0, $0
>>>  
>> I think you should be writing 2, not 0, since I'm reasonably
>> confident that we want STIBP on.  Can you explain why you're writing
>> 0?
>>
>> Do we want to talk about STIBP in general?  Should it be (yet another)
>> boot option to enable or disable?  If there is STIBP support without
>> IBRS support, it could be a set and forget at boot time.
> 
> We haven't got patches which enable STIBP in general. The kernel itself
> is safe either way with retpoline, or because IBRS implies STIBP too
> (that is, there's no difference between writing 1 and 3).
> 
> So STIBP is purely about protecting userspace processes from one
> another, and VM guests from one another, when they run on HT siblings.
> 
> There's an argument that there are so many other information leaks
> between HT siblings that we might not care. Especially as it's hard to
> *tell* when you're scheduling, whether you trust all the processes (or
> guests) on your HT siblings right now... let alone later when
> scheduling another process if you need to *now* set STIBP on a sibling
> which is no longer save from this process now running.
> 
> I'm not sure we want to set STIBP *unconditionally* either because of
> the performance implications.
> 
> For IBRS we had an answer and it was just ugly. For STIBP we don't
> actually have an answer for "how do we use this?". Do we?

Not sure.  Maybe to start, the answer might be to allow it to be set for
the ultra-paranoid, but in general don't enable it by default.  Having it
enabled would be an alternative to someone deciding to disable SMT, since
that would have even more of a performance impact.

Thanks,
Tom

> 
> 


Re: [RFC 09/10] x86/enter: Create macros to restrict/unrestrict Indirect Branch Speculation

2018-01-23 Thread Pavel Machek
On Sun 2018-01-21 20:28:17, David Woodhouse wrote:
> On Sun, 2018-01-21 at 11:34 -0800, Linus Torvalds wrote:
> > All of this is pure garbage.
> > 
> > Is Intel really planning on making this shit architectural? Has
> > anybody talked to them and told them they are f*cking insane?
> > 
> > Please, any Intel engineers here - talk to your managers. 
> 
> If the alternative was a two-decade product recall and giving everyone
> free CPUs, I'm not sure it was entirely insane.
> 
> Certainly it's a nasty hack, but hey — the world was on fire and in the
> end we didn't have to just turn the datacentres off and go back to goat
> farming, so it's not all bad.

Well, someone at Intel put world on fire. And then was selling faulty
CPUs for half a year while world was on fire; they knew they are
faulty yet they sold them anyway.

Then Intel talks about how great they are and how security is
important for them Intentionaly confusing between Meltdown and
Spectre so they can mask how badly they screwed. And without apologies.

> As a hack for existing CPUs, it's just about tolerable — as long as it
> can die entirely by the next generation.
> 
> So the part is I think is odd is the IBRS_ALL feature, where a future
> CPU will advertise "I am able to be not broken" and then you have to
> set the IBRS bit once at boot time to *ask* it not to be broken. That
> part is weird, because it ought to have been treated like the RDCL_NO
> bit — just "you don't have to worry any more, it got better".

And now Intel wants to cheat at benchmarks, to put companies that do
right thing at disadvantage and thinks that that's okay because world
was on fire?

At this point, I believe that yes, product recall would be
appropriate. If Intel is not willing to do it on their own, well,
perhaps courts can force them. Ouch and I wound not mind some jail time
for whoever is responsible for selling known-faulty CPUs to the public.

Oh, and still no word about the real fixes. World is not only Linux,
you see? https://pavelmachek.livejournal.com/140949.html?nojs=1

Best regards,
Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) 
http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html


Re: [RFC 09/10] x86/enter: Create macros to restrict/unrestrict Indirect Branch Speculation

2018-01-23 Thread Pavel Machek
On Sun 2018-01-21 20:28:17, David Woodhouse wrote:
> On Sun, 2018-01-21 at 11:34 -0800, Linus Torvalds wrote:
> > All of this is pure garbage.
> > 
> > Is Intel really planning on making this shit architectural? Has
> > anybody talked to them and told them they are f*cking insane?
> > 
> > Please, any Intel engineers here - talk to your managers. 
> 
> If the alternative was a two-decade product recall and giving everyone
> free CPUs, I'm not sure it was entirely insane.
> 
> Certainly it's a nasty hack, but hey — the world was on fire and in the
> end we didn't have to just turn the datacentres off and go back to goat
> farming, so it's not all bad.

Well, someone at Intel put world on fire. And then was selling faulty
CPUs for half a year while world was on fire; they knew they are
faulty yet they sold them anyway.

Then Intel talks about how great they are and how security is
important for them Intentionaly confusing between Meltdown and
Spectre so they can mask how badly they screwed. And without apologies.

> As a hack for existing CPUs, it's just about tolerable — as long as it
> can die entirely by the next generation.
> 
> So the part is I think is odd is the IBRS_ALL feature, where a future
> CPU will advertise "I am able to be not broken" and then you have to
> set the IBRS bit once at boot time to *ask* it not to be broken. That
> part is weird, because it ought to have been treated like the RDCL_NO
> bit — just "you don't have to worry any more, it got better".

And now Intel wants to cheat at benchmarks, to put companies that do
right thing at disadvantage and thinks that that's okay because world
was on fire?

At this point, I believe that yes, product recall would be
appropriate. If Intel is not willing to do it on their own, well,
perhaps courts can force them. Ouch and I wound not mind some jail time
for whoever is responsible for selling known-faulty CPUs to the public.

Oh, and still no word about the real fixes. World is not only Linux,
you see? https://pavelmachek.livejournal.com/140949.html?nojs=1

Best regards,
Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) 
http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html


Re: [RFC 09/10] x86/enter: Create macros to restrict/unrestrict Indirect Branch Speculation

2018-01-23 Thread Woodhouse, David
On Tue, 2018-01-23 at 10:12 -0600, Tom Lendacky wrote:
> 
> >> +.macro UNRESTRICT_IB_SPEC
> >> +    ALTERNATIVE "jmp .Lskip_\@", "", X86_FEATURE_IBRS
> >> +    PUSH_MSR_REGS
> >> +    WRMSR_ASM $MSR_IA32_SPEC_CTRL, $0, $0
> > 
> I think you should be writing 2, not 0, since I'm reasonably
> confident that we want STIBP on.  Can you explain why you're writing
> 0?
> 
> Do we want to talk about STIBP in general?  Should it be (yet another)
> boot option to enable or disable?  If there is STIBP support without
> IBRS support, it could be a set and forget at boot time.

We haven't got patches which enable STIBP in general. The kernel itself
is safe either way with retpoline, or because IBRS implies STIBP too
(that is, there's no difference between writing 1 and 3).

So STIBP is purely about protecting userspace processes from one
another, and VM guests from one another, when they run on HT siblings.

There's an argument that there are so many other information leaks
between HT siblings that we might not care. Especially as it's hard to
*tell* when you're scheduling, whether you trust all the processes (or
guests) on your HT siblings right now... let alone later when
scheduling another process if you need to *now* set STIBP on a sibling
which is no longer save from this process now running.

I'm not sure we want to set STIBP *unconditionally* either because of
the performance implications.

For IBRS we had an answer and it was just ugly. For STIBP we don't
actually have an answer for "how do we use this?". Do we?




smime.p7s
Description: S/MIME cryptographic signature


Re: [RFC 09/10] x86/enter: Create macros to restrict/unrestrict Indirect Branch Speculation

2018-01-23 Thread Woodhouse, David
On Tue, 2018-01-23 at 10:12 -0600, Tom Lendacky wrote:
> 
> >> +.macro UNRESTRICT_IB_SPEC
> >> +    ALTERNATIVE "jmp .Lskip_\@", "", X86_FEATURE_IBRS
> >> +    PUSH_MSR_REGS
> >> +    WRMSR_ASM $MSR_IA32_SPEC_CTRL, $0, $0
> > 
> I think you should be writing 2, not 0, since I'm reasonably
> confident that we want STIBP on.  Can you explain why you're writing
> 0?
> 
> Do we want to talk about STIBP in general?  Should it be (yet another)
> boot option to enable or disable?  If there is STIBP support without
> IBRS support, it could be a set and forget at boot time.

We haven't got patches which enable STIBP in general. The kernel itself
is safe either way with retpoline, or because IBRS implies STIBP too
(that is, there's no difference between writing 1 and 3).

So STIBP is purely about protecting userspace processes from one
another, and VM guests from one another, when they run on HT siblings.

There's an argument that there are so many other information leaks
between HT siblings that we might not care. Especially as it's hard to
*tell* when you're scheduling, whether you trust all the processes (or
guests) on your HT siblings right now... let alone later when
scheduling another process if you need to *now* set STIBP on a sibling
which is no longer save from this process now running.

I'm not sure we want to set STIBP *unconditionally* either because of
the performance implications.

For IBRS we had an answer and it was just ugly. For STIBP we don't
actually have an answer for "how do we use this?". Do we?




smime.p7s
Description: S/MIME cryptographic signature


Re: [RFC 09/10] x86/enter: Create macros to restrict/unrestrict Indirect Branch Speculation

2018-01-23 Thread Tom Lendacky
On 1/21/2018 1:14 PM, Andy Lutomirski wrote:
> 
> 
>> On Jan 20, 2018, at 11:23 AM, KarimAllah Ahmed  wrote:
>>
>> From: Tim Chen 
>>
>> Create macros to control Indirect Branch Speculation.
>>
>> Name them so they reflect what they are actually doing.
>> The macros are used to restrict and unrestrict the indirect branch 
>> speculation.
>> They do not *disable* (or *enable*) indirect branch speculation. A trip back 
>> to
>> user-space after *restricting* speculation would still affect the BTB.
>>
>> Quoting from a commit by Tim Chen:
>>
>> """
>>If IBRS is set, near returns and near indirect jumps/calls will not allow
>>their predicted target address to be controlled by code that executed in a
>>less privileged prediction mode *BEFORE* the IBRS mode was last written 
>> with
>>a value of 1 or on another logical processor so long as all Return Stack
>>Buffer (RSB) entries from the previous less privileged prediction mode are
>>overwritten.
>>
>>Thus a near indirect jump/call/return may be affected by code in a less
>>privileged prediction mode that executed *AFTER* IBRS mode was last 
>> written
>>with a value of 1.
>> """
>>
>> [ tglx: Changed macro names and rewrote changelog ]
>> [ karahmed: changed macro names *again* and rewrote changelog ]
>>
>> Signed-off-by: Tim Chen 
>> Signed-off-by: Thomas Gleixner 
>> Signed-off-by: KarimAllah Ahmed 
>> Cc: Andrea Arcangeli 
>> Cc: Andi Kleen 
>> Cc: Peter Zijlstra 
>> Cc: Greg KH 
>> Cc: Dave Hansen 
>> Cc: Andy Lutomirski 
>> Cc: Paolo Bonzini 
>> Cc: Dan Williams 
>> Cc: Arjan Van De Ven 
>> Cc: Linus Torvalds 
>> Cc: David Woodhouse 
>> Cc: Ashok Raj 
>> Link: 
>> https://lkml.kernel.org/r/3aab341725ee6a9aafd3141387453b45d788d61a.1515542293.git.tim.c.c...@linux.intel.com
>> Signed-off-by: David Woodhouse 
>> ---
>> arch/x86/entry/calling.h | 73 
>> 
>> 1 file changed, 73 insertions(+)
>>
>> diff --git a/arch/x86/entry/calling.h b/arch/x86/entry/calling.h
>> index 3f48f69..5aafb51 100644
>> --- a/arch/x86/entry/calling.h
>> +++ b/arch/x86/entry/calling.h
>> @@ -6,6 +6,8 @@
>> #include 
>> #include 
>> #include 
>> +#include 
>> +#include 
>>
>> /*
>>
>> @@ -349,3 +351,74 @@ For 32-bit we have the following conventions - kernel 
>> is built with
>> .Lafter_call_\@:
>> #endif
>> .endm
>> +
>> +/*
>> + * IBRS related macros
>> + */
>> +.macro PUSH_MSR_REGS
>> +pushq%rax
>> +pushq%rcx
>> +pushq%rdx
>> +.endm
>> +
>> +.macro POP_MSR_REGS
>> +popq%rdx
>> +popq%rcx
>> +popq%rax
>> +.endm
>> +
>> +.macro WRMSR_ASM msr_nr:req edx_val:req eax_val:req
>> +movl\msr_nr, %ecx
>> +movl\edx_val, %edx
>> +movl\eax_val, %eax
>> +wrmsr
>> +.endm
>> +
>> +.macro RESTRICT_IB_SPEC
>> +ALTERNATIVE "jmp .Lskip_\@", "", X86_FEATURE_IBRS
>> +PUSH_MSR_REGS
>> +WRMSR_ASM $MSR_IA32_SPEC_CTRL, $0, $SPEC_CTRL_IBRS
>> +POP_MSR_REGS
>> +.Lskip_\@:
>> +.endm
>> +
>> +.macro UNRESTRICT_IB_SPEC
>> +ALTERNATIVE "jmp .Lskip_\@", "", X86_FEATURE_IBRS
>> +PUSH_MSR_REGS
>> +WRMSR_ASM $MSR_IA32_SPEC_CTRL, $0, $0
> 
> I think you should be writing 2, not 0, since I'm reasonably confident that 
> we want STIBP on.  Can you explain why you're writing 0?

Do we want to talk about STIBP in general?  Should it be (yet another)
boot option to enable or disable?  If there is STIBP support without
IBRS support, it could be a set and forget at boot time.

Thanks,
Tom

> 
> Also, holy cow, there are so many macros here.
> 
> And a meta question: why are there so many submitters of the same series?
> 


Re: [RFC 09/10] x86/enter: Create macros to restrict/unrestrict Indirect Branch Speculation

2018-01-23 Thread Tom Lendacky
On 1/21/2018 1:14 PM, Andy Lutomirski wrote:
> 
> 
>> On Jan 20, 2018, at 11:23 AM, KarimAllah Ahmed  wrote:
>>
>> From: Tim Chen 
>>
>> Create macros to control Indirect Branch Speculation.
>>
>> Name them so they reflect what they are actually doing.
>> The macros are used to restrict and unrestrict the indirect branch 
>> speculation.
>> They do not *disable* (or *enable*) indirect branch speculation. A trip back 
>> to
>> user-space after *restricting* speculation would still affect the BTB.
>>
>> Quoting from a commit by Tim Chen:
>>
>> """
>>If IBRS is set, near returns and near indirect jumps/calls will not allow
>>their predicted target address to be controlled by code that executed in a
>>less privileged prediction mode *BEFORE* the IBRS mode was last written 
>> with
>>a value of 1 or on another logical processor so long as all Return Stack
>>Buffer (RSB) entries from the previous less privileged prediction mode are
>>overwritten.
>>
>>Thus a near indirect jump/call/return may be affected by code in a less
>>privileged prediction mode that executed *AFTER* IBRS mode was last 
>> written
>>with a value of 1.
>> """
>>
>> [ tglx: Changed macro names and rewrote changelog ]
>> [ karahmed: changed macro names *again* and rewrote changelog ]
>>
>> Signed-off-by: Tim Chen 
>> Signed-off-by: Thomas Gleixner 
>> Signed-off-by: KarimAllah Ahmed 
>> Cc: Andrea Arcangeli 
>> Cc: Andi Kleen 
>> Cc: Peter Zijlstra 
>> Cc: Greg KH 
>> Cc: Dave Hansen 
>> Cc: Andy Lutomirski 
>> Cc: Paolo Bonzini 
>> Cc: Dan Williams 
>> Cc: Arjan Van De Ven 
>> Cc: Linus Torvalds 
>> Cc: David Woodhouse 
>> Cc: Ashok Raj 
>> Link: 
>> https://lkml.kernel.org/r/3aab341725ee6a9aafd3141387453b45d788d61a.1515542293.git.tim.c.c...@linux.intel.com
>> Signed-off-by: David Woodhouse 
>> ---
>> arch/x86/entry/calling.h | 73 
>> 
>> 1 file changed, 73 insertions(+)
>>
>> diff --git a/arch/x86/entry/calling.h b/arch/x86/entry/calling.h
>> index 3f48f69..5aafb51 100644
>> --- a/arch/x86/entry/calling.h
>> +++ b/arch/x86/entry/calling.h
>> @@ -6,6 +6,8 @@
>> #include 
>> #include 
>> #include 
>> +#include 
>> +#include 
>>
>> /*
>>
>> @@ -349,3 +351,74 @@ For 32-bit we have the following conventions - kernel 
>> is built with
>> .Lafter_call_\@:
>> #endif
>> .endm
>> +
>> +/*
>> + * IBRS related macros
>> + */
>> +.macro PUSH_MSR_REGS
>> +pushq%rax
>> +pushq%rcx
>> +pushq%rdx
>> +.endm
>> +
>> +.macro POP_MSR_REGS
>> +popq%rdx
>> +popq%rcx
>> +popq%rax
>> +.endm
>> +
>> +.macro WRMSR_ASM msr_nr:req edx_val:req eax_val:req
>> +movl\msr_nr, %ecx
>> +movl\edx_val, %edx
>> +movl\eax_val, %eax
>> +wrmsr
>> +.endm
>> +
>> +.macro RESTRICT_IB_SPEC
>> +ALTERNATIVE "jmp .Lskip_\@", "", X86_FEATURE_IBRS
>> +PUSH_MSR_REGS
>> +WRMSR_ASM $MSR_IA32_SPEC_CTRL, $0, $SPEC_CTRL_IBRS
>> +POP_MSR_REGS
>> +.Lskip_\@:
>> +.endm
>> +
>> +.macro UNRESTRICT_IB_SPEC
>> +ALTERNATIVE "jmp .Lskip_\@", "", X86_FEATURE_IBRS
>> +PUSH_MSR_REGS
>> +WRMSR_ASM $MSR_IA32_SPEC_CTRL, $0, $0
> 
> I think you should be writing 2, not 0, since I'm reasonably confident that 
> we want STIBP on.  Can you explain why you're writing 0?

Do we want to talk about STIBP in general?  Should it be (yet another)
boot option to enable or disable?  If there is STIBP support without
IBRS support, it could be a set and forget at boot time.

Thanks,
Tom

> 
> Also, holy cow, there are so many macros here.
> 
> And a meta question: why are there so many submitters of the same series?
> 


Re: [RFC 09/10] x86/enter: Create macros to restrict/unrestrict Indirect Branch Speculation

2018-01-23 Thread Dave Hansen
On 01/23/2018 01:27 AM, Ingo Molnar wrote:
> 
>  - All asynchronous contexts (IRQs, NMIs, etc.) stuff the RSB before IRET. 
> (The 
>tracking could probably made IRQ and maybe even NMI safe, but the 
> worst-case 
>nesting scenarios make my head ache.)

This all sounds totally workable to me.  We talked about using ftrace
itself to track call depth, but it would be unusable in production, of
course.  This seems workable, though.  You're also totally right about
the zero overhead on most kernels with it turned off when we don't need
RSB underflow protection (basically pre-Skylake).

I also agree that the safe thing to do is to just stuff before iret.  I
bet we can get a ftrace-driven RSB tracker working precisely enough even
with NMIs, but it's way simpler to just stuff and be done with it for now.


Re: [RFC 09/10] x86/enter: Create macros to restrict/unrestrict Indirect Branch Speculation

2018-01-23 Thread Dave Hansen
On 01/23/2018 01:27 AM, Ingo Molnar wrote:
> 
>  - All asynchronous contexts (IRQs, NMIs, etc.) stuff the RSB before IRET. 
> (The 
>tracking could probably made IRQ and maybe even NMI safe, but the 
> worst-case 
>nesting scenarios make my head ache.)

This all sounds totally workable to me.  We talked about using ftrace
itself to track call depth, but it would be unusable in production, of
course.  This seems workable, though.  You're also totally right about
the zero overhead on most kernels with it turned off when we don't need
RSB underflow protection (basically pre-Skylake).

I also agree that the safe thing to do is to just stuff before iret.  I
bet we can get a ftrace-driven RSB tracker working precisely enough even
with NMIs, but it's way simpler to just stuff and be done with it for now.


Re: [RFC 09/10] x86/enter: Create macros to restrict/unrestrict Indirect Branch Speculation

2018-01-23 Thread Liran Alon

- dw...@infradead.org wrote:

> On Sun, 2018-01-21 at 14:27 -0800, Linus Torvalds wrote:
> > On Sun, Jan 21, 2018 at 2:00 PM, David Woodhouse
>  wrote:
> > >>
> > >> The patches do things like add the garbage MSR writes to the
> kernel
> > >> entry/exit points. That's insane. That says "we're trying to
> protect
> > >> the kernel".  We already have retpoline there, with less
> overhead.
> > >
> > > You're looking at IBRS usage, not IBPB. They are different
> things.
> > 
> > Ehh. Odd intel naming detail.
> > 
> > If you look at this series, it very much does that kernel
> entry/exit
> > stuff. It was patch 10/10, iirc. In fact, the patch I was replying
> to
> > was explicitly setting that garbage up.
> > 
> > And I really don't want to see these garbage patches just
> mindlessly
> > sent around.
> 
> I think we've covered the technical part of this now, not that you
> like
> it — not that any of us *like* it. But since the peanut gallery is
> paying lots of attention it's probably worth explaining it a little
> more for their benefit.
> 
> This is all about Spectre variant 2, where the CPU can be tricked
> into
> mispredicting the target of an indirect branch. And I'm specifically
> looking at what we can do on *current* hardware, where we're limited
> to
> the hacks they can manage to add in the microcode.
> 
> The new microcode from Intel and AMD adds three new features.
> 
> One new feature (IBPB) is a complete barrier for branch prediction.
> After frobbing this, no branch targets learned earlier are going to
> be
> used. It's kind of expensive (order of magnitude ~4000 cycles).
> 
> The second (STIBP) protects a hyperthread sibling from following
> branch
> predictions which were learned on another sibling. You *might* want
> this when running unrelated processes in userspace, for example. Or
> different VM guests running on HT siblings.
> 
> The third feature (IBRS) is more complicated. It's designed to be
> set when you enter a more privileged execution mode (i.e. the
> kernel).
> It prevents branch targets learned in a less-privileged execution
> mode,
> BEFORE IT WAS MOST RECENTLY SET, from taking effect. But it's not
> just
> a 'set-and-forget' feature, it also has barrier-like semantics and
> needs to be set on *each* entry into the kernel (from userspace or a
> VM
> guest). It's *also* expensive. And a vile hack, but for a while it
> was
> the only option we had.
> 
> Even with IBRS, the CPU cannot tell the difference between different
> userspace processes, and between different VM guests. So in addition
> to
> IBRS to protect the kernel, we need the full IBPB barrier on context
> switch and vmexit. And maybe STIBP while they're running.
> 
> Then along came Paul with the cunning plan of "oh, indirect branches
> can be exploited? Screw it, let's not have any of *those* then",
> which
> is retpoline. And it's a *lot* faster than frobbing IBRS on every
> entry
> into the kernel. It's a massive performance win.
> 
> So now we *mostly* don't need IBRS. We build with retpoline, use IBPB
> on context switches/vmexit (which is in the first part of this patch
> series before IBRS is added), and we're safe. We even refactored the
> patch series to put retpoline first.
> 
> But wait, why did I say "mostly"? Well, not everyone has a retpoline
> compiler yet... but OK, screw them; they need to update.
> 
> Then there's Skylake, and that generation of CPU cores. For
> complicated
> reasons they actually end up being vulnerable not just on indirect
> branches, but also on a 'ret' in some circumstances (such as 16+
> CALLs
> in a deep chain).
> 
> The IBRS solution, ugly though it is, did address that. Retpoline
> doesn't. There are patches being floated to detect and prevent deep
> stacks, and deal with some of the other special cases that bite on
> SKL,
> but those are icky too. And in fact IBRS performance isn't anywhere
> near as bad on this generation of CPUs as it is on earlier CPUs
> *anyway*, which makes it not quite so insane to *contemplate* using
> it
> as Intel proposed.
> 
> That's why my initial idea, as implemented in this RFC patchset, was
> to
> stick with IBRS on Skylake, and use retpoline everywhere else. I'll
> give you "garbage patches", but they weren't being "just mindlessly
> sent around". If we're going to drop IBRS support and accept the
> caveats, then let's do it as a conscious decision having seen what it
> would look like, not just drop it quietly because poor Davey is too
> scared that Linus might shout at him again. :)
> 
> I have seen *hand-wavy* analyses of the Skylake thing that mean I'm
> not
> actually lying awake at night fretting about it, but nothing concrete
> that really says it's OK.
> 
> If you view retpoline as a performance optimisation, which is how it
> first arrived, then it's rather unconventional to say "well, it only
> opens a *little* bit of a security hole but it does go nice and fast
> so
> let's do it".
> 
> But fine, I'm 

Re: [RFC 09/10] x86/enter: Create macros to restrict/unrestrict Indirect Branch Speculation

2018-01-23 Thread Liran Alon

- dw...@infradead.org wrote:

> On Sun, 2018-01-21 at 14:27 -0800, Linus Torvalds wrote:
> > On Sun, Jan 21, 2018 at 2:00 PM, David Woodhouse
>  wrote:
> > >>
> > >> The patches do things like add the garbage MSR writes to the
> kernel
> > >> entry/exit points. That's insane. That says "we're trying to
> protect
> > >> the kernel".  We already have retpoline there, with less
> overhead.
> > >
> > > You're looking at IBRS usage, not IBPB. They are different
> things.
> > 
> > Ehh. Odd intel naming detail.
> > 
> > If you look at this series, it very much does that kernel
> entry/exit
> > stuff. It was patch 10/10, iirc. In fact, the patch I was replying
> to
> > was explicitly setting that garbage up.
> > 
> > And I really don't want to see these garbage patches just
> mindlessly
> > sent around.
> 
> I think we've covered the technical part of this now, not that you
> like
> it — not that any of us *like* it. But since the peanut gallery is
> paying lots of attention it's probably worth explaining it a little
> more for their benefit.
> 
> This is all about Spectre variant 2, where the CPU can be tricked
> into
> mispredicting the target of an indirect branch. And I'm specifically
> looking at what we can do on *current* hardware, where we're limited
> to
> the hacks they can manage to add in the microcode.
> 
> The new microcode from Intel and AMD adds three new features.
> 
> One new feature (IBPB) is a complete barrier for branch prediction.
> After frobbing this, no branch targets learned earlier are going to
> be
> used. It's kind of expensive (order of magnitude ~4000 cycles).
> 
> The second (STIBP) protects a hyperthread sibling from following
> branch
> predictions which were learned on another sibling. You *might* want
> this when running unrelated processes in userspace, for example. Or
> different VM guests running on HT siblings.
> 
> The third feature (IBRS) is more complicated. It's designed to be
> set when you enter a more privileged execution mode (i.e. the
> kernel).
> It prevents branch targets learned in a less-privileged execution
> mode,
> BEFORE IT WAS MOST RECENTLY SET, from taking effect. But it's not
> just
> a 'set-and-forget' feature, it also has barrier-like semantics and
> needs to be set on *each* entry into the kernel (from userspace or a
> VM
> guest). It's *also* expensive. And a vile hack, but for a while it
> was
> the only option we had.
> 
> Even with IBRS, the CPU cannot tell the difference between different
> userspace processes, and between different VM guests. So in addition
> to
> IBRS to protect the kernel, we need the full IBPB barrier on context
> switch and vmexit. And maybe STIBP while they're running.
> 
> Then along came Paul with the cunning plan of "oh, indirect branches
> can be exploited? Screw it, let's not have any of *those* then",
> which
> is retpoline. And it's a *lot* faster than frobbing IBRS on every
> entry
> into the kernel. It's a massive performance win.
> 
> So now we *mostly* don't need IBRS. We build with retpoline, use IBPB
> on context switches/vmexit (which is in the first part of this patch
> series before IBRS is added), and we're safe. We even refactored the
> patch series to put retpoline first.
> 
> But wait, why did I say "mostly"? Well, not everyone has a retpoline
> compiler yet... but OK, screw them; they need to update.
> 
> Then there's Skylake, and that generation of CPU cores. For
> complicated
> reasons they actually end up being vulnerable not just on indirect
> branches, but also on a 'ret' in some circumstances (such as 16+
> CALLs
> in a deep chain).
> 
> The IBRS solution, ugly though it is, did address that. Retpoline
> doesn't. There are patches being floated to detect and prevent deep
> stacks, and deal with some of the other special cases that bite on
> SKL,
> but those are icky too. And in fact IBRS performance isn't anywhere
> near as bad on this generation of CPUs as it is on earlier CPUs
> *anyway*, which makes it not quite so insane to *contemplate* using
> it
> as Intel proposed.
> 
> That's why my initial idea, as implemented in this RFC patchset, was
> to
> stick with IBRS on Skylake, and use retpoline everywhere else. I'll
> give you "garbage patches", but they weren't being "just mindlessly
> sent around". If we're going to drop IBRS support and accept the
> caveats, then let's do it as a conscious decision having seen what it
> would look like, not just drop it quietly because poor Davey is too
> scared that Linus might shout at him again. :)
> 
> I have seen *hand-wavy* analyses of the Skylake thing that mean I'm
> not
> actually lying awake at night fretting about it, but nothing concrete
> that really says it's OK.
> 
> If you view retpoline as a performance optimisation, which is how it
> first arrived, then it's rather unconventional to say "well, it only
> opens a *little* bit of a security hole but it does go nice and fast
> so
> let's do it".
> 
> But fine, I'm content with ditching the 

Re: [RFC 09/10] x86/enter: Create macros to restrict/unrestrict Indirect Branch Speculation

2018-01-23 Thread David Woodhouse
On Tue, 2018-01-23 at 11:44 +0100, Ingo Molnar wrote:
> * David Woodhouse  wrote:
> > Hm? We still have GCC emitting 'call __fentry__' don't we? Would be nice to 
> > get 
> > to the point where we can patch *that* out into a NOP... or are you saying 
> > we 
> > already can?
> Yes, we already can and do patch the 'call __fentry__/ mcount' call site into 
> a 
> NOP today - all 50,000+ call sites on a typical distro kernel.
> 
> We did so for a long time - this is all a well established, working mechanism.

That's neat; I'd missed that.

> > But this is a digression. I was being pedantic about the "0 cycles" but 
> > sure, 
> > this would be perfectly tolerable.
> It's not a digression in two ways:
> 
> - I wanted to make it clear that for distro kernels it _is_ a zero cycles 
> overhead
>   mechanism for non-SkyLake CPUs, literally.
> 
> - I noticed that Meltdown and the CR3 writes for PTI appears to have 
> established a
>   kind of ... insensitivity and numbness to kernel micro-costs, which peaked 
> with
>   the per-syscall MSR write nonsense patch of the SkyLake workaround.
>   That attitude is totally unacceptable to me as x86 maintainer and yes, still
>   every cycle counts.

Yeah, absolutely. But here we're talking about the overhead on non-SKL, 
and on non-SKL the IBRS overhead is zero too (well, again not precisely
zero because it turns into NOPs).

You're absolutely right that we shouldn't stop counting cycles.

I've already noted that on SKL IBRS is actually a lot faster than on
earlier generations, and we also get back some of the overhead by
turning the retpoline into a bare jmp again. We haven't *forgotten*
about performance.

I'd like to see your solution once the details are sorted out, and see
proper benchmarks — both microbenchmarks and real workloads — comparing
the two. And then make a reasoned decision based on that, and on how
happy we are with the theoretical holes that your solution leaves, in
the cold light of day.

We should also look at whether we want to set STIBP too, which is
somewhat orthogonal to using IBRS to protect the kernel, and could end
up with some of the same MSR writes (at least setting to zero) on some
of the same code paths.

smime.p7s
Description: S/MIME cryptographic signature


Re: [RFC 09/10] x86/enter: Create macros to restrict/unrestrict Indirect Branch Speculation

2018-01-23 Thread David Woodhouse
On Tue, 2018-01-23 at 11:44 +0100, Ingo Molnar wrote:
> * David Woodhouse  wrote:
> > Hm? We still have GCC emitting 'call __fentry__' don't we? Would be nice to 
> > get 
> > to the point where we can patch *that* out into a NOP... or are you saying 
> > we 
> > already can?
> Yes, we already can and do patch the 'call __fentry__/ mcount' call site into 
> a 
> NOP today - all 50,000+ call sites on a typical distro kernel.
> 
> We did so for a long time - this is all a well established, working mechanism.

That's neat; I'd missed that.

> > But this is a digression. I was being pedantic about the "0 cycles" but 
> > sure, 
> > this would be perfectly tolerable.
> It's not a digression in two ways:
> 
> - I wanted to make it clear that for distro kernels it _is_ a zero cycles 
> overhead
>   mechanism for non-SkyLake CPUs, literally.
> 
> - I noticed that Meltdown and the CR3 writes for PTI appears to have 
> established a
>   kind of ... insensitivity and numbness to kernel micro-costs, which peaked 
> with
>   the per-syscall MSR write nonsense patch of the SkyLake workaround.
>   That attitude is totally unacceptable to me as x86 maintainer and yes, still
>   every cycle counts.

Yeah, absolutely. But here we're talking about the overhead on non-SKL, 
and on non-SKL the IBRS overhead is zero too (well, again not precisely
zero because it turns into NOPs).

You're absolutely right that we shouldn't stop counting cycles.

I've already noted that on SKL IBRS is actually a lot faster than on
earlier generations, and we also get back some of the overhead by
turning the retpoline into a bare jmp again. We haven't *forgotten*
about performance.

I'd like to see your solution once the details are sorted out, and see
proper benchmarks — both microbenchmarks and real workloads — comparing
the two. And then make a reasoned decision based on that, and on how
happy we are with the theoretical holes that your solution leaves, in
the cold light of day.

We should also look at whether we want to set STIBP too, which is
somewhat orthogonal to using IBRS to protect the kernel, and could end
up with some of the same MSR writes (at least setting to zero) on some
of the same code paths.

smime.p7s
Description: S/MIME cryptographic signature


Re: [RFC 09/10] x86/enter: Create macros to restrict/unrestrict Indirect Branch Speculation

2018-01-23 Thread Ingo Molnar

* David Woodhouse  wrote:

> On Tue, 2018-01-23 at 11:15 +0100, Ingo Molnar wrote:
> > 
> > BTW., the reason this is enabled on all distro kernels is because the 
> > overhead 
> > is  a single patched-in NOP instruction in the function epilogue, when 
> > tracing 
> > is  disabled. So it's not even a CALL+RET - it's a patched in NOP.
> 
> Hm? We still have GCC emitting 'call __fentry__' don't we? Would be nice to 
> get 
> to the point where we can patch *that* out into a NOP... or are you saying we 
> already can?

Yes, we already can and do patch the 'call __fentry__/ mcount' call site into a 
NOP today - all 50,000+ call sites on a typical distro kernel.

We did so for a long time - this is all a well established, working mechanism.

> But this is a digression. I was being pedantic about the "0 cycles" but sure, 
> this would be perfectly tolerable.

It's not a digression in two ways:

- I wanted to make it clear that for distro kernels it _is_ a zero cycles 
overhead
  mechanism for non-SkyLake CPUs, literally.

- I noticed that Meltdown and the CR3 writes for PTI appears to have 
established a
  kind of ... insensitivity and numbness to kernel micro-costs, which peaked 
with
  the per-syscall MSR write nonsense patch of the SkyLake workaround.
  That attitude is totally unacceptable to me as x86 maintainer and yes, still
  every cycle counts.

Thanks,

Ingo


Re: [RFC 09/10] x86/enter: Create macros to restrict/unrestrict Indirect Branch Speculation

2018-01-23 Thread Ingo Molnar

* David Woodhouse  wrote:

> On Tue, 2018-01-23 at 11:15 +0100, Ingo Molnar wrote:
> > 
> > BTW., the reason this is enabled on all distro kernels is because the 
> > overhead 
> > is  a single patched-in NOP instruction in the function epilogue, when 
> > tracing 
> > is  disabled. So it's not even a CALL+RET - it's a patched in NOP.
> 
> Hm? We still have GCC emitting 'call __fentry__' don't we? Would be nice to 
> get 
> to the point where we can patch *that* out into a NOP... or are you saying we 
> already can?

Yes, we already can and do patch the 'call __fentry__/ mcount' call site into a 
NOP today - all 50,000+ call sites on a typical distro kernel.

We did so for a long time - this is all a well established, working mechanism.

> But this is a digression. I was being pedantic about the "0 cycles" but sure, 
> this would be perfectly tolerable.

It's not a digression in two ways:

- I wanted to make it clear that for distro kernels it _is_ a zero cycles 
overhead
  mechanism for non-SkyLake CPUs, literally.

- I noticed that Meltdown and the CR3 writes for PTI appears to have 
established a
  kind of ... insensitivity and numbness to kernel micro-costs, which peaked 
with
  the per-syscall MSR write nonsense patch of the SkyLake workaround.
  That attitude is totally unacceptable to me as x86 maintainer and yes, still
  every cycle counts.

Thanks,

Ingo


Re: [RFC 09/10] x86/enter: Create macros to restrict/unrestrict Indirect Branch Speculation

2018-01-23 Thread David Woodhouse
On Tue, 2018-01-23 at 11:23 +0100, Ingo Molnar wrote:
> * David Woodhouse  wrote:
> 
> > 
> > > 
> > > On SkyLake this would add an overhead of maybe 2-3 cycles per function 
> > > call and 
> > > obviously all this code and data would be very cache hot. Given that the 
> > > average 
> > > number of function calls per system call is around a dozen, this would be 
> > > _much_ 
> > > faster than any microcode/MSR based approach.
> > That's kind of neat, except you don't want it at the top of the
> > function; you want it at the bottom.
> > 
> > If you could hijack the *return* site, then you could check for
> > underflow and stuff the RSB right there. But in __fentry__ there's not
> > a lot you can do other than complain that something bad is going to
> > happen in the future. You know that a string of 16+ rets is going to
> > happen, but you've got no gadget in *there* to deal with it when it
> > does.
>
> No, it can be done with the existing CALL instrumentation callback that 
> CONFIG_DYNAMIC_FTRACE=y provides, by pushing a RET trampoline on the stack 
> from 
> the CALL trampoline - see my previous email.

Yes, that's a neat solution.

> > 
> > HJ did have patches to turn 'ret' into a form of retpoline, which I
> > don't think ever even got performance-tested.
> Return instrumentation is possible as well, but there are two major drawbacks:
> 
>  - GCC support for it is not as widely available and return instrumentation 
> is 
>    less tested in Linux kernel contexts

Hey, we're *already* making people upgrade their compiler, and HJ
apparently never sleeps. So don't actually be held back too much by
that consideration. If it could be better done with GCC help, we really
*can* explore that.

>  - a major point of my suggestion is that CONFIG_DYNAMIC_FTRACE=y is already 
>    enabled in distros here and today, so the runtime overhead to non-SkyLake 
> CPUs 
>    would be literally zero, while still allowing to fix the RSB vulnerability 
> on 
>    SkyLake.

Sure. You still have a few holes to fix (or declare acceptable) to
bring it to the full coverage of the IBRS solution, and it's still
possible that by the time it's complete it's approaching the ick factor
of IBRS, but I'd love to see it.

smime.p7s
Description: S/MIME cryptographic signature


Re: [RFC 09/10] x86/enter: Create macros to restrict/unrestrict Indirect Branch Speculation

2018-01-23 Thread David Woodhouse
On Tue, 2018-01-23 at 11:23 +0100, Ingo Molnar wrote:
> * David Woodhouse  wrote:
> 
> > 
> > > 
> > > On SkyLake this would add an overhead of maybe 2-3 cycles per function 
> > > call and 
> > > obviously all this code and data would be very cache hot. Given that the 
> > > average 
> > > number of function calls per system call is around a dozen, this would be 
> > > _much_ 
> > > faster than any microcode/MSR based approach.
> > That's kind of neat, except you don't want it at the top of the
> > function; you want it at the bottom.
> > 
> > If you could hijack the *return* site, then you could check for
> > underflow and stuff the RSB right there. But in __fentry__ there's not
> > a lot you can do other than complain that something bad is going to
> > happen in the future. You know that a string of 16+ rets is going to
> > happen, but you've got no gadget in *there* to deal with it when it
> > does.
>
> No, it can be done with the existing CALL instrumentation callback that 
> CONFIG_DYNAMIC_FTRACE=y provides, by pushing a RET trampoline on the stack 
> from 
> the CALL trampoline - see my previous email.

Yes, that's a neat solution.

> > 
> > HJ did have patches to turn 'ret' into a form of retpoline, which I
> > don't think ever even got performance-tested.
> Return instrumentation is possible as well, but there are two major drawbacks:
> 
>  - GCC support for it is not as widely available and return instrumentation 
> is 
>    less tested in Linux kernel contexts

Hey, we're *already* making people upgrade their compiler, and HJ
apparently never sleeps. So don't actually be held back too much by
that consideration. If it could be better done with GCC help, we really
*can* explore that.

>  - a major point of my suggestion is that CONFIG_DYNAMIC_FTRACE=y is already 
>    enabled in distros here and today, so the runtime overhead to non-SkyLake 
> CPUs 
>    would be literally zero, while still allowing to fix the RSB vulnerability 
> on 
>    SkyLake.

Sure. You still have a few holes to fix (or declare acceptable) to
bring it to the full coverage of the IBRS solution, and it's still
possible that by the time it's complete it's approaching the ick factor
of IBRS, but I'd love to see it.

smime.p7s
Description: S/MIME cryptographic signature


Re: [RFC 09/10] x86/enter: Create macros to restrict/unrestrict Indirect Branch Speculation

2018-01-23 Thread David Woodhouse
On Tue, 2018-01-23 at 11:15 +0100, Ingo Molnar wrote:
> 
> BTW., the reason this is enabled on all distro kernels is because the 
> overhead is 
> a single patched-in NOP instruction in the function epilogue, when tracing is 
> disabled. So it's not even a CALL+RET - it's a patched in NOP.

Hm? We still have GCC emitting 'call __fentry__' don't we? Would be
nice to get to the point where we can patch *that* out into a NOP... or
are you saying we already can?

But this is a digression. I was being pedantic about the "0 cycles" but
sure, this would be perfectly tolerable.

smime.p7s
Description: S/MIME cryptographic signature


Re: [RFC 09/10] x86/enter: Create macros to restrict/unrestrict Indirect Branch Speculation

2018-01-23 Thread David Woodhouse
On Tue, 2018-01-23 at 11:15 +0100, Ingo Molnar wrote:
> 
> BTW., the reason this is enabled on all distro kernels is because the 
> overhead is 
> a single patched-in NOP instruction in the function epilogue, when tracing is 
> disabled. So it's not even a CALL+RET - it's a patched in NOP.

Hm? We still have GCC emitting 'call __fentry__' don't we? Would be
nice to get to the point where we can patch *that* out into a NOP... or
are you saying we already can?

But this is a digression. I was being pedantic about the "0 cycles" but
sure, this would be perfectly tolerable.

smime.p7s
Description: S/MIME cryptographic signature


Re: [RFC 09/10] x86/enter: Create macros to restrict/unrestrict Indirect Branch Speculation

2018-01-23 Thread Ingo Molnar

* David Woodhouse  wrote:

> > On SkyLake this would add an overhead of maybe 2-3 cycles per function call 
> > and 
> > obviously all this code and data would be very cache hot. Given that the 
> > average 
> > number of function calls per system call is around a dozen, this would be 
> > _much_ 
> > faster than any microcode/MSR based approach.
> 
> That's kind of neat, except you don't want it at the top of the
> function; you want it at the bottom.
> 
> If you could hijack the *return* site, then you could check for
> underflow and stuff the RSB right there. But in __fentry__ there's not
> a lot you can do other than complain that something bad is going to
> happen in the future. You know that a string of 16+ rets is going to
> happen, but you've got no gadget in *there* to deal with it when it
> does.

No, it can be done with the existing CALL instrumentation callback that 
CONFIG_DYNAMIC_FTRACE=y provides, by pushing a RET trampoline on the stack from 
the CALL trampoline - see my previous email.

> HJ did have patches to turn 'ret' into a form of retpoline, which I
> don't think ever even got performance-tested.

Return instrumentation is possible as well, but there are two major drawbacks:

 - GCC support for it is not as widely available and return instrumentation is 
   less tested in Linux kernel contexts

 - a major point of my suggestion is that CONFIG_DYNAMIC_FTRACE=y is already 
   enabled in distros here and today, so the runtime overhead to non-SkyLake 
CPUs 
   would be literally zero, while still allowing to fix the RSB vulnerability 
on 
   SkyLake.

Thanks,

Ingo


Re: [RFC 09/10] x86/enter: Create macros to restrict/unrestrict Indirect Branch Speculation

2018-01-23 Thread Ingo Molnar

* David Woodhouse  wrote:

> > On SkyLake this would add an overhead of maybe 2-3 cycles per function call 
> > and 
> > obviously all this code and data would be very cache hot. Given that the 
> > average 
> > number of function calls per system call is around a dozen, this would be 
> > _much_ 
> > faster than any microcode/MSR based approach.
> 
> That's kind of neat, except you don't want it at the top of the
> function; you want it at the bottom.
> 
> If you could hijack the *return* site, then you could check for
> underflow and stuff the RSB right there. But in __fentry__ there's not
> a lot you can do other than complain that something bad is going to
> happen in the future. You know that a string of 16+ rets is going to
> happen, but you've got no gadget in *there* to deal with it when it
> does.

No, it can be done with the existing CALL instrumentation callback that 
CONFIG_DYNAMIC_FTRACE=y provides, by pushing a RET trampoline on the stack from 
the CALL trampoline - see my previous email.

> HJ did have patches to turn 'ret' into a form of retpoline, which I
> don't think ever even got performance-tested.

Return instrumentation is possible as well, but there are two major drawbacks:

 - GCC support for it is not as widely available and return instrumentation is 
   less tested in Linux kernel contexts

 - a major point of my suggestion is that CONFIG_DYNAMIC_FTRACE=y is already 
   enabled in distros here and today, so the runtime overhead to non-SkyLake 
CPUs 
   would be literally zero, while still allowing to fix the RSB vulnerability 
on 
   SkyLake.

Thanks,

Ingo


  1   2   >