Re: [RFC 09/10] x86/enter: Create macros to restrict/unrestrict Indirect Branch Speculation
On Sun, 2018-02-04 at 19:43 +0100, Thomas Gleixner wrote: > Yet another possibility is to avoid the function entry and accouting magic > and use the generic gcc return thunk: > > __x86_return_thunk: > call L2 > L1: > pause > lfence > jmp L1 > L2: > lea 8(%rsp), %rsp|lea 4(%esp), %esp > ret > > which basically refills the RSB on every return. That can be inline or > extern, but in both cases we should be able to patch it out. > > I have no idea how that affects performance, but it might be worthwhile to > experiment with that. That was what I had in mind when I asked HJ to add -mfunction-return. I suspect the performance hit would be significant because it would cause a prediction miss on *every* return. But as I said, let's implement what we can without IBRS for Skylake, then we can compare the two options for performance, security coverage and general fugliness. smime.p7s Description: S/MIME cryptographic signature
Re: [RFC 09/10] x86/enter: Create macros to restrict/unrestrict Indirect Branch Speculation
On Sun, 2018-02-04 at 19:43 +0100, Thomas Gleixner wrote: > Yet another possibility is to avoid the function entry and accouting magic > and use the generic gcc return thunk: > > __x86_return_thunk: > call L2 > L1: > pause > lfence > jmp L1 > L2: > lea 8(%rsp), %rsp|lea 4(%esp), %esp > ret > > which basically refills the RSB on every return. That can be inline or > extern, but in both cases we should be able to patch it out. > > I have no idea how that affects performance, but it might be worthwhile to > experiment with that. That was what I had in mind when I asked HJ to add -mfunction-return. I suspect the performance hit would be significant because it would cause a prediction miss on *every* return. But as I said, let's implement what we can without IBRS for Skylake, then we can compare the two options for performance, security coverage and general fugliness. smime.p7s Description: S/MIME cryptographic signature
Re: [RFC 09/10] x86/enter: Create macros to restrict/unrestrict Indirect Branch Speculation
On Sun, 2018-02-04 at 19:43 +0100, Thomas Gleixner wrote: > > __x86_return_thunk would look like this: > > __x86_return_thunk: > testl $0xf, PER_CPU_VAR(call_depth) > jnz 1f > stuff_rsb > 1: > declPER_CPU_VAR(call_depth) > ret > > The call_depth variable would be reset on context switch. Note that the 'jnz' can be predicted taken there, allowing the CPU to speculate all the way to the 'ret'... and beyond. smime.p7s Description: S/MIME cryptographic signature
Re: [RFC 09/10] x86/enter: Create macros to restrict/unrestrict Indirect Branch Speculation
On Sun, 2018-02-04 at 19:43 +0100, Thomas Gleixner wrote: > > __x86_return_thunk would look like this: > > __x86_return_thunk: > testl $0xf, PER_CPU_VAR(call_depth) > jnz 1f > stuff_rsb > 1: > declPER_CPU_VAR(call_depth) > ret > > The call_depth variable would be reset on context switch. Note that the 'jnz' can be predicted taken there, allowing the CPU to speculate all the way to the 'ret'... and beyond. smime.p7s Description: S/MIME cryptographic signature
Re: [RFC 09/10] x86/enter: Create macros to restrict/unrestrict Indirect Branch Speculation
On Tue, 23 Jan 2018, Ingo Molnar wrote: > * David Woodhousewrote: > > > > On SkyLake this would add an overhead of maybe 2-3 cycles per function > > > call and > > > obviously all this code and data would be very cache hot. Given that the > > > average > > > number of function calls per system call is around a dozen, this would be > > > _much_ > > > faster than any microcode/MSR based approach. > > > > That's kind of neat, except you don't want it at the top of the > > function; you want it at the bottom. > > > > If you could hijack the *return* site, then you could check for > > underflow and stuff the RSB right there. But in __fentry__ there's not > > a lot you can do other than complain that something bad is going to > > happen in the future. You know that a string of 16+ rets is going to > > happen, but you've got no gadget in *there* to deal with it when it > > does. > > No, it can be done with the existing CALL instrumentation callback that > CONFIG_DYNAMIC_FTRACE=y provides, by pushing a RET trampoline on the stack > from > the CALL trampoline - see my previous email. > > > HJ did have patches to turn 'ret' into a form of retpoline, which I > > don't think ever even got performance-tested. > > Return instrumentation is possible as well, but there are two major drawbacks: > > - GCC support for it is not as widely available and return instrumentation > is >less tested in Linux kernel contexts > > - a major point of my suggestion is that CONFIG_DYNAMIC_FTRACE=y is already >enabled in distros here and today, so the runtime overhead to non-SkyLake > CPUs >would be literally zero, while still allowing to fix the RSB vulnerability > on >SkyLake. I played around with that a bit during the week and it turns out to be less simple than you thought. 1) Injecting a trampoline return only works for functions which have all arguments in registers. For functions with arguments on stack like all varg functions this breaks because the function wont find its arguments anymore. I have not yet found a way to figure out reliably which functions have arguments on stack. That might be an option to simply ignore them. The workaround is to replace the original return on stack with the trampoline and store the original return in a per thread stack, which I implemented. But this sucks performance wise badly. 2) Doing the whole dance on function entry has a real down side because you refill RSB on every 15th return no matter whether its required or not. That really gives a very prominent performance hit. An alternative idea is to do the following (not yet implemented): __fentry__: inclPER_CPU_VAR(call_depth) retq and use -mfunction-return=thunk-extern which is available on retpoline enabled compilers. That's a reasonable requirement because w/o retpoline the whole SKL magic is pointless anyway. -mfunction-return=thunk-extern issues jump__x86_return_thunk instead of ret. In the thunk we can do the whole shebang of mitigation. That jump can be identified at build time and it can be patched into a ret for unaffected CPUs. Ideally we do the patching at build time and only patch the jump in when SKL is detected or paranoia requests it. We could actually look into that for tracing as well. The only reason why we don't do that is to select the ideal nop for the CPU the kernel runs on, which obviously cannot be known at build time. __x86_return_thunk would look like this: __x86_return_thunk: testl $0xf, PER_CPU_VAR(call_depth) jnz 1f stuff_rsb 1: declPER_CPU_VAR(call_depth) ret The call_depth variable would be reset on context switch. Though that has another problem: tail calls. Tail calls will invoke the __fentry__ call of the tail called function, which makes the call_depth counter unbalanced. Tail calls can be prevented by using -fno-optimize-sibling-calls, but that probably sucks as well. Yet another possibility is to avoid the function entry and accouting magic and use the generic gcc return thunk: __x86_return_thunk: call L2 L1: pause lfence jmp L1 L2: lea 8(%rsp), %rsp|lea 4(%esp), %esp ret which basically refills the RSB on every return. That can be inline or extern, but in both cases we should be able to patch it out. I have no idea how that affects performance, but it might be worthwhile to experiment with that. If nobody beats me to it, I'll play around with that some more after vacation. Thanks, tglx
Re: [RFC 09/10] x86/enter: Create macros to restrict/unrestrict Indirect Branch Speculation
On Tue, 23 Jan 2018, Ingo Molnar wrote: > * David Woodhouse wrote: > > > > On SkyLake this would add an overhead of maybe 2-3 cycles per function > > > call and > > > obviously all this code and data would be very cache hot. Given that the > > > average > > > number of function calls per system call is around a dozen, this would be > > > _much_ > > > faster than any microcode/MSR based approach. > > > > That's kind of neat, except you don't want it at the top of the > > function; you want it at the bottom. > > > > If you could hijack the *return* site, then you could check for > > underflow and stuff the RSB right there. But in __fentry__ there's not > > a lot you can do other than complain that something bad is going to > > happen in the future. You know that a string of 16+ rets is going to > > happen, but you've got no gadget in *there* to deal with it when it > > does. > > No, it can be done with the existing CALL instrumentation callback that > CONFIG_DYNAMIC_FTRACE=y provides, by pushing a RET trampoline on the stack > from > the CALL trampoline - see my previous email. > > > HJ did have patches to turn 'ret' into a form of retpoline, which I > > don't think ever even got performance-tested. > > Return instrumentation is possible as well, but there are two major drawbacks: > > - GCC support for it is not as widely available and return instrumentation > is >less tested in Linux kernel contexts > > - a major point of my suggestion is that CONFIG_DYNAMIC_FTRACE=y is already >enabled in distros here and today, so the runtime overhead to non-SkyLake > CPUs >would be literally zero, while still allowing to fix the RSB vulnerability > on >SkyLake. I played around with that a bit during the week and it turns out to be less simple than you thought. 1) Injecting a trampoline return only works for functions which have all arguments in registers. For functions with arguments on stack like all varg functions this breaks because the function wont find its arguments anymore. I have not yet found a way to figure out reliably which functions have arguments on stack. That might be an option to simply ignore them. The workaround is to replace the original return on stack with the trampoline and store the original return in a per thread stack, which I implemented. But this sucks performance wise badly. 2) Doing the whole dance on function entry has a real down side because you refill RSB on every 15th return no matter whether its required or not. That really gives a very prominent performance hit. An alternative idea is to do the following (not yet implemented): __fentry__: inclPER_CPU_VAR(call_depth) retq and use -mfunction-return=thunk-extern which is available on retpoline enabled compilers. That's a reasonable requirement because w/o retpoline the whole SKL magic is pointless anyway. -mfunction-return=thunk-extern issues jump__x86_return_thunk instead of ret. In the thunk we can do the whole shebang of mitigation. That jump can be identified at build time and it can be patched into a ret for unaffected CPUs. Ideally we do the patching at build time and only patch the jump in when SKL is detected or paranoia requests it. We could actually look into that for tracing as well. The only reason why we don't do that is to select the ideal nop for the CPU the kernel runs on, which obviously cannot be known at build time. __x86_return_thunk would look like this: __x86_return_thunk: testl $0xf, PER_CPU_VAR(call_depth) jnz 1f stuff_rsb 1: declPER_CPU_VAR(call_depth) ret The call_depth variable would be reset on context switch. Though that has another problem: tail calls. Tail calls will invoke the __fentry__ call of the tail called function, which makes the call_depth counter unbalanced. Tail calls can be prevented by using -fno-optimize-sibling-calls, but that probably sucks as well. Yet another possibility is to avoid the function entry and accouting magic and use the generic gcc return thunk: __x86_return_thunk: call L2 L1: pause lfence jmp L1 L2: lea 8(%rsp), %rsp|lea 4(%esp), %esp ret which basically refills the RSB on every return. That can be inline or extern, but in both cases we should be able to patch it out. I have no idea how that affects performance, but it might be worthwhile to experiment with that. If nobody beats me to it, I'll play around with that some more after vacation. Thanks, tglx
Re: [RFC 09/10] x86/enter: Create macros to restrict/unrestrict Indirect Branch Speculation
[ Dropping large CC list ] On 25/01/2018 18:16, Greg Kroah-Hartman wrote: > On Thu, Jan 25, 2018 at 05:19:04PM +0100, Mason wrote: > >> On 23/01/2018 10:30, David Woodhouse wrote: >> >>> Skylake takes predictions from the generic branch target buffer when >>> the RSB underflows. >> >> Adding LAKML. >> >> AFAIU, some ARM Cortex cores have the same optimization. >> (A9 maybe, A17 probably, some recent 64-bit cores) >> >> Are there software work-arounds for Spectre planned for arm32 and arm64? > > Yes, I think they are currently buried in one of the arm64 trees, and > they have been posted to the mailing list a few times in the past. Found the burial ground, thanks Greg. https://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux.git/log/?h=kpti Via https://developer.arm.com/support/security-update "For Cortex-R8, Cortex-A8, Cortex-A9, and Cortex-A17, invalidate the branch predictor using a BPIALL instruction." The latest arm32 patch series was submitted recently: https://www.spinics.net/lists/arm-kernel/msg630892.html Regards.
Re: [RFC 09/10] x86/enter: Create macros to restrict/unrestrict Indirect Branch Speculation
[ Dropping large CC list ] On 25/01/2018 18:16, Greg Kroah-Hartman wrote: > On Thu, Jan 25, 2018 at 05:19:04PM +0100, Mason wrote: > >> On 23/01/2018 10:30, David Woodhouse wrote: >> >>> Skylake takes predictions from the generic branch target buffer when >>> the RSB underflows. >> >> Adding LAKML. >> >> AFAIU, some ARM Cortex cores have the same optimization. >> (A9 maybe, A17 probably, some recent 64-bit cores) >> >> Are there software work-arounds for Spectre planned for arm32 and arm64? > > Yes, I think they are currently buried in one of the arm64 trees, and > they have been posted to the mailing list a few times in the past. Found the burial ground, thanks Greg. https://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux.git/log/?h=kpti Via https://developer.arm.com/support/security-update "For Cortex-R8, Cortex-A8, Cortex-A9, and Cortex-A17, invalidate the branch predictor using a BPIALL instruction." The latest arm32 patch series was submitted recently: https://www.spinics.net/lists/arm-kernel/msg630892.html Regards.
Re: [RFC 09/10] x86/enter: Create macros to restrict/unrestrict Indirect Branch Speculation
On 01/27/2018 05:42 AM, Konrad Rzeszutek Wilk wrote: > On Fri, Jan 26, 2018 at 07:11:47PM +, Hansen, Dave wrote: >> The need for RSB stuffing in all the various scenarios and what the heck it >> actually mitigates is freakishly complicated. I've tried to write it all >> down in one place: https://goo.gl/pXbvBE > Thank you for sharing that. > > One question on the third from the top (' RSB Stuff (16) After > irq/nmi/#PF/...'). > > It says that :"Return from interrupt path (more than 16 deep) can empty > RSB". > > Just to clarify - you mean all the returns ('ret') that are happening after > we call do_IRQ and the stack unwinds - but before we do an 'iret' correct? Correct. The RSB is not used or updated by iret.
Re: [RFC 09/10] x86/enter: Create macros to restrict/unrestrict Indirect Branch Speculation
On 01/27/2018 05:42 AM, Konrad Rzeszutek Wilk wrote: > On Fri, Jan 26, 2018 at 07:11:47PM +, Hansen, Dave wrote: >> The need for RSB stuffing in all the various scenarios and what the heck it >> actually mitigates is freakishly complicated. I've tried to write it all >> down in one place: https://goo.gl/pXbvBE > Thank you for sharing that. > > One question on the third from the top (' RSB Stuff (16) After > irq/nmi/#PF/...'). > > It says that :"Return from interrupt path (more than 16 deep) can empty > RSB". > > Just to clarify - you mean all the returns ('ret') that are happening after > we call do_IRQ and the stack unwinds - but before we do an 'iret' correct? Correct. The RSB is not used or updated by iret.
Re: [RFC 09/10] x86/enter: Create macros to restrict/unrestrict Indirect Branch Speculation
On Fri, Jan 26, 2018 at 07:11:47PM +, Hansen, Dave wrote: > The need for RSB stuffing in all the various scenarios and what the heck it > actually mitigates is freakishly complicated. I've tried to write it all > down in one place: https://goo.gl/pXbvBE Thank you for sharing that. One question on the third from the top (' RSB Stuff (16) After irq/nmi/#PF/...'). It says that :"Return from interrupt path (more than 16 deep) can empty RSB". Just to clarify - you mean all the returns ('ret') that are happening after we call do_IRQ and the stack unwinds - but before we do an 'iret' correct? I am 99% sure that is what you mean, but just confirming as one could read this as: 'Need to do RSB after an iret' (say you are in the kernel and then get an interrupt and iret back to kernel).
Re: [RFC 09/10] x86/enter: Create macros to restrict/unrestrict Indirect Branch Speculation
On Fri, Jan 26, 2018 at 07:11:47PM +, Hansen, Dave wrote: > The need for RSB stuffing in all the various scenarios and what the heck it > actually mitigates is freakishly complicated. I've tried to write it all > down in one place: https://goo.gl/pXbvBE Thank you for sharing that. One question on the third from the top (' RSB Stuff (16) After irq/nmi/#PF/...'). It says that :"Return from interrupt path (more than 16 deep) can empty RSB". Just to clarify - you mean all the returns ('ret') that are happening after we call do_IRQ and the stack unwinds - but before we do an 'iret' correct? I am 99% sure that is what you mean, but just confirming as one could read this as: 'Need to do RSB after an iret' (say you are in the kernel and then get an interrupt and iret back to kernel).
Re: [RFC 09/10] x86/enter: Create macros to restrict/unrestrict Indirect Branch Speculation
On Fri, Jan 26, 2018 at 09:19:09AM -0800, Linus Torvalds wrote: > But did we do that "disable stuffing with SMEP"? I'm not seeing it. In > my tree, it's only conditional on X86_FEATURE_RETPOLINE. Or rather, enable stuffing on !SMEP: + if ((!boot_cpu_has(X86_FEATURE_PTI) && +!boot_cpu_has(X86_FEATURE_SMEP)) || is_skylake_era()) { + setup_force_cpu_cap(X86_FEATURE_RSB_CTXSW); + pr_info("Filling RSB on context switch\n"); + } Should be c995efd5a740 ("x86/retpoline: Fill RSB on context switch for affected CPUs") in your tree. -- Regards/Gruss, Boris. SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nürnberg) --
Re: [RFC 09/10] x86/enter: Create macros to restrict/unrestrict Indirect Branch Speculation
On Fri, Jan 26, 2018 at 09:19:09AM -0800, Linus Torvalds wrote: > But did we do that "disable stuffing with SMEP"? I'm not seeing it. In > my tree, it's only conditional on X86_FEATURE_RETPOLINE. Or rather, enable stuffing on !SMEP: + if ((!boot_cpu_has(X86_FEATURE_PTI) && +!boot_cpu_has(X86_FEATURE_SMEP)) || is_skylake_era()) { + setup_force_cpu_cap(X86_FEATURE_RSB_CTXSW); + pr_info("Filling RSB on context switch\n"); + } Should be c995efd5a740 ("x86/retpoline: Fill RSB on context switch for affected CPUs") in your tree. -- Regards/Gruss, Boris. SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nürnberg) --
Re: [RFC 09/10] x86/enter: Create macros to restrict/unrestrict Indirect Branch Speculation
On Fri, 2018-01-26 at 14:02 -0500, Konrad Rzeszutek Wilk wrote: > > -ECONFUSED, see ==> > > Is this incorrect then? > I see: > > 241 * Skylake era CPUs have a separate issue with *underflow* of the > > 242 * RSB, when they will predict 'ret' targets from the generic > BTB. > 243 * The proper mitigation for this is IBRS. If IBRS is not > supported > 244 * or deactivated in favour of retpolines the RSB fill on context > > 245 * switch is required. > > 246 */ No, that's correct (well, except that it's kind of written for a world where Linus is going to let IBRS anywhere near his kernel, and could survive being rephrased a little :) The RSB-stuffing on context switch (or kernel entry) is one of a *litany* of additional hacks we need on Skylake to make retpolines safe. We were adding the RSB-stuffing in this case *anyway* for !SMEP, so it was trivial enough to add in the (|| Skylake) condition while we were at it. smime.p7s Description: S/MIME cryptographic signature
Re: [RFC 09/10] x86/enter: Create macros to restrict/unrestrict Indirect Branch Speculation
On Fri, 2018-01-26 at 14:02 -0500, Konrad Rzeszutek Wilk wrote: > > -ECONFUSED, see ==> > > Is this incorrect then? > I see: > > 241 * Skylake era CPUs have a separate issue with *underflow* of the > > 242 * RSB, when they will predict 'ret' targets from the generic > BTB. > 243 * The proper mitigation for this is IBRS. If IBRS is not > supported > 244 * or deactivated in favour of retpolines the RSB fill on context > > 245 * switch is required. > > 246 */ No, that's correct (well, except that it's kind of written for a world where Linus is going to let IBRS anywhere near his kernel, and could survive being rephrased a little :) The RSB-stuffing on context switch (or kernel entry) is one of a *litany* of additional hacks we need on Skylake to make retpolines safe. We were adding the RSB-stuffing in this case *anyway* for !SMEP, so it was trivial enough to add in the (|| Skylake) condition while we were at it. smime.p7s Description: S/MIME cryptographic signature
RE: [RFC 09/10] x86/enter: Create macros to restrict/unrestrict Indirect Branch Speculation
The need for RSB stuffing in all the various scenarios and what the heck it actually mitigates is freakishly complicated. I've tried to write it all down in one place: https://goo.gl/pXbvBE
RE: [RFC 09/10] x86/enter: Create macros to restrict/unrestrict Indirect Branch Speculation
The need for RSB stuffing in all the various scenarios and what the heck it actually mitigates is freakishly complicated. I've tried to write it all down in one place: https://goo.gl/pXbvBE
Re: [RFC 09/10] x86/enter: Create macros to restrict/unrestrict Indirect Branch Speculation
On Fri, Jan 26, 2018 at 09:59:01AM -0800, Andi Kleen wrote: > On Fri, Jan 26, 2018 at 09:19:09AM -0800, Linus Torvalds wrote: > > On Fri, Jan 26, 2018 at 1:11 AM, David Woodhouse> > wrote: > > > > > > Do we need to look again at the fact that we've disabled the RSB- > > > stuffing for SMEP? > > > > Absolutely. SMEP helps make people a lot less worried about things, > > but it doesn't fix the "BTB only contains partial addresses" case. > > > > But did we do that "disable stuffing with SMEP"? I'm not seeing it. In > > my tree, it's only conditional on X86_FEATURE_RETPOLINE. > > For Skylake we need RSB stuffing even with SMEP to avoid falling back to the > BTB on underflow. > > It's also always needed with virtualization. -ECONFUSED, see ==> Is this incorrect then? I see: 241 * Skylake era CPUs have a separate issue with *underflow* of the 242 * RSB, when they will predict 'ret' targets from the generic BTB. 243 * The proper mitigation for this is IBRS. If IBRS is not supported 244 * or deactivated in favour of retpolines the RSB fill on context 245 * switch is required. 246 */ which came from this: commit c995efd5a740d9cbafbf58bde4973e8b50b4d761 Author: David Woodhouse Date: Fri Jan 12 17:49:25 2018 + x86/retpoline: Fill RSB on context switch for affected CPUs On context switch from a shallow call stack to a deeper one, as the CPU does 'ret' up the deeper side it may encounter RSB entries (predictions for where the 'ret' goes to) which were populated in userspace. This is problematic if neither SMEP nor KPTI (the latter of which marks userspace pages as NX for the kernel) are active, as malicious code in userspace may then be executed speculatively. Overwrite the CPU's return prediction stack with calls which are predicted to return to an infinite loop, to "capture" speculation if this happens. This is required both for retpoline, and also in conjunction with IBRS for !SMEP && !KPTI. On Skylake+ the problem is slightly different, and an *underflow* of the RSB may cause errant branch predictions to occur. So there it's not so much overwrite, as *filling* the RSB to attempt to prevent it getting empty. This is only a partial solution for Skylake+ since there are many ==>other conditions which may result in the RSB becoming empty. The full <== ==>solution on Skylake+ is to use IBRS, which will prevent the problem even <== when the RSB becomes empty. With IBRS, the RSB-stuffing will not be required on context switch. Signed-off-by: David Woodhouse Signed-off-by: Thomas Gleixner Acked-by: Arjan van de Ven The "full solution" is what is making me confused. > > -Andi
Re: [RFC 09/10] x86/enter: Create macros to restrict/unrestrict Indirect Branch Speculation
On Fri, Jan 26, 2018 at 09:59:01AM -0800, Andi Kleen wrote: > On Fri, Jan 26, 2018 at 09:19:09AM -0800, Linus Torvalds wrote: > > On Fri, Jan 26, 2018 at 1:11 AM, David Woodhouse > > wrote: > > > > > > Do we need to look again at the fact that we've disabled the RSB- > > > stuffing for SMEP? > > > > Absolutely. SMEP helps make people a lot less worried about things, > > but it doesn't fix the "BTB only contains partial addresses" case. > > > > But did we do that "disable stuffing with SMEP"? I'm not seeing it. In > > my tree, it's only conditional on X86_FEATURE_RETPOLINE. > > For Skylake we need RSB stuffing even with SMEP to avoid falling back to the > BTB on underflow. > > It's also always needed with virtualization. -ECONFUSED, see ==> Is this incorrect then? I see: 241 * Skylake era CPUs have a separate issue with *underflow* of the 242 * RSB, when they will predict 'ret' targets from the generic BTB. 243 * The proper mitigation for this is IBRS. If IBRS is not supported 244 * or deactivated in favour of retpolines the RSB fill on context 245 * switch is required. 246 */ which came from this: commit c995efd5a740d9cbafbf58bde4973e8b50b4d761 Author: David Woodhouse Date: Fri Jan 12 17:49:25 2018 + x86/retpoline: Fill RSB on context switch for affected CPUs On context switch from a shallow call stack to a deeper one, as the CPU does 'ret' up the deeper side it may encounter RSB entries (predictions for where the 'ret' goes to) which were populated in userspace. This is problematic if neither SMEP nor KPTI (the latter of which marks userspace pages as NX for the kernel) are active, as malicious code in userspace may then be executed speculatively. Overwrite the CPU's return prediction stack with calls which are predicted to return to an infinite loop, to "capture" speculation if this happens. This is required both for retpoline, and also in conjunction with IBRS for !SMEP && !KPTI. On Skylake+ the problem is slightly different, and an *underflow* of the RSB may cause errant branch predictions to occur. So there it's not so much overwrite, as *filling* the RSB to attempt to prevent it getting empty. This is only a partial solution for Skylake+ since there are many ==>other conditions which may result in the RSB becoming empty. The full <== ==>solution on Skylake+ is to use IBRS, which will prevent the problem even <== when the RSB becomes empty. With IBRS, the RSB-stuffing will not be required on context switch. Signed-off-by: David Woodhouse Signed-off-by: Thomas Gleixner Acked-by: Arjan van de Ven The "full solution" is what is making me confused. > > -Andi
Re: [RFC 09/10] x86/enter: Create macros to restrict/unrestrict Indirect Branch Speculation
On Fri, 2018-01-26 at 18:44 +, Van De Ven, Arjan wrote: > your question was specific to RSB not BTB. But please show the empirical > evidence for RSB ? We were hypothesising, which should have been clear from: On Fri, 2018-01-26 at 09:11 +, David Woodhouse wrote: > Likewise if the RSB only stores the low 31 bits of the target, SMEP > isn't much help there either. > > Do we need to look again at the fact that we've disabled the RSB- > stuffing for SMEP? ... and later... On Fri, 2018-01-26 at 17:31 +, David Woodhouse wrote: > Note, we've switched from talking about BTB to RSB here, so this is a > valid concern if the *RSB* only has the low bits of the target. I'm glad to hear that it *isn't* a valid concern for the RSB and the code in Linus' tree is correct. Thank you for clearing that up. smime.p7s Description: S/MIME cryptographic signature
Re: [RFC 09/10] x86/enter: Create macros to restrict/unrestrict Indirect Branch Speculation
On Fri, 2018-01-26 at 18:44 +, Van De Ven, Arjan wrote: > your question was specific to RSB not BTB. But please show the empirical > evidence for RSB ? We were hypothesising, which should have been clear from: On Fri, 2018-01-26 at 09:11 +, David Woodhouse wrote: > Likewise if the RSB only stores the low 31 bits of the target, SMEP > isn't much help there either. > > Do we need to look again at the fact that we've disabled the RSB- > stuffing for SMEP? ... and later... On Fri, 2018-01-26 at 17:31 +, David Woodhouse wrote: > Note, we've switched from talking about BTB to RSB here, so this is a > valid concern if the *RSB* only has the low bits of the target. I'm glad to hear that it *isn't* a valid concern for the RSB and the code in Linus' tree is correct. Thank you for clearing that up. smime.p7s Description: S/MIME cryptographic signature
RE: [RFC 09/10] x86/enter: Create macros to restrict/unrestrict Indirect Branch Speculation
> > you asked before and even before you sent the email I confirmed to > > you that the document is correct > > > > I'm not sure what the point is to then question that again 15 minutes > > later other than creating more noise. > > Apologies, I hadn't seen the comment on IRC. > > Sometimes the docs *don't* get it right, especially when they're > released in a hurry as that one was. I note there's a *fourth* version > of microcode-update-guidance.pdf available now, for example :) > > So it is useful that you have explicitly stated that for *this* > specific concern, the document is in fact correct that SMEP saves us > from BTB and RSB pollution, *despite* the empirical evidence that those > structures only hold the low 31 bits. your question was specific to RSB not BTB. But please show the empirical evidence for RSB ?
RE: [RFC 09/10] x86/enter: Create macros to restrict/unrestrict Indirect Branch Speculation
> > you asked before and even before you sent the email I confirmed to > > you that the document is correct > > > > I'm not sure what the point is to then question that again 15 minutes > > later other than creating more noise. > > Apologies, I hadn't seen the comment on IRC. > > Sometimes the docs *don't* get it right, especially when they're > released in a hurry as that one was. I note there's a *fourth* version > of microcode-update-guidance.pdf available now, for example :) > > So it is useful that you have explicitly stated that for *this* > specific concern, the document is in fact correct that SMEP saves us > from BTB and RSB pollution, *despite* the empirical evidence that those > structures only hold the low 31 bits. your question was specific to RSB not BTB. But please show the empirical evidence for RSB ?
Re: [RFC 09/10] x86/enter: Create macros to restrict/unrestrict Indirect Branch Speculation
On Fri, 2018-01-26 at 18:28 +, Van De Ven, Arjan wrote: > > As you know well, I mean "we think Intel's document is not > > correct". > > you asked before and even before you sent the email I confirmed to > you that the document is correct > > I'm not sure what the point is to then question that again 15 minutes > later other than creating more noise. Apologies, I hadn't seen the comment on IRC. Sometimes the docs *don't* get it right, especially when they're released in a hurry as that one was. I note there's a *fourth* version of microcode-update-guidance.pdf available now, for example :) So it is useful that you have explicitly stated that for *this* specific concern, the document is in fact correct that SMEP saves us from BTB and RSB pollution, *despite* the empirical evidence that those structures only hold the low 31 bits. I'm going to get back to other things now, although I'm sure others may be very interested to reconcile the empirical evidence with what you say, and want to know *how* that can be the case. Which I'm sure you won't be able to say in public anyway. smime.p7s Description: S/MIME cryptographic signature
Re: [RFC 09/10] x86/enter: Create macros to restrict/unrestrict Indirect Branch Speculation
On Fri, 2018-01-26 at 18:28 +, Van De Ven, Arjan wrote: > > As you know well, I mean "we think Intel's document is not > > correct". > > you asked before and even before you sent the email I confirmed to > you that the document is correct > > I'm not sure what the point is to then question that again 15 minutes > later other than creating more noise. Apologies, I hadn't seen the comment on IRC. Sometimes the docs *don't* get it right, especially when they're released in a hurry as that one was. I note there's a *fourth* version of microcode-update-guidance.pdf available now, for example :) So it is useful that you have explicitly stated that for *this* specific concern, the document is in fact correct that SMEP saves us from BTB and RSB pollution, *despite* the empirical evidence that those structures only hold the low 31 bits. I'm going to get back to other things now, although I'm sure others may be very interested to reconcile the empirical evidence with what you say, and want to know *how* that can be the case. Which I'm sure you won't be able to say in public anyway. smime.p7s Description: S/MIME cryptographic signature
RE: [RFC 09/10] x86/enter: Create macros to restrict/unrestrict Indirect Branch Speculation
> On Fri, 2018-01-26 at 10:12 -0800, Arjan van de Ven wrote: > > On 1/26/2018 10:11 AM, David Woodhouse wrote: > > > > > > I am *actively* ignoring Skylake right now. This is about per-SKL > > > userspace even with SMEP, because we think Intel's document lies to us. > > > > if you think we lie to you then I think we're done with the conversation? > > > > Please tell us then what you deploy in AWS for your customers ? > > > > or show us research that shows we lied to you? > > As you know well, I mean "we think Intel's document is not correct". you asked before and even before you sent the email I confirmed to you that the document is correct I'm not sure what the point is to then question that again 15 minutes later other than creating more noise.
RE: [RFC 09/10] x86/enter: Create macros to restrict/unrestrict Indirect Branch Speculation
> On Fri, 2018-01-26 at 10:12 -0800, Arjan van de Ven wrote: > > On 1/26/2018 10:11 AM, David Woodhouse wrote: > > > > > > I am *actively* ignoring Skylake right now. This is about per-SKL > > > userspace even with SMEP, because we think Intel's document lies to us. > > > > if you think we lie to you then I think we're done with the conversation? > > > > Please tell us then what you deploy in AWS for your customers ? > > > > or show us research that shows we lied to you? > > As you know well, I mean "we think Intel's document is not correct". you asked before and even before you sent the email I confirmed to you that the document is correct I'm not sure what the point is to then question that again 15 minutes later other than creating more noise.
Re: [RFC 09/10] x86/enter: Create macros to restrict/unrestrict Indirect Branch Speculation
On Fri, 2018-01-26 at 10:12 -0800, Arjan van de Ven wrote: > On 1/26/2018 10:11 AM, David Woodhouse wrote: > > > > I am *actively* ignoring Skylake right now. This is about per-SKL > > userspace even with SMEP, because we think Intel's document lies to us. > > if you think we lie to you then I think we're done with the conversation? > > Please tell us then what you deploy in AWS for your customers ? > > or show us research that shows we lied to you? As you know well, I mean "we think Intel's document is not correct". The evidence which made us suspect that is fairly clear in the last few emails in this thread — it's about the BTB/RSB only having the low bits of the target, which would mean that userspace *can* put malicious targets into the RSB, regardless of SMEP. smime.p7s Description: S/MIME cryptographic signature
Re: [RFC 09/10] x86/enter: Create macros to restrict/unrestrict Indirect Branch Speculation
On Fri, 2018-01-26 at 10:12 -0800, Arjan van de Ven wrote: > On 1/26/2018 10:11 AM, David Woodhouse wrote: > > > > I am *actively* ignoring Skylake right now. This is about per-SKL > > userspace even with SMEP, because we think Intel's document lies to us. > > if you think we lie to you then I think we're done with the conversation? > > Please tell us then what you deploy in AWS for your customers ? > > or show us research that shows we lied to you? As you know well, I mean "we think Intel's document is not correct". The evidence which made us suspect that is fairly clear in the last few emails in this thread — it's about the BTB/RSB only having the low bits of the target, which would mean that userspace *can* put malicious targets into the RSB, regardless of SMEP. smime.p7s Description: S/MIME cryptographic signature
Re: [RFC 09/10] x86/enter: Create macros to restrict/unrestrict Indirect Branch Speculation
On Fri, 2018-01-26 at 09:59 -0800, Andi Kleen wrote: > On Fri, Jan 26, 2018 at 09:19:09AM -0800, Linus Torvalds wrote: > > > > On Fri, Jan 26, 2018 at 1:11 AM, David Woodhouse wrote: > > > > > > > > > Do we need to look again at the fact that we've disabled the RSB- > > > stuffing for SMEP? > > Absolutely. SMEP helps make people a lot less worried about things, > > but it doesn't fix the "BTB only contains partial addresses" case. > > > > But did we do that "disable stuffing with SMEP"? I'm not seeing it. In > > my tree, it's only conditional on X86_FEATURE_RETPOLINE. > > For Skylake we need RSB stuffing even with SMEP to avoid falling back to the > BTB on underflow. I am *actively* ignoring Skylake right now. This is about per-SKL userspace even with SMEP, because we think Intel's document lies to us. If the RSB only holds the low bits of the target, then a userspace attacker can populate an RSB entry which points to a kernel gadget of her choice, even with SMEP or KPTI enabled. smime.p7s Description: S/MIME cryptographic signature
Re: [RFC 09/10] x86/enter: Create macros to restrict/unrestrict Indirect Branch Speculation
On Fri, 2018-01-26 at 09:59 -0800, Andi Kleen wrote: > On Fri, Jan 26, 2018 at 09:19:09AM -0800, Linus Torvalds wrote: > > > > On Fri, Jan 26, 2018 at 1:11 AM, David Woodhouse wrote: > > > > > > > > > Do we need to look again at the fact that we've disabled the RSB- > > > stuffing for SMEP? > > Absolutely. SMEP helps make people a lot less worried about things, > > but it doesn't fix the "BTB only contains partial addresses" case. > > > > But did we do that "disable stuffing with SMEP"? I'm not seeing it. In > > my tree, it's only conditional on X86_FEATURE_RETPOLINE. > > For Skylake we need RSB stuffing even with SMEP to avoid falling back to the > BTB on underflow. I am *actively* ignoring Skylake right now. This is about per-SKL userspace even with SMEP, because we think Intel's document lies to us. If the RSB only holds the low bits of the target, then a userspace attacker can populate an RSB entry which points to a kernel gadget of her choice, even with SMEP or KPTI enabled. smime.p7s Description: S/MIME cryptographic signature
Re: [RFC 09/10] x86/enter: Create macros to restrict/unrestrict Indirect Branch Speculation
On 1/26/2018 10:11 AM, David Woodhouse wrote: I am *actively* ignoring Skylake right now. This is about per-SKL userspace even with SMEP, because we think Intel's document lies to us. if you think we lie to you then I think we're done with the conversation? Please tell us then what you deploy in AWS for your customers ? or show us research that shows we lied to you?
Re: [RFC 09/10] x86/enter: Create macros to restrict/unrestrict Indirect Branch Speculation
On 1/26/2018 10:11 AM, David Woodhouse wrote: I am *actively* ignoring Skylake right now. This is about per-SKL userspace even with SMEP, because we think Intel's document lies to us. if you think we lie to you then I think we're done with the conversation? Please tell us then what you deploy in AWS for your customers ? or show us research that shows we lied to you?
Re: [RFC 09/10] x86/enter: Create macros to restrict/unrestrict Indirect Branch Speculation
On Fri, Jan 26, 2018 at 09:19:09AM -0800, Linus Torvalds wrote: > On Fri, Jan 26, 2018 at 1:11 AM, David Woodhousewrote: > > > > Do we need to look again at the fact that we've disabled the RSB- > > stuffing for SMEP? > > Absolutely. SMEP helps make people a lot less worried about things, > but it doesn't fix the "BTB only contains partial addresses" case. > > But did we do that "disable stuffing with SMEP"? I'm not seeing it. In > my tree, it's only conditional on X86_FEATURE_RETPOLINE. For Skylake we need RSB stuffing even with SMEP to avoid falling back to the BTB on underflow. It's also always needed with virtualization. -Andi
Re: [RFC 09/10] x86/enter: Create macros to restrict/unrestrict Indirect Branch Speculation
On Fri, Jan 26, 2018 at 09:19:09AM -0800, Linus Torvalds wrote: > On Fri, Jan 26, 2018 at 1:11 AM, David Woodhouse wrote: > > > > Do we need to look again at the fact that we've disabled the RSB- > > stuffing for SMEP? > > Absolutely. SMEP helps make people a lot less worried about things, > but it doesn't fix the "BTB only contains partial addresses" case. > > But did we do that "disable stuffing with SMEP"? I'm not seeing it. In > my tree, it's only conditional on X86_FEATURE_RETPOLINE. For Skylake we need RSB stuffing even with SMEP to avoid falling back to the BTB on underflow. It's also always needed with virtualization. -Andi
Re: [RFC 09/10] x86/enter: Create macros to restrict/unrestrict Indirect Branch Speculation
On Fri, 2018-01-26 at 17:29 +, David Woodhouse wrote: > On Fri, 2018-01-26 at 09:19 -0800, Linus Torvalds wrote: > > On Fri, Jan 26, 2018 at 1:11 AM, David Woodhouse> > wrote: > > > Do we need to look again at the fact that we've disabled the RSB- > > > stuffing for SMEP? > > > > Absolutely. SMEP helps make people a lot less worried about things, > > but it doesn't fix the "BTB only contains partial addresses" case. > > > > But did we do that "disable stuffing with SMEP"? I'm not seeing it. In > > my tree, it's only conditional on X86_FEATURE_RETPOLINE. > > That's the vmexit one. The one on context switch is in > commit c995efd5a7 and has its own X86_FEATURE_RSB_CTXSW which in > kernel/cpu/bugs.c is turned on for (!SMEP || Skylake). > > The "low bits of the BTB" issue probably means that wants to be > X86_FEATURE_RETPOLINE too. Despite Intel's doc saying otherwise. > > (Intel's doc also says to do it on kernel entry, but we elected to do > it on context switch instead since *that's* when the imbalances show up > in the RSB.) Note, we've switched from talking about BTB to RSB here, so this is a valid concern if the *RSB* only has the low bits of the target. smime.p7s Description: S/MIME cryptographic signature
Re: [RFC 09/10] x86/enter: Create macros to restrict/unrestrict Indirect Branch Speculation
On Fri, 2018-01-26 at 17:29 +, David Woodhouse wrote: > On Fri, 2018-01-26 at 09:19 -0800, Linus Torvalds wrote: > > On Fri, Jan 26, 2018 at 1:11 AM, David Woodhouse > > wrote: > > > Do we need to look again at the fact that we've disabled the RSB- > > > stuffing for SMEP? > > > > Absolutely. SMEP helps make people a lot less worried about things, > > but it doesn't fix the "BTB only contains partial addresses" case. > > > > But did we do that "disable stuffing with SMEP"? I'm not seeing it. In > > my tree, it's only conditional on X86_FEATURE_RETPOLINE. > > That's the vmexit one. The one on context switch is in > commit c995efd5a7 and has its own X86_FEATURE_RSB_CTXSW which in > kernel/cpu/bugs.c is turned on for (!SMEP || Skylake). > > The "low bits of the BTB" issue probably means that wants to be > X86_FEATURE_RETPOLINE too. Despite Intel's doc saying otherwise. > > (Intel's doc also says to do it on kernel entry, but we elected to do > it on context switch instead since *that's* when the imbalances show up > in the RSB.) Note, we've switched from talking about BTB to RSB here, so this is a valid concern if the *RSB* only has the low bits of the target. smime.p7s Description: S/MIME cryptographic signature
Re: [RFC 09/10] x86/enter: Create macros to restrict/unrestrict Indirect Branch Speculation
On Fri, 2018-01-26 at 09:19 -0800, Linus Torvalds wrote: > On Fri, Jan 26, 2018 at 1:11 AM, David Woodhousewrote: > > > > > > Do we need to look again at the fact that we've disabled the RSB- > > stuffing for SMEP? > Absolutely. SMEP helps make people a lot less worried about things, > but it doesn't fix the "BTB only contains partial addresses" case. > > But did we do that "disable stuffing with SMEP"? I'm not seeing it. In > my tree, it's only conditional on X86_FEATURE_RETPOLINE. That's the vmexit one. The one on context switch is in commit c995efd5a7 and has its own X86_FEATURE_RSB_CTXSW which in kernel/cpu/bugs.c is turned on for (!SMEP || Skylake). The "low bits of the BTB" issue probably means that wants to be X86_FEATURE_RETPOLINE too. Despite Intel's doc saying otherwise. (Intel's doc also says to do it on kernel entry, but we elected to do it on context switch instead since *that's* when the imbalances show up in the RSB.) smime.p7s Description: S/MIME cryptographic signature
Re: [RFC 09/10] x86/enter: Create macros to restrict/unrestrict Indirect Branch Speculation
On Fri, 2018-01-26 at 09:19 -0800, Linus Torvalds wrote: > On Fri, Jan 26, 2018 at 1:11 AM, David Woodhouse wrote: > > > > > > Do we need to look again at the fact that we've disabled the RSB- > > stuffing for SMEP? > Absolutely. SMEP helps make people a lot less worried about things, > but it doesn't fix the "BTB only contains partial addresses" case. > > But did we do that "disable stuffing with SMEP"? I'm not seeing it. In > my tree, it's only conditional on X86_FEATURE_RETPOLINE. That's the vmexit one. The one on context switch is in commit c995efd5a7 and has its own X86_FEATURE_RSB_CTXSW which in kernel/cpu/bugs.c is turned on for (!SMEP || Skylake). The "low bits of the BTB" issue probably means that wants to be X86_FEATURE_RETPOLINE too. Despite Intel's doc saying otherwise. (Intel's doc also says to do it on kernel entry, but we elected to do it on context switch instead since *that's* when the imbalances show up in the RSB.) smime.p7s Description: S/MIME cryptographic signature
Re: [RFC 09/10] x86/enter: Create macros to restrict/unrestrict Indirect Branch Speculation
On Fri, Jan 26, 2018 at 1:11 AM, David Woodhousewrote: > > Do we need to look again at the fact that we've disabled the RSB- > stuffing for SMEP? Absolutely. SMEP helps make people a lot less worried about things, but it doesn't fix the "BTB only contains partial addresses" case. But did we do that "disable stuffing with SMEP"? I'm not seeing it. In my tree, it's only conditional on X86_FEATURE_RETPOLINE. Linus
Re: [RFC 09/10] x86/enter: Create macros to restrict/unrestrict Indirect Branch Speculation
On Fri, Jan 26, 2018 at 1:11 AM, David Woodhouse wrote: > > Do we need to look again at the fact that we've disabled the RSB- > stuffing for SMEP? Absolutely. SMEP helps make people a lot less worried about things, but it doesn't fix the "BTB only contains partial addresses" case. But did we do that "disable stuffing with SMEP"? I'm not seeing it. In my tree, it's only conditional on X86_FEATURE_RETPOLINE. Linus
Re: [RFC 09/10] x86/enter: Create macros to restrict/unrestrict Indirect Branch Speculation
On Thu, 2018-01-25 at 18:23 -0800, Dave Hansen wrote: > On 01/25/2018 06:11 PM, Liran Alon wrote: > > > > It is true that attacker cannot speculate to a kernel-address, but it > > doesn't mean it cannot use the leaked kernel-address together with > > another unrelated vulnerability to build a reliable exploit. > > The address doesn't leak if you can't execute there. It's the same > reason that we don't worry about speculation to user addresses from the > kernel when SMEP is in play. If both tags and target in the BTB are only 31 bits, then surely a user-learned prediction of a branch from 0x01234567 → 0x07654321 would be equivalent to a kernel-mode branch from 0x81234567 → 0x87654321 ... and interpreted in kernel mode as the latter? So I'm not sure why SMEP saves us there? Likewise if the RSB only stores the low 31 bits of the target, SMEP isn't much help there either. Do we need to look again at the fact that we've disabled the RSB- stuffing for SMEP? smime.p7s Description: S/MIME cryptographic signature
Re: [RFC 09/10] x86/enter: Create macros to restrict/unrestrict Indirect Branch Speculation
On Thu, 2018-01-25 at 18:23 -0800, Dave Hansen wrote: > On 01/25/2018 06:11 PM, Liran Alon wrote: > > > > It is true that attacker cannot speculate to a kernel-address, but it > > doesn't mean it cannot use the leaked kernel-address together with > > another unrelated vulnerability to build a reliable exploit. > > The address doesn't leak if you can't execute there. It's the same > reason that we don't worry about speculation to user addresses from the > kernel when SMEP is in play. If both tags and target in the BTB are only 31 bits, then surely a user-learned prediction of a branch from 0x01234567 → 0x07654321 would be equivalent to a kernel-mode branch from 0x81234567 → 0x87654321 ... and interpreted in kernel mode as the latter? So I'm not sure why SMEP saves us there? Likewise if the RSB only stores the low 31 bits of the target, SMEP isn't much help there either. Do we need to look again at the fact that we've disabled the RSB- stuffing for SMEP? smime.p7s Description: S/MIME cryptographic signature
Re: [RFC 09/10] x86/enter: Create macros to restrict/unrestrict Indirect Branch Speculation
On Thu, 2018-01-25 at 18:11 -0800, Liran Alon wrote: > > P.S: > It seems to me that all these issues could be resolved completely at > hardware in future CPUs if BTB/BHB/RSB entries were tagged with > prediction-mode (or similar metadata). It will be nice if Intel/AMD > could share if that is the planned long-term solution instead of > IBRS-all-the-time. IBRS-all-the-time is tagging with the ring and VMX root/non-root mode, it seems. That much they could slip into the upcoming generation of CPUs. And it's supposed to be fast¹; none of the dirty hacks in microcode that they needed to implement the first-generation IBRS. But we still need to tag with ASID/VMID and do proper flushing for those, before we can completely ditch the need to do IBPB at the right times. Reading between the lines, I don't think they could add *that* without stopping the fabs for a year or so while they go back to the drawing board. But yes, I sincerely hope they *are* planning to do it, and expose a 'SPECTRE_NO' bit in IA32_ARCH_CAPABILITIES, as soon as is humanly possible. ¹ Fast enough that we'll want to use it and ALTERNATIVE out the retpolines. smime.p7s Description: S/MIME cryptographic signature
Re: [RFC 09/10] x86/enter: Create macros to restrict/unrestrict Indirect Branch Speculation
On Thu, 2018-01-25 at 18:11 -0800, Liran Alon wrote: > > P.S: > It seems to me that all these issues could be resolved completely at > hardware in future CPUs if BTB/BHB/RSB entries were tagged with > prediction-mode (or similar metadata). It will be nice if Intel/AMD > could share if that is the planned long-term solution instead of > IBRS-all-the-time. IBRS-all-the-time is tagging with the ring and VMX root/non-root mode, it seems. That much they could slip into the upcoming generation of CPUs. And it's supposed to be fast¹; none of the dirty hacks in microcode that they needed to implement the first-generation IBRS. But we still need to tag with ASID/VMID and do proper flushing for those, before we can completely ditch the need to do IBPB at the right times. Reading between the lines, I don't think they could add *that* without stopping the fabs for a year or so while they go back to the drawing board. But yes, I sincerely hope they *are* planning to do it, and expose a 'SPECTRE_NO' bit in IA32_ARCH_CAPABILITIES, as soon as is humanly possible. ¹ Fast enough that we'll want to use it and ALTERNATIVE out the retpolines. smime.p7s Description: S/MIME cryptographic signature
RE: [RFC 09/10] x86/enter: Create macros to restrict/unrestrict Indirect Branch Speculation
> -Original Message- > From: Liran Alon [mailto:liran.a...@oracle.com] > Sent: Thursday, January 25, 2018 6:50 PM > To: Hansen, Dave <dave.han...@intel.com> > Cc: labb...@redhat.com; l...@kernel.org; janakarajan.natara...@amd.com; > torva...@linux-foundation.org; b...@suse.de; Mallick, Asit K > <asit.k.mall...@intel.com>; rkrc...@redhat.com; karah...@amazon.de; > h...@zytor.com; mi...@redhat.com; Nakajima, Jun > <jun.nakaj...@intel.com>; x...@kernel.org; Raj, Ashok <ashok@intel.com>; > Van De Ven, Arjan <arjan.van.de@intel.com>; tim.c.c...@linux.intel.com; > pbonz...@redhat.com; a...@linux.intel.com; linux-kernel@vger.kernel.org; > dw...@infradead.org; pet...@infradead.org; t...@linutronix.de; > gre...@linuxfoundation.org; mhira...@kernel.org; ar...@linux.intel.com; > thomas.lenda...@amd.com; Williams, Dan J <dan.j.willi...@intel.com>; > j...@8bytes.org; k...@vger.kernel.org; aarca...@redhat.com > Subject: Re: [RFC 09/10] x86/enter: Create macros to restrict/unrestrict > Indirect > Branch Speculation > > > Google P0 blog-post > (https://googleprojectzero.blogspot.co.il/2018/01/reading-privileged-memory- > with-side.html) claims that BTB & BHB only use <31 low bits of the address of > the source instruction to lookup into the BTB. In addition, it claims that the > higher bits of the predicated destination change together with the higher > bits of > the source instruction. > > Therefore, it should be possible to leak the low bits of high predicition-mode > code BTB/BHB entries from low prediction-mode code. Because the predicted > destination address will reside in user-space. > > What am I missing? I thought this email thread was about the RSB...
RE: [RFC 09/10] x86/enter: Create macros to restrict/unrestrict Indirect Branch Speculation
> -Original Message- > From: Liran Alon [mailto:liran.a...@oracle.com] > Sent: Thursday, January 25, 2018 6:50 PM > To: Hansen, Dave > Cc: labb...@redhat.com; l...@kernel.org; janakarajan.natara...@amd.com; > torva...@linux-foundation.org; b...@suse.de; Mallick, Asit K > ; rkrc...@redhat.com; karah...@amazon.de; > h...@zytor.com; mi...@redhat.com; Nakajima, Jun > ; x...@kernel.org; Raj, Ashok ; > Van De Ven, Arjan ; tim.c.c...@linux.intel.com; > pbonz...@redhat.com; a...@linux.intel.com; linux-kernel@vger.kernel.org; > dw...@infradead.org; pet...@infradead.org; t...@linutronix.de; > gre...@linuxfoundation.org; mhira...@kernel.org; ar...@linux.intel.com; > thomas.lenda...@amd.com; Williams, Dan J ; > j...@8bytes.org; k...@vger.kernel.org; aarca...@redhat.com > Subject: Re: [RFC 09/10] x86/enter: Create macros to restrict/unrestrict > Indirect > Branch Speculation > > > Google P0 blog-post > (https://googleprojectzero.blogspot.co.il/2018/01/reading-privileged-memory- > with-side.html) claims that BTB & BHB only use <31 low bits of the address of > the source instruction to lookup into the BTB. In addition, it claims that the > higher bits of the predicated destination change together with the higher > bits of > the source instruction. > > Therefore, it should be possible to leak the low bits of high predicition-mode > code BTB/BHB entries from low prediction-mode code. Because the predicted > destination address will reside in user-space. > > What am I missing? I thought this email thread was about the RSB...
Re: [RFC 09/10] x86/enter: Create macros to restrict/unrestrict Indirect Branch Speculation
- dave.han...@intel.com wrote: > On 01/25/2018 06:11 PM, Liran Alon wrote: > > It is true that attacker cannot speculate to a kernel-address, but > it > > doesn't mean it cannot use the leaked kernel-address together with > > another unrelated vulnerability to build a reliable exploit. > > The address doesn't leak if you can't execute there. It's the same > reason that we don't worry about speculation to user addresses from > the > kernel when SMEP is in play. Maybe I misunderstand BTB & BHB internals. Will be glad if you could pinpoint my error. Google P0 blog-post (https://googleprojectzero.blogspot.co.il/2018/01/reading-privileged-memory-with-side.html) claims that BTB & BHB only use <31 low bits of the address of the source instruction to lookup into the BTB. In addition, it claims that the higher bits of the predicated destination change together with the higher bits of the source instruction. Therefore, it should be possible to leak the low bits of high predicition-mode code BTB/BHB entries from low prediction-mode code. Because the predicted destination address will reside in user-space. What am I missing? Thanks, -Liran
Re: [RFC 09/10] x86/enter: Create macros to restrict/unrestrict Indirect Branch Speculation
- dave.han...@intel.com wrote: > On 01/25/2018 06:11 PM, Liran Alon wrote: > > It is true that attacker cannot speculate to a kernel-address, but > it > > doesn't mean it cannot use the leaked kernel-address together with > > another unrelated vulnerability to build a reliable exploit. > > The address doesn't leak if you can't execute there. It's the same > reason that we don't worry about speculation to user addresses from > the > kernel when SMEP is in play. Maybe I misunderstand BTB & BHB internals. Will be glad if you could pinpoint my error. Google P0 blog-post (https://googleprojectzero.blogspot.co.il/2018/01/reading-privileged-memory-with-side.html) claims that BTB & BHB only use <31 low bits of the address of the source instruction to lookup into the BTB. In addition, it claims that the higher bits of the predicated destination change together with the higher bits of the source instruction. Therefore, it should be possible to leak the low bits of high predicition-mode code BTB/BHB entries from low prediction-mode code. Because the predicted destination address will reside in user-space. What am I missing? Thanks, -Liran
Re: [RFC 09/10] x86/enter: Create macros to restrict/unrestrict Indirect Branch Speculation
On 01/25/2018 06:11 PM, Liran Alon wrote: > It is true that attacker cannot speculate to a kernel-address, but it > doesn't mean it cannot use the leaked kernel-address together with > another unrelated vulnerability to build a reliable exploit. The address doesn't leak if you can't execute there. It's the same reason that we don't worry about speculation to user addresses from the kernel when SMEP is in play.
Re: [RFC 09/10] x86/enter: Create macros to restrict/unrestrict Indirect Branch Speculation
On 01/25/2018 06:11 PM, Liran Alon wrote: > It is true that attacker cannot speculate to a kernel-address, but it > doesn't mean it cannot use the leaked kernel-address together with > another unrelated vulnerability to build a reliable exploit. The address doesn't leak if you can't execute there. It's the same reason that we don't worry about speculation to user addresses from the kernel when SMEP is in play.
Re: [RFC 09/10] x86/enter: Create macros to restrict/unrestrict Indirect Branch Speculation
- dave.han...@intel.com wrote: > On 01/23/2018 03:13 AM, Liran Alon wrote: > > Therefore, breaking KASLR. In order to handle this, every exit from > > kernel-mode to user-mode should stuff RSB. In addition, this > stuffing > > of RSB may need to be done from a fixed address to avoid leaking > the > > address of the RSB stuffing itself. > > With PTI alone in place, I don't see how userspace could do anything > with this information. Even if userspace started to speculate to a > kernel address, there is nothing at the kernel address to execute: no > TLB entry, no PTE to load, nothing. > > You probably have a valid point about host->guest, though. I see it differently. It is true that attacker cannot speculate to a kernel-address, but it doesn't mean it cannot use the leaked kernel-address together with another unrelated vulnerability to build a reliable exploit. Security is built in layers. The purpose of KASLR is to break the reliablity of an exploit which relies on vulnerability primitives such as: memory-corruption of a kernel-address, hijack kernel control-flow to a kernel-address or even just read a kernel-address. In modern exploitation, it is common to chain multiple different vulnerabilities in order to build a reliable exploit. Therefore, leaking a kernel-address could be exactly the missing primitive to complete a vulnerability-chain of a reliable exploit. I don't see a big difference between leaking a kernel-address from user-mode vs. leaking a hypervisor-address from guest. They are both useful just as a primitive which is part of an exploit chain. One could argue though, that currently KASLR is fundementally broken and therefore should not be considered a security boundary anymore. This argument could be legit as there were some well-known techniques that could break KASLR before KPTI patch-set was introduced (e.g. Timing memory accesses to kernel-addresses and messure reliably by leveraging TSX). Another well-known argument against KASLR is that it is a non-deterministic mitigation which some argue is not good enough. However, I think that if we decide KASLR is not a security boundary anymore, it should be made loud and clear. In general, I think there are some info-leak vulnerabilities in our current mitigation plan which doesn't seem to be addressed. I will be glad if we could address them clearly. These are all the open issues as I see them: 1) Because IBRS doesn't restrict low prediction-mode code from using BTB of high prediction-mode code, It is possible to info-leak addresses from high prediction-mode code to low prediciton-mode code. This is the KASLR breakage discussed above. Again, could be ignored if we discard KASLR as a security boundary. 2) Both IBRS & retpoline don't prevent use of BHB of high prediction-mode code from being used by low prediction-mode code. Therefore, low prediction-mode code could deduce the conditional branches taken by high prediction-mode code. 3) Similar leak to (1) exists from the fact that RSB entries of high prediction-mode code could be leaked by low prediction-mode code which may reveal kernel-addresses. Again, we could decide that this isn't a security boundary. An alternative to solve this could be to just stuff RSB from a fixed address between prediction-mode transitions. -Liran P.S: It seems to me that all these issues could be resolved completely at hardware in future CPUs if BTB/BHB/RSB entries were tagged with prediction-mode (or similar metadata). It will be nice if Intel/AMD could share if that is the planned long-term solution instead of IBRS-all-the-time.
Re: [RFC 09/10] x86/enter: Create macros to restrict/unrestrict Indirect Branch Speculation
- dave.han...@intel.com wrote: > On 01/23/2018 03:13 AM, Liran Alon wrote: > > Therefore, breaking KASLR. In order to handle this, every exit from > > kernel-mode to user-mode should stuff RSB. In addition, this > stuffing > > of RSB may need to be done from a fixed address to avoid leaking > the > > address of the RSB stuffing itself. > > With PTI alone in place, I don't see how userspace could do anything > with this information. Even if userspace started to speculate to a > kernel address, there is nothing at the kernel address to execute: no > TLB entry, no PTE to load, nothing. > > You probably have a valid point about host->guest, though. I see it differently. It is true that attacker cannot speculate to a kernel-address, but it doesn't mean it cannot use the leaked kernel-address together with another unrelated vulnerability to build a reliable exploit. Security is built in layers. The purpose of KASLR is to break the reliablity of an exploit which relies on vulnerability primitives such as: memory-corruption of a kernel-address, hijack kernel control-flow to a kernel-address or even just read a kernel-address. In modern exploitation, it is common to chain multiple different vulnerabilities in order to build a reliable exploit. Therefore, leaking a kernel-address could be exactly the missing primitive to complete a vulnerability-chain of a reliable exploit. I don't see a big difference between leaking a kernel-address from user-mode vs. leaking a hypervisor-address from guest. They are both useful just as a primitive which is part of an exploit chain. One could argue though, that currently KASLR is fundementally broken and therefore should not be considered a security boundary anymore. This argument could be legit as there were some well-known techniques that could break KASLR before KPTI patch-set was introduced (e.g. Timing memory accesses to kernel-addresses and messure reliably by leveraging TSX). Another well-known argument against KASLR is that it is a non-deterministic mitigation which some argue is not good enough. However, I think that if we decide KASLR is not a security boundary anymore, it should be made loud and clear. In general, I think there are some info-leak vulnerabilities in our current mitigation plan which doesn't seem to be addressed. I will be glad if we could address them clearly. These are all the open issues as I see them: 1) Because IBRS doesn't restrict low prediction-mode code from using BTB of high prediction-mode code, It is possible to info-leak addresses from high prediction-mode code to low prediciton-mode code. This is the KASLR breakage discussed above. Again, could be ignored if we discard KASLR as a security boundary. 2) Both IBRS & retpoline don't prevent use of BHB of high prediction-mode code from being used by low prediction-mode code. Therefore, low prediction-mode code could deduce the conditional branches taken by high prediction-mode code. 3) Similar leak to (1) exists from the fact that RSB entries of high prediction-mode code could be leaked by low prediction-mode code which may reveal kernel-addresses. Again, we could decide that this isn't a security boundary. An alternative to solve this could be to just stuff RSB from a fixed address between prediction-mode transitions. -Liran P.S: It seems to me that all these issues could be resolved completely at hardware in future CPUs if BTB/BHB/RSB entries were tagged with prediction-mode (or similar metadata). It will be nice if Intel/AMD could share if that is the planned long-term solution instead of IBRS-all-the-time.
Re: [RFC 09/10] x86/enter: Create macros to restrict/unrestrict Indirect Branch Speculation
On 01/23/2018 03:13 AM, Liran Alon wrote: > Therefore, breaking KASLR. In order to handle this, every exit from > kernel-mode to user-mode should stuff RSB. In addition, this stuffing > of RSB may need to be done from a fixed address to avoid leaking the > address of the RSB stuffing itself. With PTI alone in place, I don't see how userspace could do anything with this information. Even if userspace started to speculate to a kernel address, there is nothing at the kernel address to execute: no TLB entry, no PTE to load, nothing. You probably have a valid point about host->guest, though.
Re: [RFC 09/10] x86/enter: Create macros to restrict/unrestrict Indirect Branch Speculation
On 01/23/2018 03:13 AM, Liran Alon wrote: > Therefore, breaking KASLR. In order to handle this, every exit from > kernel-mode to user-mode should stuff RSB. In addition, this stuffing > of RSB may need to be done from a fixed address to avoid leaking the > address of the RSB stuffing itself. With PTI alone in place, I don't see how userspace could do anything with this information. Even if userspace started to speculate to a kernel address, there is nothing at the kernel address to execute: no TLB entry, no PTE to load, nothing. You probably have a valid point about host->guest, though.
Re: [RFC 09/10] x86/enter: Create macros to restrict/unrestrict Indirect Branch Speculation
On Thu, Jan 25, 2018 at 05:19:04PM +0100, Mason wrote: > On 23/01/2018 10:30, David Woodhouse wrote: > > > Skylake takes predictions from the generic branch target buffer when > > the RSB underflows. > > Adding LAKML. > > AFAIU, some ARM Cortex cores have the same optimization. > (A9 maybe, A17 probably, some recent 64-bit cores) > > Are there software work-arounds for Spectre planned for arm32 and arm64? Yes, I think they are currently burried in one of the arm64 trees, and they have been posted to the mailing list a few times in the past. thanks, greg k-h
Re: [RFC 09/10] x86/enter: Create macros to restrict/unrestrict Indirect Branch Speculation
On Thu, Jan 25, 2018 at 05:19:04PM +0100, Mason wrote: > On 23/01/2018 10:30, David Woodhouse wrote: > > > Skylake takes predictions from the generic branch target buffer when > > the RSB underflows. > > Adding LAKML. > > AFAIU, some ARM Cortex cores have the same optimization. > (A9 maybe, A17 probably, some recent 64-bit cores) > > Are there software work-arounds for Spectre planned for arm32 and arm64? Yes, I think they are currently burried in one of the arm64 trees, and they have been posted to the mailing list a few times in the past. thanks, greg k-h
Re: [RFC 09/10] x86/enter: Create macros to restrict/unrestrict Indirect Branch Speculation
On 23/01/2018 10:30, David Woodhouse wrote: > Skylake takes predictions from the generic branch target buffer when > the RSB underflows. Adding LAKML. AFAIU, some ARM Cortex cores have the same optimization. (A9 maybe, A17 probably, some recent 64-bit cores) Are there software work-arounds for Spectre planned for arm32 and arm64? Regards.
Re: [RFC 09/10] x86/enter: Create macros to restrict/unrestrict Indirect Branch Speculation
On 23/01/2018 10:30, David Woodhouse wrote: > Skylake takes predictions from the generic branch target buffer when > the RSB underflows. Adding LAKML. AFAIU, some ARM Cortex cores have the same optimization. (A9 maybe, A17 probably, some recent 64-bit cores) Are there software work-arounds for Spectre planned for arm32 and arm64? Regards.
Re: [RFC 09/10] x86/enter: Create macros to restrict/unrestrict Indirect Branch Speculation
> On Jan 23, 2018, at 5:59 PM, Van De Ven, Arjan> wrote: > > >>> It is a reasonable approach. Let a process who needs max security >>> opt in with disabled dumpable. It can have a flush with IBPB clear before >>> starting to run, and have STIBP set while running. >>> >> >> Do we maybe want a separate opt in? I can easily imagine things like >> web browsers that *don't* want to be non-dumpable but do want this >> opt-in. > > eventually we need something better. Probably in addition. > dumpable is used today for things that want this. > >> >> Also, what's the performance hit of STIBP? > > pretty steep, but it depends on the CPU generation, for some it's cheaper > than others. (yes I realize this is a vague answer, but the range is really > from just about zero to oh my god) > > I'm not a fan of doing this right now to be honest. We really need to not > piece meal some of this, and come up with a better concept of protection on a > higher level. > For example, you mention web browsers, but the threat model for browsers is > generally internet content. For V2 to work you need to get some "evil > pointer" into the app from the observer and browsers usually aren't doing > that. > The most likely user would be some software-TPM-like service that has magic > keys. > > And for keys we want something else... we want an madvice() sort of thing > that does a few things, like equivalent of mlock (so the key does not end up > in swap), I'd love to see a slight variant: encrypt that page against some ephemeral key if it gets swapped. > not having the page (but potentially the rest) end up in core dumps, and the > kernel making sure that if the program exits (say for segv) that the key page > gets zeroed before going into the free pool. Once you do that as feature, > making the key speculation safe is not too hard (intel and arm have cpu > options to mark pages for that) > > How do we do that on Intel? Make it UC?
Re: [RFC 09/10] x86/enter: Create macros to restrict/unrestrict Indirect Branch Speculation
> On Jan 23, 2018, at 5:59 PM, Van De Ven, Arjan > wrote: > > >>> It is a reasonable approach. Let a process who needs max security >>> opt in with disabled dumpable. It can have a flush with IBPB clear before >>> starting to run, and have STIBP set while running. >>> >> >> Do we maybe want a separate opt in? I can easily imagine things like >> web browsers that *don't* want to be non-dumpable but do want this >> opt-in. > > eventually we need something better. Probably in addition. > dumpable is used today for things that want this. > >> >> Also, what's the performance hit of STIBP? > > pretty steep, but it depends on the CPU generation, for some it's cheaper > than others. (yes I realize this is a vague answer, but the range is really > from just about zero to oh my god) > > I'm not a fan of doing this right now to be honest. We really need to not > piece meal some of this, and come up with a better concept of protection on a > higher level. > For example, you mention web browsers, but the threat model for browsers is > generally internet content. For V2 to work you need to get some "evil > pointer" into the app from the observer and browsers usually aren't doing > that. > The most likely user would be some software-TPM-like service that has magic > keys. > > And for keys we want something else... we want an madvice() sort of thing > that does a few things, like equivalent of mlock (so the key does not end up > in swap), I'd love to see a slight variant: encrypt that page against some ephemeral key if it gets swapped. > not having the page (but potentially the rest) end up in core dumps, and the > kernel making sure that if the program exits (say for segv) that the key page > gets zeroed before going into the free pool. Once you do that as feature, > making the key speculation safe is not too hard (intel and arm have cpu > options to mark pages for that) > > How do we do that on Intel? Make it UC?
RE: [RFC 09/10] x86/enter: Create macros to restrict/unrestrict Indirect Branch Speculation
> > It is a reasonable approach. Let a process who needs max security > > opt in with disabled dumpable. It can have a flush with IBPB clear before > > starting to run, and have STIBP set while running. > > > > Do we maybe want a separate opt in? I can easily imagine things like > web browsers that *don't* want to be non-dumpable but do want this > opt-in. eventually we need something better. Probably in addition. dumpable is used today for things that want this. > > Also, what's the performance hit of STIBP? pretty steep, but it depends on the CPU generation, for some it's cheaper than others. (yes I realize this is a vague answer, but the range is really from just about zero to oh my god) I'm not a fan of doing this right now to be honest. We really need to not piece meal some of this, and come up with a better concept of protection on a higher level. For example, you mention web browsers, but the threat model for browsers is generally internet content. For V2 to work you need to get some "evil pointer" into the app from the observer and browsers usually aren't doing that. The most likely user would be some software-TPM-like service that has magic keys. And for keys we want something else... we want an madvice() sort of thing that does a few things, like equivalent of mlock (so the key does not end up in swap), not having the page (but potentially the rest) end up in core dumps, and the kernel making sure that if the program exits (say for segv) that the key page gets zeroed before going into the free pool. Once you do that as feature, making the key speculation safe is not too hard (intel and arm have cpu options to mark pages for that)
RE: [RFC 09/10] x86/enter: Create macros to restrict/unrestrict Indirect Branch Speculation
> > It is a reasonable approach. Let a process who needs max security > > opt in with disabled dumpable. It can have a flush with IBPB clear before > > starting to run, and have STIBP set while running. > > > > Do we maybe want a separate opt in? I can easily imagine things like > web browsers that *don't* want to be non-dumpable but do want this > opt-in. eventually we need something better. Probably in addition. dumpable is used today for things that want this. > > Also, what's the performance hit of STIBP? pretty steep, but it depends on the CPU generation, for some it's cheaper than others. (yes I realize this is a vague answer, but the range is really from just about zero to oh my god) I'm not a fan of doing this right now to be honest. We really need to not piece meal some of this, and come up with a better concept of protection on a higher level. For example, you mention web browsers, but the threat model for browsers is generally internet content. For V2 to work you need to get some "evil pointer" into the app from the observer and browsers usually aren't doing that. The most likely user would be some software-TPM-like service that has magic keys. And for keys we want something else... we want an madvice() sort of thing that does a few things, like equivalent of mlock (so the key does not end up in swap), not having the page (but potentially the rest) end up in core dumps, and the kernel making sure that if the program exits (say for segv) that the key page gets zeroed before going into the free pool. Once you do that as feature, making the key speculation safe is not too hard (intel and arm have cpu options to mark pages for that)
Re: [RFC 09/10] x86/enter: Create macros to restrict/unrestrict Indirect Branch Speculation
On Tue, 2018-01-23 at 17:00 -0800, Andy Lutomirski wrote: > On Tue, Jan 23, 2018 at 4:47 PM, Tim Chenwrote: > > > > On 01/23/2018 03:14 PM, Woodhouse, David wrote: > > > > > > On Tue, 2018-01-23 at 14:49 -0800, Andi Kleen wrote: > > > > > > > > > > > > > > Not sure. Maybe to start, the answer might be to allow it to be set > > > > > for > > > > > the ultra-paranoid, but in general don't enable it by default. > > > > > Having it > > > > > enabled would be an alternative to someone deciding to disable SMT, > > > > > since > > > > > that would have even more of a performance impact. > > > > I agree. A reasonable strategy would be to only enable it for > > > > processes that have dumpable disabled. This should be already set for > > > > high value processes like GPG, and allows others to opt-in if > > > > they need to. > > > That seems to make sense, and I think was the solution we were > > > approaching for IBPB on context switch too, right? > > > > > > Are we generally agreed on dumpable as the criterion for both of those? > > > > > It is a reasonable approach. Let a process who needs max security > > opt in with disabled dumpable. It can have a flush with IBPB clear before > > starting to run, and have STIBP set while running. > > > Do we maybe want a separate opt in? I can easily imagine things like > web browsers that *don't* want to be non-dumpable but do want this > opt-in. This is to protect you from another local process running on a HT sibling. Not the kind of thing that web browsers are normally worrying about. > Also, what's the performance hit of STIBP? Varies per CPU generation, but generally approaching that of full IBRS I think? I don't recall looking at this specifically (since we haven't actually used it for this yet). smime.p7s Description: S/MIME cryptographic signature
Re: [RFC 09/10] x86/enter: Create macros to restrict/unrestrict Indirect Branch Speculation
On Tue, 2018-01-23 at 17:00 -0800, Andy Lutomirski wrote: > On Tue, Jan 23, 2018 at 4:47 PM, Tim Chen wrote: > > > > On 01/23/2018 03:14 PM, Woodhouse, David wrote: > > > > > > On Tue, 2018-01-23 at 14:49 -0800, Andi Kleen wrote: > > > > > > > > > > > > > > Not sure. Maybe to start, the answer might be to allow it to be set > > > > > for > > > > > the ultra-paranoid, but in general don't enable it by default. > > > > > Having it > > > > > enabled would be an alternative to someone deciding to disable SMT, > > > > > since > > > > > that would have even more of a performance impact. > > > > I agree. A reasonable strategy would be to only enable it for > > > > processes that have dumpable disabled. This should be already set for > > > > high value processes like GPG, and allows others to opt-in if > > > > they need to. > > > That seems to make sense, and I think was the solution we were > > > approaching for IBPB on context switch too, right? > > > > > > Are we generally agreed on dumpable as the criterion for both of those? > > > > > It is a reasonable approach. Let a process who needs max security > > opt in with disabled dumpable. It can have a flush with IBPB clear before > > starting to run, and have STIBP set while running. > > > Do we maybe want a separate opt in? I can easily imagine things like > web browsers that *don't* want to be non-dumpable but do want this > opt-in. This is to protect you from another local process running on a HT sibling. Not the kind of thing that web browsers are normally worrying about. > Also, what's the performance hit of STIBP? Varies per CPU generation, but generally approaching that of full IBRS I think? I don't recall looking at this specifically (since we haven't actually used it for this yet). smime.p7s Description: S/MIME cryptographic signature
Re: [RFC 09/10] x86/enter: Create macros to restrict/unrestrict Indirect Branch Speculation
On Tue, Jan 23, 2018 at 4:47 PM, Tim Chenwrote: > On 01/23/2018 03:14 PM, Woodhouse, David wrote: >> On Tue, 2018-01-23 at 14:49 -0800, Andi Kleen wrote: Not sure. Maybe to start, the answer might be to allow it to be set for the ultra-paranoid, but in general don't enable it by default. Having it enabled would be an alternative to someone deciding to disable SMT, since that would have even more of a performance impact. >>> >>> I agree. A reasonable strategy would be to only enable it for >>> processes that have dumpable disabled. This should be already set for >>> high value processes like GPG, and allows others to opt-in if >>> they need to. >> >> That seems to make sense, and I think was the solution we were >> approaching for IBPB on context switch too, right? >> >> Are we generally agreed on dumpable as the criterion for both of those? >> > > It is a reasonable approach. Let a process who needs max security > opt in with disabled dumpable. It can have a flush with IBPB clear before > starting to run, and have STIBP set while running. > Do we maybe want a separate opt in? I can easily imagine things like web browsers that *don't* want to be non-dumpable but do want this opt-in. Also, what's the performance hit of STIBP?
Re: [RFC 09/10] x86/enter: Create macros to restrict/unrestrict Indirect Branch Speculation
On Tue, Jan 23, 2018 at 4:47 PM, Tim Chen wrote: > On 01/23/2018 03:14 PM, Woodhouse, David wrote: >> On Tue, 2018-01-23 at 14:49 -0800, Andi Kleen wrote: Not sure. Maybe to start, the answer might be to allow it to be set for the ultra-paranoid, but in general don't enable it by default. Having it enabled would be an alternative to someone deciding to disable SMT, since that would have even more of a performance impact. >>> >>> I agree. A reasonable strategy would be to only enable it for >>> processes that have dumpable disabled. This should be already set for >>> high value processes like GPG, and allows others to opt-in if >>> they need to. >> >> That seems to make sense, and I think was the solution we were >> approaching for IBPB on context switch too, right? >> >> Are we generally agreed on dumpable as the criterion for both of those? >> > > It is a reasonable approach. Let a process who needs max security > opt in with disabled dumpable. It can have a flush with IBPB clear before > starting to run, and have STIBP set while running. > Do we maybe want a separate opt in? I can easily imagine things like web browsers that *don't* want to be non-dumpable but do want this opt-in. Also, what's the performance hit of STIBP?
Re: [RFC 09/10] x86/enter: Create macros to restrict/unrestrict Indirect Branch Speculation
On 01/23/2018 03:14 PM, Woodhouse, David wrote: > On Tue, 2018-01-23 at 14:49 -0800, Andi Kleen wrote: >>> Not sure. Maybe to start, the answer might be to allow it to be set for >>> the ultra-paranoid, but in general don't enable it by default. Having it >>> enabled would be an alternative to someone deciding to disable SMT, since >>> that would have even more of a performance impact. >> >> I agree. A reasonable strategy would be to only enable it for >> processes that have dumpable disabled. This should be already set for >> high value processes like GPG, and allows others to opt-in if >> they need to. > > That seems to make sense, and I think was the solution we were > approaching for IBPB on context switch too, right? > > Are we generally agreed on dumpable as the criterion for both of those? > It is a reasonable approach. Let a process who needs max security opt in with disabled dumpable. It can have a flush with IBPB clear before starting to run, and have STIBP set while running. Tim
Re: [RFC 09/10] x86/enter: Create macros to restrict/unrestrict Indirect Branch Speculation
On 01/23/2018 03:14 PM, Woodhouse, David wrote: > On Tue, 2018-01-23 at 14:49 -0800, Andi Kleen wrote: >>> Not sure. Maybe to start, the answer might be to allow it to be set for >>> the ultra-paranoid, but in general don't enable it by default. Having it >>> enabled would be an alternative to someone deciding to disable SMT, since >>> that would have even more of a performance impact. >> >> I agree. A reasonable strategy would be to only enable it for >> processes that have dumpable disabled. This should be already set for >> high value processes like GPG, and allows others to opt-in if >> they need to. > > That seems to make sense, and I think was the solution we were > approaching for IBPB on context switch too, right? > > Are we generally agreed on dumpable as the criterion for both of those? > It is a reasonable approach. Let a process who needs max security opt in with disabled dumpable. It can have a flush with IBPB clear before starting to run, and have STIBP set while running. Tim
Re: [RFC 09/10] x86/enter: Create macros to restrict/unrestrict Indirect Branch Speculation
Ingo Molnarwrites: > > Is there any reason why this wouldn't work? To actually maintain the true call depth you would need to intercept the return of the function too, because the counter has to be decremented at the end of the function. Plain ftrace cannot do that because it only intercepts the function entry. The function graph tracer can do this, but only at the cost of overwriting the return address (and saving return in a special stack) This always causes a mispredict on every return, and other overhead, and is one of the reasons why function graph is so much slower than the plain function tracer. I suspect the overhead would be significant. To make your scheme work efficiently work likely we would need custom gcc instrumentation for the returns. FWIW our plan was to add enough manual stuffing at strategic points, until we're sure enough of good enough coverage. -Andi
Re: [RFC 09/10] x86/enter: Create macros to restrict/unrestrict Indirect Branch Speculation
Ingo Molnar writes: > > Is there any reason why this wouldn't work? To actually maintain the true call depth you would need to intercept the return of the function too, because the counter has to be decremented at the end of the function. Plain ftrace cannot do that because it only intercepts the function entry. The function graph tracer can do this, but only at the cost of overwriting the return address (and saving return in a special stack) This always causes a mispredict on every return, and other overhead, and is one of the reasons why function graph is so much slower than the plain function tracer. I suspect the overhead would be significant. To make your scheme work efficiently work likely we would need custom gcc instrumentation for the returns. FWIW our plan was to add enough manual stuffing at strategic points, until we're sure enough of good enough coverage. -Andi
Re: [RFC 09/10] x86/enter: Create macros to restrict/unrestrict Indirect Branch Speculation
On Tue, Jan 23, 2018 at 11:14:36PM +, Woodhouse, David wrote: > On Tue, 2018-01-23 at 14:49 -0800, Andi Kleen wrote: > > > Not sure. Maybe to start, the answer might be to allow it to be set for > > > the ultra-paranoid, but in general don't enable it by default. Having it > > > enabled would be an alternative to someone deciding to disable SMT, since > > > that would have even more of a performance impact. > > > > I agree. A reasonable strategy would be to only enable it for > > processes that have dumpable disabled. This should be already set for > > high value processes like GPG, and allows others to opt-in if > > they need to. > > That seems to make sense, and I think was the solution we were > approaching for IBPB on context switch too, right? Right. -Andi
Re: [RFC 09/10] x86/enter: Create macros to restrict/unrestrict Indirect Branch Speculation
On Tue, Jan 23, 2018 at 11:14:36PM +, Woodhouse, David wrote: > On Tue, 2018-01-23 at 14:49 -0800, Andi Kleen wrote: > > > Not sure. Maybe to start, the answer might be to allow it to be set for > > > the ultra-paranoid, but in general don't enable it by default. Having it > > > enabled would be an alternative to someone deciding to disable SMT, since > > > that would have even more of a performance impact. > > > > I agree. A reasonable strategy would be to only enable it for > > processes that have dumpable disabled. This should be already set for > > high value processes like GPG, and allows others to opt-in if > > they need to. > > That seems to make sense, and I think was the solution we were > approaching for IBPB on context switch too, right? Right. -Andi
Re: [RFC 09/10] x86/enter: Create macros to restrict/unrestrict Indirect Branch Speculation
On Tue, 2018-01-23 at 14:49 -0800, Andi Kleen wrote: > > Not sure. Maybe to start, the answer might be to allow it to be set for > > the ultra-paranoid, but in general don't enable it by default. Having it > > enabled would be an alternative to someone deciding to disable SMT, since > > that would have even more of a performance impact. > > I agree. A reasonable strategy would be to only enable it for > processes that have dumpable disabled. This should be already set for > high value processes like GPG, and allows others to opt-in if > they need to. That seems to make sense, and I think was the solution we were approaching for IBPB on context switch too, right? Are we generally agreed on dumpable as the criterion for both of those? smime.p7s Description: S/MIME cryptographic signature
Re: [RFC 09/10] x86/enter: Create macros to restrict/unrestrict Indirect Branch Speculation
On Tue, 2018-01-23 at 14:49 -0800, Andi Kleen wrote: > > Not sure. Maybe to start, the answer might be to allow it to be set for > > the ultra-paranoid, but in general don't enable it by default. Having it > > enabled would be an alternative to someone deciding to disable SMT, since > > that would have even more of a performance impact. > > I agree. A reasonable strategy would be to only enable it for > processes that have dumpable disabled. This should be already set for > high value processes like GPG, and allows others to opt-in if > they need to. That seems to make sense, and I think was the solution we were approaching for IBPB on context switch too, right? Are we generally agreed on dumpable as the criterion for both of those? smime.p7s Description: S/MIME cryptographic signature
Re: [RFC 09/10] x86/enter: Create macros to restrict/unrestrict Indirect Branch Speculation
> Not sure. Maybe to start, the answer might be to allow it to be set for > the ultra-paranoid, but in general don't enable it by default. Having it > enabled would be an alternative to someone deciding to disable SMT, since > that would have even more of a performance impact. I agree. A reasonable strategy would be to only enable it for processes that have dumpable disabled. This should be already set for high value processes like GPG, and allows others to opt-in if they need to. -Andi
Re: [RFC 09/10] x86/enter: Create macros to restrict/unrestrict Indirect Branch Speculation
> Not sure. Maybe to start, the answer might be to allow it to be set for > the ultra-paranoid, but in general don't enable it by default. Having it > enabled would be an alternative to someone deciding to disable SMT, since > that would have even more of a performance impact. I agree. A reasonable strategy would be to only enable it for processes that have dumpable disabled. This should be already set for high value processes like GPG, and allows others to opt-in if they need to. -Andi
Re: [RFC 09/10] x86/enter: Create macros to restrict/unrestrict Indirect Branch Speculation
On 1/23/2018 10:20 AM, Woodhouse, David wrote: > On Tue, 2018-01-23 at 10:12 -0600, Tom Lendacky wrote: >> +.macro UNRESTRICT_IB_SPEC + ALTERNATIVE "jmp .Lskip_\@", "", X86_FEATURE_IBRS + PUSH_MSR_REGS + WRMSR_ASM $MSR_IA32_SPEC_CTRL, $0, $0 >>> >> I think you should be writing 2, not 0, since I'm reasonably >> confident that we want STIBP on. Can you explain why you're writing >> 0? >> >> Do we want to talk about STIBP in general? Should it be (yet another) >> boot option to enable or disable? If there is STIBP support without >> IBRS support, it could be a set and forget at boot time. > > We haven't got patches which enable STIBP in general. The kernel itself > is safe either way with retpoline, or because IBRS implies STIBP too > (that is, there's no difference between writing 1 and 3). > > So STIBP is purely about protecting userspace processes from one > another, and VM guests from one another, when they run on HT siblings. > > There's an argument that there are so many other information leaks > between HT siblings that we might not care. Especially as it's hard to > *tell* when you're scheduling, whether you trust all the processes (or > guests) on your HT siblings right now... let alone later when > scheduling another process if you need to *now* set STIBP on a sibling > which is no longer save from this process now running. > > I'm not sure we want to set STIBP *unconditionally* either because of > the performance implications. > > For IBRS we had an answer and it was just ugly. For STIBP we don't > actually have an answer for "how do we use this?". Do we? Not sure. Maybe to start, the answer might be to allow it to be set for the ultra-paranoid, but in general don't enable it by default. Having it enabled would be an alternative to someone deciding to disable SMT, since that would have even more of a performance impact. Thanks, Tom > >
Re: [RFC 09/10] x86/enter: Create macros to restrict/unrestrict Indirect Branch Speculation
On 1/23/2018 10:20 AM, Woodhouse, David wrote: > On Tue, 2018-01-23 at 10:12 -0600, Tom Lendacky wrote: >> +.macro UNRESTRICT_IB_SPEC + ALTERNATIVE "jmp .Lskip_\@", "", X86_FEATURE_IBRS + PUSH_MSR_REGS + WRMSR_ASM $MSR_IA32_SPEC_CTRL, $0, $0 >>> >> I think you should be writing 2, not 0, since I'm reasonably >> confident that we want STIBP on. Can you explain why you're writing >> 0? >> >> Do we want to talk about STIBP in general? Should it be (yet another) >> boot option to enable or disable? If there is STIBP support without >> IBRS support, it could be a set and forget at boot time. > > We haven't got patches which enable STIBP in general. The kernel itself > is safe either way with retpoline, or because IBRS implies STIBP too > (that is, there's no difference between writing 1 and 3). > > So STIBP is purely about protecting userspace processes from one > another, and VM guests from one another, when they run on HT siblings. > > There's an argument that there are so many other information leaks > between HT siblings that we might not care. Especially as it's hard to > *tell* when you're scheduling, whether you trust all the processes (or > guests) on your HT siblings right now... let alone later when > scheduling another process if you need to *now* set STIBP on a sibling > which is no longer save from this process now running. > > I'm not sure we want to set STIBP *unconditionally* either because of > the performance implications. > > For IBRS we had an answer and it was just ugly. For STIBP we don't > actually have an answer for "how do we use this?". Do we? Not sure. Maybe to start, the answer might be to allow it to be set for the ultra-paranoid, but in general don't enable it by default. Having it enabled would be an alternative to someone deciding to disable SMT, since that would have even more of a performance impact. Thanks, Tom > >
Re: [RFC 09/10] x86/enter: Create macros to restrict/unrestrict Indirect Branch Speculation
On Sun 2018-01-21 20:28:17, David Woodhouse wrote: > On Sun, 2018-01-21 at 11:34 -0800, Linus Torvalds wrote: > > All of this is pure garbage. > > > > Is Intel really planning on making this shit architectural? Has > > anybody talked to them and told them they are f*cking insane? > > > > Please, any Intel engineers here - talk to your managers. > > If the alternative was a two-decade product recall and giving everyone > free CPUs, I'm not sure it was entirely insane. > > Certainly it's a nasty hack, but hey — the world was on fire and in the > end we didn't have to just turn the datacentres off and go back to goat > farming, so it's not all bad. Well, someone at Intel put world on fire. And then was selling faulty CPUs for half a year while world was on fire; they knew they are faulty yet they sold them anyway. Then Intel talks about how great they are and how security is important for them Intentionaly confusing between Meltdown and Spectre so they can mask how badly they screwed. And without apologies. > As a hack for existing CPUs, it's just about tolerable — as long as it > can die entirely by the next generation. > > So the part is I think is odd is the IBRS_ALL feature, where a future > CPU will advertise "I am able to be not broken" and then you have to > set the IBRS bit once at boot time to *ask* it not to be broken. That > part is weird, because it ought to have been treated like the RDCL_NO > bit — just "you don't have to worry any more, it got better". And now Intel wants to cheat at benchmarks, to put companies that do right thing at disadvantage and thinks that that's okay because world was on fire? At this point, I believe that yes, product recall would be appropriate. If Intel is not willing to do it on their own, well, perhaps courts can force them. Ouch and I wound not mind some jail time for whoever is responsible for selling known-faulty CPUs to the public. Oh, and still no word about the real fixes. World is not only Linux, you see? https://pavelmachek.livejournal.com/140949.html?nojs=1 Best regards, Pavel -- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
Re: [RFC 09/10] x86/enter: Create macros to restrict/unrestrict Indirect Branch Speculation
On Sun 2018-01-21 20:28:17, David Woodhouse wrote: > On Sun, 2018-01-21 at 11:34 -0800, Linus Torvalds wrote: > > All of this is pure garbage. > > > > Is Intel really planning on making this shit architectural? Has > > anybody talked to them and told them they are f*cking insane? > > > > Please, any Intel engineers here - talk to your managers. > > If the alternative was a two-decade product recall and giving everyone > free CPUs, I'm not sure it was entirely insane. > > Certainly it's a nasty hack, but hey — the world was on fire and in the > end we didn't have to just turn the datacentres off and go back to goat > farming, so it's not all bad. Well, someone at Intel put world on fire. And then was selling faulty CPUs for half a year while world was on fire; they knew they are faulty yet they sold them anyway. Then Intel talks about how great they are and how security is important for them Intentionaly confusing between Meltdown and Spectre so they can mask how badly they screwed. And without apologies. > As a hack for existing CPUs, it's just about tolerable — as long as it > can die entirely by the next generation. > > So the part is I think is odd is the IBRS_ALL feature, where a future > CPU will advertise "I am able to be not broken" and then you have to > set the IBRS bit once at boot time to *ask* it not to be broken. That > part is weird, because it ought to have been treated like the RDCL_NO > bit — just "you don't have to worry any more, it got better". And now Intel wants to cheat at benchmarks, to put companies that do right thing at disadvantage and thinks that that's okay because world was on fire? At this point, I believe that yes, product recall would be appropriate. If Intel is not willing to do it on their own, well, perhaps courts can force them. Ouch and I wound not mind some jail time for whoever is responsible for selling known-faulty CPUs to the public. Oh, and still no word about the real fixes. World is not only Linux, you see? https://pavelmachek.livejournal.com/140949.html?nojs=1 Best regards, Pavel -- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
Re: [RFC 09/10] x86/enter: Create macros to restrict/unrestrict Indirect Branch Speculation
On Tue, 2018-01-23 at 10:12 -0600, Tom Lendacky wrote: > > >> +.macro UNRESTRICT_IB_SPEC > >> + ALTERNATIVE "jmp .Lskip_\@", "", X86_FEATURE_IBRS > >> + PUSH_MSR_REGS > >> + WRMSR_ASM $MSR_IA32_SPEC_CTRL, $0, $0 > > > I think you should be writing 2, not 0, since I'm reasonably > confident that we want STIBP on. Can you explain why you're writing > 0? > > Do we want to talk about STIBP in general? Should it be (yet another) > boot option to enable or disable? If there is STIBP support without > IBRS support, it could be a set and forget at boot time. We haven't got patches which enable STIBP in general. The kernel itself is safe either way with retpoline, or because IBRS implies STIBP too (that is, there's no difference between writing 1 and 3). So STIBP is purely about protecting userspace processes from one another, and VM guests from one another, when they run on HT siblings. There's an argument that there are so many other information leaks between HT siblings that we might not care. Especially as it's hard to *tell* when you're scheduling, whether you trust all the processes (or guests) on your HT siblings right now... let alone later when scheduling another process if you need to *now* set STIBP on a sibling which is no longer save from this process now running. I'm not sure we want to set STIBP *unconditionally* either because of the performance implications. For IBRS we had an answer and it was just ugly. For STIBP we don't actually have an answer for "how do we use this?". Do we? smime.p7s Description: S/MIME cryptographic signature
Re: [RFC 09/10] x86/enter: Create macros to restrict/unrestrict Indirect Branch Speculation
On Tue, 2018-01-23 at 10:12 -0600, Tom Lendacky wrote: > > >> +.macro UNRESTRICT_IB_SPEC > >> + ALTERNATIVE "jmp .Lskip_\@", "", X86_FEATURE_IBRS > >> + PUSH_MSR_REGS > >> + WRMSR_ASM $MSR_IA32_SPEC_CTRL, $0, $0 > > > I think you should be writing 2, not 0, since I'm reasonably > confident that we want STIBP on. Can you explain why you're writing > 0? > > Do we want to talk about STIBP in general? Should it be (yet another) > boot option to enable or disable? If there is STIBP support without > IBRS support, it could be a set and forget at boot time. We haven't got patches which enable STIBP in general. The kernel itself is safe either way with retpoline, or because IBRS implies STIBP too (that is, there's no difference between writing 1 and 3). So STIBP is purely about protecting userspace processes from one another, and VM guests from one another, when they run on HT siblings. There's an argument that there are so many other information leaks between HT siblings that we might not care. Especially as it's hard to *tell* when you're scheduling, whether you trust all the processes (or guests) on your HT siblings right now... let alone later when scheduling another process if you need to *now* set STIBP on a sibling which is no longer save from this process now running. I'm not sure we want to set STIBP *unconditionally* either because of the performance implications. For IBRS we had an answer and it was just ugly. For STIBP we don't actually have an answer for "how do we use this?". Do we? smime.p7s Description: S/MIME cryptographic signature
Re: [RFC 09/10] x86/enter: Create macros to restrict/unrestrict Indirect Branch Speculation
On 1/21/2018 1:14 PM, Andy Lutomirski wrote: > > >> On Jan 20, 2018, at 11:23 AM, KarimAllah Ahmedwrote: >> >> From: Tim Chen >> >> Create macros to control Indirect Branch Speculation. >> >> Name them so they reflect what they are actually doing. >> The macros are used to restrict and unrestrict the indirect branch >> speculation. >> They do not *disable* (or *enable*) indirect branch speculation. A trip back >> to >> user-space after *restricting* speculation would still affect the BTB. >> >> Quoting from a commit by Tim Chen: >> >> """ >>If IBRS is set, near returns and near indirect jumps/calls will not allow >>their predicted target address to be controlled by code that executed in a >>less privileged prediction mode *BEFORE* the IBRS mode was last written >> with >>a value of 1 or on another logical processor so long as all Return Stack >>Buffer (RSB) entries from the previous less privileged prediction mode are >>overwritten. >> >>Thus a near indirect jump/call/return may be affected by code in a less >>privileged prediction mode that executed *AFTER* IBRS mode was last >> written >>with a value of 1. >> """ >> >> [ tglx: Changed macro names and rewrote changelog ] >> [ karahmed: changed macro names *again* and rewrote changelog ] >> >> Signed-off-by: Tim Chen >> Signed-off-by: Thomas Gleixner >> Signed-off-by: KarimAllah Ahmed >> Cc: Andrea Arcangeli >> Cc: Andi Kleen >> Cc: Peter Zijlstra >> Cc: Greg KH >> Cc: Dave Hansen >> Cc: Andy Lutomirski >> Cc: Paolo Bonzini >> Cc: Dan Williams >> Cc: Arjan Van De Ven >> Cc: Linus Torvalds >> Cc: David Woodhouse >> Cc: Ashok Raj >> Link: >> https://lkml.kernel.org/r/3aab341725ee6a9aafd3141387453b45d788d61a.1515542293.git.tim.c.c...@linux.intel.com >> Signed-off-by: David Woodhouse >> --- >> arch/x86/entry/calling.h | 73 >> >> 1 file changed, 73 insertions(+) >> >> diff --git a/arch/x86/entry/calling.h b/arch/x86/entry/calling.h >> index 3f48f69..5aafb51 100644 >> --- a/arch/x86/entry/calling.h >> +++ b/arch/x86/entry/calling.h >> @@ -6,6 +6,8 @@ >> #include >> #include >> #include >> +#include >> +#include >> >> /* >> >> @@ -349,3 +351,74 @@ For 32-bit we have the following conventions - kernel >> is built with >> .Lafter_call_\@: >> #endif >> .endm >> + >> +/* >> + * IBRS related macros >> + */ >> +.macro PUSH_MSR_REGS >> +pushq%rax >> +pushq%rcx >> +pushq%rdx >> +.endm >> + >> +.macro POP_MSR_REGS >> +popq%rdx >> +popq%rcx >> +popq%rax >> +.endm >> + >> +.macro WRMSR_ASM msr_nr:req edx_val:req eax_val:req >> +movl\msr_nr, %ecx >> +movl\edx_val, %edx >> +movl\eax_val, %eax >> +wrmsr >> +.endm >> + >> +.macro RESTRICT_IB_SPEC >> +ALTERNATIVE "jmp .Lskip_\@", "", X86_FEATURE_IBRS >> +PUSH_MSR_REGS >> +WRMSR_ASM $MSR_IA32_SPEC_CTRL, $0, $SPEC_CTRL_IBRS >> +POP_MSR_REGS >> +.Lskip_\@: >> +.endm >> + >> +.macro UNRESTRICT_IB_SPEC >> +ALTERNATIVE "jmp .Lskip_\@", "", X86_FEATURE_IBRS >> +PUSH_MSR_REGS >> +WRMSR_ASM $MSR_IA32_SPEC_CTRL, $0, $0 > > I think you should be writing 2, not 0, since I'm reasonably confident that > we want STIBP on. Can you explain why you're writing 0? Do we want to talk about STIBP in general? Should it be (yet another) boot option to enable or disable? If there is STIBP support without IBRS support, it could be a set and forget at boot time. Thanks, Tom > > Also, holy cow, there are so many macros here. > > And a meta question: why are there so many submitters of the same series? >
Re: [RFC 09/10] x86/enter: Create macros to restrict/unrestrict Indirect Branch Speculation
On 1/21/2018 1:14 PM, Andy Lutomirski wrote: > > >> On Jan 20, 2018, at 11:23 AM, KarimAllah Ahmed wrote: >> >> From: Tim Chen >> >> Create macros to control Indirect Branch Speculation. >> >> Name them so they reflect what they are actually doing. >> The macros are used to restrict and unrestrict the indirect branch >> speculation. >> They do not *disable* (or *enable*) indirect branch speculation. A trip back >> to >> user-space after *restricting* speculation would still affect the BTB. >> >> Quoting from a commit by Tim Chen: >> >> """ >>If IBRS is set, near returns and near indirect jumps/calls will not allow >>their predicted target address to be controlled by code that executed in a >>less privileged prediction mode *BEFORE* the IBRS mode was last written >> with >>a value of 1 or on another logical processor so long as all Return Stack >>Buffer (RSB) entries from the previous less privileged prediction mode are >>overwritten. >> >>Thus a near indirect jump/call/return may be affected by code in a less >>privileged prediction mode that executed *AFTER* IBRS mode was last >> written >>with a value of 1. >> """ >> >> [ tglx: Changed macro names and rewrote changelog ] >> [ karahmed: changed macro names *again* and rewrote changelog ] >> >> Signed-off-by: Tim Chen >> Signed-off-by: Thomas Gleixner >> Signed-off-by: KarimAllah Ahmed >> Cc: Andrea Arcangeli >> Cc: Andi Kleen >> Cc: Peter Zijlstra >> Cc: Greg KH >> Cc: Dave Hansen >> Cc: Andy Lutomirski >> Cc: Paolo Bonzini >> Cc: Dan Williams >> Cc: Arjan Van De Ven >> Cc: Linus Torvalds >> Cc: David Woodhouse >> Cc: Ashok Raj >> Link: >> https://lkml.kernel.org/r/3aab341725ee6a9aafd3141387453b45d788d61a.1515542293.git.tim.c.c...@linux.intel.com >> Signed-off-by: David Woodhouse >> --- >> arch/x86/entry/calling.h | 73 >> >> 1 file changed, 73 insertions(+) >> >> diff --git a/arch/x86/entry/calling.h b/arch/x86/entry/calling.h >> index 3f48f69..5aafb51 100644 >> --- a/arch/x86/entry/calling.h >> +++ b/arch/x86/entry/calling.h >> @@ -6,6 +6,8 @@ >> #include >> #include >> #include >> +#include >> +#include >> >> /* >> >> @@ -349,3 +351,74 @@ For 32-bit we have the following conventions - kernel >> is built with >> .Lafter_call_\@: >> #endif >> .endm >> + >> +/* >> + * IBRS related macros >> + */ >> +.macro PUSH_MSR_REGS >> +pushq%rax >> +pushq%rcx >> +pushq%rdx >> +.endm >> + >> +.macro POP_MSR_REGS >> +popq%rdx >> +popq%rcx >> +popq%rax >> +.endm >> + >> +.macro WRMSR_ASM msr_nr:req edx_val:req eax_val:req >> +movl\msr_nr, %ecx >> +movl\edx_val, %edx >> +movl\eax_val, %eax >> +wrmsr >> +.endm >> + >> +.macro RESTRICT_IB_SPEC >> +ALTERNATIVE "jmp .Lskip_\@", "", X86_FEATURE_IBRS >> +PUSH_MSR_REGS >> +WRMSR_ASM $MSR_IA32_SPEC_CTRL, $0, $SPEC_CTRL_IBRS >> +POP_MSR_REGS >> +.Lskip_\@: >> +.endm >> + >> +.macro UNRESTRICT_IB_SPEC >> +ALTERNATIVE "jmp .Lskip_\@", "", X86_FEATURE_IBRS >> +PUSH_MSR_REGS >> +WRMSR_ASM $MSR_IA32_SPEC_CTRL, $0, $0 > > I think you should be writing 2, not 0, since I'm reasonably confident that > we want STIBP on. Can you explain why you're writing 0? Do we want to talk about STIBP in general? Should it be (yet another) boot option to enable or disable? If there is STIBP support without IBRS support, it could be a set and forget at boot time. Thanks, Tom > > Also, holy cow, there are so many macros here. > > And a meta question: why are there so many submitters of the same series? >
Re: [RFC 09/10] x86/enter: Create macros to restrict/unrestrict Indirect Branch Speculation
On 01/23/2018 01:27 AM, Ingo Molnar wrote: > > - All asynchronous contexts (IRQs, NMIs, etc.) stuff the RSB before IRET. > (The >tracking could probably made IRQ and maybe even NMI safe, but the > worst-case >nesting scenarios make my head ache.) This all sounds totally workable to me. We talked about using ftrace itself to track call depth, but it would be unusable in production, of course. This seems workable, though. You're also totally right about the zero overhead on most kernels with it turned off when we don't need RSB underflow protection (basically pre-Skylake). I also agree that the safe thing to do is to just stuff before iret. I bet we can get a ftrace-driven RSB tracker working precisely enough even with NMIs, but it's way simpler to just stuff and be done with it for now.
Re: [RFC 09/10] x86/enter: Create macros to restrict/unrestrict Indirect Branch Speculation
On 01/23/2018 01:27 AM, Ingo Molnar wrote: > > - All asynchronous contexts (IRQs, NMIs, etc.) stuff the RSB before IRET. > (The >tracking could probably made IRQ and maybe even NMI safe, but the > worst-case >nesting scenarios make my head ache.) This all sounds totally workable to me. We talked about using ftrace itself to track call depth, but it would be unusable in production, of course. This seems workable, though. You're also totally right about the zero overhead on most kernels with it turned off when we don't need RSB underflow protection (basically pre-Skylake). I also agree that the safe thing to do is to just stuff before iret. I bet we can get a ftrace-driven RSB tracker working precisely enough even with NMIs, but it's way simpler to just stuff and be done with it for now.
Re: [RFC 09/10] x86/enter: Create macros to restrict/unrestrict Indirect Branch Speculation
- dw...@infradead.org wrote: > On Sun, 2018-01-21 at 14:27 -0800, Linus Torvalds wrote: > > On Sun, Jan 21, 2018 at 2:00 PM, David Woodhouse >wrote: > > >> > > >> The patches do things like add the garbage MSR writes to the > kernel > > >> entry/exit points. That's insane. That says "we're trying to > protect > > >> the kernel". We already have retpoline there, with less > overhead. > > > > > > You're looking at IBRS usage, not IBPB. They are different > things. > > > > Ehh. Odd intel naming detail. > > > > If you look at this series, it very much does that kernel > entry/exit > > stuff. It was patch 10/10, iirc. In fact, the patch I was replying > to > > was explicitly setting that garbage up. > > > > And I really don't want to see these garbage patches just > mindlessly > > sent around. > > I think we've covered the technical part of this now, not that you > like > it — not that any of us *like* it. But since the peanut gallery is > paying lots of attention it's probably worth explaining it a little > more for their benefit. > > This is all about Spectre variant 2, where the CPU can be tricked > into > mispredicting the target of an indirect branch. And I'm specifically > looking at what we can do on *current* hardware, where we're limited > to > the hacks they can manage to add in the microcode. > > The new microcode from Intel and AMD adds three new features. > > One new feature (IBPB) is a complete barrier for branch prediction. > After frobbing this, no branch targets learned earlier are going to > be > used. It's kind of expensive (order of magnitude ~4000 cycles). > > The second (STIBP) protects a hyperthread sibling from following > branch > predictions which were learned on another sibling. You *might* want > this when running unrelated processes in userspace, for example. Or > different VM guests running on HT siblings. > > The third feature (IBRS) is more complicated. It's designed to be > set when you enter a more privileged execution mode (i.e. the > kernel). > It prevents branch targets learned in a less-privileged execution > mode, > BEFORE IT WAS MOST RECENTLY SET, from taking effect. But it's not > just > a 'set-and-forget' feature, it also has barrier-like semantics and > needs to be set on *each* entry into the kernel (from userspace or a > VM > guest). It's *also* expensive. And a vile hack, but for a while it > was > the only option we had. > > Even with IBRS, the CPU cannot tell the difference between different > userspace processes, and between different VM guests. So in addition > to > IBRS to protect the kernel, we need the full IBPB barrier on context > switch and vmexit. And maybe STIBP while they're running. > > Then along came Paul with the cunning plan of "oh, indirect branches > can be exploited? Screw it, let's not have any of *those* then", > which > is retpoline. And it's a *lot* faster than frobbing IBRS on every > entry > into the kernel. It's a massive performance win. > > So now we *mostly* don't need IBRS. We build with retpoline, use IBPB > on context switches/vmexit (which is in the first part of this patch > series before IBRS is added), and we're safe. We even refactored the > patch series to put retpoline first. > > But wait, why did I say "mostly"? Well, not everyone has a retpoline > compiler yet... but OK, screw them; they need to update. > > Then there's Skylake, and that generation of CPU cores. For > complicated > reasons they actually end up being vulnerable not just on indirect > branches, but also on a 'ret' in some circumstances (such as 16+ > CALLs > in a deep chain). > > The IBRS solution, ugly though it is, did address that. Retpoline > doesn't. There are patches being floated to detect and prevent deep > stacks, and deal with some of the other special cases that bite on > SKL, > but those are icky too. And in fact IBRS performance isn't anywhere > near as bad on this generation of CPUs as it is on earlier CPUs > *anyway*, which makes it not quite so insane to *contemplate* using > it > as Intel proposed. > > That's why my initial idea, as implemented in this RFC patchset, was > to > stick with IBRS on Skylake, and use retpoline everywhere else. I'll > give you "garbage patches", but they weren't being "just mindlessly > sent around". If we're going to drop IBRS support and accept the > caveats, then let's do it as a conscious decision having seen what it > would look like, not just drop it quietly because poor Davey is too > scared that Linus might shout at him again. :) > > I have seen *hand-wavy* analyses of the Skylake thing that mean I'm > not > actually lying awake at night fretting about it, but nothing concrete > that really says it's OK. > > If you view retpoline as a performance optimisation, which is how it > first arrived, then it's rather unconventional to say "well, it only > opens a *little* bit of a security hole but it does go nice and fast > so > let's do it". > > But fine, I'm
Re: [RFC 09/10] x86/enter: Create macros to restrict/unrestrict Indirect Branch Speculation
- dw...@infradead.org wrote: > On Sun, 2018-01-21 at 14:27 -0800, Linus Torvalds wrote: > > On Sun, Jan 21, 2018 at 2:00 PM, David Woodhouse > wrote: > > >> > > >> The patches do things like add the garbage MSR writes to the > kernel > > >> entry/exit points. That's insane. That says "we're trying to > protect > > >> the kernel". We already have retpoline there, with less > overhead. > > > > > > You're looking at IBRS usage, not IBPB. They are different > things. > > > > Ehh. Odd intel naming detail. > > > > If you look at this series, it very much does that kernel > entry/exit > > stuff. It was patch 10/10, iirc. In fact, the patch I was replying > to > > was explicitly setting that garbage up. > > > > And I really don't want to see these garbage patches just > mindlessly > > sent around. > > I think we've covered the technical part of this now, not that you > like > it — not that any of us *like* it. But since the peanut gallery is > paying lots of attention it's probably worth explaining it a little > more for their benefit. > > This is all about Spectre variant 2, where the CPU can be tricked > into > mispredicting the target of an indirect branch. And I'm specifically > looking at what we can do on *current* hardware, where we're limited > to > the hacks they can manage to add in the microcode. > > The new microcode from Intel and AMD adds three new features. > > One new feature (IBPB) is a complete barrier for branch prediction. > After frobbing this, no branch targets learned earlier are going to > be > used. It's kind of expensive (order of magnitude ~4000 cycles). > > The second (STIBP) protects a hyperthread sibling from following > branch > predictions which were learned on another sibling. You *might* want > this when running unrelated processes in userspace, for example. Or > different VM guests running on HT siblings. > > The third feature (IBRS) is more complicated. It's designed to be > set when you enter a more privileged execution mode (i.e. the > kernel). > It prevents branch targets learned in a less-privileged execution > mode, > BEFORE IT WAS MOST RECENTLY SET, from taking effect. But it's not > just > a 'set-and-forget' feature, it also has barrier-like semantics and > needs to be set on *each* entry into the kernel (from userspace or a > VM > guest). It's *also* expensive. And a vile hack, but for a while it > was > the only option we had. > > Even with IBRS, the CPU cannot tell the difference between different > userspace processes, and between different VM guests. So in addition > to > IBRS to protect the kernel, we need the full IBPB barrier on context > switch and vmexit. And maybe STIBP while they're running. > > Then along came Paul with the cunning plan of "oh, indirect branches > can be exploited? Screw it, let's not have any of *those* then", > which > is retpoline. And it's a *lot* faster than frobbing IBRS on every > entry > into the kernel. It's a massive performance win. > > So now we *mostly* don't need IBRS. We build with retpoline, use IBPB > on context switches/vmexit (which is in the first part of this patch > series before IBRS is added), and we're safe. We even refactored the > patch series to put retpoline first. > > But wait, why did I say "mostly"? Well, not everyone has a retpoline > compiler yet... but OK, screw them; they need to update. > > Then there's Skylake, and that generation of CPU cores. For > complicated > reasons they actually end up being vulnerable not just on indirect > branches, but also on a 'ret' in some circumstances (such as 16+ > CALLs > in a deep chain). > > The IBRS solution, ugly though it is, did address that. Retpoline > doesn't. There are patches being floated to detect and prevent deep > stacks, and deal with some of the other special cases that bite on > SKL, > but those are icky too. And in fact IBRS performance isn't anywhere > near as bad on this generation of CPUs as it is on earlier CPUs > *anyway*, which makes it not quite so insane to *contemplate* using > it > as Intel proposed. > > That's why my initial idea, as implemented in this RFC patchset, was > to > stick with IBRS on Skylake, and use retpoline everywhere else. I'll > give you "garbage patches", but they weren't being "just mindlessly > sent around". If we're going to drop IBRS support and accept the > caveats, then let's do it as a conscious decision having seen what it > would look like, not just drop it quietly because poor Davey is too > scared that Linus might shout at him again. :) > > I have seen *hand-wavy* analyses of the Skylake thing that mean I'm > not > actually lying awake at night fretting about it, but nothing concrete > that really says it's OK. > > If you view retpoline as a performance optimisation, which is how it > first arrived, then it's rather unconventional to say "well, it only > opens a *little* bit of a security hole but it does go nice and fast > so > let's do it". > > But fine, I'm content with ditching the
Re: [RFC 09/10] x86/enter: Create macros to restrict/unrestrict Indirect Branch Speculation
On Tue, 2018-01-23 at 11:44 +0100, Ingo Molnar wrote: > * David Woodhousewrote: > > Hm? We still have GCC emitting 'call __fentry__' don't we? Would be nice to > > get > > to the point where we can patch *that* out into a NOP... or are you saying > > we > > already can? > Yes, we already can and do patch the 'call __fentry__/ mcount' call site into > a > NOP today - all 50,000+ call sites on a typical distro kernel. > > We did so for a long time - this is all a well established, working mechanism. That's neat; I'd missed that. > > But this is a digression. I was being pedantic about the "0 cycles" but > > sure, > > this would be perfectly tolerable. > It's not a digression in two ways: > > - I wanted to make it clear that for distro kernels it _is_ a zero cycles > overhead > mechanism for non-SkyLake CPUs, literally. > > - I noticed that Meltdown and the CR3 writes for PTI appears to have > established a > kind of ... insensitivity and numbness to kernel micro-costs, which peaked > with > the per-syscall MSR write nonsense patch of the SkyLake workaround. > That attitude is totally unacceptable to me as x86 maintainer and yes, still > every cycle counts. Yeah, absolutely. But here we're talking about the overhead on non-SKL, and on non-SKL the IBRS overhead is zero too (well, again not precisely zero because it turns into NOPs). You're absolutely right that we shouldn't stop counting cycles. I've already noted that on SKL IBRS is actually a lot faster than on earlier generations, and we also get back some of the overhead by turning the retpoline into a bare jmp again. We haven't *forgotten* about performance. I'd like to see your solution once the details are sorted out, and see proper benchmarks — both microbenchmarks and real workloads — comparing the two. And then make a reasoned decision based on that, and on how happy we are with the theoretical holes that your solution leaves, in the cold light of day. We should also look at whether we want to set STIBP too, which is somewhat orthogonal to using IBRS to protect the kernel, and could end up with some of the same MSR writes (at least setting to zero) on some of the same code paths. smime.p7s Description: S/MIME cryptographic signature
Re: [RFC 09/10] x86/enter: Create macros to restrict/unrestrict Indirect Branch Speculation
On Tue, 2018-01-23 at 11:44 +0100, Ingo Molnar wrote: > * David Woodhouse wrote: > > Hm? We still have GCC emitting 'call __fentry__' don't we? Would be nice to > > get > > to the point where we can patch *that* out into a NOP... or are you saying > > we > > already can? > Yes, we already can and do patch the 'call __fentry__/ mcount' call site into > a > NOP today - all 50,000+ call sites on a typical distro kernel. > > We did so for a long time - this is all a well established, working mechanism. That's neat; I'd missed that. > > But this is a digression. I was being pedantic about the "0 cycles" but > > sure, > > this would be perfectly tolerable. > It's not a digression in two ways: > > - I wanted to make it clear that for distro kernels it _is_ a zero cycles > overhead > mechanism for non-SkyLake CPUs, literally. > > - I noticed that Meltdown and the CR3 writes for PTI appears to have > established a > kind of ... insensitivity and numbness to kernel micro-costs, which peaked > with > the per-syscall MSR write nonsense patch of the SkyLake workaround. > That attitude is totally unacceptable to me as x86 maintainer and yes, still > every cycle counts. Yeah, absolutely. But here we're talking about the overhead on non-SKL, and on non-SKL the IBRS overhead is zero too (well, again not precisely zero because it turns into NOPs). You're absolutely right that we shouldn't stop counting cycles. I've already noted that on SKL IBRS is actually a lot faster than on earlier generations, and we also get back some of the overhead by turning the retpoline into a bare jmp again. We haven't *forgotten* about performance. I'd like to see your solution once the details are sorted out, and see proper benchmarks — both microbenchmarks and real workloads — comparing the two. And then make a reasoned decision based on that, and on how happy we are with the theoretical holes that your solution leaves, in the cold light of day. We should also look at whether we want to set STIBP too, which is somewhat orthogonal to using IBRS to protect the kernel, and could end up with some of the same MSR writes (at least setting to zero) on some of the same code paths. smime.p7s Description: S/MIME cryptographic signature
Re: [RFC 09/10] x86/enter: Create macros to restrict/unrestrict Indirect Branch Speculation
* David Woodhousewrote: > On Tue, 2018-01-23 at 11:15 +0100, Ingo Molnar wrote: > > > > BTW., the reason this is enabled on all distro kernels is because the > > overhead > > is a single patched-in NOP instruction in the function epilogue, when > > tracing > > is disabled. So it's not even a CALL+RET - it's a patched in NOP. > > Hm? We still have GCC emitting 'call __fentry__' don't we? Would be nice to > get > to the point where we can patch *that* out into a NOP... or are you saying we > already can? Yes, we already can and do patch the 'call __fentry__/ mcount' call site into a NOP today - all 50,000+ call sites on a typical distro kernel. We did so for a long time - this is all a well established, working mechanism. > But this is a digression. I was being pedantic about the "0 cycles" but sure, > this would be perfectly tolerable. It's not a digression in two ways: - I wanted to make it clear that for distro kernels it _is_ a zero cycles overhead mechanism for non-SkyLake CPUs, literally. - I noticed that Meltdown and the CR3 writes for PTI appears to have established a kind of ... insensitivity and numbness to kernel micro-costs, which peaked with the per-syscall MSR write nonsense patch of the SkyLake workaround. That attitude is totally unacceptable to me as x86 maintainer and yes, still every cycle counts. Thanks, Ingo
Re: [RFC 09/10] x86/enter: Create macros to restrict/unrestrict Indirect Branch Speculation
* David Woodhouse wrote: > On Tue, 2018-01-23 at 11:15 +0100, Ingo Molnar wrote: > > > > BTW., the reason this is enabled on all distro kernels is because the > > overhead > > is a single patched-in NOP instruction in the function epilogue, when > > tracing > > is disabled. So it's not even a CALL+RET - it's a patched in NOP. > > Hm? We still have GCC emitting 'call __fentry__' don't we? Would be nice to > get > to the point where we can patch *that* out into a NOP... or are you saying we > already can? Yes, we already can and do patch the 'call __fentry__/ mcount' call site into a NOP today - all 50,000+ call sites on a typical distro kernel. We did so for a long time - this is all a well established, working mechanism. > But this is a digression. I was being pedantic about the "0 cycles" but sure, > this would be perfectly tolerable. It's not a digression in two ways: - I wanted to make it clear that for distro kernels it _is_ a zero cycles overhead mechanism for non-SkyLake CPUs, literally. - I noticed that Meltdown and the CR3 writes for PTI appears to have established a kind of ... insensitivity and numbness to kernel micro-costs, which peaked with the per-syscall MSR write nonsense patch of the SkyLake workaround. That attitude is totally unacceptable to me as x86 maintainer and yes, still every cycle counts. Thanks, Ingo
Re: [RFC 09/10] x86/enter: Create macros to restrict/unrestrict Indirect Branch Speculation
On Tue, 2018-01-23 at 11:23 +0100, Ingo Molnar wrote: > * David Woodhousewrote: > > > > > > > > > On SkyLake this would add an overhead of maybe 2-3 cycles per function > > > call and > > > obviously all this code and data would be very cache hot. Given that the > > > average > > > number of function calls per system call is around a dozen, this would be > > > _much_ > > > faster than any microcode/MSR based approach. > > That's kind of neat, except you don't want it at the top of the > > function; you want it at the bottom. > > > > If you could hijack the *return* site, then you could check for > > underflow and stuff the RSB right there. But in __fentry__ there's not > > a lot you can do other than complain that something bad is going to > > happen in the future. You know that a string of 16+ rets is going to > > happen, but you've got no gadget in *there* to deal with it when it > > does. > > No, it can be done with the existing CALL instrumentation callback that > CONFIG_DYNAMIC_FTRACE=y provides, by pushing a RET trampoline on the stack > from > the CALL trampoline - see my previous email. Yes, that's a neat solution. > > > > HJ did have patches to turn 'ret' into a form of retpoline, which I > > don't think ever even got performance-tested. > Return instrumentation is possible as well, but there are two major drawbacks: > > - GCC support for it is not as widely available and return instrumentation > is > less tested in Linux kernel contexts Hey, we're *already* making people upgrade their compiler, and HJ apparently never sleeps. So don't actually be held back too much by that consideration. If it could be better done with GCC help, we really *can* explore that. > - a major point of my suggestion is that CONFIG_DYNAMIC_FTRACE=y is already > enabled in distros here and today, so the runtime overhead to non-SkyLake > CPUs > would be literally zero, while still allowing to fix the RSB vulnerability > on > SkyLake. Sure. You still have a few holes to fix (or declare acceptable) to bring it to the full coverage of the IBRS solution, and it's still possible that by the time it's complete it's approaching the ick factor of IBRS, but I'd love to see it. smime.p7s Description: S/MIME cryptographic signature
Re: [RFC 09/10] x86/enter: Create macros to restrict/unrestrict Indirect Branch Speculation
On Tue, 2018-01-23 at 11:23 +0100, Ingo Molnar wrote: > * David Woodhouse wrote: > > > > > > > > > On SkyLake this would add an overhead of maybe 2-3 cycles per function > > > call and > > > obviously all this code and data would be very cache hot. Given that the > > > average > > > number of function calls per system call is around a dozen, this would be > > > _much_ > > > faster than any microcode/MSR based approach. > > That's kind of neat, except you don't want it at the top of the > > function; you want it at the bottom. > > > > If you could hijack the *return* site, then you could check for > > underflow and stuff the RSB right there. But in __fentry__ there's not > > a lot you can do other than complain that something bad is going to > > happen in the future. You know that a string of 16+ rets is going to > > happen, but you've got no gadget in *there* to deal with it when it > > does. > > No, it can be done with the existing CALL instrumentation callback that > CONFIG_DYNAMIC_FTRACE=y provides, by pushing a RET trampoline on the stack > from > the CALL trampoline - see my previous email. Yes, that's a neat solution. > > > > HJ did have patches to turn 'ret' into a form of retpoline, which I > > don't think ever even got performance-tested. > Return instrumentation is possible as well, but there are two major drawbacks: > > - GCC support for it is not as widely available and return instrumentation > is > less tested in Linux kernel contexts Hey, we're *already* making people upgrade their compiler, and HJ apparently never sleeps. So don't actually be held back too much by that consideration. If it could be better done with GCC help, we really *can* explore that. > - a major point of my suggestion is that CONFIG_DYNAMIC_FTRACE=y is already > enabled in distros here and today, so the runtime overhead to non-SkyLake > CPUs > would be literally zero, while still allowing to fix the RSB vulnerability > on > SkyLake. Sure. You still have a few holes to fix (or declare acceptable) to bring it to the full coverage of the IBRS solution, and it's still possible that by the time it's complete it's approaching the ick factor of IBRS, but I'd love to see it. smime.p7s Description: S/MIME cryptographic signature
Re: [RFC 09/10] x86/enter: Create macros to restrict/unrestrict Indirect Branch Speculation
On Tue, 2018-01-23 at 11:15 +0100, Ingo Molnar wrote: > > BTW., the reason this is enabled on all distro kernels is because the > overhead is > a single patched-in NOP instruction in the function epilogue, when tracing is > disabled. So it's not even a CALL+RET - it's a patched in NOP. Hm? We still have GCC emitting 'call __fentry__' don't we? Would be nice to get to the point where we can patch *that* out into a NOP... or are you saying we already can? But this is a digression. I was being pedantic about the "0 cycles" but sure, this would be perfectly tolerable. smime.p7s Description: S/MIME cryptographic signature
Re: [RFC 09/10] x86/enter: Create macros to restrict/unrestrict Indirect Branch Speculation
On Tue, 2018-01-23 at 11:15 +0100, Ingo Molnar wrote: > > BTW., the reason this is enabled on all distro kernels is because the > overhead is > a single patched-in NOP instruction in the function epilogue, when tracing is > disabled. So it's not even a CALL+RET - it's a patched in NOP. Hm? We still have GCC emitting 'call __fentry__' don't we? Would be nice to get to the point where we can patch *that* out into a NOP... or are you saying we already can? But this is a digression. I was being pedantic about the "0 cycles" but sure, this would be perfectly tolerable. smime.p7s Description: S/MIME cryptographic signature
Re: [RFC 09/10] x86/enter: Create macros to restrict/unrestrict Indirect Branch Speculation
* David Woodhousewrote: > > On SkyLake this would add an overhead of maybe 2-3 cycles per function call > > and > > obviously all this code and data would be very cache hot. Given that the > > average > > number of function calls per system call is around a dozen, this would be > > _much_ > > faster than any microcode/MSR based approach. > > That's kind of neat, except you don't want it at the top of the > function; you want it at the bottom. > > If you could hijack the *return* site, then you could check for > underflow and stuff the RSB right there. But in __fentry__ there's not > a lot you can do other than complain that something bad is going to > happen in the future. You know that a string of 16+ rets is going to > happen, but you've got no gadget in *there* to deal with it when it > does. No, it can be done with the existing CALL instrumentation callback that CONFIG_DYNAMIC_FTRACE=y provides, by pushing a RET trampoline on the stack from the CALL trampoline - see my previous email. > HJ did have patches to turn 'ret' into a form of retpoline, which I > don't think ever even got performance-tested. Return instrumentation is possible as well, but there are two major drawbacks: - GCC support for it is not as widely available and return instrumentation is less tested in Linux kernel contexts - a major point of my suggestion is that CONFIG_DYNAMIC_FTRACE=y is already enabled in distros here and today, so the runtime overhead to non-SkyLake CPUs would be literally zero, while still allowing to fix the RSB vulnerability on SkyLake. Thanks, Ingo
Re: [RFC 09/10] x86/enter: Create macros to restrict/unrestrict Indirect Branch Speculation
* David Woodhouse wrote: > > On SkyLake this would add an overhead of maybe 2-3 cycles per function call > > and > > obviously all this code and data would be very cache hot. Given that the > > average > > number of function calls per system call is around a dozen, this would be > > _much_ > > faster than any microcode/MSR based approach. > > That's kind of neat, except you don't want it at the top of the > function; you want it at the bottom. > > If you could hijack the *return* site, then you could check for > underflow and stuff the RSB right there. But in __fentry__ there's not > a lot you can do other than complain that something bad is going to > happen in the future. You know that a string of 16+ rets is going to > happen, but you've got no gadget in *there* to deal with it when it > does. No, it can be done with the existing CALL instrumentation callback that CONFIG_DYNAMIC_FTRACE=y provides, by pushing a RET trampoline on the stack from the CALL trampoline - see my previous email. > HJ did have patches to turn 'ret' into a form of retpoline, which I > don't think ever even got performance-tested. Return instrumentation is possible as well, but there are two major drawbacks: - GCC support for it is not as widely available and return instrumentation is less tested in Linux kernel contexts - a major point of my suggestion is that CONFIG_DYNAMIC_FTRACE=y is already enabled in distros here and today, so the runtime overhead to non-SkyLake CPUs would be literally zero, while still allowing to fix the RSB vulnerability on SkyLake. Thanks, Ingo