Re: [PATCH] x86_64, asm: Work around AMD SYSRET SS descriptor attribute issue

2015-05-03 Thread Borislav Petkov
On Thu, Apr 30, 2015 at 02:39:07PM -0700, H. Peter Anvin wrote: > This is the microbenchmark I used. > > For the record, Intel's intention going forward is that 0F 1F will > always be as fast or faster than any other alternative. It looks like this is the case on AMD too. So I took your

Re: [PATCH] x86_64, asm: Work around AMD SYSRET SS descriptor attribute issue

2015-05-03 Thread Borislav Petkov
On Thu, Apr 30, 2015 at 02:39:07PM -0700, H. Peter Anvin wrote: This is the microbenchmark I used. For the record, Intel's intention going forward is that 0F 1F will always be as fast or faster than any other alternative. It looks like this is the case on AMD too. So I took your benchmark

Re: [PATCH] x86_64, asm: Work around AMD SYSRET SS descriptor attribute issue

2015-05-01 Thread Borislav Petkov
On Thu, Apr 30, 2015 at 04:23:26PM -0700, H. Peter Anvin wrote: > I probably should have added that the microbenchmark specifically tests > for an atomic 5-byte NOP (as required by tracepoints etc.) If the > requirement for 5-byte atomic is dropped there might be faster > combinations, e.g. 66 66

Re: [PATCH] x86_64, asm: Work around AMD SYSRET SS descriptor attribute issue

2015-05-01 Thread Borislav Petkov
On Thu, Apr 30, 2015 at 04:23:26PM -0700, H. Peter Anvin wrote: I probably should have added that the microbenchmark specifically tests for an atomic 5-byte NOP (as required by tracepoints etc.) If the requirement for 5-byte atomic is dropped there might be faster combinations, e.g. 66 66 66

Re: [PATCH] x86_64, asm: Work around AMD SYSRET SS descriptor attribute issue

2015-04-30 Thread H. Peter Anvin
On 04/30/2015 02:39 PM, H. Peter Anvin wrote: > This is the microbenchmark I used. > > For the record, Intel's intention going forward is that 0F 1F will > always be as fast or faster than any other alternative. > I probably should have added that the microbenchmark specifically tests for an

Re: [PATCH] x86_64, asm: Work around AMD SYSRET SS descriptor attribute issue

2015-04-30 Thread H. Peter Anvin
This is the microbenchmark I used. For the record, Intel's intention going forward is that 0F 1F will always be as fast or faster than any other alternative. -hpa #define _GNU_SOURCE #include #include #include #include #include static void nop_p6(void) { asm volatile(".rept

Re: [PATCH] x86_64, asm: Work around AMD SYSRET SS descriptor attribute issue

2015-04-30 Thread H. Peter Anvin
This is the microbenchmark I used. For the record, Intel's intention going forward is that 0F 1F will always be as fast or faster than any other alternative. -hpa #define _GNU_SOURCE #include stdio.h #include stdlib.h #include time.h #include stdbool.h #include sys/time.h static void

Re: [PATCH] x86_64, asm: Work around AMD SYSRET SS descriptor attribute issue

2015-04-30 Thread H. Peter Anvin
On 04/30/2015 02:39 PM, H. Peter Anvin wrote: This is the microbenchmark I used. For the record, Intel's intention going forward is that 0F 1F will always be as fast or faster than any other alternative. I probably should have added that the microbenchmark specifically tests for an atomic

Re: [PATCH] x86_64, asm: Work around AMD SYSRET SS descriptor attribute issue

2015-04-28 Thread Borislav Petkov
On Tue, Apr 28, 2015 at 10:16:33AM -0700, Linus Torvalds wrote: > I suspect it might be related to things like getting performance > counters and instruction debug traps etc right. There are quite > possibly also simply constraints where the front end has to generate > *something* just to keep the

Re: [PATCH] x86_64, asm: Work around AMD SYSRET SS descriptor attribute issue

2015-04-28 Thread Linus Torvalds
On Tue, Apr 28, 2015 at 9:58 AM, Borislav Petkov wrote: > > Well, AFAIK, NOPs do require resources for tracking in the machine. I > was hoping that hw would be smarter and discard at decode time but there > probably are reasons that it can't be done (...yet). I suspect it might be related to

Re: [PATCH] x86_64, asm: Work around AMD SYSRET SS descriptor attribute issue

2015-04-28 Thread Borislav Petkov
On Tue, Apr 28, 2015 at 09:28:52AM -0700, Linus Torvalds wrote: > On Tue, Apr 28, 2015 at 8:55 AM, Borislav Petkov wrote: > > > > Provided it is correct, it shows that the 0x66-prefixed 3-byte NOPs are > > better than the 0F 1F 00 suggested by the manual (Haha!): > > That's which AMD CPU? F16h.

Re: [PATCH] x86_64, asm: Work around AMD SYSRET SS descriptor attribute issue

2015-04-28 Thread Linus Torvalds
On Tue, Apr 28, 2015 at 8:55 AM, Borislav Petkov wrote: > > Provided it is correct, it shows that the 0x66-prefixed 3-byte NOPs are > better than the 0F 1F 00 suggested by the manual (Haha!): That's which AMD CPU? On my intel i7-4770S, they are the same cost (I cut down your loop numbers by an

Re: [PATCH] x86_64, asm: Work around AMD SYSRET SS descriptor attribute issue

2015-04-28 Thread Borislav Petkov
On Mon, Apr 27, 2015 at 01:14:51PM -0700, H. Peter Anvin wrote: > I did a microbenchmark in user space... let's see if I can find it. How about the simple one below? Provided it is correct, it shows that the 0x66-prefixed 3-byte NOPs are better than the 0F 1F 00 suggested by the manual (Haha!):

Re: [PATCH] x86_64, asm: Work around AMD SYSRET SS descriptor attribute issue

2015-04-28 Thread Borislav Petkov
On Mon, Apr 27, 2015 at 09:45:12PM +0200, Borislav Petkov wrote: > > Maybe you are measuring random noise. > > Yeah. Last exercise tomorrow. Let's see what those numbers would look > like. Right, so with Mel's help, I did a simple microbenchmark to measure how many cycles a syscall (getpid())

Re: [PATCH] x86_64, asm: Work around AMD SYSRET SS descriptor attribute issue

2015-04-28 Thread Borislav Petkov
On Mon, Apr 27, 2015 at 09:45:12PM +0200, Borislav Petkov wrote: Maybe you are measuring random noise. Yeah. Last exercise tomorrow. Let's see what those numbers would look like. Right, so with Mel's help, I did a simple microbenchmark to measure how many cycles a syscall (getpid()) needs

Re: [PATCH] x86_64, asm: Work around AMD SYSRET SS descriptor attribute issue

2015-04-28 Thread Borislav Petkov
On Mon, Apr 27, 2015 at 01:14:51PM -0700, H. Peter Anvin wrote: I did a microbenchmark in user space... let's see if I can find it. How about the simple one below? Provided it is correct, it shows that the 0x66-prefixed 3-byte NOPs are better than the 0F 1F 00 suggested by the manual (Haha!):

Re: [PATCH] x86_64, asm: Work around AMD SYSRET SS descriptor attribute issue

2015-04-28 Thread Linus Torvalds
On Tue, Apr 28, 2015 at 8:55 AM, Borislav Petkov b...@alien8.de wrote: Provided it is correct, it shows that the 0x66-prefixed 3-byte NOPs are better than the 0F 1F 00 suggested by the manual (Haha!): That's which AMD CPU? On my intel i7-4770S, they are the same cost (I cut down your loop

Re: [PATCH] x86_64, asm: Work around AMD SYSRET SS descriptor attribute issue

2015-04-28 Thread Borislav Petkov
On Tue, Apr 28, 2015 at 09:28:52AM -0700, Linus Torvalds wrote: On Tue, Apr 28, 2015 at 8:55 AM, Borislav Petkov b...@alien8.de wrote: Provided it is correct, it shows that the 0x66-prefixed 3-byte NOPs are better than the 0F 1F 00 suggested by the manual (Haha!): That's which AMD CPU?

Re: [PATCH] x86_64, asm: Work around AMD SYSRET SS descriptor attribute issue

2015-04-28 Thread Linus Torvalds
On Tue, Apr 28, 2015 at 9:58 AM, Borislav Petkov b...@alien8.de wrote: Well, AFAIK, NOPs do require resources for tracking in the machine. I was hoping that hw would be smarter and discard at decode time but there probably are reasons that it can't be done (...yet). I suspect it might be

Re: [PATCH] x86_64, asm: Work around AMD SYSRET SS descriptor attribute issue

2015-04-28 Thread Borislav Petkov
On Tue, Apr 28, 2015 at 10:16:33AM -0700, Linus Torvalds wrote: I suspect it might be related to things like getting performance counters and instruction debug traps etc right. There are quite possibly also simply constraints where the front end has to generate *something* just to keep the

Re: [PATCH] x86_64, asm: Work around AMD SYSRET SS descriptor attribute issue

2015-04-27 Thread H. Peter Anvin
I did a microbenchmark in user space... let's see if I can find it. On April 27, 2015 1:03:29 PM PDT, Borislav Petkov wrote: >On Mon, Apr 27, 2015 at 12:59:11PM -0700, H. Peter Anvin wrote: >> It really comes down to this: it seems older cores from both Intel >> and AMD perform better with 66 66

Re: [PATCH] x86_64, asm: Work around AMD SYSRET SS descriptor attribute issue

2015-04-27 Thread Borislav Petkov
On Mon, Apr 27, 2015 at 12:59:11PM -0700, H. Peter Anvin wrote: > It really comes down to this: it seems older cores from both Intel > and AMD perform better with 66 66 66 90, whereas the 0F 1F series is > better on newer cores. > > When I measured it, the differences were sometimes dramatic. How

Re: [PATCH] x86_64, asm: Work around AMD SYSRET SS descriptor attribute issue

2015-04-27 Thread H. Peter Anvin
It really comes down to this: it seems older cores from both Intel and AMD perform better with 66 66 66 90, whereas the 0F 1F series is better on newer cores. When I measured it, the differences were sometimes dramatic. On April 27, 2015 11:53:44 AM PDT, Borislav Petkov wrote: >On Mon, Apr

Re: [PATCH] x86_64, asm: Work around AMD SYSRET SS descriptor attribute issue

2015-04-27 Thread Borislav Petkov
On Mon, Apr 27, 2015 at 09:21:34PM +0200, Denys Vlasenko wrote: > On 04/27/2015 09:11 PM, Borislav Petkov wrote: > > A: 709.528485252 seconds time elapsed > >( +- 0.02% ) > > B: 708.976557288 seconds time elapsed

Re: [PATCH] x86_64, asm: Work around AMD SYSRET SS descriptor attribute issue

2015-04-27 Thread Denys Vlasenko
On 04/27/2015 09:11 PM, Borislav Petkov wrote: > A: 709.528485252 seconds time elapsed > ( +- 0.02% ) > B: 708.976557288 seconds time elapsed > ( +- 0.04% ) > C: 709.312844791 seconds time elapsed

Re: [PATCH] x86_64, asm: Work around AMD SYSRET SS descriptor attribute issue

2015-04-27 Thread Borislav Petkov
On Mon, Apr 27, 2015 at 08:38:54PM +0200, Borislav Petkov wrote: > I'm running them now and will report numbers relative to the last run > once it is done. And those numbers should in practice get even better if > we revert to the simpler canonical-ness check but let's see... Results are done.

Re: [PATCH] x86_64, asm: Work around AMD SYSRET SS descriptor attribute issue

2015-04-27 Thread Borislav Petkov
On Mon, Apr 27, 2015 at 11:47:30AM -0700, Linus Torvalds wrote: > On Mon, Apr 27, 2015 at 11:38 AM, Borislav Petkov wrote: > > > > So our current NOP-infrastructure does ASM_NOP_MAX NOPs of 8 bytes so > > without more invasive changes, our longest NOPs are 8 byte long and then > > we have to

Re: [PATCH] x86_64, asm: Work around AMD SYSRET SS descriptor attribute issue

2015-04-27 Thread Borislav Petkov
On Mon, Apr 27, 2015 at 11:12:05AM -0700, Linus Torvalds wrote: > So if one or two cycles in this code doesn't matter, then why are we > adding alternate instructions just to avoid a few ALU instructions and > a conditional branch that predicts perfectly? And if it does matter, > then the 6-byte

Re: [PATCH] x86_64, asm: Work around AMD SYSRET SS descriptor attribute issue

2015-04-27 Thread Linus Torvalds
On Mon, Apr 27, 2015 at 11:38 AM, Borislav Petkov wrote: > > So our current NOP-infrastructure does ASM_NOP_MAX NOPs of 8 bytes so > without more invasive changes, our longest NOPs are 8 byte long and then > we have to repeat. Btw (and I'm too lazy to check) do we take alignment into account?

Re: [PATCH] x86_64, asm: Work around AMD SYSRET SS descriptor attribute issue

2015-04-27 Thread Borislav Petkov
On Mon, Apr 27, 2015 at 11:14:15AM -0700, Linus Torvalds wrote: > Btw, please don't use the "more than three 66h overrides" version. Oh yeah, a notorious "frontend choker". > Sure, that's what the optimization manual suggests if you want > single-instruction decode for all sizes up to 15 bytes,

Re: [PATCH] x86_64, asm: Work around AMD SYSRET SS descriptor attribute issue

2015-04-27 Thread Linus Torvalds
On Mon, Apr 27, 2015 at 9:40 AM, Borislav Petkov wrote: > > Either way, the NOPs-version is faster and I'm running the test with the > F16h-specific NOPs to see how they perform. Btw, please don't use the "more than three 66h overrides" version. Sure, that's what the optimization manual suggests

Re: [PATCH] x86_64, asm: Work around AMD SYSRET SS descriptor attribute issue

2015-04-27 Thread Linus Torvalds
On Mon, Apr 27, 2015 at 9:12 AM, Denys Vlasenko wrote: > > It is smaller, but not by much. It is two instructions smaller. Ehh. That's _half_. And on a decoding side, it's the difference between 6 bytes that decode cleanly and can be decoded in parallel with other things (assuming the 6-byte

Re: [PATCH] x86_64, asm: Work around AMD SYSRET SS descriptor attribute issue

2015-04-27 Thread Borislav Petkov
On Mon, Apr 27, 2015 at 09:00:08AM -0700, Linus Torvalds wrote: > On Mon, Apr 27, 2015 at 8:46 AM, Borislav Petkov wrote: > > > > Right, what about the false positives: > > Anybody who tries to return to kernel addresses with sysret is > suspect. It's more likely to be an attack vector than

Re: [PATCH] x86_64, asm: Work around AMD SYSRET SS descriptor attribute issue

2015-04-27 Thread Denys Vlasenko
On 04/27/2015 04:57 PM, Linus Torvalds wrote: > On Mon, Apr 27, 2015 at 4:35 AM, Borislav Petkov wrote: >> >> /* >> * Change top 16 bits to be the sign-extension of 47th bit, if this >> * changed %rcx, it was not canonical. >> */ >> ALTERNATIVE "", \ >>

Re: [PATCH] x86_64, asm: Work around AMD SYSRET SS descriptor attribute issue

2015-04-27 Thread Denys Vlasenko
On 04/27/2015 06:04 PM, Brian Gerst wrote: > On Mon, Apr 27, 2015 at 11:56 AM, Andy Lutomirski wrote: >> On Mon, Apr 27, 2015 at 8:46 AM, Borislav Petkov wrote: >>> On Mon, Apr 27, 2015 at 07:57:36AM -0700, Linus Torvalds wrote: On Mon, Apr 27, 2015 at 4:35 AM, Borislav Petkov wrote: >

Re: [PATCH] x86_64, asm: Work around AMD SYSRET SS descriptor attribute issue

2015-04-27 Thread Brian Gerst
On Mon, Apr 27, 2015 at 11:56 AM, Andy Lutomirski wrote: > On Mon, Apr 27, 2015 at 8:46 AM, Borislav Petkov wrote: >> On Mon, Apr 27, 2015 at 07:57:36AM -0700, Linus Torvalds wrote: >>> On Mon, Apr 27, 2015 at 4:35 AM, Borislav Petkov wrote: >>> > >>> > /* >>> > * Change top 16

Re: [PATCH] x86_64, asm: Work around AMD SYSRET SS descriptor attribute issue

2015-04-27 Thread Linus Torvalds
On Mon, Apr 27, 2015 at 8:46 AM, Borislav Petkov wrote: > > Right, what about the false positives: Anybody who tries to return to kernel addresses with sysret is suspect. It's more likely to be an attack vector than anything else (ie somebody who is trying to take advantage of a CPU bug). I

Re: [PATCH] x86_64, asm: Work around AMD SYSRET SS descriptor attribute issue

2015-04-27 Thread Andy Lutomirski
On Mon, Apr 27, 2015 at 8:46 AM, Borislav Petkov wrote: > On Mon, Apr 27, 2015 at 07:57:36AM -0700, Linus Torvalds wrote: >> On Mon, Apr 27, 2015 at 4:35 AM, Borislav Petkov wrote: >> > >> > /* >> > * Change top 16 bits to be the sign-extension of 47th bit, if this >> >

Re: [PATCH] x86_64, asm: Work around AMD SYSRET SS descriptor attribute issue

2015-04-27 Thread Borislav Petkov
On Mon, Apr 27, 2015 at 07:57:36AM -0700, Linus Torvalds wrote: > On Mon, Apr 27, 2015 at 4:35 AM, Borislav Petkov wrote: > > > > /* > > * Change top 16 bits to be the sign-extension of 47th bit, if this > > * changed %rcx, it was not canonical. > > */ > >

Re: [PATCH] x86_64, asm: Work around AMD SYSRET SS descriptor attribute issue

2015-04-27 Thread Borislav Petkov
On Mon, Apr 27, 2015 at 08:06:16AM -0700, Linus Torvalds wrote: > So maybe our AMD nop tables should be updated? Ho-humm, we're using k8_nops on all 64-bit AMD. I better do some opt-guide staring. -- Regards/Gruss, Boris. ECO tip #101: Trim your mails when you reply. -- -- To unsubscribe

Re: [PATCH] x86_64, asm: Work around AMD SYSRET SS descriptor attribute issue

2015-04-27 Thread Linus Torvalds
On Mon, Apr 27, 2015 at 7:57 AM, Linus Torvalds wrote: > > ..end result is just six bytes. That way you can use alternative to > replace it with one single noop on AMD. Actually, it looks like we have no good 6-byte no-ops on AMD. So you'd get two three-byte ones. Oh well. It's still better than

Re: [PATCH] x86_64, asm: Work around AMD SYSRET SS descriptor attribute issue

2015-04-27 Thread Linus Torvalds
On Mon, Apr 27, 2015 at 4:35 AM, Borislav Petkov wrote: > > /* > * Change top 16 bits to be the sign-extension of 47th bit, if this > * changed %rcx, it was not canonical. > */ > ALTERNATIVE "", \ > "shl$(64 - (47+1)), %rcx; \ >

Re: [PATCH] x86_64, asm: Work around AMD SYSRET SS descriptor attribute issue

2015-04-27 Thread Borislav Petkov
On Sun, Apr 26, 2015 at 04:39:38PM -0700, Andy Lutomirski wrote: > I know it would be ugly, but would it be worth saving two bytes by > using ALTERNATIVE "jmp 1f", "shl ...", ...? Damn, it is actually visible even that saving the unconditional forward JMP makes the numbers marginally nicer (E:

Re: [PATCH] x86_64, asm: Work around AMD SYSRET SS descriptor attribute issue

2015-04-27 Thread Borislav Petkov
On Mon, Apr 27, 2015 at 02:08:40PM +0200, Denys Vlasenko wrote: > > 819ef40c: 48 c1 e1 10 shl$0x10,%rcx > > 819ef410: 48 c1 f9 10 sar$0x10,%rcx > > 819ef414: 49 39 cbcmp%rcx,%r11 > > 819ef417:

Re: [PATCH] x86_64, asm: Work around AMD SYSRET SS descriptor attribute issue

2015-04-27 Thread Denys Vlasenko
On 04/27/2015 01:35 PM, Borislav Petkov wrote: > On Mon, Apr 27, 2015 at 10:53:05AM +0200, Borislav Petkov wrote: >> ALTERNATIVE "", >> "shl $(64 - (__VIRTUAL_MASK_SHIFT+1)), %rcx \ >> sar $(64 - (__VIRTUAL_MASK_SHIFT+1)), %rcx \ >>

Re: [PATCH] x86_64, asm: Work around AMD SYSRET SS descriptor attribute issue

2015-04-27 Thread Borislav Petkov
On Mon, Apr 27, 2015 at 10:53:05AM +0200, Borislav Petkov wrote: > ALTERNATIVE "", > "shl $(64 - (__VIRTUAL_MASK_SHIFT+1)), %rcx \ >sar $(64 - (__VIRTUAL_MASK_SHIFT+1)), %rcx \ >cmpq%rcx, %r11 \ >jne

Re: [PATCH] x86_64, asm: Work around AMD SYSRET SS descriptor attribute issue

2015-04-27 Thread Borislav Petkov
On Mon, Apr 27, 2015 at 12:07:14PM +0200, Denys Vlasenko wrote: > /* Only three 0x66 prefixes for NOP for fast decode on all CPUs */ > ALTERNATIVE ".byte 0x66,0x66,0x66,0x90 \ > .byte 0x66,0x66,0x66,0x90 \ > .byte 0x66,0x66,0x66,0x90", >

Re: [PATCH] x86_64, asm: Work around AMD SYSRET SS descriptor attribute issue

2015-04-27 Thread Denys Vlasenko
On 04/27/2015 10:53 AM, Borislav Petkov wrote: > On Sun, Apr 26, 2015 at 04:39:38PM -0700, Andy Lutomirski wrote: >>> +#define X86_BUG_CANONICAL_RCX X86_BUG(8) /* SYSRET #GPs when %RCX >>> non-canonical */ >> >> I think that "sysret" should appear in the name. > > Yeah, I thought about it too,

Re: [PATCH] x86_64, asm: Work around AMD SYSRET SS descriptor attribute issue

2015-04-27 Thread Borislav Petkov
On Sun, Apr 26, 2015 at 04:39:38PM -0700, Andy Lutomirski wrote: > > +#define X86_BUG_CANONICAL_RCX X86_BUG(8) /* SYSRET #GPs when %RCX > > non-canonical */ > > I think that "sysret" should appear in the name. Yeah, I thought about it too, will fix. > Oh no! My laptop is currently bug-free,

Re: [PATCH] x86_64, asm: Work around AMD SYSRET SS descriptor attribute issue

2015-04-27 Thread Denys Vlasenko
On 04/27/2015 10:53 AM, Borislav Petkov wrote: On Sun, Apr 26, 2015 at 04:39:38PM -0700, Andy Lutomirski wrote: +#define X86_BUG_CANONICAL_RCX X86_BUG(8) /* SYSRET #GPs when %RCX non-canonical */ I think that sysret should appear in the name. Yeah, I thought about it too, will fix. Oh

Re: [PATCH] x86_64, asm: Work around AMD SYSRET SS descriptor attribute issue

2015-04-27 Thread Borislav Petkov
On Mon, Apr 27, 2015 at 12:07:14PM +0200, Denys Vlasenko wrote: /* Only three 0x66 prefixes for NOP for fast decode on all CPUs */ ALTERNATIVE .byte 0x66,0x66,0x66,0x90 \ .byte 0x66,0x66,0x66,0x90 \ .byte 0x66,0x66,0x66,0x90,

Re: [PATCH] x86_64, asm: Work around AMD SYSRET SS descriptor attribute issue

2015-04-27 Thread Borislav Petkov
On Mon, Apr 27, 2015 at 12:59:11PM -0700, H. Peter Anvin wrote: It really comes down to this: it seems older cores from both Intel and AMD perform better with 66 66 66 90, whereas the 0F 1F series is better on newer cores. When I measured it, the differences were sometimes dramatic. How did

Re: [PATCH] x86_64, asm: Work around AMD SYSRET SS descriptor attribute issue

2015-04-27 Thread H. Peter Anvin
It really comes down to this: it seems older cores from both Intel and AMD perform better with 66 66 66 90, whereas the 0F 1F series is better on newer cores. When I measured it, the differences were sometimes dramatic. On April 27, 2015 11:53:44 AM PDT, Borislav Petkov b...@alien8.de wrote:

Re: [PATCH] x86_64, asm: Work around AMD SYSRET SS descriptor attribute issue

2015-04-27 Thread Borislav Petkov
On Mon, Apr 27, 2015 at 09:21:34PM +0200, Denys Vlasenko wrote: On 04/27/2015 09:11 PM, Borislav Petkov wrote: A: 709.528485252 seconds time elapsed ( +- 0.02% ) B: 708.976557288 seconds time elapsed

Re: [PATCH] x86_64, asm: Work around AMD SYSRET SS descriptor attribute issue

2015-04-27 Thread H. Peter Anvin
I did a microbenchmark in user space... let's see if I can find it. On April 27, 2015 1:03:29 PM PDT, Borislav Petkov b...@alien8.de wrote: On Mon, Apr 27, 2015 at 12:59:11PM -0700, H. Peter Anvin wrote: It really comes down to this: it seems older cores from both Intel and AMD perform better

Re: [PATCH] x86_64, asm: Work around AMD SYSRET SS descriptor attribute issue

2015-04-27 Thread Borislav Petkov
On Sun, Apr 26, 2015 at 04:39:38PM -0700, Andy Lutomirski wrote: +#define X86_BUG_CANONICAL_RCX X86_BUG(8) /* SYSRET #GPs when %RCX non-canonical */ I think that sysret should appear in the name. Yeah, I thought about it too, will fix. Oh no! My laptop is currently bug-free, and

Re: [PATCH] x86_64, asm: Work around AMD SYSRET SS descriptor attribute issue

2015-04-27 Thread Borislav Petkov
On Mon, Apr 27, 2015 at 02:08:40PM +0200, Denys Vlasenko wrote: 819ef40c: 48 c1 e1 10 shl$0x10,%rcx 819ef410: 48 c1 f9 10 sar$0x10,%rcx 819ef414: 49 39 cbcmp%rcx,%r11 819ef417: 0f 85

Re: [PATCH] x86_64, asm: Work around AMD SYSRET SS descriptor attribute issue

2015-04-27 Thread Denys Vlasenko
On 04/27/2015 01:35 PM, Borislav Petkov wrote: On Mon, Apr 27, 2015 at 10:53:05AM +0200, Borislav Petkov wrote: ALTERNATIVE , shl $(64 - (__VIRTUAL_MASK_SHIFT+1)), %rcx \ sar $(64 - (__VIRTUAL_MASK_SHIFT+1)), %rcx \ cmpq

Re: [PATCH] x86_64, asm: Work around AMD SYSRET SS descriptor attribute issue

2015-04-27 Thread Borislav Petkov
On Mon, Apr 27, 2015 at 10:53:05AM +0200, Borislav Petkov wrote: ALTERNATIVE , shl $(64 - (__VIRTUAL_MASK_SHIFT+1)), %rcx \ sar $(64 - (__VIRTUAL_MASK_SHIFT+1)), %rcx \ cmpq%rcx, %r11 \ jne

Re: [PATCH] x86_64, asm: Work around AMD SYSRET SS descriptor attribute issue

2015-04-27 Thread Linus Torvalds
On Mon, Apr 27, 2015 at 4:35 AM, Borislav Petkov b...@alien8.de wrote: /* * Change top 16 bits to be the sign-extension of 47th bit, if this * changed %rcx, it was not canonical. */ ALTERNATIVE , \ shl$(64 - (47+1)), %rcx; \

Re: [PATCH] x86_64, asm: Work around AMD SYSRET SS descriptor attribute issue

2015-04-27 Thread Linus Torvalds
On Mon, Apr 27, 2015 at 7:57 AM, Linus Torvalds torva...@linux-foundation.org wrote: ..end result is just six bytes. That way you can use alternative to replace it with one single noop on AMD. Actually, it looks like we have no good 6-byte no-ops on AMD. So you'd get two three-byte ones. Oh

Re: [PATCH] x86_64, asm: Work around AMD SYSRET SS descriptor attribute issue

2015-04-27 Thread Borislav Petkov
On Sun, Apr 26, 2015 at 04:39:38PM -0700, Andy Lutomirski wrote: I know it would be ugly, but would it be worth saving two bytes by using ALTERNATIVE jmp 1f, shl ..., ...? Damn, it is actually visible even that saving the unconditional forward JMP makes the numbers marginally nicer (E: row). So

Re: [PATCH] x86_64, asm: Work around AMD SYSRET SS descriptor attribute issue

2015-04-27 Thread Brian Gerst
On Mon, Apr 27, 2015 at 11:56 AM, Andy Lutomirski l...@amacapital.net wrote: On Mon, Apr 27, 2015 at 8:46 AM, Borislav Petkov b...@alien8.de wrote: On Mon, Apr 27, 2015 at 07:57:36AM -0700, Linus Torvalds wrote: On Mon, Apr 27, 2015 at 4:35 AM, Borislav Petkov b...@alien8.de wrote:

Re: [PATCH] x86_64, asm: Work around AMD SYSRET SS descriptor attribute issue

2015-04-27 Thread Borislav Petkov
On Mon, Apr 27, 2015 at 07:57:36AM -0700, Linus Torvalds wrote: On Mon, Apr 27, 2015 at 4:35 AM, Borislav Petkov b...@alien8.de wrote: /* * Change top 16 bits to be the sign-extension of 47th bit, if this * changed %rcx, it was not canonical. */

Re: [PATCH] x86_64, asm: Work around AMD SYSRET SS descriptor attribute issue

2015-04-27 Thread Andy Lutomirski
On Mon, Apr 27, 2015 at 8:46 AM, Borislav Petkov b...@alien8.de wrote: On Mon, Apr 27, 2015 at 07:57:36AM -0700, Linus Torvalds wrote: On Mon, Apr 27, 2015 at 4:35 AM, Borislav Petkov b...@alien8.de wrote: /* * Change top 16 bits to be the sign-extension of 47th bit, if

Re: [PATCH] x86_64, asm: Work around AMD SYSRET SS descriptor attribute issue

2015-04-27 Thread Linus Torvalds
On Mon, Apr 27, 2015 at 8:46 AM, Borislav Petkov b...@alien8.de wrote: Right, what about the false positives: Anybody who tries to return to kernel addresses with sysret is suspect. It's more likely to be an attack vector than anything else (ie somebody who is trying to take advantage of a CPU

Re: [PATCH] x86_64, asm: Work around AMD SYSRET SS descriptor attribute issue

2015-04-27 Thread Borislav Petkov
On Mon, Apr 27, 2015 at 08:06:16AM -0700, Linus Torvalds wrote: So maybe our AMD nop tables should be updated? Ho-humm, we're using k8_nops on all 64-bit AMD. I better do some opt-guide staring. -- Regards/Gruss, Boris. ECO tip #101: Trim your mails when you reply. -- -- To unsubscribe

Re: [PATCH] x86_64, asm: Work around AMD SYSRET SS descriptor attribute issue

2015-04-27 Thread Denys Vlasenko
On 04/27/2015 06:04 PM, Brian Gerst wrote: On Mon, Apr 27, 2015 at 11:56 AM, Andy Lutomirski l...@amacapital.net wrote: On Mon, Apr 27, 2015 at 8:46 AM, Borislav Petkov b...@alien8.de wrote: On Mon, Apr 27, 2015 at 07:57:36AM -0700, Linus Torvalds wrote: On Mon, Apr 27, 2015 at 4:35 AM,

Re: [PATCH] x86_64, asm: Work around AMD SYSRET SS descriptor attribute issue

2015-04-27 Thread Denys Vlasenko
On 04/27/2015 04:57 PM, Linus Torvalds wrote: On Mon, Apr 27, 2015 at 4:35 AM, Borislav Petkov b...@alien8.de wrote: /* * Change top 16 bits to be the sign-extension of 47th bit, if this * changed %rcx, it was not canonical. */ ALTERNATIVE , \

Re: [PATCH] x86_64, asm: Work around AMD SYSRET SS descriptor attribute issue

2015-04-27 Thread Borislav Petkov
On Mon, Apr 27, 2015 at 09:00:08AM -0700, Linus Torvalds wrote: On Mon, Apr 27, 2015 at 8:46 AM, Borislav Petkov b...@alien8.de wrote: Right, what about the false positives: Anybody who tries to return to kernel addresses with sysret is suspect. It's more likely to be an attack vector

Re: [PATCH] x86_64, asm: Work around AMD SYSRET SS descriptor attribute issue

2015-04-27 Thread Borislav Petkov
On Mon, Apr 27, 2015 at 11:14:15AM -0700, Linus Torvalds wrote: Btw, please don't use the more than three 66h overrides version. Oh yeah, a notorious frontend choker. Sure, that's what the optimization manual suggests if you want single-instruction decode for all sizes up to 15 bytes, but I

Re: [PATCH] x86_64, asm: Work around AMD SYSRET SS descriptor attribute issue

2015-04-27 Thread Linus Torvalds
On Mon, Apr 27, 2015 at 11:38 AM, Borislav Petkov b...@alien8.de wrote: So our current NOP-infrastructure does ASM_NOP_MAX NOPs of 8 bytes so without more invasive changes, our longest NOPs are 8 byte long and then we have to repeat. Btw (and I'm too lazy to check) do we take alignment into

Re: [PATCH] x86_64, asm: Work around AMD SYSRET SS descriptor attribute issue

2015-04-27 Thread Borislav Petkov
On Mon, Apr 27, 2015 at 11:12:05AM -0700, Linus Torvalds wrote: So if one or two cycles in this code doesn't matter, then why are we adding alternate instructions just to avoid a few ALU instructions and a conditional branch that predicts perfectly? And if it does matter, then the 6-byte

Re: [PATCH] x86_64, asm: Work around AMD SYSRET SS descriptor attribute issue

2015-04-27 Thread Borislav Petkov
On Mon, Apr 27, 2015 at 11:47:30AM -0700, Linus Torvalds wrote: On Mon, Apr 27, 2015 at 11:38 AM, Borislav Petkov b...@alien8.de wrote: So our current NOP-infrastructure does ASM_NOP_MAX NOPs of 8 bytes so without more invasive changes, our longest NOPs are 8 byte long and then we have to

Re: [PATCH] x86_64, asm: Work around AMD SYSRET SS descriptor attribute issue

2015-04-27 Thread Linus Torvalds
On Mon, Apr 27, 2015 at 9:12 AM, Denys Vlasenko dvlas...@redhat.com wrote: It is smaller, but not by much. It is two instructions smaller. Ehh. That's _half_. And on a decoding side, it's the difference between 6 bytes that decode cleanly and can be decoded in parallel with other things

Re: [PATCH] x86_64, asm: Work around AMD SYSRET SS descriptor attribute issue

2015-04-27 Thread Linus Torvalds
On Mon, Apr 27, 2015 at 9:40 AM, Borislav Petkov b...@alien8.de wrote: Either way, the NOPs-version is faster and I'm running the test with the F16h-specific NOPs to see how they perform. Btw, please don't use the more than three 66h overrides version. Sure, that's what the optimization manual

Re: [PATCH] x86_64, asm: Work around AMD SYSRET SS descriptor attribute issue

2015-04-27 Thread Borislav Petkov
On Mon, Apr 27, 2015 at 08:38:54PM +0200, Borislav Petkov wrote: I'm running them now and will report numbers relative to the last run once it is done. And those numbers should in practice get even better if we revert to the simpler canonical-ness check but let's see... Results are done. New

Re: [PATCH] x86_64, asm: Work around AMD SYSRET SS descriptor attribute issue

2015-04-27 Thread Denys Vlasenko
On 04/27/2015 09:11 PM, Borislav Petkov wrote: A: 709.528485252 seconds time elapsed ( +- 0.02% ) B: 708.976557288 seconds time elapsed ( +- 0.04% ) C: 709.312844791 seconds time elapsed

Re: [PATCH] x86_64, asm: Work around AMD SYSRET SS descriptor attribute issue

2015-04-26 Thread Andy Lutomirski
> > diff --git a/arch/x86/include/asm/cpufeature.h > b/arch/x86/include/asm/cpufeature.h > index 7ee9b94d9921..8d555b046fe9 100644 > --- a/arch/x86/include/asm/cpufeature.h > +++ b/arch/x86/include/asm/cpufeature.h > @@ -265,6 +265,7 @@ > #define X86_BUG_11AP X86_BUG(5) /* Bad local

Re: [PATCH] x86_64, asm: Work around AMD SYSRET SS descriptor attribute issue

2015-04-26 Thread Andy Lutomirski
On Fri, Apr 24, 2015 at 7:17 PM, Denys Vlasenko wrote: > On Fri, Apr 24, 2015 at 10:50 PM, Andy Lutomirski wrote: >> On Fri, Apr 24, 2015 at 1:46 PM, Denys Vlasenko This might be way more trouble than it's worth. >>> >>> Exactly my feeling. What are you trying to save? About four CPU >>>

Re: [PATCH] x86_64, asm: Work around AMD SYSRET SS descriptor attribute issue

2015-04-26 Thread Denys Vlasenko
On Fri, Apr 24, 2015 at 4:18 AM, Andy Lutomirski wrote: > On Thu, Apr 23, 2015 at 7:15 PM, Andy Lutomirski wrote: > Even if the issue affects SYSRETQ, it could be that we don't care. If > the extent of the info leak is whether we context switched during a > 64-bit syscall to a non-syscall

perf numbers (was: Re: [PATCH] x86_64, asm: Work around AMD SYSRET SS descriptor attribute issue)

2015-04-26 Thread Borislav Petkov
On Sat, Apr 25, 2015 at 11:12:06PM +0200, Borislav Petkov wrote: > I've prepended the perf stat output with markers A:, B: or C: for easier > comparing. The markers mean: > > A: Linus' master from a couple of days ago + tip/master + tip/x86/asm > B: With Andy's SYSRET patch ontop > C: Without RCX

perf numbers (was: Re: [PATCH] x86_64, asm: Work around AMD SYSRET SS descriptor attribute issue)

2015-04-26 Thread Borislav Petkov
On Sat, Apr 25, 2015 at 11:12:06PM +0200, Borislav Petkov wrote: I've prepended the perf stat output with markers A:, B: or C: for easier comparing. The markers mean: A: Linus' master from a couple of days ago + tip/master + tip/x86/asm B: With Andy's SYSRET patch ontop C: Without RCX

Re: [PATCH] x86_64, asm: Work around AMD SYSRET SS descriptor attribute issue

2015-04-26 Thread Denys Vlasenko
On Fri, Apr 24, 2015 at 4:18 AM, Andy Lutomirski l...@amacapital.net wrote: On Thu, Apr 23, 2015 at 7:15 PM, Andy Lutomirski l...@kernel.org wrote: Even if the issue affects SYSRETQ, it could be that we don't care. If the extent of the info leak is whether we context switched during a 64-bit

Re: [PATCH] x86_64, asm: Work around AMD SYSRET SS descriptor attribute issue

2015-04-26 Thread Andy Lutomirski
On Fri, Apr 24, 2015 at 7:17 PM, Denys Vlasenko vda.li...@googlemail.com wrote: On Fri, Apr 24, 2015 at 10:50 PM, Andy Lutomirski l...@amacapital.net wrote: On Fri, Apr 24, 2015 at 1:46 PM, Denys Vlasenko This might be way more trouble than it's worth. Exactly my feeling. What are you trying

Re: [PATCH] x86_64, asm: Work around AMD SYSRET SS descriptor attribute issue

2015-04-26 Thread Andy Lutomirski
diff --git a/arch/x86/include/asm/cpufeature.h b/arch/x86/include/asm/cpufeature.h index 7ee9b94d9921..8d555b046fe9 100644 --- a/arch/x86/include/asm/cpufeature.h +++ b/arch/x86/include/asm/cpufeature.h @@ -265,6 +265,7 @@ #define X86_BUG_11AP X86_BUG(5) /* Bad local APIC aka

Re: [PATCH] x86_64, asm: Work around AMD SYSRET SS descriptor attribute issue

2015-04-25 Thread Borislav Petkov
On Thu, Apr 23, 2015 at 07:15:01PM -0700, Andy Lutomirski wrote: > AMD CPUs don't reinitialize the SS descriptor on SYSRET, so SYSRET > with SS == 0 results in an invalid usermode state in which SS is > apparently equal to __USER_DS but causes #SS if used. > > Work around the issue by replacing

Re: [PATCH] x86_64, asm: Work around AMD SYSRET SS descriptor attribute issue

2015-04-25 Thread Borislav Petkov
On Thu, Apr 23, 2015 at 07:15:01PM -0700, Andy Lutomirski wrote: AMD CPUs don't reinitialize the SS descriptor on SYSRET, so SYSRET with SS == 0 results in an invalid usermode state in which SS is apparently equal to __USER_DS but causes #SS if used. Work around the issue by replacing NULL

Re: [PATCH] x86_64, asm: Work around AMD SYSRET SS descriptor attribute issue

2015-04-24 Thread Denys Vlasenko
On Fri, Apr 24, 2015 at 10:50 PM, Andy Lutomirski wrote: > On Fri, Apr 24, 2015 at 1:46 PM, Denys Vlasenko >>> This might be way more trouble than it's worth. >> >> Exactly my feeling. What are you trying to save? About four CPU >> cycles of checking %ss != __KERNEL_DS on each switch_to? >>

Re: [PATCH] x86_64, asm: Work around AMD SYSRET SS descriptor attribute issue

2015-04-24 Thread H. Peter Anvin
On 04/24/2015 01:50 PM, Andy Lutomirski wrote: >> >> Exactly my feeling. What are you trying to save? About four CPU >> cycles of checking %ss != __KERNEL_DS on each switch_to? >> That's not worth bothering about. Your last patch seems to be perfect. > > We'll have to do the write to ss almost

Re: [PATCH] x86_64, asm: Work around AMD SYSRET SS descriptor attribute issue

2015-04-24 Thread H. Peter Anvin
On 04/24/2015 01:50 PM, Andy Lutomirski wrote: >> >> Exactly my feeling. What are you trying to save? About four CPU >> cycles of checking %ss != __KERNEL_DS on each switch_to? >> That's not worth bothering about. Your last patch seems to be perfect. > > We'll have to do the write to ss almost

Re: [PATCH] x86_64, asm: Work around AMD SYSRET SS descriptor attribute issue

2015-04-24 Thread H. Peter Anvin
On 04/24/2015 01:50 PM, Andy Lutomirski wrote: >> >> Exactly my feeling. What are you trying to save? About four CPU >> cycles of checking %ss != __KERNEL_DS on each switch_to? >> That's not worth bothering about. Your last patch seems to be perfect. > > We'll have to do the write to ss almost

Re: [PATCH] x86_64, asm: Work around AMD SYSRET SS descriptor attribute issue

2015-04-24 Thread H. Peter Anvin
On 04/24/2015 01:50 PM, Andy Lutomirski wrote: >> >> Exactly my feeling. What are you trying to save? About four CPU >> cycles of checking %ss != __KERNEL_DS on each switch_to? >> That's not worth bothering about. Your last patch seems to be perfect. > > We'll have to do the write to ss almost

Re: [PATCH] x86_64, asm: Work around AMD SYSRET SS descriptor attribute issue

2015-04-24 Thread H. Peter Anvin
On 04/24/2015 01:50 PM, Andy Lutomirski wrote: >> >> Exactly my feeling. What are you trying to save? About four CPU >> cycles of checking %ss != __KERNEL_DS on each switch_to? >> That's not worth bothering about. Your last patch seems to be perfect. > > We'll have to do the write to ss almost

Re: [PATCH] x86_64, asm: Work around AMD SYSRET SS descriptor attribute issue

2015-04-24 Thread H. Peter Anvin
On 04/24/2015 01:50 PM, Andy Lutomirski wrote: >> >> Exactly my feeling. What are you trying to save? About four CPU >> cycles of checking %ss != __KERNEL_DS on each switch_to? >> That's not worth bothering about. Your last patch seems to be perfect. > > We'll have to do the write to ss almost

Re: [PATCH] x86_64, asm: Work around AMD SYSRET SS descriptor attribute issue

2015-04-24 Thread Linus Torvalds
On Fri, Apr 24, 2015 at 1:21 PM, Andy Lutomirski wrote: > > 2. SYSRETQ. The only way that I know of to see the problem is SYSRETQ > followed by a far jump or return. This is presumably *extremely* > rare. > > What if we fixed #2 up in do_stack_segment. We should double-check > the docs, but I

Re: [PATCH] x86_64, asm: Work around AMD SYSRET SS descriptor attribute issue

2015-04-24 Thread Andy Lutomirski
On Fri, Apr 24, 2015 at 1:46 PM, Denys Vlasenko wrote: > On Fri, Apr 24, 2015 at 10:21 PM, Andy Lutomirski wrote: >> On Thu, Apr 23, 2015 at 7:15 PM, Andy Lutomirski wrote: >>> AMD CPUs don't reinitialize the SS descriptor on SYSRET, so SYSRET >>> with SS == 0 results in an invalid usermode

Re: [PATCH] x86_64, asm: Work around AMD SYSRET SS descriptor attribute issue

2015-04-24 Thread Denys Vlasenko
On Fri, Apr 24, 2015 at 10:21 PM, Andy Lutomirski wrote: > On Thu, Apr 23, 2015 at 7:15 PM, Andy Lutomirski wrote: >> AMD CPUs don't reinitialize the SS descriptor on SYSRET, so SYSRET >> with SS == 0 results in an invalid usermode state in which SS is >> apparently equal to __USER_DS but causes

Re: [PATCH] x86_64, asm: Work around AMD SYSRET SS descriptor attribute issue

2015-04-24 Thread Andy Lutomirski
On Thu, Apr 23, 2015 at 7:15 PM, Andy Lutomirski wrote: > AMD CPUs don't reinitialize the SS descriptor on SYSRET, so SYSRET > with SS == 0 results in an invalid usermode state in which SS is > apparently equal to __USER_DS but causes #SS if used. > > Work around the issue by replacing NULL SS

Re: [PATCH] x86_64, asm: Work around AMD SYSRET SS descriptor attribute issue

2015-04-24 Thread Borislav Petkov
On Fri, Apr 24, 2015 at 12:59:06PM +0200, Borislav Petkov wrote: > Yeah, that makes more sense. So I tested Andy's patch but changed it as > above and I get > > $ taskset -c 0 ./sysret_ss_attrs_32 > [RUN] Syscalls followed by SS validation > [OK]We survived Andy, you wanted the 64-bit

  1   2   >