Re: [RFC/HACK] x86: Fast return to kernel

2014-05-04 Thread H. Peter Anvin
On 05/04/2014 04:46 PM, Paolo Bonzini wrote: > > Your suggested trick of splitting the return paths for IF=0/IF=1 can be > also done like this: > > movq EFLAGS-ARGOFFSET(%rsp), %rdi > btrq $9, %rdi# Clear IF, save old value in CF > movq %rdi, (%rsi) > ... > popfq >

Re: [RFC/HACK] x86: Fast return to kernel

2014-05-04 Thread Paolo Bonzini
Il 02/05/2014 21:51, Linus Torvalds ha scritto: > Also, are you *really* sure that "popf" has the same one-instruction > interrupt shadow that "sti" has? Because I'm not at all sure that is > true, and it's not documented as far as I can tell. In contrast, the > one-instruction shadow after

Re: [RFC/HACK] x86: Fast return to kernel

2014-05-04 Thread H. Peter Anvin
On 05/04/2014 02:31 PM, Linus Torvalds wrote: > On Sun, May 4, 2014 at 12:59 PM, H. Peter Anvin > wrote: >> >> Maybe let userspace sit in a tight loop doing RDTSC, and look for data >> points too far apart to have been uninterrupted? > > That won't work, since Andy's patch improves on the

Re: [RFC/HACK] x86: Fast return to kernel

2014-05-04 Thread Linus Torvalds
On Sun, May 4, 2014 at 12:59 PM, H. Peter Anvin wrote: > > Maybe let userspace sit in a tight loop doing RDTSC, and look for data > points too far apart to have been uninterrupted? That won't work, since Andy's patch improves on the "interrupt happened in kernel space", not on the user-space

Re: [RFC/HACK] x86: Fast return to kernel

2014-05-04 Thread H. Peter Anvin
On 05/04/2014 11:40 AM, Ingo Molnar wrote: > > * Andy Lutomirski wrote: > >>> That said, regular *device* interrupts do often return to kernel >>> mode (the idle loop in particular), so if you have any way to >>> measure that, that might be interesting, and might show some of >>> the same

Re: [RFC/HACK] x86: Fast return to kernel

2014-05-04 Thread Ingo Molnar
* Andy Lutomirski wrote: > > That said, regular *device* interrupts do often return to kernel > > mode (the idle loop in particular), so if you have any way to > > measure that, that might be interesting, and might show some of > > the same advantages. > > I can try something awful

Re: [RFC/HACK] x86: Fast return to kernel

2014-05-04 Thread Ingo Molnar
* Andy Lutomirski l...@amacapital.net wrote: That said, regular *device* interrupts do often return to kernel mode (the idle loop in particular), so if you have any way to measure that, that might be interesting, and might show some of the same advantages. I can try something awful

Re: [RFC/HACK] x86: Fast return to kernel

2014-05-04 Thread H. Peter Anvin
On 05/04/2014 11:40 AM, Ingo Molnar wrote: * Andy Lutomirski l...@amacapital.net wrote: That said, regular *device* interrupts do often return to kernel mode (the idle loop in particular), so if you have any way to measure that, that might be interesting, and might show some of the

Re: [RFC/HACK] x86: Fast return to kernel

2014-05-04 Thread Linus Torvalds
On Sun, May 4, 2014 at 12:59 PM, H. Peter Anvin h.peter.an...@intel.com wrote: Maybe let userspace sit in a tight loop doing RDTSC, and look for data points too far apart to have been uninterrupted? That won't work, since Andy's patch improves on the interrupt happened in kernel space, not on

Re: [RFC/HACK] x86: Fast return to kernel

2014-05-04 Thread H. Peter Anvin
On 05/04/2014 02:31 PM, Linus Torvalds wrote: On Sun, May 4, 2014 at 12:59 PM, H. Peter Anvin h.peter.an...@intel.com wrote: Maybe let userspace sit in a tight loop doing RDTSC, and look for data points too far apart to have been uninterrupted? That won't work, since Andy's patch improves

Re: [RFC/HACK] x86: Fast return to kernel

2014-05-04 Thread Paolo Bonzini
Il 02/05/2014 21:51, Linus Torvalds ha scritto: Also, are you *really* sure that popf has the same one-instruction interrupt shadow that sti has? Because I'm not at all sure that is true, and it's not documented as far as I can tell. In contrast, the one-instruction shadow after sti very

Re: [RFC/HACK] x86: Fast return to kernel

2014-05-04 Thread H. Peter Anvin
On 05/04/2014 04:46 PM, Paolo Bonzini wrote: Your suggested trick of splitting the return paths for IF=0/IF=1 can be also done like this: movq EFLAGS-ARGOFFSET(%rsp), %rdi btrq $9, %rdi# Clear IF, save old value in CF movq %rdi, (%rsi) ... popfq jnc

Re: [RFC/HACK] x86: Fast return to kernel

2014-05-02 Thread H. Peter Anvin
On 05/02/2014 02:42 PM, Andy Lutomirski wrote: > > Hah -- I think I just faked both of you out :) > > I don't think this has anything to do with the error code, and I think > that the errorentry code already does more or less that: it pushes -1. > > The real issue here is probably the magic

Re: [RFC/HACK] x86: Fast return to kernel

2014-05-02 Thread Andy Lutomirski
On Fri, May 2, 2014 at 2:37 PM, H. Peter Anvin wrote: > On 05/02/2014 02:07 PM, Linus Torvalds wrote: >> On Fri, May 2, 2014 at 2:04 PM, Andy Lutomirski wrote: >>> >>> Because otherwise I'd have to keep track of whether it's a zeroentry >>> or an errorentry. I can't stuff the offset in a

Re: [RFC/HACK] x86: Fast return to kernel

2014-05-02 Thread H. Peter Anvin
On 05/02/2014 02:07 PM, Linus Torvalds wrote: > On Fri, May 2, 2014 at 2:04 PM, Andy Lutomirski wrote: >> >> Because otherwise I'd have to keep track of whether it's a zeroentry >> or an errorentry. I can't stuff the offset in a register without even >> more stack hackery, since there are no

Re: [RFC/HACK] x86: Fast return to kernel

2014-05-02 Thread Thomas Gleixner
On Fri, 2 May 2014, Linus Torvalds wrote: > On Fri, May 2, 2014 at 1:30 PM, Thomas Gleixner wrote: > > > > So what about manipulating the stack so that the popf does not enable > > interrupts and do an explicit sti to get the benefit of the > > one-instruction shadow ? > > That's what I already

Re: [RFC/HACK] x86: Fast return to kernel

2014-05-02 Thread Linus Torvalds
On Fri, May 2, 2014 at 2:04 PM, Andy Lutomirski wrote: > > Because otherwise I'd have to keep track of whether it's a zeroentry > or an errorentry. I can't stuff the offset in a register without even > more stack hackery, since there are no available registers there. I > could split the whole

Re: [RFC/HACK] x86: Fast return to kernel

2014-05-02 Thread Andy Lutomirski
On Fri, May 2, 2014 at 2:01 PM, Linus Torvalds wrote: > On Fri, May 2, 2014 at 1:30 PM, Thomas Gleixner wrote: >> >> So what about manipulating the stack so that the popf does not enable >> interrupts and do an explicit sti to get the benefit of the >> one-instruction shadow ? > > That's what I

Re: [RFC/HACK] x86: Fast return to kernel

2014-05-02 Thread Linus Torvalds
On Fri, May 2, 2014 at 1:30 PM, Thomas Gleixner wrote: > > So what about manipulating the stack so that the popf does not enable > interrupts and do an explicit sti to get the benefit of the > one-instruction shadow ? That's what I already suggested in the original "I don't think popf works"

Re: [RFC/HACK] x86: Fast return to kernel

2014-05-02 Thread Thomas Gleixner
On Fri, 2 May 2014, Linus Torvalds wrote: > On Fri, May 2, 2014 at 12:31 PM, Linus Torvalds > wrote: > > > > Also, are you *really* sure that "popf" has the same one-instruction > > interrupt shadow that "sti" has? Because I'm not at all sure that is > > true, and it's not documented as far as I

Re: [RFC/HACK] x86: Fast return to kernel

2014-05-02 Thread Steven Rostedt
On Fri, 2 May 2014 12:31:42 -0700 Linus Torvalds wrote: > > And NMI not being re-enabled might just be a real advantage. Adding > Steven to the cc to make him aware of this patch. > There's not much of an advantage for NMIs, as they seldom page fault. We may get some due to vmalloc'd areas,

Re: [RFC/HACK] x86: Fast return to kernel

2014-05-02 Thread H. Peter Anvin
On 05/02/2014 12:51 PM, Linus Torvalds wrote: > On Fri, May 2, 2014 at 12:31 PM, Linus Torvalds > wrote: >> >> Also, are you *really* sure that "popf" has the same one-instruction >> interrupt shadow that "sti" has? Because I'm not at all sure that is >> true, and it's not documented as far as I

Re: [RFC/HACK] x86: Fast return to kernel

2014-05-02 Thread Linus Torvalds
On Fri, May 2, 2014 at 12:31 PM, Linus Torvalds wrote: > > Also, are you *really* sure that "popf" has the same one-instruction > interrupt shadow that "sti" has? Because I'm not at all sure that is > true, and it's not documented as far as I can tell. In contrast, the > one-instruction shadow

Re: [RFC/HACK] x86: Fast return to kernel

2014-05-02 Thread Andy Lutomirski
On Fri, May 2, 2014 at 12:31 PM, Linus Torvalds wrote: > On Fri, May 2, 2014 at 12:04 PM, Andy Lutomirski wrote: >> This speeds up my kernel_pf microbenchmark by about 17%. The cfi >> annotations need some work. > > Sadly, performance of page faults in kernel mode is pretty much > completely

Re: [RFC/HACK] x86: Fast return to kernel

2014-05-02 Thread Linus Torvalds
On Fri, May 2, 2014 at 12:04 PM, Andy Lutomirski wrote: > This speeds up my kernel_pf microbenchmark by about 17%. The cfi > annotations need some work. Sadly, performance of page faults in kernel mode is pretty much completely uninteresting. It simply doesn't happen on any real load. That

[RFC/HACK] x86: Fast return to kernel

2014-05-02 Thread Andy Lutomirski
This speeds up my kernel_pf microbenchmark by about 17%. The cfi annotations need some work. Signed-off-by: Andy Lutomirski --- My test case is here: https://gitorious.org/linux-test-utils/linux-clock-tests/source/kernel_pf.c This could have some other interesting benefits. For example,

[RFC/HACK] x86: Fast return to kernel

2014-05-02 Thread Andy Lutomirski
This speeds up my kernel_pf microbenchmark by about 17%. The cfi annotations need some work. Signed-off-by: Andy Lutomirski l...@amacapital.net --- My test case is here: https://gitorious.org/linux-test-utils/linux-clock-tests/source/kernel_pf.c This could have some other interesting

Re: [RFC/HACK] x86: Fast return to kernel

2014-05-02 Thread Linus Torvalds
On Fri, May 2, 2014 at 12:04 PM, Andy Lutomirski l...@amacapital.net wrote: This speeds up my kernel_pf microbenchmark by about 17%. The cfi annotations need some work. Sadly, performance of page faults in kernel mode is pretty much completely uninteresting. It simply doesn't happen on any

Re: [RFC/HACK] x86: Fast return to kernel

2014-05-02 Thread Andy Lutomirski
On Fri, May 2, 2014 at 12:31 PM, Linus Torvalds torva...@linux-foundation.org wrote: On Fri, May 2, 2014 at 12:04 PM, Andy Lutomirski l...@amacapital.net wrote: This speeds up my kernel_pf microbenchmark by about 17%. The cfi annotations need some work. Sadly, performance of page faults in

Re: [RFC/HACK] x86: Fast return to kernel

2014-05-02 Thread Linus Torvalds
On Fri, May 2, 2014 at 12:31 PM, Linus Torvalds torva...@linux-foundation.org wrote: Also, are you *really* sure that popf has the same one-instruction interrupt shadow that sti has? Because I'm not at all sure that is true, and it's not documented as far as I can tell. In contrast, the

Re: [RFC/HACK] x86: Fast return to kernel

2014-05-02 Thread H. Peter Anvin
On 05/02/2014 12:51 PM, Linus Torvalds wrote: On Fri, May 2, 2014 at 12:31 PM, Linus Torvalds torva...@linux-foundation.org wrote: Also, are you *really* sure that popf has the same one-instruction interrupt shadow that sti has? Because I'm not at all sure that is true, and it's not

Re: [RFC/HACK] x86: Fast return to kernel

2014-05-02 Thread Steven Rostedt
On Fri, 2 May 2014 12:31:42 -0700 Linus Torvalds torva...@linux-foundation.org wrote: And NMI not being re-enabled might just be a real advantage. Adding Steven to the cc to make him aware of this patch. There's not much of an advantage for NMIs, as they seldom page fault. We may get some

Re: [RFC/HACK] x86: Fast return to kernel

2014-05-02 Thread Thomas Gleixner
On Fri, 2 May 2014, Linus Torvalds wrote: On Fri, May 2, 2014 at 12:31 PM, Linus Torvalds torva...@linux-foundation.org wrote: Also, are you *really* sure that popf has the same one-instruction interrupt shadow that sti has? Because I'm not at all sure that is true, and it's not

Re: [RFC/HACK] x86: Fast return to kernel

2014-05-02 Thread Linus Torvalds
On Fri, May 2, 2014 at 1:30 PM, Thomas Gleixner t...@linutronix.de wrote: So what about manipulating the stack so that the popf does not enable interrupts and do an explicit sti to get the benefit of the one-instruction shadow ? That's what I already suggested in the original I don't think

Re: [RFC/HACK] x86: Fast return to kernel

2014-05-02 Thread Andy Lutomirski
On Fri, May 2, 2014 at 2:01 PM, Linus Torvalds torva...@linux-foundation.org wrote: On Fri, May 2, 2014 at 1:30 PM, Thomas Gleixner t...@linutronix.de wrote: So what about manipulating the stack so that the popf does not enable interrupts and do an explicit sti to get the benefit of the

Re: [RFC/HACK] x86: Fast return to kernel

2014-05-02 Thread Linus Torvalds
On Fri, May 2, 2014 at 2:04 PM, Andy Lutomirski l...@amacapital.net wrote: Because otherwise I'd have to keep track of whether it's a zeroentry or an errorentry. I can't stuff the offset in a register without even more stack hackery, since there are no available registers there. I could

Re: [RFC/HACK] x86: Fast return to kernel

2014-05-02 Thread Thomas Gleixner
On Fri, 2 May 2014, Linus Torvalds wrote: On Fri, May 2, 2014 at 1:30 PM, Thomas Gleixner t...@linutronix.de wrote: So what about manipulating the stack so that the popf does not enable interrupts and do an explicit sti to get the benefit of the one-instruction shadow ? That's what I

Re: [RFC/HACK] x86: Fast return to kernel

2014-05-02 Thread H. Peter Anvin
On 05/02/2014 02:07 PM, Linus Torvalds wrote: On Fri, May 2, 2014 at 2:04 PM, Andy Lutomirski l...@amacapital.net wrote: Because otherwise I'd have to keep track of whether it's a zeroentry or an errorentry. I can't stuff the offset in a register without even more stack hackery, since there

Re: [RFC/HACK] x86: Fast return to kernel

2014-05-02 Thread Andy Lutomirski
On Fri, May 2, 2014 at 2:37 PM, H. Peter Anvin h.peter.an...@intel.com wrote: On 05/02/2014 02:07 PM, Linus Torvalds wrote: On Fri, May 2, 2014 at 2:04 PM, Andy Lutomirski l...@amacapital.net wrote: Because otherwise I'd have to keep track of whether it's a zeroentry or an errorentry. I

Re: [RFC/HACK] x86: Fast return to kernel

2014-05-02 Thread H. Peter Anvin
On 05/02/2014 02:42 PM, Andy Lutomirski wrote: Hah -- I think I just faked both of you out :) I don't think this has anything to do with the error code, and I think that the errorentry code already does more or less that: it pushes -1. The real issue here is probably the magic 16-byte