Re: [PATCH] KVM: x86: Reduce retpoline performance impact in slot_handle_level_range()

2018-02-05 Thread Paolo Bonzini
On 02/02/2018 19:50, Linus Torvalds wrote: > On Fri, Feb 2, 2018 at 6:59 AM, David Woodhouse wrote: >> With retpoline, tight loops of "call this function for every XXX" are >> very much pessimised by taking a prediction miss *every* time. >> >> This one showed up very high in

Re: [PATCH] KVM: x86: Reduce retpoline performance impact in slot_handle_level_range()

2018-02-05 Thread Paolo Bonzini
On 02/02/2018 19:50, Linus Torvalds wrote: > On Fri, Feb 2, 2018 at 6:59 AM, David Woodhouse wrote: >> With retpoline, tight loops of "call this function for every XXX" are >> very much pessimised by taking a prediction miss *every* time. >> >> This one showed up very high in our early testing,

Re: [PATCH] KVM: x86: Reduce retpoline performance impact in slot_handle_level_range()

2018-02-05 Thread Peter Zijlstra
On Sat, Feb 03, 2018 at 02:46:47PM +, David Woodhouse wrote: > Yeah. I'm keen on being able to use something like alternatives to > *change* 'usualfunction' at runtime though. I suspect it'll be a win > for stuff like dma_ops. That shouldn't be too hard to implement.

Re: [PATCH] KVM: x86: Reduce retpoline performance impact in slot_handle_level_range()

2018-02-05 Thread Peter Zijlstra
On Sat, Feb 03, 2018 at 02:46:47PM +, David Woodhouse wrote: > Yeah. I'm keen on being able to use something like alternatives to > *change* 'usualfunction' at runtime though. I suspect it'll be a win > for stuff like dma_ops. That shouldn't be too hard to implement.

Re: [PATCH] KVM: x86: Reduce retpoline performance impact in slot_handle_level_range()

2018-02-05 Thread Peter Zijlstra
On Sat, Feb 03, 2018 at 02:46:47PM +, David Woodhouse wrote: > > For the simple case how about wrapping the if into > > > > call_likely(foo->bar, usualfunction, args) > > > > as a companion to  > > > >  foo->bar(args) > > > > that can resolve to nothing

Re: [PATCH] KVM: x86: Reduce retpoline performance impact in slot_handle_level_range()

2018-02-05 Thread Peter Zijlstra
On Sat, Feb 03, 2018 at 02:46:47PM +, David Woodhouse wrote: > > For the simple case how about wrapping the if into > > > > call_likely(foo->bar, usualfunction, args) > > > > as a companion to  > > > >  foo->bar(args) > > > > that can resolve to nothing

Re: [PATCH] KVM: x86: Reduce retpoline performance impact in slot_handle_level_range()

2018-02-03 Thread David Woodhouse
On Fri, 2018-02-02 at 21:23 +, Alan Cox wrote: > In addition the problem with switch() is that gcc might decide in some > cases that the best way to implement your switch is an indirect call > from a lookup table. That's also true of the   if (ptr == usualfunction)      usualfunction();  

Re: [PATCH] KVM: x86: Reduce retpoline performance impact in slot_handle_level_range()

2018-02-03 Thread David Woodhouse
On Fri, 2018-02-02 at 21:23 +, Alan Cox wrote: > In addition the problem with switch() is that gcc might decide in some > cases that the best way to implement your switch is an indirect call > from a lookup table. That's also true of the   if (ptr == usualfunction)      usualfunction();  

Re: [PATCH] KVM: x86: Reduce retpoline performance impact in slot_handle_level_range()

2018-02-02 Thread Alan Cox
> Either way, that does look like a reasonable answer. I had looked at > the various one-line wrappers around slot_handle_level_range() and > thought "hm, those should be inline", but I hadn't made the next step > and pondered putting the whole thing inline. We'll give it a spin and > work out

Re: [PATCH] KVM: x86: Reduce retpoline performance impact in slot_handle_level_range()

2018-02-02 Thread Alan Cox
> Either way, that does look like a reasonable answer. I had looked at > the various one-line wrappers around slot_handle_level_range() and > thought "hm, those should be inline", but I hadn't made the next step > and pondered putting the whole thing inline. We'll give it a spin and > work out

Re: [PATCH] KVM: x86: Reduce retpoline performance impact in slot_handle_level_range()

2018-02-02 Thread David Woodhouse
On Fri, 2018-02-02 at 16:10 -0500, Paolo Bonzini wrote: > > > On 2. Feb 2018, at 15:59, David Woodhouse wrote: > > > With retpoline, tight loops of "call this function for every XXX" are > > > very much pessimised by taking a prediction miss *every* time. > > >  > > > This one

Re: [PATCH] KVM: x86: Reduce retpoline performance impact in slot_handle_level_range()

2018-02-02 Thread David Woodhouse
On Fri, 2018-02-02 at 16:10 -0500, Paolo Bonzini wrote: > > > On 2. Feb 2018, at 15:59, David Woodhouse wrote: > > > With retpoline, tight loops of "call this function for every XXX" are > > > very much pessimised by taking a prediction miss *every* time. > > >  > > > This one showed up very high

Re: [PATCH] KVM: x86: Reduce retpoline performance impact in slot_handle_level_range()

2018-02-02 Thread Paolo Bonzini
> > On 2. Feb 2018, at 15:59, David Woodhouse wrote: > > With retpoline, tight loops of "call this function for every XXX" are > > very much pessimised by taking a prediction miss *every* time. > > > > This one showed up very high in our early testing, and it only has five > >

Re: [PATCH] KVM: x86: Reduce retpoline performance impact in slot_handle_level_range()

2018-02-02 Thread Paolo Bonzini
> > On 2. Feb 2018, at 15:59, David Woodhouse wrote: > > With retpoline, tight loops of "call this function for every XXX" are > > very much pessimised by taking a prediction miss *every* time. > > > > This one showed up very high in our early testing, and it only has five > > things it'll ever

Re: [PATCH] KVM: x86: Reduce retpoline performance impact in slot_handle_level_range()

2018-02-02 Thread David Woodhouse
On Fri, 2018-02-02 at 11:10 -0800, Linus Torvalds wrote: > On Fri, Feb 2, 2018 at 10:50 AM, Linus Torvalds > wrote: > > > > > > Will it make for bigger code? Yes. But probably not really all *that* > > much bigger, because of how it also will allow the compiler to

Re: [PATCH] KVM: x86: Reduce retpoline performance impact in slot_handle_level_range()

2018-02-02 Thread David Woodhouse
On Fri, 2018-02-02 at 11:10 -0800, Linus Torvalds wrote: > On Fri, Feb 2, 2018 at 10:50 AM, Linus Torvalds > wrote: > > > > > > Will it make for bigger code? Yes. But probably not really all *that* > > much bigger, because of how it also will allow the compiler to > > simplify some things. > >

Re: [PATCH] KVM: x86: Reduce retpoline performance impact in slot_handle_level_range()

2018-02-02 Thread Linus Torvalds
On Fri, Feb 2, 2018 at 10:50 AM, Linus Torvalds wrote: > > Will it make for bigger code? Yes. But probably not really all *that* > much bigger, because of how it also will allow the compiler to > simplify some things. Actually, testing this with my fairly minimal

Re: [PATCH] KVM: x86: Reduce retpoline performance impact in slot_handle_level_range()

2018-02-02 Thread Linus Torvalds
On Fri, Feb 2, 2018 at 10:50 AM, Linus Torvalds wrote: > > Will it make for bigger code? Yes. But probably not really all *that* > much bigger, because of how it also will allow the compiler to > simplify some things. Actually, testing this with my fairly minimal config, it actually makes for

Re: [PATCH] KVM: x86: Reduce retpoline performance impact in slot_handle_level_range()

2018-02-02 Thread Linus Torvalds
On Fri, Feb 2, 2018 at 6:59 AM, David Woodhouse wrote: > With retpoline, tight loops of "call this function for every XXX" are > very much pessimised by taking a prediction miss *every* time. > > This one showed up very high in our early testing, and it only has five > things

Re: [PATCH] KVM: x86: Reduce retpoline performance impact in slot_handle_level_range()

2018-02-02 Thread Linus Torvalds
On Fri, Feb 2, 2018 at 6:59 AM, David Woodhouse wrote: > With retpoline, tight loops of "call this function for every XXX" are > very much pessimised by taking a prediction miss *every* time. > > This one showed up very high in our early testing, and it only has five > things it'll ever call so

Re: [PATCH] KVM: x86: Reduce retpoline performance impact in slot_handle_level_range()

2018-02-02 Thread Sironi, Filippo
> On 2. Feb 2018, at 15:59, David Woodhouse wrote: > > With retpoline, tight loops of "call this function for every XXX" are > very much pessimised by taking a prediction miss *every* time. > > This one showed up very high in our early testing, and it only has five > things

Re: [PATCH] KVM: x86: Reduce retpoline performance impact in slot_handle_level_range()

2018-02-02 Thread Sironi, Filippo
> On 2. Feb 2018, at 15:59, David Woodhouse wrote: > > With retpoline, tight loops of "call this function for every XXX" are > very much pessimised by taking a prediction miss *every* time. > > This one showed up very high in our early testing, and it only has five > things it'll ever call so

[PATCH] KVM: x86: Reduce retpoline performance impact in slot_handle_level_range()

2018-02-02 Thread David Woodhouse
With retpoline, tight loops of "call this function for every XXX" are very much pessimised by taking a prediction miss *every* time. This one showed up very high in our early testing, and it only has five things it'll ever call so make it take an 'op' enum instead of a function pointer and let's

[PATCH] KVM: x86: Reduce retpoline performance impact in slot_handle_level_range()

2018-02-02 Thread David Woodhouse
With retpoline, tight loops of "call this function for every XXX" are very much pessimised by taking a prediction miss *every* time. This one showed up very high in our early testing, and it only has five things it'll ever call so make it take an 'op' enum instead of a function pointer and let's