Re: [PATCH v2] kvm/x86: Inform RCU of quiescent state when entering guest mode

2018-07-12 Thread Christian Borntraeger



On 07/12/2018 01:58 PM, Christian Borntraeger wrote:
> 
> 
> On 07/12/2018 01:10 PM, David Woodhouse wrote:
>> On Thu, 2018-07-12 at 13:00 +0200, Christian Borntraeger wrote:
>>>
>>> On s390 this seems to add about 10ns (~3%) for a guest exit/rentry
>>> microbenchmark mostly due to  rcu_eqs_enter and rcu_eqs_exit now being
>>> visible in perf samples. The older interface was cheaper.
>>
>> Well, the older interface wasn't actually working, which made it
>> moderately suboptimal :)
>>
>> But that is fixed by the first patch of the two, so perhaps the second
>> ought to be conditional on CONFIG_NO_HZ_FULL since it's only really
>> fixing a real bug there?
>>
>> Unless we can contrive some way to do the rcu_eqs_enter/exit only on
>> the paths to/from userspace, and *not* when we loop in the kernel to
>> handle interrupts (etc.) and immediately re-enter the guest without
>> returning? That's somewhat non-trivial though...
> 
> Assuming that patch1 really fixes the issue (and all KVM hot loops are
> supposed to have something like
> if (need_resched())
> schedule();
> 
> this could work out given that patch1 does what it is supposed to do.
> 
> So you are thinking of something like
> 
> diff --git a/include/linux/rcutree.h b/include/linux/rcutree.h
> index d8ee9f949ce3..cac5de8a887c 100644
> --- a/include/linux/rcutree.h
> +++ b/include/linux/rcutree.h
> @@ -72,8 +72,17 @@ void cond_synchronize_sched(unsigned long oldstate);
>  
>  void rcu_idle_enter(void);
>  void rcu_idle_exit(void);
> +#ifdef CONFIG_NO_HZ_FULL
>  void rcu_kvm_enter(void);
>  void rcu_kvm_exit(void);
> +#else
> +static inline void rcu_kvm_enter(void)
> +{
> +}
> +static inline void rcu_kvm_exit(void)
> +{
> +}
> +#endif
>  void rcu_irq_enter(void);
>  void rcu_irq_exit(void);
>  void rcu_irq_enter_irqson(void);
> 
> ?
> 

Of course we also need to shift the ifdefs in rcu.c
and the rcu_virt_note_context_switch() on enter as you
said.



Re: [PATCH v2] kvm/x86: Inform RCU of quiescent state when entering guest mode

2018-07-12 Thread Christian Borntraeger



On 07/12/2018 01:10 PM, David Woodhouse wrote:
> On Thu, 2018-07-12 at 13:00 +0200, Christian Borntraeger wrote:
>>
>> On s390 this seems to add about 10ns (~3%) for a guest exit/rentry
>> microbenchmark mostly due to  rcu_eqs_enter and rcu_eqs_exit now being
>> visible in perf samples. The older interface was cheaper.
> 
> Well, the older interface wasn't actually working, which made it
> moderately suboptimal :)
> 
> But that is fixed by the first patch of the two, so perhaps the second
> ought to be conditional on CONFIG_NO_HZ_FULL since it's only really
> fixing a real bug there?
> 
> Unless we can contrive some way to do the rcu_eqs_enter/exit only on
> the paths to/from userspace, and *not* when we loop in the kernel to
> handle interrupts (etc.) and immediately re-enter the guest without
> returning? That's somewhat non-trivial though...

Assuming that patch1 really fixes the issue (and all KVM hot loops are
supposed to have something like
if (need_resched())
schedule();

this could work out given that patch1 does what it is supposed to do.

So you are thinking of something like

diff --git a/include/linux/rcutree.h b/include/linux/rcutree.h
index d8ee9f949ce3..cac5de8a887c 100644
--- a/include/linux/rcutree.h
+++ b/include/linux/rcutree.h
@@ -72,8 +72,17 @@ void cond_synchronize_sched(unsigned long oldstate);
 
 void rcu_idle_enter(void);
 void rcu_idle_exit(void);
+#ifdef CONFIG_NO_HZ_FULL
 void rcu_kvm_enter(void);
 void rcu_kvm_exit(void);
+#else
+static inline void rcu_kvm_enter(void)
+{
+}
+static inline void rcu_kvm_exit(void)
+{
+}
+#endif
 void rcu_irq_enter(void);
 void rcu_irq_exit(void);
 void rcu_irq_enter_irqson(void);

?



Re: [PATCH v2] kvm/x86: Inform RCU of quiescent state when entering guest mode

2018-07-12 Thread David Woodhouse
On Thu, 2018-07-12 at 13:00 +0200, Christian Borntraeger wrote:
> 
> On s390 this seems to add about 10ns (~3%) for a guest exit/rentry
> microbenchmark mostly due to  rcu_eqs_enter and rcu_eqs_exit now being
> visible in perf samples. The older interface was cheaper.

Well, the older interface wasn't actually working, which made it
moderately suboptimal :)

But that is fixed by the first patch of the two, so perhaps the second
ought to be conditional on CONFIG_NO_HZ_FULL since it's only really
fixing a real bug there?

Unless we can contrive some way to do the rcu_eqs_enter/exit only on
the paths to/from userspace, and *not* when we loop in the kernel to
handle interrupts (etc.) and immediately re-enter the guest without
returning? That's somewhat non-trivial though...

smime.p7s
Description: S/MIME cryptographic signature


Re: [PATCH v2] kvm/x86: Inform RCU of quiescent state when entering guest mode

2018-07-12 Thread Christian Borntraeger



On 07/12/2018 10:31 AM, David Woodhouse wrote:
> 
> 
> On Wed, 2018-07-11 at 16:47 -0700, Paul E. McKenney wrote:
>>
 What changed was RCU's reactions to longish grace periods.  It used to
 be very aggressive about forcing the scheduler to do otherwise-unneeded
 context switches, which became a problem somewhere between v4.9 and v4.15.
 I therefore reduced the number of such context switches, which in turn
 caused KVM to tell RCU about quiescent states way too infrequently.
>>>  
>>> You talk about 
>>> commit bcbfdd01dce5556a952fae84ef16fd0f12525e7b
>>>  rcu: Make non-preemptive schedule be Tasks RCU quiescent state
>>>  
>>> correct? In fact, then whatever (properly sent) patch comes up should 
>>> contain
>>> a fixes tag.
>>
>> Not that one, but this one is at least part of the "team":
>>
>> 28053bc72c0e5 ("rcu: Add long-term CPU kicking").  I might need to use
>> "git bisect" to find the most relevant commit...  :-/
> 
> Whichever commit we blame, I suspect the Fixes: tag wants to go on
> Paul's earlier 'Make need_resched() respond to urgent RCU-QS needs'
> patch.
> 
> The rcu_kvm_enter() thing we're talking about now is actually
> addressing a problem which has existed for much longer — that with
> NO_HZ_FULL, a CPU might end up in guest mode for an indefinite period
> of time, with RCU waiting for it to return.

On s390 this seems to add about 10ns (~3%) for a guest exit/rentry
microbenchmark mostly due to  rcu_eqs_enter and rcu_eqs_exit now being
visible in perf samples. The older interface was cheaper.



Re: [PATCH v2] kvm/x86: Inform RCU of quiescent state when entering guest mode

2018-07-12 Thread David Woodhouse
On Thu, 2018-07-12 at 08:21 +0200, Christian Borntraeger wrote:
> 
> Is there a single patch that that I can test or do I have to combine
> all the pieces that are sprinkled in this mail thread?

I believe it should be just these two commits, and the first is
probably optional as the second makes it mostly redundant for the KVM
case. Fengguang's 0day bot is watching this tree, so I expect to get
success/failure results fairly shortly for various build tests at
least.

http://git.infradead.org/users/dwmw2/random-2.6.git/shortlog/refs/heads/kvm-rcu

As noted, the first (need_resched) patch took our observed
RCU latencies down from 5-10 seconds to 160ms, while the second one
that we're working on now took it down to 40ms.

smime.p7s
Description: S/MIME cryptographic signature


Re: [PATCH v2] kvm/x86: Inform RCU of quiescent state when entering guest mode

2018-07-12 Thread David Woodhouse


On Wed, 2018-07-11 at 16:47 -0700, Paul E. McKenney wrote:
> 
> > > What changed was RCU's reactions to longish grace periods.  It used to
> > > be very aggressive about forcing the scheduler to do otherwise-unneeded
> > > context switches, which became a problem somewhere between v4.9 and v4.15.
> > > I therefore reduced the number of such context switches, which in turn
> > > caused KVM to tell RCU about quiescent states way too infrequently.
> > 
> > You talk about 
> > commit bcbfdd01dce5556a952fae84ef16fd0f12525e7b
> > rcu: Make non-preemptive schedule be Tasks RCU quiescent state
> > 
> > correct? In fact, then whatever (properly sent) patch comes up should 
> > contain
> > a fixes tag.
> 
> Not that one, but this one is at least part of the "team":
> 
> 28053bc72c0e5 ("rcu: Add long-term CPU kicking").  I might need to use
> "git bisect" to find the most relevant commit...  :-/

Whichever commit we blame, I suspect the Fixes: tag wants to go on
Paul's earlier 'Make need_resched() respond to urgent RCU-QS needs'
patch.

The rcu_kvm_enter() thing we're talking about now is actually
addressing a problem which has existed for much longer — that with
NO_HZ_FULL, a CPU might end up in guest mode for an indefinite period
of time, with RCU waiting for it to return.

smime.p7s
Description: S/MIME cryptographic signature


Re: [PATCH v2] kvm/x86: Inform RCU of quiescent state when entering guest mode

2018-07-11 Thread Christian Borntraeger



On 07/12/2018 01:37 AM, Paul E. McKenney wrote:
> On Wed, Jul 11, 2018 at 02:32:59PM -0700, Paul E. McKenney wrote:
>> On Wed, Jul 11, 2018 at 11:11:19PM +0200, Christian Borntraeger wrote:
>>>
>>>
>>> On 07/11/2018 10:27 PM, Paul E. McKenney wrote:
 On Wed, Jul 11, 2018 at 08:39:36PM +0200, Christian Borntraeger wrote:
>
>
> On 07/11/2018 08:36 PM, Paul E. McKenney wrote:
>> On Wed, Jul 11, 2018 at 11:20:53AM -0700, Paul E. McKenney wrote:
>>> On Wed, Jul 11, 2018 at 07:01:01PM +0100, David Woodhouse wrote:
 From: David Woodhouse 

 RCU can spend long periods of time waiting for a CPU which is actually 
 in
 KVM guest mode, entirely pointlessly. Treat it like the idle and 
 userspace
 modes, and don't wait for it.

 Signed-off-by: David Woodhouse 
>>>
>>> And idiot here forgot about some of the debugging code in RCU's 
>>> dyntick-idle
>>> code.  I will reply with a fixed patch.
>>>
>>> The code below works just fine as long as you don't enable 
>>> CONFIG_RCU_EQS_DEBUG,
>>> so should be OK for testing, just not for mainline.
>>
>> And here is the updated code that allegedly avoids splatting when run 
>> with
>> CONFIG_RCU_EQS_DEBUG.
>>
>> Thoughts?
>>
>>  Thanx, Paul
>>
>> 
>>
>> commit 12cd59e49cf734f907f44b696e2c6e4b46a291c3
>> Author: David Woodhouse 
>> Date:   Wed Jul 11 19:01:01 2018 +0100
>>
>> kvm/x86: Inform RCU of quiescent state when entering guest mode
>> 
>> RCU can spend long periods of time waiting for a CPU which is 
>> actually in
>> KVM guest mode, entirely pointlessly. Treat it like the idle and 
>> userspace
>> modes, and don't wait for it.
>> 
>> Signed-off-by: David Woodhouse 
>> Signed-off-by: Paul E. McKenney 
>> [ paulmck: Adjust to avoid bad advice I gave to dwmw, avoid 
>> WARN_ON()s. ]
>>
>> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
>> index 0046aa70205a..b0c82f70afa7 100644
>> --- a/arch/x86/kvm/x86.c
>> +++ b/arch/x86/kvm/x86.c
>> @@ -7458,7 +7458,9 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu)
>>  vcpu->arch.switch_db_regs &= ~KVM_DEBUGREG_RELOAD;
>>  }
>>
>> +rcu_kvm_enter();
>>  kvm_x86_ops->run(vcpu);
>> +rcu_kvm_exit();
>
> As indicated in my other mail. This is supposed to be handled in the 
> guest_enter|exit_ calls around
> the run function. This would also handle other architectures. So if the 
> guest_enter_irqoff code is
> not good enough, we should rather fix that instead of adding another rcu 
> hint.

 Something like this, on top of the earlier patch?  I am not at all
 confident of this patch because there might be other entry/exit
 paths I am missing.  Plus there might be RCU uses on the arch-specific
 patch to and from the guest OS.

 Thoughts?

>>>
>>> If you instrment guest_enter/exit, you should cover all cases and all 
>>> architectures as far
>>> as I can tell. FWIW, we did this rcu_note thing back then actually handling 
>>> this particular
>>> case of long running guests blocking rcu for many seconds. And I am pretty 
>>> sure that
>>> this did help back then.
>>
>> And my second patch on the email you replied to replaced the only call
>> to rcu_virt_note_context_switch().  So maybe it covers what it needs to,
>> but yes, there might well be things I missed.  Let's see what David
>> comes up with.
>>
>> What changed was RCU's reactions to longish grace periods.  It used to
>> be very aggressive about forcing the scheduler to do otherwise-unneeded
>> context switches, which became a problem somewhere between v4.9 and v4.15.
>> I therefore reduced the number of such context switches, which in turn
>> caused KVM to tell RCU about quiescent states way too infrequently.
>>
>> The advantage of the rcu_kvm_enter()/rcu_kvm_exit() approach is that
>> it tells RCU of an extended duration in the guest, which means that
>> RCU can ignore the corresponding CPU, which in turn allows the guest
>> to proceed without any RCU-induced interruptions.
>>
>> Does that make sense, or am I missing something?  I freely admit to
>> much ignorance of both kvm and s390!  ;-)
> 
> But I am getting some rcutorture near misses on the commit that
> introduces rcu_kvm_enter() and rcu_kvm_exit() to the x86 arch-specific
> vcpu_enter_guest() function.  These near misses occur when running
> rcutorture scenarios TREE01 and TREE03, and in my -rcu tree rather
> than the v4.15 version of this patch.
> 
> Given that I am making pervasive changes to the way that RCU works,
> it might well be that this commit is an innocent bystander.

Re: [PATCH v2] kvm/x86: Inform RCU of quiescent state when entering guest mode

2018-07-11 Thread Paul E. McKenney
On Wed, Jul 11, 2018 at 04:37:27PM -0700, Paul E. McKenney wrote:
> On Wed, Jul 11, 2018 at 02:32:59PM -0700, Paul E. McKenney wrote:
> > On Wed, Jul 11, 2018 at 11:11:19PM +0200, Christian Borntraeger wrote:
> > > 
> > > 
> > > On 07/11/2018 10:27 PM, Paul E. McKenney wrote:
> > > > On Wed, Jul 11, 2018 at 08:39:36PM +0200, Christian Borntraeger wrote:
> > > >>
> > > >>
> > > >> On 07/11/2018 08:36 PM, Paul E. McKenney wrote:
> > > >>> On Wed, Jul 11, 2018 at 11:20:53AM -0700, Paul E. McKenney wrote:
> > >  On Wed, Jul 11, 2018 at 07:01:01PM +0100, David Woodhouse wrote:
> > > > From: David Woodhouse 
> > > >
> > > > RCU can spend long periods of time waiting for a CPU which is 
> > > > actually in
> > > > KVM guest mode, entirely pointlessly. Treat it like the idle and 
> > > > userspace
> > > > modes, and don't wait for it.
> > > >
> > > > Signed-off-by: David Woodhouse 
> > > 
> > >  And idiot here forgot about some of the debugging code in RCU's 
> > >  dyntick-idle
> > >  code.  I will reply with a fixed patch.
> > > 
> > >  The code below works just fine as long as you don't enable 
> > >  CONFIG_RCU_EQS_DEBUG,
> > >  so should be OK for testing, just not for mainline.
> > > >>>
> > > >>> And here is the updated code that allegedly avoids splatting when run 
> > > >>> with
> > > >>> CONFIG_RCU_EQS_DEBUG.
> > > >>>
> > > >>> Thoughts?
> > > >>>
> > > >>>   Thanx, Paul
> > > >>>
> > > >>> 
> > > >>>
> > > >>> commit 12cd59e49cf734f907f44b696e2c6e4b46a291c3
> > > >>> Author: David Woodhouse 
> > > >>> Date:   Wed Jul 11 19:01:01 2018 +0100
> > > >>>
> > > >>> kvm/x86: Inform RCU of quiescent state when entering guest mode
> > > >>> 
> > > >>> RCU can spend long periods of time waiting for a CPU which is 
> > > >>> actually in
> > > >>> KVM guest mode, entirely pointlessly. Treat it like the idle and 
> > > >>> userspace
> > > >>> modes, and don't wait for it.
> > > >>> 
> > > >>> Signed-off-by: David Woodhouse 
> > > >>> Signed-off-by: Paul E. McKenney 
> > > >>> [ paulmck: Adjust to avoid bad advice I gave to dwmw, avoid 
> > > >>> WARN_ON()s. ]
> > > >>>
> > > >>> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> > > >>> index 0046aa70205a..b0c82f70afa7 100644
> > > >>> --- a/arch/x86/kvm/x86.c
> > > >>> +++ b/arch/x86/kvm/x86.c
> > > >>> @@ -7458,7 +7458,9 @@ static int vcpu_enter_guest(struct kvm_vcpu 
> > > >>> *vcpu)
> > > >>>   vcpu->arch.switch_db_regs &= ~KVM_DEBUGREG_RELOAD;
> > > >>>   }
> > > >>>
> > > >>> + rcu_kvm_enter();
> > > >>>   kvm_x86_ops->run(vcpu);
> > > >>> + rcu_kvm_exit();
> > > >>
> > > >> As indicated in my other mail. This is supposed to be handled in the 
> > > >> guest_enter|exit_ calls around
> > > >> the run function. This would also handle other architectures. So if 
> > > >> the guest_enter_irqoff code is
> > > >> not good enough, we should rather fix that instead of adding another 
> > > >> rcu hint.
> > > > 
> > > > Something like this, on top of the earlier patch?  I am not at all
> > > > confident of this patch because there might be other entry/exit
> > > > paths I am missing.  Plus there might be RCU uses on the arch-specific
> > > > patch to and from the guest OS.
> > > > 
> > > > Thoughts?
> > > > 
> > > 
> > > If you instrment guest_enter/exit, you should cover all cases and all 
> > > architectures as far
> > > as I can tell. FWIW, we did this rcu_note thing back then actually 
> > > handling this particular
> > > case of long running guests blocking rcu for many seconds. And I am 
> > > pretty sure that
> > > this did help back then.
> > 
> > And my second patch on the email you replied to replaced the only call
> > to rcu_virt_note_context_switch().  So maybe it covers what it needs to,
> > but yes, there might well be things I missed.  Let's see what David
> > comes up with.
> > 
> > What changed was RCU's reactions to longish grace periods.  It used to
> > be very aggressive about forcing the scheduler to do otherwise-unneeded
> > context switches, which became a problem somewhere between v4.9 and v4.15.
> > I therefore reduced the number of such context switches, which in turn
> > caused KVM to tell RCU about quiescent states way too infrequently.
> > 
> > The advantage of the rcu_kvm_enter()/rcu_kvm_exit() approach is that
> > it tells RCU of an extended duration in the guest, which means that
> > RCU can ignore the corresponding CPU, which in turn allows the guest
> > to proceed without any RCU-induced interruptions.
> > 
> > Does that make sense, or am I missing something?  I freely admit to
> > much ignorance of both kvm and s390!  ;-)
> 
> But I am getting some rcutorture near misses on the commit that
> introduces rcu_kvm_enter() and rcu_kvm_exit() to the x86 arch-specific
> vcpu_enter_

Re: [PATCH v2] kvm/x86: Inform RCU of quiescent state when entering guest mode

2018-07-11 Thread Paul E. McKenney
On Wed, Jul 11, 2018 at 11:39:10PM +0200, Christian Borntraeger wrote:
> 
> 
> On 07/11/2018 11:32 PM, Paul E. McKenney wrote:
> > On Wed, Jul 11, 2018 at 11:11:19PM +0200, Christian Borntraeger wrote:
> >>
> >>
> >> On 07/11/2018 10:27 PM, Paul E. McKenney wrote:
> >>> On Wed, Jul 11, 2018 at 08:39:36PM +0200, Christian Borntraeger wrote:
> 
> 
>  On 07/11/2018 08:36 PM, Paul E. McKenney wrote:
> > On Wed, Jul 11, 2018 at 11:20:53AM -0700, Paul E. McKenney wrote:
> >> On Wed, Jul 11, 2018 at 07:01:01PM +0100, David Woodhouse wrote:
> >>> From: David Woodhouse 
> >>>
> >>> RCU can spend long periods of time waiting for a CPU which is 
> >>> actually in
> >>> KVM guest mode, entirely pointlessly. Treat it like the idle and 
> >>> userspace
> >>> modes, and don't wait for it.
> >>>
> >>> Signed-off-by: David Woodhouse 
> >>
> >> And idiot here forgot about some of the debugging code in RCU's 
> >> dyntick-idle
> >> code.  I will reply with a fixed patch.
> >>
> >> The code below works just fine as long as you don't enable 
> >> CONFIG_RCU_EQS_DEBUG,
> >> so should be OK for testing, just not for mainline.
> >
> > And here is the updated code that allegedly avoids splatting when run 
> > with
> > CONFIG_RCU_EQS_DEBUG.
> >
> > Thoughts?
> >
> > Thanx, Paul
> >
> > 
> >
> > commit 12cd59e49cf734f907f44b696e2c6e4b46a291c3
> > Author: David Woodhouse 
> > Date:   Wed Jul 11 19:01:01 2018 +0100
> >
> > kvm/x86: Inform RCU of quiescent state when entering guest mode
> > 
> > RCU can spend long periods of time waiting for a CPU which is 
> > actually in
> > KVM guest mode, entirely pointlessly. Treat it like the idle and 
> > userspace
> > modes, and don't wait for it.
> > 
> > Signed-off-by: David Woodhouse 
> > Signed-off-by: Paul E. McKenney 
> > [ paulmck: Adjust to avoid bad advice I gave to dwmw, avoid 
> > WARN_ON()s. ]
> >
> > diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> > index 0046aa70205a..b0c82f70afa7 100644
> > --- a/arch/x86/kvm/x86.c
> > +++ b/arch/x86/kvm/x86.c
> > @@ -7458,7 +7458,9 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu)
> > vcpu->arch.switch_db_regs &= ~KVM_DEBUGREG_RELOAD;
> > }
> >
> > +   rcu_kvm_enter();
> > kvm_x86_ops->run(vcpu);
> > +   rcu_kvm_exit();
> 
>  As indicated in my other mail. This is supposed to be handled in the 
>  guest_enter|exit_ calls around
>  the run function. This would also handle other architectures. So if the 
>  guest_enter_irqoff code is
>  not good enough, we should rather fix that instead of adding another rcu 
>  hint.
> >>>
> >>> Something like this, on top of the earlier patch?  I am not at all
> >>> confident of this patch because there might be other entry/exit
> >>> paths I am missing.  Plus there might be RCU uses on the arch-specific
> >>> patch to and from the guest OS.
> >>>
> >>> Thoughts?
> >>>
> >>
> >> If you instrment guest_enter/exit, you should cover all cases and all 
> >> architectures as far
> >> as I can tell. FWIW, we did this rcu_note thing back then actually 
> >> handling this particular
> >> case of long running guests blocking rcu for many seconds. And I am pretty 
> >> sure that
> >> this did help back then.
> > 
> > And my second patch on the email you replied to replaced the only call
> > to rcu_virt_note_context_switch().  So maybe it covers what it needs to,
> > but yes, there might well be things I missed.  Let's see what David
> > comes up with.
> > 
> > What changed was RCU's reactions to longish grace periods.  It used to
> > be very aggressive about forcing the scheduler to do otherwise-unneeded
> > context switches, which became a problem somewhere between v4.9 and v4.15.
> > I therefore reduced the number of such context switches, which in turn
> > caused KVM to tell RCU about quiescent states way too infrequently.
> 
> You talk about 
> commit bcbfdd01dce5556a952fae84ef16fd0f12525e7b
> rcu: Make non-preemptive schedule be Tasks RCU quiescent state
> 
> correct? In fact, then whatever (properly sent) patch comes up should contain
> a fixes tag.

Not that one, but this one is at least part of the "team":

28053bc72c0e5 ("rcu: Add long-term CPU kicking").  I might need to use
"git bisect" to find the most relevant commit...  :-/

> > The advantage of the rcu_kvm_enter()/rcu_kvm_exit() approach is that
> > it tells RCU of an extended duration in the guest, which means that
> > RCU can ignore the corresponding CPU, which in turn allows the guest
> > to proceed without any RCU-induced interruptions.
> > 
> > Does that make sense, or a

Re: [PATCH v2] kvm/x86: Inform RCU of quiescent state when entering guest mode

2018-07-11 Thread Paul E. McKenney
On Wed, Jul 11, 2018 at 02:32:59PM -0700, Paul E. McKenney wrote:
> On Wed, Jul 11, 2018 at 11:11:19PM +0200, Christian Borntraeger wrote:
> > 
> > 
> > On 07/11/2018 10:27 PM, Paul E. McKenney wrote:
> > > On Wed, Jul 11, 2018 at 08:39:36PM +0200, Christian Borntraeger wrote:
> > >>
> > >>
> > >> On 07/11/2018 08:36 PM, Paul E. McKenney wrote:
> > >>> On Wed, Jul 11, 2018 at 11:20:53AM -0700, Paul E. McKenney wrote:
> >  On Wed, Jul 11, 2018 at 07:01:01PM +0100, David Woodhouse wrote:
> > > From: David Woodhouse 
> > >
> > > RCU can spend long periods of time waiting for a CPU which is 
> > > actually in
> > > KVM guest mode, entirely pointlessly. Treat it like the idle and 
> > > userspace
> > > modes, and don't wait for it.
> > >
> > > Signed-off-by: David Woodhouse 
> > 
> >  And idiot here forgot about some of the debugging code in RCU's 
> >  dyntick-idle
> >  code.  I will reply with a fixed patch.
> > 
> >  The code below works just fine as long as you don't enable 
> >  CONFIG_RCU_EQS_DEBUG,
> >  so should be OK for testing, just not for mainline.
> > >>>
> > >>> And here is the updated code that allegedly avoids splatting when run 
> > >>> with
> > >>> CONFIG_RCU_EQS_DEBUG.
> > >>>
> > >>> Thoughts?
> > >>>
> > >>> Thanx, Paul
> > >>>
> > >>> 
> > >>>
> > >>> commit 12cd59e49cf734f907f44b696e2c6e4b46a291c3
> > >>> Author: David Woodhouse 
> > >>> Date:   Wed Jul 11 19:01:01 2018 +0100
> > >>>
> > >>> kvm/x86: Inform RCU of quiescent state when entering guest mode
> > >>> 
> > >>> RCU can spend long periods of time waiting for a CPU which is 
> > >>> actually in
> > >>> KVM guest mode, entirely pointlessly. Treat it like the idle and 
> > >>> userspace
> > >>> modes, and don't wait for it.
> > >>> 
> > >>> Signed-off-by: David Woodhouse 
> > >>> Signed-off-by: Paul E. McKenney 
> > >>> [ paulmck: Adjust to avoid bad advice I gave to dwmw, avoid 
> > >>> WARN_ON()s. ]
> > >>>
> > >>> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> > >>> index 0046aa70205a..b0c82f70afa7 100644
> > >>> --- a/arch/x86/kvm/x86.c
> > >>> +++ b/arch/x86/kvm/x86.c
> > >>> @@ -7458,7 +7458,9 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu)
> > >>> vcpu->arch.switch_db_regs &= ~KVM_DEBUGREG_RELOAD;
> > >>> }
> > >>>
> > >>> +   rcu_kvm_enter();
> > >>> kvm_x86_ops->run(vcpu);
> > >>> +   rcu_kvm_exit();
> > >>
> > >> As indicated in my other mail. This is supposed to be handled in the 
> > >> guest_enter|exit_ calls around
> > >> the run function. This would also handle other architectures. So if the 
> > >> guest_enter_irqoff code is
> > >> not good enough, we should rather fix that instead of adding another rcu 
> > >> hint.
> > > 
> > > Something like this, on top of the earlier patch?  I am not at all
> > > confident of this patch because there might be other entry/exit
> > > paths I am missing.  Plus there might be RCU uses on the arch-specific
> > > patch to and from the guest OS.
> > > 
> > > Thoughts?
> > > 
> > 
> > If you instrment guest_enter/exit, you should cover all cases and all 
> > architectures as far
> > as I can tell. FWIW, we did this rcu_note thing back then actually handling 
> > this particular
> > case of long running guests blocking rcu for many seconds. And I am pretty 
> > sure that
> > this did help back then.
> 
> And my second patch on the email you replied to replaced the only call
> to rcu_virt_note_context_switch().  So maybe it covers what it needs to,
> but yes, there might well be things I missed.  Let's see what David
> comes up with.
> 
> What changed was RCU's reactions to longish grace periods.  It used to
> be very aggressive about forcing the scheduler to do otherwise-unneeded
> context switches, which became a problem somewhere between v4.9 and v4.15.
> I therefore reduced the number of such context switches, which in turn
> caused KVM to tell RCU about quiescent states way too infrequently.
> 
> The advantage of the rcu_kvm_enter()/rcu_kvm_exit() approach is that
> it tells RCU of an extended duration in the guest, which means that
> RCU can ignore the corresponding CPU, which in turn allows the guest
> to proceed without any RCU-induced interruptions.
> 
> Does that make sense, or am I missing something?  I freely admit to
> much ignorance of both kvm and s390!  ;-)

But I am getting some rcutorture near misses on the commit that
introduces rcu_kvm_enter() and rcu_kvm_exit() to the x86 arch-specific
vcpu_enter_guest() function.  These near misses occur when running
rcutorture scenarios TREE01 and TREE03, and in my -rcu tree rather
than the v4.15 version of this patch.

Given that I am making pervasive changes to the way that RCU works,
it might well be that this commit is an innocent bystander.  

Re: [PATCH v2] kvm/x86: Inform RCU of quiescent state when entering guest mode

2018-07-11 Thread Christian Borntraeger



On 07/11/2018 11:32 PM, Paul E. McKenney wrote:
> On Wed, Jul 11, 2018 at 11:11:19PM +0200, Christian Borntraeger wrote:
>>
>>
>> On 07/11/2018 10:27 PM, Paul E. McKenney wrote:
>>> On Wed, Jul 11, 2018 at 08:39:36PM +0200, Christian Borntraeger wrote:


 On 07/11/2018 08:36 PM, Paul E. McKenney wrote:
> On Wed, Jul 11, 2018 at 11:20:53AM -0700, Paul E. McKenney wrote:
>> On Wed, Jul 11, 2018 at 07:01:01PM +0100, David Woodhouse wrote:
>>> From: David Woodhouse 
>>>
>>> RCU can spend long periods of time waiting for a CPU which is actually 
>>> in
>>> KVM guest mode, entirely pointlessly. Treat it like the idle and 
>>> userspace
>>> modes, and don't wait for it.
>>>
>>> Signed-off-by: David Woodhouse 
>>
>> And idiot here forgot about some of the debugging code in RCU's 
>> dyntick-idle
>> code.  I will reply with a fixed patch.
>>
>> The code below works just fine as long as you don't enable 
>> CONFIG_RCU_EQS_DEBUG,
>> so should be OK for testing, just not for mainline.
>
> And here is the updated code that allegedly avoids splatting when run with
> CONFIG_RCU_EQS_DEBUG.
>
> Thoughts?
>
>   Thanx, Paul
>
> 
>
> commit 12cd59e49cf734f907f44b696e2c6e4b46a291c3
> Author: David Woodhouse 
> Date:   Wed Jul 11 19:01:01 2018 +0100
>
> kvm/x86: Inform RCU of quiescent state when entering guest mode
> 
> RCU can spend long periods of time waiting for a CPU which is 
> actually in
> KVM guest mode, entirely pointlessly. Treat it like the idle and 
> userspace
> modes, and don't wait for it.
> 
> Signed-off-by: David Woodhouse 
> Signed-off-by: Paul E. McKenney 
> [ paulmck: Adjust to avoid bad advice I gave to dwmw, avoid 
> WARN_ON()s. ]
>
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index 0046aa70205a..b0c82f70afa7 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -7458,7 +7458,9 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu)
>   vcpu->arch.switch_db_regs &= ~KVM_DEBUGREG_RELOAD;
>   }
>
> + rcu_kvm_enter();
>   kvm_x86_ops->run(vcpu);
> + rcu_kvm_exit();

 As indicated in my other mail. This is supposed to be handled in the 
 guest_enter|exit_ calls around
 the run function. This would also handle other architectures. So if the 
 guest_enter_irqoff code is
 not good enough, we should rather fix that instead of adding another rcu 
 hint.
>>>
>>> Something like this, on top of the earlier patch?  I am not at all
>>> confident of this patch because there might be other entry/exit
>>> paths I am missing.  Plus there might be RCU uses on the arch-specific
>>> patch to and from the guest OS.
>>>
>>> Thoughts?
>>>
>>
>> If you instrment guest_enter/exit, you should cover all cases and all 
>> architectures as far
>> as I can tell. FWIW, we did this rcu_note thing back then actually handling 
>> this particular
>> case of long running guests blocking rcu for many seconds. And I am pretty 
>> sure that
>> this did help back then.
> 
> And my second patch on the email you replied to replaced the only call
> to rcu_virt_note_context_switch().  So maybe it covers what it needs to,
> but yes, there might well be things I missed.  Let's see what David
> comes up with.
> 
> What changed was RCU's reactions to longish grace periods.  It used to
> be very aggressive about forcing the scheduler to do otherwise-unneeded
> context switches, which became a problem somewhere between v4.9 and v4.15.
> I therefore reduced the number of such context switches, which in turn
> caused KVM to tell RCU about quiescent states way too infrequently.

You talk about 
commit bcbfdd01dce5556a952fae84ef16fd0f12525e7b
rcu: Make non-preemptive schedule be Tasks RCU quiescent state

correct? In fact, then whatever (properly sent) patch comes up should contain
a fixes tag.



> 
> The advantage of the rcu_kvm_enter()/rcu_kvm_exit() approach is that
> it tells RCU of an extended duration in the guest, which means that
> RCU can ignore the corresponding CPU, which in turn allows the guest
> to proceed without any RCU-induced interruptions.
> 
> Does that make sense, or am I missing something?  I freely admit to
> much ignorance of both kvm and s390!  ;-)

WIth that explanation it makes perfect sense to replace 
rcu_virt_note_context_switch with rcu_kvm_enter/exit from an rcu performance
perspective. I assume that rcu_kvm_enter is not much slower than 
rcu_virt_note_context_switch? Because we do call it on every guest entry/exit
which we might have plenty for ping pong I/O workload.



Re: [PATCH v2] kvm/x86: Inform RCU of quiescent state when entering guest mode

2018-07-11 Thread Paul E. McKenney
On Wed, Jul 11, 2018 at 11:11:19PM +0200, Christian Borntraeger wrote:
> 
> 
> On 07/11/2018 10:27 PM, Paul E. McKenney wrote:
> > On Wed, Jul 11, 2018 at 08:39:36PM +0200, Christian Borntraeger wrote:
> >>
> >>
> >> On 07/11/2018 08:36 PM, Paul E. McKenney wrote:
> >>> On Wed, Jul 11, 2018 at 11:20:53AM -0700, Paul E. McKenney wrote:
>  On Wed, Jul 11, 2018 at 07:01:01PM +0100, David Woodhouse wrote:
> > From: David Woodhouse 
> >
> > RCU can spend long periods of time waiting for a CPU which is actually 
> > in
> > KVM guest mode, entirely pointlessly. Treat it like the idle and 
> > userspace
> > modes, and don't wait for it.
> >
> > Signed-off-by: David Woodhouse 
> 
>  And idiot here forgot about some of the debugging code in RCU's 
>  dyntick-idle
>  code.  I will reply with a fixed patch.
> 
>  The code below works just fine as long as you don't enable 
>  CONFIG_RCU_EQS_DEBUG,
>  so should be OK for testing, just not for mainline.
> >>>
> >>> And here is the updated code that allegedly avoids splatting when run with
> >>> CONFIG_RCU_EQS_DEBUG.
> >>>
> >>> Thoughts?
> >>>
> >>>   Thanx, Paul
> >>>
> >>> 
> >>>
> >>> commit 12cd59e49cf734f907f44b696e2c6e4b46a291c3
> >>> Author: David Woodhouse 
> >>> Date:   Wed Jul 11 19:01:01 2018 +0100
> >>>
> >>> kvm/x86: Inform RCU of quiescent state when entering guest mode
> >>> 
> >>> RCU can spend long periods of time waiting for a CPU which is 
> >>> actually in
> >>> KVM guest mode, entirely pointlessly. Treat it like the idle and 
> >>> userspace
> >>> modes, and don't wait for it.
> >>> 
> >>> Signed-off-by: David Woodhouse 
> >>> Signed-off-by: Paul E. McKenney 
> >>> [ paulmck: Adjust to avoid bad advice I gave to dwmw, avoid 
> >>> WARN_ON()s. ]
> >>>
> >>> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> >>> index 0046aa70205a..b0c82f70afa7 100644
> >>> --- a/arch/x86/kvm/x86.c
> >>> +++ b/arch/x86/kvm/x86.c
> >>> @@ -7458,7 +7458,9 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu)
> >>>   vcpu->arch.switch_db_regs &= ~KVM_DEBUGREG_RELOAD;
> >>>   }
> >>>
> >>> + rcu_kvm_enter();
> >>>   kvm_x86_ops->run(vcpu);
> >>> + rcu_kvm_exit();
> >>
> >> As indicated in my other mail. This is supposed to be handled in the 
> >> guest_enter|exit_ calls around
> >> the run function. This would also handle other architectures. So if the 
> >> guest_enter_irqoff code is
> >> not good enough, we should rather fix that instead of adding another rcu 
> >> hint.
> > 
> > Something like this, on top of the earlier patch?  I am not at all
> > confident of this patch because there might be other entry/exit
> > paths I am missing.  Plus there might be RCU uses on the arch-specific
> > patch to and from the guest OS.
> > 
> > Thoughts?
> > 
> 
> If you instrment guest_enter/exit, you should cover all cases and all 
> architectures as far
> as I can tell. FWIW, we did this rcu_note thing back then actually handling 
> this particular
> case of long running guests blocking rcu for many seconds. And I am pretty 
> sure that
> this did help back then.

And my second patch on the email you replied to replaced the only call
to rcu_virt_note_context_switch().  So maybe it covers what it needs to,
but yes, there might well be things I missed.  Let's see what David
comes up with.

What changed was RCU's reactions to longish grace periods.  It used to
be very aggressive about forcing the scheduler to do otherwise-unneeded
context switches, which became a problem somewhere between v4.9 and v4.15.
I therefore reduced the number of such context switches, which in turn
caused KVM to tell RCU about quiescent states way too infrequently.

The advantage of the rcu_kvm_enter()/rcu_kvm_exit() approach is that
it tells RCU of an extended duration in the guest, which means that
RCU can ignore the corresponding CPU, which in turn allows the guest
to proceed without any RCU-induced interruptions.

Does that make sense, or am I missing something?  I freely admit to
much ignorance of both kvm and s390!  ;-)

Thanx, Paul



Re: [PATCH v2] kvm/x86: Inform RCU of quiescent state when entering guest mode

2018-07-11 Thread Christian Borntraeger



On 07/11/2018 10:27 PM, Paul E. McKenney wrote:
> On Wed, Jul 11, 2018 at 08:39:36PM +0200, Christian Borntraeger wrote:
>>
>>
>> On 07/11/2018 08:36 PM, Paul E. McKenney wrote:
>>> On Wed, Jul 11, 2018 at 11:20:53AM -0700, Paul E. McKenney wrote:
 On Wed, Jul 11, 2018 at 07:01:01PM +0100, David Woodhouse wrote:
> From: David Woodhouse 
>
> RCU can spend long periods of time waiting for a CPU which is actually in
> KVM guest mode, entirely pointlessly. Treat it like the idle and userspace
> modes, and don't wait for it.
>
> Signed-off-by: David Woodhouse 

 And idiot here forgot about some of the debugging code in RCU's 
 dyntick-idle
 code.  I will reply with a fixed patch.

 The code below works just fine as long as you don't enable 
 CONFIG_RCU_EQS_DEBUG,
 so should be OK for testing, just not for mainline.
>>>
>>> And here is the updated code that allegedly avoids splatting when run with
>>> CONFIG_RCU_EQS_DEBUG.
>>>
>>> Thoughts?
>>>
>>> Thanx, Paul
>>>
>>> 
>>>
>>> commit 12cd59e49cf734f907f44b696e2c6e4b46a291c3
>>> Author: David Woodhouse 
>>> Date:   Wed Jul 11 19:01:01 2018 +0100
>>>
>>> kvm/x86: Inform RCU of quiescent state when entering guest mode
>>> 
>>> RCU can spend long periods of time waiting for a CPU which is actually 
>>> in
>>> KVM guest mode, entirely pointlessly. Treat it like the idle and 
>>> userspace
>>> modes, and don't wait for it.
>>> 
>>> Signed-off-by: David Woodhouse 
>>> Signed-off-by: Paul E. McKenney 
>>> [ paulmck: Adjust to avoid bad advice I gave to dwmw, avoid WARN_ON()s. 
>>> ]
>>>
>>> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
>>> index 0046aa70205a..b0c82f70afa7 100644
>>> --- a/arch/x86/kvm/x86.c
>>> +++ b/arch/x86/kvm/x86.c
>>> @@ -7458,7 +7458,9 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu)
>>> vcpu->arch.switch_db_regs &= ~KVM_DEBUGREG_RELOAD;
>>> }
>>>
>>> +   rcu_kvm_enter();
>>> kvm_x86_ops->run(vcpu);
>>> +   rcu_kvm_exit();
>>
>> As indicated in my other mail. This is supposed to be handled in the 
>> guest_enter|exit_ calls around
>> the run function. This would also handle other architectures. So if the 
>> guest_enter_irqoff code is
>> not good enough, we should rather fix that instead of adding another rcu 
>> hint.
> 
> Something like this, on top of the earlier patch?  I am not at all
> confident of this patch because there might be other entry/exit
> paths I am missing.  Plus there might be RCU uses on the arch-specific
> patch to and from the guest OS.
> 
> Thoughts?
> 

If you instrment guest_enter/exit, you should cover all cases and all 
architectures as far
as I can tell. FWIW, we did this rcu_note thing back then actually handling 
this particular
case of long running guests blocking rcu for many seconds. And I am pretty sure 
that
this did help back then.



Re: [PATCH v2] kvm/x86: Inform RCU of quiescent state when entering guest mode

2018-07-11 Thread Paul E. McKenney
On Wed, Jul 11, 2018 at 09:54:52PM +0100, David Woodhouse wrote:
> On Wed, 2018-07-11 at 13:27 -0700, Paul E. McKenney wrote:
> > 
> > Something like this, on top of the earlier patch?  I am not at all
> > confident of this patch because there might be other entry/exit
> > paths I am missing.  Plus there might be RCU uses on the arch-
> > specific patch to and from the guest OS.
> 
> That was going to be my starting point, yes. With some additional
> manual review and testing to cover the "not at all confident" part. :)

Please feel free to change as needed, up to and including throwing
my patch away and starting over.  ;-)

Thanx, Paul



Re: [PATCH v2] kvm/x86: Inform RCU of quiescent state when entering guest mode

2018-07-11 Thread David Woodhouse
On Wed, 2018-07-11 at 13:27 -0700, Paul E. McKenney wrote:
> 
> Something like this, on top of the earlier patch?  I am not at all
> confident of this patch because there might be other entry/exit
> paths I am missing.  Plus there might be RCU uses on the arch-
> specific patch to and from the guest OS.

That was going to be my starting point, yes. With some additional
manual review and testing to cover the "not at all confident" part. :)

Thanks.

smime.p7s
Description: S/MIME cryptographic signature


Re: [PATCH v2] kvm/x86: Inform RCU of quiescent state when entering guest mode

2018-07-11 Thread Paul E. McKenney
On Wed, Jul 11, 2018 at 08:39:36PM +0200, Christian Borntraeger wrote:
> 
> 
> On 07/11/2018 08:36 PM, Paul E. McKenney wrote:
> > On Wed, Jul 11, 2018 at 11:20:53AM -0700, Paul E. McKenney wrote:
> >> On Wed, Jul 11, 2018 at 07:01:01PM +0100, David Woodhouse wrote:
> >>> From: David Woodhouse 
> >>>
> >>> RCU can spend long periods of time waiting for a CPU which is actually in
> >>> KVM guest mode, entirely pointlessly. Treat it like the idle and userspace
> >>> modes, and don't wait for it.
> >>>
> >>> Signed-off-by: David Woodhouse 
> >>
> >> And idiot here forgot about some of the debugging code in RCU's 
> >> dyntick-idle
> >> code.  I will reply with a fixed patch.
> >>
> >> The code below works just fine as long as you don't enable 
> >> CONFIG_RCU_EQS_DEBUG,
> >> so should be OK for testing, just not for mainline.
> > 
> > And here is the updated code that allegedly avoids splatting when run with
> > CONFIG_RCU_EQS_DEBUG.
> > 
> > Thoughts?
> > 
> > Thanx, Paul
> > 
> > 
> > 
> > commit 12cd59e49cf734f907f44b696e2c6e4b46a291c3
> > Author: David Woodhouse 
> > Date:   Wed Jul 11 19:01:01 2018 +0100
> > 
> > kvm/x86: Inform RCU of quiescent state when entering guest mode
> > 
> > RCU can spend long periods of time waiting for a CPU which is actually 
> > in
> > KVM guest mode, entirely pointlessly. Treat it like the idle and 
> > userspace
> > modes, and don't wait for it.
> > 
> > Signed-off-by: David Woodhouse 
> > Signed-off-by: Paul E. McKenney 
> > [ paulmck: Adjust to avoid bad advice I gave to dwmw, avoid WARN_ON()s. 
> > ]
> > 
> > diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> > index 0046aa70205a..b0c82f70afa7 100644
> > --- a/arch/x86/kvm/x86.c
> > +++ b/arch/x86/kvm/x86.c
> > @@ -7458,7 +7458,9 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu)
> > vcpu->arch.switch_db_regs &= ~KVM_DEBUGREG_RELOAD;
> > }
> > 
> > +   rcu_kvm_enter();
> > kvm_x86_ops->run(vcpu);
> > +   rcu_kvm_exit();
> 
> As indicated in my other mail. This is supposed to be handled in the 
> guest_enter|exit_ calls around
> the run function. This would also handle other architectures. So if the 
> guest_enter_irqoff code is
> not good enough, we should rather fix that instead of adding another rcu hint.

Something like this, on top of the earlier patch?  I am not at all
confident of this patch because there might be other entry/exit
paths I am missing.  Plus there might be RCU uses on the arch-specific
patch to and from the guest OS.

Thoughts?

Thanx, Paul



diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index b0c82f70afa7..0046aa70205a 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -7458,9 +7458,7 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu)
vcpu->arch.switch_db_regs &= ~KVM_DEBUGREG_RELOAD;
}
 
-   rcu_kvm_enter();
kvm_x86_ops->run(vcpu);
-   rcu_kvm_exit();
 
/*
 * Do this here before restoring debug registers on the host.  And
diff --git a/include/linux/context_tracking.h b/include/linux/context_tracking.h
index d05609ad329d..8d2a9d3073ad 100644
--- a/include/linux/context_tracking.h
+++ b/include/linux/context_tracking.h
@@ -118,12 +118,12 @@ static inline void guest_enter_irqoff(void)
 * one time slice). Lets treat guest mode as quiescent state, just like
 * we do with user-mode execution.
 */
-   if (!context_tracking_cpu_is_enabled())
-   rcu_virt_note_context_switch(smp_processor_id());
+   rcu_kvm_enter();
 }
 
 static inline void guest_exit_irqoff(void)
 {
+   rcu_kvm_exit();
if (context_tracking_is_enabled())
__context_tracking_exit(CONTEXT_GUEST);
 
@@ -143,12 +143,13 @@ static inline void guest_enter_irqoff(void)
 */
vtime_account_system(current);
current->flags |= PF_VCPU;
-   rcu_virt_note_context_switch(smp_processor_id());
+   rcu_kvm_enter();
 }
 
 static inline void guest_exit_irqoff(void)
 {
/* Flush the guest cputime we spent on the guest */
+   rcu_kvm_exit();
vtime_account_system(current);
current->flags &= ~PF_VCPU;
 }
diff --git a/include/linux/rcutiny.h b/include/linux/rcutiny.h
index 4b2d691e453f..a7aa5b3cfb81 100644
--- a/include/linux/rcutiny.h
+++ b/include/linux/rcutiny.h
@@ -81,7 +81,6 @@ static inline int rcu_needs_cpu(u64 basemono, u64 *nextevt)
  * Take advantage of the fact that there is only one CPU, which
  * allows us to ignore virtualization-based context switches.
  */
-static inline void rcu_virt_note_context_switch(int cpu) { }
 static inline void rcu_cpu_stall_reset(void) { }
 static inline void rcu_idle_enter(void) { }
 static inline void rcu_idle

Re: [PATCH v2] kvm/x86: Inform RCU of quiescent state when entering guest mode

2018-07-11 Thread Christian Borntraeger



On 07/11/2018 08:36 PM, Paul E. McKenney wrote:
> On Wed, Jul 11, 2018 at 11:20:53AM -0700, Paul E. McKenney wrote:
>> On Wed, Jul 11, 2018 at 07:01:01PM +0100, David Woodhouse wrote:
>>> From: David Woodhouse 
>>>
>>> RCU can spend long periods of time waiting for a CPU which is actually in
>>> KVM guest mode, entirely pointlessly. Treat it like the idle and userspace
>>> modes, and don't wait for it.
>>>
>>> Signed-off-by: David Woodhouse 
>>
>> And idiot here forgot about some of the debugging code in RCU's dyntick-idle
>> code.  I will reply with a fixed patch.
>>
>> The code below works just fine as long as you don't enable 
>> CONFIG_RCU_EQS_DEBUG,
>> so should be OK for testing, just not for mainline.
> 
> And here is the updated code that allegedly avoids splatting when run with
> CONFIG_RCU_EQS_DEBUG.
> 
> Thoughts?
> 
>   Thanx, Paul
> 
> 
> 
> commit 12cd59e49cf734f907f44b696e2c6e4b46a291c3
> Author: David Woodhouse 
> Date:   Wed Jul 11 19:01:01 2018 +0100
> 
> kvm/x86: Inform RCU of quiescent state when entering guest mode
> 
> RCU can spend long periods of time waiting for a CPU which is actually in
> KVM guest mode, entirely pointlessly. Treat it like the idle and userspace
> modes, and don't wait for it.
> 
> Signed-off-by: David Woodhouse 
> Signed-off-by: Paul E. McKenney 
> [ paulmck: Adjust to avoid bad advice I gave to dwmw, avoid WARN_ON()s. ]
> 
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index 0046aa70205a..b0c82f70afa7 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -7458,7 +7458,9 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu)
>   vcpu->arch.switch_db_regs &= ~KVM_DEBUGREG_RELOAD;
>   }
> 
> + rcu_kvm_enter();
>   kvm_x86_ops->run(vcpu);
> + rcu_kvm_exit();

As indicated in my other mail. This is supposed to be handled in the 
guest_enter|exit_ calls around
the run function. This would also handle other architectures. So if the 
guest_enter_irqoff code is
not good enough, we should rather fix that instead of adding another rcu hint.



Re: [PATCH v2] kvm/x86: Inform RCU of quiescent state when entering guest mode

2018-07-11 Thread Paul E. McKenney
On Wed, Jul 11, 2018 at 11:20:53AM -0700, Paul E. McKenney wrote:
> On Wed, Jul 11, 2018 at 07:01:01PM +0100, David Woodhouse wrote:
> > From: David Woodhouse 
> > 
> > RCU can spend long periods of time waiting for a CPU which is actually in
> > KVM guest mode, entirely pointlessly. Treat it like the idle and userspace
> > modes, and don't wait for it.
> > 
> > Signed-off-by: David Woodhouse 
> 
> And idiot here forgot about some of the debugging code in RCU's dyntick-idle
> code.  I will reply with a fixed patch.
> 
> The code below works just fine as long as you don't enable 
> CONFIG_RCU_EQS_DEBUG,
> so should be OK for testing, just not for mainline.

And here is the updated code that allegedly avoids splatting when run with
CONFIG_RCU_EQS_DEBUG.

Thoughts?

Thanx, Paul



commit 12cd59e49cf734f907f44b696e2c6e4b46a291c3
Author: David Woodhouse 
Date:   Wed Jul 11 19:01:01 2018 +0100

kvm/x86: Inform RCU of quiescent state when entering guest mode

RCU can spend long periods of time waiting for a CPU which is actually in
KVM guest mode, entirely pointlessly. Treat it like the idle and userspace
modes, and don't wait for it.

Signed-off-by: David Woodhouse 
Signed-off-by: Paul E. McKenney 
[ paulmck: Adjust to avoid bad advice I gave to dwmw, avoid WARN_ON()s. ]

diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 0046aa70205a..b0c82f70afa7 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -7458,7 +7458,9 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu)
vcpu->arch.switch_db_regs &= ~KVM_DEBUGREG_RELOAD;
}
 
+   rcu_kvm_enter();
kvm_x86_ops->run(vcpu);
+   rcu_kvm_exit();
 
/*
 * Do this here before restoring debug registers on the host.  And
diff --git a/include/linux/rcutiny.h b/include/linux/rcutiny.h
index 7fa4fb9e899e..4b2d691e453f 100644
--- a/include/linux/rcutiny.h
+++ b/include/linux/rcutiny.h
@@ -85,6 +85,8 @@ static inline void rcu_virt_note_context_switch(int cpu) { }
 static inline void rcu_cpu_stall_reset(void) { }
 static inline void rcu_idle_enter(void) { }
 static inline void rcu_idle_exit(void) { }
+static inline void rcu_kvm_enter(void) { }
+static inline void rcu_kvm_exit(void) { }
 static inline void rcu_irq_enter(void) { }
 static inline void rcu_irq_exit_irqson(void) { }
 static inline void rcu_irq_enter_irqson(void) { }
diff --git a/include/linux/rcutree.h b/include/linux/rcutree.h
index 7f83179177d1..48ce58b53ece 100644
--- a/include/linux/rcutree.h
+++ b/include/linux/rcutree.h
@@ -55,6 +55,8 @@ void cond_synchronize_rcu(unsigned long oldstate);
 
 void rcu_idle_enter(void);
 void rcu_idle_exit(void);
+void rcu_kvm_enter(void);
+void rcu_kvm_exit(void);
 void rcu_irq_enter(void);
 void rcu_irq_exit(void);
 void rcu_irq_enter_irqson(void);
diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
index 765c81dd675e..0c0672faa6d1 100644
--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -583,6 +583,24 @@ void rcu_idle_enter(void)
rcu_eqs_enter(false);
 }
 
+/**
+ * rcu_kvm_enter - inform RCU that current CPU is entering a guest OS
+ *
+ * Enter guest-OS mode, in other words, -leave- the mode in which RCU
+ * read-side critical sections can occur.  (Though RCU read-side critical
+ * sections can occur in irq handlers from guest OSes, a possibility
+ * handled by irq_enter() and irq_exit().)
+ *
+ * If you add or remove a call to rcu_kvm_enter(), be sure to test with
+ * CONFIG_RCU_EQS_DEBUG=y.
+ */
+void rcu_kvm_enter(void)
+{
+   lockdep_assert_irqs_disabled();
+   rcu_eqs_enter(true);
+}
+EXPORT_SYMBOL_GPL(rcu_kvm_enter);
+
 #ifdef CONFIG_NO_HZ_FULL
 /**
  * rcu_user_enter - inform RCU that we are resuming userspace.
@@ -747,6 +765,22 @@ void rcu_idle_exit(void)
local_irq_restore(flags);
 }
 
+/**
+ * rcu_kvm_exit - inform RCU that current CPU is leaving a guest OS
+ *
+ * Exit guest-OS mode, in other words, -enter- the mode in which RCU
+ * read-side critical sections can occur.
+ *
+ * If you add or remove a call to rcu_kvm_exit(), be sure to test with
+ * CONFIG_RCU_EQS_DEBUG=y.
+ */
+void rcu_kvm_exit(void)
+{
+   lockdep_assert_irqs_disabled();
+   rcu_eqs_exit(true);
+}
+EXPORT_SYMBOL_GPL(rcu_kvm_exit);
+
 #ifdef CONFIG_NO_HZ_FULL
 /**
  * rcu_user_exit - inform RCU that we are exiting userspace.



Re: [PATCH v2] kvm/x86: Inform RCU of quiescent state when entering guest mode

2018-07-11 Thread Paul E. McKenney
On Wed, Jul 11, 2018 at 07:01:01PM +0100, David Woodhouse wrote:
> From: David Woodhouse 
> 
> RCU can spend long periods of time waiting for a CPU which is actually in
> KVM guest mode, entirely pointlessly. Treat it like the idle and userspace
> modes, and don't wait for it.
> 
> Signed-off-by: David Woodhouse 

And idiot here forgot about some of the debugging code in RCU's dyntick-idle
code.  I will reply with a fixed patch.

The code below works just fine as long as you don't enable CONFIG_RCU_EQS_DEBUG,
so should be OK for testing, just not for mainline.

Thanx, Paul

> ---
>  arch/x86/kvm/x86.c   | 2 ++
>  include/linux/rcupdate.h | 7 +++
>  kernel/rcu/tree.c| 2 ++
>  3 files changed, 11 insertions(+)
> 
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index 0046aa70205a..b0c82f70afa7 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -7458,7 +7458,9 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu)
>   vcpu->arch.switch_db_regs &= ~KVM_DEBUGREG_RELOAD;
>   }
> 
> + rcu_kvm_enter();
>   kvm_x86_ops->run(vcpu);
> + rcu_kvm_exit();
> 
>   /*
>* Do this here before restoring debug registers on the host.  And
> diff --git a/include/linux/rcupdate.h b/include/linux/rcupdate.h
> index 65163aa0bb04..1325f9d9ce00 100644
> --- a/include/linux/rcupdate.h
> +++ b/include/linux/rcupdate.h
> @@ -212,6 +212,13 @@ do { \
>  #error "Unknown RCU implementation specified to kernel configuration"
>  #endif
> 
> +/*
> + * These are currently identical to the _idle_ versions but let's
> + * explicitly have separate copies to keep Paul honest in future.
> + */
> +static inline void rcu_kvm_enter(void) { rcu_idle_enter(); }
> +static inline void rcu_kvm_exit(void) { rcu_idle_exit(); }
> +
>  /*
>   * The init_rcu_head_on_stack() and destroy_rcu_head_on_stack() calls
>   * are needed for dynamic initialization and destruction of rcu_head
> diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
> index aa7cade1b9f3..d38c381bf4e3 100644
> --- a/kernel/rcu/tree.c
> +++ b/kernel/rcu/tree.c
> @@ -760,6 +760,7 @@ void rcu_idle_enter(void)
>   lockdep_assert_irqs_disabled();
>   rcu_eqs_enter(false);
>  }
> +EXPORT_SYMBOL_GPL(rcu_idle_enter);
> 
>  #ifdef CONFIG_NO_HZ_FULL
>  /**
> @@ -913,6 +914,7 @@ void rcu_idle_exit(void)
>   rcu_eqs_exit(false);
>   local_irq_restore(flags);
>  }
> +EXPORT_SYMBOL_GPL(rcu_idle_exit);
> 
>  #ifdef CONFIG_NO_HZ_FULL
>  /**
> -- 
> 2.17.1
> 



[PATCH v2] kvm/x86: Inform RCU of quiescent state when entering guest mode

2018-07-11 Thread David Woodhouse
From: David Woodhouse 

RCU can spend long periods of time waiting for a CPU which is actually in
KVM guest mode, entirely pointlessly. Treat it like the idle and userspace
modes, and don't wait for it.

Signed-off-by: David Woodhouse 
---
 arch/x86/kvm/x86.c   | 2 ++
 include/linux/rcupdate.h | 7 +++
 kernel/rcu/tree.c| 2 ++
 3 files changed, 11 insertions(+)

diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 0046aa70205a..b0c82f70afa7 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -7458,7 +7458,9 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu)
vcpu->arch.switch_db_regs &= ~KVM_DEBUGREG_RELOAD;
}
 
+   rcu_kvm_enter();
kvm_x86_ops->run(vcpu);
+   rcu_kvm_exit();
 
/*
 * Do this here before restoring debug registers on the host.  And
diff --git a/include/linux/rcupdate.h b/include/linux/rcupdate.h
index 65163aa0bb04..1325f9d9ce00 100644
--- a/include/linux/rcupdate.h
+++ b/include/linux/rcupdate.h
@@ -212,6 +212,13 @@ do { \
 #error "Unknown RCU implementation specified to kernel configuration"
 #endif
 
+/*
+ * These are currently identical to the _idle_ versions but let's
+ * explicitly have separate copies to keep Paul honest in future.
+ */
+static inline void rcu_kvm_enter(void) { rcu_idle_enter(); }
+static inline void rcu_kvm_exit(void) { rcu_idle_exit(); }
+
 /*
  * The init_rcu_head_on_stack() and destroy_rcu_head_on_stack() calls
  * are needed for dynamic initialization and destruction of rcu_head
diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
index aa7cade1b9f3..d38c381bf4e3 100644
--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -760,6 +760,7 @@ void rcu_idle_enter(void)
lockdep_assert_irqs_disabled();
rcu_eqs_enter(false);
 }
+EXPORT_SYMBOL_GPL(rcu_idle_enter);
 
 #ifdef CONFIG_NO_HZ_FULL
 /**
@@ -913,6 +914,7 @@ void rcu_idle_exit(void)
rcu_eqs_exit(false);
local_irq_restore(flags);
 }
+EXPORT_SYMBOL_GPL(rcu_idle_exit);
 
 #ifdef CONFIG_NO_HZ_FULL
 /**
-- 
2.17.1