Re: [PATCH 2/3] KVM: arm64: nv: Emulate ISTATUS when emulated timers are fired.

2023-01-02 Thread Ganapatrao Kulkarni




On 02-01-2023 05:16 pm, Marc Zyngier wrote:

On Thu, 29 Dec 2022 13:53:15 +,
Marc Zyngier  wrote:


On Wed, 24 Aug 2022 07:03:03 +0100,
Ganapatrao Kulkarni  wrote:


Guest-Hypervisor forwards the timer interrupt to Guest-Guest, if it is
enabled, unmasked and ISTATUS bit of register CNTV_CTL_EL0 is set for a
loaded timer.

For NV2 implementation, the Host-Hypervisor is not emulating the ISTATUS
bit while forwarding the Emulated Vtimer Interrupt to Guest-Hypervisor.
This results in the drop of interrupt from Guest-Hypervisor, where as
Host Hypervisor marked it as an active interrupt and expecting Guest-Guest
to consume and acknowledge. Due to this, some of the Guest-Guest vCPUs
are stuck in Idle thread and rcu soft lockups are seen.

This issue is not seen with NV1 case since the register CNTV_CTL_EL0 read
trap handler is emulating the ISTATUS bit.

Adding code to set/emulate the ISTATUS when the emulated timers are fired.

Signed-off-by: Ganapatrao Kulkarni 
---
  arch/arm64/kvm/arch_timer.c | 5 +
  1 file changed, 5 insertions(+)

diff --git a/arch/arm64/kvm/arch_timer.c b/arch/arm64/kvm/arch_timer.c
index 27a6ec46803a..0b32d943d2d5 100644
--- a/arch/arm64/kvm/arch_timer.c
+++ b/arch/arm64/kvm/arch_timer.c
@@ -63,6 +63,7 @@ static u64 kvm_arm_timer_read(struct kvm_vcpu *vcpu,
  struct arch_timer_context *timer,
  enum kvm_arch_timer_regs treg);
  static bool kvm_arch_timer_get_input_level(int vintid);
+static u64 read_timer_ctl(struct arch_timer_context *timer);
  
  static struct irq_ops arch_timer_irq_ops = {

.get_input_level = kvm_arch_timer_get_input_level,
@@ -356,6 +357,8 @@ static enum hrtimer_restart kvm_hrtimer_expire(struct 
hrtimer *hrt)
return HRTIMER_RESTART;
}
  
+	/* Timer emulated, emulate ISTATUS also */

+   timer_set_ctl(ctx, read_timer_ctl(ctx));


Why should we do that for non-NV2 configurations?


kvm_timer_update_irq(vcpu, true, ctx);
return HRTIMER_NORESTART;
  }
@@ -458,6 +461,8 @@ static void timer_emulate(struct arch_timer_context *ctx)
trace_kvm_timer_emulate(ctx, should_fire);
  
  	if (should_fire != ctx->irq.level) {

+   /* Timer emulated, emulate ISTATUS also */
+   timer_set_ctl(ctx, read_timer_ctl(ctx));
kvm_timer_update_irq(ctx->vcpu, should_fire, ctx);
return;
}


I'm not overly keen on this. Yes, we can set the status bit there. But
conversely, the bit will not get cleared when the guest reprograms the
timer, and will take a full exit/entry cycle for it to appear.

Ergo, the architecture is buggy as memory (the VNCR page) cannot be
used to emulate something as dynamic as a timer.

It is only with FEAT_ECV that we can solve this correctly by trapping
the counter/timer accesses and emulate them for the guest hypervisor.
I'd rather we add support for that, as I expect all the FEAT_NV2
implementations to have it (and hopefully FEAT_FGT as well).


So I went ahead and implemented some very basic FEAT_ECV support to
correctly emulate the timers (trapping the CTL/CVAL accesses).

Performance dropped like a rock (~30% extra overhead) for L2
exit-heavy workloads that are terminated in userspace, such as virtio.
For those workloads, vcpu_{load,put}() in L1 now generate extra traps,
as we save/restore the timer context, and this is enough to make
things visibly slower, even on a pretty fast machine.

I managed to get *some* performance back by satisfying CTL/CVAL reads
very early on the exit path (a pretty common theme with NV). Which
means we end-up needing something like what you have -- only a bit
more complete. I came up with the following:

diff --git a/arch/arm64/kvm/arch_timer.c b/arch/arm64/kvm/arch_timer.c
index 4945c5b96f05..a198a6211e2a 100644
--- a/arch/arm64/kvm/arch_timer.c
+++ b/arch/arm64/kvm/arch_timer.c
@@ -450,6 +450,25 @@ static void kvm_timer_update_irq(struct kvm_vcpu *vcpu, 
bool new_level,
  {
int ret;
  
+	/*

+* Paper over NV2 brokenness by publishing the interrupt status
+* bit. This still results in a poor quality of emulation (guest
+* writes will have no effect until the next exit).
+*
+* But hey, it's fast, right?
+*/
+   if (vcpu_has_nv2(vcpu) && is_hyp_ctxt(vcpu) &&
+   (timer_ctx == vcpu_vtimer(vcpu) || timer_ctx == vcpu_ptimer(vcpu))) 
{
+   u32 ctl = timer_get_ctl(timer_ctx);
+
+   if (new_level)
+   ctl |= ARCH_TIMER_CTRL_IT_STAT;
+   else
+   ctl &= ~ARCH_TIMER_CTRL_IT_STAT;
+
+   timer_set_ctl(timer_ctx, ctl);
+   }
+
timer_ctx->irq.level = new_level;
trace_kvm_timer_update_irq(vcpu->vcpu_id, timer_ctx->irq.irq,
   timer_ctx->irq.level);

which reports the interrupt state in all cases.

Does this work for you?


Thanks Marc for the patch. I will 

Re: [PATCH 2/3] KVM: arm64: nv: Emulate ISTATUS when emulated timers are fired.

2023-01-02 Thread Marc Zyngier
On Thu, 29 Dec 2022 13:53:15 +,
Marc Zyngier  wrote:
> 
> On Wed, 24 Aug 2022 07:03:03 +0100,
> Ganapatrao Kulkarni  wrote:
> > 
> > Guest-Hypervisor forwards the timer interrupt to Guest-Guest, if it is
> > enabled, unmasked and ISTATUS bit of register CNTV_CTL_EL0 is set for a
> > loaded timer.
> > 
> > For NV2 implementation, the Host-Hypervisor is not emulating the ISTATUS
> > bit while forwarding the Emulated Vtimer Interrupt to Guest-Hypervisor.
> > This results in the drop of interrupt from Guest-Hypervisor, where as
> > Host Hypervisor marked it as an active interrupt and expecting Guest-Guest
> > to consume and acknowledge. Due to this, some of the Guest-Guest vCPUs
> > are stuck in Idle thread and rcu soft lockups are seen.
> > 
> > This issue is not seen with NV1 case since the register CNTV_CTL_EL0 read
> > trap handler is emulating the ISTATUS bit.
> > 
> > Adding code to set/emulate the ISTATUS when the emulated timers are fired.
> > 
> > Signed-off-by: Ganapatrao Kulkarni 
> > ---
> >  arch/arm64/kvm/arch_timer.c | 5 +
> >  1 file changed, 5 insertions(+)
> > 
> > diff --git a/arch/arm64/kvm/arch_timer.c b/arch/arm64/kvm/arch_timer.c
> > index 27a6ec46803a..0b32d943d2d5 100644
> > --- a/arch/arm64/kvm/arch_timer.c
> > +++ b/arch/arm64/kvm/arch_timer.c
> > @@ -63,6 +63,7 @@ static u64 kvm_arm_timer_read(struct kvm_vcpu *vcpu,
> >   struct arch_timer_context *timer,
> >   enum kvm_arch_timer_regs treg);
> >  static bool kvm_arch_timer_get_input_level(int vintid);
> > +static u64 read_timer_ctl(struct arch_timer_context *timer);
> >  
> >  static struct irq_ops arch_timer_irq_ops = {
> > .get_input_level = kvm_arch_timer_get_input_level,
> > @@ -356,6 +357,8 @@ static enum hrtimer_restart kvm_hrtimer_expire(struct 
> > hrtimer *hrt)
> > return HRTIMER_RESTART;
> > }
> >  
> > +   /* Timer emulated, emulate ISTATUS also */
> > +   timer_set_ctl(ctx, read_timer_ctl(ctx));
> 
> Why should we do that for non-NV2 configurations?
> 
> > kvm_timer_update_irq(vcpu, true, ctx);
> > return HRTIMER_NORESTART;
> >  }
> > @@ -458,6 +461,8 @@ static void timer_emulate(struct arch_timer_context 
> > *ctx)
> > trace_kvm_timer_emulate(ctx, should_fire);
> >  
> > if (should_fire != ctx->irq.level) {
> > +   /* Timer emulated, emulate ISTATUS also */
> > +   timer_set_ctl(ctx, read_timer_ctl(ctx));
> > kvm_timer_update_irq(ctx->vcpu, should_fire, ctx);
> > return;
> > }
> 
> I'm not overly keen on this. Yes, we can set the status bit there. But
> conversely, the bit will not get cleared when the guest reprograms the
> timer, and will take a full exit/entry cycle for it to appear.
> 
> Ergo, the architecture is buggy as memory (the VNCR page) cannot be
> used to emulate something as dynamic as a timer.
> 
> It is only with FEAT_ECV that we can solve this correctly by trapping
> the counter/timer accesses and emulate them for the guest hypervisor.
> I'd rather we add support for that, as I expect all the FEAT_NV2
> implementations to have it (and hopefully FEAT_FGT as well).

So I went ahead and implemented some very basic FEAT_ECV support to
correctly emulate the timers (trapping the CTL/CVAL accesses).

Performance dropped like a rock (~30% extra overhead) for L2
exit-heavy workloads that are terminated in userspace, such as virtio.
For those workloads, vcpu_{load,put}() in L1 now generate extra traps,
as we save/restore the timer context, and this is enough to make
things visibly slower, even on a pretty fast machine.

I managed to get *some* performance back by satisfying CTL/CVAL reads
very early on the exit path (a pretty common theme with NV). Which
means we end-up needing something like what you have -- only a bit
more complete. I came up with the following:

diff --git a/arch/arm64/kvm/arch_timer.c b/arch/arm64/kvm/arch_timer.c
index 4945c5b96f05..a198a6211e2a 100644
--- a/arch/arm64/kvm/arch_timer.c
+++ b/arch/arm64/kvm/arch_timer.c
@@ -450,6 +450,25 @@ static void kvm_timer_update_irq(struct kvm_vcpu *vcpu, 
bool new_level,
 {
int ret;
 
+   /*
+* Paper over NV2 brokenness by publishing the interrupt status
+* bit. This still results in a poor quality of emulation (guest
+* writes will have no effect until the next exit).
+*
+* But hey, it's fast, right?
+*/
+   if (vcpu_has_nv2(vcpu) && is_hyp_ctxt(vcpu) &&
+   (timer_ctx == vcpu_vtimer(vcpu) || timer_ctx == vcpu_ptimer(vcpu))) 
{
+   u32 ctl = timer_get_ctl(timer_ctx);
+
+   if (new_level)
+   ctl |= ARCH_TIMER_CTRL_IT_STAT;
+   else
+   ctl &= ~ARCH_TIMER_CTRL_IT_STAT;
+
+   timer_set_ctl(timer_ctx, ctl);
+   }
+
timer_ctx->irq.level = new_level;
trace_kvm_timer_update_irq(vcpu->vcpu_id, timer_ctx->irq.irq,
 

Re: [PATCH 2/3] KVM: arm64: nv: Emulate ISTATUS when emulated timers are fired.

2022-12-29 Thread Marc Zyngier
On Wed, 24 Aug 2022 07:03:03 +0100,
Ganapatrao Kulkarni  wrote:
> 
> Guest-Hypervisor forwards the timer interrupt to Guest-Guest, if it is
> enabled, unmasked and ISTATUS bit of register CNTV_CTL_EL0 is set for a
> loaded timer.
> 
> For NV2 implementation, the Host-Hypervisor is not emulating the ISTATUS
> bit while forwarding the Emulated Vtimer Interrupt to Guest-Hypervisor.
> This results in the drop of interrupt from Guest-Hypervisor, where as
> Host Hypervisor marked it as an active interrupt and expecting Guest-Guest
> to consume and acknowledge. Due to this, some of the Guest-Guest vCPUs
> are stuck in Idle thread and rcu soft lockups are seen.
> 
> This issue is not seen with NV1 case since the register CNTV_CTL_EL0 read
> trap handler is emulating the ISTATUS bit.
> 
> Adding code to set/emulate the ISTATUS when the emulated timers are fired.
> 
> Signed-off-by: Ganapatrao Kulkarni 
> ---
>  arch/arm64/kvm/arch_timer.c | 5 +
>  1 file changed, 5 insertions(+)
> 
> diff --git a/arch/arm64/kvm/arch_timer.c b/arch/arm64/kvm/arch_timer.c
> index 27a6ec46803a..0b32d943d2d5 100644
> --- a/arch/arm64/kvm/arch_timer.c
> +++ b/arch/arm64/kvm/arch_timer.c
> @@ -63,6 +63,7 @@ static u64 kvm_arm_timer_read(struct kvm_vcpu *vcpu,
> struct arch_timer_context *timer,
> enum kvm_arch_timer_regs treg);
>  static bool kvm_arch_timer_get_input_level(int vintid);
> +static u64 read_timer_ctl(struct arch_timer_context *timer);
>  
>  static struct irq_ops arch_timer_irq_ops = {
>   .get_input_level = kvm_arch_timer_get_input_level,
> @@ -356,6 +357,8 @@ static enum hrtimer_restart kvm_hrtimer_expire(struct 
> hrtimer *hrt)
>   return HRTIMER_RESTART;
>   }
>  
> + /* Timer emulated, emulate ISTATUS also */
> + timer_set_ctl(ctx, read_timer_ctl(ctx));

Why should we do that for non-NV2 configurations?

>   kvm_timer_update_irq(vcpu, true, ctx);
>   return HRTIMER_NORESTART;
>  }
> @@ -458,6 +461,8 @@ static void timer_emulate(struct arch_timer_context *ctx)
>   trace_kvm_timer_emulate(ctx, should_fire);
>  
>   if (should_fire != ctx->irq.level) {
> + /* Timer emulated, emulate ISTATUS also */
> + timer_set_ctl(ctx, read_timer_ctl(ctx));
>   kvm_timer_update_irq(ctx->vcpu, should_fire, ctx);
>   return;
>   }

I'm not overly keen on this. Yes, we can set the status bit there. But
conversely, the bit will not get cleared when the guest reprograms the
timer, and will take a full exit/entry cycle for it to appear.

Ergo, the architecture is buggy as memory (the VNCR page) cannot be
used to emulate something as dynamic as a timer.

It is only with FEAT_ECV that we can solve this correctly by trapping
the counter/timer accesses and emulate them for the guest hypervisor.
I'd rather we add support for that, as I expect all the FEAT_NV2
implementations to have it (and hopefully FEAT_FGT as well).

Thanks,

M.

-- 
Without deviation from the norm, progress is not possible.
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm