Re: [RFC PATCH v5 08/38] KVM: arm64: Unlock memslots after stage 2 tables are freed

2022-03-17 Thread Reiji Watanabe

Hi Alex,

On 11/17/21 7:38 AM, Alexandru Elisei wrote:

Unpin the backing pages mapped at stage 2 after the stage 2 translation
tables are destroyed.

Signed-off-by: Alexandru Elisei 
---
  arch/arm64/kvm/mmu.c | 23 ++-
  1 file changed, 18 insertions(+), 5 deletions(-)

diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c
index cd6f1bc7842d..072e2aba371f 100644
--- a/arch/arm64/kvm/mmu.c
+++ b/arch/arm64/kvm/mmu.c
@@ -1627,11 +1627,19 @@ int kvm_mmu_lock_memslot(struct kvm *kvm, u64 slot, u64 
flags)
return ret;
  }
  
-static void unlock_memslot(struct kvm *kvm, struct kvm_memory_slot *memslot)

+static void __unlock_memslot(struct kvm *kvm, struct kvm_memory_slot *memslot)
  {
bool writable = memslot->arch.flags & KVM_MEMSLOT_LOCK_WRITE;
unsigned long npages = memslot->npages;
  
+	unpin_memslot_pages(memslot, writable);

+   account_locked_vm(current->mm, npages, false);
+
+   memslot->arch.flags &= ~KVM_MEMSLOT_LOCK_MASK;
+}
+
+static void unlock_memslot(struct kvm *kvm, struct kvm_memory_slot *memslot)
+{
/*
 * MMU maintenace operations aren't performed on an unlocked memslot.
 * Unmap it from stage 2 so the abort handler performs the necessary
@@ -1640,10 +1648,7 @@ static void unlock_memslot(struct kvm *kvm, struct 
kvm_memory_slot *memslot)
if (kvm_mmu_has_pending_ops(kvm))
kvm_arch_flush_shadow_memslot(kvm, memslot);
  
-	unpin_memslot_pages(memslot, writable);

-   account_locked_vm(current->mm, npages, false);
-
-   memslot->arch.flags &= ~KVM_MEMSLOT_LOCK_MASK;
+   __unlock_memslot(kvm, memslot);
  }
  
  int kvm_mmu_unlock_memslot(struct kvm *kvm, u64 slot, u64 flags)

@@ -1951,7 +1956,15 @@ void kvm_arch_memslots_updated(struct kvm *kvm, u64 gen)
  
  void kvm_arch_flush_shadow_all(struct kvm *kvm)

  {
+   struct kvm_memory_slot *memslot;
+
kvm_free_stage2_pgd(>arch.mmu);
+
+   kvm_for_each_memslot(memslot, kvm_memslots(kvm)) {
+   if (!memslot_is_locked(memslot))
+   continue;
+   __unlock_memslot(kvm, memslot);
+   }
  }


Perhaps it might be useful to manage the number of locked memslots ?
(can be used in the fix for kvm_mmu_unlock_memslot in the patch-7 as well)
 
Thanks,

Reiji


  
  void kvm_arch_flush_shadow_memslot(struct kvm *kvm,

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [kvm-unit-tests PATCH] arm/run: Use TCG with qemu-system-arm on arm64 systems

2022-03-17 Thread Alexandru Elisei
Hi,

On Thu, Mar 17, 2022 at 06:45:07PM +0100, Andrew Jones wrote:
> On Thu, Mar 17, 2022 at 04:56:01PM +, Alexandru Elisei wrote:
> > From: Andrew Jones 
> > 
> > If the user sets QEMU=qemu-system-arm on arm64 systems, the tests can only
> > be run by using the TCG accelerator. In this case use TCG instead of KVM.
> > 
> > Signed-off-by: Andrew Jones 
> > [ Alex E: Added commit message ]
> > Signed-off-by: Alexandru Elisei 
> > ---
> >  arm/run | 12 ++--
> >  1 file changed, 10 insertions(+), 2 deletions(-)
> > 
> > diff --git a/arm/run b/arm/run
> > index 28a0b4ad2729..128489125dcb 100755
> > --- a/arm/run
> > +++ b/arm/run
> > @@ -10,16 +10,24 @@ if [ -z "$KUT_STANDALONE" ]; then
> >  fi
> >  processor="$PROCESSOR"
> >  
> > -ACCEL=$(get_qemu_accelerator) ||
> > +accel=$(get_qemu_accelerator) ||
> > exit $?
> >  
> > -if [ "$ACCEL" = "kvm" ]; then
> > +if [ "$accel" = "kvm" ]; then
> > QEMU_ARCH=$HOST
> >  fi
> >  
> >  qemu=$(search_qemu_binary) ||
> > exit $?
> >  
> > +if [ "$QEMU" ] && [ -z "$ACCEL" ] &&
> > +   [ "$HOST" = "aarch64" ] && [ "$ARCH" = "arm" ] &&
> > +   [ "$(basename $QEMU)" = "qemu-system-arm" ]; then
> > +   accel=tcg
> > +fi
> > +
> > +ACCEL=$accel
> > +
> >  if ! $qemu -machine '?' 2>&1 | grep 'ARM Virtual Machine' > /dev/null; then
> > echo "$qemu doesn't support mach-virt ('-machine virt'). Exiting."
> > exit 2
> > -- 
> > 2.35.1
> >
> 
> Ha, OK, I guess you posting this is a strong vote in favor of this
> behavior. I've queued it

Ah, yes, maybe I should've been more clear about it. I think this is more
intuitive for the new users who might not be very familiar with
run_tests.sh internals, and like you've said it won't break existing users
who had to set ACCEL=tcg to get the desired behaviour anyway.

Thanks you for queueing it so fast! Should probably have also mentioned
this as a comment in the commit, but I take full responsibility for
breaking stuff.

Alex

> 
> https://gitlab.com/rhdrjones/kvm-unit-tests/-/commits/arm/queue
> 
> Thanks,
> drew 
> 
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [kvm-unit-tests PATCH] arm/run: Use TCG with qemu-system-arm on arm64 systems

2022-03-17 Thread Andrew Jones
On Thu, Mar 17, 2022 at 04:56:01PM +, Alexandru Elisei wrote:
> From: Andrew Jones 
> 
> If the user sets QEMU=qemu-system-arm on arm64 systems, the tests can only
> be run by using the TCG accelerator. In this case use TCG instead of KVM.
> 
> Signed-off-by: Andrew Jones 
> [ Alex E: Added commit message ]
> Signed-off-by: Alexandru Elisei 
> ---
>  arm/run | 12 ++--
>  1 file changed, 10 insertions(+), 2 deletions(-)
> 
> diff --git a/arm/run b/arm/run
> index 28a0b4ad2729..128489125dcb 100755
> --- a/arm/run
> +++ b/arm/run
> @@ -10,16 +10,24 @@ if [ -z "$KUT_STANDALONE" ]; then
>  fi
>  processor="$PROCESSOR"
>  
> -ACCEL=$(get_qemu_accelerator) ||
> +accel=$(get_qemu_accelerator) ||
>   exit $?
>  
> -if [ "$ACCEL" = "kvm" ]; then
> +if [ "$accel" = "kvm" ]; then
>   QEMU_ARCH=$HOST
>  fi
>  
>  qemu=$(search_qemu_binary) ||
>   exit $?
>  
> +if [ "$QEMU" ] && [ -z "$ACCEL" ] &&
> +   [ "$HOST" = "aarch64" ] && [ "$ARCH" = "arm" ] &&
> +   [ "$(basename $QEMU)" = "qemu-system-arm" ]; then
> + accel=tcg
> +fi
> +
> +ACCEL=$accel
> +
>  if ! $qemu -machine '?' 2>&1 | grep 'ARM Virtual Machine' > /dev/null; then
>   echo "$qemu doesn't support mach-virt ('-machine virt'). Exiting."
>   exit 2
> -- 
> 2.35.1
>

Ha, OK, I guess you posting this is a strong vote in favor of this
behavior. I've queued it

https://gitlab.com/rhdrjones/kvm-unit-tests/-/commits/arm/queue

Thanks,
drew 

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[kvm-unit-tests PATCH] arm/run: Use TCG with qemu-system-arm on arm64 systems

2022-03-17 Thread Alexandru Elisei
From: Andrew Jones 

If the user sets QEMU=qemu-system-arm on arm64 systems, the tests can only
be run by using the TCG accelerator. In this case use TCG instead of KVM.

Signed-off-by: Andrew Jones 
[ Alex E: Added commit message ]
Signed-off-by: Alexandru Elisei 
---
 arm/run | 12 ++--
 1 file changed, 10 insertions(+), 2 deletions(-)

diff --git a/arm/run b/arm/run
index 28a0b4ad2729..128489125dcb 100755
--- a/arm/run
+++ b/arm/run
@@ -10,16 +10,24 @@ if [ -z "$KUT_STANDALONE" ]; then
 fi
 processor="$PROCESSOR"
 
-ACCEL=$(get_qemu_accelerator) ||
+accel=$(get_qemu_accelerator) ||
exit $?
 
-if [ "$ACCEL" = "kvm" ]; then
+if [ "$accel" = "kvm" ]; then
QEMU_ARCH=$HOST
 fi
 
 qemu=$(search_qemu_binary) ||
exit $?
 
+if [ "$QEMU" ] && [ -z "$ACCEL" ] &&
+   [ "$HOST" = "aarch64" ] && [ "$ARCH" = "arm" ] &&
+   [ "$(basename $QEMU)" = "qemu-system-arm" ]; then
+   accel=tcg
+fi
+
+ACCEL=$accel
+
 if ! $qemu -machine '?' 2>&1 | grep 'ARM Virtual Machine' > /dev/null; then
echo "$qemu doesn't support mach-virt ('-machine virt'). Exiting."
exit 2
-- 
2.35.1

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [PATCH v4 2/6] Partially revert "KVM: Pass kvm_init()'s opaque param to additional arch funcs"

2022-03-17 Thread Suzuki Kuruppassery Poulose

On 16/02/2022 03:15, Chao Gao wrote:

This partially reverts commit b99040853738 ("KVM: Pass kvm_init()'s opaque
param to additional arch funcs") remove opaque from
kvm_arch_check_processor_compat because no one uses this opaque now.
Address conflicts for ARM (due to file movement) and manually handle RISC-V
which comes after the commit.

And changes about kvm_arch_hardware_setup() in original commit are still
needed so they are not reverted.

Signed-off-by: Chao Gao 
Reviewed-by: Sean Christopherson 
---
  arch/arm64/kvm/arm.c   |  2 +-
  arch/mips/kvm/mips.c   |  2 +-
  arch/powerpc/kvm/powerpc.c |  2 +-
  arch/riscv/kvm/main.c  |  2 +-
  arch/s390/kvm/kvm-s390.c   |  2 +-
  arch/x86/kvm/x86.c |  2 +-
  include/linux/kvm_host.h   |  2 +-
  virt/kvm/kvm_main.c| 16 +++-
  8 files changed, 10 insertions(+), 20 deletions(-)

diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
index ecc5958e27fe..0165cf3aac3a 100644
--- a/arch/arm64/kvm/arm.c
+++ b/arch/arm64/kvm/arm.c
@@ -73,7 +73,7 @@ int kvm_arch_hardware_setup(void *opaque)
return 0;
  }
  
-int kvm_arch_check_processor_compat(void *opaque)

+int kvm_arch_check_processor_compat(void)
  {
return 0;
  }


For arm64 :

Reviewed-by: Suzuki K Poulose 
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [PATCH v1 2/2] KVM: arm64: Add debug tracepoint for vcpu exits

2022-03-17 Thread Marc Zyngier
Hi Jing,

On Thu, 17 Mar 2022 00:56:30 +,
Jing Zhang  wrote:
> 
> This tracepoint only provides a hook for poking vcpu exits information,
> not exported to tracefs.
> A timestamp is added for the last vcpu exit, which would be useful for
> analysis for vcpu exits.

The trace itself gives you a timestamp. Why do you need an extra one?

> 
> Signed-off-by: Jing Zhang 
> ---
>  arch/arm64/include/asm/kvm_host.h | 3 +++
>  arch/arm64/kvm/arm.c  | 2 ++
>  arch/arm64/kvm/trace_arm.h| 8 
>  3 files changed, 13 insertions(+)
> 
> diff --git a/arch/arm64/include/asm/kvm_host.h 
> b/arch/arm64/include/asm/kvm_host.h
> index daa68b053bdc..576f2c18d008 100644
> --- a/arch/arm64/include/asm/kvm_host.h
> +++ b/arch/arm64/include/asm/kvm_host.h
> @@ -415,6 +415,9 @@ struct kvm_vcpu_arch {
>  
>   /* Arch specific exit reason */
>   enum arm_exit_reason exit_reason;
> +
> + /* Timestamp for the last vcpu exit */
> + u64 last_exit_time;
>  };
>  
>  /* Pointer to the vcpu's SVE FFR for sve_{save,load}_state() */
> diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
> index f49ebdd9c990..98631f79c182 100644
> --- a/arch/arm64/kvm/arm.c
> +++ b/arch/arm64/kvm/arm.c
> @@ -783,6 +783,7 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu)
>   ret = 1;
>   run->exit_reason = KVM_EXIT_UNKNOWN;
>   while (ret > 0) {
> + trace_kvm_vcpu_exits(vcpu);

Exit? We haven't entered the guest yet!

>   /*
>* Check conditions before entering the guest
>*/
> @@ -898,6 +899,7 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu)
>   local_irq_enable();
>  
>   trace_kvm_exit(ret, kvm_vcpu_trap_get_class(vcpu), 
> *vcpu_pc(vcpu));
> + vcpu->arch.last_exit_time = ktime_to_ns(ktime_get());

Why isn't the above tracepoint sufficient? It gives you the EC, and
comes with a timestamp for free. And why should *everyone* pay the
price of this timestamp update if not tracing?

>  
>   /* Exit types that need handling before we can be preempted */
>   handle_exit_early(vcpu, ret);
> diff --git a/arch/arm64/kvm/trace_arm.h b/arch/arm64/kvm/trace_arm.h
> index 33e4e7dd2719..3e7dfd640e23 100644
> --- a/arch/arm64/kvm/trace_arm.h
> +++ b/arch/arm64/kvm/trace_arm.h
> @@ -301,6 +301,14 @@ TRACE_EVENT(kvm_timer_emulate,
> __entry->timer_idx, __entry->should_fire)
>  );
>  
> +/*
> + * Following tracepoints are not exported in tracefs and provide hooking
> + * mechanisms only for testing and debugging purposes.
> + */
> +DECLARE_TRACE(kvm_vcpu_exits,
> + TP_PROTO(struct kvm_vcpu *vcpu),
> + TP_ARGS(vcpu));
> +
>  #endif /* _TRACE_ARM_ARM64_KVM_H */
>  
>  #undef TRACE_INCLUDE_PATH

I guess this is the only bit I actually like about this series: a
generic, out of the way mechanism to let people hook whatever they
want and dump the state they need.

Thanks,

M.

-- 
Without deviation from the norm, progress is not possible.
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [PATCH v1 1/2] KVM: arm64: Add arch specific exit reasons

2022-03-17 Thread Marc Zyngier
Hi Jing,

On Thu, 17 Mar 2022 00:56:29 +,
Jing Zhang  wrote:
> 
> Arch specific exit reasons have been available for other architectures.
> Add arch specific exit reason support for ARM64, which would be used in
> KVM stats for monitoring VCPU status.
> 
> Signed-off-by: Jing Zhang 
> ---
>  arch/arm64/include/asm/kvm_emulate.h |  5 +++
>  arch/arm64/include/asm/kvm_host.h| 33 +++
>  arch/arm64/kvm/handle_exit.c | 62 +---
>  arch/arm64/kvm/mmu.c |  4 ++
>  arch/arm64/kvm/sys_regs.c|  6 +++
>  5 files changed, 105 insertions(+), 5 deletions(-)
> 
> diff --git a/arch/arm64/include/asm/kvm_emulate.h 
> b/arch/arm64/include/asm/kvm_emulate.h
> index d62405ce3e6d..f73c8d900642 100644
> --- a/arch/arm64/include/asm/kvm_emulate.h
> +++ b/arch/arm64/include/asm/kvm_emulate.h
> @@ -321,6 +321,11 @@ static inline bool kvm_vcpu_trap_is_iabt(const struct 
> kvm_vcpu *vcpu)
>   return kvm_vcpu_trap_get_class(vcpu) == ESR_ELx_EC_IABT_LOW;
>  }
>  
> +static inline bool kvm_vcpu_trap_is_dabt(const struct kvm_vcpu *vcpu)
> +{
> + return kvm_vcpu_trap_get_class(vcpu) == ESR_ELx_EC_DABT_LOW;
> +}
> +
>  static inline bool kvm_vcpu_trap_is_exec_fault(const struct kvm_vcpu *vcpu)
>  {
>   return kvm_vcpu_trap_is_iabt(vcpu) && !kvm_vcpu_abt_iss1tw(vcpu);
> diff --git a/arch/arm64/include/asm/kvm_host.h 
> b/arch/arm64/include/asm/kvm_host.h
> index 76f795b628f1..daa68b053bdc 100644
> --- a/arch/arm64/include/asm/kvm_host.h
> +++ b/arch/arm64/include/asm/kvm_host.h
> @@ -282,6 +282,36 @@ struct vcpu_reset_state {
>   boolreset;
>  };
>  
> +enum arm_exit_reason {
> + ARM_EXIT_UNKNOWN,
> + ARM_EXIT_IRQ,
> + ARM_EXIT_EL1_SERROR,
> + ARM_EXIT_HYP_GONE,
> + ARM_EXIT_IL,
> + ARM_EXIT_WFI,
> + ARM_EXIT_WFE,
> + ARM_EXIT_CP15_32,
> + ARM_EXIT_CP15_64,
> + ARM_EXIT_CP14_32,
> + ARM_EXIT_CP14_LS,
> + ARM_EXIT_CP14_64,
> + ARM_EXIT_HVC32,
> + ARM_EXIT_SMC32,
> + ARM_EXIT_HVC64,
> + ARM_EXIT_SMC64,
> + ARM_EXIT_SYS64,
> + ARM_EXIT_SVE,
> + ARM_EXIT_IABT_LOW,
> + ARM_EXIT_DABT_LOW,
> + ARM_EXIT_SOFTSTP_LOW,
> + ARM_EXIT_WATCHPT_LOW,
> + ARM_EXIT_BREAKPT_LOW,
> + ARM_EXIT_BKPT32,
> + ARM_EXIT_BRK64,
> + ARM_EXIT_FP_ASIMD,
> + ARM_EXIT_PAC,
> +};
> +
>  struct kvm_vcpu_arch {
>   struct kvm_cpu_context ctxt;
>   void *sve_state;
> @@ -382,6 +412,9 @@ struct kvm_vcpu_arch {
>   u64 last_steal;
>   gpa_t base;
>   } steal;
> +
> + /* Arch specific exit reason */
> + enum arm_exit_reason exit_reason;

We already have a copy of ESR_EL2. Together with the exit code, this
gives you everything you need. Why add another piece of state?

[...]

> @@ -135,6 +179,7 @@ static int kvm_handle_unknown_ec(struct kvm_vcpu *vcpu)
>   kvm_pr_unimpl("Unknown exception class: esr: %#08x -- %s\n",
> esr, esr_get_class_string(esr));
>  
> + vcpu->arch.exit_reason = ARM_EXIT_UNKNOWN;

If anything, this should say "either CPU out of spec, or KVM bug". And
I don't see the point of tracking these. This should be reported in a
completely different manner, because this has nothing to do with the
normal exits a vcpu does.

> @@ -250,6 +299,7 @@ int handle_exit(struct kvm_vcpu *vcpu, int 
> exception_index)
>* EL2 has been reset to the hyp-stub. This happens when a guest
>* is pre-empted by kvm_reboot()'s shutdown call.
>*/
> + vcpu->arch.exit_reason = ARM_EXIT_HYP_GONE;

Same thing here: the machine is *rebooting*. Who cares?

>   run->exit_reason = KVM_EXIT_FAIL_ENTRY;
>   return 0;
>   case ARM_EXCEPTION_IL:
> @@ -257,11 +307,13 @@ int handle_exit(struct kvm_vcpu *vcpu, int 
> exception_index)
>* We attempted an illegal exception return.  Guest state must
>* have been corrupted somehow.  Give up.
>*/
> + vcpu->arch.exit_reason = ARM_EXIT_IL;

This is another reason why I dislike this patch. It mixes
architectural state (ESR) and KVM gunk (exit code). Why not spit these
two bits of information in the trace, and let whoever deals with it to
infer what they want from it?

>   run->exit_reason = KVM_EXIT_FAIL_ENTRY;
>   return -EINVAL;
>   default:
>   kvm_pr_unimpl("Unsupported exception type: %d",
> exception_index);
> + vcpu->arch.exit_reason = ARM_EXIT_UNKNOWN;

See? Now you have UNKNOWN covering two really different concepts.
That's broken. Overall, this patch is reinventing the wheel (and
slightly square one), and I don't see a good reason for the state
duplication.

Thanks,

M.

-- 
Without deviation from the norm, progress is not possible.
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu

Re: Report an error on GICv4.1 vcpu de-schedule

2022-03-17 Thread Marc Zyngier
Hi Jingyi,

On Thu, 17 Mar 2022 07:27:45 +,
Jingyi Wang  wrote:
> 
> Hi Marc,
> 
> The patch "KVM: arm64: Delay the polling of the GICR_VPENDBASER.Dirty
> bit"(57e3cebd022fbc035dcf190ac789fd2ffc747f5b) remove the polling of
> GICR_VPENDBASER.Dirty bit in vcpu_load() , while check the VPT parsing
> ready in kvm_vgic_flush_hwstate() for better performance.
> 
> Most time it works, but we have met an error on our hardware recently.
> In preemptable kernel, the vcpu can be preempted between vcpu_load and
> kvm_vgic_flush_hwstate. As a result, it get de-scheduled and
> its_clear_vpend_valid() is called
> 
>   val = gicr_read_vpendbaser(vlpi_base + GICR_VPENDBASER);
>   val &= ~GICR_VPENDBASER_Valid;
>   val &= ~clr;
>   val |= set;
>   gicr_write_vpendbaser(val, vlpi_base + GICR_VPENDBASER);
> 
> 
> The function clears Valid bit meanwhile GICR_VPENDBASER_Dirty
> maybe still 1, which cause the subsequent GICR_VPENDBASER_Dirty polling
> fail and report ""ITS virtual pending table not cleaning".
> 
> We have communicated with Martin from ARM and get the conclusion
> that we should not change valid bit while the dirty bit not clear——
> "The dirty bit reports whether the last schedule /de-schedule
> operation has completed.The restriction on not changing Valid when Dirty
> is 1, is so that hardware can always complete the last operation for
> starting the next".

Indeed, the spec is crystal clear about that, and clearing Valid while
Dirty is set is plain wrong.

> 
> I think maybe we can check dirty bit clear before clearing the valid bit
> in its_clear_vpend_valid() code. Hope to know your opinion about this
> issue.

Yes, that's what should happen. I came up with the patch below. Please
give it a shot and let me know if that helps. If it does, I'll queue
it as a fix.

Thanks,

M.

From c23ccc9cfa603e30ac189d43af75f03b60d780bc Mon Sep 17 00:00:00 2001
From: Marc Zyngier 
Date: Thu, 17 Mar 2022 09:49:02 +
Subject: [PATCH] irqchip/gic-v4: Wait for GICR_VPENDBASER.Dirty to clear
 before descheduling

The way KVM drives GICv4.{0,1} is as follows:
- vcpu_load() makes the VPE resident, instructing the RD to start
  scanning for interrupts
- just before entering the guest, we check that the RD has finished
  scanning and that we can start running the vcpu
- on preemption, we deschedule the VPE by making it invalid on
  the RD

However, we are preemptible between the first two steps. If it so
happens *and* that the RD was still scanning, we nonetheless write
to the GICR_VPENDBASER register while Dirty is set, and bad things
happen (we're in UNPRED land).

This affects both the 4.0 and 4.1 implementations.

Make sure Dirty is cleared before performing the deschedule,
meaning that its_clear_vpend_valid() becomes a sort of full VPE
residency barrier.

Reported-by: Jingyi Wang 
Signed-off-by: Marc Zyngier 
Fixes: 57e3cebd022f ("KVM: arm64: Delay the polling of the GICR_VPENDBASER.Dirty
bit")
Link: https://lore.kernel.org/r/4aae10ba-b39a-5f84-754b-69c2eb0a2...@huawei.com
---
 drivers/irqchip/irq-gic-v3-its.c | 28 +++-
 1 file changed, 19 insertions(+), 9 deletions(-)

diff --git a/drivers/irqchip/irq-gic-v3-its.c b/drivers/irqchip/irq-gic-v3-its.c
index 9e93ff2b6375..c9b1df980899 100644
--- a/drivers/irqchip/irq-gic-v3-its.c
+++ b/drivers/irqchip/irq-gic-v3-its.c
@@ -3011,18 +3011,12 @@ static int __init allocate_lpi_tables(void)
return 0;
 }
 
-static u64 its_clear_vpend_valid(void __iomem *vlpi_base, u64 clr, u64 set)
+static u64 read_vpend_dirty_clear(void __iomem *vlpi_base)
 {
u32 count = 100;/* 1s! */
bool clean;
u64 val;
 
-   val = gicr_read_vpendbaser(vlpi_base + GICR_VPENDBASER);
-   val &= ~GICR_VPENDBASER_Valid;
-   val &= ~clr;
-   val |= set;
-   gicr_write_vpendbaser(val, vlpi_base + GICR_VPENDBASER);
-
do {
val = gicr_read_vpendbaser(vlpi_base + GICR_VPENDBASER);
clean = !(val & GICR_VPENDBASER_Dirty);
@@ -3033,10 +3027,26 @@ static u64 its_clear_vpend_valid(void __iomem 
*vlpi_base, u64 clr, u64 set)
}
} while (!clean && count);
 
-   if (unlikely(val & GICR_VPENDBASER_Dirty)) {
+   if (unlikely(!clean))
pr_err_ratelimited("ITS virtual pending table not cleaning\n");
+
+   return val;
+}
+
+static u64 its_clear_vpend_valid(void __iomem *vlpi_base, u64 clr, u64 set)
+{
+   u64 val;
+
+   /* Make sure we wait until the RD is done with the initial scan */
+   val = read_vpend_dirty_clear(vlpi_base);
+   val &= ~GICR_VPENDBASER_Valid;
+   val &= ~clr;
+   val |= set;
+   gicr_write_vpendbaser(val, vlpi_base + GICR_VPENDBASER);
+
+   val = read_vpend_dirty_clear(vlpi_base);
+   if (unlikely(val & GICR_VPENDBASER_Dirty))
val |= GICR_VPENDBASER_PendingLast;
-   }
 
return val;
 }
-- 
2.34.1


-- 
Without deviation from the norm, progress is not 

Re: [PATCH v12 20/40] arm64/sme: Implement streaming SVE signal handling

2022-03-17 Thread Thiago Jung Bauermann


Hello,

Mark Brown  writes:

> @@ -186,9 +189,16 @@ struct sve_context {
>   * sve_context.vl must equal the thread's current vector length when
>   * doing a sigreturn.
>   *
> + * On systems with support for SME the SVE register state may reflect either
> + * streaming or non-streaming mode.  In streaming mode the streaming mode
> + * vector length will be used and the flag SVE_SIG_FLAG_SM will be set in
> + * the flags field. It is permitted to enter or leave streaming mode in
> + * a signal return, applications should take care to ensure that any 
> difference
> + * in vector length between the two modes is handled, including any resixing
> + * and movement of context blocks.

s/resixing/resizing/

-- 
Thiago
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [PATCH v4 0/6] Improve KVM's interaction with CPU hotplug

2022-03-17 Thread Chao Gao
Ping. Anyone can help to review this series (particularly patch 3-5)?

FYI, Sean gave his Reviewed-by to patch 1,2,5 and 6.
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [PATCH v12 11/40] arm64/sme: Identify supported SME vector lengths at boot

2022-03-17 Thread Thiago Jung Bauermann

Hello,

Just a small suggestion:

Mark Brown  writes:

> diff --git a/arch/arm64/include/asm/cpu.h b/arch/arm64/include/asm/cpu.h
> index d08062bcb9c1..550e1fc4ae6c 100644
> --- a/arch/arm64/include/asm/cpu.h
> +++ b/arch/arm64/include/asm/cpu.h
> @@ -64,6 +64,9 @@ struct cpuinfo_arm64 {
>  
>   /* pseudo-ZCR for recording maximum ZCR_EL1 LEN value: */
>   u64 reg_zcr;
> +
> + /* pseudo-SMCR for recording maximum ZCR_EL1 LEN value: */
> + u64 reg_smcr;
>  };

Perhaps append “when in streaming mode” to the comment above (or mention
streaming mode in some other way), to convey the difference between
reg_smcr and reg_zcr?

-- 
Thiago
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [PATCH v2 2/3] KVM: arm64: selftests: add arch_timer_edge_cases

2022-03-17 Thread Marc Zyngier

On 2022-03-17 06:44, Oliver Upton wrote:

On Wed, Mar 16, 2022 at 09:51:26PM -0700, Ricardo Koller wrote:

Add an arch_timer edge-cases selftest. For now, just add some basic
sanity checks, and some stress conditions (like waiting for the timers
while re-scheduling the vcpu). The next commit will add the actual 
edge

case tests.

This test fails without a867e9d0cc1 "KVM: arm64: Don't miss pending
interrupts for suspended vCPU".

Reviewed-by: Reiji Watanabe 
Reviewed-by: Raghavendra Rao Ananta 
Signed-off-by: Ricardo Koller 


[...]


+   asm volatile("wfi\n"
+"msr daifclr, #2\n"
+/* handle IRQ */


I believe an isb is owed here (DDI0487G.b D1.13.4). Annoyingly, I am
having a hard time finding the same language in the H.a revision of the
manual :-/


D1.3.6 probably is what you are looking for.

"Context synchronization event" is the key phrase to remember
when grepping through the ARM ARM. And yes, the new layout is
a nightmare (as if we really needed an additional 2800 pages...).




+"msr daifset, #2\n"
+: : : "memory");
+   }
+}


[...]


+   /* tval should keep down-counting from 0 to -1. */
+   SET_COUNTER(DEF_CNT, test_args.timer);
+   timer_set_tval(test_args.timer, 0);
+   if (use_sched)
+   USERSPACE_SCHEDULE();
+   /* We just need 1 cycle to pass. */
+   isb();


Somewhat paranoid, but:

If the CPU retires the ISB much more quickly than the counter ticks, 
its

possible that you could observe an invalid TVAL even with a valid
implementation.


Worse than that:

- ISB doesn't need to take any time at all. It just needs to ensure
  that everything is synchronised. Depending on how the CPU is built,
  this can come for free.

- There is no relation between the counter ticks and CPU cycles.

What if you spin waiting for CNT to increment before the assertion? 
Then

you for sureknow (and can tell by reading the test) that the
implementation is broken.


That's basically the only way to implement this. You can't rely
on any other event.

Thanks,

M.
--
Jazz is not dead. It just smells funny...
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm