Re: [PATCH 0/3] KVM: arm64: Handle CCSIDR associativity mismatches

2022-12-11 Thread Akihiko Odaki

On 2022/12/11 19:21, Marc Zyngier wrote:

On Sun, 11 Dec 2022 05:25:31 +,
Akihiko Odaki  wrote:


On 2022/12/04 23:57, Marc Zyngier wrote:

On Fri, 02 Dec 2022 09:55:24 +,
Akihiko Odaki  wrote:


On 2022/12/02 18:40, Marc Zyngier wrote:

On Fri, 02 Dec 2022 05:17:12 +,
Akihiko Odaki  wrote:



On M2 MacBook Air, I have seen no other difference in standard ID
registers and CCSIDRs are exceptions. Perhaps Apple designed this way
so that macOS's Hypervisor can freely migrate vCPU, but I can't assure
that without more analysis. This is still enough to migrate vCPU
running Linux at least.


I guess that MacOS hides more of the underlying HW than KVM does. And
KVM definitely doesn't hide the MIDR_EL1 registers, which *are*
different between the two clusters.


It seems KVM stores a MIDR value of a CPU and reuse it as "invariant"
value for ioctls while it exposes the MIDR value each physical CPU
owns to vCPU.


This only affects the VMM though, and not the guest which sees the
MIDR of the CPU it runs on. The problem is that at short of pinning
the vcpus, you don't know where they will run. So any value is fair
game.


Yes, my concern is that VMM can be confused if it sees something
different from what the guest on the vCPU sees.


Well, this has been part of the ABI for about 10 years, since Rusty
introduced this notion of invariant, so userspace is already working
around it if that's an actual issue.


In that case, I think it is better to document that the interface is
not working properly and deprecated.


This means nothing. Deprecating an API doesn't mean we don't support
it and doesn't solve any issue for existing userspace.

I'd rather not change anything, TBH. Existing userspace already knows
how to deal with this,





This would be easily addressed though, and shouldn't result in any
issue. The following should do the trick (only lightly tested on an
M1).


This can be problematic when restoring vcpu state saved with the old
kernel. A possible solution is to allow the userspace to overwrite
MIDR_EL1 as proposed for CCSIDR_EL1.


That would break most guests for obvious reasons. At best what can be
done is to make the MIDR WI.


Making MIDR WI sounds good to me. Either keeping the current behavior or 
making it WI, the behavior is better to be documented, I think. The 
documentation obviously does not help running existing userspace code 
but will be helpful when writing new userspace code or understanding how 
existing userspace code works.


Regards,
Akihiko Odaki



M.


___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [PATCH 0/3] KVM: arm64: Handle CCSIDR associativity mismatches

2022-12-11 Thread Marc Zyngier
On Sun, 11 Dec 2022 05:25:31 +,
Akihiko Odaki  wrote:
> 
> On 2022/12/04 23:57, Marc Zyngier wrote:
> > On Fri, 02 Dec 2022 09:55:24 +,
> > Akihiko Odaki  wrote:
> >> 
> >> On 2022/12/02 18:40, Marc Zyngier wrote:
> >>> On Fri, 02 Dec 2022 05:17:12 +,
> >>> Akihiko Odaki  wrote:
>  
> >> On M2 MacBook Air, I have seen no other difference in standard ID
> >> registers and CCSIDRs are exceptions. Perhaps Apple designed this way
> >> so that macOS's Hypervisor can freely migrate vCPU, but I can't assure
> >> that without more analysis. This is still enough to migrate vCPU
> >> running Linux at least.
> > 
> > I guess that MacOS hides more of the underlying HW than KVM does. And
> > KVM definitely doesn't hide the MIDR_EL1 registers, which *are*
> > different between the two clusters.
>  
>  It seems KVM stores a MIDR value of a CPU and reuse it as "invariant"
>  value for ioctls while it exposes the MIDR value each physical CPU
>  owns to vCPU.
> >>> 
> >>> This only affects the VMM though, and not the guest which sees the
> >>> MIDR of the CPU it runs on. The problem is that at short of pinning
> >>> the vcpus, you don't know where they will run. So any value is fair
> >>> game.
> >> 
> >> Yes, my concern is that VMM can be confused if it sees something
> >> different from what the guest on the vCPU sees.
> > 
> > Well, this has been part of the ABI for about 10 years, since Rusty
> > introduced this notion of invariant, so userspace is already working
> > around it if that's an actual issue.
> 
> In that case, I think it is better to document that the interface is
> not working properly and deprecated.

This means nothing. Deprecating an API doesn't mean we don't support
it and doesn't solve any issue for existing userspace.

I'd rather not change anything, TBH. Existing userspace already knows
how to deal with this,

> 
> > 
> > This would be easily addressed though, and shouldn't result in any
> > issue. The following should do the trick (only lightly tested on an
> > M1).
> 
> This can be problematic when restoring vcpu state saved with the old
> kernel. A possible solution is to allow the userspace to overwrite
> MIDR_EL1 as proposed for CCSIDR_EL1.

That would break most guests for obvious reasons. At best what can be
done is to make the MIDR WI.

M.

-- 
Without deviation from the norm, progress is not possible.
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [PATCH 0/3] KVM: arm64: Handle CCSIDR associativity mismatches

2022-12-10 Thread Akihiko Odaki

On 2022/12/04 23:57, Marc Zyngier wrote:

On Fri, 02 Dec 2022 09:55:24 +,
Akihiko Odaki  wrote:


On 2022/12/02 18:40, Marc Zyngier wrote:

On Fri, 02 Dec 2022 05:17:12 +,
Akihiko Odaki  wrote:



On M2 MacBook Air, I have seen no other difference in standard ID
registers and CCSIDRs are exceptions. Perhaps Apple designed this way
so that macOS's Hypervisor can freely migrate vCPU, but I can't assure
that without more analysis. This is still enough to migrate vCPU
running Linux at least.


I guess that MacOS hides more of the underlying HW than KVM does. And
KVM definitely doesn't hide the MIDR_EL1 registers, which *are*
different between the two clusters.


It seems KVM stores a MIDR value of a CPU and reuse it as "invariant"
value for ioctls while it exposes the MIDR value each physical CPU
owns to vCPU.


This only affects the VMM though, and not the guest which sees the
MIDR of the CPU it runs on. The problem is that at short of pinning
the vcpus, you don't know where they will run. So any value is fair
game.


Yes, my concern is that VMM can be confused if it sees something
different from what the guest on the vCPU sees.


Well, this has been part of the ABI for about 10 years, since Rusty
introduced this notion of invariant, so userspace is already working
around it if that's an actual issue.


In that case, I think it is better to document that the interface is not 
working properly and deprecated.




This would be easily addressed though, and shouldn't result in any
issue. The following should do the trick (only lightly tested on an
M1).


This can be problematic when restoring vcpu state saved with the old 
kernel. A possible solution is to allow the userspace to overwrite 
MIDR_EL1 as proposed for CCSIDR_EL1.


Regards,
Akihiko Odaki



Thanks,

M.

 From f1caacb89eb8ae40dc38669160a2f081f87f4b15 Mon Sep 17 00:00:00 2001
From: Marc Zyngier 
Date: Sun, 4 Dec 2022 14:22:22 +
Subject: [PATCH] KVM: arm64: Return MIDR_EL1 to userspace as seen on the vcpu
  thread

When booting, KVM sample the MIDR of the CPU it initialises on,
and keep this as the value that will forever be exposed to userspace.

However, this has nothing to do with the value that the guest will
see. On an asymetric system, this can result in userspace observing
weird things, specially if it has pinned the vcpus on a *different*
set of CPUs.

Instead, return the MIDR value for the vpcu we're currently on and
that the vcpu will observe if it has been pinned onto that CPU.

For symmetric systems, this changes nothing. For asymmetric machines,
they will observe the correct MIDR value at the point of the call.

Reported-by: Akihiko Odaki 
Signed-off-by: Marc Zyngier 
---
  arch/arm64/kvm/sys_regs.c | 19 +--
  1 file changed, 17 insertions(+), 2 deletions(-)

diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c
index f4a7c5abcbca..f6bcf8ba9b2e 100644
--- a/arch/arm64/kvm/sys_regs.c
+++ b/arch/arm64/kvm/sys_regs.c
@@ -1246,6 +1246,22 @@ static int set_id_reg(struct kvm_vcpu *vcpu, const 
struct sys_reg_desc *rd,
return 0;
  }
  
+static int get_midr(struct kvm_vcpu *vcpu, const struct sys_reg_desc *rd,

+   u64 *val)
+{
+   *val = read_sysreg(midr_el1);
+   return 0;
+}
+
+static int set_midr(struct kvm_vcpu *vcpu, const struct sys_reg_desc *rd,
+   u64 val)
+{
+   if (val != read_sysreg(midr_el1))
+   return -EINVAL;
+
+   return 0;
+}
+
  static int get_raz_reg(struct kvm_vcpu *vcpu, const struct sys_reg_desc *rd,
   u64 *val)
  {
@@ -1432,6 +1448,7 @@ static const struct sys_reg_desc sys_reg_descs[] = {
  
  	{ SYS_DESC(SYS_DBGVCR32_EL2), NULL, reset_val, DBGVCR32_EL2, 0 },
  
+	{ SYS_DESC(SYS_MIDR_EL1), .get_user = get_midr, .set_user = set_midr },

{ SYS_DESC(SYS_MPIDR_EL1), NULL, reset_mpidr, MPIDR_EL1 },
  
  	/*

@@ -2609,7 +2626,6 @@ id_to_sys_reg_desc(struct kvm_vcpu *vcpu, u64 id,
((struct sys_reg_desc *)r)->val = read_sysreg(reg);  \
}
  
-FUNCTION_INVARIANT(midr_el1)

  FUNCTION_INVARIANT(revidr_el1)
  FUNCTION_INVARIANT(clidr_el1)
  FUNCTION_INVARIANT(aidr_el1)
@@ -2621,7 +2637,6 @@ static void get_ctr_el0(struct kvm_vcpu *v, const struct 
sys_reg_desc *r)
  
  /* ->val is filled in by kvm_sys_reg_table_init() */

  static struct sys_reg_desc invariant_sys_regs[] = {
-   { SYS_DESC(SYS_MIDR_EL1), NULL, get_midr_el1 },
{ SYS_DESC(SYS_REVIDR_EL1), NULL, get_revidr_el1 },
{ SYS_DESC(SYS_CLIDR_EL1), NULL, get_clidr_el1 },
{ SYS_DESC(SYS_AIDR_EL1), NULL, get_aidr_el1 },

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [PATCH 0/3] KVM: arm64: Handle CCSIDR associativity mismatches

2022-12-04 Thread Marc Zyngier
On Fri, 02 Dec 2022 09:55:24 +,
Akihiko Odaki  wrote:
> 
> On 2022/12/02 18:40, Marc Zyngier wrote:
> > On Fri, 02 Dec 2022 05:17:12 +,
> > Akihiko Odaki  wrote:
> >> 
>  On M2 MacBook Air, I have seen no other difference in standard ID
>  registers and CCSIDRs are exceptions. Perhaps Apple designed this way
>  so that macOS's Hypervisor can freely migrate vCPU, but I can't assure
>  that without more analysis. This is still enough to migrate vCPU
>  running Linux at least.
> >>> 
> >>> I guess that MacOS hides more of the underlying HW than KVM does. And
> >>> KVM definitely doesn't hide the MIDR_EL1 registers, which *are*
> >>> different between the two clusters.
> >> 
> >> It seems KVM stores a MIDR value of a CPU and reuse it as "invariant"
> >> value for ioctls while it exposes the MIDR value each physical CPU
> >> owns to vCPU.
> > 
> > This only affects the VMM though, and not the guest which sees the
> > MIDR of the CPU it runs on. The problem is that at short of pinning
> > the vcpus, you don't know where they will run. So any value is fair
> > game.
> 
> Yes, my concern is that VMM can be confused if it sees something
> different from what the guest on the vCPU sees.

Well, this has been part of the ABI for about 10 years, since Rusty
introduced this notion of invariant, so userspace is already working
around it if that's an actual issue.

This would be easily addressed though, and shouldn't result in any
issue. The following should do the trick (only lightly tested on an
M1).

Thanks,

M.

>From f1caacb89eb8ae40dc38669160a2f081f87f4b15 Mon Sep 17 00:00:00 2001
From: Marc Zyngier 
Date: Sun, 4 Dec 2022 14:22:22 +
Subject: [PATCH] KVM: arm64: Return MIDR_EL1 to userspace as seen on the vcpu
 thread

When booting, KVM sample the MIDR of the CPU it initialises on,
and keep this as the value that will forever be exposed to userspace.

However, this has nothing to do with the value that the guest will
see. On an asymetric system, this can result in userspace observing
weird things, specially if it has pinned the vcpus on a *different*
set of CPUs.

Instead, return the MIDR value for the vpcu we're currently on and
that the vcpu will observe if it has been pinned onto that CPU.

For symmetric systems, this changes nothing. For asymmetric machines,
they will observe the correct MIDR value at the point of the call.

Reported-by: Akihiko Odaki 
Signed-off-by: Marc Zyngier 
---
 arch/arm64/kvm/sys_regs.c | 19 +--
 1 file changed, 17 insertions(+), 2 deletions(-)

diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c
index f4a7c5abcbca..f6bcf8ba9b2e 100644
--- a/arch/arm64/kvm/sys_regs.c
+++ b/arch/arm64/kvm/sys_regs.c
@@ -1246,6 +1246,22 @@ static int set_id_reg(struct kvm_vcpu *vcpu, const 
struct sys_reg_desc *rd,
return 0;
 }
 
+static int get_midr(struct kvm_vcpu *vcpu, const struct sys_reg_desc *rd,
+   u64 *val)
+{
+   *val = read_sysreg(midr_el1);
+   return 0;
+}
+
+static int set_midr(struct kvm_vcpu *vcpu, const struct sys_reg_desc *rd,
+   u64 val)
+{
+   if (val != read_sysreg(midr_el1))
+   return -EINVAL;
+
+   return 0;
+}
+
 static int get_raz_reg(struct kvm_vcpu *vcpu, const struct sys_reg_desc *rd,
   u64 *val)
 {
@@ -1432,6 +1448,7 @@ static const struct sys_reg_desc sys_reg_descs[] = {
 
{ SYS_DESC(SYS_DBGVCR32_EL2), NULL, reset_val, DBGVCR32_EL2, 0 },
 
+   { SYS_DESC(SYS_MIDR_EL1), .get_user = get_midr, .set_user = set_midr },
{ SYS_DESC(SYS_MPIDR_EL1), NULL, reset_mpidr, MPIDR_EL1 },
 
/*
@@ -2609,7 +2626,6 @@ id_to_sys_reg_desc(struct kvm_vcpu *vcpu, u64 id,
((struct sys_reg_desc *)r)->val = read_sysreg(reg); \
}
 
-FUNCTION_INVARIANT(midr_el1)
 FUNCTION_INVARIANT(revidr_el1)
 FUNCTION_INVARIANT(clidr_el1)
 FUNCTION_INVARIANT(aidr_el1)
@@ -2621,7 +2637,6 @@ static void get_ctr_el0(struct kvm_vcpu *v, const struct 
sys_reg_desc *r)
 
 /* ->val is filled in by kvm_sys_reg_table_init() */
 static struct sys_reg_desc invariant_sys_regs[] = {
-   { SYS_DESC(SYS_MIDR_EL1), NULL, get_midr_el1 },
{ SYS_DESC(SYS_REVIDR_EL1), NULL, get_revidr_el1 },
{ SYS_DESC(SYS_CLIDR_EL1), NULL, get_clidr_el1 },
{ SYS_DESC(SYS_AIDR_EL1), NULL, get_aidr_el1 },
-- 
2.34.1


-- 
Without deviation from the norm, progress is not possible.
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [PATCH 0/3] KVM: arm64: Handle CCSIDR associativity mismatches

2022-12-02 Thread Oliver Upton
On Thu, Dec 01, 2022 at 11:14:43PM +, Marc Zyngier wrote:
> On Thu, 01 Dec 2022 18:29:51 +,
> Oliver Upton  wrote:
> > Could we extend your suggestion about accepting different topologies to
> > effectively tolerate _any_ topology provided by userspace? KVM can
> > default to the virtual topology, but a well-informed userspace could
> > still provide different values to its guest. No point in trying to
> > babyproofing the UAPI further, IMO.
> 
> I think this is *exactly* what I suggested. Any valid topology should
> be able to be restored, as we currently present the VM with any
> topology the host HW may have. This must be preserved.

Ah, I was narrowly reading into the conversation as it relates to the M2
implementation, my bad. SGTM :)

--
Thanks,
Oliver
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [PATCH 0/3] KVM: arm64: Handle CCSIDR associativity mismatches

2022-12-02 Thread Akihiko Odaki

On 2022/12/02 18:40, Marc Zyngier wrote:

On Fri, 02 Dec 2022 05:17:12 +,
Akihiko Odaki  wrote:



On M2 MacBook Air, I have seen no other difference in standard ID
registers and CCSIDRs are exceptions. Perhaps Apple designed this way
so that macOS's Hypervisor can freely migrate vCPU, but I can't assure
that without more analysis. This is still enough to migrate vCPU
running Linux at least.


I guess that MacOS hides more of the underlying HW than KVM does. And
KVM definitely doesn't hide the MIDR_EL1 registers, which *are*
different between the two clusters.


It seems KVM stores a MIDR value of a CPU and reuse it as "invariant"
value for ioctls while it exposes the MIDR value each physical CPU
owns to vCPU.


This only affects the VMM though, and not the guest which sees the
MIDR of the CPU it runs on. The problem is that at short of pinning
the vcpus, you don't know where they will run. So any value is fair
game.


Yes, my concern is that VMM can be confused if it sees something 
different from what the guest on the vCPU sees.



crosvm uses KVM on big.LITTLE processors by pinning
vCPU to physical CPU, and it is a real-world application which needs
to be supported.

For an application like crosvm, you would expect the vCPU thread gets
the MIDR value of the physical CPU which the thread is pinned to when
it calls ioctl, but it can get one of another arbitrary CPU in
reality.


No. It will get the MIDR of the CPU it runs on. Check again. What you
describing above is solely for userspace.


By "ioctl", I meant the value the VMM gets for the vCPU thread. The 
problem is that the guest on the vCPU and the VMM issuing ioctl on the 
same thread can see different values, and it doesn't seem quite right.



...or we may just say the value of MPIDR_EL0 (and possibly other


I assume you meant MIDR_EL1 here, as MPIDR_EL1 is something else (and
it has no _EL0 equivalent).


Yes, I meant MIDR_EL1.




"invariant" registers) exposed via ioctl are useless and deprecated.


Useless? Not really. The all are meaningful to the guest, and a change
there will cause issues.

CTR_EL0 must, for example, be an invariant. Otherwise, you need to
trap all the CMOs when the {I,D}minLine values that are restored from
userspace are bigger than the ones the HW has. Even worse, when the
DIC/IDC bits are set from userspace while the HW has them cleared: you
cannot mitigate that one, and you'll end up with memory corruption.

I've been toying with the idea of exposing to guests the list of
MIDR/REVIDR the guest is allowed to run on, as a PV service. This
would allow that guest to enable all the mitigations it wants in one
go.

Not sure I have time for this at the moment, but that'd be something
to explore.

[...]


I meant that the values exposed to the VMM via ioctls does not seem 
useful. They still need to be exposed to the guest as you say.


Regards,
Akihiko Odaki
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [PATCH 0/3] KVM: arm64: Handle CCSIDR associativity mismatches

2022-12-02 Thread Marc Zyngier
On Fri, 02 Dec 2022 05:17:12 +,
Akihiko Odaki  wrote:
> 
> >> On M2 MacBook Air, I have seen no other difference in standard ID
> >> registers and CCSIDRs are exceptions. Perhaps Apple designed this way
> >> so that macOS's Hypervisor can freely migrate vCPU, but I can't assure
> >> that without more analysis. This is still enough to migrate vCPU
> >> running Linux at least.
> > 
> > I guess that MacOS hides more of the underlying HW than KVM does. And
> > KVM definitely doesn't hide the MIDR_EL1 registers, which *are*
> > different between the two clusters.
> 
> It seems KVM stores a MIDR value of a CPU and reuse it as "invariant"
> value for ioctls while it exposes the MIDR value each physical CPU
> owns to vCPU.

This only affects the VMM though, and not the guest which sees the
MIDR of the CPU it runs on. The problem is that at short of pinning
the vcpus, you don't know where they will run. So any value is fair
game.

> This may be a problem worth fixing. My understanding is that while
> there is no serious application which requires vCPU migration among
> physical clusters,

Hey, I do that all the time with kvmtool! It's just that my guest do
not care about being run on a CPU or another.

> crosvm uses KVM on big.LITTLE processors by pinning
> vCPU to physical CPU, and it is a real-world application which needs
> to be supported.
> 
> For an application like crosvm, you would expect the vCPU thread gets
> the MIDR value of the physical CPU which the thread is pinned to when
> it calls ioctl, but it can get one of another arbitrary CPU in
> reality.

No. It will get the MIDR of the CPU it runs on. Check again. What you
describing above is solely for userspace.

> 
> Fixing this problem will pose two design questions:
> 
> 1. Should it expose a value consistent among clusters?
> 
> For example, we can change the KVM initialization code so that it
> initializes VPIDR with the value stored as "invariant". This would
> help migrating vCPU among clusters, but if you pin each vCPU thread to
> a distinct phyiscal CPU, you may rather want the vCPU to see the MIDR
> value specific to each physical CPU and to apply quirks or tuning
> parameters according to the value.

Which is what happens. Not at the cluster level, but at the CPU
level. The architecture doesn't describe what a *cluster* is.

> 2. Should it be invariant or variable?
> 
> Fortunately making it variable is easy. Arm provides VPIDR_EL1
> register to specify the value exposed as MPIDR_EL0 so there is no
> trapping cost.

And if you do that you make it impossible for the guest to mitigate
errata, as most of the errata handling is based on the MIDR values.

> ...or we may just say the value of MPIDR_EL0 (and possibly other

I assume you meant MIDR_EL1 here, as MPIDR_EL1 is something else (and
it has no _EL0 equivalent).

> "invariant" registers) exposed via ioctl are useless and deprecated.

Useless? Not really. The all are meaningful to the guest, and a change
there will cause issues.

CTR_EL0 must, for example, be an invariant. Otherwise, you need to
trap all the CMOs when the {I,D}minLine values that are restored from
userspace are bigger than the ones the HW has. Even worse, when the
DIC/IDC bits are set from userspace while the HW has them cleared: you
cannot mitigate that one, and you'll end up with memory corruption.

I've been toying with the idea of exposing to guests the list of
MIDR/REVIDR the guest is allowed to run on, as a PV service. This
would allow that guest to enable all the mitigations it wants in one
go.

Not sure I have time for this at the moment, but that'd be something
to explore.

[...]

> > So let's first build on top of HCR_EL2.TID2, and only then once we
> > have an idea of the overhead add support for HCR_EL2.TID4 for the
> > systems that have FEAT_EVT.
> 
> That sounds good, I'll write a new series according to this idea.

Thanks!

M.

-- 
Without deviation from the norm, progress is not possible.
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [PATCH 0/3] KVM: arm64: Handle CCSIDR associativity mismatches

2022-12-01 Thread Akihiko Odaki




On 2022/12/02 8:13, Marc Zyngier wrote:

On Thu, 01 Dec 2022 17:26:08 +,
Akihiko Odaki  wrote:


On 2022/12/01 20:06, Marc Zyngier wrote:

On Thu, 01 Dec 2022 10:49:11 +,
Akihiko Odaki  wrote:

Thanks for looking into this.


M2 MacBook Air has mismatched CCSIDR associativity bits, which makes the
bits a KVM vCPU sees inconsistent when migrating.


Can you describe the actual discrepancy? Is that an issue between the
two core types? In which case, nothing says that these two cluster
should have the same cache topology.


Yes, the processor has big.LITTLE configuration.

On the processor, the valid CSSELR values are 0 (L1D), 1 (L1I), 3
(L2D). For each CSSELR values, each cluster has:
- 0x700FE03A, 0x203FE01A, 0x70FFE07B
- 0x701FE03A, 0x203FE02A, 0x73FFE07B


This is a perfectly valid configuration. The architecture doesn't
place any limitation on how different or identical the cache
hierarchies are from the PoV of each CPU. Actually, most big-little
systems show similar differences across their clusters.


It also makes QEMU fail restoring the vCPU registers because QEMU saves
and restores all of the registers including CCSIDRs, and if the vCPU
migrated among physical CPUs between saving and restoring, it tries to
restore CCSIDR values that mismatch with the current physical CPU, which
causes EFAULT.


Well, QEMU will have plenty of other problems, starting with MIDRs,
which always reflect the physical one. In general, KVM isn't well
geared for VMs spanning multiple CPU types. It is improving, but there
is a long way to go.


On M2 MacBook Air, I have seen no other difference in standard ID
registers and CCSIDRs are exceptions. Perhaps Apple designed this way
so that macOS's Hypervisor can freely migrate vCPU, but I can't assure
that without more analysis. This is still enough to migrate vCPU
running Linux at least.


I guess that MacOS hides more of the underlying HW than KVM does. And
KVM definitely doesn't hide the MIDR_EL1 registers, which *are*
different between the two clusters.


It seems KVM stores a MIDR value of a CPU and reuse it as "invariant" 
value for ioctls while it exposes the MIDR value each physical CPU owns 
to vCPU.


This may be a problem worth fixing. My understanding is that while there 
is no serious application which requires vCPU migration among physical 
clusters, crosvm uses KVM on big.LITTLE processors by pinning vCPU to 
physical CPU, and it is a real-world application which needs to be 
supported.


For an application like crosvm, you would expect the vCPU thread gets 
the MIDR value of the physical CPU which the thread is pinned to when it 
calls ioctl, but it can get one of another arbitrary CPU in reality.


Fixing this problem will pose two design questions:

1. Should it expose a value consistent among clusters?

For example, we can change the KVM initialization code so that it 
initializes VPIDR with the value stored as "invariant". This would help 
migrating vCPU among clusters, but if you pin each vCPU thread to a 
distinct phyiscal CPU, you may rather want the vCPU to see the MIDR 
value specific to each physical CPU and to apply quirks or tuning 
parameters according to the value.


2. Should it be invariant or variable?

Fortunately making it variable is easy. Arm provides VPIDR_EL1 register 
to specify the value exposed as MPIDR_EL0 so there is no trapping cost.


...or we may just say the value of MPIDR_EL0 (and possibly other 
"invariant" registers) exposed via ioctl are useless and deprecated.





Trap CCSIDRs if there are CCSIDR value msimatches, and override the
associativity bits when handling the trap.


TBH, I'd rather we stop reporting this stuff altogether.

There is nothing a correctly written arm64 guest should do with any of
this (this is only useful for set/way CMOs, which non-secure SW should
never issue). It would be a lot better to expose a virtual topology
(one set, one way, one level). It would also save us from the CCSIDRX
silliness.

The only complexity would be to still accept different topologies from
userspace so that we can restore a VM saved before this virtual
topology.


Another (minor) concern is that trapping relevant registers may cost
too much. Currently KVM traps CSSELR and CCSIDR accesses with
HCR_TID2, but HCR_TID2 also affects CTR_EL0.


It will have an additional impact (JITs, for example, will take a hit
if they don't cache that value), but this is pretty easy to mitigate
if it proves to have too much of an impact. We already have a bunch of
fast-paths for things that we want to emulate more efficiently, and
CTR_EL0 could be one of them,


Although I'm not sure if the register is referred frequently, Arm
introduced FEAT_EVT to trap CSSELR and CSSIDR but not CTR_EL0 so
there may be some case where trapping CTR_EL0 is not
tolerated. Perhaps Arm worried that a userspace application may read
CTR_EL0 frequently.


FEAT_EVT is one of these "let's add random traps" extensions,
culminating in FEAT_FGT. Having FEAT_EVT 

Re: [PATCH 0/3] KVM: arm64: Handle CCSIDR associativity mismatches

2022-12-01 Thread Marc Zyngier
On Thu, 01 Dec 2022 18:29:51 +,
Oliver Upton  wrote:
> 
> On Thu, Dec 01, 2022 at 11:06:50AM +, Marc Zyngier wrote:
> 
> [...]
> 
> > It would be a lot better to expose a virtual topology
> > (one set, one way, one level). It would also save us from the CCSIDRX
> > silliness.
> > 
> > The only complexity would be to still accept different topologies from
> > userspace so that we can restore a VM saved before this virtual
> > topology.
> 
> I generally agree that the reported topology is meaningless to
> non-secure software.
> 
> However, with the cloud vendor hat on, I'm worried that inevitably some
> customer will inspect the cache topology of the VM we've provided them
> and complain.

That's their prerogative. It is idiotic, but I guess paying customers
get this privilege ;-).

> Could we extend your suggestion about accepting different topologies to
> effectively tolerate _any_ topology provided by userspace? KVM can
> default to the virtual topology, but a well-informed userspace could
> still provide different values to its guest. No point in trying to
> babyproofing the UAPI further, IMO.

I think this is *exactly* what I suggested. Any valid topology should
be able to be restored, as we currently present the VM with any
topology the host HW may have. This must be preserved.

Eventually, we may even have to expose CCSIDRX, but let's cross that
bridge when we get to it.

Thanks,

M.

-- 
Without deviation from the norm, progress is not possible.
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [PATCH 0/3] KVM: arm64: Handle CCSIDR associativity mismatches

2022-12-01 Thread Marc Zyngier
On Thu, 01 Dec 2022 17:26:08 +,
Akihiko Odaki  wrote:
> 
> On 2022/12/01 20:06, Marc Zyngier wrote:
> > On Thu, 01 Dec 2022 10:49:11 +,
> > Akihiko Odaki  wrote:
> > 
> > Thanks for looking into this.
> > 
> >> M2 MacBook Air has mismatched CCSIDR associativity bits, which makes the
> >> bits a KVM vCPU sees inconsistent when migrating.
> > 
> > Can you describe the actual discrepancy? Is that an issue between the
> > two core types? In which case, nothing says that these two cluster
> > should have the same cache topology.
> 
> Yes, the processor has big.LITTLE configuration.
> 
> On the processor, the valid CSSELR values are 0 (L1D), 1 (L1I), 3
> (L2D). For each CSSELR values, each cluster has:
> - 0x700FE03A, 0x203FE01A, 0x70FFE07B
> - 0x701FE03A, 0x203FE02A, 0x73FFE07B

This is a perfectly valid configuration. The architecture doesn't
place any limitation on how different or identical the cache
hierarchies are from the PoV of each CPU. Actually, most big-little
systems show similar differences across their clusters.

> >> It also makes QEMU fail restoring the vCPU registers because QEMU saves
> >> and restores all of the registers including CCSIDRs, and if the vCPU
> >> migrated among physical CPUs between saving and restoring, it tries to
> >> restore CCSIDR values that mismatch with the current physical CPU, which
> >> causes EFAULT.
> > 
> > Well, QEMU will have plenty of other problems, starting with MIDRs,
> > which always reflect the physical one. In general, KVM isn't well
> > geared for VMs spanning multiple CPU types. It is improving, but there
> > is a long way to go.
> 
> On M2 MacBook Air, I have seen no other difference in standard ID
> registers and CCSIDRs are exceptions. Perhaps Apple designed this way
> so that macOS's Hypervisor can freely migrate vCPU, but I can't assure
> that without more analysis. This is still enough to migrate vCPU
> running Linux at least.

I guess that MacOS hides more of the underlying HW than KVM does. And
KVM definitely doesn't hide the MIDR_EL1 registers, which *are*
different between the two clusters.

> >> Trap CCSIDRs if there are CCSIDR value msimatches, and override the
> >> associativity bits when handling the trap.
> > 
> > TBH, I'd rather we stop reporting this stuff altogether.
> > 
> > There is nothing a correctly written arm64 guest should do with any of
> > this (this is only useful for set/way CMOs, which non-secure SW should
> > never issue). It would be a lot better to expose a virtual topology
> > (one set, one way, one level). It would also save us from the CCSIDRX
> > silliness.
> > 
> > The only complexity would be to still accept different topologies from
> > userspace so that we can restore a VM saved before this virtual
> > topology.
> 
> Another (minor) concern is that trapping relevant registers may cost
> too much. Currently KVM traps CSSELR and CCSIDR accesses with
> HCR_TID2, but HCR_TID2 also affects CTR_EL0.

It will have an additional impact (JITs, for example, will take a hit
if they don't cache that value), but this is pretty easy to mitigate
if it proves to have too much of an impact. We already have a bunch of
fast-paths for things that we want to emulate more efficiently, and
CTR_EL0 could be one of them,

> Although I'm not sure if the register is referred frequently, Arm
> introduced FEAT_EVT to trap CSSELR and CSSIDR but not CTR_EL0 so
> there may be some case where trapping CTR_EL0 is not
> tolerated. Perhaps Arm worried that a userspace application may read
> CTR_EL0 frequently.

FEAT_EVT is one of these "let's add random traps" extensions,
culminating in FEAT_FGT. Having FEAT_EVT would make it more efficient,
but we need to support this for all revisions of the architecture.

So let's first build on top of HCR_EL2.TID2, and only then once we
have an idea of the overhead add support for HCR_EL2.TID4 for the
systems that have FEAT_EVT.

> If you think the concern on VM restoration you mentioned and the
> trapping overhead is tolerable, I'll write a new, much smaller patch
> accordingly.

That would great, thanks. There are a number of gotchas around that
(like the 32bit stuff that already plays the emulation game), but this
is the right time to start and have something in 6.3 if you keep to it!

M.

-- 
Without deviation from the norm, progress is not possible.
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [PATCH 0/3] KVM: arm64: Handle CCSIDR associativity mismatches

2022-12-01 Thread Oliver Upton
On Thu, Dec 01, 2022 at 11:06:50AM +, Marc Zyngier wrote:

[...]

> It would be a lot better to expose a virtual topology
> (one set, one way, one level). It would also save us from the CCSIDRX
> silliness.
> 
> The only complexity would be to still accept different topologies from
> userspace so that we can restore a VM saved before this virtual
> topology.

I generally agree that the reported topology is meaningless to
non-secure software.

However, with the cloud vendor hat on, I'm worried that inevitably some
customer will inspect the cache topology of the VM we've provided them
and complain.

Could we extend your suggestion about accepting different topologies to
effectively tolerate _any_ topology provided by userspace? KVM can
default to the virtual topology, but a well-informed userspace could
still provide different values to its guest. No point in trying to
babyproofing the UAPI further, IMO.

--
Thanks,
Oliver
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [PATCH 0/3] KVM: arm64: Handle CCSIDR associativity mismatches

2022-12-01 Thread Akihiko Odaki

On 2022/12/01 20:06, Marc Zyngier wrote:

On Thu, 01 Dec 2022 10:49:11 +,
Akihiko Odaki  wrote:

Thanks for looking into this.


M2 MacBook Air has mismatched CCSIDR associativity bits, which makes the
bits a KVM vCPU sees inconsistent when migrating.


Can you describe the actual discrepancy? Is that an issue between the
two core types? In which case, nothing says that these two cluster
should have the same cache topology.


Yes, the processor has big.LITTLE configuration.

On the processor, the valid CSSELR values are 0 (L1D), 1 (L1I), 3 (L2D). 
For each CSSELR values, each cluster has:

- 0x700FE03A, 0x203FE01A, 0x70FFE07B
- 0x701FE03A, 0x203FE02A, 0x73FFE07B




It also makes QEMU fail restoring the vCPU registers because QEMU saves
and restores all of the registers including CCSIDRs, and if the vCPU
migrated among physical CPUs between saving and restoring, it tries to
restore CCSIDR values that mismatch with the current physical CPU, which
causes EFAULT.


Well, QEMU will have plenty of other problems, starting with MIDRs,
which always reflect the physical one. In general, KVM isn't well
geared for VMs spanning multiple CPU types. It is improving, but there
is a long way to go.


On M2 MacBook Air, I have seen no other difference in standard ID 
registers and CCSIDRs are exceptions. Perhaps Apple designed this way so 
that macOS's Hypervisor can freely migrate vCPU, but I can't assure that 
without more analysis. This is still enough to migrate vCPU running 
Linux at least.





Trap CCSIDRs if there are CCSIDR value msimatches, and override the
associativity bits when handling the trap.


TBH, I'd rather we stop reporting this stuff altogether.

There is nothing a correctly written arm64 guest should do with any of
this (this is only useful for set/way CMOs, which non-secure SW should
never issue). It would be a lot better to expose a virtual topology
(one set, one way, one level). It would also save us from the CCSIDRX
silliness.

The only complexity would be to still accept different topologies from
userspace so that we can restore a VM saved before this virtual
topology.


Another (minor) concern is that trapping relevant registers may cost too 
much. Currently KVM traps CSSELR and CCSIDR accesses with HCR_TID2, but 
HCR_TID2 also affects CTR_EL0. Although I'm not sure if the register is 
referred frequently, Arm introduced FEAT_EVT to trap CSSELR and CSSIDR 
but not CTR_EL0 so there may be some case where trapping CTR_EL0 is not 
tolerated. Perhaps Arm worried that a userspace application may read 
CTR_EL0 frequently.


If you think the concern on VM restoration you mentioned and the 
trapping overhead is tolerable, I'll write a new, much smaller patch 
accordingly.


Regards,
Akihiko Odaki



Do you mind having a look at this?

Thanks,

M.


___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [PATCH 0/3] KVM: arm64: Handle CCSIDR associativity mismatches

2022-12-01 Thread Marc Zyngier
On Thu, 01 Dec 2022 10:49:11 +,
Akihiko Odaki  wrote:

Thanks for looking into this.

> M2 MacBook Air has mismatched CCSIDR associativity bits, which makes the
> bits a KVM vCPU sees inconsistent when migrating.

Can you describe the actual discrepancy? Is that an issue between the
two core types? In which case, nothing says that these two cluster
should have the same cache topology.

> It also makes QEMU fail restoring the vCPU registers because QEMU saves
> and restores all of the registers including CCSIDRs, and if the vCPU
> migrated among physical CPUs between saving and restoring, it tries to
> restore CCSIDR values that mismatch with the current physical CPU, which
> causes EFAULT.

Well, QEMU will have plenty of other problems, starting with MIDRs,
which always reflect the physical one. In general, KVM isn't well
geared for VMs spanning multiple CPU types. It is improving, but there
is a long way to go.

> Trap CCSIDRs if there are CCSIDR value msimatches, and override the
> associativity bits when handling the trap.

TBH, I'd rather we stop reporting this stuff altogether.

There is nothing a correctly written arm64 guest should do with any of
this (this is only useful for set/way CMOs, which non-secure SW should
never issue). It would be a lot better to expose a virtual topology
(one set, one way, one level). It would also save us from the CCSIDRX
silliness.

The only complexity would be to still accept different topologies from
userspace so that we can restore a VM saved before this virtual
topology.

Do you mind having a look at this?

Thanks,

M.

-- 
Without deviation from the norm, progress is not possible.
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm