Re: [PATCH 2/2] KVM: arm64: Workaround firmware wrongly advertising GICv2-on-v3 compatibility

2021-01-08 Thread Ard Biesheuvel
On Fri, 8 Jan 2021 at 19:13, Marc Zyngier  wrote:
>
> On 2021-01-08 17:59, Ard Biesheuvel wrote:
> > On Fri, 8 Jan 2021 at 18:12, Marc Zyngier  wrote:
> >>
> >> It looks like we have broken firmware out there that wrongly
> >> advertises
> >> a GICv2 compatibility interface, despite the CPUs not being able to
> >> deal
> >> with it.
> >>
> >> To work around this, check that the CPU initialising KVM is actually
> >> able
> >> to switch to MMIO instead of system registers, and use that as a
> >> precondition to enable GICv2 compatibility in KVM.
> >>
> >> Note that the detection happens on a single CPU. If the firmware is
> >> lying *and* that the CPUs are asymetric, all hope is lost anyway.
> >>
> >> Reported-by: Shameerali Kolothum Thodi
> >> 
> >> Signed-off-by: Marc Zyngier 
> >> ---
> >>  arch/arm64/kvm/hyp/vgic-v3-sr.c | 34
> >> +++--
> >>  arch/arm64/kvm/vgic/vgic-v3.c   |  8 ++--
> >>  2 files changed, 38 insertions(+), 4 deletions(-)
> >>
> >> diff --git a/arch/arm64/kvm/hyp/vgic-v3-sr.c
> >> b/arch/arm64/kvm/hyp/vgic-v3-sr.c
> >> index 005daa0c9dd7..d504499ab917 100644
> >> --- a/arch/arm64/kvm/hyp/vgic-v3-sr.c
> >> +++ b/arch/arm64/kvm/hyp/vgic-v3-sr.c
> >> @@ -408,11 +408,41 @@ void __vgic_v3_init_lrs(void)
> >>  /*
> >>   * Return the GIC CPU configuration:
> >>   * - [31:0]  ICH_VTR_EL2
> >> - * - [63:32] RES0
> >> + * - [62:32] RES0
> >> + * - [63]MMIO (GICv2) capable
> >>   */
> >>  u64 __vgic_v3_get_gic_config(void)
> >>  {
> >> -   return read_gicreg(ICH_VTR_EL2);
> >> +   u64 sre = read_gicreg(ICC_SRE_EL1);
> >> +   unsigned long flags = 0;
> >> +   bool v2_capable;
> >> +
> >> +   /*
> >> +* To check whether we have a MMIO-based (GICv2 compatible)
> >> +* CPU interface, we need to disable the system register
> >> +* view. To do that safely, we have to prevent any interrupt
> >> +* from firing (which would be deadly).
> >> +*
> >> +* Note that this only makes sense on VHE, as interrupts are
> >> +* already masked for nVHE as part of the exception entry to
> >> +* EL2.
> >> +*/
> >> +   if (has_vhe())
> >> +   flags = local_daif_save();
> >> +
> >> +   write_gicreg(0, ICC_SRE_EL1);
> >> +   isb();
> >> +
> >> +   v2_capable = !(read_gicreg(ICC_SRE_EL1) & ICC_SRE_EL1_SRE);
> >> +
> >> +   write_gicreg(sre, ICC_SRE_EL1);
> >> +   isb();
> >> +
> >> +   if (has_vhe())
> >> +   local_daif_restore(flags);
> >> +
> >> +   return (read_gicreg(ICH_VTR_EL2) |
> >> +   v2_capable ? (1ULL << 63) : 0);
> >>  }
> >>
> >
> > Is it necessary to perform this check unconditionally? We only care
> > about this if the firmware claims v2 compat support.
>
> Indeed. But this is done exactly once per boot, and I see it as
> a way to extract the CPU configuration more than anything else.
>
> Extracting it *only* when we have some v2 compat info would mean
> sharing that information with EL2 (in the nVHE case), and it felt
> more hassle than it is worth.
>
> Do you foresee any issue with this, other than the whole thing
> being disgusting (which I wilfully admit)?
>

No I don't think it's a problem per se. Just a bit disappointing that
every system will be burdened with this for as long as the last v2
compat capable system is still being supported.
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [PATCH 2/2] KVM: arm64: Workaround firmware wrongly advertising GICv2-on-v3 compatibility

2021-01-08 Thread Marc Zyngier

On 2021-01-08 17:59, Ard Biesheuvel wrote:

On Fri, 8 Jan 2021 at 18:12, Marc Zyngier  wrote:


It looks like we have broken firmware out there that wrongly 
advertises
a GICv2 compatibility interface, despite the CPUs not being able to 
deal

with it.

To work around this, check that the CPU initialising KVM is actually 
able

to switch to MMIO instead of system registers, and use that as a
precondition to enable GICv2 compatibility in KVM.

Note that the detection happens on a single CPU. If the firmware is
lying *and* that the CPUs are asymetric, all hope is lost anyway.

Reported-by: Shameerali Kolothum Thodi 


Signed-off-by: Marc Zyngier 
---
 arch/arm64/kvm/hyp/vgic-v3-sr.c | 34 
+++--

 arch/arm64/kvm/vgic/vgic-v3.c   |  8 ++--
 2 files changed, 38 insertions(+), 4 deletions(-)

diff --git a/arch/arm64/kvm/hyp/vgic-v3-sr.c 
b/arch/arm64/kvm/hyp/vgic-v3-sr.c

index 005daa0c9dd7..d504499ab917 100644
--- a/arch/arm64/kvm/hyp/vgic-v3-sr.c
+++ b/arch/arm64/kvm/hyp/vgic-v3-sr.c
@@ -408,11 +408,41 @@ void __vgic_v3_init_lrs(void)
 /*
  * Return the GIC CPU configuration:
  * - [31:0]  ICH_VTR_EL2
- * - [63:32] RES0
+ * - [62:32] RES0
+ * - [63]MMIO (GICv2) capable
  */
 u64 __vgic_v3_get_gic_config(void)
 {
-   return read_gicreg(ICH_VTR_EL2);
+   u64 sre = read_gicreg(ICC_SRE_EL1);
+   unsigned long flags = 0;
+   bool v2_capable;
+
+   /*
+* To check whether we have a MMIO-based (GICv2 compatible)
+* CPU interface, we need to disable the system register
+* view. To do that safely, we have to prevent any interrupt
+* from firing (which would be deadly).
+*
+* Note that this only makes sense on VHE, as interrupts are
+* already masked for nVHE as part of the exception entry to
+* EL2.
+*/
+   if (has_vhe())
+   flags = local_daif_save();
+
+   write_gicreg(0, ICC_SRE_EL1);
+   isb();
+
+   v2_capable = !(read_gicreg(ICC_SRE_EL1) & ICC_SRE_EL1_SRE);
+
+   write_gicreg(sre, ICC_SRE_EL1);
+   isb();
+
+   if (has_vhe())
+   local_daif_restore(flags);
+
+   return (read_gicreg(ICH_VTR_EL2) |
+   v2_capable ? (1ULL << 63) : 0);
 }



Is it necessary to perform this check unconditionally? We only care
about this if the firmware claims v2 compat support.


Indeed. But this is done exactly once per boot, and I see it as
a way to extract the CPU configuration more than anything else.

Extracting it *only* when we have some v2 compat info would mean
sharing that information with EL2 (in the nVHE case), and it felt
more hassle than it is worth.

Do you foresee any issue with this, other than the whole thing
being disgusting (which I wilfully admit)?

Thanks,

M.
--
Jazz is not dead. It just smells funny...
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [PATCH 2/2] KVM: arm64: Workaround firmware wrongly advertising GICv2-on-v3 compatibility

2021-01-08 Thread Ard Biesheuvel
On Fri, 8 Jan 2021 at 18:12, Marc Zyngier  wrote:
>
> It looks like we have broken firmware out there that wrongly advertises
> a GICv2 compatibility interface, despite the CPUs not being able to deal
> with it.
>
> To work around this, check that the CPU initialising KVM is actually able
> to switch to MMIO instead of system registers, and use that as a
> precondition to enable GICv2 compatibility in KVM.
>
> Note that the detection happens on a single CPU. If the firmware is
> lying *and* that the CPUs are asymetric, all hope is lost anyway.
>
> Reported-by: Shameerali Kolothum Thodi 
> Signed-off-by: Marc Zyngier 
> ---
>  arch/arm64/kvm/hyp/vgic-v3-sr.c | 34 +++--
>  arch/arm64/kvm/vgic/vgic-v3.c   |  8 ++--
>  2 files changed, 38 insertions(+), 4 deletions(-)
>
> diff --git a/arch/arm64/kvm/hyp/vgic-v3-sr.c b/arch/arm64/kvm/hyp/vgic-v3-sr.c
> index 005daa0c9dd7..d504499ab917 100644
> --- a/arch/arm64/kvm/hyp/vgic-v3-sr.c
> +++ b/arch/arm64/kvm/hyp/vgic-v3-sr.c
> @@ -408,11 +408,41 @@ void __vgic_v3_init_lrs(void)
>  /*
>   * Return the GIC CPU configuration:
>   * - [31:0]  ICH_VTR_EL2
> - * - [63:32] RES0
> + * - [62:32] RES0
> + * - [63]MMIO (GICv2) capable
>   */
>  u64 __vgic_v3_get_gic_config(void)
>  {
> -   return read_gicreg(ICH_VTR_EL2);
> +   u64 sre = read_gicreg(ICC_SRE_EL1);
> +   unsigned long flags = 0;
> +   bool v2_capable;
> +
> +   /*
> +* To check whether we have a MMIO-based (GICv2 compatible)
> +* CPU interface, we need to disable the system register
> +* view. To do that safely, we have to prevent any interrupt
> +* from firing (which would be deadly).
> +*
> +* Note that this only makes sense on VHE, as interrupts are
> +* already masked for nVHE as part of the exception entry to
> +* EL2.
> +*/
> +   if (has_vhe())
> +   flags = local_daif_save();
> +
> +   write_gicreg(0, ICC_SRE_EL1);
> +   isb();
> +
> +   v2_capable = !(read_gicreg(ICC_SRE_EL1) & ICC_SRE_EL1_SRE);
> +
> +   write_gicreg(sre, ICC_SRE_EL1);
> +   isb();
> +
> +   if (has_vhe())
> +   local_daif_restore(flags);
> +
> +   return (read_gicreg(ICH_VTR_EL2) |
> +   v2_capable ? (1ULL << 63) : 0);
>  }
>

Is it necessary to perform this check unconditionally? We only care
about this if the firmware claims v2 compat support.

>  u64 __vgic_v3_read_vmcr(void)
> diff --git a/arch/arm64/kvm/vgic/vgic-v3.c b/arch/arm64/kvm/vgic/vgic-v3.c
> index 8e7bf3151057..67b27b47312b 100644
> --- a/arch/arm64/kvm/vgic/vgic-v3.c
> +++ b/arch/arm64/kvm/vgic/vgic-v3.c
> @@ -584,8 +584,10 @@ early_param("kvm-arm.vgic_v4_enable", 
> early_gicv4_enable);
>  int vgic_v3_probe(const struct gic_kvm_info *info)
>  {
> u64 ich_vtr_el2 = kvm_call_hyp_ret(__vgic_v3_get_gic_config);
> +   bool has_v2;
> int ret;
>
> +   has_v2 = ich_vtr_el2 >> 63;
> ich_vtr_el2 = (u32)ich_vtr_el2;
>
> /*
> @@ -605,13 +607,15 @@ int vgic_v3_probe(const struct gic_kvm_info *info)
>  gicv4_enable ? "en" : "dis");
> }
>
> +   kvm_vgic_global_state.vcpu_base = 0;
> +
> if (!info->vcpu.start) {
> kvm_info("GICv3: no GICV resource entry\n");
> -   kvm_vgic_global_state.vcpu_base = 0;
> +   } else if (!has_v2) {
> +   pr_warn("CPU interface incapable of MMIO access\n");
> } else if (!PAGE_ALIGNED(info->vcpu.start)) {
> pr_warn("GICV physical address 0x%llx not page aligned\n",
> (unsigned long long)info->vcpu.start);
> -   kvm_vgic_global_state.vcpu_base = 0;
> } else {
> kvm_vgic_global_state.vcpu_base = info->vcpu.start;
> kvm_vgic_global_state.can_emulate_gicv2 = true;
> --
> 2.29.2
>
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[PATCH 0/2] KVM: arm64: Work around firmware wongly advertising GICv2 compatibility

2021-01-08 Thread Marc Zyngier
It appears that there is firmware out there that advertise GICv2
compatibility on GICv3, despite the CPUs not being able to actually do
it. That's a bummer, and at best creates unexpected behaviours for the
users. At worse, it will crash the machine. Awesome!

In order to mitigate this issue, try and validate whether we can
actually flip the CPU into supporting MMIO accesses instead of system
registers. If we can't, ignore the compatibility information and
shout. It's not completely foolproof, but it should cover the existing
broken platforms...

The workaround is much bigger than Shameer's initial proposal, but
that's because I wanted to keep it localised to KVM, and not spread
the horror at every level (after all, only KVM is concerned with v2
compat).

Marc Zyngier (2):
  KVM: arm64: Rename __vgic_v3_get_ich_vtr_el2() to
__vgic_v3_get_gic_config()
  KVM: arm64: Workaround firmware wrongly advertising GICv2-on-v3
compatibility

 arch/arm64/include/asm/kvm_asm.h   |  4 +--
 arch/arm64/kvm/hyp/nvhe/hyp-main.c |  6 ++---
 arch/arm64/kvm/hyp/vgic-v3-sr.c| 39 --
 arch/arm64/kvm/vgic/vgic-v3.c  | 12 ++---
 4 files changed, 51 insertions(+), 10 deletions(-)

-- 
2.29.2

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[PATCH 2/2] KVM: arm64: Workaround firmware wrongly advertising GICv2-on-v3 compatibility

2021-01-08 Thread Marc Zyngier
It looks like we have broken firmware out there that wrongly advertises
a GICv2 compatibility interface, despite the CPUs not being able to deal
with it.

To work around this, check that the CPU initialising KVM is actually able
to switch to MMIO instead of system registers, and use that as a
precondition to enable GICv2 compatibility in KVM.

Note that the detection happens on a single CPU. If the firmware is
lying *and* that the CPUs are asymetric, all hope is lost anyway.

Reported-by: Shameerali Kolothum Thodi 
Signed-off-by: Marc Zyngier 
---
 arch/arm64/kvm/hyp/vgic-v3-sr.c | 34 +++--
 arch/arm64/kvm/vgic/vgic-v3.c   |  8 ++--
 2 files changed, 38 insertions(+), 4 deletions(-)

diff --git a/arch/arm64/kvm/hyp/vgic-v3-sr.c b/arch/arm64/kvm/hyp/vgic-v3-sr.c
index 005daa0c9dd7..d504499ab917 100644
--- a/arch/arm64/kvm/hyp/vgic-v3-sr.c
+++ b/arch/arm64/kvm/hyp/vgic-v3-sr.c
@@ -408,11 +408,41 @@ void __vgic_v3_init_lrs(void)
 /*
  * Return the GIC CPU configuration:
  * - [31:0]  ICH_VTR_EL2
- * - [63:32] RES0
+ * - [62:32] RES0
+ * - [63]MMIO (GICv2) capable
  */
 u64 __vgic_v3_get_gic_config(void)
 {
-   return read_gicreg(ICH_VTR_EL2);
+   u64 sre = read_gicreg(ICC_SRE_EL1);
+   unsigned long flags = 0;
+   bool v2_capable;
+
+   /*
+* To check whether we have a MMIO-based (GICv2 compatible)
+* CPU interface, we need to disable the system register
+* view. To do that safely, we have to prevent any interrupt
+* from firing (which would be deadly).
+*
+* Note that this only makes sense on VHE, as interrupts are
+* already masked for nVHE as part of the exception entry to
+* EL2.
+*/
+   if (has_vhe())
+   flags = local_daif_save();
+
+   write_gicreg(0, ICC_SRE_EL1);
+   isb();
+
+   v2_capable = !(read_gicreg(ICC_SRE_EL1) & ICC_SRE_EL1_SRE);
+
+   write_gicreg(sre, ICC_SRE_EL1);
+   isb();
+
+   if (has_vhe())
+   local_daif_restore(flags);
+
+   return (read_gicreg(ICH_VTR_EL2) |
+   v2_capable ? (1ULL << 63) : 0);
 }
 
 u64 __vgic_v3_read_vmcr(void)
diff --git a/arch/arm64/kvm/vgic/vgic-v3.c b/arch/arm64/kvm/vgic/vgic-v3.c
index 8e7bf3151057..67b27b47312b 100644
--- a/arch/arm64/kvm/vgic/vgic-v3.c
+++ b/arch/arm64/kvm/vgic/vgic-v3.c
@@ -584,8 +584,10 @@ early_param("kvm-arm.vgic_v4_enable", early_gicv4_enable);
 int vgic_v3_probe(const struct gic_kvm_info *info)
 {
u64 ich_vtr_el2 = kvm_call_hyp_ret(__vgic_v3_get_gic_config);
+   bool has_v2;
int ret;
 
+   has_v2 = ich_vtr_el2 >> 63;
ich_vtr_el2 = (u32)ich_vtr_el2;
 
/*
@@ -605,13 +607,15 @@ int vgic_v3_probe(const struct gic_kvm_info *info)
 gicv4_enable ? "en" : "dis");
}
 
+   kvm_vgic_global_state.vcpu_base = 0;
+
if (!info->vcpu.start) {
kvm_info("GICv3: no GICV resource entry\n");
-   kvm_vgic_global_state.vcpu_base = 0;
+   } else if (!has_v2) {
+   pr_warn("CPU interface incapable of MMIO access\n");
} else if (!PAGE_ALIGNED(info->vcpu.start)) {
pr_warn("GICV physical address 0x%llx not page aligned\n",
(unsigned long long)info->vcpu.start);
-   kvm_vgic_global_state.vcpu_base = 0;
} else {
kvm_vgic_global_state.vcpu_base = info->vcpu.start;
kvm_vgic_global_state.can_emulate_gicv2 = true;
-- 
2.29.2

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[PATCH 1/2] KVM: arm64: Rename __vgic_v3_get_ich_vtr_el2() to __vgic_v3_get_gic_config()

2021-01-08 Thread Marc Zyngier
As we are about to report a bit more information to the rest of
the kernel, rename __vgic_v3_get_ich_vtr_el2() to the more
explicit __vgic_v3_get_gic_config().

No functional change.

Signed-off-by: Marc Zyngier 
---
 arch/arm64/include/asm/kvm_asm.h   | 4 ++--
 arch/arm64/kvm/hyp/nvhe/hyp-main.c | 6 +++---
 arch/arm64/kvm/hyp/vgic-v3-sr.c| 7 ++-
 arch/arm64/kvm/vgic/vgic-v3.c  | 4 +++-
 4 files changed, 14 insertions(+), 7 deletions(-)

diff --git a/arch/arm64/include/asm/kvm_asm.h b/arch/arm64/include/asm/kvm_asm.h
index 8a33d83ea843..37b9cd3e458e 100644
--- a/arch/arm64/include/asm/kvm_asm.h
+++ b/arch/arm64/include/asm/kvm_asm.h
@@ -50,7 +50,7 @@
 #define __KVM_HOST_SMCCC_FUNC___kvm_tlb_flush_local_vmid   5
 #define __KVM_HOST_SMCCC_FUNC___kvm_timer_set_cntvoff  6
 #define __KVM_HOST_SMCCC_FUNC___kvm_enable_ssbs7
-#define __KVM_HOST_SMCCC_FUNC___vgic_v3_get_ich_vtr_el28
+#define __KVM_HOST_SMCCC_FUNC___vgic_v3_get_gic_config 8
 #define __KVM_HOST_SMCCC_FUNC___vgic_v3_read_vmcr  9
 #define __KVM_HOST_SMCCC_FUNC___vgic_v3_write_vmcr 10
 #define __KVM_HOST_SMCCC_FUNC___vgic_v3_init_lrs   11
@@ -192,7 +192,7 @@ extern void __kvm_timer_set_cntvoff(u64 cntvoff);
 
 extern int __kvm_vcpu_run(struct kvm_vcpu *vcpu);
 
-extern u64 __vgic_v3_get_ich_vtr_el2(void);
+extern u64 __vgic_v3_get_gic_config(void);
 extern u64 __vgic_v3_read_vmcr(void);
 extern void __vgic_v3_write_vmcr(u32 vmcr);
 extern void __vgic_v3_init_lrs(void);
diff --git a/arch/arm64/kvm/hyp/nvhe/hyp-main.c 
b/arch/arm64/kvm/hyp/nvhe/hyp-main.c
index bde658d51404..3dc7f0c4fa94 100644
--- a/arch/arm64/kvm/hyp/nvhe/hyp-main.c
+++ b/arch/arm64/kvm/hyp/nvhe/hyp-main.c
@@ -67,9 +67,9 @@ static void handle___kvm_enable_ssbs(struct kvm_cpu_context 
*host_ctxt)
write_sysreg_el2(tmp, SYS_SCTLR);
 }
 
-static void handle___vgic_v3_get_ich_vtr_el2(struct kvm_cpu_context *host_ctxt)
+static void handle___vgic_v3_get_gic_config(struct kvm_cpu_context *host_ctxt)
 {
-   cpu_reg(host_ctxt, 1) = __vgic_v3_get_ich_vtr_el2();
+   cpu_reg(host_ctxt, 1) = __vgic_v3_get_gic_config();
 }
 
 static void handle___vgic_v3_read_vmcr(struct kvm_cpu_context *host_ctxt)
@@ -118,7 +118,7 @@ static const hcall_t *host_hcall[] = {
HANDLE_FUNC(__kvm_tlb_flush_local_vmid),
HANDLE_FUNC(__kvm_timer_set_cntvoff),
HANDLE_FUNC(__kvm_enable_ssbs),
-   HANDLE_FUNC(__vgic_v3_get_ich_vtr_el2),
+   HANDLE_FUNC(__vgic_v3_get_gic_config),
HANDLE_FUNC(__vgic_v3_read_vmcr),
HANDLE_FUNC(__vgic_v3_write_vmcr),
HANDLE_FUNC(__vgic_v3_init_lrs),
diff --git a/arch/arm64/kvm/hyp/vgic-v3-sr.c b/arch/arm64/kvm/hyp/vgic-v3-sr.c
index 80406f463c28..005daa0c9dd7 100644
--- a/arch/arm64/kvm/hyp/vgic-v3-sr.c
+++ b/arch/arm64/kvm/hyp/vgic-v3-sr.c
@@ -405,7 +405,12 @@ void __vgic_v3_init_lrs(void)
__gic_v3_set_lr(0, i);
 }
 
-u64 __vgic_v3_get_ich_vtr_el2(void)
+/*
+ * Return the GIC CPU configuration:
+ * - [31:0]  ICH_VTR_EL2
+ * - [63:32] RES0
+ */
+u64 __vgic_v3_get_gic_config(void)
 {
return read_gicreg(ICH_VTR_EL2);
 }
diff --git a/arch/arm64/kvm/vgic/vgic-v3.c b/arch/arm64/kvm/vgic/vgic-v3.c
index 9cdf39a94a63..8e7bf3151057 100644
--- a/arch/arm64/kvm/vgic/vgic-v3.c
+++ b/arch/arm64/kvm/vgic/vgic-v3.c
@@ -583,9 +583,11 @@ early_param("kvm-arm.vgic_v4_enable", early_gicv4_enable);
  */
 int vgic_v3_probe(const struct gic_kvm_info *info)
 {
-   u32 ich_vtr_el2 = kvm_call_hyp_ret(__vgic_v3_get_ich_vtr_el2);
+   u64 ich_vtr_el2 = kvm_call_hyp_ret(__vgic_v3_get_gic_config);
int ret;
 
+   ich_vtr_el2 = (u32)ich_vtr_el2;
+
/*
 * The ListRegs field is 5 bits, but there is an architectural
 * maximum of 16 list registers. Just ignore bit 4...
-- 
2.29.2

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


RE: [PATCH v13 00/15] SMMUv3 Nested Stage Setup (IOMMU part)

2021-01-08 Thread Shameerali Kolothum Thodi
Hi Eric,

> -Original Message-
> From: Eric Auger [mailto:eric.au...@redhat.com]
> Sent: 18 November 2020 11:22
> To: eric.auger@gmail.com; eric.au...@redhat.com;
> io...@lists.linux-foundation.org; linux-ker...@vger.kernel.org;
> k...@vger.kernel.org; kvmarm@lists.cs.columbia.edu; w...@kernel.org;
> j...@8bytes.org; m...@kernel.org; robin.mur...@arm.com;
> alex.william...@redhat.com
> Cc: jean-phili...@linaro.org; zhangfei@linaro.org;
> zhangfei@gmail.com; vivek.gau...@arm.com; Shameerali Kolothum
> Thodi ;
> jacob.jun@linux.intel.com; yi.l@intel.com; t...@semihalf.com;
> nicoleots...@gmail.com; yuzenghui 
> Subject: [PATCH v13 00/15] SMMUv3 Nested Stage Setup (IOMMU part)
> 
> This series brings the IOMMU part of HW nested paging support
> in the SMMUv3. The VFIO part is submitted separately.
> 
> The IOMMU API is extended to support 2 new API functionalities:
> 1) pass the guest stage 1 configuration
> 2) pass stage 1 MSI bindings
> 
> Then those capabilities gets implemented in the SMMUv3 driver.
> 
> The virtualizer passes information through the VFIO user API
> which cascades them to the iommu subsystem. This allows the guest
> to own stage 1 tables and context descriptors (so-called PASID
> table) while the host owns stage 2 tables and main configuration
> structures (STE).

I am seeing an issue with Guest testpmd run with this series.
I have two different setups and testpmd works fine with the
first one but not with the second.

1). Guest doesn't have kernel driver built-in for pass-through dev.

root@ubuntu:/# lspci -v
...
00:02.0 Ethernet controller: Huawei Technologies Co., Ltd. Device a22e (rev 21)
Subsystem: Huawei Technologies Co., Ltd. Device 
Flags: fast devsel
Memory at 800010 (64-bit, prefetchable) [disabled] [size=64K]
Memory at 80 (64-bit, prefetchable) [disabled] [size=1M]
Capabilities: [40] Express Root Complex Integrated Endpoint, MSI 00
Capabilities: [a0] MSI-X: Enable- Count=67 Masked-
Capabilities: [b0] Power Management version 3
Capabilities: [100] Access Control Services
Capabilities: [300] Transaction Processing Hints

root@ubuntu:/# echo vfio-pci > /sys/bus/pci/devices/:00:02.0/driver_override
root@ubuntu:/# echo :00:02.0 > /sys/bus/pci/drivers_probe

root@ubuntu:/mnt/dpdk/build/app# ./testpmd -w :00:02.0 --file-prefix 
socket0  -l 0-1 -n 2 -- -i
EAL: Detected 8 lcore(s)
EAL: Detected 1 NUMA nodes
EAL: Multi-process socket /var/run/dpdk/socket0/mp_socket
EAL: Selected IOVA mode 'VA'
EAL: No available hugepages reported in hugepages-32768kB
EAL: No available hugepages reported in hugepages-64kB
EAL: No available hugepages reported in hugepages-1048576kB
EAL: Probing VFIO support...
EAL: VFIO support initialized
EAL:   Invalid NUMA socket, default to 0
EAL:   using IOMMU type 1 (Type 1)
EAL: Probe PCI driver: net_hns3_vf (19e5:a22e) device: :00:02.0 (socket 0)
EAL: No legacy callbacks, legacy socket not created
Interactive-mode selected
testpmd: create a new mbuf pool : n=155456, size=2176, 
socket=0
testpmd: preferred mempool ops selected: ring_mp_mc

Warning! port-topology=paired and odd forward ports number, the last port will 
pair with itself.

Configuring Port 0 (socket 0)
Port 0: 8E:A6:8C:43:43:45
Checking link statuses...
Done
testpmd>

2). Guest have kernel driver built-in for pass-through dev.

root@ubuntu:/# lspci -v
...
00:02.0 Ethernet controller: Huawei Technologies Co., Ltd. Device a22e (rev 21)
Subsystem: Huawei Technologies Co., Ltd. Device 
Flags: bus master, fast devsel, latency 0
Memory at 800010 (64-bit, prefetchable) [size=64K]
Memory at 80 (64-bit, prefetchable) [size=1M]
Capabilities: [40] Express Root Complex Integrated Endpoint, MSI 00
Capabilities: [a0] MSI-X: Enable+ Count=67 Masked-
Capabilities: [b0] Power Management version 3
Capabilities: [100] Access Control Services
Capabilities: [300] Transaction Processing Hints
Kernel driver in use: hns3

root@ubuntu:/# echo vfio-pci > /sys/bus/pci/devices/:00:02.0/driver_override
root@ubuntu:/# echo :00:02.0 > /sys/bus/pci/drivers/hns3/unbind
root@ubuntu:/# echo :00:02.0 > /sys/bus/pci/drivers_probe

root@ubuntu:/mnt/dpdk/build/app# ./testpmd -w :00:02.0 --file-prefix 
socket0 -l 0-1 -n 2 -- -i
EAL: Detected 8 lcore(s)
EAL: Detected 1 NUMA nodes
EAL: Multi-process socket /var/run/dpdk/socket0/mp_socket
EAL: Selected IOVA mode 'VA'
EAL: No available hugepages reported in hugepages-32768kB
EAL: No available hugepages reported in hugepages-64kB
EAL: No available hugepages reported in hugepages-1048576kB
EAL: Probing VFIO support...
EAL: VFIO support initialized
EAL:   Invalid NUMA socket, default to 0
EAL:   using IOMMU type 1 (Type 1)
EAL: Probe PCI driver: net_hns3_vf (19e5:a22e) device: :00:02.0 (socket 0)
:00:02.0 hns3_get_mbx_resp(): VF could not get mbx(11,0) head(1) tail(0) 
lost(1) from PF in_irq:0
hns3vf_get_queue_info(): Failed to get tqp info from PF: -62
hns3vf_init_vf(): Failed to fetch 

Re: [PATCH] KVM: arm64: Compute TPIDR_EL2 ignoring MTE tag

2021-01-08 Thread Steven Price

On 08/01/2021 16:51, Marc Zyngier wrote:

Hi Steven,

On 2021-01-08 16:12, Steven Price wrote:

KASAN in HW_TAGS mode will store MTE tags in the top byte of the
pointer. When computing the offset for TPIDR_EL2 we don't want anything
in the top byte, so remove the tag to ensure the computation is correct
no matter what the tag.

Fixes: 94ab5b61ee16 ("kasan, arm64: enable CONFIG_KASAN_HW_TAGS")
Signed-off-by: Steven Price 
---
Without this fix I can't boot a config with KASAN_HW_TAGS and KVM on an
MTE enabled host. I'm unsure if this should really be in
this_cpu_ptr_nvhe_sym().


this_cpu_ptr_nvhe_sym() should return something that is valid for
the EL1 kernel, so I guess untagging in the helper may not be
that useful.


Makes sense and was my suspicion.


However, I'm more concerned by anything at requires us to follow
pointers set up by EL1 at EL2. It looks to me that the only reason
the whole thing works is because kern_hyp_va() *accidentally* drops
tags before applying the EL1/EL2 offset...


In the case I'm fixing this is intended to be an offset calculation - 
it's just messed up by the presence of an MTE tag in one of the pointers.


I agree I was somewhat surprised when everything 'just worked' with this 
one change - and I think you're right it's because kern_hyp_va() 'just 
happens' to lose the tags. Of course there may be other bugs lurking - 
running MTE+KASAN on the model is slow so I didn't do much beyond boot it.


One of the 'fun' things about MTE is that you can no longer do pointer 
subtraction to calculate the offset unless the pointers are actually 
from the same allocation (and therefore have the same tag). I'm sure the 
C language experts would point out that's "always been the case" but it 
will probably break things elsewhere too.


Steve


Or am I getting it wrong?

Thanks,

     M.


___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [PATCH] KVM: arm64: Compute TPIDR_EL2 ignoring MTE tag

2021-01-08 Thread Marc Zyngier

Hi Steven,

On 2021-01-08 16:12, Steven Price wrote:

KASAN in HW_TAGS mode will store MTE tags in the top byte of the
pointer. When computing the offset for TPIDR_EL2 we don't want anything
in the top byte, so remove the tag to ensure the computation is correct
no matter what the tag.

Fixes: 94ab5b61ee16 ("kasan, arm64: enable CONFIG_KASAN_HW_TAGS")
Signed-off-by: Steven Price 
---
Without this fix I can't boot a config with KASAN_HW_TAGS and KVM on an
MTE enabled host. I'm unsure if this should really be in
this_cpu_ptr_nvhe_sym().


this_cpu_ptr_nvhe_sym() should return something that is valid for
the EL1 kernel, so I guess untagging in the helper may not be
that useful.

However, I'm more concerned by anything at requires us to follow
pointers set up by EL1 at EL2. It looks to me that the only reason
the whole thing works is because kern_hyp_va() *accidentally* drops
tags before applying the EL1/EL2 offset...

Or am I getting it wrong?

Thanks,

M.
--
Jazz is not dead. It just smells funny...
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[RFC PATCH v2 25/26] KVM: arm64: Reserve memory for host stage 2

2021-01-08 Thread Quentin Perret
Extend the memory pool allocated for the hypervisor to include enough
pages to map all of memory at page granularity for the host stage 2.
While at it, also reserve some memory for device mappings.

Signed-off-by: Quentin Perret 
---
 arch/arm64/kvm/hyp/include/nvhe/mm.h | 36 
 arch/arm64/kvm/hyp/nvhe/setup.c  | 12 ++
 arch/arm64/kvm/hyp/reserved_mem.c|  2 ++
 3 files changed, 46 insertions(+), 4 deletions(-)

diff --git a/arch/arm64/kvm/hyp/include/nvhe/mm.h 
b/arch/arm64/kvm/hyp/include/nvhe/mm.h
index f0cc09b127a5..cdf2e3447b2a 100644
--- a/arch/arm64/kvm/hyp/include/nvhe/mm.h
+++ b/arch/arm64/kvm/hyp/include/nvhe/mm.h
@@ -52,15 +52,12 @@ static inline unsigned long 
__hyp_pgtable_max_pages(unsigned long nr_pages)
return total;
 }
 
-static inline unsigned long hyp_s1_pgtable_size(void)
+static inline unsigned long __hyp_pgtable_total_size(void)
 {
struct hyp_memblock_region *reg;
unsigned long nr_pages, res = 0;
int i;
 
-   if (kvm_nvhe_sym(hyp_memblock_nr) <= 0)
-   return 0;
-
for (i = 0; i < kvm_nvhe_sym(hyp_memblock_nr); i++) {
reg = _nvhe_sym(hyp_memory)[i];
nr_pages = (reg->end - reg->start) >> PAGE_SHIFT;
@@ -68,6 +65,18 @@ static inline unsigned long hyp_s1_pgtable_size(void)
res += nr_pages << PAGE_SHIFT;
}
 
+   return res;
+}
+
+static inline unsigned long hyp_s1_pgtable_size(void)
+{
+   unsigned long res, nr_pages;
+
+   if (kvm_nvhe_sym(hyp_memblock_nr) <= 0)
+   return 0;
+
+   res = __hyp_pgtable_total_size();
+
/* Allow 1 GiB for private mappings */
nr_pages = (1 << 30) >> PAGE_SHIFT;
nr_pages = __hyp_pgtable_max_pages(nr_pages);
@@ -76,4 +85,23 @@ static inline unsigned long hyp_s1_pgtable_size(void)
return res;
 }
 
+static inline unsigned long host_s2_mem_pgtable_size(void)
+{
+   unsigned long max_pgd_sz = 16 << PAGE_SHIFT;
+
+   if (kvm_nvhe_sym(hyp_memblock_nr) <= 0)
+   return 0;
+
+   return __hyp_pgtable_total_size() + max_pgd_sz;
+}
+
+static inline unsigned long host_s2_dev_pgtable_size(void)
+{
+   if (kvm_nvhe_sym(hyp_memblock_nr) <= 0)
+   return 0;
+
+   /* Allow 1 GiB for private mappings */
+   return __hyp_pgtable_max_pages((1 << 30) >> PAGE_SHIFT) << PAGE_SHIFT;
+}
+
 #endif /* __KVM_HYP_MM_H */
diff --git a/arch/arm64/kvm/hyp/nvhe/setup.c b/arch/arm64/kvm/hyp/nvhe/setup.c
index 6d1faede86ae..79b697df01e2 100644
--- a/arch/arm64/kvm/hyp/nvhe/setup.c
+++ b/arch/arm64/kvm/hyp/nvhe/setup.c
@@ -24,6 +24,8 @@ unsigned long hyp_nr_cpus;
 static void *stacks_base;
 static void *vmemmap_base;
 static void *hyp_pgt_base;
+static void *host_s2_mem_pgt_base;
+static void *host_s2_dev_pgt_base;
 
 static int divide_memory_pool(void *virt, unsigned long size)
 {
@@ -46,6 +48,16 @@ static int divide_memory_pool(void *virt, unsigned long size)
if (!hyp_pgt_base)
return -ENOMEM;
 
+   nr_pages = host_s2_mem_pgtable_size() >> PAGE_SHIFT;
+   host_s2_mem_pgt_base = hyp_early_alloc_contig(nr_pages);
+   if (!host_s2_mem_pgt_base)
+   return -ENOMEM;
+
+   nr_pages = host_s2_dev_pgtable_size() >> PAGE_SHIFT;
+   host_s2_dev_pgt_base = hyp_early_alloc_contig(nr_pages);
+   if (!host_s2_dev_pgt_base)
+   return -ENOMEM;
+
return 0;
 }
 
diff --git a/arch/arm64/kvm/hyp/reserved_mem.c 
b/arch/arm64/kvm/hyp/reserved_mem.c
index 32f648992835..ee97e55e3c59 100644
--- a/arch/arm64/kvm/hyp/reserved_mem.c
+++ b/arch/arm64/kvm/hyp/reserved_mem.c
@@ -74,6 +74,8 @@ void __init kvm_hyp_reserve(void)
 */
hyp_mem_size += NR_CPUS << PAGE_SHIFT;
hyp_mem_size += hyp_s1_pgtable_size();
+   hyp_mem_size += host_s2_mem_pgtable_size();
+   hyp_mem_size += host_s2_dev_pgtable_size();
 
/*
 * The hyp_vmemmap needs to be backed by pages, but these pages
-- 
2.30.0.284.gd98b1dd5eaa7-goog

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[RFC PATCH v2 26/26] KVM: arm64: Wrap the host with a stage 2

2021-01-08 Thread Quentin Perret
When KVM runs in protected nVHE mode, make use of a stage 2 page-table
to give the hypervisor some control over the host memory accesses. At
the moment all memory aborts from the host will be instantly idmapped
RWX at stage 2 in a lazy fashion. Later patches will make use of that
infrastructure to implement access control restrictions to e.g. protect
guest memory from the host.

Signed-off-by: Quentin Perret 
---
 arch/arm64/include/asm/kvm_cpufeature.h   |   2 +
 arch/arm64/kernel/image-vars.h|   3 +
 arch/arm64/kvm/hyp/include/nvhe/mem_protect.h |  33 +++
 arch/arm64/kvm/hyp/nvhe/Makefile  |   2 +-
 arch/arm64/kvm/hyp/nvhe/hyp-init.S|   1 +
 arch/arm64/kvm/hyp/nvhe/hyp-main.c|   6 +
 arch/arm64/kvm/hyp/nvhe/mem_protect.c | 191 ++
 arch/arm64/kvm/hyp/nvhe/setup.c   |   6 +
 arch/arm64/kvm/hyp/nvhe/switch.c  |   7 +-
 arch/arm64/kvm/hyp/nvhe/tlb.c |   4 +-
 10 files changed, 248 insertions(+), 7 deletions(-)
 create mode 100644 arch/arm64/kvm/hyp/include/nvhe/mem_protect.h
 create mode 100644 arch/arm64/kvm/hyp/nvhe/mem_protect.c

diff --git a/arch/arm64/include/asm/kvm_cpufeature.h 
b/arch/arm64/include/asm/kvm_cpufeature.h
index d34f85cba358..74043a149322 100644
--- a/arch/arm64/include/asm/kvm_cpufeature.h
+++ b/arch/arm64/include/asm/kvm_cpufeature.h
@@ -15,3 +15,5 @@
 #endif
 
 KVM_HYP_CPU_FTR_REG(SYS_CTR_EL0, arm64_ftr_reg_ctrel0)
+KVM_HYP_CPU_FTR_REG(SYS_ID_AA64MMFR0_EL1, arm64_ftr_reg_id_aa64mmfr0_el1)
+KVM_HYP_CPU_FTR_REG(SYS_ID_AA64MMFR1_EL1, arm64_ftr_reg_id_aa64mmfr1_el1)
diff --git a/arch/arm64/kernel/image-vars.h b/arch/arm64/kernel/image-vars.h
index 366d837f0d39..e4e4f30ac251 100644
--- a/arch/arm64/kernel/image-vars.h
+++ b/arch/arm64/kernel/image-vars.h
@@ -132,6 +132,9 @@ KVM_NVHE_ALIAS(__hyp_data_ro_after_init_end);
 KVM_NVHE_ALIAS(__hyp_bss_start);
 KVM_NVHE_ALIAS(__hyp_bss_end);
 
+/* pKVM static key */
+KVM_NVHE_ALIAS(kvm_protected_mode_initialized);
+
 #endif /* CONFIG_KVM */
 
 #endif /* __ARM64_KERNEL_IMAGE_VARS_H */
diff --git a/arch/arm64/kvm/hyp/include/nvhe/mem_protect.h 
b/arch/arm64/kvm/hyp/include/nvhe/mem_protect.h
new file mode 100644
index ..a22ef118a610
--- /dev/null
+++ b/arch/arm64/kvm/hyp/include/nvhe/mem_protect.h
@@ -0,0 +1,33 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/*
+ * Copyright (C) 2020 Google LLC
+ * Author: Quentin Perret 
+ */
+
+#ifndef __KVM_NVHE_MEM_PROTECT__
+#define __KVM_NVHE_MEM_PROTECT__
+#include 
+#include 
+#include 
+#include 
+#include 
+
+struct host_kvm {
+   struct kvm_arch arch;
+   struct kvm_pgtable pgt;
+   struct kvm_pgtable_mm_ops mm_ops;
+   hyp_spinlock_t lock;
+};
+extern struct host_kvm host_kvm;
+
+int kvm_host_prepare_stage2(void *mem_pgt_pool, void *dev_pgt_pool);
+void handle_host_mem_abort(struct kvm_cpu_context *host_ctxt);
+
+static __always_inline void __load_host_stage2(void)
+{
+   if (static_branch_likely(_protected_mode_initialized))
+   __load_stage2(_kvm.arch.mmu, host_kvm.arch.vtcr);
+   else
+   write_sysreg(0, vttbr_el2);
+}
+#endif /* __KVM_NVHE_MEM_PROTECT__ */
diff --git a/arch/arm64/kvm/hyp/nvhe/Makefile b/arch/arm64/kvm/hyp/nvhe/Makefile
index d7381a503182..c3e2f98555c4 100644
--- a/arch/arm64/kvm/hyp/nvhe/Makefile
+++ b/arch/arm64/kvm/hyp/nvhe/Makefile
@@ -11,7 +11,7 @@ lib-objs := $(addprefix ../../../lib/, $(lib-objs))
 
 obj-y := timer-sr.o sysreg-sr.o debug-sr.o switch.o tlb.o hyp-init.o host.o \
 hyp-main.o hyp-smp.o psci-relay.o early_alloc.o stub.o page_alloc.o \
-cache.o cpufeature.o setup.o mm.o
+cache.o cpufeature.o setup.o mm.o mem_protect.o
 obj-y += ../vgic-v3-sr.o ../aarch32.o ../vgic-v2-cpuif-proxy.o ../entry.o \
 ../fpsimd.o ../hyp-entry.o ../exception.o ../pgtable.o
 obj-y += $(lib-objs)
diff --git a/arch/arm64/kvm/hyp/nvhe/hyp-init.S 
b/arch/arm64/kvm/hyp/nvhe/hyp-init.S
index b1341bb4b453..32591db76c75 100644
--- a/arch/arm64/kvm/hyp/nvhe/hyp-init.S
+++ b/arch/arm64/kvm/hyp/nvhe/hyp-init.S
@@ -129,6 +129,7 @@ alternative_else_nop_endif
 
/* Invalidate the stale TLBs from Bootloader */
tlbialle2
+   tlbivmalls12e1
dsb sy
 
/*
diff --git a/arch/arm64/kvm/hyp/nvhe/hyp-main.c 
b/arch/arm64/kvm/hyp/nvhe/hyp-main.c
index 3075f117651c..93699600bc22 100644
--- a/arch/arm64/kvm/hyp/nvhe/hyp-main.c
+++ b/arch/arm64/kvm/hyp/nvhe/hyp-main.c
@@ -13,6 +13,7 @@
 #include 
 #include 
 
+#include 
 #include 
 #include 
 
@@ -222,6 +223,11 @@ void handle_trap(struct kvm_cpu_context *host_ctxt)
case ESR_ELx_EC_SMC64:
handle_host_smc(host_ctxt);
break;
+   case ESR_ELx_EC_IABT_LOW:
+   fallthrough;
+   case ESR_ELx_EC_DABT_LOW:
+   handle_host_mem_abort(host_ctxt);
+   break;
default:
hyp_panic();
}
diff --git 

[RFC PATCH v2 22/26] KVM: arm64: Refactor __load_guest_stage2()

2021-01-08 Thread Quentin Perret
Refactor __load_guest_stage2() to introduce __load_stage2() which will
be re-used when loading the host stage 2.

Signed-off-by: Quentin Perret 
---
 arch/arm64/include/asm/kvm_mmu.h | 9 +++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/arch/arm64/include/asm/kvm_mmu.h b/arch/arm64/include/asm/kvm_mmu.h
index 83b4c5cf4768..8d37d6d1ed29 100644
--- a/arch/arm64/include/asm/kvm_mmu.h
+++ b/arch/arm64/include/asm/kvm_mmu.h
@@ -345,9 +345,9 @@ static __always_inline u64 kvm_get_vttbr(struct kvm_s2_mmu 
*mmu)
  * Must be called from hyp code running at EL2 with an updated VTTBR
  * and interrupts disabled.
  */
-static __always_inline void __load_guest_stage2(struct kvm_s2_mmu *mmu)
+static __always_inline void __load_stage2(struct kvm_s2_mmu *mmu, unsigned 
long vtcr)
 {
-   write_sysreg(kern_hyp_va(mmu->arch)->vtcr, vtcr_el2);
+   write_sysreg(vtcr, vtcr_el2);
write_sysreg(kvm_get_vttbr(mmu), vttbr_el2);
 
/*
@@ -358,6 +358,11 @@ static __always_inline void __load_guest_stage2(struct 
kvm_s2_mmu *mmu)
asm(ALTERNATIVE("nop", "isb", ARM64_WORKAROUND_SPECULATIVE_AT));
 }
 
+static __always_inline void __load_guest_stage2(struct kvm_s2_mmu *mmu)
+{
+   __load_stage2(mmu, kern_hyp_va(mmu->arch)->vtcr);
+}
+
 static inline struct kvm *kvm_s2_mmu_to_kvm(struct kvm_s2_mmu *mmu)
 {
return container_of(mmu->arch, struct kvm, arch);
-- 
2.30.0.284.gd98b1dd5eaa7-goog

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[RFC PATCH v2 23/26] KVM: arm64: Refactor __populate_fault_info()

2021-01-08 Thread Quentin Perret
Refactor __populate_fault_info() to introduce __get_fault_info() which
will be used once the host is wrapped in a stage 2.

Signed-off-by: Quentin Perret 
---
 arch/arm64/kvm/hyp/include/hyp/switch.h | 36 +++--
 1 file changed, 22 insertions(+), 14 deletions(-)

diff --git a/arch/arm64/kvm/hyp/include/hyp/switch.h 
b/arch/arm64/kvm/hyp/include/hyp/switch.h
index 84473574c2e7..e9005255d639 100644
--- a/arch/arm64/kvm/hyp/include/hyp/switch.h
+++ b/arch/arm64/kvm/hyp/include/hyp/switch.h
@@ -157,19 +157,9 @@ static inline bool __translate_far_to_hpfar(u64 far, u64 
*hpfar)
return true;
 }
 
-static inline bool __populate_fault_info(struct kvm_vcpu *vcpu)
+static inline bool __get_fault_info(u64 esr, u64 *far, u64 *hpfar)
 {
-   u8 ec;
-   u64 esr;
-   u64 hpfar, far;
-
-   esr = vcpu->arch.fault.esr_el2;
-   ec = ESR_ELx_EC(esr);
-
-   if (ec != ESR_ELx_EC_DABT_LOW && ec != ESR_ELx_EC_IABT_LOW)
-   return true;
-
-   far = read_sysreg_el2(SYS_FAR);
+   *far = read_sysreg_el2(SYS_FAR);
 
/*
 * The HPFAR can be invalid if the stage 2 fault did not
@@ -185,12 +175,30 @@ static inline bool __populate_fault_info(struct kvm_vcpu 
*vcpu)
if (!(esr & ESR_ELx_S1PTW) &&
(cpus_have_final_cap(ARM64_WORKAROUND_834220) ||
 (esr & ESR_ELx_FSC_TYPE) == FSC_PERM)) {
-   if (!__translate_far_to_hpfar(far, ))
+   if (!__translate_far_to_hpfar(*far, hpfar))
return false;
} else {
-   hpfar = read_sysreg(hpfar_el2);
+   *hpfar = read_sysreg(hpfar_el2);
}
 
+   return true;
+}
+
+static inline bool __populate_fault_info(struct kvm_vcpu *vcpu)
+{
+   u8 ec;
+   u64 esr;
+   u64 hpfar, far;
+
+   esr = vcpu->arch.fault.esr_el2;
+   ec = ESR_ELx_EC(esr);
+
+   if (ec != ESR_ELx_EC_DABT_LOW && ec != ESR_ELx_EC_IABT_LOW)
+   return true;
+
+   if (!__get_fault_info(esr, , ))
+   return false;
+
vcpu->arch.fault.far_el2 = far;
vcpu->arch.fault.hpfar_el2 = hpfar;
return true;
-- 
2.30.0.284.gd98b1dd5eaa7-goog

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[RFC PATCH v2 21/26] KVM: arm64: Refactor kvm_arm_setup_stage2()

2021-01-08 Thread Quentin Perret
In order to re-use some of the stage 2 setup at EL2, factor parts of
kvm_arm_setup_stage2() out into static inline functions.

No functional change intended.

Signed-off-by: Quentin Perret 
---
 arch/arm64/include/asm/kvm_mmu.h | 48 
 arch/arm64/kvm/reset.c   | 42 +++-
 2 files changed, 52 insertions(+), 38 deletions(-)

diff --git a/arch/arm64/include/asm/kvm_mmu.h b/arch/arm64/include/asm/kvm_mmu.h
index 662f0415344e..83b4c5cf4768 100644
--- a/arch/arm64/include/asm/kvm_mmu.h
+++ b/arch/arm64/include/asm/kvm_mmu.h
@@ -280,6 +280,54 @@ static inline int kvm_write_guest_lock(struct kvm *kvm, 
gpa_t gpa,
return ret;
 }
 
+static inline u64 kvm_get_parange(u64 mmfr0)
+{
+   u64 parange = cpuid_feature_extract_unsigned_field(mmfr0,
+   ID_AA64MMFR0_PARANGE_SHIFT);
+   if (parange > ID_AA64MMFR0_PARANGE_MAX)
+   parange = ID_AA64MMFR0_PARANGE_MAX;
+
+   return parange;
+}
+
+/*
+ * The VTCR value is common across all the physical CPUs on the system.
+ * We use system wide sanitised values to fill in different fields,
+ * except for Hardware Management of Access Flags. HA Flag is set
+ * unconditionally on all CPUs, as it is safe to run with or without
+ * the feature and the bit is RES0 on CPUs that don't support it.
+ */
+static inline u64 kvm_get_vtcr(u64 mmfr0, u64 mmfr1, u32 phys_shift)
+{
+   u64 vtcr = VTCR_EL2_FLAGS;
+   u8 lvls;
+
+   vtcr |= kvm_get_parange(mmfr0) << VTCR_EL2_PS_SHIFT;
+   vtcr |= VTCR_EL2_T0SZ(phys_shift);
+   /*
+* Use a minimum 2 level page table to prevent splitting
+* host PMD huge pages at stage2.
+*/
+   lvls = stage2_pgtable_levels(phys_shift);
+   if (lvls < 2)
+   lvls = 2;
+   vtcr |= VTCR_EL2_LVLS_TO_SL0(lvls);
+
+   /*
+* Enable the Hardware Access Flag management, unconditionally
+* on all CPUs. The features is RES0 on CPUs without the support
+* and must be ignored by the CPUs.
+*/
+   vtcr |= VTCR_EL2_HA;
+
+   /* Set the vmid bits */
+   vtcr |= (get_vmid_bits(mmfr1) == 16) ?
+   VTCR_EL2_VS_16BIT :
+   VTCR_EL2_VS_8BIT;
+
+   return vtcr;
+}
+
 #define kvm_phys_to_vttbr(addr)phys_to_ttbr(addr)
 
 static __always_inline u64 kvm_get_vttbr(struct kvm_s2_mmu *mmu)
diff --git a/arch/arm64/kvm/reset.c b/arch/arm64/kvm/reset.c
index 47f3f035f3ea..6aae118c960a 100644
--- a/arch/arm64/kvm/reset.c
+++ b/arch/arm64/kvm/reset.c
@@ -332,19 +332,10 @@ int kvm_set_ipa_limit(void)
return 0;
 }
 
-/*
- * Configure the VTCR_EL2 for this VM. The VTCR value is common
- * across all the physical CPUs on the system. We use system wide
- * sanitised values to fill in different fields, except for Hardware
- * Management of Access Flags. HA Flag is set unconditionally on
- * all CPUs, as it is safe to run with or without the feature and
- * the bit is RES0 on CPUs that don't support it.
- */
 int kvm_arm_setup_stage2(struct kvm *kvm, unsigned long type)
 {
-   u64 vtcr = VTCR_EL2_FLAGS, mmfr0;
-   u32 parange, phys_shift;
-   u8 lvls;
+   u64 mmfr0, mmfr1;
+   u32 phys_shift;
 
if (type & ~KVM_VM_TYPE_ARM_IPA_SIZE_MASK)
return -EINVAL;
@@ -359,33 +350,8 @@ int kvm_arm_setup_stage2(struct kvm *kvm, unsigned long 
type)
}
 
mmfr0 = read_sanitised_ftr_reg(SYS_ID_AA64MMFR0_EL1);
-   parange = cpuid_feature_extract_unsigned_field(mmfr0,
-   ID_AA64MMFR0_PARANGE_SHIFT);
-   if (parange > ID_AA64MMFR0_PARANGE_MAX)
-   parange = ID_AA64MMFR0_PARANGE_MAX;
-   vtcr |= parange << VTCR_EL2_PS_SHIFT;
-
-   vtcr |= VTCR_EL2_T0SZ(phys_shift);
-   /*
-* Use a minimum 2 level page table to prevent splitting
-* host PMD huge pages at stage2.
-*/
-   lvls = stage2_pgtable_levels(phys_shift);
-   if (lvls < 2)
-   lvls = 2;
-   vtcr |= VTCR_EL2_LVLS_TO_SL0(lvls);
-
-   /*
-* Enable the Hardware Access Flag management, unconditionally
-* on all CPUs. The features is RES0 on CPUs without the support
-* and must be ignored by the CPUs.
-*/
-   vtcr |= VTCR_EL2_HA;
+   mmfr1 = read_sanitised_ftr_reg(SYS_ID_AA64MMFR1_EL1);
+   kvm->arch.vtcr = kvm_get_vtcr(mmfr0, mmfr1, phys_shift);
 
-   /* Set the vmid bits */
-   vtcr |= (kvm_get_vmid_bits() == 16) ?
-   VTCR_EL2_VS_16BIT :
-   VTCR_EL2_VS_8BIT;
-   kvm->arch.vtcr = vtcr;
return 0;
 }
-- 
2.30.0.284.gd98b1dd5eaa7-goog

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[RFC PATCH v2 24/26] KVM: arm64: Make memcache anonymous in pgtable allocator

2021-01-08 Thread Quentin Perret
The current stage2 page-table allocator uses a memcache to get
pre-allocated pages when it needs any. To allow re-using this code at
EL2 which uses a concept of memory pools, make the memcache argument to
kvm_pgtable_stage2_map() anonymous. and let the mm_ops zalloc_page()
callbacks use it the way they need to.

Signed-off-by: Quentin Perret 
---
 arch/arm64/include/asm/kvm_pgtable.h | 6 +++---
 arch/arm64/kvm/hyp/pgtable.c | 4 ++--
 2 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/arch/arm64/include/asm/kvm_pgtable.h 
b/arch/arm64/include/asm/kvm_pgtable.h
index 8e8f1d2c5e0e..d846bc3d3b77 100644
--- a/arch/arm64/include/asm/kvm_pgtable.h
+++ b/arch/arm64/include/asm/kvm_pgtable.h
@@ -176,8 +176,8 @@ void kvm_pgtable_stage2_destroy(struct kvm_pgtable *pgt);
  * @size:  Size of the mapping.
  * @phys:  Physical address of the memory to map.
  * @prot:  Permissions and attributes for the mapping.
- * @mc:Cache of pre-allocated GFP_PGTABLE_USER memory from 
which to
- * allocate page-table pages.
+ * @mc:Cache of pre-allocated memory from which to allocate 
page-table
+ * pages.
  *
  * The offset of @addr within a page is ignored, @size is rounded-up to
  * the next page boundary and @phys is rounded-down to the previous page
@@ -194,7 +194,7 @@ void kvm_pgtable_stage2_destroy(struct kvm_pgtable *pgt);
  */
 int kvm_pgtable_stage2_map(struct kvm_pgtable *pgt, u64 addr, u64 size,
   u64 phys, enum kvm_pgtable_prot prot,
-  struct kvm_mmu_memory_cache *mc);
+  void *mc);
 
 /**
  * kvm_pgtable_stage2_unmap() - Remove a mapping from a guest stage-2 
page-table.
diff --git a/arch/arm64/kvm/hyp/pgtable.c b/arch/arm64/kvm/hyp/pgtable.c
index 96a25d0b7b6e..5dd1b4978fe8 100644
--- a/arch/arm64/kvm/hyp/pgtable.c
+++ b/arch/arm64/kvm/hyp/pgtable.c
@@ -443,7 +443,7 @@ struct stage2_map_data {
kvm_pte_t   *anchor;
 
struct kvm_s2_mmu   *mmu;
-   struct kvm_mmu_memory_cache *memcache;
+   void*memcache;
 
struct kvm_pgtable_mm_ops   *mm_ops;
 };
@@ -613,7 +613,7 @@ static int stage2_map_walker(u64 addr, u64 end, u32 level, 
kvm_pte_t *ptep,
 
 int kvm_pgtable_stage2_map(struct kvm_pgtable *pgt, u64 addr, u64 size,
   u64 phys, enum kvm_pgtable_prot prot,
-  struct kvm_mmu_memory_cache *mc)
+  void *mc)
 {
int ret;
struct stage2_map_data map_data = {
-- 
2.30.0.284.gd98b1dd5eaa7-goog

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[RFC PATCH v2 18/26] KVM: arm64: Use kvm_arch for stage 2 pgtable

2021-01-08 Thread Quentin Perret
In order to make use of the stage 2 pgtable code for the host stage 2,
use struct kvm_arch in lieu of struct kvm as the host will have the
former but not the latter.

Signed-off-by: Quentin Perret 
---
 arch/arm64/include/asm/kvm_pgtable.h | 5 +++--
 arch/arm64/kvm/hyp/pgtable.c | 6 +++---
 arch/arm64/kvm/mmu.c | 2 +-
 3 files changed, 7 insertions(+), 6 deletions(-)

diff --git a/arch/arm64/include/asm/kvm_pgtable.h 
b/arch/arm64/include/asm/kvm_pgtable.h
index 45acc9dc6c45..8e8f1d2c5e0e 100644
--- a/arch/arm64/include/asm/kvm_pgtable.h
+++ b/arch/arm64/include/asm/kvm_pgtable.h
@@ -151,12 +151,13 @@ int kvm_pgtable_hyp_map(struct kvm_pgtable *pgt, u64 
addr, u64 size, u64 phys,
 /**
  * kvm_pgtable_stage2_init() - Initialise a guest stage-2 page-table.
  * @pgt:   Uninitialised page-table structure to initialise.
- * @kvm:   KVM structure representing the guest virtual machine.
+ * @arch:  Arch-specific KVM structure representing the guest virtual
+ * machine.
  * @mm_ops:Memory management callbacks.
  *
  * Return: 0 on success, negative error code on failure.
  */
-int kvm_pgtable_stage2_init(struct kvm_pgtable *pgt, struct kvm *kvm,
+int kvm_pgtable_stage2_init(struct kvm_pgtable *pgt, struct kvm_arch *arch,
struct kvm_pgtable_mm_ops *mm_ops);
 
 /**
diff --git a/arch/arm64/kvm/hyp/pgtable.c b/arch/arm64/kvm/hyp/pgtable.c
index 61a8a34ddfdb..96a25d0b7b6e 100644
--- a/arch/arm64/kvm/hyp/pgtable.c
+++ b/arch/arm64/kvm/hyp/pgtable.c
@@ -855,11 +855,11 @@ int kvm_pgtable_stage2_flush(struct kvm_pgtable *pgt, u64 
addr, u64 size)
return kvm_pgtable_walk(pgt, addr, size, );
 }
 
-int kvm_pgtable_stage2_init(struct kvm_pgtable *pgt, struct kvm *kvm,
+int kvm_pgtable_stage2_init(struct kvm_pgtable *pgt, struct kvm_arch *arch,
struct kvm_pgtable_mm_ops *mm_ops)
 {
size_t pgd_sz;
-   u64 vtcr = kvm->arch.vtcr;
+   u64 vtcr = arch->vtcr;
u32 ia_bits = VTCR_EL2_IPA(vtcr);
u32 sl0 = FIELD_GET(VTCR_EL2_SL0_MASK, vtcr);
u32 start_level = VTCR_EL2_TGRAN_SL0_BASE - sl0;
@@ -872,7 +872,7 @@ int kvm_pgtable_stage2_init(struct kvm_pgtable *pgt, struct 
kvm *kvm,
pgt->ia_bits= ia_bits;
pgt->start_level= start_level;
pgt->mm_ops = mm_ops;
-   pgt->mmu= >arch.mmu;
+   pgt->mmu= >mmu;
 
/* Ensure zeroed PGD pages are visible to the hardware walker */
dsb(ishst);
diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c
index 9d4c9251208e..7e6263103943 100644
--- a/arch/arm64/kvm/mmu.c
+++ b/arch/arm64/kvm/mmu.c
@@ -461,7 +461,7 @@ int kvm_init_stage2_mmu(struct kvm *kvm, struct kvm_s2_mmu 
*mmu)
if (!pgt)
return -ENOMEM;
 
-   err = kvm_pgtable_stage2_init(pgt, kvm, _s2_mm_ops);
+   err = kvm_pgtable_stage2_init(pgt, >arch, _s2_mm_ops);
if (err)
goto out_free_pgtable;
 
-- 
2.30.0.284.gd98b1dd5eaa7-goog

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[RFC PATCH v2 10/26] KVM: arm64: Introduce an early Hyp page allocator

2021-01-08 Thread Quentin Perret
With nVHE, the host currently creates all s1 hypervisor mappings at EL1
during boot, installs them at EL2, and extends them as required (e.g.
when creating a new VM). But in a world where the host is no longer
trusted, it cannot have full control over the code mapped in the
hypervisor.

In preparation for enabling the hypervisor to create its own s1 mappings
during boot, introduce an early page allocator, with minimal
functionality. This allocator is designed to be used only during early
bootstrap of the hyp code when memory protection is enabled, which will
then switch to using a full-fledged page allocator after init.

Signed-off-by: Quentin Perret 
---
 arch/arm64/kvm/hyp/include/nvhe/early_alloc.h | 14 +
 arch/arm64/kvm/hyp/include/nvhe/memory.h  | 24 
 arch/arm64/kvm/hyp/nvhe/Makefile  |  2 +-
 arch/arm64/kvm/hyp/nvhe/early_alloc.c | 60 +++
 arch/arm64/kvm/hyp/nvhe/psci-relay.c  |  4 +-
 5 files changed, 100 insertions(+), 4 deletions(-)
 create mode 100644 arch/arm64/kvm/hyp/include/nvhe/early_alloc.h
 create mode 100644 arch/arm64/kvm/hyp/include/nvhe/memory.h
 create mode 100644 arch/arm64/kvm/hyp/nvhe/early_alloc.c

diff --git a/arch/arm64/kvm/hyp/include/nvhe/early_alloc.h 
b/arch/arm64/kvm/hyp/include/nvhe/early_alloc.h
new file mode 100644
index ..68ce2bf9a718
--- /dev/null
+++ b/arch/arm64/kvm/hyp/include/nvhe/early_alloc.h
@@ -0,0 +1,14 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+#ifndef __KVM_HYP_EARLY_ALLOC_H
+#define __KVM_HYP_EARLY_ALLOC_H
+
+#include 
+
+void hyp_early_alloc_init(void *virt, unsigned long size);
+unsigned long hyp_early_alloc_nr_pages(void);
+void *hyp_early_alloc_page(void *arg);
+void *hyp_early_alloc_contig(unsigned int nr_pages);
+
+extern struct kvm_pgtable_mm_ops hyp_early_alloc_mm_ops;
+
+#endif /* __KVM_HYP_EARLY_ALLOC_H */
diff --git a/arch/arm64/kvm/hyp/include/nvhe/memory.h 
b/arch/arm64/kvm/hyp/include/nvhe/memory.h
new file mode 100644
index ..64c44c142c95
--- /dev/null
+++ b/arch/arm64/kvm/hyp/include/nvhe/memory.h
@@ -0,0 +1,24 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+#ifndef __KVM_HYP_MEMORY_H
+#define __KVM_HYP_MEMORY_H
+
+#include 
+
+#include 
+
+extern s64 hyp_physvirt_offset;
+
+#define __hyp_pa(virt) ((phys_addr_t)(virt) + hyp_physvirt_offset)
+#define __hyp_va(virt) ((void *)((phys_addr_t)(virt) - hyp_physvirt_offset))
+
+static inline void *hyp_phys_to_virt(phys_addr_t phys)
+{
+   return __hyp_va(phys);
+}
+
+static inline phys_addr_t hyp_virt_to_phys(void *addr)
+{
+   return __hyp_pa(addr);
+}
+
+#endif /* __KVM_HYP_MEMORY_H */
diff --git a/arch/arm64/kvm/hyp/nvhe/Makefile b/arch/arm64/kvm/hyp/nvhe/Makefile
index 590fdefb42dd..1fc0684a7678 100644
--- a/arch/arm64/kvm/hyp/nvhe/Makefile
+++ b/arch/arm64/kvm/hyp/nvhe/Makefile
@@ -10,7 +10,7 @@ lib-objs := clear_page.o copy_page.o memcpy.o memset.o
 lib-objs := $(addprefix ../../../lib/, $(lib-objs))
 
 obj-y := timer-sr.o sysreg-sr.o debug-sr.o switch.o tlb.o hyp-init.o host.o \
-hyp-main.o hyp-smp.o psci-relay.o
+hyp-main.o hyp-smp.o psci-relay.o early_alloc.o
 obj-y += ../vgic-v3-sr.o ../aarch32.o ../vgic-v2-cpuif-proxy.o ../entry.o \
 ../fpsimd.o ../hyp-entry.o ../exception.o
 obj-y += $(lib-objs)
diff --git a/arch/arm64/kvm/hyp/nvhe/early_alloc.c 
b/arch/arm64/kvm/hyp/nvhe/early_alloc.c
new file mode 100644
index ..de4c45662970
--- /dev/null
+++ b/arch/arm64/kvm/hyp/nvhe/early_alloc.c
@@ -0,0 +1,60 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * Copyright (C) 2020 Google LLC
+ * Author: Quentin Perret 
+ */
+
+#include 
+
+#include 
+
+struct kvm_pgtable_mm_ops hyp_early_alloc_mm_ops;
+s64 __ro_after_init hyp_physvirt_offset;
+
+static unsigned long base;
+static unsigned long end;
+static unsigned long cur;
+
+unsigned long hyp_early_alloc_nr_pages(void)
+{
+   return (cur - base) >> PAGE_SHIFT;
+}
+
+extern void clear_page(void *to);
+
+void *hyp_early_alloc_contig(unsigned int nr_pages)
+{
+   unsigned long ret = cur, i, p;
+
+   if (!nr_pages)
+   return NULL;
+
+   cur += nr_pages << PAGE_SHIFT;
+   if (cur > end) {
+   cur = ret;
+   return NULL;
+   }
+
+   for (i = 0; i < nr_pages; i++) {
+   p = ret + (i << PAGE_SHIFT);
+   clear_page((void *)(p));
+   }
+
+   return (void *)ret;
+}
+
+void *hyp_early_alloc_page(void *arg)
+{
+   return hyp_early_alloc_contig(1);
+}
+
+void hyp_early_alloc_init(unsigned long virt, unsigned long size)
+{
+   base = virt;
+   end = virt + size;
+   cur = virt;
+
+   hyp_early_alloc_mm_ops.zalloc_page = hyp_early_alloc_page;
+   hyp_early_alloc_mm_ops.phys_to_virt = hyp_phys_to_virt;
+   hyp_early_alloc_mm_ops.virt_to_phys = hyp_virt_to_phys;
+}
diff --git a/arch/arm64/kvm/hyp/nvhe/psci-relay.c 
b/arch/arm64/kvm/hyp/nvhe/psci-relay.c
index e3947846ffcb..bdd8054bce4c 100644
--- 

[RFC PATCH v2 03/26] arm64: kvm: Add standalone ticket spinlock implementation for use at hyp

2021-01-08 Thread Quentin Perret
From: Will Deacon 

We will soon need to synchronise multiple CPUs in the hyp text at EL2.
The qspinlock-based locking used by the host is overkill for this purpose
and relies on the kernel's "percpu" implementation for the MCS nodes.

Implement a simple ticket locking scheme based heavily on the code removed
by commit c11090474d70 ("arm64: locking: Replace ticket lock implementation
with qspinlock").

Signed-off-by: Will Deacon 
Signed-off-by: Quentin Perret 
---
 arch/arm64/kvm/hyp/include/nvhe/spinlock.h | 92 ++
 1 file changed, 92 insertions(+)
 create mode 100644 arch/arm64/kvm/hyp/include/nvhe/spinlock.h

diff --git a/arch/arm64/kvm/hyp/include/nvhe/spinlock.h 
b/arch/arm64/kvm/hyp/include/nvhe/spinlock.h
new file mode 100644
index ..7584c397bbac
--- /dev/null
+++ b/arch/arm64/kvm/hyp/include/nvhe/spinlock.h
@@ -0,0 +1,92 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/*
+ * A stand-alone ticket spinlock implementation for use by the non-VHE
+ * KVM hypervisor code running at EL2.
+ *
+ * Copyright (C) 2020 Google LLC
+ * Author: Will Deacon 
+ *
+ * Heavily based on the implementation removed by c11090474d70 which was:
+ * Copyright (C) 2012 ARM Ltd.
+ */
+
+#ifndef __ARM64_KVM_NVHE_SPINLOCK_H__
+#define __ARM64_KVM_NVHE_SPINLOCK_H__
+
+#include 
+#include 
+
+typedef union hyp_spinlock {
+   u32 __val;
+   struct {
+#ifdef __AARCH64EB__
+   u16 next, owner;
+#else
+   u16 owner, next;
+   };
+#endif
+} hyp_spinlock_t;
+
+#define hyp_spin_lock_init(l)  \
+do {   \
+   *(l) = (hyp_spinlock_t){ .__val = 0 };  \
+} while (0)
+
+static inline void hyp_spin_lock(hyp_spinlock_t *lock)
+{
+   u32 tmp;
+   hyp_spinlock_t lockval, newval;
+
+   asm volatile(
+   /* Atomically increment the next ticket. */
+   ARM64_LSE_ATOMIC_INSN(
+   /* LL/SC */
+"  prfmpstl1strm, %3\n"
+"1:ldaxr   %w0, %3\n"
+"  add %w1, %w0, #(1 << 16)\n"
+"  stxr%w2, %w1, %3\n"
+"  cbnz%w2, 1b\n",
+   /* LSE atomics */
+"  mov %w2, #(1 << 16)\n"
+"  ldadda  %w2, %w0, %3\n"
+   __nops(3))
+
+   /* Did we get the lock? */
+"  eor %w1, %w0, %w0, ror #16\n"
+"  cbz %w1, 3f\n"
+   /*
+* No: spin on the owner. Send a local event to avoid missing an
+* unlock before the exclusive load.
+*/
+"  sevl\n"
+"2:wfe\n"
+"  ldaxrh  %w2, %4\n"
+"  eor %w1, %w2, %w0, lsr #16\n"
+"  cbnz%w1, 2b\n"
+   /* We got the lock. Critical section starts here. */
+"3:"
+   : "=" (lockval), "=" (newval), "=" (tmp), "+Q" (*lock)
+   : "Q" (lock->owner)
+   : "memory");
+}
+
+static inline void hyp_spin_unlock(hyp_spinlock_t *lock)
+{
+   u64 tmp;
+
+   asm volatile(
+   ARM64_LSE_ATOMIC_INSN(
+   /* LL/SC */
+   "   ldrh%w1, %0\n"
+   "   add %w1, %w1, #1\n"
+   "   stlrh   %w1, %0",
+   /* LSE atomics */
+   "   mov %w1, #1\n"
+   "   staddlh %w1, %0\n"
+   __nops(1))
+   : "=Q" (lock->owner), "=" (tmp)
+   :
+   : "memory");
+}
+
+#endif /* __ARM64_KVM_NVHE_SPINLOCK_H__ */
-- 
2.30.0.284.gd98b1dd5eaa7-goog

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[RFC PATCH v2 11/26] KVM: arm64: Stub CONFIG_DEBUG_LIST at Hyp

2021-01-08 Thread Quentin Perret
In order to use the kernel list library at EL2, introduce stubs for the
CONFIG_DEBUG_LIST out-of-lines calls.

Signed-off-by: Quentin Perret 
---
 arch/arm64/kvm/hyp/nvhe/Makefile |  2 +-
 arch/arm64/kvm/hyp/nvhe/stub.c   | 22 ++
 2 files changed, 23 insertions(+), 1 deletion(-)
 create mode 100644 arch/arm64/kvm/hyp/nvhe/stub.c

diff --git a/arch/arm64/kvm/hyp/nvhe/Makefile b/arch/arm64/kvm/hyp/nvhe/Makefile
index 1fc0684a7678..33bd381d8f73 100644
--- a/arch/arm64/kvm/hyp/nvhe/Makefile
+++ b/arch/arm64/kvm/hyp/nvhe/Makefile
@@ -10,7 +10,7 @@ lib-objs := clear_page.o copy_page.o memcpy.o memset.o
 lib-objs := $(addprefix ../../../lib/, $(lib-objs))
 
 obj-y := timer-sr.o sysreg-sr.o debug-sr.o switch.o tlb.o hyp-init.o host.o \
-hyp-main.o hyp-smp.o psci-relay.o early_alloc.o
+hyp-main.o hyp-smp.o psci-relay.o early_alloc.o stub.o
 obj-y += ../vgic-v3-sr.o ../aarch32.o ../vgic-v2-cpuif-proxy.o ../entry.o \
 ../fpsimd.o ../hyp-entry.o ../exception.o
 obj-y += $(lib-objs)
diff --git a/arch/arm64/kvm/hyp/nvhe/stub.c b/arch/arm64/kvm/hyp/nvhe/stub.c
new file mode 100644
index ..c0aa6bbfd79d
--- /dev/null
+++ b/arch/arm64/kvm/hyp/nvhe/stub.c
@@ -0,0 +1,22 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * Stubs for out-of-line function calls caused by re-using kernel
+ * infrastructure at EL2.
+ *
+ * Copyright (C) 2020 - Google LLC
+ */
+
+#include 
+
+#ifdef CONFIG_DEBUG_LIST
+bool __list_add_valid(struct list_head *new, struct list_head *prev,
+ struct list_head *next)
+{
+   return true;
+}
+
+bool __list_del_entry_valid(struct list_head *entry)
+{
+   return true;
+}
+#endif
-- 
2.30.0.284.gd98b1dd5eaa7-goog

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[RFC PATCH v2 01/26] arm64: lib: Annotate {clear, copy}_page() as position-independent

2021-01-08 Thread Quentin Perret
From: Will Deacon 

clear_page() and copy_page() are suitable for use outside of the kernel
address space, so annotate them as position-independent code.

Signed-off-by: Will Deacon 
Signed-off-by: Quentin Perret 
---
 arch/arm64/lib/clear_page.S | 4 ++--
 arch/arm64/lib/copy_page.S  | 4 ++--
 2 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/arch/arm64/lib/clear_page.S b/arch/arm64/lib/clear_page.S
index 073acbf02a7c..b84b179edba3 100644
--- a/arch/arm64/lib/clear_page.S
+++ b/arch/arm64/lib/clear_page.S
@@ -14,7 +14,7 @@
  * Parameters:
  * x0 - dest
  */
-SYM_FUNC_START(clear_page)
+SYM_FUNC_START_PI(clear_page)
mrs x1, dczid_el0
and w1, w1, #0xf
mov x2, #4
@@ -25,5 +25,5 @@ SYM_FUNC_START(clear_page)
tst x0, #(PAGE_SIZE - 1)
b.ne1b
ret
-SYM_FUNC_END(clear_page)
+SYM_FUNC_END_PI(clear_page)
 EXPORT_SYMBOL(clear_page)
diff --git a/arch/arm64/lib/copy_page.S b/arch/arm64/lib/copy_page.S
index e7a793961408..29144f4cd449 100644
--- a/arch/arm64/lib/copy_page.S
+++ b/arch/arm64/lib/copy_page.S
@@ -17,7 +17,7 @@
  * x0 - dest
  * x1 - src
  */
-SYM_FUNC_START(copy_page)
+SYM_FUNC_START_PI(copy_page)
 alternative_if ARM64_HAS_NO_HW_PREFETCH
// Prefetch three cache lines ahead.
prfmpldl1strm, [x1, #128]
@@ -75,5 +75,5 @@ alternative_else_nop_endif
stnpx16, x17, [x0, #112 - 256]
 
ret
-SYM_FUNC_END(copy_page)
+SYM_FUNC_END_PI(copy_page)
 EXPORT_SYMBOL(copy_page)
-- 
2.30.0.284.gd98b1dd5eaa7-goog

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[RFC PATCH v2 02/26] KVM: arm64: Link position-independent string routines into .hyp.text

2021-01-08 Thread Quentin Perret
From: Will Deacon 

Pull clear_page(), copy_page(), memcpy() and memset() into the nVHE hyp
code and ensure that we always execute the '__pi_' entry point on the
offchance that it changes in future.

[ qperret: Commit title nits ]

Signed-off-by: Will Deacon 
Signed-off-by: Quentin Perret 
---
 arch/arm64/include/asm/hyp_image.h |  3 +++
 arch/arm64/kernel/image-vars.h | 11 +++
 arch/arm64/kvm/hyp/nvhe/Makefile   |  4 
 3 files changed, 18 insertions(+)

diff --git a/arch/arm64/include/asm/hyp_image.h 
b/arch/arm64/include/asm/hyp_image.h
index daa1a1da539e..e06842756051 100644
--- a/arch/arm64/include/asm/hyp_image.h
+++ b/arch/arm64/include/asm/hyp_image.h
@@ -31,6 +31,9 @@
  */
 #define KVM_NVHE_ALIAS(sym)kvm_nvhe_sym(sym) = sym;
 
+/* Defines a linker script alias for KVM nVHE hyp symbols */
+#define KVM_NVHE_ALIAS_HYP(first, sec) kvm_nvhe_sym(first) = kvm_nvhe_sym(sec);
+
 #endif /* LINKER_SCRIPT */
 
 #endif /* __ARM64_HYP_IMAGE_H__ */
diff --git a/arch/arm64/kernel/image-vars.h b/arch/arm64/kernel/image-vars.h
index 39289d75118d..43f3a1d6e92d 100644
--- a/arch/arm64/kernel/image-vars.h
+++ b/arch/arm64/kernel/image-vars.h
@@ -102,6 +102,17 @@ KVM_NVHE_ALIAS(__stop___kvm_ex_table);
 /* Array containing bases of nVHE per-CPU memory regions. */
 KVM_NVHE_ALIAS(kvm_arm_hyp_percpu_base);
 
+/* Position-independent library routines */
+KVM_NVHE_ALIAS_HYP(clear_page, __pi_clear_page);
+KVM_NVHE_ALIAS_HYP(copy_page, __pi_copy_page);
+KVM_NVHE_ALIAS_HYP(memcpy, __pi_memcpy);
+KVM_NVHE_ALIAS_HYP(memset, __pi_memset);
+
+#ifdef CONFIG_KASAN
+KVM_NVHE_ALIAS_HYP(__memcpy, __pi_memcpy);
+KVM_NVHE_ALIAS_HYP(__memset, __pi_memset);
+#endif
+
 #endif /* CONFIG_KVM */
 
 #endif /* __ARM64_KERNEL_IMAGE_VARS_H */
diff --git a/arch/arm64/kvm/hyp/nvhe/Makefile b/arch/arm64/kvm/hyp/nvhe/Makefile
index 1f1e351c5fe2..590fdefb42dd 100644
--- a/arch/arm64/kvm/hyp/nvhe/Makefile
+++ b/arch/arm64/kvm/hyp/nvhe/Makefile
@@ -6,10 +6,14 @@
 asflags-y := -D__KVM_NVHE_HYPERVISOR__
 ccflags-y := -D__KVM_NVHE_HYPERVISOR__
 
+lib-objs := clear_page.o copy_page.o memcpy.o memset.o
+lib-objs := $(addprefix ../../../lib/, $(lib-objs))
+
 obj-y := timer-sr.o sysreg-sr.o debug-sr.o switch.o tlb.o hyp-init.o host.o \
 hyp-main.o hyp-smp.o psci-relay.o
 obj-y += ../vgic-v3-sr.o ../aarch32.o ../vgic-v2-cpuif-proxy.o ../entry.o \
 ../fpsimd.o ../hyp-entry.o ../exception.o
+obj-y += $(lib-objs)
 
 ##
 ## Build rules for compiling nVHE hyp code
-- 
2.30.0.284.gd98b1dd5eaa7-goog

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[RFC PATCH v2 14/26] KVM: arm64: Factor out vector address calculation

2021-01-08 Thread Quentin Perret
In order to re-map the guest vectors at EL2 when pKVM is enabled,
refactor __kvm_vector_slot2idx() and kvm_init_vector_slot() to move all
the address calculation logic in a static inline function.

Signed-off-by: Quentin Perret 
---
 arch/arm64/include/asm/kvm_mmu.h | 8 
 arch/arm64/kvm/arm.c | 9 +
 2 files changed, 9 insertions(+), 8 deletions(-)

diff --git a/arch/arm64/include/asm/kvm_mmu.h b/arch/arm64/include/asm/kvm_mmu.h
index e52d82aeadca..d7ebd73ec86f 100644
--- a/arch/arm64/include/asm/kvm_mmu.h
+++ b/arch/arm64/include/asm/kvm_mmu.h
@@ -195,6 +195,14 @@ phys_addr_t kvm_mmu_get_httbr(void);
 phys_addr_t kvm_get_idmap_vector(void);
 int kvm_mmu_init(void);
 
+static inline void *__kvm_vector_slot2addr(void *base,
+  enum arm64_hyp_spectre_vector slot)
+{
+   int idx = slot - (slot != HYP_VECTOR_DIRECT);
+
+   return base + (idx * SZ_2K);
+}
+
 struct kvm;
 
 #define kvm_flush_dcache_to_poc(a,l)   __flush_dcache_area((a), (l))
diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
index 9fd769349e9e..6af9204bcd5b 100644
--- a/arch/arm64/kvm/arm.c
+++ b/arch/arm64/kvm/arm.c
@@ -1346,16 +1346,9 @@ static unsigned long nvhe_percpu_order(void)
 /* A lookup table holding the hypervisor VA for each vector slot */
 static void *hyp_spectre_vector_selector[BP_HARDEN_EL2_SLOTS];
 
-static int __kvm_vector_slot2idx(enum arm64_hyp_spectre_vector slot)
-{
-   return slot - (slot != HYP_VECTOR_DIRECT);
-}
-
 static void kvm_init_vector_slot(void *base, enum arm64_hyp_spectre_vector 
slot)
 {
-   int idx = __kvm_vector_slot2idx(slot);
-
-   hyp_spectre_vector_selector[slot] = base + (idx * SZ_2K);
+   hyp_spectre_vector_selector[slot] = __kvm_vector_slot2addr(base, slot);
 }
 
 static int kvm_init_vector_slots(void)
-- 
2.30.0.284.gd98b1dd5eaa7-goog

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[RFC PATCH v2 13/26] KVM: arm64: Enable access to sanitized CPU features at EL2

2021-01-08 Thread Quentin Perret
Introduce the infrastructure in KVM enabling to copy CPU feature
registers into EL2-owned data-structures, to allow reading sanitised
values directly at EL2 in nVHE.

Given that only a subset of these features are being read by the
hypervisor, the ones that need to be copied are to be listed under
 together with the name of the nVHE variable that
will hold the copy.

While at it, introduce the first user of this infrastructure by
implementing __flush_dcache_area at EL2, which needs
arm64_ftr_reg_ctrel0.

Signed-off-by: Quentin Perret 
---
 arch/arm64/include/asm/cpufeature.h |  1 +
 arch/arm64/include/asm/kvm_cpufeature.h | 17 ++
 arch/arm64/kernel/cpufeature.c  | 12 ++
 arch/arm64/kvm/arm.c| 31 +
 arch/arm64/kvm/hyp/nvhe/Makefile|  3 ++-
 arch/arm64/kvm/hyp/nvhe/cache.S | 13 +++
 arch/arm64/kvm/hyp/nvhe/cpufeature.c|  8 +++
 7 files changed, 84 insertions(+), 1 deletion(-)
 create mode 100644 arch/arm64/include/asm/kvm_cpufeature.h
 create mode 100644 arch/arm64/kvm/hyp/nvhe/cache.S
 create mode 100644 arch/arm64/kvm/hyp/nvhe/cpufeature.c

diff --git a/arch/arm64/include/asm/cpufeature.h 
b/arch/arm64/include/asm/cpufeature.h
index 16063c813dcd..742e9bcc051b 100644
--- a/arch/arm64/include/asm/cpufeature.h
+++ b/arch/arm64/include/asm/cpufeature.h
@@ -600,6 +600,7 @@ void __init setup_cpu_features(void);
 void check_local_cpu_capabilities(void);
 
 u64 read_sanitised_ftr_reg(u32 id);
+int copy_ftr_reg(u32 id, struct arm64_ftr_reg *dst);
 
 static inline bool cpu_supports_mixed_endian_el0(void)
 {
diff --git a/arch/arm64/include/asm/kvm_cpufeature.h 
b/arch/arm64/include/asm/kvm_cpufeature.h
new file mode 100644
index ..d34f85cba358
--- /dev/null
+++ b/arch/arm64/include/asm/kvm_cpufeature.h
@@ -0,0 +1,17 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/*
+ * Copyright (C) 2020 - Google LLC
+ * Author: Quentin Perret 
+ */
+
+#include 
+
+#ifndef KVM_HYP_CPU_FTR_REG
+#if defined(__KVM_NVHE_HYPERVISOR__)
+#define KVM_HYP_CPU_FTR_REG(id, name) extern struct arm64_ftr_reg name;
+#else
+#define KVM_HYP_CPU_FTR_REG(id, name) DECLARE_KVM_NVHE_SYM(name);
+#endif
+#endif
+
+KVM_HYP_CPU_FTR_REG(SYS_CTR_EL0, arm64_ftr_reg_ctrel0)
diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
index bc3549663957..c2019dc3 100644
--- a/arch/arm64/kernel/cpufeature.c
+++ b/arch/arm64/kernel/cpufeature.c
@@ -1113,6 +1113,18 @@ u64 read_sanitised_ftr_reg(u32 id)
 }
 EXPORT_SYMBOL_GPL(read_sanitised_ftr_reg);
 
+int copy_ftr_reg(u32 id, struct arm64_ftr_reg *dst)
+{
+   struct arm64_ftr_reg *regp = get_arm64_ftr_reg(id);
+
+   if (!regp)
+   return -EINVAL;
+
+   memcpy(dst, regp, sizeof(*regp));
+
+   return 0;
+}
+
 #define read_sysreg_case(r)\
case r: return read_sysreg_s(r)
 
diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
index 51b53ca36dc5..9fd769349e9e 100644
--- a/arch/arm64/kvm/arm.c
+++ b/arch/arm64/kvm/arm.c
@@ -34,6 +34,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -1697,6 +1698,29 @@ static void teardown_hyp_mode(void)
}
 }
 
+#undef KVM_HYP_CPU_FTR_REG
+#define KVM_HYP_CPU_FTR_REG(id, name) \
+   { .sys_id = id, .dst = (struct arm64_ftr_reg *)_nvhe_sym(name) },
+static const struct __ftr_reg_copy_entry {
+   u32 sys_id;
+   struct arm64_ftr_reg*dst;
+} hyp_ftr_regs[] = {
+   #include 
+};
+
+static int copy_cpu_ftr_regs(void)
+{
+   int i, ret;
+
+   for (i = 0; i < ARRAY_SIZE(hyp_ftr_regs); i++) {
+   ret = copy_ftr_reg(hyp_ftr_regs[i].sys_id, hyp_ftr_regs[i].dst);
+   if (ret)
+   return ret;
+   }
+
+   return 0;
+}
+
 /**
  * Inits Hyp-mode on all online CPUs
  */
@@ -1705,6 +1729,13 @@ static int init_hyp_mode(void)
int cpu;
int err = 0;
 
+   /*
+* Copy the required CPU feature register in their EL2 counterpart
+*/
+   err = copy_cpu_ftr_regs();
+   if (err)
+   return err;
+
/*
 * Allocate Hyp PGD and setup Hyp identity mapping
 */
diff --git a/arch/arm64/kvm/hyp/nvhe/Makefile b/arch/arm64/kvm/hyp/nvhe/Makefile
index 9e5eacfec6ec..72cfe53f106f 100644
--- a/arch/arm64/kvm/hyp/nvhe/Makefile
+++ b/arch/arm64/kvm/hyp/nvhe/Makefile
@@ -10,7 +10,8 @@ lib-objs := clear_page.o copy_page.o memcpy.o memset.o
 lib-objs := $(addprefix ../../../lib/, $(lib-objs))
 
 obj-y := timer-sr.o sysreg-sr.o debug-sr.o switch.o tlb.o hyp-init.o host.o \
-hyp-main.o hyp-smp.o psci-relay.o early_alloc.o stub.o page_alloc.o
+hyp-main.o hyp-smp.o psci-relay.o early_alloc.o stub.o page_alloc.o \
+cache.o cpufeature.o
 obj-y += ../vgic-v3-sr.o ../aarch32.o ../vgic-v2-cpuif-proxy.o ../entry.o \
 ../fpsimd.o ../hyp-entry.o ../exception.o
 obj-y += $(lib-objs)
diff --git 

[RFC PATCH v2 17/26] KVM: arm64: Elevate Hyp mappings creation at EL2

2021-01-08 Thread Quentin Perret
Previous commits have introduced infrastructure at EL2 to enable the Hyp
code to manage its own memory, and more specifically its stage 1 page
tables. However, this was preliminary work, and none of it is currently
in use.

Put all of this together by elevating the hyp mappings creation at EL2
when memory protection is enabled. In this case, the host kernel running
at EL1 still creates _temporary_ Hyp mappings, only used while
initializing the hypervisor, but frees them right after.

As such, all calls to create_hyp_mappings() after kvm init has finished
turn into hypercalls, as the host now has no 'legal' way to modify the
hypevisor page tables directly.

Signed-off-by: Quentin Perret 
---
 arch/arm64/include/asm/kvm_mmu.h |  1 -
 arch/arm64/kvm/arm.c | 62 +---
 arch/arm64/kvm/mmu.c | 34 ++
 3 files changed, 92 insertions(+), 5 deletions(-)

diff --git a/arch/arm64/include/asm/kvm_mmu.h b/arch/arm64/include/asm/kvm_mmu.h
index d7ebd73ec86f..6c8466a042a9 100644
--- a/arch/arm64/include/asm/kvm_mmu.h
+++ b/arch/arm64/include/asm/kvm_mmu.h
@@ -309,6 +309,5 @@ static __always_inline void __load_guest_stage2(struct 
kvm_s2_mmu *mmu)
 */
asm(ALTERNATIVE("nop", "isb", ARM64_WORKAROUND_SPECULATIVE_AT));
 }
-
 #endif /* __ASSEMBLY__ */
 #endif /* __ARM64_KVM_MMU_H__ */
diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
index 6af9204bcd5b..e524682c2ccf 100644
--- a/arch/arm64/kvm/arm.c
+++ b/arch/arm64/kvm/arm.c
@@ -1421,7 +1421,7 @@ static void cpu_prepare_hyp_mode(int cpu)
kvm_flush_dcache_to_poc(params, sizeof(*params));
 }
 
-static void cpu_init_hyp_mode(void)
+static void kvm_set_hyp_vector(void)
 {
struct kvm_nvhe_init_params *params;
struct arm_smccc_res res;
@@ -1439,6 +1439,11 @@ static void cpu_init_hyp_mode(void)
params = this_cpu_ptr_nvhe_sym(kvm_init_params);
arm_smccc_1_1_hvc(KVM_HOST_SMCCC_FUNC(__kvm_hyp_init), 
virt_to_phys(params), );
WARN_ON(res.a0 != SMCCC_RET_SUCCESS);
+}
+
+static void cpu_init_hyp_mode(void)
+{
+   kvm_set_hyp_vector();
 
/*
 * Disabling SSBD on a non-VHE system requires us to enable SSBS
@@ -1481,7 +1486,10 @@ static void cpu_set_hyp_vector(void)
struct bp_hardening_data *data = this_cpu_ptr(_hardening_data);
void *vector = hyp_spectre_vector_selector[data->slot];
 
-   *this_cpu_ptr_hyp_sym(kvm_hyp_vector) = (unsigned long)vector;
+   if (!is_protected_kvm_enabled())
+   *this_cpu_ptr_hyp_sym(kvm_hyp_vector) = (unsigned long)vector;
+   else
+   kvm_call_hyp_nvhe(__pkvm_cpu_set_vector, data->slot);
 }
 
 static void cpu_hyp_reinit(void)
@@ -1489,13 +1497,14 @@ static void cpu_hyp_reinit(void)

kvm_init_host_cpu_context(_cpu_ptr_hyp_sym(kvm_host_data)->host_ctxt);
 
cpu_hyp_reset();
-   cpu_set_hyp_vector();
 
if (is_kernel_in_hyp_mode())
kvm_timer_init_vhe();
else
cpu_init_hyp_mode();
 
+   cpu_set_hyp_vector();
+
kvm_arm_init_debug();
 
if (vgic_present)
@@ -1714,13 +1723,52 @@ static int copy_cpu_ftr_regs(void)
return 0;
 }
 
+static int kvm_hyp_enable_protection(void)
+{
+   void *per_cpu_base = kvm_ksym_ref(kvm_arm_hyp_percpu_base);
+   int ret, cpu;
+   void *addr;
+
+   if (!is_protected_kvm_enabled())
+   return 0;
+
+   if (!hyp_mem_base)
+   return -ENOMEM;
+
+   addr = phys_to_virt(hyp_mem_base);
+   ret = create_hyp_mappings(addr, addr + hyp_mem_size - 1, PAGE_HYP);
+   if (ret)
+   return ret;
+
+   preempt_disable();
+   kvm_set_hyp_vector();
+   ret = kvm_call_hyp_nvhe(__pkvm_init, hyp_mem_base, hyp_mem_size,
+   num_possible_cpus(), kern_hyp_va(per_cpu_base));
+   preempt_enable();
+   if (ret)
+   return ret;
+
+   free_hyp_pgds();
+   for_each_possible_cpu(cpu)
+   free_page(per_cpu(kvm_arm_hyp_stack_page, cpu));
+
+   return 0;
+}
+
 /**
  * Inits Hyp-mode on all online CPUs
  */
 static int init_hyp_mode(void)
 {
int cpu;
-   int err = 0;
+   int err = -ENOMEM;
+
+   /*
+* The protected Hyp-mode cannot be initialized if the memory pool
+* allocation has failed.
+*/
+   if (is_protected_kvm_enabled() && !hyp_mem_base)
+   return err;
 
/*
 * Copy the required CPU feature register in their EL2 counterpart
@@ -1854,6 +1902,12 @@ static int init_hyp_mode(void)
for_each_possible_cpu(cpu)
cpu_prepare_hyp_mode(cpu);
 
+   err = kvm_hyp_enable_protection();
+   if (err) {
+   kvm_err("Failed to enable hyp memory protection: %d\n", err);
+   goto out_err;
+   }
+
return 0;
 
 out_err:
diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c
index 3cf9397dabdb..9d4c9251208e 

[RFC PATCH v2 12/26] KVM: arm64: Introduce a Hyp buddy page allocator

2021-01-08 Thread Quentin Perret
When memory protection is enabled, the hyp code will require a basic
form of memory management in order to allocate and free memory pages at
EL2. This is needed for various use-cases, including the creation of hyp
mappings or the allocation of stage 2 page tables.

To address these use-case, introduce a simple memory allocator in the
hyp code. The allocator is designed as a conventional 'buddy allocator',
working with a page granularity. It allows to allocate and free
physically contiguous pages from memory 'pools', with a guaranteed order
alignment in the PA space. Each page in a memory pool is associated
with a struct hyp_page which holds the page's metadata, including its
refcount, as well as its current order, hence mimicking the kernel's
buddy system in the GFP infrastructure. The hyp_page metadata are made
accessible through a hyp_vmemmap, following the concept of
SPARSE_VMEMMAP in the kernel.

Signed-off-by: Quentin Perret 
---
 arch/arm64/kvm/hyp/include/nvhe/gfp.h|  32 
 arch/arm64/kvm/hyp/include/nvhe/memory.h |  25 +++
 arch/arm64/kvm/hyp/nvhe/Makefile |   2 +-
 arch/arm64/kvm/hyp/nvhe/page_alloc.c | 185 +++
 4 files changed, 243 insertions(+), 1 deletion(-)
 create mode 100644 arch/arm64/kvm/hyp/include/nvhe/gfp.h
 create mode 100644 arch/arm64/kvm/hyp/nvhe/page_alloc.c

diff --git a/arch/arm64/kvm/hyp/include/nvhe/gfp.h 
b/arch/arm64/kvm/hyp/include/nvhe/gfp.h
new file mode 100644
index ..95587faee171
--- /dev/null
+++ b/arch/arm64/kvm/hyp/include/nvhe/gfp.h
@@ -0,0 +1,32 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+#ifndef __KVM_HYP_GFP_H
+#define __KVM_HYP_GFP_H
+
+#include 
+
+#include 
+#include 
+
+#define HYP_MAX_ORDER  11U
+#define HYP_NO_ORDER   UINT_MAX
+
+struct hyp_pool {
+   hyp_spinlock_t lock;
+   struct list_head free_area[HYP_MAX_ORDER + 1];
+   phys_addr_t range_start;
+   phys_addr_t range_end;
+};
+
+/* GFP flags */
+#define HYP_GFP_NONE   0
+#define HYP_GFP_ZERO   1
+
+/* Allocation */
+void *hyp_alloc_pages(struct hyp_pool *pool, gfp_t mask, unsigned int order);
+void hyp_get_page(void *addr);
+void hyp_put_page(void *addr);
+
+/* Used pages cannot be freed */
+int hyp_pool_init(struct hyp_pool *pool, phys_addr_t phys,
+ unsigned int nr_pages, unsigned int used_pages);
+#endif /* __KVM_HYP_GFP_H */
diff --git a/arch/arm64/kvm/hyp/include/nvhe/memory.h 
b/arch/arm64/kvm/hyp/include/nvhe/memory.h
index 64c44c142c95..ed47674bc988 100644
--- a/arch/arm64/kvm/hyp/include/nvhe/memory.h
+++ b/arch/arm64/kvm/hyp/include/nvhe/memory.h
@@ -6,7 +6,17 @@
 
 #include 
 
+struct hyp_pool;
+struct hyp_page {
+   unsigned int refcount;
+   unsigned int order;
+   struct hyp_pool *pool;
+   struct list_head node;
+};
+
 extern s64 hyp_physvirt_offset;
+extern u64 __hyp_vmemmap;
+#define hyp_vmemmap ((struct hyp_page *)__hyp_vmemmap)
 
 #define __hyp_pa(virt) ((phys_addr_t)(virt) + hyp_physvirt_offset)
 #define __hyp_va(virt) ((void *)((phys_addr_t)(virt) - hyp_physvirt_offset))
@@ -21,4 +31,19 @@ static inline phys_addr_t hyp_virt_to_phys(void *addr)
return __hyp_pa(addr);
 }
 
+#define hyp_phys_to_pfn(phys)  ((phys) >> PAGE_SHIFT)
+#define hyp_phys_to_page(phys) (_vmemmap[hyp_phys_to_pfn(phys)])
+#define hyp_virt_to_page(virt) hyp_phys_to_page(__hyp_pa(virt))
+
+#define hyp_page_to_phys(page)  ((phys_addr_t)((page) - hyp_vmemmap) << 
PAGE_SHIFT)
+#define hyp_page_to_virt(page) __hyp_va(hyp_page_to_phys(page))
+#define hyp_page_to_pool(page) (((struct hyp_page *)page)->pool)
+
+static inline int hyp_page_count(void *addr)
+{
+   struct hyp_page *p = hyp_virt_to_page(addr);
+
+   return p->refcount;
+}
+
 #endif /* __KVM_HYP_MEMORY_H */
diff --git a/arch/arm64/kvm/hyp/nvhe/Makefile b/arch/arm64/kvm/hyp/nvhe/Makefile
index 33bd381d8f73..9e5eacfec6ec 100644
--- a/arch/arm64/kvm/hyp/nvhe/Makefile
+++ b/arch/arm64/kvm/hyp/nvhe/Makefile
@@ -10,7 +10,7 @@ lib-objs := clear_page.o copy_page.o memcpy.o memset.o
 lib-objs := $(addprefix ../../../lib/, $(lib-objs))
 
 obj-y := timer-sr.o sysreg-sr.o debug-sr.o switch.o tlb.o hyp-init.o host.o \
-hyp-main.o hyp-smp.o psci-relay.o early_alloc.o stub.o
+hyp-main.o hyp-smp.o psci-relay.o early_alloc.o stub.o page_alloc.o
 obj-y += ../vgic-v3-sr.o ../aarch32.o ../vgic-v2-cpuif-proxy.o ../entry.o \
 ../fpsimd.o ../hyp-entry.o ../exception.o
 obj-y += $(lib-objs)
diff --git a/arch/arm64/kvm/hyp/nvhe/page_alloc.c 
b/arch/arm64/kvm/hyp/nvhe/page_alloc.c
new file mode 100644
index ..6de6515f0432
--- /dev/null
+++ b/arch/arm64/kvm/hyp/nvhe/page_alloc.c
@@ -0,0 +1,185 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * Copyright (C) 2020 Google LLC
+ * Author: Quentin Perret 
+ */
+
+#include 
+#include 
+
+u64 __hyp_vmemmap;
+
+/*
+ * Example buddy-tree for a 4-pages physically contiguous pool:
+ *
+ * o : Page 3
+ */
+ *   o-o : Page 2
+ *  /
+ * 

[RFC PATCH v2 16/26] KVM: arm64: Prepare Hyp memory protection

2021-01-08 Thread Quentin Perret
When memory protection is enabled, the Hyp code needs the ability to
create and manage its own page-table. To do so, introduce a new set of
hypercalls to initialize Hyp memory protection.

During the init hcall, the hypervisor runs with the host-provided
page-table and uses the trivial early page allocator to create its own
set of page-tables, using a memory pool that was donated by the host.
Specifically, the hypervisor creates its own mappings for __hyp_text,
the Hyp memory pool, the __hyp_bss, the portion of hyp_vmemmap
corresponding to the Hyp pool, among other things. It then jumps back in
the idmap page, switches to use the newly-created pgd (instead of the
temporary one provided by the host) and then installs the full-fledged
buddy allocator which will then be the only one in used from then on.

Note that for the sake of symplifying the review, this only introduces
the code doing this operation, without actually being called by anyhing
yet. This will be done in a subsequent patch, which will introduce the
necessary host kernel changes.

Credits to Will for __pkvm_init_switch_pgd.

Co-authored-by: Will Deacon 
Signed-off-by: Will Deacon 
Signed-off-by: Quentin Perret 
---
 arch/arm64/include/asm/kvm_asm.h |   4 +
 arch/arm64/include/asm/kvm_host.h|   8 +
 arch/arm64/include/asm/kvm_hyp.h |   8 +
 arch/arm64/kernel/image-vars.h   |  19 +++
 arch/arm64/kvm/hyp/Makefile  |   2 +-
 arch/arm64/kvm/hyp/include/nvhe/memory.h |   6 +
 arch/arm64/kvm/hyp/include/nvhe/mm.h |  79 +
 arch/arm64/kvm/hyp/nvhe/Makefile |   4 +-
 arch/arm64/kvm/hyp/nvhe/hyp-init.S   |  31 
 arch/arm64/kvm/hyp/nvhe/hyp-main.c   |  42 +
 arch/arm64/kvm/hyp/nvhe/mm.c | 174 
 arch/arm64/kvm/hyp/nvhe/setup.c  | 196 +++
 arch/arm64/kvm/hyp/reserved_mem.c| 102 
 arch/arm64/kvm/mmu.c |   2 +-
 arch/arm64/mm/init.c |   3 +
 15 files changed, 676 insertions(+), 4 deletions(-)
 create mode 100644 arch/arm64/kvm/hyp/include/nvhe/mm.h
 create mode 100644 arch/arm64/kvm/hyp/nvhe/mm.c
 create mode 100644 arch/arm64/kvm/hyp/nvhe/setup.c
 create mode 100644 arch/arm64/kvm/hyp/reserved_mem.c

diff --git a/arch/arm64/include/asm/kvm_asm.h b/arch/arm64/include/asm/kvm_asm.h
index 7ccf770c53d9..4fc27ac08836 100644
--- a/arch/arm64/include/asm/kvm_asm.h
+++ b/arch/arm64/include/asm/kvm_asm.h
@@ -57,6 +57,10 @@
 #define __KVM_HOST_SMCCC_FUNC___kvm_get_mdcr_el2   12
 #define __KVM_HOST_SMCCC_FUNC___vgic_v3_save_aprs  13
 #define __KVM_HOST_SMCCC_FUNC___vgic_v3_restore_aprs   14
+#define __KVM_HOST_SMCCC_FUNC___pkvm_init  15
+#define __KVM_HOST_SMCCC_FUNC___pkvm_create_mappings   16
+#define __KVM_HOST_SMCCC_FUNC___pkvm_create_private_mapping17
+#define __KVM_HOST_SMCCC_FUNC___pkvm_cpu_set_vector18
 
 #ifndef __ASSEMBLY__
 
diff --git a/arch/arm64/include/asm/kvm_host.h 
b/arch/arm64/include/asm/kvm_host.h
index 81212958ef55..9a2feb83eea0 100644
--- a/arch/arm64/include/asm/kvm_host.h
+++ b/arch/arm64/include/asm/kvm_host.h
@@ -777,4 +777,12 @@ bool kvm_arm_vcpu_is_finalized(struct kvm_vcpu *vcpu);
 #define kvm_vcpu_has_pmu(vcpu) \
(test_bit(KVM_ARM_VCPU_PMU_V3, (vcpu)->arch.features))
 
+#ifdef CONFIG_KVM
+extern phys_addr_t hyp_mem_base;
+extern phys_addr_t hyp_mem_size;
+void __init kvm_hyp_reserve(void);
+#else
+static inline void kvm_hyp_reserve(void) { }
+#endif
+
 #endif /* __ARM64_KVM_HOST_H__ */
diff --git a/arch/arm64/include/asm/kvm_hyp.h b/arch/arm64/include/asm/kvm_hyp.h
index c0450828378b..a0e113734b20 100644
--- a/arch/arm64/include/asm/kvm_hyp.h
+++ b/arch/arm64/include/asm/kvm_hyp.h
@@ -100,4 +100,12 @@ void __noreturn hyp_panic(void);
 void __noreturn __hyp_do_panic(bool restore_host, u64 spsr, u64 elr, u64 par);
 #endif
 
+#ifdef __KVM_NVHE_HYPERVISOR__
+void __pkvm_init_switch_pgd(phys_addr_t phys, unsigned long size,
+   phys_addr_t pgd, void *sp, void *cont_fn);
+int __pkvm_init(phys_addr_t phys, unsigned long size, unsigned long nr_cpus,
+   unsigned long *per_cpu_base);
+void __noreturn __host_enter(struct kvm_cpu_context *host_ctxt);
+#endif
+
 #endif /* __ARM64_KVM_HYP_H__ */
diff --git a/arch/arm64/kernel/image-vars.h b/arch/arm64/kernel/image-vars.h
index 43f3a1d6e92d..366d837f0d39 100644
--- a/arch/arm64/kernel/image-vars.h
+++ b/arch/arm64/kernel/image-vars.h
@@ -113,6 +113,25 @@ KVM_NVHE_ALIAS_HYP(__memcpy, __pi_memcpy);
 KVM_NVHE_ALIAS_HYP(__memset, __pi_memset);
 #endif
 
+/* Hypevisor VA size */
+KVM_NVHE_ALIAS(hyp_va_bits);
+
+/* Kernel memory sections */
+KVM_NVHE_ALIAS(__start_rodata);
+KVM_NVHE_ALIAS(__end_rodata);
+KVM_NVHE_ALIAS(__bss_start);
+KVM_NVHE_ALIAS(__bss_stop);
+
+/* Hyp memory sections */
+KVM_NVHE_ALIAS(__hyp_idmap_text_start);

[RFC PATCH v2 08/26] KVM: arm64: Make kvm_call_hyp() a function call at Hyp

2021-01-08 Thread Quentin Perret
kvm_call_hyp() has some logic to issue a function call or a hypercall
depending the EL at which the kernel is running. However, all the code
compiled under __KVM_NVHE_HYPERVISOR__ is guaranteed to run only at EL2,
and in this case a simple function call is needed.

Add ifdefery to kvm_host.h to symplify kvm_call_hyp() in .hyp.text.

Signed-off-by: Quentin Perret 
---
 arch/arm64/include/asm/kvm_host.h | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/arch/arm64/include/asm/kvm_host.h 
b/arch/arm64/include/asm/kvm_host.h
index 8fcfab0c2567..81212958ef55 100644
--- a/arch/arm64/include/asm/kvm_host.h
+++ b/arch/arm64/include/asm/kvm_host.h
@@ -592,6 +592,7 @@ int kvm_test_age_hva(struct kvm *kvm, unsigned long hva);
 void kvm_arm_halt_guest(struct kvm *kvm);
 void kvm_arm_resume_guest(struct kvm *kvm);
 
+#ifndef __KVM_NVHE_HYPERVISOR__
 #define kvm_call_hyp_nvhe(f, ...)  
\
({  \
struct arm_smccc_res res;   \
@@ -631,6 +632,11 @@ void kvm_arm_resume_guest(struct kvm *kvm);
\
ret;\
})
+#else /* __KVM_NVHE_HYPERVISOR__ */
+#define kvm_call_hyp(f, ...) f(__VA_ARGS__)
+#define kvm_call_hyp_ret(f, ...) f(__VA_ARGS__)
+#define kvm_call_hyp_nvhe(f, ...) f(__VA_ARGS__)
+#endif /* __KVM_NVHE_HYPERVISOR__ */
 
 void force_vm_exit(const cpumask_t *mask);
 void kvm_mmu_wp_memory_region(struct kvm *kvm, int slot);
-- 
2.30.0.284.gd98b1dd5eaa7-goog

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[RFC PATCH v2 07/26] KVM: arm64: Introduce a BSS section for use at Hyp

2021-01-08 Thread Quentin Perret
Currently, the hyp code cannot make full use of a bss, as the kernel
section is mapped read-only.

While this mapping could simply be changed to read-write, it would
intermingle even more the hyp and kernel state than they currently are.
Instead, introduce a __hyp_bss section, that uses reserved pages, and
create the appropriate RW hyp mappings during KVM init.

Signed-off-by: Quentin Perret 
---
 arch/arm64/include/asm/sections.h |  1 +
 arch/arm64/kernel/vmlinux.lds.S   |  7 +++
 arch/arm64/kvm/arm.c  | 11 +++
 arch/arm64/kvm/hyp/nvhe/hyp.lds.S |  1 +
 4 files changed, 20 insertions(+)

diff --git a/arch/arm64/include/asm/sections.h 
b/arch/arm64/include/asm/sections.h
index 8ff579361731..f58cf493de16 100644
--- a/arch/arm64/include/asm/sections.h
+++ b/arch/arm64/include/asm/sections.h
@@ -12,6 +12,7 @@ extern char __hibernate_exit_text_start[], 
__hibernate_exit_text_end[];
 extern char __hyp_idmap_text_start[], __hyp_idmap_text_end[];
 extern char __hyp_text_start[], __hyp_text_end[];
 extern char __hyp_data_ro_after_init_start[], __hyp_data_ro_after_init_end[];
+extern char __hyp_bss_start[], __hyp_bss_end[];
 extern char __idmap_text_start[], __idmap_text_end[];
 extern char __initdata_begin[], __initdata_end[];
 extern char __inittext_begin[], __inittext_end[];
diff --git a/arch/arm64/kernel/vmlinux.lds.S b/arch/arm64/kernel/vmlinux.lds.S
index 43af13968dfd..3eca35d5a7cf 100644
--- a/arch/arm64/kernel/vmlinux.lds.S
+++ b/arch/arm64/kernel/vmlinux.lds.S
@@ -8,6 +8,13 @@
 #define RO_EXCEPTION_TABLE_ALIGN   8
 #define RUNTIME_DISCARD_EXIT
 
+#define BSS_FIRST_SECTIONS \
+   . = ALIGN(PAGE_SIZE);   \
+   __hyp_bss_start = .;\
+   *(.hyp.bss) \
+   . = ALIGN(PAGE_SIZE);   \
+   __hyp_bss_end = .;
+
 #include 
 #include 
 #include 
diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
index 3ac0f3425833..51b53ca36dc5 100644
--- a/arch/arm64/kvm/arm.c
+++ b/arch/arm64/kvm/arm.c
@@ -1770,7 +1770,18 @@ static int init_hyp_mode(void)
goto out_err;
}
 
+   /*
+* .hyp.bss is placed at the beginning of the .bss section, so map that
+* part RW, and the rest RO as the hyp shouldn't be touching it.
+*/
err = create_hyp_mappings(kvm_ksym_ref(__bss_start),
+ kvm_ksym_ref(__hyp_bss_end), PAGE_HYP);
+   if (err) {
+   kvm_err("Cannot map hyp bss section: %d\n", err);
+   goto out_err;
+   }
+
+   err = create_hyp_mappings(kvm_ksym_ref(__hyp_bss_end),
  kvm_ksym_ref(__bss_stop), PAGE_HYP_RO);
if (err) {
kvm_err("Cannot map bss section\n");
diff --git a/arch/arm64/kvm/hyp/nvhe/hyp.lds.S 
b/arch/arm64/kvm/hyp/nvhe/hyp.lds.S
index 5d76ff2ba63e..dc281d90063e 100644
--- a/arch/arm64/kvm/hyp/nvhe/hyp.lds.S
+++ b/arch/arm64/kvm/hyp/nvhe/hyp.lds.S
@@ -17,4 +17,5 @@ SECTIONS {
PERCPU_INPUT(L1_CACHE_BYTES)
}
HYP_SECTION(.data..ro_after_init)
+   HYP_SECTION(.bss)
 }
-- 
2.30.0.284.gd98b1dd5eaa7-goog

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[RFC PATCH v2 09/26] KVM: arm64: Allow using kvm_nvhe_sym() in hyp code

2021-01-08 Thread Quentin Perret
In order to allow the usage of code shared by the host and the hyp in
static inline library function, allow the usage of kvm_nvhe_sym() at el2
by defaulting to the raw symbol name.

Signed-off-by: Quentin Perret 
---
 arch/arm64/include/asm/hyp_image.h | 4 
 1 file changed, 4 insertions(+)

diff --git a/arch/arm64/include/asm/hyp_image.h 
b/arch/arm64/include/asm/hyp_image.h
index e06842756051..fb16e1018ea9 100644
--- a/arch/arm64/include/asm/hyp_image.h
+++ b/arch/arm64/include/asm/hyp_image.h
@@ -7,11 +7,15 @@
 #ifndef __ARM64_HYP_IMAGE_H__
 #define __ARM64_HYP_IMAGE_H__
 
+#ifndef __KVM_NVHE_HYPERVISOR__
 /*
  * KVM nVHE code has its own symbol namespace prefixed with __kvm_nvhe_,
  * to separate it from the kernel proper.
  */
 #define kvm_nvhe_sym(sym)  __kvm_nvhe_##sym
+#else
+#define kvm_nvhe_sym(sym)  sym
+#endif
 
 #ifdef LINKER_SCRIPT
 
-- 
2.30.0.284.gd98b1dd5eaa7-goog

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[RFC PATCH v2 04/26] KVM: arm64: Initialize kvm_nvhe_init_params early

2021-01-08 Thread Quentin Perret
Move the initialization of kvm_nvhe_init_params in a dedicated function
that is run early, and only once during KVM init, rather than every time
the KVM vectors are set and reset.

This also opens the opportunity for the hypervisor to change the init
structs during boot, hence simplifying the replacement of host-provided
page-tables and stacks by the ones the hypervisor will create for
itself.

Signed-off-by: Quentin Perret 
---
 arch/arm64/kvm/arm.c | 28 
 1 file changed, 20 insertions(+), 8 deletions(-)

diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
index 04c44853b103..3ac0f3425833 100644
--- a/arch/arm64/kvm/arm.c
+++ b/arch/arm64/kvm/arm.c
@@ -1383,21 +1383,17 @@ static int kvm_init_vector_slots(void)
return 0;
 }
 
-static void cpu_init_hyp_mode(void)
+static void cpu_prepare_hyp_mode(int cpu)
 {
-   struct kvm_nvhe_init_params *params = 
this_cpu_ptr_nvhe_sym(kvm_init_params);
-   struct arm_smccc_res res;
+   struct kvm_nvhe_init_params *params = 
per_cpu_ptr_nvhe_sym(kvm_init_params, cpu);
unsigned long tcr;
 
-   /* Switch from the HYP stub to our own HYP init vector */
-   __hyp_set_vectors(kvm_get_idmap_vector());
-
/*
 * Calculate the raw per-cpu offset without a translation from the
 * kernel's mapping to the linear mapping, and store it in tpidr_el2
 * so that we can use adr_l to access per-cpu variables in EL2.
 */
-   params->tpidr_el2 = (unsigned 
long)this_cpu_ptr_nvhe_sym(__per_cpu_start) -
+   params->tpidr_el2 = (unsigned 
long)per_cpu_ptr_nvhe_sym(__per_cpu_start, cpu) -
(unsigned 
long)kvm_ksym_ref(CHOOSE_NVHE_SYM(__per_cpu_start));
 
params->mair_el2 = read_sysreg(mair_el1);
@@ -1421,7 +1417,7 @@ static void cpu_init_hyp_mode(void)
tcr |= (idmap_t0sz & GENMASK(TCR_TxSZ_WIDTH - 1, 0)) << TCR_T0SZ_OFFSET;
params->tcr_el2 = tcr;
 
-   params->stack_hyp_va = 
kern_hyp_va(__this_cpu_read(kvm_arm_hyp_stack_page) + PAGE_SIZE);
+   params->stack_hyp_va = kern_hyp_va(per_cpu(kvm_arm_hyp_stack_page, cpu) 
+ PAGE_SIZE);
params->pgd_pa = kvm_mmu_get_httbr();
 
/*
@@ -1429,6 +1425,15 @@ static void cpu_init_hyp_mode(void)
 * be read while the MMU is off.
 */
kvm_flush_dcache_to_poc(params, sizeof(*params));
+}
+
+static void cpu_init_hyp_mode(void)
+{
+   struct kvm_nvhe_init_params *params;
+   struct arm_smccc_res res;
+
+   /* Switch from the HYP stub to our own HYP init vector */
+   __hyp_set_vectors(kvm_get_idmap_vector());
 
/*
 * Call initialization code, and switch to the full blown HYP code.
@@ -1437,6 +1442,7 @@ static void cpu_init_hyp_mode(void)
 * cpus_have_const_cap() wrapper.
 */
BUG_ON(!system_capabilities_finalized());
+   params = this_cpu_ptr_nvhe_sym(kvm_init_params);
arm_smccc_1_1_hvc(KVM_HOST_SMCCC_FUNC(__kvm_hyp_init), 
virt_to_phys(params), );
WARN_ON(res.a0 != SMCCC_RET_SUCCESS);
 
@@ -1807,6 +1813,12 @@ static int init_hyp_mode(void)
goto out_err;
}
 
+   /*
+* Prepare the CPU initialization parameters
+*/
+   for_each_possible_cpu(cpu)
+   cpu_prepare_hyp_mode(cpu);
+
return 0;
 
 out_err:
-- 
2.30.0.284.gd98b1dd5eaa7-goog

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[RFC PATCH v2 15/26] of/fdt: Introduce early_init_dt_add_memory_hyp()

2021-01-08 Thread Quentin Perret
Introduce early_init_dt_add_memory_hyp() to allow KVM to conserve a copy
of the memory regions parsed from DT. This will be needed in the context
of the protected nVHE feature of KVM/arm64 where the code running at EL2
will be cleanly separated from the host kernel during boot, and will
need its own representation of memory.

Signed-off-by: Quentin Perret 
---
 drivers/of/fdt.c | 5 +
 1 file changed, 5 insertions(+)

diff --git a/drivers/of/fdt.c b/drivers/of/fdt.c
index 4602e467ca8b..af2b5a09c5b4 100644
--- a/drivers/of/fdt.c
+++ b/drivers/of/fdt.c
@@ -1099,6 +1099,10 @@ int __init early_init_dt_scan_chosen(unsigned long node, 
const char *uname,
 #define MAX_MEMBLOCK_ADDR  ((phys_addr_t)~0)
 #endif
 
+void __init __weak early_init_dt_add_memory_hyp(u64 base, u64 size)
+{
+}
+
 void __init __weak early_init_dt_add_memory_arch(u64 base, u64 size)
 {
const u64 phys_offset = MIN_MEMBLOCK_ADDR;
@@ -1139,6 +1143,7 @@ void __init __weak early_init_dt_add_memory_arch(u64 
base, u64 size)
base = phys_offset;
}
memblock_add(base, size);
+   early_init_dt_add_memory_hyp(base, size);
 }
 
 int __init __weak early_init_dt_mark_hotplug_memory_arch(u64 base, u64 size)
-- 
2.30.0.284.gd98b1dd5eaa7-goog

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[RFC PATCH v2 06/26] KVM: arm64: Factor memory allocation out of pgtable.c

2021-01-08 Thread Quentin Perret
In preparation for enabling the creation of page-tables at EL2, factor
all memory allocation out of the page-table code, hence making it
re-usable with any compatible memory allocator.

No functional changes intended.

Signed-off-by: Quentin Perret 
---
 arch/arm64/include/asm/kvm_pgtable.h | 32 +-
 arch/arm64/kvm/hyp/pgtable.c | 90 +---
 arch/arm64/kvm/mmu.c | 70 +-
 3 files changed, 154 insertions(+), 38 deletions(-)

diff --git a/arch/arm64/include/asm/kvm_pgtable.h 
b/arch/arm64/include/asm/kvm_pgtable.h
index 52ab38db04c7..45acc9dc6c45 100644
--- a/arch/arm64/include/asm/kvm_pgtable.h
+++ b/arch/arm64/include/asm/kvm_pgtable.h
@@ -13,17 +13,41 @@
 
 typedef u64 kvm_pte_t;
 
+/**
+ * struct kvm_pgtable_mm_ops - Memory management callbacks.
+ * @zalloc_page:   Allocate a zeroed memory page.
+ * @zalloc_pages_exact:Allocate an exact number of zeroed memory pages.
+ * @free_pages_exact:  Free an exact number of memory pages.
+ * @get_page:  Increment the refcount on a page.
+ * @put_page:  Decrement the refcount on a page.
+ * @page_count:Returns the refcount of a page.
+ * @phys_to_virt:  Convert a physical address into a virtual address.
+ * @virt_to_phys:  Convert a virtual address into a physical address.
+ */
+struct kvm_pgtable_mm_ops {
+   void*   (*zalloc_page)(void *arg);
+   void*   (*zalloc_pages_exact)(size_t size);
+   void(*free_pages_exact)(void *addr, size_t size);
+   void(*get_page)(void *addr);
+   void(*put_page)(void *addr);
+   int (*page_count)(void *addr);
+   void*   (*phys_to_virt)(phys_addr_t phys);
+   phys_addr_t (*virt_to_phys)(void *addr);
+};
+
 /**
  * struct kvm_pgtable - KVM page-table.
  * @ia_bits:   Maximum input address size, in bits.
  * @start_level:   Level at which the page-table walk starts.
  * @pgd:   Pointer to the first top-level entry of the page-table.
+ * @mm_ops:Memory management callbacks.
  * @mmu:   Stage-2 KVM MMU struct. Unused for stage-1 page-tables.
  */
 struct kvm_pgtable {
u32 ia_bits;
u32 start_level;
kvm_pte_t   *pgd;
+   struct kvm_pgtable_mm_ops   *mm_ops;
 
/* Stage-2 only */
struct kvm_s2_mmu   *mmu;
@@ -86,10 +110,12 @@ struct kvm_pgtable_walker {
  * kvm_pgtable_hyp_init() - Initialise a hypervisor stage-1 page-table.
  * @pgt:   Uninitialised page-table structure to initialise.
  * @va_bits:   Maximum virtual address bits.
+ * @mm_ops:Memory management callbacks.
  *
  * Return: 0 on success, negative error code on failure.
  */
-int kvm_pgtable_hyp_init(struct kvm_pgtable *pgt, u32 va_bits);
+int kvm_pgtable_hyp_init(struct kvm_pgtable *pgt, u32 va_bits,
+struct kvm_pgtable_mm_ops *mm_ops);
 
 /**
  * kvm_pgtable_hyp_destroy() - Destroy an unused hypervisor stage-1 page-table.
@@ -126,10 +152,12 @@ int kvm_pgtable_hyp_map(struct kvm_pgtable *pgt, u64 
addr, u64 size, u64 phys,
  * kvm_pgtable_stage2_init() - Initialise a guest stage-2 page-table.
  * @pgt:   Uninitialised page-table structure to initialise.
  * @kvm:   KVM structure representing the guest virtual machine.
+ * @mm_ops:Memory management callbacks.
  *
  * Return: 0 on success, negative error code on failure.
  */
-int kvm_pgtable_stage2_init(struct kvm_pgtable *pgt, struct kvm *kvm);
+int kvm_pgtable_stage2_init(struct kvm_pgtable *pgt, struct kvm *kvm,
+   struct kvm_pgtable_mm_ops *mm_ops);
 
 /**
  * kvm_pgtable_stage2_destroy() - Destroy an unused guest stage-2 page-table.
diff --git a/arch/arm64/kvm/hyp/pgtable.c b/arch/arm64/kvm/hyp/pgtable.c
index d7122c5eac24..61a8a34ddfdb 100644
--- a/arch/arm64/kvm/hyp/pgtable.c
+++ b/arch/arm64/kvm/hyp/pgtable.c
@@ -148,9 +148,9 @@ static kvm_pte_t kvm_phys_to_pte(u64 pa)
return pte;
 }
 
-static kvm_pte_t *kvm_pte_follow(kvm_pte_t pte)
+static kvm_pte_t *kvm_pte_follow(kvm_pte_t pte, struct kvm_pgtable_mm_ops 
*mm_ops)
 {
-   return __va(kvm_pte_to_phys(pte));
+   return mm_ops->phys_to_virt(kvm_pte_to_phys(pte));
 }
 
 static void kvm_set_invalid_pte(kvm_pte_t *ptep)
@@ -159,9 +159,10 @@ static void kvm_set_invalid_pte(kvm_pte_t *ptep)
WRITE_ONCE(*ptep, pte & ~KVM_PTE_VALID);
 }
 
-static void kvm_set_table_pte(kvm_pte_t *ptep, kvm_pte_t *childp)
+static void kvm_set_table_pte(kvm_pte_t *ptep, kvm_pte_t *childp,
+ struct kvm_pgtable_mm_ops *mm_ops)
 {
-   kvm_pte_t old = *ptep, pte = kvm_phys_to_pte(__pa(childp));
+   kvm_pte_t old = *ptep, pte = 
kvm_phys_to_pte(mm_ops->virt_to_phys(childp));
 
pte |= FIELD_PREP(KVM_PTE_TYPE, 

[RFC PATCH v2 05/26] KVM: arm64: Avoid free_page() in page-table allocator

2021-01-08 Thread Quentin Perret
Currently, the KVM page-table allocator uses a mix of put_page() and
free_page() calls depending on the context even though page-allocation
is always achieved using variants of __get_free_page().

Make the code consitent by using put_page() throughout, and reduce the
memory management API surface used by the page-table code. This will
ease factoring out page-alloction from pgtable.c, which is a
pre-requisite to creating page-tables at EL2.

Signed-off-by: Quentin Perret 
---
 arch/arm64/kvm/hyp/pgtable.c | 10 +-
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/arch/arm64/kvm/hyp/pgtable.c b/arch/arm64/kvm/hyp/pgtable.c
index 0271b4a3b9fe..d7122c5eac24 100644
--- a/arch/arm64/kvm/hyp/pgtable.c
+++ b/arch/arm64/kvm/hyp/pgtable.c
@@ -410,7 +410,7 @@ int kvm_pgtable_hyp_init(struct kvm_pgtable *pgt, u32 
va_bits)
 static int hyp_free_walker(u64 addr, u64 end, u32 level, kvm_pte_t *ptep,
   enum kvm_pgtable_walk_flags flag, void * const arg)
 {
-   free_page((unsigned long)kvm_pte_follow(*ptep));
+   put_page(virt_to_page(kvm_pte_follow(*ptep)));
return 0;
 }
 
@@ -422,7 +422,7 @@ void kvm_pgtable_hyp_destroy(struct kvm_pgtable *pgt)
};
 
WARN_ON(kvm_pgtable_walk(pgt, 0, BIT(pgt->ia_bits), ));
-   free_page((unsigned long)pgt->pgd);
+   put_page(virt_to_page(pgt->pgd));
pgt->pgd = NULL;
 }
 
@@ -551,7 +551,7 @@ static int stage2_map_walk_table_post(u64 addr, u64 end, 
u32 level,
if (!data->anchor)
return 0;
 
-   free_page((unsigned long)kvm_pte_follow(*ptep));
+   put_page(virt_to_page(kvm_pte_follow(*ptep)));
put_page(virt_to_page(ptep));
 
if (data->anchor == ptep) {
@@ -674,7 +674,7 @@ static int stage2_unmap_walker(u64 addr, u64 end, u32 
level, kvm_pte_t *ptep,
}
 
if (childp)
-   free_page((unsigned long)childp);
+   put_page(virt_to_page(childp));
 
return 0;
 }
@@ -871,7 +871,7 @@ static int stage2_free_walker(u64 addr, u64 end, u32 level, 
kvm_pte_t *ptep,
put_page(virt_to_page(ptep));
 
if (kvm_pte_table(pte, level))
-   free_page((unsigned long)kvm_pte_follow(pte));
+   put_page(virt_to_page(kvm_pte_follow(pte)));
 
return 0;
 }
-- 
2.30.0.284.gd98b1dd5eaa7-goog

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[RFC PATCH v2 20/26] KVM: arm64: Set host stage 2 using kvm_nvhe_init_params

2021-01-08 Thread Quentin Perret
Move the registers relevant to host stage 2 enablement to
kvm_nvhe_init_params to prepare the ground for enabling it in later
patches.

Signed-off-by: Quentin Perret 
---
 arch/arm64/include/asm/kvm_asm.h   | 3 +++
 arch/arm64/kernel/asm-offsets.c| 3 +++
 arch/arm64/kvm/arm.c   | 5 +
 arch/arm64/kvm/hyp/nvhe/hyp-init.S | 9 +
 arch/arm64/kvm/hyp/nvhe/switch.c   | 5 +
 5 files changed, 21 insertions(+), 4 deletions(-)

diff --git a/arch/arm64/include/asm/kvm_asm.h b/arch/arm64/include/asm/kvm_asm.h
index 4fc27ac08836..5354b05eb9e2 100644
--- a/arch/arm64/include/asm/kvm_asm.h
+++ b/arch/arm64/include/asm/kvm_asm.h
@@ -158,6 +158,9 @@ struct kvm_nvhe_init_params {
unsigned long tpidr_el2;
unsigned long stack_hyp_va;
phys_addr_t pgd_pa;
+   unsigned long hcr_el2;
+   unsigned long vttbr;
+   unsigned long vtcr;
 };
 
 /* Translate a kernel address @ptr into its equivalent linear mapping */
diff --git a/arch/arm64/kernel/asm-offsets.c b/arch/arm64/kernel/asm-offsets.c
index 5e82488f1b82..9cf7736e31db 100644
--- a/arch/arm64/kernel/asm-offsets.c
+++ b/arch/arm64/kernel/asm-offsets.c
@@ -114,6 +114,9 @@ int main(void)
   DEFINE(NVHE_INIT_TPIDR_EL2,  offsetof(struct kvm_nvhe_init_params, 
tpidr_el2));
   DEFINE(NVHE_INIT_STACK_HYP_VA,   offsetof(struct kvm_nvhe_init_params, 
stack_hyp_va));
   DEFINE(NVHE_INIT_PGD_PA, offsetof(struct kvm_nvhe_init_params, pgd_pa));
+  DEFINE(NVHE_INIT_HCR_EL2,offsetof(struct kvm_nvhe_init_params, hcr_el2));
+  DEFINE(NVHE_INIT_VTTBR,  offsetof(struct kvm_nvhe_init_params, vttbr));
+  DEFINE(NVHE_INIT_VTCR,   offsetof(struct kvm_nvhe_init_params, vtcr));
 #endif
 #ifdef CONFIG_CPU_PM
   DEFINE(CPU_CTX_SP,   offsetof(struct cpu_suspend_ctx, sp));
diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
index e524682c2ccf..00cee4489cd7 100644
--- a/arch/arm64/kvm/arm.c
+++ b/arch/arm64/kvm/arm.c
@@ -1413,6 +1413,11 @@ static void cpu_prepare_hyp_mode(int cpu)
 
params->stack_hyp_va = kern_hyp_va(per_cpu(kvm_arm_hyp_stack_page, cpu) 
+ PAGE_SIZE);
params->pgd_pa = kvm_mmu_get_httbr();
+   if (is_protected_kvm_enabled())
+   params->hcr_el2 = HCR_HOST_NVHE_PROTECTED_FLAGS;
+   else
+   params->hcr_el2 = HCR_HOST_NVHE_FLAGS;
+   params->vttbr = params->vtcr = 0;
 
/*
 * Flush the init params from the data cache because the struct will
diff --git a/arch/arm64/kvm/hyp/nvhe/hyp-init.S 
b/arch/arm64/kvm/hyp/nvhe/hyp-init.S
index ad943966c39f..b1341bb4b453 100644
--- a/arch/arm64/kvm/hyp/nvhe/hyp-init.S
+++ b/arch/arm64/kvm/hyp/nvhe/hyp-init.S
@@ -102,6 +102,15 @@ alternative_else_nop_endif
ldr x1, [x0, #NVHE_INIT_MAIR_EL2]
msr mair_el2, x1
 
+   ldr x1, [x0, #NVHE_INIT_HCR_EL2]
+   msr hcr_el2, x1
+
+   ldr x1, [x0, #NVHE_INIT_VTTBR]
+   msr vttbr_el2, x1
+
+   ldr x1, [x0, #NVHE_INIT_VTCR]
+   msr vtcr_el2, x1
+
ldr x1, [x0, #NVHE_INIT_PGD_PA]
phys_to_ttbr x2, x1
 alternative_if ARM64_HAS_CNP
diff --git a/arch/arm64/kvm/hyp/nvhe/switch.c b/arch/arm64/kvm/hyp/nvhe/switch.c
index f3d0e9eca56c..979a76cdf9fb 100644
--- a/arch/arm64/kvm/hyp/nvhe/switch.c
+++ b/arch/arm64/kvm/hyp/nvhe/switch.c
@@ -97,10 +97,7 @@ static void __deactivate_traps(struct kvm_vcpu *vcpu)
mdcr_el2 |= MDCR_EL2_E2PB_MASK << MDCR_EL2_E2PB_SHIFT;
 
write_sysreg(mdcr_el2, mdcr_el2);
-   if (is_protected_kvm_enabled())
-   write_sysreg(HCR_HOST_NVHE_PROTECTED_FLAGS, hcr_el2);
-   else
-   write_sysreg(HCR_HOST_NVHE_FLAGS, hcr_el2);
+   write_sysreg(this_cpu_ptr(_init_params)->hcr_el2, hcr_el2);
write_sysreg(CPTR_EL2_DEFAULT, cptr_el2);
write_sysreg(__kvm_hyp_host_vector, vbar_el2);
 }
-- 
2.30.0.284.gd98b1dd5eaa7-goog

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[RFC PATCH v2 00/26] KVM/arm64: A stage 2 for the host

2021-01-08 Thread Quentin Perret
Hi all,

This is the v2 of the series previously posted here:

  https://lore.kernel.org/kvmarm/20201117181607.1761516-1-qper...@google.com/

This basically allows us to wrap the host with a stage 2 when running in
nVHE, hence paving the way for protecting guest memory from the host in
the future (among other use-cases). For more details about the
motivation and the design angle taken here, I would recommend to have a
look at the cover letter of v1, and/or to watch these presentations at
LPC [1] and KVM forum 2020 [2].

In short, the changes since v1 include:

 - Renamed most pkvm-specific pgtable functions as pkvm_* to avoid
   confusion with the host's (Fuad)

 - Added an IC flush when switching pgtables (Fuad, Mark)

 - Cleaned-up the PI aliasing in image-vars.h (David)

 - Added a TLB flush when enabling the host stage 2 to avoid stale TLBs
   from bootloader

 - Fixed the early memory reservation by using NR_CPUS instead of
   num_possible_cpus() (which is always 1 that early)

 - Added missing preempt_{dis,en}able() guards in
   kvm_hyp_enable_protection()

 - Rebased on latest kvmarm/next

And if you'd like a branch that has all the goodies, there it is:

https://android-kvm.googlesource.com/linux qperret/host-stage2-v2

Thanks!
Quentin

[1] https://youtu.be/54q6RzS9BpQ?t=10859
[2] 
https://kvmforum2020.sched.com/event/eE24/virtualization-for-the-masses-exposing-kvm-on-android-will-deacon-google

Quentin Perret (23):
  KVM: arm64: Initialize kvm_nvhe_init_params early
  KVM: arm64: Avoid free_page() in page-table allocator
  KVM: arm64: Factor memory allocation out of pgtable.c
  KVM: arm64: Introduce a BSS section for use at Hyp
  KVM: arm64: Make kvm_call_hyp() a function call at Hyp
  KVM: arm64: Allow using kvm_nvhe_sym() in hyp code
  KVM: arm64: Introduce an early Hyp page allocator
  KVM: arm64: Stub CONFIG_DEBUG_LIST at Hyp
  KVM: arm64: Introduce a Hyp buddy page allocator
  KVM: arm64: Enable access to sanitized CPU features at EL2
  KVM: arm64: Factor out vector address calculation
  of/fdt: Introduce early_init_dt_add_memory_hyp()
  KVM: arm64: Prepare Hyp memory protection
  KVM: arm64: Elevate Hyp mappings creation at EL2
  KVM: arm64: Use kvm_arch for stage 2 pgtable
  KVM: arm64: Use kvm_arch in kvm_s2_mmu
  KVM: arm64: Set host stage 2 using kvm_nvhe_init_params
  KVM: arm64: Refactor kvm_arm_setup_stage2()
  KVM: arm64: Refactor __load_guest_stage2()
  KVM: arm64: Refactor __populate_fault_info()
  KVM: arm64: Make memcache anonymous in pgtable allocator
  KVM: arm64: Reserve memory for host stage 2
  KVM: arm64: Wrap the host with a stage 2

Will Deacon (3):
  arm64: lib: Annotate {clear,copy}_page() as position-independent
  KVM: arm64: Link position-independent string routines into .hyp.text
  arm64: kvm: Add standalone ticket spinlock implementation for use at
hyp

 arch/arm64/include/asm/cpufeature.h   |   1 +
 arch/arm64/include/asm/hyp_image.h|   7 +
 arch/arm64/include/asm/kvm_asm.h  |   7 +
 arch/arm64/include/asm/kvm_cpufeature.h   |  19 ++
 arch/arm64/include/asm/kvm_host.h |  16 +-
 arch/arm64/include/asm/kvm_hyp.h  |   8 +
 arch/arm64/include/asm/kvm_mmu.h  |  69 +-
 arch/arm64/include/asm/kvm_pgtable.h  |  41 +++-
 arch/arm64/include/asm/sections.h |   1 +
 arch/arm64/kernel/asm-offsets.c   |   3 +
 arch/arm64/kernel/cpufeature.c|  12 +
 arch/arm64/kernel/image-vars.h|  33 +++
 arch/arm64/kernel/vmlinux.lds.S   |   7 +
 arch/arm64/kvm/arm.c  | 144 ++--
 arch/arm64/kvm/hyp/Makefile   |   2 +-
 arch/arm64/kvm/hyp/include/hyp/switch.h   |  36 +--
 arch/arm64/kvm/hyp/include/nvhe/early_alloc.h |  14 ++
 arch/arm64/kvm/hyp/include/nvhe/gfp.h |  32 +++
 arch/arm64/kvm/hyp/include/nvhe/mem_protect.h |  33 +++
 arch/arm64/kvm/hyp/include/nvhe/memory.h  |  55 +
 arch/arm64/kvm/hyp/include/nvhe/mm.h  | 107 +
 arch/arm64/kvm/hyp/include/nvhe/spinlock.h|  92 
 arch/arm64/kvm/hyp/nvhe/Makefile  |   9 +-
 arch/arm64/kvm/hyp/nvhe/cache.S   |  13 ++
 arch/arm64/kvm/hyp/nvhe/cpufeature.c  |   8 +
 arch/arm64/kvm/hyp/nvhe/early_alloc.c |  60 +
 arch/arm64/kvm/hyp/nvhe/hyp-init.S|  41 
 arch/arm64/kvm/hyp/nvhe/hyp-main.c|  48 
 arch/arm64/kvm/hyp/nvhe/hyp.lds.S |   1 +
 arch/arm64/kvm/hyp/nvhe/mem_protect.c | 191 
 arch/arm64/kvm/hyp/nvhe/mm.c  | 174 ++
 arch/arm64/kvm/hyp/nvhe/page_alloc.c  | 185 +++
 arch/arm64/kvm/hyp/nvhe/psci-relay.c  |   4 +-
 arch/arm64/kvm/hyp/nvhe/setup.c   | 214 ++
 arch/arm64/kvm/hyp/nvhe/stub.c|  22 ++
 arch/arm64/kvm/hyp/nvhe/switch.c  |  12 +-
 arch/arm64/kvm/hyp/nvhe/tlb.c

[RFC PATCH v2 19/26] KVM: arm64: Use kvm_arch in kvm_s2_mmu

2021-01-08 Thread Quentin Perret
In order to make use of the stage 2 pgtable code for the host stage 2,
change kvm_s2_mmu to use a kvm_arch pointer in lieu of the kvm pointer,
as the host will have the former but not the latter.

Signed-off-by: Quentin Perret 
---
 arch/arm64/include/asm/kvm_host.h | 2 +-
 arch/arm64/include/asm/kvm_mmu.h  | 7 ++-
 arch/arm64/kvm/mmu.c  | 8 
 3 files changed, 11 insertions(+), 6 deletions(-)

diff --git a/arch/arm64/include/asm/kvm_host.h 
b/arch/arm64/include/asm/kvm_host.h
index 9a2feb83eea0..9d59bebcc5ef 100644
--- a/arch/arm64/include/asm/kvm_host.h
+++ b/arch/arm64/include/asm/kvm_host.h
@@ -95,7 +95,7 @@ struct kvm_s2_mmu {
/* The last vcpu id that ran on each physical CPU */
int __percpu *last_vcpu_ran;
 
-   struct kvm *kvm;
+   struct kvm_arch *arch;
 };
 
 struct kvm_arch_memory_slot {
diff --git a/arch/arm64/include/asm/kvm_mmu.h b/arch/arm64/include/asm/kvm_mmu.h
index 6c8466a042a9..662f0415344e 100644
--- a/arch/arm64/include/asm/kvm_mmu.h
+++ b/arch/arm64/include/asm/kvm_mmu.h
@@ -299,7 +299,7 @@ static __always_inline u64 kvm_get_vttbr(struct kvm_s2_mmu 
*mmu)
  */
 static __always_inline void __load_guest_stage2(struct kvm_s2_mmu *mmu)
 {
-   write_sysreg(kern_hyp_va(mmu->kvm)->arch.vtcr, vtcr_el2);
+   write_sysreg(kern_hyp_va(mmu->arch)->vtcr, vtcr_el2);
write_sysreg(kvm_get_vttbr(mmu), vttbr_el2);
 
/*
@@ -309,5 +309,10 @@ static __always_inline void __load_guest_stage2(struct 
kvm_s2_mmu *mmu)
 */
asm(ALTERNATIVE("nop", "isb", ARM64_WORKAROUND_SPECULATIVE_AT));
 }
+
+static inline struct kvm *kvm_s2_mmu_to_kvm(struct kvm_s2_mmu *mmu)
+{
+   return container_of(mmu->arch, struct kvm, arch);
+}
 #endif /* __ASSEMBLY__ */
 #endif /* __ARM64_KVM_MMU_H__ */
diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c
index 7e6263103943..6f9bf71722bd 100644
--- a/arch/arm64/kvm/mmu.c
+++ b/arch/arm64/kvm/mmu.c
@@ -169,7 +169,7 @@ static void *kvm_host_va(phys_addr_t phys)
 static void __unmap_stage2_range(struct kvm_s2_mmu *mmu, phys_addr_t start, 
u64 size,
 bool may_block)
 {
-   struct kvm *kvm = mmu->kvm;
+   struct kvm *kvm = kvm_s2_mmu_to_kvm(mmu);
phys_addr_t end = start + size;
 
assert_spin_locked(>mmu_lock);
@@ -474,7 +474,7 @@ int kvm_init_stage2_mmu(struct kvm *kvm, struct kvm_s2_mmu 
*mmu)
for_each_possible_cpu(cpu)
*per_cpu_ptr(mmu->last_vcpu_ran, cpu) = -1;
 
-   mmu->kvm = kvm;
+   mmu->arch = >arch;
mmu->pgt = pgt;
mmu->pgd_phys = __pa(pgt->pgd);
mmu->vmid.vmid_gen = 0;
@@ -556,7 +556,7 @@ void stage2_unmap_vm(struct kvm *kvm)
 
 void kvm_free_stage2_pgd(struct kvm_s2_mmu *mmu)
 {
-   struct kvm *kvm = mmu->kvm;
+   struct kvm *kvm = kvm_s2_mmu_to_kvm(mmu);
struct kvm_pgtable *pgt = NULL;
 
spin_lock(>mmu_lock);
@@ -625,7 +625,7 @@ int kvm_phys_addr_ioremap(struct kvm *kvm, phys_addr_t 
guest_ipa,
  */
 static void stage2_wp_range(struct kvm_s2_mmu *mmu, phys_addr_t addr, 
phys_addr_t end)
 {
-   struct kvm *kvm = mmu->kvm;
+   struct kvm *kvm = kvm_s2_mmu_to_kvm(mmu);
stage2_apply_range_resched(kvm, addr, end, 
kvm_pgtable_stage2_wrprotect);
 }
 
-- 
2.30.0.284.gd98b1dd5eaa7-goog

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[PATCH] KVM: arm64: Compute TPIDR_EL2 ignoring MTE tag

2021-01-08 Thread Steven Price
KASAN in HW_TAGS mode will store MTE tags in the top byte of the
pointer. When computing the offset for TPIDR_EL2 we don't want anything
in the top byte, so remove the tag to ensure the computation is correct
no matter what the tag.

Fixes: 94ab5b61ee16 ("kasan, arm64: enable CONFIG_KASAN_HW_TAGS")
Signed-off-by: Steven Price 
---
Without this fix I can't boot a config with KASAN_HW_TAGS and KVM on an
MTE enabled host. I'm unsure if this should really be in
this_cpu_ptr_nvhe_sym().

 arch/arm64/kvm/arm.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
index 6e637d2b4cfb..3783082148bc 100644
--- a/arch/arm64/kvm/arm.c
+++ b/arch/arm64/kvm/arm.c
@@ -1403,7 +1403,7 @@ static void cpu_init_hyp_mode(void)
 * kernel's mapping to the linear mapping, and store it in tpidr_el2
 * so that we can use adr_l to access per-cpu variables in EL2.
 */
-   params->tpidr_el2 = (unsigned 
long)this_cpu_ptr_nvhe_sym(__per_cpu_start) -
+   params->tpidr_el2 = (unsigned 
long)kasan_reset_tag(this_cpu_ptr_nvhe_sym(__per_cpu_start)) -
(unsigned 
long)kvm_ksym_ref(CHOOSE_NVHE_SYM(__per_cpu_start));
 
params->mair_el2 = read_sysreg(mair_el1);
-- 
2.20.1

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


RE: [PATCH v11 12/13] vfio/pci: Register a DMA fault response region

2021-01-08 Thread Shameerali Kolothum Thodi
Hi Eric,

> -Original Message-
> From: Eric Auger [mailto:eric.au...@redhat.com]
> Sent: 16 November 2020 11:00
> To: eric.auger@gmail.com; eric.au...@redhat.com;
> io...@lists.linux-foundation.org; linux-ker...@vger.kernel.org;
> k...@vger.kernel.org; kvmarm@lists.cs.columbia.edu; w...@kernel.org;
> j...@8bytes.org; m...@kernel.org; robin.mur...@arm.com;
> alex.william...@redhat.com
> Cc: jean-phili...@linaro.org; zhangfei@linaro.org;
> zhangfei@gmail.com; vivek.gau...@arm.com; Shameerali Kolothum
> Thodi ;
> jacob.jun@linux.intel.com; yi.l@intel.com; t...@semihalf.com;
> nicoleots...@gmail.com; yuzenghui 
> Subject: [PATCH v11 12/13] vfio/pci: Register a DMA fault response region
> 
> In preparation for vSVA, let's register a DMA fault response region,
> where the userspace will push the page responses and increment the
> head of the buffer. The kernel will pop those responses and inject them
> on iommu side.
> 
> Signed-off-by: Eric Auger 
> ---
>  drivers/vfio/pci/vfio_pci.c | 114 +---
>  drivers/vfio/pci/vfio_pci_private.h |   5 ++
>  drivers/vfio/pci/vfio_pci_rdwr.c|  39 ++
>  include/uapi/linux/vfio.h   |  32 
>  4 files changed, 181 insertions(+), 9 deletions(-)
> 
> diff --git a/drivers/vfio/pci/vfio_pci.c b/drivers/vfio/pci/vfio_pci.c
> index 65a83fd0e8c0..e9a904ce3f0d 100644
> --- a/drivers/vfio/pci/vfio_pci.c
> +++ b/drivers/vfio/pci/vfio_pci.c
> @@ -318,9 +318,20 @@ static void vfio_pci_dma_fault_release(struct
> vfio_pci_device *vdev,
>   kfree(vdev->fault_pages);
>  }
> 
> -static int vfio_pci_dma_fault_mmap(struct vfio_pci_device *vdev,
> -struct vfio_pci_region *region,
> -struct vm_area_struct *vma)
> +static void
> +vfio_pci_dma_fault_response_release(struct vfio_pci_device *vdev,
> + struct vfio_pci_region *region)
> +{
> + if (vdev->dma_fault_response_wq)
> + destroy_workqueue(vdev->dma_fault_response_wq);
> + kfree(vdev->fault_response_pages);
> + vdev->fault_response_pages = NULL;
> +}
> +
> +static int __vfio_pci_dma_fault_mmap(struct vfio_pci_device *vdev,
> +  struct vfio_pci_region *region,
> +  struct vm_area_struct *vma,
> +  u8 *pages)
>  {
>   u64 phys_len, req_len, pgoff, req_start;
>   unsigned long long addr;
> @@ -333,14 +344,14 @@ static int vfio_pci_dma_fault_mmap(struct
> vfio_pci_device *vdev,
>   ((1U << (VFIO_PCI_OFFSET_SHIFT - PAGE_SHIFT)) - 1);
>   req_start = pgoff << PAGE_SHIFT;
> 
> - /* only the second page of the producer fault region is mmappable */
> + /* only the second page of the fault region is mmappable */
>   if (req_start < PAGE_SIZE)
>   return -EINVAL;
> 
>   if (req_start + req_len > phys_len)
>   return -EINVAL;
> 
> - addr = virt_to_phys(vdev->fault_pages);
> + addr = virt_to_phys(pages);
>   vma->vm_private_data = vdev;
>   vma->vm_pgoff = (addr >> PAGE_SHIFT) + pgoff;
> 
> @@ -349,13 +360,29 @@ static int vfio_pci_dma_fault_mmap(struct
> vfio_pci_device *vdev,
>   return ret;
>  }
> 
> -static int vfio_pci_dma_fault_add_capability(struct vfio_pci_device *vdev,
> -  struct vfio_pci_region *region,
> -  struct vfio_info_cap *caps)
> +static int vfio_pci_dma_fault_mmap(struct vfio_pci_device *vdev,
> +struct vfio_pci_region *region,
> +struct vm_area_struct *vma)
> +{
> + return __vfio_pci_dma_fault_mmap(vdev, region, vma,
> vdev->fault_pages);
> +}
> +
> +static int
> +vfio_pci_dma_fault_response_mmap(struct vfio_pci_device *vdev,
> + struct vfio_pci_region *region,
> + struct vm_area_struct *vma)
> +{
> + return __vfio_pci_dma_fault_mmap(vdev, region, vma,
> vdev->fault_response_pages);
> +}
> +
> +static int __vfio_pci_dma_fault_add_capability(struct vfio_pci_device *vdev,
> +struct vfio_pci_region *region,
> +struct vfio_info_cap *caps,
> +u32 cap_id)
>  {
>   struct vfio_region_info_cap_sparse_mmap *sparse = NULL;
>   struct vfio_region_info_cap_fault cap = {
> - .header.id = VFIO_REGION_INFO_CAP_DMA_FAULT,
> + .header.id = cap_id,
>   .header.version = 1,
>   .version = 1,
>   };
> @@ -383,6 +410,14 @@ static int vfio_pci_dma_fault_add_capability(struct
> vfio_pci_device *vdev,
>   return ret;
>  }
> 
> +static int vfio_pci_dma_fault_add_capability(struct vfio_pci_device *vdev,
> +  struct vfio_pci_region *region,
> +   

Re: [GIT PULL] KVM/arm64 fixes for 5.11, take #1

2021-01-08 Thread Paolo Bonzini

On 08/01/21 09:22, Marc Zyngier wrote:


   git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm.git 
tags/kvmarm-fixes-5.11-1


Looks like there are issues with the upstream changes brought in by
this pull request.  Unless my bisection is quick tomorrow it may not
make it into 5.11-rc3.  In any case, it's in my hands.


I'm not sure what you mean by "upstream changes", as there is no
additional changes on top of what is describe in this pull request,
which is directly based on the tag  you pulled for the merge window.

If there is an issue with any of these 18 patches themselves, please
shout as soon as you can.


You're right, it's not related to this pull request but just to Linus's 
tree.  It was too late yesterday, and now it's all set for sending it out.


Paolo

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [GIT PULL] KVM/arm64 fixes for 5.11, take #1

2021-01-08 Thread Marc Zyngier

Hi Paolo,

On 2021-01-07 23:09, Paolo Bonzini wrote:

On 07/01/21 12:20, Marc Zyngier wrote:
   git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm.git 
tags/kvmarm-fixes-5.11-1


Looks like there are issues with the upstream changes brought in by
this pull request.  Unless my bisection is quick tomorrow it may not
make it into 5.11-rc3.  In any case, it's in my hands.


I'm not sure what you mean by "upstream changes", as there is no
additional changes on top of what is describe in this pull request,
which is directly based on the tag  you pulled for the merge window.

If there is an issue with any of these 18 patches themselves, please
shout as soon as you can.

Thanks,

M.
--
Jazz is not dead. It just smells funny...
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm