Re: [PATCH 2/2] KVM: arm64: Workaround firmware wrongly advertising GICv2-on-v3 compatibility
On Fri, 8 Jan 2021 at 19:13, Marc Zyngier wrote: > > On 2021-01-08 17:59, Ard Biesheuvel wrote: > > On Fri, 8 Jan 2021 at 18:12, Marc Zyngier wrote: > >> > >> It looks like we have broken firmware out there that wrongly > >> advertises > >> a GICv2 compatibility interface, despite the CPUs not being able to > >> deal > >> with it. > >> > >> To work around this, check that the CPU initialising KVM is actually > >> able > >> to switch to MMIO instead of system registers, and use that as a > >> precondition to enable GICv2 compatibility in KVM. > >> > >> Note that the detection happens on a single CPU. If the firmware is > >> lying *and* that the CPUs are asymetric, all hope is lost anyway. > >> > >> Reported-by: Shameerali Kolothum Thodi > >> > >> Signed-off-by: Marc Zyngier > >> --- > >> arch/arm64/kvm/hyp/vgic-v3-sr.c | 34 > >> +++-- > >> arch/arm64/kvm/vgic/vgic-v3.c | 8 ++-- > >> 2 files changed, 38 insertions(+), 4 deletions(-) > >> > >> diff --git a/arch/arm64/kvm/hyp/vgic-v3-sr.c > >> b/arch/arm64/kvm/hyp/vgic-v3-sr.c > >> index 005daa0c9dd7..d504499ab917 100644 > >> --- a/arch/arm64/kvm/hyp/vgic-v3-sr.c > >> +++ b/arch/arm64/kvm/hyp/vgic-v3-sr.c > >> @@ -408,11 +408,41 @@ void __vgic_v3_init_lrs(void) > >> /* > >> * Return the GIC CPU configuration: > >> * - [31:0] ICH_VTR_EL2 > >> - * - [63:32] RES0 > >> + * - [62:32] RES0 > >> + * - [63]MMIO (GICv2) capable > >> */ > >> u64 __vgic_v3_get_gic_config(void) > >> { > >> - return read_gicreg(ICH_VTR_EL2); > >> + u64 sre = read_gicreg(ICC_SRE_EL1); > >> + unsigned long flags = 0; > >> + bool v2_capable; > >> + > >> + /* > >> +* To check whether we have a MMIO-based (GICv2 compatible) > >> +* CPU interface, we need to disable the system register > >> +* view. To do that safely, we have to prevent any interrupt > >> +* from firing (which would be deadly). > >> +* > >> +* Note that this only makes sense on VHE, as interrupts are > >> +* already masked for nVHE as part of the exception entry to > >> +* EL2. > >> +*/ > >> + if (has_vhe()) > >> + flags = local_daif_save(); > >> + > >> + write_gicreg(0, ICC_SRE_EL1); > >> + isb(); > >> + > >> + v2_capable = !(read_gicreg(ICC_SRE_EL1) & ICC_SRE_EL1_SRE); > >> + > >> + write_gicreg(sre, ICC_SRE_EL1); > >> + isb(); > >> + > >> + if (has_vhe()) > >> + local_daif_restore(flags); > >> + > >> + return (read_gicreg(ICH_VTR_EL2) | > >> + v2_capable ? (1ULL << 63) : 0); > >> } > >> > > > > Is it necessary to perform this check unconditionally? We only care > > about this if the firmware claims v2 compat support. > > Indeed. But this is done exactly once per boot, and I see it as > a way to extract the CPU configuration more than anything else. > > Extracting it *only* when we have some v2 compat info would mean > sharing that information with EL2 (in the nVHE case), and it felt > more hassle than it is worth. > > Do you foresee any issue with this, other than the whole thing > being disgusting (which I wilfully admit)? > No I don't think it's a problem per se. Just a bit disappointing that every system will be burdened with this for as long as the last v2 compat capable system is still being supported. ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
Re: [PATCH 2/2] KVM: arm64: Workaround firmware wrongly advertising GICv2-on-v3 compatibility
On 2021-01-08 17:59, Ard Biesheuvel wrote: On Fri, 8 Jan 2021 at 18:12, Marc Zyngier wrote: It looks like we have broken firmware out there that wrongly advertises a GICv2 compatibility interface, despite the CPUs not being able to deal with it. To work around this, check that the CPU initialising KVM is actually able to switch to MMIO instead of system registers, and use that as a precondition to enable GICv2 compatibility in KVM. Note that the detection happens on a single CPU. If the firmware is lying *and* that the CPUs are asymetric, all hope is lost anyway. Reported-by: Shameerali Kolothum Thodi Signed-off-by: Marc Zyngier --- arch/arm64/kvm/hyp/vgic-v3-sr.c | 34 +++-- arch/arm64/kvm/vgic/vgic-v3.c | 8 ++-- 2 files changed, 38 insertions(+), 4 deletions(-) diff --git a/arch/arm64/kvm/hyp/vgic-v3-sr.c b/arch/arm64/kvm/hyp/vgic-v3-sr.c index 005daa0c9dd7..d504499ab917 100644 --- a/arch/arm64/kvm/hyp/vgic-v3-sr.c +++ b/arch/arm64/kvm/hyp/vgic-v3-sr.c @@ -408,11 +408,41 @@ void __vgic_v3_init_lrs(void) /* * Return the GIC CPU configuration: * - [31:0] ICH_VTR_EL2 - * - [63:32] RES0 + * - [62:32] RES0 + * - [63]MMIO (GICv2) capable */ u64 __vgic_v3_get_gic_config(void) { - return read_gicreg(ICH_VTR_EL2); + u64 sre = read_gicreg(ICC_SRE_EL1); + unsigned long flags = 0; + bool v2_capable; + + /* +* To check whether we have a MMIO-based (GICv2 compatible) +* CPU interface, we need to disable the system register +* view. To do that safely, we have to prevent any interrupt +* from firing (which would be deadly). +* +* Note that this only makes sense on VHE, as interrupts are +* already masked for nVHE as part of the exception entry to +* EL2. +*/ + if (has_vhe()) + flags = local_daif_save(); + + write_gicreg(0, ICC_SRE_EL1); + isb(); + + v2_capable = !(read_gicreg(ICC_SRE_EL1) & ICC_SRE_EL1_SRE); + + write_gicreg(sre, ICC_SRE_EL1); + isb(); + + if (has_vhe()) + local_daif_restore(flags); + + return (read_gicreg(ICH_VTR_EL2) | + v2_capable ? (1ULL << 63) : 0); } Is it necessary to perform this check unconditionally? We only care about this if the firmware claims v2 compat support. Indeed. But this is done exactly once per boot, and I see it as a way to extract the CPU configuration more than anything else. Extracting it *only* when we have some v2 compat info would mean sharing that information with EL2 (in the nVHE case), and it felt more hassle than it is worth. Do you foresee any issue with this, other than the whole thing being disgusting (which I wilfully admit)? Thanks, M. -- Jazz is not dead. It just smells funny... ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
Re: [PATCH 2/2] KVM: arm64: Workaround firmware wrongly advertising GICv2-on-v3 compatibility
On Fri, 8 Jan 2021 at 18:12, Marc Zyngier wrote: > > It looks like we have broken firmware out there that wrongly advertises > a GICv2 compatibility interface, despite the CPUs not being able to deal > with it. > > To work around this, check that the CPU initialising KVM is actually able > to switch to MMIO instead of system registers, and use that as a > precondition to enable GICv2 compatibility in KVM. > > Note that the detection happens on a single CPU. If the firmware is > lying *and* that the CPUs are asymetric, all hope is lost anyway. > > Reported-by: Shameerali Kolothum Thodi > Signed-off-by: Marc Zyngier > --- > arch/arm64/kvm/hyp/vgic-v3-sr.c | 34 +++-- > arch/arm64/kvm/vgic/vgic-v3.c | 8 ++-- > 2 files changed, 38 insertions(+), 4 deletions(-) > > diff --git a/arch/arm64/kvm/hyp/vgic-v3-sr.c b/arch/arm64/kvm/hyp/vgic-v3-sr.c > index 005daa0c9dd7..d504499ab917 100644 > --- a/arch/arm64/kvm/hyp/vgic-v3-sr.c > +++ b/arch/arm64/kvm/hyp/vgic-v3-sr.c > @@ -408,11 +408,41 @@ void __vgic_v3_init_lrs(void) > /* > * Return the GIC CPU configuration: > * - [31:0] ICH_VTR_EL2 > - * - [63:32] RES0 > + * - [62:32] RES0 > + * - [63]MMIO (GICv2) capable > */ > u64 __vgic_v3_get_gic_config(void) > { > - return read_gicreg(ICH_VTR_EL2); > + u64 sre = read_gicreg(ICC_SRE_EL1); > + unsigned long flags = 0; > + bool v2_capable; > + > + /* > +* To check whether we have a MMIO-based (GICv2 compatible) > +* CPU interface, we need to disable the system register > +* view. To do that safely, we have to prevent any interrupt > +* from firing (which would be deadly). > +* > +* Note that this only makes sense on VHE, as interrupts are > +* already masked for nVHE as part of the exception entry to > +* EL2. > +*/ > + if (has_vhe()) > + flags = local_daif_save(); > + > + write_gicreg(0, ICC_SRE_EL1); > + isb(); > + > + v2_capable = !(read_gicreg(ICC_SRE_EL1) & ICC_SRE_EL1_SRE); > + > + write_gicreg(sre, ICC_SRE_EL1); > + isb(); > + > + if (has_vhe()) > + local_daif_restore(flags); > + > + return (read_gicreg(ICH_VTR_EL2) | > + v2_capable ? (1ULL << 63) : 0); > } > Is it necessary to perform this check unconditionally? We only care about this if the firmware claims v2 compat support. > u64 __vgic_v3_read_vmcr(void) > diff --git a/arch/arm64/kvm/vgic/vgic-v3.c b/arch/arm64/kvm/vgic/vgic-v3.c > index 8e7bf3151057..67b27b47312b 100644 > --- a/arch/arm64/kvm/vgic/vgic-v3.c > +++ b/arch/arm64/kvm/vgic/vgic-v3.c > @@ -584,8 +584,10 @@ early_param("kvm-arm.vgic_v4_enable", > early_gicv4_enable); > int vgic_v3_probe(const struct gic_kvm_info *info) > { > u64 ich_vtr_el2 = kvm_call_hyp_ret(__vgic_v3_get_gic_config); > + bool has_v2; > int ret; > > + has_v2 = ich_vtr_el2 >> 63; > ich_vtr_el2 = (u32)ich_vtr_el2; > > /* > @@ -605,13 +607,15 @@ int vgic_v3_probe(const struct gic_kvm_info *info) > gicv4_enable ? "en" : "dis"); > } > > + kvm_vgic_global_state.vcpu_base = 0; > + > if (!info->vcpu.start) { > kvm_info("GICv3: no GICV resource entry\n"); > - kvm_vgic_global_state.vcpu_base = 0; > + } else if (!has_v2) { > + pr_warn("CPU interface incapable of MMIO access\n"); > } else if (!PAGE_ALIGNED(info->vcpu.start)) { > pr_warn("GICV physical address 0x%llx not page aligned\n", > (unsigned long long)info->vcpu.start); > - kvm_vgic_global_state.vcpu_base = 0; > } else { > kvm_vgic_global_state.vcpu_base = info->vcpu.start; > kvm_vgic_global_state.can_emulate_gicv2 = true; > -- > 2.29.2 > ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
[PATCH 0/2] KVM: arm64: Work around firmware wongly advertising GICv2 compatibility
It appears that there is firmware out there that advertise GICv2 compatibility on GICv3, despite the CPUs not being able to actually do it. That's a bummer, and at best creates unexpected behaviours for the users. At worse, it will crash the machine. Awesome! In order to mitigate this issue, try and validate whether we can actually flip the CPU into supporting MMIO accesses instead of system registers. If we can't, ignore the compatibility information and shout. It's not completely foolproof, but it should cover the existing broken platforms... The workaround is much bigger than Shameer's initial proposal, but that's because I wanted to keep it localised to KVM, and not spread the horror at every level (after all, only KVM is concerned with v2 compat). Marc Zyngier (2): KVM: arm64: Rename __vgic_v3_get_ich_vtr_el2() to __vgic_v3_get_gic_config() KVM: arm64: Workaround firmware wrongly advertising GICv2-on-v3 compatibility arch/arm64/include/asm/kvm_asm.h | 4 +-- arch/arm64/kvm/hyp/nvhe/hyp-main.c | 6 ++--- arch/arm64/kvm/hyp/vgic-v3-sr.c| 39 -- arch/arm64/kvm/vgic/vgic-v3.c | 12 ++--- 4 files changed, 51 insertions(+), 10 deletions(-) -- 2.29.2 ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
[PATCH 2/2] KVM: arm64: Workaround firmware wrongly advertising GICv2-on-v3 compatibility
It looks like we have broken firmware out there that wrongly advertises a GICv2 compatibility interface, despite the CPUs not being able to deal with it. To work around this, check that the CPU initialising KVM is actually able to switch to MMIO instead of system registers, and use that as a precondition to enable GICv2 compatibility in KVM. Note that the detection happens on a single CPU. If the firmware is lying *and* that the CPUs are asymetric, all hope is lost anyway. Reported-by: Shameerali Kolothum Thodi Signed-off-by: Marc Zyngier --- arch/arm64/kvm/hyp/vgic-v3-sr.c | 34 +++-- arch/arm64/kvm/vgic/vgic-v3.c | 8 ++-- 2 files changed, 38 insertions(+), 4 deletions(-) diff --git a/arch/arm64/kvm/hyp/vgic-v3-sr.c b/arch/arm64/kvm/hyp/vgic-v3-sr.c index 005daa0c9dd7..d504499ab917 100644 --- a/arch/arm64/kvm/hyp/vgic-v3-sr.c +++ b/arch/arm64/kvm/hyp/vgic-v3-sr.c @@ -408,11 +408,41 @@ void __vgic_v3_init_lrs(void) /* * Return the GIC CPU configuration: * - [31:0] ICH_VTR_EL2 - * - [63:32] RES0 + * - [62:32] RES0 + * - [63]MMIO (GICv2) capable */ u64 __vgic_v3_get_gic_config(void) { - return read_gicreg(ICH_VTR_EL2); + u64 sre = read_gicreg(ICC_SRE_EL1); + unsigned long flags = 0; + bool v2_capable; + + /* +* To check whether we have a MMIO-based (GICv2 compatible) +* CPU interface, we need to disable the system register +* view. To do that safely, we have to prevent any interrupt +* from firing (which would be deadly). +* +* Note that this only makes sense on VHE, as interrupts are +* already masked for nVHE as part of the exception entry to +* EL2. +*/ + if (has_vhe()) + flags = local_daif_save(); + + write_gicreg(0, ICC_SRE_EL1); + isb(); + + v2_capable = !(read_gicreg(ICC_SRE_EL1) & ICC_SRE_EL1_SRE); + + write_gicreg(sre, ICC_SRE_EL1); + isb(); + + if (has_vhe()) + local_daif_restore(flags); + + return (read_gicreg(ICH_VTR_EL2) | + v2_capable ? (1ULL << 63) : 0); } u64 __vgic_v3_read_vmcr(void) diff --git a/arch/arm64/kvm/vgic/vgic-v3.c b/arch/arm64/kvm/vgic/vgic-v3.c index 8e7bf3151057..67b27b47312b 100644 --- a/arch/arm64/kvm/vgic/vgic-v3.c +++ b/arch/arm64/kvm/vgic/vgic-v3.c @@ -584,8 +584,10 @@ early_param("kvm-arm.vgic_v4_enable", early_gicv4_enable); int vgic_v3_probe(const struct gic_kvm_info *info) { u64 ich_vtr_el2 = kvm_call_hyp_ret(__vgic_v3_get_gic_config); + bool has_v2; int ret; + has_v2 = ich_vtr_el2 >> 63; ich_vtr_el2 = (u32)ich_vtr_el2; /* @@ -605,13 +607,15 @@ int vgic_v3_probe(const struct gic_kvm_info *info) gicv4_enable ? "en" : "dis"); } + kvm_vgic_global_state.vcpu_base = 0; + if (!info->vcpu.start) { kvm_info("GICv3: no GICV resource entry\n"); - kvm_vgic_global_state.vcpu_base = 0; + } else if (!has_v2) { + pr_warn("CPU interface incapable of MMIO access\n"); } else if (!PAGE_ALIGNED(info->vcpu.start)) { pr_warn("GICV physical address 0x%llx not page aligned\n", (unsigned long long)info->vcpu.start); - kvm_vgic_global_state.vcpu_base = 0; } else { kvm_vgic_global_state.vcpu_base = info->vcpu.start; kvm_vgic_global_state.can_emulate_gicv2 = true; -- 2.29.2 ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
[PATCH 1/2] KVM: arm64: Rename __vgic_v3_get_ich_vtr_el2() to __vgic_v3_get_gic_config()
As we are about to report a bit more information to the rest of the kernel, rename __vgic_v3_get_ich_vtr_el2() to the more explicit __vgic_v3_get_gic_config(). No functional change. Signed-off-by: Marc Zyngier --- arch/arm64/include/asm/kvm_asm.h | 4 ++-- arch/arm64/kvm/hyp/nvhe/hyp-main.c | 6 +++--- arch/arm64/kvm/hyp/vgic-v3-sr.c| 7 ++- arch/arm64/kvm/vgic/vgic-v3.c | 4 +++- 4 files changed, 14 insertions(+), 7 deletions(-) diff --git a/arch/arm64/include/asm/kvm_asm.h b/arch/arm64/include/asm/kvm_asm.h index 8a33d83ea843..37b9cd3e458e 100644 --- a/arch/arm64/include/asm/kvm_asm.h +++ b/arch/arm64/include/asm/kvm_asm.h @@ -50,7 +50,7 @@ #define __KVM_HOST_SMCCC_FUNC___kvm_tlb_flush_local_vmid 5 #define __KVM_HOST_SMCCC_FUNC___kvm_timer_set_cntvoff 6 #define __KVM_HOST_SMCCC_FUNC___kvm_enable_ssbs7 -#define __KVM_HOST_SMCCC_FUNC___vgic_v3_get_ich_vtr_el28 +#define __KVM_HOST_SMCCC_FUNC___vgic_v3_get_gic_config 8 #define __KVM_HOST_SMCCC_FUNC___vgic_v3_read_vmcr 9 #define __KVM_HOST_SMCCC_FUNC___vgic_v3_write_vmcr 10 #define __KVM_HOST_SMCCC_FUNC___vgic_v3_init_lrs 11 @@ -192,7 +192,7 @@ extern void __kvm_timer_set_cntvoff(u64 cntvoff); extern int __kvm_vcpu_run(struct kvm_vcpu *vcpu); -extern u64 __vgic_v3_get_ich_vtr_el2(void); +extern u64 __vgic_v3_get_gic_config(void); extern u64 __vgic_v3_read_vmcr(void); extern void __vgic_v3_write_vmcr(u32 vmcr); extern void __vgic_v3_init_lrs(void); diff --git a/arch/arm64/kvm/hyp/nvhe/hyp-main.c b/arch/arm64/kvm/hyp/nvhe/hyp-main.c index bde658d51404..3dc7f0c4fa94 100644 --- a/arch/arm64/kvm/hyp/nvhe/hyp-main.c +++ b/arch/arm64/kvm/hyp/nvhe/hyp-main.c @@ -67,9 +67,9 @@ static void handle___kvm_enable_ssbs(struct kvm_cpu_context *host_ctxt) write_sysreg_el2(tmp, SYS_SCTLR); } -static void handle___vgic_v3_get_ich_vtr_el2(struct kvm_cpu_context *host_ctxt) +static void handle___vgic_v3_get_gic_config(struct kvm_cpu_context *host_ctxt) { - cpu_reg(host_ctxt, 1) = __vgic_v3_get_ich_vtr_el2(); + cpu_reg(host_ctxt, 1) = __vgic_v3_get_gic_config(); } static void handle___vgic_v3_read_vmcr(struct kvm_cpu_context *host_ctxt) @@ -118,7 +118,7 @@ static const hcall_t *host_hcall[] = { HANDLE_FUNC(__kvm_tlb_flush_local_vmid), HANDLE_FUNC(__kvm_timer_set_cntvoff), HANDLE_FUNC(__kvm_enable_ssbs), - HANDLE_FUNC(__vgic_v3_get_ich_vtr_el2), + HANDLE_FUNC(__vgic_v3_get_gic_config), HANDLE_FUNC(__vgic_v3_read_vmcr), HANDLE_FUNC(__vgic_v3_write_vmcr), HANDLE_FUNC(__vgic_v3_init_lrs), diff --git a/arch/arm64/kvm/hyp/vgic-v3-sr.c b/arch/arm64/kvm/hyp/vgic-v3-sr.c index 80406f463c28..005daa0c9dd7 100644 --- a/arch/arm64/kvm/hyp/vgic-v3-sr.c +++ b/arch/arm64/kvm/hyp/vgic-v3-sr.c @@ -405,7 +405,12 @@ void __vgic_v3_init_lrs(void) __gic_v3_set_lr(0, i); } -u64 __vgic_v3_get_ich_vtr_el2(void) +/* + * Return the GIC CPU configuration: + * - [31:0] ICH_VTR_EL2 + * - [63:32] RES0 + */ +u64 __vgic_v3_get_gic_config(void) { return read_gicreg(ICH_VTR_EL2); } diff --git a/arch/arm64/kvm/vgic/vgic-v3.c b/arch/arm64/kvm/vgic/vgic-v3.c index 9cdf39a94a63..8e7bf3151057 100644 --- a/arch/arm64/kvm/vgic/vgic-v3.c +++ b/arch/arm64/kvm/vgic/vgic-v3.c @@ -583,9 +583,11 @@ early_param("kvm-arm.vgic_v4_enable", early_gicv4_enable); */ int vgic_v3_probe(const struct gic_kvm_info *info) { - u32 ich_vtr_el2 = kvm_call_hyp_ret(__vgic_v3_get_ich_vtr_el2); + u64 ich_vtr_el2 = kvm_call_hyp_ret(__vgic_v3_get_gic_config); int ret; + ich_vtr_el2 = (u32)ich_vtr_el2; + /* * The ListRegs field is 5 bits, but there is an architectural * maximum of 16 list registers. Just ignore bit 4... -- 2.29.2 ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
RE: [PATCH v13 00/15] SMMUv3 Nested Stage Setup (IOMMU part)
Hi Eric, > -Original Message- > From: Eric Auger [mailto:eric.au...@redhat.com] > Sent: 18 November 2020 11:22 > To: eric.auger@gmail.com; eric.au...@redhat.com; > io...@lists.linux-foundation.org; linux-ker...@vger.kernel.org; > k...@vger.kernel.org; kvmarm@lists.cs.columbia.edu; w...@kernel.org; > j...@8bytes.org; m...@kernel.org; robin.mur...@arm.com; > alex.william...@redhat.com > Cc: jean-phili...@linaro.org; zhangfei@linaro.org; > zhangfei@gmail.com; vivek.gau...@arm.com; Shameerali Kolothum > Thodi ; > jacob.jun@linux.intel.com; yi.l@intel.com; t...@semihalf.com; > nicoleots...@gmail.com; yuzenghui > Subject: [PATCH v13 00/15] SMMUv3 Nested Stage Setup (IOMMU part) > > This series brings the IOMMU part of HW nested paging support > in the SMMUv3. The VFIO part is submitted separately. > > The IOMMU API is extended to support 2 new API functionalities: > 1) pass the guest stage 1 configuration > 2) pass stage 1 MSI bindings > > Then those capabilities gets implemented in the SMMUv3 driver. > > The virtualizer passes information through the VFIO user API > which cascades them to the iommu subsystem. This allows the guest > to own stage 1 tables and context descriptors (so-called PASID > table) while the host owns stage 2 tables and main configuration > structures (STE). I am seeing an issue with Guest testpmd run with this series. I have two different setups and testpmd works fine with the first one but not with the second. 1). Guest doesn't have kernel driver built-in for pass-through dev. root@ubuntu:/# lspci -v ... 00:02.0 Ethernet controller: Huawei Technologies Co., Ltd. Device a22e (rev 21) Subsystem: Huawei Technologies Co., Ltd. Device Flags: fast devsel Memory at 800010 (64-bit, prefetchable) [disabled] [size=64K] Memory at 80 (64-bit, prefetchable) [disabled] [size=1M] Capabilities: [40] Express Root Complex Integrated Endpoint, MSI 00 Capabilities: [a0] MSI-X: Enable- Count=67 Masked- Capabilities: [b0] Power Management version 3 Capabilities: [100] Access Control Services Capabilities: [300] Transaction Processing Hints root@ubuntu:/# echo vfio-pci > /sys/bus/pci/devices/:00:02.0/driver_override root@ubuntu:/# echo :00:02.0 > /sys/bus/pci/drivers_probe root@ubuntu:/mnt/dpdk/build/app# ./testpmd -w :00:02.0 --file-prefix socket0 -l 0-1 -n 2 -- -i EAL: Detected 8 lcore(s) EAL: Detected 1 NUMA nodes EAL: Multi-process socket /var/run/dpdk/socket0/mp_socket EAL: Selected IOVA mode 'VA' EAL: No available hugepages reported in hugepages-32768kB EAL: No available hugepages reported in hugepages-64kB EAL: No available hugepages reported in hugepages-1048576kB EAL: Probing VFIO support... EAL: VFIO support initialized EAL: Invalid NUMA socket, default to 0 EAL: using IOMMU type 1 (Type 1) EAL: Probe PCI driver: net_hns3_vf (19e5:a22e) device: :00:02.0 (socket 0) EAL: No legacy callbacks, legacy socket not created Interactive-mode selected testpmd: create a new mbuf pool : n=155456, size=2176, socket=0 testpmd: preferred mempool ops selected: ring_mp_mc Warning! port-topology=paired and odd forward ports number, the last port will pair with itself. Configuring Port 0 (socket 0) Port 0: 8E:A6:8C:43:43:45 Checking link statuses... Done testpmd> 2). Guest have kernel driver built-in for pass-through dev. root@ubuntu:/# lspci -v ... 00:02.0 Ethernet controller: Huawei Technologies Co., Ltd. Device a22e (rev 21) Subsystem: Huawei Technologies Co., Ltd. Device Flags: bus master, fast devsel, latency 0 Memory at 800010 (64-bit, prefetchable) [size=64K] Memory at 80 (64-bit, prefetchable) [size=1M] Capabilities: [40] Express Root Complex Integrated Endpoint, MSI 00 Capabilities: [a0] MSI-X: Enable+ Count=67 Masked- Capabilities: [b0] Power Management version 3 Capabilities: [100] Access Control Services Capabilities: [300] Transaction Processing Hints Kernel driver in use: hns3 root@ubuntu:/# echo vfio-pci > /sys/bus/pci/devices/:00:02.0/driver_override root@ubuntu:/# echo :00:02.0 > /sys/bus/pci/drivers/hns3/unbind root@ubuntu:/# echo :00:02.0 > /sys/bus/pci/drivers_probe root@ubuntu:/mnt/dpdk/build/app# ./testpmd -w :00:02.0 --file-prefix socket0 -l 0-1 -n 2 -- -i EAL: Detected 8 lcore(s) EAL: Detected 1 NUMA nodes EAL: Multi-process socket /var/run/dpdk/socket0/mp_socket EAL: Selected IOVA mode 'VA' EAL: No available hugepages reported in hugepages-32768kB EAL: No available hugepages reported in hugepages-64kB EAL: No available hugepages reported in hugepages-1048576kB EAL: Probing VFIO support... EAL: VFIO support initialized EAL: Invalid NUMA socket, default to 0 EAL: using IOMMU type 1 (Type 1) EAL: Probe PCI driver: net_hns3_vf (19e5:a22e) device: :00:02.0 (socket 0) :00:02.0 hns3_get_mbx_resp(): VF could not get mbx(11,0) head(1) tail(0) lost(1) from PF in_irq:0 hns3vf_get_queue_info(): Failed to get tqp info from PF: -62 hns3vf_init_vf(): Failed to fetch
Re: [PATCH] KVM: arm64: Compute TPIDR_EL2 ignoring MTE tag
On 08/01/2021 16:51, Marc Zyngier wrote: Hi Steven, On 2021-01-08 16:12, Steven Price wrote: KASAN in HW_TAGS mode will store MTE tags in the top byte of the pointer. When computing the offset for TPIDR_EL2 we don't want anything in the top byte, so remove the tag to ensure the computation is correct no matter what the tag. Fixes: 94ab5b61ee16 ("kasan, arm64: enable CONFIG_KASAN_HW_TAGS") Signed-off-by: Steven Price --- Without this fix I can't boot a config with KASAN_HW_TAGS and KVM on an MTE enabled host. I'm unsure if this should really be in this_cpu_ptr_nvhe_sym(). this_cpu_ptr_nvhe_sym() should return something that is valid for the EL1 kernel, so I guess untagging in the helper may not be that useful. Makes sense and was my suspicion. However, I'm more concerned by anything at requires us to follow pointers set up by EL1 at EL2. It looks to me that the only reason the whole thing works is because kern_hyp_va() *accidentally* drops tags before applying the EL1/EL2 offset... In the case I'm fixing this is intended to be an offset calculation - it's just messed up by the presence of an MTE tag in one of the pointers. I agree I was somewhat surprised when everything 'just worked' with this one change - and I think you're right it's because kern_hyp_va() 'just happens' to lose the tags. Of course there may be other bugs lurking - running MTE+KASAN on the model is slow so I didn't do much beyond boot it. One of the 'fun' things about MTE is that you can no longer do pointer subtraction to calculate the offset unless the pointers are actually from the same allocation (and therefore have the same tag). I'm sure the C language experts would point out that's "always been the case" but it will probably break things elsewhere too. Steve Or am I getting it wrong? Thanks, M. ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
Re: [PATCH] KVM: arm64: Compute TPIDR_EL2 ignoring MTE tag
Hi Steven, On 2021-01-08 16:12, Steven Price wrote: KASAN in HW_TAGS mode will store MTE tags in the top byte of the pointer. When computing the offset for TPIDR_EL2 we don't want anything in the top byte, so remove the tag to ensure the computation is correct no matter what the tag. Fixes: 94ab5b61ee16 ("kasan, arm64: enable CONFIG_KASAN_HW_TAGS") Signed-off-by: Steven Price --- Without this fix I can't boot a config with KASAN_HW_TAGS and KVM on an MTE enabled host. I'm unsure if this should really be in this_cpu_ptr_nvhe_sym(). this_cpu_ptr_nvhe_sym() should return something that is valid for the EL1 kernel, so I guess untagging in the helper may not be that useful. However, I'm more concerned by anything at requires us to follow pointers set up by EL1 at EL2. It looks to me that the only reason the whole thing works is because kern_hyp_va() *accidentally* drops tags before applying the EL1/EL2 offset... Or am I getting it wrong? Thanks, M. -- Jazz is not dead. It just smells funny... ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
[RFC PATCH v2 25/26] KVM: arm64: Reserve memory for host stage 2
Extend the memory pool allocated for the hypervisor to include enough pages to map all of memory at page granularity for the host stage 2. While at it, also reserve some memory for device mappings. Signed-off-by: Quentin Perret --- arch/arm64/kvm/hyp/include/nvhe/mm.h | 36 arch/arm64/kvm/hyp/nvhe/setup.c | 12 ++ arch/arm64/kvm/hyp/reserved_mem.c| 2 ++ 3 files changed, 46 insertions(+), 4 deletions(-) diff --git a/arch/arm64/kvm/hyp/include/nvhe/mm.h b/arch/arm64/kvm/hyp/include/nvhe/mm.h index f0cc09b127a5..cdf2e3447b2a 100644 --- a/arch/arm64/kvm/hyp/include/nvhe/mm.h +++ b/arch/arm64/kvm/hyp/include/nvhe/mm.h @@ -52,15 +52,12 @@ static inline unsigned long __hyp_pgtable_max_pages(unsigned long nr_pages) return total; } -static inline unsigned long hyp_s1_pgtable_size(void) +static inline unsigned long __hyp_pgtable_total_size(void) { struct hyp_memblock_region *reg; unsigned long nr_pages, res = 0; int i; - if (kvm_nvhe_sym(hyp_memblock_nr) <= 0) - return 0; - for (i = 0; i < kvm_nvhe_sym(hyp_memblock_nr); i++) { reg = _nvhe_sym(hyp_memory)[i]; nr_pages = (reg->end - reg->start) >> PAGE_SHIFT; @@ -68,6 +65,18 @@ static inline unsigned long hyp_s1_pgtable_size(void) res += nr_pages << PAGE_SHIFT; } + return res; +} + +static inline unsigned long hyp_s1_pgtable_size(void) +{ + unsigned long res, nr_pages; + + if (kvm_nvhe_sym(hyp_memblock_nr) <= 0) + return 0; + + res = __hyp_pgtable_total_size(); + /* Allow 1 GiB for private mappings */ nr_pages = (1 << 30) >> PAGE_SHIFT; nr_pages = __hyp_pgtable_max_pages(nr_pages); @@ -76,4 +85,23 @@ static inline unsigned long hyp_s1_pgtable_size(void) return res; } +static inline unsigned long host_s2_mem_pgtable_size(void) +{ + unsigned long max_pgd_sz = 16 << PAGE_SHIFT; + + if (kvm_nvhe_sym(hyp_memblock_nr) <= 0) + return 0; + + return __hyp_pgtable_total_size() + max_pgd_sz; +} + +static inline unsigned long host_s2_dev_pgtable_size(void) +{ + if (kvm_nvhe_sym(hyp_memblock_nr) <= 0) + return 0; + + /* Allow 1 GiB for private mappings */ + return __hyp_pgtable_max_pages((1 << 30) >> PAGE_SHIFT) << PAGE_SHIFT; +} + #endif /* __KVM_HYP_MM_H */ diff --git a/arch/arm64/kvm/hyp/nvhe/setup.c b/arch/arm64/kvm/hyp/nvhe/setup.c index 6d1faede86ae..79b697df01e2 100644 --- a/arch/arm64/kvm/hyp/nvhe/setup.c +++ b/arch/arm64/kvm/hyp/nvhe/setup.c @@ -24,6 +24,8 @@ unsigned long hyp_nr_cpus; static void *stacks_base; static void *vmemmap_base; static void *hyp_pgt_base; +static void *host_s2_mem_pgt_base; +static void *host_s2_dev_pgt_base; static int divide_memory_pool(void *virt, unsigned long size) { @@ -46,6 +48,16 @@ static int divide_memory_pool(void *virt, unsigned long size) if (!hyp_pgt_base) return -ENOMEM; + nr_pages = host_s2_mem_pgtable_size() >> PAGE_SHIFT; + host_s2_mem_pgt_base = hyp_early_alloc_contig(nr_pages); + if (!host_s2_mem_pgt_base) + return -ENOMEM; + + nr_pages = host_s2_dev_pgtable_size() >> PAGE_SHIFT; + host_s2_dev_pgt_base = hyp_early_alloc_contig(nr_pages); + if (!host_s2_dev_pgt_base) + return -ENOMEM; + return 0; } diff --git a/arch/arm64/kvm/hyp/reserved_mem.c b/arch/arm64/kvm/hyp/reserved_mem.c index 32f648992835..ee97e55e3c59 100644 --- a/arch/arm64/kvm/hyp/reserved_mem.c +++ b/arch/arm64/kvm/hyp/reserved_mem.c @@ -74,6 +74,8 @@ void __init kvm_hyp_reserve(void) */ hyp_mem_size += NR_CPUS << PAGE_SHIFT; hyp_mem_size += hyp_s1_pgtable_size(); + hyp_mem_size += host_s2_mem_pgtable_size(); + hyp_mem_size += host_s2_dev_pgtable_size(); /* * The hyp_vmemmap needs to be backed by pages, but these pages -- 2.30.0.284.gd98b1dd5eaa7-goog ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
[RFC PATCH v2 26/26] KVM: arm64: Wrap the host with a stage 2
When KVM runs in protected nVHE mode, make use of a stage 2 page-table to give the hypervisor some control over the host memory accesses. At the moment all memory aborts from the host will be instantly idmapped RWX at stage 2 in a lazy fashion. Later patches will make use of that infrastructure to implement access control restrictions to e.g. protect guest memory from the host. Signed-off-by: Quentin Perret --- arch/arm64/include/asm/kvm_cpufeature.h | 2 + arch/arm64/kernel/image-vars.h| 3 + arch/arm64/kvm/hyp/include/nvhe/mem_protect.h | 33 +++ arch/arm64/kvm/hyp/nvhe/Makefile | 2 +- arch/arm64/kvm/hyp/nvhe/hyp-init.S| 1 + arch/arm64/kvm/hyp/nvhe/hyp-main.c| 6 + arch/arm64/kvm/hyp/nvhe/mem_protect.c | 191 ++ arch/arm64/kvm/hyp/nvhe/setup.c | 6 + arch/arm64/kvm/hyp/nvhe/switch.c | 7 +- arch/arm64/kvm/hyp/nvhe/tlb.c | 4 +- 10 files changed, 248 insertions(+), 7 deletions(-) create mode 100644 arch/arm64/kvm/hyp/include/nvhe/mem_protect.h create mode 100644 arch/arm64/kvm/hyp/nvhe/mem_protect.c diff --git a/arch/arm64/include/asm/kvm_cpufeature.h b/arch/arm64/include/asm/kvm_cpufeature.h index d34f85cba358..74043a149322 100644 --- a/arch/arm64/include/asm/kvm_cpufeature.h +++ b/arch/arm64/include/asm/kvm_cpufeature.h @@ -15,3 +15,5 @@ #endif KVM_HYP_CPU_FTR_REG(SYS_CTR_EL0, arm64_ftr_reg_ctrel0) +KVM_HYP_CPU_FTR_REG(SYS_ID_AA64MMFR0_EL1, arm64_ftr_reg_id_aa64mmfr0_el1) +KVM_HYP_CPU_FTR_REG(SYS_ID_AA64MMFR1_EL1, arm64_ftr_reg_id_aa64mmfr1_el1) diff --git a/arch/arm64/kernel/image-vars.h b/arch/arm64/kernel/image-vars.h index 366d837f0d39..e4e4f30ac251 100644 --- a/arch/arm64/kernel/image-vars.h +++ b/arch/arm64/kernel/image-vars.h @@ -132,6 +132,9 @@ KVM_NVHE_ALIAS(__hyp_data_ro_after_init_end); KVM_NVHE_ALIAS(__hyp_bss_start); KVM_NVHE_ALIAS(__hyp_bss_end); +/* pKVM static key */ +KVM_NVHE_ALIAS(kvm_protected_mode_initialized); + #endif /* CONFIG_KVM */ #endif /* __ARM64_KERNEL_IMAGE_VARS_H */ diff --git a/arch/arm64/kvm/hyp/include/nvhe/mem_protect.h b/arch/arm64/kvm/hyp/include/nvhe/mem_protect.h new file mode 100644 index ..a22ef118a610 --- /dev/null +++ b/arch/arm64/kvm/hyp/include/nvhe/mem_protect.h @@ -0,0 +1,33 @@ +/* SPDX-License-Identifier: GPL-2.0-only */ +/* + * Copyright (C) 2020 Google LLC + * Author: Quentin Perret + */ + +#ifndef __KVM_NVHE_MEM_PROTECT__ +#define __KVM_NVHE_MEM_PROTECT__ +#include +#include +#include +#include +#include + +struct host_kvm { + struct kvm_arch arch; + struct kvm_pgtable pgt; + struct kvm_pgtable_mm_ops mm_ops; + hyp_spinlock_t lock; +}; +extern struct host_kvm host_kvm; + +int kvm_host_prepare_stage2(void *mem_pgt_pool, void *dev_pgt_pool); +void handle_host_mem_abort(struct kvm_cpu_context *host_ctxt); + +static __always_inline void __load_host_stage2(void) +{ + if (static_branch_likely(_protected_mode_initialized)) + __load_stage2(_kvm.arch.mmu, host_kvm.arch.vtcr); + else + write_sysreg(0, vttbr_el2); +} +#endif /* __KVM_NVHE_MEM_PROTECT__ */ diff --git a/arch/arm64/kvm/hyp/nvhe/Makefile b/arch/arm64/kvm/hyp/nvhe/Makefile index d7381a503182..c3e2f98555c4 100644 --- a/arch/arm64/kvm/hyp/nvhe/Makefile +++ b/arch/arm64/kvm/hyp/nvhe/Makefile @@ -11,7 +11,7 @@ lib-objs := $(addprefix ../../../lib/, $(lib-objs)) obj-y := timer-sr.o sysreg-sr.o debug-sr.o switch.o tlb.o hyp-init.o host.o \ hyp-main.o hyp-smp.o psci-relay.o early_alloc.o stub.o page_alloc.o \ -cache.o cpufeature.o setup.o mm.o +cache.o cpufeature.o setup.o mm.o mem_protect.o obj-y += ../vgic-v3-sr.o ../aarch32.o ../vgic-v2-cpuif-proxy.o ../entry.o \ ../fpsimd.o ../hyp-entry.o ../exception.o ../pgtable.o obj-y += $(lib-objs) diff --git a/arch/arm64/kvm/hyp/nvhe/hyp-init.S b/arch/arm64/kvm/hyp/nvhe/hyp-init.S index b1341bb4b453..32591db76c75 100644 --- a/arch/arm64/kvm/hyp/nvhe/hyp-init.S +++ b/arch/arm64/kvm/hyp/nvhe/hyp-init.S @@ -129,6 +129,7 @@ alternative_else_nop_endif /* Invalidate the stale TLBs from Bootloader */ tlbialle2 + tlbivmalls12e1 dsb sy /* diff --git a/arch/arm64/kvm/hyp/nvhe/hyp-main.c b/arch/arm64/kvm/hyp/nvhe/hyp-main.c index 3075f117651c..93699600bc22 100644 --- a/arch/arm64/kvm/hyp/nvhe/hyp-main.c +++ b/arch/arm64/kvm/hyp/nvhe/hyp-main.c @@ -13,6 +13,7 @@ #include #include +#include #include #include @@ -222,6 +223,11 @@ void handle_trap(struct kvm_cpu_context *host_ctxt) case ESR_ELx_EC_SMC64: handle_host_smc(host_ctxt); break; + case ESR_ELx_EC_IABT_LOW: + fallthrough; + case ESR_ELx_EC_DABT_LOW: + handle_host_mem_abort(host_ctxt); + break; default: hyp_panic(); } diff --git
[RFC PATCH v2 22/26] KVM: arm64: Refactor __load_guest_stage2()
Refactor __load_guest_stage2() to introduce __load_stage2() which will be re-used when loading the host stage 2. Signed-off-by: Quentin Perret --- arch/arm64/include/asm/kvm_mmu.h | 9 +++-- 1 file changed, 7 insertions(+), 2 deletions(-) diff --git a/arch/arm64/include/asm/kvm_mmu.h b/arch/arm64/include/asm/kvm_mmu.h index 83b4c5cf4768..8d37d6d1ed29 100644 --- a/arch/arm64/include/asm/kvm_mmu.h +++ b/arch/arm64/include/asm/kvm_mmu.h @@ -345,9 +345,9 @@ static __always_inline u64 kvm_get_vttbr(struct kvm_s2_mmu *mmu) * Must be called from hyp code running at EL2 with an updated VTTBR * and interrupts disabled. */ -static __always_inline void __load_guest_stage2(struct kvm_s2_mmu *mmu) +static __always_inline void __load_stage2(struct kvm_s2_mmu *mmu, unsigned long vtcr) { - write_sysreg(kern_hyp_va(mmu->arch)->vtcr, vtcr_el2); + write_sysreg(vtcr, vtcr_el2); write_sysreg(kvm_get_vttbr(mmu), vttbr_el2); /* @@ -358,6 +358,11 @@ static __always_inline void __load_guest_stage2(struct kvm_s2_mmu *mmu) asm(ALTERNATIVE("nop", "isb", ARM64_WORKAROUND_SPECULATIVE_AT)); } +static __always_inline void __load_guest_stage2(struct kvm_s2_mmu *mmu) +{ + __load_stage2(mmu, kern_hyp_va(mmu->arch)->vtcr); +} + static inline struct kvm *kvm_s2_mmu_to_kvm(struct kvm_s2_mmu *mmu) { return container_of(mmu->arch, struct kvm, arch); -- 2.30.0.284.gd98b1dd5eaa7-goog ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
[RFC PATCH v2 23/26] KVM: arm64: Refactor __populate_fault_info()
Refactor __populate_fault_info() to introduce __get_fault_info() which will be used once the host is wrapped in a stage 2. Signed-off-by: Quentin Perret --- arch/arm64/kvm/hyp/include/hyp/switch.h | 36 +++-- 1 file changed, 22 insertions(+), 14 deletions(-) diff --git a/arch/arm64/kvm/hyp/include/hyp/switch.h b/arch/arm64/kvm/hyp/include/hyp/switch.h index 84473574c2e7..e9005255d639 100644 --- a/arch/arm64/kvm/hyp/include/hyp/switch.h +++ b/arch/arm64/kvm/hyp/include/hyp/switch.h @@ -157,19 +157,9 @@ static inline bool __translate_far_to_hpfar(u64 far, u64 *hpfar) return true; } -static inline bool __populate_fault_info(struct kvm_vcpu *vcpu) +static inline bool __get_fault_info(u64 esr, u64 *far, u64 *hpfar) { - u8 ec; - u64 esr; - u64 hpfar, far; - - esr = vcpu->arch.fault.esr_el2; - ec = ESR_ELx_EC(esr); - - if (ec != ESR_ELx_EC_DABT_LOW && ec != ESR_ELx_EC_IABT_LOW) - return true; - - far = read_sysreg_el2(SYS_FAR); + *far = read_sysreg_el2(SYS_FAR); /* * The HPFAR can be invalid if the stage 2 fault did not @@ -185,12 +175,30 @@ static inline bool __populate_fault_info(struct kvm_vcpu *vcpu) if (!(esr & ESR_ELx_S1PTW) && (cpus_have_final_cap(ARM64_WORKAROUND_834220) || (esr & ESR_ELx_FSC_TYPE) == FSC_PERM)) { - if (!__translate_far_to_hpfar(far, )) + if (!__translate_far_to_hpfar(*far, hpfar)) return false; } else { - hpfar = read_sysreg(hpfar_el2); + *hpfar = read_sysreg(hpfar_el2); } + return true; +} + +static inline bool __populate_fault_info(struct kvm_vcpu *vcpu) +{ + u8 ec; + u64 esr; + u64 hpfar, far; + + esr = vcpu->arch.fault.esr_el2; + ec = ESR_ELx_EC(esr); + + if (ec != ESR_ELx_EC_DABT_LOW && ec != ESR_ELx_EC_IABT_LOW) + return true; + + if (!__get_fault_info(esr, , )) + return false; + vcpu->arch.fault.far_el2 = far; vcpu->arch.fault.hpfar_el2 = hpfar; return true; -- 2.30.0.284.gd98b1dd5eaa7-goog ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
[RFC PATCH v2 21/26] KVM: arm64: Refactor kvm_arm_setup_stage2()
In order to re-use some of the stage 2 setup at EL2, factor parts of kvm_arm_setup_stage2() out into static inline functions. No functional change intended. Signed-off-by: Quentin Perret --- arch/arm64/include/asm/kvm_mmu.h | 48 arch/arm64/kvm/reset.c | 42 +++- 2 files changed, 52 insertions(+), 38 deletions(-) diff --git a/arch/arm64/include/asm/kvm_mmu.h b/arch/arm64/include/asm/kvm_mmu.h index 662f0415344e..83b4c5cf4768 100644 --- a/arch/arm64/include/asm/kvm_mmu.h +++ b/arch/arm64/include/asm/kvm_mmu.h @@ -280,6 +280,54 @@ static inline int kvm_write_guest_lock(struct kvm *kvm, gpa_t gpa, return ret; } +static inline u64 kvm_get_parange(u64 mmfr0) +{ + u64 parange = cpuid_feature_extract_unsigned_field(mmfr0, + ID_AA64MMFR0_PARANGE_SHIFT); + if (parange > ID_AA64MMFR0_PARANGE_MAX) + parange = ID_AA64MMFR0_PARANGE_MAX; + + return parange; +} + +/* + * The VTCR value is common across all the physical CPUs on the system. + * We use system wide sanitised values to fill in different fields, + * except for Hardware Management of Access Flags. HA Flag is set + * unconditionally on all CPUs, as it is safe to run with or without + * the feature and the bit is RES0 on CPUs that don't support it. + */ +static inline u64 kvm_get_vtcr(u64 mmfr0, u64 mmfr1, u32 phys_shift) +{ + u64 vtcr = VTCR_EL2_FLAGS; + u8 lvls; + + vtcr |= kvm_get_parange(mmfr0) << VTCR_EL2_PS_SHIFT; + vtcr |= VTCR_EL2_T0SZ(phys_shift); + /* +* Use a minimum 2 level page table to prevent splitting +* host PMD huge pages at stage2. +*/ + lvls = stage2_pgtable_levels(phys_shift); + if (lvls < 2) + lvls = 2; + vtcr |= VTCR_EL2_LVLS_TO_SL0(lvls); + + /* +* Enable the Hardware Access Flag management, unconditionally +* on all CPUs. The features is RES0 on CPUs without the support +* and must be ignored by the CPUs. +*/ + vtcr |= VTCR_EL2_HA; + + /* Set the vmid bits */ + vtcr |= (get_vmid_bits(mmfr1) == 16) ? + VTCR_EL2_VS_16BIT : + VTCR_EL2_VS_8BIT; + + return vtcr; +} + #define kvm_phys_to_vttbr(addr)phys_to_ttbr(addr) static __always_inline u64 kvm_get_vttbr(struct kvm_s2_mmu *mmu) diff --git a/arch/arm64/kvm/reset.c b/arch/arm64/kvm/reset.c index 47f3f035f3ea..6aae118c960a 100644 --- a/arch/arm64/kvm/reset.c +++ b/arch/arm64/kvm/reset.c @@ -332,19 +332,10 @@ int kvm_set_ipa_limit(void) return 0; } -/* - * Configure the VTCR_EL2 for this VM. The VTCR value is common - * across all the physical CPUs on the system. We use system wide - * sanitised values to fill in different fields, except for Hardware - * Management of Access Flags. HA Flag is set unconditionally on - * all CPUs, as it is safe to run with or without the feature and - * the bit is RES0 on CPUs that don't support it. - */ int kvm_arm_setup_stage2(struct kvm *kvm, unsigned long type) { - u64 vtcr = VTCR_EL2_FLAGS, mmfr0; - u32 parange, phys_shift; - u8 lvls; + u64 mmfr0, mmfr1; + u32 phys_shift; if (type & ~KVM_VM_TYPE_ARM_IPA_SIZE_MASK) return -EINVAL; @@ -359,33 +350,8 @@ int kvm_arm_setup_stage2(struct kvm *kvm, unsigned long type) } mmfr0 = read_sanitised_ftr_reg(SYS_ID_AA64MMFR0_EL1); - parange = cpuid_feature_extract_unsigned_field(mmfr0, - ID_AA64MMFR0_PARANGE_SHIFT); - if (parange > ID_AA64MMFR0_PARANGE_MAX) - parange = ID_AA64MMFR0_PARANGE_MAX; - vtcr |= parange << VTCR_EL2_PS_SHIFT; - - vtcr |= VTCR_EL2_T0SZ(phys_shift); - /* -* Use a minimum 2 level page table to prevent splitting -* host PMD huge pages at stage2. -*/ - lvls = stage2_pgtable_levels(phys_shift); - if (lvls < 2) - lvls = 2; - vtcr |= VTCR_EL2_LVLS_TO_SL0(lvls); - - /* -* Enable the Hardware Access Flag management, unconditionally -* on all CPUs. The features is RES0 on CPUs without the support -* and must be ignored by the CPUs. -*/ - vtcr |= VTCR_EL2_HA; + mmfr1 = read_sanitised_ftr_reg(SYS_ID_AA64MMFR1_EL1); + kvm->arch.vtcr = kvm_get_vtcr(mmfr0, mmfr1, phys_shift); - /* Set the vmid bits */ - vtcr |= (kvm_get_vmid_bits() == 16) ? - VTCR_EL2_VS_16BIT : - VTCR_EL2_VS_8BIT; - kvm->arch.vtcr = vtcr; return 0; } -- 2.30.0.284.gd98b1dd5eaa7-goog ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
[RFC PATCH v2 24/26] KVM: arm64: Make memcache anonymous in pgtable allocator
The current stage2 page-table allocator uses a memcache to get pre-allocated pages when it needs any. To allow re-using this code at EL2 which uses a concept of memory pools, make the memcache argument to kvm_pgtable_stage2_map() anonymous. and let the mm_ops zalloc_page() callbacks use it the way they need to. Signed-off-by: Quentin Perret --- arch/arm64/include/asm/kvm_pgtable.h | 6 +++--- arch/arm64/kvm/hyp/pgtable.c | 4 ++-- 2 files changed, 5 insertions(+), 5 deletions(-) diff --git a/arch/arm64/include/asm/kvm_pgtable.h b/arch/arm64/include/asm/kvm_pgtable.h index 8e8f1d2c5e0e..d846bc3d3b77 100644 --- a/arch/arm64/include/asm/kvm_pgtable.h +++ b/arch/arm64/include/asm/kvm_pgtable.h @@ -176,8 +176,8 @@ void kvm_pgtable_stage2_destroy(struct kvm_pgtable *pgt); * @size: Size of the mapping. * @phys: Physical address of the memory to map. * @prot: Permissions and attributes for the mapping. - * @mc:Cache of pre-allocated GFP_PGTABLE_USER memory from which to - * allocate page-table pages. + * @mc:Cache of pre-allocated memory from which to allocate page-table + * pages. * * The offset of @addr within a page is ignored, @size is rounded-up to * the next page boundary and @phys is rounded-down to the previous page @@ -194,7 +194,7 @@ void kvm_pgtable_stage2_destroy(struct kvm_pgtable *pgt); */ int kvm_pgtable_stage2_map(struct kvm_pgtable *pgt, u64 addr, u64 size, u64 phys, enum kvm_pgtable_prot prot, - struct kvm_mmu_memory_cache *mc); + void *mc); /** * kvm_pgtable_stage2_unmap() - Remove a mapping from a guest stage-2 page-table. diff --git a/arch/arm64/kvm/hyp/pgtable.c b/arch/arm64/kvm/hyp/pgtable.c index 96a25d0b7b6e..5dd1b4978fe8 100644 --- a/arch/arm64/kvm/hyp/pgtable.c +++ b/arch/arm64/kvm/hyp/pgtable.c @@ -443,7 +443,7 @@ struct stage2_map_data { kvm_pte_t *anchor; struct kvm_s2_mmu *mmu; - struct kvm_mmu_memory_cache *memcache; + void*memcache; struct kvm_pgtable_mm_ops *mm_ops; }; @@ -613,7 +613,7 @@ static int stage2_map_walker(u64 addr, u64 end, u32 level, kvm_pte_t *ptep, int kvm_pgtable_stage2_map(struct kvm_pgtable *pgt, u64 addr, u64 size, u64 phys, enum kvm_pgtable_prot prot, - struct kvm_mmu_memory_cache *mc) + void *mc) { int ret; struct stage2_map_data map_data = { -- 2.30.0.284.gd98b1dd5eaa7-goog ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
[RFC PATCH v2 18/26] KVM: arm64: Use kvm_arch for stage 2 pgtable
In order to make use of the stage 2 pgtable code for the host stage 2, use struct kvm_arch in lieu of struct kvm as the host will have the former but not the latter. Signed-off-by: Quentin Perret --- arch/arm64/include/asm/kvm_pgtable.h | 5 +++-- arch/arm64/kvm/hyp/pgtable.c | 6 +++--- arch/arm64/kvm/mmu.c | 2 +- 3 files changed, 7 insertions(+), 6 deletions(-) diff --git a/arch/arm64/include/asm/kvm_pgtable.h b/arch/arm64/include/asm/kvm_pgtable.h index 45acc9dc6c45..8e8f1d2c5e0e 100644 --- a/arch/arm64/include/asm/kvm_pgtable.h +++ b/arch/arm64/include/asm/kvm_pgtable.h @@ -151,12 +151,13 @@ int kvm_pgtable_hyp_map(struct kvm_pgtable *pgt, u64 addr, u64 size, u64 phys, /** * kvm_pgtable_stage2_init() - Initialise a guest stage-2 page-table. * @pgt: Uninitialised page-table structure to initialise. - * @kvm: KVM structure representing the guest virtual machine. + * @arch: Arch-specific KVM structure representing the guest virtual + * machine. * @mm_ops:Memory management callbacks. * * Return: 0 on success, negative error code on failure. */ -int kvm_pgtable_stage2_init(struct kvm_pgtable *pgt, struct kvm *kvm, +int kvm_pgtable_stage2_init(struct kvm_pgtable *pgt, struct kvm_arch *arch, struct kvm_pgtable_mm_ops *mm_ops); /** diff --git a/arch/arm64/kvm/hyp/pgtable.c b/arch/arm64/kvm/hyp/pgtable.c index 61a8a34ddfdb..96a25d0b7b6e 100644 --- a/arch/arm64/kvm/hyp/pgtable.c +++ b/arch/arm64/kvm/hyp/pgtable.c @@ -855,11 +855,11 @@ int kvm_pgtable_stage2_flush(struct kvm_pgtable *pgt, u64 addr, u64 size) return kvm_pgtable_walk(pgt, addr, size, ); } -int kvm_pgtable_stage2_init(struct kvm_pgtable *pgt, struct kvm *kvm, +int kvm_pgtable_stage2_init(struct kvm_pgtable *pgt, struct kvm_arch *arch, struct kvm_pgtable_mm_ops *mm_ops) { size_t pgd_sz; - u64 vtcr = kvm->arch.vtcr; + u64 vtcr = arch->vtcr; u32 ia_bits = VTCR_EL2_IPA(vtcr); u32 sl0 = FIELD_GET(VTCR_EL2_SL0_MASK, vtcr); u32 start_level = VTCR_EL2_TGRAN_SL0_BASE - sl0; @@ -872,7 +872,7 @@ int kvm_pgtable_stage2_init(struct kvm_pgtable *pgt, struct kvm *kvm, pgt->ia_bits= ia_bits; pgt->start_level= start_level; pgt->mm_ops = mm_ops; - pgt->mmu= >arch.mmu; + pgt->mmu= >mmu; /* Ensure zeroed PGD pages are visible to the hardware walker */ dsb(ishst); diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c index 9d4c9251208e..7e6263103943 100644 --- a/arch/arm64/kvm/mmu.c +++ b/arch/arm64/kvm/mmu.c @@ -461,7 +461,7 @@ int kvm_init_stage2_mmu(struct kvm *kvm, struct kvm_s2_mmu *mmu) if (!pgt) return -ENOMEM; - err = kvm_pgtable_stage2_init(pgt, kvm, _s2_mm_ops); + err = kvm_pgtable_stage2_init(pgt, >arch, _s2_mm_ops); if (err) goto out_free_pgtable; -- 2.30.0.284.gd98b1dd5eaa7-goog ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
[RFC PATCH v2 10/26] KVM: arm64: Introduce an early Hyp page allocator
With nVHE, the host currently creates all s1 hypervisor mappings at EL1 during boot, installs them at EL2, and extends them as required (e.g. when creating a new VM). But in a world where the host is no longer trusted, it cannot have full control over the code mapped in the hypervisor. In preparation for enabling the hypervisor to create its own s1 mappings during boot, introduce an early page allocator, with minimal functionality. This allocator is designed to be used only during early bootstrap of the hyp code when memory protection is enabled, which will then switch to using a full-fledged page allocator after init. Signed-off-by: Quentin Perret --- arch/arm64/kvm/hyp/include/nvhe/early_alloc.h | 14 + arch/arm64/kvm/hyp/include/nvhe/memory.h | 24 arch/arm64/kvm/hyp/nvhe/Makefile | 2 +- arch/arm64/kvm/hyp/nvhe/early_alloc.c | 60 +++ arch/arm64/kvm/hyp/nvhe/psci-relay.c | 4 +- 5 files changed, 100 insertions(+), 4 deletions(-) create mode 100644 arch/arm64/kvm/hyp/include/nvhe/early_alloc.h create mode 100644 arch/arm64/kvm/hyp/include/nvhe/memory.h create mode 100644 arch/arm64/kvm/hyp/nvhe/early_alloc.c diff --git a/arch/arm64/kvm/hyp/include/nvhe/early_alloc.h b/arch/arm64/kvm/hyp/include/nvhe/early_alloc.h new file mode 100644 index ..68ce2bf9a718 --- /dev/null +++ b/arch/arm64/kvm/hyp/include/nvhe/early_alloc.h @@ -0,0 +1,14 @@ +/* SPDX-License-Identifier: GPL-2.0-only */ +#ifndef __KVM_HYP_EARLY_ALLOC_H +#define __KVM_HYP_EARLY_ALLOC_H + +#include + +void hyp_early_alloc_init(void *virt, unsigned long size); +unsigned long hyp_early_alloc_nr_pages(void); +void *hyp_early_alloc_page(void *arg); +void *hyp_early_alloc_contig(unsigned int nr_pages); + +extern struct kvm_pgtable_mm_ops hyp_early_alloc_mm_ops; + +#endif /* __KVM_HYP_EARLY_ALLOC_H */ diff --git a/arch/arm64/kvm/hyp/include/nvhe/memory.h b/arch/arm64/kvm/hyp/include/nvhe/memory.h new file mode 100644 index ..64c44c142c95 --- /dev/null +++ b/arch/arm64/kvm/hyp/include/nvhe/memory.h @@ -0,0 +1,24 @@ +/* SPDX-License-Identifier: GPL-2.0-only */ +#ifndef __KVM_HYP_MEMORY_H +#define __KVM_HYP_MEMORY_H + +#include + +#include + +extern s64 hyp_physvirt_offset; + +#define __hyp_pa(virt) ((phys_addr_t)(virt) + hyp_physvirt_offset) +#define __hyp_va(virt) ((void *)((phys_addr_t)(virt) - hyp_physvirt_offset)) + +static inline void *hyp_phys_to_virt(phys_addr_t phys) +{ + return __hyp_va(phys); +} + +static inline phys_addr_t hyp_virt_to_phys(void *addr) +{ + return __hyp_pa(addr); +} + +#endif /* __KVM_HYP_MEMORY_H */ diff --git a/arch/arm64/kvm/hyp/nvhe/Makefile b/arch/arm64/kvm/hyp/nvhe/Makefile index 590fdefb42dd..1fc0684a7678 100644 --- a/arch/arm64/kvm/hyp/nvhe/Makefile +++ b/arch/arm64/kvm/hyp/nvhe/Makefile @@ -10,7 +10,7 @@ lib-objs := clear_page.o copy_page.o memcpy.o memset.o lib-objs := $(addprefix ../../../lib/, $(lib-objs)) obj-y := timer-sr.o sysreg-sr.o debug-sr.o switch.o tlb.o hyp-init.o host.o \ -hyp-main.o hyp-smp.o psci-relay.o +hyp-main.o hyp-smp.o psci-relay.o early_alloc.o obj-y += ../vgic-v3-sr.o ../aarch32.o ../vgic-v2-cpuif-proxy.o ../entry.o \ ../fpsimd.o ../hyp-entry.o ../exception.o obj-y += $(lib-objs) diff --git a/arch/arm64/kvm/hyp/nvhe/early_alloc.c b/arch/arm64/kvm/hyp/nvhe/early_alloc.c new file mode 100644 index ..de4c45662970 --- /dev/null +++ b/arch/arm64/kvm/hyp/nvhe/early_alloc.c @@ -0,0 +1,60 @@ +// SPDX-License-Identifier: GPL-2.0-only +/* + * Copyright (C) 2020 Google LLC + * Author: Quentin Perret + */ + +#include + +#include + +struct kvm_pgtable_mm_ops hyp_early_alloc_mm_ops; +s64 __ro_after_init hyp_physvirt_offset; + +static unsigned long base; +static unsigned long end; +static unsigned long cur; + +unsigned long hyp_early_alloc_nr_pages(void) +{ + return (cur - base) >> PAGE_SHIFT; +} + +extern void clear_page(void *to); + +void *hyp_early_alloc_contig(unsigned int nr_pages) +{ + unsigned long ret = cur, i, p; + + if (!nr_pages) + return NULL; + + cur += nr_pages << PAGE_SHIFT; + if (cur > end) { + cur = ret; + return NULL; + } + + for (i = 0; i < nr_pages; i++) { + p = ret + (i << PAGE_SHIFT); + clear_page((void *)(p)); + } + + return (void *)ret; +} + +void *hyp_early_alloc_page(void *arg) +{ + return hyp_early_alloc_contig(1); +} + +void hyp_early_alloc_init(unsigned long virt, unsigned long size) +{ + base = virt; + end = virt + size; + cur = virt; + + hyp_early_alloc_mm_ops.zalloc_page = hyp_early_alloc_page; + hyp_early_alloc_mm_ops.phys_to_virt = hyp_phys_to_virt; + hyp_early_alloc_mm_ops.virt_to_phys = hyp_virt_to_phys; +} diff --git a/arch/arm64/kvm/hyp/nvhe/psci-relay.c b/arch/arm64/kvm/hyp/nvhe/psci-relay.c index e3947846ffcb..bdd8054bce4c 100644 ---
[RFC PATCH v2 03/26] arm64: kvm: Add standalone ticket spinlock implementation for use at hyp
From: Will Deacon We will soon need to synchronise multiple CPUs in the hyp text at EL2. The qspinlock-based locking used by the host is overkill for this purpose and relies on the kernel's "percpu" implementation for the MCS nodes. Implement a simple ticket locking scheme based heavily on the code removed by commit c11090474d70 ("arm64: locking: Replace ticket lock implementation with qspinlock"). Signed-off-by: Will Deacon Signed-off-by: Quentin Perret --- arch/arm64/kvm/hyp/include/nvhe/spinlock.h | 92 ++ 1 file changed, 92 insertions(+) create mode 100644 arch/arm64/kvm/hyp/include/nvhe/spinlock.h diff --git a/arch/arm64/kvm/hyp/include/nvhe/spinlock.h b/arch/arm64/kvm/hyp/include/nvhe/spinlock.h new file mode 100644 index ..7584c397bbac --- /dev/null +++ b/arch/arm64/kvm/hyp/include/nvhe/spinlock.h @@ -0,0 +1,92 @@ +/* SPDX-License-Identifier: GPL-2.0-only */ +/* + * A stand-alone ticket spinlock implementation for use by the non-VHE + * KVM hypervisor code running at EL2. + * + * Copyright (C) 2020 Google LLC + * Author: Will Deacon + * + * Heavily based on the implementation removed by c11090474d70 which was: + * Copyright (C) 2012 ARM Ltd. + */ + +#ifndef __ARM64_KVM_NVHE_SPINLOCK_H__ +#define __ARM64_KVM_NVHE_SPINLOCK_H__ + +#include +#include + +typedef union hyp_spinlock { + u32 __val; + struct { +#ifdef __AARCH64EB__ + u16 next, owner; +#else + u16 owner, next; + }; +#endif +} hyp_spinlock_t; + +#define hyp_spin_lock_init(l) \ +do { \ + *(l) = (hyp_spinlock_t){ .__val = 0 }; \ +} while (0) + +static inline void hyp_spin_lock(hyp_spinlock_t *lock) +{ + u32 tmp; + hyp_spinlock_t lockval, newval; + + asm volatile( + /* Atomically increment the next ticket. */ + ARM64_LSE_ATOMIC_INSN( + /* LL/SC */ +" prfmpstl1strm, %3\n" +"1:ldaxr %w0, %3\n" +" add %w1, %w0, #(1 << 16)\n" +" stxr%w2, %w1, %3\n" +" cbnz%w2, 1b\n", + /* LSE atomics */ +" mov %w2, #(1 << 16)\n" +" ldadda %w2, %w0, %3\n" + __nops(3)) + + /* Did we get the lock? */ +" eor %w1, %w0, %w0, ror #16\n" +" cbz %w1, 3f\n" + /* +* No: spin on the owner. Send a local event to avoid missing an +* unlock before the exclusive load. +*/ +" sevl\n" +"2:wfe\n" +" ldaxrh %w2, %4\n" +" eor %w1, %w2, %w0, lsr #16\n" +" cbnz%w1, 2b\n" + /* We got the lock. Critical section starts here. */ +"3:" + : "=" (lockval), "=" (newval), "=" (tmp), "+Q" (*lock) + : "Q" (lock->owner) + : "memory"); +} + +static inline void hyp_spin_unlock(hyp_spinlock_t *lock) +{ + u64 tmp; + + asm volatile( + ARM64_LSE_ATOMIC_INSN( + /* LL/SC */ + " ldrh%w1, %0\n" + " add %w1, %w1, #1\n" + " stlrh %w1, %0", + /* LSE atomics */ + " mov %w1, #1\n" + " staddlh %w1, %0\n" + __nops(1)) + : "=Q" (lock->owner), "=" (tmp) + : + : "memory"); +} + +#endif /* __ARM64_KVM_NVHE_SPINLOCK_H__ */ -- 2.30.0.284.gd98b1dd5eaa7-goog ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
[RFC PATCH v2 11/26] KVM: arm64: Stub CONFIG_DEBUG_LIST at Hyp
In order to use the kernel list library at EL2, introduce stubs for the CONFIG_DEBUG_LIST out-of-lines calls. Signed-off-by: Quentin Perret --- arch/arm64/kvm/hyp/nvhe/Makefile | 2 +- arch/arm64/kvm/hyp/nvhe/stub.c | 22 ++ 2 files changed, 23 insertions(+), 1 deletion(-) create mode 100644 arch/arm64/kvm/hyp/nvhe/stub.c diff --git a/arch/arm64/kvm/hyp/nvhe/Makefile b/arch/arm64/kvm/hyp/nvhe/Makefile index 1fc0684a7678..33bd381d8f73 100644 --- a/arch/arm64/kvm/hyp/nvhe/Makefile +++ b/arch/arm64/kvm/hyp/nvhe/Makefile @@ -10,7 +10,7 @@ lib-objs := clear_page.o copy_page.o memcpy.o memset.o lib-objs := $(addprefix ../../../lib/, $(lib-objs)) obj-y := timer-sr.o sysreg-sr.o debug-sr.o switch.o tlb.o hyp-init.o host.o \ -hyp-main.o hyp-smp.o psci-relay.o early_alloc.o +hyp-main.o hyp-smp.o psci-relay.o early_alloc.o stub.o obj-y += ../vgic-v3-sr.o ../aarch32.o ../vgic-v2-cpuif-proxy.o ../entry.o \ ../fpsimd.o ../hyp-entry.o ../exception.o obj-y += $(lib-objs) diff --git a/arch/arm64/kvm/hyp/nvhe/stub.c b/arch/arm64/kvm/hyp/nvhe/stub.c new file mode 100644 index ..c0aa6bbfd79d --- /dev/null +++ b/arch/arm64/kvm/hyp/nvhe/stub.c @@ -0,0 +1,22 @@ +// SPDX-License-Identifier: GPL-2.0-only +/* + * Stubs for out-of-line function calls caused by re-using kernel + * infrastructure at EL2. + * + * Copyright (C) 2020 - Google LLC + */ + +#include + +#ifdef CONFIG_DEBUG_LIST +bool __list_add_valid(struct list_head *new, struct list_head *prev, + struct list_head *next) +{ + return true; +} + +bool __list_del_entry_valid(struct list_head *entry) +{ + return true; +} +#endif -- 2.30.0.284.gd98b1dd5eaa7-goog ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
[RFC PATCH v2 01/26] arm64: lib: Annotate {clear, copy}_page() as position-independent
From: Will Deacon clear_page() and copy_page() are suitable for use outside of the kernel address space, so annotate them as position-independent code. Signed-off-by: Will Deacon Signed-off-by: Quentin Perret --- arch/arm64/lib/clear_page.S | 4 ++-- arch/arm64/lib/copy_page.S | 4 ++-- 2 files changed, 4 insertions(+), 4 deletions(-) diff --git a/arch/arm64/lib/clear_page.S b/arch/arm64/lib/clear_page.S index 073acbf02a7c..b84b179edba3 100644 --- a/arch/arm64/lib/clear_page.S +++ b/arch/arm64/lib/clear_page.S @@ -14,7 +14,7 @@ * Parameters: * x0 - dest */ -SYM_FUNC_START(clear_page) +SYM_FUNC_START_PI(clear_page) mrs x1, dczid_el0 and w1, w1, #0xf mov x2, #4 @@ -25,5 +25,5 @@ SYM_FUNC_START(clear_page) tst x0, #(PAGE_SIZE - 1) b.ne1b ret -SYM_FUNC_END(clear_page) +SYM_FUNC_END_PI(clear_page) EXPORT_SYMBOL(clear_page) diff --git a/arch/arm64/lib/copy_page.S b/arch/arm64/lib/copy_page.S index e7a793961408..29144f4cd449 100644 --- a/arch/arm64/lib/copy_page.S +++ b/arch/arm64/lib/copy_page.S @@ -17,7 +17,7 @@ * x0 - dest * x1 - src */ -SYM_FUNC_START(copy_page) +SYM_FUNC_START_PI(copy_page) alternative_if ARM64_HAS_NO_HW_PREFETCH // Prefetch three cache lines ahead. prfmpldl1strm, [x1, #128] @@ -75,5 +75,5 @@ alternative_else_nop_endif stnpx16, x17, [x0, #112 - 256] ret -SYM_FUNC_END(copy_page) +SYM_FUNC_END_PI(copy_page) EXPORT_SYMBOL(copy_page) -- 2.30.0.284.gd98b1dd5eaa7-goog ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
[RFC PATCH v2 02/26] KVM: arm64: Link position-independent string routines into .hyp.text
From: Will Deacon Pull clear_page(), copy_page(), memcpy() and memset() into the nVHE hyp code and ensure that we always execute the '__pi_' entry point on the offchance that it changes in future. [ qperret: Commit title nits ] Signed-off-by: Will Deacon Signed-off-by: Quentin Perret --- arch/arm64/include/asm/hyp_image.h | 3 +++ arch/arm64/kernel/image-vars.h | 11 +++ arch/arm64/kvm/hyp/nvhe/Makefile | 4 3 files changed, 18 insertions(+) diff --git a/arch/arm64/include/asm/hyp_image.h b/arch/arm64/include/asm/hyp_image.h index daa1a1da539e..e06842756051 100644 --- a/arch/arm64/include/asm/hyp_image.h +++ b/arch/arm64/include/asm/hyp_image.h @@ -31,6 +31,9 @@ */ #define KVM_NVHE_ALIAS(sym)kvm_nvhe_sym(sym) = sym; +/* Defines a linker script alias for KVM nVHE hyp symbols */ +#define KVM_NVHE_ALIAS_HYP(first, sec) kvm_nvhe_sym(first) = kvm_nvhe_sym(sec); + #endif /* LINKER_SCRIPT */ #endif /* __ARM64_HYP_IMAGE_H__ */ diff --git a/arch/arm64/kernel/image-vars.h b/arch/arm64/kernel/image-vars.h index 39289d75118d..43f3a1d6e92d 100644 --- a/arch/arm64/kernel/image-vars.h +++ b/arch/arm64/kernel/image-vars.h @@ -102,6 +102,17 @@ KVM_NVHE_ALIAS(__stop___kvm_ex_table); /* Array containing bases of nVHE per-CPU memory regions. */ KVM_NVHE_ALIAS(kvm_arm_hyp_percpu_base); +/* Position-independent library routines */ +KVM_NVHE_ALIAS_HYP(clear_page, __pi_clear_page); +KVM_NVHE_ALIAS_HYP(copy_page, __pi_copy_page); +KVM_NVHE_ALIAS_HYP(memcpy, __pi_memcpy); +KVM_NVHE_ALIAS_HYP(memset, __pi_memset); + +#ifdef CONFIG_KASAN +KVM_NVHE_ALIAS_HYP(__memcpy, __pi_memcpy); +KVM_NVHE_ALIAS_HYP(__memset, __pi_memset); +#endif + #endif /* CONFIG_KVM */ #endif /* __ARM64_KERNEL_IMAGE_VARS_H */ diff --git a/arch/arm64/kvm/hyp/nvhe/Makefile b/arch/arm64/kvm/hyp/nvhe/Makefile index 1f1e351c5fe2..590fdefb42dd 100644 --- a/arch/arm64/kvm/hyp/nvhe/Makefile +++ b/arch/arm64/kvm/hyp/nvhe/Makefile @@ -6,10 +6,14 @@ asflags-y := -D__KVM_NVHE_HYPERVISOR__ ccflags-y := -D__KVM_NVHE_HYPERVISOR__ +lib-objs := clear_page.o copy_page.o memcpy.o memset.o +lib-objs := $(addprefix ../../../lib/, $(lib-objs)) + obj-y := timer-sr.o sysreg-sr.o debug-sr.o switch.o tlb.o hyp-init.o host.o \ hyp-main.o hyp-smp.o psci-relay.o obj-y += ../vgic-v3-sr.o ../aarch32.o ../vgic-v2-cpuif-proxy.o ../entry.o \ ../fpsimd.o ../hyp-entry.o ../exception.o +obj-y += $(lib-objs) ## ## Build rules for compiling nVHE hyp code -- 2.30.0.284.gd98b1dd5eaa7-goog ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
[RFC PATCH v2 14/26] KVM: arm64: Factor out vector address calculation
In order to re-map the guest vectors at EL2 when pKVM is enabled, refactor __kvm_vector_slot2idx() and kvm_init_vector_slot() to move all the address calculation logic in a static inline function. Signed-off-by: Quentin Perret --- arch/arm64/include/asm/kvm_mmu.h | 8 arch/arm64/kvm/arm.c | 9 + 2 files changed, 9 insertions(+), 8 deletions(-) diff --git a/arch/arm64/include/asm/kvm_mmu.h b/arch/arm64/include/asm/kvm_mmu.h index e52d82aeadca..d7ebd73ec86f 100644 --- a/arch/arm64/include/asm/kvm_mmu.h +++ b/arch/arm64/include/asm/kvm_mmu.h @@ -195,6 +195,14 @@ phys_addr_t kvm_mmu_get_httbr(void); phys_addr_t kvm_get_idmap_vector(void); int kvm_mmu_init(void); +static inline void *__kvm_vector_slot2addr(void *base, + enum arm64_hyp_spectre_vector slot) +{ + int idx = slot - (slot != HYP_VECTOR_DIRECT); + + return base + (idx * SZ_2K); +} + struct kvm; #define kvm_flush_dcache_to_poc(a,l) __flush_dcache_area((a), (l)) diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c index 9fd769349e9e..6af9204bcd5b 100644 --- a/arch/arm64/kvm/arm.c +++ b/arch/arm64/kvm/arm.c @@ -1346,16 +1346,9 @@ static unsigned long nvhe_percpu_order(void) /* A lookup table holding the hypervisor VA for each vector slot */ static void *hyp_spectre_vector_selector[BP_HARDEN_EL2_SLOTS]; -static int __kvm_vector_slot2idx(enum arm64_hyp_spectre_vector slot) -{ - return slot - (slot != HYP_VECTOR_DIRECT); -} - static void kvm_init_vector_slot(void *base, enum arm64_hyp_spectre_vector slot) { - int idx = __kvm_vector_slot2idx(slot); - - hyp_spectre_vector_selector[slot] = base + (idx * SZ_2K); + hyp_spectre_vector_selector[slot] = __kvm_vector_slot2addr(base, slot); } static int kvm_init_vector_slots(void) -- 2.30.0.284.gd98b1dd5eaa7-goog ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
[RFC PATCH v2 13/26] KVM: arm64: Enable access to sanitized CPU features at EL2
Introduce the infrastructure in KVM enabling to copy CPU feature registers into EL2-owned data-structures, to allow reading sanitised values directly at EL2 in nVHE. Given that only a subset of these features are being read by the hypervisor, the ones that need to be copied are to be listed under together with the name of the nVHE variable that will hold the copy. While at it, introduce the first user of this infrastructure by implementing __flush_dcache_area at EL2, which needs arm64_ftr_reg_ctrel0. Signed-off-by: Quentin Perret --- arch/arm64/include/asm/cpufeature.h | 1 + arch/arm64/include/asm/kvm_cpufeature.h | 17 ++ arch/arm64/kernel/cpufeature.c | 12 ++ arch/arm64/kvm/arm.c| 31 + arch/arm64/kvm/hyp/nvhe/Makefile| 3 ++- arch/arm64/kvm/hyp/nvhe/cache.S | 13 +++ arch/arm64/kvm/hyp/nvhe/cpufeature.c| 8 +++ 7 files changed, 84 insertions(+), 1 deletion(-) create mode 100644 arch/arm64/include/asm/kvm_cpufeature.h create mode 100644 arch/arm64/kvm/hyp/nvhe/cache.S create mode 100644 arch/arm64/kvm/hyp/nvhe/cpufeature.c diff --git a/arch/arm64/include/asm/cpufeature.h b/arch/arm64/include/asm/cpufeature.h index 16063c813dcd..742e9bcc051b 100644 --- a/arch/arm64/include/asm/cpufeature.h +++ b/arch/arm64/include/asm/cpufeature.h @@ -600,6 +600,7 @@ void __init setup_cpu_features(void); void check_local_cpu_capabilities(void); u64 read_sanitised_ftr_reg(u32 id); +int copy_ftr_reg(u32 id, struct arm64_ftr_reg *dst); static inline bool cpu_supports_mixed_endian_el0(void) { diff --git a/arch/arm64/include/asm/kvm_cpufeature.h b/arch/arm64/include/asm/kvm_cpufeature.h new file mode 100644 index ..d34f85cba358 --- /dev/null +++ b/arch/arm64/include/asm/kvm_cpufeature.h @@ -0,0 +1,17 @@ +/* SPDX-License-Identifier: GPL-2.0-only */ +/* + * Copyright (C) 2020 - Google LLC + * Author: Quentin Perret + */ + +#include + +#ifndef KVM_HYP_CPU_FTR_REG +#if defined(__KVM_NVHE_HYPERVISOR__) +#define KVM_HYP_CPU_FTR_REG(id, name) extern struct arm64_ftr_reg name; +#else +#define KVM_HYP_CPU_FTR_REG(id, name) DECLARE_KVM_NVHE_SYM(name); +#endif +#endif + +KVM_HYP_CPU_FTR_REG(SYS_CTR_EL0, arm64_ftr_reg_ctrel0) diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c index bc3549663957..c2019dc3 100644 --- a/arch/arm64/kernel/cpufeature.c +++ b/arch/arm64/kernel/cpufeature.c @@ -1113,6 +1113,18 @@ u64 read_sanitised_ftr_reg(u32 id) } EXPORT_SYMBOL_GPL(read_sanitised_ftr_reg); +int copy_ftr_reg(u32 id, struct arm64_ftr_reg *dst) +{ + struct arm64_ftr_reg *regp = get_arm64_ftr_reg(id); + + if (!regp) + return -EINVAL; + + memcpy(dst, regp, sizeof(*regp)); + + return 0; +} + #define read_sysreg_case(r)\ case r: return read_sysreg_s(r) diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c index 51b53ca36dc5..9fd769349e9e 100644 --- a/arch/arm64/kvm/arm.c +++ b/arch/arm64/kvm/arm.c @@ -34,6 +34,7 @@ #include #include #include +#include #include #include #include @@ -1697,6 +1698,29 @@ static void teardown_hyp_mode(void) } } +#undef KVM_HYP_CPU_FTR_REG +#define KVM_HYP_CPU_FTR_REG(id, name) \ + { .sys_id = id, .dst = (struct arm64_ftr_reg *)_nvhe_sym(name) }, +static const struct __ftr_reg_copy_entry { + u32 sys_id; + struct arm64_ftr_reg*dst; +} hyp_ftr_regs[] = { + #include +}; + +static int copy_cpu_ftr_regs(void) +{ + int i, ret; + + for (i = 0; i < ARRAY_SIZE(hyp_ftr_regs); i++) { + ret = copy_ftr_reg(hyp_ftr_regs[i].sys_id, hyp_ftr_regs[i].dst); + if (ret) + return ret; + } + + return 0; +} + /** * Inits Hyp-mode on all online CPUs */ @@ -1705,6 +1729,13 @@ static int init_hyp_mode(void) int cpu; int err = 0; + /* +* Copy the required CPU feature register in their EL2 counterpart +*/ + err = copy_cpu_ftr_regs(); + if (err) + return err; + /* * Allocate Hyp PGD and setup Hyp identity mapping */ diff --git a/arch/arm64/kvm/hyp/nvhe/Makefile b/arch/arm64/kvm/hyp/nvhe/Makefile index 9e5eacfec6ec..72cfe53f106f 100644 --- a/arch/arm64/kvm/hyp/nvhe/Makefile +++ b/arch/arm64/kvm/hyp/nvhe/Makefile @@ -10,7 +10,8 @@ lib-objs := clear_page.o copy_page.o memcpy.o memset.o lib-objs := $(addprefix ../../../lib/, $(lib-objs)) obj-y := timer-sr.o sysreg-sr.o debug-sr.o switch.o tlb.o hyp-init.o host.o \ -hyp-main.o hyp-smp.o psci-relay.o early_alloc.o stub.o page_alloc.o +hyp-main.o hyp-smp.o psci-relay.o early_alloc.o stub.o page_alloc.o \ +cache.o cpufeature.o obj-y += ../vgic-v3-sr.o ../aarch32.o ../vgic-v2-cpuif-proxy.o ../entry.o \ ../fpsimd.o ../hyp-entry.o ../exception.o obj-y += $(lib-objs) diff --git
[RFC PATCH v2 17/26] KVM: arm64: Elevate Hyp mappings creation at EL2
Previous commits have introduced infrastructure at EL2 to enable the Hyp code to manage its own memory, and more specifically its stage 1 page tables. However, this was preliminary work, and none of it is currently in use. Put all of this together by elevating the hyp mappings creation at EL2 when memory protection is enabled. In this case, the host kernel running at EL1 still creates _temporary_ Hyp mappings, only used while initializing the hypervisor, but frees them right after. As such, all calls to create_hyp_mappings() after kvm init has finished turn into hypercalls, as the host now has no 'legal' way to modify the hypevisor page tables directly. Signed-off-by: Quentin Perret --- arch/arm64/include/asm/kvm_mmu.h | 1 - arch/arm64/kvm/arm.c | 62 +--- arch/arm64/kvm/mmu.c | 34 ++ 3 files changed, 92 insertions(+), 5 deletions(-) diff --git a/arch/arm64/include/asm/kvm_mmu.h b/arch/arm64/include/asm/kvm_mmu.h index d7ebd73ec86f..6c8466a042a9 100644 --- a/arch/arm64/include/asm/kvm_mmu.h +++ b/arch/arm64/include/asm/kvm_mmu.h @@ -309,6 +309,5 @@ static __always_inline void __load_guest_stage2(struct kvm_s2_mmu *mmu) */ asm(ALTERNATIVE("nop", "isb", ARM64_WORKAROUND_SPECULATIVE_AT)); } - #endif /* __ASSEMBLY__ */ #endif /* __ARM64_KVM_MMU_H__ */ diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c index 6af9204bcd5b..e524682c2ccf 100644 --- a/arch/arm64/kvm/arm.c +++ b/arch/arm64/kvm/arm.c @@ -1421,7 +1421,7 @@ static void cpu_prepare_hyp_mode(int cpu) kvm_flush_dcache_to_poc(params, sizeof(*params)); } -static void cpu_init_hyp_mode(void) +static void kvm_set_hyp_vector(void) { struct kvm_nvhe_init_params *params; struct arm_smccc_res res; @@ -1439,6 +1439,11 @@ static void cpu_init_hyp_mode(void) params = this_cpu_ptr_nvhe_sym(kvm_init_params); arm_smccc_1_1_hvc(KVM_HOST_SMCCC_FUNC(__kvm_hyp_init), virt_to_phys(params), ); WARN_ON(res.a0 != SMCCC_RET_SUCCESS); +} + +static void cpu_init_hyp_mode(void) +{ + kvm_set_hyp_vector(); /* * Disabling SSBD on a non-VHE system requires us to enable SSBS @@ -1481,7 +1486,10 @@ static void cpu_set_hyp_vector(void) struct bp_hardening_data *data = this_cpu_ptr(_hardening_data); void *vector = hyp_spectre_vector_selector[data->slot]; - *this_cpu_ptr_hyp_sym(kvm_hyp_vector) = (unsigned long)vector; + if (!is_protected_kvm_enabled()) + *this_cpu_ptr_hyp_sym(kvm_hyp_vector) = (unsigned long)vector; + else + kvm_call_hyp_nvhe(__pkvm_cpu_set_vector, data->slot); } static void cpu_hyp_reinit(void) @@ -1489,13 +1497,14 @@ static void cpu_hyp_reinit(void) kvm_init_host_cpu_context(_cpu_ptr_hyp_sym(kvm_host_data)->host_ctxt); cpu_hyp_reset(); - cpu_set_hyp_vector(); if (is_kernel_in_hyp_mode()) kvm_timer_init_vhe(); else cpu_init_hyp_mode(); + cpu_set_hyp_vector(); + kvm_arm_init_debug(); if (vgic_present) @@ -1714,13 +1723,52 @@ static int copy_cpu_ftr_regs(void) return 0; } +static int kvm_hyp_enable_protection(void) +{ + void *per_cpu_base = kvm_ksym_ref(kvm_arm_hyp_percpu_base); + int ret, cpu; + void *addr; + + if (!is_protected_kvm_enabled()) + return 0; + + if (!hyp_mem_base) + return -ENOMEM; + + addr = phys_to_virt(hyp_mem_base); + ret = create_hyp_mappings(addr, addr + hyp_mem_size - 1, PAGE_HYP); + if (ret) + return ret; + + preempt_disable(); + kvm_set_hyp_vector(); + ret = kvm_call_hyp_nvhe(__pkvm_init, hyp_mem_base, hyp_mem_size, + num_possible_cpus(), kern_hyp_va(per_cpu_base)); + preempt_enable(); + if (ret) + return ret; + + free_hyp_pgds(); + for_each_possible_cpu(cpu) + free_page(per_cpu(kvm_arm_hyp_stack_page, cpu)); + + return 0; +} + /** * Inits Hyp-mode on all online CPUs */ static int init_hyp_mode(void) { int cpu; - int err = 0; + int err = -ENOMEM; + + /* +* The protected Hyp-mode cannot be initialized if the memory pool +* allocation has failed. +*/ + if (is_protected_kvm_enabled() && !hyp_mem_base) + return err; /* * Copy the required CPU feature register in their EL2 counterpart @@ -1854,6 +1902,12 @@ static int init_hyp_mode(void) for_each_possible_cpu(cpu) cpu_prepare_hyp_mode(cpu); + err = kvm_hyp_enable_protection(); + if (err) { + kvm_err("Failed to enable hyp memory protection: %d\n", err); + goto out_err; + } + return 0; out_err: diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c index 3cf9397dabdb..9d4c9251208e
[RFC PATCH v2 12/26] KVM: arm64: Introduce a Hyp buddy page allocator
When memory protection is enabled, the hyp code will require a basic form of memory management in order to allocate and free memory pages at EL2. This is needed for various use-cases, including the creation of hyp mappings or the allocation of stage 2 page tables. To address these use-case, introduce a simple memory allocator in the hyp code. The allocator is designed as a conventional 'buddy allocator', working with a page granularity. It allows to allocate and free physically contiguous pages from memory 'pools', with a guaranteed order alignment in the PA space. Each page in a memory pool is associated with a struct hyp_page which holds the page's metadata, including its refcount, as well as its current order, hence mimicking the kernel's buddy system in the GFP infrastructure. The hyp_page metadata are made accessible through a hyp_vmemmap, following the concept of SPARSE_VMEMMAP in the kernel. Signed-off-by: Quentin Perret --- arch/arm64/kvm/hyp/include/nvhe/gfp.h| 32 arch/arm64/kvm/hyp/include/nvhe/memory.h | 25 +++ arch/arm64/kvm/hyp/nvhe/Makefile | 2 +- arch/arm64/kvm/hyp/nvhe/page_alloc.c | 185 +++ 4 files changed, 243 insertions(+), 1 deletion(-) create mode 100644 arch/arm64/kvm/hyp/include/nvhe/gfp.h create mode 100644 arch/arm64/kvm/hyp/nvhe/page_alloc.c diff --git a/arch/arm64/kvm/hyp/include/nvhe/gfp.h b/arch/arm64/kvm/hyp/include/nvhe/gfp.h new file mode 100644 index ..95587faee171 --- /dev/null +++ b/arch/arm64/kvm/hyp/include/nvhe/gfp.h @@ -0,0 +1,32 @@ +/* SPDX-License-Identifier: GPL-2.0-only */ +#ifndef __KVM_HYP_GFP_H +#define __KVM_HYP_GFP_H + +#include + +#include +#include + +#define HYP_MAX_ORDER 11U +#define HYP_NO_ORDER UINT_MAX + +struct hyp_pool { + hyp_spinlock_t lock; + struct list_head free_area[HYP_MAX_ORDER + 1]; + phys_addr_t range_start; + phys_addr_t range_end; +}; + +/* GFP flags */ +#define HYP_GFP_NONE 0 +#define HYP_GFP_ZERO 1 + +/* Allocation */ +void *hyp_alloc_pages(struct hyp_pool *pool, gfp_t mask, unsigned int order); +void hyp_get_page(void *addr); +void hyp_put_page(void *addr); + +/* Used pages cannot be freed */ +int hyp_pool_init(struct hyp_pool *pool, phys_addr_t phys, + unsigned int nr_pages, unsigned int used_pages); +#endif /* __KVM_HYP_GFP_H */ diff --git a/arch/arm64/kvm/hyp/include/nvhe/memory.h b/arch/arm64/kvm/hyp/include/nvhe/memory.h index 64c44c142c95..ed47674bc988 100644 --- a/arch/arm64/kvm/hyp/include/nvhe/memory.h +++ b/arch/arm64/kvm/hyp/include/nvhe/memory.h @@ -6,7 +6,17 @@ #include +struct hyp_pool; +struct hyp_page { + unsigned int refcount; + unsigned int order; + struct hyp_pool *pool; + struct list_head node; +}; + extern s64 hyp_physvirt_offset; +extern u64 __hyp_vmemmap; +#define hyp_vmemmap ((struct hyp_page *)__hyp_vmemmap) #define __hyp_pa(virt) ((phys_addr_t)(virt) + hyp_physvirt_offset) #define __hyp_va(virt) ((void *)((phys_addr_t)(virt) - hyp_physvirt_offset)) @@ -21,4 +31,19 @@ static inline phys_addr_t hyp_virt_to_phys(void *addr) return __hyp_pa(addr); } +#define hyp_phys_to_pfn(phys) ((phys) >> PAGE_SHIFT) +#define hyp_phys_to_page(phys) (_vmemmap[hyp_phys_to_pfn(phys)]) +#define hyp_virt_to_page(virt) hyp_phys_to_page(__hyp_pa(virt)) + +#define hyp_page_to_phys(page) ((phys_addr_t)((page) - hyp_vmemmap) << PAGE_SHIFT) +#define hyp_page_to_virt(page) __hyp_va(hyp_page_to_phys(page)) +#define hyp_page_to_pool(page) (((struct hyp_page *)page)->pool) + +static inline int hyp_page_count(void *addr) +{ + struct hyp_page *p = hyp_virt_to_page(addr); + + return p->refcount; +} + #endif /* __KVM_HYP_MEMORY_H */ diff --git a/arch/arm64/kvm/hyp/nvhe/Makefile b/arch/arm64/kvm/hyp/nvhe/Makefile index 33bd381d8f73..9e5eacfec6ec 100644 --- a/arch/arm64/kvm/hyp/nvhe/Makefile +++ b/arch/arm64/kvm/hyp/nvhe/Makefile @@ -10,7 +10,7 @@ lib-objs := clear_page.o copy_page.o memcpy.o memset.o lib-objs := $(addprefix ../../../lib/, $(lib-objs)) obj-y := timer-sr.o sysreg-sr.o debug-sr.o switch.o tlb.o hyp-init.o host.o \ -hyp-main.o hyp-smp.o psci-relay.o early_alloc.o stub.o +hyp-main.o hyp-smp.o psci-relay.o early_alloc.o stub.o page_alloc.o obj-y += ../vgic-v3-sr.o ../aarch32.o ../vgic-v2-cpuif-proxy.o ../entry.o \ ../fpsimd.o ../hyp-entry.o ../exception.o obj-y += $(lib-objs) diff --git a/arch/arm64/kvm/hyp/nvhe/page_alloc.c b/arch/arm64/kvm/hyp/nvhe/page_alloc.c new file mode 100644 index ..6de6515f0432 --- /dev/null +++ b/arch/arm64/kvm/hyp/nvhe/page_alloc.c @@ -0,0 +1,185 @@ +// SPDX-License-Identifier: GPL-2.0-only +/* + * Copyright (C) 2020 Google LLC + * Author: Quentin Perret + */ + +#include +#include + +u64 __hyp_vmemmap; + +/* + * Example buddy-tree for a 4-pages physically contiguous pool: + * + * o : Page 3 + */ + * o-o : Page 2 + * / + *
[RFC PATCH v2 16/26] KVM: arm64: Prepare Hyp memory protection
When memory protection is enabled, the Hyp code needs the ability to create and manage its own page-table. To do so, introduce a new set of hypercalls to initialize Hyp memory protection. During the init hcall, the hypervisor runs with the host-provided page-table and uses the trivial early page allocator to create its own set of page-tables, using a memory pool that was donated by the host. Specifically, the hypervisor creates its own mappings for __hyp_text, the Hyp memory pool, the __hyp_bss, the portion of hyp_vmemmap corresponding to the Hyp pool, among other things. It then jumps back in the idmap page, switches to use the newly-created pgd (instead of the temporary one provided by the host) and then installs the full-fledged buddy allocator which will then be the only one in used from then on. Note that for the sake of symplifying the review, this only introduces the code doing this operation, without actually being called by anyhing yet. This will be done in a subsequent patch, which will introduce the necessary host kernel changes. Credits to Will for __pkvm_init_switch_pgd. Co-authored-by: Will Deacon Signed-off-by: Will Deacon Signed-off-by: Quentin Perret --- arch/arm64/include/asm/kvm_asm.h | 4 + arch/arm64/include/asm/kvm_host.h| 8 + arch/arm64/include/asm/kvm_hyp.h | 8 + arch/arm64/kernel/image-vars.h | 19 +++ arch/arm64/kvm/hyp/Makefile | 2 +- arch/arm64/kvm/hyp/include/nvhe/memory.h | 6 + arch/arm64/kvm/hyp/include/nvhe/mm.h | 79 + arch/arm64/kvm/hyp/nvhe/Makefile | 4 +- arch/arm64/kvm/hyp/nvhe/hyp-init.S | 31 arch/arm64/kvm/hyp/nvhe/hyp-main.c | 42 + arch/arm64/kvm/hyp/nvhe/mm.c | 174 arch/arm64/kvm/hyp/nvhe/setup.c | 196 +++ arch/arm64/kvm/hyp/reserved_mem.c| 102 arch/arm64/kvm/mmu.c | 2 +- arch/arm64/mm/init.c | 3 + 15 files changed, 676 insertions(+), 4 deletions(-) create mode 100644 arch/arm64/kvm/hyp/include/nvhe/mm.h create mode 100644 arch/arm64/kvm/hyp/nvhe/mm.c create mode 100644 arch/arm64/kvm/hyp/nvhe/setup.c create mode 100644 arch/arm64/kvm/hyp/reserved_mem.c diff --git a/arch/arm64/include/asm/kvm_asm.h b/arch/arm64/include/asm/kvm_asm.h index 7ccf770c53d9..4fc27ac08836 100644 --- a/arch/arm64/include/asm/kvm_asm.h +++ b/arch/arm64/include/asm/kvm_asm.h @@ -57,6 +57,10 @@ #define __KVM_HOST_SMCCC_FUNC___kvm_get_mdcr_el2 12 #define __KVM_HOST_SMCCC_FUNC___vgic_v3_save_aprs 13 #define __KVM_HOST_SMCCC_FUNC___vgic_v3_restore_aprs 14 +#define __KVM_HOST_SMCCC_FUNC___pkvm_init 15 +#define __KVM_HOST_SMCCC_FUNC___pkvm_create_mappings 16 +#define __KVM_HOST_SMCCC_FUNC___pkvm_create_private_mapping17 +#define __KVM_HOST_SMCCC_FUNC___pkvm_cpu_set_vector18 #ifndef __ASSEMBLY__ diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h index 81212958ef55..9a2feb83eea0 100644 --- a/arch/arm64/include/asm/kvm_host.h +++ b/arch/arm64/include/asm/kvm_host.h @@ -777,4 +777,12 @@ bool kvm_arm_vcpu_is_finalized(struct kvm_vcpu *vcpu); #define kvm_vcpu_has_pmu(vcpu) \ (test_bit(KVM_ARM_VCPU_PMU_V3, (vcpu)->arch.features)) +#ifdef CONFIG_KVM +extern phys_addr_t hyp_mem_base; +extern phys_addr_t hyp_mem_size; +void __init kvm_hyp_reserve(void); +#else +static inline void kvm_hyp_reserve(void) { } +#endif + #endif /* __ARM64_KVM_HOST_H__ */ diff --git a/arch/arm64/include/asm/kvm_hyp.h b/arch/arm64/include/asm/kvm_hyp.h index c0450828378b..a0e113734b20 100644 --- a/arch/arm64/include/asm/kvm_hyp.h +++ b/arch/arm64/include/asm/kvm_hyp.h @@ -100,4 +100,12 @@ void __noreturn hyp_panic(void); void __noreturn __hyp_do_panic(bool restore_host, u64 spsr, u64 elr, u64 par); #endif +#ifdef __KVM_NVHE_HYPERVISOR__ +void __pkvm_init_switch_pgd(phys_addr_t phys, unsigned long size, + phys_addr_t pgd, void *sp, void *cont_fn); +int __pkvm_init(phys_addr_t phys, unsigned long size, unsigned long nr_cpus, + unsigned long *per_cpu_base); +void __noreturn __host_enter(struct kvm_cpu_context *host_ctxt); +#endif + #endif /* __ARM64_KVM_HYP_H__ */ diff --git a/arch/arm64/kernel/image-vars.h b/arch/arm64/kernel/image-vars.h index 43f3a1d6e92d..366d837f0d39 100644 --- a/arch/arm64/kernel/image-vars.h +++ b/arch/arm64/kernel/image-vars.h @@ -113,6 +113,25 @@ KVM_NVHE_ALIAS_HYP(__memcpy, __pi_memcpy); KVM_NVHE_ALIAS_HYP(__memset, __pi_memset); #endif +/* Hypevisor VA size */ +KVM_NVHE_ALIAS(hyp_va_bits); + +/* Kernel memory sections */ +KVM_NVHE_ALIAS(__start_rodata); +KVM_NVHE_ALIAS(__end_rodata); +KVM_NVHE_ALIAS(__bss_start); +KVM_NVHE_ALIAS(__bss_stop); + +/* Hyp memory sections */ +KVM_NVHE_ALIAS(__hyp_idmap_text_start);
[RFC PATCH v2 08/26] KVM: arm64: Make kvm_call_hyp() a function call at Hyp
kvm_call_hyp() has some logic to issue a function call or a hypercall depending the EL at which the kernel is running. However, all the code compiled under __KVM_NVHE_HYPERVISOR__ is guaranteed to run only at EL2, and in this case a simple function call is needed. Add ifdefery to kvm_host.h to symplify kvm_call_hyp() in .hyp.text. Signed-off-by: Quentin Perret --- arch/arm64/include/asm/kvm_host.h | 6 ++ 1 file changed, 6 insertions(+) diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h index 8fcfab0c2567..81212958ef55 100644 --- a/arch/arm64/include/asm/kvm_host.h +++ b/arch/arm64/include/asm/kvm_host.h @@ -592,6 +592,7 @@ int kvm_test_age_hva(struct kvm *kvm, unsigned long hva); void kvm_arm_halt_guest(struct kvm *kvm); void kvm_arm_resume_guest(struct kvm *kvm); +#ifndef __KVM_NVHE_HYPERVISOR__ #define kvm_call_hyp_nvhe(f, ...) \ ({ \ struct arm_smccc_res res; \ @@ -631,6 +632,11 @@ void kvm_arm_resume_guest(struct kvm *kvm); \ ret;\ }) +#else /* __KVM_NVHE_HYPERVISOR__ */ +#define kvm_call_hyp(f, ...) f(__VA_ARGS__) +#define kvm_call_hyp_ret(f, ...) f(__VA_ARGS__) +#define kvm_call_hyp_nvhe(f, ...) f(__VA_ARGS__) +#endif /* __KVM_NVHE_HYPERVISOR__ */ void force_vm_exit(const cpumask_t *mask); void kvm_mmu_wp_memory_region(struct kvm *kvm, int slot); -- 2.30.0.284.gd98b1dd5eaa7-goog ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
[RFC PATCH v2 07/26] KVM: arm64: Introduce a BSS section for use at Hyp
Currently, the hyp code cannot make full use of a bss, as the kernel section is mapped read-only. While this mapping could simply be changed to read-write, it would intermingle even more the hyp and kernel state than they currently are. Instead, introduce a __hyp_bss section, that uses reserved pages, and create the appropriate RW hyp mappings during KVM init. Signed-off-by: Quentin Perret --- arch/arm64/include/asm/sections.h | 1 + arch/arm64/kernel/vmlinux.lds.S | 7 +++ arch/arm64/kvm/arm.c | 11 +++ arch/arm64/kvm/hyp/nvhe/hyp.lds.S | 1 + 4 files changed, 20 insertions(+) diff --git a/arch/arm64/include/asm/sections.h b/arch/arm64/include/asm/sections.h index 8ff579361731..f58cf493de16 100644 --- a/arch/arm64/include/asm/sections.h +++ b/arch/arm64/include/asm/sections.h @@ -12,6 +12,7 @@ extern char __hibernate_exit_text_start[], __hibernate_exit_text_end[]; extern char __hyp_idmap_text_start[], __hyp_idmap_text_end[]; extern char __hyp_text_start[], __hyp_text_end[]; extern char __hyp_data_ro_after_init_start[], __hyp_data_ro_after_init_end[]; +extern char __hyp_bss_start[], __hyp_bss_end[]; extern char __idmap_text_start[], __idmap_text_end[]; extern char __initdata_begin[], __initdata_end[]; extern char __inittext_begin[], __inittext_end[]; diff --git a/arch/arm64/kernel/vmlinux.lds.S b/arch/arm64/kernel/vmlinux.lds.S index 43af13968dfd..3eca35d5a7cf 100644 --- a/arch/arm64/kernel/vmlinux.lds.S +++ b/arch/arm64/kernel/vmlinux.lds.S @@ -8,6 +8,13 @@ #define RO_EXCEPTION_TABLE_ALIGN 8 #define RUNTIME_DISCARD_EXIT +#define BSS_FIRST_SECTIONS \ + . = ALIGN(PAGE_SIZE); \ + __hyp_bss_start = .;\ + *(.hyp.bss) \ + . = ALIGN(PAGE_SIZE); \ + __hyp_bss_end = .; + #include #include #include diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c index 3ac0f3425833..51b53ca36dc5 100644 --- a/arch/arm64/kvm/arm.c +++ b/arch/arm64/kvm/arm.c @@ -1770,7 +1770,18 @@ static int init_hyp_mode(void) goto out_err; } + /* +* .hyp.bss is placed at the beginning of the .bss section, so map that +* part RW, and the rest RO as the hyp shouldn't be touching it. +*/ err = create_hyp_mappings(kvm_ksym_ref(__bss_start), + kvm_ksym_ref(__hyp_bss_end), PAGE_HYP); + if (err) { + kvm_err("Cannot map hyp bss section: %d\n", err); + goto out_err; + } + + err = create_hyp_mappings(kvm_ksym_ref(__hyp_bss_end), kvm_ksym_ref(__bss_stop), PAGE_HYP_RO); if (err) { kvm_err("Cannot map bss section\n"); diff --git a/arch/arm64/kvm/hyp/nvhe/hyp.lds.S b/arch/arm64/kvm/hyp/nvhe/hyp.lds.S index 5d76ff2ba63e..dc281d90063e 100644 --- a/arch/arm64/kvm/hyp/nvhe/hyp.lds.S +++ b/arch/arm64/kvm/hyp/nvhe/hyp.lds.S @@ -17,4 +17,5 @@ SECTIONS { PERCPU_INPUT(L1_CACHE_BYTES) } HYP_SECTION(.data..ro_after_init) + HYP_SECTION(.bss) } -- 2.30.0.284.gd98b1dd5eaa7-goog ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
[RFC PATCH v2 09/26] KVM: arm64: Allow using kvm_nvhe_sym() in hyp code
In order to allow the usage of code shared by the host and the hyp in static inline library function, allow the usage of kvm_nvhe_sym() at el2 by defaulting to the raw symbol name. Signed-off-by: Quentin Perret --- arch/arm64/include/asm/hyp_image.h | 4 1 file changed, 4 insertions(+) diff --git a/arch/arm64/include/asm/hyp_image.h b/arch/arm64/include/asm/hyp_image.h index e06842756051..fb16e1018ea9 100644 --- a/arch/arm64/include/asm/hyp_image.h +++ b/arch/arm64/include/asm/hyp_image.h @@ -7,11 +7,15 @@ #ifndef __ARM64_HYP_IMAGE_H__ #define __ARM64_HYP_IMAGE_H__ +#ifndef __KVM_NVHE_HYPERVISOR__ /* * KVM nVHE code has its own symbol namespace prefixed with __kvm_nvhe_, * to separate it from the kernel proper. */ #define kvm_nvhe_sym(sym) __kvm_nvhe_##sym +#else +#define kvm_nvhe_sym(sym) sym +#endif #ifdef LINKER_SCRIPT -- 2.30.0.284.gd98b1dd5eaa7-goog ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
[RFC PATCH v2 04/26] KVM: arm64: Initialize kvm_nvhe_init_params early
Move the initialization of kvm_nvhe_init_params in a dedicated function that is run early, and only once during KVM init, rather than every time the KVM vectors are set and reset. This also opens the opportunity for the hypervisor to change the init structs during boot, hence simplifying the replacement of host-provided page-tables and stacks by the ones the hypervisor will create for itself. Signed-off-by: Quentin Perret --- arch/arm64/kvm/arm.c | 28 1 file changed, 20 insertions(+), 8 deletions(-) diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c index 04c44853b103..3ac0f3425833 100644 --- a/arch/arm64/kvm/arm.c +++ b/arch/arm64/kvm/arm.c @@ -1383,21 +1383,17 @@ static int kvm_init_vector_slots(void) return 0; } -static void cpu_init_hyp_mode(void) +static void cpu_prepare_hyp_mode(int cpu) { - struct kvm_nvhe_init_params *params = this_cpu_ptr_nvhe_sym(kvm_init_params); - struct arm_smccc_res res; + struct kvm_nvhe_init_params *params = per_cpu_ptr_nvhe_sym(kvm_init_params, cpu); unsigned long tcr; - /* Switch from the HYP stub to our own HYP init vector */ - __hyp_set_vectors(kvm_get_idmap_vector()); - /* * Calculate the raw per-cpu offset without a translation from the * kernel's mapping to the linear mapping, and store it in tpidr_el2 * so that we can use adr_l to access per-cpu variables in EL2. */ - params->tpidr_el2 = (unsigned long)this_cpu_ptr_nvhe_sym(__per_cpu_start) - + params->tpidr_el2 = (unsigned long)per_cpu_ptr_nvhe_sym(__per_cpu_start, cpu) - (unsigned long)kvm_ksym_ref(CHOOSE_NVHE_SYM(__per_cpu_start)); params->mair_el2 = read_sysreg(mair_el1); @@ -1421,7 +1417,7 @@ static void cpu_init_hyp_mode(void) tcr |= (idmap_t0sz & GENMASK(TCR_TxSZ_WIDTH - 1, 0)) << TCR_T0SZ_OFFSET; params->tcr_el2 = tcr; - params->stack_hyp_va = kern_hyp_va(__this_cpu_read(kvm_arm_hyp_stack_page) + PAGE_SIZE); + params->stack_hyp_va = kern_hyp_va(per_cpu(kvm_arm_hyp_stack_page, cpu) + PAGE_SIZE); params->pgd_pa = kvm_mmu_get_httbr(); /* @@ -1429,6 +1425,15 @@ static void cpu_init_hyp_mode(void) * be read while the MMU is off. */ kvm_flush_dcache_to_poc(params, sizeof(*params)); +} + +static void cpu_init_hyp_mode(void) +{ + struct kvm_nvhe_init_params *params; + struct arm_smccc_res res; + + /* Switch from the HYP stub to our own HYP init vector */ + __hyp_set_vectors(kvm_get_idmap_vector()); /* * Call initialization code, and switch to the full blown HYP code. @@ -1437,6 +1442,7 @@ static void cpu_init_hyp_mode(void) * cpus_have_const_cap() wrapper. */ BUG_ON(!system_capabilities_finalized()); + params = this_cpu_ptr_nvhe_sym(kvm_init_params); arm_smccc_1_1_hvc(KVM_HOST_SMCCC_FUNC(__kvm_hyp_init), virt_to_phys(params), ); WARN_ON(res.a0 != SMCCC_RET_SUCCESS); @@ -1807,6 +1813,12 @@ static int init_hyp_mode(void) goto out_err; } + /* +* Prepare the CPU initialization parameters +*/ + for_each_possible_cpu(cpu) + cpu_prepare_hyp_mode(cpu); + return 0; out_err: -- 2.30.0.284.gd98b1dd5eaa7-goog ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
[RFC PATCH v2 15/26] of/fdt: Introduce early_init_dt_add_memory_hyp()
Introduce early_init_dt_add_memory_hyp() to allow KVM to conserve a copy of the memory regions parsed from DT. This will be needed in the context of the protected nVHE feature of KVM/arm64 where the code running at EL2 will be cleanly separated from the host kernel during boot, and will need its own representation of memory. Signed-off-by: Quentin Perret --- drivers/of/fdt.c | 5 + 1 file changed, 5 insertions(+) diff --git a/drivers/of/fdt.c b/drivers/of/fdt.c index 4602e467ca8b..af2b5a09c5b4 100644 --- a/drivers/of/fdt.c +++ b/drivers/of/fdt.c @@ -1099,6 +1099,10 @@ int __init early_init_dt_scan_chosen(unsigned long node, const char *uname, #define MAX_MEMBLOCK_ADDR ((phys_addr_t)~0) #endif +void __init __weak early_init_dt_add_memory_hyp(u64 base, u64 size) +{ +} + void __init __weak early_init_dt_add_memory_arch(u64 base, u64 size) { const u64 phys_offset = MIN_MEMBLOCK_ADDR; @@ -1139,6 +1143,7 @@ void __init __weak early_init_dt_add_memory_arch(u64 base, u64 size) base = phys_offset; } memblock_add(base, size); + early_init_dt_add_memory_hyp(base, size); } int __init __weak early_init_dt_mark_hotplug_memory_arch(u64 base, u64 size) -- 2.30.0.284.gd98b1dd5eaa7-goog ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
[RFC PATCH v2 06/26] KVM: arm64: Factor memory allocation out of pgtable.c
In preparation for enabling the creation of page-tables at EL2, factor all memory allocation out of the page-table code, hence making it re-usable with any compatible memory allocator. No functional changes intended. Signed-off-by: Quentin Perret --- arch/arm64/include/asm/kvm_pgtable.h | 32 +- arch/arm64/kvm/hyp/pgtable.c | 90 +--- arch/arm64/kvm/mmu.c | 70 +- 3 files changed, 154 insertions(+), 38 deletions(-) diff --git a/arch/arm64/include/asm/kvm_pgtable.h b/arch/arm64/include/asm/kvm_pgtable.h index 52ab38db04c7..45acc9dc6c45 100644 --- a/arch/arm64/include/asm/kvm_pgtable.h +++ b/arch/arm64/include/asm/kvm_pgtable.h @@ -13,17 +13,41 @@ typedef u64 kvm_pte_t; +/** + * struct kvm_pgtable_mm_ops - Memory management callbacks. + * @zalloc_page: Allocate a zeroed memory page. + * @zalloc_pages_exact:Allocate an exact number of zeroed memory pages. + * @free_pages_exact: Free an exact number of memory pages. + * @get_page: Increment the refcount on a page. + * @put_page: Decrement the refcount on a page. + * @page_count:Returns the refcount of a page. + * @phys_to_virt: Convert a physical address into a virtual address. + * @virt_to_phys: Convert a virtual address into a physical address. + */ +struct kvm_pgtable_mm_ops { + void* (*zalloc_page)(void *arg); + void* (*zalloc_pages_exact)(size_t size); + void(*free_pages_exact)(void *addr, size_t size); + void(*get_page)(void *addr); + void(*put_page)(void *addr); + int (*page_count)(void *addr); + void* (*phys_to_virt)(phys_addr_t phys); + phys_addr_t (*virt_to_phys)(void *addr); +}; + /** * struct kvm_pgtable - KVM page-table. * @ia_bits: Maximum input address size, in bits. * @start_level: Level at which the page-table walk starts. * @pgd: Pointer to the first top-level entry of the page-table. + * @mm_ops:Memory management callbacks. * @mmu: Stage-2 KVM MMU struct. Unused for stage-1 page-tables. */ struct kvm_pgtable { u32 ia_bits; u32 start_level; kvm_pte_t *pgd; + struct kvm_pgtable_mm_ops *mm_ops; /* Stage-2 only */ struct kvm_s2_mmu *mmu; @@ -86,10 +110,12 @@ struct kvm_pgtable_walker { * kvm_pgtable_hyp_init() - Initialise a hypervisor stage-1 page-table. * @pgt: Uninitialised page-table structure to initialise. * @va_bits: Maximum virtual address bits. + * @mm_ops:Memory management callbacks. * * Return: 0 on success, negative error code on failure. */ -int kvm_pgtable_hyp_init(struct kvm_pgtable *pgt, u32 va_bits); +int kvm_pgtable_hyp_init(struct kvm_pgtable *pgt, u32 va_bits, +struct kvm_pgtable_mm_ops *mm_ops); /** * kvm_pgtable_hyp_destroy() - Destroy an unused hypervisor stage-1 page-table. @@ -126,10 +152,12 @@ int kvm_pgtable_hyp_map(struct kvm_pgtable *pgt, u64 addr, u64 size, u64 phys, * kvm_pgtable_stage2_init() - Initialise a guest stage-2 page-table. * @pgt: Uninitialised page-table structure to initialise. * @kvm: KVM structure representing the guest virtual machine. + * @mm_ops:Memory management callbacks. * * Return: 0 on success, negative error code on failure. */ -int kvm_pgtable_stage2_init(struct kvm_pgtable *pgt, struct kvm *kvm); +int kvm_pgtable_stage2_init(struct kvm_pgtable *pgt, struct kvm *kvm, + struct kvm_pgtable_mm_ops *mm_ops); /** * kvm_pgtable_stage2_destroy() - Destroy an unused guest stage-2 page-table. diff --git a/arch/arm64/kvm/hyp/pgtable.c b/arch/arm64/kvm/hyp/pgtable.c index d7122c5eac24..61a8a34ddfdb 100644 --- a/arch/arm64/kvm/hyp/pgtable.c +++ b/arch/arm64/kvm/hyp/pgtable.c @@ -148,9 +148,9 @@ static kvm_pte_t kvm_phys_to_pte(u64 pa) return pte; } -static kvm_pte_t *kvm_pte_follow(kvm_pte_t pte) +static kvm_pte_t *kvm_pte_follow(kvm_pte_t pte, struct kvm_pgtable_mm_ops *mm_ops) { - return __va(kvm_pte_to_phys(pte)); + return mm_ops->phys_to_virt(kvm_pte_to_phys(pte)); } static void kvm_set_invalid_pte(kvm_pte_t *ptep) @@ -159,9 +159,10 @@ static void kvm_set_invalid_pte(kvm_pte_t *ptep) WRITE_ONCE(*ptep, pte & ~KVM_PTE_VALID); } -static void kvm_set_table_pte(kvm_pte_t *ptep, kvm_pte_t *childp) +static void kvm_set_table_pte(kvm_pte_t *ptep, kvm_pte_t *childp, + struct kvm_pgtable_mm_ops *mm_ops) { - kvm_pte_t old = *ptep, pte = kvm_phys_to_pte(__pa(childp)); + kvm_pte_t old = *ptep, pte = kvm_phys_to_pte(mm_ops->virt_to_phys(childp)); pte |= FIELD_PREP(KVM_PTE_TYPE,
[RFC PATCH v2 05/26] KVM: arm64: Avoid free_page() in page-table allocator
Currently, the KVM page-table allocator uses a mix of put_page() and free_page() calls depending on the context even though page-allocation is always achieved using variants of __get_free_page(). Make the code consitent by using put_page() throughout, and reduce the memory management API surface used by the page-table code. This will ease factoring out page-alloction from pgtable.c, which is a pre-requisite to creating page-tables at EL2. Signed-off-by: Quentin Perret --- arch/arm64/kvm/hyp/pgtable.c | 10 +- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/arch/arm64/kvm/hyp/pgtable.c b/arch/arm64/kvm/hyp/pgtable.c index 0271b4a3b9fe..d7122c5eac24 100644 --- a/arch/arm64/kvm/hyp/pgtable.c +++ b/arch/arm64/kvm/hyp/pgtable.c @@ -410,7 +410,7 @@ int kvm_pgtable_hyp_init(struct kvm_pgtable *pgt, u32 va_bits) static int hyp_free_walker(u64 addr, u64 end, u32 level, kvm_pte_t *ptep, enum kvm_pgtable_walk_flags flag, void * const arg) { - free_page((unsigned long)kvm_pte_follow(*ptep)); + put_page(virt_to_page(kvm_pte_follow(*ptep))); return 0; } @@ -422,7 +422,7 @@ void kvm_pgtable_hyp_destroy(struct kvm_pgtable *pgt) }; WARN_ON(kvm_pgtable_walk(pgt, 0, BIT(pgt->ia_bits), )); - free_page((unsigned long)pgt->pgd); + put_page(virt_to_page(pgt->pgd)); pgt->pgd = NULL; } @@ -551,7 +551,7 @@ static int stage2_map_walk_table_post(u64 addr, u64 end, u32 level, if (!data->anchor) return 0; - free_page((unsigned long)kvm_pte_follow(*ptep)); + put_page(virt_to_page(kvm_pte_follow(*ptep))); put_page(virt_to_page(ptep)); if (data->anchor == ptep) { @@ -674,7 +674,7 @@ static int stage2_unmap_walker(u64 addr, u64 end, u32 level, kvm_pte_t *ptep, } if (childp) - free_page((unsigned long)childp); + put_page(virt_to_page(childp)); return 0; } @@ -871,7 +871,7 @@ static int stage2_free_walker(u64 addr, u64 end, u32 level, kvm_pte_t *ptep, put_page(virt_to_page(ptep)); if (kvm_pte_table(pte, level)) - free_page((unsigned long)kvm_pte_follow(pte)); + put_page(virt_to_page(kvm_pte_follow(pte))); return 0; } -- 2.30.0.284.gd98b1dd5eaa7-goog ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
[RFC PATCH v2 20/26] KVM: arm64: Set host stage 2 using kvm_nvhe_init_params
Move the registers relevant to host stage 2 enablement to kvm_nvhe_init_params to prepare the ground for enabling it in later patches. Signed-off-by: Quentin Perret --- arch/arm64/include/asm/kvm_asm.h | 3 +++ arch/arm64/kernel/asm-offsets.c| 3 +++ arch/arm64/kvm/arm.c | 5 + arch/arm64/kvm/hyp/nvhe/hyp-init.S | 9 + arch/arm64/kvm/hyp/nvhe/switch.c | 5 + 5 files changed, 21 insertions(+), 4 deletions(-) diff --git a/arch/arm64/include/asm/kvm_asm.h b/arch/arm64/include/asm/kvm_asm.h index 4fc27ac08836..5354b05eb9e2 100644 --- a/arch/arm64/include/asm/kvm_asm.h +++ b/arch/arm64/include/asm/kvm_asm.h @@ -158,6 +158,9 @@ struct kvm_nvhe_init_params { unsigned long tpidr_el2; unsigned long stack_hyp_va; phys_addr_t pgd_pa; + unsigned long hcr_el2; + unsigned long vttbr; + unsigned long vtcr; }; /* Translate a kernel address @ptr into its equivalent linear mapping */ diff --git a/arch/arm64/kernel/asm-offsets.c b/arch/arm64/kernel/asm-offsets.c index 5e82488f1b82..9cf7736e31db 100644 --- a/arch/arm64/kernel/asm-offsets.c +++ b/arch/arm64/kernel/asm-offsets.c @@ -114,6 +114,9 @@ int main(void) DEFINE(NVHE_INIT_TPIDR_EL2, offsetof(struct kvm_nvhe_init_params, tpidr_el2)); DEFINE(NVHE_INIT_STACK_HYP_VA, offsetof(struct kvm_nvhe_init_params, stack_hyp_va)); DEFINE(NVHE_INIT_PGD_PA, offsetof(struct kvm_nvhe_init_params, pgd_pa)); + DEFINE(NVHE_INIT_HCR_EL2,offsetof(struct kvm_nvhe_init_params, hcr_el2)); + DEFINE(NVHE_INIT_VTTBR, offsetof(struct kvm_nvhe_init_params, vttbr)); + DEFINE(NVHE_INIT_VTCR, offsetof(struct kvm_nvhe_init_params, vtcr)); #endif #ifdef CONFIG_CPU_PM DEFINE(CPU_CTX_SP, offsetof(struct cpu_suspend_ctx, sp)); diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c index e524682c2ccf..00cee4489cd7 100644 --- a/arch/arm64/kvm/arm.c +++ b/arch/arm64/kvm/arm.c @@ -1413,6 +1413,11 @@ static void cpu_prepare_hyp_mode(int cpu) params->stack_hyp_va = kern_hyp_va(per_cpu(kvm_arm_hyp_stack_page, cpu) + PAGE_SIZE); params->pgd_pa = kvm_mmu_get_httbr(); + if (is_protected_kvm_enabled()) + params->hcr_el2 = HCR_HOST_NVHE_PROTECTED_FLAGS; + else + params->hcr_el2 = HCR_HOST_NVHE_FLAGS; + params->vttbr = params->vtcr = 0; /* * Flush the init params from the data cache because the struct will diff --git a/arch/arm64/kvm/hyp/nvhe/hyp-init.S b/arch/arm64/kvm/hyp/nvhe/hyp-init.S index ad943966c39f..b1341bb4b453 100644 --- a/arch/arm64/kvm/hyp/nvhe/hyp-init.S +++ b/arch/arm64/kvm/hyp/nvhe/hyp-init.S @@ -102,6 +102,15 @@ alternative_else_nop_endif ldr x1, [x0, #NVHE_INIT_MAIR_EL2] msr mair_el2, x1 + ldr x1, [x0, #NVHE_INIT_HCR_EL2] + msr hcr_el2, x1 + + ldr x1, [x0, #NVHE_INIT_VTTBR] + msr vttbr_el2, x1 + + ldr x1, [x0, #NVHE_INIT_VTCR] + msr vtcr_el2, x1 + ldr x1, [x0, #NVHE_INIT_PGD_PA] phys_to_ttbr x2, x1 alternative_if ARM64_HAS_CNP diff --git a/arch/arm64/kvm/hyp/nvhe/switch.c b/arch/arm64/kvm/hyp/nvhe/switch.c index f3d0e9eca56c..979a76cdf9fb 100644 --- a/arch/arm64/kvm/hyp/nvhe/switch.c +++ b/arch/arm64/kvm/hyp/nvhe/switch.c @@ -97,10 +97,7 @@ static void __deactivate_traps(struct kvm_vcpu *vcpu) mdcr_el2 |= MDCR_EL2_E2PB_MASK << MDCR_EL2_E2PB_SHIFT; write_sysreg(mdcr_el2, mdcr_el2); - if (is_protected_kvm_enabled()) - write_sysreg(HCR_HOST_NVHE_PROTECTED_FLAGS, hcr_el2); - else - write_sysreg(HCR_HOST_NVHE_FLAGS, hcr_el2); + write_sysreg(this_cpu_ptr(_init_params)->hcr_el2, hcr_el2); write_sysreg(CPTR_EL2_DEFAULT, cptr_el2); write_sysreg(__kvm_hyp_host_vector, vbar_el2); } -- 2.30.0.284.gd98b1dd5eaa7-goog ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
[RFC PATCH v2 00/26] KVM/arm64: A stage 2 for the host
Hi all, This is the v2 of the series previously posted here: https://lore.kernel.org/kvmarm/20201117181607.1761516-1-qper...@google.com/ This basically allows us to wrap the host with a stage 2 when running in nVHE, hence paving the way for protecting guest memory from the host in the future (among other use-cases). For more details about the motivation and the design angle taken here, I would recommend to have a look at the cover letter of v1, and/or to watch these presentations at LPC [1] and KVM forum 2020 [2]. In short, the changes since v1 include: - Renamed most pkvm-specific pgtable functions as pkvm_* to avoid confusion with the host's (Fuad) - Added an IC flush when switching pgtables (Fuad, Mark) - Cleaned-up the PI aliasing in image-vars.h (David) - Added a TLB flush when enabling the host stage 2 to avoid stale TLBs from bootloader - Fixed the early memory reservation by using NR_CPUS instead of num_possible_cpus() (which is always 1 that early) - Added missing preempt_{dis,en}able() guards in kvm_hyp_enable_protection() - Rebased on latest kvmarm/next And if you'd like a branch that has all the goodies, there it is: https://android-kvm.googlesource.com/linux qperret/host-stage2-v2 Thanks! Quentin [1] https://youtu.be/54q6RzS9BpQ?t=10859 [2] https://kvmforum2020.sched.com/event/eE24/virtualization-for-the-masses-exposing-kvm-on-android-will-deacon-google Quentin Perret (23): KVM: arm64: Initialize kvm_nvhe_init_params early KVM: arm64: Avoid free_page() in page-table allocator KVM: arm64: Factor memory allocation out of pgtable.c KVM: arm64: Introduce a BSS section for use at Hyp KVM: arm64: Make kvm_call_hyp() a function call at Hyp KVM: arm64: Allow using kvm_nvhe_sym() in hyp code KVM: arm64: Introduce an early Hyp page allocator KVM: arm64: Stub CONFIG_DEBUG_LIST at Hyp KVM: arm64: Introduce a Hyp buddy page allocator KVM: arm64: Enable access to sanitized CPU features at EL2 KVM: arm64: Factor out vector address calculation of/fdt: Introduce early_init_dt_add_memory_hyp() KVM: arm64: Prepare Hyp memory protection KVM: arm64: Elevate Hyp mappings creation at EL2 KVM: arm64: Use kvm_arch for stage 2 pgtable KVM: arm64: Use kvm_arch in kvm_s2_mmu KVM: arm64: Set host stage 2 using kvm_nvhe_init_params KVM: arm64: Refactor kvm_arm_setup_stage2() KVM: arm64: Refactor __load_guest_stage2() KVM: arm64: Refactor __populate_fault_info() KVM: arm64: Make memcache anonymous in pgtable allocator KVM: arm64: Reserve memory for host stage 2 KVM: arm64: Wrap the host with a stage 2 Will Deacon (3): arm64: lib: Annotate {clear,copy}_page() as position-independent KVM: arm64: Link position-independent string routines into .hyp.text arm64: kvm: Add standalone ticket spinlock implementation for use at hyp arch/arm64/include/asm/cpufeature.h | 1 + arch/arm64/include/asm/hyp_image.h| 7 + arch/arm64/include/asm/kvm_asm.h | 7 + arch/arm64/include/asm/kvm_cpufeature.h | 19 ++ arch/arm64/include/asm/kvm_host.h | 16 +- arch/arm64/include/asm/kvm_hyp.h | 8 + arch/arm64/include/asm/kvm_mmu.h | 69 +- arch/arm64/include/asm/kvm_pgtable.h | 41 +++- arch/arm64/include/asm/sections.h | 1 + arch/arm64/kernel/asm-offsets.c | 3 + arch/arm64/kernel/cpufeature.c| 12 + arch/arm64/kernel/image-vars.h| 33 +++ arch/arm64/kernel/vmlinux.lds.S | 7 + arch/arm64/kvm/arm.c | 144 ++-- arch/arm64/kvm/hyp/Makefile | 2 +- arch/arm64/kvm/hyp/include/hyp/switch.h | 36 +-- arch/arm64/kvm/hyp/include/nvhe/early_alloc.h | 14 ++ arch/arm64/kvm/hyp/include/nvhe/gfp.h | 32 +++ arch/arm64/kvm/hyp/include/nvhe/mem_protect.h | 33 +++ arch/arm64/kvm/hyp/include/nvhe/memory.h | 55 + arch/arm64/kvm/hyp/include/nvhe/mm.h | 107 + arch/arm64/kvm/hyp/include/nvhe/spinlock.h| 92 arch/arm64/kvm/hyp/nvhe/Makefile | 9 +- arch/arm64/kvm/hyp/nvhe/cache.S | 13 ++ arch/arm64/kvm/hyp/nvhe/cpufeature.c | 8 + arch/arm64/kvm/hyp/nvhe/early_alloc.c | 60 + arch/arm64/kvm/hyp/nvhe/hyp-init.S| 41 arch/arm64/kvm/hyp/nvhe/hyp-main.c| 48 arch/arm64/kvm/hyp/nvhe/hyp.lds.S | 1 + arch/arm64/kvm/hyp/nvhe/mem_protect.c | 191 arch/arm64/kvm/hyp/nvhe/mm.c | 174 ++ arch/arm64/kvm/hyp/nvhe/page_alloc.c | 185 +++ arch/arm64/kvm/hyp/nvhe/psci-relay.c | 4 +- arch/arm64/kvm/hyp/nvhe/setup.c | 214 ++ arch/arm64/kvm/hyp/nvhe/stub.c| 22 ++ arch/arm64/kvm/hyp/nvhe/switch.c | 12 +- arch/arm64/kvm/hyp/nvhe/tlb.c
[RFC PATCH v2 19/26] KVM: arm64: Use kvm_arch in kvm_s2_mmu
In order to make use of the stage 2 pgtable code for the host stage 2, change kvm_s2_mmu to use a kvm_arch pointer in lieu of the kvm pointer, as the host will have the former but not the latter. Signed-off-by: Quentin Perret --- arch/arm64/include/asm/kvm_host.h | 2 +- arch/arm64/include/asm/kvm_mmu.h | 7 ++- arch/arm64/kvm/mmu.c | 8 3 files changed, 11 insertions(+), 6 deletions(-) diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h index 9a2feb83eea0..9d59bebcc5ef 100644 --- a/arch/arm64/include/asm/kvm_host.h +++ b/arch/arm64/include/asm/kvm_host.h @@ -95,7 +95,7 @@ struct kvm_s2_mmu { /* The last vcpu id that ran on each physical CPU */ int __percpu *last_vcpu_ran; - struct kvm *kvm; + struct kvm_arch *arch; }; struct kvm_arch_memory_slot { diff --git a/arch/arm64/include/asm/kvm_mmu.h b/arch/arm64/include/asm/kvm_mmu.h index 6c8466a042a9..662f0415344e 100644 --- a/arch/arm64/include/asm/kvm_mmu.h +++ b/arch/arm64/include/asm/kvm_mmu.h @@ -299,7 +299,7 @@ static __always_inline u64 kvm_get_vttbr(struct kvm_s2_mmu *mmu) */ static __always_inline void __load_guest_stage2(struct kvm_s2_mmu *mmu) { - write_sysreg(kern_hyp_va(mmu->kvm)->arch.vtcr, vtcr_el2); + write_sysreg(kern_hyp_va(mmu->arch)->vtcr, vtcr_el2); write_sysreg(kvm_get_vttbr(mmu), vttbr_el2); /* @@ -309,5 +309,10 @@ static __always_inline void __load_guest_stage2(struct kvm_s2_mmu *mmu) */ asm(ALTERNATIVE("nop", "isb", ARM64_WORKAROUND_SPECULATIVE_AT)); } + +static inline struct kvm *kvm_s2_mmu_to_kvm(struct kvm_s2_mmu *mmu) +{ + return container_of(mmu->arch, struct kvm, arch); +} #endif /* __ASSEMBLY__ */ #endif /* __ARM64_KVM_MMU_H__ */ diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c index 7e6263103943..6f9bf71722bd 100644 --- a/arch/arm64/kvm/mmu.c +++ b/arch/arm64/kvm/mmu.c @@ -169,7 +169,7 @@ static void *kvm_host_va(phys_addr_t phys) static void __unmap_stage2_range(struct kvm_s2_mmu *mmu, phys_addr_t start, u64 size, bool may_block) { - struct kvm *kvm = mmu->kvm; + struct kvm *kvm = kvm_s2_mmu_to_kvm(mmu); phys_addr_t end = start + size; assert_spin_locked(>mmu_lock); @@ -474,7 +474,7 @@ int kvm_init_stage2_mmu(struct kvm *kvm, struct kvm_s2_mmu *mmu) for_each_possible_cpu(cpu) *per_cpu_ptr(mmu->last_vcpu_ran, cpu) = -1; - mmu->kvm = kvm; + mmu->arch = >arch; mmu->pgt = pgt; mmu->pgd_phys = __pa(pgt->pgd); mmu->vmid.vmid_gen = 0; @@ -556,7 +556,7 @@ void stage2_unmap_vm(struct kvm *kvm) void kvm_free_stage2_pgd(struct kvm_s2_mmu *mmu) { - struct kvm *kvm = mmu->kvm; + struct kvm *kvm = kvm_s2_mmu_to_kvm(mmu); struct kvm_pgtable *pgt = NULL; spin_lock(>mmu_lock); @@ -625,7 +625,7 @@ int kvm_phys_addr_ioremap(struct kvm *kvm, phys_addr_t guest_ipa, */ static void stage2_wp_range(struct kvm_s2_mmu *mmu, phys_addr_t addr, phys_addr_t end) { - struct kvm *kvm = mmu->kvm; + struct kvm *kvm = kvm_s2_mmu_to_kvm(mmu); stage2_apply_range_resched(kvm, addr, end, kvm_pgtable_stage2_wrprotect); } -- 2.30.0.284.gd98b1dd5eaa7-goog ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
[PATCH] KVM: arm64: Compute TPIDR_EL2 ignoring MTE tag
KASAN in HW_TAGS mode will store MTE tags in the top byte of the pointer. When computing the offset for TPIDR_EL2 we don't want anything in the top byte, so remove the tag to ensure the computation is correct no matter what the tag. Fixes: 94ab5b61ee16 ("kasan, arm64: enable CONFIG_KASAN_HW_TAGS") Signed-off-by: Steven Price --- Without this fix I can't boot a config with KASAN_HW_TAGS and KVM on an MTE enabled host. I'm unsure if this should really be in this_cpu_ptr_nvhe_sym(). arch/arm64/kvm/arm.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c index 6e637d2b4cfb..3783082148bc 100644 --- a/arch/arm64/kvm/arm.c +++ b/arch/arm64/kvm/arm.c @@ -1403,7 +1403,7 @@ static void cpu_init_hyp_mode(void) * kernel's mapping to the linear mapping, and store it in tpidr_el2 * so that we can use adr_l to access per-cpu variables in EL2. */ - params->tpidr_el2 = (unsigned long)this_cpu_ptr_nvhe_sym(__per_cpu_start) - + params->tpidr_el2 = (unsigned long)kasan_reset_tag(this_cpu_ptr_nvhe_sym(__per_cpu_start)) - (unsigned long)kvm_ksym_ref(CHOOSE_NVHE_SYM(__per_cpu_start)); params->mair_el2 = read_sysreg(mair_el1); -- 2.20.1 ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
RE: [PATCH v11 12/13] vfio/pci: Register a DMA fault response region
Hi Eric, > -Original Message- > From: Eric Auger [mailto:eric.au...@redhat.com] > Sent: 16 November 2020 11:00 > To: eric.auger@gmail.com; eric.au...@redhat.com; > io...@lists.linux-foundation.org; linux-ker...@vger.kernel.org; > k...@vger.kernel.org; kvmarm@lists.cs.columbia.edu; w...@kernel.org; > j...@8bytes.org; m...@kernel.org; robin.mur...@arm.com; > alex.william...@redhat.com > Cc: jean-phili...@linaro.org; zhangfei@linaro.org; > zhangfei@gmail.com; vivek.gau...@arm.com; Shameerali Kolothum > Thodi ; > jacob.jun@linux.intel.com; yi.l@intel.com; t...@semihalf.com; > nicoleots...@gmail.com; yuzenghui > Subject: [PATCH v11 12/13] vfio/pci: Register a DMA fault response region > > In preparation for vSVA, let's register a DMA fault response region, > where the userspace will push the page responses and increment the > head of the buffer. The kernel will pop those responses and inject them > on iommu side. > > Signed-off-by: Eric Auger > --- > drivers/vfio/pci/vfio_pci.c | 114 +--- > drivers/vfio/pci/vfio_pci_private.h | 5 ++ > drivers/vfio/pci/vfio_pci_rdwr.c| 39 ++ > include/uapi/linux/vfio.h | 32 > 4 files changed, 181 insertions(+), 9 deletions(-) > > diff --git a/drivers/vfio/pci/vfio_pci.c b/drivers/vfio/pci/vfio_pci.c > index 65a83fd0e8c0..e9a904ce3f0d 100644 > --- a/drivers/vfio/pci/vfio_pci.c > +++ b/drivers/vfio/pci/vfio_pci.c > @@ -318,9 +318,20 @@ static void vfio_pci_dma_fault_release(struct > vfio_pci_device *vdev, > kfree(vdev->fault_pages); > } > > -static int vfio_pci_dma_fault_mmap(struct vfio_pci_device *vdev, > -struct vfio_pci_region *region, > -struct vm_area_struct *vma) > +static void > +vfio_pci_dma_fault_response_release(struct vfio_pci_device *vdev, > + struct vfio_pci_region *region) > +{ > + if (vdev->dma_fault_response_wq) > + destroy_workqueue(vdev->dma_fault_response_wq); > + kfree(vdev->fault_response_pages); > + vdev->fault_response_pages = NULL; > +} > + > +static int __vfio_pci_dma_fault_mmap(struct vfio_pci_device *vdev, > + struct vfio_pci_region *region, > + struct vm_area_struct *vma, > + u8 *pages) > { > u64 phys_len, req_len, pgoff, req_start; > unsigned long long addr; > @@ -333,14 +344,14 @@ static int vfio_pci_dma_fault_mmap(struct > vfio_pci_device *vdev, > ((1U << (VFIO_PCI_OFFSET_SHIFT - PAGE_SHIFT)) - 1); > req_start = pgoff << PAGE_SHIFT; > > - /* only the second page of the producer fault region is mmappable */ > + /* only the second page of the fault region is mmappable */ > if (req_start < PAGE_SIZE) > return -EINVAL; > > if (req_start + req_len > phys_len) > return -EINVAL; > > - addr = virt_to_phys(vdev->fault_pages); > + addr = virt_to_phys(pages); > vma->vm_private_data = vdev; > vma->vm_pgoff = (addr >> PAGE_SHIFT) + pgoff; > > @@ -349,13 +360,29 @@ static int vfio_pci_dma_fault_mmap(struct > vfio_pci_device *vdev, > return ret; > } > > -static int vfio_pci_dma_fault_add_capability(struct vfio_pci_device *vdev, > - struct vfio_pci_region *region, > - struct vfio_info_cap *caps) > +static int vfio_pci_dma_fault_mmap(struct vfio_pci_device *vdev, > +struct vfio_pci_region *region, > +struct vm_area_struct *vma) > +{ > + return __vfio_pci_dma_fault_mmap(vdev, region, vma, > vdev->fault_pages); > +} > + > +static int > +vfio_pci_dma_fault_response_mmap(struct vfio_pci_device *vdev, > + struct vfio_pci_region *region, > + struct vm_area_struct *vma) > +{ > + return __vfio_pci_dma_fault_mmap(vdev, region, vma, > vdev->fault_response_pages); > +} > + > +static int __vfio_pci_dma_fault_add_capability(struct vfio_pci_device *vdev, > +struct vfio_pci_region *region, > +struct vfio_info_cap *caps, > +u32 cap_id) > { > struct vfio_region_info_cap_sparse_mmap *sparse = NULL; > struct vfio_region_info_cap_fault cap = { > - .header.id = VFIO_REGION_INFO_CAP_DMA_FAULT, > + .header.id = cap_id, > .header.version = 1, > .version = 1, > }; > @@ -383,6 +410,14 @@ static int vfio_pci_dma_fault_add_capability(struct > vfio_pci_device *vdev, > return ret; > } > > +static int vfio_pci_dma_fault_add_capability(struct vfio_pci_device *vdev, > + struct vfio_pci_region *region, > +
Re: [GIT PULL] KVM/arm64 fixes for 5.11, take #1
On 08/01/21 09:22, Marc Zyngier wrote: git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm.git tags/kvmarm-fixes-5.11-1 Looks like there are issues with the upstream changes brought in by this pull request. Unless my bisection is quick tomorrow it may not make it into 5.11-rc3. In any case, it's in my hands. I'm not sure what you mean by "upstream changes", as there is no additional changes on top of what is describe in this pull request, which is directly based on the tag you pulled for the merge window. If there is an issue with any of these 18 patches themselves, please shout as soon as you can. You're right, it's not related to this pull request but just to Linus's tree. It was too late yesterday, and now it's all set for sending it out. Paolo ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
Re: [GIT PULL] KVM/arm64 fixes for 5.11, take #1
Hi Paolo, On 2021-01-07 23:09, Paolo Bonzini wrote: On 07/01/21 12:20, Marc Zyngier wrote: git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm.git tags/kvmarm-fixes-5.11-1 Looks like there are issues with the upstream changes brought in by this pull request. Unless my bisection is quick tomorrow it may not make it into 5.11-rc3. In any case, it's in my hands. I'm not sure what you mean by "upstream changes", as there is no additional changes on top of what is describe in this pull request, which is directly based on the tag you pulled for the merge window. If there is an issue with any of these 18 patches themselves, please shout as soon as you can. Thanks, M. -- Jazz is not dead. It just smells funny... ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm