[PATCH] kselftest: kvm/arm64: Skip tests if we can't create a vgic-v3
The arch_timer and vgic_irq kselftests assume that they can create a vgic-v3, using the library function vgic_v3_setup() which aborts with a test failure if it is not possible to do so. Since vgic-v3 can only be instantiated on systems where the host has GICv3 this leads to false positives on older systems where that is not the case. Fix this by changing vgic_v3_setup() to return an error if the vgic can't be instantiated and have the callers skip if this happens. We could also exit flagging a skip in vgic_v3_setup() but this would prevent future test cases conditionally deciding which GIC to use or generally doing more complex output. Signed-off-by: Mark Brown --- tools/testing/selftests/kvm/aarch64/arch_timer.c | 7 ++- tools/testing/selftests/kvm/aarch64/vgic_irq.c | 5 + tools/testing/selftests/kvm/lib/aarch64/vgic.c | 4 +++- 3 files changed, 14 insertions(+), 2 deletions(-) diff --git a/tools/testing/selftests/kvm/aarch64/arch_timer.c b/tools/testing/selftests/kvm/aarch64/arch_timer.c index 9ad38bd360a4..791d38404652 100644 --- a/tools/testing/selftests/kvm/aarch64/arch_timer.c +++ b/tools/testing/selftests/kvm/aarch64/arch_timer.c @@ -366,6 +366,7 @@ static struct kvm_vm *test_vm_create(void) { struct kvm_vm *vm; unsigned int i; + int ret; int nr_vcpus = test_args.nr_vcpus; vm = vm_create_default_with_vcpus(nr_vcpus, 0, 0, guest_code, NULL); @@ -382,7 +383,11 @@ static struct kvm_vm *test_vm_create(void) ucall_init(vm, NULL); test_init_timer_irq(vm); - vgic_v3_setup(vm, nr_vcpus, 64, GICD_BASE_GPA, GICR_BASE_GPA); + ret = vgic_v3_setup(vm, nr_vcpus, 64, GICD_BASE_GPA, GICR_BASE_GPA); + if (ret < 0) { + pr_info("Failed to create vgic-v3, skipping\n"); + exit(KSFT_SKIP); + } /* Make all the test's cmdline args visible to the guest */ sync_global_to_guest(vm, test_args); diff --git a/tools/testing/selftests/kvm/aarch64/vgic_irq.c b/tools/testing/selftests/kvm/aarch64/vgic_irq.c index e6c7d7f8fbd1..8c6b61b8e6aa 100644 --- a/tools/testing/selftests/kvm/aarch64/vgic_irq.c +++ b/tools/testing/selftests/kvm/aarch64/vgic_irq.c @@ -761,6 +761,11 @@ static void test_vgic(uint32_t nr_irqs, bool level_sensitive, bool eoi_split) gic_fd = vgic_v3_setup(vm, 1, nr_irqs, GICD_BASE_GPA, GICR_BASE_GPA); + if (gic_fd < 0) { + pr_info("Failed to create vgic-v3, skipping\n"); + exit(KSFT_SKIP); + } + vm_install_exception_handler(vm, VECTOR_IRQ_CURRENT, guest_irq_handlers[args.eoi_split][args.level_sensitive]); diff --git a/tools/testing/selftests/kvm/lib/aarch64/vgic.c b/tools/testing/selftests/kvm/lib/aarch64/vgic.c index b3a0fca0d780..647c18733e1b 100644 --- a/tools/testing/selftests/kvm/lib/aarch64/vgic.c +++ b/tools/testing/selftests/kvm/lib/aarch64/vgic.c @@ -52,7 +52,9 @@ int vgic_v3_setup(struct kvm_vm *vm, unsigned int nr_vcpus, uint32_t nr_irqs, nr_vcpus, nr_vcpus_created); /* Distributor setup */ - gic_fd = kvm_create_device(vm, KVM_DEV_TYPE_ARM_VGIC_V3, false); + gic_fd = kvm_create_device(vm, KVM_DEV_TYPE_ARM_VGIC_V3, true); + if (gic_fd == -1) + return -1; kvm_device_access(gic_fd, KVM_DEV_ARM_VGIC_GRP_NR_IRQS, 0, _irqs, true); -- 2.30.2 ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
Re: [PATCH v4 07/21] KVM: arm64: Support SDEI_EVENT_UNREGISTER hypercall
Hi Gavin, On 1/12/22 3:38 AM, Gavin Shan wrote: > Hi Eric, > > On 11/10/21 1:05 AM, Eric Auger wrote: >> On 8/15/21 2:13 AM, Gavin Shan wrote: >>> This supports SDEI_EVENT_UNREGISTER hypercall. It's used by the >>> guest to unregister SDEI event. The SDEI event won't be raised to >>> the guest or specific vCPU after it's unregistered successfully. >>> It's notable the SDEI event is disabled automatically on the guest >>> or specific vCPU once it's unregistered successfully. >>> >>> Signed-off-by: Gavin Shan >>> --- >>> arch/arm64/kvm/sdei.c | 61 +++ >>> 1 file changed, 61 insertions(+) >>> >>> diff --git a/arch/arm64/kvm/sdei.c b/arch/arm64/kvm/sdei.c >>> index b4162efda470..a3ba69dc91cb 100644 >>> --- a/arch/arm64/kvm/sdei.c >>> +++ b/arch/arm64/kvm/sdei.c >>> @@ -308,6 +308,65 @@ static unsigned long >>> kvm_sdei_hypercall_context(struct kvm_vcpu *vcpu) >>> return ret; >>> } >>> +static unsigned long kvm_sdei_hypercall_unregister(struct kvm_vcpu >>> *vcpu) >>> +{ >>> + struct kvm *kvm = vcpu->kvm; >>> + struct kvm_sdei_kvm *ksdei = kvm->arch.sdei; >>> + struct kvm_sdei_vcpu *vsdei = vcpu->arch.sdei; >>> + struct kvm_sdei_event *kse = NULL; >>> + struct kvm_sdei_kvm_event *kske = NULL; >>> + unsigned long event_num = smccc_get_arg1(vcpu); >>> + int index = 0; >>> + unsigned long ret = SDEI_SUCCESS; >>> + >>> + /* Sanity check */ >>> + if (!(ksdei && vsdei)) { >>> + ret = SDEI_NOT_SUPPORTED; >>> + goto out; >>> + } >>> + >>> + if (!kvm_sdei_is_valid_event_num(event_num)) { >>> + ret = SDEI_INVALID_PARAMETERS; >>> + goto out; >>> + } >>> + >>> + /* Check if the KVM event exists */ >>> + spin_lock(>lock); >>> + kske = kvm_sdei_find_kvm_event(kvm, event_num); >>> + if (!kske) { >>> + ret = SDEI_INVALID_PARAMETERS; >>> + goto unlock; >>> + } >>> + >>> + /* Check if there is pending events */ >>> + if (kske->state.refcount) { >>> + ret = SDEI_PENDING; >> don't you want to record the fact the unregistration is outstanding to >> perform subsequent actions? Otherwise nothing will hapen when the >> current executing handlers complete?> > It's not necessary. The guest should retry in this case. I do not understand that from the spec: 6.7 Unregistering an event says With the PENDING status, the unregister request will be queued until the event is completed using SDEI_EVENT_COMPLETE . Also there is state called "Handler-unregister-pending" But well I would need to dig further into the spec again :) > >>> + goto unlock; >>> + } >>> + >>> + /* Check if it has been registered */ >>> + kse = kske->kse; >>> + index = (kse->state.type == SDEI_EVENT_TYPE_PRIVATE) ? >>> + vcpu->vcpu_idx : 0; >> you could have an inline for the above as this is executed in many >> functions. even including the code below. > > Ok, it's a good idea. > >>> + if (!kvm_sdei_is_registered(kske, index)) { >>> + ret = SDEI_DENIED; >>> + goto unlock; >>> + } >>> + >>> + /* The event is disabled when it's unregistered */ >>> + kvm_sdei_clear_enabled(kske, index); >>> + kvm_sdei_clear_registered(kske, index); >>> + if (kvm_sdei_empty_registered(kske)) { >> a refcount mechanism would be cleaner I think. > > A refcount isn't working well. We need a mapping here because the private > SDEI event can be enabled/registered on multiple vCPUs. We need to know > the exact vCPUs where the private SDEI event is enabled/registered. I don't get why you can't increment/decrement the ref count each time the event is registered/unregistered by a given vcpu to manage its life cycle? Does not mean you don't need the bitmap to know the actual mapping. Thanks Eric > >>> + list_del(>link); >>> + kfree(kske); >>> + } >>> + >>> +unlock: >>> + spin_unlock(>lock); >>> +out: >>> + return ret; >>> +} >>> + >>> int kvm_sdei_hypercall(struct kvm_vcpu *vcpu) >>> { >>> u32 func = smccc_get_function(vcpu); >>> @@ -333,6 +392,8 @@ int kvm_sdei_hypercall(struct kvm_vcpu *vcpu) >>> case SDEI_1_0_FN_SDEI_EVENT_COMPLETE: >>> case SDEI_1_0_FN_SDEI_EVENT_COMPLETE_AND_RESUME: >>> case SDEI_1_0_FN_SDEI_EVENT_UNREGISTER: >>> + ret = kvm_sdei_hypercall_unregister(vcpu); >>> + break; >>> case SDEI_1_0_FN_SDEI_EVENT_STATUS: >>> case SDEI_1_0_FN_SDEI_EVENT_GET_INFO: >>> case SDEI_1_0_FN_SDEI_EVENT_ROUTING_SET: >>> > > Thanks, > Gavin > ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
Re: [PATCH 4/4] KVM: arm64: Workaround Cortex-A510's single-step and PAC trap errata
On Tue, 25 Jan 2022 18:19:45 +, James Morse wrote: > > Hi Marc, > > On 25/01/2022 16:51, Marc Zyngier wrote: > > On Tue, 25 Jan 2022 15:38:03 +, > > James Morse wrote: > >> > >> Cortex-A510's erratum #2077057 causes SPSR_EL2 to be corrupted when > >> single-stepping authenticated ERET instructions. A single step is > >> expected, but a pointer authentication trap is taken instead. The > >> erratum causes SPSR_EL1 to be copied to SPSR_EL2, which could allow > >> EL1 to cause a return to EL2 with a guest controlled ELR_EL2. > >> > >> Because the conditions require an ERET into active-not-pending state, > >> this is only a problem for the EL2 when EL2 is stepping EL1. In this case > >> the previous SPSR_EL2 value is preserved in struct kvm_vcpu, and can be > >> restored. > > > Urgh. That's pretty nasty :-(. > > >> diff --git a/arch/arm64/kvm/hyp/include/hyp/switch.h > >> b/arch/arm64/kvm/hyp/include/hyp/switch.h > >> index 331dd10821df..93bf140cc972 100644 > >> --- a/arch/arm64/kvm/hyp/include/hyp/switch.h > >> +++ b/arch/arm64/kvm/hyp/include/hyp/switch.h > >> @@ -440,6 +442,22 @@ static inline bool fixup_guest_exit(struct kvm_vcpu > >> *vcpu, u64 *exit_code) > >>write_sysreg_el2(read_sysreg_el2(SYS_ELR) - 4, SYS_ELR); > >>} > >> > >> + /* > >> + * Check for the conditions of Cortex-A510's #2077057. When these occur > >> + * SPSR_EL2 can't be trusted, but isn't needed either as it is > >> + * unchanged from the value in vcpu_gp_regs(vcpu)->pstate. > >> + * Did we just take a PAC exception when a step exception was expected? > >> + */ > >> + if (IS_ENABLED(CONFIG_ARM64_ERRATUM_2077057) && > > > > nit: we can drop this IS_ENABLED()... > > Hmmm, I thought dead code elimination was a good thing! > Without the cpu_errata.c match, (which is also guarded by #ifdef), > the cap will never be true, even if an affected CPU showed up. This > way the compiler knows it can remove all this. I don't dispute that. However, experience shows that the more of these we sprinkle around, the quicker this code bitrots as we don't build for all the possible configurations. In general, we hardly have any dependency on configuration symbols, and rely on static keys got things not be terribly sucky. > > > >> + cpus_have_const_cap(ARM64_WORKAROUND_2077057) && > > > > and make this a final cap. Being a ARM64_CPUCAP_LOCAL_CPU_ERRATUM, we > > won't accept late CPUs on a system that wasn't affected until then. > > > >> + ARM_EXCEPTION_CODE(*exit_code) != ARM_EXCEPTION_IRQ && > >> + esr_ec == ESR_ELx_EC_PAC && > >> + vcpu->guest_debug & KVM_GUESTDBG_SINGLESTEP) { > >> + /* Active-not-pending? */ > >> + if (*vcpu_cpsr(vcpu) & DBG_SPSR_SS) > >> + write_sysreg_el2(*vcpu_cpsr(vcpu), SYS_SPSR); > > > > Err... Isn't this way too late? The function starts with: > > > > vcpu->arch.ctxt.regs.pstate = read_sysreg_el2(SYS_SPSR); > > > > which is just another way to write: > > > > *vcpu_cpsr(vcpu) = read_sysreg_el2(SYS_SPSR); > > > > By that time, the vcpu's PSTATE is terminally corrupted. > > Yes - bother. Staring at it didn't let me spot that! > I can hit the conditions to test this, but due to lack of > imagination the model doesn't corrupt the SPSR. Bad model. ;-) > > > > I think you need to hoist this workaround way up, before we call into > > early_exit_filter() as it will assume that the guest pstate is correct > > (this is used by both pKVM and the NV code). > > > > Something like this (untested): > > > > diff --git a/arch/arm64/kvm/hyp/include/hyp/switch.h > > b/arch/arm64/kvm/hyp/include/hyp/switch.h > > index 93bf140cc972..a8a1502db237 100644 > > --- a/arch/arm64/kvm/hyp/include/hyp/switch.h > > +++ b/arch/arm64/kvm/hyp/include/hyp/switch.h > > @@ -402,6 +402,26 @@ static inline bool kvm_hyp_handle_exit(struct kvm_vcpu > > *vcpu, u64 *exit_code) > > return false; > > } > > > > +static inline void synchronize_vcpu_pstate(struct kvm_vcpu *vcpu, > > + u64 *exit_code) > > +{ > > + /* > > +* Check for the conditions of Cortex-A510's #2077057. When these occur > > +* SPSR_EL2 can't be trusted, but isn't needed either as it is > > +* unchanged from the value in vcpu_gp_regs(vcpu)->pstate. > > +* Did we just take a PAC exception when a step exception (being in the > > +* Active-not-pending state) was expected? > > +*/ > > + if (cpus_have_final_cap(ARM64_WORKAROUND_2077057) && > > + ARM_EXCEPTION_CODE(*exit_code) != ARM_EXCEPTION_IRQ && > > > + kvm_vcpu_trap_get_class(vcpu) == ESR_ELx_EC_PAC && > > The vcpu's esr_el2 isn't yet set: > | ESR_ELx_EC(read_sysreg_el2(SYS_ESR)) == ESR_ELx_EC_PAC Ah, well spotted! > > (and I'll shuffle the order so this is last as its an extra sysreg read) > > > > + vcpu->guest_debug & KVM_GUESTDBG_SINGLESTEP && > > + *vcpu_cpsr(vcpu) & DBG_SPSR_SS) > > +
Re: [PATCH v4 06/21] KVM: arm64: Support SDEI_EVENT_CONTEXT hypercall
Hi Gavin, On 1/13/22 8:13 AM, Gavin Shan wrote: > Hi Shannon, > > On 1/13/22 3:02 PM, Gavin Shan wrote: >> On 1/11/22 5:43 PM, Shannon Zhao wrote: >>> On 2021/8/15 8:13, Gavin Shan wrote: +static unsigned long kvm_sdei_hypercall_context(struct kvm_vcpu *vcpu) +{ + struct kvm *kvm = vcpu->kvm; + struct kvm_sdei_kvm *ksdei = kvm->arch.sdei; + struct kvm_sdei_vcpu *vsdei = vcpu->arch.sdei; + struct kvm_sdei_vcpu_regs *regs; + unsigned long index = smccc_get_arg1(vcpu); + unsigned long ret = SDEI_SUCCESS; + + /* Sanity check */ + if (!(ksdei && vsdei)) { + ret = SDEI_NOT_SUPPORTED; + goto out; + } >>> Maybe we could move these common sanity check codes to >>> kvm_sdei_hypercall to save some lines. >>> >> >> Not all hypercalls need this check. For example, >> COMPLETE/COMPLETE_RESUME/CONTEXT don't >> have SDEI event number as the argument. If we really want move this >> check into function >> kvm_sdei_hypercall(), we would have code like below. Too much >> duplicated snippets will >> be seen. I don't think it's better than what we have if I fully >> understand your comments. >> > > oops... sorry. Please ignore my previous reply. I thought you talk about > the check on the SDEI event number wrongly. Yes, you're correct that the > check should be moved to kvm_sdei_hypercall(). even better than my previous proposal then Eric > > Thanks, > Gavin > > ___ > kvmarm mailing list > kvmarm@lists.cs.columbia.edu > https://lists.cs.columbia.edu/mailman/listinfo/kvmarm ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
Re: [PATCH v4 06/21] KVM: arm64: Support SDEI_EVENT_CONTEXT hypercall
Hi Gavin, On 1/13/22 8:02 AM, Gavin Shan wrote: > Hi Shannon, > > On 1/11/22 5:43 PM, Shannon Zhao wrote: >> On 2021/8/15 8:13, Gavin Shan wrote: >>> +static unsigned long kvm_sdei_hypercall_context(struct kvm_vcpu *vcpu) >>> +{ >>> + struct kvm *kvm = vcpu->kvm; >>> + struct kvm_sdei_kvm *ksdei = kvm->arch.sdei; >>> + struct kvm_sdei_vcpu *vsdei = vcpu->arch.sdei; >>> + struct kvm_sdei_vcpu_regs *regs; >>> + unsigned long index = smccc_get_arg1(vcpu); >>> + unsigned long ret = SDEI_SUCCESS; >>> + >>> + /* Sanity check */ >>> + if (!(ksdei && vsdei)) { >>> + ret = SDEI_NOT_SUPPORTED; >>> + goto out; >>> + } >> Maybe we could move these common sanity check codes to >> kvm_sdei_hypercall to save some lines. >> > > Not all hypercalls need this check. For example, > COMPLETE/COMPLETE_RESUME/CONTEXT don't > have SDEI event number as the argument. If we really want move this > check into function > kvm_sdei_hypercall(), we would have code like below. Too much duplicated > snippets will > be seen. I don't think it's better than what we have if I fully > understand your comments. > > switch (...) { > case REGISTER: > if (!(ksdei && vsdei)) { > ret = SDEI_NOT_SUPPORTED; > break; > } at least you can use an inline function taking the vcpu as param? Thanks Eric > > ret = kvm_sdei_hypercall_register(vcpu); > break; > case UNREGISTER: > if (!(ksdei && vsdei)) { > ret = SDEI_NOT_SUPPORTED; > break; > } > > ret = kvm_sdei_hypercall_unregister(vcpu); > break; > case CONTEXT: > ret = kvm_sdei_hypercall_context(vcpu); > break; > : > } > > Thanks, > Gavin > > ___ > kvmarm mailing list > kvmarm@lists.cs.columbia.edu > https://lists.cs.columbia.edu/mailman/listinfo/kvmarm ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
Re: [PATCH v4 06/21] KVM: arm64: Support SDEI_EVENT_CONTEXT hypercall
Hi Gavin, On 1/12/22 3:33 AM, Gavin Shan wrote: > Hi Eric, > > On 11/10/21 7:16 PM, Eric Auger wrote: >> On 8/15/21 2:13 AM, Gavin Shan wrote: >>> This supports SDEI_EVENT_CONTEXT hypercall. It's used by the guest >>> to retrieved the original registers (R0 - R17) in its SDEI event >>> handler. Those registers can be corrupted during the SDEI event >>> delivery. >>> >>> Signed-off-by: Gavin Shan >>> --- >>> arch/arm64/kvm/sdei.c | 40 >>> 1 file changed, 40 insertions(+) >>> >>> diff --git a/arch/arm64/kvm/sdei.c b/arch/arm64/kvm/sdei.c >>> index b022ce0a202b..b4162efda470 100644 >>> --- a/arch/arm64/kvm/sdei.c >>> +++ b/arch/arm64/kvm/sdei.c >>> @@ -270,6 +270,44 @@ static unsigned long >>> kvm_sdei_hypercall_enable(struct kvm_vcpu *vcpu, >>> return ret; >>> } >>> +static unsigned long kvm_sdei_hypercall_context(struct kvm_vcpu >>> *vcpu) >>> +{ >>> + struct kvm *kvm = vcpu->kvm; >>> + struct kvm_sdei_kvm *ksdei = kvm->arch.sdei; >>> + struct kvm_sdei_vcpu *vsdei = vcpu->arch.sdei; >>> + struct kvm_sdei_vcpu_regs *regs; >>> + unsigned long index = smccc_get_arg1(vcpu); >> s/index/param_id to match the spec? > > Sure, but "reg_id" seems better here. As the parameter indicates the GPR > index > to be fetched on request of the guest kernel. fine with me. > >>> + unsigned long ret = SDEI_SUCCESS; >>> + >>> + /* Sanity check */ >>> + if (!(ksdei && vsdei)) { >>> + ret = SDEI_NOT_SUPPORTED; >>> + goto out; >>> + } >>> + >>> + if (index > ARRAY_SIZE(vsdei->state.critical_regs.regs)) { >>> + ret = SDEI_INVALID_PARAMETERS; >>> + goto out; >>> + } >> I would move the above after regs = and use regs there (although the >> regs ARRAY_SIZE of both is identifical) > > Ok. > >>> + >>> + /* Check if the pending event exists */ >>> + spin_lock(>lock); >>> + if (!(vsdei->critical_event || vsdei->normal_event)) { >>> + ret = SDEI_DENIED; >>> + goto unlock; >>> + } >>> + >>> + /* Fetch the requested register */ >>> + regs = vsdei->critical_event ? >state.critical_regs : >>> + >state.normal_regs; >>> + ret = regs->regs[index]; >>> + >>> +unlock: >>> + spin_unlock(>lock); >>> +out: >>> + return ret; >>> +} >>> + >>> int kvm_sdei_hypercall(struct kvm_vcpu *vcpu) >>> { >>> u32 func = smccc_get_function(vcpu); >>> @@ -290,6 +328,8 @@ int kvm_sdei_hypercall(struct kvm_vcpu *vcpu) >>> ret = kvm_sdei_hypercall_enable(vcpu, false); >>> break; >>> case SDEI_1_0_FN_SDEI_EVENT_CONTEXT: >>> + ret = kvm_sdei_hypercall_context(vcpu); >>> + break; >>> case SDEI_1_0_FN_SDEI_EVENT_COMPLETE: >>> case SDEI_1_0_FN_SDEI_EVENT_COMPLETE_AND_RESUME: >>> case SDEI_1_0_FN_SDEI_EVENT_UNREGISTER: >>> > > Thanks, > Gavin > Eric ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
Re: [PATCH v4 05/21] KVM: arm64: Support SDEI_EVENT_{ENABLE, DISABLE} hypercall
Hi Gavin, On 1/12/22 3:29 AM, Gavin Shan wrote: > Hi Eric, > > On 11/10/21 12:02 AM, Eric Auger wrote: >> On 8/15/21 2:13 AM, Gavin Shan wrote: >>> This supports SDEI_EVENT_{ENABLE, DISABLE} hypercall. After SDEI >>> event is registered by guest, it won't be delivered to the guest >>> until it's enabled. On the other hand, the SDEI event won't be >>> raised to the guest or specific vCPU if it's has been disabled >>> on the guest or specific vCPU. >>> >>> Signed-off-by: Gavin Shan >>> --- >>> arch/arm64/kvm/sdei.c | 68 +++ >>> 1 file changed, 68 insertions(+) >>> >>> diff --git a/arch/arm64/kvm/sdei.c b/arch/arm64/kvm/sdei.c >>> index d3ea3eee154b..b022ce0a202b 100644 >>> --- a/arch/arm64/kvm/sdei.c >>> +++ b/arch/arm64/kvm/sdei.c >>> @@ -206,6 +206,70 @@ static unsigned long >>> kvm_sdei_hypercall_register(struct kvm_vcpu *vcpu) >>> return ret; >>> } >>> +static unsigned long kvm_sdei_hypercall_enable(struct kvm_vcpu *vcpu, >>> + bool enable) >>> +{ >>> + struct kvm *kvm = vcpu->kvm; >>> + struct kvm_sdei_kvm *ksdei = kvm->arch.sdei; >>> + struct kvm_sdei_vcpu *vsdei = vcpu->arch.sdei; >>> + struct kvm_sdei_event *kse = NULL; >>> + struct kvm_sdei_kvm_event *kske = NULL; >>> + unsigned long event_num = smccc_get_arg1(vcpu); >>> + int index = 0; >>> + unsigned long ret = SDEI_SUCCESS; >>> + >>> + /* Sanity check */ >>> + if (!(ksdei && vsdei)) { >>> + ret = SDEI_NOT_SUPPORTED; >>> + goto out; >>> + } >>> + >>> + if (!kvm_sdei_is_valid_event_num(event_num)) { >> I would rename into is_exposed_event_num() > > kvm_sdei_is_virtual() has been recommended by you when you reviewed the > following > patch. I think kvm_sdei_is_virtual() is good enough :) argh, is_virtual() then :) Eric > > [PATCH v4 02/21] KVM: arm64: Add SDEI virtualization infrastructure > >>> + ret = SDEI_INVALID_PARAMETERS; >>> + goto out; >>> + } >>> + >>> + /* Check if the KVM event exists */ >>> + spin_lock(>lock); >>> + kske = kvm_sdei_find_kvm_event(kvm, event_num); >>> + if (!kske) { >>> + ret = SDEI_INVALID_PARAMETERS; >> should be DENIED according to the spec, ie. nobody registered that event? > > Ok. > >>> + goto unlock; >>> + } >>> + >>> + /* Check if there is pending events */ >> does that match the "handler-unregister-pending state" case mentionned >> in the spec? >>> + if (kske->state.refcount) { >>> + ret = SDEI_PENDING; >> ? not documented in my A spec? DENIED? > > Yep, It should be DENIED. > >>> + goto unlock; >>> + } >>> + >>> + /* Check if it has been registered */ >> isn't duplicate of /* Check if the KVM event exists */ ? > > It's not duplicate check, but the comment here seems misleading. I will > correct this to: > > /* Check if it has been defined or exposed */ > >>> + kse = kske->kse; >>> + index = (kse->state.type == SDEI_EVENT_TYPE_PRIVATE) ? >>> + vcpu->vcpu_idx : 0; >>> + if (!kvm_sdei_is_registered(kske, index)) { >>> + ret = SDEI_DENIED; >>> + goto unlock; >>> + } >>> + >>> + /* Verify its enablement state */ >>> + if (enable == kvm_sdei_is_enabled(kske, index)) { >> spec says: >> Enabling/disabled an event, which is already enabled/disabled, is >> permitted and has no effect. I guess ret should be OK. > > yep, it should be ok. > >>> + ret = SDEI_DENIED; >>> + goto unlock; >>> + } >>> + >>> + /* Update enablement state */ >>> + if (enable) >>> + kvm_sdei_set_enabled(kske, index); >>> + else >>> + kvm_sdei_clear_enabled(kske, index); >>> + >>> +unlock: >>> + spin_unlock(>lock); >>> +out: >>> + return ret; >>> +} >>> + >>> int kvm_sdei_hypercall(struct kvm_vcpu *vcpu) >>> { >>> u32 func = smccc_get_function(vcpu); >>> @@ -220,7 +284,11 @@ int kvm_sdei_hypercall(struct kvm_vcpu *vcpu) >>> ret = kvm_sdei_hypercall_register(vcpu); >>> break; >>> case SDEI_1_0_FN_SDEI_EVENT_ENABLE: >>> + ret = kvm_sdei_hypercall_enable(vcpu, true); >>> + break; >>> case SDEI_1_0_FN_SDEI_EVENT_DISABLE: >>> + ret = kvm_sdei_hypercall_enable(vcpu, false); >>> + break; >>> case SDEI_1_0_FN_SDEI_EVENT_CONTEXT: >>> case SDEI_1_0_FN_SDEI_EVENT_COMPLETE: >>> case SDEI_1_0_FN_SDEI_EVENT_COMPLETE_AND_RESUME: >>> > > Thanks, > Gavin > ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
Re: [PATCH 4/4] KVM: arm64: Workaround Cortex-A510's single-step and PAC trap errata
Hi Marc, On 25/01/2022 16:51, Marc Zyngier wrote: > On Tue, 25 Jan 2022 15:38:03 +, > James Morse wrote: >> >> Cortex-A510's erratum #2077057 causes SPSR_EL2 to be corrupted when >> single-stepping authenticated ERET instructions. A single step is >> expected, but a pointer authentication trap is taken instead. The >> erratum causes SPSR_EL1 to be copied to SPSR_EL2, which could allow >> EL1 to cause a return to EL2 with a guest controlled ELR_EL2. >> >> Because the conditions require an ERET into active-not-pending state, >> this is only a problem for the EL2 when EL2 is stepping EL1. In this case >> the previous SPSR_EL2 value is preserved in struct kvm_vcpu, and can be >> restored. > Urgh. That's pretty nasty :-(. >> diff --git a/arch/arm64/kvm/hyp/include/hyp/switch.h >> b/arch/arm64/kvm/hyp/include/hyp/switch.h >> index 331dd10821df..93bf140cc972 100644 >> --- a/arch/arm64/kvm/hyp/include/hyp/switch.h >> +++ b/arch/arm64/kvm/hyp/include/hyp/switch.h >> @@ -440,6 +442,22 @@ static inline bool fixup_guest_exit(struct kvm_vcpu >> *vcpu, u64 *exit_code) >> write_sysreg_el2(read_sysreg_el2(SYS_ELR) - 4, SYS_ELR); >> } >> >> +/* >> + * Check for the conditions of Cortex-A510's #2077057. When these occur >> + * SPSR_EL2 can't be trusted, but isn't needed either as it is >> + * unchanged from the value in vcpu_gp_regs(vcpu)->pstate. >> + * Did we just take a PAC exception when a step exception was expected? >> + */ >> +if (IS_ENABLED(CONFIG_ARM64_ERRATUM_2077057) && > > nit: we can drop this IS_ENABLED()... Hmmm, I thought dead code elimination was a good thing! Without the cpu_errata.c match, (which is also guarded by #ifdef), the cap will never be true, even if an affected CPU showed up. This way the compiler knows it can remove all this. >> +cpus_have_const_cap(ARM64_WORKAROUND_2077057) && > > and make this a final cap. Being a ARM64_CPUCAP_LOCAL_CPU_ERRATUM, we > won't accept late CPUs on a system that wasn't affected until then. > >> +ARM_EXCEPTION_CODE(*exit_code) != ARM_EXCEPTION_IRQ && >> +esr_ec == ESR_ELx_EC_PAC && >> +vcpu->guest_debug & KVM_GUESTDBG_SINGLESTEP) { >> +/* Active-not-pending? */ >> +if (*vcpu_cpsr(vcpu) & DBG_SPSR_SS) >> +write_sysreg_el2(*vcpu_cpsr(vcpu), SYS_SPSR); > > Err... Isn't this way too late? The function starts with: > > vcpu->arch.ctxt.regs.pstate = read_sysreg_el2(SYS_SPSR); > > which is just another way to write: > > *vcpu_cpsr(vcpu) = read_sysreg_el2(SYS_SPSR); > > By that time, the vcpu's PSTATE is terminally corrupted. Yes - bother. Staring at it didn't let me spot that! I can hit the conditions to test this, but due to lack of imagination the model doesn't corrupt the SPSR. > I think you need to hoist this workaround way up, before we call into > early_exit_filter() as it will assume that the guest pstate is correct > (this is used by both pKVM and the NV code). > > Something like this (untested): > > diff --git a/arch/arm64/kvm/hyp/include/hyp/switch.h > b/arch/arm64/kvm/hyp/include/hyp/switch.h > index 93bf140cc972..a8a1502db237 100644 > --- a/arch/arm64/kvm/hyp/include/hyp/switch.h > +++ b/arch/arm64/kvm/hyp/include/hyp/switch.h > @@ -402,6 +402,26 @@ static inline bool kvm_hyp_handle_exit(struct kvm_vcpu > *vcpu, u64 *exit_code) > return false; > } > > +static inline void synchronize_vcpu_pstate(struct kvm_vcpu *vcpu, > +u64 *exit_code) > +{ > + /* > + * Check for the conditions of Cortex-A510's #2077057. When these occur > + * SPSR_EL2 can't be trusted, but isn't needed either as it is > + * unchanged from the value in vcpu_gp_regs(vcpu)->pstate. > + * Did we just take a PAC exception when a step exception (being in the > + * Active-not-pending state) was expected? > + */ > + if (cpus_have_final_cap(ARM64_WORKAROUND_2077057) && > + ARM_EXCEPTION_CODE(*exit_code) != ARM_EXCEPTION_IRQ && > + kvm_vcpu_trap_get_class(vcpu) == ESR_ELx_EC_PAC && The vcpu's esr_el2 isn't yet set: | ESR_ELx_EC(read_sysreg_el2(SYS_ESR)) == ESR_ELx_EC_PAC (and I'll shuffle the order so this is last as its an extra sysreg read) > + vcpu->guest_debug & KVM_GUESTDBG_SINGLESTEP && > + *vcpu_cpsr(vcpu) & DBG_SPSR_SS) > + write_sysreg_el2(*vcpu_cpsr(vcpu), SYS_SPSR); > + > + *vcpu_cpsr(vcpu) = read_sysreg_el2(SYS_SPSR); > +} > + > /* > * Return true when we were able to fixup the guest exit and should return to > * the guest, false when we should restore the host state and return to the > @@ -415,7 +435,7 @@ static inline bool fixup_guest_exit(struct kvm_vcpu > *vcpu, u64 *exit_code) >* Save PSTATE early so that we can evaluate the vcpu mode >* early on. >*/ > - vcpu->arch.ctxt.regs.pstate = read_sysreg_el2(SYS_SPSR); > +
Re: [PATCH 4/4] KVM: arm64: Workaround Cortex-A510's single-step and PAC trap errata
Hi James, Thanks for this. I guess. On Tue, 25 Jan 2022 15:38:03 +, James Morse wrote: > > Cortex-A510's erratum #2077057 causes SPSR_EL2 to be corrupted when > single-stepping authenticated ERET instructions. A single step is > expected, but a pointer authentication trap is taken instead. The > erratum causes SPSR_EL1 to be copied to SPSR_EL2, which could allow > EL1 to cause a return to EL2 with a guest controlled ELR_EL2. > > Because the conditions require an ERET into active-not-pending state, > this is only a problem for the EL2 when EL2 is stepping EL1. In this case > the previous SPSR_EL2 value is preserved in struct kvm_vcpu, and can be > restored. Urgh. That's pretty nasty :-(. > > Cc: sta...@vger.kernel.org # ${GITHASHHERE}: arm64: Add Cortex-A510 CPU part > definition > Cc: sta...@vger.kernel.org > Signed-off-by: James Morse > --- > Documentation/arm64/silicon-errata.rst | 2 ++ > arch/arm64/Kconfig | 16 > arch/arm64/kernel/cpu_errata.c | 8 > arch/arm64/kvm/hyp/include/hyp/switch.h | 24 +--- > arch/arm64/tools/cpucaps| 1 + > 5 files changed, 48 insertions(+), 3 deletions(-) > > diff --git a/Documentation/arm64/silicon-errata.rst > b/Documentation/arm64/silicon-errata.rst > index 5342e895fb60..ac1ae34564c9 100644 > --- a/Documentation/arm64/silicon-errata.rst > +++ b/Documentation/arm64/silicon-errata.rst > @@ -92,6 +92,8 @@ stable kernels. > > ++-+-+-+ > | ARM| Cortex-A77 | #1508412| ARM64_ERRATUM_1508412 > | > > ++-+-+-+ > +| ARM| Cortex-A510 | #2077057| ARM64_ERRATUM_2077057 > | > +++-+-+-+ > | ARM| Cortex-A710 | #2119858| ARM64_ERRATUM_2119858 > | > > ++-+-+-+ > | ARM| Cortex-A710 | #2054223| ARM64_ERRATUM_2054223 > | > diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig > index 6978140edfa4..02b542ec18c8 100644 > --- a/arch/arm64/Kconfig > +++ b/arch/arm64/Kconfig > @@ -670,6 +670,22 @@ config ARM64_ERRATUM_1508412 > config ARM64_WORKAROUND_TRBE_OVERWRITE_FILL_MODE > bool > > +config ARM64_ERRATUM_2077057 > + bool "Cortex-A510: 2077057: workaround software-step corrupting > SPSR_EL2" > + help > + This option adds the workaround for ARM Cortex-A510 erratum 2077057. > + Affected Cortex-A510 may corrupt SPSR_EL2 when the a step exception is > + expected, but a Pointer Authentication trap is taken instead. The > + erratum causes SPSR_EL1 to be copied to SPSR_EL2, which could allow > + EL1 to cause a return to EL2 with a guest controlled ELR_EL2. > + > + This can only happen when EL2 is stepping EL1. > + > + When these conditions occur, the SPSR_EL2 value is unchanged from the > + previous guest entry, and can be restored from the in-memory copy. > + > + If unsure, say Y. > + > config ARM64_ERRATUM_2119858 > bool "Cortex-A710: 2119858: workaround TRBE overwriting trace data in > FILL mode" > default y > diff --git a/arch/arm64/kernel/cpu_errata.c b/arch/arm64/kernel/cpu_errata.c > index 9e1c1aef9ebd..04a014c63251 100644 > --- a/arch/arm64/kernel/cpu_errata.c > +++ b/arch/arm64/kernel/cpu_errata.c > @@ -597,6 +597,14 @@ const struct arm64_cpu_capabilities arm64_errata[] = { > .type = ARM64_CPUCAP_WEAK_LOCAL_CPU_FEATURE, > CAP_MIDR_RANGE_LIST(trbe_write_out_of_range_cpus), > }, > +#endif > +#ifdef CONFIG_ARM64_ERRATUM_2077057 > + { > + .desc = "ARM erratum 2077057", > + .capability = ARM64_WORKAROUND_2077057, > + .type = ARM64_CPUCAP_LOCAL_CPU_ERRATUM, > + ERRATA_MIDR_REV_RANGE(MIDR_CORTEX_A510, 0, 0, 2), > + }, > #endif > { > } > diff --git a/arch/arm64/kvm/hyp/include/hyp/switch.h > b/arch/arm64/kvm/hyp/include/hyp/switch.h > index 331dd10821df..93bf140cc972 100644 > --- a/arch/arm64/kvm/hyp/include/hyp/switch.h > +++ b/arch/arm64/kvm/hyp/include/hyp/switch.h > @@ -409,6 +409,8 @@ static inline bool kvm_hyp_handle_exit(struct kvm_vcpu > *vcpu, u64 *exit_code) > */ > static inline bool fixup_guest_exit(struct kvm_vcpu *vcpu, u64 *exit_code) > { > + u8 esr_ec; > + > /* >* Save PSTATE early so that we can evaluate the vcpu mode >* early on. > @@ -421,13 +423,13 @@ static inline bool fixup_guest_exit(struct kvm_vcpu > *vcpu, u64 *exit_code) >*/ > early_exit_filter(vcpu, exit_code); > > - if (ARM_EXCEPTION_CODE(*exit_code) != ARM_EXCEPTION_IRQ) > + if (ARM_EXCEPTION_CODE(*exit_code) != ARM_EXCEPTION_IRQ) { >
[PATCH 4/4] KVM: arm64: Workaround Cortex-A510's single-step and PAC trap errata
Cortex-A510's erratum #2077057 causes SPSR_EL2 to be corrupted when single-stepping authenticated ERET instructions. A single step is expected, but a pointer authentication trap is taken instead. The erratum causes SPSR_EL1 to be copied to SPSR_EL2, which could allow EL1 to cause a return to EL2 with a guest controlled ELR_EL2. Because the conditions require an ERET into active-not-pending state, this is only a problem for the EL2 when EL2 is stepping EL1. In this case the previous SPSR_EL2 value is preserved in struct kvm_vcpu, and can be restored. Cc: sta...@vger.kernel.org # ${GITHASHHERE}: arm64: Add Cortex-A510 CPU part definition Cc: sta...@vger.kernel.org Signed-off-by: James Morse --- Documentation/arm64/silicon-errata.rst | 2 ++ arch/arm64/Kconfig | 16 arch/arm64/kernel/cpu_errata.c | 8 arch/arm64/kvm/hyp/include/hyp/switch.h | 24 +--- arch/arm64/tools/cpucaps| 1 + 5 files changed, 48 insertions(+), 3 deletions(-) diff --git a/Documentation/arm64/silicon-errata.rst b/Documentation/arm64/silicon-errata.rst index 5342e895fb60..ac1ae34564c9 100644 --- a/Documentation/arm64/silicon-errata.rst +++ b/Documentation/arm64/silicon-errata.rst @@ -92,6 +92,8 @@ stable kernels. ++-+-+-+ | ARM| Cortex-A77 | #1508412| ARM64_ERRATUM_1508412 | ++-+-+-+ +| ARM| Cortex-A510 | #2077057| ARM64_ERRATUM_2077057 | +++-+-+-+ | ARM| Cortex-A710 | #2119858| ARM64_ERRATUM_2119858 | ++-+-+-+ | ARM| Cortex-A710 | #2054223| ARM64_ERRATUM_2054223 | diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig index 6978140edfa4..02b542ec18c8 100644 --- a/arch/arm64/Kconfig +++ b/arch/arm64/Kconfig @@ -670,6 +670,22 @@ config ARM64_ERRATUM_1508412 config ARM64_WORKAROUND_TRBE_OVERWRITE_FILL_MODE bool +config ARM64_ERRATUM_2077057 + bool "Cortex-A510: 2077057: workaround software-step corrupting SPSR_EL2" + help + This option adds the workaround for ARM Cortex-A510 erratum 2077057. + Affected Cortex-A510 may corrupt SPSR_EL2 when the a step exception is + expected, but a Pointer Authentication trap is taken instead. The + erratum causes SPSR_EL1 to be copied to SPSR_EL2, which could allow + EL1 to cause a return to EL2 with a guest controlled ELR_EL2. + + This can only happen when EL2 is stepping EL1. + + When these conditions occur, the SPSR_EL2 value is unchanged from the + previous guest entry, and can be restored from the in-memory copy. + + If unsure, say Y. + config ARM64_ERRATUM_2119858 bool "Cortex-A710: 2119858: workaround TRBE overwriting trace data in FILL mode" default y diff --git a/arch/arm64/kernel/cpu_errata.c b/arch/arm64/kernel/cpu_errata.c index 9e1c1aef9ebd..04a014c63251 100644 --- a/arch/arm64/kernel/cpu_errata.c +++ b/arch/arm64/kernel/cpu_errata.c @@ -597,6 +597,14 @@ const struct arm64_cpu_capabilities arm64_errata[] = { .type = ARM64_CPUCAP_WEAK_LOCAL_CPU_FEATURE, CAP_MIDR_RANGE_LIST(trbe_write_out_of_range_cpus), }, +#endif +#ifdef CONFIG_ARM64_ERRATUM_2077057 + { + .desc = "ARM erratum 2077057", + .capability = ARM64_WORKAROUND_2077057, + .type = ARM64_CPUCAP_LOCAL_CPU_ERRATUM, + ERRATA_MIDR_REV_RANGE(MIDR_CORTEX_A510, 0, 0, 2), + }, #endif { } diff --git a/arch/arm64/kvm/hyp/include/hyp/switch.h b/arch/arm64/kvm/hyp/include/hyp/switch.h index 331dd10821df..93bf140cc972 100644 --- a/arch/arm64/kvm/hyp/include/hyp/switch.h +++ b/arch/arm64/kvm/hyp/include/hyp/switch.h @@ -409,6 +409,8 @@ static inline bool kvm_hyp_handle_exit(struct kvm_vcpu *vcpu, u64 *exit_code) */ static inline bool fixup_guest_exit(struct kvm_vcpu *vcpu, u64 *exit_code) { + u8 esr_ec; + /* * Save PSTATE early so that we can evaluate the vcpu mode * early on. @@ -421,13 +423,13 @@ static inline bool fixup_guest_exit(struct kvm_vcpu *vcpu, u64 *exit_code) */ early_exit_filter(vcpu, exit_code); - if (ARM_EXCEPTION_CODE(*exit_code) != ARM_EXCEPTION_IRQ) + if (ARM_EXCEPTION_CODE(*exit_code) != ARM_EXCEPTION_IRQ) { vcpu->arch.fault.esr_el2 = read_sysreg_el2(SYS_ESR); + esr_ec = kvm_vcpu_trap_get_class(vcpu); + } if (ARM_SERROR_PENDING(*exit_code) && ARM_EXCEPTION_CODE(*exit_code) != ARM_EXCEPTION_IRQ) { - u8 esr_ec =
[PATCH 1/4] arm64: Add Cortex-A510 CPU part definition
From: Anshuman Khandual Add the CPU Partnumbers for the new Arm designs. Cc: Catalin Marinas Cc: Will Deacon Cc: Suzuki Poulose Cc: linux-arm-ker...@lists.infradead.org Cc: linux-ker...@vger.kernel.org Reviewed-by: Suzuki K Poulose Acked-by: Catalin Marinas Signed-off-by: Anshuman Khandual Signed-off-by: James Morse --- arch/arm64/include/asm/cputype.h | 2 ++ 1 file changed, 2 insertions(+) diff --git a/arch/arm64/include/asm/cputype.h b/arch/arm64/include/asm/cputype.h index 19b8441aa8f2..e8fdc10395b6 100644 --- a/arch/arm64/include/asm/cputype.h +++ b/arch/arm64/include/asm/cputype.h @@ -73,6 +73,7 @@ #define ARM_CPU_PART_CORTEX_A760xD0B #define ARM_CPU_PART_NEOVERSE_N1 0xD0C #define ARM_CPU_PART_CORTEX_A770xD0D +#define ARM_CPU_PART_CORTEX_A510 0xD46 #define ARM_CPU_PART_CORTEX_A710 0xD47 #define ARM_CPU_PART_NEOVERSE_N2 0xD49 @@ -115,6 +116,7 @@ #define MIDR_CORTEX_A76MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_A76) #define MIDR_NEOVERSE_N1 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_NEOVERSE_N1) #define MIDR_CORTEX_A77MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_A77) +#define MIDR_CORTEX_A510 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_A510) #define MIDR_CORTEX_A710 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_A710) #define MIDR_NEOVERSE_N2 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_NEOVERSE_N2) #define MIDR_THUNDERX MIDR_CPU_MODEL(ARM_CPU_IMP_CAVIUM, CAVIUM_CPU_PART_THUNDERX) -- 2.30.2 ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
[PATCH 2/4] KVM: arm64: Avoid consuming a stale esr value when SError occur
When any exception other than an IRQ occurs, the CPU updates the ESR_EL2 register with the exception syndrome. An SError may also become pending, and will be synchronised by KVM. KVM notes the exception type, and whether an SError was synchronised in exit_code. When an exception other than an IRQ occurs, fixup_guest_exit() updates vcpu->arch.fault.esr_el2 from the hardware register. When an SError was synchronised, the vcpu esr value is used to determine if the exception was due to an HVC. If so, ELR_EL2 is moved back one instruction. This is so that KVM can process the SError first, and re-execute the HVC if the guest survives the SError. But if an IRQ synchronises an SError, the vcpu's esr value is stale. If the previous non-IRQ exception was an HVC, KVM will corrupt ELR_EL2, causing an unrelated guest instruction to be executed twice. Check ARM_EXCEPTION_CODE() before messing with ELR_EL2, IRQs don't update this register so don't need to check. Fixes: defe21f49bc9 ("KVM: arm64: Move PC rollback on SError to HYP") Cc: sta...@vger.kernel.org Reported-by: Steven Price Signed-off-by: James Morse --- arch/arm64/kvm/hyp/include/hyp/switch.h | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/arch/arm64/kvm/hyp/include/hyp/switch.h b/arch/arm64/kvm/hyp/include/hyp/switch.h index 58e14f8ead23..331dd10821df 100644 --- a/arch/arm64/kvm/hyp/include/hyp/switch.h +++ b/arch/arm64/kvm/hyp/include/hyp/switch.h @@ -424,7 +424,8 @@ static inline bool fixup_guest_exit(struct kvm_vcpu *vcpu, u64 *exit_code) if (ARM_EXCEPTION_CODE(*exit_code) != ARM_EXCEPTION_IRQ) vcpu->arch.fault.esr_el2 = read_sysreg_el2(SYS_ESR); - if (ARM_SERROR_PENDING(*exit_code)) { + if (ARM_SERROR_PENDING(*exit_code) && + ARM_EXCEPTION_CODE(*exit_code) != ARM_EXCEPTION_IRQ) { u8 esr_ec = kvm_vcpu_trap_get_class(vcpu); /* -- 2.30.2 ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
[PATCH 3/4] KVM: arm64: Stop handle_exit() from handling HVC twice when an SError occurs
Prior to commit defe21f49bc9 ("KVM: arm64: Move PC rollback on SError to HYP"), when an SError is synchronised due to another exception, KVM handles the SError first. If the guest survives, the instruction that triggered the original exception is re-exectued to handle the first exception. HVC is treated as a special case as the instruction wouldn't normally be re-exectued, as its not a trap. Commit defe21f49bc9 didn't preserve the behaviour of the 'return 1' that skips the rest of handle_exit(). Since commit defe21f49bc9, KVM will try to handle the SError and the original exception at the same time. When the exception was an HVC, fixup_guest_exit() has already rolled back ELR_EL2, meaning if the guest has virtual SError masked, it will execute and handle the HVC twice. Restore the original behaviour. Fixes: defe21f49bc9 ("KVM: arm64: Move PC rollback on SError to HYP") Cc: sta...@vger.kernel.org Signed-off-by: James Morse --- It may be possible to remove both this patch, and the HVC handling code in fixup_guest_exit(). This means KVM would always handle the exception and the SError. This may result in unnecessary work if the guest takes the virtual SError when it is next restarted, but should be harmless if SError are always re-injected using HCR_EL2.VSE. --- arch/arm64/kvm/handle_exit.c | 8 1 file changed, 8 insertions(+) diff --git a/arch/arm64/kvm/handle_exit.c b/arch/arm64/kvm/handle_exit.c index fd2dd26caf91..e3140abd2e2e 100644 --- a/arch/arm64/kvm/handle_exit.c +++ b/arch/arm64/kvm/handle_exit.c @@ -228,6 +228,14 @@ int handle_exit(struct kvm_vcpu *vcpu, int exception_index) { struct kvm_run *run = vcpu->run; + if (ARM_SERROR_PENDING(exception_index)) { + /* +* The SError is handled by handle_exit_early(). If the guest +* survives it will re-execute the original instruction. +*/ + return 1; + } + exception_index = ARM_EXCEPTION_CODE(exception_index); switch (exception_index) { -- 2.30.2 ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
[PATCH 0/4] KVM: arm64: A510 errata workaround and fixes for fixup_guest_exit()
Hello! Early Cortex-A510 parts have a nasty erratum where two ERETs, pointer-auth and software step conspire to corrupt SPSR_EL2. A guest can only trigger this when it is being stepped by EL2, which gives EL2 the opportunity to work around the erratum. Patch 4 does this, the SDEN is available from: https://developer.arm.com/documentation/SDEN2397239/900 Patches 2 and 3 fix two issues with the adjacent code where a stale esr value could be used to alter the ELR_EL2 when an IRQ synchronises an SError, and when an HVC synchronises an SError, the HVC may be handled twice, (not just execute twice). There are three series that would add the Cortex-A510 part macros. I've picked Anshuman's patch that does this, on the assumption that makes someone's life easier. I haven't spotted that patch on the arm64/for-next/fixes branch, so I've not included the hash in the prerequisite field of the CC-stable. Let me know if you want this reposted once that value is known. This series is based on v5.17-rc1 and can be retrieved from: https://git.gitlab.arm.com/linux-arm/linux-jm.git a510_errata/kvm_bits/v1 Thanks, James Anshuman Khandual (1): arm64: Add Cortex-A510 CPU part definition James Morse (3): KVM: arm64: Avoid consuming a stale esr value when SError occur KVM: arm64: Stop handle_exit() from handling HVC twice when an SError occurs KVM: arm64: Workaround Cortex-A510's single-step and PAC trap errata Documentation/arm64/silicon-errata.rst | 2 ++ arch/arm64/Kconfig | 16 +++ arch/arm64/include/asm/cputype.h| 2 ++ arch/arm64/kernel/cpu_errata.c | 8 arch/arm64/kvm/handle_exit.c| 8 arch/arm64/kvm/hyp/include/hyp/switch.h | 27 + arch/arm64/tools/cpucaps| 1 + 7 files changed, 60 insertions(+), 4 deletions(-) -- 2.30.2 ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
Re: [RFC PATCH v3 01/11] KVM: Capture VM start
On Wed, 19 Jan 2022 00:07:44 +, Sean Christopherson wrote: > > On Fri, Jan 14, 2022, Reiji Watanabe wrote: > > The restriction, with which KVM doesn't need to worry about the changes > > in the registers after KVM_RUN, could potentially protect or be useful > > to protect KVM and simplify future changes/maintenance of the KVM codes > > that consumes the values. > > That sort of protection is definitely welcome, the previously mentioned CPUID > mess > on x86 would have benefit greatly by KVM being restrictive in the past. That > said, > hooking KVM_RUN is likely the wrong way to go about implementing any > restrictions. > Running a vCPU is where much of the vCPU's state is explicitly consumed, but > it's > all too easy for KVM to implicity/indirectly consume state via a different > ioctl(), > e.g. if there are side effects that are visible in other registers, than an > update > can also be visible to userspace via KVM_{G,S}ET_{S,}REGS, at which point > disallowing > modifying state after KVM_RUN but not after reading/writing regs is arbitrary > and > inconsitent. > > If possible, preventing modification if kvm->created_vcpus > 0 is > ideal as it's a relatively common pattern in KVM, and provides a > clear boundary to userpace regarding what is/isn't allowed. No, that's way too late. The configuration is in general per-CPU, and I really don't want to expand the surface of the userspace API to allow all sort of magic trick depending on the nature of what you save/restore. The "first run" crap is already there. We have it on a per-CPU basis, and we need it at the VM level for other reasons (see the recent discussion about PMU filtering vs binding to a specific PMU implementation). M. -- Without deviation from the norm, progress is not possible. ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
Re: [RFC PATCH v3 01/11] KVM: Capture VM start
It's probably time I jump on this thread On Thu, 13 Jan 2022 17:21:19 +, Sean Christopherson wrote: > > On Wed, Jan 12, 2022, Raghavendra Rao Ananta wrote: > > On Tue, Jan 11, 2022 at 11:16 AM Jim Mattson wrote: > > > Perhaps it would help if you explained *why* you are doing this. It > > > sounds like you are either trying to protect against a malicious > > > userspace, or you are trying to keep userspace from doing something > > > stupid. In general, kvm only enforces constraints that are necessary > > > to protect the host. If that's what you're doing, I don't understand > > > why live migration doesn't provide an end-run around your protections. > > It's mainly to safeguard the guests. With respect to migration, KVM > > and the userspace are collectively playing a role here. It's up to the > > userspace to ensure that the registers are configured the same across > > migrations and KVM ensures that the userspace doesn't modify the > > registers after KVM_RUN so that they don't see features turned OFF/ON > > during execution. I'm not sure if it falls into the definition of > > protecting the host. Do you see a value in adding this extra > > protection from KVM? > > Short answer: probably not? Well, I beg to defer. > There is precedent for disallowing userspace from doing stupid > things, but that's either for KVM's protection (as Jim pointed out), > or because KVM can't honor the change, e.g. x86 is currently in the > process of disallowing most CPUID changes after KVM_RUN because KVM > itself consumes the CPUID information and KVM doesn't support > updating some of it's own internal state (because removing features > like GB hugepage support is nonsensical and would require a large > pile of complicated, messy code). We provide quite a lot of CPU emulation based on the feature registers exposed to the guest. Allowing userspace to change this stuff once the guest is running is a massive attack vector. > Restricing CPUID changes does offer some "protection" to the guest, but that's > not the goal. E.g. KVM won't detect CPUID misconfiguration in the migration > case, and trying to do so is a fool's errand. > > If restricting updates in the arm64 is necessary to ensure KVM provides sane > behavior, then it could be justified. But if it's purely a sanity check on > behalf of the guest, then it's not justified. No. This is about preventing userspace from tripping the kernel into doing things it really shouldn't by flipping bits of configuration that should be set in stone. M. -- Without deviation from the norm, progress is not possible. ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
Re: [PATCH v8 25/38] KVM: arm64: Trap SME usage in guest
On Tue, Jan 25, 2022 at 01:21:47PM +, Marc Zyngier wrote: > Mark Brown wrote: > > OK, I'll remove the defensiveness here. FGT is mandatory from v8.6 and > > SME is a v9 feature so people shouldn't build a SME implementation that > > lacks FGT. > Can you then please make it that SME doesn't get enabled at all if FGT > isn't present? It would also be good to have a clarification in the > architecture that it isn't allowed to build SME without FGT (specially > given that v9.0 is congruent to v8.5, and thus doesn't have FGT). Right, this should be handled by the time the full spec is published - it's an issue people are aware of and it's not something that should ever get built. It would be good to explicitly handle the dependency in the cpufeature stuff, we'll have other issues like this, but I'd like to handle that separately since at first look doing it properly is a bit of surgery on cpufeature and the series is already rather large. signature.asc Description: PGP signature ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
Re: [PATCH v8 26/38] KVM: arm64: Handle SME host state when running guests
On Tue, Jan 25, 2022 at 01:22:51PM +, Marc Zyngier wrote: > Mark Brown wrote: > > > I find the use of CPACR_EL1_SMEN in some cases and its individual bits > > > in some others pretty confusing. I understand that you have modelled > > > it after the SVE code, but maybe this is a mistake we don't need to > > > repeat. I'd be in favour of directly exposing the individual bits in > > > all cases. > > OK, it is just the KVM code that uses the plain ZEN. I'll add a cleanup > > patch for that at the start of the series for ZEN I guess otherwise it > > looks worse, though that will inflate the size of the series a bit. > I'm happy to merge such a patch early if that helps. That'd be good. There's also similar stuff going on with FPEN BTW (which is I imagine where the SVE stuff came from). signature.asc Description: PGP signature ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
Re: [PATCH v8 26/38] KVM: arm64: Handle SME host state when running guests
On Tue, 25 Jan 2022 12:52:18 +, Mark Brown wrote: > > [1 ] > On Tue, Jan 25, 2022 at 11:59:02AM +, Marc Zyngier wrote: > > Mark Brown wrote: > > > > + if (has_vhe()) { > > > + if (system_supports_sme()) { > > > nit:if (has_vhe() && system_supports_sme()) { > > > saves you one level of indentation. > > Yes, for now. IIRC there was some other stuff there when I had some of > the code for doing the register switching properly. > > > > + /* Also restore EL0 state seen on entry */ > > > + if (vcpu->arch.flags & KVM_ARM64_HOST_SME_ENABLED) > > > + sysreg_clear_set(CPACR_EL1, 0, > > > + CPACR_EL1_SMEN_EL0EN | > > > + CPACR_EL1_SMEN_EL1EN); > > > + else > > > + sysreg_clear_set(CPACR_EL1, > > > + CPACR_EL1_SMEN_EL0EN, > > > + CPACR_EL1_SMEN_EL1EN); > > > I find the use of CPACR_EL1_SMEN in some cases and its individual bits > > in some others pretty confusing. I understand that you have modelled > > it after the SVE code, but maybe this is a mistake we don't need to > > repeat. I'd be in favour of directly exposing the individual bits in > > all cases. > > OK, it is just the KVM code that uses the plain ZEN. I'll add a cleanup > patch for that at the start of the series for ZEN I guess otherwise it > looks worse, though that will inflate the size of the series a bit. I'm happy to merge such a patch early if that helps. M. -- Without deviation from the norm, progress is not possible. ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
Re: [PATCH v8 25/38] KVM: arm64: Trap SME usage in guest
On Tue, 25 Jan 2022 12:25:47 +, Mark Brown wrote: > > [1 ] > On Tue, Jan 25, 2022 at 11:27:55AM +, Marc Zyngier wrote: > > Mark Brown wrote: > > > > + if (IS_ENABLED(CONFIG_ARM64_SME) && cpus_have_final_cap(ARM64_SME)) > > > Please drop the IS_ENABLED(). We purposely avoid conditional > > compilation in KVM in order to avoid bitrot, and the amount of code > > you save isn't significant. Having a static key is more than enough to > > avoid runtime costs. > > Sure, I wanted to be extra careful here as this is all in hot paths and > going to get moved elsewhere when we have real guest support. > > > > + if (IS_ENABLED(CONFIG_ARM64_SME) && cpus_have_final_cap(ARM64_SME) && > > > + cpus_have_final_cap(ARM64_HAS_FGT)) { > > > + val = read_sysreg_s(SYS_HFGRTR_EL2); > > > + val &= ~(HFGxTR_EL2_nTPIDR_EL0_MASK | > > > + HFGxTR_EL2_nSMPRI_EL1_MASK); > > > + write_sysreg_s(val, SYS_HFGRTR_EL2); > > > + > > > + val = read_sysreg_s(SYS_HFGWTR_EL2); > > > + val &= ~(HFGxTR_EL2_nTPIDR_EL0_MASK | > > > + HFGxTR_EL2_nSMPRI_EL1_MASK); > > > + write_sysreg_s(val, SYS_HFGWTR_EL2); > > > + } > > > If the CPUs do not have FGT, what provides the equivalent trapping? > > Nothing for nVHE mode. That's what I feared. > > > If FGT is mandatory when SME exists, then you should simplify the > > condition. > > OK, I'll remove the defensiveness here. FGT is mandatory from v8.6 and > SME is a v9 feature so people shouldn't build a SME implementation that > lacks FGT. Can you then please make it that SME doesn't get enabled at all if FGT isn't present? It would also be good to have a clarification in the architecture that it isn't allowed to build SME without FGT (specially given that v9.0 is congruent to v8.5, and thus doesn't have FGT). Thanks, M. -- Without deviation from the norm, progress is not possible. ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
Re: [PATCH v8 26/38] KVM: arm64: Handle SME host state when running guests
On Tue, Jan 25, 2022 at 11:59:02AM +, Marc Zyngier wrote: > Mark Brown wrote: > > + if (has_vhe()) { > > + if (system_supports_sme()) { > nit: if (has_vhe() && system_supports_sme()) { > saves you one level of indentation. Yes, for now. IIRC there was some other stuff there when I had some of the code for doing the register switching properly. > > + /* Also restore EL0 state seen on entry */ > > + if (vcpu->arch.flags & KVM_ARM64_HOST_SME_ENABLED) > > + sysreg_clear_set(CPACR_EL1, 0, > > +CPACR_EL1_SMEN_EL0EN | > > +CPACR_EL1_SMEN_EL1EN); > > + else > > + sysreg_clear_set(CPACR_EL1, > > +CPACR_EL1_SMEN_EL0EN, > > +CPACR_EL1_SMEN_EL1EN); > I find the use of CPACR_EL1_SMEN in some cases and its individual bits > in some others pretty confusing. I understand that you have modelled > it after the SVE code, but maybe this is a mistake we don't need to > repeat. I'd be in favour of directly exposing the individual bits in > all cases. OK, it is just the KVM code that uses the plain ZEN. I'll add a cleanup patch for that at the start of the series for ZEN I guess otherwise it looks worse, though that will inflate the size of the series a bit. signature.asc Description: PGP signature ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
Re: [PATCH v8 25/38] KVM: arm64: Trap SME usage in guest
On Tue, Jan 25, 2022 at 11:27:55AM +, Marc Zyngier wrote: > Mark Brown wrote: > > + if (IS_ENABLED(CONFIG_ARM64_SME) && cpus_have_final_cap(ARM64_SME)) > Please drop the IS_ENABLED(). We purposely avoid conditional > compilation in KVM in order to avoid bitrot, and the amount of code > you save isn't significant. Having a static key is more than enough to > avoid runtime costs. Sure, I wanted to be extra careful here as this is all in hot paths and going to get moved elsewhere when we have real guest support. > > + if (IS_ENABLED(CONFIG_ARM64_SME) && cpus_have_final_cap(ARM64_SME) && > > + cpus_have_final_cap(ARM64_HAS_FGT)) { > > + val = read_sysreg_s(SYS_HFGRTR_EL2); > > + val &= ~(HFGxTR_EL2_nTPIDR_EL0_MASK | > > +HFGxTR_EL2_nSMPRI_EL1_MASK); > > + write_sysreg_s(val, SYS_HFGRTR_EL2); > > + > > + val = read_sysreg_s(SYS_HFGWTR_EL2); > > + val &= ~(HFGxTR_EL2_nTPIDR_EL0_MASK | > > +HFGxTR_EL2_nSMPRI_EL1_MASK); > > + write_sysreg_s(val, SYS_HFGWTR_EL2); > > + } > If the CPUs do not have FGT, what provides the equivalent trapping? Nothing for nVHE mode. > If FGT is mandatory when SME exists, then you should simplify the > condition. OK, I'll remove the defensiveness here. FGT is mandatory from v8.6 and SME is a v9 feature so people shouldn't build a SME implementation that lacks FGT. signature.asc Description: PGP signature ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
Re: [PATCH v8 05/38] arm64/sme: System register and exception syndrome definitions
On Tue, Jan 25, 2022 at 11:25:31AM +, Marc Zyngier wrote: > On Tue, 25 Jan 2022 00:10:41 +, > Mark Brown wrote: > > +/* HFG[WR]TR_EL2 bit definitions */ > > +#define HFGxTR_EL2_nTPIDR_EL0_SHIFT55 > > +#define HFGxTR_EL2_nTPIDR_EL0_MASK (1 << HFGxTR_EL2_nTPIDR_EL0_SHIFT) > This annoyingly clashes with bit 35 of the same registers, which maps > to TPIDR_EL0. I have the feeling that this really should be TPIDR2_EL0. Yes, it should be. signature.asc Description: PGP signature ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
Re: [PATCH v8 01/38] arm64: cpufeature: Always specify and use a field width for capabilities
On Tue, Jan 25, 2022 at 10:57:47AM +, Suzuki K Poulose wrote: > On 25/01/2022 00:10, Mark Brown wrote: > > + int val = cpuid_feature_extract_field_width(reg, entry->field_pos, > > + entry->field_width, > > + entry->sign); > Could we do something like : > + int val = cpuid_feature_extract_field_width(reg, > entry->field_pos, > entry->field_width ? : 4, > .. > ); > and leave the existing structures as they are ? Obviously we *could* (ideally without the ternery operator) but having the implicit default like that makes me nervous that it's too easy to just forget to fill it in and get the wrong default. > And that could avoid these changes too. We could add : > #define HWCAP_CPUID_MATCH_WIDTH(...) > when/if we need one, which sets the width. I'd originally had a completely separate set of definitions for single bit fields but Catlain wanted to get everything in one. I'm not sure I see a huge advantage in sharing the match function and not also sharing the macro TBH. signature.asc Description: PGP signature ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
Re: [PATCH v8 26/38] KVM: arm64: Handle SME host state when running guests
On Tue, 25 Jan 2022 00:11:02 +, Mark Brown wrote: > > While we don't currently support SME in guests we do currently support it > for the host system so we need to take care of SME's impact, including > the floating point register state, when running guests. Simiarly to SVE > we need to manage the traps in CPACR_RL1, what is new is the handling of > streaming mode and ZA. > > Normally we defer any handling of the floating point register state until > the guest first uses it however if the system is in streaming mode FPSIMD > and SVE operations may generate SME traps which we would need to distinguish > from actual attempts by the guest to use SME. Rather than do this for the > time being if we are in streaming mode when entering the guest we force > the floating point state to be saved immediately and exit streaming mode, > meaning that the guest won't generate SME traps for supported operations. > > We could handle ZA in the access trap similarly to the FPSIMD/SVE state > without the disruption caused by streaming mode but for simplicity > handle it the same way as streaming mode for now. > > This will be revisited when we support SME for guests (hopefully before SME > hardware becomes available), for now it will only incur additional cost on > systems with SME and even there only if streaming mode or ZA are enabled. > > Signed-off-by: Mark Brown > --- > arch/arm64/include/asm/kvm_host.h | 1 + > arch/arm64/kvm/fpsimd.c | 38 +++ > 2 files changed, 39 insertions(+) > > diff --git a/arch/arm64/include/asm/kvm_host.h > b/arch/arm64/include/asm/kvm_host.h > index 7dc85d5a6552..404b7358ba96 100644 > --- a/arch/arm64/include/asm/kvm_host.h > +++ b/arch/arm64/include/asm/kvm_host.h > @@ -438,6 +438,7 @@ struct kvm_vcpu_arch { > #define KVM_ARM64_DEBUG_STATE_SAVE_SPE (1 << 12) /* Save SPE context > if active */ > #define KVM_ARM64_DEBUG_STATE_SAVE_TRBE (1 << 13) /* Save TRBE context > if active */ > #define KVM_ARM64_FP_FOREIGN_FPSTATE (1 << 14) > +#define KVM_ARM64_HOST_SME_ENABLED (1 << 15) /* SME enabled for EL0 */ > > #define KVM_GUESTDBG_VALID_MASK (KVM_GUESTDBG_ENABLE | \ >KVM_GUESTDBG_USE_SW_BP | \ > diff --git a/arch/arm64/kvm/fpsimd.c b/arch/arm64/kvm/fpsimd.c > index 338733ac63f8..cecaddb644ce 100644 > --- a/arch/arm64/kvm/fpsimd.c > +++ b/arch/arm64/kvm/fpsimd.c > @@ -82,6 +82,26 @@ void kvm_arch_vcpu_load_fp(struct kvm_vcpu *vcpu) > > if (read_sysreg(cpacr_el1) & CPACR_EL1_ZEN_EL0EN) > vcpu->arch.flags |= KVM_ARM64_HOST_SVE_ENABLED; > + > + /* > + * We don't currently support SME guests but if we leave > + * things in streaming mode then when the guest starts running > + * FPSIMD or SVE code it may generate SME traps so as a > + * special case if we are in streaming mode we force the host > + * state to be saved now and exit streaming mode so that we > + * don't have to handle any SME traps for valid guest > + * operations. Do this for ZA as well for now for simplicity. > + */ > + if (system_supports_sme()) { > + if (read_sysreg(cpacr_el1) & CPACR_EL1_SMEN_EL0EN) > + vcpu->arch.flags |= KVM_ARM64_HOST_SME_ENABLED; > + > + if (read_sysreg_s(SYS_SVCR_EL0) & > + (SYS_SVCR_EL0_SM_MASK | SYS_SVCR_EL0_ZA_MASK)) { > + vcpu->arch.flags &= ~KVM_ARM64_FP_HOST; > + fpsimd_save_and_flush_cpu_state(); > + } > + } > } > > void kvm_arch_vcpu_ctxflush_fp(struct kvm_vcpu *vcpu) > @@ -129,6 +149,24 @@ void kvm_arch_vcpu_put_fp(struct kvm_vcpu *vcpu) > > local_irq_save(flags); > > + /* > + * If we have VHE then the Hyp code will reset CPACR_EL1 to > + * CPACR_EL1_DEFAULT and we need to reenable SME. > + */ > + if (has_vhe()) { > + if (system_supports_sme()) { nit:if (has_vhe() && system_supports_sme()) { saves you one level of indentation. > + /* Also restore EL0 state seen on entry */ > + if (vcpu->arch.flags & KVM_ARM64_HOST_SME_ENABLED) > + sysreg_clear_set(CPACR_EL1, 0, > + CPACR_EL1_SMEN_EL0EN | > + CPACR_EL1_SMEN_EL1EN); > + else > + sysreg_clear_set(CPACR_EL1, > + CPACR_EL1_SMEN_EL0EN, > + CPACR_EL1_SMEN_EL1EN); I find the use of CPACR_EL1_SMEN in some cases and its individual bits in some others pretty confusing. I understand that you have modelled it after the SVE code, but maybe this is a mistake we don't need to repeat. I'd be in favour of directly exposing the individual bits in all cases. Thanks, M. -- Without deviation from the norm, progress is not
Re: [PATCH v8 25/38] KVM: arm64: Trap SME usage in guest
On Tue, 25 Jan 2022 00:11:01 +, Mark Brown wrote: > > SME defines two new traps which need to be enabled for guests to ensure > that they can't use SME, one for the main SME operations which mirrors the > traps for SVE and another for access to TPIDR2 in SCTLR_EL2. > > For VHE manage SMEN along with ZEN in activate_traps() and the FP state > management callbacks. > > For nVHE the value to be used for CPTR_EL2 in the guest is stored in > vcpu->arch.cptr_el2, set TSM there during initialisation. It will be > cleared in __deactivate_traps_common() by virtue of not being set in > CPTR_EL2_DEFAULT. > > For both VHE and nVHE cases handle SCTLR_EL2.EnTPIDR2 in the shared > __active_traps_common() and __deactivate_traps_common(), there is no > existing dynamic management of SCTLR_EL2. > > Signed-off-by: Mark Brown > --- > arch/arm64/kvm/hyp/nvhe/switch.c | 30 ++ > arch/arm64/kvm/hyp/vhe/switch.c | 10 +- > 2 files changed, 39 insertions(+), 1 deletion(-) > > diff --git a/arch/arm64/kvm/hyp/nvhe/switch.c > b/arch/arm64/kvm/hyp/nvhe/switch.c > index 6410d21d8695..184bf6bd79b9 100644 > --- a/arch/arm64/kvm/hyp/nvhe/switch.c > +++ b/arch/arm64/kvm/hyp/nvhe/switch.c > @@ -47,10 +47,25 @@ static void __activate_traps(struct kvm_vcpu *vcpu) > val |= CPTR_EL2_TFP | CPTR_EL2_TZ; > __activate_traps_fpsimd32(vcpu); > } > + if (IS_ENABLED(CONFIG_ARM64_SME) && cpus_have_final_cap(ARM64_SME)) Please drop the IS_ENABLED(). We purposely avoid conditional compilation in KVM in order to avoid bitrot, and the amount of code you save isn't significant. Having a static key is more than enough to avoid runtime costs. > + val |= CPTR_EL2_TSM; > > write_sysreg(val, cptr_el2); > write_sysreg(__this_cpu_read(kvm_hyp_vector), vbar_el2); > > + if (IS_ENABLED(CONFIG_ARM64_SME) && cpus_have_final_cap(ARM64_SME) && > + cpus_have_final_cap(ARM64_HAS_FGT)) { > + val = read_sysreg_s(SYS_HFGRTR_EL2); > + val &= ~(HFGxTR_EL2_nTPIDR_EL0_MASK | > + HFGxTR_EL2_nSMPRI_EL1_MASK); > + write_sysreg_s(val, SYS_HFGRTR_EL2); > + > + val = read_sysreg_s(SYS_HFGWTR_EL2); > + val &= ~(HFGxTR_EL2_nTPIDR_EL0_MASK | > + HFGxTR_EL2_nSMPRI_EL1_MASK); > + write_sysreg_s(val, SYS_HFGWTR_EL2); > + } If the CPUs do not have FGT, what provides the equivalent trapping? If FGT is mandatory when SME exists, then you should simplify the condition. M. -- Without deviation from the norm, progress is not possible. ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
Re: [PATCH v8 05/38] arm64/sme: System register and exception syndrome definitions
On Tue, 25 Jan 2022 00:10:41 +, Mark Brown wrote: > > +/* HFG[WR]TR_EL2 bit definitions */ > +#define HFGxTR_EL2_nTPIDR_EL0_SHIFT 55 > +#define HFGxTR_EL2_nTPIDR_EL0_MASK (1 << HFGxTR_EL2_nTPIDR_EL0_SHIFT) This annoyingly clashes with bit 35 of the same registers, which maps to TPIDR_EL0. I have the feeling that this really should be TPIDR2_EL0. M. -- Without deviation from the norm, progress is not possible. ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
Re: [PATCH v8 01/38] arm64: cpufeature: Always specify and use a field width for capabilities
On 25/01/2022 00:10, Mark Brown wrote: Since all the fields in the main ID registers are 4 bits wide we have up until now not bothered specifying the width in the code. Since we now wish to use this mechanism to enumerate features from the floating point feature registers which do not follow this pattern add a width to the table. This means updating all the existing table entries but makes it less likely that we run into issues in future due to implicitly assuming a 4 bit width. Signed-off-by: Mark Brown Cc: Suzuki K Poulose --- arch/arm64/include/asm/cpufeature.h | 1 + arch/arm64/kernel/cpufeature.c | 167 +--- 2 files changed, 102 insertions(+), 66 deletions(-) diff --git a/arch/arm64/include/asm/cpufeature.h b/arch/arm64/include/asm/cpufeature.h index ef6be92b1921..2728abd9cae4 100644 --- a/arch/arm64/include/asm/cpufeature.h +++ b/arch/arm64/include/asm/cpufeature.h @@ -356,6 +356,7 @@ struct arm64_cpu_capabilities { struct {/* Feature register checking */ u32 sys_reg; u8 field_pos; + u8 field_width; u8 min_field_value; u8 hwcap_type; bool sign; diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c index a46ab3b1c4d5..d9f09e40aaf6 100644 --- a/arch/arm64/kernel/cpufeature.c +++ b/arch/arm64/kernel/cpufeature.c @@ -1307,7 +1307,9 @@ u64 __read_sysreg_by_encoding(u32 sys_id) static bool feature_matches(u64 reg, const struct arm64_cpu_capabilities *entry) { - int val = cpuid_feature_extract_field(reg, entry->field_pos, entry->sign); + int val = cpuid_feature_extract_field_width(reg, entry->field_pos, + entry->field_width, + entry->sign); Could we do something like : + int val = cpuid_feature_extract_field_width(reg, entry->field_pos, entry->field_width ? : 4, .. ); and leave the existing structures as they are ? return val >= entry->min_field_value; } @@ -1952,6 +1954,7 @@ static const struct arm64_cpu_capabilities arm64_features[] = { There is arm64_errata[] array in cpu_errata.c. So, if we use the above proposal we could leave things unchanged for all existing uses. .matches = has_cpuid_feature, .sys_reg = SYS_ID_AA64MMFR0_EL1, .field_pos = ID_AA64MMFR0_ECV_SHIFT, + .field_width = 4, .sign = FTR_UNSIGNED, .min_field_value = 1, }, ... -#define HWCAP_CPUID_MATCH(reg, field, s, min_value) \ +#define HWCAP_CPUID_MATCH(reg, field, width, s, min_value) \ .matches = has_cpuid_feature, \ .sys_reg = reg, \ .field_pos = field, \ + .field_width = width, \ .sign = s, \ .min_field_value = min_value, And that could avoid these changes too. We could add : #define HWCAP_CPUID_MATCH_WIDTH(...) when/if we need one, which sets the width. Cheers Suzuki ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
Re: KVM/arm64: Guest ABI changes do not appear rollback-safe
On Tue, 25 Jan 2022 03:47:04 +, Raghavendra Rao Ananta wrote: > > Hello all, > > Based on the recent discussion on the patch '[RFC PATCH v3 01/11] KVM: > Capture VM start' [1], Reiji, I (and others in the team) were > wondering, since the hypercall feature-map is technically per-VM and > not per-vCPU, would it make more sense to present it as a kvm_device, > rather than vCPU psuedo-registers to the userspace? I really don't think so. > If I understand correctly, the original motivation for going with > pseudo-registers was to comply with QEMU, which uses KVM_GET_REG_LIST > and KVM_[GET|SET]_ONE_REG interface, but I'm guessing the VMMs doing > save/restore across migration might write the same values for every > vCPU. KVM currently restricts the vcpu features to be unified across vcpus, but that's only an implementation choice. The ARM architecture doesn't mandate that these registers are all the same, and it isn't impossible that we'd allow for the feature set to become per-vcpu at some point in time. So this argument doesn't really hold. Furthermore, compatibility with QEMU's save/restore model is essential, and AFAICT, there is no open source alternative. > Granted that we would be diverging from the existing implementation > (psci versioning and spectre WA registers), but this can be a cleaner > way forward for extending hypercall support. Cleaner? Why? How? You'd have the exact same constraints, plus the disadvantages of having to maintain a new interface concurrently with the existing ones. > The kvm_device interface > can be made backward compatible with the way hypercalls are exposed > today, it can have the same registers/features discovery mechanisms > that we discussed above, and majorly the userspace has to configure it > only once per-VM. We can probably make the feature selection immutable > just before any vCPU is created. What is the problem with immutability on first run? Or even with a 'finalize' ioctl (we already have one)? > > Please let me know your thoughts or any disadvantages that I'm overlooking. A device means yet another configuration and migration API. Don't you think we have enough of those? The complexity of KVM/arm64 userspace API is already insane, and extremely fragile. Adding to it will be a validation nightmare (it already is, and I don't see anyone actively helping with it). So no, I have no plan to consider anything but the GET/SET_ONE_REG interface, and until someone writes another open source VMM that has the same save/restore capabilities and proves that this interface isn't fit for purpose, I see no incentive in deviating from a model that is known to work. Thanks, M. -- Without deviation from the norm, progress is not possible. ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm