Re: [PATCH v8 20/20] KVM: ARM64: Add a new kvm ARM PMU device
On 22 December 2015 at 08:08, Shannon Zhaowrote: > From: Shannon Zhao > > Add a new kvm device type KVM_DEV_TYPE_ARM_PMU_V3 for ARM PMU. Implement > the kvm_device_ops for it. > > Signed-off-by: Shannon Zhao > --- > Documentation/virtual/kvm/devices/arm-pmu.txt | 24 + > arch/arm64/include/uapi/asm/kvm.h | 4 + > include/linux/kvm_host.h | 1 + > include/uapi/linux/kvm.h | 2 + > virt/kvm/arm/pmu.c| 128 > ++ > virt/kvm/kvm_main.c | 4 + > 6 files changed, 163 insertions(+) > create mode 100644 Documentation/virtual/kvm/devices/arm-pmu.txt > > diff --git a/Documentation/virtual/kvm/devices/arm-pmu.txt > b/Documentation/virtual/kvm/devices/arm-pmu.txt > new file mode 100644 > index 000..dda864e > --- /dev/null > +++ b/Documentation/virtual/kvm/devices/arm-pmu.txt > @@ -0,0 +1,24 @@ > +ARM Virtual Performance Monitor Unit (vPMU) > +=== > + > +Device types supported: > + KVM_DEV_TYPE_ARM_PMU_V3 ARM Performance Monitor Unit v3 > + > +Instantiate one PMU instance for per VCPU through this API. Do you mean that userspace has to call this API once per vCPU to create each PMU, or that calling the device create ioctl once makes the kernel instantiate a PMU for each vCPU? (It's a little bit confusing that we say "this API" to mean "not the API documented in this file at all but actually the KVM_CREATE_DEVICE ioctl", but I see we do that in the GIC API docs too.) > + > +Groups: > + KVM_DEV_ARM_PMU_GRP_IRQ > + Attributes: > +The attr field of kvm_device_attr encodes one value: > +bits: | 63 32 | 31 0 | > +values: | reserved | vcpu_index | > +A value describing the PMU overflow interrupt number for the specified > +vcpu_index vcpu. This interrupt could be a PPI or SPI, but for one VM the > +interrupt type must be same for each vcpu. As a PPI, the interrupt > number is > +same for all vcpus, while as a SPI it must be different for each vcpu. I see we're using vcpu_index rather than MPIDR affinity value for specifying which CPU we're configuring. Is this in line with our planned API for GICv3 configuration? > + Errors: > +-ENXIO: Unsupported attribute group > +-EBUSY: The PMU overflow interrupt is already set > +-ENODEV: Getting the PMU overflow interrupt number while it's not set > +-EINVAL: Invalid vcpu_index or PMU overflow interrupt number supplied What happens if you create a PMU but then never set the IRQ number? Is there a default or does the VM refuse to run or something? Do we want an attribute like KVM_DEV_ARM_VGIC_CTRL_INIT so userspace can say "I have definitely completed configuration of this device now" ? We wound up with a fair amount of legacy mess to deal with in the GIC code because we didn't put one of those in from the start. (Perhaps we should aim to standardize on all kvm devices having one?) thanks -- PMM -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v8 20/20] KVM: ARM64: Add a new kvm ARM PMU device
On 7 January 2016 at 14:49, Shannon Zhao <zhaoshengl...@huawei.com> wrote: > > > On 2016/1/7 22:36, Peter Maydell wrote: >> On 22 December 2015 at 08:08, Shannon Zhao <zhaoshengl...@huawei.com> wrote: >>> From: Shannon Zhao <shannon.z...@linaro.org> >>> >>> Add a new kvm device type KVM_DEV_TYPE_ARM_PMU_V3 for ARM PMU. Implement >>> the kvm_device_ops for it. >>> >>> Signed-off-by: Shannon Zhao <shannon.z...@linaro.org> >>> --- >>> Documentation/virtual/kvm/devices/arm-pmu.txt | 24 + >>> arch/arm64/include/uapi/asm/kvm.h | 4 + >>> include/linux/kvm_host.h | 1 + >>> include/uapi/linux/kvm.h | 2 + >>> virt/kvm/arm/pmu.c| 128 >>> ++ >>> virt/kvm/kvm_main.c | 4 + >>> 6 files changed, 163 insertions(+) >>> create mode 100644 Documentation/virtual/kvm/devices/arm-pmu.txt >>> >>> diff --git a/Documentation/virtual/kvm/devices/arm-pmu.txt >>> b/Documentation/virtual/kvm/devices/arm-pmu.txt >>> new file mode 100644 >>> index 000..dda864e >>> --- /dev/null >>> +++ b/Documentation/virtual/kvm/devices/arm-pmu.txt >>> @@ -0,0 +1,24 @@ >>> +ARM Virtual Performance Monitor Unit (vPMU) >>> +=== >>> + >>> +Device types supported: >>> + KVM_DEV_TYPE_ARM_PMU_V3 ARM Performance Monitor Unit v3 >>> + >>> +Instantiate one PMU instance for per VCPU through this API. >> >> Do you mean that userspace has to call this API once per vCPU to >> create each PMU, or that calling the device create ioctl once makes >> the kernel instantiate a PMU for each vCPU? >> > Call the device create ioctl once and kvm will create a PMU for each > vCPU. But userspace should set the irqs for each PMU since for SPI they > are different. > >> (It's a little bit confusing that we say "this API" to mean >> "not the API documented in this file at all but actually >> the KVM_CREATE_DEVICE ioctl", but I see we do that in the GIC >> API docs too.) >> >>> + >>> +Groups: >>> + KVM_DEV_ARM_PMU_GRP_IRQ >>> + Attributes: >>> +The attr field of kvm_device_attr encodes one value: >>> +bits: | 63 32 | 31 0 | >>> +values: | reserved | vcpu_index | >>> +A value describing the PMU overflow interrupt number for the specified >>> +vcpu_index vcpu. This interrupt could be a PPI or SPI, but for one VM >>> the >>> +interrupt type must be same for each vcpu. As a PPI, the interrupt >>> number is >>> +same for all vcpus, while as a SPI it must be different for each vcpu. >> >> I see we're using vcpu_index rather than MPIDR affinity value >> for specifying which CPU we're configuring. Is this in line with >> our planned API for GICv3 configuration? >> > Here vcpu_index is used to indexing the vCPU, no special use. Yes, but you can identify the CPU by index, or by its MPIDR. We had a discussion about which was the best way for doing the VGIC API, and I can't remember which way round we ended up going for. Whichever we chose, we should do the same thing here. >>> + Errors: >>> +-ENXIO: Unsupported attribute group >>> +-EBUSY: The PMU overflow interrupt is already set >>> +-ENODEV: Getting the PMU overflow interrupt number while it's not set >>> +-EINVAL: Invalid vcpu_index or PMU overflow interrupt number supplied >> >> What happens if you create a PMU but then never set the IRQ number? >> Is there a default or does the VM refuse to run or something? >> > If userspace doesn't specify the irq number, the guest will not receive > the PMU interrupt because we check if the irq is initialized when we > inject the interrupt. But guest could still use the vPMU if QEMU > generates a proper DTB or ACPI. So is it a valid use case to create a PMU with the interrupt not wired up to anything? (If it's never valid it would be nice to diagnose it rather than just silently letting the guest run but not work right.) thanks -- PMM -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/2] arm: KVM: Do not update PC if the trap handler has updated it
On 22 December 2015 at 09:55, Marc Zyngierwrote: > Assuming we trap a coprocessor access, and decide that the access > is illegal, we will inject an exception in the guest. In this > case, we shouldn't increment the PC, or the vcpu will miss the > first instruction of the handler, leading to a mildly confused > guest. > > Solve this by snapshoting PC before the access is performed, > and checking if it has moved or not before incrementing it. > > Reported-by: Shannon Zhao > Signed-off-by: Marc Zyngier > --- > arch/arm/kvm/coproc.c | 14 -- > 1 file changed, 12 insertions(+), 2 deletions(-) > > diff --git a/arch/arm/kvm/coproc.c b/arch/arm/kvm/coproc.c > index f3d88dc..f4ad2f2 100644 > --- a/arch/arm/kvm/coproc.c > +++ b/arch/arm/kvm/coproc.c > @@ -447,12 +447,22 @@ static int emulate_cp15(struct kvm_vcpu *vcpu, > r = find_reg(params, cp15_regs, ARRAY_SIZE(cp15_regs)); > > if (likely(r)) { > + unsigned long pc = *vcpu_pc(vcpu); > + > /* If we don't have an accessor, we should never get here! */ > BUG_ON(!r->access); > > if (likely(r->access(vcpu, params, r))) { > - /* Skip instruction, since it was emulated */ > - kvm_skip_instr(vcpu, kvm_vcpu_trap_il_is32bit(vcpu)); > + /* > +* Skip the instruction if it was emulated > +* without PC having changed. This allows us > +* to detect a fault being injected > +* (incrementing the PC here would cause the > +* vcpu to skip the first instruction of its > +* fault handler). > +*/ > + if (pc == *vcpu_pc(vcpu)) > + kvm_skip_instr(vcpu, > kvm_vcpu_trap_il_is32bit(vcpu)); Won't this result in our incorrectly skipping the first insn in the fault handler if the original offending instruction was itself the first insn in the fault handler? thanks -- PMM -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/2] arm: KVM: Do not update PC if the trap handler has updated it
On 22 December 2015 at 14:39, Christoffer Dall <christoffer.d...@linaro.org> wrote: > On Tue, Dec 22, 2015 at 11:08:10AM +0000, Peter Maydell wrote: >> Won't this result in our incorrectly skipping the first insn >> in the fault handler if the original offending instruction >> was itself the first insn in the fault handler? >> > Wouldn't that then loop with the exception forever? Yes, but so would real hardware... thanks -- PMM -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v10 0/6] QEMU support for KVM Guest Debug on arm64
On 8 December 2015 at 18:32, Alex Bennéewrote: > Hi, > > Here is the latest patch set to support debugging of KVM guests on > arm64. The main changes are fixing arm32 compiles (mostly with stubs > for the upcomming arm32 debug) and the usual bunch of minor tweaks and > clarifications following review. > > I've kept the GDB Python based test in tests/guest-debug and cleaned > it up so it will work with python2/3 linked GDBs. It still isn't > plumbed it in to the "make check" so can be dropped until we have a > solution for testing against non-host binaries. > > So in summary the changes are: > > - Fixed arm32 compile > - Use results of debug capability checks > - Whitespace and comment cleanups > - Py2/3 cleanliness for test script > > More detailed changelogs are attached to each patch. Thanks, applied to target-arm.next. (I fixed a few typos in comments and commit messages in a couple of places). -- PMM -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v9 1/6] target-arm: kvm64 - introduce kvm_arm_init_debug()
On 12 November 2015 at 16:20, Alex Bennée <alex.ben...@linaro.org> wrote: > As we haven't always had guest debug support we need to probe for it. > Additionally we don't do this in the start-up capability code so we > don't fall over on old kernels. > > Signed-off-by: Alex Bennée <alex.ben...@linaro.org> > --- > target-arm/kvm64.c | 18 ++ > 1 file changed, 18 insertions(+) > > diff --git a/target-arm/kvm64.c b/target-arm/kvm64.c > index ceebfeb..d087794 100644 > --- a/target-arm/kvm64.c > +++ b/target-arm/kvm64.c > @@ -25,6 +25,22 @@ > #include "internals.h" > #include "hw/arm/arm.h" > > +static bool have_guest_debug; > + > +/** > + * kvm_arm_init_debug() > + * @cs: CPUState > + * > + * Check for guest debug capabilities. > + * > + */ > +static void kvm_arm_init_debug(CPUState *cs) > +{ > +have_guest_debug = kvm_check_extension(cs->kvm_state, > + KVM_CAP_SET_GUEST_DEBUG); > +return; > +} > + > static inline void set_feature(uint64_t *features, int feature) > { > *features |= 1ULL << feature; > @@ -121,6 +137,8 @@ int kvm_arch_init_vcpu(CPUState *cs) > } > cpu->mp_affinity = mpidr & ARM64_AFFINITY_MASK; > > +kvm_arm_init_debug(cs); > + > return kvm_arm_init_cpreg_list(cpu); > } I assume in practice the kernel guarantees that either all CPUs have the SET_GUEST_DEBUG cap, or none do :-) Reviewed-by: Peter Maydell <peter.mayd...@linaro.org> thanks -- PMM -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v9 1/6] target-arm: kvm64 - introduce kvm_arm_init_debug()
On 20 November 2015 at 15:05, Peter Maydell <peter.mayd...@linaro.org> wrote: > On 12 November 2015 at 16:20, Alex Bennée <alex.ben...@linaro.org> wrote: >> As we haven't always had guest debug support we need to probe for it. >> Additionally we don't do this in the start-up capability code so we >> don't fall over on old kernels. >> >> Signed-off-by: Alex Bennée <alex.ben...@linaro.org> >> --- >> target-arm/kvm64.c | 18 ++ >> 1 file changed, 18 insertions(+) >> >> diff --git a/target-arm/kvm64.c b/target-arm/kvm64.c >> index ceebfeb..d087794 100644 >> --- a/target-arm/kvm64.c >> +++ b/target-arm/kvm64.c >> @@ -25,6 +25,22 @@ >> #include "internals.h" >> #include "hw/arm/arm.h" >> >> +static bool have_guest_debug; >> + >> +/** >> + * kvm_arm_init_debug() >> + * @cs: CPUState >> + * >> + * Check for guest debug capabilities. >> + * >> + */ >> +static void kvm_arm_init_debug(CPUState *cs) >> +{ >> +have_guest_debug = kvm_check_extension(cs->kvm_state, >> + KVM_CAP_SET_GUEST_DEBUG); >> +return; >> +} >> + >> static inline void set_feature(uint64_t *features, int feature) >> { >> *features |= 1ULL << feature; >> @@ -121,6 +137,8 @@ int kvm_arch_init_vcpu(CPUState *cs) >> } >> cpu->mp_affinity = mpidr & ARM64_AFFINITY_MASK; >> >> +kvm_arm_init_debug(cs); >> + >> return kvm_arm_init_cpreg_list(cpu); >> } > > I assume in practice the kernel guarantees that either all > CPUs have the SET_GUEST_DEBUG cap, or none do :-) > > Reviewed-by: Peter Maydell <peter.mayd...@linaro.org> ...except I've just noticed that nothing else in this patchset ever reads the have_guest_debug bool we just set, so what is the purpose of this patch? thanks -- PMM -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v9 2/6] target-arm: kvm - implement software breakpoints
On 12 November 2015 at 16:20, Alex Bennéewrote: > These don't involve messing around with debug registers, just setting > the breakpoint instruction in memory. GDB will not use this mechanism if > it can't access the memory to write the breakpoint. > > All the kernel has to do is ensure the hypervisor traps the breakpoint > exceptions and returns to userspace. > > Signed-off-by: Alex Bennée > > -- > v2 > - handle debug exit with new hsr exception info > - add verbosity to UNIMP message > v3 > - sync with kvm_cpu_synchronize_state() before checking PC. > - use internals.h defines > - use env->pc > - use proper format types > v9 > - add include for error_report > - define a brk_insn constant > --- > target-arm/kvm.c | 90 > > 1 file changed, 78 insertions(+), 12 deletions(-) > > diff --git a/target-arm/kvm.c b/target-arm/kvm.c > index 79ef4c6..50f70ef 100644 > --- a/target-arm/kvm.c > +++ b/target-arm/kvm.c > @@ -17,6 +17,7 @@ > > #include "qemu-common.h" > #include "qemu/timer.h" > +#include "qemu/error-report.h" > #include "sysemu/sysemu.h" > #include "sysemu/kvm.h" > #include "kvm_arm.h" > @@ -516,9 +517,60 @@ MemTxAttrs kvm_arch_post_run(CPUState *cs, struct > kvm_run *run) > return MEMTXATTRS_UNSPECIFIED; > } > > +/* See v8 ARM ARM D7.2.27 ESR_ELx, Exception Syndrome Register > + * > + * To minimise translating between kernel and user-space the kernel > + * ABI just provides user-space with the full exception syndrome > + * register value to be decoded in QEMU. > + */ > + > +static int kvm_handle_debug(CPUState *cs, struct kvm_run *run) > +{ > +struct kvm_debug_exit_arch *arch_info = >debug.arch; > +int hsr_ec = arch_info->hsr >> ARM_EL_EC_SHIFT; This doesn't build for 32-bit ARM: target-arm/kvm.c:530:27: error: ‘struct kvm_debug_exit_arch’ has no member named ‘hsr’ int hsr_ec = arch_info->hsr >> ARM_EL_EC_SHIFT; > +ARMCPU *cpu = ARM_CPU(cs); > +CPUARMState *env = >env; > + > +/* Ensure PC is synchronised */ > +kvm_cpu_synchronize_state(cs); > + > +switch (hsr_ec) { > +case EC_AA64_BKPT: > +if (kvm_find_sw_breakpoint(cs, env->pc)) { > +return true; > +} > +break; > +default: > +error_report("%s: unhandled debug exit (%"PRIx32", %"PRIx64")\n", > + __func__, arch_info->hsr, env->pc); > +} > + > +/* If we don't handle this it could be it really is for the > + guest to handle */ > +qemu_log_mask(LOG_UNIMP, > + "%s: re-injecting exception not yet implemented" > + " (0x%"PRIx32", %"PRIx64")\n", > + __func__, hsr_ec, env->pc); > + > +return false; > +} > + > int kvm_arch_handle_exit(CPUState *cs, struct kvm_run *run) > { > -return 0; > +int ret = 0; > + > +switch (run->exit_reason) { > +case KVM_EXIT_DEBUG: > +if (kvm_handle_debug(cs, run)) { > +ret = EXCP_DEBUG; > +} /* otherwise return to guest */ > +break; > +default: > +qemu_log_mask(LOG_UNIMP, "%s: un-handled exit reason %d\n", > + __func__, run->exit_reason); > +break; > +} > +return ret; > } > > bool kvm_arch_stop_on_emulation_error(CPUState *cs) > @@ -543,14 +595,34 @@ int kvm_arch_on_sigbus(int code, void *addr) > > void kvm_arch_update_guest_debug(CPUState *cs, struct kvm_guest_debug *dbg) > { > -qemu_log_mask(LOG_UNIMP, "%s: not implemented\n", __func__); > +if (kvm_sw_breakpoints_active(cs)) { > +dbg->control |= KVM_GUESTDBG_ENABLE | KVM_GUESTDBG_USE_SW_BP; This won't compile on 32-bit ARM either: target-arm/kvm.c:634:38: error: ‘KVM_GUESTDBG_USE_SW_BP’ undeclared (first use in this function) dbg->control |= KVM_GUESTDBG_ENABLE | KVM_GUESTDBG_USE_SW_BP; > +} > } > > -int kvm_arch_insert_sw_breakpoint(CPUState *cs, > - struct kvm_sw_breakpoint *bp) > +/* C6.6.29 BRK instruction */ > +static const uint32_t brk_insn = 0xd420; This is the A64 breakpoint instruction, so why is it in the common-to-32-and-64 source file? How about the A32 and T16 breakpoint insns? > + > +int kvm_arch_insert_sw_breakpoint(CPUState *cs, struct kvm_sw_breakpoint *bp) > { > -qemu_log_mask(LOG_UNIMP, "%s: not implemented\n", __func__); > -return -EINVAL; > + > +if (cpu_memory_rw_debug(cs, bp->pc, (uint8_t *)>saved_insn, 4, 0) || > +cpu_memory_rw_debug(cs, bp->pc, (uint8_t *)_insn, 4, 1)) { > +return -EINVAL; > +} Should we allow insertion of sw breakpoint insns if the kernel doesn't implement KVM_EXIT_DEBUG and reporting the ESR to us? > +return 0; > +} > + > +int kvm_arch_remove_sw_breakpoint(CPUState *cs, struct kvm_sw_breakpoint *bp) > +{ > +static uint32_t brk; > + > +if (cpu_memory_rw_debug(cs, bp->pc, (uint8_t *), 4, 0) || > +
Re: [PATCH v9 4/6] target-arm: kvm - add support for HW assisted debug
On 12 November 2015 at 16:20, Alex Bennéewrote: > This adds basic support for HW assisted debug. The ioctl interface to > KVM allows us to pass an implementation defined number of break and > watch point registers. When KVM_GUESTDBG_USE_HW is specified these > debug registers will be installed in place on the world switch into the > guest. > > The hardware is actually capable of more advanced matching but it is > unclear if this expressiveness is available via the gdbstub protocol. > > Signed-off-by: Alex Bennée > > --- > v2 > - correct setting of PMC/BAS/MASK > - improved commentary > - added helper function to check watchpoint in range > - fix find/deletion of watchpoints > v3 > - use internals.h definitions > v6 > - KVM_GUESTDBG_USE_HW_BP->KVM_GUESTDBG_USE_HW > - renamed some helper functions to avoid confusion > v9 > - fix merge conflicts on re-base > - rm asm/ptrace.h include > - add additional commentry for hw breakpoints > - explain gdb's model for HW bkpts > - fix up spacing, formatting as per checkpatch > - better PAC values > - use is_power_of_2 > - use _arm_ fn naming and add docs > - add a CPUWatchpoint structure for reporting > - replace manual array manipulation with g_array abstraction > --- > target-arm/kvm.c | 38 +++--- > target-arm/kvm64.c | 352 > ++- > target-arm/kvm_arm.h | 38 ++ > 3 files changed, 406 insertions(+), 22 deletions(-) > > diff --git a/target-arm/kvm.c b/target-arm/kvm.c > index d505a7e..1f57e92 100644 > --- a/target-arm/kvm.c > +++ b/target-arm/kvm.c > @@ -547,6 +547,20 @@ static int kvm_handle_debug(CPUState *cs, struct kvm_run > *run) > return true; > } > break; > +case EC_BREAKPOINT: > +if (kvm_arm_find_hw_breakpoint(cs, env->pc)) { > +return true; > +} > +break; > +case EC_WATCHPOINT: > +{ > +CPUWatchpoint *wp = kvm_arm_find_hw_watchpoint(cs, arch_info->far); This won't compile for 32-bit ARM. > +if (wp) { > +cs->watchpoint_hit = wp; > +return true; > +} > +break; > +} > default: > error_report("%s: unhandled debug exit (%"PRIx32", %"PRIx64")\n", > __func__, arch_info->hsr, env->pc); > @@ -608,6 +622,10 @@ void kvm_arch_update_guest_debug(CPUState *cs, struct > kvm_guest_debug *dbg) > if (kvm_sw_breakpoints_active(cs)) { > dbg->control |= KVM_GUESTDBG_ENABLE | KVM_GUESTDBG_USE_SW_BP; > } > +if (kvm_arm_hw_debug_active(cs)) { > +dbg->control |= KVM_GUESTDBG_ENABLE | KVM_GUESTDBG_USE_HW; > +kvm_arm_copy_hw_debug_data(>arch); > +} > } > > /* C6.6.29 BRK instruction */ > @@ -635,26 +653,6 @@ int kvm_arch_remove_sw_breakpoint(CPUState *cs, struct > kvm_sw_breakpoint *bp) > return 0; > } > > -int kvm_arch_insert_hw_breakpoint(target_ulong addr, > - target_ulong len, int type) > -{ > -qemu_log_mask(LOG_UNIMP, "%s: not implemented\n", __func__); > -return -EINVAL; > -} You've moved these functions into kvm64.c but haven't provided a stub version in kvm32.c... > - > -int kvm_arch_remove_hw_breakpoint(target_ulong addr, > - target_ulong len, int type) > -{ > -qemu_log_mask(LOG_UNIMP, "%s: not implemented\n", __func__); > -return -EINVAL; > -} > - > - > -void kvm_arch_remove_all_hw_breakpoints(void) > -{ > -qemu_log_mask(LOG_UNIMP, "%s: not implemented\n", __func__); > -} > - > void kvm_arch_init_irq_routing(KVMState *s) > { > } > diff --git a/target-arm/kvm64.c b/target-arm/kvm64.c > index d087794..c468324 100644 > --- a/target-arm/kvm64.c > +++ b/target-arm/kvm64.c > @@ -2,6 +2,7 @@ > * ARM implementation of KVM hooks, 64 bit specific code > * > * Copyright Mian-M. Hamayun 2013, Virtual Open Systems > + * Copyright Alex Bennée 2014, Linaro > * > * This work is licensed under the terms of the GNU GPL, version 2 or later. > * See the COPYING file in the top-level directory. > @@ -12,12 +13,17 @@ > #include > #include > #include > +#include > > +#include > #include > > #include "config-host.h" > #include "qemu-common.h" > #include "qemu/timer.h" > +#include "qemu/host-utils.h" > +#include "qemu/error-report.h" > +#include "exec/gdbstub.h" > #include "sysemu/sysemu.h" > #include "sysemu/kvm.h" > #include "kvm_arm.h" > @@ -27,20 +33,362 @@ > > static bool have_guest_debug; > > +/* > + * Although the ARM implementation of hardware assisted debugging > + * allows for different breakpoints per-core the current GDB interface Comma after "per-core". > + * treats them as a global pool of registers (which seems to be the > + * case for x86, ppc and s390). As a result we store one copy of > + * registers which is used for all active cores. > + * > + * Write access is serialised by virtue of the
Re: [PATCH v9 6/6] tests/guest-debug: introduce basic gdbstub tests
On 12 November 2015 at 16:20, Alex Bennéewrote: > From: Alex Bennée > > The aim of these tests is to combine with an appropriate kernel > image (with symbol-file vmlinux) and check it behaves as it should. > Given a kernel it checks: > > - single step > - software breakpoint > - hardware breakpoint > - access, read and write watchpoints > > On success it returns 0 to the calling process. > > I've not plumbed this into the "make check" logic though as we need a > solution for providing non-host binaries to the tests. However the test > is structured to work with pretty much any Linux kernel image as it > uses the basic kernel_init code which is common across architectures. Do these tests pass if you run them on the TCG QEMU, just out of interest? I'm not a great fan of tests that aren't in 'make check' because IME they just bitrot, but as you say we have no sensible approach for handling tests that need to run real guest code :-( thanks -- PMM -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v9 5/6] target-arm: kvm - re-inject guest debug exceptions
On 12 November 2015 at 16:20, Alex Bennéewrote: > From: Alex Bennée > > If we can't find details for the debug exception in our debug state > then we can assume the exception is due to debugging inside the guest. > To inject the exception into the guest state we re-use the TCG exception > code (do_interupt). "do_interrupt". > > However while guest debugging is in effect we currently can't handle the > guest using single step which is heavily used by GDB. Can you expand this to be clearer about what the problem is here? Is this a thing fixed by this commit or a remaining issue after it? > Signed-off-by: Alex Bennée > > --- > v5: > - new for v5 > --- > target-arm/helper-a64.c | 12 ++-- > target-arm/kvm.c| 27 +++ > 2 files changed, 29 insertions(+), 10 deletions(-) > > diff --git a/target-arm/helper-a64.c b/target-arm/helper-a64.c > index deb8dbe..fc3ccdf 100644 > --- a/target-arm/helper-a64.c > +++ b/target-arm/helper-a64.c > @@ -25,6 +25,7 @@ > #include "qemu/bitops.h" > #include "internals.h" > #include "qemu/crc32c.h" > +#include "sysemu/kvm.h" > #include /* For crc32 */ > > /* C2.4.7 Multiply and divide */ > @@ -469,7 +470,8 @@ void aarch64_cpu_do_interrupt(CPUState *cs) >new_el); > if (qemu_loglevel_mask(CPU_LOG_INT) > && !excp_is_internal(cs->exception_index)) { > -qemu_log_mask(CPU_LOG_INT, "...with ESR 0x%" PRIx32 "\n", > +qemu_log_mask(CPU_LOG_INT, "...with ESR %x/0x%" PRIx32 "\n", > + env->exception.syndrome >> ARM_EL_EC_SHIFT, >env->exception.syndrome); > } > > @@ -535,6 +537,12 @@ void aarch64_cpu_do_interrupt(CPUState *cs) > aarch64_restore_sp(env, new_el); > > env->pc = addr; > -cs->interrupt_request |= CPU_INTERRUPT_EXITTB; > + > +qemu_log_mask(CPU_LOG_INT, "...to EL%d PC 0x%" PRIx64 " PSTATE 0x%x\n", > + new_el, env->pc, pstate_read(env)); > + > +if (!kvm_enabled()) { > +cs->interrupt_request |= CPU_INTERRUPT_EXITTB; > +} > } > #endif > diff --git a/target-arm/kvm.c b/target-arm/kvm.c > index 1f57e92..4ac177a 100644 > --- a/target-arm/kvm.c > +++ b/target-arm/kvm.c > @@ -529,9 +529,10 @@ static int kvm_handle_debug(CPUState *cs, struct kvm_run > *run) > struct kvm_debug_exit_arch *arch_info = >debug.arch; > int hsr_ec = arch_info->hsr >> ARM_EL_EC_SHIFT; > ARMCPU *cpu = ARM_CPU(cs); > +CPUClass *cc = CPU_GET_CLASS(cs); > CPUARMState *env = >env; > > -/* Ensure PC is synchronised */ > +/* Ensure all state is synchronised */ You might as well have just written the comment like that to start with :-) > kvm_cpu_synchronize_state(cs); > > switch (hsr_ec) { > @@ -539,7 +540,14 @@ static int kvm_handle_debug(CPUState *cs, struct kvm_run > *run) > if (cs->singlestep_enabled) { > return true; > } else { > -error_report("Came out of SINGLE STEP when not enabled"); > +/* > + * The kernel should have supressed the guests ability to "suppressed". "guest's". > + * single step at this point so something has gone wrong. > + */ > +error_report("%s: guest single-step while debugging unsupported" > + " (%"PRIx64", %"PRIx32")\n", > + __func__, env->pc, arch_info->hsr); > +return false; Why didn't we just write the error_report this way to start with? > } > break; > case EC_AA64_BKPT: > @@ -564,14 +572,17 @@ static int kvm_handle_debug(CPUState *cs, struct > kvm_run *run) > default: > error_report("%s: unhandled debug exit (%"PRIx32", %"PRIx64")\n", > __func__, arch_info->hsr, env->pc); > +return false; Might as well put this "return false;" in the original code that adds the default case, rather than changing the control flow in this patch. > } > > -/* If we don't handle this it could be it really is for the > - guest to handle */ > -qemu_log_mask(LOG_UNIMP, > - "%s: re-injecting exception not yet implemented" > - " (0x%"PRIx32", %"PRIx64")\n", > - __func__, hsr_ec, env->pc); > +/* If we are not handling the debug exception it must belong to > + * the guest. Let's re-use the existing TCG interrupt code to set > + * everything up properly Missing trailing ".". > + */ > +cs->exception_index = EXCP_BKPT; > +env->exception.syndrome = arch_info->hsr; > +env->exception.vaddress = arch_info->far; You need to set env->exception.target_el to 1 as well. > +cc->do_interrupt(cs); > > return false; > } > -- > 2.6.3 > thanks -- PMM -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo
Re: [PATCH v9 3/6] target-arm: kvm - support for single step
On 12 November 2015 at 16:20, Alex Bennéewrote: > This adds support for single-step. There isn't much to do on the QEMU > side as after we set-up the request for single step via the debug ioctl > it is all handled within the kernel. > > Signed-off-by: Alex Bennée > > --- > v2 > - convert to using HSR_EC > v3 > - use internals.h definitions > --- > target-arm/kvm.c | 10 ++ > 1 file changed, 10 insertions(+) > > diff --git a/target-arm/kvm.c b/target-arm/kvm.c > index 50f70ef..d505a7e 100644 > --- a/target-arm/kvm.c > +++ b/target-arm/kvm.c > @@ -535,6 +535,13 @@ static int kvm_handle_debug(CPUState *cs, struct kvm_run > *run) > kvm_cpu_synchronize_state(cs); > > switch (hsr_ec) { > +case EC_SOFTWARESTEP: > +if (cs->singlestep_enabled) { > +return true; > +} else { > +error_report("Came out of SINGLE STEP when not enabled"); > +} > +break; > case EC_AA64_BKPT: > if (kvm_find_sw_breakpoint(cs, env->pc)) { > return true; > @@ -595,6 +602,9 @@ int kvm_arch_on_sigbus(int code, void *addr) > > void kvm_arch_update_guest_debug(CPUState *cs, struct kvm_guest_debug *dbg) > { > +if (cs->singlestep_enabled) { > +dbg->control |= KVM_GUESTDBG_ENABLE | KVM_GUESTDBG_SINGLESTEP; > +} Doesn't kvm_update_guest_debug() already set these bits, or am I misreading it? > if (kvm_sw_breakpoints_active(cs)) { > dbg->control |= KVM_GUESTDBG_ENABLE | KVM_GUESTDBG_USE_SW_BP; > } > -- > 2.6.3 thanks -- PMM -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Qemu-devel] [PATCH] kvm-all: PAGE_SIZE should be real host page size
On 10 November 2015 at 16:59, Andrew Jones <drjo...@redhat.com> wrote: > On Tue, Nov 10, 2015 at 04:29:31PM +0000, Peter Maydell wrote: >> On 10 November 2015 at 00:23, Andrew Jones <drjo...@redhat.com> wrote: >> > -/* KVM uses PAGE_SIZE in its definition of COALESCED_MMIO_MAX */ >> > -#define PAGE_SIZE TARGET_PAGE_SIZE >> > +/* KVM uses PAGE_SIZE in its definition of KVM_COALESCED_MMIO_MAX. We >> > + * need to use the real host PAGE_SIZE, as that's what KVM will use. >> > + */ >> > +#define PAGE_SIZE getpagesize() >> >> Rather than defining PAGE_SIZE here (a confusing macro given >> we have several page sizes to deal with), why not just use >> getpagesize() in the one and only location where we currently >> use this macro? > > The macro is used by kernel headers that we import and include in > kvm-all.c. It's ugly, I agree, but that's how the this cookie crumbled. Oh, I see. That's pretty horrible. thanks -- PMM -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Qemu-devel] [PATCH] kvm-all: PAGE_SIZE should be real host page size
On 10 November 2015 at 00:23, Andrew Joneswrote: > Just noticed this while grepping TARGET_PAGE_SIZE for an unrelated > reason. I didn't use qemu_real_host_page_size as kvm_set_phys_mem() > does, because we'd need to make sure page_size_init() has run first. > > Signed-off-by: Andrew Jones > --- > kvm-all.c | 6 -- > 1 file changed, 4 insertions(+), 2 deletions(-) > > diff --git a/kvm-all.c b/kvm-all.c > index 1bc12737723c3..de9ff5971fb3b 100644 > --- a/kvm-all.c > +++ b/kvm-all.c > @@ -45,8 +45,10 @@ > #include > #endif > > -/* KVM uses PAGE_SIZE in its definition of COALESCED_MMIO_MAX */ > -#define PAGE_SIZE TARGET_PAGE_SIZE > +/* KVM uses PAGE_SIZE in its definition of KVM_COALESCED_MMIO_MAX. We > + * need to use the real host PAGE_SIZE, as that's what KVM will use. > + */ > +#define PAGE_SIZE getpagesize() Rather than defining PAGE_SIZE here (a confusing macro given we have several page sizes to deal with), why not just use getpagesize() in the one and only location where we currently use this macro? Also, you're guaranteed that page_size_init() has been run, because we call that from kvm_init(), and you can't call kvm_vcpu_init() before kvm_init(). thanks -- PMM -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v4 0/3] KVM: arm/arm64: Clean up some obsolete code
On 5 November 2015 at 06:50, Pavel Fedin> You know, since we are talking about this... This definitely > has something to do with the reset, and... Looks like nobody > resets vGIC/vTimer, unless the userland does it explicitly by > resetting every register by hand. This is how KVM in-kernel device reset is supposed to work, yes. thanks -- PMM -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Qemu-devel] [PATCH 3/7] linux-headers/kvm: add Hyper-V SynIC irq routing type and struct
On 26 October 2015 at 09:50, Andrey Smetaninwrote: > Signed-off-by: Andrey Smetanin > Reviewed-by: Roman Kagan > Signed-off-by: Denis V. Lunev > CC: Vitaly Kuznetsov > CC: "K. Y. Srinivasan" > CC: Gleb Natapov > CC: Paolo Bonzini > CC: Roman Kagan > CC: Denis V. Lunev > CC: kvm@vger.kernel.org > CC: virtualizat...@lists.linux-foundation.org > > --- > linux-headers/linux/kvm.h | 8 > 1 file changed, 8 insertions(+) Hi. Changes to linux-headers/ should only be made as part of an automated update from a mainline Linux kernel tree using the scripts/update-linux-headers.sh script. This patch looks like maybe it was a manual edit ? thanks -- PMM -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Qemu-devel] [PATCH 3/7] linux-headers/kvm: add Hyper-V SynIC irq routing type and struct
On 26 October 2015 at 10:12, Denis V. Lunev <d...@openvz.org> wrote: > On 10/26/2015 01:03 PM, Peter Maydell wrote: >> Hi. Changes to linux-headers/ should only be made as part of >> an automated update from a mainline Linux kernel tree using >> the scripts/update-linux-headers.sh script. This patch looks >> like maybe it was a manual edit ? > yep. We know and have discussed this with Paolo already. > Kernel stuff is in progress at the moment. The patch > is presented to interested people to allow to compile and > run. > > Actual merge will be performed with proper sync > when kernel will be in rc3 or 4 stage and the patch will be > dropped. OK, cool. It's best to tag the series as 'RFC' rather than 'PATCH' in that case, as a reminder that it can't be applied just yet. thanks -- PMM -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v5 0/7] KVM: arm64: Implement API for vGICv3 live migration
On 12 October 2015 at 09:29, Pavel Fedin <p.fe...@samsung.com> wrote: > This patchset adds necessary userspace API in order to support vGICv3 live > migration. GICv3 registers are accessed using device attribute ioctls, > similar to GICv2. > > Whoever wants to test it, please note that this version is not > binary-compatible with previous one, the API has been seriously changed. > qemu patchess will be posted in some time. > > v4 => v5: > - Adapted to new API by Peter Maydell, Marc Zyngier and Christoffer Dall. > Acked-by's on the documentation were dropped, just in case, because i > slightly adjusted it. Additionally, i merged all doc updates into one > patch. Could you tell us what you changed in the doc patch from the version that got sent out with the acks, please? thanks -- PMM -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] KVM: arm/arm64: Add VGICv3 save/restore API documentation
On 8 October 2015 at 10:10, Pavel Fedinwrote: > Sorry, didn't want to offend anyone. I just wanted to tell that i know > that you, as maintainers, have much more power than i do, and you can > always say "it's political decision, we just want it and that's final", > and if you choose to do this, i'm perfectly OK with that, just say it. This isn't intended to be a political decision; it's our joint technical opinion on the best design for this API. Since we all happened to be in one physical location the other week it was a good opportunity for us to work through how we thought the API should look from a technical perspective. At that point it seemed to us to be clearer to write up the results of that discussion as a single patch to the API documentation, rather than doing it by (for instance) making lots of different review comments on your patches. Christoffer's API documentation patch is a patch like any other and it has to go through review. If there are parts where you don't understand the rationale or you think we got something wrong you should let us know. thanks -- PMM -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] KVM: arm/arm64: Add VGICv3 save/restore API documentation
On 8 October 2015 at 13:45, Pavel Fedinwrote: >> Speaking of which, does the QEMU side of this patch set require first >> adding the GICv3 emulation for the data structures or is there a >> stand-alone migration patch set somewhere? > > I rolled it out a week ago: > https://www.mail-archive.com/qemu-devel@nongnu.org/msg325331.html. I > tried to get reviewed/accepted/whatever at least data structure part, but > Peter replied that he > isn't interested before we have the kernel. More specifically, I wanted us to agree on the kernel API for migration, because that significantly affects how the QEMU code looks. A quick look at your patch suggests you still have data structures like +struct gicv3_irq_state { +/* The enable bits are only banked for per-cpu interrupts. */ +unsigned long *enabled; +unsigned long *pending; +unsigned long *active; +unsigned long *level; +unsigned long *group; +bool edge_trigger; /* true: edge-triggered, false: level-triggered */ +uint32_t mask_size; /* Size of bitfields in long's, for migration */ +}; I think it's probably going to be better to have an array of redistributor structures (one per CPU), and then keep the state that in hardware is in each redistributor inside those sub-structures. Similarly for state that in hardware is inside the distributor, and inside each cpu interface. thanks -- PMM -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Qemu-devel] [PATCH 1/3] Target-microblaze: Remove unnecessary variable
On 5 October 2015 at 08:18, Michael Tokarevwrote: > 05.10.2015 08:18, Markus Armbruster wrote: >> Why? I like having the semantic patch in the commit message when >> there's any chance we'll want do the same mechanical change again later. >> >> You could save space and include it by reference, though: "Same >> Coccinelle semantic patch as is commit 74c373e". > > git commit messages aren't good documentation for various scripts > like this, this info will be lost in the noize. If it might be > better to keep such scripts in a separate file where it is easier > to find, or in a wiki page on the site. The key point is where to > find the info, git log is difficult for that, especially when you > don't know what to search for or that such a script exists in > there in the first place. > > On the other hand, when git log is cluttered by such a long messages > for such small changes, it becomes more difficult to find info which > you really look in git log -- namely, which changes were made that > might have introduced this regression, things like that. I think it can be useful when you're looking at a commit to know that it was automatically created, especially if it's a big commit. It means that if you're looking for a bug in it you can concentrate on the script that created it rather than the possibly large set of changes it produced, or if you're trying to cherry-pick it into another branch you can just apply the script instead. In a commit with a change this small it's not very significant either way, though. thanks -- PMM -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC] Handling CP15 timer without in-kernel irqchip
On 2 October 2015 at 11:18, Paolo Bonzini <pbonz...@redhat.com> wrote: > > > On 02/10/2015 12:16, Peter Maydell wrote: >> On 2 October 2015 at 11:05, Paolo Bonzini <pbonz...@redhat.com> wrote: >>> On 02/10/2015 11:58, Peter Maydell wrote: >>>> I definitely dislike the latter -- userspace ends up having to >>>> emulate part of the CPU even though that CPU support is really >>>> there in hardware. Also it requires us to edit the device tree, >>>> which means it won't work at all on boards other than 'virt' >>>> where we use the kernel's device tree rather than creating our >>>> own. Better for the kernel to forward the timer >>>> interrupts back out to userspace's irq controller. >>> >>> How do boards other than 'virt' work when emulated without KVM? It must >>> be possible to emulate the physical timer in QEMU. >> >> Without KVM is easy -- we emulate the physical timer as just >> one of the parts of the emulated CPU. With KVM, we don't emulate >> the CPU at all. We don't try to handle a "half TCG half KVM" setup. > > I mean in the device tree. Does the boot loader realize it's under a > hypervisor, and provide different device trees to the kernel? No. The device tree says "you have both physical and virtual timers", the guest kernel always prefers the virtual timer, and this works for both KVM and TCG. -- PMM -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC] Handling CP15 timer without in-kernel irqchip
On 2 October 2015 at 11:05, Paolo Bonzini <pbonz...@redhat.com> wrote: > On 02/10/2015 11:58, Peter Maydell wrote: >> I definitely dislike the latter -- userspace ends up having to >> emulate part of the CPU even though that CPU support is really >> there in hardware. Also it requires us to edit the device tree, >> which means it won't work at all on boards other than 'virt' >> where we use the kernel's device tree rather than creating our >> own. Better for the kernel to forward the timer >> interrupts back out to userspace's irq controller. > > How do boards other than 'virt' work when emulated without KVM? It must > be possible to emulate the physical timer in QEMU. Without KVM is easy -- we emulate the physical timer as just one of the parts of the emulated CPU. With KVM, we don't emulate the CPU at all. We don't try to handle a "half TCG half KVM" setup. thanks -- PMM -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC] Handling CP15 timer without in-kernel irqchip
On 2 October 2015 at 10:30, Paolo Bonziniwrote: > > > On 02/10/2015 09:28, Pavel Fedin wrote: >> 2. Another possible approach, based on how device tree binding is handled >> by Linux. It is possible >> to remove virtual timer IRQ from the device tree, in this case the kernel >> reverts to using physical >> timer. When running under hypervisor, accesses to physical CP15 timer are >> trapped into HYP, >> therefore we can forward them to userspace using new exit code, something >> like KVM_EXIT_REG_ACCESS. >> In this case the timer would be also emulated by the userspace, which is >> slower, but allows better >> emulation. Also, this could be used in order to run some other guests which >> just expect physical >> timer to be there. >> >> Both approaches have their own limitations, but anyway this is much better >> than nothing. What do >> you think, and which approach do you like more? > > I like the latter. But I guess one could even do both? I definitely dislike the latter -- userspace ends up having to emulate part of the CPU even though that CPU support is really there in hardware. Also it requires us to edit the device tree, which means it won't work at all on boards other than 'virt' where we use the kernel's device tree rather than creating our own. Better for the kernel to forward the timer interrupts back out to userspace's irq controller. -- PMM -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Qemu-devel] [PATCH 2/3] Hw: timer: Remove unnecessary variable
On 25 September 2015 at 01:37, Shraddha Barkewrote: > Compress lines and remove the variable ret. > > Change made using Coccinelle script > diff --git a/hw/timer/tusb6010.c b/hw/timer/tusb6010.c > index 459c748..ba01050 100644 > --- a/hw/timer/tusb6010.c > +++ b/hw/timer/tusb6010.c > @@ -321,7 +321,6 @@ static uint32_t tusb_async_readw(void *opaque, hwaddr > addr) > TUSBState *s = (TUSBState *) opaque; > int offset = addr & 0xfff; > int epnum; > -uint32_t ret; > > switch (offset) { > case TUSB_DEV_CONF: > @@ -338,12 +337,11 @@ static uint32_t tusb_async_readw(void *opaque, hwaddr > addr) > return 0x00; /* TODO */ > > case TUSB_DEV_OTG_STAT: > -ret = s->otg_status; > #if 0 > if (!(s->prcm_mngmt & TUSB_PRCM_MNGMT_OTG_VBUS_DET_EN)) > ret &= ~TUSB_DEV_OTG_STAT_VBUS_VALID; > #endif > -return ret; > +return s->otg_status; > case TUSB_DEV_OTG_TIMER: > return s->otg_timer_val; > This change would break the #if-0'd out code if we ever reenabled it. -- PMM -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v3 5/5] KVM: arm64: Implement vGICv3 CPU interface access
On 25 September 2015 at 15:27, Andre Przywarawrote: > On 24/09/15 13:08, Pavel Fedin wrote: >> Hello! >> >>> The only thing that is pure 64-bit is the MRS/MSR _instruction_ in >>> Aarch64, which always takes a x register. >>> So can you model the register size according to the spec and allow >>> 32-bit accesses from userland? >> >> I would like to complete the rework and respin v4, but this is, i guess, >> the only major issue left. >> Additionally, it impacts the API. So... >> In order to allow 32-bit accesses we would have to drop using >> ARM64_SYS_REG() for building >> attribute ID and introduce something own, like KVM_DEV_ARM_VGIC_REG(). It >> will have different bits >> layout (actually it will be missing 'arch' and 'size' field, and instead i >> will use >> KVM_DEV_ARM_VGIC_64BIT flag for length specification, the same as for >> redistributor. >> Will this be OK ? > > No, instead you should go with your original approach ;-) > Thinking about that again I see that this interface is of course modeled > after the architectured GICv3 system registers, where AArch32 has its > own, separate encoding. So it's perfectly fine to use that 64-bit > interface between userland and KVM now. If we later get Aarch32 support > for the GICv3, we can add the appropriate Aarch32 sysregs to that > interface and have a natural match. Christoffer, Marc and I had a discussion about how best to handle GICv3 state save and restore here at Linaro Connect this week, and one of the conclusions there was that all of it should be done via the device fd's save/restore interface, not via the SET/GET_ONE_REG CPU register interface. (Because it's much less confusing to have all of the GICv3's state be dealt with via the same interface, rather than splitting it; it's also more convenient for QEMU.) More details to follow from one of us next week when we aren't all on aeroplanes, I expect. thanks -- PMM -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] KVM: arm64: remove all traces of the ThumbEE registers
On 15 September 2015 at 17:15, Will Deaconwrote: > Although the ThumbEE registers and traps were present in earlier > versions of the v8 architecture, it was retrospectively removed and so > we can do the same. > > Cc: Marc Zyngier > Signed-off-by: Will Deacon > diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c > index b41607d270ac..6c35e49757d8 100644 > --- a/arch/arm64/kvm/sys_regs.c > +++ b/arch/arm64/kvm/sys_regs.c > @@ -539,13 +539,6 @@ static const struct sys_reg_desc sys_reg_descs[] = { > { Op0(0b10), Op1(0b000), CRn(0b0111), CRm(0b1110), Op2(0b110), > trap_dbgauthstatus_el1 }, > > - /* TEECR32_EL1 */ > - { Op0(0b10), Op1(0b010), CRn(0b), CRm(0b), Op2(0b000), > - NULL, reset_val, TEECR32_EL1, 0 }, > - /* TEEHBR32_EL1 */ > - { Op0(0b10), Op1(0b010), CRn(0b0001), CRm(0b), Op2(0b000), > - NULL, reset_val, TEEHBR32_EL1, 0 }, > - I guess this is a VM migration compatibility break between kernels without this patch and kernels with it? I think that's OK at this point, but it would be nice to mention it in the commit message. thanks -- PMM -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Qemu-devel] [kvm-unit-tests PATCH 2/2] arm/arm64 config: Fix arch_clean rule
On 4 September 2015 at 15:05, Andrew Joneswrote: > But anyway I would suggest you do your 'make clean' or even a > 'make distclean' *before* running configure the second time. > Otherwise you'll leave old object files around from the previously > configured arch. I.e. you want 'make clean' to apply to the > currently built config, not the new (not yet built) config. Separate object directories is usually the best approach if you regularly want to build multiple configurations IMHO... thanks -- PMM -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2 2/5] KVM: arm64: Implement vGICv3 distributor and redistributor access from userspace
On 2 September 2015 at 09:09, Pavel Fedinwrote: > The access is done similar to vGICv2, using KVM_DEV_ARM_VGIC_GRP_DIST_REGS > and KVM_DEV_ARM_VGIC_GRP_REDIST_REGS with KVM_SET_DEVICE_ATTR and > KVM_GET_DEVICE_ATTR ioctls. > > Some registers are 64-bit wide according to the specification. > KVM_DEV_ARM_VGIC_64BIT flag is introduced, allowing to perform full 64-bit > accesses. > > Signed-off-by: Pavel Fedin > --- > + KVM_DEV_ARM_VGIC_GRP_REDIST_REGS > + Attributes: > +The attr field of kvm_device_attr encodes two values: > +bits: | 63 | 62 .. 40 | 39 .. 32 | 31 0 | > +values: | size | reserved | cpu id | offset | We should avoid imposing an accidental limit on the maximum number of CPUs in the userspace API. GICv3 doesn't have a limit at 256 CPUs, so I think we should define an attr format which lets us specify a complete affinity specification. thanks -- PMM -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/3] KVM: arm64: Implement vGICv3 distributor and redistributor access from userspace
On 1 September 2015 at 14:52, Andre Przywarawrote: > Also the GIC specification says that everything must be accessible with > 32-bit accesses. Correct me if I am wrong on this, but vCPUs are not > supposed to run while you are getting/setting VGIC registers, right? So > there shouldn't be any issues with non-atomic accesses to 64-bit > registers, which means you could just go ahead and do everything in > 32-bit only. This would also help with supporting 32-bit userland and/or > kernel later. We should design the userspace API based on the natural size of the registers in the GICv3 spec, not on what happens to be convenient for the kernel to implement. There's only one kernel but there can be multiple userspace consumers of the API... I don't see any reason why a 32-bit userland wouldn't be able to handle 64-bit accesses via the KVM_SET/GET_DEVICE_ATTR ioctls, or am I missing something? thanks -- PMM -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 3/3] KVM: arm64: Implement accessors for vGIC CPU interface registers
On 30 August 2015 at 17:50, Christoffer Dall christoffer.d...@linaro.org wrote: I had imagined we would encode the GICv3 register accesses through the device API and not through the system register API, since I'm not crazy about polluting the general system register handling logic with GIC registers solely for the purposes of migration. There's an interesting design question lurking under this about the extent to which you expose the h/w design split between the CPU interface and the GIC proper as part of the KVM APIs. I'm inclined to agree that it's better to for our purposes treat both bits as just part of an irqchip device, but I haven't given it a great deal of thought. (Similarly in the QEMU emulated-GICv3 case you could also split the CPU i/f more formally, or not. The kernel's choice would have implications for which way QEMU ends up going, I think.) thanks -- PMM -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Qemu-devel] [PATCH 0/2] mips/kvm: Fixes for big endian MIPS64 hosts
On 9 July 2015 at 14:49, Paolo Bonzini pbonz...@redhat.com wrote: On 09/07/2015 10:13, Leon Alrae wrote: On 08/07/2015 16:22, Paolo Bonzini wrote: On 08/07/2015 17:03, James Hogan wrote: Hi Paolo, On 24/04/15 11:26, James Hogan wrote: A couple of small fixes for accessing 32-bit KVM registers on big endian, and to sign extend struct kvm_regs registers so as to work on MIPS64 hosts. James Hogan (2): mips/kvm: Fix Big endian 32-bit register access mips/kvm: Sign extend registers written to KVM Any chance of applying these in time for v2.4? I had assumed they'd go through Leon, but I can apply them for 2.4-rc1 as well. I thought these changes would get applied via kvm queue since they touch target-mips/kvm.c only. But if such target-specific kvm patches usually go via target-* tree then I'm happy to pick them up. It's the same---they can go in through any tree. In this case I've now applied them and will send them out next week. I've actually just applied them to master as buildfixes :-) thanks -- PMM -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC PATCH] KVM: arm/arm64: Don't let userspace update CNTVOFF once guest is running
On 9 July 2015 at 15:17, Christoffer Dall christoffer.d...@linaro.org wrote: On Thu, Jul 09, 2015 at 02:24:06PM +0200, Christoffer Dall wrote: So I ran this through GDB, and this happens when the guest probes the virtio devices, specifically the backtrace tells me that virtio_current_cpu_endian () at /root/src/qemu/hw/virtio/virtio.c:594 = cc-virtio_is_big_endian - arm_cpu_is_big_endian - cpu_synchronize_state - kvm_cpu_synchronize_state which causes cpu-kvm_vcpu_dirty = true, which causes the run-loop to write the CNTVOFF on a per-vcpu basis without stopping anything, as far as I can tell. Ah, I was wondering if it was going to turn out to be this, but I hadn't figured out why it was going to cause us to do a write-back of state rather than just a read. So yeah, I guess the only required fix here is to fix QEMU in some way as to not fiddle with the timer registers in this way, and then I honestly don't know if we should try to fix legacy userspace in the kernel, but based on the feedback from Jan, I suppose not. arm_cpu_is_big_endian() doesn't actually want to write back any state -- all it's interested in is reading. So we really ought not to need to write anything back there. kvm-all.c's sync functions don't seem to provide a sync kernel state to userspace but I promise I'm not going to dirty it function though. I guess we could add one. (Overall it's kind of fragile even if we can avoid this specific case where we're writing back the counter state, though -- we should probably think about the sync level stuff as well.) -- PMM -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH for 2.4 0/2] MIPS build fixes for v2.4
On 9 July 2015 at 12:52, Leon Alrae leon.al...@imgtec.com wrote: On 09/07/2015 10:17, James Hogan wrote: These two patches fix build errors for the MIPS TCG backend and MIPS KVM. Please could they be applied for v2.4. James Hogan (2): tcg/mips: Fix build error from merged memop+mmu_idx parameter mips/kvm: Sync with newer MIPS KVM headers target-mips/kvm.c | 15 ++- tcg/mips/tcg-target.c | 4 ++-- 2 files changed, 4 insertions(+), 15 deletions(-) Reviewed-by: Leon Alrae leon.al...@imgtec.com Peter, since these are build fixes, could they be squeezed into rc0? Applied to master, thanks. -- PMM -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Qemu-devel] [PATCH 0/2] mips/kvm: Fixes for big endian MIPS64 hosts
On 9 July 2015 at 14:58, Peter Maydell peter.mayd...@linaro.org wrote: On 9 July 2015 at 14:49, Paolo Bonzini pbonz...@redhat.com wrote: It's the same---they can go in through any tree. In this case I've now applied them and will send them out next week. I've actually just applied them to master as buildfixes :-) No, wait, I'm confusing this set with a different 2-patch set of MIPS fixes. Paolo, can you go ahead and take them through the kvm tree? thanks -- PMM -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC PATCH] KVM: arm/arm64: Don't let userspace update CNTVOFF once guest is running
On 9 July 2015 at 11:22, Christoffer Dall christoffer.d...@linaro.org wrote: On Wed, Jul 08, 2015 at 08:13:59PM +0100, Peter Maydell wrote: I suspect Jan is right and we really need to distinguish the KVM_PUT_*_STATE levels in ARM QEMU. This probably implies some kind of whitelist/override mechanism, since by and large we neither know nor want to know the semantics for system registers, we leave that up to the kernel. Q: if you have a running VM, and you pause it for an hour, what should the CNTVCT register do? Presumably it should not advance, but how do we arrange for that to happen? I think the CNTVCT should not advance when the VM is not scheduled, so if we pause the VM or starve all the VCPUs for enough time, the guest should not see time progressing, since otherwise the guest scheduler cannot maintain fairness and you're bound to see spurious RCU stalls etc. That's exactly why a guest can read both a virtual and physical counter and it is an area where you simply want some level of paravirtualization. I haven't studied how/if Linux deals with this at all. So I think adjusting CNTVOFF should be managed by the kernel for the pause/starvation scenario (which I think Avi once told me x86 does too - does anyone know the current state of the art?). Agreed that we should be aiming for same behaviour as x86 rather than reinventing the wheel... So the only situation where I think userspace should adjust the CNTVOFF value is for migration where we are talking about a brand new VM with has_run_once clear. For incoming migration/loadvm the VM will be freshly *reset*, but it will not be completely created from scratch. It's possible for the user to run a VM for a bit, and then use loadvm from the monitor to load an old snapshot. This will do a qemu_system_reset() and then restore the state. So in that case has_run_once won't be clear. (Would it be OK to clear it when userspace does VCPU_INIT? What else does it control?) Thus, if we were designing this from scratch now, the API should be to return an error when trying to set KVM_REG_ARM_TIMER_CNT after the VM has run once, but it's too late for that as we would break userspace. The best alternative IMHO would be to merge Marc's patch and fix CNTVOFF in the kernel side as well, and finally also fix QEMU so that it doesn't try to do the thing that future kernels will ignore. Isn't Marc's patch causing us to return an error to userspace? (It does a 'return -1', at least -- does that get ignored at a higher level?) In principle, if userspace does: * stop all vCPUs * read the counter values from each vCPU * write the counter values back to each vCPU * resume vCPUs this shouldn't cause time to do funny things, should it? (ie the problem only occurs if we try to write the vCPU counter value while only one vCPU is stopped). thanks -- PMM -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC PATCH] KVM: arm/arm64: Don't let userspace update CNTVOFF once guest is running
On 9 July 2015 at 13:05, Christoffer Dall christoffer.d...@linaro.org wrote: As I understand it, the problem is that if we ever run a VCPU after reading the value, and write back the value afterwards, you potentially make time go backwards and get inconsistent views of time from different VCPUs because they may have read the time before/after updating the CNTVOFF. Right, but I think if QEMU does that it's a bug (and more to the point I don't entirely understand why we would do that yet, even given that we don't have a distinction between registers to sync always and registers to sync only on reset...) -- PMM -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC PATCH] KVM: arm/arm64: Don't let userspace update CNTVOFF once guest is running
On 8 July 2015 at 16:56, Marc Zyngier marc.zyng...@arm.com wrote: On 29/06/15 18:37, Peter Maydell wrote: On 29 June 2015 at 18:20, Claudio Fontana claudio.font...@huawei.com wrote: On 26.06.2015 06:49, Jan Kiszka wrote: QEMU has the concept of write-back levels: KVM_PUT_RUNTIME_STATE, KVM_PUT_RESET_STATE and KVM_PUT_FULL_STATE. I suspect this registers is just sorted into the wrong category, thus written as part of the RUNTIME_STATE. We had such bug patterns during the x86 maturing phase as well. It seems that QEMU target-arm ignores the level parameter to kvm_arch_put_registers completely. Is it intended? Yes, sort of. We don't in general know anything about the semantics of most of the system registers. It should always be safe to read them all out of the kernel and write them back... I'm not sure you can safely assume this for time related things, unless you can guarantee that all vcpus are stopped. Claudio is seeing time jumping in weird ways, and so have I, which would tend to show that QEMU is introducing some jitter. Maybe not easily observable on real hardware, but the FastModel is enough to show the issue. So unless someone has a better solution, I'm seriously considering getting this patch merged. I'd prefer it if somebody could investigate to see why QEMU is actually doing this -- so far we just have speculation. -- PMM -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC PATCH] KVM: arm/arm64: Don't let userspace update CNTVOFF once guest is running
On 8 July 2015 at 17:37, Marc Zyngier marc.zyng...@arm.com wrote: On 08/07/15 17:06, Peter Maydell wrote: I'd prefer it if somebody could investigate to see why QEMU is actually doing this -- so far we just have speculation. I'd prefer that too, but so far people seem to be more comfortable waiting for the issue to fix itself. In the meantime, VMs are broken in weird and wonderful ways, and I don't think the current status-quo helps anyone. Putting in a patch which might not be the right fix isn't necessarily a good plan either... Does has_run_once get cleared if we do a re-VCPU_INIT of a CPU that's run before? (We need to allow rewriting of guest state at that point so that reset VM and load migration state behaves correctly.) I suspect Jan is right and we really need to distinguish the KVM_PUT_*_STATE levels in ARM QEMU. This probably implies some kind of whitelist/override mechanism, since by and large we neither know nor want to know the semantics for system registers, we leave that up to the kernel. Q: if you have a running VM, and you pause it for an hour, what should the CNTVCT register do? Presumably it should not advance, but how do we arrange for that to happen? -- PMM -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] arm64/kvm: Add generic v8 KVM target
On 3 July 2015 at 09:08, Marc Zyngier marc.zyng...@arm.com wrote: On 02/07/15 21:29, Chalamarla, Tirumalesh wrote: is there a chance that this get merged in to 4.2? if not is it possible to accept the other patches like adding implementations explicitly(like Thunder). We first need to reach a conclusion on this. Until then, I don't plan to get anything in. As for 4.2, it feels way too late (the merge window is almost closed now). I would still like to see the proponents of this patch say what their model is for userspace support of cross-host migration, if we're abandoning the model the current API envisages. thanks -- PMM -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] arm64/kvm: Add generic v8 KVM target
On 3 July 2015 at 09:28, Marc Zyngier marc.zyng...@arm.com wrote: On 03/07/15 09:12, Peter Maydell wrote: I would still like to see the proponents of this patch say what their model is for userspace support of cross-host migration, if we're abandoning the model the current API envisages. I thought we had discussed this above, and don't really see this as a departure from the current model: - -cpu host results in GENERIC being used: VM can only be migrated to the exact same HW (no cross-host migration). MIDR should probably become RO. - -cpu host results in A57 (for example): VM can be migrated to a variety of A57 platforms, and allow for some fuzzing on the revision (or accept any revision). - -cpu a57 forces an A57 model to be emulated, always. It is always possible to migrate such a VM on any host. I think only the first point is new, but the last two are what we have (or what we should have). Right, but the implicit idea of this GENERIC patch seems to be that new host CPU types don't get their own KVM_ARM_TARGET_* constant, and are thus forever unable to do cross-host migration. It's not clear to me why we'd want to have new CPUs be second class citizens like that. -- PMM -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 05/10] KVM: arm/arm64: vgic: Relax vgic_can_sample_irq for edge IRQs
On 3 July 2015 at 10:50, Marc Zyngier marc.zyng...@arm.com wrote: On 02/07/15 17:23, Christoffer Dall wrote: If we had a different *shared* device than the timer which is edge-triggered, don't we then also need to capture the physical distributor's pending state along with the state of the device unless we assume that upon restoring the state for the device count on the device to have another rising/falling edge to trigger the interrupt again? (I assume the line would always go high for a level-triggered interrupt in this case). I'd definitely assume that restoring the state of the device would make it generate an interrupt. This has to be a property of the device, otherwise it is not really shareable between vcpus. FWIW, QEMU's modelling approach to this is to say that devices do *not* generate interrupts on restore. If the device had previously generated an interrupt then this should be captured by the state of the interrupt controller (or whatever else it is connected to) and dealt with when the GIC state is restored. If you say that restoring the device state is supposed to generate an interrupt, you introduce an ordering requirement that the state of the interrupt controller is restored first and the device second (otherwise the incoming GIC state will overwrite the interrupt that the device just generated), which isn't ideal (especially since QEMU makes no guarantees about restore order between devices). thanks -- PMM -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC PATCH] KVM: arm/arm64: Don't let userspace update CNTVOFF once guest is running
On 29 June 2015 at 18:20, Claudio Fontana claudio.font...@huawei.com wrote: On 26.06.2015 06:49, Jan Kiszka wrote: QEMU has the concept of write-back levels: KVM_PUT_RUNTIME_STATE, KVM_PUT_RESET_STATE and KVM_PUT_FULL_STATE. I suspect this registers is just sorted into the wrong category, thus written as part of the RUNTIME_STATE. We had such bug patterns during the x86 maturing phase as well. It seems that QEMU target-arm ignores the level parameter to kvm_arch_put_registers completely. Is it intended? Yes, sort of. We don't in general know anything about the semantics of most of the system registers. It should always be safe to read them all out of the kernel and write them back... -- PMM -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] arm64/kvm: Add generic v8 KVM target
On 29 June 2015 at 18:30, Marc Zyngier marc.zyng...@arm.com wrote: On 29/06/15 18:13, Chalamarla, Tirumalesh wrote: Will this also prevents migrating between same implementations, if no how is this identified. This shouldn't. It is pretty easy to look at the incoming guest's MIDR, and verify that it matches the default MIDR on the receiving system. Any difference, and the migration should abort. Do you really want to forbid migration between (say) Cortex-A57 r3p0 and Cortex-A57 r3p1 ? That seems pretty harsh. -- PMM -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] arm64/kvm: Add generic v8 KVM target
On 25 June 2015 at 14:44, Marc Zyngier marc.zyng...@arm.com wrote: It should always be possible to emulate a known CPU on a generic host, and it should be able to migrate. The case we can't migrate is when we let the guest be generic (which I guess should really be unknown, and not generic). So if the user specify -cpu cortex-a57 on the command line, the guest should be able to migrate from an A72 to an A53. if the user specified -cpu host, the resulting guest won't be able to migrate. Does it make sense? Yes. We've always said -cpu host won't be cross-host migratable from a QEMU point of view. -- PMM -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC PATCH] KVM: arm/arm64: Don't let userspace update CNTVOFF once guest is running
On 25 June 2015 at 09:59, Claudio Fontana claudio.font...@huawei.com wrote: Once the VM is created, I think QEMU should not request kvm to change the virtual offset of the VM anymore: maybe an unexpected consequence of QEMU's target-arm/kvm64.c::kvm_arch_put_registers ? Hmm. In general we assume that we can: * stop the VM * read all the guest system registers * write those values back again * restart the VM if we need to. Is that what's happening here, or are we doing something odder? -- PMM -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] arm64/kvm: Add generic v8 KVM target
On 23 June 2015 at 15:03, Suzuki K. Poulose suzuki.poul...@arm.com wrote: On 23/06/15 13:39, Christoffer Dall wrote: On Mon, Jun 22, 2015 at 09:44:48AM +0100, Peter Maydell wrote: I'm guessing the intention is to avoid having to add code in the kernel to support KVM on a new CPU where nothing else needs to be done to support KVM on that system. Yes, thats the *only* motivation behind the patch I'm not convinced we really want to change the userspace ABI design principles just for convenience of internal kernel implementation. At the moment the approach is: (1) if you want a specific guest CPU, you ask for it; the kernel guarantees that if this succeeds, you can cross-migrate between this and another kernel that also lets you init that guest CPU [at the moment this implies no cross-host-type migration, of course] (2) if you don't care, you ask the kernel what its preferred CPU type is, and use it If we want to change away from this model (which has at least the advantages of being coherent and sort-of similar to how x86 does things), what's the model we're moving to? May be we can create a dummy set of values for the ID registers, which doesn't provide any 'special functionality' so that it is safe to be migrated across any host ? I suspect this problem is larger and requires a more complex solution than merely providing funky ID register values. If we can do it successfully that would be really handy, though... thanks -- PMM -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] arm64/kvm: Add generic v8 KVM target
On 17 June 2015 at 10:00, Suzuki K. Poulose suzuki.poul...@arm.com wrote: From: Suzuki K. Poulose suzuki.poul...@arm.com This patch adds a generic ARM v8 KVM target cpu type for use by the new CPUs which eventualy ends up using the common sys_reg table. For backward compatibility the existing targets have been preserved. Any new target CPU that can be covered by generic v8 sys_reg tables should make use of the new generic target. How do you intend this to work for cross-host migration? Is the idea that the kernel guarantees that generic looks 100% the same to the guest regardless of host hardware? I'm not sure that can be made to work, given impdef differences in ID register values, bp/wp registers, and so on. Given that, it seems to me that we still need to provide KVM_ARM_TARGET_$THISCPU defines so userspace can request a specific guest CPU flavour; so what does this patch provide that isn't already provided by just having userspace query for the preferred CPU type as it does already? thanks -- PMM -- To unsubscribe from this list: send the line unsubscribe kvm in
Re: [PATCH 04/10] KVM: arm/arm64: vgic: Allow HW irq to be encoded in LR
On 17 June 2015 at 12:53, Eric Auger eric.au...@linaro.org wrote: shouldn't we test somewhere that the hwirq is between 16 and 1019. Not directly related, but that reminds me that I noticed the other day that we have VGIC_MAX_IRQS = 1024 (and use that as a guard on how many irqs we let userspace configure and ask us to deliver), but that doesn't account for the couple of magic numbers at the top of the range. I think that lets userspace cause us to do UNPREDICTABLE things to the GIC... -- PMM -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v5 2/6] target-arm: kvm64: introduce kvm_arm_init_debug()
On 29 May 2015 at 16:19, Alex Bennée alex.ben...@linaro.org wrote: As we haven't always had guest debug support we need to probe for it. Additionally we don't do this in the start-up capability code so we don't fall over on old kernels. Signed-off-by: Alex Bennée alex.ben...@linaro.org --- Reviewed-by: Peter Maydell peter.mayd...@linaro.org thanks -- PMM -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v5 3/6] target-arm: kvm - implement software breakpoints
On 29 May 2015 at 16:19, Alex Bennée alex.ben...@linaro.org wrote: These don't involve messing around with debug registers, just setting the breakpoint instruction in memory. GDB will not use this mechanism if it can't access the memory to write the breakpoint. All the kernel has to do is ensure the hypervisor traps the breakpoint exceptions and returns to userspace. Signed-off-by: Alex Bennée alex.ben...@linaro.org -- v2 - handle debug exit with new hsr exception info - add verbosity to UNIMP message v3 - sync with kvm_cpu_synchronize_state() before checking PC. - use internals.h defines - use env-pc - use proper format types --- target-arm/kvm.c | 88 1 file changed, 76 insertions(+), 12 deletions(-) diff --git a/target-arm/kvm.c b/target-arm/kvm.c index fdd9ba3..c3bad6f 100644 --- a/target-arm/kvm.c +++ b/target-arm/kvm.c @@ -510,9 +510,60 @@ void kvm_arch_post_run(CPUState *cs, struct kvm_run *run) { } +/* See v8 ARM ARM D7.2.27 ESR_ELx, Exception Syndrome Register + * + * To minimise translating between kernel and user-space the kernel + * ABI just provides user-space with the full exception syndrome + * register value to be decoded in QEMU. + */ + +static int kvm_handle_debug(CPUState *cs, struct kvm_run *run) +{ +struct kvm_debug_exit_arch *arch_info = run-debug.arch; +int hsr_ec = arch_info-hsr ARM_EL_EC_SHIFT; +ARMCPU *cpu = ARM_CPU(cs); +CPUARMState *env = cpu-env; + +/* Ensure PC is synchronised */ +kvm_cpu_synchronize_state(cs); + +switch (hsr_ec) { +case EC_AA64_BKPT: +if (kvm_find_sw_breakpoint(cs, env-pc)) { +return true; +} +break; +default: +error_report(%s: unhandled debug exit (%PRIx32, %PRIx64)\n, + __func__, arch_info-hsr, env-pc); +} + +/* If we don't handle this it could be it really is for the + guest to handle */ +qemu_log_mask(LOG_UNIMP, + %s: re-injecting exception not yet implemented + (0x%PRIx32, %PRIx64)\n, + __func__, hsr_ec, env-pc); + +return false; +} + int kvm_arch_handle_exit(CPUState *cs, struct kvm_run *run) { -return 0; +int ret = 0; + +switch (run-exit_reason) { +case KVM_EXIT_DEBUG: +if (kvm_handle_debug(cs, run)) { +ret = EXCP_DEBUG; +} /* otherwise return to guest */ +break; +default: +qemu_log_mask(LOG_UNIMP, %s: un-handled exit reason %d\n, + __func__, run-exit_reason); +break; +} +return ret; } bool kvm_arch_stop_on_emulation_error(CPUState *cs) @@ -537,14 +588,33 @@ int kvm_arch_on_sigbus(int code, void *addr) void kvm_arch_update_guest_debug(CPUState *cs, struct kvm_guest_debug *dbg) { -qemu_log_mask(LOG_UNIMP, %s: not implemented\n, __func__); +if (kvm_sw_breakpoints_active(cs)) { +dbg-control |= KVM_GUESTDBG_ENABLE | KVM_GUESTDBG_USE_SW_BP; +} } -int kvm_arch_insert_sw_breakpoint(CPUState *cs, - struct kvm_sw_breakpoint *bp) +/* C6.6.29 BRK instruction */ +int kvm_arch_insert_sw_breakpoint(CPUState *cs, struct kvm_sw_breakpoint *bp) { -qemu_log_mask(LOG_UNIMP, %s: not implemented\n, __func__); -return -EINVAL; +static const uint32_t brk = 0xd420; #define, please, since you're using it here and in the remove fn. + +if (cpu_memory_rw_debug(cs, bp-pc, (uint8_t *)bp-saved_insn, 4, 0) || +cpu_memory_rw_debug(cs, bp-pc, (uint8_t *)brk, 4, 1)) { +return -EINVAL; +} +return 0; +} Shouldn't we be testing the does the kernel implement debug flag before we allow gdb to write in bp insns or mess with dbg-control ? -- PMM -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v5 4/6] target-arm: kvm - support for single step
On 29 May 2015 at 16:19, Alex Bennée alex.ben...@linaro.org wrote: This adds support for single-step. There isn't much to do on the QEMU side as after we set-up the request for single step via the debug ioctl it is all handled within the kernel. Signed-off-by: Alex Bennée alex.ben...@linaro.org --- v2 - convert to using HSR_EC v3 - use internals.h definitions --- target-arm/kvm.c | 10 ++ 1 file changed, 10 insertions(+) diff --git a/target-arm/kvm.c b/target-arm/kvm.c index c3bad6f..de2865a 100644 --- a/target-arm/kvm.c +++ b/target-arm/kvm.c @@ -528,6 +528,13 @@ static int kvm_handle_debug(CPUState *cs, struct kvm_run *run) kvm_cpu_synchronize_state(cs); switch (hsr_ec) { +case EC_SOFTWARESTEP: +if (cs-singlestep_enabled) { +return true; +} else { +error_report(Came out of SINGLE STEP when not enabled); All-caps there is a bit odd. +} +break; case EC_AA64_BKPT: if (kvm_find_sw_breakpoint(cs, env-pc)) { return true; @@ -588,6 +595,9 @@ int kvm_arch_on_sigbus(int code, void *addr) void kvm_arch_update_guest_debug(CPUState *cs, struct kvm_guest_debug *dbg) { +if (cs-singlestep_enabled) { +dbg-control |= KVM_GUESTDBG_ENABLE | KVM_GUESTDBG_SINGLESTEP; +} if (kvm_sw_breakpoints_active(cs)) { dbg-control |= KVM_GUESTDBG_ENABLE | KVM_GUESTDBG_USE_SW_BP; } Again, is there a guard somewhere to prevent us trying to enable singlestep if the kernel doesn't support it? -- PMM -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v5 5/6] target-arm: kvm - add support for HW assisted debug
On 29 May 2015 at 16:19, Alex Bennée alex.ben...@linaro.org wrote: This adds basic support for HW assisted debug. The ioctl interface to KVM allows us to pass an implementation defined number of break and watch point registers. When KVM_GUESTDBG_USE_HW_BP is specified these debug registers will be installed in place on the world switch into the guest. The hardware is actually capable of more advanced matching but it is unclear if this expressiveness is available via the gdbstub protocol. Signed-off-by: Alex Bennée alex.ben...@linaro.org --- v2 - correct setting of PMC/BAS/MASK - improved commentary - added helper function to check watchpoint in range - fix find/deletion of watchpoints v3 - use internals.h definitions --- target-arm/kvm.c | 35 +++--- target-arm/kvm64.c | 304 ++- target-arm/kvm_arm.h | 21 3 files changed, 338 insertions(+), 22 deletions(-) diff --git a/target-arm/kvm.c b/target-arm/kvm.c index de2865a..e1fccdd 100644 --- a/target-arm/kvm.c +++ b/target-arm/kvm.c @@ -17,6 +17,7 @@ #include qemu-common.h #include qemu/timer.h +#include qemu/error-report.h #include sysemu/sysemu.h #include sysemu/kvm.h #include kvm_arm.h @@ -540,6 +541,16 @@ static int kvm_handle_debug(CPUState *cs, struct kvm_run *run) return true; } break; +case EC_BREAKPOINT: +if (kvm_arm_find_hw_breakpoint(cs, env-pc)) { +return true; +} +break; +case EC_WATCHPOINT: +if (kvm_arm_find_hw_watchpoint(cs, arch_info-far)) { +return true; +} I assume the kernel is giving us the hardware behaviour of watchpoint traps happen before load/store execution, not after? (This is the correct thing from the POV of the rest of QEMU.) +break; default: error_report(%s: unhandled debug exit (%PRIx32, %PRIx64)\n, __func__, arch_info-hsr, env-pc); @@ -601,6 +612,10 @@ void kvm_arch_update_guest_debug(CPUState *cs, struct kvm_guest_debug *dbg) if (kvm_sw_breakpoints_active(cs)) { dbg-control |= KVM_GUESTDBG_ENABLE | KVM_GUESTDBG_USE_SW_BP; } +if (kvm_hw_breakpoints_active(cs)) { +dbg-control |= KVM_GUESTDBG_ENABLE | KVM_GUESTDBG_USE_HW_BP; +kvm_copy_hw_breakpoint_data(dbg-arch); +} } /* C6.6.29 BRK instruction */ @@ -627,26 +642,6 @@ int kvm_arch_remove_sw_breakpoint(CPUState *cs, struct kvm_sw_breakpoint *bp) return 0; } -int kvm_arch_insert_hw_breakpoint(target_ulong addr, - target_ulong len, int type) -{ -qemu_log_mask(LOG_UNIMP, %s: not implemented\n, __func__); -return -EINVAL; -} - -int kvm_arch_remove_hw_breakpoint(target_ulong addr, - target_ulong len, int type) -{ -qemu_log_mask(LOG_UNIMP, %s: not implemented\n, __func__); -return -EINVAL; -} - - -void kvm_arch_remove_all_hw_breakpoints(void) -{ -qemu_log_mask(LOG_UNIMP, %s: not implemented\n, __func__); -} - void kvm_arch_init_irq_routing(KVMState *s) { } diff --git a/target-arm/kvm64.c b/target-arm/kvm64.c index 61592d2..06d4e1e 100644 --- a/target-arm/kvm64.c +++ b/target-arm/kvm64.c @@ -2,6 +2,7 @@ * ARM implementation of KVM hooks, 64 bit specific code * * Copyright Mian-M. Hamayun 2013, Virtual Open Systems + * Copyright Alex Bennée 2014, Linaro Really 2014? :-) * * This work is licensed under the terms of the GNU GPL, version 2 or later. * See the COPYING file in the top-level directory. @@ -12,12 +13,18 @@ #include sys/types.h #include sys/ioctl.h #include sys/mman.h +#include sys/ptrace.h +#include asm/ptrace.h +#include linux/elf.h #include linux/kvm.h #include config-host.h #include qemu-common.h #include qemu/timer.h +#include qemu/host-utils.h +#include qemu/error-report.h +#include exec/gdbstub.h #include sysemu/sysemu.h #include sysemu/kvm.h #include kvm_arm.h @@ -26,21 +33,314 @@ #include hw/arm/arm.h static bool have_guest_debug; +/* Max and current break/watch point counts */ +int max_hw_bp, max_hw_wp; +int cur_hw_bp, cur_hw_wp; +struct kvm_guest_debug_arch guest_debug_registers; ...did we figure out how this works for SMP? /** - * kvm_arm_init_debug() + * kvm_arm_init_debug() - check for guest debug capabilities * @cs: CPUState * - * Check for guest debug capabilities. + * kvm_check_extension returns 0 if we have no debug registers or the + * number we have. * */ static void kvm_arm_init_debug(CPUState *cs) { have_guest_debug = kvm_check_extension(cs-kvm_state, KVM_CAP_SET_GUEST_DEBUG); +max_hw_wp = kvm_check_extension(cs-kvm_state, KVM_CAP_GUEST_DEBUG_HW_WPS); +max_hw_bp = kvm_check_extension(cs-kvm_state, KVM_CAP_GUEST_DEBUG_HW_BPS); return; }
Re: [PATCH v5 6/6] target-arm: kvm - re-inject guest debug exceptions
On 29 May 2015 at 16:19, Alex Bennée alex.ben...@linaro.org wrote: From: Alex Bennée a...@bennee.com If we can't find details for the debug exception in our debug state then we can assume the exception is due to debugging inside the guest. To inject the exception into the guest state we re-use the TCG exception code (do_interupt). However while guest debugging is in effect we currently can't handle the guest using single step which is heavily used by GDB. Is this just the kernel not supporting this, or would QEMU need to change as well? Worth mentioning either way. Signed-off-by: Alex Bennée alex.ben...@linaro.org --- v5: - new for v5 --- target-arm/cpu.h| 1 + target-arm/helper-a64.c | 17 ++--- target-arm/internals.h | 1 + target-arm/kvm.c| 30 ++ 4 files changed, 38 insertions(+), 11 deletions(-) diff --git a/target-arm/cpu.h b/target-arm/cpu.h index 083211c..95ae3a8 100644 --- a/target-arm/cpu.h +++ b/target-arm/cpu.h @@ -56,6 +56,7 @@ #define EXCP_SMC13 /* Secure Monitor Call */ #define EXCP_VIRQ 14 #define EXCP_VFIQ 15 +#define EXCP_WAPT 16 Why do we need a new EXCP value here? In hardware watchpoints are reported via a particular syndrome value and (for AArch32) as Data Aborts, so I would expect that we would just do the same. The code in this patch doesn't seem to treat EXCP_WAPT any differently from EXCP_DABT... #define ARMV7M_EXCP_RESET 1 #define ARMV7M_EXCP_NMI 2 diff --git a/target-arm/helper-a64.c b/target-arm/helper-a64.c index 861f6fa..32bd27d 100644 --- a/target-arm/helper-a64.c +++ b/target-arm/helper-a64.c @@ -25,6 +25,7 @@ #include qemu/bitops.h #include internals.h #include qemu/crc32c.h +#include sysemu/kvm.h #include zlib.h /* For crc32 */ /* C2.4.7 Multiply and divide */ @@ -478,10 +479,13 @@ void aarch64_cpu_do_interrupt(CPUState *cs) } arm_log_exception(cs-exception_index); -qemu_log_mask(CPU_LOG_INT, ...from EL%d\n, arm_current_el(env)); +qemu_log_mask(CPU_LOG_INT, ...from EL%d PC 0x% PRIx64 \n, + arm_current_el(env), env-pc); + if (qemu_loglevel_mask(CPU_LOG_INT) !excp_is_internal(cs-exception_index)) { -qemu_log_mask(CPU_LOG_INT, ...with ESR 0x% PRIx32 \n, +qemu_log_mask(CPU_LOG_INT, ...with ESR %x/0x% PRIx32 \n, + env-exception.syndrome ARM_EL_EC_SHIFT, env-exception.syndrome); } If you just want to improve our debug logging that should be a separate patch. @@ -494,6 +498,7 @@ void aarch64_cpu_do_interrupt(CPUState *cs) switch (cs-exception_index) { case EXCP_PREFETCH_ABORT: case EXCP_DATA_ABORT: +case EXCP_WAPT: env-cp15.far_el[new_el] = env-exception.vaddress; qemu_log_mask(CPU_LOG_INT, ...with FAR 0x% PRIx64 \n, env-cp15.far_el[new_el]); @@ -539,6 +544,12 @@ void aarch64_cpu_do_interrupt(CPUState *cs) aarch64_restore_sp(env, new_el); env-pc = addr; -cs-interrupt_request |= CPU_INTERRUPT_EXITTB; + +qemu_log_mask(CPU_LOG_INT, ...to EL%d PC 0x% PRIx64 PSTATE 0x%x\n, + new_el, env-pc, pstate_read(env)); + +if (!kvm_enabled()) { +cs-interrupt_request |= CPU_INTERRUPT_EXITTB; +} Do we need a similar change in the AArch32 do_interrupt code, to handle the case of debugging a 32-bit guest? } #endif diff --git a/target-arm/internals.h b/target-arm/internals.h index 2cc3017..10e8999 100644 --- a/target-arm/internals.h +++ b/target-arm/internals.h @@ -58,6 +58,7 @@ static const char * const excnames[] = { [EXCP_SMC] = Secure Monitor Call, [EXCP_VIRQ] = Virtual IRQ, [EXCP_VFIQ] = Virtual FIQ, +[EXCP_WAPT] = Watchpoint, }; static inline void arm_log_exception(int idx) diff --git a/target-arm/kvm.c b/target-arm/kvm.c index e1fccdd..6f608d8 100644 --- a/target-arm/kvm.c +++ b/target-arm/kvm.c @@ -523,9 +523,11 @@ static int kvm_handle_debug(CPUState *cs, struct kvm_run *run) struct kvm_debug_exit_arch *arch_info = run-debug.arch; int hsr_ec = arch_info-hsr ARM_EL_EC_SHIFT; ARMCPU *cpu = ARM_CPU(cs); +CPUClass *cc = CPU_GET_CLASS(cs); CPUARMState *env = cpu-env; +int forward_excp = EXCP_BKPT; -/* Ensure PC is synchronised */ +/* Ensure all state is synchronised */ kvm_cpu_synchronize_state(cs); Not sure the comment is providing any useful information now; it probably should either be more explanatory or just deleted. switch (hsr_ec) { @@ -533,7 +535,14 @@ static int kvm_handle_debug(CPUState *cs, struct kvm_run *run) if (cs-singlestep_enabled) { return true; } else { -error_report(Came out of SINGLE STEP when not enabled); +/* + * The kernel should have supressed the guests ability
Re: [PATCH v5 0/6] QEMU support for KVM Guest Debug on arm64
On 29 May 2015 at 16:19, Alex Bennée alex.ben...@linaro.org wrote: You may be wondering what happened to v3 and v4. They do exist but they didn't change much from the the original patches as I've been mostly looking the kernel side of the equation. So in summary the changes are: - updates to the kernel ABI - don't fall over on kernels without debug support - better logging, syncing and use of internals.h - debug exception re-injection for guest events* Some generic remarks (which we've talked about in irc): * does this correctly handle single step over emulated MMIO insns? how about single step over insns emulated in the kernel without trapping out to userspace? (eg some of the sysregs) kvm_skip_instr() doesn't seem to update PSTATE.SS... * the kernel currently does kvm_skip_instr() before the emulated MMIO exit, not afterwards. That feels conceptually the wrong way round -- are there any interesting corner cases we would get wrong currently but that naturally fall out in the wash if it's done afterwards? * what about debugging a 32-bit guest which uses the 32-bit ARM/Thumb bkpt insns? thanks -- PMM -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Qemu-devel] Announcing qboot, a minimal x86 firmware for QEMU
On 27 May 2015 at 10:30, Paolo Bonzini pbonz...@redhat.com wrote: For example, ARM's -M virt uses a pl011 block for the RTC, and also uses fw_cfg. Another commonly used ISA device is the UART, for which again -M virt uses a pl031. Partly we do that because there were a number of reports that trying to use virtio for the console didn't work reliably... Using the stock UART that is widely supported in UEFI/uboot/kernel was a conservative design choice. The next thing that's likely to appear in virt is a PL061 GPIO device, which you need for CPU hotplug and external-shutdown-request notifications. -- PMM -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Qemu-devel] Announcing qboot, a minimal x86 firmware for QEMU
On 22 May 2015 at 12:01, Daniel P. Berrange berra...@redhat.com wrote: On the QEMU side of things I wonder if there is scope for taking AArch64's 'virt' machine type concept and duplicating it on all architectures. Experience suggests that holding the line on minimal is really quite tricky, though -- there's always one more thing that somebody really wants to add... -- PMM -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Qemu-devel] Announcing qboot, a minimal x86 firmware for QEMU
On 22 May 2015 at 12:12, Daniel P. Berrange berra...@redhat.com wrote: Yep, it is hard saying no - but I'd think as long as it was possible to add the extra features using -device, it ought to be practical to keep a virt machine types -nodefaults -nodefconfig base setup pretty minimal. Mmm, but -device only works for pluggable devices really. We don't have a coherent mechanism for saying put the PS/2 keyboard controller into the system at its usual IO ports on the command line. -- PMM -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v4 03/12] KVM: arm64: guest debug, define API headers
On 15 May 2015 at 16:14, Alex Bennée alex.ben...@linaro.org wrote: Mark Rutland mark.rutl...@arm.com writes: On Fri, May 15, 2015 at 03:27:06PM +0100, Alex Bennée wrote: +/* + * See v8 ARM ARM D7.3: Debug Registers + * + * The control registers are architecturally defined as 32 bits but are + * stored as 64 bit values alongside the value registers. This is done Stale comment? They're stored as __u32 below. Gah yes it is. It's possible that the registers could grow in future as happened in the case of CLIDR_EL1, so it might be worth treating system registers generally as u64 values. Really? I mean the existing debug *control* registers have reserved bits 24-31 so there is space for expansion. Other places in the userspace ABI which deal with sysregs (notably ONE_REG) consistently define them all as 64-bit (which makes sense anyway since the ISA only provides 64-bit accessors to them). Architecturally 32 bits only means top 32 bits reserved. -- PMM -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2 06/10] KVM: arm64: guest debug, add SW break point support
On 28 April 2015 at 09:42, Alex Bennée alex.ben...@linaro.org wrote: Peter Maydell peter.mayd...@linaro.org writes: Does the kernel already have a conveniently implemented inject exception into guest lump of code? If so it might be less effort to do it that way round, maybe. So you pointed out we can't just re-inject the exceptions we get as we need to map from things like ESR_ELx_EC_WATCHPT_LOW to ESR_ELx_EC_WATCHPT_CUR before re-injection. Of course if it is as simple as modifying the ESR_EL1 register and returning +ve in the handle_exit path then I can do that but I assumed if any other wrangling needs doing it should be done in userspace. Well, somebody's got to do it, and it's the same amount of work either way (fiddling with ESR, making sure we direct the guest to the right exception vector entry point, maybe a few other things). -- PMM -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2 06/10] KVM: arm64: guest debug, add SW break point support
On 27 April 2015 at 21:04, Christoffer Dall christoffer.d...@linaro.org wrote: On Thu, Apr 23, 2015 at 03:26:53PM +0100, Alex Bennée wrote: Christoffer Dall christoffer.d...@linaro.org writes: On Tue, Mar 31, 2015 at 04:08:04PM +0100, Alex Bennée wrote: + * just need to report the PC and the HSR values to userspace. + * Userspace may decide to re-inject the exception and deliver it to + * the guest if it wasn't for the host to deal with. now I'm confused - does userspace setup the guest to receive an exception or does it tell KVM to emulate an exception for the guest or do we execute the breakpoint without trapping the debug exception? I've made it all go through userspace as we may have to translate the hypervisor visible exception code to what the guest was expecting to see. ok, so I think you should re-phrase something like: Userspace may decide that this exception is caused by the guest using debugging itself, and may in that case emulate the guest debug exception in userspace before resuming KVM. But does that really work? Given that we don't support KVM-TCG migration, this sounds a little strange. Did we test it? The QEMU patches have a TODO note at the point where you'd want to do this... Design-wise you can do the reinjection in the kernel or in userspace (ppc QEMU does this in userspace, for instance) because it's pretty much just setting registers to fake up the exception-entry into EL1. Code-wise QEMU's ARM code isn't set up to do it right now, but it shouldn't be too difficult to persuade the TCG exception-entry code to work for this case too. Does the kernel already have a conveniently implemented inject exception into guest lump of code? If so it might be less effort to do it that way round, maybe. -- PMM -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2 3/4] target-arm: kvm - support for single step
On 21 April 2015 at 13:56, Alex Bennée alex.ben...@linaro.org wrote: Peter Maydell peter.mayd...@linaro.org writes: switch (hsr_ec) { +case HSR_EC_SOFT_STEP: +if (cs-singlestep_enabled) { +return true; +} else { +error_report(Came out of SINGLE STEP when not enabled); This can only happen if there's a kernel bug, right? Sure. Should we report it differently? abort() out? Saying 'This would be a kernel bug' in a comment would do... -- PMM -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2 4/4] target-arm: kvm - add support for HW assisted debug
On 31 March 2015 at 16:40, Alex Bennée alex.ben...@linaro.org wrote: From: Alex Bennée a...@bennee.com This adds basic support for HW assisted debug. The ioctl interface to KVM allows us to pass an implementation defined number of break and watch point registers. When KVM_GUESTDBG_USE_HW_BP is specified these debug registers will be installed in place on the world switch into the guest. The hardware is actually capable of more advanced matching but it is unclear if this expressiveness is available via the gdbstub protocol. Signed-off-by: Alex Bennée alex.ben...@linaro.org --- v2 - correct setting of PMC/BAS/MASK - improved commentary - added helper function to check watchpoint in range - fix find/deletion of watchpoints diff --git a/target-arm/kvm.c b/target-arm/kvm.c index ae0f8b2..d1adf5f 100644 --- a/target-arm/kvm.c +++ b/target-arm/kvm.c @@ -17,6 +17,7 @@ #include qemu-common.h #include qemu/timer.h +#include qemu/error-report.h #include sysemu/sysemu.h #include sysemu/kvm.h #include kvm_arm.h @@ -476,6 +477,8 @@ void kvm_arch_post_run(CPUState *cs, struct kvm_run *run) #define HSR_EC_SHIFT26 #define HSR_EC_SOFT_STEP0x32 +#define HSR_EC_HW_BKPT 0x30 +#define HSR_EC_HW_WATCH 0x34 #define HSR_EC_SW_BKPT 0x3c static int kvm_handle_debug(CPUState *cs, struct kvm_run *run) @@ -496,6 +499,16 @@ static int kvm_handle_debug(CPUState *cs, struct kvm_run *run) return true; } break; +case HSR_EC_HW_BKPT: +if (kvm_arm_find_hw_breakpoint(cs, arch_info-pc)) { +return true; +} +break; +case HSR_EC_HW_WATCH: +if (kvm_arm_find_hw_watchpoint(cs, arch_info-far)) { +return true; +} +break; default: error_report(%s: unhandled debug exit (%x, %llx)\n, __func__, arch_info-hsr, arch_info-pc); @@ -556,6 +569,10 @@ void kvm_arch_update_guest_debug(CPUState *cs, struct kvm_guest_debug *dbg) if (kvm_sw_breakpoints_active(cs)) { dbg-control |= KVM_GUESTDBG_ENABLE | KVM_GUESTDBG_USE_SW_BP; } +if (kvm_hw_breakpoints_active(cs)) { +dbg-control |= KVM_GUESTDBG_ENABLE | KVM_GUESTDBG_USE_HW_BP; +kvm_copy_hw_breakpoint_data(dbg-arch); +} } /* C6.6.29 BRK instruction */ @@ -582,26 +599,6 @@ int kvm_arch_remove_sw_breakpoint(CPUState *cs, struct kvm_sw_breakpoint *bp) return 0; } -int kvm_arch_insert_hw_breakpoint(target_ulong addr, - target_ulong len, int type) -{ -qemu_log_mask(LOG_UNIMP, %s: not implemented\n, __func__); -return -EINVAL; -} - -int kvm_arch_remove_hw_breakpoint(target_ulong addr, - target_ulong len, int type) -{ -qemu_log_mask(LOG_UNIMP, %s: not implemented\n, __func__); -return -EINVAL; -} - - -void kvm_arch_remove_all_hw_breakpoints(void) -{ -qemu_log_mask(LOG_UNIMP, %s: not implemented\n, __func__); -} - void kvm_arch_init_irq_routing(KVMState *s) { } diff --git a/target-arm/kvm64.c b/target-arm/kvm64.c index 8cf3a62..dbe81a7 100644 --- a/target-arm/kvm64.c +++ b/target-arm/kvm64.c @@ -2,6 +2,7 @@ * ARM implementation of KVM hooks, 64 bit specific code * * Copyright Mian-M. Hamayun 2013, Virtual Open Systems + * Copyright Alex Bennée 2014, Linaro * * This work is licensed under the terms of the GNU GPL, version 2 or later. * See the COPYING file in the top-level directory. @@ -12,11 +13,17 @@ #include sys/types.h #include sys/ioctl.h #include sys/mman.h +#include sys/ptrace.h +#include asm/ptrace.h We really need the asm/ include ? +#include linux/elf.h #include linux/kvm.h #include qemu-common.h #include qemu/timer.h +#include qemu/host-utils.h +#include qemu/error-report.h +#include exec/gdbstub.h #include sysemu/sysemu.h #include sysemu/kvm.h #include kvm_arm.h @@ -24,6 +31,312 @@ #include internals.h #include hw/arm/arm.h +/* Max and current break/watch point counts */ +int max_hw_bp, max_hw_wp; +int cur_hw_bp, cur_hw_wp; +struct kvm_guest_debug_arch guest_debug_registers; How does this work in an SMP guest? + +/** + * kvm_arm_init_debug() + * @cs: CPUState + * + * kvm_check_extension returns 0 if we have no debug registers or the + * number we have. + * + */ +static void kvm_arm_init_debug(CPUState *cs) +{ +max_hw_wp = kvm_check_extension(cs-kvm_state, KVM_CAP_GUEST_DEBUG_HW_WPS); +max_hw_bp = kvm_check_extension(cs-kvm_state, KVM_CAP_GUEST_DEBUG_HW_BPS); +return; +} + +/** + * insert_hw_breakpoint() + * @addr: address of breakpoint + * + * See ARM ARM D2.9.1 for details but here we are only going to create + * simple un-linked breakpoints (i.e. we don't chain breakpoints + * together to match address and context or vmid). The hardware is + * capable of fancier
Re: [PATCH v2 2/4] target-arm: kvm - implement software breakpoints
On 31 March 2015 at 16:40, Alex Bennée alex.ben...@linaro.org wrote: These don't involve messing around with debug registers, just setting the breakpoint instruction in memory. GDB will not use this mechanism if it can't access the memory to write the breakpoint. All the kernel has to do is ensure the hypervisor traps the breakpoint exceptions and returns to userspace. Signed-off-by: Alex Bennée alex.ben...@linaro.org -- v2 - handle debug exit with new hsr exception info - add verbosity to UNIMP message diff --git a/target-arm/kvm.c b/target-arm/kvm.c index 72c1fa1..290c1fe 100644 --- a/target-arm/kvm.c +++ b/target-arm/kvm.c @@ -25,6 +25,7 @@ #include hw/arm/arm.h const KVMCapabilityInfo kvm_arch_required_capabilities[] = { +KVM_CAP_INFO(SET_GUEST_DEBUG), KVM_CAP_LAST_INFO }; Doesn't this mean we'll suddenly stop working on older kernels that didn't implement this capability? It would be nicer to merely disable the breakpoint/debug support, rather than refuse to run at all... @@ -466,9 +467,57 @@ void kvm_arch_post_run(CPUState *cs, struct kvm_run *run) { } +/* See ARM ARM D7.2.27 ESR_ELx, Exception Syndrome Register You should probably say 'v8 ARM ARM'. +** +** To minimise translating between kernel and user-space the kernel +** ABI just provides user-space with the full exception syndrome +** register value to be decoded in QEMU. What's with the weird '**' comment format? +*/ + +#define HSR_EC_SHIFT26 +#define HSR_EC_SW_BKPT 0x3c + +static int kvm_handle_debug(CPUState *cs, struct kvm_run *run) +{ +struct kvm_debug_exit_arch *arch_info = run-debug.arch; +int hsr_ec = arch_info-hsr HSR_EC_SHIFT; + +switch (hsr_ec) { +case HSR_EC_SW_BKPT: +if (kvm_find_sw_breakpoint(cs, arch_info-pc)) { +return true; +} +break; +default: +error_report(%s: unhandled debug exit (%x, %llx)\n, + __func__, arch_info-hsr, arch_info-pc); Is this intended to be a can't happen case, or is it something a guest can trigger? +} + +/* If we don't handle this it could be it really is for the + guest to handle */ (suboptimal multiline comment format) +qemu_log_mask(LOG_UNIMP, + %s: re-injecting exception not yet implemented (0x%x, %llx)\n, + __func__, hsr_ec, arch_info-pc); When does this happen? Guest userspace program hits a hardcoded breakpoint insn while we're trying to debug the VM? If we just return to the guest in that situation will we try to re-execute the bkpt insn and loop round taking exceptions forever? + +return false; +} + int kvm_arch_handle_exit(CPUState *cs, struct kvm_run *run) { -return 0; +int ret = 0; + +switch (run-exit_reason) { +case KVM_EXIT_DEBUG: +if (kvm_handle_debug(cs, run)) { +ret = EXCP_DEBUG; +} /* otherwise return to guest */ +break; +default: +qemu_log_mask(LOG_UNIMP, %s: un-handled exit reason %d\n, + __func__, run-exit_reason); +break; +} +return ret; } bool kvm_arch_stop_on_emulation_error(CPUState *cs) @@ -493,14 +542,33 @@ int kvm_arch_on_sigbus(int code, void *addr) void kvm_arch_update_guest_debug(CPUState *cs, struct kvm_guest_debug *dbg) { -qemu_log_mask(LOG_UNIMP, %s: not implemented\n, __func__); +if (kvm_sw_breakpoints_active(cs)) { +dbg-control |= KVM_GUESTDBG_ENABLE | KVM_GUESTDBG_USE_SW_BP; +} } -int kvm_arch_insert_sw_breakpoint(CPUState *cs, - struct kvm_sw_breakpoint *bp) +/* C6.6.29 BRK instruction */ This comment would be better placed next to the magic number it's explaining. +int kvm_arch_insert_sw_breakpoint(CPUState *cs, struct kvm_sw_breakpoint *bp) { -qemu_log_mask(LOG_UNIMP, %s: not implemented\n, __func__); -return -EINVAL; +static const uint32_t brk = 0xd420; + +if (cpu_memory_rw_debug(cs, bp-pc, (uint8_t *)bp-saved_insn, 4, 0) || +cpu_memory_rw_debug(cs, bp-pc, (uint8_t *)brk, 4, 1)) { Does this work correctly for big-endian hosts? +return -EINVAL; +} +return 0; +} + +int kvm_arch_remove_sw_breakpoint(CPUState *cs, struct kvm_sw_breakpoint *bp) +{ +static uint32_t brk; + +if (cpu_memory_rw_debug(cs, bp-pc, (uint8_t *)brk, 4, 0) || +brk != 0xd420 || If you're going to use this magic number twice please name it... +cpu_memory_rw_debug(cs, bp-pc, (uint8_t *)bp-saved_insn, 4, 1)) { +return -EINVAL; +} +return 0; } int kvm_arch_insert_hw_breakpoint(target_ulong addr, @@ -517,12 +585,6 @@ int kvm_arch_remove_hw_breakpoint(target_ulong addr, return -EINVAL; } -int kvm_arch_remove_sw_breakpoint(CPUState *cs, - struct kvm_sw_breakpoint *bp) -{ -
Re: [PATCH v2 3/4] target-arm: kvm - support for single step
On 31 March 2015 at 16:40, Alex Bennée alex.ben...@linaro.org wrote: This adds support for single-step. There isn't much to do on the QEMU side as after we set-up the request for single step via the debug ioctl it is all handled within the kernel. Signed-off-by: Alex Bennée alex.ben...@linaro.org --- v2 - convert to using HSR_EC diff --git a/target-arm/kvm.c b/target-arm/kvm.c index 290c1fe..ae0f8b2 100644 --- a/target-arm/kvm.c +++ b/target-arm/kvm.c @@ -475,6 +475,7 @@ void kvm_arch_post_run(CPUState *cs, struct kvm_run *run) */ #define HSR_EC_SHIFT26 +#define HSR_EC_SOFT_STEP0x32 #define HSR_EC_SW_BKPT 0x3c We already include internals.h in this file, so can you just use the EC_* constants and ARM_EL_EC_SHIFT rather than defining new ones? (Applies for patch 1 as well.) static int kvm_handle_debug(CPUState *cs, struct kvm_run *run) @@ -483,6 +484,13 @@ static int kvm_handle_debug(CPUState *cs, struct kvm_run *run) int hsr_ec = arch_info-hsr HSR_EC_SHIFT; switch (hsr_ec) { +case HSR_EC_SOFT_STEP: +if (cs-singlestep_enabled) { +return true; +} else { +error_report(Came out of SINGLE STEP when not enabled); This can only happen if there's a kernel bug, right? +} +break; case HSR_EC_SW_BKPT: if (kvm_find_sw_breakpoint(cs, arch_info-pc)) { return true; @@ -542,6 +550,9 @@ int kvm_arch_on_sigbus(int code, void *addr) void kvm_arch_update_guest_debug(CPUState *cs, struct kvm_guest_debug *dbg) { +if (cs-singlestep_enabled) { +dbg-control |= KVM_GUESTDBG_ENABLE | KVM_GUESTDBG_SINGLESTEP; +} if (kvm_sw_breakpoints_active(cs)) { dbg-control |= KVM_GUESTDBG_ENABLE | KVM_GUESTDBG_USE_SW_BP; } -- 2.3.4 thanks -- PMM -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] kvm: add support for memory transaction attributes
On 8 April 2015 at 13:28, Paolo Bonzini pbonz...@redhat.com wrote: Let kvm_arch_post_run convert fields in the kvm_run struct to MemTxAttrs. These are then passed to address_space_rw. Signed-off-by: Paolo Bonzini pbonz...@redhat.com Reviewed-by: Peter Maydell peter.mayd...@linaro.org At some point it might be nice to support returning the transaction-failed response to the kernel, but that would be a KVM API enhancement, so definitely something for another day. thanks -- PMM -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2 03/10] KVM: arm: guest debug, define API headers
On 1 April 2015 at 17:01, Alex Bennée alex.ben...@linaro.org wrote: David Hildenbrand d...@linux.vnet.ibm.com writes: +/* + * See ARM ARM D7.3: Debug Registers Maybe drop one ARM Well technically it is the ARM Architecture Reference Manual but that's quite long winded. Dropping one ARM would be actively wrong, so don't do that. You probably want to say 'v8 ARM ARM, assuming you mean that. -- PMM -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v6 2/5] target-arm: kvm: save/restore mp state
On 1 April 2015 at 16:39, Alex Bennée alex.ben...@linaro.org wrote: --- a/target-arm/kvm.c +++ b/target-arm/kvm.c @@ -28,6 +28,8 @@ const KVMCapabilityInfo kvm_arch_required_capabilities[] = { KVM_CAP_LAST_INFO }; +static bool cap_has_mp_state = false; This explicit init to false is unnecessary, and checkpatch complains about it. I will fix this as I apply to my tree. -- PMM -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v5 3/6] hw/intc: arm_gic_kvm.c restore config first
On 23 March 2015 at 17:05, Alex Bennée alex.ben...@linaro.org wrote: As there is logic to deal with the difference between edge and level triggered interrupts in the kernel we must ensure it knows the configuration of the IRQs before we restore the pending state. Signed-off-by: Alex Bennée alex.ben...@linaro.org Acked-by: Christoffer Dall christoffer.d...@linaro.org diff --git a/hw/intc/arm_gic_kvm.c b/hw/intc/arm_gic_kvm.c index 0d20750..e2512f1 100644 --- a/hw/intc/arm_gic_kvm.c +++ b/hw/intc/arm_gic_kvm.c @@ -370,6 +370,11 @@ static void kvm_arm_gic_put(GICState *s) * the appropriate CPU interfaces in the kernel) */ kvm_dist_put(s, 0x800, 8, s-num_irq, translate_targets); +/* irq_state[n].trigger - GICD_ICFGRn + * (restore targets before pending IRQs so we treat level/edge + * correctly */ +kvm_dist_put(s, 0xc00, 2, s-num_irq, translate_trigger); You don't seem to have acted on Christoffer's query in v4 about this comment... -- PMM -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v5 4/6] target-arm: kvm64 sync FP register state
On 23 March 2015 at 17:05, Alex Bennée alex.ben...@linaro.org wrote: +/* Advanced SIMD and FP registers + * We map Qn = regs[2n+1]:regs[2n] + */ +for (i = 0; i 32; i++) { +int rd = i 1; +float128 fp_val = make_float128(env-vfp.regs[rd + 1], +env-vfp.regs[rd]); float128 is part of softfloat; it doesn't really belong here... -- PMM -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v5 6/6] target-arm: cpu.h document why env-spsr exists
On 23 March 2015 at 17:05, Alex Bennée alex.ben...@linaro.org wrote: I was getting very confused about the duplication of state so wanted to make it explicit. Signed-off-by: Alex Bennée alex.ben...@linaro.org diff --git a/target-arm/cpu.h b/target-arm/cpu.h index 083211c..6dc1799 100644 --- a/target-arm/cpu.h +++ b/target-arm/cpu.h @@ -155,6 +155,11 @@ typedef struct CPUARMState { This contains all the other bits. Use cpsr_{read,write} to access the whole CPSR. */ uint32_t uncached_cpsr; +/* The spsr is a alias for spsr_elN where N is the current + * exception level. This comment is still wrong. It is provided for here so the TCG msr/mrs + * implementation can access one register. ...and this is wrong too. Care needs to be taken + * to ensure the banked_spsr[] is also updated. + */ uint32_t spsr; /* Banked registers. */ I think it would be better to just drop this patch, really... -- PMM -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v5 5/6] target-arm: kvm64 fix save/restore of SPSR regs
On 23 March 2015 at 17:05, Alex Bennée alex.ben...@linaro.org wrote: The current code was negatively indexing the cpu state array and not synchronizing banked spsr register state with the current mode's spsr state, causing occasional failures with migration. Some munging is done to take care of the aarch64 mapping and also to ensure the most current value of the spsr is updated to the banked registers (relevant for KVM-TCG migration). Signed-off-by: Alex Bennée alex.ben...@linaro.org --- v2 (ajb) - minor tweaks and clarifications v3 - Use the correct bank index function for setting/getting env-spsr - only deal with spsrs in elevated exception levels v4 - try and make commentary clearer - ensure env-banked_spsr[0] = env-spsr before we sync v5 - fix banking index now banking fixed - keep wide spacing on [ ] forms - claimed authorship diff --git a/target-arm/kvm64.c b/target-arm/kvm64.c index 857e970..5270fa7 100644 --- a/target-arm/kvm64.c +++ b/target-arm/kvm64.c @@ -139,6 +139,7 @@ int kvm_arch_put_registers(CPUState *cs, int level) uint64_t val; int i; int ret; +unsigned int el; ARMCPU *cpu = ARM_CPU(cs); CPUARMState *env = cpu-env; @@ -205,9 +206,24 @@ int kvm_arch_put_registers(CPUState *cs, int level) return ret; } +/* Saved Program State Registers + * + * Before we restore from the banked_spsr[] array we need to + * ensure that any modifications to env-spsr are correctly + * reflected in the banks. + */ +el = arm_current_el(env); +if (el 0) { +i = is_a64(env) ? +aarch64_banked_spsr_index(el) : +bank_number(env-uncached_cpsr CPSR_M); +env-banked_spsr[i] = env-spsr; This doesn't look right. If we're currently A64 then env-spsr is unused and this code will trash one of the banked_spsr[] fields. +} + +/* KVM 0-4 map to QEMU banks 1-5 */ for (i = 0; i KVM_NR_SPSR; i++) { reg.id = AARCH64_CORE_REG(spsr[i]); -reg.addr = (uintptr_t) env-banked_spsr[i - 1]; +reg.addr = (uintptr_t) env-banked_spsr[i + 1]; ret = kvm_vcpu_ioctl(cs, KVM_SET_ONE_REG, reg); if (ret) { return ret; @@ -253,11 +269,13 @@ int kvm_arch_put_registers(CPUState *cs, int level) return ret; } + Stray extra space. -- PMM -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v5 2/6] target-arm: kvm: save/restore mp state
On 23 March 2015 at 17:05, Alex Bennée alex.ben...@linaro.org wrote: This adds the saving and restore of the current Multi-Processing state of the machine. While the KVM_GET/SET_MP_STATE API exposes a number of potential states for x86 we only use two for ARM. Either the process is running or not. We then save this state into the cpu_powered TCG state to avoid changing the serialisation format. Signed-off-by: Alex Bennée alex.ben...@linaro.org --- v2 - make mpstate field runtime dependant (kvm_enabled()) - drop initial KVM_CAP_MP_STATE requirement - re-use cpu_powered instead of new field v4 - s/HALTED/STOPPED/ - move code from machine.c to kvm. diff --git a/target-arm/kvm.c b/target-arm/kvm.c index 72c1fa1..a74832c 100644 --- a/target-arm/kvm.c +++ b/target-arm/kvm.c @@ -458,6 +458,46 @@ void kvm_arm_reset_vcpu(ARMCPU *cpu) } } +/* + * Update KVM's MP_STATE based on what QEMU thinks it is + */ +int kvm_arm_sync_mpstate_to_kvm(ARMCPU *cpu) +{ +if (kvm_check_extension(CPU(cpu)-kvm_state, KVM_CAP_MP_STATE)) { Doing the ioctl to check the extension every time isn't great. Look at the way ppc and s390 do a cap check at init time and cache the answer. +struct kvm_mp_state mp_state = { +.mp_state = +cpu-powered_off ? KVM_MP_STATE_STOPPED : KVM_MP_STATE_RUNNABLE +}; +int ret = kvm_vcpu_ioctl(CPU(cpu), KVM_SET_MP_STATE, mp_state); +if (ret) { +fprintf(stderr, %s: failed to set MP_STATE %d/%s\n, +__func__, ret, strerror(ret)); +return -1; +} +} + +return 0; +} + +/* + * Sync the KVM MP_STATE into QEMU + */ +int kvm_arm_sync_mpstate_to_qemu(ARMCPU *cpu) +{ +if (kvm_check_extension(CPU(cpu)-kvm_state, KVM_CAP_MP_STATE)) { +struct kvm_mp_state mp_state; +int ret = kvm_vcpu_ioctl(CPU(cpu), KVM_GET_MP_STATE, mp_state); +if (ret) { +fprintf(stderr, %s: failed to get MP_STATE %d/%s\n, +__func__, ret, strerror(ret)); +abort(); +} +cpu-powered_off = (mp_state.mp_state == KVM_MP_STATE_STOPPED); +} + +return 0; +} + void kvm_arch_pre_run(CPUState *cs, struct kvm_run *run) { } diff --git a/target-arm/kvm32.c b/target-arm/kvm32.c index 94030d1..49b6bab 100644 --- a/target-arm/kvm32.c +++ b/target-arm/kvm32.c @@ -356,6 +356,8 @@ int kvm_arch_put_registers(CPUState *cs, int level) return EINVAL; } +kvm_arm_sync_mpstate_to_kvm(cpu); + return ret; } @@ -427,5 +429,7 @@ int kvm_arch_get_registers(CPUState *cs) */ write_list_to_cpustate(cpu); +kvm_arm_sync_mpstate_to_qemu(cpu); + return 0; } diff --git a/target-arm/kvm64.c b/target-arm/kvm64.c index 8cf3a62..fed03f2 100644 --- a/target-arm/kvm64.c +++ b/target-arm/kvm64.c @@ -211,6 +211,8 @@ int kvm_arch_put_registers(CPUState *cs, int level) return EINVAL; } +kvm_arm_sync_mpstate_to_kvm(cpu); + /* TODO: * FP state */ @@ -310,6 +312,8 @@ int kvm_arch_get_registers(CPUState *cs) */ write_list_to_cpustate(cpu); +kvm_arm_sync_mpstate_to_qemu(cpu); + /* TODO: other registers */ return ret; } diff --git a/target-arm/kvm_arm.h b/target-arm/kvm_arm.h index 455dea3..7b75758 100644 --- a/target-arm/kvm_arm.h +++ b/target-arm/kvm_arm.h @@ -162,6 +162,24 @@ typedef struct ARMHostCPUClass { */ bool kvm_arm_get_host_cpu_features(ARMHostCPUClass *ahcc); + +/** + * kvm_arm_sync_mpstate_to_kvm + * @cpu: ARMCPU + * + * If supported set the KVM MP_STATE based on QEMUs migration data. Apostrophes (again!). And the reference to migration data here is spurious. + */ +int kvm_arm_sync_mpstate_to_kvm(ARMCPU *cpu); + +/** + * kvm_arm_sync_mpstate_to_qemu + * @cpu: ARMCPU + * + * If supported get the MP_STATE from KVM and store in QEMUs migration + * data. + */ +int kvm_arm_sync_mpstate_to_qemu(ARMCPU *cpu); + #endif #endif -- 2.3.2 -- PMM -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Qemu-devel] [PATCH v5 1/6] target-arm: Store SPSR_EL1 state in banked_spsr[1] (SPSR_svc)
On 24 March 2015 at 14:32, Greg Bellows greg.bell...@linaro.org wrote: On Mon, Mar 23, 2015 at 12:05 PM, Alex Bennée alex.ben...@linaro.org wrote: From: Peter Maydell peter.mayd...@linaro.org @@ -523,7 +523,7 @@ void aarch64_cpu_do_interrupt(CPUState *cs) aarch64_save_sp(env, arm_current_el(env)); env-elr_el[new_el] = env-pc; } else { -env-banked_spsr[0] = cpsr_read(env); +env-banked_spsr[aarch64_banked_spsr_index(new_el)] = cpsr_read(env); Are the other banks (2-5) only used for KVM? It seems we go out of our way to manage this larger SPSR array then not use all of the slots in QEMU itself. They're used in AArch32 (where they are the SPSR for various 32 bit modes). In AArch64 you can access those registers via MSR/MRS (we probably haven't implemented those yet because they are only accessible at EL2 and above) so hypervisors can do worldswitches. But for exception entry and return (which is what this code is) we only use SPSR_EL0/SPSR_EL1/SPSR_EL2/SPSR_EL3 which is a subset of the AArch32 SPSRs. -- PMM -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v4 3/5] target-arm: kvm64 sync FP register state
On 16 March 2015 at 11:01, Alex Bennée alex.ben...@linaro.org wrote: For migration to work we need to sync all of the register state. This is especially noticeable when GCC starts using FP registers as spill registers even with integer programs. Signed-off-by: Alex Bennée alex.ben...@linaro.org --- v4: - fixed merge conflicts - rm superfluous reg.id++ diff --git a/target-arm/kvm64.c b/target-arm/kvm64.c index fed03f2..8fd0c8d 100644 --- a/target-arm/kvm64.c +++ b/target-arm/kvm64.c @@ -126,9 +126,17 @@ bool kvm_arm_reg_syncs_via_cpreg_list(uint64_t regidx) #define AARCH64_CORE_REG(x) (KVM_REG_ARM64 | KVM_REG_SIZE_U64 | \ KVM_REG_ARM_CORE | KVM_REG_ARM_CORE_REG(x)) +/* The linux headers don't define a 128 bit wide SIMD macro for us */ I'm not clear what this comment is trying to tell me... +#define AARCH64_SIMD_CORE_REG(x) (KVM_REG_ARM64 | KVM_REG_SIZE_U128 | \ + KVM_REG_ARM_CORE | KVM_REG_ARM_CORE_REG(x)) + +#define AARCH64_SIMD_CTRL_REG(x) (KVM_REG_ARM64 | KVM_REG_SIZE_U32 | \ + KVM_REG_ARM_CORE | KVM_REG_ARM_CORE_REG(x)) + int kvm_arch_put_registers(CPUState *cs, int level) { struct kvm_one_reg reg; +uint32_t fpr; uint64_t val; int i; int ret; @@ -207,15 +215,37 @@ int kvm_arch_put_registers(CPUState *cs, int level) } } +/* Advanced SIMD and FP registers */ +for (i = 0; i 32; i++) { +reg.id = AARCH64_SIMD_CORE_REG(fp_regs.vregs[i]); +reg.addr = (uintptr_t)(env-vfp.regs[i]); env-vfp.regs[] is a 64 entry array, but here we are only dealing with the first 32 entries (and furthermore trying to write 128 bits into a 64 bit element...) +ret = kvm_vcpu_ioctl(cs, KVM_SET_ONE_REG, reg); Does this work right for bigendian? (ie do the kernel and QEMU agree on which order the two halves of the 128 bit value go in? I suspect not...) +if (ret) { +return ret; +} +} + +reg.addr = (uintptr_t)(fpr); +fpr = vfp_get_fpsr(env); +reg.id = AARCH64_SIMD_CTRL_REG(fp_regs.fpsr); +ret = kvm_vcpu_ioctl(cs, KVM_SET_ONE_REG, reg); +if (ret) { +return ret; +} + +fpr = vfp_get_fpcr(env); +reg.id = AARCH64_SIMD_CTRL_REG(fp_regs.fpcr); +ret = kvm_vcpu_ioctl(cs, KVM_SET_ONE_REG, reg); +if (ret) { +return ret; +} + if (!write_list_to_kvmstate(cpu)) { return EINVAL; } kvm_arm_sync_mpstate_to_kvm(cpu); -/* TODO: - * FP state - */ return ret; } @@ -223,6 +253,7 @@ int kvm_arch_get_registers(CPUState *cs) { struct kvm_one_reg reg; uint64_t val; +uint32_t fpr; int i; int ret; @@ -304,6 +335,31 @@ int kvm_arch_get_registers(CPUState *cs) } } +/* Advanced SIMD and FP registers */ +for (i = 0; i 32; i++) { +reg.id = AARCH64_SIMD_CORE_REG(fp_regs.vregs[i]); +reg.addr = (uintptr_t)(env-vfp.regs[i]); +ret = kvm_vcpu_ioctl(cs, KVM_GET_ONE_REG, reg); Same remarks apply here as for the set loop. +if (ret) { +return ret; +} +} + +reg.addr = (uintptr_t)(fpr); +reg.id = AARCH64_SIMD_CTRL_REG(fp_regs.fpsr); +ret = kvm_vcpu_ioctl(cs, KVM_GET_ONE_REG, reg); +if (ret) { +return ret; +} +vfp_set_fpsr(env, fpr); + +reg.id = AARCH64_SIMD_CTRL_REG(fp_regs.fpcr); +ret = kvm_vcpu_ioctl(cs, KVM_GET_ONE_REG, reg); +if (ret) { +return ret; +} +vfp_set_fpcr(env, fpr); + if (!write_kvmstate_to_list(cpu)) { return EINVAL; } -- 2.3.2 -- PMM -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v4 1/5] target-arm: kvm: save/restore mp state
On 16 March 2015 at 11:01, Alex Bennée alex.ben...@linaro.org wrote: This adds the saving and restore of the current Multi-Processing state of the machine. While the KVM_GET/SET_MP_STATE API exposes a number of potential states for x86 we only use two for ARM. Either the process is running or not. We then save this state into the cpu_powered TCG state to avoid changing the serialisation format. This (and the comments and function names below) still seem to be looking at this in terms of some ambiguous multiprocessing state. What we are actually dealing with here is is the vCPU powered on or not, and I'd rather we said so explicitly. Signed-off-by: Alex Bennée alex.ben...@linaro.org --- v2 - make mpstate field runtime dependant (kvm_enabled()) - drop initial KVM_CAP_MP_STATE requirement - re-use cpu_powered instead of new field v4 - s/HALTED/STOPPED/ - move code from machine.c to kvm. diff --git a/target-arm/kvm.c b/target-arm/kvm.c index 72c1fa1..a74832c 100644 --- a/target-arm/kvm.c +++ b/target-arm/kvm.c @@ -458,6 +458,46 @@ void kvm_arm_reset_vcpu(ARMCPU *cpu) } } +/* + * Update KVM's MP_STATE based on what QEMU thinks it is + */ +int kvm_arm_sync_mpstate_to_kvm(ARMCPU *cpu) +{ +if (kvm_check_extension(CPU(cpu)-kvm_state, KVM_CAP_MP_STATE)) { Doesn't seem like a great idea to do the extension check every time we sync state, when it's always going to be the same for any particular run of QEMU. +struct kvm_mp_state mp_state = { +.mp_state = +cpu-powered_off ? KVM_MP_STATE_STOPPED : KVM_MP_STATE_RUNNABLE +}; +int ret = kvm_vcpu_ioctl(CPU(cpu), KVM_SET_MP_STATE, mp_state); +if (ret) { +fprintf(stderr, %s: failed to set MP_STATE %d/%s\n, +__func__, ret, strerror(ret)); kvm_vcpu_ioctl() returns a negative errno, but strerror wants a positive one. +return -1; +} +} + +return 0; +} + +/* + * Sync the KVM MP_STATE into QEMU + */ +int kvm_arm_sync_mpstate_to_qemu(ARMCPU *cpu) +{ +if (kvm_check_extension(CPU(cpu)-kvm_state, KVM_CAP_MP_STATE)) { +struct kvm_mp_state mp_state; +int ret = kvm_vcpu_ioctl(CPU(cpu), KVM_GET_MP_STATE, mp_state); +if (ret) { +fprintf(stderr, %s: failed to get MP_STATE %d/%s\n, +__func__, ret, strerror(ret)); +abort(); +} +cpu-powered_off = (mp_state.mp_state == KVM_MP_STATE_STOPPED); +} + +return 0; +} + void kvm_arch_pre_run(CPUState *cs, struct kvm_run *run) { } diff --git a/target-arm/kvm32.c b/target-arm/kvm32.c index 94030d1..49b6bab 100644 --- a/target-arm/kvm32.c +++ b/target-arm/kvm32.c @@ -356,6 +356,8 @@ int kvm_arch_put_registers(CPUState *cs, int level) return EINVAL; } +kvm_arm_sync_mpstate_to_kvm(cpu); + return ret; } @@ -427,5 +429,7 @@ int kvm_arch_get_registers(CPUState *cs) */ write_list_to_cpustate(cpu); +kvm_arm_sync_mpstate_to_qemu(cpu); + return 0; } diff --git a/target-arm/kvm64.c b/target-arm/kvm64.c index 8cf3a62..fed03f2 100644 --- a/target-arm/kvm64.c +++ b/target-arm/kvm64.c @@ -211,6 +211,8 @@ int kvm_arch_put_registers(CPUState *cs, int level) return EINVAL; } +kvm_arm_sync_mpstate_to_kvm(cpu); + /* TODO: * FP state */ @@ -310,6 +312,8 @@ int kvm_arch_get_registers(CPUState *cs) */ write_list_to_cpustate(cpu); +kvm_arm_sync_mpstate_to_qemu(cpu); + /* TODO: other registers */ return ret; } diff --git a/target-arm/kvm_arm.h b/target-arm/kvm_arm.h index 455dea3..7b75758 100644 --- a/target-arm/kvm_arm.h +++ b/target-arm/kvm_arm.h @@ -162,6 +162,24 @@ typedef struct ARMHostCPUClass { */ bool kvm_arm_get_host_cpu_features(ARMHostCPUClass *ahcc); + +/** + * kvm_arm_sync_mpstate_to_kvm + * @cpu: ARMCPU + * + * If supported set the KVM MP_STATE based on QEMUs migration data. QEMU's. Also, not migration data, it's just our state. + */ +int kvm_arm_sync_mpstate_to_kvm(ARMCPU *cpu); + +/** + * kvm_arm_sync_mpstate_to_qemu + * @cpu: ARMCPU + * + * If supported get the MP_STATE from KVM and store in QEMUs migration + * data. Apostrophe again. + */ +int kvm_arm_sync_mpstate_to_qemu(ARMCPU *cpu); thanks -- PMM -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v4 4/5] target-arm: kvm64 fix save/restore of SPSR regs
On 17 March 2015 at 19:04, Christoffer Dall christoffer.d...@linaro.org wrote: On Tue, Mar 17, 2015 at 04:18:16PM +, Peter Maydell wrote: Note that this code is implicitly relying on the ordering of register banks defined by the bank_number() function, which is a bit icky. right, I thought you wrote this code with some deeper intention of doing it this way so I tried to stick with the general idea - but now I actually looked at git blame and realized that you didn't even write it. Given all this churn around this, probably it's much cleaner to get rid of the loop and have an explicit sync for each architecturally implemented register, i.e. the SPSR_EL1 and the various mode-specific AArch32 SPSR registers? Yes, this seems like a good idea. I almost suggested it when I was writing out my review comments, in fact... for (i = 0; i KVM_NR_SPSR; i++) { reg.id = AARCH64_CORE_REG(spsr[i]); -reg.addr = (uintptr_t) env-banked_spsr[i - 1]; +reg.addr = (uintptr_t) env-banked_spsr[i+1]; Spaces again. ret = kvm_vcpu_ioctl(cs, KVM_GET_ONE_REG, reg); if (ret) { return ret; } } +el = arm_current_el(env); +if (el 0) { +if (is_a64(env)) { +g_assert(el == 1); +/* KVM_SPSR_SVC holds the AARCH64 EL1 SPSR which QEMU + * keeps in bank 0 so copy it across. */ +env-banked_spsr[0] = env-banked_spsr[1]; +i = aarch64_banked_spsr_index(el); More workarounds for a bug we should just fix, I think. again, this is just for the loop above to be generic, and then fix things up afterwards so that things work both for is_a64() and !is_a64(). But the only reason we're fixing anything up is that we're working around the bug. If we didn't have that bug and the QEMU definition of where SPSR_EL1's state lived correctly pointed at banked_spsr[1], then the only thing you'd need to do for syncing is copy the KVM SPSRs into banked_spsr[1..5]. -- PMM -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v4 4/5] target-arm: kvm64 fix save/restore of SPSR regs
On 16 March 2015 at 11:01, Alex Bennée alex.ben...@linaro.org wrote: From: Christoffer Dall christoffer.d...@linaro.org The current code was negatively indexing the cpu state array and not synchronizing banked spsr register state with the current mode's spsr state, causing occasional failures with migration. Some munging is done to take care of the aarch64 mapping and also to ensure the most current value of the spsr is updated to the banked registers (relevant for KVM-TCG migration). Signed-off-by: Christoffer Dall christoffer.d...@linaro.org Signed-off-by: Alex Bennée alex.ben...@linaro.org --- v2 (ajb) - minor tweaks and clarifications v3 - Use the correct bank index function for setting/getting env-spsr - only deal with spsrs in elevated exception levels v4 - try and make commentary clearer - ensure env-banked_spsr[0] = env-spsr before we sync diff --git a/target-arm/kvm64.c b/target-arm/kvm64.c index 8fd0c8d..7ddb1b1 100644 --- a/target-arm/kvm64.c +++ b/target-arm/kvm64.c @@ -140,6 +140,7 @@ int kvm_arch_put_registers(CPUState *cs, int level) uint64_t val; int i; int ret; +unsigned int el; ARMCPU *cpu = ARM_CPU(cs); CPUARMState *env = cpu-env; @@ -206,9 +207,29 @@ int kvm_arch_put_registers(CPUState *cs, int level) return ret; } +/* Saved Program State Registers + * + * Before we restore from the banked_spsr[] array we need to + * ensure that any modifications to env-spsr are correctly + * reflected and map aarch64 exception levels if required. AArch64. Not entirely sure what the map exception levels is saying. + */ +el = arm_current_el(env); +if (el 0) { +if (is_a64(env)) { +g_assert(el == 1); +env-banked_spsr[0] = env-spsr; If we're in AArch64 mode then I don't believe env-spsr has valid content at this point. The live copy is in env-banked_sprsr[] for all cases. +/* QEMUs AARCH64 EL1 SPSR is in bank 0, so map it to + * KVM_SPSR_SVC for syncing to KVM */ QEMU's and AArch64, but this is just a bug in our implementation -- the mapping between AArch64 SPSR_EL1 and AArch32 SPSR_svc is architecturally mandated. We should be using banked_spsr[1] for our regdef for SPSR_EL1 in helper.c. Also the */ should be on its own line. +env-banked_spsr[1] = env-banked_spsr[0]; What is going on here? Architecturally, AArch64 SPSR_EL1 is mapped to AArch32 SPSR_svc, which is env-banked_spsr[1]. env-banked_spsr[0] would be the SPSR for USR and SYS, except they don't have an SPSR (you cannot take exceptions into them). I think QEMU just has a banked_spsr[0] for convenience of implementation and it should never actually be used. Is this code just working around the QEMU SPSR_EL1 bug? +i = bank_number(env-uncached_cpsr CPSR_M); +env-banked_spsr[i] = env-spsr; +} +} + for (i = 0; i KVM_NR_SPSR; i++) { reg.id = AARCH64_CORE_REG(spsr[i]); -reg.addr = (uintptr_t) env-banked_spsr[i - 1]; +reg.addr = (uintptr_t) env-banked_spsr[i+1]; Coding style demands spaces around the + operator. Note that this code is implicitly relying on the ordering of register banks defined by the bank_number() function, which is a bit icky. ret = kvm_vcpu_ioctl(cs, KVM_SET_ONE_REG, reg); if (ret) { return ret; @@ -254,6 +275,7 @@ int kvm_arch_get_registers(CPUState *cs) struct kvm_one_reg reg; uint64_t val; uint32_t fpr; +unsigned int el; int i; int ret; @@ -326,15 +348,34 @@ int kvm_arch_get_registers(CPUState *cs) return ret; } +/* Fetch the SPSR registers + * + * KVM has an array of state indexed for all the possible aarch32 + * privilege levels. These map onto QEMUs aarch32 banks 1 - 4. QEMU's. AArch32. Also, you mean 1 - 5. + */ for (i = 0; i KVM_NR_SPSR; i++) { reg.id = AARCH64_CORE_REG(spsr[i]); -reg.addr = (uintptr_t) env-banked_spsr[i - 1]; +reg.addr = (uintptr_t) env-banked_spsr[i+1]; Spaces again. ret = kvm_vcpu_ioctl(cs, KVM_GET_ONE_REG, reg); if (ret) { return ret; } } +el = arm_current_el(env); +if (el 0) { +if (is_a64(env)) { +g_assert(el == 1); +/* KVM_SPSR_SVC holds the AARCH64 EL1 SPSR which QEMU + * keeps in bank 0 so copy it across. */ +env-banked_spsr[0] = env-banked_spsr[1]; +i = aarch64_banked_spsr_index(el); More workarounds for a bug we should just fix, I think. +} else { +i = bank_number(env-uncached_cpsr CPSR_M); +} +env-spsr = env-banked_spsr[i]; +} + /* Advanced SIMD and FP registers */ for (i = 0; i 32; i++) { reg.id = AARCH64_SIMD_CORE_REG(fp_regs.vregs[i]); -- 2.3.2 thanks
Re: [PATCH v4 5/5] target-arm: cpu.h document why env-spsr exists
On 16 March 2015 at 11:01, Alex Bennée alex.ben...@linaro.org wrote: I was getting very confused about the duplication of state so wanted to make it explicit. Signed-off-by: Alex Bennée alex.ben...@linaro.org diff --git a/target-arm/cpu.h b/target-arm/cpu.h index 083211c..6dc1799 100644 --- a/target-arm/cpu.h +++ b/target-arm/cpu.h @@ -155,6 +155,11 @@ typedef struct CPUARMState { This contains all the other bits. Use cpsr_{read,write} to access the whole CPSR. */ uint32_t uncached_cpsr; +/* The spsr is a alias for spsr_elN where N is the current + * exception level. I could have sworn I'd commented about this in an earlier version. It's not an alias for spsr_elN, because on AArch32 there are multiple SPSRs at EL1; it is the current SPSR, and the SPSRs for other modes are stored in banked_spsr[]. /* SPSR for current mode. */ would do IMHO (matching the comment for env-regs[].) -- PMM -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 6/6] target-arm/cpu.h: document why env-spsr exists
On 25 February 2015 at 16:02, Alex Bennée alex.ben...@linaro.org wrote: I was getting very confused about the duplication of state. Perhaps we should just get rid of env-spsr and use helpers that understand the banking? I've already disagreed with this. I would suggest putting tentative questions about future direction of the codebase below the '---' rather than in the commit log :-) Signed-off-by: Alex Bennée alex.ben...@linaro.org diff --git a/target-arm/cpu.h b/target-arm/cpu.h index 11845a6..d7fd13f 100644 --- a/target-arm/cpu.h +++ b/target-arm/cpu.h @@ -155,6 +155,11 @@ typedef struct CPUARMState { This contains all the other bits. Use cpsr_{read,write} to access the whole CPSR. */ uint32_t uncached_cpsr; +/* The spsr is a alias for spsr_elN where N is the current + * exception level. This isn't true for AArch32 (which has multiple different SPSRs any of which might be the one in env-spsr when we're at EL1). -- PMM -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2 1/6] target-arm: kvm: save/restore mp state
On 4 March 2015 at 14:35, Alex Bennée alex.ben...@linaro.org wrote: This adds the saving and restore of the current Multi-Processing state of the machine. While the KVM_GET/SET_MP_STATE API exposes a number of potential states for x86 we only use two for ARM. Either the process is running or not. We then save this state into the cpu_powered TCG state to avoid changing the serialisation format. Signed-off-by: Alex Bennée alex.ben...@linaro.org --- v2 - make mpstate field runtime dependant (kvm_enabled()) - drop initial KVM_CAP_MP_STATE requirement - re-use cpu_powered instead of new field diff --git a/target-arm/machine.c b/target-arm/machine.c index 9446e5a..185f9a2 100644 --- a/target-arm/machine.c +++ b/target-arm/machine.c @@ -161,6 +161,7 @@ static const VMStateInfo vmstate_cpsr = { .put = put_cpsr, }; + static void cpu_pre_save(void *opaque) { ARMCPU *cpu = opaque; @@ -170,6 +171,20 @@ static void cpu_pre_save(void *opaque) /* This should never fail */ abort(); } +#if defined CONFIG_KVM +if (kvm_check_extension(CPU(cpu)-kvm_state, KVM_CAP_MP_STATE)) { +struct kvm_mp_state mp_state; +int ret = kvm_vcpu_ioctl(CPU(cpu), KVM_GET_MP_STATE, mp_state); +if (ret) { +fprintf(stderr, %s: failed to get MP_STATE %d/%s\n, +__func__, ret, strerror(ret)); +abort(); +} +cpu-powered_off = +(mp_state.mp_state == KVM_MP_STATE_RUNNABLE) +? false : true; Ternary operator to produce a true-or-false result is a bit redundant... +} +#endif Why is this in pre-save/post-load rather than in the kvm_arch_get/put_registers functions like all the other syncing code? -- PMM -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2 3/6] hw/char: pl011 don't keep setting the IRQ if nothing changed
On 4 March 2015 at 14:35, Alex Bennée alex.ben...@linaro.org wrote: While observing KVM traces I can see additional IRQ calls on pretty much every MMIO access which is just plain inefficient. Only update the QEMU IRQ level if something has actually changed from last time. Otherwise we may be papering over other failure modes. Signed-off-by: Alex Bennée alex.ben...@linaro.org diff --git a/hw/char/pl011.c b/hw/char/pl011.c index 0a45115..bb554bc 100644 --- a/hw/char/pl011.c +++ b/hw/char/pl011.c @@ -36,6 +36,9 @@ typedef struct PL011State { CharDriverState *chr; qemu_irq irq; const unsigned char *id; + +/* not serialised, prevents pl011_update doing extra set_irqs */ +uint32_t current_irq; } PL011State; #define PL011_INT_TX 0x20 @@ -53,10 +56,11 @@ static const unsigned char pl011_id_luminary[8] = static void pl011_update(PL011State *s) { -uint32_t flags; - -flags = s-int_level s-int_enabled; -qemu_set_irq(s-irq, flags != 0); +uint32_t flags = s-int_level s-int_enabled; +if (flags != s-current_irq) { +s-current_irq = flags; +qemu_set_irq(s-irq, s-current_irq != 0); +} } Consider this sequence of events: * the guest does something causing the interrupt to be asserted; int_level and int_enabled are 1, and current_irq is also now 1. We call qemu_set_irq() to raise the interrupt with the GIC * we migrate the guest to another host * on the receiving end, QEMU is in a cleanly reset state, and so current_irq, int_level and int_enabled are all zero before incoming data arrives * int_level and int_enabled are both set to 1 from the incoming data stream * the GIC itself is set to the interrupt is asserted state by its own incoming data * current_irq remains zero, because it's not migrated * the guest is resumed, and does something to deassert the interrupt. the new 'flags' value is zero * because flags == s-current_irq, we don't call qemu_set_irq, and so we've just dropped the deassert of this interrupt on the floor. -- PMM -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2 3/6] hw/char: pl011 don't keep setting the IRQ if nothing changed
On 12 March 2015 at 15:51, Peter Maydell peter.mayd...@linaro.org wrote: On 4 March 2015 at 14:35, Alex Bennée alex.ben...@linaro.org wrote: While observing KVM traces I can see additional IRQ calls on pretty much every MMIO access which is just plain inefficient. Only update the QEMU IRQ level if something has actually changed from last time. Otherwise we may be papering over other failure modes. Consider this sequence of events: Incidentally, the code review rule of thumb that led me to construct that scenario is: * state inside the device struct which doesn't correspond to real hardware state is suspicious * state which doesn't have any handling on migration save/restore is doubly suspicious ...and then it was just a matter of find the situation where this is broken :-) If we want to avoid making syscalls back into KVM it might be better to attack the problem in the GIC object rather than in all the devices that might be connected to it. In general QEMU devices tend to assume they can just always call qemu_set_irq() and it's the other end's job to avoid doing anything too expensive in that situation. -- PMM -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 5/6] target-arm/kvm64: fix save/restore of SPSR regs
On 9 March 2015 at 21:56, Christoffer Dall christoffer.d...@linaro.org wrote: this function, however, is not used only when migration, but should generally cover the case where you want to synchronize QEMU's state into KVM's state, right? So while it may not be harmful in currently supported use cases, is there ever a situation where (is_a64(env) el == 0) and env-spsr != banked_spsr[el], and where env-spsr is out-of-date? If EL == 0 then you can't access any SPSR, so env-spsr is by definition out of date. -- PMM -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2 1/5] arm: KVM: export vcpi-pause state via MP_STATE ioctls
On 10 March 2015 at 04:29, Christoffer Dall christoffer.d...@linaro.org wrote: On Mon, Mar 09, 2015 at 04:34:21PM +, Alex Bennée wrote: - Boot - Power on secondary CPUs - Power off one secondary CPU - Migrate to file (cpu_powered reflects state of each CPU) - Start fresh QEMU - Restore from file (cpu_powered - vcpu-paused via ioctl) - Run (we continue with the same power state pre-migrate) - PSCI RESET - Does what it does, power all secondaries down? - Kernel boots, turns them on? Hmmm. As long as QEMU always inits all VCPUs in the exact same way (including the KVM_ARM_VCPU_POWER_OFF feature bit) I guess it works and that's probably a reasonable requirement for migration. We init the VCPUs with the POWER_OFF flag set for exactly the set of CPUs that we want to start powered off. Typically that means that the first CPU is inited with POWER_OFF clear and the rest are inited with it set. Regardless, if we're doing an incoming migration, then the incoming data will be used to set the VCPU on/off state appropriately for resuming the VM. (This might include powering on a VCPU that's been inited in power-off and never run, or powering down a VCPU that was inited power-on but never ran. In the 'migration moves a vcpu to powered-on' case we'll also set the vcpu's PC and other registers so when it is run it will DTRT.) If the resumed guest subsequently does a PSCI reset then QEMU will re-init the VCPUs with the set of feature bits that the machine model/etc defines for a reset condition for this board, which will be the same first CPU powered on, all the rest powered off config we started with. (It's the user's responsibility to ensure that when doing a migration the QEMUs at both ends are identically configured, incidentally.) -- PMM -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC/RFT PATCH 0/3] arm64: KVM: work around incoherency with uncached guest mappings
On 5 March 2015 at 20:04, Catalin Marinas catalin.mari...@arm.com wrote: On Thu, Mar 05, 2015 at 11:12:22AM +0100, Paolo Bonzini wrote: On 04/03/2015 18:28, Catalin Marinas wrote: Can you add that property to the device tree for PCI devices too? Yes but not with mainline yet: http://thread.gmane.org/gmane.linux.kernel.iommu/8935 We can add the property at the PCI host bridge level (with the drawback that it covers all the PCI devices), like here: Even covering all PCI devices is not enough if we want to support device assignment of PCI host devices. Can we not have another PCI bridge node in the DT for the host device assignments? I'd hate to have to do that. PCI should be entirely probeable given that we tell the guest where the host bridge is, that's one of its advantages. -- PMM -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC/RFT PATCH 0/3] arm64: KVM: work around incoherency with uncached guest mappings
On 4 March 2015 at 23:29, Catalin Marinas catalin.mari...@arm.com wrote: I disagree it is 100% a host-side issue. It is a host-side issue _if_ the host tells the guest that the (virtual) device is non-coherent (or, more precisely, it does not explicitly tell the guest that the device is coherent). If the guest thinks the (virtual) device is non-coherent because of information passed by the host, I fully agree that the host needs to manage the cache coherency. However, the host could also pass a dma-coherent property in the DT given to the guest and avoid any form of cache maintenance. If the guest does not honour such coherency property, it's a guest problem and it needs fixing in the guest. This isn't any different from a real physical device behaviour. Right, and we should do that for things like virtio, because we want the performance. But we also have devices (like vga framebuffers) which shouldn't be handled as cacheable, so we need to be able to deal with both situations. -- PMM -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2 6/6] target-arm: cpu.h document why env-spsr exists
On 4 March 2015 at 23:35, Alex Bennée alex.ben...@linaro.org wrote: I was getting very confused about the duplication of state. Perhaps we should just get rid of env-spsr and use helpers that understand the banking? Doesn't seem worth changing the current working code to something else that's strictly less efficient... spsr is by no means the only banked-by-mode register, and it works just like all the others. -- PMM -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/6] target-arm: kvm: save/restore mp state
On 3 March 2015 at 20:06, Paolo Bonzini pbonz...@redhat.com wrote: On 03/03/2015 11:56, Alex Bennée wrote: This adds the saving and restore of the current Multi-Processing state of the machine. While the KVM_GET/SET_MP_STATE API exposes a number of potential states for x86 we only use two for ARM. Either the process is running or not. By this you mean is the CPU in the PSCI powered down state or not, right? From the vcpu's perspective it is either running or not. However it is the same mechanism that is used when PSCI_0_2_FN_CPU_OFF is passed the VM, internally setting vcpu-arch.paused. Well, it has to be (ABI defined to be) identical with being PSCI powered down/up, because that's how userspace is going to be treating it. If we might tell userspace we're in the not running state for other cases than PSCI-powered-down then we probably need to consider separating those out into separate states. I suggest that you define a new MP_STATE constant for this. HALTED in x86 and s390 is the state an ARM processor enters when you execute wfi. Architecturally the CPU doesn't have to enter any state at all if you execute a WFI -- it might be implemented as go to low power state and wait for an interrupt or go low power but maybe be unnecessarily woken up or nop, do nothing... Right now this is not migrated on ARM if I remember correctly, but perhaps you'll want to add it in the future. ...which is why we don't need to migrate this: it just means that migration during WFI causes an unnecessary-wakeup, which is architecturally fine. -- PMM -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/6] target-arm: kvm: save/restore mp state
On 26 February 2015 at 01:02, Alex Bennée alex.ben...@linaro.org wrote: This adds the saving and restore of the current Multi-Processing state of the machine. While the KVM_GET/SET_MP_STATE API exposes a number of potential states for x86 we only use two for ARM. Either the process is running or not. By this you mean is the CPU in the PSCI powered down state or not, right? Signed-off-by: Alex Bennée alex.ben...@linaro.org diff --git a/target-arm/kvm.c b/target-arm/kvm.c index 23cefe9..8732854 100644 --- a/target-arm/kvm.c +++ b/target-arm/kvm.c @@ -25,6 +25,7 @@ #include hw/arm/arm.h const KVMCapabilityInfo kvm_arch_required_capabilities[] = { +KVM_CAP_INFO(MP_STATE), Does this really want to be a required cap? I haven't checked, but assuming 'required' means what it says this presumably means we'll stop working on host kernels we previously ran fine on (even if migration didn't work there). KVM_CAP_LAST_INFO }; diff --git a/target-arm/machine.c b/target-arm/machine.c index 9446e5a..70b1bc4 100644 --- a/target-arm/machine.c +++ b/target-arm/machine.c @@ -161,6 +161,34 @@ static const VMStateInfo vmstate_cpsr = { .put = put_cpsr, }; +#if defined CONFIG_KVM +static int get_mpstate(QEMUFile *f, void *opaque, size_t size) +{ +ARMCPU *cpu = opaque; +struct kvm_mp_state mp_state = { .mp_state = qemu_get_be32(f)}; +return kvm_vcpu_ioctl(CPU(cpu), KVM_SET_MP_STATE, mp_state); +} Won't this break if you're running a QEMU built with KVM support in TCG mode on an aarch64 host? In any case, for that configuration we should be migrating the TCG cpu-powered_off flag, which is where we keep the PSCI power state for TCG. + +static void put_mpstate(QEMUFile *f, void *opaque, size_t size) +{ +ARMCPU *cpu = opaque; +struct kvm_mp_state mp_state; +int ret = kvm_vcpu_ioctl(CPU(cpu), KVM_GET_MP_STATE, mp_state); +if (ret) { +fprintf(stderr, %s: failed to get MP_STATE %d/%s\n, +__func__, ret, strerror(ret)); +abort(); +} +qemu_put_be32(f, mp_state.mp_state); +} + +static const VMStateInfo vmstate_mpstate = { +.name = mpstate, +.get = get_mpstate, +.put = put_mpstate, +}; +#endif + static void cpu_pre_save(void *opaque) { ARMCPU *cpu = opaque; @@ -244,6 +272,16 @@ const VMStateDescription vmstate_arm_cpu = { VMSTATE_UINT32_ARRAY(env.regs, ARMCPU, 16), VMSTATE_UINT64_ARRAY(env.xregs, ARMCPU, 32), VMSTATE_UINT64(env.pc, ARMCPU), +#if defined CONFIG_KVM +{ +.name = mp_state, +.version_id = 0, +.size = sizeof(uint32_t), +.info = vmstate_mpstate, +.flags = VMS_SINGLE, +.offset = 0, +}, +#endif This means the migration format will be different on different build configurations, which seems like a really bad idea. Have you considered having the KVM state sync functions just sync the kernel's MP state into cpu-powered_off, and then migrating that flag here unconditionally? { .name = cpsr, .version_id = 0, -- 2.3.0 -- PMM -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: vexpress: Horribly slow MMC emulation on ARM host
On 16 February 2015 at 06:56, Jan Kiszka jan.kis...@web.de wrote: This applies to both emulated and KVM accelerated mode. If I run the same image (and guest kernel) emulated on my x86 box, there is still a difference between both disk modes, but it's not that excessive. On ARM the system requires minutes to boot from MMC - if it doesn't run into timeouts earlier. It's seconds with virtio. MMC is slow for several reasons, which boil down to: * we aren't doing async I/O on the block device layer (the SD/MMC card code is old and probably a long way from current best practice) * we model the mmc-controller-to-card interface fairly closely, which is not a very efficient way of transferring data * the MMC card interface in the vexpress board doesn't do DMA, which means that all data transferred is via (at best) word at a time MMIO register accesses * MMIO register access is particularly bad with KVM because it requires us to come out of EL1 guest to EL2 which then switches to EL1 host (host kernel) which then drops out to EL0 (QEMU), and then on return we do the reverse transition EL0-EL1-EL2-EL1. Some of this is likely fixable (and possibly there are bugs which make it slower than it need be as well as the issues we actually know about). But in general we've taken the approach of saying if you want sensible IO performance use virtio, because some of the above list (no DMA, for instance) is unavoidable. -- PMM -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] kvmtool: don't use PCI config space IRQ line field
On 6 February 2015 at 18:55, Will Deacon will.dea...@arm.com wrote: On Wed, Feb 04, 2015 at 03:39:50PM +, Andre Przywara wrote: In PCI config space there is an interrupt line field (offset 0x3f), which is used to initially communicate the IRQ line number from firmware to the OS. _Hardware_ should never use this information, as the OS is free to write any information in there. Is this true even with probe-only? I appreciate that this isn't a BAR, but it still feels odd for Linux to write this in that case. The hardware (model) shouldn't be doing anything with the value in this register anyway, so I think this change to kvmtool is correct regardless of Linux's behaviour. -- PMM -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 4/4] arm/arm64: KVM: use kernel mapping to perform invalidation on page fault
On 13 January 2015 at 12:04, Christoffer Dall christoffer.d...@linaro.org wrote: Additionally, I haven't been able to think of a reasonable guest scenario where this breaks. Once the guest turns on its MMU it should deal with the necessary icache invalidation itself (I think), so we're really talking about situations where the stage-1 MMU is off, and I gather that mostly you'll be seeing a single core doing any heavy lifting and then secondary cores basically coming up, only seeing valid entries in the icache, and doing the necessary invalidat+turn on mmu stuff. The trouble with that is that as the secondary comes up, before it turns on its icache its VA-PA mapping is the identity map; whereas the primary vCPU's VA-PA mapping is whatever the guest kernel's usual mapping is. If the kernel has some mapping other than identity for the VA which is wherever the secondary-CPU-startup-to-MMU-enable code lives (which seems quite likely), then you have potential problems. -- PMM -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 4/4] arm/arm64: KVM: use kernel mapping to perform invalidation on page fault
On 13 January 2015 at 13:35, Christoffer Dall christoffer.d...@linaro.org wrote: Wouldn't a guest (and I believe Linux does this) reserve ASID 0 for additional cores and use ASID 1+++ for itself? If the guest reserves an ASID for MMU disabled then yes, that would work. The question of course is whether all guests do that... -- PMM -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 4/4] arm/arm64: KVM: use kernel mapping to perform invalidation on page fault
On 11 January 2015 at 12:33, Christoffer Dall christoffer.d...@linaro.org wrote: On Fri, Jan 09, 2015 at 03:28:58PM +, Peter Maydell wrote: But implementations are allowed to hit in the cache even when the cache is disabled. In particular, setting the guest But how can it hit anything when the icache for the used VMID is guaranteed to be clear (maybe that requires another full icache invalidate for that VMID for PSCI reset)? The point is that at the moment we don't do anything to guarantee that we've cleared the icache. (Plus could there be stale data in the icache for this physical CPU for this VMID because we've run some other vCPU on it? Or does the process of rescheduling vCPUs across pCPUs and guest ASID management deal with that?) You probably want to clear the icache on vcpu (re-)init rather than reset, though (no guarantee that userspace is going to handle system resets via PSCI). -- PMM -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 4/4] arm/arm64: KVM: use kernel mapping to perform invalidation on page fault
On 11 January 2015 at 17:58, Christoffer Dall christoffer.d...@linaro.org wrote: On Sun, Jan 11, 2015 at 05:37:52PM +, Peter Maydell wrote: On 11 January 2015 at 12:33, Christoffer Dall christoffer.d...@linaro.org wrote: On Fri, Jan 09, 2015 at 03:28:58PM +, Peter Maydell wrote: But implementations are allowed to hit in the cache even when the cache is disabled. In particular, setting the guest But how can it hit anything when the icache for the used VMID is guaranteed to be clear (maybe that requires another full icache invalidate for that VMID for PSCI reset)? The point is that at the moment we don't do anything to guarantee that we've cleared the icache. that's not entirely accurate, I assume all of the icache is invalidated/cleaned at system bring-up time, and every time we re-use a VMID (when we start a VMID rollover) we invalidate the entire icache. Right, but that doesn't catch the VM reset case, which is the one we're talking about. (Plus could there be stale data in the icache for this physical CPU for this VMID because we've run some other vCPU on it? Or does the process of rescheduling vCPUs across pCPUs and guest ASID management deal with that?) we don't clear the icache for vCPUs migrating onto other pCPUs but invalidating the icache on a page fault won't guarantee that either. Do we really need to do that? I don't think we do, but I haven't thought through exactly why we don't yet :-) -- PMM -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 4/4] arm/arm64: KVM: use kernel mapping to perform invalidation on page fault
On 9 January 2015 at 14:16, Marc Zyngier marc.zyng...@arm.com wrote: On 09/01/15 13:03, Peter Maydell wrote: When we reset a cpu by re-calling KVM_ARM_VCPU_INIT, that doesn't mean we get a new VMID for it, though, does it? I thought that what causes the icache flush to happen for the reset guest is that we unmap all of stage 2 and then fault it back in, via this code. That works for PIPT (we flush the range) and for VIPT (we do a full icache flush), but at the moment for VIVT ASID tagged we assume we can do nothing, and I don't think that's right for this case (though it is right for faulted because page was swapped out and OK for page was written by DMA). When we reset the guest, we also turn both its Icache off. Before turning it back on, the guest has to invalidate it (the ARM ARM doesn't seem to define the state of the cache out of reset). But implementations are allowed to hit in the cache even when the cache is disabled. In particular, setting the guest SCTLR_EL1.I to 0 means iside accesses are Normal Noncacheable for stage 1 attributes and (v8 ARM ARM D3.4.6) these can be held in the instruction cache. So the guest is required to do an icache-invalidate for any instructions that it writes *itself* (or DMAs in) even with the icache off. But it cannot possibly do so for its own initial startup code, and it must be the job of KVM to do that for it. (You can think of this as the VCPU provided by KVM always invalidates the icache on reset and does not require an impdef magic cache-init routine as described by D3.4.4 if you like.) -- PMM -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 4/4] arm/arm64: KVM: use kernel mapping to perform invalidation on page fault
On 9 January 2015 at 12:50, Christoffer Dall christoffer.d...@linaro.org wrote: On Thu, Jan 08, 2015 at 03:21:50PM +, Peter Maydell wrote: If this is the first instruction in the guest (ie we've just (warm) reset the VM and are running the kernel as loaded into the guest by QEMU/kvmtool) then the guest can't have invalidated the icache, and QEMU can't do the invalidate because it doesn't have the vaddr and VMID of the guest. The guest must clean its icache before turning on the MMU, no? Yes, but to execute the clean icache insn inside the guest, the guest is executing instructions, so we'd better not have stale info for that code... Whenever we reuse a VMID (rollover), we flush the entire icache for that vmid. When we reset a cpu by re-calling KVM_ARM_VCPU_INIT, that doesn't mean we get a new VMID for it, though, does it? I thought that what causes the icache flush to happen for the reset guest is that we unmap all of stage 2 and then fault it back in, via this code. That works for PIPT (we flush the range) and for VIPT (we do a full icache flush), but at the moment for VIVT ASID tagged we assume we can do nothing, and I don't think that's right for this case (though it is right for faulted because page was swapped out and OK for page was written by DMA). -- PMM -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 4/4] arm/arm64: KVM: use kernel mapping to perform invalidation on page fault
On 8 January 2015 at 11:59, Marc Zyngier marc.zyng...@arm.com wrote: @@ -180,10 +177,32 @@ static inline void coherent_cache_guest_page(struct kvm_vcpu *vcpu, hva_t hva, * * VIVT caches are tagged using both the ASID and the VMID and doesn't * need any kind of flushing (DDI 0406C.b - Page B3-1392). +* +* We need to do this through a kernel mapping (using the +* user-space mapping has proved to be the wrong +* solution). For that, we need to kmap one page at a time, +* and iterate over the range. */ - if (icache_is_pipt()) { - __cpuc_coherent_user_range(hva, hva + size); - } else if (!icache_is_vivt_asid_tagged()) { + + VM_BUG_ON(size PAGE_MASK); + + while (size) { + void *va = kmap_atomic_pfn(pfn); + + if (!vcpu_has_cache_enabled(vcpu) || ipa_uncached) + kvm_flush_dcache_to_poc(va, PAGE_SIZE); + + if (icache_is_pipt()) + __cpuc_coherent_user_range((unsigned long)va, + (unsigned long)va + PAGE_SIZE); + + size -= PAGE_SIZE; + pfn++; + + kunmap_atomic(va); + } If (vcpu_has_cache_enabled(vcpu) !ipa_uncached !icache_is_pipt()) then we're going to run round this loop mapping and unmapping without actually doing anything. Is it worth hoisting that check out of the loop? (I think it's going to be the common case for a non-PIPT icache, right?) + if (!icache_is_pipt() !icache_is_vivt_asid_tagged()) { /* any kind of VIPT cache */ __flush_icache_all(); } Can you remind me why it's OK not to flush the icache for an ASID tagged VIVT icache? Making this page coherent might actually be revealing a change in the instructions associated with the VA, mightn't it? thanks -- PMM -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 4/4] arm/arm64: KVM: use kernel mapping to perform invalidation on page fault
On 8 January 2015 at 13:07, Marc Zyngier marc.zyng...@arm.com wrote: Can you remind me why it's OK not to flush the icache for an ASID tagged VIVT icache? Making this page coherent might actually be revealing a change in the instructions associated with the VA, mightn't it? ASID cached VIVT icaches are also VMID tagged. It is thus impossible for stale cache lines to come with a new page. And if by synchronizing the caches you obtain a different instruction stream, it means you've restored the wrong page. ...is that true even if the dirty data in the dcache comes from the userspace process doing DMA or writing the initial boot image or whatever? -- PMM -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html