Re: [PATCH v3 01/11] KVM: arm: plug guest debug exploit
On Wed, Jul 01, 2015 at 03:04:00PM +0800, zichao wrote: On June 29, 2015 11:49:53 PM GMT+08:00, Christoffer Dall christoffer.d...@linaro.org wrote: On Mon, Jun 22, 2015 at 06:41:24PM +0800, Zhichao Huang wrote: Hardware debugging in guests is not intercepted currently, it means that a malicious guest can bring down the entire machine by writing to the debug registers. This patch enable trapping of all debug registers, preventing the guests to access the debug registers. This patch also disable the debug mode(DBGDSCR) in the guest world all the time, preventing the guests to mess with the host state. However, it is a precursor for later patches which will need to do more to world switch debug states while necessary. Cc: sta...@vger.kernel.org Signed-off-by: Zhichao Huang zhichao.hu...@linaro.org --- arch/arm/include/asm/kvm_coproc.h | 3 +- arch/arm/kvm/coproc.c | 60 +++ arch/arm/kvm/handle_exit.c| 4 +-- arch/arm/kvm/interrupts_head.S| 13 - 4 files changed, 70 insertions(+), 10 deletions(-) diff --git a/arch/arm/include/asm/kvm_coproc.h b/arch/arm/include/asm/kvm_coproc.h index 4917c2f..e74ab0f 100644 --- a/arch/arm/include/asm/kvm_coproc.h +++ b/arch/arm/include/asm/kvm_coproc.h @@ -31,7 +31,8 @@ void kvm_register_target_coproc_table(struct kvm_coproc_target_table *table); int kvm_handle_cp10_id(struct kvm_vcpu *vcpu, struct kvm_run *run); int kvm_handle_cp_0_13_access(struct kvm_vcpu *vcpu, struct kvm_run *run); int kvm_handle_cp14_load_store(struct kvm_vcpu *vcpu, struct kvm_run *run); -int kvm_handle_cp14_access(struct kvm_vcpu *vcpu, struct kvm_run *run); +int kvm_handle_cp14_32(struct kvm_vcpu *vcpu, struct kvm_run *run); +int kvm_handle_cp14_64(struct kvm_vcpu *vcpu, struct kvm_run *run); int kvm_handle_cp15_32(struct kvm_vcpu *vcpu, struct kvm_run *run); int kvm_handle_cp15_64(struct kvm_vcpu *vcpu, struct kvm_run *run); diff --git a/arch/arm/kvm/coproc.c b/arch/arm/kvm/coproc.c index f3d88dc..2e12760 100644 --- a/arch/arm/kvm/coproc.c +++ b/arch/arm/kvm/coproc.c @@ -91,12 +91,6 @@ int kvm_handle_cp14_load_store(struct kvm_vcpu *vcpu, struct kvm_run *run) return 1; } -int kvm_handle_cp14_access(struct kvm_vcpu *vcpu, struct kvm_run *run) -{ - kvm_inject_undefined(vcpu); - return 1; -} - static void reset_mpidr(struct kvm_vcpu *vcpu, const struct coproc_reg *r) { /* @@ -519,6 +513,60 @@ int kvm_handle_cp15_32(struct kvm_vcpu *vcpu, struct kvm_run *run) return emulate_cp15(vcpu, params); } +/** + * kvm_handle_cp14_64 -- handles a mrrc/mcrr trap on a guest CP14 access + * @vcpu: The VCPU pointer + * @run: The kvm_run struct + */ +int kvm_handle_cp14_64(struct kvm_vcpu *vcpu, struct kvm_run *run) +{ + struct coproc_params params; + + params.CRn = (kvm_vcpu_get_hsr(vcpu) 1) 0xf; + params.Rt1 = (kvm_vcpu_get_hsr(vcpu) 5) 0xf; + params.is_write = ((kvm_vcpu_get_hsr(vcpu) 1) == 0); + params.is_64bit = true; + + params.Op1 = (kvm_vcpu_get_hsr(vcpu) 16) 0xf; + params.Op2 = 0; + params.Rt2 = (kvm_vcpu_get_hsr(vcpu) 10) 0xf; + params.CRm = 0; this is a complete duplicate of kvm_handle_cp15_64, can you share this code somehow? This patch just want to plug the exploit in the simplest way, and I shared the cp14/cp15 handlers in later patches [PATCH v3 04/11]. Should I take the patch [04/11] ahead of current patch [01/11] ? It would be good if the patch that we can cc stable and which fixes the issue is self-contained. If it's impossible to do that while sharing the handlers (I don't see why, but I didn't write the code) then ok, but otherwise just add that bit of code into this patch I would say. + + /* raz_wi */ + (void)pm_fake(vcpu, params, NULL); + + /* handled */ + kvm_skip_instr(vcpu, kvm_vcpu_trap_il_is32bit(vcpu)); + return 1; +} + +/** + * kvm_handle_cp14_32 -- handles a mrc/mcr trap on a guest CP14 access + * @vcpu: The VCPU pointer + * @run: The kvm_run struct + */ +int kvm_handle_cp14_32(struct kvm_vcpu *vcpu, struct kvm_run *run) +{ + struct coproc_params params; + + params.CRm = (kvm_vcpu_get_hsr(vcpu) 1) 0xf; + params.Rt1 = (kvm_vcpu_get_hsr(vcpu) 5) 0xf; + params.is_write = ((kvm_vcpu_get_hsr(vcpu) 1) == 0); + params.is_64bit = false; + + params.CRn = (kvm_vcpu_get_hsr(vcpu) 10) 0xf; + params.Op1 = (kvm_vcpu_get_hsr(vcpu) 14) 0x7; + params.Op2 = (kvm_vcpu_get_hsr(vcpu) 17) 0x7; + params.Rt2 = 0; this is a complete duplicate of kvm_handle_cp15_32, can you share this code somehow? + + /* raz_wi */ + (void)pm_fake(vcpu, params, NULL); + + /* handled */ + kvm_skip_instr(vcpu, kvm_vcpu_trap_il_is32bit(vcpu)); + return 1; +} +
Re: [PATCH v3 04/11] KVM: arm: common infrastructure for handling AArch32 CP14/CP15
On Wed, Jul 01, 2015 at 03:09:35PM +0800, zichao wrote: On June 30, 2015 3:43:34 AM GMT+08:00, Christoffer Dall christoffer.d...@linaro.org wrote: On Mon, Jun 22, 2015 at 06:41:27PM +0800, Zhichao Huang wrote: As we're about to trap a bunch of CP14 registers, let's rework the CP15 handling so it can be generalized and work with multiple tables. Signed-off-by: Zhichao Huang zhichao.hu...@linaro.org --- arch/arm/kvm/coproc.c | 176 ++--- arch/arm/kvm/interrupts_head.S | 2 +- 2 files changed, 112 insertions(+), 66 deletions(-) diff --git a/arch/arm/kvm/coproc.c b/arch/arm/kvm/coproc.c index 9d283d9..d23395b 100644 --- a/arch/arm/kvm/coproc.c +++ b/arch/arm/kvm/coproc.c @@ -375,6 +375,9 @@ static const struct coproc_reg cp15_regs[] = { { CRn(15), CRm( 0), Op1( 4), Op2( 0), is32, access_cbar}, }; +static const struct coproc_reg cp14_regs[] = { +}; + /* Target specific emulation tables */ static struct kvm_coproc_target_table *target_tables[KVM_ARM_NUM_TARGETS]; @@ -424,47 +427,75 @@ static const struct coproc_reg *find_reg(const struct coproc_params *params, return NULL; } -static int emulate_cp15(struct kvm_vcpu *vcpu, - const struct coproc_params *params) +/* + * emulate_cp -- tries to match a cp14/cp15 access in a handling table, + *and call the corresponding trap handler. + * + * @params: pointer to the descriptor of the access + * @table: array of trap descriptors + * @num: size of the trap descriptor array + * + * Return 0 if the access has been handled, and -1 if not. + */ +static int emulate_cp(struct kvm_vcpu *vcpu, + const struct coproc_params *params, + const struct coproc_reg *table, + size_t num) { - size_t num; - const struct coproc_reg *table, *r; - - trace_kvm_emulate_cp15_imp(params-Op1, params-Rt1, params-CRn, - params-CRm, params-Op2, params-is_write); + const struct coproc_reg *r; - table = get_target_table(vcpu-arch.target, num); + if (!table) + return -1; /* Not handled */ - /* Search target-specific then generic table. */ r = find_reg(params, table, num); - if (!r) - r = find_reg(params, cp15_regs, ARRAY_SIZE(cp15_regs)); - if (likely(r)) { + if (r) { /* If we don't have an accessor, we should never get here! */ BUG_ON(!r-access); if (likely(r-access(vcpu, params, r))) { /* Skip instruction, since it was emulated */ kvm_skip_instr(vcpu, kvm_vcpu_trap_il_is32bit(vcpu)); - return 1; } - /* If access function fails, it should complain. */ - } else { - kvm_err(Unsupported guest CP15 access at: %08lx\n, - *vcpu_pc(vcpu)); - print_cp_instr(params); + + /* Handled */ + return 0; } + + /* Not handled */ + return -1; +} + +static void unhandled_cp_access(struct kvm_vcpu *vcpu, + const struct coproc_params *params) +{ + u8 hsr_ec = kvm_vcpu_trap_get_class(vcpu); + int cp; + + switch (hsr_ec) { + case HSR_EC_CP15_32: + case HSR_EC_CP15_64: + cp = 15; + break; + case HSR_EC_CP14_MR: + case HSR_EC_CP14_64: + cp = 14; + break; + default: + WARN_ON((cp = -1)); + } + + kvm_err(Unsupported guest CP%d access at: %08lx\n, + cp, *vcpu_pc(vcpu)); + print_cp_instr(params); kvm_inject_undefined(vcpu); - return 1; } -/** - * kvm_handle_cp15_64 -- handles a mrrc/mcrr trap on a guest CP15 access - * @vcpu: The VCPU pointer - * @run: The kvm_run struct - */ -int kvm_handle_cp15_64(struct kvm_vcpu *vcpu, struct kvm_run *run) +int kvm_handle_cp_64(struct kvm_vcpu *vcpu, + const struct coproc_reg *global, + size_t nr_global, + const struct coproc_reg *target_specific, + size_t nr_specific) { struct coproc_params params; @@ -478,7 +509,13 @@ int kvm_handle_cp15_64(struct kvm_vcpu *vcpu, struct kvm_run *run) params.Rt2 = (kvm_vcpu_get_hsr(vcpu) 10) 0xf; params.CRm = 0; - return emulate_cp15(vcpu, params); + if (!emulate_cp(vcpu, params, target_specific, nr_specific)) + return 1; + if (!emulate_cp(vcpu, params, global, nr_global)) + return 1; + + unhandled_cp_access(vcpu, params); + return 1; } static void reset_coproc_regs(struct kvm_vcpu *vcpu, @@ -491,12 +528,11 @@ static void reset_coproc_regs(struct kvm_vcpu *vcpu, table[i].reset(vcpu, table[i]); } -/** - * kvm_handle_cp15_32 -- handles a mrc/mcr trap
[PATCH v3 1/2] vhost: extend memory regions allocation to vmalloc
with large number of memory regions we could end up with high order allocations and kmalloc could fail if host is under memory pressure. Considering that memory regions array is used on hot path try harder to allocate using kmalloc and if it fails resort to vmalloc. It's still better than just failing vhost_set_memory() and causing guest crash due to it when a new memory hotplugged to guest. I'll still look at QEMU side solution to reduce amount of memory regions it feeds to vhost to make things even better, but it doesn't hurt for kernel to behave smarter and don't crash older QEMU's which could use large amount of memory regions. Signed-off-by: Igor Mammedov imamm...@redhat.com --- drivers/vhost/vhost.c | 22 +- 1 file changed, 17 insertions(+), 5 deletions(-) diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c index f1e07b8..99931a0 100644 --- a/drivers/vhost/vhost.c +++ b/drivers/vhost/vhost.c @@ -471,7 +471,7 @@ void vhost_dev_cleanup(struct vhost_dev *dev, bool locked) fput(dev-log_file); dev-log_file = NULL; /* No one will access memory at this point */ - kfree(dev-memory); + kvfree(dev-memory); dev-memory = NULL; WARN_ON(!list_empty(dev-work_list)); if (dev-worker) { @@ -601,6 +601,18 @@ static int vhost_memory_reg_sort_cmp(const void *p1, const void *p2) return 0; } +static void *vhost_kvzalloc(unsigned long size) +{ + void *n = kzalloc(size, GFP_KERNEL | __GFP_NOWARN | __GFP_REPEAT); + + if (!n) { + n = vzalloc(size); + if (!n) + return ERR_PTR(-ENOMEM); + } + return n; +} + static long vhost_set_memory(struct vhost_dev *d, struct vhost_memory __user *m) { struct vhost_memory mem, *newmem, *oldmem; @@ -613,21 +625,21 @@ static long vhost_set_memory(struct vhost_dev *d, struct vhost_memory __user *m) return -EOPNOTSUPP; if (mem.nregions VHOST_MEMORY_MAX_NREGIONS) return -E2BIG; - newmem = kmalloc(size + mem.nregions * sizeof *m-regions, GFP_KERNEL); + newmem = vhost_kvzalloc(size + mem.nregions * sizeof(*m-regions)); if (!newmem) return -ENOMEM; memcpy(newmem, mem, size); if (copy_from_user(newmem-regions, m-regions, mem.nregions * sizeof *m-regions)) { - kfree(newmem); + kvfree(newmem); return -EFAULT; } sort(newmem-regions, newmem-nregions, sizeof(*newmem-regions), vhost_memory_reg_sort_cmp, NULL); if (!memory_access_ok(d, newmem, 0)) { - kfree(newmem); + kvfree(newmem); return -EFAULT; } oldmem = d-memory; @@ -639,7 +651,7 @@ static long vhost_set_memory(struct vhost_dev *d, struct vhost_memory __user *m) d-vqs[i]-memory = newmem; mutex_unlock(d-vqs[i]-mutex); } - kfree(oldmem); + kvfree(oldmem); return 0; } -- 1.8.3.1 -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 05/10] KVM: arm/arm64: vgic: Relax vgic_can_sample_irq for edge IRQs
On 30/06/15 21:19, Christoffer Dall wrote: On Mon, Jun 08, 2015 at 06:04:00PM +0100, Marc Zyngier wrote: We only set the irq_queued flag for level interrupts, meaning that !vgic_irq_is_queued(vcpu, irq) is a good enough predicate for all interrupts. This will allow us to inject edge HW interrupts, for which the state ACTIVE+PENDING is not allowed. I don't understand this; ACTIVE+PENDING is allowed for edge interrupts. Do you mean that if we set the HW bit in the LR, then we are linking to an HW interrupt where we don't allow that to be ACTIVE+PENDING on the HW GIC side? Why is this relevant here? I feel like I'm missing context. I've probably taken a shortcut here - bear with me while I'm trying to explain the issue. For HW interrupts, we shouldn't even try to use the state bits in the LR, because that state is contained in the physical distributor. Setting the HW bit really means there is something going on at the distributor level, just go there. If we were to inject a ACTIVE+PENDING interrupt at the LR level, we'd basically loose the second interrupt because that state is simply not considered. So the trick we're using is to only inject the active interrupt, and prevent anything else from being injected until we can confirm that the active state has been cleared at the physical level. Does it make any sense? M. -- Jazz is not dead. It just smells funny... -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 10/10] KVM: arm/arm64: vgic: Allow non-shared device HW interrupts
On 30/06/15 21:19, Christoffer Dall wrote: On Mon, Jun 08, 2015 at 06:04:05PM +0100, Marc Zyngier wrote: So far, the only use of the HW interrupt facility is the timer, implying that the active state is context-switched for each vcpu, as the device is is shared across all vcpus. This does not work for a device that has been assigned to a VM, as the guest is entierely in control of that device (the HW is not shared). In that case, it makes sense to bypass the whole active state switchint, and only track the deactivation of the interrupt. The discinction here between shared and non-shared feels a bit arbitrary (it may not be, but just feel that way) and I can't easily convince myself that this is the logical/correct/all-encompassing word to describe the nature of the two devices. Does the idea of global vs private resource feel more correct? M. -- Jazz is not dead. It just smells funny... -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 10/10] KVM: arm/arm64: vgic: Allow non-shared device HW interrupts
On Wed, Jul 01, 2015 at 09:26:59AM +0100, Marc Zyngier wrote: On 30/06/15 21:19, Christoffer Dall wrote: On Mon, Jun 08, 2015 at 06:04:05PM +0100, Marc Zyngier wrote: So far, the only use of the HW interrupt facility is the timer, implying that the active state is context-switched for each vcpu, as the device is is shared across all vcpus. This does not work for a device that has been assigned to a VM, as the guest is entierely in control of that device (the HW is not shared). In that case, it makes sense to bypass the whole active state switchint, and only track the deactivation of the interrupt. The discinction here between shared and non-shared feels a bit arbitrary (it may not be, but just feel that way) and I can't easily convince myself that this is the logical/correct/all-encompassing word to describe the nature of the two devices. Does the idea of global vs private resource feel more correct? I think shared covers that equally well. This feels like one of those things that just doesn't make intuitive sense on its own but when you think about the cases we are familiar with, then it fits for now. So what you have here is probably as good as it gets and hopefully it does cover all the cases we care about, i.e. shared and non-shared :) -Christoffer -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v3 2/2] vhost: add max_mem_regions module parameter
it became possible to use a bigger amount of memory slots, which is used by memory hotplug for registering hotplugged memory. However QEMU crashes if it's used with more than ~60 pc-dimm devices and vhost-net enabled since host kernel in module vhost-net refuses to accept more than 64 memory regions. Allow to tweak limit via max_mem_regions module paramemter with default value set to 64 slots. Signed-off-by: Igor Mammedov imamm...@redhat.com --- drivers/vhost/vhost.c | 8 ++-- 1 file changed, 6 insertions(+), 2 deletions(-) diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c index 99931a0..5905cd7 100644 --- a/drivers/vhost/vhost.c +++ b/drivers/vhost/vhost.c @@ -29,8 +29,12 @@ #include vhost.h +static ushort max_mem_regions = 64; +module_param(max_mem_regions, ushort, 0444); +MODULE_PARM_DESC(max_mem_regions, + Maximum number of memory regions in memory map. (default: 64)); + enum { - VHOST_MEMORY_MAX_NREGIONS = 64, VHOST_MEMORY_F_LOG = 0x1, }; @@ -623,7 +627,7 @@ static long vhost_set_memory(struct vhost_dev *d, struct vhost_memory __user *m) return -EFAULT; if (mem.padding) return -EOPNOTSUPP; - if (mem.nregions VHOST_MEMORY_MAX_NREGIONS) + if (mem.nregions max_mem_regions) return -E2BIG; newmem = vhost_kvzalloc(size + mem.nregions * sizeof(*m-regions)); if (!newmem) -- 1.8.3.1 -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v3 0/2] vhost: support more than 64 memory regions
changes since v2: * drop cache patches for now as suggested * add max_mem_regions module parameter instead of unconditionally increasing limit * drop bsearch patch since it's already queued References to previous versions: v2: https://lkml.org/lkml/2015/6/17/276 v1: http://www.spinics.net/lists/kvm/msg117654.html Series allows to tweak vhost's memory regions count limit. It fixes VM crashing on memory hotplug due to vhost refusing accepting more than 64 memory regions with max_mem_regions set to more than 262 slots in default QEMU configuration. Igor Mammedov (2): vhost: extend memory regions allocation to vmalloc vhost: add max_mem_regions module parameter drivers/vhost/vhost.c | 30 +++--- 1 file changed, 23 insertions(+), 7 deletions(-) -- 1.8.3.1 -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 3/9] kvm: add hyper-v crash msrs values
On 01/07/2015 18:06, Peter Hornyack wrote: If userspace is controlling the crash capabilities then HV_X64_MSR_CRASH_CTL_CONTENTS is not needed. Actually you still need to: userspace cannot write anything but 0 or (1ULL 63). However, the name makes less sense, so I'm in favor of removing the value. Paolo -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v7 05/11] KVM: arm64: guest debug, add SW break point support
This adds support for SW breakpoints inserted by userspace. We do this by trapping all guest software debug exceptions to the hypervisor (MDCR_EL2.TDE). The exit handler sets an exit reason of KVM_EXIT_DEBUG with the kvm_debug_exit_arch structure holding the exception syndrome information. It will be up to userspace to extract the PC (via GET_ONE_REG) and determine if the debug event was for a breakpoint it inserted. If not userspace will need to re-inject the correct exception restart the hypervisor to deliver the debug exception to the guest. Any other guest software debug exception (e.g. single step or HW assisted breakpoints) will cause an error and the VM to be killed. This is addressed by later patches which add support for the other debug types. Signed-off-by: Alex Bennée alex.ben...@linaro.org Reviewed-by: Christoffer Dall christoffer.d...@linaro.org --- v2 - update to use new exit struct - tweak for C setup - do our setup in debug_setup/clear code - fixed up comments v3: - fix spacing in KVM_GUESTDBG_VALID_MASK - fix and clarify wording on kvm_handle_guest_debug - handle error case in kvm_handle_guest_debug - re-word the commit message v4 - rm else leg - add r-b-tag v7 - moved ioctl to guest --- Documentation/virtual/kvm/api.txt | 2 +- arch/arm64/kvm/debug.c| 3 +++ arch/arm64/kvm/guest.c| 2 +- arch/arm64/kvm/handle_exit.c | 36 4 files changed, 41 insertions(+), 2 deletions(-) diff --git a/Documentation/virtual/kvm/api.txt b/Documentation/virtual/kvm/api.txt index ba635c7..33c8143 100644 --- a/Documentation/virtual/kvm/api.txt +++ b/Documentation/virtual/kvm/api.txt @@ -2667,7 +2667,7 @@ when running. Common control bits are: The top 16 bits of the control field are architecture specific control flags which can include the following: - - KVM_GUESTDBG_USE_SW_BP: using software breakpoints [x86] + - KVM_GUESTDBG_USE_SW_BP: using software breakpoints [x86, arm64] - KVM_GUESTDBG_USE_HW_BP: using hardware breakpoints [x86, s390] - KVM_GUESTDBG_INJECT_DB: inject DB type exception [x86] - KVM_GUESTDBG_INJECT_BP: inject BP type exception [x86] diff --git a/arch/arm64/kvm/debug.c b/arch/arm64/kvm/debug.c index faf0e1f..8d1bfa4 100644 --- a/arch/arm64/kvm/debug.c +++ b/arch/arm64/kvm/debug.c @@ -73,6 +73,9 @@ void kvm_arm_setup_debug(struct kvm_vcpu *vcpu) if (trap_debug) vcpu-arch.mdcr_el2 |= MDCR_EL2_TDA; + /* Trap breakpoints? */ + if (vcpu-guest_debug KVM_GUESTDBG_USE_SW_BP) + vcpu-arch.mdcr_el2 |= MDCR_EL2_TDE; } void kvm_arm_clear_debug(struct kvm_vcpu *vcpu) diff --git a/arch/arm64/kvm/guest.c b/arch/arm64/kvm/guest.c index 0ba8677..22d22c5 100644 --- a/arch/arm64/kvm/guest.c +++ b/arch/arm64/kvm/guest.c @@ -332,7 +332,7 @@ int kvm_arch_vcpu_ioctl_translate(struct kvm_vcpu *vcpu, return -EINVAL; } -#define KVM_GUESTDBG_VALID_MASK (KVM_GUESTDBG_ENABLE) +#define KVM_GUESTDBG_VALID_MASK (KVM_GUESTDBG_ENABLE | KVM_GUESTDBG_USE_SW_BP) /** * kvm_arch_vcpu_ioctl_set_guest_debug - set up guest debugging diff --git a/arch/arm64/kvm/handle_exit.c b/arch/arm64/kvm/handle_exit.c index 524fa25..27f38a9 100644 --- a/arch/arm64/kvm/handle_exit.c +++ b/arch/arm64/kvm/handle_exit.c @@ -82,6 +82,40 @@ static int kvm_handle_wfx(struct kvm_vcpu *vcpu, struct kvm_run *run) return 1; } +/** + * kvm_handle_guest_debug - handle a debug exception instruction + * + * @vcpu: the vcpu pointer + * @run: access to the kvm_run structure for results + * + * We route all debug exceptions through the same handler. If both the + * guest and host are using the same debug facilities it will be up to + * userspace to re-inject the correct exception for guest delivery. + * + * @return: 0 (while setting run-exit_reason), -1 for error + */ +static int kvm_handle_guest_debug(struct kvm_vcpu *vcpu, struct kvm_run *run) +{ + u32 hsr = kvm_vcpu_get_hsr(vcpu); + int ret = 0; + + run-exit_reason = KVM_EXIT_DEBUG; + run-debug.arch.hsr = hsr; + + switch (hsr ESR_ELx_EC_SHIFT) { + case ESR_ELx_EC_BKPT32: + case ESR_ELx_EC_BRK64: + break; + default: + kvm_err(%s: un-handled case hsr: %#08x\n, + __func__, (unsigned int) hsr); + ret = -1; + break; + } + + return ret; +} + static exit_handle_fn arm_exit_handlers[] = { [ESR_ELx_EC_WFx]= kvm_handle_wfx, [ESR_ELx_EC_CP15_32]= kvm_handle_cp15_32, @@ -96,6 +130,8 @@ static exit_handle_fn arm_exit_handlers[] = { [ESR_ELx_EC_SYS64] = kvm_handle_sys_reg, [ESR_ELx_EC_IABT_LOW] = kvm_handle_guest_abort, [ESR_ELx_EC_DABT_LOW] = kvm_handle_guest_abort, + [ESR_ELx_EC_BKPT32] = kvm_handle_guest_debug, + [ESR_ELx_EC_BRK64] = kvm_handle_guest_debug, }; static
[PATCH v7 04/11] KVM: arm: introduce kvm_arm_init/setup/clear_debug
This is a precursor for later patches which will need to do more to setup debug state before entering the hyp.S switch code. The existing functionality for setting mdcr_el2 has been moved out of hyp.S and now uses the value kept in vcpu-arch.mdcr_el2. As the assembler used to previously mask and preserve MDCR_EL2.HPMN I've had to add a mechanism to save the value of mdcr_el2 as a per-cpu variable during the initialisation code. The kernel never sets this number so we are assuming the bootcode has set up the correct value here. This also moves the conditional setting of the TDA bit from the hyp code into the C code which is currently used for the lazy debug register context switch code. Signed-off-by: Alex Bennée alex.ben...@linaro.org Reviewed-by: Christoffer Dall christoffer.d...@linaro.org --- v3 - rename fns from arch-arm - preserve MDCR_EL2.HPMN setting - re-word some of the comments - fix some minor grammar nits - merge setting of mdcr_el2 - introduce trap_debug flag - move setup/clear within the irq lock section v4 - fix TDOSA desc - rm un-needed else leg - s/arch/arm/ v6 - add s-o-b tag --- arch/arm/include/asm/kvm_host.h | 4 ++ arch/arm/kvm/arm.c| 9 - arch/arm64/include/asm/kvm_asm.h | 2 + arch/arm64/include/asm/kvm_host.h | 5 +++ arch/arm64/kernel/asm-offsets.c | 1 + arch/arm64/kvm/Makefile | 2 +- arch/arm64/kvm/debug.c| 81 +++ arch/arm64/kvm/hyp.S | 19 - 8 files changed, 110 insertions(+), 13 deletions(-) create mode 100644 arch/arm64/kvm/debug.c diff --git a/arch/arm/include/asm/kvm_host.h b/arch/arm/include/asm/kvm_host.h index d71607c..746c0c69 100644 --- a/arch/arm/include/asm/kvm_host.h +++ b/arch/arm/include/asm/kvm_host.h @@ -236,4 +236,8 @@ static inline void kvm_arch_sync_events(struct kvm *kvm) {} static inline void kvm_arch_vcpu_uninit(struct kvm_vcpu *vcpu) {} static inline void kvm_arch_sched_in(struct kvm_vcpu *vcpu, int cpu) {} +static inline void kvm_arm_init_debug(void) {} +static inline void kvm_arm_setup_debug(struct kvm_vcpu *vcpu) {} +static inline void kvm_arm_clear_debug(struct kvm_vcpu *vcpu) {} + #endif /* __ARM_KVM_HOST_H__ */ diff --git a/arch/arm/kvm/arm.c b/arch/arm/kvm/arm.c index 92b80bc..af60e6f 100644 --- a/arch/arm/kvm/arm.c +++ b/arch/arm/kvm/arm.c @@ -542,6 +542,8 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu, struct kvm_run *run) continue; } + kvm_arm_setup_debug(vcpu); + /** * Enter the guest */ @@ -554,7 +556,10 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu, struct kvm_run *run) vcpu-mode = OUTSIDE_GUEST_MODE; kvm_guest_exit(); trace_kvm_exit(kvm_vcpu_trap_get_class(vcpu), *vcpu_pc(vcpu)); - /* + + kvm_arm_clear_debug(vcpu); + +/* * We may have taken a host interrupt in HYP mode (ie * while executing the guest). This interrupt is still * pending, as we haven't serviced it yet! @@ -902,6 +907,8 @@ static void cpu_init_hyp_mode(void *dummy) vector_ptr = (unsigned long)__kvm_hyp_vector; __cpu_init_hyp_mode(boot_pgd_ptr, pgd_ptr, hyp_stack_ptr, vector_ptr); + + kvm_arm_init_debug(); } static int hyp_init_cpu_notify(struct notifier_block *self, diff --git a/arch/arm64/include/asm/kvm_asm.h b/arch/arm64/include/asm/kvm_asm.h index 4f7310f..d6b507e 100644 --- a/arch/arm64/include/asm/kvm_asm.h +++ b/arch/arm64/include/asm/kvm_asm.h @@ -137,6 +137,8 @@ extern char __restore_vgic_v2_state[]; extern char __save_vgic_v3_state[]; extern char __restore_vgic_v3_state[]; +extern u32 __kvm_get_mdcr_el2(void); + #endif #endif /* __ARM_KVM_ASM_H__ */ diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h index f0f58c9..7cb99b5 100644 --- a/arch/arm64/include/asm/kvm_host.h +++ b/arch/arm64/include/asm/kvm_host.h @@ -103,6 +103,7 @@ struct kvm_vcpu_arch { /* HYP configuration */ u64 hcr_el2; + u32 mdcr_el2; /* Exception Information */ struct kvm_vcpu_fault_info fault; @@ -250,4 +251,8 @@ static inline void kvm_arch_sync_events(struct kvm *kvm) {} static inline void kvm_arch_vcpu_uninit(struct kvm_vcpu *vcpu) {} static inline void kvm_arch_sched_in(struct kvm_vcpu *vcpu, int cpu) {} +void kvm_arm_init_debug(void); +void kvm_arm_setup_debug(struct kvm_vcpu *vcpu); +void kvm_arm_clear_debug(struct kvm_vcpu *vcpu); + #endif /* __ARM64_KVM_HOST_H__ */ diff --git a/arch/arm64/kernel/asm-offsets.c b/arch/arm64/kernel/asm-offsets.c index da675cc..dfb25a2 100644 --- a/arch/arm64/kernel/asm-offsets.c +++ b/arch/arm64/kernel/asm-offsets.c @@ -117,6 +117,7 @@ int main(void) DEFINE(VCPU_HPFAR_EL2,
Re: [PATCH 05/10] KVM: arm/arm64: vgic: Relax vgic_can_sample_irq for edge IRQs
On 01/07/15 12:58, Christoffer Dall wrote: On Wed, Jul 01, 2015 at 10:17:52AM +0100, Marc Zyngier wrote: On 30/06/15 21:19, Christoffer Dall wrote: On Mon, Jun 08, 2015 at 06:04:00PM +0100, Marc Zyngier wrote: We only set the irq_queued flag for level interrupts, meaning that !vgic_irq_is_queued(vcpu, irq) is a good enough predicate for all interrupts. This will allow us to inject edge HW interrupts, for which the state ACTIVE+PENDING is not allowed. I don't understand this; ACTIVE+PENDING is allowed for edge interrupts. Do you mean that if we set the HW bit in the LR, then we are linking to an HW interrupt where we don't allow that to be ACTIVE+PENDING on the HW GIC side? Why is this relevant here? I feel like I'm missing context. I've probably taken a shortcut here - bear with me while I'm trying to explain the issue. For HW interrupts, we shouldn't even try to use the state bits in the LR, because that state is contained in the physical distributor. Setting the HW bit really means there is something going on at the distributor level, just go there. ok, so by HW interrupts you mean virtual interrupts with the HW bit in the LR set, correct? Yes, sorry. If we were to inject a ACTIVE+PENDING interrupt at the LR level, we'd basically loose the second interrupt because that state is simply not considered. Huh? Which second interrupt. I looked at the spec and it says don't use the state bits for HW interrupts, so isn't it simply not supported to set these bits at all and that's it? I managed to confuse myself reading the same bit. It says (GICv3 spec): A hypervisor must only use the pending and active state for software originated interrupts, which are typically associated with virtual devices, or SGIs. That's the PENDING+ACTIVE state, and not the pending and active bits like I read it initially. Now consider the following scenario: - We inject a virtual edge interrupt - We mark the corresponding physical interrupt as active. - Queue interrupt in an LR - Resume vcpu Now, we inject another edge interrupt, the vcpu exits for whatever reason, and the previously injected interrupt is still active. The normal vGIC flow would be to mark the interrupt as ACTIVE+PENDING in the LR, and resume the vcpu. But the above states that this is invalid for HW generated interrupts. So the trick we're using is to only inject the active interrupt, and prevent anything else from being injected until we can confirm that the active state has been cleared at the physical level. Does it make any sense? Sort of, but what I don't understand now is how the guest ever sees the interrupt then. If we always inject the virtual interrupt by setting the active state on the physical distributor, and we can't inject this as active+pending, and the guest doesn't see the state in the LR, then how does this ever raise a virtual interrupt and how does the guest see an interrupt which is only PENDING so that it can ack it etc. etc.? Maybe I don't fully understand how the HW bit works after all... The way the spec is written is slightly misleading. But the gist of it is that we still signal the guest using the PENDING bit in the LR, and switch the LR as usual. it is just that we can't use the PENDING+ACTIVE state (apparently, this can lead to a double deactivation). Not sure the above makes sense. Beer time, I suppose. M. -- Jazz is not dead. It just smells funny... -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v7 01/11] KVM: add comments for kvm_debug_exit_arch struct
Bring into line with the comments for the other structures and their KVM_EXIT_* cases. Also update api.txt to reflect use in kvm_run documentation. Signed-off-by: Alex Bennée alex.ben...@linaro.org Reviewed-by: David Hildenbrand d...@linux.vnet.ibm.com Reviewed-by: Andrew Jones drjo...@redhat.com Acked-by: Christoffer Dall christoffer.d...@linaro.org --- v2 - add comments for other exit types v3 - s/commentary/comments/ - add rb tags - update api.txt kvm_run to include KVM_EXIT_DEBUG desc v4 - sp fixes - add a-b --- Documentation/virtual/kvm/api.txt | 4 +++- include/uapi/linux/kvm.h | 3 +++ 2 files changed, 6 insertions(+), 1 deletion(-) diff --git a/Documentation/virtual/kvm/api.txt b/Documentation/virtual/kvm/api.txt index 9fa2bf8..c34c32d 100644 --- a/Documentation/virtual/kvm/api.txt +++ b/Documentation/virtual/kvm/api.txt @@ -3070,11 +3070,13 @@ data_offset describes where the data is located (KVM_EXIT_IO_OUT) or where kvm expects application code to place the data for the next KVM_RUN invocation (KVM_EXIT_IO_IN). Data format is a packed array. + /* KVM_EXIT_DEBUG */ struct { struct kvm_debug_exit_arch arch; } debug; -Unused. +If the exit_reason is KVM_EXIT_DEBUG, then a vcpu is processing a debug event +for which architecture specific information is returned. /* KVM_EXIT_MMIO */ struct { diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h index 4b60056..70ac641 100644 --- a/include/uapi/linux/kvm.h +++ b/include/uapi/linux/kvm.h @@ -237,6 +237,7 @@ struct kvm_run { __u32 count; __u64 data_offset; /* relative to kvm_run start */ } io; + /* KVM_EXIT_DEBUG */ struct { struct kvm_debug_exit_arch arch; } debug; @@ -285,6 +286,7 @@ struct kvm_run { __u32 data; __u8 is_write; } dcr; + /* KVM_EXIT_INTERNAL_ERROR */ struct { __u32 suberror; /* Available with KVM_CAP_INTERNAL_ERROR_DATA: */ @@ -295,6 +297,7 @@ struct kvm_run { struct { __u64 gprs[32]; } osi; + /* KVM_EXIT_PAPR_HCALL */ struct { __u64 nr; __u64 ret; -- 2.4.5 -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v7 06/11] KVM: arm64: guest debug, add support for single-step
This adds support for single-stepping the guest. To do this we need to manipulate the guests PSTATE.SS and MDSCR_EL1.SS bits to trigger stepping. We take care to preserve MDSCR_EL1 and trap access to it to ensure we don't affect the apparent state of the guest. As we have to enable trapping of all software debug exceptions we suppress the ability of the guest to single-step itself. If we didn't we would have to deal with the exception arriving while the guest was in kernelspace when the guest is expecting to single-step userspace. This is something we don't want to unwind in the kernel. Once the host is no longer debugging the guest its ability to single-step userspace is restored. Signed-off-by: Alex Bennée alex.ben...@linaro.org Reviewed-by: Christoffer Dall christoffer.d...@linaro.org --- v2 - Move pstate/mdscr manipulation into C - don't export guest_debug to assembly - add accessor for saved_debug regs - tweak save/restore of mdscr_el1 v3 - don't save PC in debug information struct - rename debug_saved_regs-guest_debug_state - save whole value, only use bits in restore - add save/restore_guest-debug_regs helper functions - simplify commit message for clarity - rm vcpu_debug_saved_reg access fn v4 - added more comments based on suggestions - guest_debug_state-guest_debug_preserved - no point masking restore, we will trap out v5 - more comments - don't bother preserving pstate.ss (guest never sees change) v6 - reword comments on guest SS suppression - simplify comment for save regs, SS explained in detail later on - add r-b-t (code) - expanded commit description v7 - merge fix for ioctl move to guest.c --- arch/arm64/include/asm/kvm_host.h | 11 +++ arch/arm64/kvm/debug.c| 68 --- arch/arm64/kvm/guest.c| 4 ++- arch/arm64/kvm/handle_exit.c | 2 ++ 4 files changed, 80 insertions(+), 5 deletions(-) diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h index 7cb99b5..e2db6a6 100644 --- a/arch/arm64/include/asm/kvm_host.h +++ b/arch/arm64/include/asm/kvm_host.h @@ -123,6 +123,17 @@ struct kvm_vcpu_arch { * here. */ + /* +* Guest registers we preserve during guest debugging. +* +* These shadow registers are updated by the kvm_handle_sys_reg +* trap handler if the guest accesses or updates them while we +* are using guest debug. +*/ + struct { + u32 mdscr_el1; + } guest_debug_preserved; + /* Don't run the guest */ bool pause; diff --git a/arch/arm64/kvm/debug.c b/arch/arm64/kvm/debug.c index 8d1bfa4..d439eb8 100644 --- a/arch/arm64/kvm/debug.c +++ b/arch/arm64/kvm/debug.c @@ -19,11 +19,39 @@ #include linux/kvm_host.h +#include asm/debug-monitors.h +#include asm/kvm_asm.h #include asm/kvm_arm.h +#include asm/kvm_emulate.h + +/* These are the bits of MDSCR_EL1 we may manipulate */ +#define MDSCR_EL1_DEBUG_MASK (DBG_MDSCR_SS | \ + DBG_MDSCR_KDE | \ + DBG_MDSCR_MDE) static DEFINE_PER_CPU(u32, mdcr_el2); /** + * save/restore_guest_debug_regs + * + * For some debug operations we need to tweak some guest registers. As + * a result we need to save the state of those registers before we + * make those modifications. + * + * Guest access to MDSCR_EL1 is trapped by the hypervisor and handled + * after we have restored the preserved value to the main context. + */ +static void save_guest_debug_regs(struct kvm_vcpu *vcpu) +{ + vcpu-arch.guest_debug_preserved.mdscr_el1 = vcpu_sys_reg(vcpu, MDSCR_EL1); +} + +static void restore_guest_debug_regs(struct kvm_vcpu *vcpu) +{ + vcpu_sys_reg(vcpu, MDSCR_EL1) = vcpu-arch.guest_debug_preserved.mdscr_el1; +} + +/** * kvm_arm_init_debug - grab what we need for debug * * Currently the sole task of this function is to retrieve the initial @@ -38,7 +66,6 @@ void kvm_arm_init_debug(void) __this_cpu_write(mdcr_el2, kvm_call_hyp(__kvm_get_mdcr_el2)); } - /** * kvm_arm_setup_debug - set up debug related stuff * @@ -73,12 +100,45 @@ void kvm_arm_setup_debug(struct kvm_vcpu *vcpu) if (trap_debug) vcpu-arch.mdcr_el2 |= MDCR_EL2_TDA; - /* Trap breakpoints? */ - if (vcpu-guest_debug KVM_GUESTDBG_USE_SW_BP) + /* Is Guest debugging in effect? */ + if (vcpu-guest_debug) { + /* Route all software debug exceptions to EL2 */ vcpu-arch.mdcr_el2 |= MDCR_EL2_TDE; + + /* Save guest debug state */ + save_guest_debug_regs(vcpu); + + /* +* Single Step (ARM ARM D2.12.3 The software step state +* machine) +* +* If we are doing Single Step we need to manipulate +* the guest's MDSCR_EL1.SS and PSTATE.SS. Once the +* step has
[PATCH v7 11/11] KVM: arm64: add trace points for guest_debug debug
This includes trace points for: kvm_arch_setup_guest_debug kvm_arch_clear_guest_debug I've also added some generic register setting trace events and also a trace point to dump the array of hardware registers. Signed-off-by: Alex Bennée alex.ben...@linaro.org --- v3 - add trace event for debug access. - remove short trace #define, rename trace events - use __print_array with fixed array instead of own func - rationalise trace points (only one per register changed) - add vcpu ptr to the debug_setup trace - remove :: in prints v4 - u32/u64 split on debug registers - fix for renames - add tracing of traps/set_guest_debug - remove handle_guest_debug trace v5 - minor print fmt fix - rm pstate traces v6 - fix merge conflicts - update control reg tracking to u64 (abi change) v7 - fix merge conflicts from ioctl move - fix other minor merge conflicts - fixes for the re-factored sys_regs code --- arch/arm64/kvm/debug.c| 35 - arch/arm64/kvm/guest.c| 4 ++ arch/arm64/kvm/sys_regs.c | 21 arch/arm64/kvm/trace.h| 123 ++ 4 files changed, 182 insertions(+), 1 deletion(-) diff --git a/arch/arm64/kvm/debug.c b/arch/arm64/kvm/debug.c index 46b73d7..119107f 100644 --- a/arch/arm64/kvm/debug.c +++ b/arch/arm64/kvm/debug.c @@ -24,6 +24,8 @@ #include asm/kvm_arm.h #include asm/kvm_emulate.h +#include trace.h + /* These are the bits of MDSCR_EL1 we may manipulate */ #define MDSCR_EL1_DEBUG_MASK (DBG_MDSCR_SS | \ DBG_MDSCR_KDE | \ @@ -44,11 +46,17 @@ static DEFINE_PER_CPU(u32, mdcr_el2); static void save_guest_debug_regs(struct kvm_vcpu *vcpu) { vcpu-arch.guest_debug_preserved.mdscr_el1 = vcpu_sys_reg(vcpu, MDSCR_EL1); + + trace_kvm_arm_set_dreg32(Saved MDSCR_EL1, + vcpu-arch.guest_debug_preserved.mdscr_el1); } static void restore_guest_debug_regs(struct kvm_vcpu *vcpu) { vcpu_sys_reg(vcpu, MDSCR_EL1) = vcpu-arch.guest_debug_preserved.mdscr_el1; + + trace_kvm_arm_set_dreg32(Restored MDSCR_EL1, + vcpu_sys_reg(vcpu, MDSCR_EL1)); } /** @@ -99,6 +107,8 @@ void kvm_arm_setup_debug(struct kvm_vcpu *vcpu) { bool trap_debug = !(vcpu-arch.debug_flags KVM_ARM64_DEBUG_DIRTY); + trace_kvm_arm_setup_debug(vcpu, vcpu-guest_debug); + vcpu-arch.mdcr_el2 = __this_cpu_read(mdcr_el2) MDCR_EL2_HPMN_MASK; vcpu-arch.mdcr_el2 |= (MDCR_EL2_TPM | MDCR_EL2_TPMCR | @@ -140,6 +150,8 @@ void kvm_arm_setup_debug(struct kvm_vcpu *vcpu) vcpu_sys_reg(vcpu, MDSCR_EL1) = ~DBG_MDSCR_SS; } + trace_kvm_arm_set_dreg32(SPSR_EL2, *vcpu_cpsr(vcpu)); + /* * HW Breakpoints and watchpoints * @@ -156,6 +168,14 @@ void kvm_arm_setup_debug(struct kvm_vcpu *vcpu) vcpu-arch.debug_ptr = vcpu-arch.external_debug_state; vcpu-arch.debug_flags |= KVM_ARM64_DEBUG_DIRTY; trap_debug = true; + + trace_kvm_arm_set_regset(BKPTS, get_num_brps(), + vcpu-arch.debug_ptr-dbg_bcr[0], + vcpu-arch.debug_ptr-dbg_bvr[0]); + + trace_kvm_arm_set_regset(WAPTS, get_num_wrps(), + vcpu-arch.debug_ptr-dbg_wcr[0], + vcpu-arch.debug_ptr-dbg_wvr[0]); } } @@ -165,10 +185,15 @@ void kvm_arm_setup_debug(struct kvm_vcpu *vcpu) /* Trap debug register access */ if (trap_debug) vcpu-arch.mdcr_el2 |= MDCR_EL2_TDA; + + trace_kvm_arm_set_dreg32(MDCR_EL2, vcpu-arch.mdcr_el2); + trace_kvm_arm_set_dreg32(MDSCR_EL1, vcpu_sys_reg(vcpu, MDSCR_EL1)); } void kvm_arm_clear_debug(struct kvm_vcpu *vcpu) { + trace_kvm_arm_clear_debug(vcpu-guest_debug); + if (vcpu-guest_debug) { restore_guest_debug_regs(vcpu); @@ -176,8 +201,16 @@ void kvm_arm_clear_debug(struct kvm_vcpu *vcpu) * If we were using HW debug we need to restore the * debug_ptr to the guest debug state. */ - if (vcpu-guest_debug KVM_GUESTDBG_USE_HW) + if (vcpu-guest_debug KVM_GUESTDBG_USE_HW) { kvm_arm_reset_debug_ptr(vcpu); + trace_kvm_arm_set_regset(BKPTS, get_num_brps(), + vcpu-arch.debug_ptr-dbg_bcr[0], + vcpu-arch.debug_ptr-dbg_bvr[0]); + + trace_kvm_arm_set_regset(WAPTS, get_num_wrps(), + vcpu-arch.debug_ptr-dbg_wcr[0], +
[PATCH v7 08/11] KVM: arm64: introduce vcpu-arch.debug_ptr
This introduces a level of indirection for the debug registers. Instead of using the sys_regs[] directly we store registers in a structure in the vcpu. The new kvm_arm_reset_debug_ptr() sets the debug ptr to the guest context. This also entails updating the sys_regs code to access this new structure. New access function have been added for each set of debug registers. The generic functions are still used for the few registers stored in the main context. New access function pointers have been added to the sys_reg_desc structure to support the GET/SET_ONE_REG ioctl operations. Signed-off-by: Alex Bennée alex.ben...@linaro.org --- v6: - fix up some ws issues - correct clobber info - re-word commentary in kvm_host.h - fix endian access issues for aarch32 fields - revert all KVM_GET/SET_ONE_REG to 64bit (also see ABI update) v7 - new fn kvm_arm_reset_debug_ptr(), stubbed for arm - split trap fns into bcr,bvr,bcr,wvr and wxvr - add set/get fns to sys_regs_desc - reg_to_dbg/dbg_to_reg helpers for 32bit support --- arch/arm/include/asm/kvm_host.h | 2 +- arch/arm/kvm/arm.c| 2 + arch/arm64/include/asm/kvm_asm.h | 24 ++-- arch/arm64/include/asm/kvm_host.h | 17 ++- arch/arm64/kernel/asm-offsets.c | 6 + arch/arm64/kvm/debug.c| 9 ++ arch/arm64/kvm/hyp.S | 24 ++-- arch/arm64/kvm/sys_regs.c | 281 ++ arch/arm64/kvm/sys_regs.h | 6 + 9 files changed, 321 insertions(+), 50 deletions(-) diff --git a/arch/arm/include/asm/kvm_host.h b/arch/arm/include/asm/kvm_host.h index 746c0c69..f42759b 100644 --- a/arch/arm/include/asm/kvm_host.h +++ b/arch/arm/include/asm/kvm_host.h @@ -239,5 +239,5 @@ static inline void kvm_arch_sched_in(struct kvm_vcpu *vcpu, int cpu) {} static inline void kvm_arm_init_debug(void) {} static inline void kvm_arm_setup_debug(struct kvm_vcpu *vcpu) {} static inline void kvm_arm_clear_debug(struct kvm_vcpu *vcpu) {} - +static inline void kvm_arm_reset_debug_ptr(struct kvm_vcpu *vcpu) {} #endif /* __ARM_KVM_HOST_H__ */ diff --git a/arch/arm/kvm/arm.c b/arch/arm/kvm/arm.c index af60e6f..525473f 100644 --- a/arch/arm/kvm/arm.c +++ b/arch/arm/kvm/arm.c @@ -279,6 +279,8 @@ int kvm_arch_vcpu_init(struct kvm_vcpu *vcpu) /* Set up the timer */ kvm_timer_vcpu_init(vcpu); + kvm_arm_reset_debug_ptr(vcpu); + return 0; } diff --git a/arch/arm64/include/asm/kvm_asm.h b/arch/arm64/include/asm/kvm_asm.h index d6b507e..e997404 100644 --- a/arch/arm64/include/asm/kvm_asm.h +++ b/arch/arm64/include/asm/kvm_asm.h @@ -46,24 +46,16 @@ #defineCNTKCTL_EL1 20 /* Timer Control Register (EL1) */ #definePAR_EL1 21 /* Physical Address Register */ #define MDSCR_EL1 22 /* Monitor Debug System Control Register */ -#define DBGBCR0_EL123 /* Debug Breakpoint Control Registers (0-15) */ -#define DBGBCR15_EL1 38 -#define DBGBVR0_EL139 /* Debug Breakpoint Value Registers (0-15) */ -#define DBGBVR15_EL1 54 -#define DBGWCR0_EL155 /* Debug Watchpoint Control Registers (0-15) */ -#define DBGWCR15_EL1 70 -#define DBGWVR0_EL171 /* Debug Watchpoint Value Registers (0-15) */ -#define DBGWVR15_EL1 86 -#define MDCCINT_EL187 /* Monitor Debug Comms Channel Interrupt Enable Reg */ +#define MDCCINT_EL123 /* Monitor Debug Comms Channel Interrupt Enable Reg */ /* 32bit specific registers. Keep them at the end of the range */ -#defineDACR32_EL2 88 /* Domain Access Control Register */ -#defineIFSR32_EL2 89 /* Instruction Fault Status Register */ -#defineFPEXC32_EL2 90 /* Floating-Point Exception Control Register */ -#defineDBGVCR32_EL291 /* Debug Vector Catch Register */ -#defineTEECR32_EL1 92 /* ThumbEE Configuration Register */ -#defineTEEHBR32_EL193 /* ThumbEE Handler Base Register */ -#defineNR_SYS_REGS 94 +#defineDACR32_EL2 24 /* Domain Access Control Register */ +#defineIFSR32_EL2 25 /* Instruction Fault Status Register */ +#defineFPEXC32_EL2 26 /* Floating-Point Exception Control Register */ +#defineDBGVCR32_EL227 /* Debug Vector Catch Register */ +#defineTEECR32_EL1 28 /* ThumbEE Configuration Register */ +#defineTEEHBR32_EL129 /* ThumbEE Handler Base Register */ +#defineNR_SYS_REGS 30 /* 32bit mapping */ #define c0_MPIDR (MPIDR_EL1 * 2) /* MultiProcessor ID Register */ diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h index e2db6a6..461d288 100644 --- a/arch/arm64/include/asm/kvm_host.h +++ b/arch/arm64/include/asm/kvm_host.h @@ -108,11 +108,25 @@ struct kvm_vcpu_arch { /* Exception Information */ struct kvm_vcpu_fault_info fault; - /* Debug state */ + /*
[PATCH v7 10/11] KVM: arm64: enable KVM_CAP_SET_GUEST_DEBUG
Finally advertise the KVM capability for SET_GUEST_DEBUG. Once arm support is added this check can be moved to the common kvm_vm_ioctl_check_extension() code. Signed-off-by: Alex Bennée alex.ben...@linaro.org Acked-by: Christoffer Dall christoffer.d...@linaro.org --- v3: - separated capability check from previous patches - moved into arm64 specific ioctl handler. v4: - add a-b-tag --- arch/arm64/kvm/reset.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/arch/arm64/kvm/reset.c b/arch/arm64/kvm/reset.c index 21d5a62..88e5331 100644 --- a/arch/arm64/kvm/reset.c +++ b/arch/arm64/kvm/reset.c @@ -76,6 +76,9 @@ int kvm_arch_dev_ioctl_check_extension(long ext) case KVM_CAP_GUEST_DEBUG_HW_WPS: r = get_num_wrps(); break; + case KVM_CAP_SET_GUEST_DEBUG: + r = 1; + break; default: r = 0; } -- 2.4.5 -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v7 09/11] KVM: arm64: guest debug, HW assisted debug support
This adds support for userspace to control the HW debug registers for guest debug. In the debug ioctl we copy an IMPDEF registers into a new register set called host_debug_state. We use the recently introduced vcpu parameter debug_ptr to select which register set is copied into the real registers when world switch occurs. I've made some helper functions from hw_breakpoint.c more widely available for re-use. As with single step we need to tweak the guest registers to enable the exceptions so we need to save and restore those bits. Two new capabilities have been added to the KVM_EXTENSION ioctl to allow userspace to query the number of hardware break and watch points available on the host hardware. Signed-off-by: Alex Bennée alex.ben...@linaro.org Reviewed-by: Christoffer Dall christoffer.d...@linaro.org --- v2 - switched to C setup - replace host debug registers directly into context - minor tweak to api docs - setup right register for debug - add FAR_EL2 to debug exit structure - add support for trapping debug register access v3 - remove stray trace statement - fix spacing around operators (various) - clean-up usage of trap_debug - introduce debug_ptr, replace excessive memcpy stuff - don't use memcpy in ioctl, just assign - update cap ioctl documentation - reword a number comments - rename host_debug_state-external_debug_state v4 - use the new u32/u64 split debug_ptr approach - fix some wording/comments v5 - don't set MDSCR_EL1.KDE (not needed) v6 - update wording given change in commentary - KVM_GUESTDBG_USE_HW_BP-KVM_GUESTDBG_USE_HW v7 - fix merge conflicts from ioctl move to guest.c - use kvm_arm_reset_debug_ptr to reset ptr - a BUG_ON() test has been added to trap failure to reset debug_ptr - debugging-debug in kvm_host.h comment - s/defined// s/to// in commit msg - rm ref to introducing debug_ptr in commit msg - add r-b tag --- Documentation/virtual/kvm/api.txt | 7 +- arch/arm64/include/asm/hw_breakpoint.h | 4 arch/arm64/include/asm/kvm_host.h | 6 - arch/arm64/kernel/hw_breakpoint.c | 4 ++-- arch/arm64/kvm/debug.c | 40 +- arch/arm64/kvm/guest.c | 7 ++ arch/arm64/kvm/handle_exit.c | 6 + arch/arm64/kvm/reset.c | 12 ++ arch/arm64/kvm/sys_regs.c | 3 --- include/uapi/linux/kvm.h | 2 ++ 10 files changed, 79 insertions(+), 12 deletions(-) diff --git a/Documentation/virtual/kvm/api.txt b/Documentation/virtual/kvm/api.txt index 33c8143..ada57df 100644 --- a/Documentation/virtual/kvm/api.txt +++ b/Documentation/virtual/kvm/api.txt @@ -2668,7 +2668,7 @@ The top 16 bits of the control field are architecture specific control flags which can include the following: - KVM_GUESTDBG_USE_SW_BP: using software breakpoints [x86, arm64] - - KVM_GUESTDBG_USE_HW_BP: using hardware breakpoints [x86, s390] + - KVM_GUESTDBG_USE_HW_BP: using hardware breakpoints [x86, s390, arm64] - KVM_GUESTDBG_INJECT_DB: inject DB type exception [x86] - KVM_GUESTDBG_INJECT_BP: inject BP type exception [x86] - KVM_GUESTDBG_EXIT_PENDING: trigger an immediate guest exit [s390] @@ -2683,6 +2683,11 @@ updated to the correct (supplied) values. The second part of the structure is architecture specific and typically contains a set of debug registers. +For arm64 the number of debug registers is implementation defined and +can be determined by querying the KVM_CAP_GUEST_DEBUG_HW_BPS and +KVM_CAP_GUEST_DEBUG_HW_WPS capabilities which return a positive number +indicating the number of supported registers. + When debug events exit the main run loop with the reason KVM_EXIT_DEBUG with the kvm_debug_exit_arch part of the kvm_run structure containing architecture specific debug information. diff --git a/arch/arm64/include/asm/hw_breakpoint.h b/arch/arm64/include/asm/hw_breakpoint.h index 52b484b..9da2824 100644 --- a/arch/arm64/include/asm/hw_breakpoint.h +++ b/arch/arm64/include/asm/hw_breakpoint.h @@ -130,6 +130,10 @@ static inline void ptrace_hw_copy_thread(struct task_struct *task) } #endif +/* Determine number of BRP/WRP registers available. */ +extern int get_num_brps(void); +extern int get_num_wrps(void); + extern struct pmu perf_ops_bp; #endif /* __KERNEL__ */ diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h index 461d288..6c745e0 100644 --- a/arch/arm64/include/asm/kvm_host.h +++ b/arch/arm64/include/asm/kvm_host.h @@ -116,13 +116,17 @@ struct kvm_vcpu_arch { * debugging the guest from the host and to maintain separate host and * guest state during world switches. vcpu_debug_state are the debug * registers of the vcpu as the guest sees them. host_debug_state are -* the host registers which are saved and restored during world switches. +* the host
[PATCH v7 07/11] KVM: arm64: re-factor hyp.S debug register code
This is a pre-cursor to sharing the code with the guest debug support. This replaces the big macro that fishes data out of a fixed location with a more general helper macro to restore a set of debug registers. It uses macro substitution so it can be re-used for debug control and value registers. It does however rely on the debug registers being 64 bit aligned (as they happen to be in the hyp ABI). Signed-off-by: Alex Bennée alex.ben...@linaro.org Reviewed-by: Christoffer Dall christoffer.d...@linaro.org --- v3: - return to the patch series - add save and restore targets - change register use and document v4: - keep original setup/restore names - don't use split u32/u64 structure yet v6: - fix ws and clobber info in hyp.S v7: - fix whitespace - add r-b-tag --- arch/arm64/kvm/hyp.S | 517 ++- 1 file changed, 138 insertions(+), 379 deletions(-) diff --git a/arch/arm64/kvm/hyp.S b/arch/arm64/kvm/hyp.S index 2c67a14..77c08df 100644 --- a/arch/arm64/kvm/hyp.S +++ b/arch/arm64/kvm/hyp.S @@ -228,199 +228,52 @@ stp x24, x25, [x3, #160] .endm -.macro save_debug - // x2: base address for cpu context - // x3: tmp register - - mrs x26, id_aa64dfr0_el1 - ubfxx24, x26, #12, #4 // Extract BRPs - ubfxx25, x26, #20, #4 // Extract WRPs - mov w26, #15 - sub w24, w26, w24 // How many BPs to skip - sub w25, w26, w25 // How many WPs to skip - - add x3, x2, #CPU_SYSREG_OFFSET(DBGBCR0_EL1) - - adr x26, 1f - add x26, x26, x24, lsl #2 - br x26 -1: - mrs x20, dbgbcr15_el1 - mrs x19, dbgbcr14_el1 - mrs x18, dbgbcr13_el1 - mrs x17, dbgbcr12_el1 - mrs x16, dbgbcr11_el1 - mrs x15, dbgbcr10_el1 - mrs x14, dbgbcr9_el1 - mrs x13, dbgbcr8_el1 - mrs x12, dbgbcr7_el1 - mrs x11, dbgbcr6_el1 - mrs x10, dbgbcr5_el1 - mrs x9, dbgbcr4_el1 - mrs x8, dbgbcr3_el1 - mrs x7, dbgbcr2_el1 - mrs x6, dbgbcr1_el1 - mrs x5, dbgbcr0_el1 - - adr x26, 1f - add x26, x26, x24, lsl #2 - br x26 - -1: - str x20, [x3, #(15 * 8)] - str x19, [x3, #(14 * 8)] - str x18, [x3, #(13 * 8)] - str x17, [x3, #(12 * 8)] - str x16, [x3, #(11 * 8)] - str x15, [x3, #(10 * 8)] - str x14, [x3, #(9 * 8)] - str x13, [x3, #(8 * 8)] - str x12, [x3, #(7 * 8)] - str x11, [x3, #(6 * 8)] - str x10, [x3, #(5 * 8)] - str x9, [x3, #(4 * 8)] - str x8, [x3, #(3 * 8)] - str x7, [x3, #(2 * 8)] - str x6, [x3, #(1 * 8)] - str x5, [x3, #(0 * 8)] - - add x3, x2, #CPU_SYSREG_OFFSET(DBGBVR0_EL1) - - adr x26, 1f - add x26, x26, x24, lsl #2 - br x26 -1: - mrs x20, dbgbvr15_el1 - mrs x19, dbgbvr14_el1 - mrs x18, dbgbvr13_el1 - mrs x17, dbgbvr12_el1 - mrs x16, dbgbvr11_el1 - mrs x15, dbgbvr10_el1 - mrs x14, dbgbvr9_el1 - mrs x13, dbgbvr8_el1 - mrs x12, dbgbvr7_el1 - mrs x11, dbgbvr6_el1 - mrs x10, dbgbvr5_el1 - mrs x9, dbgbvr4_el1 - mrs x8, dbgbvr3_el1 - mrs x7, dbgbvr2_el1 - mrs x6, dbgbvr1_el1 - mrs x5, dbgbvr0_el1 - - adr x26, 1f - add x26, x26, x24, lsl #2 - br x26 - -1: - str x20, [x3, #(15 * 8)] - str x19, [x3, #(14 * 8)] - str x18, [x3, #(13 * 8)] - str x17, [x3, #(12 * 8)] - str x16, [x3, #(11 * 8)] - str x15, [x3, #(10 * 8)] - str x14, [x3, #(9 * 8)] - str x13, [x3, #(8 * 8)] - str x12, [x3, #(7 * 8)] - str x11, [x3, #(6 * 8)] - str x10, [x3, #(5 * 8)] - str x9, [x3, #(4 * 8)] - str x8, [x3, #(3 * 8)] - str x7, [x3, #(2 * 8)] - str x6, [x3, #(1 * 8)] - str x5, [x3, #(0 * 8)] - - add x3, x2, #CPU_SYSREG_OFFSET(DBGWCR0_EL1) - - adr x26, 1f - add x26, x26, x25, lsl #2 - br x26 +.macro save_debug type + // x4: pointer to register set + // x5: number of registers to skip + // x6..x22 trashed + + adr x22, 1f + add x22, x22, x5, lsl #2 + br x22 1: - mrs x20, dbgwcr15_el1 - mrs x19, dbgwcr14_el1 - mrs x18, dbgwcr13_el1 - mrs x17, dbgwcr12_el1 - mrs x16, dbgwcr11_el1 - mrs x15, dbgwcr10_el1 - mrs x14, dbgwcr9_el1 - mrs x13, dbgwcr8_el1 - mrs x12, dbgwcr7_el1 - mrs x11, dbgwcr6_el1 - mrs x10, dbgwcr5_el1 - mrs x9, dbgwcr4_el1 - mrs x8,
[PATCH v7 00/11] KVM Guest Debug support for arm64
Here is V7 of the KVM Guest Debug support for arm64. The fixes are fairly minor aside from the re-factoring of sys_regs.c to have individual trap functions for each debug register. There is a lot of boiler plate but it does make the ugliness of the previous offset hacks go away. On top of that I've fixed some build failures on v7 which were not apparent on my defconfig build. I've also been helped with kernelci.org doing the heavy lifting for me: http://kernelci.org/boot/all/job/alex/ For full details see the changelog on each of the patches. GIT Repos: The patches for this series are based off v4.1 and can be found at: Kernel: https://git.linaro.org/people/alex.bennee/linux.git branch: guest-debug/4.1-v7 describe: v4.1-11-g2a10438 QEMU: https://github.com/stsquad/qemu branch: kvm/guest-debug-v6 Alex Bennée (11): KVM: add comments for kvm_debug_exit_arch struct KVM: arm64: guest debug, define API headers KVM: arm: guest debug, add stub KVM_SET_GUEST_DEBUG ioctl KVM: arm: introduce kvm_arm_init/setup/clear_debug KVM: arm64: guest debug, add SW break point support KVM: arm64: guest debug, add support for single-step KVM: arm64: re-factor hyp.S debug register code KVM: arm64: introduce vcpu-arch.debug_ptr KVM: arm64: guest debug, HW assisted debug support KVM: arm64: enable KVM_CAP_SET_GUEST_DEBUG KVM: arm64: add trace points for guest_debug debug Documentation/virtual/kvm/api.txt | 15 +- arch/arm/include/asm/kvm_host.h| 4 + arch/arm/kvm/arm.c | 18 +- arch/arm/kvm/guest.c | 6 + arch/arm64/include/asm/hw_breakpoint.h | 4 + arch/arm64/include/asm/kvm_asm.h | 26 +- arch/arm64/include/asm/kvm_host.h | 37 ++- arch/arm64/include/uapi/asm/kvm.h | 27 ++ arch/arm64/kernel/asm-offsets.c| 7 + arch/arm64/kernel/hw_breakpoint.c | 4 +- arch/arm64/kvm/Makefile| 2 +- arch/arm64/kvm/debug.c | 216 + arch/arm64/kvm/guest.c | 40 +++ arch/arm64/kvm/handle_exit.c | 44 +++ arch/arm64/kvm/hyp.S | 544 ++--- arch/arm64/kvm/reset.c | 15 + arch/arm64/kvm/sys_regs.c | 299 -- arch/arm64/kvm/sys_regs.h | 6 + arch/arm64/kvm/trace.h | 123 include/uapi/linux/kvm.h | 5 + 20 files changed, 996 insertions(+), 446 deletions(-) create mode 100644 arch/arm64/kvm/debug.c -- 2.4.5 -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v7 03/11] KVM: arm: guest debug, add stub KVM_SET_GUEST_DEBUG ioctl
This commit adds a stub function to support the KVM_SET_GUEST_DEBUG ioctl. Any unsupported flag will return -EINVAL. For now, only KVM_GUESTDBG_ENABLE is supported, although it won't have any effects. Signed-off-by: Alex Bennée alex.ben...@linaro.org. Reviewed-by: Christoffer Dall christoffer.d...@linaro.org --- v2 - simplified form of the ioctl (stuff will go into setup_debug) v3 - KVM_GUESTDBG_VALID-KVM_GUESTDBG_VALID_MASK - move mask check to the top of function - add ioctl doc header - split capability into separate patch - tweaked commit wording w.r.t return of -EINVAL v4 - add r-b-tag v7 - moved ioctl to arm64/kvm/guest.c, stubbed arm/kvm/guest.c --- Documentation/virtual/kvm/api.txt | 2 +- arch/arm/kvm/arm.c| 7 --- arch/arm/kvm/guest.c | 6 ++ arch/arm64/kvm/guest.c| 27 +++ 4 files changed, 34 insertions(+), 8 deletions(-) diff --git a/Documentation/virtual/kvm/api.txt b/Documentation/virtual/kvm/api.txt index c34c32d..ba635c7 100644 --- a/Documentation/virtual/kvm/api.txt +++ b/Documentation/virtual/kvm/api.txt @@ -2645,7 +2645,7 @@ handled. 4.87 KVM_SET_GUEST_DEBUG Capability: KVM_CAP_SET_GUEST_DEBUG -Architectures: x86, s390, ppc +Architectures: x86, s390, ppc, arm64 Type: vcpu ioctl Parameters: struct kvm_guest_debug (in) Returns: 0 on success; -1 on error diff --git a/arch/arm/kvm/arm.c b/arch/arm/kvm/arm.c index d9631ec..92b80bc 100644 --- a/arch/arm/kvm/arm.c +++ b/arch/arm/kvm/arm.c @@ -302,13 +302,6 @@ void kvm_arch_vcpu_put(struct kvm_vcpu *vcpu) kvm_arm_set_running_vcpu(NULL); } -int kvm_arch_vcpu_ioctl_set_guest_debug(struct kvm_vcpu *vcpu, - struct kvm_guest_debug *dbg) -{ - return -EINVAL; -} - - int kvm_arch_vcpu_ioctl_get_mpstate(struct kvm_vcpu *vcpu, struct kvm_mp_state *mp_state) { diff --git a/arch/arm/kvm/guest.c b/arch/arm/kvm/guest.c index d503fbb..96e935b 100644 --- a/arch/arm/kvm/guest.c +++ b/arch/arm/kvm/guest.c @@ -290,3 +290,9 @@ int kvm_arch_vcpu_ioctl_translate(struct kvm_vcpu *vcpu, { return -EINVAL; } + +int kvm_arch_vcpu_ioctl_set_guest_debug(struct kvm_vcpu *vcpu, + struct kvm_guest_debug *dbg) +{ + return -EINVAL; +} diff --git a/arch/arm64/kvm/guest.c b/arch/arm64/kvm/guest.c index 9535bd5..0ba8677 100644 --- a/arch/arm64/kvm/guest.c +++ b/arch/arm64/kvm/guest.c @@ -331,3 +331,30 @@ int kvm_arch_vcpu_ioctl_translate(struct kvm_vcpu *vcpu, { return -EINVAL; } + +#define KVM_GUESTDBG_VALID_MASK (KVM_GUESTDBG_ENABLE) + +/** + * kvm_arch_vcpu_ioctl_set_guest_debug - set up guest debugging + * @kvm: pointer to the KVM struct + * @kvm_guest_debug: the ioctl data buffer + * + * This sets up and enables the VM for guest debugging. Userspace + * passes in a control flag to enable different debug types and + * potentially other architecture specific information in the rest of + * the structure. + */ +int kvm_arch_vcpu_ioctl_set_guest_debug(struct kvm_vcpu *vcpu, + struct kvm_guest_debug *dbg) +{ + if (dbg-control ~KVM_GUESTDBG_VALID_MASK) + return -EINVAL; + + if (dbg-control KVM_GUESTDBG_ENABLE) { + vcpu-guest_debug = dbg-control; + } else { + /* If not enabled clear all flags */ + vcpu-guest_debug = 0; + } + return 0; +} -- 2.4.5 -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/1] KVM: s390: virtio-ccw: Fix config space values
On Wed, Jul 01, 2015 at 04:05:27PM +0200, Paolo Bonzini wrote: On 01/07/2015 15:45, Michael S. Tsirkin wrote: Paolo, here is fix targetted for kvm/master (4.2) that fixes an issue with virtio config space on s390. It mostly manifests in vhost-scsi not working properly on s390. The problem itself might affect other things as well so cc stable/target 4.2. @Michael FYI, sending this via Paolo as most virtio-ccw kernel things went this way. OK but virtio patches should be Cc'd to the virtualization mailing list. So I think we need a separate MAINTAINERS entry for s390/virtio. See my other email---I think no special case is necessary. Paolo Hmm but MAINTAINERS doesn't tell people they should Cc virtio ML - isn't that a problem? -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/1] KVM: s390: virtio-ccw: don't overwrite config space values
On 29/06/2015 16:44, Christian Borntraeger wrote: From: Cornelia Huck cornelia.h...@de.ibm.com Eric noticed problems with vhost-scsi and virtio-ccw: vhost-scsi complained about overwriting values in the config space, which was triggered by a broken implementation of virtio-ccw's config get/set routines. It was probably sheer luck that we did not hit this before. When writing a value to the config space, the WRITE_CONF ccw will always write from the beginning of the config space up to and including the value to be set. If the config space up to the value has not yet been retrieved from the device, however, we'll end up overwriting values. Keep track of the known config space and update if needed to avoid this. Moreover, READ_CONF will only read the number of bytes it has been instructed to retrieve, so we must not copy more than that to the buffer, or we might overwrite trailing values. Reported-by: Eric Farman far...@linux.vnet.ibm.com Signed-off-by: Cornelia Huck cornelia.h...@de.ibm.com Reviewed-by: Eric Farman far...@linux.vnet.ibm.com Tested-by: Eric Farman far...@linux.vnet.ibm.com Signed-off-by: Christian Borntraeger borntrae...@de.ibm.com Cc: sta...@vger.kernel.org --- drivers/s390/kvm/virtio_ccw.c | 11 +-- 1 file changed, 9 insertions(+), 2 deletions(-) diff --git a/drivers/s390/kvm/virtio_ccw.c b/drivers/s390/kvm/virtio_ccw.c index 6f1fa17..f8d8fdb 100644 --- a/drivers/s390/kvm/virtio_ccw.c +++ b/drivers/s390/kvm/virtio_ccw.c @@ -65,6 +65,7 @@ struct virtio_ccw_device { bool is_thinint; bool going_away; bool device_lost; + unsigned int config_ready; void *airq_info; }; @@ -833,8 +834,11 @@ static void virtio_ccw_get_config(struct virtio_device *vdev, if (ret) goto out_free; - memcpy(vcdev-config, config_area, sizeof(vcdev-config)); - memcpy(buf, vcdev-config[offset], len); + memcpy(vcdev-config, config_area, offset + len); + if (buf) + memcpy(buf, vcdev-config[offset], len); + if (vcdev-config_ready offset + len) + vcdev-config_ready = offset + len; out_free: kfree(config_area); @@ -857,6 +861,9 @@ static void virtio_ccw_set_config(struct virtio_device *vdev, if (!config_area) goto out_free; + /* Make sure we don't overwrite fields. */ + if (vcdev-config_ready offset) + virtio_ccw_get_config(vdev, 0, NULL, offset); memcpy(vcdev-config[offset], buf, len); /* Write the config area to the host. */ memcpy(config_area, vcdev-config, sizeof(vcdev-config)); Applied (but I think in general virtio-ccw patches should go through mst---the exception is when matching changes to KVM are needed, and of course the exception was almost always the rule during bringup). Thanks, Paolo -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/5] PPC: Current patch queue for HV KVM
On 24.06.15 13:18, Paul Mackerras wrote: This is my current queue of patches for HV KVM. This series is based on the kvm next branch. They have all been posted 6 weeks ago or more, though I have just added a 3-line fix to patch 2/5 to fix a bug that we found in testing migration, and I expanded a comment (no code change) in patch 3/5 following a suggestion by Aneesh. I'd like to see these go into 4.2 if possible. Thanks, applied all to kvm-ppc-queue. Alex -- To unsubscribe from this list: send the line unsubscribe kvm-ppc in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/5] PPC: Current patch queue for HV KVM
On 24.06.15 13:18, Paul Mackerras wrote: This is my current queue of patches for HV KVM. This series is based on the kvm next branch. They have all been posted 6 weeks ago or more, though I have just added a 3-line fix to patch 2/5 to fix a bug that we found in testing migration, and I expanded a comment (no code change) in patch 3/5 following a suggestion by Aneesh. I'd like to see these go into 4.2 if possible. Thanks, applied all to kvm-ppc-queue. Alex -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] KVM: x86: remove data variable from kvm_get_msr_common
On 29/06/2015 12:39, Nicolas Iooss wrote: Commit 609e36d372ad (KVM: x86: pass host_initiated to functions that read MSRs) modified kvm_get_msr_common function to use msr_info-data instead of data but missed one occurrence. Replace it and remove the unused local variable. Fixes: 609e36d372ad (KVM: x86: pass host_initiated to functions that read MSRs) Signed-off-by: Nicolas Iooss nicolas.iooss_li...@m4x.org --- arch/x86/kvm/x86.c | 4 +--- 1 file changed, 1 insertion(+), 3 deletions(-) diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index ac165c2fb8e5..bbaf44e8f0d3 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -2379,8 +2379,6 @@ static int get_msr_hyperv(struct kvm_vcpu *vcpu, u32 msr, u64 *pdata) int kvm_get_msr_common(struct kvm_vcpu *vcpu, struct msr_data *msr_info) { - u64 data; - switch (msr_info-index) { case MSR_IA32_PLATFORM_ID: case MSR_IA32_EBL_CR_POWERON: @@ -2453,7 +2451,7 @@ int kvm_get_msr_common(struct kvm_vcpu *vcpu, struct msr_data *msr_info) /* TSC increment by tick */ msr_info-data = 1000ULL; /* CPU multiplier */ - data |= (((uint64_t)4ULL) 40); + msr_info-data |= (((uint64_t)4ULL) 40); break; case MSR_EFER: msr_info-data = vcpu-arch.efer; Applied, thanks. Paolo -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v3 1/2] arm64: KVM: Optimize arm64 skip 30-50% vfp/simd save/restore on exits
On Wed, Jun 24, 2015 at 05:04:11PM -0700, Mario Smarduch wrote: This patch only saves and restores FP/SIMD registers on Guest access. To do this cptr_el2 FP/SIMD trap is set on Guest entry and later checked on exit. lmbench, hackbench show significant improvements, for 30-50% exits FP/SIMD context is not saved/restored Signed-off-by: Mario Smarduch m.smard...@samsung.com --- arch/arm64/include/asm/kvm_arm.h |5 - arch/arm64/kvm/hyp.S | 46 +++--- 2 files changed, 47 insertions(+), 4 deletions(-) diff --git a/arch/arm64/include/asm/kvm_arm.h b/arch/arm64/include/asm/kvm_arm.h index ac6fafb..7605e09 100644 --- a/arch/arm64/include/asm/kvm_arm.h +++ b/arch/arm64/include/asm/kvm_arm.h @@ -171,10 +171,13 @@ #define HSTR_EL2_TTEE(1 16) #define HSTR_EL2_T(x)(1 x) +/* Hyp Coproccessor Trap Register Shifts */ +#define CPTR_EL2_TFP_SHIFT 10 + /* Hyp Coprocessor Trap Register */ #define CPTR_EL2_TCPAC (1 31) #define CPTR_EL2_TTA (1 20) -#define CPTR_EL2_TFP (1 10) +#define CPTR_EL2_TFP (1 CPTR_EL2_TFP_SHIFT) /* Hyp Debug Configuration Register bits */ #define MDCR_EL2_TDRA(1 11) diff --git a/arch/arm64/kvm/hyp.S b/arch/arm64/kvm/hyp.S index 5befd01..de0788f 100644 --- a/arch/arm64/kvm/hyp.S +++ b/arch/arm64/kvm/hyp.S @@ -673,6 +673,15 @@ tbz \tmp, #KVM_ARM64_DEBUG_DIRTY_SHIFT, \target .endm +/* + * Check cptr VFP/SIMD accessed bit, if set VFP/SIMD not accessed by guest. This comment doesn't really help me understand the function, may I suggest: Branch to target if CPTR_EL2.TFP bit is set (VFP/SIMD trapping enabled) + */ +.macro skip_fpsimd_state tmp, target + mrs \tmp, cptr_el2 + tbnz\tmp, #CPTR_EL2_TFP_SHIFT, \target +.endm + + .macro compute_debug_state target // Compute debug state: If any of KDE, MDE or KVM_ARM64_DEBUG_DIRTY // is set, we do a full save/restore cycle and disable trapping. @@ -763,6 +772,7 @@ ldr x2, [x0, #VCPU_HCR_EL2] msr hcr_el2, x2 mov x2, #CPTR_EL2_TTA + orr x2, x2, #CPTR_EL2_TFP msr cptr_el2, x2 mov x2, #(1 15) // Trap CP15 Cr=15 @@ -785,7 +795,6 @@ .macro deactivate_traps mov x2, #HCR_RW msr hcr_el2, x2 - msr cptr_el2, xzr msr hstr_el2, xzr mrs x2, mdcr_el2 @@ -912,6 +921,28 @@ __restore_fpsimd: restore_fpsimd ret +switch_to_guest_fpsimd: + pushx4, lr + + mrs x2, cptr_el2 + bic x2, x2, #CPTR_EL2_TFP + msr cptr_el2, x2 + + mrs x0, tpidr_el2 + + ldr x2, [x0, #VCPU_HOST_CONTEXT] + kern_hyp_va x2 + bl __save_fpsimd + + add x2, x0, #VCPU_CONTEXT + bl __restore_fpsimd + + pop x4, lr + pop x2, x3 + pop x0, x1 + + eret + /* * u64 __kvm_vcpu_run(struct kvm_vcpu *vcpu); * @@ -932,7 +963,6 @@ ENTRY(__kvm_vcpu_run) kern_hyp_va x2 save_host_regs - bl __save_fpsimd bl __save_sysregs compute_debug_state 1f @@ -948,7 +978,6 @@ ENTRY(__kvm_vcpu_run) add x2, x0, #VCPU_CONTEXT bl __restore_sysregs - bl __restore_fpsimd skip_debug_state x3, 1f bl __restore_debug @@ -967,7 +996,9 @@ __kvm_vcpu_return: add x2, x0, #VCPU_CONTEXT save_guest_regs + skip_fpsimd_state x3, 1f bl __save_fpsimd +1: bl __save_sysregs skip_debug_state x3, 1f @@ -986,7 +1017,11 @@ __kvm_vcpu_return: kern_hyp_va x2 bl __restore_sysregs + skip_fpsimd_state x3, 1f bl __restore_fpsimd +1: + /* Clear FPSIMD and Trace trapping */ + msr cptr_el2, xzr why not simply move the deactivate_traps down here instead? skip_debug_state x3, 1f // Clear the dirty flag for the next run, as all the state has @@ -1201,6 +1236,11 @@ el1_trap: * x1: ESR * x2: ESR_EC */ + + /* Guest accessed VFP/SIMD registers, save host, restore Guest */ + cmp x2, #ESR_ELx_EC_FP_ASIMD + b.eqswitch_to_guest_fpsimd + cmp x2, #ESR_ELx_EC_DABT_LOW mov x0, #ESR_ELx_EC_IABT_LOW ccmpx2, x0, #4, ne -- 1.7.9.5 Otherwise looks good, -Christoffer -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v3 0/9] HyperV equivalent of pvpanic driver
On 30/06/2015 13:33, Denis V. Lunev wrote: Windows 2012 guests can notify hypervisor about occurred guest crash (Windows bugcheck(BSOD)) by writing specific Hyper-V msrs. This patch does handling of this MSR's by KVM and sending notification to user space that allows to gather Windows guest crash dump by QEMU/LIBVIRT. The idea is to provide functionality equal to pvpanic device without QEMU guest agent for Windows. The idea is borrowed from Linux HyperV bus driver and validated against Windows 2k12. Changes from v2: * forbid modification crash ctl msr by guest * qemu_system_guest_panicked usage in pvpanic and s390x * hyper-v crash handler move from generic kvm to i386 * hyper-v crash handler: skip fetching crash msrs just mark crash occured * sync with linux-next 20150629 * patch 11 squashed to patch 10 * patch 9 squashed to patch 7 Changes from v1: * hyperv code move to hyperv.c * added read handlers of crash data msrs * added per vm and per cpu hyperv context structures * added saving crash msrs inside qemu cpu state * added qemu fetch and update of crash msrs * added qemu crash msrs store in cpu state and it's migration Signed-off-by: Andrey Smetanin asmeta...@virtuozzo.com Signed-off-by: Denis V. Lunev d...@openvz.org CC: Gleb Natapov g...@kernel.org CC: Paolo Bonzini pbonz...@redhat.com The patches look good, thanks. I'll queue them as soon as I start merging 4.3 features. Paolo -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/1] KVM: s390: virtio-ccw: Fix config space values
On 01/07/2015 15:45, Michael S. Tsirkin wrote: Paolo, here is fix targetted for kvm/master (4.2) that fixes an issue with virtio config space on s390. It mostly manifests in vhost-scsi not working properly on s390. The problem itself might affect other things as well so cc stable/target 4.2. @Michael FYI, sending this via Paolo as most virtio-ccw kernel things went this way. OK but virtio patches should be Cc'd to the virtualization mailing list. So I think we need a separate MAINTAINERS entry for s390/virtio. See my other email---I think no special case is necessary. Paolo -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/3] KVM: x86: legacy NMI watchdog fixes
On 30/06/2015 22:19, Radim Krčmář wrote: Until v2.6.37, Linux used NMI watchdog that utilized IO-APIC and LVT0. This series fixes some problems with APICv, restore, and concurrency while keeping the monster asleep. Queued for 4.2. Paolo -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 3/3] KVM: x86: make vapics_in_nmi_mode atomic
On 30/06/2015 22:19, Radim Krčmář wrote: Writes were a bit racy, but hard to turn into a bug at the same time. (Particularly because modern Linux doesn't use this feature anymore.) I suspect patch 2 makes this race much easier to trigger, so it deserves Cc: stable@ as well. Paolo Signed-off-by: Radim Krčmář rkrc...@redhat.com --- arch/x86/include/asm/kvm_host.h | 2 +- arch/x86/kvm/i8254.c| 2 +- arch/x86/kvm/lapic.c| 4 ++-- 3 files changed, 4 insertions(+), 4 deletions(-) diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h index c7fa57b529d2..2a7f5d782c33 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -607,7 +607,7 @@ struct kvm_arch { struct kvm_pic *vpic; struct kvm_ioapic *vioapic; struct kvm_pit *vpit; - int vapics_in_nmi_mode; + atomic_t vapics_in_nmi_mode; struct mutex apic_map_lock; struct kvm_apic_map *apic_map; diff --git a/arch/x86/kvm/i8254.c b/arch/x86/kvm/i8254.c index 4dce6f8b6129..f90952f64e79 100644 --- a/arch/x86/kvm/i8254.c +++ b/arch/x86/kvm/i8254.c @@ -305,7 +305,7 @@ static void pit_do_work(struct kthread_work *work) * LVT0 to NMI delivery. Other PIC interrupts are just sent to * VCPU0, and only if its LVT0 is in EXTINT mode. */ - if (kvm-arch.vapics_in_nmi_mode 0) + if (atomic_read(kvm-arch.vapics_in_nmi_mode) 0) kvm_for_each_vcpu(i, vcpu, kvm) kvm_apic_nmi_wd_deliver(vcpu); } diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c index 8dc32b5a4e0d..954e98a8c2e3 100644 --- a/arch/x86/kvm/lapic.c +++ b/arch/x86/kvm/lapic.c @@ -1264,9 +1264,9 @@ static void apic_manage_nmi_watchdog(struct kvm_lapic *apic, u32 lvt0_val) if (lvt0_in_nmi_mode) { apic_debug(Receive NMI setting on APIC_LVT0 for cpu %d\n, apic-vcpu-vcpu_id); - apic-vcpu-kvm-arch.vapics_in_nmi_mode++; + atomic_inc(apic-vcpu-kvm-arch.vapics_in_nmi_mode); } else - apic-vcpu-kvm-arch.vapics_in_nmi_mode--; + atomic_dec(apic-vcpu-kvm-arch.vapics_in_nmi_mode); } } -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/1] KVM: s390: virtio-ccw: Fix config space values
On 01/07/2015 16:18, Michael S. Tsirkin wrote: On Wed, Jul 01, 2015 at 04:05:27PM +0200, Paolo Bonzini wrote: On 01/07/2015 15:45, Michael S. Tsirkin wrote: Paolo, here is fix targetted for kvm/master (4.2) that fixes an issue with virtio config space on s390. It mostly manifests in vhost-scsi not working properly on s390. The problem itself might affect other things as well so cc stable/target 4.2. @Michael FYI, sending this via Paolo as most virtio-ccw kernel things went this way. OK but virtio patches should be Cc'd to the virtualization mailing list. So I think we need a separate MAINTAINERS entry for s390/virtio. See my other email---I think no special case is necessary. Paolo Hmm but MAINTAINERS doesn't tell people they should Cc virtio ML - isn't that a problem? Ah that's because ccw isn't under drivers/virtio. Yes, that should be fixed and the old pre-ccw drivers should also get a stanza in MAINTAINERS. Paolo -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/1] KVM: s390: virtio-ccw: Fix config space values
On Mon, Jun 29, 2015 at 04:44:00PM +0200, Christian Borntraeger wrote: Paolo, here is fix targetted for kvm/master (4.2) that fixes an issue with virtio config space on s390. It mostly manifests in vhost-scsi not working properly on s390. The problem itself might affect other things as well so cc stable/target 4.2. @Michael FYI, sending this via Paolo as most virtio-ccw kernel things went this way. OK but virtio patches should be Cc'd to the virtualization mailing list. So I think we need a separate MAINTAINERS entry for s390/virtio. Cornelia Huck (1): KVM: s390: virtio-ccw: don't overwrite config space values drivers/s390/kvm/virtio_ccw.c | 11 +-- 1 file changed, 9 insertions(+), 2 deletions(-) -- 2.3.0 -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v3 2/2] arm: KVM: keep arm vfp/simd exit handling consistent with arm64
On Wed, Jun 24, 2015 at 05:04:12PM -0700, Mario Smarduch wrote: After enhancing arm64 FP/SIMD exit handling, ARMv7 VFP exit branch is moved to guest trap handling. This allows us to keep exit handling flow between both architectures consistent. Signed-off-by: Mario Smarduch m.smard...@samsung.com --- arch/arm/kvm/interrupts.S | 14 -- 1 file changed, 8 insertions(+), 6 deletions(-) diff --git a/arch/arm/kvm/interrupts.S b/arch/arm/kvm/interrupts.S index 79caf79..b245b4e 100644 --- a/arch/arm/kvm/interrupts.S +++ b/arch/arm/kvm/interrupts.S @@ -363,10 +363,6 @@ hyp_hvc: @ Check syndrome register mrc p15, 4, r1, c5, c2, 0 @ HSR lsr r0, r1, #HSR_EC_SHIFT -#ifdef CONFIG_VFPv3 - cmp r0, #HSR_EC_CP_0_13 - beq switch_to_guest_vfp -#endif cmp r0, #HSR_EC_HVC bne guest_trap @ Not HVC instr. @@ -380,7 +376,10 @@ hyp_hvc: cmp r2, #0 bne guest_trap @ Guest called HVC -host_switch_to_hyp: + /* + * Getting here means host called HVC, we shift parameters and branch + * to Hyp function. + */ pop {r0, r1, r2} /* Check for __hyp_get_vectors */ @@ -411,6 +410,10 @@ guest_trap: @ Check if we need the fault information lsr r1, r1, #HSR_EC_SHIFT +#ifdef CONFIG_VFPv3 + cmp r1, #HSR_EC_CP_0_13 + beq switch_to_guest_vfp +#endif cmp r1, #HSR_EC_IABT mrceq p15, 4, r2, c6, c0, 2 @ HIFAR beq 2f @@ -479,7 +482,6 @@ guest_trap: */ #ifdef CONFIG_VFPv3 switch_to_guest_vfp: - load_vcpu @ Load VCPU pointer to r0 push{r3-r7} @ NEON/VFP used. Turn on VFP access. -- 1.7.9.5 Reviewed-by: Christoffer Dall christoffer.d...@linaro.org -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/3] KVM: x86: properly restore LVT0
On 30/06/2015 22:19, Radim Krčmář wrote: Legacy NMI watchdog didn't work after migration/resume, because vapics_in_nmi_mode was left at 0. Signed-off-by: Radim Krčmář rkrc...@redhat.com --- arch/x86/kvm/lapic.c | 1 + 1 file changed, 1 insertion(+) diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c index f49c7cca1de6..8dc32b5a4e0d 100644 --- a/arch/x86/kvm/lapic.c +++ b/arch/x86/kvm/lapic.c @@ -1824,6 +1824,7 @@ void kvm_apic_post_state_restore(struct kvm_vcpu *vcpu, apic_update_ppr(apic); hrtimer_cancel(apic-lapic_timer.timer); apic_update_lvtt(apic); + apic_manage_nmi_watchdog(apic, kvm_apic_get_reg(apic, APIC_LVT0)); update_divide_count(apic); start_apic_timer(apic); apic-irr_pending = true; Applied already, with Cc: stable, as it is not related to APICv. Paolo -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PULL] virtio/vhost: cross endian support
The following changes since commit 8a7b19d8b542b87bccc3eaaf81dcc90a5ca48aea: include/uapi/linux/virtio_balloon.h: include linux/virtio_types.h (2015-06-01 15:46:54 +0200) are available in the git repository at: git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost.git tags/for_linus for you to fetch changes up to 59a5b0f7bf74f88da6670bcbf924d8cc1e75b1ee: virtio-pci: alloc only resources actually used. (2015-06-24 08:15:09 +0200) virtio/vhost: cross endian support I have just queued some more bugfix patches today but none fix regressions and none are related to these ones, so it looks like a good time for a merge for -rc1. Signed-off-by: Michael S. Tsirkin m...@redhat.com Gerd Hoffmann (1): virtio-pci: alloc only resources actually used. Greg Kurz (8): virtio: introduce virtio_is_little_endian() helper tun: add tun_is_little_endian() helper macvtap: introduce macvtap_is_little_endian() helper vringh: introduce vringh_is_little_endian() helper vhost: introduce vhost_is_little_endian() helper virtio: add explicit big-endian support to memory accessors vhost: cross-endian support for legacy devices macvtap/tun: cross-endian support for little-endian hosts drivers/vhost/vhost.h | 25 --- drivers/virtio/virtio_pci_common.h | 2 + include/linux/virtio_byteorder.h | 24 ++- include/linux/virtio_config.h | 18 +--- include/linux/vringh.h | 18 +--- include/uapi/linux/if_tun.h| 6 +++ include/uapi/linux/vhost.h | 14 +++ drivers/net/macvtap.c | 65 - drivers/net/tun.c | 67 +- drivers/vhost/vhost.c | 85 +- drivers/virtio/virtio_pci_common.c | 7 drivers/virtio/virtio_pci_legacy.c | 13 +- drivers/virtio/virtio_pci_modern.c | 24 --- drivers/net/Kconfig| 14 +++ drivers/vhost/Kconfig | 15 +++ 15 files changed, 350 insertions(+), 47 deletions(-) -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v3 0/2] arm/arm64: KVM: Optimize arm64 fp/simd, saves 30-50% on exits
On Wed, Jun 24, 2015 at 05:04:10PM -0700, Mario Smarduch wrote: Currently we save/restore fp/simd on each exit. Fist patch optimizes arm64 save/restore, we only do so on Guest access. hackbench and several lmbench tests show anywhere from 30% to above 50% optimzation achieved. In second patch 32-bit handler is updated to keep exit handling consistent with 64-bit code. 30-50% of what? The overhead or overall performance? Changes since v1: - Addressed Marcs comments - Verified optimization improvements with lmbench and hackbench, updated commit message Changes since v2: - only for patch 2/2 - Reworked trapping to vfp access handler Changes since v3: - Only for patch 2/2 - Removed load_vcpu in switch_to_guest_vfp per Marcs comment - Got another chance to replace an unreferenced label with a comment Mario Smarduch (2): Optimize arm64 skip 30-50% vfp/simd save/restore on exits keep arm vfp/simd exit handling consistent with arm64 arch/arm/kvm/interrupts.S| 14 +++- arch/arm64/include/asm/kvm_arm.h |5 - arch/arm64/kvm/hyp.S | 46 +++--- 3 files changed, 55 insertions(+), 10 deletions(-) -- 1.7.9.5 -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 05/10] KVM: arm/arm64: vgic: Relax vgic_can_sample_irq for edge IRQs
On Wed, Jul 01, 2015 at 10:17:52AM +0100, Marc Zyngier wrote: On 30/06/15 21:19, Christoffer Dall wrote: On Mon, Jun 08, 2015 at 06:04:00PM +0100, Marc Zyngier wrote: We only set the irq_queued flag for level interrupts, meaning that !vgic_irq_is_queued(vcpu, irq) is a good enough predicate for all interrupts. This will allow us to inject edge HW interrupts, for which the state ACTIVE+PENDING is not allowed. I don't understand this; ACTIVE+PENDING is allowed for edge interrupts. Do you mean that if we set the HW bit in the LR, then we are linking to an HW interrupt where we don't allow that to be ACTIVE+PENDING on the HW GIC side? Why is this relevant here? I feel like I'm missing context. I've probably taken a shortcut here - bear with me while I'm trying to explain the issue. For HW interrupts, we shouldn't even try to use the state bits in the LR, because that state is contained in the physical distributor. Setting the HW bit really means there is something going on at the distributor level, just go there. ok, so by HW interrupts you mean virtual interrupts with the HW bit in the LR set, correct? If we were to inject a ACTIVE+PENDING interrupt at the LR level, we'd basically loose the second interrupt because that state is simply not considered. Huh? Which second interrupt. I looked at the spec and it says don't use the state bits for HW interrupts, so isn't it simply not supported to set these bits at all and that's it? So the trick we're using is to only inject the active interrupt, and prevent anything else from being injected until we can confirm that the active state has been cleared at the physical level. Does it make any sense? Sort of, but what I don't understand now is how the guest ever sees the interrupt then. If we always inject the virtual interrupt by setting the active state on the physical distributor, and we can't inject this as active+pending, and the guest doesn't see the state in the LR, then how does this ever raise a virtual interrupt and how does the guest see an interrupt which is only PENDING so that it can ack it etc. etc.? Maybe I don't fully understand how the HW bit works after all... Thanks, -Christoffer -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 06/10] KVM: arm/arm64: vgic: Allow dynamic mapping of physical/virtual interrupts
On Wed, Jul 01, 2015 at 11:20:45AM +0100, Marc Zyngier wrote: On 30/06/15 21:19, Christoffer Dall wrote: On Mon, Jun 08, 2015 at 06:04:01PM +0100, Marc Zyngier wrote: In order to be able to feed physical interrupts to a guest, we need to be able to establish the virtual-physical mapping between the two worlds. The mapping is kept in a rbtree, indexed by virtual interrupts. how many of these do you expect there will be? Is the extra code and complexity of an rbtree really warranted? I would assume that you'll have one PPI for each CPU in the default case plus potentially a few more for an assigned network adapter, let's say a couple of handfulls. Am I missing something obvious or is this optimization of traversing a list of 10-12 mappings in the typical case not likely to be measurable? I would actually be more concerned about the additional locking and would look at RCU for protecting a list instead. Can you protect an rbtree with RCU easily? Not very easily. There was some work done a while ago for the dentry cache IIRC, but I doubt that's reusable directly, and probably overkill. RCU protected lists are, on the other hand, readily available. Bah. I'll switch to this. By the time it becomes the bottleneck, the world will have moved on. Or so I hope. We can also move to RB trees if we have some data to show us it's worth the hassle later on, but I assume that since these structs are fairly small and overhead like this is mostly to show up on a hot path, a better optimization would be to allocate a bunch of these structures contiguously for cache locality, but again, I feel like this is all premature and we should measure the beast first. Thanks, -Christoffer -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 06/10] KVM: arm/arm64: vgic: Allow dynamic mapping of physical/virtual interrupts
On 30/06/15 21:19, Christoffer Dall wrote: On Mon, Jun 08, 2015 at 06:04:01PM +0100, Marc Zyngier wrote: In order to be able to feed physical interrupts to a guest, we need to be able to establish the virtual-physical mapping between the two worlds. The mapping is kept in a rbtree, indexed by virtual interrupts. how many of these do you expect there will be? Is the extra code and complexity of an rbtree really warranted? I would assume that you'll have one PPI for each CPU in the default case plus potentially a few more for an assigned network adapter, let's say a couple of handfulls. Am I missing something obvious or is this optimization of traversing a list of 10-12 mappings in the typical case not likely to be measurable? I would actually be more concerned about the additional locking and would look at RCU for protecting a list instead. Can you protect an rbtree with RCU easily? Not very easily. There was some work done a while ago for the dentry cache IIRC, but I doubt that's reusable directly, and probably overkill. RCU protected lists are, on the other hand, readily available. Bah. I'll switch to this. By the time it becomes the bottleneck, the world will have moved on. Or so I hope. M. Thanks, -Christoffer Signed-off-by: Marc Zyngier marc.zyng...@arm.com --- include/kvm/arm_vgic.h | 18 virt/kvm/arm/vgic.c| 110 + 2 files changed, 128 insertions(+) diff --git a/include/kvm/arm_vgic.h b/include/kvm/arm_vgic.h index 4f9fa1d..33d121a 100644 --- a/include/kvm/arm_vgic.h +++ b/include/kvm/arm_vgic.h @@ -159,6 +159,14 @@ struct vgic_io_device { struct kvm_io_device dev; }; +struct irq_phys_map { +struct rb_node node; +u32 virt_irq; +u32 phys_irq; +u32 irq; +boolactive; +}; + struct vgic_dist { spinlock_t lock; boolin_kernel; @@ -256,6 +264,10 @@ struct vgic_dist { struct vgic_vm_ops vm_ops; struct vgic_io_device dist_iodev; struct vgic_io_device *redist_iodevs; + +/* Virtual irq to hwirq mapping */ +spinlock_t irq_phys_map_lock; why do we need a separate lock here? +struct rb_root irq_phys_map; }; struct vgic_v2_cpu_if { @@ -307,6 +319,9 @@ struct vgic_cpu { struct vgic_v2_cpu_if vgic_v2; struct vgic_v3_cpu_if vgic_v3; }; + +/* Protected by the distributor's irq_phys_map_lock */ +struct rb_root irq_phys_map; }; #define LR_EMPTY0xff @@ -331,6 +346,9 @@ int kvm_vgic_inject_irq(struct kvm *kvm, int cpuid, unsigned int irq_num, void vgic_v3_dispatch_sgi(struct kvm_vcpu *vcpu, u64 reg); int kvm_vgic_vcpu_pending_irq(struct kvm_vcpu *vcpu); int kvm_vgic_vcpu_active_irq(struct kvm_vcpu *vcpu); +struct irq_phys_map *vgic_map_phys_irq(struct kvm_vcpu *vcpu, + int virt_irq, int irq); +int vgic_unmap_phys_irq(struct kvm_vcpu *vcpu, struct irq_phys_map *map); #define irqchip_in_kernel(k)(!!((k)-arch.vgic.in_kernel)) #define vgic_initialized(k) (!!((k)-arch.vgic.nr_cpus)) diff --git a/virt/kvm/arm/vgic.c b/virt/kvm/arm/vgic.c index 59ed7a3..c6604f2 100644 --- a/virt/kvm/arm/vgic.c +++ b/virt/kvm/arm/vgic.c @@ -24,6 +24,7 @@ #include linux/of.h #include linux/of_address.h #include linux/of_irq.h +#include linux/rbtree.h #include linux/uaccess.h #include linux/irqchip/arm-gic.h @@ -84,6 +85,8 @@ static void vgic_retire_disabled_irqs(struct kvm_vcpu *vcpu); static void vgic_retire_lr(int lr_nr, int irq, struct kvm_vcpu *vcpu); static struct vgic_lr vgic_get_lr(const struct kvm_vcpu *vcpu, int lr); static void vgic_set_lr(struct kvm_vcpu *vcpu, int lr, struct vgic_lr lr_desc); +static struct irq_phys_map *vgic_irq_map_search(struct kvm_vcpu *vcpu, +int virt_irq); static const struct vgic_ops *vgic_ops; static const struct vgic_params *vgic; @@ -1585,6 +1588,112 @@ static irqreturn_t vgic_maintenance_handler(int irq, void *data) return IRQ_HANDLED; } +static struct rb_root *vgic_get_irq_phys_map(struct kvm_vcpu *vcpu, + int virt_irq) +{ +if (virt_irq VGIC_NR_PRIVATE_IRQS) +return vcpu-arch.vgic_cpu.irq_phys_map; +else +return vcpu-kvm-arch.vgic.irq_phys_map; +} + +struct irq_phys_map *vgic_map_phys_irq(struct kvm_vcpu *vcpu, + int virt_irq, int irq) +{ +struct vgic_dist *dist = vcpu-kvm-arch.vgic; +struct rb_root *root = vgic_get_irq_phys_map(vcpu, virt_irq); +struct rb_node **new = root-rb_node, *parent = NULL; +struct irq_phys_map *new_map; +struct irq_desc *desc; +struct irq_data *data; +
[PATCH 13/16] nvdimm: support NFIT_CMD_IMPLEMENTED function
__DSM is defined in ACPI 6.0: 9.14.1 _DSM (Device Specific Method) Function 0 is a query function. We do not support any function on root device and only 3 functions are support for NVDIMM device, NFIT_CMD_GET_CONFIG_SIZE, NFIT_CMD_GET_CONFIG_DATA and NFIT_CMD_SET_CONFIG_DATA, that means we currently only allow to access device's Label Namespace Signed-off-by: Xiao Guangrong guangrong.x...@linux.intel.com --- hw/mem/pc-nvdimm.c | 126 + 1 file changed, 126 insertions(+) diff --git a/hw/mem/pc-nvdimm.c b/hw/mem/pc-nvdimm.c index c0965ae..b586bf7 100644 --- a/hw/mem/pc-nvdimm.c +++ b/hw/mem/pc-nvdimm.c @@ -29,6 +29,15 @@ #include exec/address-spaces.h #include hw/acpi/aml-build.h #include hw/mem/pc-nvdimm.h +#include sysemu/sysemu.h + +//#define NVDIMM_DEBUG + +#ifdef NVDIMM_DEBUG +#define nvdebug(fmt, ...) fprintf(stderr, nvdimm: fmt, ## __VA_ARGS__) +#else +#define nvdebug(...) +#endif #define PAGE_SIZE (1UL 12) @@ -135,6 +144,22 @@ static void nfit_spa_uuid_pm(void *uuid) memcpy(uuid, uuid_pm, sizeof(uuid_pm)); } +static bool dsm_is_root_uuid(uint8_t *uuid) +{ +uuid_le uuid_root = UUID_LE(0x2f10e7a4, 0x9e91, 0x11e4, 0x89, +0xd3, 0x12, 0x3b, 0x93, 0xf7, 0x5c, 0xba); + +return !memcmp(uuid, uuid_root, sizeof(uuid_root)); +} + +static bool dsm_is_dimm_uuid(uint8_t *uuid) +{ +uuid_le uuid_dimm = UUID_LE(0x4309ac30, 0x0d11, 0x11e4, 0x91, +0x91, 0x08, 0x00, 0x20, 0x0c, 0x9a, 0x66); + +return !memcmp(uuid, uuid_dimm, sizeof(uuid_dimm)); +} + enum { NFIT_TABLE_SPA = 0, NFIT_TABLE_MEM = 1, @@ -349,6 +374,23 @@ enum { NFIT_CMD_VENDOR = 9, }; +enum { +NFIT_STATUS_SUCCESS = 0, +NFIT_STATUS_NOT_SUPPORTED = 1, +NFIT_STATUS_NON_EXISTING_MEM_DEV = 2, +NFIT_STATUS_INVALID_PARAS = 3, +NFIT_STATUS_VENDOR_SPECIFIC_ERROR = 4, +}; + +#define DSM_REVISION(1) + +/* do not support any command except NFIT_CMD_ARS_CAP on root. */ +#define ROOT_SUPPORT_CMD(1 NFIT_CMD_ARS_CAP) +#define DIMM_SUPPORT_CMD((1 NFIT_CMD_IMPLEMENTED)\ + | (1 NFIT_CMD_GET_CONFIG_SIZE)\ + | (1 NFIT_CMD_GET_CONFIG_DATA)\ + | (1 NFIT_CMD_SET_CONFIG_DATA)) + struct dsm_buffer { /* RAM page. */ uint32_t handle; @@ -366,6 +408,18 @@ struct dsm_buffer { }; }; +struct cmd_out_implemented { +uint64_t cmd_list; +}; + +struct dsm_out { +union { +uint32_t status; +struct cmd_out_implemented cmd_implemented; +uint8_t data[PAGE_SIZE]; +}; +}; + static uint64_t dsm_read(void *opaque, hwaddr addr, unsigned size) { @@ -374,10 +428,82 @@ static uint64_t dsm_read(void *opaque, hwaddr addr, return 0; } +static void dsm_write_root(struct dsm_buffer *in, struct dsm_out *out) +{ +uint32_t function = in-arg2; + +if (function == NFIT_CMD_IMPLEMENTED) { +out-cmd_implemented.cmd_list = ROOT_SUPPORT_CMD; +return; +} + +out-status = NFIT_STATUS_NOT_SUPPORTED; +nvdebug(Return status %#x.\n, out-status); +} + +static void dsm_write_nvdimm(struct dsm_buffer *in, struct dsm_out *out) +{ +uint32_t function = in-arg2; +uint32_t status; + +switch (function) { +case NFIT_CMD_IMPLEMENTED: +out-cmd_implemented.cmd_list = DIMM_SUPPORT_CMD; +return; +default: +status = NFIT_STATUS_NOT_SUPPORTED; +}; + +nvdebug(Return status %#x.\n, status); +out-status = status; +} + static void dsm_write(void *opaque, hwaddr addr, uint64_t val, unsigned size) { +struct MemoryRegion *dsm_ram_mr = opaque; +struct dsm_buffer *dsm; +struct dsm_out *out; +void *buf; + assert(val == NOTIFY_VALUE); + +buf = memory_region_get_ram_ptr(dsm_ram_mr); +dsm = buf; +out = buf; + +nvdebug(Arg0 UUID_FMT .\n, dsm-arg0[0], dsm-arg0[1], dsm-arg0[2], +dsm-arg0[3], dsm-arg0[4], dsm-arg0[5], dsm-arg0[6], +dsm-arg0[7], dsm-arg0[8], dsm-arg0[9], dsm-arg0[10], +dsm-arg0[11], dsm-arg0[12], dsm-arg0[13], dsm-arg0[14], +dsm-arg0[15]); +nvdebug(Handler %#x, Arg1 %#x, Arg2 %#x.\n, dsm-handle, dsm-arg1, +dsm-arg2); + +if (dsm-arg1 != DSM_REVISION) { +nvdebug(Revision %#x is not supported, expect %#x.\n, +dsm-arg1, DSM_REVISION); +goto exit; +} + +if (!dsm-handle) { +if (!dsm_is_root_uuid(dsm-arg0)) { +nvdebug(Root UUID does not match.\n); +goto exit; +} + +return dsm_write_root(dsm, out); +} + +if (!dsm_is_dimm_uuid(dsm-arg0)) { +nvdebug(DIMM UUID does not match.\n); +goto exit; +} + +return dsm_write_nvdimm(dsm, out); + +exit: +out-status = NFIT_STATUS_NOT_SUPPORTED; } static const
Re: [PATCH 8/9] kvm/x86: add sending hyper-v crash notification to user space
On 30/06/2015 13:33, Denis V. Lunev wrote: From: Andrey Smetanin asmeta...@virtuozzo.com Sending of notification is done by exiting vcpu to user space if KVM_REQ_HV_CRASH is enabled for vcpu. kvm_run structure will contains system_event with type KVM_SYSTEM_EVENT_CRASH and flag KVM_SYSTEM_EVENT_FL_HV_CRASH to clarify that crash occures inside Hyper-V based guest. This needs to be documented in Documentation/virtual/kvm/api.txt. Also, please rename KVM_SYSTEM_EVENT_FL_HV_CRASH to KVM_SYSTEM_EVENT_FLAG_HV_CRASH and move it to arch/x86/include/uapi/asm/kvm.h. You do not need to send the whole series again; just resend this one patch. Paolo Signed-off-by: Andrey Smetanin asmeta...@virtuozzo.com Signed-off-by: Denis V. Lunev d...@openvz.org CC: Paolo Bonzini pbonz...@redhat.com CC: Gleb Natapov g...@kernel.org --- arch/x86/kvm/x86.c | 8 include/uapi/linux/kvm.h | 2 ++ 2 files changed, 10 insertions(+) diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 47b7507..55a4b92 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -6264,6 +6264,14 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu) vcpu_scan_ioapic(vcpu); if (kvm_check_request(KVM_REQ_APIC_PAGE_RELOAD, vcpu)) kvm_vcpu_reload_apic_access_page(vcpu); + if (kvm_check_request(KVM_REQ_HV_CRASH, vcpu)) { + vcpu-run-exit_reason = KVM_EXIT_SYSTEM_EVENT; + vcpu-run-system_event.type = KVM_SYSTEM_EVENT_CRASH; + vcpu-run-system_event.flags = + KVM_SYSTEM_EVENT_FL_HV_CRASH; + r = 0; + goto out; + } } if (kvm_check_request(KVM_REQ_EVENT, vcpu) || req_int_win) { diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h index 716ad4a..eefb8b9 100644 --- a/include/uapi/linux/kvm.h +++ b/include/uapi/linux/kvm.h @@ -317,6 +317,8 @@ struct kvm_run { struct { #define KVM_SYSTEM_EVENT_SHUTDOWN 1 #define KVM_SYSTEM_EVENT_RESET 2 +#define KVM_SYSTEM_EVENT_CRASH 3 +#define KVM_SYSTEM_EVENT_FL_HV_CRASH(1ULL 0) __u32 type; __u64 flags; } system_event; -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 3/9] kvm: add hyper-v crash msrs values
On 01/07/15 18:00, Paolo Bonzini wrote: On 30/06/2015 13:33, Denis V. Lunev wrote: +#define HV_X64_MSR_CRASH_CTL_NOTIFY(1ULL 63) +#define HV_X64_MSR_CRASH_CTL_CONTENTS \ + (HV_X64_MSR_CRASH_CTL_NOTIFY) Why is HV_X64_MSR_CRASH_CTL_CONTENTS needed? Can I just remove it? Paolo this was a direct request from Peter Hornyack peterhorny...@google.com I suggest here: #define HV_X64_MSR_CRASH_CTL_CONTENTS \ (HV_CRASH_CTL_CRASH_NOTIFY) To allow for more crash actions to be added in the future. Den -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[kvm-ppc:kvm-ppc-queue 6/9] kernel/fork.c:99:0: warning: MAX_THREADS redefined
tree: git://github.com/agraf/linux-2.6.git kvm-ppc-queue head: cc75c6b1368c88977d6015fd67b02c85ee04e57c commit: c98d80c7b761a4b3bcbcc9314c4492f76585caa0 [6/9] KVM: PPC: Book3S HV: Implement dynamic micro-threading on POWER8 config: powerpc-defconfig (attached as .config) reproduce: wget https://git.kernel.org/cgit/linux/kernel/git/wfg/lkp-tests.git/plain/sbin/make.cross -O ~/bin/make.cross chmod +x ~/bin/make.cross git checkout c98d80c7b761a4b3bcbcc9314c4492f76585caa0 # save the attached .config to linux build tree make.cross ARCH=powerpc All warnings (new ones prefixed by ): kernel/fork.c:99:0: warning: MAX_THREADS redefined #define MAX_THREADS FUTEX_TID_MASK ^ In file included from arch/powerpc/include/asm/paca.h:25:0, from arch/powerpc/include/asm/hw_irq.h:42, from arch/powerpc/include/asm/irqflags.h:11, from include/linux/irqflags.h:15, from include/linux/spinlock.h:53, from include/linux/mmzone.h:7, from include/linux/gfp.h:5, from include/linux/slab.h:14, from kernel/fork.c:14: arch/powerpc/include/asm/kvm_book3s_asm.h:29:0: note: this is the location of the previous definition #define MAX_THREADS 8 ^ kernel/fork.c:99:0: warning: MAX_THREADS redefined #define MAX_THREADS FUTEX_TID_MASK ^ In file included from arch/powerpc/include/asm/paca.h:25:0, from arch/powerpc/include/asm/hw_irq.h:42, from arch/powerpc/include/asm/irqflags.h:11, from include/linux/irqflags.h:15, from include/linux/spinlock.h:53, from include/linux/mmzone.h:7, from include/linux/gfp.h:5, from include/linux/slab.h:14, from kernel/fork.c:14: arch/powerpc/include/asm/kvm_book3s_asm.h:29:0: note: this is the location of the previous definition #define MAX_THREADS 8 ^ vim +/MAX_THREADS +99 kernel/fork.c ^1da177e Linus Torvalds 2005-04-16 83 #include asm/cacheflush.h ^1da177e Linus Torvalds 2005-04-16 84 #include asm/tlbflush.h ^1da177e Linus Torvalds 2005-04-16 85 ad8d75ff Steven Rostedt 2009-04-14 86 #include trace/events/sched.h ad8d75ff Steven Rostedt 2009-04-14 87 43d2b113 KAMEZAWA Hiroyuki 2012-01-10 88 #define CREATE_TRACE_POINTS 43d2b113 KAMEZAWA Hiroyuki 2012-01-10 89 #include trace/events/task.h 43d2b113 KAMEZAWA Hiroyuki 2012-01-10 90 ^1da177e Linus Torvalds 2005-04-16 91 /* ac1b398d Heinrich Schuchardt 2015-04-16 92 * Minimum number of threads to boot the kernel ac1b398d Heinrich Schuchardt 2015-04-16 93 */ ac1b398d Heinrich Schuchardt 2015-04-16 94 #define MIN_THREADS 20 ac1b398d Heinrich Schuchardt 2015-04-16 95 ac1b398d Heinrich Schuchardt 2015-04-16 96 /* ac1b398d Heinrich Schuchardt 2015-04-16 97 * Maximum number of threads ac1b398d Heinrich Schuchardt 2015-04-16 98 */ ac1b398d Heinrich Schuchardt 2015-04-16 @99 #define MAX_THREADS FUTEX_TID_MASK ac1b398d Heinrich Schuchardt 2015-04-16 100 ac1b398d Heinrich Schuchardt 2015-04-16 101 /* ^1da177e Linus Torvalds 2005-04-16 102 * Protected counters by write_lock_irq(tasklist_lock) ^1da177e Linus Torvalds 2005-04-16 103 */ ^1da177e Linus Torvalds 2005-04-16 104 unsigned long total_forks; /* Handle normal Linux uptimes. */ ^1da177e Linus Torvalds 2005-04-16 105 int nr_threads; /* The idle threads do not count.. */ ^1da177e Linus Torvalds 2005-04-16 106 ^1da177e Linus Torvalds 2005-04-16 107 int max_threads; /* tunable limit on nr_threads */ :: The code at line 99 was first introduced by commit :: ac1b398de1ef94aeee8ba87b0120763526572a6e kernel/fork.c: avoid division by zero :: TO: Heinrich Schuchardt xypron.g...@gmx.de :: CC: Linus Torvalds torva...@linux-foundation.org --- 0-DAY kernel test infrastructureOpen Source Technology Center https://lists.01.org/pipermail/kbuild-all Intel Corporation # # Automatically generated file; DO NOT EDIT. # Linux/powerpc 4.1.0 Kernel Configuration # CONFIG_PPC64=y # # Processor support # CONFIG_PPC_BOOK3S_64=y # CONFIG_PPC_BOOK3E_64 is not set CONFIG_GENERIC_CPU=y # CONFIG_CELL_CPU is not set # CONFIG_POWER4_CPU is not set # CONFIG_POWER5_CPU is not set # CONFIG_POWER6_CPU is not set # CONFIG_POWER7_CPU is not set # CONFIG_POWER8_CPU is not set CONFIG_PPC_BOOK3S=y # CONFIG_TUNE_CELL is not set CONFIG_PPC_FPU=y CONFIG_ALTIVEC=y CONFIG_VSX=y # CONFIG_PPC_ICSWX is not set CONFIG_PPC_STD_MMU=y CONFIG_PPC_STD_MMU_64=y CONFIG_PPC_MM_SLICES=y CONFIG_PPC_HAVE_PMU_SUPPORT=y CONFIG_PPC_PERF_CTRS=y CONFIG_SMP=y CONFIG_NR_CPUS=32 CONFIG_PPC_DOORBELL=y CONFIG_VDSO32=y CONFIG_CPU_BIG_ENDIAN=y # CONFIG_CPU_LITTLE_ENDIAN is not set
[PATCH 03/16] acpi: add aml_derefof
Implement DeRefOf term which is used by NVDIMM _DSM method in later patch Signed-off-by: Xiao Guangrong guangrong.x...@linux.intel.com --- hw/acpi/aml-build.c | 8 include/hw/acpi/aml-build.h | 1 + 2 files changed, 9 insertions(+) diff --git a/hw/acpi/aml-build.c b/hw/acpi/aml-build.c index 02f9e3d..9e89efc 100644 --- a/hw/acpi/aml-build.c +++ b/hw/acpi/aml-build.c @@ -1135,6 +1135,14 @@ Aml *aml_unicode(const char *str) return var; } +/* ACPI 6.0: 20.2.5.4 Type 2 Opcodes Encoding: DefDerefOf */ +Aml *aml_derefof(Aml *arg) +{ +Aml *var = aml_opcode(0x83 /* DerefOfOp */); +aml_append(var, arg); +return var; +} + void build_header(GArray *linker, GArray *table_data, AcpiTableHeader *h, const char *sig, int len, uint8_t rev) diff --git a/include/hw/acpi/aml-build.h b/include/hw/acpi/aml-build.h index 996ac5b..21dc5e9 100644 --- a/include/hw/acpi/aml-build.h +++ b/include/hw/acpi/aml-build.h @@ -275,6 +275,7 @@ Aml *aml_create_dword_field(Aml *srcbuf, Aml *index, const char *name); Aml *aml_varpackage(uint32_t num_elements); Aml *aml_touuid(const char *uuid); Aml *aml_unicode(const char *str); +Aml *aml_derefof(Aml *arg); void build_header(GArray *linker, GArray *table_data, -- 2.1.0 -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 09/16] nvdimm: build ACPI NFIT table
NFIT is defined in ACPI 6.0: 5.2.25 NVDIMM Firmware Interface Table (NFIT) Currently, we only support PMEM mode. Each device has 3 tables: - SPA table, define the PMEM region info - MEM DEV table, it has the @handle which is used to associate specified ACPI NVDIMM device we will introduce in later patch. Also we can happily ignored the memory device's interleave, the real nvdimm hardware access is hidden behind host - DCR table, it defines Vendor ID used to associate specified vendor nvdimm driver. Since we only implement PMEM mode this time, Command window and Data window are not needed Signed-off-by: Xiao Guangrong guangrong.x...@linux.intel.com --- hw/i386/acpi-build.c | 3 + hw/mem/pc-nvdimm.c | 286 + include/hw/mem/pc-nvdimm.h | 8 ++ 3 files changed, 297 insertions(+) diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c index 6a1ab09..80c21be 100644 --- a/hw/i386/acpi-build.c +++ b/hw/i386/acpi-build.c @@ -39,6 +39,7 @@ #include hw/loader.h #include hw/isa/isa.h #include hw/acpi/memory_hotplug.h +#include hw/mem/pc-nvdimm.h #include sysemu/tpm.h #include hw/acpi/tpm.h #include sysemu/tpm_backend.h @@ -1741,6 +1742,8 @@ void acpi_build(PcGuestInfo *guest_info, AcpiBuildTables *tables) build_dmar_q35(tables_blob, tables-linker); } +pc_nvdimm_build_nfit_table(table_offsets, tables_blob, tables-linker); + /* Add tables supplied by user (if any) */ for (u = acpi_table_first(); u; u = acpi_table_next(u)) { unsigned len = acpi_table_len(u); diff --git a/hw/mem/pc-nvdimm.c b/hw/mem/pc-nvdimm.c index 9531935..e7cff29 100644 --- a/hw/mem/pc-nvdimm.c +++ b/hw/mem/pc-nvdimm.c @@ -27,10 +27,12 @@ #include linux/fs.h #include exec/address-spaces.h +#include hw/acpi/aml-build.h #include hw/mem/pc-nvdimm.h #define PAGE_SIZE (1UL 12) +#define MAX_NVDIMM_NUMBER (10) #define MIN_CONFIG_DATA_SIZE(128 10) static struct nvdimms_info { @@ -65,6 +67,290 @@ static uint32_t new_device_index(void) return nvdimms_info.device_index++; } +static int pc_nvdimm_built_list(Object *obj, void *opaque) +{ +GSList **list = opaque; + +if (object_dynamic_cast(obj, TYPE_PC_NVDIMM)) { +PCNVDIMMDevice *nvdimm = PC_NVDIMM(obj); + +/* only realized NVDIMMs matter */ +if (memory_region_size(nvdimm-mr)) { +*list = g_slist_append(*list, nvdimm); +} +} + +object_child_foreach(obj, pc_nvdimm_built_list, opaque); +return 0; +} + +static GSList *get_nvdimm_built_list(void) +{ +GSList *list = NULL; + +object_child_foreach(qdev_get_machine(), pc_nvdimm_built_list, list); +return list; +} + +static int get_nvdimm_device_number(GSList *list) +{ +int nr = 0; + +for (; list; list = list-next) { +nr++; +} + +return nr; +} + +static uint32_t nvdimm_index_to_sn(int index) +{ +return 0x123456 + index; +} + +static uint32_t nvdimm_index_to_handle(int index) +{ +return index + 1; +} + +typedef struct { +uint8_t b[16]; +} uuid_le; + +#define UUID_LE(a, b, c, d0, d1, d2, d3, d4, d5, d6, d7) \ +((uuid_le) \ +{ { (a) 0xff, ((a) 8) 0xff, ((a) 16) 0xff, ((a) 24) 0xff, \ +(b) 0xff, ((b) 8) 0xff, (c) 0xff, ((c) 8) 0xff, \ +(d0), (d1), (d2), (d3), (d4), (d5), (d6), (d7) } }) + +static void nfit_spa_uuid_pm(void *uuid) +{ +uuid_le uuid_pm = UUID_LE(0x66f0d379, 0xb4f3, 0x4074, 0xac, 0x43, 0x0d, + 0x33, 0x18, 0xb7, 0x8c, 0xdb); +memcpy(uuid, uuid_pm, sizeof(uuid_pm)); +} + +enum { +NFIT_TABLE_SPA = 0, +NFIT_TABLE_MEM = 1, +NFIT_TABLE_IDT = 2, +NFIT_TABLE_SMBIOS = 3, +NFIT_TABLE_DCR = 4, +NFIT_TABLE_BDW = 5, +NFIT_TABLE_FLUSH = 6, +}; + +enum { +EFI_MEMORY_UC = 0x1ULL, +EFI_MEMORY_WC = 0x2ULL, +EFI_MEMORY_WT = 0x4ULL, +EFI_MEMORY_WB = 0x8ULL, +EFI_MEMORY_UCE = 0x10ULL, +EFI_MEMORY_WP = 0x1000ULL, +EFI_MEMORY_RP = 0x2000ULL, +EFI_MEMORY_XP = 0x4000ULL, +EFI_MEMORY_NV = 0x8000ULL, +EFI_MEMORY_MORE_RELIABLE = 0x1ULL, +}; + +/* + * struct nfit - Nvdimm Firmware Interface Table + * @signature: NFIT + */ +struct nfit { +ACPI_TABLE_HEADER_DEF +uint32_t reserved; +} QEMU_PACKED; + +/* + * struct nfit_spa - System Physical Address Range Structure + */ +struct nfit_spa { +uint16_t type; +uint16_t length; +uint16_t spa_index; +uint16_t flags; +uint32_t reserved; +uint32_t proximity_domain; +uint8_t type_uuid[16]; +uint64_t spa_base; +uint64_t spa_length; +uint64_t mem_attr; +} QEMU_PACKED; + +/* + * struct nfit_memdev - Memory Device to SPA Map Structure + */ +struct nfit_memdev { +uint16_t type; +uint16_t length; +uint32_t nfit_handle; +uint16_t phys_id; +uint16_t region_id; +uint16_t spa_index; +uint16_t
[PATCH 00/16] implement vNVDIMM
== Background == NVDIMM (A Non-Volatile Dual In-line Memory Module) is going to be supported on Intel's platform. They are discovered via ACPI and configured by _DSM method of NVDIMM device in ACPI. There has some supporting documents which can be found at: ACPI 6: http://www.uefi.org/sites/default/files/resources/ACPI_6.0.pdf NVDIMM Namespace: http://pmem.io/documents/NVDIMM_Namespace_Spec.pdf DSM Interface Example: http://pmem.io/documents/NVDIMM_DSM_Interface_Example.pdf Driver Writer's Guide: http://pmem.io/documents/NVDIMM_Driver_Writers_Guide.pdf Currently, the NVDIMM driver has been merged into upstream Linux Kernel and this patchset tries to enable it in virtualization field == Design == NVDIMM supports two mode accesses, one is PMEM which maps NVDIMM into CPU's address space then CPU can directly access it as normal memory, another is BLK which is used as block device to reduce the occupying of CPU address space BLK mode accesses NVDIMM via Command Register window and Data Register window. BLK virtualization has high workload since each sector access will cause at least two VM-EXIT. So we currently only imperilment vPMEM in this patchset --- vPMEM design --- We introduce a new device named pc-nvdimm, it has a parameter, file, which is the file-based backed memory passed to guest. The file can be regular file and block device. We can use any file when we do test or emulation, however, in the real word, the files passed to guest are: - the regular file in the filesystem with DAX enabled created on NVDIMM device on host - the raw PMEM device on host, e,g /dev/pmem0 Memory access on the address created by mmap on these kinds of files can directly reach NVDIMM device on host. --- vConfigure data area design --- Each NVDIMM device has a configure data area which is used to store label namespace data. In order to emulating this area, we divide the file into two parts: - first parts is (0, size - 128K], which is used as PMEM - 128K at the end of the file, which is used as Config Data Area So that the label namespace data can be persistent during power lose or system failure --- _DSM method design --- _DSM in ACPI is used to configure NVDIMM, currently we only allow access of label namespace data, i.e, Get Namespace Label Size (Function Index 4), Get Namespace Label Data (Function Index 5) and Set Namespace Label Data (Function Index 6) _DSM uses two pages to transfer data between ACPI and Qemu, the first page is RAM-based used to save the input info of _DSM method and Qemu reuse it store output info and another page is MMIO-based, ACPI write data to this page to transfer the control to Qemu We use the address region above 4G to map these pages because there is huge free space above 4G and it can avoid the address overlap with PCI and other address reserved component (e,g HPET). This is also the reason we choose MMIO notification instead of PIO == Test == In host 1) create memory backed file, e.g # dd if=zero of=/tmp/nvdimm bs=1G count=10 2) append '-device pc-nvdimm,file=/tmp/nvdimm' in Qemu command line In guest, download the latest upsteam kernel (4.2 merge window) and enable ACPI_NFIT, LIBNVDIMM and BLK_DEV_PMEM. 1) insmod drivers/nvdimm/libnvdimm.ko 2) insmod drivers/acpi/nfit.ko 3) insmod drivers/nvdimm/nd_btt.ko 4) insmod drivers/nvdimm/nd_pmem.ko You can see the whole nvdimm device used as a single namespace and /dev/pmem0 appears. You can do whatever on /dev/pmem0 including DAX access. Currently Linux NVDIMM driver does not support namespace operation on this kind of PMEM, apply below changes to support dynamical namespace: @@ -798,7 +823,8 @@ static int acpi_nfit_register_dimms(struct acpi_nfit_desc *a continue; } - if (nfit_mem-bdw nfit_mem-memdev_pmem) + //if (nfit_mem-bdw nfit_mem-memdev_pmem) + if (nfit_mem-memdev_pmem) flags |= NDD_ALIASING; You can append another NVDIMM device in guest and do: # cd /sys/bus/nd/devices/ # cd namespace1.0/ # echo `uuidgen` uuid # echo `expr 1024 \* 1024 \* 128` size then reload nd.pmem.ko You can see /dev/pmem1 appears == TODO == 1) NVDIMM NUMA support 2) NVDIMM hotplug support Xiao Guangrong (16): acpi: allow aml_operation_region() working on 64 bit offset i386/acpi-build: allow SSDT to operate on 64 bit acpi: add aml_derefof acpi: add aml_sizeof acpi: add aml_create_field pc: implement NVDIMM device abstract nvdimm: reserve address range for NVDIMM nvdimm: init backend memory mapping and config data area nvdimm: build ACPI NFIT table nvdimm: init the address region used by _DSM method nvdimm: build ACPI nvdimm devices nvdimm: save arg3 for NVDIMM device _DSM method nvdimm: support NFIT_CMD_IMPLEMENTED function nvdimm: support NFIT_CMD_GET_CONFIG_SIZE function nvdimm: support NFIT_CMD_GET_CONFIG_DATA nvdimm: support NFIT_CMD_SET_CONFIG_DATA
[PATCH 08/16] nvdimm: init backend memory mapping and config data area
The parameter @file is used as backed memory for NVDIMM which is divided into two parts: - first parts is (0, size - 128K], which is used as PMEM (Persistent Memory) - 128K at the end of the file, which is used as Config Data Area, it's used to store Label namespace data The @file supports both regular file and block device, of course we can assign any these two kinds of files for test and emulation, however, in the real word for performance reason, we usually used these files as NVDIMM backed file: - the regular file in the filesystem with DAX enabled created on NVDIMM device on host - the raw PMEM device on host, e,g /dev/pmem0 Signed-off-by: Xiao Guangrong guangrong.x...@linux.intel.com --- hw/mem/pc-nvdimm.c | 102 - include/hw/mem/pc-nvdimm.h | 5 +++ 2 files changed, 106 insertions(+), 1 deletion(-) diff --git a/hw/mem/pc-nvdimm.c b/hw/mem/pc-nvdimm.c index b40d4e7..9531935 100644 --- a/hw/mem/pc-nvdimm.c +++ b/hw/mem/pc-nvdimm.c @@ -22,12 +22,20 @@ * License along with this library; if not, see http://www.gnu.org/licenses/ */ +#include sys/mman.h +#include sys/ioctl.h +#include linux/fs.h + +#include exec/address-spaces.h #include hw/mem/pc-nvdimm.h -#define PAGE_SIZE (1UL 12) +#define PAGE_SIZE (1UL 12) + +#define MIN_CONFIG_DATA_SIZE(128 10) static struct nvdimms_info { ram_addr_t current_addr; +int device_index; } nvdimms_info; /* the address range [offset, ~0ULL) is reserved for NVDIMM. */ @@ -37,6 +45,26 @@ void pc_nvdimm_reserve_range(ram_addr_t offset) nvdimms_info.current_addr = offset; } +static ram_addr_t reserved_range_push(uint64_t size) +{ +uint64_t current; + +current = ROUND_UP(nvdimms_info.current_addr, PAGE_SIZE); + +/* do not have enough space? */ +if (current + size current) { +return 0; +} + +nvdimms_info.current_addr = current + size; +return current; +} + +static uint32_t new_device_index(void) +{ +return nvdimms_info.device_index++; +} + static char *get_file(Object *obj, Error **errp) { PCNVDIMMDevice *nvdimm = PC_NVDIMM(obj); @@ -48,6 +76,11 @@ static void set_file(Object *obj, const char *str, Error **errp) { PCNVDIMMDevice *nvdimm = PC_NVDIMM(obj); +if (memory_region_size(nvdimm-mr)) { +error_setg(errp, cannot change property value); +return; +} + if (nvdimm-file) { g_free(nvdimm-file); } @@ -60,13 +93,80 @@ static void pc_nvdimm_init(Object *obj) object_property_add_str(obj, file, get_file, set_file, NULL); } +static uint64_t get_file_size(int fd) +{ +struct stat stat_buf; +uint64_t size; + +if (fstat(fd, stat_buf) 0) { +return 0; +} + +if (S_ISREG(stat_buf.st_mode)) { +return stat_buf.st_size; +} + +if (S_ISBLK(stat_buf.st_mode) !ioctl(fd, BLKGETSIZE64, size)) { +return size; +} + +return 0; +} + static void pc_nvdimm_realize(DeviceState *dev, Error **errp) { PCNVDIMMDevice *nvdimm = PC_NVDIMM(dev); +char name[512]; +void *buf; +ram_addr_t addr; +uint64_t size; +int fd; if (!nvdimm-file) { error_setg(errp, file property is not set); } + +fd = open(nvdimm-file, O_RDWR); +if (fd 0) { +error_setg(errp, can not open %s, nvdimm-file); +return; +} + +/* reserve MIN_CONFIGDATA_AREA_SIZE for configue data */ +size = get_file_size(fd) - MIN_CONFIG_DATA_SIZE; +if ((int64_t)size = 0) { +error_setg(errp, file size is too small to store NVDIMM + configure data); +goto do_close; +} + +buf = mmap(NULL, size + MIN_CONFIG_DATA_SIZE, PROT_READ | PROT_WRITE, + MAP_SHARED, fd, 0); +if (buf == MAP_FAILED) { +error_setg(errp, can not do mmap on %s, nvdimm-file); +goto do_close; +} + +addr = reserved_range_push(size); +if (!addr) { +error_setg(errp, do not have enough space for size %#lx.\n, size); +goto do_unmap; +} + +nvdimm-device_index = new_device_index(); +sprintf(name, NVDIMM-%d, nvdimm-device_index); +memory_region_init_ram_ptr(nvdimm-mr, OBJECT(dev), name, size, buf); +vmstate_register_ram(nvdimm-mr, DEVICE(dev)); +memory_region_add_subregion(get_system_memory(), addr, nvdimm-mr); + +nvdimm-config_data_addr = buf + size; +nvdimm-config_data_size = MIN_CONFIG_DATA_SIZE; + +return; +do_unmap: +munmap(buf, size); +do_close: +close(fd); } static void pc_nvdimm_class_init(ObjectClass *oc, void *data) diff --git a/include/hw/mem/pc-nvdimm.h b/include/hw/mem/pc-nvdimm.h index 2081e7c..e743ed1 100644 --- a/include/hw/mem/pc-nvdimm.h +++ b/include/hw/mem/pc-nvdimm.h @@ -21,6 +21,11 @@ typedef struct PCNVDIMMDevice { DeviceState parent_obj; char *file; +void *config_data_addr; +uint64_t config_data_size; + +int device_index; +
[PATCH 06/16] pc: implement NVDIMM device abstract
Introduce pc-nvdimm device and it only has one parameter, @file, which is the backed memory file for NVDIMM device We can use -device pc-nvdimm,file=/dev/pmem in the Qemu command to create NVDIMM device for the guest Signed-off-by: Xiao Guangrong guangrong.x...@linux.intel.com --- hw/mem/Makefile.objs | 1 + hw/mem/pc-nvdimm.c | 83 ++ include/hw/mem/pc-nvdimm.h | 32 ++ 3 files changed, 116 insertions(+) create mode 100644 hw/mem/pc-nvdimm.c create mode 100644 include/hw/mem/pc-nvdimm.h diff --git a/hw/mem/Makefile.objs b/hw/mem/Makefile.objs index b000fb4..9a7f5a9 100644 --- a/hw/mem/Makefile.objs +++ b/hw/mem/Makefile.objs @@ -1 +1,2 @@ common-obj-$(CONFIG_MEM_HOTPLUG) += pc-dimm.o +common-obj-$(CONFIG_LINUX) += pc-nvdimm.o diff --git a/hw/mem/pc-nvdimm.c b/hw/mem/pc-nvdimm.c new file mode 100644 index 000..0209ea9 --- /dev/null +++ b/hw/mem/pc-nvdimm.c @@ -0,0 +1,83 @@ +/* + * NVDIMM (A Non-Volatile Dual In-line Memory Module) Virtualization Implement + * + * Copyright(C) 2015 Intel Corporation. + * + * Author: + * Xiao Guangrong guangrong.x...@linux.intel.com + * + * Currently, it only supports PMEM Virtualization. + * + * This library is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2 of the License, or (at your option) any later version. + * + * This library is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with this library; if not, see http://www.gnu.org/licenses/ + */ + +#include hw/mem/pc-nvdimm.h + +static char *get_file(Object *obj, Error **errp) +{ +PCNVDIMMDevice *nvdimm = PC_NVDIMM(obj); + +return g_strdup(nvdimm-file); +} + +static void set_file(Object *obj, const char *str, Error **errp) +{ +PCNVDIMMDevice *nvdimm = PC_NVDIMM(obj); + +if (nvdimm-file) { +g_free(nvdimm-file); +} + +nvdimm-file = g_strdup(str); +} + +static void pc_nvdimm_init(Object *obj) +{ +object_property_add_str(obj, file, get_file, set_file, NULL); +} + +static void pc_nvdimm_realize(DeviceState *dev, Error **errp) +{ +PCNVDIMMDevice *nvdimm = PC_NVDIMM(dev); + +if (!nvdimm-file) { +error_setg(errp, file property is not set); +} +} + +static void pc_nvdimm_class_init(ObjectClass *oc, void *data) +{ +DeviceClass *dc = DEVICE_CLASS(oc); + +/* nvdimm hotplug has not supported yet. */ +dc-hotpluggable = false; + +dc-realize = pc_nvdimm_realize; +dc-desc = NVDIMM memory module; +} + +static TypeInfo pc_nvdimm_info = { +.name = TYPE_PC_NVDIMM, +.parent= TYPE_DEVICE, +.instance_size = sizeof(PCNVDIMMDevice), +.instance_init = pc_nvdimm_init, +.class_init= pc_nvdimm_class_init, +}; + +static void pc_nvdimm_register_types(void) +{ +type_register_static(pc_nvdimm_info); +} + +type_init(pc_nvdimm_register_types) diff --git a/include/hw/mem/pc-nvdimm.h b/include/hw/mem/pc-nvdimm.h new file mode 100644 index 000..7f37b46 --- /dev/null +++ b/include/hw/mem/pc-nvdimm.h @@ -0,0 +1,32 @@ +/* + * NVDIMM (A Non-Volatile Dual In-line Memory Module) Virtualization Implement + * + * Copyright(C) 2015 Intel Corporation. + * + * Author: + * Xiao Guangrong guangrong.x...@linux.intel.com + * + * This work is licensed under the terms of the GNU GPL, version 2 or later. + * See the COPYING file in the top-level directory. + */ + +#ifndef __PC_NVDIMM_H +#define __PC_NVDIMM_H + +#include hw/qdev.h + +#ifdef CONFIG_LINUX +typedef struct PCNVDIMMDevice { +/* private */ +DeviceState parent_obj; + +char *file; +} PCNVDIMMDevice; + +#define TYPE_PC_NVDIMM pc-nvdimm + +#define PC_NVDIMM(obj) \ +OBJECT_CHECK(PCNVDIMMDevice, (obj), TYPE_PC_NVDIMM) +#else /* !CONFIG_LINUX */ +#endif +#endif -- 2.1.0 -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 02/16] i386/acpi-build: allow SSDT to operate on 64 bit
Only 512M is left for MMIO below 4G and that are used by PCI, BIOS etc. Other components also reserve regions from their internal usage, e.g, [0xFED0, 0xFED0 + 0x400) is reserved for HPET Switch SSDT to 64 bit to use the huge free room above 4G. In the later patches, we will dynamical allocate free space within this region which is used by NVDIMM _DSM method Signed-off-by: Xiao Guangrong guangrong.x...@linux.intel.com --- hw/i386/acpi-build.c | 4 ++-- hw/i386/acpi-dsdt.dsl | 2 +- 2 files changed, 3 insertions(+), 3 deletions(-) diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c index 00818b9..6a1ab09 100644 --- a/hw/i386/acpi-build.c +++ b/hw/i386/acpi-build.c @@ -1348,7 +1348,7 @@ build_ssdt(GArray *table_data, GArray *linker, g_array_append_vals(table_data, ssdt-buf-data, ssdt-buf-len); build_header(linker, table_data, (void *)(table_data-data + table_data-len - ssdt-buf-len), -SSDT, ssdt-buf-len, 1); +SSDT, ssdt-buf-len, 2); free_aml_allocator(); } @@ -1586,7 +1586,7 @@ build_dsdt(GArray *table_data, GArray *linker, AcpiMiscInfo *misc) memset(dsdt, 0, sizeof *dsdt); build_header(linker, table_data, dsdt, DSDT, - misc-dsdt_size, 1); + misc-dsdt_size, 2); } static GArray * diff --git a/hw/i386/acpi-dsdt.dsl b/hw/i386/acpi-dsdt.dsl index a2d84ec..5cd3f0e 100644 --- a/hw/i386/acpi-dsdt.dsl +++ b/hw/i386/acpi-dsdt.dsl @@ -22,7 +22,7 @@ ACPI_EXTRACT_ALL_CODE AcpiDsdtAmlCode DefinitionBlock ( acpi-dsdt.aml,// Output Filename DSDT, // Signature -0x01, // DSDT Compliance Revision +0x02, // DSDT Compliance Revision BXPC, // OEMID BXDSDT, // TABLE ID 0x1 // OEM Revision -- 2.1.0 -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 05/16] acpi: add aml_create_field
Implement CreateField term which are used by NVDIMM _DSM method in later patch Signed-off-by: Xiao Guangrong guangrong.x...@linux.intel.com --- hw/acpi/aml-build.c | 14 ++ include/hw/acpi/aml-build.h | 1 + 2 files changed, 15 insertions(+) diff --git a/hw/acpi/aml-build.c b/hw/acpi/aml-build.c index a526eed..debdad2 100644 --- a/hw/acpi/aml-build.c +++ b/hw/acpi/aml-build.c @@ -1151,6 +1151,20 @@ Aml *aml_sizeof(Aml *arg) return var; } +/* ACPI 6.0: 20.2.5.2 Named Objects Encoding: DefCreateField */ +Aml *aml_create_field(Aml *srcbuf, Aml *index, Aml *len, const char *name) +{ +Aml *var = aml_alloc(); + +build_append_byte(var-buf, 0x5B); /* ExtOpPrefix */ +build_append_byte(var-buf, 0x13); /* CreateFieldOp */ +aml_append(var, srcbuf); +aml_append(var, index); +aml_append(var, len); +build_append_namestring(var-buf, %s, name); +return var; +} + void build_header(GArray *linker, GArray *table_data, AcpiTableHeader *h, const char *sig, int len, uint8_t rev) diff --git a/include/hw/acpi/aml-build.h b/include/hw/acpi/aml-build.h index 6b591ab..d4dbd44 100644 --- a/include/hw/acpi/aml-build.h +++ b/include/hw/acpi/aml-build.h @@ -277,6 +277,7 @@ Aml *aml_touuid(const char *uuid); Aml *aml_unicode(const char *str); Aml *aml_derefof(Aml *arg); Aml *aml_sizeof(Aml *arg); +Aml *aml_create_field(Aml *srcbuf, Aml *index, Aml *len, const char *name); void build_header(GArray *linker, GArray *table_data, -- 2.1.0 -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 04/16] acpi: add aml_sizeof
Implement SizeOf term which is used by NVDIMM _DSM method in later patch Signed-off-by: Xiao Guangrong guangrong.x...@linux.intel.com --- hw/acpi/aml-build.c | 8 include/hw/acpi/aml-build.h | 1 + 2 files changed, 9 insertions(+) diff --git a/hw/acpi/aml-build.c b/hw/acpi/aml-build.c index 9e89efc..a526eed 100644 --- a/hw/acpi/aml-build.c +++ b/hw/acpi/aml-build.c @@ -1143,6 +1143,14 @@ Aml *aml_derefof(Aml *arg) return var; } +/* ACPI 6.0: 20.2.5.4 Type 2 Opcodes Encoding: DefSizeOf */ +Aml *aml_sizeof(Aml *arg) +{ +Aml *var = aml_opcode(0x87 /* SizeOfOp */); +aml_append(var, arg); +return var; +} + void build_header(GArray *linker, GArray *table_data, AcpiTableHeader *h, const char *sig, int len, uint8_t rev) diff --git a/include/hw/acpi/aml-build.h b/include/hw/acpi/aml-build.h index 21dc5e9..6b591ab 100644 --- a/include/hw/acpi/aml-build.h +++ b/include/hw/acpi/aml-build.h @@ -276,6 +276,7 @@ Aml *aml_varpackage(uint32_t num_elements); Aml *aml_touuid(const char *uuid); Aml *aml_unicode(const char *str); Aml *aml_derefof(Aml *arg); +Aml *aml_sizeof(Aml *arg); void build_header(GArray *linker, GArray *table_data, -- 2.1.0 -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 01/16] acpi: allow aml_operation_region() working on 64 bit offset
Currently, the offset in OperationRegion is limited to 32 bit, extend it to 64 bit so that we can switch SSDT to 64 bit in later patch Signed-off-by: Xiao Guangrong guangrong.x...@linux.intel.com --- hw/acpi/aml-build.c | 2 +- include/hw/acpi/aml-build.h | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/hw/acpi/aml-build.c b/hw/acpi/aml-build.c index 0d4b324..02f9e3d 100644 --- a/hw/acpi/aml-build.c +++ b/hw/acpi/aml-build.c @@ -752,7 +752,7 @@ Aml *aml_package(uint8_t num_elements) /* ACPI 1.0b: 16.2.5.2 Named Objects Encoding: DefOpRegion */ Aml *aml_operation_region(const char *name, AmlRegionSpace rs, - uint32_t offset, uint32_t len) + uint64_t offset, uint32_t len) { Aml *var = aml_alloc(); build_append_byte(var-buf, 0x5B); /* ExtOpPrefix */ diff --git a/include/hw/acpi/aml-build.h b/include/hw/acpi/aml-build.h index e3afa13..996ac5b 100644 --- a/include/hw/acpi/aml-build.h +++ b/include/hw/acpi/aml-build.h @@ -222,7 +222,7 @@ Aml *aml_interrupt(AmlConsumerAndProducer con_and_pro, Aml *aml_io(AmlIODecode dec, uint16_t min_base, uint16_t max_base, uint8_t aln, uint8_t len); Aml *aml_operation_region(const char *name, AmlRegionSpace rs, - uint32_t offset, uint32_t len); + uint64_t offset, uint32_t len); Aml *aml_irq_no_flags(uint8_t irq); Aml *aml_named_field(const char *name, unsigned length); Aml *aml_reserved_field(unsigned length); -- 2.1.0 -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 11/16] nvdimm: build ACPI nvdimm devices
NVDIMM devices is defined in ACPI 6.0 9.20 NVDIMM Devices This is a root device under \_SB and specified NVDIMM device are under the root device. Each NVDIMM device has _ADR which return its handle used to associate MEMDEV table in NFIT We reserve handle 0 for root device. In this patch, we save handle, arg0, arg1 and arg2. Arg3 is conditionally saved in later patch Signed-off-by: Xiao Guangrong guangrong.x...@linux.intel.com --- hw/i386/acpi-build.c | 2 + hw/mem/pc-nvdimm.c | 126 + include/hw/mem/pc-nvdimm.h | 6 +++ 3 files changed, 134 insertions(+) diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c index 80c21be..85c7226 100644 --- a/hw/i386/acpi-build.c +++ b/hw/i386/acpi-build.c @@ -1342,6 +1342,8 @@ build_ssdt(GArray *table_data, GArray *linker, aml_append(sb_scope, scope); } } + +pc_nvdimm_build_acpi_devices(sb_scope); aml_append(ssdt, sb_scope); } diff --git a/hw/mem/pc-nvdimm.c b/hw/mem/pc-nvdimm.c index 4c290cb..0e2a9d5 100644 --- a/hw/mem/pc-nvdimm.c +++ b/hw/mem/pc-nvdimm.c @@ -32,6 +32,7 @@ #define PAGE_SIZE (1UL 12) +#define NOTIFY_VALUE(0x99) #define MAX_NVDIMM_NUMBER (10) #define MIN_CONFIG_DATA_SIZE(128 10) @@ -348,12 +349,15 @@ struct dsm_buffer { static uint64_t dsm_read(void *opaque, hwaddr addr, unsigned size) { +fprintf(stderr, BUG: we never read DSM notification MMIO.\n); +assert(0); return 0; } static void dsm_write(void *opaque, hwaddr addr, uint64_t val, unsigned size) { +assert(val == NOTIFY_VALUE); } static const MemoryRegionOps dsm_ops = { @@ -429,6 +433,128 @@ exit: g_slist_free(list); } +#define BUILD_STA_METHOD(_dev_, _method_) \ +do { \ +_method_ = aml_method(_STA, 0); \ +aml_append(_method_, aml_return(aml_int(0x0f))); \ +aml_append(_dev_, _method_); \ +} while (0) + +#define SAVE_ARG012_HANDLE(_method_, _handle_) \ +do { \ +aml_append(_method_, aml_store(_handle_, aml_name(HDLE))); \ +aml_append(_method_, aml_store(aml_arg(0), aml_name(ARG0))); \ +aml_append(_method_, aml_store(aml_arg(1), aml_name(ARG1))); \ +aml_append(_method_, aml_store(aml_arg(2), aml_name(ARG2))); \ +} while (0) + +#define NOTIFY_AND_RETURN(_method_)\ +do { \ +aml_append(_method_, aml_store(aml_int(NOTIFY_VALUE), \ + aml_name(NOTI))); \ +aml_append(_method_, aml_return(aml_name(ODAT)));\ +} while (0) + +static void build_nvdimm_devices(Aml *root_dev, GSList *list) +{ +for (; list; list = list-next) { +PCNVDIMMDevice *nvdimm = list-data; +uint32_t handle = nvdimm_index_to_handle(nvdimm-device_index); +Aml *dev, *method; + +dev = aml_device(NVD%d, nvdimm-device_index); +aml_append(dev, aml_name_decl(_ADR, aml_int(handle))); + +BUILD_STA_METHOD(dev, method); + +method = aml_method(_DSM, 4); +{ +SAVE_ARG012_HANDLE(method, aml_int(handle)); +NOTIFY_AND_RETURN(method); +} +aml_append(dev, method); + +aml_append(root_dev, dev); +} +} + +void pc_nvdimm_build_acpi_devices(Aml *sb_scope) +{ +Aml *dev, *method, *field; +struct dsm_buffer *dsm_buf; +GSList *list = get_nvdimm_built_list(); +int nr = get_nvdimm_device_number(list); + +if (nr = 0 || nr MAX_NVDIMM_NUMBER) { +g_slist_free(list); +return; +} + +dev = aml_device(NVDR); +aml_append(dev, aml_name_decl(_HID, aml_string(ACPI0012))); + +/* map DSM buffer into ACPI namespace. */ +aml_append(dev, aml_operation_region(DSMR, AML_SYSTEM_MEMORY, + nvdimms_info.dsm_addr, nvdimms_info.dsm_size)); + +/* + * DSM input: + * @HDLE: store device's handle, it's zero if the _DSM call happens + *on ROOT. + * @ARG0 ~ @ARG3: store the parameters of _DSM call. + * + * They are ram mapping on host so that these access never cause VM-EXIT. + */ +field = aml_field(DSMR, AML_DWORD_ACC, AML_PRESERVE); +aml_append(field, aml_named_field(HDLE, + sizeof(dsm_buf-handle) * BITS_PER_BYTE)); +aml_append(field, aml_named_field(ARG0, + sizeof(dsm_buf-arg0) * BITS_PER_BYTE)); +aml_append(field, aml_named_field(ARG1, + sizeof(dsm_buf-arg1) *
[PATCH 07/16] nvdimm: reserve address range for NVDIMM
NVDIMM reserves all the free range above 4G to do: - Persistent Memory (PMEM) mapping - implement NVDIMM ACPI device _DSM method Signed-off-by: Xiao Guangrong guangrong.x...@linux.intel.com --- hw/i386/pc.c | 11 +-- hw/mem/pc-nvdimm.c | 13 + include/hw/mem/pc-nvdimm.h | 5 + 3 files changed, 27 insertions(+), 2 deletions(-) diff --git a/hw/i386/pc.c b/hw/i386/pc.c index 7072930..82e80a9 100644 --- a/hw/i386/pc.c +++ b/hw/i386/pc.c @@ -64,6 +64,7 @@ #include hw/pci/pci_host.h #include acpi-build.h #include hw/mem/pc-dimm.h +#include hw/mem/pc-nvdimm.h #include trace.h #include qapi/visitor.h #include qapi-visit.h @@ -1241,6 +1242,7 @@ FWCfgState *pc_memory_init(MachineState *machine, MemoryRegion *ram_below_4g, *ram_above_4g; FWCfgState *fw_cfg; PCMachineState *pcms = PC_MACHINE(machine); +ram_addr_t offset; assert(machine-ram_size == below_4g_mem_size + above_4g_mem_size); @@ -1278,6 +1280,8 @@ FWCfgState *pc_memory_init(MachineState *machine, exit(EXIT_FAILURE); } +offset = 0x1ULL + above_4g_mem_size; + /* initialize hotplug memory address space */ if (guest_info-has_reserved_memory (machine-ram_size machine-maxram_size)) { @@ -1297,8 +1301,7 @@ FWCfgState *pc_memory_init(MachineState *machine, exit(EXIT_FAILURE); } -pcms-hotplug_memory_base = -ROUND_UP(0x1ULL + above_4g_mem_size, 1ULL 30); +pcms-hotplug_memory_base = ROUND_UP(offset, 1ULL 30); if (pcms-enforce_aligned_dimm) { /* size hotplug region assuming 1G page max alignment per slot */ @@ -1316,8 +1319,12 @@ FWCfgState *pc_memory_init(MachineState *machine, hotplug-memory, hotplug_mem_size); memory_region_add_subregion(system_memory, pcms-hotplug_memory_base, pcms-hotplug_memory); +offset = pcms-hotplug_memory_base + hotplug_mem_size; } +/* all the space left above 4G is reserved for NVDIMM. */ +pc_nvdimm_reserve_range(offset); + /* Initialize PC system firmware */ pc_system_firmware_init(rom_memory, guest_info-isapc_ram_fw); diff --git a/hw/mem/pc-nvdimm.c b/hw/mem/pc-nvdimm.c index 0209ea9..b40d4e7 100644 --- a/hw/mem/pc-nvdimm.c +++ b/hw/mem/pc-nvdimm.c @@ -24,6 +24,19 @@ #include hw/mem/pc-nvdimm.h +#define PAGE_SIZE (1UL 12) + +static struct nvdimms_info { +ram_addr_t current_addr; +} nvdimms_info; + +/* the address range [offset, ~0ULL) is reserved for NVDIMM. */ +void pc_nvdimm_reserve_range(ram_addr_t offset) +{ +offset = ROUND_UP(offset, PAGE_SIZE); +nvdimms_info.current_addr = offset; +} + static char *get_file(Object *obj, Error **errp) { PCNVDIMMDevice *nvdimm = PC_NVDIMM(obj); diff --git a/include/hw/mem/pc-nvdimm.h b/include/hw/mem/pc-nvdimm.h index 7f37b46..2081e7c 100644 --- a/include/hw/mem/pc-nvdimm.h +++ b/include/hw/mem/pc-nvdimm.h @@ -27,6 +27,11 @@ typedef struct PCNVDIMMDevice { #define PC_NVDIMM(obj) \ OBJECT_CHECK(PCNVDIMMDevice, (obj), TYPE_PC_NVDIMM) + +void pc_nvdimm_reserve_range(ram_addr_t offset); #else /* !CONFIG_LINUX */ +static inline void pc_nvdimm_reserve_range(ram_addr_t offset) +{ +} #endif #endif -- 2.1.0 -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 10/16] nvdimm: init the address region used by _DSM method
This memory range is used to transfer data between ACPI in guest and Qemu, it occupies two pages: - one is RAM-based used to save the input info of _DSM method and Qemu reuse it store output info - another one is MMIO-based, ACPI write data to this page to transfer the control to Qemu Signed-off-by: Xiao Guangrong guangrong.x...@linux.intel.com --- hw/mem/pc-nvdimm.c | 80 +- 1 file changed, 79 insertions(+), 1 deletion(-) diff --git a/hw/mem/pc-nvdimm.c b/hw/mem/pc-nvdimm.c index e7cff29..4c290cb 100644 --- a/hw/mem/pc-nvdimm.c +++ b/hw/mem/pc-nvdimm.c @@ -37,6 +37,10 @@ static struct nvdimms_info { ram_addr_t current_addr; + +ram_addr_t dsm_addr; +int dsm_size; + int device_index; } nvdimms_info; @@ -324,14 +328,88 @@ static void build_nfit_table(GSList *device_list, char *buf) } } +struct dsm_buffer { +/* RAM page. */ +uint32_t handle; +uint8_t arg0[16]; +uint32_t arg1; +uint32_t arg2; +union { +char arg3[PAGE_SIZE - 3 * sizeof(uint32_t) - 16 * sizeof(uint8_t)]; +}; + +/* MMIO page. */ +union { +uint32_t notify; +char pedding[PAGE_SIZE]; +}; +}; + +static uint64_t dsm_read(void *opaque, hwaddr addr, + unsigned size) +{ +return 0; +} + +static void dsm_write(void *opaque, hwaddr addr, + uint64_t val, unsigned size) +{ +} + +static const MemoryRegionOps dsm_ops = { +.read = dsm_read, +.write = dsm_write, +.endianness = DEVICE_NATIVE_ENDIAN, +}; + +static int build_dsm_buffer(void) +{ +MemoryRegion *dsm_ram_mr, *dsm_mmio_mr; +ram_addr_t addr;; + +QEMU_BUILD_BUG_ON(PAGE_SIZE * 2 != sizeof(struct dsm_buffer)); + +/* DSM buffer has already been built. */ +if (nvdimms_info.dsm_addr) { +return 0; +} + +addr = reserved_range_push(2 * PAGE_SIZE); +if (!addr) { +return -1; +} + +nvdimms_info.dsm_addr = addr; +nvdimms_info.dsm_size = PAGE_SIZE * 2; + +dsm_ram_mr = g_new(MemoryRegion, 1); +memory_region_init_ram(dsm_ram_mr, NULL, dsm_ram, PAGE_SIZE, + error_abort); +vmstate_register_ram_global(dsm_ram_mr); +memory_region_add_subregion(get_system_memory(), addr, dsm_ram_mr); + +dsm_mmio_mr = g_new(MemoryRegion, 1); +memory_region_init_io(dsm_mmio_mr, NULL, dsm_ops, dsm_ram_mr, + dsm_mmio, PAGE_SIZE); +memory_region_add_subregion(get_system_memory(), addr + PAGE_SIZE, +dsm_mmio_mr); +return 0; +} + void pc_nvdimm_build_nfit_table(GArray *table_offsets, GArray *table_data, GArray *linker) { -GSList *list = get_nvdimm_built_list(); +GSList *list; size_t total; char *buf; int nfit_start, nr; +if (build_dsm_buffer()) { +fprintf(stderr, do not have enough space for DSM buffer.\n); +return; +} + +list = get_nvdimm_built_list(); nr = get_nvdimm_device_number(list); total = get_nfit_total_size(nr); -- 2.1.0 -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 16/16] nvdimm: support NFIT_CMD_SET_CONFIG_DATA
Function 6 is used to set Namespace Label Data Signed-off-by: Xiao Guangrong guangrong.x...@linux.intel.com --- hw/mem/pc-nvdimm.c | 37 + 1 file changed, 37 insertions(+) diff --git a/hw/mem/pc-nvdimm.c b/hw/mem/pc-nvdimm.c index 0498de3..0d2d9fb 100644 --- a/hw/mem/pc-nvdimm.c +++ b/hw/mem/pc-nvdimm.c @@ -450,12 +450,17 @@ struct cmd_out_get_config_data { uint8_t out_buf[0]; } QEMU_PACKED; +struct cmd_out_set_config_data { +uint32_t status; +} QEMU_PACKED; + struct dsm_out { union { uint32_t status; struct cmd_out_implemented cmd_implemented; struct cmd_out_get_config_size cmd_config_size; struct cmd_out_get_config_data cmd_config_get; +struct cmd_out_set_config_data cmd_config_set; uint8_t data[PAGE_SIZE]; }; }; @@ -555,6 +560,35 @@ exit: return status; } +static uint32_t dsm_cmd_config_set(struct dsm_buffer *in, struct dsm_out *out) +{ +GSList *list = get_nvdimm_built_list(); +PCNVDIMMDevice *nvdimm = get_nvdimm_device_by_handle(list, in-handle); +struct cmd_in_set_config_data *cmd_in = in-cmd_config_set; +uint32_t status = NFIT_STATUS_NON_EXISTING_MEM_DEV; + +if (!nvdimm) { +goto exit; +} + +nvdebug(Write Config: offset %#x length %#x.\n, cmd_in-offset, +cmd_in-length); +if (nvdimm-config_data_size cmd_in-length + cmd_in-offset) { +nvdebug(position %#x is beyond config data (len = %#lx).\n, +cmd_in-length + cmd_in-offset, nvdimm-config_data_size); +status = NFIT_STATUS_INVALID_PARAS; +goto exit; +} + +status = NFIT_STATUS_SUCCESS; +memcpy(nvdimm-config_data_addr + cmd_in-offset, cmd_in-in_buf, + cmd_in-length); + +exit: +g_slist_free(list); +return status; +} + static void dsm_write_nvdimm(struct dsm_buffer *in, struct dsm_out *out) { uint32_t function = in-arg2; @@ -570,6 +604,9 @@ static void dsm_write_nvdimm(struct dsm_buffer *in, struct dsm_out *out) case NFIT_CMD_GET_CONFIG_DATA: status = dsm_cmd_config_get(in, out); break; +case NFIT_CMD_SET_CONFIG_DATA: +status = dsm_cmd_config_set(in, out); +break; default: status = NFIT_STATUS_NOT_SUPPORTED; }; -- 2.1.0 -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 15/16] nvdimm: support NFIT_CMD_GET_CONFIG_DATA
Function 5 is used to get Namespace Label Data Signed-off-by: Xiao Guangrong guangrong.x...@linux.intel.com --- hw/mem/pc-nvdimm.c | 33 + 1 file changed, 33 insertions(+) diff --git a/hw/mem/pc-nvdimm.c b/hw/mem/pc-nvdimm.c index 7e5446c..0498de3 100644 --- a/hw/mem/pc-nvdimm.c +++ b/hw/mem/pc-nvdimm.c @@ -423,6 +423,7 @@ struct dsm_buffer { uint32_t arg1; uint32_t arg2; union { +struct cmd_in_get_config_data cmd_config_get; struct cmd_in_set_config_data cmd_config_set; char arg3[PAGE_SIZE - 3 * sizeof(uint32_t) - 16 * sizeof(uint8_t)]; }; @@ -525,6 +526,35 @@ exit: return status; } +static uint32_t dsm_cmd_config_get(struct dsm_buffer *in, struct dsm_out *out) +{ +GSList *list = get_nvdimm_built_list(); +PCNVDIMMDevice *nvdimm = get_nvdimm_device_by_handle(list, in-handle); +struct cmd_in_get_config_data *cmd_in = in-cmd_config_get; +uint32_t status = NFIT_STATUS_NON_EXISTING_MEM_DEV; + +if (!nvdimm) { +goto exit; +} + +nvdebug(Read Config: offset %#x length %#x.\n, cmd_in-offset, +cmd_in-length); +if (nvdimm-config_data_size cmd_in-length + cmd_in-offset) { +nvdebug(position %#x is beyond config data (len = %#lx).\n, +cmd_in-length + cmd_in-offset, nvdimm-config_data_size); +status = NFIT_STATUS_INVALID_PARAS; +goto exit; +} + +status = NFIT_STATUS_SUCCESS; +memcpy(out-cmd_config_get.out_buf, nvdimm-config_data_addr + + cmd_in-offset, cmd_in-length); + +exit: +g_slist_free(list); +return status; +} + static void dsm_write_nvdimm(struct dsm_buffer *in, struct dsm_out *out) { uint32_t function = in-arg2; @@ -537,6 +567,9 @@ static void dsm_write_nvdimm(struct dsm_buffer *in, struct dsm_out *out) case NFIT_CMD_GET_CONFIG_SIZE: status = dsm_cmd_config_size(in, out); break; +case NFIT_CMD_GET_CONFIG_DATA: +status = dsm_cmd_config_get(in, out); +break; default: status = NFIT_STATUS_NOT_SUPPORTED; }; -- 2.1.0 -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 12/16] nvdimm: save arg3 for NVDIMM device _DSM method
Check if the function (Arg2) has additional input info (arg3) and save the info if needed We only do the save on NVDIMM device since we are not going to support any function on root device Signed-off-by: Xiao Guangrong guangrong.x...@linux.intel.com --- hw/mem/pc-nvdimm.c | 73 +- 1 file changed, 72 insertions(+), 1 deletion(-) diff --git a/hw/mem/pc-nvdimm.c b/hw/mem/pc-nvdimm.c index 0e2a9d5..c0965ae 100644 --- a/hw/mem/pc-nvdimm.c +++ b/hw/mem/pc-nvdimm.c @@ -329,6 +329,26 @@ static void build_nfit_table(GSList *device_list, char *buf) } } +enum { +NFIT_CMD_IMPLEMENTED = 0, + +/* bus commands */ +NFIT_CMD_ARS_CAP = 1, +NFIT_CMD_ARS_START = 2, +NFIT_CMD_ARS_QUERY = 3, + +/* per-dimm commands */ +NFIT_CMD_SMART = 1, +NFIT_CMD_SMART_THRESHOLD = 2, +NFIT_CMD_DIMM_FLAGS = 3, +NFIT_CMD_GET_CONFIG_SIZE = 4, +NFIT_CMD_GET_CONFIG_DATA = 5, +NFIT_CMD_SET_CONFIG_DATA = 6, +NFIT_CMD_VENDOR_EFFECT_LOG_SIZE = 7, +NFIT_CMD_VENDOR_EFFECT_LOG = 8, +NFIT_CMD_VENDOR = 9, +}; + struct dsm_buffer { /* RAM page. */ uint32_t handle; @@ -433,6 +453,19 @@ exit: g_slist_free(list); } +static bool device_cmd_has_arg3[] = { +false, /* NFIT_CMD_IMPLEMENTED */ +false, /* NFIT_CMD_SMART */ +false, /* NFIT_CMD_SMART_THRESHOLD */ +false, /* NFIT_CMD_DIMM_FLAGS */ +false, /* NFIT_CMD_GET_CONFIG_SIZE */ +true, /* NFIT_CMD_GET_CONFIG_DATA */ +true, /* NFIT_CMD_SET_CONFIG_DATA */ +false, /* NFIT_CMD_VENDOR_EFFECT_LOG_SIZE */ +false, /* NFIT_CMD_VENDOR_EFFECT_LOG */ +false, /* NFIT_CMD_VENDOR */ +}; + #define BUILD_STA_METHOD(_dev_, _method_) \ do { \ _method_ = aml_method(_STA, 0); \ @@ -457,10 +490,20 @@ exit: static void build_nvdimm_devices(Aml *root_dev, GSList *list) { +Aml *has_arg3; +int i, cmd_nr; + +cmd_nr = ARRAY_SIZE(device_cmd_has_arg3); +has_arg3 = aml_package(cmd_nr); +for (i = 0; i cmd_nr; i++) { +aml_append(has_arg3, aml_int(device_cmd_has_arg3[i])); +} +aml_append(root_dev, aml_name_decl(CAG3, has_arg3)); + for (; list; list = list-next) { PCNVDIMMDevice *nvdimm = list-data; uint32_t handle = nvdimm_index_to_handle(nvdimm-device_index); -Aml *dev, *method; +Aml *dev, *method, *ifctx; dev = aml_device(NVD%d, nvdimm-device_index); aml_append(dev, aml_name_decl(_ADR, aml_int(handle))); @@ -470,6 +513,34 @@ static void build_nvdimm_devices(Aml *root_dev, GSList *list) method = aml_method(_DSM, 4); { SAVE_ARG012_HANDLE(method, aml_int(handle)); + +/* Local5 = DeRefOf(Index(CAG3, Arg2)) */ +aml_append(method, + aml_store(aml_derefof(aml_index(aml_name(CAG3), + aml_arg(2))), aml_local(5))); +/* if 0 local5 */ +ifctx = aml_if(aml_lless(aml_int(0), aml_local(5))); +{ +/* Local0 = Index(Arg3, 0) */ +aml_append(ifctx, aml_store(aml_index(aml_arg(3), aml_int(0)), + aml_local(0))); +/* Local1 = sizeof(Local0) */ +aml_append(ifctx, aml_store(aml_sizeof(aml_local(0)), + aml_local(1))); +/* Local2 = Local1 3 */ +aml_append(ifctx, aml_store(aml_shiftleft(aml_local(1), + aml_int(3)), aml_local(2))); +/* Local3 = DeRefOf(Local0) */ +aml_append(ifctx, aml_store(aml_derefof(aml_local(0)), + aml_local(3))); +/* CreateField(Local3, 0, local2, IBUF) */ +aml_append(ifctx, aml_create_field(aml_local(3), + aml_int(0), aml_local(2), IBUF)); +/* ARG3 = IBUF */ +aml_append(ifctx, aml_store(aml_name(IBUF), + aml_name(ARG3))); +} +aml_append(method, ifctx); NOTIFY_AND_RETURN(method); } aml_append(dev, method); -- 2.1.0 -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 14/16] nvdimm: support NFIT_CMD_GET_CONFIG_SIZE function
Function 4 is used to get Namespace lable size Signed-off-by: Xiao Guangrong guangrong.x...@linux.intel.com --- hw/mem/pc-nvdimm.c | 87 ++ 1 file changed, 87 insertions(+) diff --git a/hw/mem/pc-nvdimm.c b/hw/mem/pc-nvdimm.c index b586bf7..7e5446c 100644 --- a/hw/mem/pc-nvdimm.c +++ b/hw/mem/pc-nvdimm.c @@ -127,6 +127,20 @@ static uint32_t nvdimm_index_to_handle(int index) return index + 1; } +static PCNVDIMMDevice +*get_nvdimm_device_by_handle(GSList *list, uint32_t handle) +{ +for (; list; list = list-next) { +PCNVDIMMDevice *nvdimm = list-data; + +if (nvdimm_index_to_handle(nvdimm-device_index) == handle) { +return nvdimm; +} +} + +return NULL; +} + typedef struct { uint8_t b[16]; } uuid_le; @@ -391,6 +405,17 @@ enum { | (1 NFIT_CMD_GET_CONFIG_DATA)\ | (1 NFIT_CMD_SET_CONFIG_DATA)) +struct cmd_in_get_config_data { +uint32_t offset; +uint32_t length; +} QEMU_PACKED; + +struct cmd_in_set_config_data { +uint32_t offset; +uint32_t length; +uint8_t in_buf[0]; +} QEMU_PACKED; + struct dsm_buffer { /* RAM page. */ uint32_t handle; @@ -398,6 +423,7 @@ struct dsm_buffer { uint32_t arg1; uint32_t arg2; union { +struct cmd_in_set_config_data cmd_config_set; char arg3[PAGE_SIZE - 3 * sizeof(uint32_t) - 16 * sizeof(uint8_t)]; }; @@ -412,10 +438,23 @@ struct cmd_out_implemented { uint64_t cmd_list; }; +struct cmd_out_get_config_size { +uint32_t status; +uint32_t config_size; +uint32_t max_xfer; +} QEMU_PACKED; + +struct cmd_out_get_config_data { +uint32_t status; +uint8_t out_buf[0]; +} QEMU_PACKED; + struct dsm_out { union { uint32_t status; struct cmd_out_implemented cmd_implemented; +struct cmd_out_get_config_size cmd_config_size; +struct cmd_out_get_config_data cmd_config_get; uint8_t data[PAGE_SIZE]; }; }; @@ -441,6 +480,51 @@ static void dsm_write_root(struct dsm_buffer *in, struct dsm_out *out) nvdebug(Return status %#x.\n, out-status); } +/* + * the max transfer size is the max size transfered by both a + * NFIT_CMD_GET_CONFIG_DATA and a NFIT_CMD_SET_CONFIG_DATA + * command. + */ +static uint32_t max_xfer_config_size(void) +{ +struct dsm_buffer *in; +struct dsm_out *out; +uint32_t max_get_size, max_set_size; + +/* + * the max data ACPI can read one time which is transfered by + * the response of NFIT_CMD_GET_CONFIG_DATA. + */ +max_get_size = sizeof(out-data) - sizeof(out-cmd_config_get); + +/* + * the max data ACPI can write one time which is transfered by + * NFIT_CMD_SET_CONFIG_DATA + */ +max_set_size = sizeof(in-arg3) - sizeof(in-cmd_config_set); +return MIN(max_get_size, max_set_size); +} + +static uint32_t dsm_cmd_config_size(struct dsm_buffer *in, struct dsm_out *out) +{ +GSList *list = get_nvdimm_built_list(); +PCNVDIMMDevice *nvdimm = get_nvdimm_device_by_handle(list, in-handle); +uint32_t status = NFIT_STATUS_NON_EXISTING_MEM_DEV; + +if (!nvdimm) { +goto exit; +} + +status = NFIT_STATUS_SUCCESS; +out-cmd_config_size.config_size = nvdimm-config_data_size; +out-cmd_config_size.max_xfer = max_xfer_config_size(); +nvdebug(%s config_size %#x, max_xfer %#x.\n, __func__, +out-cmd_config_size.config_size, out-cmd_config_size.max_xfer); +exit: +g_slist_free(list); +return status; +} + static void dsm_write_nvdimm(struct dsm_buffer *in, struct dsm_out *out) { uint32_t function = in-arg2; @@ -450,6 +534,9 @@ static void dsm_write_nvdimm(struct dsm_buffer *in, struct dsm_out *out) case NFIT_CMD_IMPLEMENTED: out-cmd_implemented.cmd_list = DIMM_SUPPORT_CMD; return; +case NFIT_CMD_GET_CONFIG_SIZE: +status = dsm_cmd_config_size(in, out); +break; default: status = NFIT_STATUS_NOT_SUPPORTED; }; -- 2.1.0 -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 9/9] qemu/kvm: kvm hyper-v based guest crash event handling
On 30/06/2015 13:33, Denis V. Lunev wrote: +static int kvm_arch_handle_hv_crash(CPUState *cs) +{ +X86CPU *cpu = X86_CPU(cs); +CPUX86State *env = cpu-env; + +/* Mark that Hyper-v guest crash occurred */ +env-hv_crash_occurred = 1; This need not be a hv crash. You can add crash_occurred to CPUState directly, and set it in qemu_system_guest_panicked: if (current_cpu) { current_cpu-crash_occurred = true; } Then you would add two subsections: one for crash_occurred in exec.c (attached to vmstate_cpu_common), one for hyperv crash params in target-i386/machine.c. This also gives an idea about splitting the patch: first the introduction of qemu_system_guest_panicked and crash_occurred, second the Hyper-V specific bits. +if (cpu-hyperv_crash) { +c-edx |= HV_X64_GUEST_CRASH_MSR_AVAILABLE; +has_msr_hv_crash = true; You can only set this to true if the kernel also supports the MSRs. +} + c = cpuid_data.entries[cpuid_i++]; c-function = HYPERV_CPUID_ENLIGHTMENT_INFO; if (cpu-hyperv_relaxed_timing) { @@ -761,6 +767,10 @@ void kvm_arch_reset_vcpu(X86CPU *cpu) } else { env-mp_state = KVM_MP_STATE_RUNNABLE; } +if (has_msr_hv_crash) { +env-msr_hv_crash_ctl = HV_X64_MSR_CRASH_CTL_NOTIFY; The value is always host-defined, so I think it doesn't need a field in CPUX86State. On the other hand, this: +static bool hyperv_crash_enable_needed(void *opaque) +{ +X86CPU *cpu = opaque; +CPUX86State *env = cpu-env; + +return (env-msr_hv_crash_ctl HV_X64_MSR_CRASH_CTL_CONTENTS) ? +true : false; +} + can just check if any of the params fields is nonzero. Thanks, Paolo +env-hv_crash_occurred = 0; +} -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v3 0/9] HyperV equivalent of pvpanic driver
On 01/07/15 17:09, Paolo Bonzini wrote: On 30/06/2015 13:33, Denis V. Lunev wrote: Windows 2012 guests can notify hypervisor about occurred guest crash (Windows bugcheck(BSOD)) by writing specific Hyper-V msrs. This patch does handling of this MSR's by KVM and sending notification to user space that allows to gather Windows guest crash dump by QEMU/LIBVIRT. The idea is to provide functionality equal to pvpanic device without QEMU guest agent for Windows. The idea is borrowed from Linux HyperV bus driver and validated against Windows 2k12. Changes from v2: * forbid modification crash ctl msr by guest * qemu_system_guest_panicked usage in pvpanic and s390x * hyper-v crash handler move from generic kvm to i386 * hyper-v crash handler: skip fetching crash msrs just mark crash occured * sync with linux-next 20150629 * patch 11 squashed to patch 10 * patch 9 squashed to patch 7 Changes from v1: * hyperv code move to hyperv.c * added read handlers of crash data msrs * added per vm and per cpu hyperv context structures * added saving crash msrs inside qemu cpu state * added qemu fetch and update of crash msrs * added qemu crash msrs store in cpu state and it's migration Signed-off-by: Andrey Smetanin asmeta...@virtuozzo.com Signed-off-by: Denis V. Lunev d...@openvz.org CC: Gleb Natapov g...@kernel.org CC: Paolo Bonzini pbonz...@redhat.com The patches look good, thanks. I'll queue them as soon as I start merging 4.3 features. Paolo that sounds good to me. We'll re-send patch 8 and fork second thread for QEMU part then. Den -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 3/9] kvm: add hyper-v crash msrs values
On 30/06/2015 13:33, Denis V. Lunev wrote: +#define HV_X64_MSR_CRASH_CTL_NOTIFY (1ULL 63) +#define HV_X64_MSR_CRASH_CTL_CONTENTS\ + (HV_X64_MSR_CRASH_CTL_NOTIFY) Why is HV_X64_MSR_CRASH_CTL_CONTENTS needed? Can I just remove it? Paolo -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] MAINTAINERS: separate section for s390 virtio drivers
On 01/07/2015 17:15, Cornelia Huck wrote: The s390-specific virtio drivers have probably more to do with virtio than with kvm today; let's move them out into a separate section to reflect this and to be able to add relevant mailing lists. CC: Christian Borntraeger borntrae...@de.ibm.com Signed-off-by: Cornelia Huck cornelia.h...@de.ibm.com --- MAINTAINERS | 10 +- 1 file changed, 9 insertions(+), 1 deletion(-) diff --git a/MAINTAINERS b/MAINTAINERS index 246d9d8..fca5c00 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -5766,7 +5766,6 @@ S: Supported F: Documentation/s390/kvm.txt F: arch/s390/include/asm/kvm* F: arch/s390/kvm/ -F: drivers/s390/kvm/ KERNEL VIRTUAL MACHINE (KVM) FOR ARM M: Christoffer Dall christoffer.d...@linaro.org @@ -10671,6 +10670,15 @@ F: drivers/block/virtio_blk.c F: include/linux/virtio_*.h F: include/uapi/linux/virtio_*.h +VIRTIO DRIVERS FOR S390 +M: Christian Borntraeger borntrae...@de.ibm.com +M: Cornelia Huck cornelia.h...@de.ibm.com +L: linux-s...@vger.kernel.org +L: virtualizat...@lists.linux-foundation.org +L: kvm@vger.kernel.org Keeping the KVM mailing list is probably a good idea. +S: Supported +F: drivers/s390/kvm/ Since we are at it, do we want to rename the directory to drivers/s390/virtio? Anyway: Acked-by: Paolo Bonzini pbonz...@redhat.com Paolo VIRTIO HOST (VHOST) M: Michael S. Tsirkin m...@redhat.com L: kvm@vger.kernel.org -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] MAINTAINERS: separate section for s390 virtio drivers
The s390-specific virtio drivers have probably more to do with virtio than with kvm today; let's move them out into a separate section to reflect this and to be able to add relevant mailing lists. CC: Christian Borntraeger borntrae...@de.ibm.com Signed-off-by: Cornelia Huck cornelia.h...@de.ibm.com --- MAINTAINERS | 10 +- 1 file changed, 9 insertions(+), 1 deletion(-) diff --git a/MAINTAINERS b/MAINTAINERS index 246d9d8..fca5c00 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -5766,7 +5766,6 @@ S:Supported F: Documentation/s390/kvm.txt F: arch/s390/include/asm/kvm* F: arch/s390/kvm/ -F: drivers/s390/kvm/ KERNEL VIRTUAL MACHINE (KVM) FOR ARM M: Christoffer Dall christoffer.d...@linaro.org @@ -10671,6 +10670,15 @@ F: drivers/block/virtio_blk.c F: include/linux/virtio_*.h F: include/uapi/linux/virtio_*.h +VIRTIO DRIVERS FOR S390 +M: Christian Borntraeger borntrae...@de.ibm.com +M: Cornelia Huck cornelia.h...@de.ibm.com +L: linux-s...@vger.kernel.org +L: virtualizat...@lists.linux-foundation.org +L: kvm@vger.kernel.org +S: Supported +F: drivers/s390/kvm/ + VIRTIO HOST (VHOST) M: Michael S. Tsirkin m...@redhat.com L: kvm@vger.kernel.org -- 2.3.8 -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 3/9] kvm: add hyper-v crash msrs values
If userspace is controlling the crash capabilities then HV_X64_MSR_CRASH_CTL_CONTENTS is not needed. On Wed, Jul 1, 2015 at 8:53 AM, Denis V. Lunev d...@openvz.org wrote: On 01/07/15 18:00, Paolo Bonzini wrote: On 30/06/2015 13:33, Denis V. Lunev wrote: +#define HV_X64_MSR_CRASH_CTL_NOTIFY(1ULL 63) +#define HV_X64_MSR_CRASH_CTL_CONTENTS \ + (HV_X64_MSR_CRASH_CTL_NOTIFY) Why is HV_X64_MSR_CRASH_CTL_CONTENTS needed? Can I just remove it? Paolo this was a direct request from Peter Hornyack peterhorny...@google.com I suggest here: #define HV_X64_MSR_CRASH_CTL_CONTENTS \ (HV_CRASH_CTL_CRASH_NOTIFY) To allow for more crash actions to be added in the future. Den -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PULL] virtio/vhost: cross endian support
On Wed, Jul 1, 2015 at 2:31 AM, Michael S. Tsirkin m...@redhat.com wrote: virtio/vhost: cross endian support Ugh. Does this really have to be dynamic? Can't virtio do the sane thing, and just use a _fixed_ endianness? Doing a unconditional byte swap is faster and simpler than the crazy conditionals. That's true regardless of endianness, but gets to be even more so if the fixed endianness is little-endian, since BE is not-so-slowly fading from the world. Linus -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PULL] virtio/vhost: cross endian support
On Wed, Jul 1, 2015 at 12:02 PM, Linus Torvalds torva...@linux-foundation.org wrote: Doing a unconditional byte swap is faster and simpler than the crazy conditionals. Unconditional endianness not only makes for simpler and faster code, it also ends up being easier to debug and add things like type annotations for sparse. Linus -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v7 02/11] KVM: arm64: guest debug, define API headers
This commit defines the API headers for guest debugging. There are two architecture specific debug structures: - kvm_guest_debug_arch, allows us to pass in HW debug registers - kvm_debug_exit_arch, signals exception and possible faulting address The type of debugging being used is controlled by the architecture specific control bits of the kvm_guest_debug-control flags in the ioctl structure. Signed-off-by: Alex Bennée alex.ben...@linaro.org Reviewed-by: David Hildenbrand d...@linux.vnet.ibm.com Reviewed-by: Andrew Jones drjo...@redhat.com Acked-by: Christoffer Dall christoffer.d...@linaro.org --- v2 - expose hsr and pc directly to user-space v3 - s/control/controlled/ in commit message - add v8 to ARM ARM comment (ARM Architecture Reference Manual) - add rb tag - rm pc, add far - re-word comments on alignment - rename KVM_ARM_NDBG_REGS - KVM_ARM_MAX_DBG_REGS v4 - now uses common HW/SW BP define - add a-b-tag - use u32 for control regs v5 - revert to have arch specific KVM_GUESTDBG_USE_SW/HW_BP - rm stale comments dbgctrl was stored as u64 v6 - mv far comment from later patch - KVM_GUESTDBG_USE_HW_BP - KVM_GUESTDBG_USE_HW - revert control regs to u64 (parity with GET/SET_ONE_REG) --- arch/arm64/include/uapi/asm/kvm.h | 27 +++ 1 file changed, 27 insertions(+) diff --git a/arch/arm64/include/uapi/asm/kvm.h b/arch/arm64/include/uapi/asm/kvm.h index d268320..d82f3f3 100644 --- a/arch/arm64/include/uapi/asm/kvm.h +++ b/arch/arm64/include/uapi/asm/kvm.h @@ -100,12 +100,39 @@ struct kvm_sregs { struct kvm_fpu { }; +/* + * See v8 ARM ARM D7.3: Debug Registers + * + * The architectural limit is 16 debug registers of each type although + * in practice there are usually less (see ID_AA64DFR0_EL1). + * + * Although the control registers are architecturally defined as 32 + * bits wide we use a 64 bit structure here to keep parity with + * KVM_GET/SET_ONE_REG behaviour which treats all system registers as + * 64 bit values. It also allows for the possibility of the + * architecture expanding the control registers without having to + * change the userspace ABI. + */ +#define KVM_ARM_MAX_DBG_REGS 16 struct kvm_guest_debug_arch { + __u64 dbg_bcr[KVM_ARM_MAX_DBG_REGS]; + __u64 dbg_bvr[KVM_ARM_MAX_DBG_REGS]; + __u64 dbg_wcr[KVM_ARM_MAX_DBG_REGS]; + __u64 dbg_wvr[KVM_ARM_MAX_DBG_REGS]; }; struct kvm_debug_exit_arch { + __u32 hsr; + __u64 far; /* used for watchpoints */ }; +/* + * Architecture specific defines for kvm_guest_debug-control + */ + +#define KVM_GUESTDBG_USE_SW_BP (1 16) +#define KVM_GUESTDBG_USE_HW(1 17) + struct kvm_sync_regs { }; -- 2.4.5 -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 7/9] kvm/x86: added hyper-v crash data and ctl msr's get/set'ers
On 30/06/2015 13:33, Denis V. Lunev wrote: +static int kvm_hv_msr_set_crash_ctl(struct kvm_vcpu *vcpu, u64 data, bool host) +{ + struct kvm_hv *hv = vcpu-kvm-arch.hyperv; + + if (host) + hv-hv_crash_ctl = data; + You need to check against HV_X64_MSR_CRASH_CTL_CONTENTS here (or HV_X64_MSR_CRASH_CTL_NOTIFY) here. Paolo -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v3 06/11] KVM: arm: add trap handlers for 32-bit debug registers
On June 30, 2015 5:16:41 AM GMT+08:00, Christoffer Dall christoffer.d...@linaro.org wrote: On Mon, Jun 22, 2015 at 06:41:29PM +0800, Zhichao Huang wrote: Add handlers for all the 32-bit debug registers. Signed-off-by: Zhichao Huang zhichao.hu...@linaro.org --- arch/arm/include/asm/kvm_asm.h | 12 arch/arm/include/asm/kvm_host.h | 3 + arch/arm/kernel/asm-offsets.c | 1 + arch/arm/kvm/coproc.c | 122 4 files changed, 138 insertions(+) diff --git a/arch/arm/include/asm/kvm_asm.h b/arch/arm/include/asm/kvm_asm.h index 25410b2..ba65e05 100644 --- a/arch/arm/include/asm/kvm_asm.h +++ b/arch/arm/include/asm/kvm_asm.h @@ -52,6 +52,18 @@ #define c10_AMAIR1 30 /* Auxilary Memory Attribute Indirection Reg1 */ #define NR_CP15_REGS31 /* Number of regs (incl. invalid) */ +/* 0 is reserved as an invalid value. */ +#define cp14_DBGBVR01 /* Debug Breakpoint Control Registers (0-15) */ +#define cp14_DBGBVR15 16 +#define cp14_DBGBCR017 /* Debug Breakpoint Value Registers (0-15) */ +#define cp14_DBGBCR15 32 +#define cp14_DBGWVR033 /* Debug Watchpoint Control Registers (0-15) */ +#define cp14_DBGWVR15 48 +#define cp14_DBGWCR049 /* Debug Watchpoint Value Registers (0-15) */ +#define cp14_DBGWCR15 64 +#define cp14_DBGDSCRext 65 /* Debug Status and Control external */ +#define NR_CP14_REGS66 /* Number of regs (incl. invalid) */ + #define ARM_EXCEPTION_RESET 0 #define ARM_EXCEPTION_UNDEFINED 1 #define ARM_EXCEPTION_SOFTWARE2 diff --git a/arch/arm/include/asm/kvm_host.h b/arch/arm/include/asm/kvm_host.h index d71607c..3d16820 100644 --- a/arch/arm/include/asm/kvm_host.h +++ b/arch/arm/include/asm/kvm_host.h @@ -124,6 +124,9 @@ struct kvm_vcpu_arch { struct vgic_cpu vgic_cpu; struct arch_timer_cpu timer_cpu; +/* System control coprocessor (cp14) */ +u32 cp14[NR_CP14_REGS]; + /* * Anything that is not used directly from assembly code goes * here. diff --git a/arch/arm/kernel/asm-offsets.c b/arch/arm/kernel/asm-offsets.c index 871b826..9158de0 100644 --- a/arch/arm/kernel/asm-offsets.c +++ b/arch/arm/kernel/asm-offsets.c @@ -172,6 +172,7 @@ int main(void) #ifdef CONFIG_KVM_ARM_HOST DEFINE(VCPU_KVM, offsetof(struct kvm_vcpu, kvm)); DEFINE(VCPU_MIDR, offsetof(struct kvm_vcpu, arch.midr)); + DEFINE(VCPU_CP14, offsetof(struct kvm_vcpu, arch.cp14)); DEFINE(VCPU_CP15, offsetof(struct kvm_vcpu, arch.cp15)); DEFINE(VCPU_VFP_GUEST,offsetof(struct kvm_vcpu, arch.vfp_guest)); DEFINE(VCPU_VFP_HOST, offsetof(struct kvm_vcpu, arch.host_cpu_context)); diff --git a/arch/arm/kvm/coproc.c b/arch/arm/kvm/coproc.c index 16d5f69..59b65b7 100644 --- a/arch/arm/kvm/coproc.c +++ b/arch/arm/kvm/coproc.c @@ -220,6 +220,47 @@ bool access_vm_reg(struct kvm_vcpu *vcpu, return true; } +static bool trap_debug32(struct kvm_vcpu *vcpu, +const struct coproc_params *p, +const struct coproc_reg *r) +{ +if (p-is_write) +vcpu-arch.cp14[r-reg] = *vcpu_reg(vcpu, p-Rt1); +else +*vcpu_reg(vcpu, p-Rt1) = vcpu-arch.cp14[r-reg]; + +return true; +} + +/* DBGIDR (RO) Debug ID */ +static bool trap_dbgidr(struct kvm_vcpu *vcpu, +const struct coproc_params *p, +const struct coproc_reg *r) +{ +u32 val; + +if (p-is_write) +return ignore_write(vcpu, p); + +ARM_DBG_READ(c0, c0, 0, val); +*vcpu_reg(vcpu, p-Rt1) = val; + +return true; +} + +/* DBGDSCRint (RO) Debug Status and Control Register */ +static bool trap_dbgdscr(struct kvm_vcpu *vcpu, +const struct coproc_params *p, +const struct coproc_reg *r) +{ +if (p-is_write) +return ignore_write(vcpu, p); + +*vcpu_reg(vcpu, p-Rt1) = vcpu-arch.cp14[r-reg]; + +return true; +} + /* * We could trap ID_DFR0 and tell the guest we don't support performance * monitoring. Unfortunately the patch to make the kernel check ID_DFR0 was @@ -375,7 +416,88 @@ static const struct coproc_reg cp15_regs[] = { { CRn(15), CRm( 0), Op1( 4), Op2( 0), is32, access_cbar}, }; +#define DBG_BCR_BVR_WCR_WVR(n) \ +/* DBGBVRn */ \ +{ CRn( 0), CRm((n)), Op1( 0), Op2( 4), is32,\ + trap_debug32, reset_val, (cp14_DBGBVR0 + (n)), 0 }, \ +/* DBGBCRn */ \ +{ CRn( 0), CRm((n)), Op1( 0), Op2( 5), is32,\ + trap_debug32, reset_val, (cp14_DBGBCR0 + (n)), 0 }, \ +/* DBGWVRn */ \ +{ CRn( 0), CRm((n)),
Re: [PATCH v3 04/11] KVM: arm: common infrastructure for handling AArch32 CP14/CP15
On June 30, 2015 3:43:34 AM GMT+08:00, Christoffer Dall christoffer.d...@linaro.org wrote: On Mon, Jun 22, 2015 at 06:41:27PM +0800, Zhichao Huang wrote: As we're about to trap a bunch of CP14 registers, let's rework the CP15 handling so it can be generalized and work with multiple tables. Signed-off-by: Zhichao Huang zhichao.hu...@linaro.org --- arch/arm/kvm/coproc.c | 176 ++--- arch/arm/kvm/interrupts_head.S | 2 +- 2 files changed, 112 insertions(+), 66 deletions(-) diff --git a/arch/arm/kvm/coproc.c b/arch/arm/kvm/coproc.c index 9d283d9..d23395b 100644 --- a/arch/arm/kvm/coproc.c +++ b/arch/arm/kvm/coproc.c @@ -375,6 +375,9 @@ static const struct coproc_reg cp15_regs[] = { { CRn(15), CRm( 0), Op1( 4), Op2( 0), is32, access_cbar}, }; +static const struct coproc_reg cp14_regs[] = { +}; + /* Target specific emulation tables */ static struct kvm_coproc_target_table *target_tables[KVM_ARM_NUM_TARGETS]; @@ -424,47 +427,75 @@ static const struct coproc_reg *find_reg(const struct coproc_params *params, return NULL; } -static int emulate_cp15(struct kvm_vcpu *vcpu, -const struct coproc_params *params) +/* + * emulate_cp -- tries to match a cp14/cp15 access in a handling table, + *and call the corresponding trap handler. + * + * @params: pointer to the descriptor of the access + * @table: array of trap descriptors + * @num: size of the trap descriptor array + * + * Return 0 if the access has been handled, and -1 if not. + */ +static int emulate_cp(struct kvm_vcpu *vcpu, +const struct coproc_params *params, +const struct coproc_reg *table, +size_t num) { -size_t num; -const struct coproc_reg *table, *r; - -trace_kvm_emulate_cp15_imp(params-Op1, params-Rt1, params-CRn, - params-CRm, params-Op2, params-is_write); +const struct coproc_reg *r; -table = get_target_table(vcpu-arch.target, num); +if (!table) +return -1; /* Not handled */ -/* Search target-specific then generic table. */ r = find_reg(params, table, num); -if (!r) -r = find_reg(params, cp15_regs, ARRAY_SIZE(cp15_regs)); -if (likely(r)) { +if (r) { /* If we don't have an accessor, we should never get here! */ BUG_ON(!r-access); if (likely(r-access(vcpu, params, r))) { /* Skip instruction, since it was emulated */ kvm_skip_instr(vcpu, kvm_vcpu_trap_il_is32bit(vcpu)); -return 1; } -/* If access function fails, it should complain. */ -} else { -kvm_err(Unsupported guest CP15 access at: %08lx\n, -*vcpu_pc(vcpu)); -print_cp_instr(params); + +/* Handled */ +return 0; } + +/* Not handled */ +return -1; +} + +static void unhandled_cp_access(struct kvm_vcpu *vcpu, +const struct coproc_params *params) +{ +u8 hsr_ec = kvm_vcpu_trap_get_class(vcpu); +int cp; + +switch (hsr_ec) { +case HSR_EC_CP15_32: +case HSR_EC_CP15_64: +cp = 15; +break; +case HSR_EC_CP14_MR: +case HSR_EC_CP14_64: +cp = 14; +break; +default: +WARN_ON((cp = -1)); +} + +kvm_err(Unsupported guest CP%d access at: %08lx\n, +cp, *vcpu_pc(vcpu)); +print_cp_instr(params); kvm_inject_undefined(vcpu); -return 1; } -/** - * kvm_handle_cp15_64 -- handles a mrrc/mcrr trap on a guest CP15 access - * @vcpu: The VCPU pointer - * @run: The kvm_run struct - */ -int kvm_handle_cp15_64(struct kvm_vcpu *vcpu, struct kvm_run *run) +int kvm_handle_cp_64(struct kvm_vcpu *vcpu, +const struct coproc_reg *global, +size_t nr_global, +const struct coproc_reg *target_specific, +size_t nr_specific) { struct coproc_params params; @@ -478,7 +509,13 @@ int kvm_handle_cp15_64(struct kvm_vcpu *vcpu, struct kvm_run *run) params.Rt2 = (kvm_vcpu_get_hsr(vcpu) 10) 0xf; params.CRm = 0; -return emulate_cp15(vcpu, params); +if (!emulate_cp(vcpu, params, target_specific, nr_specific)) +return 1; +if (!emulate_cp(vcpu, params, global, nr_global)) +return 1; + +unhandled_cp_access(vcpu, params); +return 1; } static void reset_coproc_regs(struct kvm_vcpu *vcpu, @@ -491,12 +528,11 @@ static void reset_coproc_regs(struct kvm_vcpu *vcpu, table[i].reset(vcpu, table[i]); } -/** - * kvm_handle_cp15_32 -- handles a mrc/mcr trap on a guest CP15 access - * @vcpu: The VCPU pointer - * @run: The
Re: [PATCH v3 07/11] KVM: arm: add trap handlers for 64-bit debug registers
On June 30, 2015 9:20:29 PM GMT+08:00, Christoffer Dall christoffer.d...@linaro.org wrote: On Mon, Jun 22, 2015 at 06:41:30PM +0800, Zhichao Huang wrote: Add handlers for all the 64-bit debug registers. There is an overlap between 32 and 64bit registers. Make sure that 64-bit registers preceding 32-bit ones. Signed-off-by: Zhichao Huang zhichao.hu...@linaro.org --- arch/arm/kvm/coproc.c | 12 1 file changed, 12 insertions(+) diff --git a/arch/arm/kvm/coproc.c b/arch/arm/kvm/coproc.c index 59b65b7..648 100644 --- a/arch/arm/kvm/coproc.c +++ b/arch/arm/kvm/coproc.c @@ -435,9 +435,17 @@ static const struct coproc_reg cp15_regs[] = { { CRn( 1), CRm((n)), Op1( 0), Op2( 1), is32, trap_raz_wi } /* + * Architected CP14 registers. + * belongs in other patch? OK, I will move it to the patch [06/11]. * Trapped cp14 registers. We generally ignore most of the external * debug, on the principle that they don't really make sense to a * guest. Revisit this one day, whould this principle change. + * + * CRn denotes the primary register number, but is copied to the CRm in the + * user space API for 64-bit register access in line with the terminology used + * in the ARM ARM. + * Important: Must be sorted ascending by CRn, CRM, Op1, Op2 and with 64-bit + *registers preceding 32-bit ones. */ static const struct coproc_reg cp14_regs[] = { /* DBGIDR */ @@ -445,10 +453,14 @@ static const struct coproc_reg cp14_regs[] = { /* DBGDTRRXext */ { CRn( 0), CRm( 0), Op1( 0), Op2( 2), is32, trap_raz_wi }, DBG_BCR_BVR_WCR_WVR(0), +/* DBGDRAR (64bit) */ +{ CRn( 0), CRm( 1), Op1( 0), Op2( 0), is64, trap_raz_wi}, /* DBGDSCRint */ { CRn( 0), CRm( 1), Op1( 0), Op2( 0), is32, trap_dbgdscr, NULL, cp14_DBGDSCRext }, DBG_BCR_BVR_WCR_WVR(1), +/* DBGDSAR (64bit) */ +{ CRn( 0), CRm( 2), Op1( 0), Op2( 0), is64, trap_raz_wi}, /* DBGDSCRext */ { CRn( 0), CRm( 2), Op1( 0), Op2( 2), is32, trap_debug32, reset_val, cp14_DBGDSCRext, 0 }, -- 1.7.12.4 Otherwise: Reviewed-by: Christoffer Dall christoffer.d...@linaro.org -- Zhichao Huang -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v3 01/11] KVM: arm: plug guest debug exploit
On June 29, 2015 11:49:53 PM GMT+08:00, Christoffer Dall christoffer.d...@linaro.org wrote: On Mon, Jun 22, 2015 at 06:41:24PM +0800, Zhichao Huang wrote: Hardware debugging in guests is not intercepted currently, it means that a malicious guest can bring down the entire machine by writing to the debug registers. This patch enable trapping of all debug registers, preventing the guests to access the debug registers. This patch also disable the debug mode(DBGDSCR) in the guest world all the time, preventing the guests to mess with the host state. However, it is a precursor for later patches which will need to do more to world switch debug states while necessary. Cc: sta...@vger.kernel.org Signed-off-by: Zhichao Huang zhichao.hu...@linaro.org --- arch/arm/include/asm/kvm_coproc.h | 3 +- arch/arm/kvm/coproc.c | 60 +++ arch/arm/kvm/handle_exit.c| 4 +-- arch/arm/kvm/interrupts_head.S| 13 - 4 files changed, 70 insertions(+), 10 deletions(-) diff --git a/arch/arm/include/asm/kvm_coproc.h b/arch/arm/include/asm/kvm_coproc.h index 4917c2f..e74ab0f 100644 --- a/arch/arm/include/asm/kvm_coproc.h +++ b/arch/arm/include/asm/kvm_coproc.h @@ -31,7 +31,8 @@ void kvm_register_target_coproc_table(struct kvm_coproc_target_table *table); int kvm_handle_cp10_id(struct kvm_vcpu *vcpu, struct kvm_run *run); int kvm_handle_cp_0_13_access(struct kvm_vcpu *vcpu, struct kvm_run *run); int kvm_handle_cp14_load_store(struct kvm_vcpu *vcpu, struct kvm_run *run); -int kvm_handle_cp14_access(struct kvm_vcpu *vcpu, struct kvm_run *run); +int kvm_handle_cp14_32(struct kvm_vcpu *vcpu, struct kvm_run *run); +int kvm_handle_cp14_64(struct kvm_vcpu *vcpu, struct kvm_run *run); int kvm_handle_cp15_32(struct kvm_vcpu *vcpu, struct kvm_run *run); int kvm_handle_cp15_64(struct kvm_vcpu *vcpu, struct kvm_run *run); diff --git a/arch/arm/kvm/coproc.c b/arch/arm/kvm/coproc.c index f3d88dc..2e12760 100644 --- a/arch/arm/kvm/coproc.c +++ b/arch/arm/kvm/coproc.c @@ -91,12 +91,6 @@ int kvm_handle_cp14_load_store(struct kvm_vcpu *vcpu, struct kvm_run *run) return 1; } -int kvm_handle_cp14_access(struct kvm_vcpu *vcpu, struct kvm_run *run) -{ -kvm_inject_undefined(vcpu); -return 1; -} - static void reset_mpidr(struct kvm_vcpu *vcpu, const struct coproc_reg *r) { /* @@ -519,6 +513,60 @@ int kvm_handle_cp15_32(struct kvm_vcpu *vcpu, struct kvm_run *run) return emulate_cp15(vcpu, params); } +/** + * kvm_handle_cp14_64 -- handles a mrrc/mcrr trap on a guest CP14 access + * @vcpu: The VCPU pointer + * @run: The kvm_run struct + */ +int kvm_handle_cp14_64(struct kvm_vcpu *vcpu, struct kvm_run *run) +{ +struct coproc_params params; + +params.CRn = (kvm_vcpu_get_hsr(vcpu) 1) 0xf; +params.Rt1 = (kvm_vcpu_get_hsr(vcpu) 5) 0xf; +params.is_write = ((kvm_vcpu_get_hsr(vcpu) 1) == 0); +params.is_64bit = true; + +params.Op1 = (kvm_vcpu_get_hsr(vcpu) 16) 0xf; +params.Op2 = 0; +params.Rt2 = (kvm_vcpu_get_hsr(vcpu) 10) 0xf; +params.CRm = 0; this is a complete duplicate of kvm_handle_cp15_64, can you share this code somehow? This patch just want to plug the exploit in the simplest way, and I shared the cp14/cp15 handlers in later patches [PATCH v3 04/11]. Should I take the patch [04/11] ahead of current patch [01/11] ? + +/* raz_wi */ +(void)pm_fake(vcpu, params, NULL); + +/* handled */ +kvm_skip_instr(vcpu, kvm_vcpu_trap_il_is32bit(vcpu)); +return 1; +} + +/** + * kvm_handle_cp14_32 -- handles a mrc/mcr trap on a guest CP14 access + * @vcpu: The VCPU pointer + * @run: The kvm_run struct + */ +int kvm_handle_cp14_32(struct kvm_vcpu *vcpu, struct kvm_run *run) +{ +struct coproc_params params; + +params.CRm = (kvm_vcpu_get_hsr(vcpu) 1) 0xf; +params.Rt1 = (kvm_vcpu_get_hsr(vcpu) 5) 0xf; +params.is_write = ((kvm_vcpu_get_hsr(vcpu) 1) == 0); +params.is_64bit = false; + +params.CRn = (kvm_vcpu_get_hsr(vcpu) 10) 0xf; +params.Op1 = (kvm_vcpu_get_hsr(vcpu) 14) 0x7; +params.Op2 = (kvm_vcpu_get_hsr(vcpu) 17) 0x7; +params.Rt2 = 0; this is a complete duplicate of kvm_handle_cp15_32, can you share this code somehow? + +/* raz_wi */ +(void)pm_fake(vcpu, params, NULL); + +/* handled */ +kvm_skip_instr(vcpu, kvm_vcpu_trap_il_is32bit(vcpu)); +return 1; +} + /** * Userspace API */ diff --git a/arch/arm/kvm/handle_exit.c b/arch/arm/kvm/handle_exit.c index 95f12b2..357ad1b 100644 --- a/arch/arm/kvm/handle_exit.c +++ b/arch/arm/kvm/handle_exit.c @@ -104,9 +104,9 @@ static exit_handle_fn arm_exit_handlers[] = { [HSR_EC_WFI]= kvm_handle_wfx,