Re: [PATCH 14/15] KVM: MTRR: do not map huage page for non-consistent range
[ CCed Zhang Yang ] On 06/04/2015 04:36 PM, Paolo Bonzini wrote: On 04/06/2015 10:23, Xiao Guangrong wrote: So, why do you need to always use IPAT=0? Can patch 15 keep the current logic for RAM, like this: if (is_mmio || kvm_arch_has_noncoherent_dma(vcpu-kvm)) ret = kvm_mtrr_get_guest_memory_type(vcpu, gfn) VMX_EPT_MT_EPTE_SHIFT; else ret = (MTRR_TYPE_WRBACK VMX_EPT_MT_EPTE_SHIFT) | VMX_EPT_IPAT_BIT; Yeah, it's okay, actually we considered this way, however - it's light enough, it did not hurt guest performance based on our benchmark. - the logic has always used for noncherent_dma case, extend it to normal case should have low risk and also help us to check the logic. But noncoherent_dma is not the common case, so it's not necessarily true that the risk is low. I thought noncoherent_dma exists on 1st generation(s) IOMMU, it should be fully tested at that time. - completely follow MTRRS spec would be better than host hides it. We are a virtualization platform, we know well when MTRRs are necessary. Tis a risk from blindly obeying the guest MTRRs: userspace can see stale data if the guest's accesses bypass the cache. AMD bypasses this by enabling snooping even in cases that ordinarily wouldn't snoop; for Intel the solution is that RAM-backed areas should always use IPAT. Not sure if UC and other cacheable type combinations on guest and host will cause problem. The SMD mentioned that snoop is not required only when The UC attribute comes from the MTRRs and the processors are not required to snoop their caches since the data could never have been cached. (Vol 3. 11.5.2.2) VMX do not touch hardware MTRR MSRs and i guess snoop works under this case. I also noticed if SS (self-snooping) is supported we need not to invalidate cache when programming memory type (Vol 3. 11.11.8), so that means CPU works well on the page which has different cache types i guess. After think it carefully, we (Zhang Yang) doubt if always set WB for DMA memory is really a good idea because we can not assume WB DMA works well for all devices. One example is that audio DMA (not a MMIO region) is required WC to improve its performance. However, we think the SDM is not clear enough so let's do full vMTRR on MMIO and noncoherent_dma first. :) -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v2 4/8] AArch{32,64}: dynamically configure the number of GIC interrupts
From: Marc Zyngier marc.zyng...@arm.com In order to reduce the memory usage of large guests (as well as improve performance), tell KVM about the number of interrupts we require. To avoid synchronization with the various device creation, use a late_init callback to compute the GIC configuration. [Andre: rename to gic__init_gic() to ease future expansion] Signed-off-by: Marc Zyngier marc.zyng...@arm.com Signed-off-by: Andre Przywara andre.przyw...@arm.com --- arm/gic.c | 25 + 1 file changed, 25 insertions(+) diff --git a/arm/gic.c b/arm/gic.c index ce5f7fa..6277af8 100644 --- a/arm/gic.c +++ b/arm/gic.c @@ -1,10 +1,12 @@ #include kvm/fdt.h +#include kvm/irq.h #include kvm/kvm.h #include kvm/virtio.h #include arm-common/gic.h #include linux/byteorder.h +#include linux/kernel.h #include linux/kvm.h static int gic_fd = -1; @@ -87,6 +89,29 @@ int gic__create(struct kvm *kvm) return err; } +static int gic__init_gic(struct kvm *kvm) +{ + int lines = irq__get_nr_allocated_lines(); + u32 nr_irqs = ALIGN(lines, 32) + GIC_SPI_IRQ_BASE; + struct kvm_device_attr nr_irqs_attr = { + .group = KVM_DEV_ARM_VGIC_GRP_NR_IRQS, + .addr = (u64)(unsigned long)nr_irqs, + }; + + /* +* If we didn't use the KVM_CREATE_DEVICE method, KVM will +* give us some default number of interrupts. +*/ + if (gic_fd 0) + return 0; + + if (!ioctl(gic_fd, KVM_HAS_DEVICE_ATTR, nr_irqs_attr)) + return ioctl(gic_fd, KVM_SET_DEVICE_ATTR, nr_irqs_attr); + + return 0; +} +late_init(gic__init_gic) + void gic__generate_fdt_nodes(void *fdt, u32 phandle) { u64 reg_prop[] = { -- 2.3.5 -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v2 5/8] arm: finish VGIC initialisation explicitly
Since Linux 3.19-rc1 there is a new API to explicitly initialise the in-kernel GIC emulation by a userland KVM device call. Use that to tell the kernel we are finished with the GIC initialisation, since the automatic GIC init will only be provided as a legacy functionality in the future. Signed-off-by: Andre Przywara andre.przyw...@arm.com --- arm/gic.c | 25 ++--- 1 file changed, 22 insertions(+), 3 deletions(-) diff --git a/arm/gic.c b/arm/gic.c index 6277af8..8d47562 100644 --- a/arm/gic.c +++ b/arm/gic.c @@ -89,24 +89,43 @@ int gic__create(struct kvm *kvm) return err; } +/* + * Sets the number of used interrupts and finalizes the GIC init explicitly. + */ static int gic__init_gic(struct kvm *kvm) { + int ret; + int lines = irq__get_nr_allocated_lines(); u32 nr_irqs = ALIGN(lines, 32) + GIC_SPI_IRQ_BASE; struct kvm_device_attr nr_irqs_attr = { .group = KVM_DEV_ARM_VGIC_GRP_NR_IRQS, .addr = (u64)(unsigned long)nr_irqs, }; + struct kvm_device_attr vgic_init_attr = { + .group = KVM_DEV_ARM_VGIC_GRP_CTRL, + .attr = KVM_DEV_ARM_VGIC_CTRL_INIT, + }; /* * If we didn't use the KVM_CREATE_DEVICE method, KVM will -* give us some default number of interrupts. +* give us some default number of interrupts. The GIC initialization +* will be done automatically in this case. */ if (gic_fd 0) return 0; - if (!ioctl(gic_fd, KVM_HAS_DEVICE_ATTR, nr_irqs_attr)) - return ioctl(gic_fd, KVM_SET_DEVICE_ATTR, nr_irqs_attr); + if (!ioctl(gic_fd, KVM_HAS_DEVICE_ATTR, nr_irqs_attr)) { + ret = ioctl(gic_fd, KVM_SET_DEVICE_ATTR, nr_irqs_attr); + if (ret) + return ret; + } + + if (!ioctl(gic_fd, KVM_HAS_DEVICE_ATTR, vgic_init_attr)) { + ret = ioctl(gic_fd, KVM_SET_DEVICE_ATTR, vgic_init_attr); + if (ret) + return ret; + } return 0; } -- 2.3.5 -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v2 3/8] irq: add irq__get_nr_allocated_lines
From: Marc Zyngier marc.zyng...@arm.com The ARM GIC emulation needs to be told the number of interrupts it has to support. As commit 1c262fa1dc7bc (kvm tools: irq: make irq__alloc_line generic) made the interrupt counter private, add a new accessor returning the number of interrupt lines we've allocated so far. Signed-off-by: Marc Zyngier marc.zyng...@arm.com Signed-off-by: Andre Przywara andre.przyw...@arm.com --- include/kvm/irq.h | 1 + irq.c | 5 + 2 files changed, 6 insertions(+) diff --git a/include/kvm/irq.h b/include/kvm/irq.h index 4cec6f0..8a78e43 100644 --- a/include/kvm/irq.h +++ b/include/kvm/irq.h @@ -11,6 +11,7 @@ struct kvm; int irq__alloc_line(void); +int irq__get_nr_allocated_lines(void); int irq__init(struct kvm *kvm); int irq__exit(struct kvm *kvm); diff --git a/irq.c b/irq.c index 33ea8d2..71eaa05 100644 --- a/irq.c +++ b/irq.c @@ -7,3 +7,8 @@ int irq__alloc_line(void) { return next_line++; } + +int irq__get_nr_allocated_lines(void) +{ + return next_line - KVM_IRQ_OFFSET; +} -- 2.3.5 -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v2 2/8] AArch{32,64}: use KVM_CREATE_DEVICE co to instanciate the GIC
From: Marc Zyngier marc.zyng...@arm.com As of 3.14, KVM/arm supports the creation/configuration of the GIC through a more generic device API, which is now the preferred way to do so. Plumb the new API in, and allow the old code to be used as a fallback. [Andre: Rename some functions on the way to differentiate between creation and initialisation more clearly.] Signed-off-by: Marc Zyngier marc.zyng...@arm.com Signed-off-by: Andre Przywara andre.przyw...@arm.com --- arm/gic.c| 60 ++-- arm/include/arm-common/gic.h | 2 +- arm/kvm.c| 6 ++--- 3 files changed, 57 insertions(+), 11 deletions(-) diff --git a/arm/gic.c b/arm/gic.c index 5d8cbe6..ce5f7fa 100644 --- a/arm/gic.c +++ b/arm/gic.c @@ -7,7 +7,41 @@ #include linux/byteorder.h #include linux/kvm.h -int gic__init_irqchip(struct kvm *kvm) +static int gic_fd = -1; + +static int gic__create_device(struct kvm *kvm) +{ + int err; + u64 cpu_if_addr = ARM_GIC_CPUI_BASE; + u64 dist_addr = ARM_GIC_DIST_BASE; + struct kvm_create_device gic_device = { + .type = KVM_DEV_TYPE_ARM_VGIC_V2, + }; + struct kvm_device_attr cpu_if_attr = { + .group = KVM_DEV_ARM_VGIC_GRP_ADDR, + .attr = KVM_VGIC_V2_ADDR_TYPE_CPU, + .addr = (u64)(unsigned long)cpu_if_addr, + }; + struct kvm_device_attr dist_attr = { + .group = KVM_DEV_ARM_VGIC_GRP_ADDR, + .attr = KVM_VGIC_V2_ADDR_TYPE_DIST, + .addr = (u64)(unsigned long)dist_addr, + }; + + err = ioctl(kvm-vm_fd, KVM_CREATE_DEVICE, gic_device); + if (err) + return err; + + gic_fd = gic_device.fd; + + err = ioctl(gic_fd, KVM_SET_DEVICE_ATTR, cpu_if_attr); + if (err) + return err; + + return ioctl(gic_fd, KVM_SET_DEVICE_ATTR, dist_attr); +} + +static int gic__create_irqchip(struct kvm *kvm) { int err; struct kvm_arm_device_addr gic_addr[] = { @@ -23,12 +57,6 @@ int gic__init_irqchip(struct kvm *kvm) } }; - if (kvm-nrcpus GIC_MAX_CPUS) { - pr_warning(%d CPUS greater than maximum of %d -- truncating\n, - kvm-nrcpus, GIC_MAX_CPUS); - kvm-nrcpus = GIC_MAX_CPUS; - } - err = ioctl(kvm-vm_fd, KVM_CREATE_IRQCHIP); if (err) return err; @@ -41,6 +69,24 @@ int gic__init_irqchip(struct kvm *kvm) return err; } +int gic__create(struct kvm *kvm) +{ + int err; + + if (kvm-nrcpus GIC_MAX_CPUS) { + pr_warning(%d CPUS greater than maximum of %d -- truncating\n, + kvm-nrcpus, GIC_MAX_CPUS); + kvm-nrcpus = GIC_MAX_CPUS; + } + + /* Try the new way first, and fallback on legacy method otherwise */ + err = gic__create_device(kvm); + if (err) + err = gic__create_irqchip(kvm); + + return err; +} + void gic__generate_fdt_nodes(void *fdt, u32 phandle) { u64 reg_prop[] = { diff --git a/arm/include/arm-common/gic.h b/arm/include/arm-common/gic.h index 5a36f2c..44859f7 100644 --- a/arm/include/arm-common/gic.h +++ b/arm/include/arm-common/gic.h @@ -24,7 +24,7 @@ struct kvm; int gic__alloc_irqnum(void); -int gic__init_irqchip(struct kvm *kvm); +int gic__create(struct kvm *kvm); void gic__generate_fdt_nodes(void *fdt, u32 phandle); #endif /* ARM_COMMON__GIC_H */ diff --git a/arm/kvm.c b/arm/kvm.c index 58ad9fa..bcd2533 100644 --- a/arm/kvm.c +++ b/arm/kvm.c @@ -81,7 +81,7 @@ void kvm__arch_init(struct kvm *kvm, const char *hugetlbfs_path, u64 ram_size) madvise(kvm-arch.ram_alloc_start, kvm-arch.ram_alloc_size, MADV_MERGEABLE | MADV_HUGEPAGE); - /* Initialise the virtual GIC. */ - if (gic__init_irqchip(kvm)) - die(Failed to initialise virtual GIC); + /* Create the virtual GIC. */ + if (gic__create(kvm)) + die(Failed to create virtual GIC); } -- 2.3.5 -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v2 7/8] arm: add support for supplying GICv3 redistributor addresses
The code currently is assuming fixed sized memory regions for the distributor and CPU interface. GICv3 needs a dynamic allocation of its redistributor region, since its size depends on the number of vCPUs. Also add the necessary code to create a GICv3 IRQ chip instance. This contains some defines which are not (yet) in the (32 bit) header files to allow compilation for ARM. Signed-off-by: Andre Przywara andre.przyw...@arm.com --- arm/gic.c | 37 +++-- arm/include/arm-common/gic.h | 2 +- arm/include/arm-common/kvm-arch.h | 18 ++ arm/kvm-cpu.c | 4 +++- 4 files changed, 53 insertions(+), 8 deletions(-) diff --git a/arm/gic.c b/arm/gic.c index 0ce40e4..c50d662 100644 --- a/arm/gic.c +++ b/arm/gic.c @@ -9,13 +9,24 @@ #include linux/kernel.h #include linux/kvm.h +/* Those names are not defined for ARM (yet) */ +#ifndef KVM_VGIC_V3_ADDR_TYPE_DIST +#define KVM_VGIC_V3_ADDR_TYPE_DIST 2 +#endif + +#ifndef KVM_VGIC_V3_ADDR_TYPE_REDIST +#define KVM_VGIC_V3_ADDR_TYPE_REDIST 3 +#endif + static int gic_fd = -1; +static int nr_redists; static int gic__create_device(struct kvm *kvm, enum irqchip_type type) { int err; u64 cpu_if_addr = ARM_GIC_CPUI_BASE; u64 dist_addr = ARM_GIC_DIST_BASE; + u64 redist_addr = dist_addr - nr_redists * ARM_GIC_REDIST_SIZE; struct kvm_create_device gic_device = { .flags = 0, }; @@ -28,11 +39,19 @@ static int gic__create_device(struct kvm *kvm, enum irqchip_type type) .group = KVM_DEV_ARM_VGIC_GRP_ADDR, .addr = (u64)(unsigned long)dist_addr, }; + struct kvm_device_attr redist_attr = { + .group = KVM_DEV_ARM_VGIC_GRP_ADDR, + .attr = KVM_VGIC_V3_ADDR_TYPE_REDIST, + .addr = (u64)(unsigned long)redist_addr, + }; switch (type) { case IRQCHIP_GICV2: gic_device.type = KVM_DEV_TYPE_ARM_VGIC_V2; break; + case IRQCHIP_GICV3: + gic_device.type = KVM_DEV_TYPE_ARM_VGIC_V3; + break; default: return -ENODEV; } @@ -48,6 +67,10 @@ static int gic__create_device(struct kvm *kvm, enum irqchip_type type) dist_attr.attr = KVM_VGIC_V2_ADDR_TYPE_DIST; err = ioctl(gic_fd, KVM_SET_DEVICE_ATTR, cpu_if_attr); break; + case IRQCHIP_GICV3: + dist_attr.attr = KVM_VGIC_V3_ADDR_TYPE_DIST; + err = ioctl(gic_fd, KVM_SET_DEVICE_ATTR, redist_attr); + break; default: return -ENODEV; } @@ -55,6 +78,8 @@ static int gic__create_device(struct kvm *kvm, enum irqchip_type type) return err; err = ioctl(gic_fd, KVM_SET_DEVICE_ATTR, dist_attr); + if (err) + return err; return err; } @@ -162,17 +187,25 @@ void gic__generate_fdt_nodes(void *fdt, u32 phandle, enum irqchip_type type) u64 reg_prop[] = { cpu_to_fdt64(ARM_GIC_DIST_BASE), cpu_to_fdt64(ARM_GIC_DIST_SIZE), - cpu_to_fdt64(ARM_GIC_CPUI_BASE), - cpu_to_fdt64(ARM_GIC_CPUI_SIZE), + 0, 0, /* to be filled */ }; switch (type) { case IRQCHIP_GICV2: compatible = arm,cortex-a15-gic; + reg_prop[2] = ARM_GIC_CPUI_BASE; + reg_prop[3] = ARM_GIC_CPUI_SIZE; + break; + case IRQCHIP_GICV3: + compatible = arm,gic-v3; + reg_prop[2] = ARM_GIC_DIST_BASE - nr_redists * ARM_GIC_REDIST_SIZE; + reg_prop[3] = ARM_GIC_REDIST_SIZE * nr_redists; break; default: return; } + reg_prop[2] = cpu_to_fdt64(reg_prop[2]); + reg_prop[3] = cpu_to_fdt64(reg_prop[3]); _FDT(fdt_begin_node(fdt, intc)); _FDT(fdt_property_string(fdt, compatible, compatible)); diff --git a/arm/include/arm-common/gic.h b/arm/include/arm-common/gic.h index f5f6707..8d6ab01 100644 --- a/arm/include/arm-common/gic.h +++ b/arm/include/arm-common/gic.h @@ -21,7 +21,7 @@ #define GIC_MAX_CPUS 8 #define GIC_MAX_IRQ255 -enum irqchip_type {IRQCHIP_DEFAULT, IRQCHIP_GICV2}; +enum irqchip_type {IRQCHIP_DEFAULT, IRQCHIP_GICV2, IRQCHIP_GICV3}; struct kvm; diff --git a/arm/include/arm-common/kvm-arch.h b/arm/include/arm-common/kvm-arch.h index 082131d..be66a76 100644 --- a/arm/include/arm-common/kvm-arch.h +++ b/arm/include/arm-common/kvm-arch.h @@ -17,10 +17,8 @@ #define ARM_GIC_DIST_BASE (ARM_AXI_AREA - ARM_GIC_DIST_SIZE) #define ARM_GIC_CPUI_BASE (ARM_GIC_DIST_BASE - ARM_GIC_CPUI_SIZE) -#define ARM_GIC_SIZE (ARM_GIC_DIST_SIZE + ARM_GIC_CPUI_SIZE) #define ARM_IOPORT_SIZE(ARM_MMIO_AREA -
[PATCH v2 0/8] kvmtool: arm64: GICv3 guest support
Hi, a rework of the GICv3 support series for kvmtool. I addressed Will's comments on the broken fallback in VGIC creation, also changed the command line parameter to --irqchip=[gicv2,gicv3]. The default is still GICv2 emulation for the sake of reproducibility, not sure we want to have an automatic switch-over in case GICv2 emulation is not supported by the hardware. This is also the base for ITS support, which I will send later as a follow-up series. Cheers, Andre. - Since Linux 3.19 the kernel can emulate a GICv3 for KVM guests. This allows more than 8 VCPUs in a guest and enables in-kernel irqchip for non-backwards-compatible GICv3 implementations. This series updates kvmtool to support this feature. The first half of the series is mostly from Marc and supports some newer features of the virtual GIC which we later depend on. The second part enables support for a guest GICv3 by adding a new command line parameter (--irqchip=). We now use the KVM_CREATE_DEVICE interface to create a virtual GIC and only fall back to the now legacy KVM_CREATE_IRQCHIP call if the former is not supported by the kernel. Also we use two new features the KVM_CREATE_DEVICE interface introduces: * We now set the number of actually used interrupts to avoid allocating too many of them without ever using them. * We tell the kernel explicitly that we are finished with the GIC initialisation. This is a requirement for future VGIC versions. The final three patches introduce virtual GICv3 support, so on supported hardware (and given kernel support) the user can ask KVM to emulate a GICv3, lifting the 8 VCPU limit of KVM. This is done by specifying --irqchip=gicv3 on the command line. As the kernel currently only supports this on ARM64, this parameter is valid for the arm64 kvmtool build. But as the GIC is shared in kvmtool, I had to add the macro definitions to not break the build on ARM. This series goes on top of the new official stand-alone repo hosted on Will's kernel.org git [1]. Find a branch with those patches included at my repo [2]. [1] git://git.kernel.org/pub/scm/linux/kernel/git/will/kvmtool.git [2] git://linux-arm.org/kvmtool.git (branch gicv3/v2) http://www.linux-arm.org/git?p=kvmtool.git;a=log;h=refs/heads/gicv3/v2 Andre Przywara (4): arm: finish VGIC initialisation explicitly arm: prepare for instantiating different IRQ chip devices arm: add support for supplying GICv3 redistributor addresses arm: use new irqchip parameter to create different vGIC types Marc Zyngier (4): AArch64: Reserve two 64k pages for GIC CPU interface AArch{32,64}: use KVM_CREATE_DEVICE co to instanciate the GIC irq: add irq__get_nr_allocated_lines AArch{32,64}: dynamically configure the number of GIC interrupts arm/aarch32/arm-cpu.c| 2 +- arm/aarch64/arm-cpu.c| 2 +- arm/aarch64/include/kvm/kvm-arch.h | 2 +- arm/gic.c| 202 +-- arm/include/arm-common/gic.h | 6 +- arm/include/arm-common/kvm-arch.h| 18 ++- arm/include/arm-common/kvm-config-arch.h | 9 +- arm/kvm-cpu.c| 10 +- arm/kvm.c| 8 +- include/kvm/irq.h| 1 + irq.c| 5 + 11 files changed, 240 insertions(+), 25 deletions(-) -- 2.3.5 -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v2 8/8] arm: use new irqchip parameter to create different vGIC types
Currently we unconditionally create a virtual GICv2 in the guest. Add a --irqchip= parameter to let the user specify a different GIC type for the guest. For now we the only other supported type is GICv3. Signed-off-by: Andre Przywara andre.przyw...@arm.com --- arm/aarch64/arm-cpu.c| 2 +- arm/gic.c| 21 + arm/include/arm-common/kvm-config-arch.h | 9 - arm/kvm-cpu.c| 6 ++ arm/kvm.c| 4 +++- 5 files changed, 39 insertions(+), 3 deletions(-) diff --git a/arm/aarch64/arm-cpu.c b/arm/aarch64/arm-cpu.c index f702b9e..3dc8ea3 100644 --- a/arm/aarch64/arm-cpu.c +++ b/arm/aarch64/arm-cpu.c @@ -12,7 +12,7 @@ static void generate_fdt_nodes(void *fdt, struct kvm *kvm, u32 gic_phandle) { int timer_interrupts[4] = {13, 14, 11, 10}; - gic__generate_fdt_nodes(fdt, gic_phandle, IRQCHIP_GICV2); + gic__generate_fdt_nodes(fdt, gic_phandle, kvm-cfg.arch.irqchip); timer__generate_fdt_nodes(fdt, kvm, timer_interrupts); } diff --git a/arm/gic.c b/arm/gic.c index c50d662..ab0f594 100644 --- a/arm/gic.c +++ b/arm/gic.c @@ -21,6 +21,23 @@ static int gic_fd = -1; static int nr_redists; +int irqchip_parser(const struct option *opt, const char *arg, int unset) +{ + enum irqchip_type *type = opt-value; + + *type = IRQCHIP_DEFAULT; + if (!strcmp(arg, gicv2)) { + *type = IRQCHIP_GICV2; + } else if (!strcmp(arg, gicv3)) { + *type = IRQCHIP_GICV3; + } else if (strcmp(arg, default)) { + fprintf(stderr, irqchip: unknown type \%s\\n, arg); + return -1; + } + + return 0; +} + static int gic__create_device(struct kvm *kvm, enum irqchip_type type) { int err; @@ -121,6 +138,10 @@ int gic__create(struct kvm *kvm, enum irqchip_type type) case IRQCHIP_GICV2: max_cpus = GIC_MAX_CPUS; break; + case IRQCHIP_GICV3: + nr_redists = kvm-cfg.nrcpus; + max_cpus = 255; + break; default: return -ENODEV; } diff --git a/arm/include/arm-common/kvm-config-arch.h b/arm/include/arm-common/kvm-config-arch.h index a8ebd94..ae4e89b 100644 --- a/arm/include/arm-common/kvm-config-arch.h +++ b/arm/include/arm-common/kvm-config-arch.h @@ -8,8 +8,11 @@ struct kvm_config_arch { unsigned intforce_cntfrq; boolvirtio_trans_pci; boolaarch32_guest; + int irqchip; }; +int irqchip_parser(const struct option *opt, const char *arg, int unset); + #define OPT_ARCH_RUN(pfx, cfg) \ pfx, \ ARM_OPT_ARCH_RUN(cfg) \ @@ -21,6 +24,10 @@ struct kvm_config_arch { updated to program CNTFRQ correctly*), \ OPT_BOOLEAN('\0', force-pci, (cfg)-virtio_trans_pci, \ Force virtio devices to use PCI as their default \ - transport), + transport), \ +OPT_CALLBACK('\0', irqchip, (cfg)-irqchip, \ +[gicv2|gicv3], \ +type of interrupt controller to emulate in the guest, \ +irqchip_parser, NULL), #endif /* ARM_COMMON__KVM_CONFIG_ARCH_H */ diff --git a/arm/kvm-cpu.c b/arm/kvm-cpu.c index a3344fa..aacc172 100644 --- a/arm/kvm-cpu.c +++ b/arm/kvm-cpu.c @@ -144,6 +144,12 @@ bool kvm_cpu__emulate_mmio(struct kvm_cpu *vcpu, u64 phys_addr, u8 *data, { int nr_redists = 0; + switch (vcpu-kvm-cfg.arch.irqchip) { + case IRQCHIP_GICV3: + nr_redists = vcpu-kvm-nrcpus; + break; + } + if (arm_addr_in_virtio_mmio_region(nr_redists, phys_addr)) { return kvm__emulate_mmio(vcpu, phys_addr, data, len, is_write); } else if (arm_addr_in_ioport_region(phys_addr)) { diff --git a/arm/kvm.c b/arm/kvm.c index f9685c2..2628d31 100644 --- a/arm/kvm.c +++ b/arm/kvm.c @@ -82,6 +82,8 @@ void kvm__arch_init(struct kvm *kvm, const char *hugetlbfs_path, u64 ram_size) MADV_MERGEABLE | MADV_HUGEPAGE); /* Create the virtual GIC. */ - if (gic__create(kvm, IRQCHIP_GICV2)) + if (kvm-cfg.arch.irqchip == IRQCHIP_DEFAULT) + kvm-cfg.arch.irqchip = IRQCHIP_GICV2; + if (gic__create(kvm, kvm-cfg.arch.irqchip)) die(Failed to create virtual GIC); } -- 2.3.5 -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at
[PATCH v2 6/8] arm: prepare for instantiating different IRQ chip devices
Extend the vGIC handling code to potentially deal with different IRQ chip devices instead of hard-coding the GICv2 in. We extend most vGIC functions to take a type parameter, but still put GICv2 in at the top for the time being. Signed-off-by: Andre Przywara andre.przyw...@arm.com --- arm/aarch32/arm-cpu.c| 2 +- arm/aarch64/arm-cpu.c| 2 +- arm/gic.c| 66 ++-- arm/include/arm-common/gic.h | 6 ++-- arm/kvm.c| 2 +- 5 files changed, 58 insertions(+), 20 deletions(-) diff --git a/arm/aarch32/arm-cpu.c b/arm/aarch32/arm-cpu.c index 946e443..d8d6293 100644 --- a/arm/aarch32/arm-cpu.c +++ b/arm/aarch32/arm-cpu.c @@ -12,7 +12,7 @@ static void generate_fdt_nodes(void *fdt, struct kvm *kvm, u32 gic_phandle) { int timer_interrupts[4] = {13, 14, 11, 10}; - gic__generate_fdt_nodes(fdt, gic_phandle); + gic__generate_fdt_nodes(fdt, gic_phandle, IRQCHIP_GICV2); timer__generate_fdt_nodes(fdt, kvm, timer_interrupts); } diff --git a/arm/aarch64/arm-cpu.c b/arm/aarch64/arm-cpu.c index 8efe877..f702b9e 100644 --- a/arm/aarch64/arm-cpu.c +++ b/arm/aarch64/arm-cpu.c @@ -12,7 +12,7 @@ static void generate_fdt_nodes(void *fdt, struct kvm *kvm, u32 gic_phandle) { int timer_interrupts[4] = {13, 14, 11, 10}; - gic__generate_fdt_nodes(fdt, gic_phandle); + gic__generate_fdt_nodes(fdt, gic_phandle, IRQCHIP_GICV2); timer__generate_fdt_nodes(fdt, kvm, timer_interrupts); } diff --git a/arm/gic.c b/arm/gic.c index 8d47562..0ce40e4 100644 --- a/arm/gic.c +++ b/arm/gic.c @@ -11,13 +11,13 @@ static int gic_fd = -1; -static int gic__create_device(struct kvm *kvm) +static int gic__create_device(struct kvm *kvm, enum irqchip_type type) { int err; u64 cpu_if_addr = ARM_GIC_CPUI_BASE; u64 dist_addr = ARM_GIC_DIST_BASE; struct kvm_create_device gic_device = { - .type = KVM_DEV_TYPE_ARM_VGIC_V2, + .flags = 0, }; struct kvm_device_attr cpu_if_attr = { .group = KVM_DEV_ARM_VGIC_GRP_ADDR, @@ -26,21 +26,37 @@ static int gic__create_device(struct kvm *kvm) }; struct kvm_device_attr dist_attr = { .group = KVM_DEV_ARM_VGIC_GRP_ADDR, - .attr = KVM_VGIC_V2_ADDR_TYPE_DIST, .addr = (u64)(unsigned long)dist_addr, }; + switch (type) { + case IRQCHIP_GICV2: + gic_device.type = KVM_DEV_TYPE_ARM_VGIC_V2; + break; + default: + return -ENODEV; + } + err = ioctl(kvm-vm_fd, KVM_CREATE_DEVICE, gic_device); if (err) return err; gic_fd = gic_device.fd; - err = ioctl(gic_fd, KVM_SET_DEVICE_ATTR, cpu_if_attr); + switch (type) { + case IRQCHIP_GICV2: + dist_attr.attr = KVM_VGIC_V2_ADDR_TYPE_DIST; + err = ioctl(gic_fd, KVM_SET_DEVICE_ATTR, cpu_if_attr); + break; + default: + return -ENODEV; + } if (err) return err; - return ioctl(gic_fd, KVM_SET_DEVICE_ATTR, dist_attr); + err = ioctl(gic_fd, KVM_SET_DEVICE_ATTR, dist_attr); + + return err; } static int gic__create_irqchip(struct kvm *kvm) @@ -71,19 +87,28 @@ static int gic__create_irqchip(struct kvm *kvm) return err; } -int gic__create(struct kvm *kvm) +int gic__create(struct kvm *kvm, enum irqchip_type type) { + int max_cpus; int err; - if (kvm-nrcpus GIC_MAX_CPUS) { + switch (type) { + case IRQCHIP_GICV2: + max_cpus = GIC_MAX_CPUS; + break; + default: + return -ENODEV; + } + + if (kvm-nrcpus max_cpus) { pr_warning(%d CPUS greater than maximum of %d -- truncating\n, - kvm-nrcpus, GIC_MAX_CPUS); - kvm-nrcpus = GIC_MAX_CPUS; + kvm-nrcpus, max_cpus); + kvm-nrcpus = max_cpus; } /* Try the new way first, and fallback on legacy method otherwise */ - err = gic__create_device(kvm); - if (err) + err = gic__create_device(kvm, type); + if (err type == IRQCHIP_GICV2) err = gic__create_irqchip(kvm); return err; @@ -131,15 +156,26 @@ static int gic__init_gic(struct kvm *kvm) } late_init(gic__init_gic) -void gic__generate_fdt_nodes(void *fdt, u32 phandle) +void gic__generate_fdt_nodes(void *fdt, u32 phandle, enum irqchip_type type) { + const char *compatible; u64 reg_prop[] = { - cpu_to_fdt64(ARM_GIC_DIST_BASE), cpu_to_fdt64(ARM_GIC_DIST_SIZE), - cpu_to_fdt64(ARM_GIC_CPUI_BASE), cpu_to_fdt64(ARM_GIC_CPUI_SIZE), + cpu_to_fdt64(ARM_GIC_DIST_BASE), + cpu_to_fdt64(ARM_GIC_DIST_SIZE), +
[PATCH v2 1/8] AArch64: Reserve two 64k pages for GIC CPU interface
From: Marc Zyngier marc.zyng...@arm.com On AArch64 system with a GICv2, the GICC range can be aligned to the last 4k block of a 64k page, ending up straddling two 64k pages. In order not to conflict with the distributor mapping, allocate two 64k pages to the CPU interface. Signed-off-by: Marc Zyngier marc.zyng...@arm.com Signed-off-by: Andre Przywara andre.przyw...@arm.com --- arm/aarch64/include/kvm/kvm-arch.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arm/aarch64/include/kvm/kvm-arch.h b/arm/aarch64/include/kvm/kvm-arch.h index 2f08a26..4925736 100644 --- a/arm/aarch64/include/kvm/kvm-arch.h +++ b/arm/aarch64/include/kvm/kvm-arch.h @@ -2,7 +2,7 @@ #define KVM__KVM_ARCH_H #define ARM_GIC_DIST_SIZE 0x1 -#define ARM_GIC_CPUI_SIZE 0x1 +#define ARM_GIC_CPUI_SIZE 0x2 #define ARM_KERN_OFFSET(kvm) ((kvm)-cfg.arch.aarch32_guest ? \ 0x8000 : \ -- 2.3.5 -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v3 1/4] KVM: x86: Split the APIC from the rest of IRQCHIP.
From the perspective of avoiding impacting other architectures, this is a good idea, but the naming seems strange in the x86 case. Having irqchip_in_kernel be true when the ioapic/pic are in userspace seems strange. Admittedly, the irqchip isn't a real concept on x86, so inventing a new meaning is fine. From the KVM point of view, the irqchip is whatever delivers interrupts to the vCPU---which is the LAPIC for x86. Paolo -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2 00/13] SMM implementation for KVM
On 05/27/2015 08:05 PM, Paolo Bonzini wrote: This brings together the remaining parts of SMM. For now I've left the weird interaction between SMM and NMI blocking, and I'm using the same format for the state save area (which is also the one used by QEMU) as the RFC. It builds on the previous cleanup patches, which (with the exception of KVM: x86: pass kvm_mmu_page to gfn_to_rmap) are now in kvm/queue. The first six patches are more or less the same as the previous version, while the address spaces part hopefully touches all affected functions now. Patches 1-6 implement the SMM API and world switch; patches 7-12 implements the multiple address spaces; patch 13 ties the loose ends and advertises the capability. Tested with SeaBIOS and OVMF, where SMM provides the trusted base for secure boot. Nice work. While I did not do a thorough review, the mmu bits look robust. Thanks, Paolo Paolo Bonzini (13): KVM: x86: introduce num_emulated_msrs KVM: x86: pass host_initiated to functions that read MSRs KVM: x86: pass the whole hflags field to emulator and back KVM: x86: API changes for SMM support KVM: x86: stubs for SMM support KVM: x86: save/load state on SMM switch KVM: add vcpu-specific functions to read/write/translate GFNs KVM: implement multiple address spaces KVM: x86: pass kvm_mmu_page to gfn_to_rmap KVM: x86: use vcpu-specific functions to read/write/translate GFNs KVM: x86: work on all available address spaces KVM: x86: add SMM to the MMU role, support SMRAM address space KVM: x86: advertise KVM_CAP_X86_SMM Documentation/virtual/kvm/api.txt| 52 ++- arch/powerpc/include/asm/kvm_book3s_64.h | 2 +- arch/x86/include/asm/kvm_emulate.h | 9 +- arch/x86/include/asm/kvm_host.h | 44 ++- arch/x86/include/asm/vmx.h | 1 + arch/x86/include/uapi/asm/kvm.h | 11 +- arch/x86/kvm/cpuid.h | 8 + arch/x86/kvm/emulate.c | 262 +- arch/x86/kvm/kvm_cache_regs.h| 5 + arch/x86/kvm/lapic.c | 4 +- arch/x86/kvm/mmu.c | 171 +- arch/x86/kvm/mmu_audit.c | 16 +- arch/x86/kvm/paging_tmpl.h | 18 +- arch/x86/kvm/svm.c | 73 ++-- arch/x86/kvm/trace.h | 22 ++ arch/x86/kvm/vmx.c | 106 +++--- arch/x86/kvm/x86.c | 562 ++- include/linux/kvm_host.h | 49 ++- include/uapi/linux/kvm.h | 6 +- virt/kvm/kvm_main.c | 237 ++--- 20 files changed, 1337 insertions(+), 321 deletions(-) -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] kvm: irqchip: Break up high order allocations of kvm_irq_routing_table
On 05/06/2015 12:50, Joerg Roedel wrote: Great, I'll apply the patch. Gentle ping. I don't see the patch in the queue or next branches of the KVM tree yet. Do you plan to apply it for v4.2? Fell through the cracks, sorry. I will apply it today. Paolo -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2 07/13] KVM: add vcpu-specific functions to read/write/translate GFNs
On 29/05/2015 21:23, Radim Krčmář wrote: +int kvm_vcpu_write_guest(struct kvm_vcpu *vcpu, gpa_t gpa, const void *data, + unsigned long len) +{ + gfn_t gfn = gpa PAGE_SHIFT; + int seg; + int offset = offset_in_page(gpa); + int ret; + + while ((seg = next_segment(len, offset)) != 0) { + ret = kvm_vcpu_write_guest_page(vcpu, gfn, data, offset, seg); + if (ret 0) + return ret; + offset = 0; + len -= seg; + data += seg; + ++gfn; + } + return 0; +} (There is no need to pass vcpu, and kvm, in this API. How so? A single kvm_vcpu_write_guest can cross multiple slots. Paolo Extracting memslots early will help to keep more code common. I have patches that did a superset of this for the old API, so posting them after this series is finalized will be simple.) -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] kvm: irqchip: Break up high order allocations of kvm_irq_routing_table
Hi Paolo, On Mon, May 11, 2015 at 03:27:26PM +0200, Paolo Bonzini wrote: Great, I'll apply the patch. Gentle ping. I don't see the patch in the queue or next branches of the KVM tree yet. Do you plan to apply it for v4.2? Joerg -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2 07/13] KVM: add vcpu-specific functions to read/write/translate GFNs
2015-06-05 12:26+0200, Paolo Bonzini: On 29/05/2015 21:23, Radim Krčmář wrote: +int kvm_vcpu_write_guest(struct kvm_vcpu *vcpu, gpa_t gpa, const void *data, + unsigned long len) +{ | [...] +} (There is no need to pass vcpu, and kvm, in this API. How so? A single kvm_vcpu_write_guest can cross multiple slots. I meant passing 'struct kvm_memslots *' instead and as soon as possible, which would still allow more slots. Something like this hunk: diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c index 10ae7e348dcc..8c6d84c12f18 100644 --- a/virt/kvm/kvm_main.c +++ b/virt/kvm/kvm_main.c @@ -1718,8 +1718,8 @@ int kvm_vcpu_write_guest_page(struct kvm_vcpu *vcpu, gfn_t gfn, } EXPORT_SYMBOL_GPL(kvm_vcpu_write_guest_page); -int kvm_write_guest(struct kvm *kvm, gpa_t gpa, const void *data, - unsigned long len) +static int __kvm_write_guest(struct kvm_memslots *slots, gpa_t gpa, const void *data, + unsigned long len) { gfn_t gfn = gpa PAGE_SHIFT; int seg; @@ -1727,7 +1727,8 @@ int kvm_write_guest(struct kvm *kvm, gpa_t gpa, const void *data, int ret; while ((seg = next_segment(len, offset)) != 0) { - ret = kvm_write_guest_page(kvm, gfn, data, offset, seg); + ret = __kvm_write_guest_page(__gfn_to_memslot(slots, gfn), gfn, + data, offset, seg); if (ret 0) return ret; offset = 0; @@ -1737,26 +1738,18 @@ int kvm_write_guest(struct kvm *kvm, gpa_t gpa, const void *data, } return 0; } + +int kvm_write_guest(struct kvm *kvm, gpa_t gpa, const void *data, + unsigned long len) +{ + return __kvm_write_guest(kvm_memslots(kvm), gpa, data, len); +} EXPORT_SYMBOL_GPL(kvm_write_guest); int kvm_vcpu_write_guest(struct kvm_vcpu *vcpu, gpa_t gpa, const void *data, unsigned long len) { - gfn_t gfn = gpa PAGE_SHIFT; - int seg; - int offset = offset_in_page(gpa); - int ret; - - while ((seg = next_segment(len, offset)) != 0) { - ret = kvm_vcpu_write_guest_page(vcpu, gfn, data, offset, seg); - if (ret 0) - return ret; - offset = 0; - len -= seg; - data += seg; - ++gfn; - } - return 0; + return __kvm_write_guest(kvm_vcpu_memslots(vcpu), gpa, data, len); } EXPORT_SYMBOL_GPL(kvm_vcpu_write_guest); -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2] arm/arm64: KVM: Properly account for guest CPU time
On 06/02/2015 02:27 AM, Christoffer Dall wrote: On Mon, Jun 01, 2015 at 08:48:22AM -0700, Mario Smarduch wrote: On 05/30/2015 11:59 PM, Christoffer Dall wrote: Hi Mario, On Fri, May 29, 2015 at 03:34:47PM -0700, Mario Smarduch wrote: On 05/28/2015 11:49 AM, Christoffer Dall wrote: Until now we have been calling kvm_guest_exit after re-enabling interrupts when we come back from the guest, but this has the unfortunate effect that CPU time accounting done in the context of timer interrupts occurring while the guest is running doesn't properly notice that the time since the last tick was spent in the guest. Inspired by the comment in the x86 code, move the kvm_guest_exit() call below the local_irq_enable() call and change __kvm_guest_exit() to kvm_guest_exit(), because we are now calling this function with interrupts enabled. We have to now explicitly disable preemption and not enable preemption before we've called kvm_guest_exit(), since otherwise we could be preempted and everything happening before we eventually get scheduled again would be accounted for as guest time. At the same time, move the trace_kvm_exit() call outside of the atomic section, since there is no reason for us to do that with interrupts disabled. Signed-off-by: Christoffer Dall christoffer.d...@linaro.org --- This patch is based on kvm/queue, because it has the kvm_guest_enter/exit rework recently posted by Christian Borntraeger. I hope I got the logic of this right, there were 2 slightly worrying facts about this: First, we now enable and disable and enable interrupts on each exit path, but I couldn't see any performance overhead on hackbench - yes the only benchmark we care about. Second, looking at the ppc and mips code, they seem to also call kvm_guest_exit() before enabling interrupts, so I don't understand how guest CPU time accounting works on those architectures. Changes since v1: - Tweak comment and commit text based on Marc's feedback. - Explicitly disable preemption and enable it only after kvm_guest_exit(). arch/arm/kvm/arm.c | 21 + 1 file changed, 17 insertions(+), 4 deletions(-) diff --git a/arch/arm/kvm/arm.c b/arch/arm/kvm/arm.c index e41cb11..fe8028d 100644 --- a/arch/arm/kvm/arm.c +++ b/arch/arm/kvm/arm.c @@ -532,6 +532,7 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu, struct kvm_run *run) kvm_vgic_flush_hwstate(vcpu); kvm_timer_flush_hwstate(vcpu); + preempt_disable(); local_irq_disable(); /* @@ -544,6 +545,7 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu, struct kvm_run *run) if (ret = 0 || need_new_vmid_gen(vcpu-kvm)) { local_irq_enable(); + preempt_enable(); kvm_timer_sync_hwstate(vcpu); kvm_vgic_sync_hwstate(vcpu); continue; @@ -559,8 +561,10 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu, struct kvm_run *run) ret = kvm_call_hyp(__kvm_vcpu_run, vcpu); vcpu-mode = OUTSIDE_GUEST_MODE; - __kvm_guest_exit(); - trace_kvm_exit(kvm_vcpu_trap_get_class(vcpu), *vcpu_pc(vcpu)); + /* + * Back from guest + */ + /* * We may have taken a host interrupt in HYP mode (ie * while executing the guest). This interrupt is still @@ -574,8 +578,17 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu, struct kvm_run *run) local_irq_enable(); /* - * Back from guest - */ + * We do local_irq_enable() before calling kvm_guest_exit() so + * that if a timer interrupt hits while running the guest we + * account that tick as being spent in the guest. We enable + * preemption after calling kvm_guest_exit() so that if we get + * preempted we make sure ticks after that is not counted as + * guest time. + */ + kvm_guest_exit(); + trace_kvm_exit(kvm_vcpu_trap_get_class(vcpu), *vcpu_pc(vcpu)); + preempt_enable(); + kvm_timer_sync_hwstate(vcpu); kvm_vgic_sync_hwstate(vcpu); Hi Christoffer, so currently we take a snap shot when we enter the guest (tsk-vtime_snap) and upon exit add the time we spent in the guest and update accrued time, which appears correct. not on arm64, because we don't select HAVE_VIRT_CPU_ACCOUNTING_GEN. Or am I missing something obvious here? I see what you mean we can't use cycle based accounting to accrue Guest time. See other thread, we can enable this in the config but it still only works with NO_HZ_FULL. With this patch it appears that interrupts running in host mode are accrued to Guest time, and additional preemption latency is added. It is
Re: [PATCH v2 07/13] KVM: add vcpu-specific functions to read/write/translate GFNs
On 05/06/2015 14:10, Radim Krčmář wrote: + ret = __kvm_write_guest_page(__gfn_to_memslot(slots, gfn), gfn, + data, offset, seg); Even better, let's pass memslots to all the __ functions. Paolo -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2 07/13] KVM: add vcpu-specific functions to read/write/translate GFNs
On 05/06/2015 14:10, Radim Krčmář wrote: 2015-06-05 12:26+0200, Paolo Bonzini: On 29/05/2015 21:23, Radim Krčmář wrote: +int kvm_vcpu_write_guest(struct kvm_vcpu *vcpu, gpa_t gpa, const void *data, + unsigned long len) +{ | [...] +} (There is no need to pass vcpu, and kvm, in this API. How so? A single kvm_vcpu_write_guest can cross multiple slots. I meant passing 'struct kvm_memslots *' instead and as soon as possible, which would still allow more slots. Oh, indeed that works fine! Paolo -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] KVM: x86: fix lapic.timer_mode on restore
On 05/06/2015 20:57, Radim Krčmář wrote: lapic.timer_mode was not properly initialized after migration, which broke few useful things, like login, by making every sleep eternal. Fix this by calling apic_update_lvtt in kvm_apic_post_state_restore. There are other slowpaths that update lvtt, so this patch makes sure something similar doesn't happen again by calling apic_update_lvtt after every modification. Cc: sta...@vger.kernel.org Fixes: f30ebc312ca9 (KVM: x86: optimize some accesses to LVTT and SPIV) Signed-off-by: Radim Krčmář rkrc...@redhat.com --- arch/x86/kvm/lapic.c | 26 -- 1 file changed, 16 insertions(+), 10 deletions(-) diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c index beeef05bb4d9..36e9de1b4127 100644 --- a/arch/x86/kvm/lapic.c +++ b/arch/x86/kvm/lapic.c @@ -1103,6 +1103,17 @@ static void update_divide_count(struct kvm_lapic *apic) apic-divide_count); } +static void apic_update_lvtt(struct kvm_lapic *apic) +{ + u32 timer_mode = kvm_apic_get_reg(apic, APIC_LVTT) + apic-lapic_timer.timer_mode_mask; + + if (apic-lapic_timer.timer_mode != timer_mode) { + apic-lapic_timer.timer_mode = timer_mode; + hrtimer_cancel(apic-lapic_timer.timer); + } +} + static void apic_timer_expired(struct kvm_lapic *apic) { struct kvm_vcpu *vcpu = apic-vcpu; @@ -1311,6 +1322,7 @@ static int apic_reg_write(struct kvm_lapic *apic, u32 reg, u32 val) apic_set_reg(apic, APIC_LVTT + 0x10 * i, lvt_val | APIC_LVT_MASKED); } + apic_update_lvtt(apic); atomic_set(apic-lapic_timer.pending, 0); } @@ -1343,20 +1355,13 @@ static int apic_reg_write(struct kvm_lapic *apic, u32 reg, u32 val) break; - case APIC_LVTT: { - u32 timer_mode = val apic-lapic_timer.timer_mode_mask; - - if (apic-lapic_timer.timer_mode != timer_mode) { - apic-lapic_timer.timer_mode = timer_mode; - hrtimer_cancel(apic-lapic_timer.timer); - } - + case APIC_LVTT: if (!kvm_apic_sw_enabled(apic)) val |= APIC_LVT_MASKED; val = (apic_lvt_mask[0] | apic-lapic_timer.timer_mode_mask); apic_set_reg(apic, APIC_LVTT, val); + apic_update_lvtt(apic); break; - } case APIC_TMICT: if (apic_lvtt_tscdeadline(apic)) @@ -1588,7 +1593,7 @@ void kvm_lapic_reset(struct kvm_vcpu *vcpu, bool init_event) for (i = 0; i APIC_LVT_NUM; i++) apic_set_reg(apic, APIC_LVTT + 0x10 * i, APIC_LVT_MASKED); - apic-lapic_timer.timer_mode = 0; + apic_update_lvtt(apic); if (!(vcpu-kvm-arch.disabled_quirks KVM_QUIRK_LINT0_REENABLED)) apic_set_reg(apic, APIC_LVT0, SET_APIC_DELIVERY_MODE(0, APIC_MODE_EXTINT)); @@ -1816,6 +1821,7 @@ void kvm_apic_post_state_restore(struct kvm_vcpu *vcpu, apic_update_ppr(apic); hrtimer_cancel(apic-lapic_timer.timer); + apic_update_lvtt(apic); update_divide_count(apic); start_apic_timer(apic); apic-irr_pending = true; Marcelo, if you have some free cycles feel free to apply this to kvm/master and send it to Linus sometime next week. I cannot do it on Monday and I'll be on vacation afterwards. (I'll be back as soon as June 16th so I didn't plan on a formal handoff, but I think it's better to have this in 4.1. The merge window conflicts with Linus's own vacation and might be delayed). And thanks Radim for the fix, it looks good. Paolo -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] KVM: x86: fix lapic.timer_mode on restore
lapic.timer_mode was not properly initialized after migration, which broke few useful things, like login, by making every sleep eternal. Fix this by calling apic_update_lvtt in kvm_apic_post_state_restore. There are other slowpaths that update lvtt, so this patch makes sure something similar doesn't happen again by calling apic_update_lvtt after every modification. Cc: sta...@vger.kernel.org Fixes: f30ebc312ca9 (KVM: x86: optimize some accesses to LVTT and SPIV) Signed-off-by: Radim Krčmář rkrc...@redhat.com --- arch/x86/kvm/lapic.c | 26 -- 1 file changed, 16 insertions(+), 10 deletions(-) diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c index beeef05bb4d9..36e9de1b4127 100644 --- a/arch/x86/kvm/lapic.c +++ b/arch/x86/kvm/lapic.c @@ -1103,6 +1103,17 @@ static void update_divide_count(struct kvm_lapic *apic) apic-divide_count); } +static void apic_update_lvtt(struct kvm_lapic *apic) +{ + u32 timer_mode = kvm_apic_get_reg(apic, APIC_LVTT) + apic-lapic_timer.timer_mode_mask; + + if (apic-lapic_timer.timer_mode != timer_mode) { + apic-lapic_timer.timer_mode = timer_mode; + hrtimer_cancel(apic-lapic_timer.timer); + } +} + static void apic_timer_expired(struct kvm_lapic *apic) { struct kvm_vcpu *vcpu = apic-vcpu; @@ -1311,6 +1322,7 @@ static int apic_reg_write(struct kvm_lapic *apic, u32 reg, u32 val) apic_set_reg(apic, APIC_LVTT + 0x10 * i, lvt_val | APIC_LVT_MASKED); } + apic_update_lvtt(apic); atomic_set(apic-lapic_timer.pending, 0); } @@ -1343,20 +1355,13 @@ static int apic_reg_write(struct kvm_lapic *apic, u32 reg, u32 val) break; - case APIC_LVTT: { - u32 timer_mode = val apic-lapic_timer.timer_mode_mask; - - if (apic-lapic_timer.timer_mode != timer_mode) { - apic-lapic_timer.timer_mode = timer_mode; - hrtimer_cancel(apic-lapic_timer.timer); - } - + case APIC_LVTT: if (!kvm_apic_sw_enabled(apic)) val |= APIC_LVT_MASKED; val = (apic_lvt_mask[0] | apic-lapic_timer.timer_mode_mask); apic_set_reg(apic, APIC_LVTT, val); + apic_update_lvtt(apic); break; - } case APIC_TMICT: if (apic_lvtt_tscdeadline(apic)) @@ -1588,7 +1593,7 @@ void kvm_lapic_reset(struct kvm_vcpu *vcpu, bool init_event) for (i = 0; i APIC_LVT_NUM; i++) apic_set_reg(apic, APIC_LVTT + 0x10 * i, APIC_LVT_MASKED); - apic-lapic_timer.timer_mode = 0; + apic_update_lvtt(apic); if (!(vcpu-kvm-arch.disabled_quirks KVM_QUIRK_LINT0_REENABLED)) apic_set_reg(apic, APIC_LVT0, SET_APIC_DELIVERY_MODE(0, APIC_MODE_EXTINT)); @@ -1816,6 +1821,7 @@ void kvm_apic_post_state_restore(struct kvm_vcpu *vcpu, apic_update_ppr(apic); hrtimer_cancel(apic-lapic_timer.timer); + apic_update_lvtt(apic); update_divide_count(apic); start_apic_timer(apic); apic-irr_pending = true; -- 2.4.2 -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] kvmtool: don't use PCI config space IRQ line field
On Thu, Jun 04, 2015 at 04:20:45PM +0100, Andre Przywara wrote: In PCI config space there is an interrupt line field (offset 0x3f), which is used to initially communicate the IRQ line number from firmware to the OS. _Hardware_ should never use this information, as the OS is free to write any information in there. But kvmtool uses this number when it triggers IRQs in the guest, which fails starting with Linux 3.19-rc1, where the PCI layer starts writing the virtual IRQ number in there. Fix that by storing the IRQ number in a separate field in struct virtio_pci, which is independent from the PCI config space and cannot be influenced by the guest. This fixes ARM/ARM64 guests using PCI with newer kernels. Signed-off-by: Andre Przywara andre.przyw...@arm.com --- include/kvm/virtio-pci.h | 8 virtio/pci.c | 9 ++--- 2 files changed, 14 insertions(+), 3 deletions(-) diff --git a/include/kvm/virtio-pci.h b/include/kvm/virtio-pci.h index c795ce7..b70cadd 100644 --- a/include/kvm/virtio-pci.h +++ b/include/kvm/virtio-pci.h @@ -30,6 +30,14 @@ struct virtio_pci { u8 isr; u32 features; + /* + * We cannot rely on the INTERRUPT_LINE byte in the config space once + * we have run guest code, as the OS is allowed to use that field + * as a scratch pad to communicate between driver and PCI layer. + * So store our legacy interrupt line number in here for internal use. + */ + u8 legacy_irq_line; + /* MSI-X */ u16 config_vector; u32 config_gsi; diff --git a/virtio/pci.c b/virtio/pci.c index 7556239..e17e5a9 100644 --- a/virtio/pci.c +++ b/virtio/pci.c @@ -141,7 +141,7 @@ static bool virtio_pci__io_in(struct ioport *ioport, struct kvm_cpu *vcpu, u16 p break; case VIRTIO_PCI_ISR: ioport__write8(data, vpci-isr); - kvm__irq_line(kvm, vpci-pci_hdr.irq_line, VIRTIO_IRQ_LOW); + kvm__irq_line(kvm, vpci-legacy_irq_line, VIRTIO_IRQ_LOW); vpci-isr = VIRTIO_IRQ_LOW; break; default: @@ -299,7 +299,7 @@ int virtio_pci__signal_vq(struct kvm *kvm, struct virtio_device *vdev, u32 vq) kvm__irq_trigger(kvm, vpci-gsis[vq]); } else { vpci-isr = VIRTIO_IRQ_HIGH; - kvm__irq_trigger(kvm, vpci-pci_hdr.irq_line); + kvm__irq_trigger(kvm, vpci-legacy_irq_line); } return 0; } @@ -323,7 +323,7 @@ int virtio_pci__signal_config(struct kvm *kvm, struct virtio_device *vdev) kvm__irq_trigger(kvm, vpci-config_gsi); } else { vpci-isr = VIRTIO_PCI_ISR_CONFIG; - kvm__irq_trigger(kvm, vpci-pci_hdr.irq_line); + kvm__irq_trigger(kvm, vpci-legacy_irq_line); } return 0; @@ -422,6 +422,9 @@ int virtio_pci__init(struct kvm *kvm, void *dev, struct virtio_device *vdev, if (r 0) goto free_msix_mmio; + /* save the IRQ that device__register() has allocated */ + vpci-legacy_irq_line = vpci-pci_hdr.irq_line; I'd rather we used the container_of trick that we do for virtio-mmio devices when assigning the irq in device__register. Then we can avoid this line completely. Will -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Announcing qboot, a minimal x86 firmware for QEMU
On Tue, May 26, 2015 at 9:47 AM, Stefan Hajnoczi stefa...@gmail.com wrote: On Fri, May 22, 2015 at 10:53:54AM +0800, Yong Wang wrote: On Thu, May 21, 2015 at 03:51:43PM +0200, Paolo Bonzini wrote: On the QEMU side, there is no support yet for persistent memory and the NFIT tables from ACPI 6.0. Once that (and ACPI support) is added, qboot will automatically start using it. We are working on adding NFIT support into virtual bios. Great. I asked about this on the #pmem (irc.oftc.net) IRC channel last week. Which virtual bios are you targeting? Ping? Interest in persistent memory is picking up and I'd like to avoid duplicating work. Which pieces do you have patches for? 1. QEMU -device pmem,file=/path/to/dax/file,id=pmem1 and fw_cfg/ACPI info that gets passed to the guest 2. SeaBIOS NFIT ACPI table 3. ACPI NVDIMM DSM (probably not much needed, most features would be disabled) Thanks, Stefan -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2 07/13] KVM: add vcpu-specific functions to read/write/translate GFNs
2015-06-05 14:46+0200, Paolo Bonzini: On 05/06/2015 14:10, Radim Krčmář wrote: + ret = __kvm_write_guest_page(__gfn_to_memslot(slots, gfn), gfn, + data, offset, seg); Even better, let's pass memslots to all the __ functions. Yeah, while scoping it, I noticed a bug in the series ... makes me wish that C had a useful type system. A quick fix would be to replace gpa with gfn in calls to __kvm_read_guest_atomic(). I presume you'd prefer a new patch to rebasing, so it's below. --- KVM: fix gpa/gfn mixup in __kvm_read_guest_atomic Refactoring passed gpa instead of gfn to __kvm_read_guest_atomic. While at it, lessen code duplication by extracting slots earlier. Fixes: 841509f38372 (KVM: add vcpu-specific functions to read/write/translate GFNs) Signed-off-by: Radim Krčmář rkrc...@redhat.com --- virt/kvm/kvm_main.c | 19 +-- 1 file changed, 9 insertions(+), 10 deletions(-) diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c index 10ae7e348dcc..4fa1edc34630 100644 --- a/virt/kvm/kvm_main.c +++ b/virt/kvm/kvm_main.c @@ -1645,11 +1645,14 @@ int kvm_vcpu_read_guest(struct kvm_vcpu *vcpu, gpa_t gpa, void *data, unsigned l } EXPORT_SYMBOL_GPL(kvm_vcpu_read_guest); -static int __kvm_read_guest_atomic(struct kvm_memory_slot *slot, gfn_t gfn, - void *data, int offset, unsigned long len) +static int __kvm_read_guest_atomic(struct kvm_memslots *slots, gpa_t gpa, + void *data, unsigned long len) { int r; unsigned long addr; + gfn_t gfn = gpa PAGE_SHIFT; + struct kvm_memory_slot *slot = __gfn_to_memslot(slots, gfn); + int offset = offset_in_page(gpa); addr = gfn_to_hva_memslot_prot(slot, gfn, NULL); if (kvm_is_error_hva(addr)) @@ -1665,22 +1668,18 @@ static int __kvm_read_guest_atomic(struct kvm_memory_slot *slot, gfn_t gfn, int kvm_read_guest_atomic(struct kvm *kvm, gpa_t gpa, void *data, unsigned long len) { - gfn_t gfn = gpa PAGE_SHIFT; - struct kvm_memory_slot *slot = gfn_to_memslot(kvm, gfn); - int offset = offset_in_page(gpa); + struct kvm_memslots *slots = kvm_memslots(kvm); - return __kvm_read_guest_atomic(slot, gpa, data, offset, len); + return __kvm_read_guest_atomic(slots, gpa, data, len); } EXPORT_SYMBOL_GPL(kvm_read_guest_atomic); int kvm_vcpu_read_guest_atomic(struct kvm_vcpu *vcpu, gpa_t gpa, void *data, unsigned long len) { - gfn_t gfn = gpa PAGE_SHIFT; - struct kvm_memory_slot *slot = kvm_vcpu_gfn_to_memslot(vcpu, gfn); - int offset = offset_in_page(gpa); + struct kvm_memslots *slots = kvm_vcpu_memslots(vcpu); - return __kvm_read_guest_atomic(slot, gpa, data, offset, len); + return __kvm_read_guest_atomic(slots, gpa, data, len); } EXPORT_SYMBOL_GPL(kvm_vcpu_read_guest_atomic); -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] KVM: arm/arm64: Enable the KVM-VFIO device
From: Kim Phillips kim.phill...@linaro.org The KVM-VFIO device is used by the QEMU VFIO device. It is used to record the list of in-use VFIO groups so that KVM can manipulate them. Signed-off-by: Kim Phillips kim.phill...@linaro.org Signed-off-by: Eric Auger eric.au...@linaro.org --- - previously included in KVM-VFIO IRQ forward control v6 series. Rationale to put it aside is the unavailability of the kvm-vfio device causes produces a warning when launching the QEMU VFIO platform device that can puzzle some users (although not blocking): Failed to create KVM VFIO device: No such device --- arch/arm/kvm/Kconfig| 1 + arch/arm/kvm/Makefile | 2 +- arch/arm64/kvm/Kconfig | 1 + arch/arm64/kvm/Makefile | 2 +- 4 files changed, 4 insertions(+), 2 deletions(-) diff --git a/arch/arm/kvm/Kconfig b/arch/arm/kvm/Kconfig index f1f79d1..bfb915d 100644 --- a/arch/arm/kvm/Kconfig +++ b/arch/arm/kvm/Kconfig @@ -28,6 +28,7 @@ config KVM select KVM_GENERIC_DIRTYLOG_READ_PROTECT select SRCU select MMU_NOTIFIER + select KVM_VFIO select HAVE_KVM_EVENTFD select HAVE_KVM_IRQFD depends on ARM_VIRT_EXT ARM_LPAE ARM_ARCH_TIMER diff --git a/arch/arm/kvm/Makefile b/arch/arm/kvm/Makefile index 139e46c..c5eef02c 100644 --- a/arch/arm/kvm/Makefile +++ b/arch/arm/kvm/Makefile @@ -15,7 +15,7 @@ AFLAGS_init.o := -Wa,-march=armv7-a$(plus_virt) AFLAGS_interrupts.o := -Wa,-march=armv7-a$(plus_virt) KVM := ../../../virt/kvm -kvm-arm-y = $(KVM)/kvm_main.o $(KVM)/coalesced_mmio.o $(KVM)/eventfd.o +kvm-arm-y = $(KVM)/kvm_main.o $(KVM)/coalesced_mmio.o $(KVM)/eventfd.o $(KVM)/vfio.o obj-y += kvm-arm.o init.o interrupts.o obj-y += arm.o handle_exit.o guest.o mmu.o emulate.o reset.o diff --git a/arch/arm64/kvm/Kconfig b/arch/arm64/kvm/Kconfig index 5105e29..bfffe8f 100644 --- a/arch/arm64/kvm/Kconfig +++ b/arch/arm64/kvm/Kconfig @@ -28,6 +28,7 @@ config KVM select KVM_ARM_HOST select KVM_GENERIC_DIRTYLOG_READ_PROTECT select SRCU + select KVM_VFIO select HAVE_KVM_EVENTFD select HAVE_KVM_IRQFD ---help--- diff --git a/arch/arm64/kvm/Makefile b/arch/arm64/kvm/Makefile index d5904f8..f90f4aa 100644 --- a/arch/arm64/kvm/Makefile +++ b/arch/arm64/kvm/Makefile @@ -11,7 +11,7 @@ ARM=../../../arch/arm/kvm obj-$(CONFIG_KVM_ARM_HOST) += kvm.o -kvm-$(CONFIG_KVM_ARM_HOST) += $(KVM)/kvm_main.o $(KVM)/coalesced_mmio.o $(KVM)/eventfd.o +kvm-$(CONFIG_KVM_ARM_HOST) += $(KVM)/kvm_main.o $(KVM)/coalesced_mmio.o $(KVM)/eventfd.o $(KVM)/vfio.o kvm-$(CONFIG_KVM_ARM_HOST) += $(ARM)/arm.o $(ARM)/mmu.o $(ARM)/mmio.o kvm-$(CONFIG_KVM_ARM_HOST) += $(ARM)/psci.o $(ARM)/perf.o -- 1.9.1 -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2 07/13] KVM: add vcpu-specific functions to read/write/translate GFNs
On 05/06/2015 17:13, Radim Krčmář wrote: 2015-06-05 14:46+0200, Paolo Bonzini: On 05/06/2015 14:10, Radim Krčmář wrote: + ret = __kvm_write_guest_page(__gfn_to_memslot(slots, gfn), gfn, + data, offset, seg); Even better, let's pass memslots to all the __ functions. Yeah, while scoping it, I noticed a bug in the series ... makes me wish that C had a useful type system. A quick fix would be to replace gpa with gfn in calls to __kvm_read_guest_atomic(). I presume you'd prefer a new patch to rebasing, so it's below. Since it was pushed only for 15 minutes or so, and the fix is two lines: diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c index 30425ce6a4a4..848af90b8091 100644 --- a/virt/kvm/kvm_main.c +++ b/virt/kvm/kvm_main.c @@ -1669,7 +1669,7 @@ int kvm_read_guest_atomic(struct kvm *kvm, gpa_t gpa, void *data, struct kvm_memory_slot *slot = gfn_to_memslot(kvm, gfn); int offset = offset_in_page(gpa); - return __kvm_read_guest_atomic(slot, gpa, data, offset, len); + return __kvm_read_guest_atomic(slot, gfn, data, offset, len); } EXPORT_SYMBOL_GPL(kvm_read_guest_atomic); @@ -1680,7 +1680,7 @@ int kvm_vcpu_read_guest_atomic(struct kvm_vcpu *vcpu, gpa_t gpa, struct kvm_memory_slot *slot = kvm_vcpu_gfn_to_memslot(vcpu, gfn); int offset = offset_in_page(gpa); - return __kvm_read_guest_atomic(slot, gpa, data, offset, len); + return __kvm_read_guest_atomic(slot, gfn, data, offset, len); } EXPORT_SYMBOL_GPL(kvm_vcpu_read_guest_atomic); I just force-pushed kvm/next. The patch is good, but I prefer to do minimal changes before fleeing on holiday. Paolo -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [v3 24/26] KVM: Update Posted-Interrupts Descriptor when vCPU is blocked
On Tue, Apr 14, 2015 at 07:37:44AM +, Wu, Feng wrote: -Original Message- From: Marcelo Tosatti [mailto:mtosa...@redhat.com] Sent: Tuesday, March 31, 2015 7:56 AM To: Wu, Feng Cc: h...@zytor.com; t...@linutronix.de; mi...@redhat.com; x...@kernel.org; g...@kernel.org; pbonz...@redhat.com; dw...@infradead.org; j...@8bytes.org; alex.william...@redhat.com; jiang@linux.intel.com; eric.au...@linaro.org; linux-ker...@vger.kernel.org; io...@lists.linux-foundation.org; kvm@vger.kernel.org Subject: Re: [v3 24/26] KVM: Update Posted-Interrupts Descriptor when vCPU is blocked On Mon, Mar 30, 2015 at 04:46:55AM +, Wu, Feng wrote: -Original Message- From: Marcelo Tosatti [mailto:mtosa...@redhat.com] Sent: Saturday, March 28, 2015 3:30 AM To: Wu, Feng Cc: h...@zytor.com; t...@linutronix.de; mi...@redhat.com; x...@kernel.org; g...@kernel.org; pbonz...@redhat.com; dw...@infradead.org; j...@8bytes.org; alex.william...@redhat.com; jiang@linux.intel.com; eric.au...@linaro.org; linux-ker...@vger.kernel.org; io...@lists.linux-foundation.org; kvm@vger.kernel.org Subject: Re: [v3 24/26] KVM: Update Posted-Interrupts Descriptor when vCPU is blocked On Fri, Mar 27, 2015 at 06:34:14AM +, Wu, Feng wrote: Currently, the following code is executed before local_irq_disable() is called, so do you mean 1)moving local_irq_disable() to the place before it. 2) after interrupt is disabled, set KVM_REQ_EVENT in case the ON bit is set? 2) after interrupt is disabled, set KVM_REQ_EVENT in case the ON bit is set. Here is my understanding about your comments here: - Disable interrupts - Check 'ON' - Set KVM_REQ_EVENT if 'ON' is set Then we can put the above code inside if (kvm_check_request(KVM_REQ_EVENT, vcpu) || req_int_win) just like it used to be. However, I still have some questions about this comment: 1. Where should I set KVM_REQ_EVENT? In function vcpu_enter_guest(), or other places? See below: If in vcpu_enter_guest(), since currently local_irq_disable() is called after 'KVM_REQ_EVENT' is checked, is it helpful to set KVM_REQ_EVENT after local_irq_disable() is called? local_irq_disable(); *** add code here *** So we need add code like the following here, right? if ('ON' is set) kvm_make_request(KVM_REQ_EVENT, vcpu); Hi Marcelo, I changed the code as above, then I found that the ping latency was extremely big, (70ms - 400ms). I digged into it and got the root cause. We cannot use checking-on as the judgment, since 'ON' can be cleared by hypervisor software in lots of places. In this case, KVM_REQ_EVENT cannot be set when we check 'ON' bit, hence the interrupts are not injected to the guest in time. Please refer to the following code, in which 'ON' bit can be cleared: apic_find_highest_irr () -- vmx_sync_pir_to_irr () -- pi_test_and_clear_on() Searching from the code step by step, apic_find_highest_irr() can be called by many other guys. Thanks, Ok then, ignore my suggestion. Can you resend the latest version please ? -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[kvm:queue 76/76] arch/x86/kvm/../../../virt/kvm/irqchip.c:144:35: sparse: incorrect type in argument 1 (different address spaces)
tree: git://git.kernel.org/pub/scm/virt/kvm/kvm.git queue head: 6aa5e7eb06cff8d317328a0c4696b5f635ba6be3 commit: 6aa5e7eb06cff8d317328a0c4696b5f635ba6be3 [76/76] kvm: irqchip: Break up high order allocations of kvm_irq_routing_table reproduce: # apt-get install sparse git checkout 6aa5e7eb06cff8d317328a0c4696b5f635ba6be3 make ARCH=x86_64 allmodconfig make C=1 CF=-D__CHECK_ENDIAN__ sparse warnings: (new ones prefixed by ) arch/x86/kvm/../../../virt/kvm/irqchip.c:144:35: sparse: incorrect type in argument 1 (different address spaces) arch/x86/kvm/../../../virt/kvm/irqchip.c:144:35:expected struct kvm_irq_routing_table *rt arch/x86/kvm/../../../virt/kvm/irqchip.c:144:35:got struct kvm_irq_routing_table [noderef] asn:4*irq_routing arch/x86/kvm/../../../virt/kvm/irqchip.c:224:13: sparse: incorrect type in assignment (different address spaces) arch/x86/kvm/../../../virt/kvm/irqchip.c:224:13:expected struct kvm_irq_routing_table *old arch/x86/kvm/../../../virt/kvm/irqchip.c:224:13:got struct kvm_irq_routing_table [noderef] asn:4*irq_routing vim +144 arch/x86/kvm/../../../virt/kvm/irqchip.c 128 struct kvm_kernel_irq_routing_entry *e; 129 struct hlist_node *n; 130 131 hlist_for_each_entry_safe(e, n, rt-map[i], link) { 132 hlist_del(e-link); 133 kfree(e); 134 } 135 } 136 137 kfree(rt); 138 } 139 140 void kvm_free_irq_routing(struct kvm *kvm) 141 { 142 /* Called only during vm destruction. Nobody can use the pointer 143 at this stage */ 144 free_irq_routing_table(kvm-irq_routing); 145 } 146 147 static int setup_routing_entry(struct kvm_irq_routing_table *rt, 148 struct kvm_kernel_irq_routing_entry *e, 149 const struct kvm_irq_routing_entry *ue) 150 { 151 int r = -EINVAL; 152 struct kvm_kernel_irq_routing_entry *ei; --- 0-DAY kernel test infrastructureOpen Source Technology Center http://lists.01.org/mailman/listinfo/kbuild Intel Corporation -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html