Re: [PATCH 14/15] KVM: MTRR: do not map huage page for non-consistent range

2015-06-05 Thread Xiao Guangrong


[ CCed Zhang Yang ]

On 06/04/2015 04:36 PM, Paolo Bonzini wrote:



On 04/06/2015 10:23, Xiao Guangrong wrote:


So, why do you need to always use IPAT=0?  Can patch 15 keep the current
logic for RAM, like this:

 if (is_mmio || kvm_arch_has_noncoherent_dma(vcpu-kvm))
 ret = kvm_mtrr_get_guest_memory_type(vcpu, gfn) 
   VMX_EPT_MT_EPTE_SHIFT;
 else
 ret = (MTRR_TYPE_WRBACK  VMX_EPT_MT_EPTE_SHIFT)
 | VMX_EPT_IPAT_BIT;


Yeah, it's okay, actually we considered this way, however
- it's light enough, it did not hurt guest performance based on our
   benchmark.
- the logic has always used for noncherent_dma case, extend it to
   normal case should have low risk and also help us to check the logic.


But noncoherent_dma is not the common case, so it's not necessarily true
that the risk is low.


I thought noncoherent_dma exists on 1st generation(s) IOMMU, it should
be fully tested at that time.




- completely follow MTRRS spec would be better than host hides it.


We are a virtualization platform, we know well when MTRRs are necessary.

Tis a risk from blindly obeying the guest MTRRs: userspace can see stale
data if the guest's accesses bypass the cache.  AMD bypasses this by
enabling snooping even in cases that ordinarily wouldn't snoop; for
Intel the solution is that RAM-backed areas should always use IPAT.


Not sure if UC and other cacheable type combinations on guest and host
will cause problem. The SMD mentioned that snoop is not required only when
The UC attribute comes from the MTRRs and the processors are not required
 to snoop their caches since the data could never have been cached.
(Vol 3. 11.5.2.2)
VMX do not touch hardware MTRR MSRs and i guess snoop works under this case.

I also noticed if SS (self-snooping) is supported we need not to invalidate
cache when programming memory type (Vol 3. 11.11.8), so that means CPU works
well on the page which has different cache types i guess.

After think it carefully, we (Zhang Yang) doubt if always set WB for DMA
memory is really a good idea because we can not assume WB DMA works well for
all devices. One example is that audio DMA (not a MMIO region) is required WC
to improve its performance.

However, we think the SDM is not clear enough so let's do full vMTRR on MMIO
and noncoherent_dma first. :)
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v2 4/8] AArch{32,64}: dynamically configure the number of GIC interrupts

2015-06-05 Thread Andre Przywara
From: Marc Zyngier marc.zyng...@arm.com

In order to reduce the memory usage of large guests (as well
as improve performance), tell KVM about the number of interrupts
we require.

To avoid synchronization with the various device creation,
use a late_init callback to compute the GIC configuration.
[Andre: rename to gic__init_gic() to ease future expansion]

Signed-off-by: Marc Zyngier marc.zyng...@arm.com
Signed-off-by: Andre Przywara andre.przyw...@arm.com
---
 arm/gic.c | 25 +
 1 file changed, 25 insertions(+)

diff --git a/arm/gic.c b/arm/gic.c
index ce5f7fa..6277af8 100644
--- a/arm/gic.c
+++ b/arm/gic.c
@@ -1,10 +1,12 @@
 #include kvm/fdt.h
+#include kvm/irq.h
 #include kvm/kvm.h
 #include kvm/virtio.h
 
 #include arm-common/gic.h
 
 #include linux/byteorder.h
+#include linux/kernel.h
 #include linux/kvm.h
 
 static int gic_fd = -1;
@@ -87,6 +89,29 @@ int gic__create(struct kvm *kvm)
return err;
 }
 
+static int gic__init_gic(struct kvm *kvm)
+{
+   int lines = irq__get_nr_allocated_lines();
+   u32 nr_irqs = ALIGN(lines, 32) + GIC_SPI_IRQ_BASE;
+   struct kvm_device_attr nr_irqs_attr = {
+   .group  = KVM_DEV_ARM_VGIC_GRP_NR_IRQS,
+   .addr   = (u64)(unsigned long)nr_irqs,
+   };
+
+   /*
+* If we didn't use the KVM_CREATE_DEVICE method, KVM will
+* give us some default number of interrupts.
+*/
+   if (gic_fd  0)
+   return 0;
+
+   if (!ioctl(gic_fd, KVM_HAS_DEVICE_ATTR, nr_irqs_attr))
+   return ioctl(gic_fd, KVM_SET_DEVICE_ATTR, nr_irqs_attr);
+
+   return 0;
+}
+late_init(gic__init_gic)
+
 void gic__generate_fdt_nodes(void *fdt, u32 phandle)
 {
u64 reg_prop[] = {
-- 
2.3.5

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v2 5/8] arm: finish VGIC initialisation explicitly

2015-06-05 Thread Andre Przywara
Since Linux 3.19-rc1 there is a new API to explicitly initialise
the in-kernel GIC emulation by a userland KVM device call.
Use that to tell the kernel we are finished with the GIC
initialisation, since the automatic GIC init will only be provided
as a legacy functionality in the future.

Signed-off-by: Andre Przywara andre.przyw...@arm.com
---
 arm/gic.c | 25 ++---
 1 file changed, 22 insertions(+), 3 deletions(-)

diff --git a/arm/gic.c b/arm/gic.c
index 6277af8..8d47562 100644
--- a/arm/gic.c
+++ b/arm/gic.c
@@ -89,24 +89,43 @@ int gic__create(struct kvm *kvm)
return err;
 }
 
+/*
+ * Sets the number of used interrupts and finalizes the GIC init explicitly.
+ */
 static int gic__init_gic(struct kvm *kvm)
 {
+   int ret;
+
int lines = irq__get_nr_allocated_lines();
u32 nr_irqs = ALIGN(lines, 32) + GIC_SPI_IRQ_BASE;
struct kvm_device_attr nr_irqs_attr = {
.group  = KVM_DEV_ARM_VGIC_GRP_NR_IRQS,
.addr   = (u64)(unsigned long)nr_irqs,
};
+   struct kvm_device_attr vgic_init_attr = {
+   .group  = KVM_DEV_ARM_VGIC_GRP_CTRL,
+   .attr   = KVM_DEV_ARM_VGIC_CTRL_INIT,
+   };
 
/*
 * If we didn't use the KVM_CREATE_DEVICE method, KVM will
-* give us some default number of interrupts.
+* give us some default number of interrupts. The GIC initialization
+* will be done automatically in this case.
 */
if (gic_fd  0)
return 0;
 
-   if (!ioctl(gic_fd, KVM_HAS_DEVICE_ATTR, nr_irqs_attr))
-   return ioctl(gic_fd, KVM_SET_DEVICE_ATTR, nr_irqs_attr);
+   if (!ioctl(gic_fd, KVM_HAS_DEVICE_ATTR, nr_irqs_attr)) {
+   ret = ioctl(gic_fd, KVM_SET_DEVICE_ATTR, nr_irqs_attr);
+   if (ret)
+   return ret;
+   }
+
+   if (!ioctl(gic_fd, KVM_HAS_DEVICE_ATTR, vgic_init_attr)) {
+   ret = ioctl(gic_fd, KVM_SET_DEVICE_ATTR, vgic_init_attr);
+   if (ret)
+   return ret;
+   }
 
return 0;
 }
-- 
2.3.5

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v2 3/8] irq: add irq__get_nr_allocated_lines

2015-06-05 Thread Andre Przywara
From: Marc Zyngier marc.zyng...@arm.com

The ARM GIC emulation needs to be told the number of interrupts
it has to support. As commit 1c262fa1dc7bc (kvm tools: irq: make
irq__alloc_line generic) made the interrupt counter private,
add a new accessor returning the number of interrupt lines we've
allocated so far.

Signed-off-by: Marc Zyngier marc.zyng...@arm.com
Signed-off-by: Andre Przywara andre.przyw...@arm.com
---
 include/kvm/irq.h | 1 +
 irq.c | 5 +
 2 files changed, 6 insertions(+)

diff --git a/include/kvm/irq.h b/include/kvm/irq.h
index 4cec6f0..8a78e43 100644
--- a/include/kvm/irq.h
+++ b/include/kvm/irq.h
@@ -11,6 +11,7 @@
 struct kvm;
 
 int irq__alloc_line(void);
+int irq__get_nr_allocated_lines(void);
 
 int irq__init(struct kvm *kvm);
 int irq__exit(struct kvm *kvm);
diff --git a/irq.c b/irq.c
index 33ea8d2..71eaa05 100644
--- a/irq.c
+++ b/irq.c
@@ -7,3 +7,8 @@ int irq__alloc_line(void)
 {
return next_line++;
 }
+
+int irq__get_nr_allocated_lines(void)
+{
+   return next_line - KVM_IRQ_OFFSET;
+}
-- 
2.3.5

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v2 2/8] AArch{32,64}: use KVM_CREATE_DEVICE co to instanciate the GIC

2015-06-05 Thread Andre Przywara
From: Marc Zyngier marc.zyng...@arm.com

As of 3.14, KVM/arm supports the creation/configuration of the GIC through
a more generic device API, which is now the preferred way to do so.

Plumb the new API in, and allow the old code to be used as a fallback.

[Andre: Rename some functions on the way to differentiate between
creation and initialisation more clearly.]

Signed-off-by: Marc Zyngier marc.zyng...@arm.com
Signed-off-by: Andre Przywara andre.przyw...@arm.com
---
 arm/gic.c| 60 ++--
 arm/include/arm-common/gic.h |  2 +-
 arm/kvm.c|  6 ++---
 3 files changed, 57 insertions(+), 11 deletions(-)

diff --git a/arm/gic.c b/arm/gic.c
index 5d8cbe6..ce5f7fa 100644
--- a/arm/gic.c
+++ b/arm/gic.c
@@ -7,7 +7,41 @@
 #include linux/byteorder.h
 #include linux/kvm.h
 
-int gic__init_irqchip(struct kvm *kvm)
+static int gic_fd = -1;
+
+static int gic__create_device(struct kvm *kvm)
+{
+   int err;
+   u64 cpu_if_addr = ARM_GIC_CPUI_BASE;
+   u64 dist_addr = ARM_GIC_DIST_BASE;
+   struct kvm_create_device gic_device = {
+   .type   = KVM_DEV_TYPE_ARM_VGIC_V2,
+   };
+   struct kvm_device_attr cpu_if_attr = {
+   .group  = KVM_DEV_ARM_VGIC_GRP_ADDR,
+   .attr   = KVM_VGIC_V2_ADDR_TYPE_CPU,
+   .addr   = (u64)(unsigned long)cpu_if_addr,
+   };
+   struct kvm_device_attr dist_attr = {
+   .group  = KVM_DEV_ARM_VGIC_GRP_ADDR,
+   .attr   = KVM_VGIC_V2_ADDR_TYPE_DIST,
+   .addr   = (u64)(unsigned long)dist_addr,
+   };
+
+   err = ioctl(kvm-vm_fd, KVM_CREATE_DEVICE, gic_device);
+   if (err)
+   return err;
+
+   gic_fd = gic_device.fd;
+
+   err = ioctl(gic_fd, KVM_SET_DEVICE_ATTR, cpu_if_attr);
+   if (err)
+   return err;
+
+   return ioctl(gic_fd, KVM_SET_DEVICE_ATTR, dist_attr);
+}
+
+static int gic__create_irqchip(struct kvm *kvm)
 {
int err;
struct kvm_arm_device_addr gic_addr[] = {
@@ -23,12 +57,6 @@ int gic__init_irqchip(struct kvm *kvm)
}
};
 
-   if (kvm-nrcpus  GIC_MAX_CPUS) {
-   pr_warning(%d CPUS greater than maximum of %d -- truncating\n,
-   kvm-nrcpus, GIC_MAX_CPUS);
-   kvm-nrcpus = GIC_MAX_CPUS;
-   }
-
err = ioctl(kvm-vm_fd, KVM_CREATE_IRQCHIP);
if (err)
return err;
@@ -41,6 +69,24 @@ int gic__init_irqchip(struct kvm *kvm)
return err;
 }
 
+int gic__create(struct kvm *kvm)
+{
+   int err;
+
+   if (kvm-nrcpus  GIC_MAX_CPUS) {
+   pr_warning(%d CPUS greater than maximum of %d -- truncating\n,
+   kvm-nrcpus, GIC_MAX_CPUS);
+   kvm-nrcpus = GIC_MAX_CPUS;
+   }
+
+   /* Try the new way first, and fallback on legacy method otherwise */
+   err = gic__create_device(kvm);
+   if (err)
+   err = gic__create_irqchip(kvm);
+
+   return err;
+}
+
 void gic__generate_fdt_nodes(void *fdt, u32 phandle)
 {
u64 reg_prop[] = {
diff --git a/arm/include/arm-common/gic.h b/arm/include/arm-common/gic.h
index 5a36f2c..44859f7 100644
--- a/arm/include/arm-common/gic.h
+++ b/arm/include/arm-common/gic.h
@@ -24,7 +24,7 @@
 struct kvm;
 
 int gic__alloc_irqnum(void);
-int gic__init_irqchip(struct kvm *kvm);
+int gic__create(struct kvm *kvm);
 void gic__generate_fdt_nodes(void *fdt, u32 phandle);
 
 #endif /* ARM_COMMON__GIC_H */
diff --git a/arm/kvm.c b/arm/kvm.c
index 58ad9fa..bcd2533 100644
--- a/arm/kvm.c
+++ b/arm/kvm.c
@@ -81,7 +81,7 @@ void kvm__arch_init(struct kvm *kvm, const char 
*hugetlbfs_path, u64 ram_size)
madvise(kvm-arch.ram_alloc_start, kvm-arch.ram_alloc_size,
MADV_MERGEABLE | MADV_HUGEPAGE);
 
-   /* Initialise the virtual GIC. */
-   if (gic__init_irqchip(kvm))
-   die(Failed to initialise virtual GIC);
+   /* Create the virtual GIC. */
+   if (gic__create(kvm))
+   die(Failed to create virtual GIC);
 }
-- 
2.3.5

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v2 7/8] arm: add support for supplying GICv3 redistributor addresses

2015-06-05 Thread Andre Przywara
The code currently is assuming fixed sized memory regions for the
distributor and CPU interface. GICv3 needs a dynamic allocation of
its redistributor region, since its size depends on the number of
vCPUs.
Also add the necessary code to create a GICv3 IRQ chip instance.
This contains some defines which are not (yet) in the (32 bit) header
files to allow compilation for ARM.

Signed-off-by: Andre Przywara andre.przyw...@arm.com
---
 arm/gic.c | 37 +++--
 arm/include/arm-common/gic.h  |  2 +-
 arm/include/arm-common/kvm-arch.h | 18 ++
 arm/kvm-cpu.c |  4 +++-
 4 files changed, 53 insertions(+), 8 deletions(-)

diff --git a/arm/gic.c b/arm/gic.c
index 0ce40e4..c50d662 100644
--- a/arm/gic.c
+++ b/arm/gic.c
@@ -9,13 +9,24 @@
 #include linux/kernel.h
 #include linux/kvm.h
 
+/* Those names are not defined for ARM (yet) */
+#ifndef KVM_VGIC_V3_ADDR_TYPE_DIST
+#define KVM_VGIC_V3_ADDR_TYPE_DIST 2
+#endif
+
+#ifndef KVM_VGIC_V3_ADDR_TYPE_REDIST
+#define KVM_VGIC_V3_ADDR_TYPE_REDIST 3
+#endif
+
 static int gic_fd = -1;
+static int nr_redists;
 
 static int gic__create_device(struct kvm *kvm, enum irqchip_type type)
 {
int err;
u64 cpu_if_addr = ARM_GIC_CPUI_BASE;
u64 dist_addr = ARM_GIC_DIST_BASE;
+   u64 redist_addr = dist_addr - nr_redists * ARM_GIC_REDIST_SIZE;
struct kvm_create_device gic_device = {
.flags  = 0,
};
@@ -28,11 +39,19 @@ static int gic__create_device(struct kvm *kvm, enum 
irqchip_type type)
.group  = KVM_DEV_ARM_VGIC_GRP_ADDR,
.addr   = (u64)(unsigned long)dist_addr,
};
+   struct kvm_device_attr redist_attr = {
+   .group  = KVM_DEV_ARM_VGIC_GRP_ADDR,
+   .attr   = KVM_VGIC_V3_ADDR_TYPE_REDIST,
+   .addr   = (u64)(unsigned long)redist_addr,
+   };
 
switch (type) {
case IRQCHIP_GICV2:
gic_device.type = KVM_DEV_TYPE_ARM_VGIC_V2;
break;
+   case IRQCHIP_GICV3:
+   gic_device.type = KVM_DEV_TYPE_ARM_VGIC_V3;
+   break;
default:
return -ENODEV;
}
@@ -48,6 +67,10 @@ static int gic__create_device(struct kvm *kvm, enum 
irqchip_type type)
dist_attr.attr = KVM_VGIC_V2_ADDR_TYPE_DIST;
err = ioctl(gic_fd, KVM_SET_DEVICE_ATTR, cpu_if_attr);
break;
+   case IRQCHIP_GICV3:
+   dist_attr.attr = KVM_VGIC_V3_ADDR_TYPE_DIST;
+   err = ioctl(gic_fd, KVM_SET_DEVICE_ATTR, redist_attr);
+   break;
default:
return -ENODEV;
}
@@ -55,6 +78,8 @@ static int gic__create_device(struct kvm *kvm, enum 
irqchip_type type)
return err;
 
err = ioctl(gic_fd, KVM_SET_DEVICE_ATTR, dist_attr);
+   if (err)
+   return err;
 
return err;
 }
@@ -162,17 +187,25 @@ void gic__generate_fdt_nodes(void *fdt, u32 phandle, enum 
irqchip_type type)
u64 reg_prop[] = {
cpu_to_fdt64(ARM_GIC_DIST_BASE),
cpu_to_fdt64(ARM_GIC_DIST_SIZE),
-   cpu_to_fdt64(ARM_GIC_CPUI_BASE),
-   cpu_to_fdt64(ARM_GIC_CPUI_SIZE),
+   0, 0,   /* to be filled */
};
 
switch (type) {
case IRQCHIP_GICV2:
compatible = arm,cortex-a15-gic;
+   reg_prop[2] = ARM_GIC_CPUI_BASE;
+   reg_prop[3] = ARM_GIC_CPUI_SIZE;
+   break;
+   case IRQCHIP_GICV3:
+   compatible = arm,gic-v3;
+   reg_prop[2] = ARM_GIC_DIST_BASE - nr_redists * 
ARM_GIC_REDIST_SIZE;
+   reg_prop[3] = ARM_GIC_REDIST_SIZE * nr_redists;
break;
default:
return;
}
+   reg_prop[2] = cpu_to_fdt64(reg_prop[2]);
+   reg_prop[3] = cpu_to_fdt64(reg_prop[3]);
 
_FDT(fdt_begin_node(fdt, intc));
_FDT(fdt_property_string(fdt, compatible, compatible));
diff --git a/arm/include/arm-common/gic.h b/arm/include/arm-common/gic.h
index f5f6707..8d6ab01 100644
--- a/arm/include/arm-common/gic.h
+++ b/arm/include/arm-common/gic.h
@@ -21,7 +21,7 @@
 #define GIC_MAX_CPUS   8
 #define GIC_MAX_IRQ255
 
-enum irqchip_type {IRQCHIP_DEFAULT, IRQCHIP_GICV2};
+enum irqchip_type {IRQCHIP_DEFAULT, IRQCHIP_GICV2, IRQCHIP_GICV3};
 
 struct kvm;
 
diff --git a/arm/include/arm-common/kvm-arch.h 
b/arm/include/arm-common/kvm-arch.h
index 082131d..be66a76 100644
--- a/arm/include/arm-common/kvm-arch.h
+++ b/arm/include/arm-common/kvm-arch.h
@@ -17,10 +17,8 @@
 
 #define ARM_GIC_DIST_BASE  (ARM_AXI_AREA - ARM_GIC_DIST_SIZE)
 #define ARM_GIC_CPUI_BASE  (ARM_GIC_DIST_BASE - ARM_GIC_CPUI_SIZE)
-#define ARM_GIC_SIZE   (ARM_GIC_DIST_SIZE + ARM_GIC_CPUI_SIZE)
 
 #define ARM_IOPORT_SIZE(ARM_MMIO_AREA - 

[PATCH v2 0/8] kvmtool: arm64: GICv3 guest support

2015-06-05 Thread Andre Przywara
Hi,

a rework of the GICv3 support series for kvmtool.
I addressed Will's comments on the broken fallback in VGIC creation,
also changed the command line parameter to --irqchip=[gicv2,gicv3].
The default is still GICv2 emulation for the sake of reproducibility,
not sure we want to have an automatic switch-over in case GICv2
emulation is not supported by the hardware.
This is also the base for ITS support, which I will send later as
a follow-up series.

Cheers,
Andre.
-

Since Linux 3.19 the kernel can emulate a GICv3 for KVM guests.
This allows more than 8 VCPUs in a guest and enables in-kernel irqchip
for non-backwards-compatible GICv3 implementations.

This series updates kvmtool to support this feature.
The first half of the series is mostly from Marc and supports some
newer features of the virtual GIC which we later depend on. The second
part enables support for a guest GICv3 by adding a new command line
parameter (--irqchip=).

We now use the KVM_CREATE_DEVICE interface to create a virtual GIC
and only fall back to the now legacy KVM_CREATE_IRQCHIP call if the
former is not supported by the kernel.
Also we use two new features the KVM_CREATE_DEVICE interface
introduces:
* We now set the number of actually used interrupts to avoid
  allocating too many of them without ever using them.
* We tell the kernel explicitly that we are finished with the GIC
  initialisation. This is a requirement for future VGIC versions.

The final three patches introduce virtual GICv3 support, so on
supported hardware (and given kernel support) the user can ask KVM to
emulate a GICv3, lifting the 8 VCPU limit of KVM. This is done by
specifying --irqchip=gicv3 on the command line.
As the kernel currently only supports this on ARM64, this parameter
is valid for the arm64 kvmtool build. But as the GIC is shared in
kvmtool, I had to add the macro definitions to not break the build
on ARM.

This series goes on top of the new official stand-alone repo hosted
on Will's kernel.org git [1].
Find a branch with those patches included at my repo [2].

[1] git://git.kernel.org/pub/scm/linux/kernel/git/will/kvmtool.git
[2] git://linux-arm.org/kvmtool.git (branch gicv3/v2)
http://www.linux-arm.org/git?p=kvmtool.git;a=log;h=refs/heads/gicv3/v2

Andre Przywara (4):
  arm: finish VGIC initialisation explicitly
  arm: prepare for instantiating different IRQ chip devices
  arm: add support for supplying GICv3 redistributor addresses
  arm: use new irqchip parameter to create different vGIC types

Marc Zyngier (4):
  AArch64: Reserve two 64k pages for GIC CPU interface
  AArch{32,64}: use KVM_CREATE_DEVICE  co to instanciate the GIC
  irq: add irq__get_nr_allocated_lines
  AArch{32,64}: dynamically configure the number of GIC interrupts

 arm/aarch32/arm-cpu.c|   2 +-
 arm/aarch64/arm-cpu.c|   2 +-
 arm/aarch64/include/kvm/kvm-arch.h   |   2 +-
 arm/gic.c| 202 +--
 arm/include/arm-common/gic.h |   6 +-
 arm/include/arm-common/kvm-arch.h|  18 ++-
 arm/include/arm-common/kvm-config-arch.h |   9 +-
 arm/kvm-cpu.c|  10 +-
 arm/kvm.c|   8 +-
 include/kvm/irq.h|   1 +
 irq.c|   5 +
 11 files changed, 240 insertions(+), 25 deletions(-)

-- 
2.3.5

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v2 8/8] arm: use new irqchip parameter to create different vGIC types

2015-06-05 Thread Andre Przywara
Currently we unconditionally create a virtual GICv2 in the guest.
Add a --irqchip= parameter to let the user specify a different GIC
type for the guest.
For now we the only other supported type is GICv3.

Signed-off-by: Andre Przywara andre.przyw...@arm.com
---
 arm/aarch64/arm-cpu.c|  2 +-
 arm/gic.c| 21 +
 arm/include/arm-common/kvm-config-arch.h |  9 -
 arm/kvm-cpu.c|  6 ++
 arm/kvm.c|  4 +++-
 5 files changed, 39 insertions(+), 3 deletions(-)

diff --git a/arm/aarch64/arm-cpu.c b/arm/aarch64/arm-cpu.c
index f702b9e..3dc8ea3 100644
--- a/arm/aarch64/arm-cpu.c
+++ b/arm/aarch64/arm-cpu.c
@@ -12,7 +12,7 @@
 static void generate_fdt_nodes(void *fdt, struct kvm *kvm, u32 gic_phandle)
 {
int timer_interrupts[4] = {13, 14, 11, 10};
-   gic__generate_fdt_nodes(fdt, gic_phandle, IRQCHIP_GICV2);
+   gic__generate_fdt_nodes(fdt, gic_phandle, kvm-cfg.arch.irqchip);
timer__generate_fdt_nodes(fdt, kvm, timer_interrupts);
 }
 
diff --git a/arm/gic.c b/arm/gic.c
index c50d662..ab0f594 100644
--- a/arm/gic.c
+++ b/arm/gic.c
@@ -21,6 +21,23 @@
 static int gic_fd = -1;
 static int nr_redists;
 
+int irqchip_parser(const struct option *opt, const char *arg, int unset)
+{
+   enum irqchip_type *type = opt-value;
+
+   *type = IRQCHIP_DEFAULT;
+   if (!strcmp(arg, gicv2)) {
+   *type = IRQCHIP_GICV2;
+   } else if (!strcmp(arg, gicv3)) {
+   *type = IRQCHIP_GICV3;
+   } else if (strcmp(arg, default)) {
+   fprintf(stderr, irqchip: unknown type \%s\\n, arg);
+   return -1;
+   }
+
+   return 0;
+}
+
 static int gic__create_device(struct kvm *kvm, enum irqchip_type type)
 {
int err;
@@ -121,6 +138,10 @@ int gic__create(struct kvm *kvm, enum irqchip_type type)
case IRQCHIP_GICV2:
max_cpus = GIC_MAX_CPUS;
break;
+   case IRQCHIP_GICV3:
+   nr_redists = kvm-cfg.nrcpus;
+   max_cpus = 255;
+   break;
default:
return -ENODEV;
}
diff --git a/arm/include/arm-common/kvm-config-arch.h 
b/arm/include/arm-common/kvm-config-arch.h
index a8ebd94..ae4e89b 100644
--- a/arm/include/arm-common/kvm-config-arch.h
+++ b/arm/include/arm-common/kvm-config-arch.h
@@ -8,8 +8,11 @@ struct kvm_config_arch {
unsigned intforce_cntfrq;
boolvirtio_trans_pci;
boolaarch32_guest;
+   int irqchip;
 };
 
+int irqchip_parser(const struct option *opt, const char *arg, int unset);
+
 #define OPT_ARCH_RUN(pfx, cfg) 
\
pfx,
\
ARM_OPT_ARCH_RUN(cfg)   
\
@@ -21,6 +24,10 @@ struct kvm_config_arch {
 updated to program CNTFRQ correctly*),   
\
OPT_BOOLEAN('\0', force-pci, (cfg)-virtio_trans_pci,
\
Force virtio devices to use PCI as their default  
\
-   transport),
+   transport),   
\
+OPT_CALLBACK('\0', irqchip, (cfg)-irqchip, 
\
+[gicv2|gicv3],   \
+type of interrupt controller to emulate in the guest,
\
+irqchip_parser, NULL),
 
 #endif /* ARM_COMMON__KVM_CONFIG_ARCH_H */
diff --git a/arm/kvm-cpu.c b/arm/kvm-cpu.c
index a3344fa..aacc172 100644
--- a/arm/kvm-cpu.c
+++ b/arm/kvm-cpu.c
@@ -144,6 +144,12 @@ bool kvm_cpu__emulate_mmio(struct kvm_cpu *vcpu, u64 
phys_addr, u8 *data,
 {
int nr_redists = 0;
 
+   switch (vcpu-kvm-cfg.arch.irqchip) {
+   case IRQCHIP_GICV3:
+   nr_redists = vcpu-kvm-nrcpus;
+   break;
+   }
+
if (arm_addr_in_virtio_mmio_region(nr_redists, phys_addr)) {
return kvm__emulate_mmio(vcpu, phys_addr, data, len, is_write);
} else if (arm_addr_in_ioport_region(phys_addr)) {
diff --git a/arm/kvm.c b/arm/kvm.c
index f9685c2..2628d31 100644
--- a/arm/kvm.c
+++ b/arm/kvm.c
@@ -82,6 +82,8 @@ void kvm__arch_init(struct kvm *kvm, const char 
*hugetlbfs_path, u64 ram_size)
MADV_MERGEABLE | MADV_HUGEPAGE);
 
/* Create the virtual GIC. */
-   if (gic__create(kvm, IRQCHIP_GICV2))
+   if (kvm-cfg.arch.irqchip == IRQCHIP_DEFAULT)
+   kvm-cfg.arch.irqchip = IRQCHIP_GICV2;
+   if (gic__create(kvm, kvm-cfg.arch.irqchip))
die(Failed to create virtual GIC);
 }
-- 
2.3.5

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  

[PATCH v2 6/8] arm: prepare for instantiating different IRQ chip devices

2015-06-05 Thread Andre Przywara
Extend the vGIC handling code to potentially deal with different IRQ
chip devices instead of hard-coding the GICv2 in.
We extend most vGIC functions to take a type parameter, but still put
GICv2 in at the top for the time being.

Signed-off-by: Andre Przywara andre.przyw...@arm.com
---
 arm/aarch32/arm-cpu.c|  2 +-
 arm/aarch64/arm-cpu.c|  2 +-
 arm/gic.c| 66 ++--
 arm/include/arm-common/gic.h |  6 ++--
 arm/kvm.c|  2 +-
 5 files changed, 58 insertions(+), 20 deletions(-)

diff --git a/arm/aarch32/arm-cpu.c b/arm/aarch32/arm-cpu.c
index 946e443..d8d6293 100644
--- a/arm/aarch32/arm-cpu.c
+++ b/arm/aarch32/arm-cpu.c
@@ -12,7 +12,7 @@ static void generate_fdt_nodes(void *fdt, struct kvm *kvm, 
u32 gic_phandle)
 {
int timer_interrupts[4] = {13, 14, 11, 10};
 
-   gic__generate_fdt_nodes(fdt, gic_phandle);
+   gic__generate_fdt_nodes(fdt, gic_phandle, IRQCHIP_GICV2);
timer__generate_fdt_nodes(fdt, kvm, timer_interrupts);
 }
 
diff --git a/arm/aarch64/arm-cpu.c b/arm/aarch64/arm-cpu.c
index 8efe877..f702b9e 100644
--- a/arm/aarch64/arm-cpu.c
+++ b/arm/aarch64/arm-cpu.c
@@ -12,7 +12,7 @@
 static void generate_fdt_nodes(void *fdt, struct kvm *kvm, u32 gic_phandle)
 {
int timer_interrupts[4] = {13, 14, 11, 10};
-   gic__generate_fdt_nodes(fdt, gic_phandle);
+   gic__generate_fdt_nodes(fdt, gic_phandle, IRQCHIP_GICV2);
timer__generate_fdt_nodes(fdt, kvm, timer_interrupts);
 }
 
diff --git a/arm/gic.c b/arm/gic.c
index 8d47562..0ce40e4 100644
--- a/arm/gic.c
+++ b/arm/gic.c
@@ -11,13 +11,13 @@
 
 static int gic_fd = -1;
 
-static int gic__create_device(struct kvm *kvm)
+static int gic__create_device(struct kvm *kvm, enum irqchip_type type)
 {
int err;
u64 cpu_if_addr = ARM_GIC_CPUI_BASE;
u64 dist_addr = ARM_GIC_DIST_BASE;
struct kvm_create_device gic_device = {
-   .type   = KVM_DEV_TYPE_ARM_VGIC_V2,
+   .flags  = 0,
};
struct kvm_device_attr cpu_if_attr = {
.group  = KVM_DEV_ARM_VGIC_GRP_ADDR,
@@ -26,21 +26,37 @@ static int gic__create_device(struct kvm *kvm)
};
struct kvm_device_attr dist_attr = {
.group  = KVM_DEV_ARM_VGIC_GRP_ADDR,
-   .attr   = KVM_VGIC_V2_ADDR_TYPE_DIST,
.addr   = (u64)(unsigned long)dist_addr,
};
 
+   switch (type) {
+   case IRQCHIP_GICV2:
+   gic_device.type = KVM_DEV_TYPE_ARM_VGIC_V2;
+   break;
+   default:
+   return -ENODEV;
+   }
+
err = ioctl(kvm-vm_fd, KVM_CREATE_DEVICE, gic_device);
if (err)
return err;
 
gic_fd = gic_device.fd;
 
-   err = ioctl(gic_fd, KVM_SET_DEVICE_ATTR, cpu_if_attr);
+   switch (type) {
+   case IRQCHIP_GICV2:
+   dist_attr.attr = KVM_VGIC_V2_ADDR_TYPE_DIST;
+   err = ioctl(gic_fd, KVM_SET_DEVICE_ATTR, cpu_if_attr);
+   break;
+   default:
+   return -ENODEV;
+   }
if (err)
return err;
 
-   return ioctl(gic_fd, KVM_SET_DEVICE_ATTR, dist_attr);
+   err = ioctl(gic_fd, KVM_SET_DEVICE_ATTR, dist_attr);
+
+   return err;
 }
 
 static int gic__create_irqchip(struct kvm *kvm)
@@ -71,19 +87,28 @@ static int gic__create_irqchip(struct kvm *kvm)
return err;
 }
 
-int gic__create(struct kvm *kvm)
+int gic__create(struct kvm *kvm, enum irqchip_type type)
 {
+   int max_cpus;
int err;
 
-   if (kvm-nrcpus  GIC_MAX_CPUS) {
+   switch (type) {
+   case IRQCHIP_GICV2:
+   max_cpus = GIC_MAX_CPUS;
+   break;
+   default:
+   return -ENODEV;
+   }
+
+   if (kvm-nrcpus  max_cpus) {
pr_warning(%d CPUS greater than maximum of %d -- truncating\n,
-   kvm-nrcpus, GIC_MAX_CPUS);
-   kvm-nrcpus = GIC_MAX_CPUS;
+   kvm-nrcpus, max_cpus);
+   kvm-nrcpus = max_cpus;
}
 
/* Try the new way first, and fallback on legacy method otherwise */
-   err = gic__create_device(kvm);
-   if (err)
+   err = gic__create_device(kvm, type);
+   if (err  type == IRQCHIP_GICV2)
err = gic__create_irqchip(kvm);
 
return err;
@@ -131,15 +156,26 @@ static int gic__init_gic(struct kvm *kvm)
 }
 late_init(gic__init_gic)
 
-void gic__generate_fdt_nodes(void *fdt, u32 phandle)
+void gic__generate_fdt_nodes(void *fdt, u32 phandle, enum irqchip_type type)
 {
+   const char *compatible;
u64 reg_prop[] = {
-   cpu_to_fdt64(ARM_GIC_DIST_BASE), 
cpu_to_fdt64(ARM_GIC_DIST_SIZE),
-   cpu_to_fdt64(ARM_GIC_CPUI_BASE), 
cpu_to_fdt64(ARM_GIC_CPUI_SIZE),
+   cpu_to_fdt64(ARM_GIC_DIST_BASE),
+   cpu_to_fdt64(ARM_GIC_DIST_SIZE),
+   

[PATCH v2 1/8] AArch64: Reserve two 64k pages for GIC CPU interface

2015-06-05 Thread Andre Przywara
From: Marc Zyngier marc.zyng...@arm.com

On AArch64 system with a GICv2, the GICC range can be aligned
to the last 4k block of a 64k page, ending up straddling two
64k pages. In order not to conflict with the distributor mapping,
allocate two 64k pages to the CPU interface.

Signed-off-by: Marc Zyngier marc.zyng...@arm.com
Signed-off-by: Andre Przywara andre.przyw...@arm.com
---
 arm/aarch64/include/kvm/kvm-arch.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arm/aarch64/include/kvm/kvm-arch.h 
b/arm/aarch64/include/kvm/kvm-arch.h
index 2f08a26..4925736 100644
--- a/arm/aarch64/include/kvm/kvm-arch.h
+++ b/arm/aarch64/include/kvm/kvm-arch.h
@@ -2,7 +2,7 @@
 #define KVM__KVM_ARCH_H
 
 #define ARM_GIC_DIST_SIZE  0x1
-#define ARM_GIC_CPUI_SIZE  0x1
+#define ARM_GIC_CPUI_SIZE  0x2
 
 #define ARM_KERN_OFFSET(kvm)   ((kvm)-cfg.arch.aarch32_guest  ?   \
0x8000  :   \
-- 
2.3.5

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v3 1/4] KVM: x86: Split the APIC from the rest of IRQCHIP.

2015-06-05 Thread Paolo Bonzini

 From the perspective of avoiding impacting other architectures, this is a
 good idea, but the naming seems strange in the x86 case. Having
 irqchip_in_kernel be true when the ioapic/pic are in userspace seems
 strange. Admittedly, the irqchip isn't a real concept on x86, so
 inventing a new meaning is fine.

From the KVM point of view, the irqchip is whatever delivers
interrupts to the vCPU---which is the LAPIC for x86.

Paolo
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 00/13] SMM implementation for KVM

2015-06-05 Thread Avi Kivity

On 05/27/2015 08:05 PM, Paolo Bonzini wrote:

This brings together the remaining parts of SMM.  For now I've left the
weird interaction between SMM and NMI blocking, and I'm using the same
format for the state save area (which is also the one used by QEMU) as
the RFC.

It builds on the previous cleanup patches, which (with the exception
of KVM: x86: pass kvm_mmu_page to gfn_to_rmap) are now in kvm/queue.
The first six patches are more or less the same as the previous version,
while the address spaces part hopefully touches all affected functions
now.

Patches 1-6 implement the SMM API and world switch; patches 7-12
implements the multiple address spaces; patch 13 ties the loose
ends and advertises the capability.

Tested with SeaBIOS and OVMF, where SMM provides the trusted base
for secure boot.



Nice work.  While I did not do a thorough review, the mmu bits look robust.




Thanks,

Paolo

Paolo Bonzini (13):
   KVM: x86: introduce num_emulated_msrs
   KVM: x86: pass host_initiated to functions that read MSRs
   KVM: x86: pass the whole hflags field to emulator and back
   KVM: x86: API changes for SMM support
   KVM: x86: stubs for SMM support
   KVM: x86: save/load state on SMM switch
   KVM: add vcpu-specific functions to read/write/translate GFNs
   KVM: implement multiple address spaces
   KVM: x86: pass kvm_mmu_page to gfn_to_rmap
   KVM: x86: use vcpu-specific functions to read/write/translate GFNs
   KVM: x86: work on all available address spaces
   KVM: x86: add SMM to the MMU role, support SMRAM address space
   KVM: x86: advertise KVM_CAP_X86_SMM

  Documentation/virtual/kvm/api.txt|  52 ++-
  arch/powerpc/include/asm/kvm_book3s_64.h |   2 +-
  arch/x86/include/asm/kvm_emulate.h   |   9 +-
  arch/x86/include/asm/kvm_host.h  |  44 ++-
  arch/x86/include/asm/vmx.h   |   1 +
  arch/x86/include/uapi/asm/kvm.h  |  11 +-
  arch/x86/kvm/cpuid.h |   8 +
  arch/x86/kvm/emulate.c   | 262 +-
  arch/x86/kvm/kvm_cache_regs.h|   5 +
  arch/x86/kvm/lapic.c |   4 +-
  arch/x86/kvm/mmu.c   | 171 +-
  arch/x86/kvm/mmu_audit.c |  16 +-
  arch/x86/kvm/paging_tmpl.h   |  18 +-
  arch/x86/kvm/svm.c   |  73 ++--
  arch/x86/kvm/trace.h |  22 ++
  arch/x86/kvm/vmx.c   | 106 +++---
  arch/x86/kvm/x86.c   | 562 ++-
  include/linux/kvm_host.h |  49 ++-
  include/uapi/linux/kvm.h |   6 +-
  virt/kvm/kvm_main.c  | 237 ++---
  20 files changed, 1337 insertions(+), 321 deletions(-)



--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] kvm: irqchip: Break up high order allocations of kvm_irq_routing_table

2015-06-05 Thread Paolo Bonzini


On 05/06/2015 12:50, Joerg Roedel wrote:
  Great, I'll apply the patch.
 
 Gentle ping. I don't see the patch in the queue or next branches of the
 KVM tree yet. Do you plan to apply it for v4.2?

Fell through the cracks, sorry.  I will apply it today.

Paolo
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 07/13] KVM: add vcpu-specific functions to read/write/translate GFNs

2015-06-05 Thread Paolo Bonzini


On 29/05/2015 21:23, Radim Krčmář wrote:
  +int kvm_vcpu_write_guest(struct kvm_vcpu *vcpu, gpa_t gpa, const void 
  *data,
  +   unsigned long len)
  +{
  +  gfn_t gfn = gpa  PAGE_SHIFT;
  +  int seg;
  +  int offset = offset_in_page(gpa);
  +  int ret;
  +
  +  while ((seg = next_segment(len, offset)) != 0) {
  +  ret = kvm_vcpu_write_guest_page(vcpu, gfn, data, offset, seg);
  +  if (ret  0)
  +  return ret;
  +  offset = 0;
  +  len -= seg;
  +  data += seg;
  +  ++gfn;
  +  }
  +  return 0;
  +}
 (There is no need to pass vcpu, and kvm, in this API.

How so?  A single kvm_vcpu_write_guest can cross multiple slots.

Paolo

  Extracting memslots early will help to keep more code common.
 
  I have patches that did a superset of this for the old API, so posting
  them after this series is finalized will be simple.)
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] kvm: irqchip: Break up high order allocations of kvm_irq_routing_table

2015-06-05 Thread Joerg Roedel
Hi Paolo,

On Mon, May 11, 2015 at 03:27:26PM +0200, Paolo Bonzini wrote:
 Great, I'll apply the patch.

Gentle ping. I don't see the patch in the queue or next branches of the
KVM tree yet. Do you plan to apply it for v4.2?


Joerg

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 07/13] KVM: add vcpu-specific functions to read/write/translate GFNs

2015-06-05 Thread Radim Krčmář
2015-06-05 12:26+0200, Paolo Bonzini:
 On 29/05/2015 21:23, Radim Krčmář wrote:
  +int kvm_vcpu_write_guest(struct kvm_vcpu *vcpu, gpa_t gpa, const void 
  *data,
  +  unsigned long len)
  +{
| [...]
  +}
 (There is no need to pass vcpu, and kvm, in this API.
 
 How so?  A single kvm_vcpu_write_guest can cross multiple slots.

I meant passing 'struct kvm_memslots *' instead and as soon as possible,
which would still allow more slots.  Something like this hunk:

diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index 10ae7e348dcc..8c6d84c12f18 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -1718,8 +1718,8 @@ int kvm_vcpu_write_guest_page(struct kvm_vcpu *vcpu, 
gfn_t gfn,
 }
 EXPORT_SYMBOL_GPL(kvm_vcpu_write_guest_page);
 
-int kvm_write_guest(struct kvm *kvm, gpa_t gpa, const void *data,
-   unsigned long len)
+static int __kvm_write_guest(struct kvm_memslots *slots, gpa_t gpa, const void 
*data,
+   unsigned long len)
 {
gfn_t gfn = gpa  PAGE_SHIFT;
int seg;
@@ -1727,7 +1727,8 @@ int kvm_write_guest(struct kvm *kvm, gpa_t gpa, const 
void *data,
int ret;
 
while ((seg = next_segment(len, offset)) != 0) {
-   ret = kvm_write_guest_page(kvm, gfn, data, offset, seg);
+   ret = __kvm_write_guest_page(__gfn_to_memslot(slots, gfn), gfn,
+   data, offset, seg);
if (ret  0)
return ret;
offset = 0;
@@ -1737,26 +1738,18 @@ int kvm_write_guest(struct kvm *kvm, gpa_t gpa, const 
void *data,
}
return 0;
 }
+
+int kvm_write_guest(struct kvm *kvm, gpa_t gpa, const void *data,
+   unsigned long len)
+{
+   return __kvm_write_guest(kvm_memslots(kvm), gpa, data, len);
+}
 EXPORT_SYMBOL_GPL(kvm_write_guest);
 
 int kvm_vcpu_write_guest(struct kvm_vcpu *vcpu, gpa_t gpa, const void *data,
 unsigned long len)
 {
-   gfn_t gfn = gpa  PAGE_SHIFT;
-   int seg;
-   int offset = offset_in_page(gpa);
-   int ret;
-
-   while ((seg = next_segment(len, offset)) != 0) {
-   ret = kvm_vcpu_write_guest_page(vcpu, gfn, data, offset, seg);
-   if (ret  0)
-   return ret;
-   offset = 0;
-   len -= seg;
-   data += seg;
-   ++gfn;
-   }
-   return 0;
+   return __kvm_write_guest(kvm_vcpu_memslots(vcpu), gpa, data, len);
 }
 EXPORT_SYMBOL_GPL(kvm_vcpu_write_guest);
 
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2] arm/arm64: KVM: Properly account for guest CPU time

2015-06-05 Thread Mario Smarduch
On 06/02/2015 02:27 AM, Christoffer Dall wrote:
 On Mon, Jun 01, 2015 at 08:48:22AM -0700, Mario Smarduch wrote:
 On 05/30/2015 11:59 PM, Christoffer Dall wrote:
 Hi Mario,

 On Fri, May 29, 2015 at 03:34:47PM -0700, Mario Smarduch wrote:
 On 05/28/2015 11:49 AM, Christoffer Dall wrote:
 Until now we have been calling kvm_guest_exit after re-enabling
 interrupts when we come back from the guest, but this has the
 unfortunate effect that CPU time accounting done in the context of timer
 interrupts occurring while the guest is running doesn't properly notice
 that the time since the last tick was spent in the guest.

 Inspired by the comment in the x86 code, move the kvm_guest_exit() call
 below the local_irq_enable() call and change __kvm_guest_exit() to
 kvm_guest_exit(), because we are now calling this function with
 interrupts enabled.  We have to now explicitly disable preemption and
 not enable preemption before we've called kvm_guest_exit(), since
 otherwise we could be preempted and everything happening before we
 eventually get scheduled again would be accounted for as guest time.

 At the same time, move the trace_kvm_exit() call outside of the atomic
 section, since there is no reason for us to do that with interrupts
 disabled.

 Signed-off-by: Christoffer Dall christoffer.d...@linaro.org
 ---
 This patch is based on kvm/queue, because it has the kvm_guest_enter/exit
 rework recently posted by Christian Borntraeger.  I hope I got the logic
 of this right, there were 2 slightly worrying facts about this:

 First, we now enable and disable and enable interrupts on each exit
 path, but I couldn't see any performance overhead on hackbench - yes the
 only benchmark we care about.

 Second, looking at the ppc and mips code, they seem to also call
 kvm_guest_exit() before enabling interrupts, so I don't understand how
 guest CPU time accounting works on those architectures.

 Changes since v1:
  - Tweak comment and commit text based on Marc's feedback.
  - Explicitly disable preemption and enable it only after 
 kvm_guest_exit().

  arch/arm/kvm/arm.c | 21 +
  1 file changed, 17 insertions(+), 4 deletions(-)

 diff --git a/arch/arm/kvm/arm.c b/arch/arm/kvm/arm.c
 index e41cb11..fe8028d 100644
 --- a/arch/arm/kvm/arm.c
 +++ b/arch/arm/kvm/arm.c
 @@ -532,6 +532,7 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu, 
 struct kvm_run *run)
   kvm_vgic_flush_hwstate(vcpu);
   kvm_timer_flush_hwstate(vcpu);
  
 + preempt_disable();
   local_irq_disable();
  
   /*
 @@ -544,6 +545,7 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu, 
 struct kvm_run *run)
  
   if (ret = 0 || need_new_vmid_gen(vcpu-kvm)) {
   local_irq_enable();
 + preempt_enable();
   kvm_timer_sync_hwstate(vcpu);
   kvm_vgic_sync_hwstate(vcpu);
   continue;
 @@ -559,8 +561,10 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu, 
 struct kvm_run *run)
   ret = kvm_call_hyp(__kvm_vcpu_run, vcpu);
  
   vcpu-mode = OUTSIDE_GUEST_MODE;
 - __kvm_guest_exit();
 - trace_kvm_exit(kvm_vcpu_trap_get_class(vcpu), *vcpu_pc(vcpu));
 + /*
 +  * Back from guest
 +  */
 +
   /*
* We may have taken a host interrupt in HYP mode (ie
* while executing the guest). This interrupt is still
 @@ -574,8 +578,17 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu, 
 struct kvm_run *run)
   local_irq_enable();
  
   /*
 -  * Back from guest
 -  */
 +  * We do local_irq_enable() before calling kvm_guest_exit() so
 +  * that if a timer interrupt hits while running the guest we
 +  * account that tick as being spent in the guest.  We enable
 +  * preemption after calling kvm_guest_exit() so that if we get
 +  * preempted we make sure ticks after that is not counted as
 +  * guest time.
 +  */
 + kvm_guest_exit();
 + trace_kvm_exit(kvm_vcpu_trap_get_class(vcpu), *vcpu_pc(vcpu));
 + preempt_enable();
 +
  
   kvm_timer_sync_hwstate(vcpu);
   kvm_vgic_sync_hwstate(vcpu);


 Hi Christoffer,
  so currently we take a snap shot when we enter the guest
 (tsk-vtime_snap) and upon exit add the time we spent in
 the guest and update accrued time, which appears correct.

 not on arm64, because we don't select HAVE_VIRT_CPU_ACCOUNTING_GEN.  Or
 am I missing something obvious here?
 I see what you mean we can't use cycle based accounting to accrue
 Guest time.

 
 See other thread, we can enable this in the config but it still only
 works with NO_HZ_FULL.
 


 With this patch it appears that interrupts running
 in host mode are accrued to Guest time, and additional preemption
 latency is added.

 It is 

Re: [PATCH v2 07/13] KVM: add vcpu-specific functions to read/write/translate GFNs

2015-06-05 Thread Paolo Bonzini


On 05/06/2015 14:10, Radim Krčmář wrote:
 + ret = __kvm_write_guest_page(__gfn_to_memslot(slots, gfn), gfn,
 + data, offset, seg);

Even better, let's pass memslots to all the __ functions.

Paolo
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 07/13] KVM: add vcpu-specific functions to read/write/translate GFNs

2015-06-05 Thread Paolo Bonzini


On 05/06/2015 14:10, Radim Krčmář wrote:
 2015-06-05 12:26+0200, Paolo Bonzini:
 On 29/05/2015 21:23, Radim Krčmář wrote:
 +int kvm_vcpu_write_guest(struct kvm_vcpu *vcpu, gpa_t gpa, const void 
 *data,
 +  unsigned long len)
 +{
 | [...]
 +}
 (There is no need to pass vcpu, and kvm, in this API.

 How so?  A single kvm_vcpu_write_guest can cross multiple slots.
 
 I meant passing 'struct kvm_memslots *' instead and as soon as possible,
 which would still allow more slots.

Oh, indeed that works fine!

Paolo
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] KVM: x86: fix lapic.timer_mode on restore

2015-06-05 Thread Paolo Bonzini


On 05/06/2015 20:57, Radim Krčmář wrote:
 lapic.timer_mode was not properly initialized after migration, which
 broke few useful things, like login, by making every sleep eternal.
 
 Fix this by calling apic_update_lvtt in kvm_apic_post_state_restore.
 
 There are other slowpaths that update lvtt, so this patch makes sure
 something similar doesn't happen again by calling apic_update_lvtt
 after every modification.
 
 Cc: sta...@vger.kernel.org
 Fixes: f30ebc312ca9 (KVM: x86: optimize some accesses to LVTT and SPIV)
 Signed-off-by: Radim Krčmář rkrc...@redhat.com
 ---
  arch/x86/kvm/lapic.c | 26 --
  1 file changed, 16 insertions(+), 10 deletions(-)
 
 diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
 index beeef05bb4d9..36e9de1b4127 100644
 --- a/arch/x86/kvm/lapic.c
 +++ b/arch/x86/kvm/lapic.c
 @@ -1103,6 +1103,17 @@ static void update_divide_count(struct kvm_lapic *apic)
  apic-divide_count);
  }
  
 +static void apic_update_lvtt(struct kvm_lapic *apic)
 +{
 + u32 timer_mode = kvm_apic_get_reg(apic, APIC_LVTT) 
 + apic-lapic_timer.timer_mode_mask;
 +
 + if (apic-lapic_timer.timer_mode != timer_mode) {
 + apic-lapic_timer.timer_mode = timer_mode;
 + hrtimer_cancel(apic-lapic_timer.timer);
 + }
 +}
 +
  static void apic_timer_expired(struct kvm_lapic *apic)
  {
   struct kvm_vcpu *vcpu = apic-vcpu;
 @@ -1311,6 +1322,7 @@ static int apic_reg_write(struct kvm_lapic *apic, u32 
 reg, u32 val)
   apic_set_reg(apic, APIC_LVTT + 0x10 * i,
lvt_val | APIC_LVT_MASKED);
   }
 + apic_update_lvtt(apic);
   atomic_set(apic-lapic_timer.pending, 0);
  
   }
 @@ -1343,20 +1355,13 @@ static int apic_reg_write(struct kvm_lapic *apic, u32 
 reg, u32 val)
  
   break;
  
 - case APIC_LVTT: {
 - u32 timer_mode = val  apic-lapic_timer.timer_mode_mask;
 -
 - if (apic-lapic_timer.timer_mode != timer_mode) {
 - apic-lapic_timer.timer_mode = timer_mode;
 - hrtimer_cancel(apic-lapic_timer.timer);
 - }
 -
 + case APIC_LVTT:
   if (!kvm_apic_sw_enabled(apic))
   val |= APIC_LVT_MASKED;
   val = (apic_lvt_mask[0] | apic-lapic_timer.timer_mode_mask);
   apic_set_reg(apic, APIC_LVTT, val);
 + apic_update_lvtt(apic);
   break;
 - }
  
   case APIC_TMICT:
   if (apic_lvtt_tscdeadline(apic))
 @@ -1588,7 +1593,7 @@ void kvm_lapic_reset(struct kvm_vcpu *vcpu, bool 
 init_event)
  
   for (i = 0; i  APIC_LVT_NUM; i++)
   apic_set_reg(apic, APIC_LVTT + 0x10 * i, APIC_LVT_MASKED);
 - apic-lapic_timer.timer_mode = 0;
 + apic_update_lvtt(apic);
   if (!(vcpu-kvm-arch.disabled_quirks  KVM_QUIRK_LINT0_REENABLED))
   apic_set_reg(apic, APIC_LVT0,
SET_APIC_DELIVERY_MODE(0, APIC_MODE_EXTINT));
 @@ -1816,6 +1821,7 @@ void kvm_apic_post_state_restore(struct kvm_vcpu *vcpu,
  
   apic_update_ppr(apic);
   hrtimer_cancel(apic-lapic_timer.timer);
 + apic_update_lvtt(apic);
   update_divide_count(apic);
   start_apic_timer(apic);
   apic-irr_pending = true;
 

Marcelo, if you have some free cycles feel free to apply this to
kvm/master and send it to Linus sometime next week.  I cannot do it on
Monday and I'll be on vacation afterwards.  (I'll be back as soon as
June 16th so I didn't plan on a formal handoff, but I think it's better
to have this in 4.1. The merge window conflicts with Linus's own
vacation and might be delayed).

And thanks Radim for the fix, it looks good.

Paolo
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] KVM: x86: fix lapic.timer_mode on restore

2015-06-05 Thread Radim Krčmář
lapic.timer_mode was not properly initialized after migration, which
broke few useful things, like login, by making every sleep eternal.

Fix this by calling apic_update_lvtt in kvm_apic_post_state_restore.

There are other slowpaths that update lvtt, so this patch makes sure
something similar doesn't happen again by calling apic_update_lvtt
after every modification.

Cc: sta...@vger.kernel.org
Fixes: f30ebc312ca9 (KVM: x86: optimize some accesses to LVTT and SPIV)
Signed-off-by: Radim Krčmář rkrc...@redhat.com
---
 arch/x86/kvm/lapic.c | 26 --
 1 file changed, 16 insertions(+), 10 deletions(-)

diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
index beeef05bb4d9..36e9de1b4127 100644
--- a/arch/x86/kvm/lapic.c
+++ b/arch/x86/kvm/lapic.c
@@ -1103,6 +1103,17 @@ static void update_divide_count(struct kvm_lapic *apic)
   apic-divide_count);
 }
 
+static void apic_update_lvtt(struct kvm_lapic *apic)
+{
+   u32 timer_mode = kvm_apic_get_reg(apic, APIC_LVTT) 
+   apic-lapic_timer.timer_mode_mask;
+
+   if (apic-lapic_timer.timer_mode != timer_mode) {
+   apic-lapic_timer.timer_mode = timer_mode;
+   hrtimer_cancel(apic-lapic_timer.timer);
+   }
+}
+
 static void apic_timer_expired(struct kvm_lapic *apic)
 {
struct kvm_vcpu *vcpu = apic-vcpu;
@@ -1311,6 +1322,7 @@ static int apic_reg_write(struct kvm_lapic *apic, u32 
reg, u32 val)
apic_set_reg(apic, APIC_LVTT + 0x10 * i,
 lvt_val | APIC_LVT_MASKED);
}
+   apic_update_lvtt(apic);
atomic_set(apic-lapic_timer.pending, 0);
 
}
@@ -1343,20 +1355,13 @@ static int apic_reg_write(struct kvm_lapic *apic, u32 
reg, u32 val)
 
break;
 
-   case APIC_LVTT: {
-   u32 timer_mode = val  apic-lapic_timer.timer_mode_mask;
-
-   if (apic-lapic_timer.timer_mode != timer_mode) {
-   apic-lapic_timer.timer_mode = timer_mode;
-   hrtimer_cancel(apic-lapic_timer.timer);
-   }
-
+   case APIC_LVTT:
if (!kvm_apic_sw_enabled(apic))
val |= APIC_LVT_MASKED;
val = (apic_lvt_mask[0] | apic-lapic_timer.timer_mode_mask);
apic_set_reg(apic, APIC_LVTT, val);
+   apic_update_lvtt(apic);
break;
-   }
 
case APIC_TMICT:
if (apic_lvtt_tscdeadline(apic))
@@ -1588,7 +1593,7 @@ void kvm_lapic_reset(struct kvm_vcpu *vcpu, bool 
init_event)
 
for (i = 0; i  APIC_LVT_NUM; i++)
apic_set_reg(apic, APIC_LVTT + 0x10 * i, APIC_LVT_MASKED);
-   apic-lapic_timer.timer_mode = 0;
+   apic_update_lvtt(apic);
if (!(vcpu-kvm-arch.disabled_quirks  KVM_QUIRK_LINT0_REENABLED))
apic_set_reg(apic, APIC_LVT0,
 SET_APIC_DELIVERY_MODE(0, APIC_MODE_EXTINT));
@@ -1816,6 +1821,7 @@ void kvm_apic_post_state_restore(struct kvm_vcpu *vcpu,
 
apic_update_ppr(apic);
hrtimer_cancel(apic-lapic_timer.timer);
+   apic_update_lvtt(apic);
update_divide_count(apic);
start_apic_timer(apic);
apic-irr_pending = true;
-- 
2.4.2

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] kvmtool: don't use PCI config space IRQ line field

2015-06-05 Thread Will Deacon
On Thu, Jun 04, 2015 at 04:20:45PM +0100, Andre Przywara wrote:
 In PCI config space there is an interrupt line field (offset 0x3f),
 which is used to initially communicate the IRQ line number from
 firmware to the OS. _Hardware_ should never use this information,
 as the OS is free to write any information in there.
 But kvmtool uses this number when it triggers IRQs in the guest,
 which fails starting with Linux 3.19-rc1, where the PCI layer starts
 writing the virtual IRQ number in there.
 
 Fix that by storing the IRQ number in a separate field in
 struct virtio_pci, which is independent from the PCI config space
 and cannot be influenced by the guest.
 This fixes ARM/ARM64 guests using PCI with newer kernels.
 
 Signed-off-by: Andre Przywara andre.przyw...@arm.com
 ---
  include/kvm/virtio-pci.h | 8 
  virtio/pci.c | 9 ++---
  2 files changed, 14 insertions(+), 3 deletions(-)
 
 diff --git a/include/kvm/virtio-pci.h b/include/kvm/virtio-pci.h
 index c795ce7..b70cadd 100644
 --- a/include/kvm/virtio-pci.h
 +++ b/include/kvm/virtio-pci.h
 @@ -30,6 +30,14 @@ struct virtio_pci {
   u8  isr;
   u32 features;
  
 + /*
 +  * We cannot rely on the INTERRUPT_LINE byte in the config space once
 +  * we have run guest code, as the OS is allowed to use that field
 +  * as a scratch pad to communicate between driver and PCI layer.
 +  * So store our legacy interrupt line number in here for internal use.
 +  */
 + u8  legacy_irq_line;
 +
   /* MSI-X */
   u16 config_vector;
   u32 config_gsi;
 diff --git a/virtio/pci.c b/virtio/pci.c
 index 7556239..e17e5a9 100644
 --- a/virtio/pci.c
 +++ b/virtio/pci.c
 @@ -141,7 +141,7 @@ static bool virtio_pci__io_in(struct ioport *ioport, 
 struct kvm_cpu *vcpu, u16 p
   break;
   case VIRTIO_PCI_ISR:
   ioport__write8(data, vpci-isr);
 - kvm__irq_line(kvm, vpci-pci_hdr.irq_line, VIRTIO_IRQ_LOW);
 + kvm__irq_line(kvm, vpci-legacy_irq_line, VIRTIO_IRQ_LOW);
   vpci-isr = VIRTIO_IRQ_LOW;
   break;
   default:
 @@ -299,7 +299,7 @@ int virtio_pci__signal_vq(struct kvm *kvm, struct 
 virtio_device *vdev, u32 vq)
   kvm__irq_trigger(kvm, vpci-gsis[vq]);
   } else {
   vpci-isr = VIRTIO_IRQ_HIGH;
 - kvm__irq_trigger(kvm, vpci-pci_hdr.irq_line);
 + kvm__irq_trigger(kvm, vpci-legacy_irq_line);
   }
   return 0;
  }
 @@ -323,7 +323,7 @@ int virtio_pci__signal_config(struct kvm *kvm, struct 
 virtio_device *vdev)
   kvm__irq_trigger(kvm, vpci-config_gsi);
   } else {
   vpci-isr = VIRTIO_PCI_ISR_CONFIG;
 - kvm__irq_trigger(kvm, vpci-pci_hdr.irq_line);
 + kvm__irq_trigger(kvm, vpci-legacy_irq_line);
   }
  
   return 0;
 @@ -422,6 +422,9 @@ int virtio_pci__init(struct kvm *kvm, void *dev, struct 
 virtio_device *vdev,
   if (r  0)
   goto free_msix_mmio;
  
 + /* save the IRQ that device__register() has allocated */
 + vpci-legacy_irq_line = vpci-pci_hdr.irq_line;

I'd rather we used the container_of trick that we do for virtio-mmio
devices when assigning the irq in device__register. Then we can avoid
this line completely.

Will
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Announcing qboot, a minimal x86 firmware for QEMU

2015-06-05 Thread Stefan Hajnoczi
On Tue, May 26, 2015 at 9:47 AM, Stefan Hajnoczi stefa...@gmail.com wrote:
 On Fri, May 22, 2015 at 10:53:54AM +0800, Yong Wang wrote:
 On Thu, May 21, 2015 at 03:51:43PM +0200, Paolo Bonzini wrote:
  On the QEMU side, there is no support yet for persistent memory and the
  NFIT tables from ACPI 6.0.  Once that (and ACPI support) is added, qboot
  will automatically start using it.
 

 We are working on adding NFIT support into virtual bios.

 Great.  I asked about this on the #pmem (irc.oftc.net) IRC channel last week.

 Which virtual bios are you targeting?

Ping?

Interest in persistent memory is picking up and I'd like to avoid
duplicating work.  Which pieces do you have patches for?

1. QEMU -device pmem,file=/path/to/dax/file,id=pmem1 and fw_cfg/ACPI
info that gets passed to the guest
2. SeaBIOS NFIT ACPI table
3. ACPI NVDIMM DSM (probably not much needed, most features would be disabled)

Thanks,
Stefan
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 07/13] KVM: add vcpu-specific functions to read/write/translate GFNs

2015-06-05 Thread Radim Krčmář
2015-06-05 14:46+0200, Paolo Bonzini:
 On 05/06/2015 14:10, Radim Krčmář wrote:
  +   ret = __kvm_write_guest_page(__gfn_to_memslot(slots, gfn), gfn,
  +   data, offset, seg);
 
 Even better, let's pass memslots to all the __ functions.

Yeah, while scoping it, I noticed a bug in the series ...
makes me wish that C had a useful type system.

A quick fix would be to replace gpa with gfn in calls to
__kvm_read_guest_atomic().  I presume you'd prefer a new patch to
rebasing, so it's below.

---
KVM: fix gpa/gfn mixup in __kvm_read_guest_atomic

Refactoring passed gpa instead of gfn to __kvm_read_guest_atomic.
While at it, lessen code duplication by extracting slots earlier.

Fixes: 841509f38372 (KVM: add vcpu-specific functions to read/write/translate 
GFNs)
Signed-off-by: Radim Krčmář rkrc...@redhat.com
---
 virt/kvm/kvm_main.c | 19 +--
 1 file changed, 9 insertions(+), 10 deletions(-)

diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index 10ae7e348dcc..4fa1edc34630 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -1645,11 +1645,14 @@ int kvm_vcpu_read_guest(struct kvm_vcpu *vcpu, gpa_t 
gpa, void *data, unsigned l
 }
 EXPORT_SYMBOL_GPL(kvm_vcpu_read_guest);
 
-static int __kvm_read_guest_atomic(struct kvm_memory_slot *slot, gfn_t gfn,
-  void *data, int offset, unsigned long len)
+static int __kvm_read_guest_atomic(struct kvm_memslots *slots, gpa_t gpa,
+  void *data, unsigned long len)
 {
int r;
unsigned long addr;
+   gfn_t gfn = gpa  PAGE_SHIFT;
+   struct kvm_memory_slot *slot = __gfn_to_memslot(slots, gfn);
+   int offset = offset_in_page(gpa);
 
addr = gfn_to_hva_memslot_prot(slot, gfn, NULL);
if (kvm_is_error_hva(addr))
@@ -1665,22 +1668,18 @@ static int __kvm_read_guest_atomic(struct 
kvm_memory_slot *slot, gfn_t gfn,
 int kvm_read_guest_atomic(struct kvm *kvm, gpa_t gpa, void *data,
  unsigned long len)
 {
-   gfn_t gfn = gpa  PAGE_SHIFT;
-   struct kvm_memory_slot *slot = gfn_to_memslot(kvm, gfn);
-   int offset = offset_in_page(gpa);
+   struct kvm_memslots *slots = kvm_memslots(kvm);
 
-   return __kvm_read_guest_atomic(slot, gpa, data, offset, len);
+   return __kvm_read_guest_atomic(slots, gpa, data, len);
 }
 EXPORT_SYMBOL_GPL(kvm_read_guest_atomic);
 
 int kvm_vcpu_read_guest_atomic(struct kvm_vcpu *vcpu, gpa_t gpa,
   void *data, unsigned long len)
 {
-   gfn_t gfn = gpa  PAGE_SHIFT;
-   struct kvm_memory_slot *slot = kvm_vcpu_gfn_to_memslot(vcpu, gfn);
-   int offset = offset_in_page(gpa);
+   struct kvm_memslots *slots = kvm_vcpu_memslots(vcpu);
 
-   return __kvm_read_guest_atomic(slot, gpa, data, offset, len);
+   return __kvm_read_guest_atomic(slots, gpa, data, len);
 }
 EXPORT_SYMBOL_GPL(kvm_vcpu_read_guest_atomic);
 
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] KVM: arm/arm64: Enable the KVM-VFIO device

2015-06-05 Thread Eric Auger
From: Kim Phillips kim.phill...@linaro.org

The KVM-VFIO device is used by the QEMU VFIO device. It is used to
record the list of in-use VFIO groups so that KVM can manipulate
them.

Signed-off-by: Kim Phillips kim.phill...@linaro.org
Signed-off-by: Eric Auger eric.au...@linaro.org

---

- previously included in KVM-VFIO IRQ forward control v6 series.
  Rationale to put it aside is the unavailability of the kvm-vfio device
  causes produces a warning when launching the QEMU VFIO platform device
  that can puzzle some users (although not blocking):
  Failed to create KVM VFIO device: No such device
---
 arch/arm/kvm/Kconfig| 1 +
 arch/arm/kvm/Makefile   | 2 +-
 arch/arm64/kvm/Kconfig  | 1 +
 arch/arm64/kvm/Makefile | 2 +-
 4 files changed, 4 insertions(+), 2 deletions(-)

diff --git a/arch/arm/kvm/Kconfig b/arch/arm/kvm/Kconfig
index f1f79d1..bfb915d 100644
--- a/arch/arm/kvm/Kconfig
+++ b/arch/arm/kvm/Kconfig
@@ -28,6 +28,7 @@ config KVM
select KVM_GENERIC_DIRTYLOG_READ_PROTECT
select SRCU
select MMU_NOTIFIER
+   select KVM_VFIO
select HAVE_KVM_EVENTFD
select HAVE_KVM_IRQFD
depends on ARM_VIRT_EXT  ARM_LPAE  ARM_ARCH_TIMER
diff --git a/arch/arm/kvm/Makefile b/arch/arm/kvm/Makefile
index 139e46c..c5eef02c 100644
--- a/arch/arm/kvm/Makefile
+++ b/arch/arm/kvm/Makefile
@@ -15,7 +15,7 @@ AFLAGS_init.o := -Wa,-march=armv7-a$(plus_virt)
 AFLAGS_interrupts.o := -Wa,-march=armv7-a$(plus_virt)
 
 KVM := ../../../virt/kvm
-kvm-arm-y = $(KVM)/kvm_main.o $(KVM)/coalesced_mmio.o $(KVM)/eventfd.o
+kvm-arm-y = $(KVM)/kvm_main.o $(KVM)/coalesced_mmio.o $(KVM)/eventfd.o 
$(KVM)/vfio.o
 
 obj-y += kvm-arm.o init.o interrupts.o
 obj-y += arm.o handle_exit.o guest.o mmu.o emulate.o reset.o
diff --git a/arch/arm64/kvm/Kconfig b/arch/arm64/kvm/Kconfig
index 5105e29..bfffe8f 100644
--- a/arch/arm64/kvm/Kconfig
+++ b/arch/arm64/kvm/Kconfig
@@ -28,6 +28,7 @@ config KVM
select KVM_ARM_HOST
select KVM_GENERIC_DIRTYLOG_READ_PROTECT
select SRCU
+   select KVM_VFIO
select HAVE_KVM_EVENTFD
select HAVE_KVM_IRQFD
---help---
diff --git a/arch/arm64/kvm/Makefile b/arch/arm64/kvm/Makefile
index d5904f8..f90f4aa 100644
--- a/arch/arm64/kvm/Makefile
+++ b/arch/arm64/kvm/Makefile
@@ -11,7 +11,7 @@ ARM=../../../arch/arm/kvm
 
 obj-$(CONFIG_KVM_ARM_HOST) += kvm.o
 
-kvm-$(CONFIG_KVM_ARM_HOST) += $(KVM)/kvm_main.o $(KVM)/coalesced_mmio.o 
$(KVM)/eventfd.o
+kvm-$(CONFIG_KVM_ARM_HOST) += $(KVM)/kvm_main.o $(KVM)/coalesced_mmio.o 
$(KVM)/eventfd.o $(KVM)/vfio.o
 kvm-$(CONFIG_KVM_ARM_HOST) += $(ARM)/arm.o $(ARM)/mmu.o $(ARM)/mmio.o
 kvm-$(CONFIG_KVM_ARM_HOST) += $(ARM)/psci.o $(ARM)/perf.o
 
-- 
1.9.1

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 07/13] KVM: add vcpu-specific functions to read/write/translate GFNs

2015-06-05 Thread Paolo Bonzini


On 05/06/2015 17:13, Radim Krčmář wrote:
 2015-06-05 14:46+0200, Paolo Bonzini:
 On 05/06/2015 14:10, Radim Krčmář wrote:
 +   ret = __kvm_write_guest_page(__gfn_to_memslot(slots, gfn), gfn,
 +   data, offset, seg);

 Even better, let's pass memslots to all the __ functions.
 
 Yeah, while scoping it, I noticed a bug in the series ...
 makes me wish that C had a useful type system.
 
 A quick fix would be to replace gpa with gfn in calls to
 __kvm_read_guest_atomic().  I presume you'd prefer a new patch to
 rebasing, so it's below.

Since it was pushed only for 15 minutes or so, and the fix is two lines:

diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index 30425ce6a4a4..848af90b8091 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -1669,7 +1669,7 @@ int kvm_read_guest_atomic(struct kvm *kvm, gpa_t gpa, 
void *data,
struct kvm_memory_slot *slot = gfn_to_memslot(kvm, gfn);
int offset = offset_in_page(gpa);
 
-   return __kvm_read_guest_atomic(slot, gpa, data, offset, len);
+   return __kvm_read_guest_atomic(slot, gfn, data, offset, len);
 }
 EXPORT_SYMBOL_GPL(kvm_read_guest_atomic);
 
@@ -1680,7 +1680,7 @@ int kvm_vcpu_read_guest_atomic(struct kvm_vcpu *vcpu, 
gpa_t gpa,
struct kvm_memory_slot *slot = kvm_vcpu_gfn_to_memslot(vcpu, gfn);
int offset = offset_in_page(gpa);
 
-   return __kvm_read_guest_atomic(slot, gpa, data, offset, len);
+   return __kvm_read_guest_atomic(slot, gfn, data, offset, len);
 }
 EXPORT_SYMBOL_GPL(kvm_vcpu_read_guest_atomic);

I just force-pushed kvm/next.

The patch is good, but I prefer to do minimal changes before fleeing
on holiday.

Paolo
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [v3 24/26] KVM: Update Posted-Interrupts Descriptor when vCPU is blocked

2015-06-05 Thread Marcelo Tosatti
On Tue, Apr 14, 2015 at 07:37:44AM +, Wu, Feng wrote:
 
 
  -Original Message-
  From: Marcelo Tosatti [mailto:mtosa...@redhat.com]
  Sent: Tuesday, March 31, 2015 7:56 AM
  To: Wu, Feng
  Cc: h...@zytor.com; t...@linutronix.de; mi...@redhat.com; x...@kernel.org;
  g...@kernel.org; pbonz...@redhat.com; dw...@infradead.org;
  j...@8bytes.org; alex.william...@redhat.com; jiang@linux.intel.com;
  eric.au...@linaro.org; linux-ker...@vger.kernel.org;
  io...@lists.linux-foundation.org; kvm@vger.kernel.org
  Subject: Re: [v3 24/26] KVM: Update Posted-Interrupts Descriptor when vCPU
  is blocked
  
  On Mon, Mar 30, 2015 at 04:46:55AM +, Wu, Feng wrote:
  
  
-Original Message-
From: Marcelo Tosatti [mailto:mtosa...@redhat.com]
Sent: Saturday, March 28, 2015 3:30 AM
To: Wu, Feng
Cc: h...@zytor.com; t...@linutronix.de; mi...@redhat.com;
  x...@kernel.org;
g...@kernel.org; pbonz...@redhat.com; dw...@infradead.org;
j...@8bytes.org; alex.william...@redhat.com; jiang@linux.intel.com;
eric.au...@linaro.org; linux-ker...@vger.kernel.org;
io...@lists.linux-foundation.org; kvm@vger.kernel.org
Subject: Re: [v3 24/26] KVM: Update Posted-Interrupts Descriptor when
  vCPU
is blocked
   
On Fri, Mar 27, 2015 at 06:34:14AM +, Wu, Feng wrote:
   Currently, the following code is executed before 
   local_irq_disable() is
called,
   so do you mean 1)moving local_irq_disable() to the place before 
   it. 2)
  after
  interrupt
   is disabled, set KVM_REQ_EVENT in case the ON bit is set?
 
  2) after interrupt is disabled, set KVM_REQ_EVENT in case the ON bit
  is set.

 Here is my understanding about your comments here:
 - Disable interrupts
 - Check 'ON'
 - Set KVM_REQ_EVENT if 'ON' is set

 Then we can put the above code inside  if
(kvm_check_request(KVM_REQ_EVENT, vcpu) || req_int_win) 
 just like it used to be. However, I still have some questions about 
 this
comment:

 1. Where should I set KVM_REQ_EVENT? In function vcpu_enter_guest(),
  or
other places?
   
See below:
   
 If in vcpu_enter_guest(), since currently local_irq_disable() is 
 called after
'KVM_REQ_EVENT'
 is checked, is it helpful to set KVM_REQ_EVENT after 
 local_irq_disable() is
called?
   
local_irq_disable();
   
*** add code here ***
  
   So we need add code like the following here, right?
  
 if ('ON' is set)
 kvm_make_request(KVM_REQ_EVENT, vcpu);
  
 
 Hi Marcelo,
 
 I changed the code as above, then I found that the ping latency was extremely 
 big, (70ms - 400ms).
 I digged into it and got the root cause. We cannot use checking-on as the 
 judgment, since 'ON'
 can be cleared by hypervisor software in lots of places. In this case, 
 KVM_REQ_EVENT cannot be
 set when we check 'ON' bit, hence the interrupts are not injected to the 
 guest in time.
 
 Please refer to the following code, in which 'ON' bit can be cleared:
 
 apic_find_highest_irr () -- vmx_sync_pir_to_irr () -- pi_test_and_clear_on()
 
 Searching from the code step by step, apic_find_highest_irr() can be called 
 by many other guys.
 
 Thanks,

Ok then, ignore my suggestion.

Can you resend the latest version please ?


--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[kvm:queue 76/76] arch/x86/kvm/../../../virt/kvm/irqchip.c:144:35: sparse: incorrect type in argument 1 (different address spaces)

2015-06-05 Thread kbuild test robot
tree:   git://git.kernel.org/pub/scm/virt/kvm/kvm.git queue
head:   6aa5e7eb06cff8d317328a0c4696b5f635ba6be3
commit: 6aa5e7eb06cff8d317328a0c4696b5f635ba6be3 [76/76] kvm: irqchip: Break up 
high order allocations of kvm_irq_routing_table
reproduce:
  # apt-get install sparse
  git checkout 6aa5e7eb06cff8d317328a0c4696b5f635ba6be3
  make ARCH=x86_64 allmodconfig
  make C=1 CF=-D__CHECK_ENDIAN__


sparse warnings: (new ones prefixed by )

 arch/x86/kvm/../../../virt/kvm/irqchip.c:144:35: sparse: incorrect type in 
 argument 1 (different address spaces)
   arch/x86/kvm/../../../virt/kvm/irqchip.c:144:35:expected struct 
kvm_irq_routing_table *rt
   arch/x86/kvm/../../../virt/kvm/irqchip.c:144:35:got struct 
kvm_irq_routing_table [noderef] asn:4*irq_routing
   arch/x86/kvm/../../../virt/kvm/irqchip.c:224:13: sparse: incorrect type in 
assignment (different address spaces)
   arch/x86/kvm/../../../virt/kvm/irqchip.c:224:13:expected struct 
kvm_irq_routing_table *old
   arch/x86/kvm/../../../virt/kvm/irqchip.c:224:13:got struct 
kvm_irq_routing_table [noderef] asn:4*irq_routing

vim +144 arch/x86/kvm/../../../virt/kvm/irqchip.c

   128  struct kvm_kernel_irq_routing_entry *e;
   129  struct hlist_node *n;
   130  
   131  hlist_for_each_entry_safe(e, n, rt-map[i], link) {
   132  hlist_del(e-link);
   133  kfree(e);
   134  }
   135  }
   136  
   137  kfree(rt);
   138  }
   139  
   140  void kvm_free_irq_routing(struct kvm *kvm)
   141  {
   142  /* Called only during vm destruction. Nobody can use the pointer
   143 at this stage */
  144  free_irq_routing_table(kvm-irq_routing);
   145  }
   146  
   147  static int setup_routing_entry(struct kvm_irq_routing_table *rt,
   148 struct kvm_kernel_irq_routing_entry *e,
   149 const struct kvm_irq_routing_entry *ue)
   150  {
   151  int r = -EINVAL;
   152  struct kvm_kernel_irq_routing_entry *ei;

---
0-DAY kernel test infrastructureOpen Source Technology Center
http://lists.01.org/mailman/listinfo/kbuild Intel Corporation
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html