Re: [PATCH v2 6/8] arm/arm64: KVM: Add forwarded physical interrupts documentation

2015-09-09 Thread Eric Auger
Salut Andre,
On 09/09/2015 10:49 AM, Christoffer Dall wrote:
> On Tue, Sep 8, 2015 at 6:57 PM, Andre Przywara  wrote:
>> Hi Eric,
>>
>> thanks for you answer.
>>
>> On 08/09/15 09:43, Eric Auger wrote:
>>> Hi Andre,
>>> On 09/07/2015 01:25 PM, Andre Przywara wrote:
 Hi,

 firstly: this text is really great, thanks for coming up with that.
 See below for some information I got from tracing the host which I
 cannot make sense of


 On 04/09/15 20:40, Christoffer Dall wrote:
> Forwarded physical interrupts on arm/arm64 is a tricky concept and the
> way we deal with them is not apparently easy to understand by reading
> various specs.
>
> Therefore, add a proper documentation file explaining the flow and
> rationale of the behavior of the vgic.
>
> Some of this text was contributed by Marc Zyngier and edited by me.
> Omissions and errors are all mine.
>
> Signed-off-by: Christoffer Dall 
> ---
>  Documentation/virtual/kvm/arm/vgic-mapped-irqs.txt | 181 
> +
>  1 file changed, 181 insertions(+)
>  create mode 100644 Documentation/virtual/kvm/arm/vgic-mapped-irqs.txt
>
> diff --git a/Documentation/virtual/kvm/arm/vgic-mapped-irqs.txt 
> b/Documentation/virtual/kvm/arm/vgic-mapped-irqs.txt
> new file mode 100644
> index 000..24b6f28
> --- /dev/null
> +++ b/Documentation/virtual/kvm/arm/vgic-mapped-irqs.txt
> @@ -0,0 +1,181 @@
> +KVM/ARM VGIC Forwarded Physical Interrupts
> +==
> +
> +The KVM/ARM code implements software support for the ARM Generic
> +Interrupt Controller's (GIC's) hardware support for virtualization by
> +allowing software to inject virtual interrupts to a VM, which the guest
> +OS sees as regular interrupts.  The code is famously known as the VGIC.
> +
> +Some of these virtual interrupts, however, correspond to physical
> +interrupts from real physical devices.  One example could be the
> +architected timer, which itself supports virtualization, and therefore
> +lets a guest OS program the hardware device directly to raise an
> +interrupt at some point in time.  When such an interrupt is raised, the
> +host OS initially handles the interrupt and must somehow signal this
> +event as a virtual interrupt to the guest.  Another example could be a
> +passthrough device, where the physical interrupts are initially handled
> +by the host, but the device driver for the device lives in the guest OS
> +and KVM must therefore somehow inject a virtual interrupt on behalf of
> +the physical one to the guest OS.
> +
> +These virtual interrupts corresponding to a physical interrupt on the
> +host are called forwarded physical interrupts, but are also sometimes
> +referred to as 'virtualized physical interrupts' and 'mapped interrupts'.
> +
> +Forwarded physical interrupts are handled slightly differently compared
> +to virtual interrupts generated purely by a software emulated device.
> +
> +
> +The HW bit
> +--
> +Virtual interrupts are signalled to the guest by programming the List
> +Registers (LRs) on the GIC before running a VCPU.  The LR is programmed
> +with the virtual IRQ number and the state of the interrupt (Pending,
> +Active, or Pending+Active).  When the guest ACKs and EOIs a virtual
> +interrupt, the LR state moves from Pending to Active, and finally to
> +inactive.
> +
> +The LRs include an extra bit, called the HW bit.  When this bit is set,
> +KVM must also program an additional field in the LR, the physical IRQ
> +number, to link the virtual with the physical IRQ.
> +
> +When the HW bit is set, KVM must EITHER set the Pending OR the Active
> +bit, never both at the same time.
> +
> +Setting the HW bit causes the hardware to deactivate the physical
> +interrupt on the physical distributor when the guest deactivates the
> +corresponding virtual interrupt.
> +
> +
> +Forwarded Physical Interrupts Life Cycle
> +
> +
> +The state of forwarded physical interrupts is managed in the following 
> way:
> +
> +  - The physical interrupt is acked by the host, and becomes active on
> +the physical distributor (*).
> +  - KVM sets the LR.Pending bit, because this is the only way the GICV
> +interface is going to present it to the guest.
> +  - LR.Pending will stay set as long as the guest has not acked the 
> interrupt.
> +  - LR.Pending transitions to LR.Active on the guest read of the IAR, as
> +expected.
> +  - On guest EOI, the *physical distributor* active bit gets cleared,
> +but the LR.Active is left untouched (set).

 I tried 

[RFC PATCH v4 3/3] vfio: platform: return device properties as arrays of unsigned integers

2015-09-09 Thread Baptiste Reynal
From: Antonios Motakis 

Certain properties of a device are accessible as an array of unsigned
integers, either u64, u32, u16, or u8. Let the VFIO user query this
type of device properties.

Signed-off-by: Antonios Motakis 
Signed-off-by: Baptiste Reynal 
---
 drivers/vfio/platform/properties.c | 62 +-
 1 file changed, 61 insertions(+), 1 deletion(-)

diff --git a/drivers/vfio/platform/properties.c 
b/drivers/vfio/platform/properties.c
index 8bf9c8f..625e2d3 100644
--- a/drivers/vfio/platform/properties.c
+++ b/drivers/vfio/platform/properties.c
@@ -70,7 +70,67 @@ static int dev_property_get_uint(struct device *dev, 
uint32_t *flags,
 char *name, uint32_t type, unsigned *lenp,
 void __user *datap, unsigned long datasz)
 {
-   return -EINVAL;
+   int ret, n;
+   u8 *out;
+   size_t sz;
+   int (*func)(const struct device *, const char *, void *, size_t)
+   = NULL;
+
+   switch (type) {
+   case VFIO_DEV_PROPERTY_TYPE_U64:
+   sz = sizeof(u64);
+   func = (int (*)(const struct device *,
+   const char *, void *, size_t))
+   device_property_read_u64_array;
+   break;
+   case VFIO_DEV_PROPERTY_TYPE_U32:
+   sz = sizeof(u32);
+   func = (int (*)(const struct device *,
+   const char *, void *, size_t))
+   device_property_read_u32_array;
+   break;
+   case VFIO_DEV_PROPERTY_TYPE_U16:
+   sz = sizeof(u16);
+   func = (int (*)(const struct device *,
+   const char *, void *, size_t))
+   device_property_read_u16_array;
+   break;
+   case VFIO_DEV_PROPERTY_TYPE_U8:
+   sz = sizeof(u8);
+   func = (int (*)(const struct device *,
+   const char *, void *, size_t))
+   device_property_read_u8_array;
+   break;
+
+   default:
+   return -EINVAL;
+   }
+
+   /* get size of array */
+   n = func(dev, name, NULL, 0);
+   if (n < 0)
+   return n;
+
+   if (lenp)
+   *lenp = n * sz;
+
+   if (n * sz > datasz)
+   return -EOVERFLOW;
+
+   out = kcalloc(n, sz, GFP_KERNEL);
+   if (!out)
+   return -ENOMEM;
+
+   ret = func(dev, name, out, n);
+   if (ret)
+   goto out;
+
+   if (copy_to_user(datap, out, n * sz))
+   ret = -EFAULT;
+
+out:
+   kfree(out);
+   return ret;
 }
 
 int vfio_platform_dev_properties(struct device *dev, uint32_t *flags,
-- 
2.5.1

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v2 3/8] KVM: VMX: drop rdtscp_enabled check in prepare_vmcs02()

2015-09-09 Thread Xiao Guangrong
SECONDARY_EXEC_RDTSCP set for L2 guest comes from vmcs12

Signed-off-by: Xiao Guangrong 
---
 arch/x86/kvm/vmx.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index 073cbc8..61d44b0 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -9323,8 +9323,7 @@ static void prepare_vmcs02(struct kvm_vcpu *vcpu, struct 
vmcs12 *vmcs12)
 
if (cpu_has_secondary_exec_ctrls()) {
exec_control = vmx_secondary_exec_control(vmx);
-   if (!vmx->rdtscp_enabled)
-   exec_control &= ~SECONDARY_EXEC_RDTSCP;
+
/* Take the following fields only from vmcs12 */
exec_control &= ~(SECONDARY_EXEC_VIRTUALIZE_APIC_ACCESSES |
  SECONDARY_EXEC_RDTSCP |
-- 
2.4.3

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v2 4/8] KVM: VMX: simplify rdtscp handling in vmx_cpuid_update()

2015-09-09 Thread Xiao Guangrong
if vmx_rdtscp_supported() is true SECONDARY_EXEC_RDTSCP must
have already been set in current vmcs by
vmx_secondary_exec_control()

Signed-off-by: Xiao Guangrong 
---
 arch/x86/kvm/vmx.c | 17 -
 1 file changed, 8 insertions(+), 9 deletions(-)

diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index 61d44b0..bcc69de 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -8678,16 +8678,15 @@ static void vmx_cpuid_update(struct kvm_vcpu *vcpu)
vmx->rdtscp_enabled = false;
if (vmx_rdtscp_supported()) {
exec_control = vmcs_read32(SECONDARY_VM_EXEC_CONTROL);
-   if (exec_control & SECONDARY_EXEC_RDTSCP) {
-   best = kvm_find_cpuid_entry(vcpu, 0x8001, 0);
-   if (best && (best->edx & bit(X86_FEATURE_RDTSCP)))
-   vmx->rdtscp_enabled = true;
-   else {
-   exec_control &= ~SECONDARY_EXEC_RDTSCP;
-   vmcs_write32(SECONDARY_VM_EXEC_CONTROL,
-   exec_control);
-   }
+   best = kvm_find_cpuid_entry(vcpu, 0x8001, 0);
+   if (best && (best->edx & bit(X86_FEATURE_RDTSCP)))
+   vmx->rdtscp_enabled = true;
+   else {
+   exec_control &= ~SECONDARY_EXEC_RDTSCP;
+   vmcs_write32(SECONDARY_VM_EXEC_CONTROL,
+   exec_control);
}
+
if (nested && !vmx->rdtscp_enabled)
vmx->nested.nested_vmx_secondary_ctls_high &=
~SECONDARY_EXEC_RDTSCP;
-- 
2.4.3

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v2 6/8] KVM: VMX: unify SECONDARY_VM_EXEC_CONTROL update

2015-09-09 Thread Xiao Guangrong
Unify the update in vmx_cpuid_update()

Signed-off-by: Xiao Guangrong 
---
 arch/x86/kvm/vmx.c | 21 +++--
 1 file changed, 11 insertions(+), 10 deletions(-)

diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index 97e3340..5a074d0 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -8673,19 +8673,15 @@ static void vmx_cpuid_update(struct kvm_vcpu *vcpu)
 {
struct kvm_cpuid_entry2 *best;
struct vcpu_vmx *vmx = to_vmx(vcpu);
-   u32 exec_control;
+   u32 clear_exe_ctrl = 0;
 
vmx->rdtscp_enabled = false;
if (vmx_rdtscp_supported()) {
-   exec_control = vmcs_read32(SECONDARY_VM_EXEC_CONTROL);
best = kvm_find_cpuid_entry(vcpu, 0x8001, 0);
if (best && (best->edx & bit(X86_FEATURE_RDTSCP)))
vmx->rdtscp_enabled = true;
-   else {
-   exec_control &= ~SECONDARY_EXEC_RDTSCP;
-   vmcs_write32(SECONDARY_VM_EXEC_CONTROL,
-   exec_control);
-   }
+   else
+   clear_exe_ctrl |= SECONDARY_EXEC_RDTSCP;
 
if (nested && !vmx->rdtscp_enabled)
vmx->nested.nested_vmx_secondary_ctls_high &=
@@ -8697,14 +8693,19 @@ static void vmx_cpuid_update(struct kvm_vcpu *vcpu)
if (vmx_invpcid_supported() &&
(!best || !(best->ebx & bit(X86_FEATURE_INVPCID)) ||
!guest_cpuid_has_pcid(vcpu))) {
-   exec_control = vmcs_read32(SECONDARY_VM_EXEC_CONTROL);
-   exec_control &= ~SECONDARY_EXEC_ENABLE_INVPCID;
-   vmcs_write32(SECONDARY_VM_EXEC_CONTROL, exec_control);
+   clear_exe_ctrl |= SECONDARY_EXEC_ENABLE_INVPCID;
 
if (best)
best->ebx &= ~bit(X86_FEATURE_INVPCID);
}
 
+   if (clear_exe_ctrl) {
+   u32 exec_ctl = vmcs_read32(SECONDARY_VM_EXEC_CONTROL);
+
+   exec_ctl &= ~clear_exe_ctrl;
+   vmcs_write32(SECONDARY_VM_EXEC_CONTROL, exec_ctl);
+   }
+
if (!guest_cpuid_has_pcommit(vcpu) && nested)
vmx->nested.nested_vmx_secondary_ctls_high &=
~SECONDARY_EXEC_PCOMMIT;
-- 
2.4.3

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v2 8/8] KVM: VMX: drop rdtscp_enabled field

2015-09-09 Thread Xiao Guangrong
Check cpuid bit instead of it

Signed-off-by: Xiao Guangrong 
---
 arch/x86/kvm/cpuid.h |  8 
 arch/x86/kvm/vmx.c   | 19 ++-
 2 files changed, 14 insertions(+), 13 deletions(-)

diff --git a/arch/x86/kvm/cpuid.h b/arch/x86/kvm/cpuid.h
index aed7bfe..d434ee9 100644
--- a/arch/x86/kvm/cpuid.h
+++ b/arch/x86/kvm/cpuid.h
@@ -141,4 +141,12 @@ static inline bool guest_cpuid_has_pcommit(struct kvm_vcpu 
*vcpu)
best = kvm_find_cpuid_entry(vcpu, 7, 0);
return best && (best->ebx & bit(X86_FEATURE_PCOMMIT));
 }
+
+static inline bool guest_cpuid_has_rdtscp(struct kvm_vcpu *vcpu)
+{
+   struct kvm_cpuid_entry2 *best;
+
+   best = kvm_find_cpuid_entry(vcpu, 0x8001, 0);
+   return best && (best->edx & bit(X86_FEATURE_RDTSCP));
+}
 #endif
diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index f18f744..2e98e6d 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -532,8 +532,6 @@ struct vcpu_vmx {
s64 vnmi_blocked_time;
u32 exit_reason;
 
-   bool rdtscp_enabled;
-
/* Posted interrupt descriptor */
struct pi_desc pi_desc;
 
@@ -2207,7 +2205,7 @@ static void setup_msrs(struct vcpu_vmx *vmx)
if (index >= 0)
move_msr_up(vmx, index, save_nmsrs++);
index = __find_msr_index(vmx, MSR_TSC_AUX);
-   if (index >= 0 && vmx->rdtscp_enabled)
+   if (index >= 0 && guest_cpuid_has_rdtscp(>vcpu))
move_msr_up(vmx, index, save_nmsrs++);
/*
 * MSR_STAR is only needed on long mode guests, and only
@@ -2674,7 +2672,7 @@ static int vmx_get_msr(struct kvm_vcpu *vcpu, struct 
msr_data *msr_info)
msr_info->data = vcpu->arch.ia32_xss;
break;
case MSR_TSC_AUX:
-   if (!to_vmx(vcpu)->rdtscp_enabled)
+   if (!guest_cpuid_has_rdtscp(vcpu))
return 1;
/* Otherwise falls through */
default:
@@ -2780,7 +2778,7 @@ static int vmx_set_msr(struct kvm_vcpu *vcpu, struct 
msr_data *msr_info)
clear_atomic_switch_msr(vmx, MSR_IA32_XSS);
break;
case MSR_TSC_AUX:
-   if (!vmx->rdtscp_enabled)
+   if (!guest_cpuid_has_rdtscp(vcpu))
return 1;
/* Check reserved bit, higher 32 bits should be zero */
if ((data >> 32) != 0)
@@ -8664,15 +8662,10 @@ static void vmx_cpuid_update(struct kvm_vcpu *vcpu)
struct vcpu_vmx *vmx = to_vmx(vcpu);
u32 clear_exe_ctrl = 0;
 
-   vmx->rdtscp_enabled = false;
-   if (vmx_rdtscp_supported()) {
-   best = kvm_find_cpuid_entry(vcpu, 0x8001, 0);
-   if (best && (best->edx & bit(X86_FEATURE_RDTSCP)))
-   vmx->rdtscp_enabled = true;
-   else
-   clear_exe_ctrl |= SECONDARY_EXEC_RDTSCP;
+   if (vmx_rdtscp_supported() && !guest_cpuid_has_rdtscp(vcpu)) {
+   clear_exe_ctrl |= SECONDARY_EXEC_RDTSCP;
 
-   if (nested && !vmx->rdtscp_enabled)
+   if (nested)
vmx->nested.nested_vmx_secondary_ctls_high &=
~SECONDARY_EXEC_RDTSCP;
}
-- 
2.4.3

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v2 0/8] KVM: x86: enable cflushopt/clwb/pcommit and simplify code

2015-09-09 Thread Xiao Guangrong
Changelog:
Thanks for Paolo's review, there are the changes in v2:
- use WARN_ON(1) instead of BUG() if PCOMMIT-exit happend for L1 guest
- drop set_clear_2nd_exec_ctrl() and use vmcs_{set,clear}_bits  instead
- improve commit log and adjust code style

This pachset enables clfushopt, clwb and pcommit instructions for guest which
are used by NVDIMM.

The specification locates at:
https://software.intel.com/sites/default/files/managed/0d/53/319433-022.pdf

patch 1 and patch 2 enable these three instructions for guest and other patches
simplify current VMX code

Xiao Guangrong (8):
  KVM: x86: allow guest to use cflushopt and clwb
  KVM: x86: add pcommit support
  KVM: VMX: drop rdtscp_enabled check in prepare_vmcs02()
  KVM: VMX: simplify rdtscp handling in vmx_cpuid_update()
  KVM: VMX: simplify invpcid handling in vmx_cpuid_update()
  KVM: VMX: unify SECONDARY_VM_EXEC_CONTROL update
  KVM: VMX: clean up bit operation on SECONDARY_VM_EXEC_CONTROL
  KVM: VMX: drop rdtscp_enabled field

 arch/x86/include/asm/vmx.h  |   2 +-
 arch/x86/include/uapi/asm/vmx.h |   4 +-
 arch/x86/kvm/cpuid.c|   2 +-
 arch/x86/kvm/cpuid.h|  16 +++
 arch/x86/kvm/vmx.c  | 103 ++--
 5 files changed, 67 insertions(+), 60 deletions(-)

-- 
2.4.3

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v2 2/8] KVM: x86: add pcommit support

2015-09-09 Thread Xiao Guangrong
Pass PCOMMIT CPU feature to guest to enable PCOMMIT instruction

Currently we do not catch pcommit instruction for L1 guest and
allow L1 to catch this instruction for L2 if, as required by the spec,
L1 can enumerate the PCOMMIT instruction via CPUID:
| IA32_VMX_PROCBASED_CTLS2[53] (which enumerates support for the
| 1-setting of PCOMMIT exiting) is always the same as
| CPUID.07H:EBX.PCOMMIT[bit 22]. Thus, software can set PCOMMIT exiting
| to 1 if and only if the PCOMMIT instruction is enumerated via CPUID

The spec can be found at
https://software.intel.com/sites/default/files/managed/0d/53/319433-022.pdf

Signed-off-by: Xiao Guangrong 
---
 arch/x86/include/asm/vmx.h  |  2 +-
 arch/x86/include/uapi/asm/vmx.h |  4 +++-
 arch/x86/kvm/cpuid.c|  2 +-
 arch/x86/kvm/cpuid.h|  8 
 arch/x86/kvm/vmx.c  | 29 -
 5 files changed, 37 insertions(+), 8 deletions(-)

diff --git a/arch/x86/include/asm/vmx.h b/arch/x86/include/asm/vmx.h
index 448b7ca..d25f32a 100644
--- a/arch/x86/include/asm/vmx.h
+++ b/arch/x86/include/asm/vmx.h
@@ -72,7 +72,7 @@
 #define SECONDARY_EXEC_SHADOW_VMCS  0x4000
 #define SECONDARY_EXEC_ENABLE_PML   0x0002
 #define SECONDARY_EXEC_XSAVES  0x0010
-
+#define SECONDARY_EXEC_PCOMMIT 0x0020
 
 #define PIN_BASED_EXT_INTR_MASK 0x0001
 #define PIN_BASED_NMI_EXITING   0x0008
diff --git a/arch/x86/include/uapi/asm/vmx.h b/arch/x86/include/uapi/asm/vmx.h
index 37fee27..5b15d94 100644
--- a/arch/x86/include/uapi/asm/vmx.h
+++ b/arch/x86/include/uapi/asm/vmx.h
@@ -78,6 +78,7 @@
 #define EXIT_REASON_PML_FULL62
 #define EXIT_REASON_XSAVES  63
 #define EXIT_REASON_XRSTORS 64
+#define EXIT_REASON_PCOMMIT 65
 
 #define VMX_EXIT_REASONS \
{ EXIT_REASON_EXCEPTION_NMI, "EXCEPTION_NMI" }, \
@@ -126,7 +127,8 @@
{ EXIT_REASON_INVVPID,   "INVVPID" }, \
{ EXIT_REASON_INVPCID,   "INVPCID" }, \
{ EXIT_REASON_XSAVES,"XSAVES" }, \
-   { EXIT_REASON_XRSTORS,   "XRSTORS" }
+   { EXIT_REASON_XRSTORS,   "XRSTORS" }, \
+   { EXIT_REASON_PCOMMIT,   "PCOMMIT" }
 
 #define VMX_ABORT_SAVE_GUEST_MSR_FAIL1
 #define VMX_ABORT_LOAD_HOST_MSR_FAIL 4
diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
index 962fc7d..faeb0b3 100644
--- a/arch/x86/kvm/cpuid.c
+++ b/arch/x86/kvm/cpuid.c
@@ -348,7 +348,7 @@ static inline int __do_cpuid_ent(struct kvm_cpuid_entry2 
*entry, u32 function,
F(FSGSBASE) | F(BMI1) | F(HLE) | F(AVX2) | F(SMEP) |
F(BMI2) | F(ERMS) | f_invpcid | F(RTM) | f_mpx | F(RDSEED) |
F(ADX) | F(SMAP) | F(AVX512F) | F(AVX512PF) | F(AVX512ER) |
-   F(AVX512CD) | F(CLFLUSHOPT) | F(CLWB);
+   F(AVX512CD) | F(CLFLUSHOPT) | F(CLWB) | F(PCOMMIT);
 
/* cpuid 0xD.1.eax */
const u32 kvm_supported_word10_x86_features =
diff --git a/arch/x86/kvm/cpuid.h b/arch/x86/kvm/cpuid.h
index dd05b9c..aed7bfe 100644
--- a/arch/x86/kvm/cpuid.h
+++ b/arch/x86/kvm/cpuid.h
@@ -133,4 +133,12 @@ static inline bool guest_cpuid_has_mpx(struct kvm_vcpu 
*vcpu)
best = kvm_find_cpuid_entry(vcpu, 7, 0);
return best && (best->ebx & bit(X86_FEATURE_MPX));
 }
+
+static inline bool guest_cpuid_has_pcommit(struct kvm_vcpu *vcpu)
+{
+   struct kvm_cpuid_entry2 *best;
+
+   best = kvm_find_cpuid_entry(vcpu, 7, 0);
+   return best && (best->ebx & bit(X86_FEATURE_PCOMMIT));
+}
 #endif
diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index da1590e..073cbc8 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -2474,7 +2474,8 @@ static void nested_vmx_setup_ctls_msrs(struct vcpu_vmx 
*vmx)
SECONDARY_EXEC_APIC_REGISTER_VIRT |
SECONDARY_EXEC_VIRTUAL_INTR_DELIVERY |
SECONDARY_EXEC_WBINVD_EXITING |
-   SECONDARY_EXEC_XSAVES;
+   SECONDARY_EXEC_XSAVES |
+   SECONDARY_EXEC_PCOMMIT;
 
if (enable_ept) {
/* nested EPT: emulate EPT also to L1 */
@@ -3015,7 +3016,8 @@ static __init int setup_vmcs_config(struct vmcs_config 
*vmcs_conf)
SECONDARY_EXEC_VIRTUAL_INTR_DELIVERY |
SECONDARY_EXEC_SHADOW_VMCS |
SECONDARY_EXEC_XSAVES |
-   SECONDARY_EXEC_ENABLE_PML;
+   SECONDARY_EXEC_ENABLE_PML |
+   SECONDARY_EXEC_PCOMMIT;
if (adjust_vmx_controls(min2, opt2,
MSR_IA32_VMX_PROCBASED_CTLS2,
&_cpu_based_2nd_exec_control) < 0)
@@ -4570,6 +4572,9 @@ static u32 vmx_secondary_exec_control(struct vcpu_vmx 
*vmx)
/* PML is 

Re: [PATCH v2 6/8] arm/arm64: KVM: Add forwarded physical interrupts documentation

2015-09-09 Thread Christoffer Dall
On Tue, Sep 8, 2015 at 6:57 PM, Andre Przywara  wrote:
> Hi Eric,
>
> thanks for you answer.
>
> On 08/09/15 09:43, Eric Auger wrote:
>> Hi Andre,
>> On 09/07/2015 01:25 PM, Andre Przywara wrote:
>>> Hi,
>>>
>>> firstly: this text is really great, thanks for coming up with that.
>>> See below for some information I got from tracing the host which I
>>> cannot make sense of
>>>
>>>
>>> On 04/09/15 20:40, Christoffer Dall wrote:
 Forwarded physical interrupts on arm/arm64 is a tricky concept and the
 way we deal with them is not apparently easy to understand by reading
 various specs.

 Therefore, add a proper documentation file explaining the flow and
 rationale of the behavior of the vgic.

 Some of this text was contributed by Marc Zyngier and edited by me.
 Omissions and errors are all mine.

 Signed-off-by: Christoffer Dall 
 ---
  Documentation/virtual/kvm/arm/vgic-mapped-irqs.txt | 181 
 +
  1 file changed, 181 insertions(+)
  create mode 100644 Documentation/virtual/kvm/arm/vgic-mapped-irqs.txt

 diff --git a/Documentation/virtual/kvm/arm/vgic-mapped-irqs.txt 
 b/Documentation/virtual/kvm/arm/vgic-mapped-irqs.txt
 new file mode 100644
 index 000..24b6f28
 --- /dev/null
 +++ b/Documentation/virtual/kvm/arm/vgic-mapped-irqs.txt
 @@ -0,0 +1,181 @@
 +KVM/ARM VGIC Forwarded Physical Interrupts
 +==
 +
 +The KVM/ARM code implements software support for the ARM Generic
 +Interrupt Controller's (GIC's) hardware support for virtualization by
 +allowing software to inject virtual interrupts to a VM, which the guest
 +OS sees as regular interrupts.  The code is famously known as the VGIC.
 +
 +Some of these virtual interrupts, however, correspond to physical
 +interrupts from real physical devices.  One example could be the
 +architected timer, which itself supports virtualization, and therefore
 +lets a guest OS program the hardware device directly to raise an
 +interrupt at some point in time.  When such an interrupt is raised, the
 +host OS initially handles the interrupt and must somehow signal this
 +event as a virtual interrupt to the guest.  Another example could be a
 +passthrough device, where the physical interrupts are initially handled
 +by the host, but the device driver for the device lives in the guest OS
 +and KVM must therefore somehow inject a virtual interrupt on behalf of
 +the physical one to the guest OS.
 +
 +These virtual interrupts corresponding to a physical interrupt on the
 +host are called forwarded physical interrupts, but are also sometimes
 +referred to as 'virtualized physical interrupts' and 'mapped interrupts'.
 +
 +Forwarded physical interrupts are handled slightly differently compared
 +to virtual interrupts generated purely by a software emulated device.
 +
 +
 +The HW bit
 +--
 +Virtual interrupts are signalled to the guest by programming the List
 +Registers (LRs) on the GIC before running a VCPU.  The LR is programmed
 +with the virtual IRQ number and the state of the interrupt (Pending,
 +Active, or Pending+Active).  When the guest ACKs and EOIs a virtual
 +interrupt, the LR state moves from Pending to Active, and finally to
 +inactive.
 +
 +The LRs include an extra bit, called the HW bit.  When this bit is set,
 +KVM must also program an additional field in the LR, the physical IRQ
 +number, to link the virtual with the physical IRQ.
 +
 +When the HW bit is set, KVM must EITHER set the Pending OR the Active
 +bit, never both at the same time.
 +
 +Setting the HW bit causes the hardware to deactivate the physical
 +interrupt on the physical distributor when the guest deactivates the
 +corresponding virtual interrupt.
 +
 +
 +Forwarded Physical Interrupts Life Cycle
 +
 +
 +The state of forwarded physical interrupts is managed in the following 
 way:
 +
 +  - The physical interrupt is acked by the host, and becomes active on
 +the physical distributor (*).
 +  - KVM sets the LR.Pending bit, because this is the only way the GICV
 +interface is going to present it to the guest.
 +  - LR.Pending will stay set as long as the guest has not acked the 
 interrupt.
 +  - LR.Pending transitions to LR.Active on the guest read of the IAR, as
 +expected.
 +  - On guest EOI, the *physical distributor* active bit gets cleared,
 +but the LR.Active is left untouched (set).
>>>
>>> I tried hard in the last week, but couldn't confirm this. Tracing shows
>>> the following pattern over and over (case 1):
>>> (This is the kvm/kvm.git:queue branch from last 

[RFC PATCH v4 1/3] vfio: platform: add device properties skeleton and user API

2015-09-09 Thread Baptiste Reynal
From: Antonios Motakis 

This patch introduces an API that allows to return device properties (OF
or ACPI) of a device bound to the vfio-platform/vfio-amba driver and the
skeleton of the implementation for VFIO_PLATFORM. Information about any
device node bound by VFIO_PLATFORM should be queried via the introduced
ioctl VFIO_DEVICE_GET_DEV_PROPERTY.

The user needs to know the name and the data type of the property he is
accessing.

Signed-off-by: Antonios Motakis 
Signed-off-by: Baptiste Reynal 

---
v3 -> v4:
 - added flags placeholder in vfio_dev_properties
 - ioctl returns -E2BIG if the buffer is too small
 - details VFIO_DEVICE_GET_DEV_PROPERTY documentation
---
 drivers/vfio/platform/Makefile|  3 +-
 drivers/vfio/platform/properties.c| 77 +++
 drivers/vfio/platform/vfio_platform_common.c  | 35 
 drivers/vfio/platform/vfio_platform_private.h |  7 +++
 include/uapi/linux/vfio.h | 31 +++
 5 files changed, 152 insertions(+), 1 deletion(-)
 create mode 100644 drivers/vfio/platform/properties.c

diff --git a/drivers/vfio/platform/Makefile b/drivers/vfio/platform/Makefile
index 9ce8afe..37cf5ed 100644
--- a/drivers/vfio/platform/Makefile
+++ b/drivers/vfio/platform/Makefile
@@ -1,5 +1,6 @@
 
-vfio-platform-y := vfio_platform.o vfio_platform_common.o vfio_platform_irq.o
+vfio-platform-y := vfio_platform.o vfio_platform_common.o vfio_platform_irq.o \
+  properties.o
 
 obj-$(CONFIG_VFIO_PLATFORM) += vfio-platform.o
 obj-$(CONFIG_VFIO_PLATFORM) += reset/
diff --git a/drivers/vfio/platform/properties.c 
b/drivers/vfio/platform/properties.c
new file mode 100644
index 000..98754c2
--- /dev/null
+++ b/drivers/vfio/platform/properties.c
@@ -0,0 +1,77 @@
+/*
+ * Copyright (C) 2015 - Virtual Open Systems
+ * Authors: Antonios Motakis 
+ *  Baptiste Reynal 
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License, version 2, as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ */
+
+#include 
+#include 
+#include 
+#include "vfio_platform_private.h"
+
+static int dev_property_get_strings(struct device *dev, uint32_t *flags,
+   char *name, unsigned *lenp,
+   void __user *datap, unsigned long datasz)
+{
+   return -EINVAL;
+}
+
+static int dev_property_get_uint(struct device *dev, uint32_t *flags,
+char *name, uint32_t type, unsigned *lenp,
+void __user *datap, unsigned long datasz)
+{
+   return -EINVAL;
+}
+
+int vfio_platform_dev_properties(struct device *dev, uint32_t *flags,
+uint32_t type, unsigned *lenp,
+void __user *datap, unsigned long datasz)
+{
+   char *name;
+   long namesz;
+   int ret;
+
+   namesz = strnlen_user(datap, datasz);
+   if (!namesz)
+   return -EFAULT;
+   if (namesz > datasz)
+   return -EINVAL;
+
+   name = kzalloc(namesz, GFP_KERNEL);
+   if (!name)
+   return -ENOMEM;
+   if (strncpy_from_user(name, datap, namesz) <= 0) {
+   kfree(name);
+   return -EFAULT;
+   }
+
+   switch (type) {
+   case VFIO_DEV_PROPERTY_TYPE_STRINGS:
+   ret = dev_property_get_strings(dev, flags, name, lenp,
+   datap, datasz);
+   break;
+
+   case VFIO_DEV_PROPERTY_TYPE_U64:
+   case VFIO_DEV_PROPERTY_TYPE_U32:
+   case VFIO_DEV_PROPERTY_TYPE_U16:
+   case VFIO_DEV_PROPERTY_TYPE_U8:
+   ret = dev_property_get_uint(dev, flags, name, type, lenp,
+   datap, datasz);
+   break;
+
+   default:
+   ret = -EINVAL;
+   }
+
+   kfree(name);
+   return ret;
+}
diff --git a/drivers/vfio/platform/vfio_platform_common.c 
b/drivers/vfio/platform/vfio_platform_common.c
index e43efb5..44ba22c 100644
--- a/drivers/vfio/platform/vfio_platform_common.c
+++ b/drivers/vfio/platform/vfio_platform_common.c
@@ -20,6 +20,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include "vfio_platform_private.h"
 
@@ -302,6 +303,34 @@ static long vfio_platform_ioctl(void *device_data,
return vdev->reset(vdev);
else
return -EINVAL;
+   } else if (cmd == VFIO_DEVICE_GET_DEV_PROPERTY) {
+   struct 

[PATCH v2 1/8] KVM: x86: allow guest to use cflushopt and clwb

2015-09-09 Thread Xiao Guangrong
Pass these CPU features to guest to enable them in guest

They are needed by nvdimm drivers

Signed-off-by: Xiao Guangrong 
---
 arch/x86/kvm/cpuid.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
index 2fbea25..962fc7d 100644
--- a/arch/x86/kvm/cpuid.c
+++ b/arch/x86/kvm/cpuid.c
@@ -348,7 +348,7 @@ static inline int __do_cpuid_ent(struct kvm_cpuid_entry2 
*entry, u32 function,
F(FSGSBASE) | F(BMI1) | F(HLE) | F(AVX2) | F(SMEP) |
F(BMI2) | F(ERMS) | f_invpcid | F(RTM) | f_mpx | F(RDSEED) |
F(ADX) | F(SMAP) | F(AVX512F) | F(AVX512PF) | F(AVX512ER) |
-   F(AVX512CD);
+   F(AVX512CD) | F(CLFLUSHOPT) | F(CLWB);
 
/* cpuid 0xD.1.eax */
const u32 kvm_supported_word10_x86_features =
-- 
2.4.3

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Bug 104271] New: Boot crashes on Lenovo W520 KVM and Discrete Graphic enabled in BIOS

2015-09-09 Thread bugzilla-daemon
https://bugzilla.kernel.org/show_bug.cgi?id=104271

Bug ID: 104271
   Summary: Boot crashes on Lenovo W520 KVM and Discrete Graphic
enabled in BIOS
   Product: Virtualization
   Version: unspecified
Kernel Version: 4.1.6
  Hardware: Intel
OS: Linux
  Tree: Mainline
Status: NEW
  Severity: blocking
  Priority: P1
 Component: kvm
  Assignee: virtualization_...@kernel-bugs.osdl.org
  Reporter: streibel.2...@gmx.net
Regression: No

Booting stops/crashes on a Lenovo W520 Notebook with these BIOS Settings:
- Virtualization
  Intel Virtualization Technology: Enabled
  Intel VT-d Feature: Enabled
- Display
  Boot Display Device: ThinkPad LCD
  Graphics Device: Discrete Graphics
  OS Detection for NVIDIA Optimus: Disabled

Booting starts and the last lines show that the kernel cannot access the hard
disks (lines starting with ata2 and ata1). The last lines are:

[ TIME ] Timed out waiting for device dev-disk-by\x2duuid-c30...4394b6b.device.
[ DEPEND ] Dependency failed for
/dev/disk/by-uuid/c305bd9f-9d6...7-359b94394b6b.
[ DEPEND ] Dependency failed for Swap.

After that booting stops.

Additional Information:
- This is no hardware fault. The error can be reproduced on a different W520.

- Booting also crashes with these kernels:
  -  3.16.0 (kernel from Debian Jessie)
  -  4.1.0 (backported kernel to Debian Jessie)

- Booting succeeds if:
  - both virtualization flags are set to Disabled in the BIOS
  or
  - Graphics Device is set to "Optimus" in the BIOS

-- 
You are receiving this mail because:
You are watching the assignee of the bug.
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Cher utilisateur de messagerie Web.

2015-09-09 Thread Lillie Phillips
Cher utilisateur de messagerie Web.



Nous avons un problème technique, nous notre base de date pour le moment. vous 
êtes au CLIQUEZ ICI pour vous reconfirmer 
web compte de messagerie une fois de plus.



Help desk
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PULL] vhost: cleanups and fixes

2015-09-09 Thread Michael S. Tsirkin
The following changes since commit 64291f7db5bd8150a74ad2036f1037e6a0428df2:

  Linux 4.2 (2015-08-30 11:34:09 -0700)

are available in the git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost.git tags/for_linus

for you to fetch changes up to 997e120843e82609c8d99a9d5714e6cf91e14cbe:

  virtio_balloon: do not change memory amount visible via /proc/meminfo 
(2015-09-08 13:32:11 +0300)


virtio: fixes and features 4.3

virtio-mmio can now be auto-loaded through acpi.
virtio blk supports extended partitions.
total memory is better reported when using virtio balloon with auto-deflate.
cache control is re-enabled when using virtio-blk in modern mode.

Signed-off-by: Michael S. Tsirkin 


Denis V. Lunev (2):
  virtio_ballon: change stub of release_pages_by_pfn
  virtio_balloon: do not change memory amount visible via /proc/meminfo

Fam Zheng (1):
  virtio-blk: Allow extended partitions

Graeme Gregory (1):
  virtio_mmio: add ACPI probing

Paolo Bonzini (1):
  virtio-blk: use VIRTIO_BLK_F_WCE and VIRTIO_BLK_F_CONFIG_WCE in virtio1

 drivers/block/virtio_blk.c  |  6 +++---
 drivers/virtio/virtio_balloon.c | 16 ++--
 drivers/virtio/virtio_mmio.c| 10 ++
 3 files changed, 23 insertions(+), 9 deletions(-)
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Bug 104271] Boot crashes on Lenovo W520 with KVM and Discrete Graphic enabled in BIOS

2015-09-09 Thread bugzilla-daemon
https://bugzilla.kernel.org/show_bug.cgi?id=104271

Hans Streibel  changed:

   What|Removed |Added

Summary|Boot crashes on Lenovo W520 |Boot crashes on Lenovo W520
   |KVM and Discrete Graphic|with KVM and Discrete
   |enabled in BIOS |Graphic enabled in BIOS

-- 
You are receiving this mail because:
You are watching the assignee of the bug.
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v2 5/8] KVM: VMX: simplify invpcid handling in vmx_cpuid_update()

2015-09-09 Thread Xiao Guangrong
If vmx_invpcid_supported() is true, second execution control
filed must be supported and SECONDARY_EXEC_ENABLE_INVPCID
must have already been set in current vmcs by
vmx_secondary_exec_control()

If vmx_invpcid_supported() is false, no need to clear
SECONDARY_EXEC_ENABLE_INVPCID

Signed-off-by: Xiao Guangrong 
---
 arch/x86/kvm/vmx.c | 17 +
 1 file changed, 5 insertions(+), 12 deletions(-)

diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index bcc69de..97e3340 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -8695,19 +8695,12 @@ static void vmx_cpuid_update(struct kvm_vcpu *vcpu)
/* Exposing INVPCID only when PCID is exposed */
best = kvm_find_cpuid_entry(vcpu, 0x7, 0);
if (vmx_invpcid_supported() &&
-   best && (best->ebx & bit(X86_FEATURE_INVPCID)) &&
-   guest_cpuid_has_pcid(vcpu)) {
+   (!best || !(best->ebx & bit(X86_FEATURE_INVPCID)) ||
+   !guest_cpuid_has_pcid(vcpu))) {
exec_control = vmcs_read32(SECONDARY_VM_EXEC_CONTROL);
-   exec_control |= SECONDARY_EXEC_ENABLE_INVPCID;
-   vmcs_write32(SECONDARY_VM_EXEC_CONTROL,
-exec_control);
-   } else {
-   if (cpu_has_secondary_exec_ctrls()) {
-   exec_control = vmcs_read32(SECONDARY_VM_EXEC_CONTROL);
-   exec_control &= ~SECONDARY_EXEC_ENABLE_INVPCID;
-   vmcs_write32(SECONDARY_VM_EXEC_CONTROL,
-exec_control);
-   }
+   exec_control &= ~SECONDARY_EXEC_ENABLE_INVPCID;
+   vmcs_write32(SECONDARY_VM_EXEC_CONTROL, exec_control);
+
if (best)
best->ebx &= ~bit(X86_FEATURE_INVPCID);
}
-- 
2.4.3

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v2 7/8] KVM: VMX: clean up bit operation on SECONDARY_VM_EXEC_CONTROL

2015-09-09 Thread Xiao Guangrong
Use vmcs_set_bits() and vmcs_clear_bits() to clean up the code

Signed-off-by: Xiao Guangrong 
---
 arch/x86/kvm/vmx.c | 31 ---
 1 file changed, 8 insertions(+), 23 deletions(-)

diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index 5a074d0..f18f744 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -6643,7 +6643,6 @@ static int nested_vmx_check_permission(struct kvm_vcpu 
*vcpu)
 
 static inline void nested_release_vmcs12(struct vcpu_vmx *vmx)
 {
-   u32 exec_control;
if (vmx->nested.current_vmptr == -1ull)
return;
 
@@ -6656,9 +6655,8 @@ static inline void nested_release_vmcs12(struct vcpu_vmx 
*vmx)
   they were modified */
copy_shadow_to_vmcs12(vmx);
vmx->nested.sync_shadow_vmcs = false;
-   exec_control = vmcs_read32(SECONDARY_VM_EXEC_CONTROL);
-   exec_control &= ~SECONDARY_EXEC_SHADOW_VMCS;
-   vmcs_write32(SECONDARY_VM_EXEC_CONTROL, exec_control);
+   vmcs_clear_bits(SECONDARY_VM_EXEC_CONTROL,
+   SECONDARY_EXEC_SHADOW_VMCS);
vmcs_write64(VMCS_LINK_POINTER, -1ull);
}
vmx->nested.posted_intr_nv = -1;
@@ -7054,7 +7052,6 @@ static int handle_vmptrld(struct kvm_vcpu *vcpu)
 {
struct vcpu_vmx *vmx = to_vmx(vcpu);
gpa_t vmptr;
-   u32 exec_control;
 
if (!nested_vmx_check_permission(vcpu))
return 1;
@@ -7086,9 +7083,8 @@ static int handle_vmptrld(struct kvm_vcpu *vcpu)
vmx->nested.current_vmcs12 = new_vmcs12;
vmx->nested.current_vmcs12_page = page;
if (enable_shadow_vmcs) {
-   exec_control = vmcs_read32(SECONDARY_VM_EXEC_CONTROL);
-   exec_control |= SECONDARY_EXEC_SHADOW_VMCS;
-   vmcs_write32(SECONDARY_VM_EXEC_CONTROL, exec_control);
+   vmcs_set_bits(SECONDARY_VM_EXEC_CONTROL,
+ SECONDARY_EXEC_SHADOW_VMCS);
vmcs_write64(VMCS_LINK_POINTER,
 __pa(vmx->nested.current_shadow_vmcs));
vmx->nested.sync_shadow_vmcs = true;
@@ -7598,7 +7594,6 @@ static void vmx_get_exit_info(struct kvm_vcpu *vcpu, u64 
*info1, u64 *info2)
 static int vmx_enable_pml(struct vcpu_vmx *vmx)
 {
struct page *pml_pg;
-   u32 exec_control;
 
pml_pg = alloc_page(GFP_KERNEL | __GFP_ZERO);
if (!pml_pg)
@@ -7609,24 +7604,18 @@ static int vmx_enable_pml(struct vcpu_vmx *vmx)
vmcs_write64(PML_ADDRESS, page_to_phys(vmx->pml_pg));
vmcs_write16(GUEST_PML_INDEX, PML_ENTITY_NUM - 1);
 
-   exec_control = vmcs_read32(SECONDARY_VM_EXEC_CONTROL);
-   exec_control |= SECONDARY_EXEC_ENABLE_PML;
-   vmcs_write32(SECONDARY_VM_EXEC_CONTROL, exec_control);
+   vmcs_set_bits(SECONDARY_VM_EXEC_CONTROL, SECONDARY_EXEC_ENABLE_PML);
 
return 0;
 }
 
 static void vmx_disable_pml(struct vcpu_vmx *vmx)
 {
-   u32 exec_control;
-
ASSERT(vmx->pml_pg);
__free_page(vmx->pml_pg);
vmx->pml_pg = NULL;
 
-   exec_control = vmcs_read32(SECONDARY_VM_EXEC_CONTROL);
-   exec_control &= ~SECONDARY_EXEC_ENABLE_PML;
-   vmcs_write32(SECONDARY_VM_EXEC_CONTROL, exec_control);
+   vmcs_clear_bits(SECONDARY_VM_EXEC_CONTROL, SECONDARY_EXEC_ENABLE_PML);
 }
 
 static void vmx_flush_pml_buffer(struct kvm_vcpu *vcpu)
@@ -8699,12 +8688,8 @@ static void vmx_cpuid_update(struct kvm_vcpu *vcpu)
best->ebx &= ~bit(X86_FEATURE_INVPCID);
}
 
-   if (clear_exe_ctrl) {
-   u32 exec_ctl = vmcs_read32(SECONDARY_VM_EXEC_CONTROL);
-
-   exec_ctl &= ~clear_exe_ctrl;
-   vmcs_write32(SECONDARY_VM_EXEC_CONTROL, exec_ctl);
-   }
+   if (clear_exe_ctrl)
+   vmcs_clear_bits(SECONDARY_VM_EXEC_CONTROL, clear_exe_ctrl);
 
if (!guest_cpuid_has_pcommit(vcpu) && nested)
vmx->nested.nested_vmx_secondary_ctls_high &=
-- 
2.4.3

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[RFC PATCH v4 2/3] vfio: platform: access device property as a list of strings

2015-09-09 Thread Baptiste Reynal
From: Antonios Motakis 

Certain device properties (e.g. the device node name, the compatible
string), are available as a list of strings (separated by the null
terminating character). Let the VFIO user query this type of properties.

Signed-off-by: Antonios Motakis 
Signed-off-by: Baptiste Reynal 

---
v3 -> v4:
 - The list length is computed before strings copy. If the entire list
   doesn't fit, no strings are copied to the user.
---
 drivers/vfio/platform/properties.c | 43 +-
 1 file changed, 42 insertions(+), 1 deletion(-)

diff --git a/drivers/vfio/platform/properties.c 
b/drivers/vfio/platform/properties.c
index 98754c2..8bf9c8f 100644
--- a/drivers/vfio/platform/properties.c
+++ b/drivers/vfio/platform/properties.c
@@ -22,7 +22,48 @@ static int dev_property_get_strings(struct device *dev, 
uint32_t *flags,
char *name, unsigned *lenp,
void __user *datap, unsigned long datasz)
 {
-   return -EINVAL;
+   const char **val;
+   int n, i, ret;
+
+   if (lenp == NULL)
+   return -EFAULT;
+
+   *lenp = 0;
+
+   n = device_property_read_string_array(dev, name, NULL, 0);
+   if (n < 0)
+   return n;
+
+   val = kcalloc(n, sizeof(char *), GFP_KERNEL);
+   if (!val)
+   return -ENOMEM;
+
+   ret = device_property_read_string_array(dev, name, val, n);
+   if (ret < 0)
+   goto out;
+
+   for (i = 0; i < n; i++)
+   *lenp += strlen(val[i]) + 1;
+
+   if (datasz < *lenp) {
+   ret = -E2BIG;
+   goto out;
+   }
+
+   for (i = 0; i < n; i++) {
+   size_t len = strlen(val[i]) + 1;
+
+   if (copy_to_user(datap, val[i], strlen(val[i]) + 1)) {
+   ret = -EFAULT;
+   goto out;
+   }
+
+   datap += len;
+   }
+
+out:
+   kfree(val);
+   return ret;
 }
 
 static int dev_property_get_uint(struct device *dev, uint32_t *flags,
-- 
2.5.1

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v3 07/10] KVM: arm/arm64: vgic: Allow HW interrupts for non-shared devices

2015-09-09 Thread Eric Auger
Hi Christoffer,
On 09/02/2015 09:42 PM, Christoffer Dall wrote:
> On Mon, Aug 10, 2015 at 03:21:01PM +0200, Eric Auger wrote:
>> From: Marc Zyngier 
>>
>> So far, the only use of the HW interrupt facility was the timer,
>> implying that the active state is context-switched for each vcpu,
>> as the device is is shared across all vcpus.
>>
>> This does not work for a device that has been assigned to a VM,
>> as the guest is entierely in control of that device (the HW is
>> not shared). In that case, it makes sense to bypass the whole
>> active state switching.
>>
>> Also the VGIC state machine is adapted to support those assigned
>> (non shared) HW IRQs:
>> - nly can be sampled when it is pending
>> - when queueing the IRQ (programming the LR), the pending state is
>>   removed as for edge sensitive IRQs
>> - queued state is not modelled. Level state is not modelled
>> - its injection always is valid since steming from the HW.
>>
>> Signed-off-by: Marc Zyngier 
>> Signed-off-by: Eric Auger 
>>
>> ---
>>
>> - a mix of
>>   [PATCH v4 11/11] KVM: arm/arm64: vgic: Allow HW interrupts for
>>non-shared devices
>>   [RFC v2 2/4] KVM: arm: vgic: fix state machine for forwarded IRQ
>> ---
>>  include/kvm/arm_vgic.h|  6 +++--
>>  virt/kvm/arm/arch_timer.c |  3 ++-
>>  virt/kvm/arm/vgic.c   | 58 
>> +++
>>  3 files changed, 49 insertions(+), 18 deletions(-)
>>
>> diff --git a/include/kvm/arm_vgic.h b/include/kvm/arm_vgic.h
>> index d901f1a..7ef9ce0 100644
>> --- a/include/kvm/arm_vgic.h
>> +++ b/include/kvm/arm_vgic.h
>> @@ -163,7 +163,8 @@ struct irq_phys_map {
>>  u32 virt_irq;
>>  u32 phys_irq;
>>  u32 irq;
>> -boolactive;
>> +boolshared;
>> +boolactive; /* Only valid if shared */
>>  };
>>  
>>  struct irq_phys_map_entry {
>> @@ -356,7 +357,8 @@ void vgic_v3_dispatch_sgi(struct kvm_vcpu *vcpu, u64 
>> reg);
>>  int kvm_vgic_vcpu_pending_irq(struct kvm_vcpu *vcpu);
>>  int kvm_vgic_vcpu_active_irq(struct kvm_vcpu *vcpu);
>>  struct irq_phys_map *kvm_vgic_map_phys_irq(struct kvm_vcpu *vcpu,
>> -   int virt_irq, int irq);
>> +   int virt_irq, int irq,
>> +   bool shared);
>>  int kvm_vgic_unmap_phys_irq(struct kvm_vcpu *vcpu, struct irq_phys_map 
>> *map);
>>  bool kvm_vgic_get_phys_irq_active(struct irq_phys_map *map);
>>  void kvm_vgic_set_phys_irq_active(struct irq_phys_map *map, bool active);
>> diff --git a/virt/kvm/arm/arch_timer.c b/virt/kvm/arm/arch_timer.c
>> index 76e38d2..db21d8f 100644
>> --- a/virt/kvm/arm/arch_timer.c
>> +++ b/virt/kvm/arm/arch_timer.c
>> @@ -203,7 +203,8 @@ int kvm_timer_vcpu_reset(struct kvm_vcpu *vcpu,
>>   * Tell the VGIC that the virtual interrupt is tied to a
>>   * physical interrupt. We do that once per VCPU.
>>   */
>> -map = kvm_vgic_map_phys_irq(vcpu, irq->irq, host_vtimer_irq);
>> +map = kvm_vgic_map_phys_irq(vcpu, irq->irq,
>> +host_vtimer_irq, true);
>>  if (WARN_ON(IS_ERR(map)))
>>  return PTR_ERR(map);
>>  
>> diff --git a/virt/kvm/arm/vgic.c b/virt/kvm/arm/vgic.c
>> index 9eb489a..fbd5ba5 100644
>> --- a/virt/kvm/arm/vgic.c
>> +++ b/virt/kvm/arm/vgic.c
>> @@ -400,7 +400,11 @@ void vgic_cpu_irq_clear(struct kvm_vcpu *vcpu, int irq)
>>  
>>  static bool vgic_can_sample_irq(struct kvm_vcpu *vcpu, int irq)
>>  {
>> -return !vgic_irq_is_queued(vcpu, irq);
>> +struct irq_phys_map *map = vgic_irq_map_search(vcpu, irq);
>> +bool shared_hw = map && !map->shared;
> 
> why is shared true when map->shared is false?
> 
>> +
>> +return !vgic_irq_is_queued(vcpu, irq) ||
>> +(shared_hw && vgic_dist_irq_is_pending(vcpu, irq));
> 
> so for forwarded, non-shared, level-triggered IRQs, we always sample the
> line if it's pending?  Why?
I tried to integrate into the updated state machine for non shared
mapped IRQ but I fail.

1) The first problem encountered is how to reset the level of the IRQ
(since its completion is not trapped). I added this reset in
process_queued_irq. I think this was the most natural place since at
sink time we get aware the IRQ is deactivated at physical distributor
level. However I observe  failures in vgic_validate_injection. I think
there is due to a race between update_irq_pending and sync. As soon as
the guest EOI's the virtual IRQ (and also the pIRQ), a new physical IRQ
hits and gets injected by irqfd. This injection can happen before the
sync. So I would be tempted to keep my current strategy of ignoring the
validate_injection in case of non-shared mapped IRQ and not model the
level state. The vIRQ directly comes from the HW so it must be valid
(guest deactivated the 

KVM call for agenda for 2015-09-15

2015-09-09 Thread Juan Quintela

Hi

Please, send any topic that you are interested in covering.

At the end of Monday I will send an email with the agenda or the
cancellation of the call, so hurry up.

After discussions on the QEMU Summit, we are going to have always open a
KVM call where you can add topics.

 Call details:

By popular demand, a google calendar public entry with it

  
https://www.google.com/calendar/embed?src=dG9iMXRqcXAzN3Y4ZXZwNzRoMHE4a3BqcXNAZ3JvdXAuY2FsZW5kYXIuZ29vZ2xlLmNvbQ

(Let me know if you have any problems with the calendar entry.  I just
gave up about getting right at the same time CEST, CET, EDT and DST).

If you need phone number details,  contact me privately

Thanks, Juan.
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Who wants to mentor for Outreachy Dec-Mar?

2015-09-09 Thread Michal Privoznik
On 09.09.2015 12:28, Stefan Hajnoczi wrote:
> We are now looking for mentors for the next round of Outreachy running
> from December 7, 2015 to March 7, 2016.  I have set up a wiki page
> here:
> http://qemu-project.org/Outreachy_2015_DecemberMarch

I've copied over unused projects from GSoC which I'm willing to mentor.
Others are welcomed to mentor too ;-)

Michal
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH v3 2/5] KVM: arm64: Implement vGICv3 distributor and redistributor access from userspace

2015-09-09 Thread Pavel Fedin
 Hello Andre! I haven't heard from you on any of my questions. But, 
nevertheless, i would like to
inform you (and other interested people) that i'm leaving on vacation, and will 
be back 2 weeks
later. Just in case if someone writes me something and i don't reply.

Kind regards,
Pavel Fedin
Expert Engineer
Samsung Electronics Research center Russia


> -Original Message-
> From: kvm-ow...@vger.kernel.org [mailto:kvm-ow...@vger.kernel.org] On Behalf 
> Of Pavel Fedin
> Sent: Monday, September 07, 2015 10:57 AM
> To: 'Andre Przywara'
> Cc: kvm...@lists.cs.columbia.edu; kvm@vger.kernel.org; 'Marc Zyngier'; 'Peter 
> Maydell'
> Subject: RE: [PATCH v3 2/5] KVM: arm64: Implement vGICv3 distributor and 
> redistributor access
> from userspace
> 
>  Hello!
> 
> > > --- a/virt/kvm/arm/vgic.c
> > > +++ b/virt/kvm/arm/vgic.c
> > > @@ -2468,6 +2468,7 @@ int vgic_attr_regs_access(struct kvm_device *dev,
> > >* access. For 64-bit registers we have to split up the operation.
> > >*/
> > >   mmio->len = sizeof(u32);
> > > + mmio->private = vcpu; /* For redistributor handlers */
> >
> > I guess this can be moved into the caller and then you can drop the vcpu
> > parameter and use private here instead, no?
> 
>  No because 'vcpu' is not a parameter. It is figured out in the middle of the 
> function, under
> dev->kvm->lock mutex, out of 'cpuid' index.
> 
> Kind regards,
> Pavel Fedin
> Expert Engineer
> Samsung Electronics Research center Russia
> 
> 
> --
> To unsubscribe from this list: send the line "unsubscribe kvm" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Who wants to mentor for Outreachy Dec-Mar?

2015-09-09 Thread Stefan Hajnoczi
We are now looking for mentors for the next round of Outreachy running
from December 7, 2015 to March 7, 2016.  I have set up a wiki page
here:
http://qemu-project.org/Outreachy_2015_DecemberMarch

Our communities have participated in previous years to mentor people
from underrepresented groups and help them get involved in open source
software.  To learn more, see the Outreachy website:
https://www.gnome.org/outreachy/

If you are a regular contributor to QEMU, libvirt, or the KVM kernel
module then you can become an Outreachy mentor.  Information on what's
involved is here:
https://wiki.gnome.org/Outreachy/Admin/InfoForMentors

Mentoring summary:

1. Post your project ideas here:
http://qemu-project.org/Outreachy_2015_DecemberMarch

2. You give each applicant a different small task so they can submit a
patch upstream.  You also interview promising candidates on IRC to get
a better picture.  Then you select a candidate you wish to work with
(or none).

3. Requires 5 hours/week from December 2015 to March 2016 to mentor
your intern, review their code, answer their questions, etc.

If you'd like to become a mentor, please let me know.


We are also looking for sponsors who wish to fund Outreachy interns
for QEMU, libvirt, and the KVM kernel module.  The sponsorship for one
intern is $6,500.  Learn more about sponsorship:
https://wiki.gnome.org/Outreachy/Admin/InfoForOrgs

Stefan
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Who wants to mentor for Outreachy Dec-Mar?

2015-09-09 Thread Stefan Hajnoczi
On Wed, Sep 9, 2015 at 12:59 PM, Michal Privoznik  wrote:
> On 09.09.2015 12:28, Stefan Hajnoczi wrote:
>> We are now looking for mentors for the next round of Outreachy running
>> from December 7, 2015 to March 7, 2016.  I have set up a wiki page
>> here:
>> http://qemu-project.org/Outreachy_2015_DecemberMarch
>
> I've copied over unused projects from GSoC which I'm willing to mentor.
> Others are welcomed to mentor too ;-)

Great, since there is interest I will start looking if we can secure
sponsorship.

We need to find sponsors for the Outreachy interns that we take.
Unlike GSoC, there is not automatically funding available for all
interns.

Stefan
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v7 0/3] KVM: Dynamic Halt-Polling

2015-09-09 Thread Christian Borntraeger
Am 03.09.2015 um 16:07 schrieb Wanpeng Li:
> v6 -> v7:
>  * explicit signal (set a bool)
>  * fix the tracepoint 
> 
> v5 -> v6:
>  * fix wait_ns and poll_ns 
> 
> v4 -> v5:
>  * set base case 10us and max poll time 500us
>  * handle short/long halt, idea from David, many thanks David 
> 
> v3 -> v4:
>  * bring back grow vcpu->halt_poll_ns when interrupt arrives and shrinks
>when idle VCPU is detected 
> 
> v2 -> v3:
>  * grow/shrink vcpu->halt_poll_ns by *halt_poll_ns_grow or 
> /halt_poll_ns_shrink
>  * drop the macros and hard coding the numbers in the param definitions
>  * update the comments "5-7 us"
>  * remove halt_poll_ns_max and use halt_poll_ns as the max halt_poll_ns time,
>vcpu->halt_poll_ns start at zero
>  * drop the wrappers 
>  * move the grow/shrink logic before "out:" w/ "if (waited)"
> 
> v1 -> v2:
>  * change kvm_vcpu_block to read halt_poll_ns from the vcpu instead of 
>the module parameter
>  * use the shrink/grow matrix which is suggested by David
>  * set halt_poll_ns_max to 2ms
> 
> There is a downside of always-poll since poll is still happened for idle 
> vCPUs which can waste cpu usage. This patchset add the ability to adjust 
> halt_poll_ns dynamically, to grow halt_poll_ns when shot halt is detected,  
> and to shrink halt_poll_ns when long halt is detected.
> 
> There are two new kernel parameters for changing the halt_poll_ns:
> halt_poll_ns_grow and halt_poll_ns_shrink. 
> 
> no-poll  always-polldynamic-poll
> ---
> Idle (nohz) vCPU %c0 0.15%0.3%0.2%  
> Idle (250HZ) vCPU %c01.1% 4.6%~14%1.2%
> TCP_RR latency   34us 27us26.7us
> 
> "Idle (X) vCPU %c0" is the percent of time the physical cpu spent in
> c0 over 60 seconds (each vCPU is pinned to a pCPU). (nohz) means the
> guest was tickless. (250HZ) means the guest was ticking at 250HZ.
> 
> The big win is with ticking operating systems. Running the linux guest
> with nohz=off (and HZ=250), we save 3.4%~12.8% CPUs/second and get close 
> to no-polling overhead levels by using the dynamic-poll. The savings
> should be even higher for higher frequency ticks.
> 
> Wanpeng Li (3):
>   KVM: make halt_poll_ns per-vCPU
>   KVM: dynamic halt-polling
>   KVM: trace kvm_halt_poll_ns grow/shrink
> 
>  include/linux/kvm_host.h   |  1 +
>  include/trace/events/kvm.h | 30 +++
>  virt/kvm/kvm_main.c| 72 
> ++
>  3 files changed, 97 insertions(+), 6 deletions(-)
> 

I get some nice improvements for uperf between 2 guests, but there is one "bug":
If there is already some polling ongoing, its impossible to disable the polling,
even a
 echo 0 >  /sys/module/kvm/parameters/halt_poll_ns
is ignored. (as vcpu->halt_poll_ns is used to check)

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Bug 104271] Boot crashes on Lenovo W520 with KVM and Discrete Graphic enabled in BIOS

2015-09-09 Thread bugzilla-daemon
https://bugzilla.kernel.org/show_bug.cgi?id=104271

Huaitong Han  changed:

   What|Removed |Added

 CC||oen...@gmail.com

--- Comment #1 from Huaitong Han  ---
Does it work well on windows? I guess it's a BIOS problem. finding a avaliable
disk device is the job of BIOS, and the problem just right occurs when you
change the BIOS settings.

-- 
You are receiving this mail because:
You are watching the assignee of the bug.
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] powerpc/e500: move qemu machine spec together with the rest

2015-09-09 Thread Scott Wood
On Fri, 2015-09-04 at 15:46 +0300, Laurentiu Tudor wrote:
> This way we get rid of an entire file with mostly
> duplicated code plus a Kconfig option that you always
> had to take care to check it in order for kvm to work.
> 
> Signed-off-by: Laurentiu Tudor 
> ---
>  arch/powerpc/platforms/85xx/Kconfig   | 15 -
>  arch/powerpc/platforms/85xx/Makefile  |  1 -
>  arch/powerpc/platforms/85xx/corenet_generic.c |  1 +
>  arch/powerpc/platforms/85xx/qemu_e500.c   | 85 


qemu_e500 is not only for corenet chips.  We can add it to the defconfig (in 
fact I've been meaning to do so).

> -static void __init qemu_e500_setup_arch(void)
> -{
> - ppc_md.progress("qemu_e500_setup_arch()", 0);
> -
> - fsl_pci_assign_primary();
> - swiotlb_detect_4g();

Where is fsl_pci_assign_primary() in corenet_generic.c?  At one point this 
was needed for QEMU's PCI implementation -- have you tested QEMU PCI without 
it?

-Scott

--
To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v7 0/3] KVM: Dynamic Halt-Polling

2015-09-09 Thread Wanpeng Li

On 9/9/15 9:39 PM, Christian Borntraeger wrote:

Am 03.09.2015 um 16:07 schrieb Wanpeng Li:

v6 -> v7:
  * explicit signal (set a bool)
  * fix the tracepoint

v5 -> v6:
  * fix wait_ns and poll_ns

v4 -> v5:
  * set base case 10us and max poll time 500us
  * handle short/long halt, idea from David, many thanks David

v3 -> v4:
  * bring back grow vcpu->halt_poll_ns when interrupt arrives and shrinks
when idle VCPU is detected

v2 -> v3:
  * grow/shrink vcpu->halt_poll_ns by *halt_poll_ns_grow or /halt_poll_ns_shrink
  * drop the macros and hard coding the numbers in the param definitions
  * update the comments "5-7 us"
  * remove halt_poll_ns_max and use halt_poll_ns as the max halt_poll_ns time,
vcpu->halt_poll_ns start at zero
  * drop the wrappers
  * move the grow/shrink logic before "out:" w/ "if (waited)"

v1 -> v2:
  * change kvm_vcpu_block to read halt_poll_ns from the vcpu instead of
the module parameter
  * use the shrink/grow matrix which is suggested by David
  * set halt_poll_ns_max to 2ms

There is a downside of always-poll since poll is still happened for idle
vCPUs which can waste cpu usage. This patchset add the ability to adjust
halt_poll_ns dynamically, to grow halt_poll_ns when shot halt is detected,
and to shrink halt_poll_ns when long halt is detected.

There are two new kernel parameters for changing the halt_poll_ns:
halt_poll_ns_grow and halt_poll_ns_shrink.

 no-poll  always-polldynamic-poll
---
Idle (nohz) vCPU %c0 0.15%0.3%0.2%
Idle (250HZ) vCPU %c01.1% 4.6%~14%1.2%
TCP_RR latency   34us 27us26.7us

"Idle (X) vCPU %c0" is the percent of time the physical cpu spent in
c0 over 60 seconds (each vCPU is pinned to a pCPU). (nohz) means the
guest was tickless. (250HZ) means the guest was ticking at 250HZ.

The big win is with ticking operating systems. Running the linux guest
with nohz=off (and HZ=250), we save 3.4%~12.8% CPUs/second and get close
to no-polling overhead levels by using the dynamic-poll. The savings
should be even higher for higher frequency ticks.

Wanpeng Li (3):
   KVM: make halt_poll_ns per-vCPU
   KVM: dynamic halt-polling
   KVM: trace kvm_halt_poll_ns grow/shrink

  include/linux/kvm_host.h   |  1 +
  include/trace/events/kvm.h | 30 +++
  virt/kvm/kvm_main.c| 72 ++
  3 files changed, 97 insertions(+), 6 deletions(-)


I get some nice improvements for uperf between 2 guests,


Good to hear that.


but there is one "bug":
If there is already some polling ongoing, its impossible to disable the polling,


The polling will stop if long halt is detected, and there is no need to 
manual tuning. Just like dynamise PLE window can detect false positive 
and handle ple window suitably.


Regards,
Wanpeng Li
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] kvmtool Makefile: relax arm test

2015-09-09 Thread Riku Voipio
On 4 September 2015 at 14:06, Andre Przywara  wrote:
> Hi Riku,
>
> On 04/09/15 11:52, Riku Voipio wrote:
>> On 4 September 2015 at 13:10, Andre Przywara  wrote:
>>> Hi Riku,
>>>
>>> On 03/09/15 12:20, riku.voi...@linaro.org wrote:
 From: Riku Voipio 

 Currently Makefile accepts only armv7l.* When building kvmtool under 32bit
 personality on Aarch64 machines, uname -m reports "armv8l", so build fails.
 We expect doing 32bit arm builds in Aarch64 to become standard the same way
 people do i386 builds on x86_64 machines.

 Make the sed test a little more greedy so armv8l becomes acceptable.

 Signed-off-by: Riku Voipio 
>>>
>>> The patch looks OK to me, I just wonder how you do the actual build
>>> within the linux32 environment?
>>> Do you have an arm cross compiler installed and set CROSS_COMPILE? Or is
>>> there a magic compiler (driver) which uses uname -m as well?
>>> And what would be the difference to setting ARCH=arm as well? Just
>>> convenience?
>>
>> It's just an arm32 chroot, with an native arm32 compiler. The chroot
>> is on an arm64 machine since these tend to be much faster than arm32
>> hardware.
>
> Oh right, a chroot, didn't think about the obvious ;-)
> Also it applies to 64-bit kernels with 32-bit root filesystems, I think.
> So:
>
> Acked-by: Andre Przywara 

Ping?

>>
>> It would of course be possible to set ARCH=arm, but that would mean
>> some ifdefs in the Debian packaging, since the same build rule should
>> work for all architectures.
>>
>> Riku
>>
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC PATCH v4 1/3] vfio: platform: add device properties skeleton and user API

2015-09-09 Thread Alex Williamson
On Wed, 2015-09-09 at 11:17 +0200, Baptiste Reynal wrote:
> From: Antonios Motakis 
> 
> This patch introduces an API that allows to return device properties (OF
> or ACPI) of a device bound to the vfio-platform/vfio-amba driver and the
> skeleton of the implementation for VFIO_PLATFORM. Information about any
> device node bound by VFIO_PLATFORM should be queried via the introduced
> ioctl VFIO_DEVICE_GET_DEV_PROPERTY.
> 
> The user needs to know the name and the data type of the property he is
> accessing.
> 
> Signed-off-by: Antonios Motakis 
> Signed-off-by: Baptiste Reynal 
> 
> ---
> v3 -> v4:
>  - added flags placeholder in vfio_dev_properties
>  - ioctl returns -E2BIG if the buffer is too small
>  - details VFIO_DEVICE_GET_DEV_PROPERTY documentation
> ---
>  drivers/vfio/platform/Makefile|  3 +-
>  drivers/vfio/platform/properties.c| 77 
> +++
>  drivers/vfio/platform/vfio_platform_common.c  | 35 
>  drivers/vfio/platform/vfio_platform_private.h |  7 +++
>  include/uapi/linux/vfio.h | 31 +++
>  5 files changed, 152 insertions(+), 1 deletion(-)
>  create mode 100644 drivers/vfio/platform/properties.c
> 
> diff --git a/drivers/vfio/platform/Makefile b/drivers/vfio/platform/Makefile
> index 9ce8afe..37cf5ed 100644
> --- a/drivers/vfio/platform/Makefile
> +++ b/drivers/vfio/platform/Makefile
> @@ -1,5 +1,6 @@
>  
> -vfio-platform-y := vfio_platform.o vfio_platform_common.o vfio_platform_irq.o
> +vfio-platform-y := vfio_platform.o vfio_platform_common.o 
> vfio_platform_irq.o \
> +properties.o
>  
>  obj-$(CONFIG_VFIO_PLATFORM) += vfio-platform.o
>  obj-$(CONFIG_VFIO_PLATFORM) += reset/
> diff --git a/drivers/vfio/platform/properties.c 
> b/drivers/vfio/platform/properties.c
> new file mode 100644
> index 000..98754c2
> --- /dev/null
> +++ b/drivers/vfio/platform/properties.c
> @@ -0,0 +1,77 @@
> +/*
> + * Copyright (C) 2015 - Virtual Open Systems
> + * Authors: Antonios Motakis 
> + *  Baptiste Reynal 
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License, version 2, as
> + * published by the Free Software Foundation.
> + *
> + * This program is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> + * GNU General Public License for more details.
> + */
> +
> +#include 
> +#include 
> +#include 
> +#include "vfio_platform_private.h"
> +
> +static int dev_property_get_strings(struct device *dev, uint32_t *flags,
> + char *name, unsigned *lenp,
> + void __user *datap, unsigned long datasz)
> +{
> + return -EINVAL;
> +}
> +
> +static int dev_property_get_uint(struct device *dev, uint32_t *flags,
> +  char *name, uint32_t type, unsigned *lenp,
> +  void __user *datap, unsigned long datasz)
> +{
> + return -EINVAL;
> +}
> +
> +int vfio_platform_dev_properties(struct device *dev, uint32_t *flags,
> +  uint32_t type, unsigned *lenp,
> +  void __user *datap, unsigned long datasz)
> +{
> + char *name;
> + long namesz;
> + int ret;
> +
> + namesz = strnlen_user(datap, datasz);
> + if (!namesz)
> + return -EFAULT;
> + if (namesz > datasz)
> + return -EINVAL;
> +
> + name = kzalloc(namesz, GFP_KERNEL);

What prevents the user from passing an arbitrarily large string here?

> + if (!name)
> + return -ENOMEM;
> + if (strncpy_from_user(name, datap, namesz) <= 0) {
> + kfree(name);
> + return -EFAULT;
> + }
> +
> + switch (type) {
> + case VFIO_DEV_PROPERTY_TYPE_STRINGS:
> + ret = dev_property_get_strings(dev, flags, name, lenp,
> + datap, datasz);
> + break;
> +
> + case VFIO_DEV_PROPERTY_TYPE_U64:
> + case VFIO_DEV_PROPERTY_TYPE_U32:
> + case VFIO_DEV_PROPERTY_TYPE_U16:
> + case VFIO_DEV_PROPERTY_TYPE_U8:
> + ret = dev_property_get_uint(dev, flags, name, type, lenp,
> + datap, datasz);
> + break;
> +
> + default:
> + ret = -EINVAL;
> + }
> +
> + kfree(name);
> + return ret;
> +}
> diff --git a/drivers/vfio/platform/vfio_platform_common.c 
> b/drivers/vfio/platform/vfio_platform_common.c
> index e43efb5..44ba22c 100644
> --- a/drivers/vfio/platform/vfio_platform_common.c
> +++ b/drivers/vfio/platform/vfio_platform_common.c
> @@ -20,6 +20,7 @@
>  #include 
>  #include 
>  

Re: [RFC PATCH v4 2/3] vfio: platform: access device property as a list of strings

2015-09-09 Thread Alex Williamson
On Wed, 2015-09-09 at 11:17 +0200, Baptiste Reynal wrote:
> From: Antonios Motakis 
> 
> Certain device properties (e.g. the device node name, the compatible
> string), are available as a list of strings (separated by the null
> terminating character). Let the VFIO user query this type of properties.
> 
> Signed-off-by: Antonios Motakis 
> Signed-off-by: Baptiste Reynal 
> 
> ---
> v3 -> v4:
>  - The list length is computed before strings copy. If the entire list
>doesn't fit, no strings are copied to the user.
> ---
>  drivers/vfio/platform/properties.c | 43 
> +-
>  1 file changed, 42 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/vfio/platform/properties.c 
> b/drivers/vfio/platform/properties.c
> index 98754c2..8bf9c8f 100644
> --- a/drivers/vfio/platform/properties.c
> +++ b/drivers/vfio/platform/properties.c
> @@ -22,7 +22,48 @@ static int dev_property_get_strings(struct device *dev, 
> uint32_t *flags,
>   char *name, unsigned *lenp,
>   void __user *datap, unsigned long datasz)
>  {
> - return -EINVAL;
> + const char **val;
> + int n, i, ret;
> +
> + if (lenp == NULL)
> + return -EFAULT;

Paranoia?

> +
> + *lenp = 0;
> +
> + n = device_property_read_string_array(dev, name, NULL, 0);
> + if (n < 0)
> + return n;
> +
> + val = kcalloc(n, sizeof(char *), GFP_KERNEL);
> + if (!val)
> + return -ENOMEM;
> +
> + ret = device_property_read_string_array(dev, name, val, n);
> + if (ret < 0)
> + goto out;
> +
> + for (i = 0; i < n; i++)
> + *lenp += strlen(val[i]) + 1;
> +
> + if (datasz < *lenp) {
> + ret = -E2BIG;
> + goto out;
> + }
> +
> + for (i = 0; i < n; i++) {
> + size_t len = strlen(val[i]) + 1;
> +
> + if (copy_to_user(datap, val[i], strlen(val[i]) + 1)) {

No need to call strlen() again here

> + ret = -EFAULT;
> + goto out;
> + }
> +
> + datap += len;
> + }
> +
> +out:
> + kfree(val);
> + return ret;
>  }
>  
>  static int dev_property_get_uint(struct device *dev, uint32_t *flags,



--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC PATCH v4 3/3] vfio: platform: return device properties as arrays of unsigned integers

2015-09-09 Thread Alex Williamson
On Wed, 2015-09-09 at 11:17 +0200, Baptiste Reynal wrote:
> From: Antonios Motakis 
> 
> Certain properties of a device are accessible as an array of unsigned
> integers, either u64, u32, u16, or u8. Let the VFIO user query this
> type of device properties.
> 
> Signed-off-by: Antonios Motakis 
> Signed-off-by: Baptiste Reynal 
> ---
>  drivers/vfio/platform/properties.c | 62 
> +-
>  1 file changed, 61 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/vfio/platform/properties.c 
> b/drivers/vfio/platform/properties.c
> index 8bf9c8f..625e2d3 100644
> --- a/drivers/vfio/platform/properties.c
> +++ b/drivers/vfio/platform/properties.c
> @@ -70,7 +70,67 @@ static int dev_property_get_uint(struct device *dev, 
> uint32_t *flags,
>char *name, uint32_t type, unsigned *lenp,
>void __user *datap, unsigned long datasz)
>  {
> - return -EINVAL;
> + int ret, n;
> + u8 *out;
> + size_t sz;
> + int (*func)(const struct device *, const char *, void *, size_t)
> + = NULL;
> +
> + switch (type) {
> + case VFIO_DEV_PROPERTY_TYPE_U64:
> + sz = sizeof(u64);
> + func = (int (*)(const struct device *,
> + const char *, void *, size_t))
> + device_property_read_u64_array;
> + break;
> + case VFIO_DEV_PROPERTY_TYPE_U32:
> + sz = sizeof(u32);
> + func = (int (*)(const struct device *,
> + const char *, void *, size_t))
> + device_property_read_u32_array;
> + break;
> + case VFIO_DEV_PROPERTY_TYPE_U16:
> + sz = sizeof(u16);
> + func = (int (*)(const struct device *,
> + const char *, void *, size_t))
> + device_property_read_u16_array;
> + break;
> + case VFIO_DEV_PROPERTY_TYPE_U8:
> + sz = sizeof(u8);
> + func = (int (*)(const struct device *,
> + const char *, void *, size_t))
> + device_property_read_u8_array;
> + break;
> +
> + default:
> + return -EINVAL;
> + }
> +
> + /* get size of array */
> + n = func(dev, name, NULL, 0);
> + if (n < 0)
> + return n;
> +
> + if (lenp)
> + *lenp = n * sz;

Why is this conditional?

> +
> + if (n * sz > datasz)
> + return -EOVERFLOW;

Ugh, this isn't E2BIG or ENOSPC...

> +
> + out = kcalloc(n, sz, GFP_KERNEL);
> + if (!out)
> + return -ENOMEM;
> +
> + ret = func(dev, name, out, n);
> + if (ret)
> + goto out;
> +
> + if (copy_to_user(datap, out, n * sz))
> + ret = -EFAULT;
> +
> +out:
> + kfree(out);
> + return ret;
>  }
>  
>  int vfio_platform_dev_properties(struct device *dev, uint32_t *flags,



--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Bug 104271] Boot crashes on Lenovo W520 with KVM and Discrete Graphic enabled in BIOS

2015-09-09 Thread bugzilla-daemon
https://bugzilla.kernel.org/show_bug.cgi?id=104271

--- Comment #2 from Hans Streibel  ---
Yes, it works on Windows 7. I did not recognize any problems there.
Ok, I did not try any virtualization like Virtual Box under Windows,
but everything else seems to work.

Even if it is a BIOS problem, then Windows obviously found a way to handle it.

-- 
You are receiving this mail because:
You are watching the assignee of the bug.
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Bug 104271] Boot crashes on Lenovo W520 with KVM and Discrete Graphic enabled in BIOS

2015-09-09 Thread bugzilla-daemon
https://bugzilla.kernel.org/show_bug.cgi?id=104271

Alex Williamson  changed:

   What|Removed |Added

 CC||alex.william...@redhat.com

--- Comment #3 from Alex Williamson  ---
You might get more attention moving this bug to Drivers/PCI and including the
full dmesg or console log of the failure and lspci -vvv so we know something
about your system.  If it doesn't boot, there are always screen shots or
netconsole if you're serial port challenged.

KVM is a hypervisor, it has nothing to do with your system not booting.  More
likely it's something with the IOMMU.  Does disabling VT-d help?  What about
booting with iommu=pt?  BTW, reproducing on another W520 doesn't mean the
hardware isn't broken, it just means this isn't a point defect.  Hardware is
broken a disturbing amount of the time.

-- 
You are receiving this mail because:
You are watching the assignee of the bug.
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v4 3/4] irqchip: GIC: Convert to EOImode == 1

2015-09-09 Thread Julien Grall
Hi,

I've been trying the latest linus/master (a794b4f), which include this
patch, as baremetal kernel on X-gene. This is failing on early boot
without much log.

After bisecting the tree, I found the error coming from this patch.
While this patch is valid, it made me remembered that X-Gene (at least
the first version) as an odd GICv2.

The GICC is divided in 2 area of 4K, each one aligned at a 64KB address.
This means that, the address of GICC_DIR won't be 0x1000 but 0x1.

We had the same issue on Xen when we did the first port of X-gene [1].
Although, we choose to add a quirk in Xen for this platform in order to
map contiguously in the virtual memory the 2 part of GICC.

Note that, back then, Ian suggested to extend the bindings to support a
such platform [2]. AFAICT, there was no follow-up on it.

Regards,

[1] http://lists.xen.org/archives/html/xen-devel/2013-11/msg03526.html
[2] http://www.spinics.net/lists/devicetree/msg10478.html

On 26/08/15 17:00, Marc Zyngier wrote:
> So far, GICv2 has been used with EOImode == 0. The effect of this
> mode is to perform the priority drop and the deactivation of the
> interrupt at the same time.
> 
> While this works perfectly for Linux (we only have a single priority),
> it causes issues when an interrupt is forwarded to a guest, and when
> we want the guest to perform the EOI itself.
> 
> For this case, the GIC architecture provides EOImode == 1, where:
> - A write to the EOI register drops the priority of the interrupt
>   and leaves it active. Other interrupts at the same priority level
>   can now be taken, but the active interrupt cannot be taken again
> - A write to the DIR marks the interrupt as inactive, meaning it can
>   now be taken again.
> 
> We only enable this feature when booted in HYP mode and that
> the device-tree reported a suitable CPU interface. Observable behaviour
> should remain unchanged.
> 
> Signed-off-by: Marc Zyngier 
> ---
>  drivers/irqchip/irq-gic.c   | 69 
> +++--
>  include/linux/irqchip/arm-gic.h |  4 +++
>  2 files changed, 70 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/irqchip/irq-gic.c b/drivers/irqchip/irq-gic.c
> index 4dd8826..9215897 100644
> --- a/drivers/irqchip/irq-gic.c
> +++ b/drivers/irqchip/irq-gic.c
> @@ -46,6 +46,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  
>  #include "irq-gic-common.h"
>  #include "irqchip.h"
> @@ -82,6 +83,8 @@ static DEFINE_RAW_SPINLOCK(irq_controller_lock);
>  #define NR_GIC_CPU_IF 8
>  static u8 gic_cpu_map[NR_GIC_CPU_IF] __read_mostly;
>  
> +static struct static_key supports_deactivate = STATIC_KEY_INIT_TRUE;
> +
>  #ifndef MAX_GIC_NR
>  #define MAX_GIC_NR   1
>  #endif
> @@ -157,6 +160,11 @@ static void gic_mask_irq(struct irq_data *d)
>   gic_poke_irq(d, GIC_DIST_ENABLE_CLEAR);
>  }
>  
> +static void gic_eoimode1_mask_irq(struct irq_data *d)
> +{
> + gic_mask_irq(d);
> +}
> +
>  static void gic_unmask_irq(struct irq_data *d)
>  {
>   gic_poke_irq(d, GIC_DIST_ENABLE_SET);
> @@ -167,6 +175,11 @@ static void gic_eoi_irq(struct irq_data *d)
>   writel_relaxed(gic_irq(d), gic_cpu_base(d) + GIC_CPU_EOI);
>  }
>  
> +static void gic_eoimode1_eoi_irq(struct irq_data *d)
> +{
> + writel_relaxed(gic_irq(d), gic_cpu_base(d) + GIC_CPU_DEACTIVATE);
> +}
> +
>  static int gic_irq_set_irqchip_state(struct irq_data *d,
>enum irqchip_irq_state which, bool val)
>  {
> @@ -272,11 +285,15 @@ static void __exception_irq_entry gic_handle_irq(struct 
> pt_regs *regs)
>   irqnr = irqstat & GICC_IAR_INT_ID_MASK;
>  
>   if (likely(irqnr > 15 && irqnr < 1021)) {
> + if (static_key_true(_deactivate))
> + writel_relaxed(irqstat, cpu_base + GIC_CPU_EOI);
>   handle_domain_irq(gic->domain, irqnr, regs);
>   continue;
>   }
>   if (irqnr < 16) {
>   writel_relaxed(irqstat, cpu_base + GIC_CPU_EOI);
> + if (static_key_true(_deactivate))
> + writel_relaxed(irqstat, cpu_base + 
> GIC_CPU_DEACTIVATE);
>  #ifdef CONFIG_SMP
>   handle_IPI(irqnr, regs);
>  #endif
> @@ -327,6 +344,20 @@ static struct irq_chip gic_chip = {
>   .flags  = IRQCHIP_SET_TYPE_MASKED,
>  };
>  
> +static struct irq_chip gic_eoimode1_chip = {
> + .name   = "GICv2",
> + .irq_mask   = gic_eoimode1_mask_irq,
> + .irq_unmask = gic_unmask_irq,
> + .irq_eoi= gic_eoimode1_eoi_irq,
> + .irq_set_type   = gic_set_type,
> +#ifdef CONFIG_SMP
> + .irq_set_affinity   = gic_set_affinity,
> +#endif
> + .irq_get_irqchip_state  = gic_irq_get_irqchip_state,
> + .irq_set_irqchip_state  = gic_irq_set_irqchip_state,
> + .flags  = IRQCHIP_SET_TYPE_MASKED,
> +};
> +
>