Re: [PATCH 05/13] KVM: Update IRTE according to guest interrupt configuration changes

2014-11-12 Thread Paolo Bonzini


On 12/11/2014 04:42, Zhang, Yang Z wrote:
 Personally, I think this feature will be helpful to the legacy device
 assignment. Agree, vfio is the right solution for future feature
 enabling. But the old kvm without the good vfio supporting is still
 used largely today. The user really looking for this feature but they
 will not upgrade their kernel. It's easy for us to backport this
 feature to old kvm with the legacy device assignment, but it is
 impossible to backport the whole vfio.

You can certainly backport these patches to distros that do not have
VFIO.  But upstream we should work on VFIO first.  VFIO has feature
parity with legacy device assignment, and adding a new feature that is
not in VFIO would be a bad idea.

By the way, do you have benchmark results for it?  We have not been able
to see any performance improvement for APICv on e.g. netperf.

Paolo
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH 05/13] KVM: Update IRTE according to guest interrupt configuration changes

2014-11-12 Thread Wu, Feng


 -Original Message-
 From: Paolo Bonzini [mailto:pbonz...@redhat.com]
 Sent: Wednesday, November 12, 2014 5:14 PM
 To: Zhang, Yang Z; Wu, Feng; Alex Williamson
 Cc: g...@kernel.org; dw...@infradead.org; j...@8bytes.org;
 t...@linutronix.de; mi...@redhat.com; h...@zytor.com; x...@kernel.org;
 kvm@vger.kernel.org; io...@lists.linux-foundation.org;
 linux-ker...@vger.kernel.org
 Subject: Re: [PATCH 05/13] KVM: Update IRTE according to guest interrupt
 configuration changes
 
 
 
 On 12/11/2014 04:42, Zhang, Yang Z wrote:
  Personally, I think this feature will be helpful to the legacy device
  assignment. Agree, vfio is the right solution for future feature
  enabling. But the old kvm without the good vfio supporting is still
  used largely today. The user really looking for this feature but they
  will not upgrade their kernel. It's easy for us to backport this
  feature to old kvm with the legacy device assignment, but it is
  impossible to backport the whole vfio.
 
 You can certainly backport these patches to distros that do not have
 VFIO.  But upstream we should work on VFIO first.  VFIO has feature
 parity with legacy device assignment, and adding a new feature that is
 not in VFIO would be a bad idea.
 
 By the way, do you have benchmark results for it?  We have not been able
 to see any performance improvement for APICv on e.g. netperf.

Do you mean benchmark results for APICv itself or VT-d Posted-Interrtups?

Thanks,
Feng

 
 Paolo


Re: [PATCH 05/13] KVM: Update IRTE according to guest interrupt configuration changes

2014-11-12 Thread Paolo Bonzini


On 12/11/2014 10:19, Wu, Feng wrote:
 You can certainly backport these patches to distros that do not have
 VFIO.  But upstream we should work on VFIO first.  VFIO has feature
 parity with legacy device assignment, and adding a new feature that is
 not in VFIO would be a bad idea.

 By the way, do you have benchmark results for it?  We have not been able
 to see any performance improvement for APICv on e.g. netperf.
 
 Do you mean benchmark results for APICv itself or VT-d Posted-Interrtups?

Especially for VT-d posted interrupts---but it'd be great to know which
workloads see the biggest speedup from APICv.

Paolo
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 05/13] KVM: Update IRTE according to guest interrupt configuration changes

2014-11-12 Thread Alex Williamson
On Wed, 2014-11-12 at 10:14 +0100, Paolo Bonzini wrote:
 
 On 12/11/2014 04:42, Zhang, Yang Z wrote:
  Personally, I think this feature will be helpful to the legacy device
  assignment. Agree, vfio is the right solution for future feature
  enabling. But the old kvm without the good vfio supporting is still
  used largely today. The user really looking for this feature but they
  will not upgrade their kernel. It's easy for us to backport this
  feature to old kvm with the legacy device assignment, but it is
  impossible to backport the whole vfio.
 
 You can certainly backport these patches to distros that do not have
 VFIO.  But upstream we should work on VFIO first.  VFIO has feature
 parity with legacy device assignment, and adding a new feature that is
 not in VFIO would be a bad idea.

Thanks Paolo, I agree.  We should design the interfaces for VFIO since
we expect legacy KVM assignment to be deprecated and eventually removed.
I think that some of the platform device work for ARM's IRQ forwarding
should probably be leveraged for this interface.  IRQ forwarding
effectively allows level triggered interrupts to be handled as edge,
eliminating the mask/unmask overhead and EOI path entirely.  To do this
through VFIO they make use of the KVM-VFIO device to register the device
and set attributes for the forwarded IRQ.  This enables KVM to use the
VFIO external user interfaces to acquire a VFIO device reference and
access the struct device.  From there it can do some IRQ manipulation on
the device to reconfigure how the host handles the interrupt.  Ideally
we could use the same base KVM-VFIO device interface interface, perhaps
with different attributes, and obviously with different architecture
backing.  Thanks,

Alex

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH 05/13] KVM: Update IRTE according to guest interrupt configuration changes

2014-11-12 Thread Zhang, Yang Z
Wu, Feng wrote on 2014-11-13:
 
 
 kvm-ow...@vger.kernel.org wrote on 2014-11-12:
 kvm@vger.kernel.org; io...@lists.linux-foundation.org; 
 linux-ker...@vger.kernel.org
 Subject: Re: [PATCH 05/13] KVM: Update IRTE according to guest 
 interrupt configuration changes
 
 
 
 On 12/11/2014 10:19, Wu, Feng wrote:
 You can certainly backport these patches to distros that do not 
 have VFIO.  But upstream we should work on VFIO first.  VFIO has 
 feature parity with legacy device assignment, and adding a new 
 feature that is not in VFIO would be a bad idea.
 
 By the way, do you have benchmark results for it?  We have not been 
 able to see any performance improvement for APICv on e.g. netperf.
 
 Do you mean benchmark results for APICv itself or VT-d Posted-Interrtups?
 
 Especially for VT-d posted interrupts---but it'd be great to know 
 which workloads see the biggest speedup from APICv.
 
 We have some draft performance data internally, please see the 
 attached. For VT-d PI, I think we can get the biggest performance gain 
 if the VCPU is running in non-root mode for most of the time (not in 
 HLT state), since external interrupt from assigned devices will be delivered 
 by guest directly in this case.
 That means we can run some cpu intensive workload in the guests.

Have you check that the CPU side posted interrupt is taking effect in w/o VT-D 
PI case? Per my understanding, the performance gap should be so large if you 
use CPU side posted interrupt. This data more like the VT-d PI vs non PI(both 
VT-d and CPU).

 
 Thanks,
 Feng
 
 
 Paolo
 --
 To unsubscribe from this list: send the line unsubscribe kvm in the 
 body of a message to majord...@vger.kernel.org More majordomo info at 
 http://vger.kernel.org/majordomo-info.html


Best regards,
Yang




RE: [PATCH 05/13] KVM: Update IRTE according to guest interrupt configuration changes

2014-11-12 Thread Wu, Feng


 -Original Message-
 From: Zhang, Yang Z
 Sent: Thursday, November 13, 2014 9:21 AM
 To: Wu, Feng; Paolo Bonzini; Alex Williamson
 Cc: g...@kernel.org; dw...@infradead.org; j...@8bytes.org;
 t...@linutronix.de; mi...@redhat.com; h...@zytor.com; x...@kernel.org;
 kvm@vger.kernel.org; io...@lists.linux-foundation.org;
 linux-ker...@vger.kernel.org
 Subject: RE: [PATCH 05/13] KVM: Update IRTE according to guest interrupt
 configuration changes
 
 Wu, Feng wrote on 2014-11-13:
 
 
  kvm-ow...@vger.kernel.org wrote on 2014-11-12:
  kvm@vger.kernel.org; io...@lists.linux-foundation.org;
  linux-ker...@vger.kernel.org
  Subject: Re: [PATCH 05/13] KVM: Update IRTE according to guest
  interrupt configuration changes
 
 
 
  On 12/11/2014 10:19, Wu, Feng wrote:
  You can certainly backport these patches to distros that do not
  have VFIO.  But upstream we should work on VFIO first.  VFIO has
  feature parity with legacy device assignment, and adding a new
  feature that is not in VFIO would be a bad idea.
 
  By the way, do you have benchmark results for it?  We have not been
  able to see any performance improvement for APICv on e.g. netperf.
 
  Do you mean benchmark results for APICv itself or VT-d Posted-Interrtups?
 
  Especially for VT-d posted interrupts---but it'd be great to know
  which workloads see the biggest speedup from APICv.
 
  We have some draft performance data internally, please see the
  attached. For VT-d PI, I think we can get the biggest performance gain
  if the VCPU is running in non-root mode for most of the time (not in
  HLT state), since external interrupt from assigned devices will be 
  delivered by
 guest directly in this case.
  That means we can run some cpu intensive workload in the guests.
 
 Have you check that the CPU side posted interrupt is taking effect in w/o VT-D
 PI case? Per my understanding, the performance gap should be so large if you
 use CPU side posted interrupt. This data more like the VT-d PI vs non PI(both
 VT-d and CPU).

Yes, this data is VT-d PI vs Non VT-d PI. The CPU side APICv mechanism 
(including CPU side Posted-Interrtups) is enabled.

Thanks,
Feng

 
 
  Thanks,
  Feng
 
 
  Paolo
  --
  To unsubscribe from this list: send the line unsubscribe kvm in the
  body of a message to majord...@vger.kernel.org More majordomo info at
  http://vger.kernel.org/majordomo-info.html
 
 
 Best regards,
 Yang
 



RE: [PATCH 05/13] KVM: Update IRTE according to guest interrupt configuration changes

2014-11-12 Thread Zhang, Yang Z
Wu, Feng wrote on 2014-11-13:
 
 
 Zhang, Yang Z wrote on 2014-11-13:
 kvm@vger.kernel.org; io...@lists.linux-foundation.org;
 linux-ker...@vger.kernel.org
 Subject: RE: [PATCH 05/13] KVM: Update IRTE according to guest
 interrupt configuration changes
 
 Wu, Feng wrote on 2014-11-13:
 
 
 kvm-ow...@vger.kernel.org wrote on 2014-11-12:
 kvm@vger.kernel.org; io...@lists.linux-foundation.org;
 linux-ker...@vger.kernel.org
 Subject: Re: [PATCH 05/13] KVM: Update IRTE according to guest
 interrupt configuration changes
 
 
 
 On 12/11/2014 10:19, Wu, Feng wrote:
 You can certainly backport these patches to distros that do not
 have VFIO.  But upstream we should work on VFIO first.  VFIO
 has feature parity with legacy device assignment, and adding a
 new feature that is not in VFIO would be a bad idea.
 
 By the way, do you have benchmark results for it?  We have not
 been able to see any performance improvement for APICv on e.g.
 netperf.
 
 Do you mean benchmark results for APICv itself or VT-d
 Posted-Interrtups?
 
 Especially for VT-d posted interrupts---but it'd be great to know
 which workloads see the biggest speedup from APICv.
 
 We have some draft performance data internally, please see the
 attached. For VT-d PI, I think we can get the biggest performance gain
 if the VCPU is running in non-root mode for most of the time (not in
 HLT state), since external interrupt from assigned devices will be
 delivered by guest directly in this case. That means we can run some
 cpu intensive workload in the guests.
 
 Have you check that the CPU side posted interrupt is taking effect
 in w/o VT-D PI case? Per my understanding, the performance gap
 should be so large if you use CPU side posted interrupt. This data
 more like the VT-d PI vs non PI(both VT-d and CPU).
 
 Yes, this data is VT-d PI vs Non VT-d PI. The CPU side APICv mechanism
 (including CPU side Posted-Interrtups) is enabled.

From the CPU utilization data, it seems the environment of APICv is not 
reasonable to me. with current APICv, the interrupt should not deliver to the 
PCPU where vcpu is running. Otherwise, it will force the vcpu vmexit and the 
CPU side posted interrupt cannot take effect. Do you set the interrupt affinity 
manually?

 
 Thanks,
 Feng
 
 
 
 Thanks,
 Feng
 
 
 Paolo
 --
 To unsubscribe from this list: send the line unsubscribe kvm in
 the body of a message to majord...@vger.kernel.org More majordomo
 info at http://vger.kernel.org/majordomo-info.html
 
 
 Best regards,
 Yang



Best regards,
Yang




RE: [PATCH 05/13] KVM: Update IRTE according to guest interrupt configuration changes

2014-11-11 Thread Wu, Feng


 -Original Message-
 From: Alex Williamson [mailto:alex.william...@redhat.com]
 Sent: Tuesday, November 11, 2014 5:58 AM
 To: Wu, Feng
 Cc: g...@kernel.org; pbonz...@redhat.com; dw...@infradead.org;
 j...@8bytes.org; t...@linutronix.de; mi...@redhat.com; h...@zytor.com;
 x...@kernel.org; kvm@vger.kernel.org; io...@lists.linux-foundation.org;
 linux-ker...@vger.kernel.org
 Subject: Re: [PATCH 05/13] KVM: Update IRTE according to guest interrupt
 configuration changes
 
 On Mon, 2014-11-10 at 14:26 +0800, Feng Wu wrote:
  When guest changes its interrupt configuration (such as, vector, etc.)
  for direct-assigned devices, we need to update the associated IRTE
  with the new guest vector, so external interrupts from the assigned
  devices can be injected to guests without VM-Exit.
 
  The current method of handling guest lowest priority interrtups
  is to use a counter 'apic_arb_prio' for each VCPU, we choose the
  VCPU with smallest 'apic_arb_prio' and then increase it by 1.
  However, for VT-d PI, we cannot re-use this, since we no longer
  have control to 'apic_arb_prio' with posted interrupt direct
  delivery by Hardware.
 
  Here, we introduce a similiar way with 'apic_arb_prio' to handle
  guest lowest priority interrtups when VT-d PI is used. Here is the
  ideas:
  - Each VCPU has a counter 'round_robin_counter'.
  - When guests sets an interrupts to lowest priority, we choose
  the VCPU with smallest 'round_robin_counter' as the destination,
  then increase it.
 
  Signed-off-by: Feng Wu feng...@intel.com
  ---
   arch/x86/include/asm/irq_remapping.h |6 ++
   arch/x86/include/asm/kvm_host.h  |2 +
   arch/x86/kvm/vmx.c   |   12 +++
   arch/x86/kvm/x86.c   |   11 +++
   drivers/iommu/amd_iommu.c|6 ++
   drivers/iommu/intel_irq_remapping.c  |   28 +++
   drivers/iommu/irq_remapping.c|9 ++
   drivers/iommu/irq_remapping.h|3 +
   include/linux/dmar.h |   26 ++
   include/linux/kvm_host.h |   22 +
   include/uapi/linux/kvm.h |1 +
   virt/kvm/assigned-dev.c  |  141
 ++
   virt/kvm/irq_comm.c  |4 +-
   virt/kvm/irqchip.c   |   11 ---
   14 files changed, 269 insertions(+), 13 deletions(-)
 
  diff --git a/arch/x86/include/asm/irq_remapping.h
 b/arch/x86/include/asm/irq_remapping.h
  index a3cc437..32d6cc4 100644
  --- a/arch/x86/include/asm/irq_remapping.h
  +++ b/arch/x86/include/asm/irq_remapping.h
  @@ -51,6 +51,7 @@ extern void compose_remapped_msi_msg(struct
 pci_dev *pdev,
   unsigned int irq, unsigned int dest,
   struct msi_msg *msg, u8 hpet_id);
   extern int setup_hpet_msi_remapped(unsigned int irq, unsigned int id);
  +extern int update_pi_irte(unsigned int irq, u64 pi_desc_addr, u32 vector);
   extern void panic_if_irq_remap(const char *msg);
   extern bool setup_remapped_irq(int irq,
 struct irq_cfg *cfg,
  @@ -88,6 +89,11 @@ static inline int setup_hpet_msi_remapped(unsigned
 int irq, unsigned int id)
  return -ENODEV;
   }
 
  +static inline int update_pi_irte(unsigned int irq, u64 pi_desc_addr, u32
 vector)
  +{
  +   return -ENODEV;
  +}
  +
   static inline void panic_if_irq_remap(const char *msg)
   {
   }
  diff --git a/arch/x86/include/asm/kvm_host.h
 b/arch/x86/include/asm/kvm_host.h
  index 6ed0c30..0630161 100644
  --- a/arch/x86/include/asm/kvm_host.h
  +++ b/arch/x86/include/asm/kvm_host.h
  @@ -358,6 +358,7 @@ struct kvm_vcpu_arch {
  struct kvm_lapic *apic;/* kernel irqchip context */
  unsigned long apic_attention;
  int32_t apic_arb_prio;
  +   int32_t round_robin_counter;
  int mp_state;
  u64 ia32_misc_enable_msr;
  bool tpr_access_reporting;
  @@ -771,6 +772,7 @@ struct kvm_x86_ops {
  int (*check_nested_events)(struct kvm_vcpu *vcpu, bool external_intr);
 
  void (*sched_in)(struct kvm_vcpu *kvm, int cpu);
  +   u64 (*get_pi_desc_addr)(struct kvm_vcpu *vcpu);
   };
 
   struct kvm_arch_async_pf {
  diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
  index a4670d3..ae91b72 100644
  --- a/arch/x86/kvm/vmx.c
  +++ b/arch/x86/kvm/vmx.c
  @@ -544,6 +544,11 @@ static inline struct vcpu_vmx *to_vmx(struct
 kvm_vcpu *vcpu)
  return container_of(vcpu, struct vcpu_vmx, vcpu);
   }
 
  +struct pi_desc *vcpu_to_pi_desc(struct kvm_vcpu *vcpu)
  +{
  +   return (to_vmx(vcpu)-pi_desc);
  +}
  +
   #define VMCS12_OFFSET(x) offsetof(struct vmcs12, x)
   #define FIELD(number, name)[number] = VMCS12_OFFSET(name)
   #define FIELD64(number, name)  [number] = VMCS12_OFFSET(name), \
  @@ -4280,6 +4285,11 @@ static void vmx_sync_pir_to_irr_dummy(struct
 kvm_vcpu *vcpu)
  return;
   }
 
  +static u64 vmx_get_pi_desc_addr(struct kvm_vcpu *vcpu)
  +{
  +   return __pa((u64)vcpu_to_pi_desc(vcpu));
  +}
  +
   /*
* Set up

RE: [PATCH 05/13] KVM: Update IRTE according to guest interrupt configuration changes

2014-11-11 Thread Wu, Feng


 -Original Message-
 From: Alex Williamson [mailto:alex.william...@redhat.com]
 Sent: Tuesday, November 11, 2014 5:58 AM
 To: Wu, Feng
 Cc: g...@kernel.org; pbonz...@redhat.com; dw...@infradead.org;
 j...@8bytes.org; t...@linutronix.de; mi...@redhat.com; h...@zytor.com;
 x...@kernel.org; kvm@vger.kernel.org; io...@lists.linux-foundation.org;
 linux-ker...@vger.kernel.org
 Subject: Re: [PATCH 05/13] KVM: Update IRTE according to guest interrupt
 configuration changes
 
 On Mon, 2014-11-10 at 14:26 +0800, Feng Wu wrote:
  When guest changes its interrupt configuration (such as, vector, etc.)
  for direct-assigned devices, we need to update the associated IRTE
  with the new guest vector, so external interrupts from the assigned
  devices can be injected to guests without VM-Exit.
 
  The current method of handling guest lowest priority interrtups
  is to use a counter 'apic_arb_prio' for each VCPU, we choose the
  VCPU with smallest 'apic_arb_prio' and then increase it by 1.
  However, for VT-d PI, we cannot re-use this, since we no longer
  have control to 'apic_arb_prio' with posted interrupt direct
  delivery by Hardware.
 
  Here, we introduce a similiar way with 'apic_arb_prio' to handle
  guest lowest priority interrtups when VT-d PI is used. Here is the
  ideas:
  - Each VCPU has a counter 'round_robin_counter'.
  - When guests sets an interrupts to lowest priority, we choose
  the VCPU with smallest 'round_robin_counter' as the destination,
  then increase it.
 
  Signed-off-by: Feng Wu feng...@intel.com
  ---
   arch/x86/include/asm/irq_remapping.h |6 ++
   arch/x86/include/asm/kvm_host.h  |2 +
   arch/x86/kvm/vmx.c   |   12 +++
   arch/x86/kvm/x86.c   |   11 +++
   drivers/iommu/amd_iommu.c|6 ++
   drivers/iommu/intel_irq_remapping.c  |   28 +++
   drivers/iommu/irq_remapping.c|9 ++
   drivers/iommu/irq_remapping.h|3 +
   include/linux/dmar.h |   26 ++
   include/linux/kvm_host.h |   22 +
   include/uapi/linux/kvm.h |1 +
   virt/kvm/assigned-dev.c  |  141
 ++
   virt/kvm/irq_comm.c  |4 +-
   virt/kvm/irqchip.c   |   11 ---
   14 files changed, 269 insertions(+), 13 deletions(-)
 
  diff --git a/arch/x86/include/asm/irq_remapping.h
 b/arch/x86/include/asm/irq_remapping.h
  index a3cc437..32d6cc4 100644
  --- a/arch/x86/include/asm/irq_remapping.h
  +++ b/arch/x86/include/asm/irq_remapping.h
  @@ -51,6 +51,7 @@ extern void compose_remapped_msi_msg(struct
 pci_dev *pdev,
   unsigned int irq, unsigned int dest,
   struct msi_msg *msg, u8 hpet_id);
   extern int setup_hpet_msi_remapped(unsigned int irq, unsigned int id);
  +extern int update_pi_irte(unsigned int irq, u64 pi_desc_addr, u32 vector);
   extern void panic_if_irq_remap(const char *msg);
   extern bool setup_remapped_irq(int irq,
 struct irq_cfg *cfg,
  @@ -88,6 +89,11 @@ static inline int setup_hpet_msi_remapped(unsigned
 int irq, unsigned int id)
  return -ENODEV;
   }
 
  +static inline int update_pi_irte(unsigned int irq, u64 pi_desc_addr, u32
 vector)
  +{
  +   return -ENODEV;
  +}
  +
   static inline void panic_if_irq_remap(const char *msg)
   {
   }
  diff --git a/arch/x86/include/asm/kvm_host.h
 b/arch/x86/include/asm/kvm_host.h
  index 6ed0c30..0630161 100644
  --- a/arch/x86/include/asm/kvm_host.h
  +++ b/arch/x86/include/asm/kvm_host.h
  @@ -358,6 +358,7 @@ struct kvm_vcpu_arch {
  struct kvm_lapic *apic;/* kernel irqchip context */
  unsigned long apic_attention;
  int32_t apic_arb_prio;
  +   int32_t round_robin_counter;
  int mp_state;
  u64 ia32_misc_enable_msr;
  bool tpr_access_reporting;
  @@ -771,6 +772,7 @@ struct kvm_x86_ops {
  int (*check_nested_events)(struct kvm_vcpu *vcpu, bool external_intr);
 
  void (*sched_in)(struct kvm_vcpu *kvm, int cpu);
  +   u64 (*get_pi_desc_addr)(struct kvm_vcpu *vcpu);
   };
 
   struct kvm_arch_async_pf {
  diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
  index a4670d3..ae91b72 100644
  --- a/arch/x86/kvm/vmx.c
  +++ b/arch/x86/kvm/vmx.c
  @@ -544,6 +544,11 @@ static inline struct vcpu_vmx *to_vmx(struct
 kvm_vcpu *vcpu)
  return container_of(vcpu, struct vcpu_vmx, vcpu);
   }
 
  +struct pi_desc *vcpu_to_pi_desc(struct kvm_vcpu *vcpu)
  +{
  +   return (to_vmx(vcpu)-pi_desc);
  +}
  +
   #define VMCS12_OFFSET(x) offsetof(struct vmcs12, x)
   #define FIELD(number, name)[number] = VMCS12_OFFSET(name)
   #define FIELD64(number, name)  [number] = VMCS12_OFFSET(name), \
  @@ -4280,6 +4285,11 @@ static void vmx_sync_pir_to_irr_dummy(struct
 kvm_vcpu *vcpu)
  return;
   }
 
  +static u64 vmx_get_pi_desc_addr(struct kvm_vcpu *vcpu)
  +{
  +   return __pa((u64)vcpu_to_pi_desc(vcpu));
  +}
  +
   /*
* Set up

Re: [PATCH 05/13] KVM: Update IRTE according to guest interrupt configuration changes

2014-11-11 Thread Paolo Bonzini


On 11/11/2014 10:20, Wu, Feng wrote:
  Since legacy KVM device assignment is effectively deprecated, have you
  considered how we might do this with VFIO?  Thanks,
 
 I haven't thought about how to enable this in VFIO so far. I think I can 
 continue to
 implement that if needed after this patch set is finished. What do you think 
 of this?

Hi Feng,

we are not applying new features to legacy KVM device assignment, since
it is unsafe (it does not honor ACS).

I and Alex can help you with designing a way to interface VFIO with KVM
posted interrupts.  Give us a few days to study these patches more, or
feel free to request comments if you have ideas about it yourself.

Paolo
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH 05/13] KVM: Update IRTE according to guest interrupt configuration changes

2014-11-11 Thread Wu, Feng


 -Original Message-
 From: Paolo Bonzini [mailto:pbonz...@redhat.com]
 Sent: Tuesday, November 11, 2014 7:02 PM
 To: Wu, Feng; Alex Williamson
 Cc: g...@kernel.org; dw...@infradead.org; j...@8bytes.org;
 t...@linutronix.de; mi...@redhat.com; h...@zytor.com; x...@kernel.org;
 kvm@vger.kernel.org; io...@lists.linux-foundation.org;
 linux-ker...@vger.kernel.org
 Subject: Re: [PATCH 05/13] KVM: Update IRTE according to guest interrupt
 configuration changes
 
 
 
 On 11/11/2014 10:20, Wu, Feng wrote:
   Since legacy KVM device assignment is effectively deprecated, have you
   considered how we might do this with VFIO?  Thanks,
 
  I haven't thought about how to enable this in VFIO so far. I think I can 
  continue
 to
  implement that if needed after this patch set is finished. What do you 
  think of
 this?
 
 Hi Feng,
 
 we are not applying new features to legacy KVM device assignment, since
 it is unsafe (it does not honor ACS).
 
 I and Alex can help you with designing a way to interface VFIO with KVM
 posted interrupts.  Give us a few days to study these patches more, or
 feel free to request comments if you have ideas about it yourself.
 
 Paolo

Okay, then I will put some efforts on getting familiar with VFIO mechanism. If
You have any questions about these patches, we can discuss it together.

Thanks,
Feng
N�r��yb�X��ǧv�^�)޺{.n�+h����ܨ}���Ơz�j:+v���zZ+��+zf���h���~i���z��w���?��)ߢf

RE: [PATCH 05/13] KVM: Update IRTE according to guest interrupt configuration changes

2014-11-11 Thread Wu, Feng


 -Original Message-
 From: Paolo Bonzini [mailto:pbonz...@redhat.com]
 Sent: Tuesday, November 11, 2014 7:02 PM
 To: Wu, Feng; Alex Williamson
 Cc: g...@kernel.org; dw...@infradead.org; j...@8bytes.org;
 t...@linutronix.de; mi...@redhat.com; h...@zytor.com; x...@kernel.org;
 kvm@vger.kernel.org; io...@lists.linux-foundation.org;
 linux-ker...@vger.kernel.org
 Subject: Re: [PATCH 05/13] KVM: Update IRTE according to guest interrupt
 configuration changes
 
 
 
 On 11/11/2014 10:20, Wu, Feng wrote:
   Since legacy KVM device assignment is effectively deprecated, have you
   considered how we might do this with VFIO?  Thanks,
 
  I haven't thought about how to enable this in VFIO so far. I think I can 
  continue
 to
  implement that if needed after this patch set is finished. What do you 
  think of
 this?
 
 Hi Feng,
 
 we are not applying new features to legacy KVM device assignment, since
 it is unsafe (it does not honor ACS).
 
 I and Alex can help you with designing a way to interface VFIO with KVM
 posted interrupts.  Give us a few days to study these patches more, or
 feel free to request comments if you have ideas about it yourself.
 
 Paolo

Okay, then I will put some efforts on getting familiar with VFIO mechanism. If
You have any questions about these patches, we can discuss it together.

Thanks,
Feng
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH 05/13] KVM: Update IRTE according to guest interrupt configuration changes

2014-11-11 Thread Zhang, Yang Z
Paolo Bonzini wrote on 2014-11-11:
 
 
 On 11/11/2014 10:20, Wu, Feng wrote:
 Since legacy KVM device assignment is effectively deprecated, have
 you considered how we might do this with VFIO?  Thanks,
 
 I haven't thought about how to enable this in VFIO so far. I think I
 can continue to implement that if needed after this patch set is finished.
 What do you think of this?
 
 Hi Feng,
 
 we are not applying new features to legacy KVM device assignment,
 since it is unsafe (it does not honor ACS).

Personally, I think this feature will be helpful to the legacy device 
assignment. Agree, vfio is the right solution for future feature enabling. But 
the old kvm without the good vfio supporting is still used largely today. The 
user really looking for this feature but they will not upgrade their kernel. 
It's easy for us to backport this feature to old kvm with the legacy device 
assignment, but it is impossible to backport the whole vfio. So I think you 
guys can take a consider to add this feature to both vfio and legacy device 
assignment.

 
 I and Alex can help you with designing a way to interface VFIO with
 KVM posted interrupts.  Give us a few days to study these patches
 more, or feel free to request comments if you have ideas about it yourself.
 
 Paolo


Best regards,
Yang



Re: [PATCH 05/13] KVM: Update IRTE according to guest interrupt configuration changes

2014-11-10 Thread Alex Williamson
On Mon, 2014-11-10 at 14:26 +0800, Feng Wu wrote:
 When guest changes its interrupt configuration (such as, vector, etc.)
 for direct-assigned devices, we need to update the associated IRTE
 with the new guest vector, so external interrupts from the assigned
 devices can be injected to guests without VM-Exit.
 
 The current method of handling guest lowest priority interrtups
 is to use a counter 'apic_arb_prio' for each VCPU, we choose the
 VCPU with smallest 'apic_arb_prio' and then increase it by 1.
 However, for VT-d PI, we cannot re-use this, since we no longer
 have control to 'apic_arb_prio' with posted interrupt direct
 delivery by Hardware.
 
 Here, we introduce a similiar way with 'apic_arb_prio' to handle
 guest lowest priority interrtups when VT-d PI is used. Here is the
 ideas:
 - Each VCPU has a counter 'round_robin_counter'.
 - When guests sets an interrupts to lowest priority, we choose
 the VCPU with smallest 'round_robin_counter' as the destination,
 then increase it.
 
 Signed-off-by: Feng Wu feng...@intel.com
 ---
  arch/x86/include/asm/irq_remapping.h |6 ++
  arch/x86/include/asm/kvm_host.h  |2 +
  arch/x86/kvm/vmx.c   |   12 +++
  arch/x86/kvm/x86.c   |   11 +++
  drivers/iommu/amd_iommu.c|6 ++
  drivers/iommu/intel_irq_remapping.c  |   28 +++
  drivers/iommu/irq_remapping.c|9 ++
  drivers/iommu/irq_remapping.h|3 +
  include/linux/dmar.h |   26 ++
  include/linux/kvm_host.h |   22 +
  include/uapi/linux/kvm.h |1 +
  virt/kvm/assigned-dev.c  |  141 
 ++
  virt/kvm/irq_comm.c  |4 +-
  virt/kvm/irqchip.c   |   11 ---
  14 files changed, 269 insertions(+), 13 deletions(-)
 
 diff --git a/arch/x86/include/asm/irq_remapping.h 
 b/arch/x86/include/asm/irq_remapping.h
 index a3cc437..32d6cc4 100644
 --- a/arch/x86/include/asm/irq_remapping.h
 +++ b/arch/x86/include/asm/irq_remapping.h
 @@ -51,6 +51,7 @@ extern void compose_remapped_msi_msg(struct pci_dev *pdev,
unsigned int irq, unsigned int dest,
struct msi_msg *msg, u8 hpet_id);
  extern int setup_hpet_msi_remapped(unsigned int irq, unsigned int id);
 +extern int update_pi_irte(unsigned int irq, u64 pi_desc_addr, u32 vector);
  extern void panic_if_irq_remap(const char *msg);
  extern bool setup_remapped_irq(int irq,
  struct irq_cfg *cfg,
 @@ -88,6 +89,11 @@ static inline int setup_hpet_msi_remapped(unsigned int 
 irq, unsigned int id)
   return -ENODEV;
  }
  
 +static inline int update_pi_irte(unsigned int irq, u64 pi_desc_addr, u32 
 vector)
 +{
 + return -ENODEV;
 +}
 +
  static inline void panic_if_irq_remap(const char *msg)
  {
  }
 diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
 index 6ed0c30..0630161 100644
 --- a/arch/x86/include/asm/kvm_host.h
 +++ b/arch/x86/include/asm/kvm_host.h
 @@ -358,6 +358,7 @@ struct kvm_vcpu_arch {
   struct kvm_lapic *apic;/* kernel irqchip context */
   unsigned long apic_attention;
   int32_t apic_arb_prio;
 + int32_t round_robin_counter;
   int mp_state;
   u64 ia32_misc_enable_msr;
   bool tpr_access_reporting;
 @@ -771,6 +772,7 @@ struct kvm_x86_ops {
   int (*check_nested_events)(struct kvm_vcpu *vcpu, bool external_intr);
  
   void (*sched_in)(struct kvm_vcpu *kvm, int cpu);
 + u64 (*get_pi_desc_addr)(struct kvm_vcpu *vcpu);
  };
  
  struct kvm_arch_async_pf {
 diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
 index a4670d3..ae91b72 100644
 --- a/arch/x86/kvm/vmx.c
 +++ b/arch/x86/kvm/vmx.c
 @@ -544,6 +544,11 @@ static inline struct vcpu_vmx *to_vmx(struct kvm_vcpu 
 *vcpu)
   return container_of(vcpu, struct vcpu_vmx, vcpu);
  }
  
 +struct pi_desc *vcpu_to_pi_desc(struct kvm_vcpu *vcpu)
 +{
 + return (to_vmx(vcpu)-pi_desc);
 +}
 +
  #define VMCS12_OFFSET(x) offsetof(struct vmcs12, x)
  #define FIELD(number, name)  [number] = VMCS12_OFFSET(name)
  #define FIELD64(number, name)[number] = VMCS12_OFFSET(name), \
 @@ -4280,6 +4285,11 @@ static void vmx_sync_pir_to_irr_dummy(struct kvm_vcpu 
 *vcpu)
   return;
  }
  
 +static u64 vmx_get_pi_desc_addr(struct kvm_vcpu *vcpu)
 +{
 + return __pa((u64)vcpu_to_pi_desc(vcpu));
 +}
 +
  /*
   * Set up the vmcs's constant host-state fields, i.e., host-state fields that
   * will not change in the lifetime of the guest.
 @@ -9232,6 +9242,8 @@ static struct kvm_x86_ops vmx_x86_ops = {
   .check_nested_events = vmx_check_nested_events,
  
   .sched_in = vmx_sched_in,
 +
 + .get_pi_desc_addr = vmx_get_pi_desc_addr,
  };
  
  static int __init vmx_init(void)
 diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
 index b447a98..0c19d15 100644
 --- a/arch/x86/kvm/x86.c
 +++ b/arch/x86/kvm/x86.c
 @@ -7735,6 +7735,17 @@ bool