Re: [PATCH v2 11/13] KVM: x86: add KVM_CAP_X2APIC_API

2016-07-11 Thread Paolo Bonzini


On 11/07/2016 12:33, Yang Zhang wrote:
> On 2016/7/11 17:17, Paolo Bonzini wrote:
>> On 11/07/2016 10:56, Yang Zhang wrote:
>>> On 2016/7/11 15:44, Paolo Bonzini wrote:
 On 11/07/2016 08:06, Yang Zhang wrote:
 If interrupt remapping is on, KVM_CAP_X2APIC_API is needed even with 8
 VCPUs, I think.  Otherwise KVM will believe that 0xff is "broadcast"
 rather than "cluster 0, CPUs 0-7".
>>>
>>> If interrupt remapping is using, what 0xff means is relying on which
>>> mode the destination CPU is in. I think there is no KVM_CAP_X2APIC_API
>>> needed since interrupt remapping table gives all the information.
>>
>> If you have EIM 0xff never means broadcast, but KVM sees a 0xff in the
>> interrupt route or KVM_SIGNAL_MSI argument and translates it into a
>> broadcast.
> 
> I see your point. I thought there would be a new irq router(like
> KVM_IRQ_ROUTING_IR) to handle all interrupts after turn on IR and
> KVM_CAP_X2APIC_API would be dropped.

KVM_CAP_X2APIC_API seems simpler to me than a new type of irq routing.

> So we will continue to use
> KVM_IRQ_ROUTING_IRQCHIP and KVM_IRQ_ROUTING_MSI for interrupt from IR,
> right?

Actually only KVM_IRQ_ROUTING_MSI, because that's the only one that is
available with split irqchip.

Paolo


Re: [PATCH v2 11/13] KVM: x86: add KVM_CAP_X2APIC_API

2016-07-11 Thread Paolo Bonzini


On 11/07/2016 12:33, Yang Zhang wrote:
> On 2016/7/11 17:17, Paolo Bonzini wrote:
>> On 11/07/2016 10:56, Yang Zhang wrote:
>>> On 2016/7/11 15:44, Paolo Bonzini wrote:
 On 11/07/2016 08:06, Yang Zhang wrote:
 If interrupt remapping is on, KVM_CAP_X2APIC_API is needed even with 8
 VCPUs, I think.  Otherwise KVM will believe that 0xff is "broadcast"
 rather than "cluster 0, CPUs 0-7".
>>>
>>> If interrupt remapping is using, what 0xff means is relying on which
>>> mode the destination CPU is in. I think there is no KVM_CAP_X2APIC_API
>>> needed since interrupt remapping table gives all the information.
>>
>> If you have EIM 0xff never means broadcast, but KVM sees a 0xff in the
>> interrupt route or KVM_SIGNAL_MSI argument and translates it into a
>> broadcast.
> 
> I see your point. I thought there would be a new irq router(like
> KVM_IRQ_ROUTING_IR) to handle all interrupts after turn on IR and
> KVM_CAP_X2APIC_API would be dropped.

KVM_CAP_X2APIC_API seems simpler to me than a new type of irq routing.

> So we will continue to use
> KVM_IRQ_ROUTING_IRQCHIP and KVM_IRQ_ROUTING_MSI for interrupt from IR,
> right?

Actually only KVM_IRQ_ROUTING_MSI, because that's the only one that is
available with split irqchip.

Paolo


Re: [PATCH v2 11/13] KVM: x86: add KVM_CAP_X2APIC_API

2016-07-11 Thread Yang Zhang

On 2016/7/11 17:17, Paolo Bonzini wrote:



On 11/07/2016 10:56, Yang Zhang wrote:

On 2016/7/11 15:44, Paolo Bonzini wrote:



On 11/07/2016 08:06, Yang Zhang wrote:

Changes to MSI addresses follow the format used by interrupt remapping
unit.
The upper address word, that used to be 0, contains upper 24 bits of
the LAPIC
address in its upper 24 bits.  Lower 8 bits are reserved as 0.
Using the upper address word is not backward-compatible either as we
didn't
check that userspace zeroed the word.  Reserved bits are still not
explicitly


Does this means we cannot migrate the VM from KVM_CAP_X2APIC_API enabled
host to the disable host even VM doesn't have more than 255 VCPUs?


Yes, but that's why KVM_CAP_X2APIC_API is enabled manually.  The idea is
that QEMU will not use KVM_CAP_X2APIC_API except on the newest machine
type.


Thanks for confirmation. And when the KVM_CAP_X2APIC_API will be enabled
in Qemu?


It could be 2.7 or 2.8.



If interrupt remapping is on, KVM_CAP_X2APIC_API is needed even with 8
VCPUs, I think.  Otherwise KVM will believe that 0xff is "broadcast"
rather than "cluster 0, CPUs 0-7".


If interrupt remapping is using, what 0xff means is relying on which
mode the destination CPU is in. I think there is no KVM_CAP_X2APIC_API
needed since interrupt remapping table gives all the information.


If you have EIM 0xff never means broadcast, but KVM sees a 0xff in the
interrupt route or KVM_SIGNAL_MSI argument and translates it into a
broadcast.


I see your point. I thought there would be a new irq router(like 
KVM_IRQ_ROUTING_IR) to handle all interrupts after turn on IR and 
KVM_CAP_X2APIC_API would be dropped. So we will continue to use 
KVM_IRQ_ROUTING_IRQCHIP and KVM_IRQ_ROUTING_MSI for interrupt from IR, 
right?


--
best regards
yang


Re: [PATCH v2 11/13] KVM: x86: add KVM_CAP_X2APIC_API

2016-07-11 Thread Yang Zhang

On 2016/7/11 17:17, Paolo Bonzini wrote:



On 11/07/2016 10:56, Yang Zhang wrote:

On 2016/7/11 15:44, Paolo Bonzini wrote:



On 11/07/2016 08:06, Yang Zhang wrote:

Changes to MSI addresses follow the format used by interrupt remapping
unit.
The upper address word, that used to be 0, contains upper 24 bits of
the LAPIC
address in its upper 24 bits.  Lower 8 bits are reserved as 0.
Using the upper address word is not backward-compatible either as we
didn't
check that userspace zeroed the word.  Reserved bits are still not
explicitly


Does this means we cannot migrate the VM from KVM_CAP_X2APIC_API enabled
host to the disable host even VM doesn't have more than 255 VCPUs?


Yes, but that's why KVM_CAP_X2APIC_API is enabled manually.  The idea is
that QEMU will not use KVM_CAP_X2APIC_API except on the newest machine
type.


Thanks for confirmation. And when the KVM_CAP_X2APIC_API will be enabled
in Qemu?


It could be 2.7 or 2.8.



If interrupt remapping is on, KVM_CAP_X2APIC_API is needed even with 8
VCPUs, I think.  Otherwise KVM will believe that 0xff is "broadcast"
rather than "cluster 0, CPUs 0-7".


If interrupt remapping is using, what 0xff means is relying on which
mode the destination CPU is in. I think there is no KVM_CAP_X2APIC_API
needed since interrupt remapping table gives all the information.


If you have EIM 0xff never means broadcast, but KVM sees a 0xff in the
interrupt route or KVM_SIGNAL_MSI argument and translates it into a
broadcast.


I see your point. I thought there would be a new irq router(like 
KVM_IRQ_ROUTING_IR) to handle all interrupts after turn on IR and 
KVM_CAP_X2APIC_API would be dropped. So we will continue to use 
KVM_IRQ_ROUTING_IRQCHIP and KVM_IRQ_ROUTING_MSI for interrupt from IR, 
right?


--
best regards
yang


Re: [PATCH v2 11/13] KVM: x86: add KVM_CAP_X2APIC_API

2016-07-11 Thread Paolo Bonzini


On 11/07/2016 10:56, Yang Zhang wrote:
> On 2016/7/11 15:44, Paolo Bonzini wrote:
>>
>>
>> On 11/07/2016 08:06, Yang Zhang wrote:
 Changes to MSI addresses follow the format used by interrupt remapping
 unit.
 The upper address word, that used to be 0, contains upper 24 bits of
 the LAPIC
 address in its upper 24 bits.  Lower 8 bits are reserved as 0.
 Using the upper address word is not backward-compatible either as we
 didn't
 check that userspace zeroed the word.  Reserved bits are still not
 explicitly
>>>
>>> Does this means we cannot migrate the VM from KVM_CAP_X2APIC_API enabled
>>> host to the disable host even VM doesn't have more than 255 VCPUs?
>>
>> Yes, but that's why KVM_CAP_X2APIC_API is enabled manually.  The idea is
>> that QEMU will not use KVM_CAP_X2APIC_API except on the newest machine
>> type.
> 
> Thanks for confirmation. And when the KVM_CAP_X2APIC_API will be enabled
> in Qemu?

It could be 2.7 or 2.8.

>>
>> If interrupt remapping is on, KVM_CAP_X2APIC_API is needed even with 8
>> VCPUs, I think.  Otherwise KVM will believe that 0xff is "broadcast"
>> rather than "cluster 0, CPUs 0-7".
> 
> If interrupt remapping is using, what 0xff means is relying on which
> mode the destination CPU is in. I think there is no KVM_CAP_X2APIC_API
> needed since interrupt remapping table gives all the information.

If you have EIM 0xff never means broadcast, but KVM sees a 0xff in the
interrupt route or KVM_SIGNAL_MSI argument and translates it into a
broadcast.

Paolo


Re: [PATCH v2 11/13] KVM: x86: add KVM_CAP_X2APIC_API

2016-07-11 Thread Paolo Bonzini


On 11/07/2016 10:56, Yang Zhang wrote:
> On 2016/7/11 15:44, Paolo Bonzini wrote:
>>
>>
>> On 11/07/2016 08:06, Yang Zhang wrote:
 Changes to MSI addresses follow the format used by interrupt remapping
 unit.
 The upper address word, that used to be 0, contains upper 24 bits of
 the LAPIC
 address in its upper 24 bits.  Lower 8 bits are reserved as 0.
 Using the upper address word is not backward-compatible either as we
 didn't
 check that userspace zeroed the word.  Reserved bits are still not
 explicitly
>>>
>>> Does this means we cannot migrate the VM from KVM_CAP_X2APIC_API enabled
>>> host to the disable host even VM doesn't have more than 255 VCPUs?
>>
>> Yes, but that's why KVM_CAP_X2APIC_API is enabled manually.  The idea is
>> that QEMU will not use KVM_CAP_X2APIC_API except on the newest machine
>> type.
> 
> Thanks for confirmation. And when the KVM_CAP_X2APIC_API will be enabled
> in Qemu?

It could be 2.7 or 2.8.

>>
>> If interrupt remapping is on, KVM_CAP_X2APIC_API is needed even with 8
>> VCPUs, I think.  Otherwise KVM will believe that 0xff is "broadcast"
>> rather than "cluster 0, CPUs 0-7".
> 
> If interrupt remapping is using, what 0xff means is relying on which
> mode the destination CPU is in. I think there is no KVM_CAP_X2APIC_API
> needed since interrupt remapping table gives all the information.

If you have EIM 0xff never means broadcast, but KVM sees a 0xff in the
interrupt route or KVM_SIGNAL_MSI argument and translates it into a
broadcast.

Paolo


Re: [PATCH v2 11/13] KVM: x86: add KVM_CAP_X2APIC_API

2016-07-11 Thread Yang Zhang

On 2016/7/11 15:44, Paolo Bonzini wrote:



On 11/07/2016 08:06, Yang Zhang wrote:

Changes to MSI addresses follow the format used by interrupt remapping
unit.
The upper address word, that used to be 0, contains upper 24 bits of
the LAPIC
address in its upper 24 bits.  Lower 8 bits are reserved as 0.
Using the upper address word is not backward-compatible either as we
didn't
check that userspace zeroed the word.  Reserved bits are still not
explicitly


Does this means we cannot migrate the VM from KVM_CAP_X2APIC_API enabled
host to the disable host even VM doesn't have more than 255 VCPUs?


Yes, but that's why KVM_CAP_X2APIC_API is enabled manually.  The idea is
that QEMU will not use KVM_CAP_X2APIC_API except on the newest machine type.


Thanks for confirmation. And when the KVM_CAP_X2APIC_API will be enabled 
in Qemu?




If interrupt remapping is on, KVM_CAP_X2APIC_API is needed even with 8
VCPUs, I think.  Otherwise KVM will believe that 0xff is "broadcast"
rather than "cluster 0, CPUs 0-7".


If interrupt remapping is using, what 0xff means is relying on which 
mode the destination CPU is in. I think there is no KVM_CAP_X2APIC_API 
needed since interrupt remapping table gives all the information.


--
best regards
yang


Re: [PATCH v2 11/13] KVM: x86: add KVM_CAP_X2APIC_API

2016-07-11 Thread Yang Zhang

On 2016/7/11 15:44, Paolo Bonzini wrote:



On 11/07/2016 08:06, Yang Zhang wrote:

Changes to MSI addresses follow the format used by interrupt remapping
unit.
The upper address word, that used to be 0, contains upper 24 bits of
the LAPIC
address in its upper 24 bits.  Lower 8 bits are reserved as 0.
Using the upper address word is not backward-compatible either as we
didn't
check that userspace zeroed the word.  Reserved bits are still not
explicitly


Does this means we cannot migrate the VM from KVM_CAP_X2APIC_API enabled
host to the disable host even VM doesn't have more than 255 VCPUs?


Yes, but that's why KVM_CAP_X2APIC_API is enabled manually.  The idea is
that QEMU will not use KVM_CAP_X2APIC_API except on the newest machine type.


Thanks for confirmation. And when the KVM_CAP_X2APIC_API will be enabled 
in Qemu?




If interrupt remapping is on, KVM_CAP_X2APIC_API is needed even with 8
VCPUs, I think.  Otherwise KVM will believe that 0xff is "broadcast"
rather than "cluster 0, CPUs 0-7".


If interrupt remapping is using, what 0xff means is relying on which 
mode the destination CPU is in. I think there is no KVM_CAP_X2APIC_API 
needed since interrupt remapping table gives all the information.


--
best regards
yang


Re: [PATCH v2 11/13] KVM: x86: add KVM_CAP_X2APIC_API

2016-07-11 Thread Paolo Bonzini


On 11/07/2016 08:06, Yang Zhang wrote:
>> Changes to MSI addresses follow the format used by interrupt remapping
>> unit.
>> The upper address word, that used to be 0, contains upper 24 bits of
>> the LAPIC
>> address in its upper 24 bits.  Lower 8 bits are reserved as 0.
>> Using the upper address word is not backward-compatible either as we
>> didn't
>> check that userspace zeroed the word.  Reserved bits are still not
>> explicitly
> 
> Does this means we cannot migrate the VM from KVM_CAP_X2APIC_API enabled
> host to the disable host even VM doesn't have more than 255 VCPUs?

Yes, but that's why KVM_CAP_X2APIC_API is enabled manually.  The idea is
that QEMU will not use KVM_CAP_X2APIC_API except on the newest machine type.

If interrupt remapping is on, KVM_CAP_X2APIC_API is needed even with 8
VCPUs, I think.  Otherwise KVM will believe that 0xff is "broadcast"
rather than "cluster 0, CPUs 0-7".

Paolo


Re: [PATCH v2 11/13] KVM: x86: add KVM_CAP_X2APIC_API

2016-07-11 Thread Paolo Bonzini


On 11/07/2016 08:06, Yang Zhang wrote:
>> Changes to MSI addresses follow the format used by interrupt remapping
>> unit.
>> The upper address word, that used to be 0, contains upper 24 bits of
>> the LAPIC
>> address in its upper 24 bits.  Lower 8 bits are reserved as 0.
>> Using the upper address word is not backward-compatible either as we
>> didn't
>> check that userspace zeroed the word.  Reserved bits are still not
>> explicitly
> 
> Does this means we cannot migrate the VM from KVM_CAP_X2APIC_API enabled
> host to the disable host even VM doesn't have more than 255 VCPUs?

Yes, but that's why KVM_CAP_X2APIC_API is enabled manually.  The idea is
that QEMU will not use KVM_CAP_X2APIC_API except on the newest machine type.

If interrupt remapping is on, KVM_CAP_X2APIC_API is needed even with 8
VCPUs, I think.  Otherwise KVM will believe that 0xff is "broadcast"
rather than "cluster 0, CPUs 0-7".

Paolo


Re: [PATCH v2 11/13] KVM: x86: add KVM_CAP_X2APIC_API

2016-07-11 Thread Yang Zhang

On 2016/7/8 1:15, Radim Krčmář wrote:

KVM_CAP_X2APIC_API can be enabled to extend APIC ID in get/set ioctl and MSI
addresses to 32 bits.  Both are needed to support x2APIC.

The capability has to be toggleable and disabled by default, because get/set
ioctl shifted and truncated APIC ID to 8 bits by using a non-standard protocol
inspired by xAPIC and the change is not backward-compatible.

Changes to MSI addresses follow the format used by interrupt remapping unit.
The upper address word, that used to be 0, contains upper 24 bits of the LAPIC
address in its upper 24 bits.  Lower 8 bits are reserved as 0.
Using the upper address word is not backward-compatible either as we didn't
check that userspace zeroed the word.  Reserved bits are still not explicitly


Does this means we cannot migrate the VM from KVM_CAP_X2APIC_API enabled 
host to the disable host even VM doesn't have more than 255 VCPUs?



checked, but non-zero data will affect LAPIC addresses, which will cause a bug.

Signed-off-by: Radim Krčmář 
---
 v2:
 * pass struct kvm into kvm_set_msi_irq [Paolo]
 * trace address_hi [David]
 * use hex dst, like other tracepoins
 * strict reserved MSI bits checking [Paolo]
 * loose reserved capability bits checking [Paolo]
 * improved documentation [Paolo]

 Documentation/virtual/kvm/api.txt | 32 
 arch/x86/include/asm/kvm_host.h   |  4 +++-
 arch/x86/kvm/irq_comm.c   | 26 +-
 arch/x86/kvm/lapic.c  | 13 +
 arch/x86/kvm/vmx.c|  2 +-
 arch/x86/kvm/x86.c|  5 +
 include/trace/events/kvm.h|  5 +++--
 include/uapi/linux/kvm.h  |  1 +
 8 files changed, 75 insertions(+), 13 deletions(-)

diff --git a/Documentation/virtual/kvm/api.txt 
b/Documentation/virtual/kvm/api.txt
index 09efa9eb3926..a8f2ef910f98 100644
--- a/Documentation/virtual/kvm/api.txt
+++ b/Documentation/virtual/kvm/api.txt
@@ -1482,6 +1482,10 @@ struct kvm_irq_routing_msi {
__u32 pad;
 };

+On x86, address_hi is ignored unless the KVM_CAP_X2APIC_API capability is
+enabled.  If it is enabled, address_hi bits 31-8 provide bits 31-8 of the
+destination id.  Bits 7-0 of address_hi must be zero.
+
 struct kvm_irq_routing_s390_adapter {
__u64 ind_addr;
__u64 summary_addr;
@@ -1583,6 +1587,15 @@ struct kvm_lapic_state {
 Reads the Local APIC registers and copies them into the input argument.  The
 data format and layout are the same as documented in the architecture manual.

+If KVM_CAP_X2APIC_API is enabled, then the format of APIC_ID register depends
+on the APIC mode (reported by MSR_IA32_APICBASE) of its VCPU.  x2APIC stores
+APIC ID in the APIC_ID register (bytes 32-35).  xAPIC only allows an 8-bit APIC
+ID which is stored in bits 31-24 of the APIC register, or equivalently in byte
+35 of struct kvm_lapic_state's regs field.
+
+If KVM_CAP_X2APIC_API is disabled, struct kvm_lapic_state always uses xAPIC
+format.
+

 4.58 KVM_SET_LAPIC

@@ -1600,6 +1613,10 @@ struct kvm_lapic_state {
 Copies the input argument into the Local APIC registers.  The data format
 and layout are the same as documented in the architecture manual.

+The format of the APIC ID register (bytes 32-35 of struct kvm_lapic_state's
+regs field) depends on the state of the KVM_CAP_X2APIC_API capability.
+See the note in KVM_GET_LAPIC.
+

 4.59 KVM_IOEVENTFD

@@ -2180,6 +2197,10 @@ struct kvm_msi {

 No flags are defined so far. The corresponding field must be 0.

+On x86, address_hi is ignored unless the KVM_CAP_X2APIC_API capability is
+enabled.  If it is enabled, address_hi bits 31-8 provide bits 31-8 of the
+destination id.  Bits 7-0 of address_hi must be zero.
+

 4.71 KVM_CREATE_PIT2

@@ -3811,6 +3832,17 @@ Allows use of runtime-instrumentation introduced with 
zEC12 processor.
 Will return -EINVAL if the machine does not support runtime-instrumentation.
 Will return -EBUSY if a VCPU has already been created.

+7.7 KVM_CAP_X2APIC_API
+
+Architectures: x86
+Parameters: none
+Returns: 0
+
+Enabling this capability changes the behavior of KVM_SET_GSI_ROUTING,
+KVM_SIGNAL_MSI, KVM_SET_LAPIC, and KVM_GET_LAPIC, allowing the use of 32-bit
+APIC IDs.  See KVM_CAP_X2APIC_API in their respective sections.
+
+
 8. Other capabilities.
 --

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 643e3dffcd85..f1b202b34c72 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -782,6 +782,8 @@ struct kvm_arch {
u32 ldr_mode;
struct page *avic_logical_id_table_page;
struct page *avic_physical_id_table_page;
+
+   bool x2apic_api;
 };

 struct kvm_vm_stat {
@@ -1364,7 +1366,7 @@ bool kvm_vcpu_is_bsp(struct kvm_vcpu *vcpu);
 bool kvm_intr_is_single_vcpu(struct kvm *kvm, struct kvm_lapic_irq *irq,
 struct kvm_vcpu **dest_vcpu);

-void kvm_set_msi_irq(struct 

Re: [PATCH v2 11/13] KVM: x86: add KVM_CAP_X2APIC_API

2016-07-11 Thread Yang Zhang

On 2016/7/8 1:15, Radim Krčmář wrote:

KVM_CAP_X2APIC_API can be enabled to extend APIC ID in get/set ioctl and MSI
addresses to 32 bits.  Both are needed to support x2APIC.

The capability has to be toggleable and disabled by default, because get/set
ioctl shifted and truncated APIC ID to 8 bits by using a non-standard protocol
inspired by xAPIC and the change is not backward-compatible.

Changes to MSI addresses follow the format used by interrupt remapping unit.
The upper address word, that used to be 0, contains upper 24 bits of the LAPIC
address in its upper 24 bits.  Lower 8 bits are reserved as 0.
Using the upper address word is not backward-compatible either as we didn't
check that userspace zeroed the word.  Reserved bits are still not explicitly


Does this means we cannot migrate the VM from KVM_CAP_X2APIC_API enabled 
host to the disable host even VM doesn't have more than 255 VCPUs?



checked, but non-zero data will affect LAPIC addresses, which will cause a bug.

Signed-off-by: Radim Krčmář 
---
 v2:
 * pass struct kvm into kvm_set_msi_irq [Paolo]
 * trace address_hi [David]
 * use hex dst, like other tracepoins
 * strict reserved MSI bits checking [Paolo]
 * loose reserved capability bits checking [Paolo]
 * improved documentation [Paolo]

 Documentation/virtual/kvm/api.txt | 32 
 arch/x86/include/asm/kvm_host.h   |  4 +++-
 arch/x86/kvm/irq_comm.c   | 26 +-
 arch/x86/kvm/lapic.c  | 13 +
 arch/x86/kvm/vmx.c|  2 +-
 arch/x86/kvm/x86.c|  5 +
 include/trace/events/kvm.h|  5 +++--
 include/uapi/linux/kvm.h  |  1 +
 8 files changed, 75 insertions(+), 13 deletions(-)

diff --git a/Documentation/virtual/kvm/api.txt 
b/Documentation/virtual/kvm/api.txt
index 09efa9eb3926..a8f2ef910f98 100644
--- a/Documentation/virtual/kvm/api.txt
+++ b/Documentation/virtual/kvm/api.txt
@@ -1482,6 +1482,10 @@ struct kvm_irq_routing_msi {
__u32 pad;
 };

+On x86, address_hi is ignored unless the KVM_CAP_X2APIC_API capability is
+enabled.  If it is enabled, address_hi bits 31-8 provide bits 31-8 of the
+destination id.  Bits 7-0 of address_hi must be zero.
+
 struct kvm_irq_routing_s390_adapter {
__u64 ind_addr;
__u64 summary_addr;
@@ -1583,6 +1587,15 @@ struct kvm_lapic_state {
 Reads the Local APIC registers and copies them into the input argument.  The
 data format and layout are the same as documented in the architecture manual.

+If KVM_CAP_X2APIC_API is enabled, then the format of APIC_ID register depends
+on the APIC mode (reported by MSR_IA32_APICBASE) of its VCPU.  x2APIC stores
+APIC ID in the APIC_ID register (bytes 32-35).  xAPIC only allows an 8-bit APIC
+ID which is stored in bits 31-24 of the APIC register, or equivalently in byte
+35 of struct kvm_lapic_state's regs field.
+
+If KVM_CAP_X2APIC_API is disabled, struct kvm_lapic_state always uses xAPIC
+format.
+

 4.58 KVM_SET_LAPIC

@@ -1600,6 +1613,10 @@ struct kvm_lapic_state {
 Copies the input argument into the Local APIC registers.  The data format
 and layout are the same as documented in the architecture manual.

+The format of the APIC ID register (bytes 32-35 of struct kvm_lapic_state's
+regs field) depends on the state of the KVM_CAP_X2APIC_API capability.
+See the note in KVM_GET_LAPIC.
+

 4.59 KVM_IOEVENTFD

@@ -2180,6 +2197,10 @@ struct kvm_msi {

 No flags are defined so far. The corresponding field must be 0.

+On x86, address_hi is ignored unless the KVM_CAP_X2APIC_API capability is
+enabled.  If it is enabled, address_hi bits 31-8 provide bits 31-8 of the
+destination id.  Bits 7-0 of address_hi must be zero.
+

 4.71 KVM_CREATE_PIT2

@@ -3811,6 +3832,17 @@ Allows use of runtime-instrumentation introduced with 
zEC12 processor.
 Will return -EINVAL if the machine does not support runtime-instrumentation.
 Will return -EBUSY if a VCPU has already been created.

+7.7 KVM_CAP_X2APIC_API
+
+Architectures: x86
+Parameters: none
+Returns: 0
+
+Enabling this capability changes the behavior of KVM_SET_GSI_ROUTING,
+KVM_SIGNAL_MSI, KVM_SET_LAPIC, and KVM_GET_LAPIC, allowing the use of 32-bit
+APIC IDs.  See KVM_CAP_X2APIC_API in their respective sections.
+
+
 8. Other capabilities.
 --

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 643e3dffcd85..f1b202b34c72 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -782,6 +782,8 @@ struct kvm_arch {
u32 ldr_mode;
struct page *avic_logical_id_table_page;
struct page *avic_physical_id_table_page;
+
+   bool x2apic_api;
 };

 struct kvm_vm_stat {
@@ -1364,7 +1366,7 @@ bool kvm_vcpu_is_bsp(struct kvm_vcpu *vcpu);
 bool kvm_intr_is_single_vcpu(struct kvm *kvm, struct kvm_lapic_irq *irq,
 struct kvm_vcpu **dest_vcpu);

-void kvm_set_msi_irq(struct kvm_kernel_irq_routing_entry *e,
+void 

[PATCH v2 11/13] KVM: x86: add KVM_CAP_X2APIC_API

2016-07-07 Thread Radim Krčmář
KVM_CAP_X2APIC_API can be enabled to extend APIC ID in get/set ioctl and MSI
addresses to 32 bits.  Both are needed to support x2APIC.

The capability has to be toggleable and disabled by default, because get/set
ioctl shifted and truncated APIC ID to 8 bits by using a non-standard protocol
inspired by xAPIC and the change is not backward-compatible.

Changes to MSI addresses follow the format used by interrupt remapping unit.
The upper address word, that used to be 0, contains upper 24 bits of the LAPIC
address in its upper 24 bits.  Lower 8 bits are reserved as 0.
Using the upper address word is not backward-compatible either as we didn't
check that userspace zeroed the word.  Reserved bits are still not explicitly
checked, but non-zero data will affect LAPIC addresses, which will cause a bug.

Signed-off-by: Radim Krčmář 
---
 v2:
 * pass struct kvm into kvm_set_msi_irq [Paolo]
 * trace address_hi [David]
 * use hex dst, like other tracepoins
 * strict reserved MSI bits checking [Paolo]
 * loose reserved capability bits checking [Paolo]
 * improved documentation [Paolo]

 Documentation/virtual/kvm/api.txt | 32 
 arch/x86/include/asm/kvm_host.h   |  4 +++-
 arch/x86/kvm/irq_comm.c   | 26 +-
 arch/x86/kvm/lapic.c  | 13 +
 arch/x86/kvm/vmx.c|  2 +-
 arch/x86/kvm/x86.c|  5 +
 include/trace/events/kvm.h|  5 +++--
 include/uapi/linux/kvm.h  |  1 +
 8 files changed, 75 insertions(+), 13 deletions(-)

diff --git a/Documentation/virtual/kvm/api.txt 
b/Documentation/virtual/kvm/api.txt
index 09efa9eb3926..a8f2ef910f98 100644
--- a/Documentation/virtual/kvm/api.txt
+++ b/Documentation/virtual/kvm/api.txt
@@ -1482,6 +1482,10 @@ struct kvm_irq_routing_msi {
__u32 pad;
 };
 
+On x86, address_hi is ignored unless the KVM_CAP_X2APIC_API capability is
+enabled.  If it is enabled, address_hi bits 31-8 provide bits 31-8 of the
+destination id.  Bits 7-0 of address_hi must be zero.
+
 struct kvm_irq_routing_s390_adapter {
__u64 ind_addr;
__u64 summary_addr;
@@ -1583,6 +1587,15 @@ struct kvm_lapic_state {
 Reads the Local APIC registers and copies them into the input argument.  The
 data format and layout are the same as documented in the architecture manual.
 
+If KVM_CAP_X2APIC_API is enabled, then the format of APIC_ID register depends
+on the APIC mode (reported by MSR_IA32_APICBASE) of its VCPU.  x2APIC stores
+APIC ID in the APIC_ID register (bytes 32-35).  xAPIC only allows an 8-bit APIC
+ID which is stored in bits 31-24 of the APIC register, or equivalently in byte
+35 of struct kvm_lapic_state's regs field.
+
+If KVM_CAP_X2APIC_API is disabled, struct kvm_lapic_state always uses xAPIC
+format.
+
 
 4.58 KVM_SET_LAPIC
 
@@ -1600,6 +1613,10 @@ struct kvm_lapic_state {
 Copies the input argument into the Local APIC registers.  The data format
 and layout are the same as documented in the architecture manual.
 
+The format of the APIC ID register (bytes 32-35 of struct kvm_lapic_state's
+regs field) depends on the state of the KVM_CAP_X2APIC_API capability.
+See the note in KVM_GET_LAPIC.
+
 
 4.59 KVM_IOEVENTFD
 
@@ -2180,6 +2197,10 @@ struct kvm_msi {
 
 No flags are defined so far. The corresponding field must be 0.
 
+On x86, address_hi is ignored unless the KVM_CAP_X2APIC_API capability is
+enabled.  If it is enabled, address_hi bits 31-8 provide bits 31-8 of the
+destination id.  Bits 7-0 of address_hi must be zero.
+
 
 4.71 KVM_CREATE_PIT2
 
@@ -3811,6 +3832,17 @@ Allows use of runtime-instrumentation introduced with 
zEC12 processor.
 Will return -EINVAL if the machine does not support runtime-instrumentation.
 Will return -EBUSY if a VCPU has already been created.
 
+7.7 KVM_CAP_X2APIC_API
+
+Architectures: x86
+Parameters: none
+Returns: 0
+
+Enabling this capability changes the behavior of KVM_SET_GSI_ROUTING,
+KVM_SIGNAL_MSI, KVM_SET_LAPIC, and KVM_GET_LAPIC, allowing the use of 32-bit
+APIC IDs.  See KVM_CAP_X2APIC_API in their respective sections.
+
+
 8. Other capabilities.
 --
 
diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 643e3dffcd85..f1b202b34c72 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -782,6 +782,8 @@ struct kvm_arch {
u32 ldr_mode;
struct page *avic_logical_id_table_page;
struct page *avic_physical_id_table_page;
+
+   bool x2apic_api;
 };
 
 struct kvm_vm_stat {
@@ -1364,7 +1366,7 @@ bool kvm_vcpu_is_bsp(struct kvm_vcpu *vcpu);
 bool kvm_intr_is_single_vcpu(struct kvm *kvm, struct kvm_lapic_irq *irq,
 struct kvm_vcpu **dest_vcpu);
 
-void kvm_set_msi_irq(struct kvm_kernel_irq_routing_entry *e,
+void kvm_set_msi_irq(struct kvm *kvm, struct kvm_kernel_irq_routing_entry *e,
 struct kvm_lapic_irq *irq);
 
 static inline void 

[PATCH v2 11/13] KVM: x86: add KVM_CAP_X2APIC_API

2016-07-07 Thread Radim Krčmář
KVM_CAP_X2APIC_API can be enabled to extend APIC ID in get/set ioctl and MSI
addresses to 32 bits.  Both are needed to support x2APIC.

The capability has to be toggleable and disabled by default, because get/set
ioctl shifted and truncated APIC ID to 8 bits by using a non-standard protocol
inspired by xAPIC and the change is not backward-compatible.

Changes to MSI addresses follow the format used by interrupt remapping unit.
The upper address word, that used to be 0, contains upper 24 bits of the LAPIC
address in its upper 24 bits.  Lower 8 bits are reserved as 0.
Using the upper address word is not backward-compatible either as we didn't
check that userspace zeroed the word.  Reserved bits are still not explicitly
checked, but non-zero data will affect LAPIC addresses, which will cause a bug.

Signed-off-by: Radim Krčmář 
---
 v2:
 * pass struct kvm into kvm_set_msi_irq [Paolo]
 * trace address_hi [David]
 * use hex dst, like other tracepoins
 * strict reserved MSI bits checking [Paolo]
 * loose reserved capability bits checking [Paolo]
 * improved documentation [Paolo]

 Documentation/virtual/kvm/api.txt | 32 
 arch/x86/include/asm/kvm_host.h   |  4 +++-
 arch/x86/kvm/irq_comm.c   | 26 +-
 arch/x86/kvm/lapic.c  | 13 +
 arch/x86/kvm/vmx.c|  2 +-
 arch/x86/kvm/x86.c|  5 +
 include/trace/events/kvm.h|  5 +++--
 include/uapi/linux/kvm.h  |  1 +
 8 files changed, 75 insertions(+), 13 deletions(-)

diff --git a/Documentation/virtual/kvm/api.txt 
b/Documentation/virtual/kvm/api.txt
index 09efa9eb3926..a8f2ef910f98 100644
--- a/Documentation/virtual/kvm/api.txt
+++ b/Documentation/virtual/kvm/api.txt
@@ -1482,6 +1482,10 @@ struct kvm_irq_routing_msi {
__u32 pad;
 };
 
+On x86, address_hi is ignored unless the KVM_CAP_X2APIC_API capability is
+enabled.  If it is enabled, address_hi bits 31-8 provide bits 31-8 of the
+destination id.  Bits 7-0 of address_hi must be zero.
+
 struct kvm_irq_routing_s390_adapter {
__u64 ind_addr;
__u64 summary_addr;
@@ -1583,6 +1587,15 @@ struct kvm_lapic_state {
 Reads the Local APIC registers and copies them into the input argument.  The
 data format and layout are the same as documented in the architecture manual.
 
+If KVM_CAP_X2APIC_API is enabled, then the format of APIC_ID register depends
+on the APIC mode (reported by MSR_IA32_APICBASE) of its VCPU.  x2APIC stores
+APIC ID in the APIC_ID register (bytes 32-35).  xAPIC only allows an 8-bit APIC
+ID which is stored in bits 31-24 of the APIC register, or equivalently in byte
+35 of struct kvm_lapic_state's regs field.
+
+If KVM_CAP_X2APIC_API is disabled, struct kvm_lapic_state always uses xAPIC
+format.
+
 
 4.58 KVM_SET_LAPIC
 
@@ -1600,6 +1613,10 @@ struct kvm_lapic_state {
 Copies the input argument into the Local APIC registers.  The data format
 and layout are the same as documented in the architecture manual.
 
+The format of the APIC ID register (bytes 32-35 of struct kvm_lapic_state's
+regs field) depends on the state of the KVM_CAP_X2APIC_API capability.
+See the note in KVM_GET_LAPIC.
+
 
 4.59 KVM_IOEVENTFD
 
@@ -2180,6 +2197,10 @@ struct kvm_msi {
 
 No flags are defined so far. The corresponding field must be 0.
 
+On x86, address_hi is ignored unless the KVM_CAP_X2APIC_API capability is
+enabled.  If it is enabled, address_hi bits 31-8 provide bits 31-8 of the
+destination id.  Bits 7-0 of address_hi must be zero.
+
 
 4.71 KVM_CREATE_PIT2
 
@@ -3811,6 +3832,17 @@ Allows use of runtime-instrumentation introduced with 
zEC12 processor.
 Will return -EINVAL if the machine does not support runtime-instrumentation.
 Will return -EBUSY if a VCPU has already been created.
 
+7.7 KVM_CAP_X2APIC_API
+
+Architectures: x86
+Parameters: none
+Returns: 0
+
+Enabling this capability changes the behavior of KVM_SET_GSI_ROUTING,
+KVM_SIGNAL_MSI, KVM_SET_LAPIC, and KVM_GET_LAPIC, allowing the use of 32-bit
+APIC IDs.  See KVM_CAP_X2APIC_API in their respective sections.
+
+
 8. Other capabilities.
 --
 
diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 643e3dffcd85..f1b202b34c72 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -782,6 +782,8 @@ struct kvm_arch {
u32 ldr_mode;
struct page *avic_logical_id_table_page;
struct page *avic_physical_id_table_page;
+
+   bool x2apic_api;
 };
 
 struct kvm_vm_stat {
@@ -1364,7 +1366,7 @@ bool kvm_vcpu_is_bsp(struct kvm_vcpu *vcpu);
 bool kvm_intr_is_single_vcpu(struct kvm *kvm, struct kvm_lapic_irq *irq,
 struct kvm_vcpu **dest_vcpu);
 
-void kvm_set_msi_irq(struct kvm_kernel_irq_routing_entry *e,
+void kvm_set_msi_irq(struct kvm *kvm, struct kvm_kernel_irq_routing_entry *e,
 struct kvm_lapic_irq *irq);
 
 static inline void kvm_arch_vcpu_blocking(struct