Am 09.11.2010 14:27, Avi Kivity wrote:
> On 11/08/2010 01:21 PM, Jan Kiszka wrote:
>> PCI 2.3 allows to generically disable IRQ sources at device level. This
>> enables us to share IRQs of such devices between on the host side when
>> passing them to a guest. This feature is optional, user space has to
>> request it explicitly. Moreover, user space can inform us about its view
>> of PCI_COMMAND_INTX_DISABLE so that we can avoid unmasking the interrupt
>> and signaling it if the guest masked it via the PCI config space.
>>
>
> It's a pity this cannot be done transparently. We could detect multiple
> devices sharing the line,
Even that is not possible. Assigned or host devices may be activated
after we registered exclusively, pushing the breakage from VM start-up
to a different operation.
> but what about PCI_COMMAND_INTX_DISABLE?
>
> Perhaps we can hook the kernel's handler for this bit?
Some IRQ registration notifier that would allow us to reregister our
handler with IRQ sharing support? Maybe.
>
>>
>> /* Depends on KVM_CAP_IOMMU */
>> #define KVM_DEV_ASSIGN_ENABLE_IOMMU (1<< 0)
>> +/* The following two depend on KVM_CAP_PCI_2_3 */
>> +#define KVM_DEV_ASSIGN_PCI_2_3 (1<< 1)
>> +#define KVM_DEV_ASSIGN_MASK_INTX (1<< 2)
>> +
>> +If KVM_DEV_ASSIGN_PCI_2_3 is set, the kernel will manage legacy INTx
>> interrupts
>> +via the PCI-2.3-compliant device-level mask, thus enable IRQ sharing with
>> other
>> +assigned devices or host devices. KVM_DEV_ASSIGN_MASK_INTX specifies the
>> +guest's view on the INTx mask, see KVM_ASSIGN_SET_INTX_MASK for details.
>>
>> 4.48 KVM_DEASSIGN_PCI_DEVICE
>>
>> @@ -1263,6 +1271,23 @@ struct kvm_assigned_msix_entry {
>> __u16 padding[3];
>> };
>>
>> +5.54 KVM_ASSIGN_SET_INTX_MASK
>
> 4.54?
Of course.
>
> (54? wow.)
And I don't think all IOCTLs are already documented (though the majority
now).
>
>> +
>> +Capability: KVM_CAP_PCI_2_3
>> +Architectures: x86
>> +Type: vm ioctl
>> +Parameters: struct kvm_assigned_pci_dev (in)
>> +Returns: 0 on success, -1 on error
>> +
>> +Informs the kernel about the guest's view on the INTx mask. As long as the
>> +guest masks the legacy INTx, the kernel will refrain from unmasking it at
>> +hardware level and will not assert the guest's IRQ line. User space is still
>> +responsible for applying this state to the assigned device's real config
>> space.
>
> What if userspace lies?
User space problem. We will at worst receive one IRQ, mask it, and then
user space need to react again.
>
>> +
>> +See KVM_ASSIGN_DEV_IRQ for the data structure. The target device is
>> specified
>> +by assigned_dev_id. In the flags field, only KVM_DEV_ASSIGN_MASK_INTX is
>> +evaluated.
>> +
>>
>> diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
>> index fe83eb0..7f1627c 100644
>> --- a/include/linux/kvm_host.h
>> +++ b/include/linux/kvm_host.h
>> @@ -468,6 +468,7 @@ struct kvm_assigned_dev_kernel {
>> unsigned int entries_nr;
>> int host_irq;
>> bool host_irq_disabled;
>> + bool pci_2_3;
>> struct msix_entry *host_msix_entries;
>> int guest_irq;
>> struct msix_entry *guest_msix_entries;
>> @@ -477,6 +478,7 @@ struct kvm_assigned_dev_kernel {
>> struct pci_dev *dev;
>> struct kvm *kvm;
>> spinlock_t intx_lock;
>> + struct mutex intx_mask_lock;
>> char irq_name[32];
>> };
>
> I saw no reason this can't be a spinlock, but perhaps I missed
> something. This would allow us to avoid srcu, which is slightly more
> expensive than rcu. Since pci 2.3 assigned devices are not a major use
> case, I'd like not to penalize the mainstream users for this.
The lock has to be held across kvm_set_irq, which is the potentially
expensive (O(n), n == number of VCPUs) operation.
>
> This patch undoes some of the niceness of the previous patches, but I
> have no alternative to suggest.
>
Yes, it surely does not make things simpler. But much of the complexity
is avoided during runtime when MSIs are used.
Jan
--
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to [email protected]
More majordomo info at http://vger.kernel.org/majordomo-info.html