* On Thursday 21 Aug 2008 16:35:57 Ben-Ami Yassour wrote:
> On Thu, 2008-08-21 at 12:13 +0530, Amit Shah wrote:
> > * On Thursday 07 Aug 2008 19:44:47 Ben-Ami Yassour wrote:
> > > Based on a patch by: Kay, Allen M <[EMAIL PROTECTED]>
> > >
> > > This patch enables pci device assignment based on VT-d support.
> > > When a device is assigned to the guest, the guest memory is pinned and
> > > the mapping is updated in the VT-d IOMMU.
> > >
> > > Signed-off-by: Kay, Allen M <[EMAIL PROTECTED]>
> > > Signed-off-by: Weidong Han <[EMAIL PROTECTED]>
> > > Signed-off-by: Ben-Ami Yassour <[EMAIL PROTECTED]>
> > > ---
> > > arch/x86/kvm/Makefile | 3 +
> > > arch/x86/kvm/vtd.c | 203
> > > ++++++++++++++++++++++++++++++++++++++++++++ arch/x86/kvm/x86.c
> > > | 10 ++
> > > include/asm-x86/kvm_host.h | 3 +
> > > include/linux/kvm_host.h | 32 +++++++
> > > virt/kvm/kvm_main.c | 9 ++-
> > > 6 files changed, 259 insertions(+), 1 deletions(-)
> > > create mode 100644 arch/x86/kvm/vtd.c
> > >
> > > +int kvm_iommu_map_guest(struct kvm *kvm,
> > > + struct kvm_assigned_dev_kernel *assigned_dev)
> > > +{
> > > + struct pci_dev *pdev = NULL;
> > > + int rc;
> > > +
> > > + if (!intel_iommu_found()) {
> > > + printk(KERN_ERR "intel iommu not found\n");
> > > + return -ENODEV;
> > > + }
> > > +
> > > + printk(KERN_DEBUG "VT-d direct map: host bdf = %x:%x:%x\n",
> > > + assigned_dev->host_busnr,
> > > + PCI_SLOT(assigned_dev->host_devfn),
> > > + PCI_FUNC(assigned_dev->host_devfn));
> > > +
> > > + pdev = assigned_dev->dev;
> > > +
> > > + if (pdev == NULL) {
> > > + if (kvm->arch.intel_iommu_domain) {
> > > + intel_iommu_domain_exit(kvm->arch.intel_iommu_domain);
> > > + kvm->arch.intel_iommu_domain = NULL;
> > > + }
> > > + return -ENODEV;
> > > + }
> > > +
> > > + kvm->arch.intel_iommu_domain = intel_iommu_domain_alloc(pdev);
> > >
> > > +int kvm_iommu_unmap_guest(struct kvm *kvm)
> > > +{
> > > + struct kvm_assigned_dev_kernel *entry;
> > > + struct pci_dev *pdev = NULL;
> > > + struct dmar_domain *domain = kvm->arch.intel_iommu_domain;
> > > +
> > > + /* check if iommu exists and in use */
> > > + if (!domain)
> > > + return 0;
> > > +
> > > + list_for_each_entry(entry, &kvm->arch.assigned_dev_head, list) {
> > > + printk(KERN_DEBUG "VT-d unmap: host bdf = %x:%x:%x\n",
> > > + entry->host_busnr,
> > > + PCI_SLOT(entry->host_devfn),
> > > + PCI_FUNC(entry->host_devfn));
> > > +
> > > + for_each_pci_dev(pdev) {
> > > + if ((pdev->bus->number == entry->host_busnr) &&
> > > + (pdev->devfn == entry->host_devfn))
> > > + break;
> > > + }
> > > +
> > > + if (pdev == NULL)
> > > + return -ENODEV;
> > > +
> > > + /* detach kvm dmar domain */
> > > + intel_iommu_detach_dev(domain,
> > > + pdev->bus->number, pdev->devfn);
> > > + }
> > > + kvm_iommu_unmap_memslots(kvm);
> > > + intel_iommu_domain_exit(domain);
> > > + return 0;
> > > +}
> > > diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> > > index a97157c..5cfc21a 100644
> > > --- a/arch/x86/kvm/x86.c
> > > +++ b/arch/x86/kvm/x86.c
> > > @@ -35,6 +35,7 @@
> > > #include <linux/module.h>
> > > #include <linux/mman.h>
> > > #include <linux/highmem.h>
> > > +#include <linux/intel-iommu.h>
> > >
> > > #include <asm/uaccess.h>
> > > #include <asm/msr.h>
> > > @@ -271,9 +272,17 @@ static int kvm_vm_ioctl_assign_device(struct kvm
> > > *kvm,
> > >
> > > list_add(&match->list, &kvm->arch.assigned_dev_head);
> > >
> > > + /* currenlty iommu is required for handling dma */
> > > + r = kvm_iommu_map_guest(kvm, match);
> > > + if (r)
> > > + goto out_list_del;
> > > +
> >
> > This will break pvdma support. Please check if intel-iommu is found and
> > only then bail out of here.
> >
> > Even though pvdma is out-of-tree, it doesn't make sense to restrict our
> > usage here. AMD IOMMU support will need this to be handled in the right
> > way as well.
> >
> > > out:
> > > mutex_unlock(&kvm->lock);
> > > return r;
> > > +out_list_del:
> > > + list_del(&match->list);
> > > + pci_release_regions(dev);
> > > out_disable:
> > > pci_disable_device(dev);
> > > out_put:
> > > @@ -4219,6 +4228,7 @@ static void kvm_free_vcpus(struct kvm *kvm)
> > >
> > > void kvm_arch_destroy_vm(struct kvm *kvm)
> > > {
> > > + kvm_iommu_unmap_guest(kvm);
> >
> > Likewise, check if intel-iommu is found as a first step in this function.
> >
> > As part of getting AMD-iommu and pvdma support in, we should have generic
> > function names, but that can be done later.
> >
> > > diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
> > > index a18aaad..b703890 100644
> > > --- a/include/linux/kvm_host.h
> > > +++ b/include/linux/kvm_host.h
> > > @@ -285,6 +285,33 @@ int kvm_cpu_has_interrupt(struct kvm_vcpu *v);
> > > int kvm_cpu_has_pending_timer(struct kvm_vcpu *vcpu);
> > > void kvm_vcpu_kick(struct kvm_vcpu *vcpu);
> > >
> > > +#ifdef CONFIG_DMAR
> > > +int kvm_iommu_map_pages(struct kvm *kvm, gfn_t base_gfn,
> > > + unsigned long npages);
> > > +int kvm_iommu_map_guest(struct kvm *kvm,
> > > + struct kvm_assigned_dev_kernel *assigned_dev);
> > > +int kvm_iommu_unmap_guest(struct kvm *kvm);
> > > +#else /* CONFIG_DMAR */
> > > +static inline int kvm_iommu_map_pages(struct kvm *kvm,
> > > + gfn_t base_gfn,
> > > + unsigned long npages)
> > > +{
> > > + return 0;
> > > +}
> > > +
> > > +static inline int kvm_iommu_map_guest(struct kvm *kvm,
> > > + struct kvm_assigned_dev_kernel
> > > + *assigned_dev)
> > > +{
> > > + return -ENODEV;
> > > +}
> > > +
> > > +static inline int kvm_iommu_unmap_guest(struct kvm *kvm)
> > > +{
> > > + return 0;
> > > +}
> > > +#endif /* CONFIG_DMAR */
> >
> > This means you can't use pvdma with iommu. Make these functions that
> > check if iommu support for device assignment is enabled. A command-line
> > parameter to qemu stating you want to enable vt-d for device assignment
> > would be appropriate.
> >
> > For example, we could have the device assignment parameter changed to
> > look like this:
> >
> > -pcidevice dev=00:13.0,dma=vtd,dma=pvdma
> >
> > which will mean we should use vtd in conjunction with pvdma for this
> > assigned device.
>
> Amit,
>
> I agree with your comments that adding pvdma and AMD iommu support will
> require more changes.
> But I think that we should merge this code now, knowing that it will
> need to change when more functionality is added in the future (isn't
> that the way it usually works?).
I don't agree with this; at the minimum, something like this should be done:
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 29f94ba..d655188 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -277,10 +277,12 @@ static int kvm_vm_ioctl_assign_device(struct kvm *kvm,
list_add(&match->list, &kvm->arch.assigned_dev_head);
- /* currenlty iommu is required for handling dma */
- r = kvm_iommu_map_guest(kvm, match);
- if (r)
- goto out_list_del;
+ /* Enable the Intel IOMMU if available */
+ if (intel_iommu_found()) {
+ r = kvm_iommu_map_guest(kvm, match);
+ if (r)
+ goto out_list_del;
+ }
out:
mutex_unlock(&kvm->lock);
> I think we need to avoid the scenario we had two months ago with a lot
> of pending code on a separate tree. It was much harder for people to
That happened because the code was under development. It wasn't sitting out
for lack of reviews or users.
> review the code and to consider merging it. If we merge this code then
> the future changes will hopefully be smaller and easier to review and
> merge.
Though true, the changes suggested are small enough and are a big convenience
to the other developers involved. Remember that parallel development on other
modes of DMA translation is ongoing and all are based on kvm-mainline. Once
we have VT-d in, the others will base their work off this. If a small patch
can make their lives easier later, it's worth it to include the patch now.
I'll mostly do the command-line option-based checking patch tomorrow unless
someone else steps up first.
Amit
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at http://vger.kernel.org/majordomo-info.html