Kay, Allen M wrote: > Kvm kernel changes. > > Signed-off-by: Allen M Kay <[EMAIL PROTECTED]> > > ------ > arch/x86/kvm/Makefile | 2 > arch/x86/kvm/vtd.c | 183 > +++++++++++++++++++++++++++++++++++++++++++++ > arch/x86/kvm/x86.c | 7 + > include/asm-x86/kvm_host.h | 3 > include/asm-x86/kvm_para.h | 1 > include/linux/kvm_host.h | 6 + > virt/kvm/kvm_main.c | 3 > 7 files changed, 204 insertions(+), 1 deletion(-) > > ------ > > diff --git a/arch/x86/kvm/Makefile b/arch/x86/kvm/Makefile > index c97d35c..b1057fb 100644 > --- a/arch/x86/kvm/Makefile > +++ b/arch/x86/kvm/Makefile > @@ -12,7 +12,7 @@ EXTRA_CFLAGS += -Ivirt/kvm -Iarch/x86/kvm > kvm-objs := $(common-objs) x86.o mmu.o x86_emulate.o i8259.o irq.o > lapic.o \ > i8254.o > obj-$(CONFIG_KVM) += kvm.o > -kvm-intel-objs = vmx.o > +kvm-intel-objs = vmx.o vtd.o > obj-$(CONFIG_KVM_INTEL) += kvm-intel.o > kvm-amd-objs = svm.o > obj-$(CONFIG_KVM_AMD) += kvm-amd.o > diff --git a/arch/x86/kvm/vtd.c b/arch/x86/kvm/vtd.c > new file mode 100644 > index 0000000..9a080b5 > --- /dev/null > +++ b/arch/x86/kvm/vtd.c > @@ -0,0 +1,183 @@ > +/* > + * Copyright (c) 2006, Intel Corporation. > + * > + * This program is free software; you can redistribute it and/or modify > it > + * under the terms and conditions of the GNU General Public License, > + * version 2, as published by the Free Software Foundation. > + * > + * This program is distributed in the hope it will be useful, but > WITHOUT > + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY > or > + * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public > License for > + * more details. > + * > + * You should have received a copy of the GNU General Public License > along with > + * this program; if not, write to the Free Software Foundation, Inc., > 59 Temple > + * Place - Suite 330, Boston, MA 02111-1307 USA. > + * > + * Copyright (C) 2006-2008 Intel Corporation > + * Author: Allen M. Kay <[EMAIL PROTECTED]> > + * Author: Weidong Han <[EMAIL PROTECTED]> > + */ > + > +#include <linux/list.h> > +#include <linux/kvm_host.h> > +#include <linux/pci.h> > +#include <linux/dmar.h> > +#include <linux/intel-iommu.h> > + > +//#define DEBUG > + > +#define DEFAULT_DOMAIN_ADDRESS_WIDTH 48 > + > +struct dmar_drhd_unit * dmar_find_matched_drhd_unit(struct pci_dev > *dev); > +struct dmar_domain * iommu_alloc_domain(struct intel_iommu *iommu); > +void iommu_free_domain(struct dmar_domain *domain); > +int domain_init(struct dmar_domain *domain, int guest_width); > +int domain_context_mapping(struct dmar_domain *d, > + struct pci_dev *pdev); > +int domain_page_mapping(struct dmar_domain *domain, dma_addr_t iova, > + u64 hpa, size_t size, int prot); > +void detach_domain_for_dev(struct dmar_domain *domain, u8 bus, u8 > devfn); > +struct dmar_domain * find_domain(struct pci_dev *pdev); >
These definitely need to be moved to a common header. > + > +int kvm_iommu_map_pages(struct kvm *kvm, > + gfn_t base_gfn, unsigned long npages) > +{ > + unsigned long gpa; > + struct page *page; > + hpa_t hpa; > + int j, write; > + struct vm_area_struct *vma; > + > + if (!kvm->arch.domain) > + return 1; > In the kernel, we should be using -errno to return error codes. > + gpa = base_gfn << PAGE_SHIFT; > + page = gfn_to_page(kvm, base_gfn); > + hpa = page_to_phys(page); > Please use gfn_to_pfn(). Keep in mind, by using gfn_to_page/gfn_to_pfn, you take a reference to a page. You're leaking that reference here. > + printk(KERN_DEBUG "kvm_iommu_map_page: gpa = %lx\n", gpa); > + printk(KERN_DEBUG "kvm_iommu_map_page: hpa = %llx\n", hpa); > + printk(KERN_DEBUG "kvm_iommu_map_page: size = %lx\n", > + npages*PAGE_SIZE); > + > + for (j = 0; j < npages; j++) { > + gpa += PAGE_SIZE; > + page = gfn_to_page(kvm, gpa >> PAGE_SHIFT); > + hpa = page_to_phys(page); > Again, gfn_to_pfn() and you're taking a reference that I never see you releasing. > + domain_page_mapping(kvm->arch.domain, gpa, hpa, > PAGE_SIZE, > + DMA_PTE_READ | DMA_PTE_WRITE); > + vma = find_vma(current->mm, gpa); > + if (!vma) > + return 1; > + write = (vma->vm_flags & VM_WRITE) != 0; > + get_user_pages(current, current->mm, gpa, > + PAGE_SIZE, write, 0, NULL, NULL); > I don't quite see what you're doing here. It looks like you're trying to pre-fault the page in? gfn_to_pfn will do that for you. You're taking a bunch of references here that are never getting released. I think the general approach here is a bit faulty. I think what we want to do is mlock() from userspace to ensure all the memory is present for the guest. We should combine this with MMU-notifiers such that whenever the userspace mapping changes, we can reprogram the IOMMU. In the case where we don't have MMU-notifiers, we simply hold on to the memory forever and never program the IOMMU. The initial mlock() in userspace is somewhat of a nop here but it's important with MMU-notifiers because we will no longer be holding a reference for guest memory. We have to ensure we don't swap KVM guest memory while using hardware pass-through, but AFAICT, we do not need to make the memory non-reclaimable As long as we reprogram the IOMMU with a new, valid, mapping everything should be fine. mlock() really gives us the right semantics. Semantically, a PV API that supports DMA window registration simply mlock()s the DMA regions on behalf of the guest. No special logic should be needed. Regards, Anthony Liguori ------------------------------------------------------------------------- This SF.net email is sponsored by the 2008 JavaOne(SM) Conference Don't miss this year's exciting event. There's still time to save $100. Use priority code J8TL2D2. http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone _______________________________________________ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel