Kay, Allen M wrote:
> Kvm kernel changes.
>
> Signed-off-by: Allen M Kay <[EMAIL PROTECTED]>
>
> ------
>  arch/x86/kvm/Makefile      |    2 
>  arch/x86/kvm/vtd.c         |  183
> +++++++++++++++++++++++++++++++++++++++++++++
>  arch/x86/kvm/x86.c         |    7 +
>  include/asm-x86/kvm_host.h |    3 
>  include/asm-x86/kvm_para.h |    1 
>  include/linux/kvm_host.h   |    6 +
>  virt/kvm/kvm_main.c        |    3 
>  7 files changed, 204 insertions(+), 1 deletion(-)
>
> ------
>
> diff --git a/arch/x86/kvm/Makefile b/arch/x86/kvm/Makefile
> index c97d35c..b1057fb 100644
> --- a/arch/x86/kvm/Makefile
> +++ b/arch/x86/kvm/Makefile
> @@ -12,7 +12,7 @@ EXTRA_CFLAGS += -Ivirt/kvm -Iarch/x86/kvm
>  kvm-objs := $(common-objs) x86.o mmu.o x86_emulate.o i8259.o irq.o
> lapic.o \
>       i8254.o
>  obj-$(CONFIG_KVM) += kvm.o
> -kvm-intel-objs = vmx.o
> +kvm-intel-objs = vmx.o vtd.o
>  obj-$(CONFIG_KVM_INTEL) += kvm-intel.o
>  kvm-amd-objs = svm.o
>  obj-$(CONFIG_KVM_AMD) += kvm-amd.o
> diff --git a/arch/x86/kvm/vtd.c b/arch/x86/kvm/vtd.c
> new file mode 100644
> index 0000000..9a080b5
> --- /dev/null
> +++ b/arch/x86/kvm/vtd.c
> @@ -0,0 +1,183 @@
> +/*
> + * Copyright (c) 2006, Intel Corporation.
> + *
> + * This program is free software; you can redistribute it and/or modify
> it
> + * under the terms and conditions of the GNU General Public License,
> + * version 2, as published by the Free Software Foundation.
> + *
> + * This program is distributed in the hope it will be useful, but
> WITHOUT
> + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY
> or
> + * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public
> License for
> + * more details.
> + *
> + * You should have received a copy of the GNU General Public License
> along with
> + * this program; if not, write to the Free Software Foundation, Inc.,
> 59 Temple
> + * Place - Suite 330, Boston, MA 02111-1307 USA.
> + *
> + * Copyright (C) 2006-2008 Intel Corporation
> + * Author: Allen M. Kay <[EMAIL PROTECTED]>
> + * Author: Weidong Han <[EMAIL PROTECTED]>
> + */
> +
> +#include <linux/list.h>
> +#include <linux/kvm_host.h>
> +#include <linux/pci.h>
> +#include <linux/dmar.h>
> +#include <linux/intel-iommu.h>
> +
> +//#define DEBUG
> +
> +#define DEFAULT_DOMAIN_ADDRESS_WIDTH 48
> +
> +struct dmar_drhd_unit * dmar_find_matched_drhd_unit(struct pci_dev
> *dev);
> +struct dmar_domain * iommu_alloc_domain(struct intel_iommu *iommu);
> +void iommu_free_domain(struct dmar_domain *domain);
> +int domain_init(struct dmar_domain *domain, int guest_width);
> +int domain_context_mapping(struct dmar_domain *d,
> +                     struct pci_dev *pdev);
> +int domain_page_mapping(struct dmar_domain *domain, dma_addr_t iova,
> +                     u64 hpa, size_t size, int prot);
> +void detach_domain_for_dev(struct dmar_domain *domain, u8 bus, u8
> devfn);
> +struct dmar_domain * find_domain(struct pci_dev *pdev);
>   

These definitely need to be moved to a common header.

> +
> +int kvm_iommu_map_pages(struct kvm *kvm,
> +     gfn_t base_gfn, unsigned long npages)
> +{
> +     unsigned long gpa;
> +     struct page *page;
> +     hpa_t hpa;
> +     int j, write;
> +     struct vm_area_struct *vma;
> +
> +     if (!kvm->arch.domain)
> +             return 1;
>   

In the kernel, we should be using -errno to return error codes.

> +     gpa = base_gfn << PAGE_SHIFT;
> +     page = gfn_to_page(kvm, base_gfn);
> +     hpa = page_to_phys(page);
>   

Please use gfn_to_pfn().  Keep in mind, by using gfn_to_page/gfn_to_pfn, 
you take a reference to a page.  You're leaking that reference here.

> +     printk(KERN_DEBUG "kvm_iommu_map_page: gpa = %lx\n", gpa);
> +     printk(KERN_DEBUG "kvm_iommu_map_page: hpa = %llx\n", hpa);
> +     printk(KERN_DEBUG "kvm_iommu_map_page: size = %lx\n",
> +             npages*PAGE_SIZE);
> +
> +     for (j = 0; j < npages; j++) {
> +             gpa +=  PAGE_SIZE;
> +             page = gfn_to_page(kvm, gpa >> PAGE_SHIFT);
> +             hpa = page_to_phys(page);
>   

Again, gfn_to_pfn() and you're taking a reference that I never see you 
releasing.

> +             domain_page_mapping(kvm->arch.domain, gpa, hpa,
> PAGE_SIZE,
> +                                     DMA_PTE_READ | DMA_PTE_WRITE);
> +             vma = find_vma(current->mm, gpa);
> +             if (!vma)
> +                     return 1;
> +             write = (vma->vm_flags & VM_WRITE) != 0;
> +             get_user_pages(current, current->mm, gpa,
> +                             PAGE_SIZE, write, 0, NULL, NULL);
>   

I don't quite see what you're doing here.  It looks like you're trying 
to pre-fault the page in?  gfn_to_pfn will do that for you.  You're 
taking a bunch of references here that are never getting released.

I think the general approach here is a bit faulty.  I think what we want 
to do is mlock() from userspace to ensure all the memory is present for 
the guest.  We should combine this with MMU-notifiers such that whenever 
the userspace mapping changes, we can reprogram the IOMMU.  In the case 
where we don't have MMU-notifiers, we simply hold on to the memory 
forever and never program the IOMMU.  The initial mlock() in userspace 
is somewhat of a nop here but it's important with MMU-notifiers because 
we will no longer be holding a reference for guest memory.

We have to ensure we don't swap KVM guest memory while using hardware 
pass-through, but AFAICT, we do not need to make the memory 
non-reclaimable  As long as we reprogram the IOMMU with a new, valid, 
mapping everything should be fine.  mlock() really gives us the right 
semantics.

Semantically, a PV API that supports DMA window registration simply 
mlock()s the DMA regions on behalf of the guest.  No special logic 
should be needed.

Regards,

Anthony Liguori

-------------------------------------------------------------------------
This SF.net email is sponsored by the 2008 JavaOne(SM) Conference 
Don't miss this year's exciting event. There's still time to save $100. 
Use priority code J8TL2D2. 
http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone
_______________________________________________
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel

Reply via email to