Hi Avi,

Good that you're back.

On Sun, Mar 16, 2008 at 04:00:06PM +0200, Avi Kivity wrote:
> Marcelo Tosatti wrote:
> >This patchset resends Anthony's QEMU balloon support plus:
> >
> >- Truncates the target size to ram size
> >- Enables madvise() conditioned on KVM_ZAP_GFN ioctl
> >
> >  
> 
> Once mmu notifiers are in, KVM_ZAP_GFN isn't needed.  So we have three 
> possible situations:
> 
> - zap needed, but not available: don't madvise()
> - zap needed and available: zap and madvise()
> - zap unneeded: madvise()

Right. That is what the patch does. You just have to fill in
"have_mmu_notifiers" here:

+int kvm_zap_single_gfn(struct kvm *kvm, gfn_t gfn)
+{
+       unsigned long addr;
+       int have_mmu_notifiers = 0;
+
+       down_read(&kvm->slots_lock);
+       addr = gfn_to_hva(kvm, gfn);
+
+       if (kvm_is_error_hva(addr)) {
+               up_read(&kvm->slots_lock);
+               return -EINVAL;
+       }
+
+       if (!have_mmu_notifiers) {
+               spin_lock(&kvm->mmu_lock);
+               rmap_nuke(kvm, gfn);
+               spin_unlock(&kvm->mmu_lock);
+       }
+       up_read(&kvm->slots_lock);

So as to avoid rmap_nuke, since that will be done through the madvise()
path.

> Did you find out what's causing the errors in the first place (if zap is 
> not used)?  It worries me greatly.

Yes, the problem is that the rmap code does not handle the qemu process
mappings from vanishing while there is a present rmap. If that happens,
and there is a fault for a gfn whose qemu mapping has been removed, a
different physical zero page will be allocated:

        rmap a -> gfn 0 -> physical host page 0
        mapping for gfn 0 gets removed
        guest faults in gfn 0 through the same pte "chain"
        rmap a -> gfn 0 -> physical host page 1

When instantiating the shadow mapping for the second time, the
"is_rmap_pte" check succeeds, so we release the reference grabbed by
gfn_to_page() at mmu_set_spte(). We now have a shadow mapping pointing
to a physical page without having an additional reference on that page.

The following makes the host not crash under such a condition, but the
condition itself is invalid leading to inconsistent state on the guest.
So IMHO it shouldnt be allowed to happen in the first place.

diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
index f0cdfba..4c93b79 100644
--- a/arch/x86/kvm/mmu.c
+++ b/arch/x86/kvm/mmu.c
@@ -1009,6 +1009,21 @@ struct page *gva_to_page(struct kvm_vcpu *vcpu, gva_t
+gva)
        return page;
 }

+static int was_spte_rmapped(struct kvm *kvm, u64 *spte, struct page *page)
+{
+       int ret = 0;
+       unsigned long host_pfn = (*spte & PT64_BASE_ADDR_MASK) >> PAGE_SHIFT;
+
+       if (is_rmap_pte(*spte)) {
+               if (host_pfn != page_to_pfn(page))
+                       rmap_remove(kvm, spte);
+               else
+                       ret = 1;
+       }
+
+       return ret;
+}
+
 static void mmu_set_spte(struct kvm_vcpu *vcpu, u64 *shadow_pte,
                         unsigned pt_access, unsigned pte_access,
                         int user_fault, int write_fault, int dirty,
@@ -1016,7 +1031,7 @@ static void mmu_set_spte(struct kvm_vcpu *vcpu, u64
+*shadow_pte,
                         struct page *page)
 {
        u64 spte;
-       int was_rmapped = is_rmap_pte(*shadow_pte);
+       int was_rmapped = was_spte_rmapped(vcpu->kvm, shadow_pte, page);
        int was_writeble = is_writeble_pte(*shadow_pte);

        /*






-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
_______________________________________________
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel

Reply via email to