Re: [patch 09/13] KVM: MMU: out of sync shadow core

Avi Kivity Sun, 07 Sep 2008 04:02:36 -0700

Marcelo Tosatti wrote:

Allow global and single-root, single-role-per-gfn leaf shadowed
pagetables to be unsynced.


Global unsync pages are saved into a per-vm array, synced on cr4/cr0 writes.


Why not a list?

Non-global unsync pages are linked off their root shadow page, syncedon cr3/cr4/cr0 writes.


Some of this logic is simplistic and could be smarter (page_multimapped and
the full root sync on higher level pagetable sharing).

Also unsyncing of non-leaf nodes might be interesting (but more complicated).

+static struct kvm_mmu_page *kvm_mmu_lookup_page_root(struct kvm_vcpu *vcpu,

+                                                    gfn_t gfn)
+{
+       unsigned index;
+       struct hlist_head *bucket;
+       struct kvm_mmu_page *sp;
+       struct hlist_node *node;
+       struct kvm *kvm = vcpu->kvm;
+       int level = vcpu->arch.mmu.root_level;
+       if (!is_long_mode(vcpu) && is_pae(vcpu))
+               level--;
+
+       pgprintk("%s: looking for gfn %lx\n", __func__, gfn);
+       index = kvm_page_table_hashfn(gfn);
+       bucket = &kvm->arch.mmu_page_hash[index];
+       hlist_for_each_entry(sp, node, bucket, hash_link)
+               if (sp->gfn == gfn && !sp->role.metaphysical
+                   && !sp->role.invalid && sp->role.level == level) {
+                       pgprintk("%s: found role %x\n",
+                                __func__, sp->role.word);
+                       return sp;
+               }
+       return NULL;
+}


I'm worried about the complexity this (and the rest) introduces.

A possible alternative is:

- for non-leaf pages, including roots, add a 'unsync_children' flag.
- when marking a page unsync, set the flag recursively on all parents

- when switching cr3, recursively descend to locate unsynced leaves,clearing flags along the way- to speed this up, put a bitmap with 1 bit per pte in the pages (512bits = 64 bytes)

- the bitmap can be externally allocated to save space, or not

This means we no longer have to worry about multiple roots, when a pageacquires another root while it is unsynced, etc.

@@ -963,8 +1112,24 @@ static struct kvm_mmu_page *kvm_mmu_get_
                 gfn, role.word);
        index = kvm_page_table_hashfn(gfn);
        bucket = &vcpu->kvm->arch.mmu_page_hash[index];
-       hlist_for_each_entry(sp, node, bucket, hash_link)
-               if (sp->gfn == gfn && sp->role.word == role.word) {
+       hlist_for_each_entry_safe(sp, node, tmp, bucket, hash_link)
+               if (sp->gfn == gfn) {
+                       /*
+                        * If a pagetable becomes referenced by more than one
+                        * root, or has multiple roles, unsync it and disable
+                        * oos. For higher level pgtables the entire tree
+                        * has to be synced.
+                        */
+                       if (sp->root_gfn != root_gfn) {
+                               kvm_set_pg_inuse(sp);


What does inuse mean exactly?

+                               if (set_shared_mmu_page(vcpu, sp))
+                                       tmp = bucket->first;
+                               kvm_clear_pg_inuse(sp);


Cleared here?

+                               unsyncable = 0;
+                       }
+                       if (sp->role.word != role.word)
+                               continue;
+
                        mmu_page_add_parent_pte(vcpu, sp, parent_pte);
                        pgprintk("%s: found\n", __func__);
                        return sp;
--- kvm.orig/include/asm-x86/kvm_host.h
+++ kvm/include/asm-x86/kvm_host.h
@@ -179,6 +179,10 @@ union kvm_mmu_page_role {
 struct kvm_mmu_page {
        struct list_head link;
        struct hlist_node hash_link;
+       /* FIXME: one list_head is enough */
+       struct list_head unsync_pages;
+       struct list_head oos_link;


That's okay, we may allow OOS roots one day.

+       gfn_t root_gfn; /* root this pagetable belongs to, -1 if multimapped */

/*

         * The following two entries are used to key the shadow page in the
@@ -362,6 +366,8 @@ struct kvm_arch{
        unsigned int n_requested_mmu_pages;
        unsigned int n_alloc_mmu_pages;
        struct hlist_head mmu_page_hash[KVM_NUM_MMU_PAGES];
+       struct kvm_mmu_page *oos_global_pages[7];
+       unsigned oos_global_idx;


What does the index mean?  An lru pointer?

I became a little unsynced myself reading the patch.  It's very complex.

We need to change something to keep it maintainable. Either a betterdata structure, or disallowing a parent to be zapped while any of itschildren are alive.


What happens when a sp->global changes its value while a page is oos?

Can we even detect a nonglobal->global change? Maybe instead of a flag,add a counter for active ptes and global ptes, and a page is global ifthe counters match. However most likely it doesn't matter at all.


--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [patch 09/13] KVM: MMU: out of sync shadow core

Reply via email to