Marcelo Tosatti wrote:
Allow global and single-root, single-role-per-gfn leaf shadowed
pagetables to be unsynced.

Global unsync pages are saved into a per-vm array, synced on cr4/cr0 writes.


Why not a list?

Non-global unsync pages are linked off their root shadow page, synced on cr3/cr4/cr0 writes.

Some of this logic is simplistic and could be smarter (page_multimapped and
the full root sync on higher level pagetable sharing).

Also unsyncing of non-leaf nodes might be interesting (but more complicated).


+static struct kvm_mmu_page *kvm_mmu_lookup_page_root(struct kvm_vcpu *vcpu,
+                                                    gfn_t gfn)
+{
+       unsigned index;
+       struct hlist_head *bucket;
+       struct kvm_mmu_page *sp;
+       struct hlist_node *node;
+       struct kvm *kvm = vcpu->kvm;
+       int level = vcpu->arch.mmu.root_level;
+       if (!is_long_mode(vcpu) && is_pae(vcpu))
+               level--;
+
+       pgprintk("%s: looking for gfn %lx\n", __func__, gfn);
+       index = kvm_page_table_hashfn(gfn);
+       bucket = &kvm->arch.mmu_page_hash[index];
+       hlist_for_each_entry(sp, node, bucket, hash_link)
+               if (sp->gfn == gfn && !sp->role.metaphysical
+                   && !sp->role.invalid && sp->role.level == level) {
+                       pgprintk("%s: found role %x\n",
+                                __func__, sp->role.word);
+                       return sp;
+               }
+       return NULL;
+}

I'm worried about the complexity this (and the rest) introduces.

A possible alternative is:

- for non-leaf pages, including roots, add a 'unsync_children' flag.
- when marking a page unsync, set the flag recursively on all parents
- when switching cr3, recursively descend to locate unsynced leaves, clearing flags along the way - to speed this up, put a bitmap with 1 bit per pte in the pages (512 bits = 64 bytes)
- the bitmap can be externally allocated to save space, or not

This means we no longer have to worry about multiple roots, when a page acquires another root while it is unsynced, etc.


@@ -963,8 +1112,24 @@ static struct kvm_mmu_page *kvm_mmu_get_
                 gfn, role.word);
        index = kvm_page_table_hashfn(gfn);
        bucket = &vcpu->kvm->arch.mmu_page_hash[index];
-       hlist_for_each_entry(sp, node, bucket, hash_link)
-               if (sp->gfn == gfn && sp->role.word == role.word) {
+       hlist_for_each_entry_safe(sp, node, tmp, bucket, hash_link)
+               if (sp->gfn == gfn) {
+                       /*
+                        * If a pagetable becomes referenced by more than one
+                        * root, or has multiple roles, unsync it and disable
+                        * oos. For higher level pgtables the entire tree
+                        * has to be synced.
+                        */
+                       if (sp->root_gfn != root_gfn) {
+                               kvm_set_pg_inuse(sp);

What does inuse mean exactly?

+                               if (set_shared_mmu_page(vcpu, sp))
+                                       tmp = bucket->first;
+                               kvm_clear_pg_inuse(sp);

Cleared here?

+                               unsyncable = 0;
+                       }
+                       if (sp->role.word != role.word)
+                               continue;
+
                        mmu_page_add_parent_pte(vcpu, sp, parent_pte);
                        pgprintk("%s: found\n", __func__);
                        return sp;
--- kvm.orig/include/asm-x86/kvm_host.h
+++ kvm/include/asm-x86/kvm_host.h
@@ -179,6 +179,10 @@ union kvm_mmu_page_role {
 struct kvm_mmu_page {
        struct list_head link;
        struct hlist_node hash_link;
+       /* FIXME: one list_head is enough */
+       struct list_head unsync_pages;
+       struct list_head oos_link;

That's okay, we may allow OOS roots one day.

+       gfn_t root_gfn; /* root this pagetable belongs to, -1 if multimapped */
/*
         * The following two entries are used to key the shadow page in the
@@ -362,6 +366,8 @@ struct kvm_arch{
        unsigned int n_requested_mmu_pages;
        unsigned int n_alloc_mmu_pages;
        struct hlist_head mmu_page_hash[KVM_NUM_MMU_PAGES];
+       struct kvm_mmu_page *oos_global_pages[7];
+       unsigned oos_global_idx;

What does the index mean?  An lru pointer?

I became a little unsynced myself reading the patch.  It's very complex.

We need to change something to keep it maintainable. Either a better data structure, or disallowing a parent to be zapped while any of its children are alive.

What happens when a sp->global changes its value while a page is oos?

Can we even detect a nonglobal->global change? Maybe instead of a flag, add a counter for active ptes and global ptes, and a page is global if the counters match. However most likely it doesn't matter at all.

--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to