Re: [patch 07/13] KVM: MMU: mode specific sync_page

Avi Kivity Mon, 08 Sep 2008 02:50:17 -0700

Marcelo Tosatti wrote:

On Sun, Sep 07, 2008 at 12:52:21PM +0300, Avi Kivity wrote:

What if vcpu0 is in mode X, while vcpu1 is in mode Y. vcpu0 writes tosome pagetable, causing both mode X and mode Y shadows to becomeunsynced, so on the next resync (either by vcpu0 or vcpu1) we need tosync both modes.


From the oos core patch:

-       hlist_for_each_entry(sp, node, bucket, hash_link)
-               if (sp->gfn == gfn && sp->role.word == role.word) {
+       hlist_for_each_entry_safe(sp, node, tmp, bucket, hash_link)
+               if (sp->gfn == gfn) {
+                       /*
+                        * If a pagetable becomes referenced by more than one
+                        * root, or has multiple roles, unsync it and disable
+                        * oos. For higher level pgtables the entire tree
+                        * has to be synced.
+                        */
+                       if (sp->root_gfn != root_gfn) {
+                               kvm_set_pg_inuse(sp);
+                               if (set_shared_mmu_page(vcpu, sp))
+                                       tmp = bucket->first;
+                               kvm_clear_pg_inuse(sp);
+                               unsyncable = 0;
+                       }

So as soon as a pagetable is shadowed with different modes, its resyncedand unsyncing is disabled.

Okay. But the complexity here, esp. with rarely used cases likemultiple mode shadows, is frightening.

+
+                       pte_access = sp->role.access & FNAME(gpte_access)(vcpu, 
*pt);
+                       /* user */
+                       if (pte_access & ACC_USER_MASK)
+                               spte |= shadow_user_mask;
There are some special cases involving cr0.wp=0 and the user mask. sospte.u is not correlated exactly with gpte.u.
How come?

When cr0.wp=0, the cpu ignores pte.w for cpl=0 accesses. kvm requirescr0.wp=1 (since we need to write protect pages, for many reasons, likeemulating pte.dirty). This is how we handle pte.u=1 + pte.w=0:

- for cpl 0 accesses, we set spte.w=1 (to allow the write) and spte.u=0(to forbid cpl>0 accesses)- for cpl>0 accesses, we set spte.w=0 (to forbid userspace writeaccesses) and spte.u=1 (to allow cpl>0 read accesses)

this works well except if the accesses keep alternating betweenuserspace and kernel.

+                       /* guest->shadow accessed sync */
+                       if (!(*pt & PT_ACCESSED_MASK))
+                               spte &= ~PT_ACCESSED_MASK;
spte shouldn't be accessible at all if gpte is not accessed, so we canset gpte.a on the next access (similar to spte not being writeable ifgpte is not dirty).
Right. Perhaps accessed bit synchronization to guest could be performed
lazily somehow, so as to avoid a vmexit on every first page access.

I don't think this is doable (well you can do it if you make the guestpage table not present, but then even reading the accessed bit faults).

+                       set_shadow_pte(&sp->spt[i], spte);

What if permissions are reduced?


Then a local TLB flush is needed. Flushing the TLB's of remote vcpus
should be done by the guest AFAICS.

hm. It depends on why they are reduced. If the page became shadowed,then we are responsible.


Don't think this is the case here, so local flush is likely sufficient.


--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [patch 07/13] KVM: MMU: mode specific sync_page

Reply via email to