Marcelo Tosatti wrote:
On Tue, Jan 06, 2009 at 12:41:40PM +0200, Avi Kivity wrote:
Alexander Graf wrote:
Avi Kivity wrote:
From: Avi Kivity <[email protected]>
Don't allow a vcpu with cr4.pge cleared to use a shadow page created with
cr4.pge set; this might cause a cr3 switch not to sync ptes that have the
global bit set (the global bit has no effect if !cr4.pge).
This can only occur on smp with different cr4.pge settings for different
vcpus (since a cr4 change will resync the shadow ptes), but there's no
cost to being correct here.
Signed-off-by: Avi Kivity <[email protected]>
diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index f49bfd0..ab8ef1d 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -183,6 +183,7 @@ union kvm_mmu_page_role {
unsigned metaphysical:1;
unsigned access:3;
unsigned invalid:1;
+ unsigned cr4_pge:1;
};
};
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index c4da7fb..aa4575c 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -363,6 +363,7 @@ void kvm_set_cr4(struct kvm_vcpu *vcpu, unsigned long cr4)
}
kvm_x86_ops->set_cr4(vcpu, cr4);
vcpu->arch.cr4 = cr4;
+ vcpu->arch.mmu.base_role.cr4_pge = !!(cr4 & X86_CR4_PGE);
This line broke VMware ESX bootup using NPT on one VCPU for me. If I
comment it out and apply my patches to actually make ESX run, it boots
again.
I think this might be the problem:
static int mmu_need_write_protect(struct kvm_vcpu *vcpu, gfn_t gfn,
bool can_unsync)
{
struct kvm_mmu_page *shadow;
shadow = kvm_mmu_lookup_page(vcpu->kvm, gfn);
if (shadow) {
if (shadow->role.level != PT_PAGE_TABLE_LEVEL)
return 1;
if (shadow->unsync)
return 0;
if (can_unsync && oos_shadow)
return kvm_unsync_page(vcpu, shadow);
return 1;
}
return 0;
}
lookup_page() is not deterministic if there are multiple shadows for a
page, and the patch increases multiple shadows.
Marcelo, shouldn't there be a for_each in there?
static int kvm_unsync_page(struct kvm_vcpu *vcpu, struct kvm_mmu_page *sp) {
unsigned index;
struct hlist_head *bucket;
struct kvm_mmu_page *s;
struct hlist_node *node, *n;
index = kvm_page_table_hashfn(sp->gfn);
bucket = &vcpu->kvm->arch.mmu_page_hash[index];
/* don't unsync if pagetable is shadowed with multiple roles */
hlist_for_each_entry_safe(s, node, n, bucket, hash_link) {
if (s->gfn != sp->gfn || s->role.metaphysical)
continue;
if (s->role.word != sp->role.word)
return 1;
}
This one?
Yes...
Looks like kvm_unsync_page can be folded into mmu_need_write_protect
(after which we can drop lookup_page(), which is not a good API). But
that's after we solve the current problem.
Looks like the addition of a second role for non-pge mode confuses the
mmu. After the second page is created, mmu_need_write_protect() will
return 1, but previously existing sptes can still be writable?
Looks like we need to call rmap_write_protect() when the new page is
created.
--
error compiling committee.c: too many arguments to function
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to [email protected]
More majordomo info at http://vger.kernel.org/majordomo-info.html