On Wed, Mar 26, 2008 at 07:12:25PM +0100, Andrea Arcangeli wrote: > On Wed, Mar 26, 2008 at 02:51:28PM -0300, Marcelo Tosatti wrote: > > Nope. If a physical CPU has page translations cached it _must_ be > > running in the context of a qemu thread (does not matter if its in > > userspace or executing guest code). The bit corresponding to such CPU's > > will be set in mm->vm_cpu_mask and flush_tlb_page() will take care of > > flushing the TLBs appropriately. > > That would require the tlb to lookup any SVM/VMX virtualized address > that could point to the same physical page that is pointed by the > invlpg linux virtual address. So it might be feasible if the hardware > is using a physical tag on each spte translation, but it'd require the > tlb to do a lot more work than we need it to do, so I hope you're > wrong on the hardware side of things. That would be an hardware > complexity or slowdown that would provide no benefit to software. > > But regardless, even if this was the case and you're right that > invalidating a linux userland address invalidates atomically all other > virtualized addresses in the svm/vmx tlbs (all asn included, not just > one), the spte is still instantiated when flush_tlb_page runs on all > cpus.
You're right, there is no guarantee that a full TLB flush will happen, and invlpg for the qemu task virtual address is not enough. > So just after the global tlb flush, a guest spte tlb-miss (no page > fault required, no get_user_pages required) can happen that will > re-instantiate the spte contents inside the tlb before flush_tlb_page > returns. > > CPU0 CPU 1 > pte_clear() inside ptep_clear_flush > flush_tlb_page inside ptep_clear_flush inside rmap > page_count = 1 > guest tlb miss > tlb entry is back! > ioctl() > mark spte nonpresent > rmap_remove -> put_page > tlb flush > > Any tlb flush happening before clearing the shadow-pte entry is > totally useless. > > Avi patch is great fix, and it will need furtherx changes to properly > fix this race, because many set_shadow_pte/and pt[i] = nonpresent, are > executed _after_ rmap_remove (they must be executed first, in case the > array is full and we have flush tlb and free the page). One comment on the patch: +static void kvm_release_page_clean_deferred(struct kvm *kvm, + struct page *page) +{ + kvm_release_page_deferred(kvm, page, true); +} false ------------------------------------------------------------------------- Check out the new SourceForge.net Marketplace. It's the best place to buy or sell services for just about anything Open Source. http://ad.doubleclick.net/clk;164216239;13503038;w?http://sf.net/marketplace _______________________________________________ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel