On Fri, Oct 31, 2008 at 09:58:17PM +0200, Avi Kivity wrote:
> Marcelo Tosatti wrote:
>>>> +          sw->pte_gpa = (sp->gfn << PAGE_SHIFT);
>>>> +          sw->pte_gpa += (sptep - sp->spt) * sizeof(pt_element_t);
>>>> +
>>>> +          if (is_shadow_present_pte(*sptep)) {
>>>>                    rmap_remove(vcpu->kvm, sptep);
>>>> +                  sw->pte_gpa = -1;
>>>>         
>>> Why?  The pte could have heen replaced (for example, a write access 
>>> to a  cow page).
>>>     
>>
>> Well look-aheads on address space teardown will be useless. OTOH the
>> guest pte read cost is minimal compared to an exit.
>>   
>
> Don't understand.  We will incur an exit if a pte is replaced and  
> invlpg'ed due to a copy-on-write (do guests actually execute invlpg  
> after a cow? I don't think they have to).
>
> What is the downside?  A pagetable teardown that does not involve  
> zeroing the page?  I don't think we'll see invlpg on that path, more  
> likely a complete tlb flush.

Err, I'm on crack. The assumption is that the common case is pte
invalidation + invlpg: kunmap_atomic, page aging clearing the 
accessed bit, page reclaim.

Linux COW will invalidate + invlpg (do_wp_page) first:

                entry = mk_pte(new_page, vma->vm_page_prot);
                entry = maybe_mkwrite(pte_mkdirty(entry), vma);
                /*
                 * Clear the pte entry and flush it first, before
                 * updating the
                 * pte with the new entry. This will avoid a race
                 * condition
                 * seen in the presence of one thread doing SMC and
                 * another
                 * thread doing COW.
                 */
                ptep_clear_flush_notify(vma, address, page_table);

Not sure about Windows.

>> Whatever you prefer. Learning guest behaviour as suggested earlier 
>> would be optimal, but simple is good.
>>   
>
> We're way past simple.  We can reclaim some of the complexity by always  
> doing unsync, and dropping emulation and kvm_mmu_set_pte(), but need to  
> make sure we don't regress on performance.  I think Windows does a pde  
> write on context switch, which will add a vmexit, but Windows  
> applications are not too context switch intensive AFAIK.

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to