Marcelo Tosatti wrote:
Instead of flushing remote TLB's at every page resync, do an initial
pass to write protect the sptes, collapsing the flushes on a single
remote TLB invalidation.

kernbench is 2.3% faster on 4-way guest. Improvements have been seen
with other loads such as AIM7.

Avi: feel free to change this if you dislike the style (I do, but can't
think of anything nicer).

 static void mmu_sync_children(struct kvm_vcpu *vcpu, struct kvm_mmu_page *sp)
 {
        struct sync_walker walker = {
-               .walker = { .entry = mmu_sync_fn, },
+               .walker = { .entry = mmu_wprotect_fn,
+                           .clear_unsync = false, },
                .vcpu = vcpu,
+               .write_protected = false
        };
+ /* collapse the TLB flushes as an optimization */
+       mmu_unsync_walk(sp, &walker.walker);
+       if (walker.write_protected)
+               kvm_flush_remote_tlbs(vcpu->kvm);
+
+       walker.walker.entry = mmu_sync_fn;
+       walker.walker.clear_unsync = true;
+
        while (mmu_unsync_walk(sp, &walker.walker))
                cond_resched_lock(&vcpu->kvm->mmu_lock);

We're always doing two passes here, which is a bit sad. How about having a single pass which:

- collects unsync pages into an array
- exits on no more unsync pages or max array size reached

Then, iterate over the array:

- write protect all pages
- flush tlb
- sync pages

Loop until the root is synced.

If the number of pages to sync is typically small, and the array is sized to be larger than this, then we only walk the pagetables once.

btw, our walkers are a bit awkward (though still better than what we had before). If we rewrite them into for_each style iterators, the code could become cleaner and shorter.

--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to