Marcelo Tosatti wrote:
Instead of flushing remote TLB's at every page resync, do an initial
pass to write protect the sptes, collapsing the flushes on a single
remote TLB invalidation.
kernbench is 2.3% faster on 4-way guest. Improvements have been seen
with other loads such as AIM7.
Avi: feel free to change this if you dislike the style (I do, but can't
think of anything nicer).
static void mmu_sync_children(struct kvm_vcpu *vcpu, struct kvm_mmu_page *sp)
{
struct sync_walker walker = {
- .walker = { .entry = mmu_sync_fn, },
+ .walker = { .entry = mmu_wprotect_fn,
+ .clear_unsync = false, },
.vcpu = vcpu,
+ .write_protected = false
};
+ /* collapse the TLB flushes as an optimization */
+ mmu_unsync_walk(sp, &walker.walker);
+ if (walker.write_protected)
+ kvm_flush_remote_tlbs(vcpu->kvm);
+
+ walker.walker.entry = mmu_sync_fn;
+ walker.walker.clear_unsync = true;
+
while (mmu_unsync_walk(sp, &walker.walker))
cond_resched_lock(&vcpu->kvm->mmu_lock);
We're always doing two passes here, which is a bit sad. How about
having a single pass which:
- collects unsync pages into an array
- exits on no more unsync pages or max array size reached
Then, iterate over the array:
- write protect all pages
- flush tlb
- sync pages
Loop until the root is synced.
If the number of pages to sync is typically small, and the array is
sized to be larger than this, then we only walk the pagetables once.
btw, our walkers are a bit awkward (though still better than what we had
before). If we rewrite them into for_each style iterators, the code
could become cleaner and shorter.
--
error compiling committee.c: too many arguments to function
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at http://vger.kernel.org/majordomo-info.html