Zachary Amsden wrote:
> I'm a bit skeptical you can get such a semantic to work without a very
> heavyweight method in the hypervisor. How do you guarantee no other CPU
> is fizzling the A/D bits in the page table (it can be done by hardware
> with direct page tables), unless you use some kind of IPI? Is this why
> it is still 7x?
>
No, you just use cmpxchg. It's pretty lightweight really. Xen holds a
lock internally to stop other cpus from updating the pte in software, so
the only source of modification is the hardware itself; the cmpxchg loop
is guaranteed to terminate because the A/D bits can only transition from
0->1.
I haven't really gone into depth as to exactly where the 7x number comes
from. I could increase the batch size (currently max of 32 pte
updates/hypercall), and some of it is plain overhead from the in-kernel
infrastructure. A simpler and more hackish approach which basically
pastes the Xen hypercall directly into the mprotect loop gets the
overhead down to about 5.5x.
> Still, a 7x gain from asynchronous batching is very nice. I wonder if
> that means the average mprotect size in your benchmark is 7 pages.
>
Yeah, it's around 7x. The batching pays off even for single page
mprotects, because the trap and emulate of xchg is so expensive.
>> I believe that other virtualization systems, whether they use direct
>> paging like Xen, or a shadow pagetable scheme (vmi, kvm, lguest), can
>> make use of this interface to improve the performance.
>>
>
> On VMI, we don't trap the xchg of the pte, thus we don't have any
> bottleneck here to begin with.
If you're doing code rewriting then I guess you can effectively do the
same trick at that point. If not, then presumably you take a fault for
the first pte updated in the mprotect and then sync the shadow up when
the tlb flush happens; batching that trap and the tlb flush would give
you some benefit for small mprotects.
J
_______________________________________________
Virtualization mailing list
[email protected]
https://lists.linux-foundation.org/mailman/listinfo/virtualization