At the moment, exec ignores high bits in each address, for efficiency. This is incorrect: devices can do full 64 bit DMA, it's only the CPU that is limited by target address space. Using full 64 bit addresses was clocked at 12% performance hit on a microbenchmark. To solve, teach pagetables to skip bits at any level and not just the lowest level.
This should solve the performance problem (only one line of code changed on the data path). I'm still trying to figure out how to measure speed properly with TCG, sending this out for early feedback and flames. Michael S. Tsirkin (3): exec: relace leaf with skip exec: extend skip field to 3 bits exec: memory radix tree page level compression Paolo Bonzini (2): split definitions for exec.c and translate-all.c radix trees exec: make address spaces 64-bit wide translate-all.h | 7 ---- exec.c | 117 +++++++++++++++++++++++++++++++++++++++++++++++--------- translate-all.c | 32 +++++++++------- 3 files changed, 117 insertions(+), 39 deletions(-) -- MST