Avi Kivity wrote:
Chris Wright wrote:
I think it's safe to say the perf folks are concerned w/ data integrity
first, stable/reproducible results second, and raw performance third.
So seeing data cached in host was simply not what they expected. I
think
write through is sufficient. However I think that uncached vs. wt will
show up on the radar under reproducible results (need to tune based on
cache size). And in most overcommit scenarios memory is typically more
precious than cpu, it's unclear to me if the extra buffering is anything
other than memory overhead. As long as it's configurable then it's
comparable and benchmarking and best practices can dictate best choice.
Getting good performance because we have a huge amount of free memory
in the host is not a good benchmark. Under most circumstances, the
free memory will be used either for more guests, or will be given to
the existing guests, which can utilize it more efficiently than the host.
I can see two cases where this is not true:
- using older, 32-bit guests which cannot utilize all of the cache. I
think Windows XP is limited to 512MB of cache, and usually doesn't
utilize even that. So if you have an application running on 32-bit
Windows (or on 32-bit Linux with pae disabled), and a huge host, you
will see a significant boost from cache=writethrough. This is a case
where performance can exceed native, simply because native cannot
exploit all the resources of the host.
- if cache requirements vary in time across the different guests, and
if some smart ballooning is not in place, having free memory on the
host means we utilize it for whichever guest has the greatest need, so
overall performance improves.
Another justification for ODIRECT is that many production system will
use the base images for their VMs.
It's mainly true for desktop virtualization but probably for some server
virtualization deployments.
In these type of scenarios, we can have all of the base image chain
opened as default with caching for read-only while the
leaf images are open with cache=off.
Since there is ongoing effort (both by IT and developers) to keep the
base images as big as possible, it guarantees that
this data is best suited for caching in the host while the private leaf
images will be uncached.
This way we provide good performance and caching for the shared parent
images while also promising correctness.
Actually this is what happens on mainline qemu with cache=off.
Cheers,
Dor
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at http://vger.kernel.org/majordomo-info.html