On 04/10/2011 11:30 AM, Alexander Graf wrote:
On 10.04.2011, at 10:15, Avi Kivity wrote:

>  On 04/09/2011 03:20 PM, Alexander Graf wrote:
>>  On 09.04.2011, at 14:14, Sasha Levin wrote:
>>
>>  >   Attempt to use mmap first for working with a disk image, if the attempt 
is failed (for example, large image on a 32bit system) fallback to using read/write.
>>
>>  That reminds me of an idea I had quite a while back.
>>
>>  What if we mmap'ed a raw disk image directly into the guest's address 
space? This could for example be done through a virtio feature addition, keeping the 
disk accessible through normal virtio plus the mmap'ed part. At least in writeback 
mode, this should perform pretty well, as we'd save all the userspace exits. It'd 
basically be almost like vhost-blk :).
>>
>>  Have you thought about trying out to implement such a feature?
>
>  A creative idea, but I don't think it will work.  On EPT hosts we don't have 
accessed/dirty bits so you have to incur at least write faults to track dirty data 
and perhaps read faults to gather recency information.  On non-EPT you have to 
scan page tables to find out what you have to write out, and flush TLBs.  Cache 
misses, which you'd expect there to be quite a few, would stall the vcpu (unless 
you use asynchronous page faults) and contribute less information to the host than 
virtio-blk (location of access but not size).  Write misses are converted to 
read-modify-write operations.

Since we're moving the 4k sector sizes, the RMW argument shouldn't matter too 
much in a couple of years from now.

I wasn't talking about the sector size, rather that

  memcpy(&mmapped_disk[sector * SECTOR_SIZE], data, SECTOR_SIZE)

writes the data word by word, so on the first write you have to read in the entire page, then modify the first and following words.

As for the faults, yes. We'd basically have to declare the file region as dirty 
logged, which means we get lots of page faults when accessing them. However, 
these are all lightweight exits. So we take one lightweight exit for each 4k 
chunk. when doing writes. For reads, we probably really do need asynchronous 
page faults - everything else would stall the vcpus way too long.

These are still very expensive, compared to virtio-blk which can get you megabytes worth of data with a single exit.

--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to