Re: [Qemu-devel] [PATCH] Introduce cache images for the QCOW2 format

2013-08-15 Thread Kaveh Razavi
On 08/15/2013 10:32 AM, Stefan Hajnoczi wrote: I don't buy the argument about the page cache being evicted at any time: At the scale where caching is important, provisioning a measily 100 MB of RAM per guest should not be a challenge. cgroups can be used to isolate page cache between VMs if you

Re: [Qemu-devel] [PATCH] Introduce cache images for the QCOW2 format

2013-08-15 Thread Stefan Hajnoczi
On Wed, Aug 14, 2013 at 04:20:27PM +0200, Kaveh Razavi wrote: > Hi, > > On 08/14/2013 11:29 AM, Stefan Hajnoczi wrote: > > 100 MB is small enough for RAM. Did you try enabling the host kernel > > page cache for the backing file? That way all guests running on this > > host share a single RAM-cac

Re: [Qemu-devel] [PATCH] Introduce cache images for the QCOW2 format

2013-08-15 Thread Stefan Hajnoczi
On Wed, Aug 14, 2013 at 05:32:16PM +0200, Kevin Wolf wrote: > Am 14.08.2013 um 16:26 hat Kaveh Razavi geschrieben: > > On 08/14/2013 03:50 PM, Alex Bligh wrote: > > > Assuming the cache quota is not exhausted, how do you know how that > > > a VM has finished 'creating' the cache? At any point it mi

Re: [Qemu-devel] [PATCH] Introduce cache images for the QCOW2 format

2013-08-15 Thread Wenchao Xia
于 2013-8-14 23:32, Kevin Wolf 写道: Am 14.08.2013 um 16:26 hat Kaveh Razavi geschrieben: On 08/14/2013 03:50 PM, Alex Bligh wrote: Assuming the cache quota is not exhausted, how do you know how that a VM has finished 'creating' the cache? At any point it might read a bit more from the backing ima

Re: [Qemu-devel] [PATCH] Introduce cache images for the QCOW2 format

2013-08-14 Thread Alex Bligh
On 15 Aug 2013, at 01:53, Fam Zheng wrote: > On Wed, 08/14 13:03, Alex Bligh wrote: >> >> On 14 Aug 2013, at 12:52, Fam Zheng wrote: >> >>> Yes, this one sounds good to have. VMDK and VHDX have this kind of >>> backing file status validation. >> >> ... though I'd prefer something safer than lo

Re: [Qemu-devel] [PATCH] Introduce cache images for the QCOW2 format

2013-08-14 Thread Fam Zheng
On Wed, 08/14 13:03, Alex Bligh wrote: > > On 14 Aug 2013, at 12:52, Fam Zheng wrote: > > > Yes, this one sounds good to have. VMDK and VHDX have this kind of > > backing file status validation. > > ... though I'd prefer something safer than looking at mtime, for > instance a sequence number tha

Re: [Qemu-devel] [PATCH] Introduce cache images for the QCOW2 format

2013-08-14 Thread Richard W.M. Jones
On Wed, Aug 14, 2013 at 01:03:48PM +0100, Alex Bligh wrote: > > On 14 Aug 2013, at 12:52, Fam Zheng wrote: > > > Yes, this one sounds good to have. VMDK and VHDX have this kind of > > backing file status validation. > > ... though I'd prefer something safer than looking at mtime, for > instance

Re: [Qemu-devel] [PATCH] Introduce cache images for the QCOW2 format

2013-08-14 Thread Kevin Wolf
Am 14.08.2013 um 16:26 hat Kaveh Razavi geschrieben: > On 08/14/2013 03:50 PM, Alex Bligh wrote: > > Assuming the cache quota is not exhausted, how do you know how that > > a VM has finished 'creating' the cache? At any point it might > > read a bit more from the backing image. > > I was assuming

Re: [Qemu-devel] [PATCH] Introduce cache images for the QCOW2 format

2013-08-14 Thread Alex Bligh
On 14 Aug 2013, at 15:26, Kaveh Razavi wrote: > This is a good idea, since it relaxes the requirement for releasing the > cache only on shutdown. I am not sure how the 'finish point' can be > recognized. Full cache quota is one obvious scenario, but I imagine most > VMs do/should not really read

Re: [Qemu-devel] [PATCH] Introduce cache images for the QCOW2 format

2013-08-14 Thread Kaveh Razavi
On 08/14/2013 03:50 PM, Alex Bligh wrote: > Assuming the cache quota is not exhausted, how do you know how that > a VM has finished 'creating' the cache? At any point it might > read a bit more from the backing image. I was assuming on shutdown. > I'm wondering whether you could just use POSIX ma

Re: [Qemu-devel] [PATCH] Introduce cache images for the QCOW2 format

2013-08-14 Thread Kaveh Razavi
Hi, On 08/14/2013 11:29 AM, Stefan Hajnoczi wrote: > 100 MB is small enough for RAM. Did you try enabling the host kernel > page cache for the backing file? That way all guests running on this > host share a single RAM-cached version of the backing file. > Yes, indeed. That is why we think it m

Re: [Qemu-devel] [PATCH] Introduce cache images for the QCOW2 format

2013-08-14 Thread Alex Bligh
On 14 Aug 2013, at 14:43, Kaveh Razavi wrote: > No, once the read-only cache is created, it can be used by different VMs > on the same host. But yes, it first needs to be created. OK - this was the point I had missed. Assuming the cache quota is not exhausted, how do you know how that a VM has

Re: [Qemu-devel] [PATCH] Introduce cache images for the QCOW2 format

2013-08-14 Thread Kaveh Razavi
On 08/14/2013 02:02 PM, Alex Bligh wrote: >> > Not really. I meant different backing images, and not necessarily >> > booting on the same host. > So how does your cache solve the problem you mentioned in that > para? > If you have a fast network (think 10GbE), then qcow2 can easily boot many VMs

Re: [Qemu-devel] [PATCH] Introduce cache images for the QCOW2 format

2013-08-14 Thread Kaveh Razavi
On 08/14/2013 01:57 PM, Alex Bligh wrote: > I don't agree. The penalty for a qcow2 suffering a false positive on > a change to a backing file is that the VM can no longer boot. The > penalty for your cache suffering a false positive is that the > VM boots marginally slower. Moreover, it is expected

Re: [Qemu-devel] [PATCH] Introduce cache images for the QCOW2 format

2013-08-14 Thread Alex Bligh
On 14 Aug 2013, at 12:52, Fam Zheng wrote: > Yes, this one sounds good to have. VMDK and VHDX have this kind of > backing file status validation. ... though I'd prefer something safer than looking at mtime, for instance a sequence number that is incremented prior to any bdrv_close if a write has

Re: [Qemu-devel] [PATCH] Introduce cache images for the QCOW2 format

2013-08-14 Thread Alex Bligh
On 14 Aug 2013, at 12:42, Kaveh Razavi wrote: > On 08/14/2013 01:16 AM, Alex Bligh wrote: >> The above para implies you intend one cache file to be shared by >> two VMs booting from the same backing image on the same node. >> If that's true, how do you protect yourself from the following > > Not

Re: [Qemu-devel] [PATCH] Introduce cache images for the QCOW2 format

2013-08-14 Thread Alex Bligh
Kaveh, On 14 Aug 2013, at 12:28, Kaveh Razavi wrote: > On 08/14/2013 12:53 AM, Alex Bligh wrote: >> What is this cache keyed on and how is it invalidated? Let's say a >> 2 VM on node X boot with backing file A. The first populates the cache, >> and the second utilises the cache. I then stop both

Re: [Qemu-devel] [PATCH] Introduce cache images for the QCOW2 format

2013-08-14 Thread Fam Zheng
On Wed, 08/14 13:28, Kaveh Razavi wrote: > On 08/14/2013 12:53 AM, Alex Bligh wrote: > > What is this cache keyed on and how is it invalidated? Let's say a > > 2 VM on node X boot with backing file A. The first populates the cache, > > and the second utilises the cache. I then stop both VMs, delete

Re: [Qemu-devel] [PATCH] Introduce cache images for the QCOW2 format

2013-08-14 Thread Kaveh Razavi
On 08/14/2013 01:16 AM, Alex Bligh wrote: > The above para implies you intend one cache file to be shared by > two VMs booting from the same backing image on the same node. > If that's true, how do you protect yourself from the following: > Not really. I meant different backing images, and not nec

Re: [Qemu-devel] [PATCH] Introduce cache images for the QCOW2 format

2013-08-14 Thread Kaveh Razavi
On 08/14/2013 12:53 AM, Alex Bligh wrote: > What is this cache keyed on and how is it invalidated? Let's say a > 2 VM on node X boot with backing file A. The first populates the cache, > and the second utilises the cache. I then stop both VMs, delete > the derived disks, and change the contents of

Re: [Qemu-devel] [PATCH] Introduce cache images for the QCOW2 format

2013-08-14 Thread Kaveh Razavi
On 08/13/2013 11:37 PM, Eric Blake wrote: > What is the QMP counterpart for hot-plugging a disk with the cache > attached? Is this something that can integrate nicely with Kevin's > planned blockdev-add for 1.7? > I do not know the details of this, but as long as it has proper support for backin

Re: [Qemu-devel] [PATCH] Introduce cache images for the QCOW2 format

2013-08-14 Thread Stefan Hajnoczi
On Tue, Aug 13, 2013 at 07:03:56PM +0200, Kaveh Razavi wrote: > Using copy-on-write images with the base image stored remotely is common > practice in data centers. This saves significant network traffic by > avoiding the transfer of the complete base image. However, the data > blocks needed for a

Re: [Qemu-devel] [PATCH] Introduce cache images for the QCOW2 format

2013-08-13 Thread Alex Bligh
--On 13 August 2013 19:03:56 +0200 Kaveh Razavi wrote: Also, simultaneously booting VMs from more than one VM image creates a bottleneck at the storage device of the base image, if the storage device does not fair well with the random access pattern that happens during booting. Additional

Re: [Qemu-devel] [PATCH] Introduce cache images for the QCOW2 format

2013-08-13 Thread Alex Bligh
--On 13 August 2013 19:03:56 +0200 Kaveh Razavi wrote: This patch introduces a block-level caching mechanism by introducing a copy-on-read image that supports quota and goes in between the base image and copy-on-write image. This cache image can either be stored on the nodes that run VMs or o

Re: [Qemu-devel] [PATCH] Introduce cache images for the QCOW2 format

2013-08-13 Thread Eric Blake
On 08/13/2013 11:03 AM, Kaveh Razavi wrote: > Using copy-on-write images with the base image stored remotely is common > practice in data centers. This saves significant network traffic by > avoiding the transfer of the complete base image. However, the data > blocks needed for a VM boot still need

[Qemu-devel] [PATCH] Introduce cache images for the QCOW2 format

2013-08-13 Thread Kaveh Razavi
Using copy-on-write images with the base image stored remotely is common practice in data centers. This saves significant network traffic by avoiding the transfer of the complete base image. However, the data blocks needed for a VM boot still need to be transfered to the node that runs the VM. On s