[Qemu-devel] Re: [RFC][STABLE 0.13] Revert "qcow2: Use bdrv_(p)write_sync for metadata writes"

Anthony Liguori Wed, 25 Aug 2010 05:47:53 -0700

On 08/25/2010 02:14 AM, Avi Kivity wrote:

If (c) happens before (b), then we've created an extent that'sattached to a table with a zero reference count. This is a corruptimage.
If the only issue is new block allocation, it can be easily solved.

Technically, I believe there are similar issues around creatingsnapshots but I don't think we care.

Instead of allocating exactly the needed amount of blocks, allocatea large extent and hold them in memory.

So you're suggesting that we allocate a bunch of blocks, update the refcount table so that they are seen as allocated even though they aren'tattached to an l1 table?

The next allocation can then be filled from memory, so theallocation sync is amortized over many blocks. A power fail will leakthe preallocated blocks, losing some megabytes of address space, butnot real disk space.

It's a clever idea, but it would lose real disk space which is probablynot a huge issue.

Let's consider if we eliminate the reference count table which meanseliminating internal snapshots.
1) guest submits write request
2) allocate extent
3) write data to disk (a)
4) write (a) completes
5) write extent table (c)
6) write (c) completes
7) complete guest write request
If this all happens in order and we lose power, we just leak ablock. It means we need a periodic fsck.
If (c) completes before (a), then it means that the image is notcorrupted but data gets lost. This is okay based on the guest contract.
And that's it.  There is no scenario where the disk is corrupted.
_if_ that's the only failure mode.

If we had another disk format that only supported growth and metadatafor a backing file, can you think of another failure scenario?


Regards,

Anthony Liguori

[Qemu-devel] Re: [RFC][STABLE 0.13] Revert "qcow2: Use bdrv_(p)write_sync for metadata writes"

Reply via email to