On Mon, Feb 10, 2020 at 04:29:53PM -0600, Eric Blake wrote: > On 2/10/20 4:12 PM, Richard W.M. Jones wrote: > >On Mon, Feb 10, 2020 at 03:37:20PM -0600, Eric Blake wrote: > >>For now, only 2 of those 16 bits are defined: NBD_INIT_SPARSE (the > >>image has at least one hole) and NBD_INIT_ZERO (the image reads > >>completely as zero); the two bits are orthogonal and can be set > >>independently, although it is easy enough to see completely sparse > >>files with both bits set. > > > >I think I'm confused about the exact meaning of NBD_INIT_SPARSE. Do > >you really mean the whole image is sparse; or (as you seem to have > >said above) that there exists a hole somewhere in the image but we're > >not saying where it is and there can be non-sparse parts of the image? > > As implemented: > > NBD_INIT_SPARSE - there is at least one hole somewhere (allocation > would be required to write to that part of the file), but there may > b allocated data elsewhere in the image. Most disk images will fit > this definition (for example, it is very common to have a hole > between the MBR or GPT and the first partition containing a file > system, or for file systems themselves to be sparse within the > larger block device).
I think I'm still confused about why this particular flag would be useful for clients (I can completely understand why clients need NBD_INIT_ZERO). But anyway ... could a flag indicating that the whole image is sparse be useful, either as well as NBD_INIT_SPARSE or instead of it? You could use it to avoid an initial disk trim, which is something that mke2fs does: https://github.com/tytso/e2fsprogs/blob/0670fc20df4a4bbbeb0edb30d82628ea30a80598/misc/mke2fs.c#L2768 and which is painfully slow over NBD for very large devices because of the 32 bit limit on request sizes - try doing mke2fs on a 1E nbdkit memory disk some time. > NBD_INIT_ZERO - all bytes read as zero. > > The combination NBD_INIT_SPARSE|NBD_INIT_ZERO is common (generally, > if you use lseek(SEEK_DATA) to prove the entire image reads as > zeroes, you also know the entire image is sparse), but NBD_INIT_ZERO > in isolation is also possible (especially with the qcow2 proposal of > a persistent autoclear bit, where even with a fully preallocated > qcow2 image you still know it reads as zeroes but there are no > holes). But you are also right that for servers that can advertise > both bits efficiently, NBD_INIT_SPARSE in isolation may be more > common than NBD_INIT_SPARSE|NBD_INIT_ZERO (the former for most disk > images, the latter only for a freshly-created image that happens to > create with zero initialization). > > What's more, in my patches, I did NOT patch qemu to set or consume > INIT_SPARSE; so far, it only sets/consumes INIT_ZERO. Of course, if > we can find a reason WHY qemu should track whether a qcow2 image is > fully-allocated, by demonstrating a qemu-img algorithm that becomes > easier for knowing if an image is sparse (even if our justification > is: "when copying an image, I want to know if the _source_ is > sparse, to know whether I have to bend over backwards to preallocate > the destination"), then using that in qemu makes sense for my v2 > patches. But for v1, my only justification was "when copying an > image, I can skip holes in the source if I know the _destination_ > already reads as zeroes", which only needed INIT_ZERO. > > Some of the nbdkit patches demonstrate the some-vs.-all nature of > the two bits; for example, in the split plugin, I initialize > h->init_sparse = false; h->init_zero = true; then in a loop over > each file change h->init_sparse to true if at least one file was > sparse, and change h->init_zero to false if at least one file had > non-zero contents. Rich. -- Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones Read my programming and virtualization blog: http://rwmj.wordpress.com virt-builder quickly builds VMs from scratch http://libguestfs.org/virt-builder.1.html
