Re: [PATCH v2] qemu-img: Make unallocated part of backing chain obvious in map

2021-06-11 Thread Nir Soffer
On Fri, Jun 11, 2021 at 5:59 PM Eric Blake  wrote:
>
> On Fri, Jun 11, 2021 at 05:35:12PM +0300, Vladimir Sementsov-Ogievskiy wrote:
> > > An obvious solution is to make 'qemu-img map --output=json'
> > > distinguish between clusters that have a local allocation from those
> > > that are found nowhere in the chain.  We already have a one-off
> > > mismatch between qemu-img map and NBD qemu:allocation-depth (the
> > > former chose 0, and the latter 1 for the local layer), so exposing the
> > > latter's choice of 0 for unallocated in the entire chain would mean
> > > using using "depth":-1 in the former, but a negative depth may confuse
> > > existing tools.  But there is an easy out: for any chain of length N,
> > > we can simply represent an unallocated cluster as "depth":N+1.  This
> > > does have a slight risk of confusing any tool that might try to
> > > dereference NULL when finding the backing image for the last file in
> > > the backing chain, but that risk sseems worth the more precise output.
> > > The iotests have several examples where this distinction demonstrates
> > > the additional accuracy.
> > >
> > > Signed-off-by: Eric Blake 
> > > ---
> > >
> > > Replaces v1: 20210610213906.1313440-1-ebl...@redhat.com
> > > (qemu-img: Use "depth":-1 to make backing probes obvious)
> > >
> > > Use N+1 instead of -1 for unallocated [Kevin]
> > >
> >
> > Bit in contrast with -1, or with separate boolean flag, you lose the 
> > possibility to distinguish case when we have 3 layers and the cluster is 
> > absent in all of them, and the case when we have 4 layers and the cluster 
> > is absent in top 3 but in 4 it is qcow2 UNALLOCATED_ZERO cluster.
>
> Using just 'qemu-img map --output-json', you only see depth numbers.
> You also have to use 'qemu-img info --backing-chain' to see what file
> those depth numbers correspond to, at which point it becomes obvious
> whether "depth":4 meant unallocated (because the chain was length 3)
> or allocated at depth 4 (because the chain was length 4 or longer).
> But that's no worse than pre-patch, where you had to use qemu-img info
> --backing-chain to learn which file a particular "depth" maps to.
>
> >
> > So, if someone use this API to reconstruct the chain, then for original 3 
> > empty layers he will create 3 empty layers and 4rd additional ZERO layer. 
> > And such reconstructed chain would not be equal to original chain (as if we 
> > take these two chains and add additional backing file as a new bottom 
> > layer, effect would be different).. I'm not sure is it a problem in the 
> > task you are solving :\
>
> It should be fairly easy to optimize the case of a backing chain where
> EVERY listed cluster at the final depth was "data":false,"zero":true
> to omit that file after all.
>
> And in oVirt's case, Nir pointed out that we have one more tool at our
> disposal in recreating a backing chain: if you use
> json:{"driver":"qcow2", "backing":null, ...} as your image file, you
> don't have to worry about arbitrary files in the backing chain, only
> about recreating the top-most layer of a chain.  And in that case, it
> becomes very obvious that "depth":0 is something you must recreate,
> and "depth":1 would be a non-existent backing file because you just
> passed "backing":null.

Note that oVirt does not use qemu-img map, we use qemu-nbd to get
image extents, since it is used only in context we already connect to
qemu-nbd server or run qemu-nbd.

Management tools already know the image format (they should avoid
doing format probing anyway), and using a json uri allows single command
to get the needed info when you inspect a single layer.

But this change introduces a risk that some program using qemu-img map
will interrupt the result in the wrong way, assuming that there is N+1 layer.

I think adding a new flag for absent extents is better. It cannot break
any user and it is easier to understand and use.

Nir




Re: [PATCH v2] qemu-img: Make unallocated part of backing chain obvious in map

2021-06-11 Thread Eric Blake
On Fri, Jun 11, 2021 at 05:35:12PM +0300, Vladimir Sementsov-Ogievskiy wrote:
> > An obvious solution is to make 'qemu-img map --output=json'
> > distinguish between clusters that have a local allocation from those
> > that are found nowhere in the chain.  We already have a one-off
> > mismatch between qemu-img map and NBD qemu:allocation-depth (the
> > former chose 0, and the latter 1 for the local layer), so exposing the
> > latter's choice of 0 for unallocated in the entire chain would mean
> > using using "depth":-1 in the former, but a negative depth may confuse
> > existing tools.  But there is an easy out: for any chain of length N,
> > we can simply represent an unallocated cluster as "depth":N+1.  This
> > does have a slight risk of confusing any tool that might try to
> > dereference NULL when finding the backing image for the last file in
> > the backing chain, but that risk sseems worth the more precise output.
> > The iotests have several examples where this distinction demonstrates
> > the additional accuracy.
> > 
> > Signed-off-by: Eric Blake 
> > ---
> > 
> > Replaces v1: 20210610213906.1313440-1-ebl...@redhat.com
> > (qemu-img: Use "depth":-1 to make backing probes obvious)
> > 
> > Use N+1 instead of -1 for unallocated [Kevin]
> > 
> 
> Bit in contrast with -1, or with separate boolean flag, you lose the 
> possibility to distinguish case when we have 3 layers and the cluster is 
> absent in all of them, and the case when we have 4 layers and the cluster is 
> absent in top 3 but in 4 it is qcow2 UNALLOCATED_ZERO cluster.

Using just 'qemu-img map --output-json', you only see depth numbers.
You also have to use 'qemu-img info --backing-chain' to see what file
those depth numbers correspond to, at which point it becomes obvious
whether "depth":4 meant unallocated (because the chain was length 3)
or allocated at depth 4 (because the chain was length 4 or longer).
But that's no worse than pre-patch, where you had to use qemu-img info
--backing-chain to learn which file a particular "depth" maps to.

> 
> So, if someone use this API to reconstruct the chain, then for original 3 
> empty layers he will create 3 empty layers and 4rd additional ZERO layer. And 
> such reconstructed chain would not be equal to original chain (as if we take 
> these two chains and add additional backing file as a new bottom layer, 
> effect would be different).. I'm not sure is it a problem in the task you are 
> solving :\

It should be fairly easy to optimize the case of a backing chain where
EVERY listed cluster at the final depth was "data":false,"zero":true
to omit that file after all.

And in oVirt's case, Nir pointed out that we have one more tool at our
disposal in recreating a backing chain: if you use
json:{"driver":"qcow2", "backing":null, ...} as your image file, you
don't have to worry about arbitrary files in the backing chain, only
about recreating the top-most layer of a chain.  And in that case, it
becomes very obvious that "depth":0 is something you must recreate,
and "depth":1 would be a non-existent backing file because you just
passed "backing":null.

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3266
Virtualization:  qemu.org | libvirt.org




Re: [PATCH v2] qemu-img: Make unallocated part of backing chain obvious in map

2021-06-11 Thread Vladimir Sementsov-Ogievskiy

11.06.2021 17:01, Eric Blake wrote:

The recently-added NBD context qemu:allocation-depth is able to
distinguish between locally-present data (even with that data is
sparse) [shown as depth 1 over NBD], and data that could not be found
anywhere in the backing chain [shown as depth 0].  But qemu-img map
--output=json predates that addition, and prior to this patch has the
unfortunate behavior that all portions of the backing chain that
resolve without finding a hit in any backing layer report the same
depth as the final backing layer.  This makes it harder to reconstruct
a qcow2 backing chain using just 'qemu-img map' output, especially
when using "backing":null to artificially limit a backing chain,
because it is impossible to distinguish between a
QCOW2_CLUSTER_UNALLOCATED (which defers to a [missing] backing file)
and a QCOW2_CLUSTER_ZERO_PLAIN cluster (which would override any
backing file), since both types of clusters otherwise show as
"data":false,"zero":true" (but note that we can distinguish a
QCOW2_CLUSTER_ZERO_ALLOCATED, which would also have an "offset":
listing).

The task of reconstructing a qcow2 chain was made harder in commit
0da9856851 (nbd: server: Report holes for raw images), because prior
to that point, it was possible to abuse NBD's block status command to
see which portions of a qcow2 file resulted in BDRV_BLOCK_ALLOCATED
(showing up as NBD_STATE_ZERO in isolation) vs. missing from the chain
(showing up as NBD_STATE_ZERO|NBD_STATE_HOLE); but now qemu reports
more accurate sparseness information over NBD.

An obvious solution is to make 'qemu-img map --output=json'
distinguish between clusters that have a local allocation from those
that are found nowhere in the chain.  We already have a one-off
mismatch between qemu-img map and NBD qemu:allocation-depth (the
former chose 0, and the latter 1 for the local layer), so exposing the
latter's choice of 0 for unallocated in the entire chain would mean
using using "depth":-1 in the former, but a negative depth may confuse
existing tools.  But there is an easy out: for any chain of length N,
we can simply represent an unallocated cluster as "depth":N+1.  This
does have a slight risk of confusing any tool that might try to
dereference NULL when finding the backing image for the last file in
the backing chain, but that risk sseems worth the more precise output.
The iotests have several examples where this distinction demonstrates
the additional accuracy.

Signed-off-by: Eric Blake 
---

Replaces v1: 20210610213906.1313440-1-ebl...@redhat.com
(qemu-img: Use "depth":-1 to make backing probes obvious)

Use N+1 instead of -1 for unallocated [Kevin]



Bit in contrast with -1, or with separate boolean flag, you lose the 
possibility to distinguish case when we have 3 layers and the cluster is absent 
in all of them, and the case when we have 4 layers and the cluster is absent in 
top 3 but in 4 it is qcow2 UNALLOCATED_ZERO cluster.

So, if someone use this API to reconstruct the chain, then for original 3 empty 
layers he will create 3 empty layers and 4rd additional ZERO layer. And such 
reconstructed chain would not be equal to original chain (as if we take these 
two chains and add additional backing file as a new bottom layer, effect would 
be different).. I'm not sure is it a problem in the task you are solving :\


--
Best regards,
Vladimir



[PATCH v2] qemu-img: Make unallocated part of backing chain obvious in map

2021-06-11 Thread Eric Blake
The recently-added NBD context qemu:allocation-depth is able to
distinguish between locally-present data (even with that data is
sparse) [shown as depth 1 over NBD], and data that could not be found
anywhere in the backing chain [shown as depth 0].  But qemu-img map
--output=json predates that addition, and prior to this patch has the
unfortunate behavior that all portions of the backing chain that
resolve without finding a hit in any backing layer report the same
depth as the final backing layer.  This makes it harder to reconstruct
a qcow2 backing chain using just 'qemu-img map' output, especially
when using "backing":null to artificially limit a backing chain,
because it is impossible to distinguish between a
QCOW2_CLUSTER_UNALLOCATED (which defers to a [missing] backing file)
and a QCOW2_CLUSTER_ZERO_PLAIN cluster (which would override any
backing file), since both types of clusters otherwise show as
"data":false,"zero":true" (but note that we can distinguish a
QCOW2_CLUSTER_ZERO_ALLOCATED, which would also have an "offset":
listing).

The task of reconstructing a qcow2 chain was made harder in commit
0da9856851 (nbd: server: Report holes for raw images), because prior
to that point, it was possible to abuse NBD's block status command to
see which portions of a qcow2 file resulted in BDRV_BLOCK_ALLOCATED
(showing up as NBD_STATE_ZERO in isolation) vs. missing from the chain
(showing up as NBD_STATE_ZERO|NBD_STATE_HOLE); but now qemu reports
more accurate sparseness information over NBD.

An obvious solution is to make 'qemu-img map --output=json'
distinguish between clusters that have a local allocation from those
that are found nowhere in the chain.  We already have a one-off
mismatch between qemu-img map and NBD qemu:allocation-depth (the
former chose 0, and the latter 1 for the local layer), so exposing the
latter's choice of 0 for unallocated in the entire chain would mean
using using "depth":-1 in the former, but a negative depth may confuse
existing tools.  But there is an easy out: for any chain of length N,
we can simply represent an unallocated cluster as "depth":N+1.  This
does have a slight risk of confusing any tool that might try to
dereference NULL when finding the backing image for the last file in
the backing chain, but that risk sseems worth the more precise output.
The iotests have several examples where this distinction demonstrates
the additional accuracy.

Signed-off-by: Eric Blake 
---

Replaces v1: 20210610213906.1313440-1-ebl...@redhat.com
(qemu-img: Use "depth":-1 to make backing probes obvious)

Use N+1 instead of -1 for unallocated [Kevin]

 docs/tools/qemu-img.rst|   3 +-
 qemu-img.c |   3 ++
 tests/qemu-iotests/122.out |  34 ++---
 tests/qemu-iotests/154.out | 100 +++--
 tests/qemu-iotests/179.out |  67 +
 tests/qemu-iotests/223.out |  24 -
 tests/qemu-iotests/244.out |   7 +--
 tests/qemu-iotests/252.out |   4 +-
 tests/qemu-iotests/274.out |  16 +++---
 tests/qemu-iotests/291.out |   8 +--
 tests/qemu-iotests/309.out |   4 +-
 11 files changed, 162 insertions(+), 108 deletions(-)

diff --git a/docs/tools/qemu-img.rst b/docs/tools/qemu-img.rst
index cfe11478791f..c155b1bf3cc8 100644
--- a/docs/tools/qemu-img.rst
+++ b/docs/tools/qemu-img.rst
@@ -599,7 +599,8 @@ Command description:
   - whether the data is known to read as zero (boolean field ``zero``);
   - in order to make the output shorter, the target file is expressed as
 a ``depth``; for example, a depth of 2 refers to the backing file
-of the backing file of *FILENAME*.
+of the backing file of *FILENAME*.  Depth will be one larger than
+the chain length if no file in the chain provides the data.

   In JSON format, the ``offset`` field is optional; it is absent in
   cases where ``human`` format would omit the entry or exit with an error.
diff --git a/qemu-img.c b/qemu-img.c
index a5993682aad4..33a5cd012b8b 100644
--- a/qemu-img.c
+++ b/qemu-img.c
@@ -3035,6 +3035,9 @@ static int get_block_status(BlockDriverState *bs, int64_t 
offset,
 bdrv_refresh_filename(file);
 filename = file->filename;
 }
+if (!(ret & BDRV_BLOCK_ALLOCATED)) {
+depth++;
+}

 *e = (MapEntry) {
 .start = offset,
diff --git a/tests/qemu-iotests/122.out b/tests/qemu-iotests/122.out
index 3a3e121d579d..779dab4847f0 100644
--- a/tests/qemu-iotests/122.out
+++ b/tests/qemu-iotests/122.out
@@ -68,11 +68,11 @@ read 65536/65536 bytes at offset 4194304
 read 65536/65536 bytes at offset 8388608
 64 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
 [{ "start": 0, "length": 65536, "depth": 0, "zero": false, "data": true},
-{ "start": 65536, "length": 4128768, "depth": 0, "zero": true, "data": false},
+{ "start": 65536, "length": 4128768, "depth": 1, "zero": true, "data": false},
 { "start": 4194304, "length": 65536, "depth": 0, "zero": false, "data": true},
-{ "start": 4259840, "length": 4128768,