Re: [PATCH 1/5] iotests: Update 241 to expose backing layer fragmentation

2021-02-25 Thread Vladimir Sementsov-Ogievskiy

25.02.2021 18:52, Eric Blake wrote:

On 2/25/21 8:57 AM, Vladimir Sementsov-Ogievskiy wrote:

25.02.2021 16:50, Vladimir Sementsov-Ogievskiy wrote:

18.02.2021 23:15, Eric Blake wrote:

Previous commits (such as 6e280648, 75d34eb9) have mentioned that our
NBD server still sends unaligned fragments when an active layer with
large advertised minimum block size is backed by another layer with a
smaller block size. Expand the test to actually cover these scenario,
by using two different approaches: qcow2 encryption (which forces
512-byte alignment) with an unaligned raw backing file, and blkdebug
with a 4k alignment.





Now I don't think that aligning qemu:allocation-depth information is a
correct thing to do.


Why not?  First, it's very rare that you'd have a qcow2 image with
mandated 4k minimum block size, backed by another qcow2 image with 512
block size (blkdebug made it possible to expose the bug, but I could not
find a way in common day-to-day usage), so we really aren't impacting
normal users.  Second, from the perspective of copying backing chains
over NBD, what difference does it make if we have the backing chain:

A (granularity 512) <- B (granularity 512) <- C (granularity 4k)

with the allocation pattern:

A: -A-A-A-A-A-A-A-A
B: --BB--BB--BB--BB
C: 

and report the allocation depth as:



instead of

03220322

The former may be imprecise, but it obeys our bounds, and in all
reality, if all we have access to is 4k chunks, any decisions we make
about how to handle that 4k block should be based on the fact that at
least some of the data was allocated in our backing file, and not
treating the entire 4k as unallocated merely because the first 512 bytes
are neither in A nor B.



I'm not sure about NBD client, but in qemu block-jobs the decision may be 
different for different tasks, as I mentioned in my answer on [2/5].
For example block-stream will skip chunks allocated in top, because nothing to 
do, data is already in top. But if we imagine that top may return ALLOCATED for 
something that is not ALLOCATED the stream logic is broken.. Probably that's a 
bad example.

I agree that this is a rare case anyway and we probably shouldn't care too 
much. But we should at least describe it in allocation-depth specification.


--
Best regards,
Vladimir



Re: [PATCH 1/5] iotests: Update 241 to expose backing layer fragmentation

2021-02-25 Thread Eric Blake
On 2/25/21 8:57 AM, Vladimir Sementsov-Ogievskiy wrote:
> 25.02.2021 16:50, Vladimir Sementsov-Ogievskiy wrote:
>> 18.02.2021 23:15, Eric Blake wrote:
>>> Previous commits (such as 6e280648, 75d34eb9) have mentioned that our
>>> NBD server still sends unaligned fragments when an active layer with
>>> large advertised minimum block size is backed by another layer with a
>>> smaller block size. Expand the test to actually cover these scenario,
>>> by using two different approaches: qcow2 encryption (which forces
>>> 512-byte alignment) with an unaligned raw backing file, and blkdebug
>>> with a 4k alignment.
>>>

> 
> Now I don't think that aligning qemu:allocation-depth information is a
> correct thing to do.

Why not?  First, it's very rare that you'd have a qcow2 image with
mandated 4k minimum block size, backed by another qcow2 image with 512
block size (blkdebug made it possible to expose the bug, but I could not
find a way in common day-to-day usage), so we really aren't impacting
normal users.  Second, from the perspective of copying backing chains
over NBD, what difference does it make if we have the backing chain:

A (granularity 512) <- B (granularity 512) <- C (granularity 4k)

with the allocation pattern:

A: -A-A-A-A-A-A-A-A
B: --BB--BB--BB--BB
C: 

and report the allocation depth as:

   

instead of

   03220322

The former may be imprecise, but it obeys our bounds, and in all
reality, if all we have access to is 4k chunks, any decisions we make
about how to handle that 4k block should be based on the fact that at
least some of the data was allocated in our backing file, and not
treating the entire 4k as unallocated merely because the first 512 bytes
are neither in A nor B.

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3226
Virtualization:  qemu.org | libvirt.org




Re: [PATCH 1/5] iotests: Update 241 to expose backing layer fragmentation

2021-02-25 Thread Eric Blake
On 2/25/21 7:50 AM, Vladimir Sementsov-Ogievskiy wrote:
> 18.02.2021 23:15, Eric Blake wrote:
>> Previous commits (such as 6e280648, 75d34eb9) have mentioned that our
>> NBD server still sends unaligned fragments when an active layer with
>> large advertised minimum block size is backed by another layer with a
>> smaller block size. Expand the test to actually cover these scenario,
>> by using two different approaches: qcow2 encryption (which forces
>> 512-byte alignment) with an unaligned raw backing file, and blkdebug
>> with a 4k alignment.
>>
>> The encryption test passes with the desired results, but only because
>> the client side works around the server's non-compliance; if you
>> repeat the test manually with tracing turned on, you will see the
>> server sending a status for 1000 bytes of data then 1048 bytes of
>> hole, which is not aligned. But reverting commit 737d3f5244 shows that
>> it is indeed the client working around the bug in the server.
>>
>> Meanwhile, the blkdebug test gives incorrect results: remember, when
>> using x-dirty-bitmap with qemu-img map as a way to sniff alternative
>> metadata contexts, the meanings of "data" and "zero" are determined by
> 
> How I'm tired of this abuse:) It seems that total amount of comments
> about it in code and commit messages worth creating more intuitive
> interface.. Don't you have an idea in mind?

Yes: 'nbdinfo' as part of the libnbd project ;)

Sadly, libnbd is not available on all our common porting targets yet,
and nbdinfo is less than a year old (so even distros that have libnbd
1.0 are too old).

> 
>> that context.  Our client workaround is assuming that the fragmented
>> replies can be merged according to base:allocation rules, but those
>> rules do not work for other contexts (merging dirty and clean bitmap
>> should produce dirty; merging allocated and unallocated should produce
>> allocated; see the FIXME for more about the decoded values we expect).
> 
> You could instead keep the test output correct (without FIXME marks) but
> add the test to "disabled" group and drop it from the group when fixed.

Either way, it's fixed by the end of the series.

> 
>>
>> Signed-off-by: Eric Blake 
> 
> Reviewed-by: Vladimir Sementsov-Ogievskiy 
> 

Thanks!

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3226
Virtualization:  qemu.org | libvirt.org




Re: [PATCH 1/5] iotests: Update 241 to expose backing layer fragmentation

2021-02-25 Thread Vladimir Sementsov-Ogievskiy

25.02.2021 16:50, Vladimir Sementsov-Ogievskiy wrote:

18.02.2021 23:15, Eric Blake wrote:

Previous commits (such as 6e280648, 75d34eb9) have mentioned that our
NBD server still sends unaligned fragments when an active layer with
large advertised minimum block size is backed by another layer with a
smaller block size. Expand the test to actually cover these scenario,
by using two different approaches: qcow2 encryption (which forces
512-byte alignment) with an unaligned raw backing file, and blkdebug
with a 4k alignment.

The encryption test passes with the desired results, but only because
the client side works around the server's non-compliance; if you
repeat the test manually with tracing turned on, you will see the
server sending a status for 1000 bytes of data then 1048 bytes of
hole, which is not aligned. But reverting commit 737d3f5244 shows that
it is indeed the client working around the bug in the server.

Meanwhile, the blkdebug test gives incorrect results: remember, when
using x-dirty-bitmap with qemu-img map as a way to sniff alternative
metadata contexts, the meanings of "data" and "zero" are determined by


How I'm tired of this abuse:) It seems that total amount of comments
about it in code and commit messages worth creating more intuitive
interface.. Don't you have an idea in mind?


that context.  Our client workaround is assuming that the fragmented
replies can be merged according to base:allocation rules, but those
rules do not work for other contexts (merging dirty and clean bitmap
should produce dirty; merging allocated and unallocated should produce
allocated; see the FIXME for more about the decoded values we expect).


You could instead keep the test output correct (without FIXME marks) but
add the test to "disabled" group and drop it from the group when fixed.



Signed-off-by: Eric Blake 


Reviewed-by: Vladimir Sementsov-Ogievskiy 



Now I don't think that aligning qemu:allocation-depth information is a correct 
thing to do.


--
Best regards,
Vladimir



Re: [PATCH 1/5] iotests: Update 241 to expose backing layer fragmentation

2021-02-25 Thread Vladimir Sementsov-Ogievskiy

18.02.2021 23:15, Eric Blake wrote:

Previous commits (such as 6e280648, 75d34eb9) have mentioned that our
NBD server still sends unaligned fragments when an active layer with
large advertised minimum block size is backed by another layer with a
smaller block size. Expand the test to actually cover these scenario,
by using two different approaches: qcow2 encryption (which forces
512-byte alignment) with an unaligned raw backing file, and blkdebug
with a 4k alignment.

The encryption test passes with the desired results, but only because
the client side works around the server's non-compliance; if you
repeat the test manually with tracing turned on, you will see the
server sending a status for 1000 bytes of data then 1048 bytes of
hole, which is not aligned. But reverting commit 737d3f5244 shows that
it is indeed the client working around the bug in the server.

Meanwhile, the blkdebug test gives incorrect results: remember, when
using x-dirty-bitmap with qemu-img map as a way to sniff alternative
metadata contexts, the meanings of "data" and "zero" are determined by


How I'm tired of this abuse:) It seems that total amount of comments
about it in code and commit messages worth creating more intuitive
interface.. Don't you have an idea in mind?


that context.  Our client workaround is assuming that the fragmented
replies can be merged according to base:allocation rules, but those
rules do not work for other contexts (merging dirty and clean bitmap
should produce dirty; merging allocated and unallocated should produce
allocated; see the FIXME for more about the decoded values we expect).


You could instead keep the test output correct (without FIXME marks) but
add the test to "disabled" group and drop it from the group when fixed.



Signed-off-by: Eric Blake 


Reviewed-by: Vladimir Sementsov-Ogievskiy 


---
  tests/qemu-iotests/241 | 60 ++
  tests/qemu-iotests/241.out | 21 +
  2 files changed, 76 insertions(+), 5 deletions(-)

diff --git a/tests/qemu-iotests/241 b/tests/qemu-iotests/241
index c962c8b6075d..5217af82dc65 100755
--- a/tests/qemu-iotests/241
+++ b/tests/qemu-iotests/241
@@ -3,7 +3,7 @@
  #
  # Test qemu-nbd vs. unaligned images
  #
-# Copyright (C) 2018-2019 Red Hat, Inc.
+# Copyright (C) 2018-2021 Red Hat, Inc.
  #
  # This program is free software; you can redistribute it and/or modify
  # it under the terms of the GNU General Public License as published by
@@ -27,7 +27,7 @@ status=1 # failure is the default!
  _cleanup()
  {
  _cleanup_test_img
-rm -f "$TEST_DIR/server.log"
+rm -f "$TEST_DIR/server.log" "$TEST_IMG_FILE.mid" "$TEST_IMG_FILE.qcow2"
  nbd_server_stop
  }
  trap "_cleanup; exit \$status" 0 1 2 3 15
@@ -37,7 +37,7 @@ trap "_cleanup; exit \$status" 0 1 2 3 15
  . ./common.filter
  . ./common.nbd

-_supported_fmt raw
+_supported_fmt raw # although the test also requires use of qcow2
  _supported_proto nbd
  _supported_os Linux
  _require_command QEMU_NBD
@@ -89,11 +89,61 @@ $QEMU_IMG map --output=json "$TEST_IMG" | 
_filter_qemu_img_map
  $QEMU_IO -c map "$TEST_IMG"
  nbd_server_stop

-# Not tested yet: we also want to ensure that qemu as NBD client does
+echo
+echo "=== Encrypted qcow2 file backed by unaligned raw image ==="
+echo
+
+# Enabling encryption in qcow2 forces 512-alignment
+SECRET=secret,id=sec0,data=12345
+$QEMU_IMG create -f qcow2 -b "$TEST_IMG_FILE" -F raw --object "$SECRET" \
+  -o encrypt.format=luks,encrypt.key-secret=sec0,encrypt.iter-time=10 \
+  "$TEST_IMG_FILE.qcow2" 2k | _filter_img_create
+nbd_server_start_unix_socket --object "$SECRET" --image-opts \
+  driver=qcow2,file.filename="$TEST_IMG_FILE.qcow2",encrypt.key-secret=sec0
+
+$QEMU_NBD_PROG --list -k $nbd_unix_socket | grep '\(size\|min\)'
+$QEMU_IMG map --output=json "$TEST_IMG" | _filter_qemu_img_map
+$QEMU_IO -c map "$TEST_IMG"
+nbd_server_stop
+
+echo
+echo "=== Use blkdebug for larger alignment than backing layer ==="
+echo
+
+$QEMU_IMG create -f qcow2 -o cluster_size=1024 \
+  "$TEST_IMG_FILE" 2k | _filter_img_create
+$QEMU_IMG create -f qcow2 -b "$TEST_IMG_FILE" -F qcow2 -o cluster_size=512 \
+  "$TEST_IMG_FILE.mid" | _filter_img_create
+$QEMU_IMG bitmap --add -g 512 --enable "$TEST_IMG_FILE.mid" b0
+$QEMU_IO -f qcow2 -c 'w 512 512' "$TEST_IMG_FILE.mid" | _filter_qemu_io
+$QEMU_IMG create -f qcow2 -b "$TEST_IMG_FILE.mid" -F qcow2 \
+  "$TEST_IMG_FILE.qcow2" 4k | _filter_img_create
+FILE=image.file.filename="$TEST_IMG_FILE.qcow2"
+nbd_server_start_unix_socket -B b0 -A --image-opts \
+  driver=blkdebug,align=4096,image.driver=qcow2,image.file.driver=file,"$FILE"
+
+TEST_IMG="driver=nbd,server.type=unix,server.path=$nbd_unix_socket"
+$QEMU_NBD_PROG --list -k $nbd_unix_socket | grep '\(size\|min\)'
+# FIXME: this should report a single 4k block of "data":false which translates
+# to the dirty bitmap being set for at least part of the region; "data":true
+# is wrong unless the entire 4k is clean.
+$QEMU_IMG map --output=json 

[PATCH 1/5] iotests: Update 241 to expose backing layer fragmentation

2021-02-18 Thread Eric Blake
Previous commits (such as 6e280648, 75d34eb9) have mentioned that our
NBD server still sends unaligned fragments when an active layer with
large advertised minimum block size is backed by another layer with a
smaller block size. Expand the test to actually cover these scenario,
by using two different approaches: qcow2 encryption (which forces
512-byte alignment) with an unaligned raw backing file, and blkdebug
with a 4k alignment.

The encryption test passes with the desired results, but only because
the client side works around the server's non-compliance; if you
repeat the test manually with tracing turned on, you will see the
server sending a status for 1000 bytes of data then 1048 bytes of
hole, which is not aligned. But reverting commit 737d3f5244 shows that
it is indeed the client working around the bug in the server.

Meanwhile, the blkdebug test gives incorrect results: remember, when
using x-dirty-bitmap with qemu-img map as a way to sniff alternative
metadata contexts, the meanings of "data" and "zero" are determined by
that context.  Our client workaround is assuming that the fragmented
replies can be merged according to base:allocation rules, but those
rules do not work for other contexts (merging dirty and clean bitmap
should produce dirty; merging allocated and unallocated should produce
allocated; see the FIXME for more about the decoded values we expect).

Signed-off-by: Eric Blake 
---
 tests/qemu-iotests/241 | 60 ++
 tests/qemu-iotests/241.out | 21 +
 2 files changed, 76 insertions(+), 5 deletions(-)

diff --git a/tests/qemu-iotests/241 b/tests/qemu-iotests/241
index c962c8b6075d..5217af82dc65 100755
--- a/tests/qemu-iotests/241
+++ b/tests/qemu-iotests/241
@@ -3,7 +3,7 @@
 #
 # Test qemu-nbd vs. unaligned images
 #
-# Copyright (C) 2018-2019 Red Hat, Inc.
+# Copyright (C) 2018-2021 Red Hat, Inc.
 #
 # This program is free software; you can redistribute it and/or modify
 # it under the terms of the GNU General Public License as published by
@@ -27,7 +27,7 @@ status=1 # failure is the default!
 _cleanup()
 {
 _cleanup_test_img
-rm -f "$TEST_DIR/server.log"
+rm -f "$TEST_DIR/server.log" "$TEST_IMG_FILE.mid" "$TEST_IMG_FILE.qcow2"
 nbd_server_stop
 }
 trap "_cleanup; exit \$status" 0 1 2 3 15
@@ -37,7 +37,7 @@ trap "_cleanup; exit \$status" 0 1 2 3 15
 . ./common.filter
 . ./common.nbd

-_supported_fmt raw
+_supported_fmt raw # although the test also requires use of qcow2
 _supported_proto nbd
 _supported_os Linux
 _require_command QEMU_NBD
@@ -89,11 +89,61 @@ $QEMU_IMG map --output=json "$TEST_IMG" | 
_filter_qemu_img_map
 $QEMU_IO -c map "$TEST_IMG"
 nbd_server_stop

-# Not tested yet: we also want to ensure that qemu as NBD client does
+echo
+echo "=== Encrypted qcow2 file backed by unaligned raw image ==="
+echo
+
+# Enabling encryption in qcow2 forces 512-alignment
+SECRET=secret,id=sec0,data=12345
+$QEMU_IMG create -f qcow2 -b "$TEST_IMG_FILE" -F raw --object "$SECRET" \
+  -o encrypt.format=luks,encrypt.key-secret=sec0,encrypt.iter-time=10 \
+  "$TEST_IMG_FILE.qcow2" 2k | _filter_img_create
+nbd_server_start_unix_socket --object "$SECRET" --image-opts \
+  driver=qcow2,file.filename="$TEST_IMG_FILE.qcow2",encrypt.key-secret=sec0
+
+$QEMU_NBD_PROG --list -k $nbd_unix_socket | grep '\(size\|min\)'
+$QEMU_IMG map --output=json "$TEST_IMG" | _filter_qemu_img_map
+$QEMU_IO -c map "$TEST_IMG"
+nbd_server_stop
+
+echo
+echo "=== Use blkdebug for larger alignment than backing layer ==="
+echo
+
+$QEMU_IMG create -f qcow2 -o cluster_size=1024 \
+  "$TEST_IMG_FILE" 2k | _filter_img_create
+$QEMU_IMG create -f qcow2 -b "$TEST_IMG_FILE" -F qcow2 -o cluster_size=512 \
+  "$TEST_IMG_FILE.mid" | _filter_img_create
+$QEMU_IMG bitmap --add -g 512 --enable "$TEST_IMG_FILE.mid" b0
+$QEMU_IO -f qcow2 -c 'w 512 512' "$TEST_IMG_FILE.mid" | _filter_qemu_io
+$QEMU_IMG create -f qcow2 -b "$TEST_IMG_FILE.mid" -F qcow2 \
+  "$TEST_IMG_FILE.qcow2" 4k | _filter_img_create
+FILE=image.file.filename="$TEST_IMG_FILE.qcow2"
+nbd_server_start_unix_socket -B b0 -A --image-opts \
+  driver=blkdebug,align=4096,image.driver=qcow2,image.file.driver=file,"$FILE"
+
+TEST_IMG="driver=nbd,server.type=unix,server.path=$nbd_unix_socket"
+$QEMU_NBD_PROG --list -k $nbd_unix_socket | grep '\(size\|min\)'
+# FIXME: this should report a single 4k block of "data":false which translates
+# to the dirty bitmap being set for at least part of the region; "data":true
+# is wrong unless the entire 4k is clean.
+$QEMU_IMG map --output=json --image-opts \
+ "$TEST_IMG",x-dirty-bitmap=qemu:dirty-bitmap:b0 | _filter_qemu_img_map
+
+# FIXME: this should report a single 4k block of "zero":true,"data":true,
+# meaning allocated from the backing chain.  Using "zero":false,"data":false
+# (allocated in active layer) or "zero":false,"data":true (entire region
+# unallocated) is wrong.
+$QEMU_IMG map --output=json --image-opts \
+