from:"Max Reitz"

[PULL 17/19] test-write-threshold: drop extra tests

2021-05-14 Thread Max Reitz

From: Vladimir Sementsov-Ogievskiy 

Testing set/get of one 64bit variable doesn't seem necessary. We have a
lot of such variables. Also remaining tests do test set/get anyway.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Reviewed-by: Max Reitz 
Message-Id: <20210506090621.11848-7-vsement...@virtuozzo.com>
Reviewed-by: Stefan Hajnoczi 
Signed-off-by: Max Reitz 
---
 tests/unit/test-write-threshold.c | 43 ---
 1 file changed, 43 deletions(-)

diff --git a/tests/unit/test-write-threshold.c 
b/tests/unit/test-write-threshold.c
index bb5c1a5217..9e9986aefc 100644
--- a/tests/unit/test-write-threshold.c
+++ b/tests/unit/test-write-threshold.c
@@ -12,43 +12,6 @@
 #include "block/write-threshold.h"
 
 
-static void test_threshold_not_set_on_init(void)
-{
-uint64_t res;
-BlockDriverState bs;
-memset(, 0, sizeof(bs));
-
-res = bdrv_write_threshold_get();
-g_assert_cmpint(res, ==, 0);
-}
-
-static void test_threshold_set_get(void)
-{
-uint64_t threshold = 4 * 1024 * 1024;
-uint64_t res;
-BlockDriverState bs;
-memset(, 0, sizeof(bs));
-
-bdrv_write_threshold_set(, threshold);
-
-res = bdrv_write_threshold_get();
-g_assert_cmpint(res, ==, threshold);
-}
-
-static void test_threshold_multi_set_get(void)
-{
-uint64_t threshold1 = 4 * 1024 * 1024;
-uint64_t threshold2 = 15 * 1024 * 1024;
-uint64_t res;
-BlockDriverState bs;
-memset(, 0, sizeof(bs));
-
-bdrv_write_threshold_set(, threshold1);
-bdrv_write_threshold_set(, threshold2);
-res = bdrv_write_threshold_get();
-g_assert_cmpint(res, ==, threshold2);
-}
-
 static void test_threshold_not_trigger(void)
 {
 uint64_t threshold = 4 * 1024 * 1024;
@@ -84,12 +47,6 @@ int main(int argc, char **argv)
 {
 size_t i;
 TestStruct tests[] = {
-{ "/write-threshold/not-set-on-init",
-  test_threshold_not_set_on_init },
-{ "/write-threshold/set-get",
-  test_threshold_set_get },
-{ "/write-threshold/multi-set-get",
-  test_threshold_multi_set_get },
 { "/write-threshold/not-trigger",
   test_threshold_not_trigger },
 { "/write-threshold/trigger",
-- 
2.31.1

[PULL 16/19] block/write-threshold: drop extra APIs

2021-05-14 Thread Max Reitz

From: Vladimir Sementsov-Ogievskiy 

bdrv_write_threshold_exceeded() is unused.

bdrv_write_threshold_is_set() is used only to double check the value of
bs->write_threshold_offset in tests. No real sense in it (both tests do
check real value with help of bdrv_write_threshold_get())

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Reviewed-by: Max Reitz 
Message-Id: <20210506090621.11848-5-vsement...@virtuozzo.com>
Reviewed-by: Eric Blake 
Reviewed-by: Stefan Hajnoczi 
[mreitz: Adjusted commit message as per Eric's suggestion]
Signed-off-by: Max Reitz 
---
 include/block/write-threshold.h   | 24 
 block/write-threshold.c   | 19 ---
 tests/unit/test-write-threshold.c |  4 
 3 files changed, 47 deletions(-)

diff --git a/include/block/write-threshold.h b/include/block/write-threshold.h
index 848a5dde85..a03ee1cacd 100644
--- a/include/block/write-threshold.h
+++ b/include/block/write-threshold.h
@@ -35,30 +35,6 @@ void bdrv_write_threshold_set(BlockDriverState *bs, uint64_t 
threshold_bytes);
  */
 uint64_t bdrv_write_threshold_get(const BlockDriverState *bs);
 
-/*
- * bdrv_write_threshold_is_set
- *
- * Tell if a write threshold is set for a given BDS.
- */
-bool bdrv_write_threshold_is_set(const BlockDriverState *bs);
-
-/*
- * bdrv_write_threshold_exceeded
- *
- * Return the extent of a write request that exceeded the threshold,
- * or zero if the request is below the threshold.
- * Return zero also if the threshold was not set.
- *
- * NOTE: here we assume the following holds for each request this code
- * deals with:
- *
- * assert((req->offset + req->bytes) <= UINT64_MAX)
- *
- * Please not there is *not* an actual C assert().
- */
-uint64_t bdrv_write_threshold_exceeded(const BlockDriverState *bs,
-   const BdrvTrackedRequest *req);
-
 /*
  * bdrv_write_threshold_check_write
  *
diff --git a/block/write-threshold.c b/block/write-threshold.c
index 71df3c434f..65a6acd142 100644
--- a/block/write-threshold.c
+++ b/block/write-threshold.c
@@ -24,25 +24,6 @@ uint64_t bdrv_write_threshold_get(const BlockDriverState *bs)
 return bs->write_threshold_offset;
 }
 
-bool bdrv_write_threshold_is_set(const BlockDriverState *bs)
-{
-return bs->write_threshold_offset > 0;
-}
-
-uint64_t bdrv_write_threshold_exceeded(const BlockDriverState *bs,
-   const BdrvTrackedRequest *req)
-{
-if (bdrv_write_threshold_is_set(bs)) {
-if (req->offset > bs->write_threshold_offset) {
-return (req->offset - bs->write_threshold_offset) + req->bytes;
-}
-if ((req->offset + req->bytes) > bs->write_threshold_offset) {
-return (req->offset + req->bytes) - bs->write_threshold_offset;
-}
-}
-return 0;
-}
-
 void bdrv_write_threshold_set(BlockDriverState *bs, uint64_t threshold_bytes)
 {
 bs->write_threshold_offset = threshold_bytes;
diff --git a/tests/unit/test-write-threshold.c 
b/tests/unit/test-write-threshold.c
index fd40a815b8..bb5c1a5217 100644
--- a/tests/unit/test-write-threshold.c
+++ b/tests/unit/test-write-threshold.c
@@ -18,8 +18,6 @@ static void test_threshold_not_set_on_init(void)
 BlockDriverState bs;
 memset(, 0, sizeof(bs));
 
-g_assert(!bdrv_write_threshold_is_set());
-
 res = bdrv_write_threshold_get();
 g_assert_cmpint(res, ==, 0);
 }
@@ -33,8 +31,6 @@ static void test_threshold_set_get(void)
 
 bdrv_write_threshold_set(, threshold);
 
-g_assert(bdrv_write_threshold_is_set());
-
 res = bdrv_write_threshold_get();
 g_assert_cmpint(res, ==, threshold);
 }
-- 
2.31.1

[PULL 15/19] test-write-threshold: rewrite test_threshold_(not_)trigger tests

2021-05-14 Thread Max Reitz

From: Vladimir Sementsov-Ogievskiy 

These tests use bdrv_write_threshold_exceeded() API, which is used only
for test (since pre-previous commit). Better is testing real API, which
is used in block.c as well.

So, let's call bdrv_write_threshold_check_write(), and check is
bs->write_threshold_offset cleared or not (it's cleared iff threshold
triggered).

Also we get rid of BdrvTrackedRequest use here. Note, that paranoiac
bdrv_check_request() calls were added in 8b1170012b1 to protect
BdrvTrackedRequest. Drop them now.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Reviewed-by: Max Reitz 
Message-Id: <20210506090621.11848-4-vsement...@virtuozzo.com>
Reviewed-by: Stefan Hajnoczi 
Signed-off-by: Max Reitz 
---
 tests/unit/test-write-threshold.c | 22 --
 1 file changed, 4 insertions(+), 18 deletions(-)

diff --git a/tests/unit/test-write-threshold.c 
b/tests/unit/test-write-threshold.c
index fc1c45a2eb..fd40a815b8 100644
--- a/tests/unit/test-write-threshold.c
+++ b/tests/unit/test-write-threshold.c
@@ -55,41 +55,27 @@ static void test_threshold_multi_set_get(void)
 
 static void test_threshold_not_trigger(void)
 {
-uint64_t amount = 0;
 uint64_t threshold = 4 * 1024 * 1024;
 BlockDriverState bs;
-BdrvTrackedRequest req;
 
 memset(, 0, sizeof(bs));
-memset(, 0, sizeof(req));
-req.offset = 1024;
-req.bytes = 1024;
-
-bdrv_check_request(req.offset, req.bytes, _abort);
 
 bdrv_write_threshold_set(, threshold);
-amount = bdrv_write_threshold_exceeded(, );
-g_assert_cmpuint(amount, ==, 0);
+bdrv_write_threshold_check_write(, 1024, 1024);
+g_assert_cmpuint(bdrv_write_threshold_get(), ==, threshold);
 }
 
 
 static void test_threshold_trigger(void)
 {
-uint64_t amount = 0;
 uint64_t threshold = 4 * 1024 * 1024;
 BlockDriverState bs;
-BdrvTrackedRequest req;
 
 memset(, 0, sizeof(bs));
-memset(, 0, sizeof(req));
-req.offset = (4 * 1024 * 1024) - 1024;
-req.bytes = 2 * 1024;
-
-bdrv_check_request(req.offset, req.bytes, _abort);
 
 bdrv_write_threshold_set(, threshold);
-amount = bdrv_write_threshold_exceeded(, );
-g_assert_cmpuint(amount, >=, 1024);
+bdrv_write_threshold_check_write(, threshold - 1024, 2 * 1024);
+g_assert_cmpuint(bdrv_write_threshold_get(), ==, 0);
 }
 
 typedef struct TestStruct {
-- 
2.31.1

[PULL 09/19] qemu-iotests: fix case of SOCK_DIR already in the environment

2021-05-14 Thread Max Reitz

From: Paolo Bonzini 

Due to a typo, in this case the SOCK_DIR was not being created.

Reviewed-by: Vladimir Sementsov-Ogievskiy 
Signed-off-by: Paolo Bonzini 
Tested-by: Emanuele Giuseppe Esposito 
Message-Id: <20210323181928.311862-6-pbonz...@redhat.com>
Message-Id: <20210503110110.476887-6-pbonz...@redhat.com>
Signed-off-by: Max Reitz 
---
 tests/qemu-iotests/testenv.py | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tests/qemu-iotests/testenv.py b/tests/qemu-iotests/testenv.py
index cd0e39b789..0c3fe75636 100644
--- a/tests/qemu-iotests/testenv.py
+++ b/tests/qemu-iotests/testenv.py
@@ -120,7 +120,7 @@ def init_directories(self) -> None:
 try:
 self.sock_dir = os.environ['SOCK_DIR']
 self.tmp_sock_dir = False
-Path(self.test_dir).mkdir(parents=True, exist_ok=True)
+Path(self.sock_dir).mkdir(parents=True, exist_ok=True)
 except KeyError:
 self.sock_dir = tempfile.mkdtemp()
 self.tmp_sock_dir = True
-- 
2.31.1

[PULL 14/19] block: drop write notifiers

2021-05-14 Thread Max Reitz

From: Vladimir Sementsov-Ogievskiy 

They are unused now.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Reviewed-by: Max Reitz 
Message-Id: <20210506090621.11848-3-vsement...@virtuozzo.com>
Reviewed-by: Stefan Hajnoczi 
Signed-off-by: Max Reitz 
---
 include/block/block_int.h | 12 
 block.c   |  1 -
 block/io.c|  6 --
 3 files changed, 19 deletions(-)

diff --git a/include/block/block_int.h b/include/block/block_int.h
index aff948fb63..b2c8b09d0f 100644
--- a/include/block/block_int.h
+++ b/include/block/block_int.h
@@ -954,9 +954,6 @@ struct BlockDriverState {
  */
 int64_t total_sectors;
 
-/* Callback before write request is processed */
-NotifierWithReturnList before_write_notifiers;
-
 /* threshold limit for writes, in bytes. "High water mark". */
 uint64_t write_threshold_offset;
 
@@ -1083,15 +1080,6 @@ void bdrv_parse_filename_strip_prefix(const char 
*filename, const char *prefix,
 bool bdrv_backing_overridden(BlockDriverState *bs);
 
 
-/**
- * bdrv_add_before_write_notifier:
- *
- * Register a callback that is invoked before write requests are processed but
- * after any throttling or waiting for overlapping requests.
- */
-void bdrv_add_before_write_notifier(BlockDriverState *bs,
-NotifierWithReturn *notifier);
-
 /**
  * bdrv_add_aio_context_notifier:
  *
diff --git a/block.c b/block.c
index 9ad725d205..75a82af641 100644
--- a/block.c
+++ b/block.c
@@ -400,7 +400,6 @@ BlockDriverState *bdrv_new(void)
 for (i = 0; i < BLOCK_OP_TYPE_MAX; i++) {
 QLIST_INIT(>op_blockers[i]);
 }
-notifier_with_return_list_init(>before_write_notifiers);
 qemu_co_mutex_init(>reqs_lock);
 qemu_mutex_init(>dirty_bitmap_mutex);
 bs->refcnt = 1;
diff --git a/block/io.c b/block/io.c
index 3520de51bb..1e826ba9e8 100644
--- a/block/io.c
+++ b/block/io.c
@@ -3165,12 +3165,6 @@ bool bdrv_qiov_is_aligned(BlockDriverState *bs, 
QEMUIOVector *qiov)
 return true;
 }
 
-void bdrv_add_before_write_notifier(BlockDriverState *bs,
-NotifierWithReturn *notifier)
-{
-notifier_with_return_list_add(>before_write_notifiers, notifier);
-}
-
 void bdrv_io_plug(BlockDriverState *bs)
 {
 BdrvChild *child;
-- 
2.31.1

[PULL 12/19] qemu-iotests: fix pylint 2.8 consider-using-with error

2021-05-14 Thread Max Reitz

From: Emanuele Giuseppe Esposito 

pylint 2.8 introduces consider-using-with error, suggesting
to use the 'with' block statement when possible.

Modify all subprocess.Popen call to use the 'with' statement,
except the one in __init__ of QemuIoInteractive class, since
it is assigned to a class field and used in other methods.

Signed-off-by: Emanuele Giuseppe Esposito 
Message-Id: <20210510190449.65948-1-eespo...@redhat.com>
[mreitz: Disable bad-option-value warning in the iotests' pylintrc, so
 that disabling consider-using-with in QemuIoInteractive will
 not produce a warning in pre-2.8 pylint versions]
Signed-off-by: Max Reitz 
---
 tests/qemu-iotests/iotests.py| 65 
 tests/qemu-iotests/pylintrc  |  3 ++
 tests/qemu-iotests/testrunner.py | 22 +--
 3 files changed, 47 insertions(+), 43 deletions(-)

diff --git a/tests/qemu-iotests/iotests.py b/tests/qemu-iotests/iotests.py
index 5ead94229f..777fa2ec0e 100644
--- a/tests/qemu-iotests/iotests.py
+++ b/tests/qemu-iotests/iotests.py
@@ -112,15 +112,14 @@ def qemu_tool_pipe_and_status(tool: str, args: 
Sequence[str],
 Run a tool and return both its output and its exit code
 """
 stderr = subprocess.STDOUT if connect_stderr else None
-subp = subprocess.Popen(args,
-stdout=subprocess.PIPE,
-stderr=stderr,
-universal_newlines=True)
-output = subp.communicate()[0]
-if subp.returncode < 0:
-cmd = ' '.join(args)
-sys.stderr.write(f'{tool} received signal {-subp.returncode}: {cmd}\n')
-return (output, subp.returncode)
+with subprocess.Popen(args, stdout=subprocess.PIPE,
+  stderr=stderr, universal_newlines=True) as subp:
+output = subp.communicate()[0]
+if subp.returncode < 0:
+cmd = ' '.join(args)
+sys.stderr.write(f'{tool} received signal \
+   {-subp.returncode}: {cmd}\n')
+return (output, subp.returncode)
 
 def qemu_img_pipe_and_status(*args: str) -> Tuple[str, int]:
 """
@@ -236,6 +235,9 @@ def qemu_io_silent_check(*args):
 class QemuIoInteractive:
 def __init__(self, *args):
 self.args = qemu_io_args_no_fmt + list(args)
+# We need to keep the Popen objext around, and not
+# close it immediately. Therefore, disable the pylint check:
+# pylint: disable=consider-using-with
 self._p = subprocess.Popen(self.args, stdin=subprocess.PIPE,
stdout=subprocess.PIPE,
stderr=subprocess.STDOUT,
@@ -309,22 +311,22 @@ def qemu_nbd_popen(*args):
 cmd.extend(args)
 
 log('Start NBD server')
-p = subprocess.Popen(cmd)
-try:
-while not os.path.exists(pid_file):
-if p.poll() is not None:
-raise RuntimeError(
-"qemu-nbd terminated with exit code {}: {}"
-.format(p.returncode, ' '.join(cmd)))
-
-time.sleep(0.01)
-yield
-finally:
-if os.path.exists(pid_file):
-os.remove(pid_file)
-log('Kill NBD server')
-p.kill()
-p.wait()
+with subprocess.Popen(cmd) as p:
+try:
+while not os.path.exists(pid_file):
+if p.poll() is not None:
+raise RuntimeError(
+"qemu-nbd terminated with exit code {}: {}"
+.format(p.returncode, ' '.join(cmd)))
+
+time.sleep(0.01)
+yield
+finally:
+if os.path.exists(pid_file):
+os.remove(pid_file)
+log('Kill NBD server')
+p.kill()
+p.wait()
 
 def compare_images(img1, img2, fmt1=imgfmt, fmt2=imgfmt):
 '''Return True if two image files are identical'''
@@ -333,13 +335,12 @@ def compare_images(img1, img2, fmt1=imgfmt, fmt2=imgfmt):
 
 def create_image(name, size):
 '''Create a fully-allocated raw image with sector markers'''
-file = open(name, 'wb')
-i = 0
-while i < size:
-sector = struct.pack('>l504xl', i // 512, i // 512)
-file.write(sector)
-i = i + 512
-file.close()
+with open(name, 'wb') as file:
+i = 0
+while i < size:
+sector = struct.pack('>l504xl', i // 512, i // 512)
+file.write(sector)
+i = i + 512
 
 def image_size(img):
 '''Return image's virtual size'''
diff --git a/tests/qemu-iotests/pylintrc b/tests/qemu-iotests/pylintrc
index 7a6c0a9474..f2c0b522ac 100644
--- a/tests/qemu-iotests/pylintrc
+++ b/tests/qemu-iotests/pylintrc
@@ -19,6 +19,9 @@ disable=invalid-name,
 too-many-public-methods,
 # pylint warns about Optional[] etc. as unsubscriptable in 3.9
 unsubscriptable-obje

[PULL 10/19] Document qemu-img options data_file and data_file_raw

2021-05-14 Thread Max Reitz

From: Connor Kuehl 

The contents of this patch were initially developed and posted by Han
Han[1], however, it appears the original patch was not applied. Since
then, the relevant documentation has been moved and adapted to a new
format.

I've taken most of the original wording and tweaked it according to
some of the feedback from the original patch submission. I've also
adapted it to restructured text, which is the format the documentation
currently uses.

[1] https://lists.nongnu.org/archive/html/qemu-block/2019-10/msg01253.html

Fixes: https://bugzilla.redhat.com/1763105
Signed-off-by: Han Han 
Suggested-by: Max Reitz 
[ Max: provided description of data_file_raw behavior ]
Signed-off-by: Connor Kuehl 
Message-Id: <20210505195512.391128-1-cku...@redhat.com>
Signed-off-by: Max Reitz 
---
 docs/tools/qemu-img.rst | 31 +++
 1 file changed, 31 insertions(+)

diff --git a/docs/tools/qemu-img.rst b/docs/tools/qemu-img.rst
index c9efcfaefc..cfe1147879 100644
--- a/docs/tools/qemu-img.rst
+++ b/docs/tools/qemu-img.rst
@@ -866,6 +866,37 @@ Supported image file formats:
 issue ``lsattr filename`` to check if the NOCOW flag is set or not
 (Capital 'C' is NOCOW flag).
 
+  ``data_file``
+Filename where all guest data will be stored. If this option is used,
+the qcow2 file will only contain the image's metadata.
+
+Note: Data loss will occur if the given filename already exists when
+using this option with ``qemu-img create`` since ``qemu-img`` will create
+the data file anew, overwriting the file's original contents. To simply
+update the reference to point to the given pre-existing file, use
+``qemu-img amend``.
+
+  ``data_file_raw``
+If this option is set to ``on``, QEMU will always keep the external data
+file consistent as a standalone read-only raw image.
+
+It does this by forwarding all write accesses to the qcow2 file through to
+the raw data file, including their offsets. Therefore, data that is visible
+on the qcow2 node (i.e., to the guest) at some offset is visible at the 
same
+offset in the raw data file. This results in a read-only raw image. Writes
+that bypass the qcow2 metadata may corrupt the qcow2 metadata because the
+out-of-band writes may result in the metadata falling out of sync with the
+raw image.
+
+If this option is ``off``, QEMU will use the data file to store data in an
+arbitrary manner. The file’s content will not make sense without the
+accompanying qcow2 metadata. Where data is written will have no relation to
+its offset as seen by the guest, and some writes (specifically zero writes)
+may not be forwarded to the data file at all, but will only be handled by
+modifying qcow2 metadata.
+
+This option can only be enabled if ``data_file`` is set.
+
 ``Other``
 
   QEMU also supports various other image file formats for
-- 
2.31.1

[PULL 03/19] monitor: hmp_qemu_io: acquire aio contex, fix crash

2021-05-14 Thread Max Reitz

From: Vladimir Sementsov-Ogievskiy 

Max reported the following bug:

$ ./qemu-img create -f raw src.img 1G
$ ./qemu-img create -f raw dst.img 1G

$ (echo '
   {"execute":"qmp_capabilities"}
   {"execute":"blockdev-mirror",
"arguments":{"job-id":"mirror",
 "device":"source",
 "target":"target",
 "sync":"full",
 "filter-node-name":"mirror-top"}}
'; sleep 3; echo '
   {"execute":"human-monitor-command",
"arguments":{"command-line":
 "qemu-io mirror-top \"write 0 1G\""}}') \
| x86_64-softmmu/qemu-system-x86_64 \
   -qmp stdio \
   -blockdev file,node-name=source,filename=src.img \
   -blockdev file,node-name=target,filename=dst.img \
   -object iothread,id=iothr0 \
   -device virtio-blk,drive=source,iothread=iothr0

crashes:

0  raise () at /usr/lib/libc.so.6
1  abort () at /usr/lib/libc.so.6
2  error_exit
   (err=,
   msg=msg@entry=0x55fbb1634790 <__func__.27> "qemu_mutex_unlock_impl")
   at ../util/qemu-thread-posix.c:37
3  qemu_mutex_unlock_impl
   (mutex=mutex@entry=0x55fbb25ab6e0,
   file=file@entry=0x55fbb1636957 "../util/async.c",
   line=line@entry=650)
   at ../util/qemu-thread-posix.c:109
4  aio_context_release (ctx=ctx@entry=0x55fbb25ab680) at ../util/async.c:650
5  bdrv_do_drained_begin
   (bs=bs@entry=0x55fbb3a87000, recursive=recursive@entry=false,
   parent=parent@entry=0x0,
   ignore_bds_parents=ignore_bds_parents@entry=false,
   poll=poll@entry=true) at ../block/io.c:441
6  bdrv_do_drained_begin
   (poll=true, ignore_bds_parents=false, parent=0x0, recursive=false,
   bs=0x55fbb3a87000) at ../block/io.c:448
7  blk_drain (blk=0x55fbb26c5a00) at ../block/block-backend.c:1718
8  blk_unref (blk=0x55fbb26c5a00) at ../block/block-backend.c:498
9  blk_unref (blk=0x55fbb26c5a00) at ../block/block-backend.c:491
10 hmp_qemu_io (mon=0x7fffaf3fc7d0, qdict=)
   at ../block/monitor/block-hmp-cmds.c:628

man pthread_mutex_unlock
...
EPERM  The  mutex type is PTHREAD_MUTEX_ERRORCHECK or
PTHREAD_MUTEX_RECURSIVE, or the mutex is a robust mutex, and the
current thread does not own the mutex.

So, thread doesn't own the mutex. And we have iothread here.

Next, note that AIO_WAIT_WHILE() documents that ctx must be acquired
exactly once by caller. But where is it acquired in the call stack?
Seems nowhere.

qemuio_command do acquire aio context.. But we need context acquired
around blk_unref() as well and actually around blk_insert_bs() too.

Let's refactor qemuio_command so that it doesn't acquire aio context
but callers do that instead. This way we can cleanly acquire aio
context in hmp_qemu_io() around all three calls.

Reported-by: Max Reitz 
Signed-off-by: Vladimir Sementsov-Ogievskiy 
Message-Id: <20210423134233.51495-1-vsement...@virtuozzo.com>
[mreitz: Fixed comment]
Signed-off-by: Max Reitz 
---
 block/monitor/block-hmp-cmds.c | 31 +--
 qemu-io-cmds.c |  8 
 qemu-io.c  | 17 +++--
 3 files changed, 40 insertions(+), 16 deletions(-)

diff --git a/block/monitor/block-hmp-cmds.c b/block/monitor/block-hmp-cmds.c
index ebf1033f31..3e6670c963 100644
--- a/block/monitor/block-hmp-cmds.c
+++ b/block/monitor/block-hmp-cmds.c
@@ -557,8 +557,10 @@ void hmp_eject(Monitor *mon, const QDict *qdict)
 
 void hmp_qemu_io(Monitor *mon, const QDict *qdict)
 {
-BlockBackend *blk;
+BlockBackend *blk = NULL;
+BlockDriverState *bs = NULL;
 BlockBackend *local_blk = NULL;
+AioContext *ctx = NULL;
 bool qdev = qdict_get_try_bool(qdict, "qdev", false);
 const char *device = qdict_get_str(qdict, "device");
 const char *command = qdict_get_str(qdict, "command");
@@ -573,20 +575,24 @@ void hmp_qemu_io(Monitor *mon, const QDict *qdict)
 } else {
 blk = blk_by_name(device);
 if (!blk) {
-BlockDriverState *bs = bdrv_lookup_bs(NULL, device, );
-if (bs) {
-blk = local_blk = blk_new(bdrv_get_aio_context(bs),
-  0, BLK_PERM_ALL);
-ret = blk_insert_bs(blk, bs, );
-if (ret < 0) {
-goto fail;
-}
-} else {
+bs = bdrv_lookup_bs(NULL, device, );
+if (!bs) {
 goto fail;
 }
 }
 }
 
+ctx = blk ? blk_get_aio_context(blk) : bdrv_get_aio_context(bs);
+aio_context_acquire(ctx);
+
+if (bs) {
+blk = local_blk = blk_new(bdrv_get_aio_context(bs), 0, BLK_PERM_ALL);
+ret = blk_insert_bs(blk, bs, );
+if (ret < 0) {
+goto fail;
+}
+}
+
 /*
  * Notably absent: Proper permi

[PULL 07/19] qemu-iotests: move command line and environment handling from TestRunner to TestEnv

2021-05-14 Thread Max Reitz

From: Paolo Bonzini 

In the next patch, "check" will learn how to execute a test script without
going through TestRunner.  To enable this, keep only the text output
and subprocess handling in the TestRunner; move into TestEnv the logic
to prepare for running a subprocess.

Reviewed-by: Vladimir Sementsov-Ogievskiy 
Signed-off-by: Paolo Bonzini 
Tested-by: Emanuele Giuseppe Esposito 
Message-Id: <20210323181928.311862-4-pbonz...@redhat.com>
Message-Id: <20210503110110.476887-4-pbonz...@redhat.com>
Signed-off-by: Max Reitz 
---
 tests/qemu-iotests/testenv.py| 17 -
 tests/qemu-iotests/testrunner.py | 14 +-
 2 files changed, 17 insertions(+), 14 deletions(-)

diff --git a/tests/qemu-iotests/testenv.py b/tests/qemu-iotests/testenv.py
index 6d27712617..fca3a609e0 100644
--- a/tests/qemu-iotests/testenv.py
+++ b/tests/qemu-iotests/testenv.py
@@ -25,7 +25,7 @@
 import random
 import subprocess
 import glob
-from typing import Dict, Any, Optional, ContextManager
+from typing import List, Dict, Any, Optional, ContextManager
 
 
 def isxfile(path: str) -> bool:
@@ -74,6 +74,21 @@ class TestEnv(ContextManager['TestEnv']):
  'CACHEMODE_IS_DEFAULT', 'IMGFMT_GENERIC', 'IMGOPTSSYNTAX',
  'IMGKEYSECRET', 'QEMU_DEFAULT_MACHINE', 'MALLOC_PERTURB_']
 
+def prepare_subprocess(self, args: List[str]) -> Dict[str, str]:
+if self.debug:
+args.append('-d')
+
+with open(args[0], encoding="utf-8") as f:
+try:
+if f.readline().rstrip() == '#!/usr/bin/env python3':
+args.insert(0, self.python)
+except UnicodeDecodeError:  # binary test? for future.
+pass
+
+os_env = os.environ.copy()
+os_env.update(self.get_env())
+return os_env
+
 def get_env(self) -> Dict[str, str]:
 env = {}
 for v in self.env_variables:
diff --git a/tests/qemu-iotests/testrunner.py b/tests/qemu-iotests/testrunner.py
index 1fc61fcaa3..519924dc81 100644
--- a/tests/qemu-iotests/testrunner.py
+++ b/tests/qemu-iotests/testrunner.py
@@ -129,7 +129,6 @@ class TestRunner(ContextManager['TestRunner']):
 def __init__(self, env: TestEnv, makecheck: bool = False,
  color: str = 'auto') -> None:
 self.env = env
-self.test_run_env = self.env.get_env()
 self.makecheck = makecheck
 self.last_elapsed = LastElapsedTime('.last-elapsed-cache', env)
 
@@ -243,18 +242,7 @@ def do_run_test(self, test: str) -> TestResult:
 silent_unlink(p)
 
 args = [str(f_test.resolve())]
-if self.env.debug:
-args.append('-d')
-
-with f_test.open(encoding="utf-8") as f:
-try:
-if f.readline().rstrip() == '#!/usr/bin/env python3':
-args.insert(0, self.env.python)
-except UnicodeDecodeError:  # binary test? for future.
-pass
-
-env = os.environ.copy()
-env.update(self.test_run_env)
+env = self.env.prepare_subprocess(args)
 
 t0 = time.time()
 with f_bad.open('w', encoding="utf-8") as f:
-- 
2.31.1

[PULL 08/19] qemu-iotests: let "check" spawn an arbitrary test command

2021-05-14 Thread Max Reitz

From: Paolo Bonzini 

Right now there is no easy way for "check" to print a reproducer command.
Because such a reproducer command line would be huge, we can instead teach
check to start a command of our choice.  This can be for example a Python
unit test with arguments to only run a specific subtest.

Move the trailing empty line to print_env(), since it always looks better
and one caller was not adding it.

Signed-off-by: Paolo Bonzini 
Reviewed-by: Vladimir Sementsov-Ogievskiy 
Tested-by: Emanuele Giuseppe Esposito 
Message-Id: <20210323181928.311862-5-pbonz...@redhat.com>
Message-Id: <20210503110110.476887-5-pbonz...@redhat.com>
Signed-off-by: Max Reitz 
---
 tests/qemu-iotests/check | 19 ++-
 tests/qemu-iotests/testenv.py|  3 ++-
 tests/qemu-iotests/testrunner.py |  1 -
 3 files changed, 20 insertions(+), 3 deletions(-)

diff --git a/tests/qemu-iotests/check b/tests/qemu-iotests/check
index 08f51366f1..2dd529eb75 100755
--- a/tests/qemu-iotests/check
+++ b/tests/qemu-iotests/check
@@ -19,6 +19,9 @@
 import os
 import sys
 import argparse
+import shutil
+from pathlib import Path
+
 from findtests import TestFinder
 from testenv import TestEnv
 from testrunner import TestRunner
@@ -100,7 +103,7 @@ def make_argparser() -> argparse.ArgumentParser:
'rerun failed ./check command, starting from the '
'middle of the process.')
 g_sel.add_argument('tests', metavar='TEST_FILES', nargs='*',
-   help='tests to run')
+   help='tests to run, or "--" followed by a command')
 
 return p
 
@@ -113,6 +116,20 @@ if __name__ == '__main__':
   imgopts=args.imgopts, misalign=args.misalign,
   debug=args.debug, valgrind=args.valgrind)
 
+if len(sys.argv) > 1 and sys.argv[-len(args.tests)-1] == '--':
+if not args.tests:
+sys.exit("missing command after '--'")
+cmd = args.tests
+env.print_env()
+exec_pathstr = shutil.which(cmd[0])
+if exec_pathstr is None:
+sys.exit('command not found: ' + cmd[0])
+exec_path = Path(exec_pathstr).resolve()
+cmd[0] = str(exec_path)
+full_env = env.prepare_subprocess(cmd)
+os.chdir(exec_path.parent)
+os.execve(cmd[0], cmd, full_env)
+
 testfinder = TestFinder(test_dir=env.source_iotests)
 
 groups = args.groups.split(',') if args.groups else None
diff --git a/tests/qemu-iotests/testenv.py b/tests/qemu-iotests/testenv.py
index fca3a609e0..cd0e39b789 100644
--- a/tests/qemu-iotests/testenv.py
+++ b/tests/qemu-iotests/testenv.py
@@ -284,7 +284,8 @@ def print_env(self) -> None:
 PLATFORM  -- {platform}
 TEST_DIR  -- {TEST_DIR}
 SOCK_DIR  -- {SOCK_DIR}
-SOCKET_SCM_HELPER -- {SOCKET_SCM_HELPER}"""
+SOCKET_SCM_HELPER -- {SOCKET_SCM_HELPER}
+"""
 
 args = collections.defaultdict(str, self.get_env())
 
diff --git a/tests/qemu-iotests/testrunner.py b/tests/qemu-iotests/testrunner.py
index 519924dc81..2f56ac545d 100644
--- a/tests/qemu-iotests/testrunner.py
+++ b/tests/qemu-iotests/testrunner.py
@@ -316,7 +316,6 @@ def run_tests(self, tests: List[str]) -> bool:
 
 if not self.makecheck:
 self.env.print_env()
-print()
 
 test_field_width = max(len(os.path.basename(t)) for t in tests) + 2
 
-- 
2.31.1

[PULL 13/19] block/write-threshold: don't use write notifiers

2021-05-14 Thread Max Reitz

From: Vladimir Sementsov-Ogievskiy 

write-notifiers are used only for write-threshold. New code for such
purpose should create filters.

Let's better special-case write-threshold and drop write notifiers at
all. (Actually, write-threshold is special-cased anyway, as the only
user of write-notifiers)

So, create a new direct interface for bdrv_co_write_req_prepare() and
drop all write-notifier related logic from write-threshold.c.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Reviewed-by: Max Reitz 
Message-Id: <20210506090621.11848-2-vsement...@virtuozzo.com>
Reviewed-by: Eric Blake 
Reviewed-by: Stefan Hajnoczi 
[mreitz: Adjusted comment as per Eric's suggestion]
Signed-off-by: Max Reitz 
---
 include/block/block_int.h   |  1 -
 include/block/write-threshold.h |  9 +
 block/io.c  |  5 ++-
 block/write-threshold.c | 70 +++--
 4 files changed, 27 insertions(+), 58 deletions(-)

diff --git a/include/block/block_int.h b/include/block/block_int.h
index 731ffedb27..aff948fb63 100644
--- a/include/block/block_int.h
+++ b/include/block/block_int.h
@@ -959,7 +959,6 @@ struct BlockDriverState {
 
 /* threshold limit for writes, in bytes. "High water mark". */
 uint64_t write_threshold_offset;
-NotifierWithReturn write_threshold_notifier;
 
 /* Writing to the list requires the BQL _and_ the dirty_bitmap_mutex.
  * Reading from the list can be done with either the BQL or the
diff --git a/include/block/write-threshold.h b/include/block/write-threshold.h
index c646f267a4..848a5dde85 100644
--- a/include/block/write-threshold.h
+++ b/include/block/write-threshold.h
@@ -59,4 +59,13 @@ bool bdrv_write_threshold_is_set(const BlockDriverState *bs);
 uint64_t bdrv_write_threshold_exceeded(const BlockDriverState *bs,
const BdrvTrackedRequest *req);
 
+/*
+ * bdrv_write_threshold_check_write
+ *
+ * Check whether the specified request exceeds the write threshold.
+ * If so, send a corresponding event and disable write threshold checking.
+ */
+void bdrv_write_threshold_check_write(BlockDriverState *bs, int64_t offset,
+  int64_t bytes);
+
 #endif
diff --git a/block/io.c b/block/io.c
index 35b6c56efc..3520de51bb 100644
--- a/block/io.c
+++ b/block/io.c
@@ -30,6 +30,7 @@
 #include "block/blockjob_int.h"
 #include "block/block_int.h"
 #include "block/coroutines.h"
+#include "block/write-threshold.h"
 #include "qemu/cutils.h"
 #include "qapi/error.h"
 #include "qemu/error-report.h"
@@ -2008,8 +2009,8 @@ bdrv_co_write_req_prepare(BdrvChild *child, int64_t 
offset, int64_t bytes,
 } else {
 assert(child->perm & BLK_PERM_WRITE);
 }
-return notifier_with_return_list_notify(>before_write_notifiers,
-req);
+bdrv_write_threshold_check_write(bs, offset, bytes);
+return 0;
 case BDRV_TRACKED_TRUNCATE:
 assert(child->perm & BLK_PERM_RESIZE);
 return 0;
diff --git a/block/write-threshold.c b/block/write-threshold.c
index 85b78dc2a9..71df3c434f 100644
--- a/block/write-threshold.c
+++ b/block/write-threshold.c
@@ -29,14 +29,6 @@ bool bdrv_write_threshold_is_set(const BlockDriverState *bs)
 return bs->write_threshold_offset > 0;
 }
 
-static void write_threshold_disable(BlockDriverState *bs)
-{
-if (bdrv_write_threshold_is_set(bs)) {
-notifier_with_return_remove(>write_threshold_notifier);
-bs->write_threshold_offset = 0;
-}
-}
-
 uint64_t bdrv_write_threshold_exceeded(const BlockDriverState *bs,
const BdrvTrackedRequest *req)
 {
@@ -51,55 +43,9 @@ uint64_t bdrv_write_threshold_exceeded(const 
BlockDriverState *bs,
 return 0;
 }
 
-static int coroutine_fn before_write_notify(NotifierWithReturn *notifier,
-void *opaque)
-{
-BdrvTrackedRequest *req = opaque;
-BlockDriverState *bs = req->bs;
-uint64_t amount = 0;
-
-amount = bdrv_write_threshold_exceeded(bs, req);
-if (amount > 0) {
-qapi_event_send_block_write_threshold(
-bs->node_name,
-amount,
-bs->write_threshold_offset);
-
-/* autodisable to avoid flooding the monitor */
-write_threshold_disable(bs);
-}
-
-return 0; /* should always let other notifiers run */
-}
-
-static void write_threshold_register_notifier(BlockDriverState *bs)
-{
-bs->write_threshold_notifier.notify = before_write_notify;
-bdrv_add_before_write_notifier(bs, >write_threshold_notifier);
-}
-
-static void write_threshold_update(BlockDriverState *bs,
-   int64_t threshold_bytes)
-{
-bs->write_threshold_offset = threshold_bytes;
-}
-
 void bdrv_write_thres

[PULL 04/19] mirror: stop cancelling in-flight requests on non-force cancel in READY

2021-05-14 Thread Max Reitz

From: Vladimir Sementsov-Ogievskiy 

If mirror is READY than cancel operation is not discarding the whole
result of the operation, but instead it's a documented way get a
point-in-time snapshot of source disk.

So, we should not cancel any requests if mirror is READ and
force=false. Let's fix that case.

Note, that bug that we have before this commit is not critical, as the
only .bdrv_cancel_in_flight implementation is nbd_cancel_in_flight()
and it cancels only requests waiting for reconnection, so it should be
rare case.

Fixes: 521ff8b779b11c394dbdc43f02e158dd99df308a
Signed-off-by: Vladimir Sementsov-Ogievskiy 
Message-Id: <20210421075858.40197-1-vsement...@virtuozzo.com>
Signed-off-by: Max Reitz 
---
 include/block/block_int.h | 2 +-
 include/qemu/job.h| 2 +-
 block/backup.c| 2 +-
 block/mirror.c| 6 --
 job.c | 2 +-
 tests/qemu-iotests/264| 2 +-
 6 files changed, 9 insertions(+), 7 deletions(-)

diff --git a/include/block/block_int.h b/include/block/block_int.h
index c823f5b1b3..731ffedb27 100644
--- a/include/block/block_int.h
+++ b/include/block/block_int.h
@@ -357,7 +357,7 @@ struct BlockDriver {
  * of in-flight requests, so don't waste the time if possible.
  *
  * One example usage is to avoid waiting for an nbd target node reconnect
- * timeout during job-cancel.
+ * timeout during job-cancel with force=true.
  */
 void (*bdrv_cancel_in_flight)(BlockDriverState *bs);
 
diff --git a/include/qemu/job.h b/include/qemu/job.h
index efc6fa7544..41162ed494 100644
--- a/include/qemu/job.h
+++ b/include/qemu/job.h
@@ -254,7 +254,7 @@ struct JobDriver {
 /**
  * If the callback is not NULL, it will be invoked in job_cancel_async
  */
-void (*cancel)(Job *job);
+void (*cancel)(Job *job, bool force);
 
 
 /** Called when the job is freed */
diff --git a/block/backup.c b/block/backup.c
index 6cf2f974aa..bd3614ce70 100644
--- a/block/backup.c
+++ b/block/backup.c
@@ -331,7 +331,7 @@ static void coroutine_fn backup_set_speed(BlockJob *job, 
int64_t speed)
 }
 }
 
-static void backup_cancel(Job *job)
+static void backup_cancel(Job *job, bool force)
 {
 BackupBlockJob *s = container_of(job, BackupBlockJob, common.job);
 
diff --git a/block/mirror.c b/block/mirror.c
index 840b8e8c15..019f6deaa5 100644
--- a/block/mirror.c
+++ b/block/mirror.c
@@ -1178,12 +1178,14 @@ static bool mirror_drained_poll(BlockJob *job)
 return !!s->in_flight;
 }
 
-static void mirror_cancel(Job *job)
+static void mirror_cancel(Job *job, bool force)
 {
 MirrorBlockJob *s = container_of(job, MirrorBlockJob, common.job);
 BlockDriverState *target = blk_bs(s->target);
 
-bdrv_cancel_in_flight(target);
+if (force || !job_is_ready(job)) {
+bdrv_cancel_in_flight(target);
+}
 }
 
 static const BlockJobDriver mirror_job_driver = {
diff --git a/job.c b/job.c
index 4aff13d95a..8775c1803b 100644
--- a/job.c
+++ b/job.c
@@ -716,7 +716,7 @@ static int job_finalize_single(Job *job)
 static void job_cancel_async(Job *job, bool force)
 {
 if (job->driver->cancel) {
-job->driver->cancel(job);
+job->driver->cancel(job, force);
 }
 if (job->user_paused) {
 /* Do not call job_enter here, the caller will handle it.  */
diff --git a/tests/qemu-iotests/264 b/tests/qemu-iotests/264
index 4f96825a22..bc431d1a19 100755
--- a/tests/qemu-iotests/264
+++ b/tests/qemu-iotests/264
@@ -95,7 +95,7 @@ class TestNbdReconnect(iotests.QMPTestCase):
 self.assert_qmp(result, 'return', {})
 
 def cancel_job(self):
-result = self.vm.qmp('block-job-cancel', device='drive0')
+result = self.vm.qmp('block-job-cancel', device='drive0', force=True)
 self.assert_qmp(result, 'return', {})
 
 start_t = time.time()
-- 
2.31.1

[PULL 01/19] iotests/231: Update expected deprecation message

2021-05-14 Thread Max Reitz

From: Connor Kuehl 

The deprecation message in the expected output has technically been
wrong since the wrong version of a patch was applied to it. Because of
this, the test fails. Correct the expected output so that it passes.

Signed-off-by: Connor Kuehl 
Reviewed-by: Max Reitz 
Reviewed-by: Stefano Garzarella 
Message-Id: <20210421212343.85524-2-cku...@redhat.com>
Signed-off-by: Max Reitz 
---
 tests/qemu-iotests/231.out | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/tests/qemu-iotests/231.out b/tests/qemu-iotests/231.out
index 579ba11c16..747dd221bb 100644
--- a/tests/qemu-iotests/231.out
+++ b/tests/qemu-iotests/231.out
@@ -1,9 +1,7 @@
 QA output created by 231
-qemu-img: RBD options encoded in the filename as keyvalue pairs is deprecated. 
 Future versions may cease to parse these options in the future.
+qemu-img: warning: RBD options encoded in the filename as keyvalue pairs is 
deprecated
 unable to get monitor info from DNS SRV with service name: ceph-mon
-no monitors specified to connect to.
 qemu-img: Could not open 
'json:{'file.driver':'rbd','file.filename':'rbd:rbd/bogus:conf=BOGUS_CONF'}': 
error connecting: No such file or directory
 unable to get monitor info from DNS SRV with service name: ceph-mon
-no monitors specified to connect to.
 qemu-img: Could not open 
'json:{'file.driver':'rbd','file.pool':'rbd','file.image':'bogus','file.conf':'BOGUS_CONF'}':
 error connecting: No such file or directory
 *** done
-- 
2.31.1

[PULL 05/19] qemu-iotests: do not buffer the test output

2021-05-14 Thread Max Reitz

From: Paolo Bonzini 

Instead of buffering the test output into a StringIO, patch it on
the fly by wrapping sys.stdout's write method.  This can be
done unconditionally, even if using -d, which makes execute_unittest
a bit simpler.

Signed-off-by: Paolo Bonzini 
Reviewed-by: Vladimir Sementsov-Ogievskiy 
Tested-by: Emanuele Giuseppe Esposito 
Message-Id: <20210323181928.311862-2-pbonz...@redhat.com>
Message-Id: <20210503110110.476887-2-pbonz...@redhat.com>
Signed-off-by: Max Reitz 
---
 tests/qemu-iotests/240.out|  8 ++--
 tests/qemu-iotests/245.out|  8 ++--
 tests/qemu-iotests/295.out|  6 +--
 tests/qemu-iotests/296.out|  8 ++--
 tests/qemu-iotests/iotests.py | 70 ---
 5 files changed, 56 insertions(+), 44 deletions(-)

diff --git a/tests/qemu-iotests/240.out b/tests/qemu-iotests/240.out
index e0982831ae..89ed25e506 100644
--- a/tests/qemu-iotests/240.out
+++ b/tests/qemu-iotests/240.out
@@ -15,7 +15,7 @@
 {"return": {}}
 {"execute": "blockdev-del", "arguments": {"node-name": "hd0"}}
 {"return": {}}
-==Attach two SCSI disks using the same block device and the same iothread==
+.==Attach two SCSI disks using the same block device and the same iothread==
 {"execute": "blockdev-add", "arguments": {"driver": "null-co", "node-name": 
"hd0", "read-only": true, "read-zeroes": true}}
 {"return": {}}
 {"execute": "object-add", "arguments": {"id": "iothread0", "qom-type": 
"iothread"}}
@@ -32,7 +32,7 @@
 {"return": {}}
 {"execute": "blockdev-del", "arguments": {"node-name": "hd0"}}
 {"return": {}}
-==Attach two SCSI disks using the same block device but different iothreads==
+.==Attach two SCSI disks using the same block device but different iothreads==
 {"execute": "blockdev-add", "arguments": {"driver": "null-co", "node-name": 
"hd0", "read-only": true, "read-zeroes": true}}
 {"return": {}}
 {"execute": "object-add", "arguments": {"id": "iothread0", "qom-type": 
"iothread"}}
@@ -55,7 +55,7 @@
 {"return": {}}
 {"execute": "blockdev-del", "arguments": {"node-name": "hd0"}}
 {"return": {}}
-==Attach a SCSI disks using the same block device as a NBD server==
+.==Attach a SCSI disks using the same block device as a NBD server==
 {"execute": "blockdev-add", "arguments": {"driver": "null-co", "node-name": 
"hd0", "read-only": true, "read-zeroes": true}}
 {"return": {}}
 {"execute": "nbd-server-start", "arguments": {"addr": {"data": {"path": 
"SOCK_DIR/PID-nbd.sock"}, "type": "unix"}}}
@@ -68,7 +68,7 @@
 {"return": {}}
 {"execute": "device_add", "arguments": {"drive": "hd0", "driver": "scsi-hd", 
"id": "scsi-hd0"}}
 {"return": {}}
-
+.
 --
 Ran 4 tests
 
diff --git a/tests/qemu-iotests/245.out b/tests/qemu-iotests/245.out
index 4b33dcaf5c..99c12f4f98 100644
--- a/tests/qemu-iotests/245.out
+++ b/tests/qemu-iotests/245.out
@@ -1,16 +1,16 @@
-{"execute": "job-finalize", "arguments": {"id": "commit0"}}
+..{"execute": "job-finalize", "arguments": {"id": "commit0"}}
 {"return": {}}
 {"data": {"id": "commit0", "type": "commit"}, "event": "BLOCK_JOB_PENDING", 
"timestamp": {"microseconds": "USECS", "seconds": "SECS"}}
 {"data": {"device": "commit0", "len": 3145728, "offset": 3145728, "speed": 0, 
"type": "commit"}, "event": "BLOCK_JOB_COMPLETED", "timestamp": 
{"microseconds": "USECS", "seconds": "SECS"}}
-{"execute": "job-finalize", "arguments": {"id": "stream0"}}
+...{"execute": "job-finalize", "arguments": {"id": "stream0"}}
 {"return": {}}
 {"data": {"id": "stream0", "type": "stream"}, "event": "BLOCK_JOB_PENDI

[PULL 02/19] block/rbd: Add an escape-aware strchr helper

2021-05-14 Thread Max Reitz

From: Connor Kuehl 

Sometimes the parser needs to further split a token it has collected
from the token input stream. Right now, it does a cursory check to see
if the relevant characters appear in the token to determine if it should
break it down further.

However, qemu_rbd_next_tok() will escape characters as it removes tokens
from the token stream and plain strchr() won't. This can make the
initial strchr() check slightly misleading since it implies
qemu_rbd_next_tok() will find the token and split on it, except the
reality is that qemu_rbd_next_tok() will pass over it if it is escaped.

Use a custom strchr to avoid mixing escaped and unescaped string
operations. Furthermore, this code is identical to how
qemu_rbd_next_tok() seeks its next token, so incorporate this custom
strchr into the body of that function to reduce duplication.

Reported-by: Han Han 
Fixes: https://bugzilla.redhat.com/1873913
Signed-off-by: Connor Kuehl 
Message-Id: <20210421212343.85524-3-cku...@redhat.com>
Reviewed-by: Stefano Garzarella 
Signed-off-by: Max Reitz 
---
 block/rbd.c| 32 +---
 tests/qemu-iotests/231 |  4 
 tests/qemu-iotests/231.out |  3 +++
 3 files changed, 28 insertions(+), 11 deletions(-)

diff --git a/block/rbd.c b/block/rbd.c
index f098a89c7b..26f64cce7c 100644
--- a/block/rbd.c
+++ b/block/rbd.c
@@ -113,21 +113,31 @@ static int qemu_rbd_connect(rados_t *cluster, 
rados_ioctx_t *io_ctx,
 const char *keypairs, const char *secretid,
 Error **errp);
 
+static char *qemu_rbd_strchr(char *src, char delim)
+{
+char *p;
+
+for (p = src; *p; ++p) {
+if (*p == delim) {
+return p;
+}
+if (*p == '\\' && p[1] != '\0') {
+++p;
+}
+}
+
+return NULL;
+}
+
+
 static char *qemu_rbd_next_tok(char *src, char delim, char **p)
 {
 char *end;
 
 *p = NULL;
 
-for (end = src; *end; ++end) {
-if (*end == delim) {
-break;
-}
-if (*end == '\\' && end[1] != '\0') {
-end++;
-}
-}
-if (*end == delim) {
+end = qemu_rbd_strchr(src, delim);
+if (end) {
 *p = end + 1;
 *end = '\0';
 }
@@ -171,7 +181,7 @@ static void qemu_rbd_parse_filename(const char *filename, 
QDict *options,
 qemu_rbd_unescape(found_str);
 qdict_put_str(options, "pool", found_str);
 
-if (strchr(p, '@')) {
+if (qemu_rbd_strchr(p, '@')) {
 image_name = qemu_rbd_next_tok(p, '@', );
 
 found_str = qemu_rbd_next_tok(p, ':', );
@@ -181,7 +191,7 @@ static void qemu_rbd_parse_filename(const char *filename, 
QDict *options,
 image_name = qemu_rbd_next_tok(p, ':', );
 }
 /* Check for namespace in the image_name */
-if (strchr(image_name, '/')) {
+if (qemu_rbd_strchr(image_name, '/')) {
 found_str = qemu_rbd_next_tok(image_name, '/', _name);
 qemu_rbd_unescape(found_str);
 qdict_put_str(options, "namespace", found_str);
diff --git a/tests/qemu-iotests/231 b/tests/qemu-iotests/231
index 0f66d0ca36..8e6c6447c1 100755
--- a/tests/qemu-iotests/231
+++ b/tests/qemu-iotests/231
@@ -55,6 +55,10 @@ _filter_conf()
 $QEMU_IMG info 
"json:{'file.driver':'rbd','file.filename':'rbd:rbd/bogus:conf=${BOGUS_CONF}'}" 
2>&1 | _filter_conf
 $QEMU_IMG info 
"json:{'file.driver':'rbd','file.pool':'rbd','file.image':'bogus','file.conf':'${BOGUS_CONF}'}"
 2>&1 | _filter_conf
 
+# Regression test: the qemu-img invocation is expected to fail, but it should
+# not seg fault the parser.
+$QEMU_IMG create "rbd:rbd/aa\/bb:conf=${BOGUS_CONF}" 1M 2>&1 | _filter_conf
+
 # success, all done
 echo "*** done"
 rm -f $seq.full
diff --git a/tests/qemu-iotests/231.out b/tests/qemu-iotests/231.out
index 747dd221bb..a785a6e859 100644
--- a/tests/qemu-iotests/231.out
+++ b/tests/qemu-iotests/231.out
@@ -4,4 +4,7 @@ unable to get monitor info from DNS SRV with service name: 
ceph-mon
 qemu-img: Could not open 
'json:{'file.driver':'rbd','file.filename':'rbd:rbd/bogus:conf=BOGUS_CONF'}': 
error connecting: No such file or directory
 unable to get monitor info from DNS SRV with service name: ceph-mon
 qemu-img: Could not open 
'json:{'file.driver':'rbd','file.pool':'rbd','file.image':'bogus','file.conf':'BOGUS_CONF'}':
 error connecting: No such file or directory
+Formatting 'rbd:rbd/aa\/bb:conf=BOGUS_CONF', fmt=raw size=1048576
+unable to get monitor info from DNS SRV with service name: ceph-mon
+qemu-img: rbd:rbd/aa\/bb:conf=BOGUS_CONF: error connecting: No such file or 
directory
 *** done
-- 
2.31.1

[PULL 00/19] Block patches

2021-05-14 Thread Max Reitz

The following changes since commit 96662996eda78c48aa4e76d8615c7eb72d80:

  Merge remote-tracking branch 'remotes/dgilbert/tags/pull-migration-20210513a' 
into staging (2021-05-14 12:03:47 +0100)

are available in the Git repository at:

  https://github.com/XanClic/qemu.git tags/pull-block-2021-05-14

for you to fetch changes up to c61ebf362d0abf288ce266845519d5a550a1d89f:

  write-threshold: deal with includes (2021-05-14 16:14:10 +0200)


Block patches:
- drop block/io write notifiers
- qemu-iotests enhancements to make debugging easier
- rbd parsing fix
- HMP qemu-io fix (for iothreads)
- mirror job cancel relaxation (do not cancel in-flight requests when a
  READY mirror job is canceled with force=false)
- document qcow2's data_file and data_file_raw features
- fix iotest 297 for pylint 2.8
- block/copy-on-read refactoring


Connor Kuehl (3):
  iotests/231: Update expected deprecation message
  block/rbd: Add an escape-aware strchr helper
  Document qemu-img options data_file and data_file_raw

Emanuele Giuseppe Esposito (1):
  qemu-iotests: fix pylint 2.8 consider-using-with error

Paolo Bonzini (5):
  qemu-iotests: do not buffer the test output
  qemu-iotests: allow passing unittest.main arguments to the test
scripts
  qemu-iotests: move command line and environment handling from
TestRunner to TestEnv
  qemu-iotests: let "check" spawn an arbitrary test command
  qemu-iotests: fix case of SOCK_DIR already in the environment

Vladimir Sementsov-Ogievskiy (10):
  monitor: hmp_qemu_io: acquire aio contex, fix crash
  mirror: stop cancelling in-flight requests on non-force cancel in
READY
  block/copy-on-read: use bdrv_drop_filter() and drop s->active
  block/write-threshold: don't use write notifiers
  block: drop write notifiers
  test-write-threshold: rewrite test_threshold_(not_)trigger tests
  block/write-threshold: drop extra APIs
  test-write-threshold: drop extra tests
  test-write-threshold: drop extra TestStruct structure
  write-threshold: deal with includes

 docs/tools/qemu-img.rst   |  31 +++
 include/block/block_int.h |  15 +---
 include/block/write-threshold.h   |  27 ++
 include/qemu/job.h|   2 +-
 block.c   |   1 -
 block/backup.c|   2 +-
 block/copy-on-read.c  |  33 +--
 block/io.c|  11 +--
 block/mirror.c|   6 +-
 block/monitor/block-hmp-cmds.c|  31 ---
 block/rbd.c   |  32 ---
 block/write-threshold.c   |  91 ---
 job.c |   2 +-
 qemu-io-cmds.c|   8 +-
 qemu-io.c |  17 +++-
 tests/unit/test-write-threshold.c |  90 ++-
 tests/qemu-iotests/231|   4 +
 tests/qemu-iotests/231.out|   7 +-
 tests/qemu-iotests/240.out|   8 +-
 tests/qemu-iotests/245.out|   8 +-
 tests/qemu-iotests/264|   2 +-
 tests/qemu-iotests/295.out|   6 +-
 tests/qemu-iotests/296.out|   8 +-
 tests/qemu-iotests/check  |  19 +++-
 tests/qemu-iotests/iotests.py | 145 +-
 tests/qemu-iotests/pylintrc   |   3 +
 tests/qemu-iotests/testenv.py |  22 -
 tests/qemu-iotests/testrunner.py  |  37 +++-
 28 files changed, 289 insertions(+), 379 deletions(-)

-- 
2.31.1

[PULL 19/19] write-threshold: deal with includes

2021-05-14 Thread Max Reitz

From: Vladimir Sementsov-Ogievskiy 

"qemu/typedefs.h" is enough for include/block/write-threshold.h header
with forward declaration of BlockDriverState. Also drop extra includes
from block/write-threshold.c and tests/unit/test-write-threshold.c

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Message-Id: <20210506090621.11848-9-vsement...@virtuozzo.com>
Reviewed-by: Stefan Hajnoczi 
Signed-off-by: Max Reitz 
---
 include/block/write-threshold.h   | 2 +-
 block/write-threshold.c   | 2 --
 tests/unit/test-write-threshold.c | 1 -
 3 files changed, 1 insertion(+), 4 deletions(-)

diff --git a/include/block/write-threshold.h b/include/block/write-threshold.h
index a03ee1cacd..f50f923e7e 100644
--- a/include/block/write-threshold.h
+++ b/include/block/write-threshold.h
@@ -13,7 +13,7 @@
 #ifndef BLOCK_WRITE_THRESHOLD_H
 #define BLOCK_WRITE_THRESHOLD_H
 
-#include "block/block_int.h"
+#include "qemu/typedefs.h"
 
 /*
  * bdrv_write_threshold_set:
diff --git a/block/write-threshold.c b/block/write-threshold.c
index 65a6acd142..35cafbc22d 100644
--- a/block/write-threshold.c
+++ b/block/write-threshold.c
@@ -12,9 +12,7 @@
 
 #include "qemu/osdep.h"
 #include "block/block_int.h"
-#include "qemu/coroutine.h"
 #include "block/write-threshold.h"
-#include "qemu/notify.h"
 #include "qapi/error.h"
 #include "qapi/qapi-commands-block-core.h"
 #include "qapi/qapi-events-block-core.h"
diff --git a/tests/unit/test-write-threshold.c 
b/tests/unit/test-write-threshold.c
index 49b1ef7a20..0158e4637a 100644
--- a/tests/unit/test-write-threshold.c
+++ b/tests/unit/test-write-threshold.c
@@ -7,7 +7,6 @@
  */
 
 #include "qemu/osdep.h"
-#include "qapi/error.h"
 #include "block/block_int.h"
 #include "block/write-threshold.h"
 
-- 
2.31.1

[PULL 18/19] test-write-threshold: drop extra TestStruct structure

2021-05-14 Thread Max Reitz

From: Vladimir Sementsov-Ogievskiy 

We don't need this extra logic: it doesn't make code simpler.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Reviewed-by: Max Reitz 
Message-Id: <20210506090621.11848-8-vsement...@virtuozzo.com>
Reviewed-by: Stefan Hajnoczi 
Signed-off-by: Max Reitz 
---
 tests/unit/test-write-threshold.c | 20 +++-
 1 file changed, 3 insertions(+), 17 deletions(-)

diff --git a/tests/unit/test-write-threshold.c 
b/tests/unit/test-write-threshold.c
index 9e9986aefc..49b1ef7a20 100644
--- a/tests/unit/test-write-threshold.c
+++ b/tests/unit/test-write-threshold.c
@@ -37,26 +37,12 @@ static void test_threshold_trigger(void)
 g_assert_cmpuint(bdrv_write_threshold_get(), ==, 0);
 }
 
-typedef struct TestStruct {
-const char *name;
-void (*func)(void);
-} TestStruct;
-
 
 int main(int argc, char **argv)
 {
-size_t i;
-TestStruct tests[] = {
-{ "/write-threshold/not-trigger",
-  test_threshold_not_trigger },
-{ "/write-threshold/trigger",
-  test_threshold_trigger },
-{ NULL, NULL }
-};
-
 g_test_init(, , NULL);
-for (i = 0; tests[i].name != NULL; i++) {
-g_test_add_func(tests[i].name, tests[i].func);
-}
+g_test_add_func("/write-threshold/not-trigger", 
test_threshold_not_trigger);
+g_test_add_func("/write-threshold/trigger", test_threshold_trigger);
+
 return g_test_run();
 }
-- 
2.31.1

[PULL 11/19] block/copy-on-read: use bdrv_drop_filter() and drop s->active

2021-05-14 Thread Max Reitz

From: Vladimir Sementsov-Ogievskiy 

Now, after huge update of block graph permission update algorithm, we
don't need this workaround with active state of the filter. Drop it and
use new smart bdrv_drop_filter() function.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Message-Id: <20210506194143.394141-1-vsement...@virtuozzo.com>
Signed-off-by: Max Reitz 
---
 block/copy-on-read.c | 33 +
 1 file changed, 1 insertion(+), 32 deletions(-)

diff --git a/block/copy-on-read.c b/block/copy-on-read.c
index 9cad9e1b8c..c428682272 100644
--- a/block/copy-on-read.c
+++ b/block/copy-on-read.c
@@ -29,7 +29,6 @@
 
 
 typedef struct BDRVStateCOR {
-bool active;
 BlockDriverState *bottom_bs;
 bool chain_frozen;
 } BDRVStateCOR;
@@ -89,7 +88,6 @@ static int cor_open(BlockDriverState *bs, QDict *options, int 
flags,
  */
 bdrv_ref(bottom_bs);
 }
-state->active = true;
 state->bottom_bs = bottom_bs;
 
 /*
@@ -112,17 +110,6 @@ static void cor_child_perm(BlockDriverState *bs, BdrvChild 
*c,
uint64_t perm, uint64_t shared,
uint64_t *nperm, uint64_t *nshared)
 {
-BDRVStateCOR *s = bs->opaque;
-
-if (!s->active) {
-/*
- * While the filter is being removed
- */
-*nperm = 0;
-*nshared = BLK_PERM_ALL;
-return;
-}
-
 *nperm = perm & PERM_PASSTHROUGH;
 *nshared = (shared & PERM_PASSTHROUGH) | PERM_UNCHANGED;
 
@@ -280,32 +267,14 @@ static BlockDriver bdrv_copy_on_read = {
 
 void bdrv_cor_filter_drop(BlockDriverState *cor_filter_bs)
 {
-BdrvChild *child;
-BlockDriverState *bs;
 BDRVStateCOR *s = cor_filter_bs->opaque;
 
-child = bdrv_filter_child(cor_filter_bs);
-if (!child) {
-return;
-}
-bs = child->bs;
-
-/* Retain the BDS until we complete the graph change. */
-bdrv_ref(bs);
-/* Hold a guest back from writing while permissions are being reset. */
-bdrv_drained_begin(bs);
-/* Drop permissions before the graph change. */
-s->active = false;
 /* unfreeze, as otherwise bdrv_replace_node() will fail */
 if (s->chain_frozen) {
 s->chain_frozen = false;
 bdrv_unfreeze_backing_chain(cor_filter_bs, s->bottom_bs);
 }
-bdrv_child_refresh_perms(cor_filter_bs, child, _abort);
-bdrv_replace_node(cor_filter_bs, bs, _abort);
-
-bdrv_drained_end(bs);
-bdrv_unref(bs);
+bdrv_drop_filter(cor_filter_bs, _abort);
 bdrv_unref(cor_filter_bs);
 }
 
-- 
2.31.1

[PULL 06/19] qemu-iotests: allow passing unittest.main arguments to the test scripts

2021-05-14 Thread Max Reitz

From: Paolo Bonzini 

Python test scripts that use unittest consist of multiple tests.
unittest.main allows selecting which tests to run, but currently this
is not possible because the iotests wrapper ignores sys.argv.

unittest.main command line options also allow the user to pick the
desired options for verbosity, failfast mode, etc.  While "-d" is
currently translated to "-v", it also enables extra debug output,
and other options are not available at all.

These command line options only work if the unittest.main testRunner
argument is a type, rather than a TestRunner instance.  Therefore, pass
the class name and "verbosity" argument to unittest.main, and adjust for
the different default warnings between TextTestRunner and unittest.main.

Signed-off-by: Paolo Bonzini 
Reviewed-by: Vladimir Sementsov-Ogievskiy 
Tested-by: Emanuele Giuseppe Esposito 
Message-Id: <20210323181928.311862-3-pbonz...@redhat.com>
Message-Id: <20210503110110.476887-3-pbonz...@redhat.com>
Signed-off-by: Max Reitz 
---
 tests/qemu-iotests/iotests.py | 14 +-
 1 file changed, 9 insertions(+), 5 deletions(-)

diff --git a/tests/qemu-iotests/iotests.py b/tests/qemu-iotests/iotests.py
index 55a017577f..5ead94229f 100644
--- a/tests/qemu-iotests/iotests.py
+++ b/tests/qemu-iotests/iotests.py
@@ -1308,12 +1308,16 @@ def __init__(self, stream: Optional[TextIO] = None,
  resultclass=resultclass,
  **kwargs)
 
-def execute_unittest(debug=False):
+def execute_unittest(argv: List[str], debug: bool = False) -> None:
 """Executes unittests within the calling module."""
 
-verbosity = 2 if debug else 1
-runner = ReproducibleTestRunner(verbosity=verbosity)
-unittest.main(testRunner=runner)
+# Some tests have warnings, especially ResourceWarnings for unclosed
+# files and sockets.  Ignore them for now to ensure reproducibility of
+# the test output.
+unittest.main(argv=argv,
+  testRunner=ReproducibleTestRunner,
+  verbosity=2 if debug else 1,
+  warnings=None if sys.warnoptions else 'ignore')
 
 def execute_setup_common(supported_fmts: Sequence[str] = (),
  supported_platforms: Sequence[str] = (),
@@ -1350,7 +1354,7 @@ def execute_test(*args, test_function=None, **kwargs):
 
 debug = execute_setup_common(*args, **kwargs)
 if not test_function:
-execute_unittest(debug)
+execute_unittest(sys.argv, debug)
 else:
 test_function()
 
-- 
2.31.1

[PATCH v3 0/4] iotests/297: Cover tests/

2021-05-14 Thread Max Reitz

v1: https://lists.nongnu.org/archive/html/qemu-block/2021-03/msg01471.html
v2: https://lists.nongnu.org/archive/html/qemu-block/2021-05/msg00492.html


Hi,

When reviewing Vladimir’s new addition to tests/, I noticed that 297 so
far does not cover named tests.  That isn’t so good.

This series makes it cover them, and because tests/ is rather sparse at
this point, I decided to also fix up the two tests in there that don’t
pass pylint’s scrutiny yet.  I think it would be nice if we could keep
all of tests/ clean.


v3:
- Fixed patch 3: Turns out replacing `lambda self: mc(self)` by just
  `mc` (as pylint suggests) breaks the test.  So leave it as it is and
  instead disable the warning locally.


git-backport-diff against v3:

Key:
[] : patches are identical
[] : number of functional differences between upstream/downstream patch
[down] : patch is downstream-only
The flags [FC] indicate (F)unctional and (C)ontextual differences, respectively

001/4:[] [--] 'iotests/297: Drop 169 and 199 from the skip list'
002/4:[] [--] 'migrate-bitmaps-postcopy-test: Fix pylint warnings'
003/4:[0005] [FC] 'migrate-bitmaps-test: Fix pylint warnings'
004/4:[] [--] 'iotests/297: Cover tests/'


Max Reitz (4):
  iotests/297: Drop 169 and 199 from the skip list
  migrate-bitmaps-postcopy-test: Fix pylint warnings
  migrate-bitmaps-test: Fix pylint warnings
  iotests/297: Cover tests/

 tests/qemu-iotests/297|  7 ++--
 .../tests/migrate-bitmaps-postcopy-test   | 13 +++---
 tests/qemu-iotests/tests/migrate-bitmaps-test | 41 +++
 3 files changed, 34 insertions(+), 27 deletions(-)

-- 
2.31.1

[PATCH v3 4/4] iotests/297: Cover tests/

2021-05-14 Thread Max Reitz

297 so far does not check the named tests, which reside in the tests/
directory (i.e. full path tests/qemu-iotests/tests).  Fix it.

Thanks to the previous two commits, all named tests pass its scrutiny,
so we do not have to add anything to SKIP_FILES.

Signed-off-by: Max Reitz 
Reviewed-by: Willian Rampazzo 
Reviewed-by: Vladimir Sementsov-Ogievskiy 
Reviewed-by: Kevin Wolf 
---
 tests/qemu-iotests/297 | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/tests/qemu-iotests/297 b/tests/qemu-iotests/297
index e3244d40a0..ce0e82e279 100755
--- a/tests/qemu-iotests/297
+++ b/tests/qemu-iotests/297
@@ -55,8 +55,9 @@ def is_python_file(filename):
 
 
 def run_linters():
-files = [filename for filename in (set(os.listdir('.')) - set(SKIP_FILES))
- if is_python_file(filename)]
+named_tests = [f'tests/{entry}' for entry in os.listdir('tests')]
+check_tests = set(os.listdir('.') + named_tests) - set(SKIP_FILES)
+files = [filename for filename in check_tests if is_python_file(filename)]
 
 iotests.logger.debug('Files to be checked:')
 iotests.logger.debug(', '.join(sorted(files)))
-- 
2.31.1

[PATCH v3 3/4] migrate-bitmaps-test: Fix pylint warnings

2021-05-14 Thread Max Reitz

There are a couple of things pylint takes issue with:
- The "time" import is unused
- The import order (iotests should come last)
- get_bitmap_hash() doesn't use @self and so should be a function
- Semicolons at the end of some lines
- Parentheses after "if"
- Some lines are too long (80 characters instead of 79)
- inject_test_case()'s @name parameter shadows a top-level @name
  variable
- "lambda self: mc(self)" were equivalent to just "mc", but in
  inject_test_case(), it is not equivalent, so add a comment and disable
  the warning locally
- Always put two empty lines after a function
- f'exec: cat > /dev/null' does not need to be an f-string

Fix them.

Signed-off-by: Max Reitz 
---
 tests/qemu-iotests/tests/migrate-bitmaps-test | 41 +++
 1 file changed, 23 insertions(+), 18 deletions(-)

diff --git a/tests/qemu-iotests/tests/migrate-bitmaps-test 
b/tests/qemu-iotests/tests/migrate-bitmaps-test
index a5c7bc83e0..31d3255943 100755
--- a/tests/qemu-iotests/tests/migrate-bitmaps-test
+++ b/tests/qemu-iotests/tests/migrate-bitmaps-test
@@ -20,11 +20,10 @@
 #
 
 import os
-import iotests
-import time
 import itertools
 import operator
 import re
+import iotests
 from iotests import qemu_img, qemu_img_create, Timeout
 
 
@@ -37,6 +36,12 @@ mig_cmd = 'exec: cat > ' + mig_file
 incoming_cmd = 'exec: cat ' + mig_file
 
 
+def get_bitmap_hash(vm):
+result = vm.qmp('x-debug-block-dirty-bitmap-sha256',
+node='drive0', name='bitmap0')
+return result['return']['sha256']
+
+
 class TestDirtyBitmapMigration(iotests.QMPTestCase):
 def tearDown(self):
 self.vm_a.shutdown()
@@ -62,21 +67,16 @@ class TestDirtyBitmapMigration(iotests.QMPTestCase):
 params['persistent'] = True
 
 result = vm.qmp('block-dirty-bitmap-add', **params)
-self.assert_qmp(result, 'return', {});
-
-def get_bitmap_hash(self, vm):
-result = vm.qmp('x-debug-block-dirty-bitmap-sha256',
-node='drive0', name='bitmap0')
-return result['return']['sha256']
+self.assert_qmp(result, 'return', {})
 
 def check_bitmap(self, vm, sha256):
 result = vm.qmp('x-debug-block-dirty-bitmap-sha256',
 node='drive0', name='bitmap0')
 if sha256:
-self.assert_qmp(result, 'return/sha256', sha256);
+self.assert_qmp(result, 'return/sha256', sha256)
 else:
 self.assert_qmp(result, 'error/desc',
-"Dirty bitmap 'bitmap0' not found");
+"Dirty bitmap 'bitmap0' not found")
 
 def do_test_migration_resume_source(self, persistent, migrate_bitmaps):
 granularity = 512
@@ -97,7 +97,7 @@ class TestDirtyBitmapMigration(iotests.QMPTestCase):
 self.add_bitmap(self.vm_a, granularity, persistent)
 for r in regions:
 self.vm_a.hmp_qemu_io('drive0', 'write %d %d' % r)
-sha256 = self.get_bitmap_hash(self.vm_a)
+sha256 = get_bitmap_hash(self.vm_a)
 
 result = self.vm_a.qmp('migrate', uri=mig_cmd)
 while True:
@@ -106,7 +106,7 @@ class TestDirtyBitmapMigration(iotests.QMPTestCase):
 break
 while True:
 result = self.vm_a.qmp('query-status')
-if (result['return']['status'] == 'postmigrate'):
+if result['return']['status'] == 'postmigrate':
 break
 
 # test that bitmap is still here
@@ -164,7 +164,7 @@ class TestDirtyBitmapMigration(iotests.QMPTestCase):
 self.add_bitmap(self.vm_a, granularity, persistent)
 for r in regions:
 self.vm_a.hmp_qemu_io('drive0', 'write %d %d' % r)
-sha256 = self.get_bitmap_hash(self.vm_a)
+sha256 = get_bitmap_hash(self.vm_a)
 
 if pre_shutdown:
 self.vm_a.shutdown()
@@ -214,16 +214,20 @@ class TestDirtyBitmapMigration(iotests.QMPTestCase):
 self.check_bitmap(self.vm_b, sha256 if persistent else False)
 
 
-def inject_test_case(klass, name, method, *args, **kwargs):
+def inject_test_case(klass, suffix, method, *args, **kwargs):
 mc = operator.methodcaller(method, *args, **kwargs)
-setattr(klass, 'test_' + method + name, lambda self: mc(self))
+# The lambda is required to enforce the `self` parameter.  Without it,
+# `mc` would be called without any arguments, and then complain.
+# pylint: disable=unnecessary-lambda
+setattr(klass, 'test_' + method + suffix, lambda self: mc(self))
+
 
 for cmb in list(itertools.product((True, False), repeat=5)):
 name = ('_' if cmb[0] else '_not_') + 'persistent_'
 name += ('_' if cmb[1] else '_not_') + 'migbitmap_'
 name += '_online' if cmb[2] else '_offline'
 name += '_shared' if cmb[3] else '_nonshared'
-if (cmb[4]):
+if cmb[4]:
 name += '__pre_shutdown'
 
 inject_test_case(TestDirtyBit

[PATCH v3 1/4] iotests/297: Drop 169 and 199 from the skip list

2021-05-14 Thread Max Reitz

169 and 199 have been renamed and moved to tests/ (commit a44be0334be:
"iotests: rename and move 169 and 199 tests"), so we can drop them from
the skip list.

Signed-off-by: Max Reitz 
Reviewed-by: Willian Rampazzo 
Reviewed-by: Vladimir Sementsov-Ogievskiy 
Reviewed-by: Kevin Wolf 
---
 tests/qemu-iotests/297 | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tests/qemu-iotests/297 b/tests/qemu-iotests/297
index a37910b42d..e3244d40a0 100755
--- a/tests/qemu-iotests/297
+++ b/tests/qemu-iotests/297
@@ -29,7 +29,7 @@ import iotests
 SKIP_FILES = (
 '030', '040', '041', '044', '045', '055', '056', '057', '065', '093',
 '096', '118', '124', '132', '136', '139', '147', '148', '149',
-'151', '152', '155', '163', '165', '169', '194', '196', '199', '202',
+'151', '152', '155', '163', '165', '194', '196', '202',
 '203', '205', '206', '207', '208', '210', '211', '212', '213', '216',
 '218', '219', '222', '224', '228', '234', '235', '236', '237', '238',
 '240', '242', '245', '246', '248', '255', '256', '257', '258', '260',
-- 
2.31.1

[PATCH v3 2/4] migrate-bitmaps-postcopy-test: Fix pylint warnings

2021-05-14 Thread Max Reitz

pylint complains that discards1_sha256 and all_discards_sha256 are first
set in non-__init__ methods.

These variables are not really class-variables anyway, so let them
instead be returned by start_postcopy(), thus silencing pylint.

Suggested-by: Vladimir Sementsov-Ogievskiy 
Signed-off-by: Max Reitz 
Reviewed-by: Vladimir Sementsov-Ogievskiy 
---
 .../tests/migrate-bitmaps-postcopy-test | 13 +++--
 1 file changed, 7 insertions(+), 6 deletions(-)

diff --git a/tests/qemu-iotests/tests/migrate-bitmaps-postcopy-test 
b/tests/qemu-iotests/tests/migrate-bitmaps-postcopy-test
index 584062b412..00ebb5c251 100755
--- a/tests/qemu-iotests/tests/migrate-bitmaps-postcopy-test
+++ b/tests/qemu-iotests/tests/migrate-bitmaps-postcopy-test
@@ -132,10 +132,10 @@ class 
TestDirtyBitmapPostcopyMigration(iotests.QMPTestCase):
 
 result = self.vm_a.qmp('x-debug-block-dirty-bitmap-sha256',
node='drive0', name='bitmap0')
-self.discards1_sha256 = result['return']['sha256']
+discards1_sha256 = result['return']['sha256']
 
 # Check, that updating the bitmap by discards works
-assert self.discards1_sha256 != empty_sha256
+assert discards1_sha256 != empty_sha256
 
 # We want to calculate resulting sha256. Do it in bitmap0, so, disable
 # other bitmaps
@@ -148,7 +148,7 @@ class TestDirtyBitmapPostcopyMigration(iotests.QMPTestCase):
 
 result = self.vm_a.qmp('x-debug-block-dirty-bitmap-sha256',
node='drive0', name='bitmap0')
-self.all_discards_sha256 = result['return']['sha256']
+all_discards_sha256 = result['return']['sha256']
 
 # Now, enable some bitmaps, to be updated during migration
 for i in range(2, nb_bitmaps, 2):
@@ -173,10 +173,11 @@ class 
TestDirtyBitmapPostcopyMigration(iotests.QMPTestCase):
 
 event_resume = self.vm_b.event_wait('RESUME')
 self.vm_b_events.append(event_resume)
-return event_resume
+return (event_resume, discards1_sha256, all_discards_sha256)
 
 def test_postcopy_success(self):
-event_resume = self.start_postcopy()
+event_resume, discards1_sha256, all_discards_sha256 = \
+self.start_postcopy()
 
 # enabled bitmaps should be updated
 apply_discards(self.vm_b, discards2)
@@ -217,7 +218,7 @@ class TestDirtyBitmapPostcopyMigration(iotests.QMPTestCase):
 for i in range(0, nb_bitmaps, 5):
 result = self.vm_b.qmp('x-debug-block-dirty-bitmap-sha256',
node='drive0', name='bitmap{}'.format(i))
-sha = self.discards1_sha256 if i % 2 else self.all_discards_sha256
+sha = discards1_sha256 if i % 2 else all_discards_sha256
 self.assert_qmp(result, 'return/sha256', sha)
 
 def test_early_shutdown_destination(self):
-- 
2.31.1

Re: [PATCH v2 0/4] iotests/297: Cover tests/

2021-05-14 Thread Max Reitz


On 14.05.21 13:02, Max Reitz wrote:

On 12.05.21 19:43, Max Reitz wrote:
v1: 
https://lists.nongnu.org/archive/html/qemu-block/2021-03/msg01471.html



Hi,

When reviewing Vladimir’s new addition to tests/, I noticed that 297 so
far does not cover named tests.  That isn’t so good.

This series makes it cover them, and because tests/ is rather sparse at
this point, I decided to also fix up the two tests in there that don’t
pass pylint’s scrutiny yet.  I think it would be nice if we could keep
all of tests/ clean.


v2:
- Changed patch 2 as per Vladimir’s suggestion
   (i.e. don’t let discards1_sha256 and all_discards_sha256 be class
   variables at all)


Thanks for the review, applied to my block branch:

https://github.com/XanClic/qemu/commits/block


...and dropping again, patch 3 embarrassingly breaks 
migrate-bitmaps-test.  The problem seems to be that contrastingly to 
pylint’s opinion, the `lambda self: mc(self)` is necessary, it can’t be 
contracted to just `mc`.  I suspect that `mc` (returned by 
`methodcaller`) has a variable argument list, and so without the lambda, 
`setattr` adds it as a argument-less function, so when it is called, it 
doesn’t receive the `self` parameter.  (It complains that it expected 1 
argument, but got 0.)


So we need the lambda to enforce that the `self` parameter is passed.

Max

Re: [PATCH v2] qemu-iotests: fix pylint 2.8 consider-using-with error

2021-05-14 Thread Max Reitz


On 12.05.21 19:04, Max Reitz wrote:

On 10.05.21 21:04, Emanuele Giuseppe Esposito wrote:

pylint 2.8 introduces consider-using-with error, suggesting
to use the 'with' block statement when possible.

Modify all subprocess.Popen call to use the 'with' statement,
except the one in __init__ of QemuIoInteractive class, since
it is assigned to a class field and used in other methods.

Signed-off-by: Emanuele Giuseppe Esposito 
---
v1 -> v2:
* fix indentation [Max]
* explain why we disabled the check in QemuIoInteractive's __init__ [Max]


Thanks!

Applied to my block branch:

https://github.com/XanClic/qemu/commits/block


I’ve just seen that the “# pylint: disable=consider-using-with” line 
causes a warning in pylint versions that don’t know that warning…


I’d like to squash this in:


diff --git a/tests/qemu-iotests/pylintrc b/tests/qemu-iotests/pylintrc
index 7a6c0a9474..f2c0b522ac 100644
--- a/tests/qemu-iotests/pylintrc
+++ b/tests/qemu-iotests/pylintrc
@@ -19,6 +19,9 @@ disable=invalid-name,
 too-many-public-methods,
 # pylint warns about Optional[] etc. as unsubscriptable in 3.9
 unsubscriptable-object,
+# Sometimes we need to disable a newly introduced pylint warning.
+# Doing so should not produce a warning in older versions of pylint.
+bad-option-value,
 # These are temporary, and should be removed:
 missing-docstring,
 too-many-return-statements,


Would that be OK for you?

Max

Re: [PATCH v2 0/4] iotests/297: Cover tests/

2021-05-14 Thread Max Reitz


On 12.05.21 19:43, Max Reitz wrote:

v1: https://lists.nongnu.org/archive/html/qemu-block/2021-03/msg01471.html


Hi,

When reviewing Vladimir’s new addition to tests/, I noticed that 297 so
far does not cover named tests.  That isn’t so good.

This series makes it cover them, and because tests/ is rather sparse at
this point, I decided to also fix up the two tests in there that don’t
pass pylint’s scrutiny yet.  I think it would be nice if we could keep
all of tests/ clean.


v2:
- Changed patch 2 as per Vladimir’s suggestion
   (i.e. don’t let discards1_sha256 and all_discards_sha256 be class
   variables at all)


Thanks for the review, applied to my block branch:

https://github.com/XanClic/qemu/commits/block

Max

[PATCH v2 4/4] iotests/297: Cover tests/

2021-05-12 Thread Max Reitz

297 so far does not check the named tests, which reside in the tests/
directory (i.e. full path tests/qemu-iotests/tests).  Fix it.

Thanks to the previous two commits, all named tests pass its scrutiny,
so we do not have to add anything to SKIP_FILES.

Signed-off-by: Max Reitz 
Reviewed-by: Willian Rampazzo 
Reviewed-by: Vladimir Sementsov-Ogievskiy 
Reviewed-by: Kevin Wolf 
---
 tests/qemu-iotests/297 | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/tests/qemu-iotests/297 b/tests/qemu-iotests/297
index e3244d40a0..ce0e82e279 100755
--- a/tests/qemu-iotests/297
+++ b/tests/qemu-iotests/297
@@ -55,8 +55,9 @@ def is_python_file(filename):
 
 
 def run_linters():
-files = [filename for filename in (set(os.listdir('.')) - set(SKIP_FILES))
- if is_python_file(filename)]
+named_tests = [f'tests/{entry}' for entry in os.listdir('tests')]
+check_tests = set(os.listdir('.') + named_tests) - set(SKIP_FILES)
+files = [filename for filename in check_tests if is_python_file(filename)]
 
 iotests.logger.debug('Files to be checked:')
 iotests.logger.debug(', '.join(sorted(files)))
-- 
2.31.1

[PATCH v2 2/4] migrate-bitmaps-postcopy-test: Fix pylint warnings

2021-05-12 Thread Max Reitz

pylint complains that discards1_sha256 and all_discards_sha256 are first
set in non-__init__ methods.

These variables are not really class-variables anyway, so let them
instead be returned by start_postcopy(), thus silencing pylint.

Suggested-by: Vladimir Sementsov-Ogievskiy 
Signed-off-by: Max Reitz 
---
 .../tests/migrate-bitmaps-postcopy-test | 13 +++--
 1 file changed, 7 insertions(+), 6 deletions(-)

diff --git a/tests/qemu-iotests/tests/migrate-bitmaps-postcopy-test 
b/tests/qemu-iotests/tests/migrate-bitmaps-postcopy-test
index 584062b412..00ebb5c251 100755
--- a/tests/qemu-iotests/tests/migrate-bitmaps-postcopy-test
+++ b/tests/qemu-iotests/tests/migrate-bitmaps-postcopy-test
@@ -132,10 +132,10 @@ class 
TestDirtyBitmapPostcopyMigration(iotests.QMPTestCase):
 
 result = self.vm_a.qmp('x-debug-block-dirty-bitmap-sha256',
node='drive0', name='bitmap0')
-self.discards1_sha256 = result['return']['sha256']
+discards1_sha256 = result['return']['sha256']
 
 # Check, that updating the bitmap by discards works
-assert self.discards1_sha256 != empty_sha256
+assert discards1_sha256 != empty_sha256
 
 # We want to calculate resulting sha256. Do it in bitmap0, so, disable
 # other bitmaps
@@ -148,7 +148,7 @@ class TestDirtyBitmapPostcopyMigration(iotests.QMPTestCase):
 
 result = self.vm_a.qmp('x-debug-block-dirty-bitmap-sha256',
node='drive0', name='bitmap0')
-self.all_discards_sha256 = result['return']['sha256']
+all_discards_sha256 = result['return']['sha256']
 
 # Now, enable some bitmaps, to be updated during migration
 for i in range(2, nb_bitmaps, 2):
@@ -173,10 +173,11 @@ class 
TestDirtyBitmapPostcopyMigration(iotests.QMPTestCase):
 
 event_resume = self.vm_b.event_wait('RESUME')
 self.vm_b_events.append(event_resume)
-return event_resume
+return (event_resume, discards1_sha256, all_discards_sha256)
 
 def test_postcopy_success(self):
-event_resume = self.start_postcopy()
+event_resume, discards1_sha256, all_discards_sha256 = \
+self.start_postcopy()
 
 # enabled bitmaps should be updated
 apply_discards(self.vm_b, discards2)
@@ -217,7 +218,7 @@ class TestDirtyBitmapPostcopyMigration(iotests.QMPTestCase):
 for i in range(0, nb_bitmaps, 5):
 result = self.vm_b.qmp('x-debug-block-dirty-bitmap-sha256',
node='drive0', name='bitmap{}'.format(i))
-sha = self.discards1_sha256 if i % 2 else self.all_discards_sha256
+sha = discards1_sha256 if i % 2 else all_discards_sha256
 self.assert_qmp(result, 'return/sha256', sha)
 
 def test_early_shutdown_destination(self):
-- 
2.31.1

[PATCH v2 1/4] iotests/297: Drop 169 and 199 from the skip list

2021-05-12 Thread Max Reitz

169 and 199 have been renamed and moved to tests/ (commit a44be0334be:
"iotests: rename and move 169 and 199 tests"), so we can drop them from
the skip list.

Signed-off-by: Max Reitz 
Reviewed-by: Willian Rampazzo 
Reviewed-by: Vladimir Sementsov-Ogievskiy 
Reviewed-by: Kevin Wolf 
---
 tests/qemu-iotests/297 | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tests/qemu-iotests/297 b/tests/qemu-iotests/297
index a37910b42d..e3244d40a0 100755
--- a/tests/qemu-iotests/297
+++ b/tests/qemu-iotests/297
@@ -29,7 +29,7 @@ import iotests
 SKIP_FILES = (
 '030', '040', '041', '044', '045', '055', '056', '057', '065', '093',
 '096', '118', '124', '132', '136', '139', '147', '148', '149',
-'151', '152', '155', '163', '165', '169', '194', '196', '199', '202',
+'151', '152', '155', '163', '165', '194', '196', '202',
 '203', '205', '206', '207', '208', '210', '211', '212', '213', '216',
 '218', '219', '222', '224', '228', '234', '235', '236', '237', '238',
 '240', '242', '245', '246', '248', '255', '256', '257', '258', '260',
-- 
2.31.1

[PATCH v2 3/4] migrate-bitmaps-test: Fix pylint warnings

2021-05-12 Thread Max Reitz

There are a couple of things pylint takes issue with:
- The "time" import is unused
- The import order (iotests should come last)
- get_bitmap_hash() doesn't use @self and so should be a function
- Semicolons at the end of some lines
- Parentheses after "if"
- Some lines are too long (80 characters instead of 79)
- inject_test_case()'s @name parameter shadows a top-level @name
  variable
- "lambda self: mc(self)" is equivalent to just "mc"
- Always put two empty lines after a function
- f'exec: cat > /dev/null' does not need to be an f-string

Fix them.

Signed-off-by: Max Reitz 
Reviewed-by: Willian Rampazzo 
Reviewed-by: Vladimir Sementsov-Ogievskiy 
Reviewed-by: Kevin Wolf 
---
 tests/qemu-iotests/tests/migrate-bitmaps-test | 38 ++-
 1 file changed, 20 insertions(+), 18 deletions(-)

diff --git a/tests/qemu-iotests/tests/migrate-bitmaps-test 
b/tests/qemu-iotests/tests/migrate-bitmaps-test
index a5c7bc83e0..fb5ffbb8c4 100755
--- a/tests/qemu-iotests/tests/migrate-bitmaps-test
+++ b/tests/qemu-iotests/tests/migrate-bitmaps-test
@@ -20,11 +20,10 @@
 #
 
 import os
-import iotests
-import time
 import itertools
 import operator
 import re
+import iotests
 from iotests import qemu_img, qemu_img_create, Timeout
 
 
@@ -37,6 +36,12 @@ mig_cmd = 'exec: cat > ' + mig_file
 incoming_cmd = 'exec: cat ' + mig_file
 
 
+def get_bitmap_hash(vm):
+result = vm.qmp('x-debug-block-dirty-bitmap-sha256',
+node='drive0', name='bitmap0')
+return result['return']['sha256']
+
+
 class TestDirtyBitmapMigration(iotests.QMPTestCase):
 def tearDown(self):
 self.vm_a.shutdown()
@@ -62,21 +67,16 @@ class TestDirtyBitmapMigration(iotests.QMPTestCase):
 params['persistent'] = True
 
 result = vm.qmp('block-dirty-bitmap-add', **params)
-self.assert_qmp(result, 'return', {});
-
-def get_bitmap_hash(self, vm):
-result = vm.qmp('x-debug-block-dirty-bitmap-sha256',
-node='drive0', name='bitmap0')
-return result['return']['sha256']
+self.assert_qmp(result, 'return', {})
 
 def check_bitmap(self, vm, sha256):
 result = vm.qmp('x-debug-block-dirty-bitmap-sha256',
 node='drive0', name='bitmap0')
 if sha256:
-self.assert_qmp(result, 'return/sha256', sha256);
+self.assert_qmp(result, 'return/sha256', sha256)
 else:
 self.assert_qmp(result, 'error/desc',
-"Dirty bitmap 'bitmap0' not found");
+"Dirty bitmap 'bitmap0' not found")
 
 def do_test_migration_resume_source(self, persistent, migrate_bitmaps):
 granularity = 512
@@ -97,7 +97,7 @@ class TestDirtyBitmapMigration(iotests.QMPTestCase):
 self.add_bitmap(self.vm_a, granularity, persistent)
 for r in regions:
 self.vm_a.hmp_qemu_io('drive0', 'write %d %d' % r)
-sha256 = self.get_bitmap_hash(self.vm_a)
+sha256 = get_bitmap_hash(self.vm_a)
 
 result = self.vm_a.qmp('migrate', uri=mig_cmd)
 while True:
@@ -106,7 +106,7 @@ class TestDirtyBitmapMigration(iotests.QMPTestCase):
 break
 while True:
 result = self.vm_a.qmp('query-status')
-if (result['return']['status'] == 'postmigrate'):
+if result['return']['status'] == 'postmigrate':
 break
 
 # test that bitmap is still here
@@ -164,7 +164,7 @@ class TestDirtyBitmapMigration(iotests.QMPTestCase):
 self.add_bitmap(self.vm_a, granularity, persistent)
 for r in regions:
 self.vm_a.hmp_qemu_io('drive0', 'write %d %d' % r)
-sha256 = self.get_bitmap_hash(self.vm_a)
+sha256 = get_bitmap_hash(self.vm_a)
 
 if pre_shutdown:
 self.vm_a.shutdown()
@@ -214,16 +214,17 @@ class TestDirtyBitmapMigration(iotests.QMPTestCase):
 self.check_bitmap(self.vm_b, sha256 if persistent else False)
 
 
-def inject_test_case(klass, name, method, *args, **kwargs):
+def inject_test_case(klass, suffix, method, *args, **kwargs):
 mc = operator.methodcaller(method, *args, **kwargs)
-setattr(klass, 'test_' + method + name, lambda self: mc(self))
+setattr(klass, 'test_' + method + suffix, mc)
+
 
 for cmb in list(itertools.product((True, False), repeat=5)):
 name = ('_' if cmb[0] else '_not_') + 'persistent_'
 name += ('_' if cmb[1] else '_not_') + 'migbitmap_'
 name += '_online' if cmb[2] else '_offline'
 name += '_shared' if cmb[3] else '_nonshared'
-if (cmb[4]):
+if cmb[4]:
 name += '__pre_shutdown'
 
 inject_test_case(TestDirtyBitmapMigration, name, 'do_test_migration',
@@ -270,7 +271,8 @@ class TestDirtyBitmapBackingMigration(iotests.QMPTestCase):
 self.assert_qmp(result, 'return', {})
 
 # Check that the bitmaps are there
-for

[PATCH v2 0/4] iotests/297: Cover tests/

2021-05-12 Thread Max Reitz

v1: https://lists.nongnu.org/archive/html/qemu-block/2021-03/msg01471.html


Hi,

When reviewing Vladimir’s new addition to tests/, I noticed that 297 so
far does not cover named tests.  That isn’t so good.

This series makes it cover them, and because tests/ is rather sparse at
this point, I decided to also fix up the two tests in there that don’t
pass pylint’s scrutiny yet.  I think it would be nice if we could keep
all of tests/ clean.


v2:
- Changed patch 2 as per Vladimir’s suggestion
  (i.e. don’t let discards1_sha256 and all_discards_sha256 be class
  variables at all)


git-backport-diff against v1:

Key:
[] : patches are identical
[] : number of functional differences between upstream/downstream patch
[down] : patch is downstream-only
The flags [FC] indicate (F)unctional and (C)ontextual differences, respectively

001/4:[] [--] 'iotests/297: Drop 169 and 199 from the skip list'
002/4:[0016] [FC] 'migrate-bitmaps-postcopy-test: Fix pylint warnings'
003/4:[] [--] 'migrate-bitmaps-test: Fix pylint warnings'
004/4:[] [--] 'iotests/297: Cover tests/'


Max Reitz (4):
  iotests/297: Drop 169 and 199 from the skip list
  migrate-bitmaps-postcopy-test: Fix pylint warnings
  migrate-bitmaps-test: Fix pylint warnings
  iotests/297: Cover tests/

 tests/qemu-iotests/297|  7 ++--
 .../tests/migrate-bitmaps-postcopy-test   | 13 ---
 tests/qemu-iotests/tests/migrate-bitmaps-test | 38 ++-
 3 files changed, 31 insertions(+), 27 deletions(-)

-- 
2.31.1

Re: [PATCH 0/4] iotests/297: Cover tests/

2021-05-12 Thread Max Reitz


On 05.05.21 10:02, Vladimir Sementsov-Ogievskiy wrote:

Hi!

Kindly remind. Didn't you forget? I'm responsible for these two tests, 
let me know if you don't have time, and I'll resend this series :)


Sorry, I forgot indeed... v2 coming right up. :)

Max

Re: [PATCH v3 0/8] block: refactor write threshold

2021-05-12 Thread Max Reitz


On 06.05.21 11:06, Vladimir Sementsov-Ogievskiy wrote:

Hi all!

v3:
01-04,06,07: add Max's r-b.
01: improve commit msg and comments
03: improve commit msg
05: add more comments and qemu/atomic.h include
08: new, replacement for v2:08,09

Vladimir Sementsov-Ogievskiy (8):
   block/write-threshold: don't use write notifiers
   block: drop write notifiers
   test-write-threshold: rewrite test_threshold_(not_)trigger tests
   block/write-threshold: drop extra APIs
   block/write-threshold: don't use aio context lock
   test-write-threshold: drop extra tests
   test-write-threshold: drop extra TestStruct structure
   write-threshold: deal with includes

  include/block/block_int.h |  19 ++---
  include/block/write-threshold.h   |  31 +++--
  block.c   |   1 -
  block/io.c|  11 +--
  block/write-threshold.c   | 111 +++---
  tests/unit/test-write-threshold.c |  90 ++--
  6 files changed, 52 insertions(+), 211 deletions(-)


Thanks, I’ve applied all patches but patch 5 to my block branch, with 
the changes Eric has suggested:


https://github.com/XanClic/qemu/commits/block

Max

Re: [PATCH v2] qemu-iotests: fix pylint 2.8 consider-using-with error

2021-05-12 Thread Max Reitz


On 10.05.21 21:04, Emanuele Giuseppe Esposito wrote:

pylint 2.8 introduces consider-using-with error, suggesting
to use the 'with' block statement when possible.

Modify all subprocess.Popen call to use the 'with' statement,
except the one in __init__ of QemuIoInteractive class, since
it is assigned to a class field and used in other methods.

Signed-off-by: Emanuele Giuseppe Esposito 
---
v1 -> v2:
* fix indentation [Max]
* explain why we disabled the check in QemuIoInteractive's __init__ [Max]


Thanks!

Applied to my block branch:

https://github.com/XanClic/qemu/commits/block

Max

Re: [PATCH 2/7] block: use GDateTime for formatting timestamp when dumping snapshot info

2021-05-07 Thread Max Reitz


On 05.05.21 12:36, Daniel P. Berrangé wrote:

The GDateTime APIs provided by GLib avoid portability pitfalls, such
as some platforms where 'struct timeval.tv_sec' field is still 'long'
instead of 'time_t'. When combined with automatic cleanup, GDateTime
often results in simpler code too.

Signed-off-by: Daniel P. Berrangé 
---
  block/qapi.c | 11 ---
  1 file changed, 4 insertions(+), 7 deletions(-)


Reviewed-by: Max Reitz

Re: [PATCH] block/copy-on-read: use bdrv_drop_filter() and drop s->active

2021-05-07 Thread Max Reitz


On 06.05.21 21:41, Vladimir Sementsov-Ogievskiy wrote:

Now, after huge update of block graph permission update algorithm, we
don't need this workaround with active state of the filter. Drop it and
use new smart bdrv_drop_filter() function.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
---
  block/copy-on-read.c | 33 +
  1 file changed, 1 insertion(+), 32 deletions(-)


Thanks, applied to my block branch:

https://github.com/XanClic/qemu/commits/block

Max

Re: [PATCH 3/4] block/vvfat: Fix leak of mapping_t::path

2021-05-07 Thread Max Reitz


On 30.04.21 18:25, Philippe Mathieu-Daudé wrote:

read_directory() keeps pointers to alloc'ed data in path ...:

  743 static int read_directory(BDRVVVFATState* s, int mapping_index)
  744 {
  ...
  792 buffer = g_malloc(length);
  ...
  828 /* create mapping for this file */
  829 if(!is_dot && !is_dotdot && (S_ISDIR(st.st_mode) || st.st_size)) {
  830 s->current_mapping = array_get_next(&(s->mapping));
  ...
  847 s->current_mapping->path=buffer;

... but these pointers are never free'd.


“Never” is not quite right, they are freed in some places: Namely in 
remove_mapping(), and in handle_renames_and_mkdirs().


Speaking of the latter, which does some commit_t handling – should those 
paths also be freed?  (i.e. the ones in commit_t, in s->commits)



Free them in vvfat_close(),
to fix (QEMU built with --enable-sanitizers):

   Direct leak of 148 byte(s) in 6 object(s) allocated from:
 #0 0x55d7a363773f in malloc (qemu-system-x86_64+0x1dab73f)
 #1 0x7f55c6447958 in g_malloc (/lib64/libglib-2.0.so.0+0x58958)
 #2 0x55d7a5e17679 in init_directories block/vvfat.c:962:16
 #3 0x55d7a5e1251e in vvfat_open block/vvfat.c:1255:9
 #4 0x55d7a5a5363f in bdrv_open_driver block.c:1526:15
 #5 0x55d7a5a9d369 in bdrv_open_common block.c:1802:11
 #6 0x55d7a5a609f1 in bdrv_open_inherit block.c:3444:11
 #7 0x55d7a5a65411 in bdrv_open_child_bs block.c:3079:10
 #8 0x55d7a5a60079 in bdrv_open_inherit block.c:3391:19
 #9 0x55d7a5a65da3 in bdrv_open block.c:3537:12
 #10 0x55d7a5b33f6a in blk_new_open block/block-backend.c:421:10
 #11 0x55d7a5a0a33e in blockdev_init blockdev.c:610:15
 #12 0x55d7a5a088e7 in drive_new blockdev.c:994:11
 #13 0x55d7a51b10c4 in drive_init_func softmmu/vl.c:636:12
 #14 0x55d7a620e148 in qemu_opts_foreach util/qemu-option.c:1167:14
 #15 0x55d7a51b0e20 in configure_blockdev softmmu/vl.c:695:9
 #16 0x55d7a51a70b5 in qemu_create_early_backends softmmu/vl.c:1895:5
 #17 0x55d7a519bf87 in qemu_init softmmu/vl.c:3551:5
 #18 0x55d7a366f619 in main softmmu/main.c:49:5

Fixes: a046433a161 ("Major overhaul of the virtual FAT driver for read/write 
support")
Signed-off-by: Philippe Mathieu-Daudé 
---
  block/vvfat.c | 5 +
  1 file changed, 5 insertions(+)


First:

Reviewed-by: Max Reitz 

Second:

In commit_mappings(), there is this line:

next_mapping->path = mapping->path;

That looks strange.  Should that be a g_strdup(), or am I missing the 
place where only one of them can really exist?


But, well, all of the vvfat code is just strange to me.


diff --git a/block/vvfat.c b/block/vvfat.c
index 2cc21787600..c193a816646 100644
--- a/block/vvfat.c
+++ b/block/vvfat.c
@@ -3228,6 +3228,11 @@ static void vvfat_close(BlockDriverState *bs)
  {
  BDRVVVFATState *s = bs->opaque;
  
+for (unsigned j = 0; j < s->mapping.next; j++) {

+mapping_t *mapping = array_get(&(s->mapping), j);
+
+g_free(mapping->path);
+}
  vvfat_close_current_file(s);
  array_free(&(s->fat));
  array_free(&(s->directory));

Re: [PATCH] qemu-iotests: fix pylint 2.8 consider-using-with error

2021-05-07 Thread Max Reitz


On 06.05.21 10:48, Emanuele Giuseppe Esposito wrote:

pylint 2.8 introduces consider-using-with error, suggesting
to use the 'with' block statement when possible.
http://pylint.pycqa.org/en/latest/whatsnew/2.8.html

Modify all subprocess.Popen calls to use the 'with' statement,
except one in __init__ of QemuIoInteractive class, since
the return value is assigned to a class field and used in other methods.

Signed-off-by: Emanuele Giuseppe Esposito 
---
  tests/qemu-iotests/iotests.py| 63 
  tests/qemu-iotests/testrunner.py | 22 +--
  2 files changed, 42 insertions(+), 43 deletions(-)


Thanks, looks good, functionally.  But I just can’t keep myself from 
nagging about indentation (I wouldn’t have, but flake8 says I may be 
justified):



diff --git a/tests/qemu-iotests/iotests.py b/tests/qemu-iotests/iotests.py
index 5af0182895..e1117e8ae8 100644
--- a/tests/qemu-iotests/iotests.py
+++ b/tests/qemu-iotests/iotests.py
@@ -113,15 +113,14 @@ def qemu_tool_pipe_and_status(tool: str, args: 
Sequence[str],
  Run a tool and return both its output and its exit code
  """
  stderr = subprocess.STDOUT if connect_stderr else None
-subp = subprocess.Popen(args,
-stdout=subprocess.PIPE,
-stderr=stderr,
-universal_newlines=True)
-output = subp.communicate()[0]
-if subp.returncode < 0:
-cmd = ' '.join(args)
-sys.stderr.write(f'{tool} received signal {-subp.returncode}: {cmd}\n')
-return (output, subp.returncode)
+with subprocess.Popen(args, stdout=subprocess.PIPE,
+stderr=stderr, universal_newlines=True) as subp:


The second line’s indentation is not aligned to the opening parenthesis 
of the Popen call.  I’d like it better if it were.



+output = subp.communicate()[0]
+if subp.returncode < 0:
+cmd = ' '.join(args)
+sys.stderr.write(f'{tool} received signal \
+   {-subp.returncode}: {cmd}\n')
+return (output, subp.returncode)
  
  def qemu_img_pipe_and_status(*args: str) -> Tuple[str, int]:

  """
@@ -237,6 +236,7 @@ def qemu_io_silent_check(*args):
  class QemuIoInteractive:
  def __init__(self, *args):
  self.args = qemu_io_args_no_fmt + list(args)
+# pylint: disable=consider-using-with


I think I would have added an additional comment why we need this 
statement (because we need to keep the Popen object around), but then 
again, I suppose it really is obvious.


(Wouldn’t have commented on this bit if I hadn’t replied because of the 
indentation.  So if you think it’s obvious, no need to add a comment.)



  self._p = subprocess.Popen(self.args, stdin=subprocess.PIPE,
 stdout=subprocess.PIPE,
 stderr=subprocess.STDOUT,


[...]


diff --git a/tests/qemu-iotests/testrunner.py b/tests/qemu-iotests/testrunner.py
index 1fc61fcaa3..729fe9ee3b 100644
--- a/tests/qemu-iotests/testrunner.py
+++ b/tests/qemu-iotests/testrunner.py
@@ -258,17 +258,17 @@ def do_run_test(self, test: str) -> TestResult:
  
  t0 = time.time()

  with f_bad.open('w', encoding="utf-8") as f:
-proc = subprocess.Popen(args, cwd=str(f_test.parent), env=env,
-stdout=f, stderr=subprocess.STDOUT)
-try:
-proc.wait()
-except KeyboardInterrupt:
-proc.terminate()
-proc.wait()
-return TestResult(status='not run',
-  description='Interrupted by user',
-  interrupted=True)
-ret = proc.returncode
+with subprocess.Popen(args, cwd=str(f_test.parent), env=env,
+stdout=f, stderr=subprocess.STDOUT) as proc:


As above, the second line is (no longer) aligned to the opening parenthesis.


+try:
+proc.wait()
+except KeyboardInterrupt:
+proc.terminate()
+proc.wait()
+return TestResult(status='not run',
+description='Interrupted by user',
+interrupted=True)


And here, too.

Max


+ret = proc.returncode
  
  elapsed = round(time.time() - t0, 1)

Re: [PATCH v3] Document qemu-img options data_file and data_file_raw

2021-05-07 Thread Max Reitz


On 05.05.21 21:55, Connor Kuehl wrote:

The contents of this patch were initially developed and posted by Han
Han[1], however, it appears the original patch was not applied. Since
then, the relevant documentation has been moved and adapted to a new
format.

I've taken most of the original wording and tweaked it according to
some of the feedback from the original patch submission. I've also
adapted it to restructured text, which is the format the documentation
currently uses.

[1] https://lists.nongnu.org/archive/html/qemu-block/2019-10/msg01253.html

Fixes: https://bugzilla.redhat.com/1763105
Signed-off-by: Han Han 
Suggested-by: Max Reitz 
[ Max: provided description of data_file_raw behavior ]
Signed-off-by: Connor Kuehl 


Thanks, applied to my block branch:

https://github.com/XanClic/qemu/commits/block

Max

Re: [PATCH v2 9/9] block/write-threshold: drop extra includes

2021-05-06 Thread Max Reitz


On 05.05.21 22:34, Vladimir Sementsov-Ogievskiy wrote:

05.05.2021 19:23, Max Reitz wrote:

On 04.05.21 10:25, Vladimir Sementsov-Ogievskiy wrote:

Signed-off-by: Vladimir Sementsov-Ogievskiy 
---
  block/write-threshold.c | 3 ---
  1 file changed, 3 deletions(-)

diff --git a/block/write-threshold.c b/block/write-threshold.c
index fbf4e6f5c4..db271c5537 100644
--- a/block/write-threshold.c
+++ b/block/write-threshold.c
@@ -12,10 +12,7 @@
   */
  #include "qemu/osdep.h"
-#include "block/block_int.h"


We need this (for bs->write_threshold_offset), but it’s included 
through block/block_int.h.  I’m not sure whether we should drop it 
from the includes.


Perhaps we should instead drop block_int.h from write-threshold.h? 
Shouldn’t including qemu/osdep.h (which includes qemu/typedefs.h, 
which forward-declares BDS) be sufficient there?



-#include "qemu/coroutine.h"
  #include "block/write-threshold.h"
-#include "qemu/notify.h"
  #include "qapi/error.h"
  #include "qapi/qapi-commands-block-core.h"
  #include "qapi/qapi-events-block-core.h"


Btw, where does qemu/atomic.h come in?  Looks like it comes through 
block_int.h.  I think we should include it explicitly.




I'm not sure. If something is included through another include, why to 
include it explicitly?


Because the other include may change.  I’d include something if you need 
the feature, and we need atomics here.


And block_int.h doesn’t even provide atomic.h, it comes through various 
of its includes (which are probably not included to provide atomics). 
So this is already indirect and basically just incidental; block_int.h 
doesn’t guarantee atomic.h.


Another thing: I see that other fields in BDS that are to be accessed 
with atomic ops have a comment saying so.  I think 
write_threshold_offset should have, too.


Max

For me, if statement can be removed with no effect, it's an extra 
statement.

Re: [PATCH v2 9/9] block/write-threshold: drop extra includes

2021-05-05 Thread Max Reitz


On 04.05.21 10:25, Vladimir Sementsov-Ogievskiy wrote:

Signed-off-by: Vladimir Sementsov-Ogievskiy 
---
  block/write-threshold.c | 3 ---
  1 file changed, 3 deletions(-)

diff --git a/block/write-threshold.c b/block/write-threshold.c
index fbf4e6f5c4..db271c5537 100644
--- a/block/write-threshold.c
+++ b/block/write-threshold.c
@@ -12,10 +12,7 @@
   */
  
  #include "qemu/osdep.h"

-#include "block/block_int.h"


We need this (for bs->write_threshold_offset), but it’s included through 
block/block_int.h.  I’m not sure whether we should drop it from the 
includes.


Perhaps we should instead drop block_int.h from write-threshold.h? 
Shouldn’t including qemu/osdep.h (which includes qemu/typedefs.h, which 
forward-declares BDS) be sufficient there?



-#include "qemu/coroutine.h"
  #include "block/write-threshold.h"
-#include "qemu/notify.h"
  #include "qapi/error.h"
  #include "qapi/qapi-commands-block-core.h"
  #include "qapi/qapi-events-block-core.h"


Btw, where does qemu/atomic.h come in?  Looks like it comes through 
block_int.h.  I think we should include it explicitly.


Max

Re: [PATCH] block/snapshot: Clarify goto fallback behavior

2021-05-05 Thread Max Reitz


On 05.05.21 17:05, Vladimir Sementsov-Ogievskiy wrote:

03.05.2021 12:54, Max Reitz wrote:

In the bdrv_snapshot_goto() fallback code, we work with a pointer to
either bs->file or bs->backing. 



We close that child,


Do we?


We *detach it.


close the node
(with .bdrv_close()), apply the snapshot on the child node, and then
re-open the node (with .bdrv_open()).

In order for .bdrv_open() to attach the same child node that we had
before, we pass "file={child-node}" or "backing={child-node}" to it.
Therefore, when .bdrv_open() has returned success, we can assume that
bs->file or bs->backing (respectively) points to our original child
again.  This is verified by an assertion.

All of this is not immediately clear from a quick glance at the code,
so add a comment to the assertion what it is for, and why it is valid.
It certainly confused Coverity.

Reported-by: Coverity (CID 1452774)
Signed-off-by: Max Reitz 
---
  block/snapshot.c | 14 +-
  1 file changed, 13 insertions(+), 1 deletion(-)

diff --git a/block/snapshot.c b/block/snapshot.c
index e8ae9a28c1..cce5e35b35 100644
--- a/block/snapshot.c
+++ b/block/snapshot.c
@@ -275,13 +275,16 @@ int bdrv_snapshot_goto(BlockDriverState *bs,
  qobject_unref(file_options);
  g_free(subqdict_prefix);
+    /* Force .bdrv_open() below to re-attach fallback_bs on 
*fallback_ptr */

  qdict_put_str(options, (*fallback_ptr)->name,
    bdrv_get_node_name(fallback_bs));
+    /* Now close bs, apply the snapshot on fallback_bs, and 
re-open bs */

  if (drv->bdrv_close) {
  drv->bdrv_close(bs);
  }
+    /* .bdrv_open() will re-attach it */
  bdrv_unref_child(bs, *fallback_ptr);
  *fallback_ptr = NULL;
@@ -296,7 +299,16 @@ int bdrv_snapshot_goto(BlockDriverState *bs,
  return ret < 0 ? ret : open_ret;
  }
-    assert(fallback_bs == (*fallback_ptr)->bs);
+    /*
+ * fallback_ptr is >file or >backing.  *fallback_ptr
+ * was closed above and set to NULL, but the .bdrv_open() call
+ * has opened it again, because we set the respective option
+ * (with the qdict_put_str() call above).
+ * Assert that .bdrv_open() has attached some child on
+ * *fallback_ptr, and that it has attached the one we wanted
+ * it to (i.e., fallback_bs).
+ */
+    assert(*fallback_ptr && fallback_bs == (*fallback_ptr)->bs);
  bdrv_unref(fallback_bs);
  return ret;
  }



Reviewed-by: Vladimir Sementsov-Ogievskiy 

===

I still think that this fallback has more problems.. Nothing guarantee 
that all format drivers (even restricted to those who have only one 
child) support such logic.


For example, .bdrv_open() may rely on state structure were zeroed and 
not initialize some fields. And .bdrv_close() may just g_free() some 
things, not setting them to zero.. So we probably should call 
bdrv_open()/bdrv_close() generic functions. But we can't: at least 
bdrv_close() requires that node has no parents.


Not saying about that we lose everything on failure (when actually, it's 
better to restore original state on failure).


This feature should instead be done with help of bdrv_reopen_multiple(), 
and even with it it's not obvious how to do it properly.


The feature were done long ago in 2010 with commit 
7cdb1f6d305e1000b5f882257cbee71b8bb08ef5 by Morita Kazutaka.


Note also, that the only protocol driver that support snapshots is rbd, 
and snapshot support was added to it in 2012 with commit 
bd6032470631d8d5de6db84ecb55398b70d9d2f3 by Gregory Farnum.


Other two drivers with support are sheepdog (which is deprecated) and 
qcow2 (I doubt that it should be used as protocol driver for some other 
format).



Do we really need this fallback? Is it used somewhere? May be, we can 
just deprecate it, and look will someone complain?


Maybe? :)

Max

Re: [PATCH v2 8/9] test-write-threshold: drop extra includes

2021-05-05 Thread Max Reitz


On 04.05.21 10:25, Vladimir Sementsov-Ogievskiy wrote:

Signed-off-by: Vladimir Sementsov-Ogievskiy 
---
  tests/unit/test-write-threshold.c | 2 --
  1 file changed, 2 deletions(-)


Reviewed-by: Max Reitz

Re: [PATCH v2 7/9] test-write-threshold: drop extra TestStruct structure

2021-05-05 Thread Max Reitz


On 04.05.21 10:25, Vladimir Sementsov-Ogievskiy wrote:

We don't need this extra logic: it doesn't make code simpler.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
---
  tests/unit/test-write-threshold.c | 20 +++-
  1 file changed, 3 insertions(+), 17 deletions(-)


Reviewed-by: Max Reitz

Re: [PATCH v2 6/9] test-write-threshold: drop extra tests

2021-05-05 Thread Max Reitz


On 04.05.21 10:25, Vladimir Sementsov-Ogievskiy wrote:

Testing set/get of one 64bit variable doesn't seem necessary. We have a
lot of such variables. Also remaining tests do test set/get anyway.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
---
  tests/unit/test-write-threshold.c | 43 ---
  1 file changed, 43 deletions(-)


Reviewed-by: Max Reitz

Re: [PATCH v2 5/9] block/write-threshold: don't use aio context lock

2021-05-05 Thread Max Reitz


On 04.05.21 10:25, Vladimir Sementsov-Ogievskiy wrote:

Instead of relying on aio context lock, let's make use of atomic
operations.

The tricky place is bdrv_write_threshold_check_write(): we want
atomically unset bs->write_threshold_offset iff
   offset + bytes > bs->write_threshold_offset
We don't have such atomic operation, so let's go in a loop:


This could also be solved by untangling the overloaded meaning of 
write_threshold_offset – if we had an additional boolean 
write_threshold_check, then this wouldn’t be a problem, and we could do 
this:


if (end > bdrv_write_threshold_get(bs)) {
if (qatomic_xchg(>write_threshold_check, false) == true) {
qapi_event_send(...);
}
}

However, the problem then becomes thinking about the memory access 
semantics, because if some other thread sets the threshold offset and 
enables the threshold check, we need to ensure that we see the updated 
offset here...


So I suppose your loop is simpler then.


1. fetch wtr atomically
2. if condition satisfied, try cmpxchg (if not satisfied, we are done,
don't send event)
3. if cmpxchg succeeded, we are done (send event), else go to [1]

Signed-off-by: Vladimir Sementsov-Ogievskiy 
---
  block/write-threshold.c | 32 +---
  1 file changed, 17 insertions(+), 15 deletions(-)


Reviewed-by: Max Reitz

Re: [PATCH v2 4/9] block/write-threshold: drop extra APIs

2021-05-05 Thread Max Reitz


On 04.05.21 10:25, Vladimir Sementsov-Ogievskiy wrote:

bdrv_write_threshold_exceeded() is unused at all.

bdrv_write_threshold_is_set() is used only to double check the value of
bs->write_threshold_offset in tests. No real sense in it (both tests do
check real value with help of bdrv_write_threshold_get())


Well, depends on how one sees it.  One could also say that it neatly 
hides the fact that a threshold of 0 means disabled (i.e. the overloaded 
meaning of the write_threshold_offset attribute).


*shrug*

Reviewed-by: Max Reitz 


Signed-off-by: Vladimir Sementsov-Ogievskiy 
---
  include/block/write-threshold.h   | 24 
  block/write-threshold.c   | 19 ---
  tests/unit/test-write-threshold.c |  4 
  3 files changed, 47 deletions(-)

Re: [PATCH v2 1/9] block/write-threshold: don't use write notifiers

2021-05-05 Thread Max Reitz


On 05.05.21 15:35, Vladimir Sementsov-Ogievskiy wrote:

05.05.2021 15:37, Max Reitz wrote:

write-notifiers are used only for write-threshold. New code for such
purpose should create filters.

Let's better special-case write-threshold and drop write notifiers at
all. (Actually, write-threshold is special-cased anyway, as the only
user of write-notifiers)


Not noted here: That write-threshold.c is also reorganized.  (Doesn’t 
seem entirely necessary to do right in this patch, but why not.)


You mean, we probably could only add new interface here, keeping other 
things as is, and drop them in a separate patch?


Something like that, yes.  But not important.


If keep as is we can add the following here:

   So, create a new direct interface for bdrv_co_write_req_prepare() and 
drop

   all write-notifier related logic from write-threshold.c.



Sounds good!

Max

Re: [PATCH v2 3/9] test-write-threshold: rewrite test_threshold_(not_)trigger tests

2021-05-05 Thread Max Reitz


On 04.05.21 10:25, Vladimir Sementsov-Ogievskiy wrote:

These tests use bdrv_write_threshold_exceeded() API, which is used only
for test.


Well, now.  That used to be different before patch 1.


Better is testing real API, which is used in block.c as well.

So, let's call bdrv_write_threshold_check_write(), and check is
bs->write_threshold_offset cleared or not (it's cleared iff threshold
triggered).

Also we get rid of BdrvTrackedRequest use here. Tracked requests are
unrelated to write-threshold since we get rid of write notifiers.


The purpose behind the BdrvTrackedRequest was clearly because this is 
the object bdrv_write_threshold_exceeded() expected.  This reads like 
there was some other purpose (i.e. actually tracked requests), but there 
wasn’t.


The question that comes to my mind is why we had the 
bdrv_check_request() calls there, and why this patch removes them.  They 
seem unrelated to the write threshold, but someone must have thought 
something when adding them.


Looking into git blame, that someone is you :) (8b1170012b1)
Looks like you added those calls because BdrvTrackedRequest is a block 
layer structure, so getting rid of it means the reason for 
bdrv_check_request() disappears.  OK.


Reviewed-by: Max Reitz 


Signed-off-by: Vladimir Sementsov-Ogievskiy 
---
  tests/unit/test-write-threshold.c | 22 --
  1 file changed, 4 insertions(+), 18 deletions(-)

diff --git a/tests/unit/test-write-threshold.c 
b/tests/unit/test-write-threshold.c
index fc1c45a2eb..fd40a815b8 100644
--- a/tests/unit/test-write-threshold.c
+++ b/tests/unit/test-write-threshold.c
@@ -55,41 +55,27 @@ static void test_threshold_multi_set_get(void)
  
  static void test_threshold_not_trigger(void)

  {
-uint64_t amount = 0;
  uint64_t threshold = 4 * 1024 * 1024;
  BlockDriverState bs;
-BdrvTrackedRequest req;
  
  memset(, 0, sizeof(bs));

-memset(, 0, sizeof(req));
-req.offset = 1024;
-req.bytes = 1024;
-
-bdrv_check_request(req.offset, req.bytes, _abort);
  
  bdrv_write_threshold_set(, threshold);

-amount = bdrv_write_threshold_exceeded(, );
-g_assert_cmpuint(amount, ==, 0);
+bdrv_write_threshold_check_write(, 1024, 1024);
+g_assert_cmpuint(bdrv_write_threshold_get(), ==, threshold);
  }
  
  
  static void test_threshold_trigger(void)

  {
-uint64_t amount = 0;
  uint64_t threshold = 4 * 1024 * 1024;
  BlockDriverState bs;
-BdrvTrackedRequest req;
  
  memset(, 0, sizeof(bs));

-memset(, 0, sizeof(req));
-req.offset = (4 * 1024 * 1024) - 1024;
-req.bytes = 2 * 1024;
-
-bdrv_check_request(req.offset, req.bytes, _abort);
  
  bdrv_write_threshold_set(, threshold);

-amount = bdrv_write_threshold_exceeded(, );
-g_assert_cmpuint(amount, >=, 1024);
+bdrv_write_threshold_check_write(, threshold - 1024, 2 * 1024);
+g_assert_cmpuint(bdrv_write_threshold_get(), ==, 0);
  }
  
  typedef struct TestStruct {

Re: [PATCH v2 2/9] block: drop write notifiers

2021-05-05 Thread Max Reitz


On 04.05.21 10:25, Vladimir Sementsov-Ogievskiy wrote:

They are unused now.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
---
  include/block/block_int.h | 12 
  block.c   |  1 -
  block/io.c|  6 --
  3 files changed, 19 deletions(-)


Reviewed-by: Max Reitz

Re: [PATCH v2 1/9] block/write-threshold: don't use write notifiers

2021-05-05 Thread Max Reitz


On 04.05.21 10:25, Vladimir Sementsov-Ogievskiy wrote:

write-notifiers are used only for write-threshold. New code for such
purpose should create filters.

Let's better special-case write-threshold and drop write notifiers at
all. (Actually, write-threshold is special-cased anyway, as the only
user of write-notifiers)


Not noted here: That write-threshold.c is also reorganized.  (Doesn’t 
seem entirely necessary to do right in this patch, but why not.)



Signed-off-by: Vladimir Sementsov-Ogievskiy 
---
  include/block/block_int.h   |  1 -
  include/block/write-threshold.h |  9 +
  block/io.c  |  5 ++-
  block/write-threshold.c | 68 +++--
  4 files changed, 25 insertions(+), 58 deletions(-)


[...]


diff --git a/include/block/write-threshold.h b/include/block/write-threshold.h
index c646f267a4..7942dcc368 100644
--- a/include/block/write-threshold.h
+++ b/include/block/write-threshold.h
@@ -59,4 +59,13 @@ bool bdrv_write_threshold_is_set(const BlockDriverState *bs);
  uint64_t bdrv_write_threshold_exceeded(const BlockDriverState *bs,
 const BdrvTrackedRequest *req);
  
+/*

+ * bdrv_write_threshold_check_write
+ *
+ * Check, does specified request exceeds write threshold. If it is, send


I’d prefer either “Check: Does the specified request exceed the write 
threshold?” or “Check whether the specified request exceeds the write 
threshold.”



+ * corresponding event and unset write threshold.


I’d call it “disable write threshold checking” instead of “unset write 
threshold”, because I don’t it immediately clear what unsetting the 
threshold means.



+ */
+void bdrv_write_threshold_check_write(BlockDriverState *bs, int64_t offset,
+  int64_t bytes);
+
  #endif


[...]


@@ -122,3 +68,15 @@ void qmp_block_set_write_threshold(const char *node_name,
  
  aio_context_release(aio_context);

  }
+
+void bdrv_write_threshold_check_write(BlockDriverState *bs, int64_t offset,
+  int64_t bytes)
+{
+int64_t end = offset + bytes;
+uint64_t wtr = bs->write_threshold_offset;
+
+if (wtr > 0 && end > wtr) {
+qapi_event_send_block_write_threshold(bs->node_name, end - wtr, wtr);


OK, don’t understand why bdrv_write_threshold_exceeded() had two cases 
(one for offset > wtr, one for end > wtr).  Perhaps overflow 
considerations?  Shouldn’t matter though.



+bdrv_write_threshold_set(bs, 0);


I’d keep the comment from before_write_notify() here (i.e. “autodisable 
to avoid flooding the monitor”).


But other than that, I have no complaints:

Reviewed-by: Max Reitz 


+}
+}

Re: [PATCH v2] Document qemu-img options data_file and data_file_raw

2021-05-04 Thread Max Reitz


On 04.05.21 01:15, Connor Kuehl wrote:

On 4/30/21 9:45 AM, Max Reitz wrote:

+  ``data_file_raw``
+If this option is set to ``on``, QEMU will always keep the external
+data file consistent as a standalone read-only raw image. It does
+this by forwarding updates through to the raw image in addition to
+updating the image metadata. If set to ``off``, QEMU will only
+update the image metadata without forwarding the changes through
+to the raw image. The default value is ``off``.


Hm, what updates and what changes?  I mean, the first part makes sense (the “It 
does this by...”), but the second part doesn’t.  qemu will still forward most 
writes to the data file.  (Not all, but most.)

(Also, nit pick: With data_file_raw=off, the data file is not a raw image.  
(You still call it that in the penultimate sentence.))
When you write data to a qcow2 file with data_file, the data also goes to the 
data_file, most of the time.  The exception is when it can be handled with a 
metadata update, i.e. when it's a zero write or discard.

In addition, such updates (i.e. zero writes, I presume) not happening to the 
data file are usually a minor problem.  The real problem is that without 
data_file_raw, data clusters can be allocated anywhere in the data file, 
whereas with data_file_raw, they are allocated at their respective guest offset 
(i.e. the host offset always equals the guest offset).

I personally would have been fine with the first sentence, but if we want more 
of an explanation...  Perhaps:

<

I found it very helpful. I'll incorporate your explanation into the next
revision.

I'm wondering what the most appropriate trailer would be for the next
revision?

Suggested-by: Max [..]
Co-developed-by: Max [..]

Let me know if you have a strong preference, otherwise I'll go with
Suggested-by:


I’m fine without any tag (if I merge this patch, it’ll get my S-o-b 
anyway :)), but if any, I’d probably go with a Suggested-by, yes.


Max

Re: [PATCH v3 14/15] qemu_iotests: add option to show qemu binary logs on stdout

2021-05-03 Thread Max Reitz


On 30.04.21 23:04, Emanuele Giuseppe Esposito wrote:



On 30/04/2021 15:50, Max Reitz wrote:

On 14.04.21 19:03, Emanuele Giuseppe Esposito wrote:

Using the flag -p, allow the qemu binary to print to stdout.
This helps especially when doing print-debugging.


I think this shouldn’t refer to prints but to qemu’s stdout/stderr in 
general, i.e



Signed-off-by: Emanuele Giuseppe Esposito 
---
  tests/qemu-iotests/check  | 3 ++-
  tests/qemu-iotests/iotests.py | 9 +
  tests/qemu-iotests/testenv.py | 9 +++--
  3 files changed, 18 insertions(+), 3 deletions(-)

diff --git a/tests/qemu-iotests/check b/tests/qemu-iotests/check
index 489178d9a4..84483922eb 100755
--- a/tests/qemu-iotests/check
+++ b/tests/qemu-iotests/check
@@ -33,6 +33,7 @@ def make_argparser() -> argparse.ArgumentParser:
 help='pretty print output for make check')
  p.add_argument('-d', dest='debug', action='store_true', 
help='debug')
+    p.add_argument('-p', dest='print', action='store_true', 
help='shows qemu binary prints to stdout')


I’d prefer for the description to be “redirects qemu's stdout and 
stderr to the test output”.


Also, this line is too long.


  p.add_argument('-gdb', action='store_true',
 help="start gdbserver with $GDB_QEMU options. \
   Default is localhost:12345")
@@ -117,7 +118,7 @@ if __name__ == '__main__':
    aiomode=args.aiomode, cachemode=args.cachemode,
    imgopts=args.imgopts, misalign=args.misalign,
    debug=args.debug, valgrind=args.valgrind,
-  gdb=args.gdb)
+  gdb=args.gdb, qprint=args.print)
  testfinder = TestFinder(test_dir=env.source_iotests)
diff --git a/tests/qemu-iotests/iotests.py 
b/tests/qemu-iotests/iotests.py

index f9832558a0..52ff7332f8 100644
--- a/tests/qemu-iotests/iotests.py
+++ b/tests/qemu-iotests/iotests.py
@@ -79,6 +79,8 @@
  if os.environ.get('GDB_QEMU'):
  qemu_gdb = ['gdbserver'] + 
os.environ.get('GDB_QEMU').strip().split(' ')

+qemu_print = os.environ.get('PRINT_QEMU', False)
+
  imgfmt = os.environ.get('IMGFMT', 'raw')
  imgproto = os.environ.get('IMGPROTO', 'file')
  output_dir = os.environ.get('OUTPUT_DIR', '.')
@@ -621,6 +623,13 @@ def _post_shutdown(self) -> None:
  super()._post_shutdown()
  self.subprocess_check_valgrind(qemu_valgrind)
+    def _pre_launch(self) -> None:
+    super()._pre_launch()
+    if qemu_print and self._qemu_log_file != None:


I think "is not None" is mostly used in Python.  (I’m saying this in 
this weird way because I’m not the one to ask what’s mostly used in 
Python...)


(That also silences mypy, which otherwise complains and then fails 297.)


+    # set QEMU binary output to stdout
+    self._qemu_log_file.close()
+    self._qemu_log_file = None


I don’t know enough Python to know how this works.  I suppose this 
does access the super class’s member?  (I could also imagine it 
creates a new member, just for this child class, but that then 
effectively overwrites the super class’s member.)


Considering _qemu_log_file is apparently meant to be a private 
attribute (from the leading underscore), I think it would be better to 
have a function provided by the super class for this.


It should access the superclass member, and it's the same that 
_post_shutdown does. I can make a function for that.


Understood.

Regarding all other feedback in all patches that I did not answer: agree 
on all of them, will adjust everything in v4.


Thanks, looking forward to it! :)

Max

Re: [PATCH v3 05/15] qemu-iotests: delay QMP socket timers

2021-05-03 Thread Max Reitz


On 30.04.21 23:03, Emanuele Giuseppe Esposito wrote:



On 30/04/2021 13:59, Max Reitz wrote:

On 14.04.21 19:03, Emanuele Giuseppe Esposito wrote:

Attaching a gdbserver implies that the qmp socket
should wait indefinitely for an answer from QEMU.

Signed-off-by: Emanuele Giuseppe Esposito 
---
  python/qemu/machine.py    |  3 +++
  tests/qemu-iotests/iotests.py | 10 +-
  2 files changed, 12 insertions(+), 1 deletion(-)

diff --git a/python/qemu/machine.py b/python/qemu/machine.py
index 12752142c9..d6142271c2 100644
--- a/python/qemu/machine.py
+++ b/python/qemu/machine.py
@@ -409,6 +409,9 @@ def _launch(self) -> None:
 stderr=subprocess.STDOUT,
 shell=False,
 close_fds=False)
+
+    if 'gdbserver' in self._wrapper:
+    self._qmp_timer = None


Why doesn’t __init__() evaluate this?  This here doesn’t feel like the 
right place for it.  If we want to evaluate it here, self._qmp_timer 
shouldn’t exist, and instead the timeout should be a _post_launch() 
parameter.  (Which I would have nothing against, by the way.)


Uhm.. I got another comment in a previous version where for the "event" 
callbacks it was better a property than passing around a parameter. 
Which I honestly agree.


I think that comment was in the sense of providing a default value, 
which can be expressed by having a property that is set in __init__.


I don’t have anything against making this a property, but I also don’t 
have anything against making it a _post_launch() parameter.  I could 
even live with both, i.e. set _qmp_timer to 15 in __init__, then have a 
_post_launch parameter, and pass either self._qmp_timer or None if 
self._wrapper includes 'gdbserver'.


What I do mind is that I don’t understand why the property is modified 
here.  The value of self._qmp_timer is supposed to be 15 by default and 
None if self._wrapper includes 'gdbserver'.  It should thus be changed 
to None the moment self._wrapper is made to include 'gdbserver'. 
Because self._wrapper is set only in __init__, this should happen in 
__init__.


What should __init__() do? The check here is to see if the invocation 
has gdb (and a couple of patches ahead also valgrind), to remove the timer.

If I understand what you mean, you want something like
def __init__(self, timer):


Oh, no.  We can optionally do that perhaps later, but what I meant is 
just to put this in __init__() (without adding any parameters to it):


self._qmp_timer = 15.0 if 'gdbserver' not in self._wrapper else None

I think self._qmp_timer should always reflect what timeout we are going 
to use when a VM is launched.  So if the conditions influencing the 
timeout change, it should be updated immediately to reflect this.  The 
only condition we have right now is the content of self._wrapper, which 
is only set in __init__, so self._qmp_timer should be set once in 
__init__ and not changed afterwards.


That sounds academic, but imagine what would happen if we had a 
set_qmp_timer() method: The timout could be adjusted, but launch() would 
just ignore it and update the property, even though the conditions 
influencing the timout didn’t change between set_qmp_timer() and launch().


Or if we had a get_qmp_timer(); a caller would read a timeout of 15.0 
before launch(), even though the timeout is going to be None.


Therefore, I think a property should not be updated just before it is 
read, but instead when any condition that’s supposed to influence its 
value changes.



I suggested making it a parameter because updating a property when 
reading it sounds like it should be a parameter instead.  I.e., one 
would say


def __init__():
self._qmp_timeout_default = 15.0

def post_launch(qmp_timeout):
self._qmp.accept(qmp_timeout)

def launch(self):
...
qmp_timeout = None if 'gdbserver' in self._wrapper \
   else self._qmp_timout_default
self.post_launch(qmp_timeout)


Which is basically the structure your patch has, which gave me the idea.

[...]


  self._post_launch()
  def _early_cleanup(self) -> None:
diff --git a/tests/qemu-iotests/iotests.py 
b/tests/qemu-iotests/iotests.py

index 05d0dc0751..380527245e 100644
--- a/tests/qemu-iotests/iotests.py
+++ b/tests/qemu-iotests/iotests.py


[...]


@@ -684,6 +687,11 @@ def qmp_to_opts(self, obj):
  output_list += [key + '=' + obj[key]]
  return ','.join(output_list)
+    def get_qmp_events(self, wait: bool = False) -> List[QMPMessage]:
+    if qemu_gdb:
+    wait = 0.0


[...]



Second, I don’t understand this.  If the caller wants to block waiting 
on an event, then that should have nothing to do with whether we have 
gdb running or not.  As far as I understand, setting wait to 0.0 is 
the same as wait = False, i.e. we don’t block and just return None 
immediately if there is no pending event.


You'r

Re: [PATCH v3 04/15] qemu-iotests: add option to attach gdbserver

2021-05-03 Thread Max Reitz


On 30.04.21 23:03, Emanuele Giuseppe Esposito wrote:



On 30/04/2021 13:38, Max Reitz wrote:

On 14.04.21 19:03, Emanuele Giuseppe Esposito wrote:

Add -gdb flag and GDB_QEMU environmental variable
to python tests to attach a gdbserver to each qemu instance.


Well, this patch doesn’t do this, but OK.


Maybe "define" rather than "add"? In the sense of defining the "-gdb" 
option, which is what it actually does.


That’s possible, but I think better would be to contrast it with what 
this patch doesn’t do, but what one could think when reading this 
description.


I.e. to say “Add/define -gdb flag [...] to each qemu instance.  This 
patch only adds and parses this flag, it does not yet add the 
implementation for it.”


Out of interest: Why gdbserver and not “just” gdb?  On Fedora, those 
are separate packages, so I don’t have gdbserver installed, that’s why 
I’m asking.


As far as I have tried, using only gdb with ./check is very hard to use, 
because the stdout is filtered out by the script.
So invoking gdb attaches it to QEMU, but it is not possible to start 
execution (run command) or interact with it, because of the python 
script filtering. This leaves the test hanging.


gdbserver is just something that a gdb client can attach to (for 
example, in another console or even in another host) for example with 
the command
# gdb -iex "target remote localhost:12345" . This provides a nice and 
separate gdb monitor to the client.


All right.  I thought gdb could be used as a server, too, but...  Looks 
like it can’t.  (Like, I thought, you could do something like
“gdb -ex 'listen localhost:12345' $cmd”.  But seems like there is no 
server built into gdb proper.)


Max

Re: [PATCH 1/2] block: Fix Transaction leak in bdrv_root_attach_child()

2021-05-03 Thread Max Reitz


On 03.05.21 13:51, Vladimir Sementsov-Ogievskiy wrote:

03.05.2021 14:49, Max Reitz wrote:

On 03.05.21 13:05, Kevin Wolf wrote:

The error path needs to call tran_finalize(), too.

Fixes: CID 1452773
Fixes: 548a74c0dbc858edd1a7ee3045b5f2fe710bd8b1
Signed-off-by: Kevin Wolf 
---
  block.c | 7 ---
  1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/block.c b/block.c
index 874c22c43e..5c0ced6238 100644
--- a/block.c
+++ b/block.c
@@ -2918,13 +2918,14 @@ BdrvChild 
*bdrv_root_attach_child(BlockDriverState *child_bs,
 child_role, perm, shared_perm, 
opaque,

 , tran, errp);
  if (ret < 0) {
-    bdrv_unref(child_bs);
-    return NULL;
+    assert(child == NULL);
+    goto out;
  }
  ret = bdrv_refresh_perms(child_bs, errp);
-    tran_finalize(tran, ret);
+out:
+    tran_finalize(tran, ret);
  bdrv_unref(child_bs);
  return child;


Looks OK, so:

Reviewed-by: Max Reitz 

However, the function’s description says that it will return NULL on 
error.  But if bdrv_refresh_perms() fails, it will still return a 
non-NULL child.  Is that right?




No, it's reset to NULL on transaction abort, so code is correct. It's 
not obvious, and I've added a comment and assertion in my version of 
this fix "[PATCH 1/6] block: fix leak of tran in bdrv_root_attach_child"


The fact that the transaction keeps the pointer to this local variable 
around is a bit horrifying, but well.


Max

Re: [PATCH 1/2] block: Fix Transaction leak in bdrv_root_attach_child()

2021-05-03 Thread Max Reitz


On 03.05.21 13:05, Kevin Wolf wrote:

The error path needs to call tran_finalize(), too.

Fixes: CID 1452773
Fixes: 548a74c0dbc858edd1a7ee3045b5f2fe710bd8b1
Signed-off-by: Kevin Wolf 
---
  block.c | 7 ---
  1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/block.c b/block.c
index 874c22c43e..5c0ced6238 100644
--- a/block.c
+++ b/block.c
@@ -2918,13 +2918,14 @@ BdrvChild *bdrv_root_attach_child(BlockDriverState 
*child_bs,
 child_role, perm, shared_perm, opaque,
 , tran, errp);
  if (ret < 0) {
-bdrv_unref(child_bs);
-return NULL;
+assert(child == NULL);
+goto out;
  }
  
  ret = bdrv_refresh_perms(child_bs, errp);

-tran_finalize(tran, ret);
  
+out:

+tran_finalize(tran, ret);
  bdrv_unref(child_bs);
  return child;


Looks OK, so:

Reviewed-by: Max Reitz 

However, the function’s description says that it will return NULL on 
error.  But if bdrv_refresh_perms() fails, it will still return a 
non-NULL child.  Is that right?

[PATCH] block/snapshot: Clarify goto fallback behavior

2021-05-03 Thread Max Reitz

In the bdrv_snapshot_goto() fallback code, we work with a pointer to
either bs->file or bs->backing.  We close that child, close the node
(with .bdrv_close()), apply the snapshot on the child node, and then
re-open the node (with .bdrv_open()).

In order for .bdrv_open() to attach the same child node that we had
before, we pass "file={child-node}" or "backing={child-node}" to it.
Therefore, when .bdrv_open() has returned success, we can assume that
bs->file or bs->backing (respectively) points to our original child
again.  This is verified by an assertion.

All of this is not immediately clear from a quick glance at the code,
so add a comment to the assertion what it is for, and why it is valid.
It certainly confused Coverity.

Reported-by: Coverity (CID 1452774)
Signed-off-by: Max Reitz 
---
 block/snapshot.c | 14 +-
 1 file changed, 13 insertions(+), 1 deletion(-)

diff --git a/block/snapshot.c b/block/snapshot.c
index e8ae9a28c1..cce5e35b35 100644
--- a/block/snapshot.c
+++ b/block/snapshot.c
@@ -275,13 +275,16 @@ int bdrv_snapshot_goto(BlockDriverState *bs,
 qobject_unref(file_options);
 g_free(subqdict_prefix);
 
+/* Force .bdrv_open() below to re-attach fallback_bs on *fallback_ptr 
*/
 qdict_put_str(options, (*fallback_ptr)->name,
   bdrv_get_node_name(fallback_bs));
 
+/* Now close bs, apply the snapshot on fallback_bs, and re-open bs */
 if (drv->bdrv_close) {
 drv->bdrv_close(bs);
 }
 
+/* .bdrv_open() will re-attach it */
 bdrv_unref_child(bs, *fallback_ptr);
 *fallback_ptr = NULL;
 
@@ -296,7 +299,16 @@ int bdrv_snapshot_goto(BlockDriverState *bs,
 return ret < 0 ? ret : open_ret;
 }
 
-assert(fallback_bs == (*fallback_ptr)->bs);
+/*
+ * fallback_ptr is >file or >backing.  *fallback_ptr
+ * was closed above and set to NULL, but the .bdrv_open() call
+ * has opened it again, because we set the respective option
+ * (with the qdict_put_str() call above).
+ * Assert that .bdrv_open() has attached some child on
+ * *fallback_ptr, and that it has attached the one we wanted
+ * it to (i.e., fallback_bs).
+ */
+assert(*fallback_ptr && fallback_bs == (*fallback_ptr)->bs);
 bdrv_unref(fallback_bs);
 return ret;
 }
-- 
2.31.1

Re: [PULL 37/64] block/snapshot: Fix fallback

2021-05-03 Thread Max Reitz


On 03.05.21 11:40, Kevin Wolf wrote:

Am 01.05.2021 um 00:30 hat Peter Maydell geschrieben:

On Mon, 7 Sept 2020 at 12:11, Kevin Wolf  wrote:


From: Max Reitz 

If the top node's driver does not provide snapshot functionality and we
want to fall back to a node down the chain, we need to snapshot all
non-COW children.  For simplicity's sake, just do not fall back if there
is more than one such child.  Furthermore, we really only can fall back
to bs->file and bs->backing, because bdrv_snapshot_goto() has to modify
the child link (notably, set it to NULL).

Suggested-by: Vladimir Sementsov-Ogievskiy 
Signed-off-by: Max Reitz 
Reviewed-by: Andrey Shinkevich 
Reviewed-by: Kevin Wolf 


Hi; Coverity thinks it's found a problem with this code
(CID 1452774):


Cc: Max as the patch author


Yes, I’m writing a patch to add a comment.


@@ -205,39 +258,46 @@ int bdrv_snapshot_goto(BlockDriverState *bs,
  return ret;
  }

-if (bs->file) {
-BlockDriverState *file;
-QDict *options = qdict_clone_shallow(bs->options);
+fallback_ptr = bdrv_snapshot_fallback_ptr(bs);
+if (fallback_ptr) {
+QDict *options;
  QDict *file_options;
  Error *local_err = NULL;
+BlockDriverState *fallback_bs = (*fallback_ptr)->bs;
+char *subqdict_prefix = g_strdup_printf("%s.", (*fallback_ptr)->name);
+
+options = qdict_clone_shallow(bs->options);

-file = bs->file->bs;
  /* Prevent it from getting deleted when detached from bs */
-bdrv_ref(file);
+bdrv_ref(fallback_bs);

-qdict_extract_subqdict(options, _options, "file.");
+qdict_extract_subqdict(options, _options, subqdict_prefix);
  qobject_unref(file_options);
-qdict_put_str(options, "file", bdrv_get_node_name(file));
+g_free(subqdict_prefix);
+
+qdict_put_str(options, (*fallback_ptr)->name,
+  bdrv_get_node_name(fallback_bs));

  if (drv->bdrv_close) {
  drv->bdrv_close(bs);
  }
-bdrv_unref_child(bs, bs->file);
-bs->file = NULL;

-ret = bdrv_snapshot_goto(file, snapshot_id, errp);
+bdrv_unref_child(bs, *fallback_ptr);
+*fallback_ptr = NULL;


Here we set *fallback_ptr to NULL...


+
+ret = bdrv_snapshot_goto(fallback_bs, snapshot_id, errp);
  open_ret = drv->bdrv_open(bs, options, bs->open_flags, _err);
  qobject_unref(options);
  if (open_ret < 0) {
-bdrv_unref(file);
+bdrv_unref(fallback_bs);
  bs->drv = NULL;
  /* A bdrv_snapshot_goto() error takes precedence */
  error_propagate(errp, local_err);
  return ret < 0 ? ret : open_ret;
  }

-assert(bs->file->bs == file);
-bdrv_unref(file);
+assert(fallback_bs == (*fallback_ptr)->bs);


...but here we dereference *fallback_ptr, and Coverity doesn't see
anything that it recognizes as being able to change it.


+bdrv_unref(fallback_bs);
  return ret;
  }


False positive, or real issue? (If a false positive, a comment
explaining what's going on wouldn't go amiss -- as a human reader
I'm kind of confused about whether there's some kind of hidden
magic going on here.)


I think it's a false positive because drv->bdrv_open() is supposed to
give it a non-NULL value again. Not sure if we can make the assumption
in every case without checking it, but it feels reasonable to require
that drv->bdrv_open() would return failure otherwise. Max?


Yes.  I think it’s sensible to add an *fallback_ptr non-NULL check to 
the assert condition (i.e.,


assert(*fallback_ptr && fallback_bs == (*fallback_ptr)->bs);

), because the intention of the condition is already to verify that 
.bdrv_open() has opened the right node.  So we might say what’s missing 
is to also assert that it has opened any node at all, but if we’re fine 
with asserting that it has opened the right node (which we did since 
7a9e51198c24), we should definitely be fine with asserting that it has 
opened any node at all.


Max

Re: [PATCH v2] Document qemu-img options data_file and data_file_raw

2021-04-30 Thread Max Reitz


On 30.04.21 15:34, Connor Kuehl wrote:

The contents of this patch were initially developed and posted by Han
Han[1], however, it appears the original patch was not applied. Since
then, the relevant documentation has been moved and adapted to a new
format.

I've taken most of the original wording and tweaked it according to
some of the feedback from the original patch submission. I've also
adapted it to restructured text, which is the format the documentation
currently uses.

[1] https://lists.nongnu.org/archive/html/qemu-block/2019-10/msg01253.html

Fixes: https://bugzilla.redhat.com/1763105
Signed-off-by: Han Han 
Signed-off-by: Connor Kuehl 
---
Changes since v1:
   * Clarify different behaviors with these options when using qemu-img
 create vs amend (Max)
   * Touch on the negative case of how the file becomes inconsistent
 (John)

  docs/tools/qemu-img.rst | 20 
  1 file changed, 20 insertions(+)

diff --git a/docs/tools/qemu-img.rst b/docs/tools/qemu-img.rst
index c9efcfaefc..87b4a65535 100644
--- a/docs/tools/qemu-img.rst
+++ b/docs/tools/qemu-img.rst
@@ -866,6 +866,26 @@ Supported image file formats:
  issue ``lsattr filename`` to check if the NOCOW flag is set or not
  (Capital 'C' is NOCOW flag).
  
+  ``data_file``

+Filename where all guest data will be stored. If this option is used,
+the qcow2 file will only contain the image's metadata.
+
+Note: Data loss will occur if the given filename already exists when
+using this option with ``qemu-img create`` since ``qemu-img`` will create
+the data file anew, overwriting the file's original contents. To simply
+update the reference to point to the given pre-existing file, use
+``qemu-img amend``.
+
+  ``data_file_raw``
+If this option is set to ``on``, QEMU will always keep the external
+data file consistent as a standalone read-only raw image. It does
+this by forwarding updates through to the raw image in addition to
+updating the image metadata. If set to ``off``, QEMU will only
+update the image metadata without forwarding the changes through
+to the raw image. The default value is ``off``.


Hm, what updates and what changes?  I mean, the first part makes sense 
(the “It does this by...”), but the second part doesn’t.  qemu will 
still forward most writes to the data file.  (Not all, but most.)


(Also, nit pick: With data_file_raw=off, the data file is not a raw 
image.  (You still call it that in the penultimate sentence.))


When you write data to a qcow2 file with data_file, the data also goes 
to the data_file, most of the time.  The exception is when it can be 
handled with a metadata update, i.e. when it's a zero write or discard.


In addition, such updates (i.e. zero writes, I presume) not happening to 
the data file are usually a minor problem.  The real problem is that 
without data_file_raw, data clusters can be allocated anywhere in the 
data file, whereas with data_file_raw, they are allocated at their 
respective guest offset (i.e. the host offset always equals the guest 
offset).


I personally would have been fine with the first sentence, but if we 
want more of an explanation...  Perhaps:

Re: [PATCH v3 15/15] docs/devel/testing: add -p option to the debug section of QEMU iotests

2021-04-30 Thread Max Reitz


On 14.04.21 19:03, Emanuele Giuseppe Esposito wrote:

Signed-off-by: Emanuele Giuseppe Esposito 
---
  docs/devel/testing.rst | 4 
  1 file changed, 4 insertions(+)

diff --git a/docs/devel/testing.rst b/docs/devel/testing.rst
index 62902cfd2d..0c18fc4571 100644
--- a/docs/devel/testing.rst
+++ b/docs/devel/testing.rst
@@ -246,6 +246,10 @@ given as options to the ``check`` script:
  * ``-d`` (debug) just increases the logging verbosity, showing
for example the QMP commands and answers.
  
+* ``-p`` (print) allows QEMU binary stdout to be shown in the


stderr, too.


+  test console, instead of saving it into a log file in
+  ``$TEST_DIR/qemu-machine-``.
+


It doesn’t allow this, though, it forces it.  That means that tests that 
use the log will fail (e.g. 245)[1].  That is, I’d drop the “allows” and 
just state “redirect QEMU’s stdout and stderr to the test output, 
instead of...”.


[1] I realize that all tests will technically fail with -p, because the 
reference output differs, but 245 will not just fail because of that 
difference, but because two of its cases actually really fail.
No need to make a note of this, though.  It’s just that “allows” sounds 
a bit like qemu could choose to put some info into the test output and 
some into the log, when really it’s all or nothing (-p or not -p).


Max


  Test case groups

Re: [PATCH v3 14/15] qemu_iotests: add option to show qemu binary logs on stdout

2021-04-30 Thread Max Reitz

T_QEMU--  "{PRINT_QEMU}"


Again, I personally don’t like the quotes.

(I think “PRINT_QEMU -- y” and “PRINT_QEMU --” look better.)

Another thing: Here in this place, the name of the environment variable 
is visible.  “PRINT_QEMU” doesn’t really have meaning; now, if it 
weren’t visible, I wouldn’t care, but here it is visible.  Perhaps 
“PRINT_QEMU_OUTPUT”, or "SHOW_QEMU_OUTPUT" would mean more?


Well, it’s all bike-shedding, so with “is not None” and the add_argument 
line broken so it doesn’t exceed 79 characters:


Reviewed-by: Max Reitz 


  """
  
  args = collections.defaultdict(str, self.get_env())

Re: [PATCH v3 13/15] docs/devel/testing: add -valgrind option to the debug section of QEMU iotests

2021-04-30 Thread Max Reitz


On 14.04.21 19:03, Emanuele Giuseppe Esposito wrote:

Signed-off-by: Emanuele Giuseppe Esposito 
---
  docs/devel/testing.rst | 7 +++
  1 file changed, 7 insertions(+)

diff --git a/docs/devel/testing.rst b/docs/devel/testing.rst
index 2ee77a057b..62902cfd2d 100644
--- a/docs/devel/testing.rst
+++ b/docs/devel/testing.rst
@@ -236,6 +236,13 @@ given as options to the ``check`` script:
by setting the ``$GDB_QEMU`` environmental variable.
The final command line will be ``gdbserver $GDB_QEMU $QEMU ...``
  
+* ``-valgrind`` wraps a valgrind instance to QEMU. If it detects


Not a native speaker, but I think it should be “attaches ... to QEMU”, 
“wraps QEMU in a valgrind instances”, or “wraps ... around QEMU".


Apart from that:

Reviewed-by: Max Reitz 


+  warnings, it will print and save the log in
+  ``$TEST_DIR/.valgrind``.
+  The final command line will be ``valgrind --log-file=$TEST_DIR/
+  .valgrind --error-exitcode=99 $QEMU ...``
+  Note: if used together with ``-gdb``, this command will be ignored.
+
  * ``-d`` (debug) just increases the logging verbosity, showing
for example the QMP commands and answers.

Re: [PATCH v3 06/15] qemu_iotests: insert gdbserver command line as wrapper for qemu binary

2021-04-30 Thread Max Reitz


On 14.04.21 19:03, Emanuele Giuseppe Esposito wrote:

Signed-off-by: Emanuele Giuseppe Esposito 
---
  tests/qemu-iotests/iotests.py | 3 ++-
  1 file changed, 2 insertions(+), 1 deletion(-)


Reviewed-by: Max Reitz

Re: [PATCH v3 12/15] qemu_iotests: insert valgrind command line as wrapper for qemu binary

2021-04-30 Thread Max Reitz


On 14.04.21 19:03, Emanuele Giuseppe Esposito wrote:

The priority will be given to gdb command line, meaning if the -gdb
parameter and -valgrind are given, gdb will be wrapped around
the qemu binary.

Signed-off-by: Emanuele Giuseppe Esposito 
---
  tests/qemu-iotests/iotests.py | 3 ++-
  1 file changed, 2 insertions(+), 1 deletion(-)


Reviewed-by: Max Reitz

Re: [PATCH v3 11/15] qemu_iotests: allow valgrind to read/delete the generated log file

2021-04-30 Thread Max Reitz


On 14.04.21 19:03, Emanuele Giuseppe Esposito wrote:

When using -valgrind on the script tests, it generates a log file
in $TEST_DIR that is either read (if valgrind finds problems) or
otherwise deleted. Provide the same exact behavior when using
-valgrind on the python tests.

Signed-off-by: Emanuele Giuseppe Esposito 
---
  tests/qemu-iotests/iotests.py | 20 
  1 file changed, 20 insertions(+)

diff --git a/tests/qemu-iotests/iotests.py b/tests/qemu-iotests/iotests.py
index 94597433fa..aef67e3a86 100644
--- a/tests/qemu-iotests/iotests.py
+++ b/tests/qemu-iotests/iotests.py
@@ -600,6 +600,26 @@ def __init__(self, path_suffix=''):
   sock_dir=sock_dir)
  self._num_drives = 0
  
+def subprocess_check_valgrind(self, valgrind) -> None:


A type annotation would be nice.  (I.e. List[str], or we make it a bool 
and the caller uses bool(qemu_valgrind).)



+


I’d drop this empty line.


+if not valgrind:
+return
+
+valgrind_filename =  test_dir + "/" + str(self._popen.pid) + 
".valgrind"


mypy (iotest 297) complains that _popen is Optional[], so .pid might not 
exist.  Perhaps this should be safeguarded in the "if not valgrind" 
condition (i.e. "if not valgrind or not self._popen").


Also, pylint complains about the line being too long (79 is the maximum 
length).  Can be fixed with an f-string:


valgrind_filename = f"{test_dir}/{self._popen.pid}.valgrind"


+
+if self.exitcode() == 99:
+with open(valgrind_filename) as f:
+content = f.readlines()
+for line in content:
+print(line, end ="")


'end=""' would be better, I think.  (flake8 complains.)

Also, would this be better as:

with open(valgrind_filename) as f:
for line in f.readlines():
print(line, end="")

?

(Or just

with open(valgrind_filename) as f:
print(f.read())

– wouldn’t that work, too?)

Max


+print("")
+else:
+os.remove(valgrind_filename)
+
+def _post_shutdown(self) -> None:
+super()._post_shutdown()
+self.subprocess_check_valgrind(qemu_valgrind)
+
  def add_object(self, opts):
  self._args.append('-object')
  self._args.append(opts)

Re: [PATCH v3 10/15] qemu_iotests: extent QMP socket timeout when using valgrind

2021-04-30 Thread Max Reitz


On 14.04.21 19:03, Emanuele Giuseppe Esposito wrote:

As with gdbserver, valgrind delays the test execution, so
the default QMP socket timeout timeout too soon.


I’m curious: The default timeouts should be long enough for slow 
systems, too, though (e.g. heavily-loaded CI systems).  I would expect 
that valgrind is used on developer systems where there is more leeway, 
so the timeouts should still work.


But in practice, thinking about that doesn’t matter.  If valgrind leads 
to a timeout being hit, that wouldn’t be nice.  OTOH, if you run 
valgrind to debug a test/qemu, you don’t particularly care about the 
timeouts anyway.


So in principle, this patch sounds good to me, it’s just that it’s based 
on patch 5, which I don’t fully agree with.


Max


Signed-off-by: Emanuele Giuseppe Esposito 
---
  python/qemu/machine.py| 2 +-
  tests/qemu-iotests/iotests.py | 4 ++--
  2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/python/qemu/machine.py b/python/qemu/machine.py
index d6142271c2..dce96e1858 100644
--- a/python/qemu/machine.py
+++ b/python/qemu/machine.py
@@ -410,7 +410,7 @@ def _launch(self) -> None:
 shell=False,
 close_fds=False)
  
-if 'gdbserver' in self._wrapper:

+if 'gdbserver' in self._wrapper or 'valgrind' in self._wrapper:
  self._qmp_timer = None
  self._post_launch()
  
diff --git a/tests/qemu-iotests/iotests.py b/tests/qemu-iotests/iotests.py

index a2e8604674..94597433fa 100644
--- a/tests/qemu-iotests/iotests.py
+++ b/tests/qemu-iotests/iotests.py
@@ -489,7 +489,7 @@ def log(msg: Msg,
  
  class Timeout:

  def __init__(self, seconds, errmsg="Timeout"):
-if qemu_gdb:
+if qemu_gdb or qemu_valgrind:
  self.seconds = 3000
  else:
  self.seconds = seconds
@@ -700,7 +700,7 @@ def qmp_to_opts(self, obj):
  return ','.join(output_list)
  
  def get_qmp_events(self, wait: bool = False) -> List[QMPMessage]:

-if qemu_gdb:
+if qemu_gdb or qemu_valgrind:
  wait = 0.0
  return super().get_qmp_events(wait=wait)

Re: [PATCH v3 05/15] qemu-iotests: delay QMP socket timers

2021-04-30 Thread Max Reitz


On 14.04.21 19:03, Emanuele Giuseppe Esposito wrote:

Attaching a gdbserver implies that the qmp socket
should wait indefinitely for an answer from QEMU.

Signed-off-by: Emanuele Giuseppe Esposito 
---
  python/qemu/machine.py|  3 +++
  tests/qemu-iotests/iotests.py | 10 +-
  2 files changed, 12 insertions(+), 1 deletion(-)

diff --git a/python/qemu/machine.py b/python/qemu/machine.py
index 12752142c9..d6142271c2 100644
--- a/python/qemu/machine.py
+++ b/python/qemu/machine.py
@@ -409,6 +409,9 @@ def _launch(self) -> None:
 stderr=subprocess.STDOUT,
 shell=False,
 close_fds=False)
+
+if 'gdbserver' in self._wrapper:
+self._qmp_timer = None


Why doesn’t __init__() evaluate this?  This here doesn’t feel like the 
right place for it.  If we want to evaluate it here, self._qmp_timer 
shouldn’t exist, and instead the timeout should be a _post_launch() 
parameter.  (Which I would have nothing against, by the way.)


Also, mypy complains that this variable is a float, so iotest 297 (which 
runs mypy) fails.



  self._post_launch()
  
  def _early_cleanup(self) -> None:

diff --git a/tests/qemu-iotests/iotests.py b/tests/qemu-iotests/iotests.py
index 05d0dc0751..380527245e 100644
--- a/tests/qemu-iotests/iotests.py
+++ b/tests/qemu-iotests/iotests.py
@@ -478,7 +478,10 @@ def log(msg: Msg,
  
  class Timeout:

  def __init__(self, seconds, errmsg="Timeout"):
-self.seconds = seconds
+if qemu_gdb:
+self.seconds = 3000
+else:
+self.seconds = seconds


We might as well completely disable the timeout then, that would be more 
honest, I think.  (I.e. to make __enter__ and __exit__ no-ops.)



  self.errmsg = errmsg
  def __enter__(self):
  signal.signal(signal.SIGALRM, self.timeout)
@@ -684,6 +687,11 @@ def qmp_to_opts(self, obj):
  output_list += [key + '=' + obj[key]]
  return ','.join(output_list)
  
+def get_qmp_events(self, wait: bool = False) -> List[QMPMessage]:

+if qemu_gdb:
+wait = 0.0


First, this is a bool.  I can see that qmp.py expects a
Union[bool, float], but machine.py expects just a bool.  (Also, mypy 
complains that this specific `wait` here is a `bool`.  You can see that 
by running iotest 297.)


Second, I don’t understand this.  If the caller wants to block waiting 
on an event, then that should have nothing to do with whether we have 
gdb running or not.  As far as I understand, setting wait to 0.0 is the 
same as wait = False, i.e. we don’t block and just return None 
immediately if there is no pending event.


Max


+return super().get_qmp_events(wait=wait)
+
  def get_qmp_events_filtered(self, wait=60.0):
  result = []
  for ev in self.get_qmp_events(wait=wait):

Re: [PATCH v3 09/15] qemu_iotests: extend the check script to support valgrind for python tests

2021-04-30 Thread Max Reitz


On 14.04.21 19:03, Emanuele Giuseppe Esposito wrote:

Currently, the check script only parses the option and sets the
VALGRIND_QEMU environmental variable to "y".
Add another local python variable that prepares the command line,
identical to the one provided in the test scripts.

Because the python script does not know in advance the valgring
PID to assign to the log file name, use the "%p" flag in valgrind
log file name that automatically puts the process PID at runtime.

Signed-off-by: Emanuele Giuseppe Esposito 
---
  tests/qemu-iotests/check  |  7 ---
  tests/qemu-iotests/iotests.py | 11 +++
  tests/qemu-iotests/testenv.py |  1 +
  3 files changed, 16 insertions(+), 3 deletions(-)

diff --git a/tests/qemu-iotests/check b/tests/qemu-iotests/check
index 6186495eee..489178d9a4 100755
--- a/tests/qemu-iotests/check
+++ b/tests/qemu-iotests/check
@@ -36,6 +36,10 @@ def make_argparser() -> argparse.ArgumentParser:
  p.add_argument('-gdb', action='store_true',
 help="start gdbserver with $GDB_QEMU options. \
   Default is localhost:12345")
+p.add_argument('-valgrind', action='store_true',
+help='use valgrind, sets VALGRIND_QEMU environment '
+'variable')
+
  p.add_argument('-misalign', action='store_true',
 help='misalign memory allocations')
  p.add_argument('--color', choices=['on', 'off', 'auto'],
@@ -86,9 +90,6 @@ def make_argparser() -> argparse.ArgumentParser:
  g_bash.add_argument('-o', dest='imgopts',
  help='options to pass to qemu-img create/convert, '
  'sets IMGOPTS environment variable')
-g_bash.add_argument('-valgrind', action='store_true',
-help='use valgrind, sets VALGRIND_QEMU environment '
-'variable')
  
  g_sel = p.add_argument_group('test selecting options',

   'The following options specify test set '
diff --git a/tests/qemu-iotests/iotests.py b/tests/qemu-iotests/iotests.py
index 4f3fb13915..a2e8604674 100644
--- a/tests/qemu-iotests/iotests.py
+++ b/tests/qemu-iotests/iotests.py
@@ -96,6 +96,17 @@
  sys.stderr.write('Please run this test via the "check" script\n')
  sys.exit(os.EX_USAGE)
  
+qemu_valgrind = []

+if os.environ.get('VALGRIND_QEMU') == "y" and \
+os.environ.get('NO_VALGRIND') != "y":
+valgrind_logfile = "--log-file=" + test_dir.strip()
+# %p allows to put the valgrind process PID, since
+# we don't know it a priori (subprocess.Peopen is


*Popen


+# not yet invoked)
+valgrind_logfile += "/%p.valgrind"
+
+qemu_valgrind = ['valgrind', valgrind_logfile, '--error-exitcode=99']
+
  socket_scm_helper = os.environ.get('SOCKET_SCM_HELPER', 'socket_scm_helper')
  
  luks_default_secret_object = 'secret,id=keysec0,data=' + \

diff --git a/tests/qemu-iotests/testenv.py b/tests/qemu-iotests/testenv.py
index e131ff42cb..39ae7ace33 100644
--- a/tests/qemu-iotests/testenv.py
+++ b/tests/qemu-iotests/testenv.py
@@ -277,6 +277,7 @@ def print_env(self) -> None:
  SOCK_DIR  -- {SOCK_DIR}
  SOCKET_SCM_HELPER -- {SOCKET_SCM_HELPER}
  GDB_QEMU  -- "{GDB_QEMU}"
+VALGRIND_QEMU -- "{VALGRIND_QEMU}"


I don’t think this should be enclosed in quotes (and neither should 
GDB_QEMU be).  AFAIU, only the *_PROG options are quoted to signify that 
they will be executed as a single program, whether they include spaces 
or not.  OTOH, GDB_QEMU is a list of options, so spaces act as 
separators, and VALGRIND_QEMU is just y or not set, so it’s impossible 
for spaces to be in there.


(Also, I think the “--” should be aligned to all the other “--”s.)

Bikeshedding, though, so:

Reviewed-by: Max Reitz 


  """
  
  args = collections.defaultdict(str, self.get_env())

Re: [PATCH v3 08/15] docs/devel/testing: add -gdb option to the debugging section of QEMU iotests

2021-04-30 Thread Max Reitz


On 14.04.21 19:03, Emanuele Giuseppe Esposito wrote:

Signed-off-by: Emanuele Giuseppe Esposito 
---
  docs/devel/testing.rst | 7 +++
  1 file changed, 7 insertions(+)

diff --git a/docs/devel/testing.rst b/docs/devel/testing.rst
index b7e2370e7e..2ee77a057b 100644
--- a/docs/devel/testing.rst
+++ b/docs/devel/testing.rst
@@ -229,6 +229,13 @@ Debugging a test case
  QEMU iotests offers some options to debug a failing test, that can be
  given as options to the ``check`` script:
  
+* ``-gdb`` wraps ``gdbsever`` to the QEMU binary,

+  so it is possible to connect to it via gdb.
+  One way to do so is via ``gdb -iex "target remote $GDB_QEMU"``
+  The default address is ``localhost:12345``, and can be changed
+  by setting the ``$GDB_QEMU`` environmental variable.


*environment variable


+  The final command line will be ``gdbserver $GDB_QEMU $QEMU ...``
+


I think the order in this explanation is ordered not quite right, 
because it uses $GDB_QEMU before explaining what it is.  (Also, I 
suppose $GDB_QEMU might contain other options than the socket address, 
so "target remote $GDB_QEMU" does not necessarily work.)  I’d 
reorder/change it to:


``-gdb`` wraps every QEMU invocation in a ``gdbserver``, which waits for 
a connection from a gdb client.  The options given to ``gdbserver`` 
(e.g. the address on which to listen for connections) are taken from the 
``$GDB_QEMU`` environment variable.  By default (if ``$GDB_QEMU`` is 
empty), it listens on ``localhost:12345``.
You can connect to it for example with ``gdb -iex "target remote 
$addr"``, where ``$addr`` is the address ``gdbserver`` listens on.


Max


  * ``-d`` (debug) just increases the logging verbosity, showing
for example the QMP commands and answers.

Re: [PATCH v3 07/15] qemu-iotests: add gdbserver option to script tests too

2021-04-30 Thread Max Reitz


On 14.04.21 19:03, Emanuele Giuseppe Esposito wrote:

The only limitation here is that running a script with gdbserver
will make the test output mismatch with the expected
results, making the test fail.

Signed-off-by: Emanuele Giuseppe Esposito 
---
  tests/qemu-iotests/common.rc | 8 +++-
  1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/tests/qemu-iotests/common.rc b/tests/qemu-iotests/common.rc
index 65cdba5723..53a3310fee 100644
--- a/tests/qemu-iotests/common.rc
+++ b/tests/qemu-iotests/common.rc
@@ -166,8 +166,14 @@ _qemu_wrapper()
  if [ -n "${QEMU_NEED_PID}" ]; then
  echo $BASHPID > "${QEMU_TEST_DIR}/qemu-${_QEMU_HANDLE}.pid"
  fi
+
+GDB="${QEMU_PROG}"
+if [ ! -z ${GDB_QEMU} ]; then
+GDB="gdbserver ${GDB_QEMU} ${GDB}"
+fi
+
  VALGRIND_QEMU="${VALGRIND_QEMU_VM}" _qemu_proc_exec 
"${VALGRIND_LOGFILE}" \
-"$QEMU_PROG" $QEMU_OPTIONS "$@"
+   $GDB $QEMU_OPTIONS "$@"


This looks strange, because now there is no qemu here.

Why not just fill $GDB with the GDB options, keeping it empty if no gdb 
is to be used?  Then we’d have


VALGRIND_QEMU=... ... \
$GDB "$QEMU_PROG" $QEMU_OPTIONS "$@"

here.

(Also, the indentation changed.  I’d keep it aligned to four spaces.)

Max


  )
  RETVAL=$?
  _qemu_proc_valgrind_log "${VALGRIND_LOGFILE}" $RETVAL

Re: [PATCH v3 04/15] qemu-iotests: add option to attach gdbserver

2021-04-30 Thread Max Reitz


On 14.04.21 19:03, Emanuele Giuseppe Esposito wrote:

Add -gdb flag and GDB_QEMU environmental variable
to python tests to attach a gdbserver to each qemu instance.

if -gdb is not provided but $GDB_QEMU is set, ignore the
environmental variable.

Signed-off-by: Emanuele Giuseppe Esposito 
---
  tests/qemu-iotests/check  |  6 +-
  tests/qemu-iotests/iotests.py |  4 
  tests/qemu-iotests/testenv.py | 15 ---
  3 files changed, 21 insertions(+), 4 deletions(-)

diff --git a/tests/qemu-iotests/check b/tests/qemu-iotests/check
index d1c87ceaf1..6186495eee 100755
--- a/tests/qemu-iotests/check
+++ b/tests/qemu-iotests/check
@@ -33,6 +33,9 @@ def make_argparser() -> argparse.ArgumentParser:
 help='pretty print output for make check')
  
  p.add_argument('-d', dest='debug', action='store_true', help='debug')

+p.add_argument('-gdb', action='store_true',
+   help="start gdbserver with $GDB_QEMU options. \
+ Default is localhost:12345")
  p.add_argument('-misalign', action='store_true',
 help='misalign memory allocations')
  p.add_argument('--color', choices=['on', 'off', 'auto'],
@@ -112,7 +115,8 @@ if __name__ == '__main__':
  env = TestEnv(imgfmt=args.imgfmt, imgproto=args.imgproto,
aiomode=args.aiomode, cachemode=args.cachemode,
imgopts=args.imgopts, misalign=args.misalign,
-  debug=args.debug, valgrind=args.valgrind)
+  debug=args.debug, valgrind=args.valgrind,
+  gdb=args.gdb)
  
  testfinder = TestFinder(test_dir=env.source_iotests)
  
diff --git a/tests/qemu-iotests/iotests.py b/tests/qemu-iotests/iotests.py

index 90d0b62523..05d0dc0751 100644
--- a/tests/qemu-iotests/iotests.py
+++ b/tests/qemu-iotests/iotests.py
@@ -75,6 +75,10 @@
  qemu_prog = os.environ.get('QEMU_PROG', 'qemu')
  qemu_opts = os.environ.get('QEMU_OPTIONS', '').strip().split(' ')
  
+qemu_gdb = []

+if os.environ.get('GDB_QEMU'):
+qemu_gdb = ['gdbserver'] + os.environ.get('GDB_QEMU').strip().split(' ')


os.environ.get('GDB_QEMU') returns an Option[str], so mypy complains 
about the .strip() (thus failing iotest 297).


(That can be fixed for example by either using os.environ['GDB_QEMU'] 
here, like most other places here do, or by something like:


gdb_qemu_env = os.environ.get('GDB_QEMU')
if gdb_qemu_env:
qemu_gdb = ['gdbserver'] + gdb_qemu_env.strip().split(' ')
)

Max


+
  imgfmt = os.environ.get('IMGFMT', 'raw')
  imgproto = os.environ.get('IMGPROTO', 'file')
  output_dir = os.environ.get('OUTPUT_DIR', '.')
diff --git a/tests/qemu-iotests/testenv.py b/tests/qemu-iotests/testenv.py
index 1fbec854c1..e131ff42cb 100644
--- a/tests/qemu-iotests/testenv.py
+++ b/tests/qemu-iotests/testenv.py
@@ -72,7 +72,8 @@ class TestEnv(ContextManager['TestEnv']):
   'QEMU_NBD_OPTIONS', 'IMGOPTS', 'IMGFMT', 'IMGPROTO',
   'AIOMODE', 'CACHEMODE', 'VALGRIND_QEMU',
   'CACHEMODE_IS_DEFAULT', 'IMGFMT_GENERIC', 
'IMGOPTSSYNTAX',
- 'IMGKEYSECRET', 'QEMU_DEFAULT_MACHINE', 'MALLOC_PERTURB_']
+ 'IMGKEYSECRET', 'QEMU_DEFAULT_MACHINE', 'MALLOC_PERTURB_',
+ 'GDB_QEMU']
  
  def get_env(self) -> Dict[str, str]:

  env = {}
@@ -163,7 +164,8 @@ def __init__(self, imgfmt: str, imgproto: str, aiomode: str,
   imgopts: Optional[str] = None,
   misalign: bool = False,
   debug: bool = False,
- valgrind: bool = False) -> None:
+ valgrind: bool = False,
+ gdb: bool = False) -> None:
  self.imgfmt = imgfmt
  self.imgproto = imgproto
  self.aiomode = aiomode
@@ -171,6 +173,11 @@ def __init__(self, imgfmt: str, imgproto: str, aiomode: 
str,
  self.misalign = misalign
  self.debug = debug
  
+if gdb:

+self.gdb_qemu = os.environ.get('GDB_QEMU', 'localhost:12345')
+elif 'GDB_QEMU' in os.environ:
+del os.environ['GDB_QEMU']
+
  if valgrind:
  self.valgrind_qemu = 'y'
  
@@ -268,7 +275,9 @@ def print_env(self) -> None:

  PLATFORM  -- {platform}
  TEST_DIR  -- {TEST_DIR}
  SOCK_DIR  -- {SOCK_DIR}
-SOCKET_SCM_HELPER -- {SOCKET_SCM_HELPER}"""
+SOCKET_SCM_HELPER -- {SOCKET_SCM_HELPER}
+GDB_QEMU  -- "{GDB_QEMU}"
+"""
  
  args = collections.defaultdict(str, self.get_env())

Re: [PATCH v3 04/15] qemu-iotests: add option to attach gdbserver

2021-04-30 Thread Max Reitz


On 14.04.21 19:03, Emanuele Giuseppe Esposito wrote:

Add -gdb flag and GDB_QEMU environmental variable
to python tests to attach a gdbserver to each qemu instance.


Well, this patch doesn’t do this, but OK.

Out of interest: Why gdbserver and not “just” gdb?  On Fedora, those are 
separate packages, so I don’t have gdbserver installed, that’s why I’m 
asking.


(I’ve also never used gdbserver before.  From what I can tell, it’s 
basically just a limited version of gdb so it only serves as a server.)



if -gdb is not provided but $GDB_QEMU is set, ignore the
environmental variable.


s/environmental/environment/



Signed-off-by: Emanuele Giuseppe Esposito 
---
  tests/qemu-iotests/check  |  6 +-
  tests/qemu-iotests/iotests.py |  4 
  tests/qemu-iotests/testenv.py | 15 ---
  3 files changed, 21 insertions(+), 4 deletions(-)

diff --git a/tests/qemu-iotests/check b/tests/qemu-iotests/check
index d1c87ceaf1..6186495eee 100755
--- a/tests/qemu-iotests/check
+++ b/tests/qemu-iotests/check
@@ -33,6 +33,9 @@ def make_argparser() -> argparse.ArgumentParser:
 help='pretty print output for make check')
  
  p.add_argument('-d', dest='debug', action='store_true', help='debug')

+p.add_argument('-gdb', action='store_true',
+   help="start gdbserver with $GDB_QEMU options. \
+ Default is localhost:12345")


That makes it sound like this were the default for the `-gdb` option. 
Since `-gdb` is just a switch, it doesn’t have a default (apart from 
being off by default).


So I’d rephrase this at least to “The default options are 
'localhost:12345'”.  Or maybe “start gdbserver with $GDB_QEMU options 
('localhost:12345' if $GDB_QEMU is empty)”.


Also, $GDB_QEMU as a name is a bit strange, because it does not specify 
which gdb to use; it just gives the options to use for gdb. 
$GDB_QEMU_OPTIONS would be more in line with the naming of the rest of 
the environment variables (or just $GDB_OPTIONS).


Max


  p.add_argument('-misalign', action='store_true',
 help='misalign memory allocations')
  p.add_argument('--color', choices=['on', 'off', 'auto'],

Re: [PATCH v3 03/15] docs/devel/testing: add debug section to the QEMU iotests chapter

2021-04-30 Thread Max Reitz


On 14.04.21 19:03, Emanuele Giuseppe Esposito wrote:

Introduce the "Debugging a test case" section, in preparation
to the additional flags that will be added in the next patches.

Signed-off-by: Emanuele Giuseppe Esposito 
---
  docs/devel/testing.rst | 8 
  1 file changed, 8 insertions(+)

diff --git a/docs/devel/testing.rst b/docs/devel/testing.rst
index 1434a50cc4..b7e2370e7e 100644
--- a/docs/devel/testing.rst
+++ b/docs/devel/testing.rst
@@ -224,6 +224,14 @@ another application on the host may have locked the file, 
possibly leading to a
  test failure.  If using such devices are explicitly desired, consider adding
  ``locking=off`` option to disable image locking.
  
+Debugging a test case

+---
+QEMU iotests offers some options to debug a failing test, that can be


-,

Max


+given as options to the ``check`` script:
+
+* ``-d`` (debug) just increases the logging verbosity, showing
+  for example the QMP commands and answers.
+
  Test case groups

Re: [PATCH v3 02/15] python: qemu: pass the wrapper field from QEMUQtestmachine to QEMUMachine

2021-04-30 Thread Max Reitz


On 14.04.21 19:03, Emanuele Giuseppe Esposito wrote:

Signed-off-by: Emanuele Giuseppe Esposito 
---
  python/qemu/qtest.py | 4 +++-
  1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/python/qemu/qtest.py b/python/qemu/qtest.py
index 39a0cf62fe..c18eae96c6 100644
--- a/python/qemu/qtest.py
+++ b/python/qemu/qtest.py
@@ -111,6 +111,7 @@ class QEMUQtestMachine(QEMUMachine):
  def __init__(self,
   binary: str,
   args: Sequence[str] = (),
+ wrapper: Sequence[str] = (),
   name: Optional[str] = None,
   test_dir: str = "/var/tmp",
   socket_scm_helper: Optional[str] = None,
@@ -119,7 +120,8 @@ def __init__(self,
  name = "qemu-%d" % os.getpid()
  if sock_dir is None:
  sock_dir = test_dir
-super().__init__(binary, args, name=name, test_dir=test_dir,
+super().__init__(binary, args, wrapper=wrapper, name=name,
+ test_dir=test_dir,
   socket_scm_helper=socket_scm_helper,
   sock_dir=sock_dir)
  self._qtest: Optional[QEMUQtestProtocol] = None


Reviewed-by: Max Reitz

Re: [PATCH v3 01/15] python: qemu: add timer parameter for qmp.accept socket

2021-04-30 Thread Max Reitz


On 14.04.21 19:03, Emanuele Giuseppe Esposito wrote:

Add a new _qmp_timer field to the QEMUMachine class.
The default timer is 15 sec, as per the default in the
qmp accept() function.

Signed-off-by: Emanuele Giuseppe Esposito 
---
  python/qemu/machine.py | 3 ++-
  1 file changed, 2 insertions(+), 1 deletion(-)


Reviewed-by: Max Reitz 

(I would have preferred for the commit message to tell me why we want 
this (I suspect some later patch is going to use it), but oh well.)

Re: [PATCH] mirror: stop cancelling in-flight requests on non-force cancel in READY

2021-04-30 Thread Max Reitz


On 21.04.21 09:58, Vladimir Sementsov-Ogievskiy wrote:

If mirror is READY than cancel operation is not discarding the whole
result of the operation, but instead it's a documented way get a
point-in-time snapshot of source disk.

So, we should not cancel any requests if mirror is READ and
force=false. Let's fix that case.

Note, that bug that we have before this commit is not critical, as the
only .bdrv_cancel_in_flight implementation is nbd_cancel_in_flight()
and it cancels only requests waiting for reconnection, so it should be
rare case.

Fixes: 521ff8b779b11c394dbdc43f02e158dd99df308a
Signed-off-by: Vladimir Sementsov-Ogievskiy 
---
  include/block/block_int.h | 2 +-
  include/qemu/job.h| 2 +-
  block/backup.c| 2 +-
  block/mirror.c| 6 --
  job.c | 2 +-
  tests/qemu-iotests/264| 2 +-
  6 files changed, 9 insertions(+), 7 deletions(-)


Thanks, applied to my block branch:

https://github.com/XanClic/qemu/commits/block

Max

Re: [PATCH] block: simplify write-threshold and drop write notifiers

2021-04-30 Thread Max Reitz


On 22.04.21 00:09, Vladimir Sementsov-Ogievskiy wrote:

write-notifiers are used only for write-threshold. New code for such
purpose should create filters.

Let's handle write-threshold simply in generic code and drop write
notifiers at all.

Also move part of write-threshold API that is used only for testing to
the test.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
---

I agree that this could be split into 2-3 parts and not combining
everything into one. But I'm tired now.


Er...  You should have put it off until the next day then? O:)

It should be multiple patches.  At least one to move the write threshold 
update to block/io.c, and then another to drop write notifiers.



I can send v2 if needed, so
consider it as RFC. Or take as is if you think it's not too much-in-one.

I also suggest this as a prepartion (and partly substitution) for
"[PATCH v2 0/8] Block layer thread-safety, continued"

  include/block/block_int.h | 12 -
  include/block/write-threshold.h   | 24 -
  block.c   |  1 -
  block/io.c| 21 +---
  block/write-threshold.c   | 87 ++-
  tests/unit/test-write-threshold.c | 38 ++
  6 files changed, 54 insertions(+), 129 deletions(-)


[...]


diff --git a/block/io.c b/block/io.c
index ca2dca3007..e0aa775952 100644
--- a/block/io.c
+++ b/block/io.c
@@ -36,6 +36,8 @@
  #include "qemu/main-loop.h"
  #include "sysemu/replay.h"
  
+#include "qapi/qapi-events-block-core.h"

+
  /* Maximum bounce buffer for copy-on-read and write zeroes, in bytes */
  #define MAX_BOUNCE_BUFFER (32768 << BDRV_SECTOR_BITS)
  
@@ -1974,6 +1976,8 @@ bdrv_co_write_req_prepare(BdrvChild *child, int64_t offset, int64_t bytes,

 child->perm & BLK_PERM_RESIZE);
  
  switch (req->type) {

+uint64_t write_threshold;
+


Works, but I think this is the first time I see a variable declared in a 
switch block.  What I usually do for such cases is to put a block after 
the label.  (i.e. case X: { uint64_t write_threshold; ... })


But it wouldn’t hurt to just declare this at the beginning of 
bdrv_co_write_req_prepare(), I think.



  case BDRV_TRACKED_WRITE:
  case BDRV_TRACKED_DISCARD:
  if (flags & BDRV_REQ_WRITE_UNCHANGED) {
@@ -1981,8 +1985,15 @@ bdrv_co_write_req_prepare(BdrvChild *child, int64_t 
offset, int64_t bytes,
  } else {
  assert(child->perm & BLK_PERM_WRITE);
  }
-return notifier_with_return_list_notify(>before_write_notifiers,
-req);
+write_threshold = qatomic_read(>write_threshold_offset);
+if (write_threshold > 0 && offset + bytes > write_threshold) {
+qapi_event_send_block_write_threshold(
+bs->node_name,
+offset + bytes - write_threshold,
+write_threshold);
+qatomic_set(>write_threshold_offset, 0);
+}


I’d put all of this into a function in block/write-threshold.c that’s 
called from here.


Max


+return 0;
  case BDRV_TRACKED_TRUNCATE:
  assert(child->perm & BLK_PERM_RESIZE);
  return 0;
@@ -3137,12 +3148,6 @@ bool bdrv_qiov_is_aligned(BlockDriverState *bs, 
QEMUIOVector *qiov)
  return true;
  }
  
-void bdrv_add_before_write_notifier(BlockDriverState *bs,

-NotifierWithReturn *notifier)
-{
-notifier_with_return_list_add(>before_write_notifiers, notifier);
-}
-
  void bdrv_io_plug(BlockDriverState *bs)
  {
  BdrvChild *child;

Re: [PATCH v2] monitor: hmp_qemu_io: acquire aio contex, fix crash

2021-04-30 Thread Max Reitz


On 23.04.21 15:42, Vladimir Sementsov-Ogievskiy wrote:

Max reported the following bug:

$ ./qemu-img create -f raw src.img 1G
$ ./qemu-img create -f raw dst.img 1G

$ (echo '
{"execute":"qmp_capabilities"}
{"execute":"blockdev-mirror",
 "arguments":{"job-id":"mirror",
  "device":"source",
  "target":"target",
  "sync":"full",
  "filter-node-name":"mirror-top"}}
'; sleep 3; echo'
{"execute":"human-monitor-command",
 "arguments":{"command-line":
  "qemu-io mirror-top \"write 0 1G\""}}') \
| x86_64-softmmu/qemu-system-x86_64 \
-qmp stdio \
-blockdev file,node-name=source,filename=src.img \
-blockdev file,node-name=target,filename=dst.img \
-object iothread,id=iothr0 \
-device virtio-blk,drive=source,iothread=iothr0

crashes:

0  raise () at /usr/lib/libc.so.6
1  abort () at /usr/lib/libc.so.6
2  error_exit
(err=,
msg=msg@entry=0x55fbb1634790 <__func__.27> "qemu_mutex_unlock_impl")
at ../util/qemu-thread-posix.c:37
3  qemu_mutex_unlock_impl
(mutex=mutex@entry=0x55fbb25ab6e0,
file=file@entry=0x55fbb1636957 "../util/async.c",
line=line@entry=650)
at ../util/qemu-thread-posix.c:109
4  aio_context_release (ctx=ctx@entry=0x55fbb25ab680) at ../util/async.c:650
5  bdrv_do_drained_begin
(bs=bs@entry=0x55fbb3a87000, recursive=recursive@entry=false,
parent=parent@entry=0x0,
ignore_bds_parents=ignore_bds_parents@entry=false,
poll=poll@entry=true) at ../block/io.c:441
6  bdrv_do_drained_begin
(poll=true, ignore_bds_parents=false, parent=0x0, recursive=false,
bs=0x55fbb3a87000) at ../block/io.c:448
7  blk_drain (blk=0x55fbb26c5a00) at ../block/block-backend.c:1718
8  blk_unref (blk=0x55fbb26c5a00) at ../block/block-backend.c:498
9  blk_unref (blk=0x55fbb26c5a00) at ../block/block-backend.c:491
10 hmp_qemu_io (mon=0x7fffaf3fc7d0, qdict=)
at ../block/monitor/block-hmp-cmds.c:628

man pthread_mutex_unlock
...
 EPERM  The  mutex type is PTHREAD_MUTEX_ERRORCHECK or
 PTHREAD_MUTEX_RECURSIVE, or the mutex is a robust mutex, and the
 current thread does not own the mutex.

So, thread doesn't own the mutex. And we have iothread here.

Next, note that AIO_WAIT_WHILE() documents that ctx must be acquired
exactly once by caller. But where is it acquired in the call stack?
Seems nowhere.

qemuio_command do acquire aio context.. But we need context acquired
around blk_unref() as well and actually around blk_insert_bs() too.

Let's refactor qemuio_command so that it doesn't acquire aio context
but callers do that instead. This way we can cleanly acquire aio
context in hmp_qemu_io() around all three calls.

Reported-by: Max Reitz 
Signed-off-by: Vladimir Sementsov-Ogievskiy 
---

v2: rewrite, to cleanly acquire aio-context around all needed things in
 hmp_qemu_io

  block/monitor/block-hmp-cmds.c | 31 +--
  qemu-io-cmds.c |  8 
  qemu-io.c  | 17 +++--
  3 files changed, 40 insertions(+), 16 deletions(-)


Thanks, applied to my block branch:

https://github.com/XanClic/qemu/commits/block

(With the change below:)


diff --git a/qemu-io-cmds.c b/qemu-io-cmds.c
index 97611969cb..19149e014d 100644
--- a/qemu-io-cmds.c
+++ b/qemu-io-cmds.c
@@ -2457,9 +2457,12 @@ static const cmdinfo_t help_cmd = {
  .oneline= "help for one or all commands",
  };
  
+/*

+ * Called with aio context of blk acquired. Or with qemu_get_aio_context()
+ * context acquired if no blk is NULL.


-no


+ */
  int qemuio_command(BlockBackend *blk, const char *cmd)
  {
-AioContext *ctx;
  char *input;
  const cmdinfo_t *ct;
  char **v;

Re: [PATCH 1/2] block/export: Free ignored Error

2021-04-26 Thread Max Reitz


On 26.04.21 11:44, Vladimir Sementsov-Ogievskiy wrote:

22.04.2021 17:53, Max Reitz wrote:

When invoking block-export-add with some iothread and
fixed-iothread=false, and changing the node's iothread fails, the error
is supposed to be ignored.

However, it is still stored in *errp, which is wrong.  If a second error
occurs, the "*errp must be NULL" assertion in error_setv() fails:

   qemu-system-x86_64: ../util/error.c:59: error_setv: Assertion
   `*errp == NULL' failed.

So the error from bdrv_try_set_aio_context() must be freed when it is
ignored.

Fixes: f51d23c80af73c95e0ce703ad06a300f1b3d63ef
    ("block/export: add iothread and fixed-iothread options")
Signed-off-by: Max Reitz 
---
  block/export/export.c | 4 
  1 file changed, 4 insertions(+)

diff --git a/block/export/export.c b/block/export/export.c
index fec7d9f738..ce5dd3e59b 100644
--- a/block/export/export.c
+++ b/block/export/export.c
@@ -68,6 +68,7 @@ static const BlockExportDriver 
*blk_exp_find_driver(BlockExportType type)

  BlockExport *blk_exp_add(BlockExportOptions *export, Error **errp)
  {
+    ERRP_GUARD();
  bool fixed_iothread = export->has_fixed_iothread && 
export->fixed_iothread;

  const BlockExportDriver *drv;
  BlockExport *exp = NULL;
@@ -127,6 +128,9 @@ BlockExport *blk_exp_add(BlockExportOptions 
*export, Error **errp)

  ctx = new_ctx;
  } else if (fixed_iothread) {
  goto fail;
+    } else {
+    error_free(*errp);
+    *errp = NULL;
  }
  }



I don't think ERRP_GUARD is needed in this case: we don't need to handle 
errp somehow except for just free if it was set.


Perhaps not, but style-wise, I prefer not special-casing the
errp == NULL case.

(It can be argued that ERRP_GUARD similarly special-cases it, but that’s 
hidden from my view.  Also, the errp == NULL case actually doesn’t even 
happen, so ERRP_GUARD is effectively a no-op and it won’t cost 
performance (not that it really matters).)


Of course we could also do this:

ret = bdrv_try_set_aio_context(bs, new_ctx, fixed_iothread ? errp : NULL);

Would be even shorter.


So we can simply do:

} else if (errp) {
    error_free(*errp);
    *errp = NULL;
}

Let's only check that errp is really set on failure path of 
bdrv_try_set_aio_context():


OK,  but out of interest, why?  error_free() doesn’t care.  I mean it 
might be a problem if blk_exp_add() returns an error without setting 
*errp, but that’d’ve been pre-existing.


Max

bdrv_try_set_aio_context() fails iff bdrv_can_set_aio_context() fails, 
which in turn may fail iff bdrv_parent_can_set_aio_context() or 
bdrv_child_can_set_aio_context() fails.


bdrv_parent_can_set_aio_context() has two failure path, on first it set 
errp by hand, and on second it has assertion that errp is set.


bdrv_child_can_set_aio_context() may fail only if nested call to 
bdrv_can_set_aio_context() fails, so recursion is closed.

[PATCH 2/2] iotests/307: Test iothread conflict for exports

2021-04-22 Thread Max Reitz

Passing fixed-iothread=true should make iothread conflicts fatal,
whereas fixed-iothread=false should not.

Combine the second case with an error condition that is checked after
the iothread is handled, to verify that qemu does not crash if there is
such an error after changing the iothread failed.

Signed-off-by: Max Reitz 
---
 tests/qemu-iotests/307 | 15 +++
 tests/qemu-iotests/307.out |  8 
 2 files changed, 23 insertions(+)

diff --git a/tests/qemu-iotests/307 b/tests/qemu-iotests/307
index c7685347bc..b429b5aa50 100755
--- a/tests/qemu-iotests/307
+++ b/tests/qemu-iotests/307
@@ -41,9 +41,11 @@ with iotests.FilePath('image') as img, \
 iotests.log('=== Launch VM ===')
 
 vm.add_object('iothread,id=iothread0')
+vm.add_object('iothread,id=iothread1')
 vm.add_blockdev(f'file,filename={img},node-name=file')
 vm.add_blockdev(f'{iotests.imgfmt},file=file,node-name=fmt')
 vm.add_blockdev('raw,file=file,node-name=ro,read-only=on')
+vm.add_blockdev('null-co,node-name=null')
 vm.add_device(f'id=scsi0,driver=virtio-scsi,iothread=iothread0')
 vm.launch()
 
@@ -74,6 +76,19 @@ with iotests.FilePath('image') as img, \
 vm.qmp_log('query-block-exports')
 iotests.qemu_nbd_list_log('-k', socket)
 
+iotests.log('\n=== Add export with conflicting iothread ===')
+
+vm.qmp_log('device_add', id='sdb', driver='scsi-hd', drive='null')
+
+# Should fail because of fixed-iothread
+vm.qmp_log('block-export-add', id='export1', type='nbd', node_name='null',
+   iothread='iothread1', fixed_iothread=True, writable=True)
+
+# Should ignore the iothread conflict, but then fail because of the
+# permission conflict (and not crash)
+vm.qmp_log('block-export-add', id='export1', type='nbd', node_name='null',
+   iothread='iothread1', fixed_iothread=False, writable=True)
+
 iotests.log('\n=== Add a writable export ===')
 
 # This fails because share-rw=off
diff --git a/tests/qemu-iotests/307.out b/tests/qemu-iotests/307.out
index daa8ad2da0..69439d96ff 100644
--- a/tests/qemu-iotests/307.out
+++ b/tests/qemu-iotests/307.out
@@ -51,6 +51,14 @@ exports available: 1
base:allocation
 
 
+=== Add export with conflicting iothread ===
+{"execute": "device_add", "arguments": {"drive": "null", "driver": "scsi-hd", 
"id": "sdb"}}
+{"return": {}}
+{"execute": "block-export-add", "arguments": {"fixed-iothread": true, "id": 
"export1", "iothread": "iothread1", "node-name": "null", "type": "nbd", 
"writable": true}}
+{"error": {"class": "GenericError", "desc": "Cannot change iothread of active 
block backend"}}
+{"execute": "block-export-add", "arguments": {"fixed-iothread": false, "id": 
"export1", "iothread": "iothread1", "node-name": "null", "type": "nbd", 
"writable": true}}
+{"error": {"class": "GenericError", "desc": "Conflicts with use by sdb as 
'root', which does not allow 'write' on null"}}
+
 === Add a writable export ===
 {"execute": "block-export-add", "arguments": {"description": "This is the 
writable second export", "id": "export1", "name": "export1", "node-name": 
"fmt", "type": "nbd", "writable": true, "writethrough": true}}
 {"error": {"class": "GenericError", "desc": "Conflicts with use by sda as 
'root', which does not allow 'write' on fmt"}}
-- 
2.30.2

[PATCH 0/2] block/export: Fix crash on error after iothread conflict

2021-04-22 Thread Max Reitz

Hi,

By passing the @iothread option to block-export-add, the new export can
be moved to the given iothread.  This may conflict with an existing
parent of the node in question.  How this conflict is resolved, depends
on @fixed-iothread: If that option is true, the error is fatal and
block-export-add fails.  If it is false, the error is ignored and the
node stays in its original iothread.

However, in the implementation, the ignored error is still in *errp, and
so if a second error occurs afterwards and tries to put something into
*errp, that will fail an assertion.

To really ignore the error, we have to free it and clear *errp (with an
ERRP_GUARD()).

Patch 1 is the fix, patch 2 a regression test.


Max Reitz (2):
  block/export: Free ignored Error
  iotests/307: Test iothread conflict for exports

 block/export/export.c  |  4 
 tests/qemu-iotests/307 | 15 +++
 tests/qemu-iotests/307.out |  8 
 3 files changed, 27 insertions(+)

-- 
2.30.2

[PATCH 1/2] block/export: Free ignored Error

2021-04-22 Thread Max Reitz

When invoking block-export-add with some iothread and
fixed-iothread=false, and changing the node's iothread fails, the error
is supposed to be ignored.

However, it is still stored in *errp, which is wrong.  If a second error
occurs, the "*errp must be NULL" assertion in error_setv() fails:

  qemu-system-x86_64: ../util/error.c:59: error_setv: Assertion
  `*errp == NULL' failed.

So the error from bdrv_try_set_aio_context() must be freed when it is
ignored.

Fixes: f51d23c80af73c95e0ce703ad06a300f1b3d63ef
   ("block/export: add iothread and fixed-iothread options")
Signed-off-by: Max Reitz 
---
 block/export/export.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/block/export/export.c b/block/export/export.c
index fec7d9f738..ce5dd3e59b 100644
--- a/block/export/export.c
+++ b/block/export/export.c
@@ -68,6 +68,7 @@ static const BlockExportDriver 
*blk_exp_find_driver(BlockExportType type)
 
 BlockExport *blk_exp_add(BlockExportOptions *export, Error **errp)
 {
+ERRP_GUARD();
 bool fixed_iothread = export->has_fixed_iothread && export->fixed_iothread;
 const BlockExportDriver *drv;
 BlockExport *exp = NULL;
@@ -127,6 +128,9 @@ BlockExport *blk_exp_add(BlockExportOptions *export, Error 
**errp)
 ctx = new_ctx;
 } else if (fixed_iothread) {
 goto fail;
+} else {
+error_free(*errp);
+*errp = NULL;
 }
 }
 
-- 
2.30.2

Re: [PATCH] monitor: hmp_qemu_io: acquire aio contex, fix crash

2021-04-22 Thread Max Reitz


On 21.04.21 10:32, Vladimir Sementsov-Ogievskiy wrote:

Max reported the following bug:

$ ./qemu-img create -f raw src.img 1G
$ ./qemu-img create -f raw dst.img 1G

$ (echo '
{"execute":"qmp_capabilities"}
{"execute":"blockdev-mirror",
 "arguments":{"job-id":"mirror",
  "device":"source",
  "target":"target",
  "sync":"full",
  "filter-node-name":"mirror-top"}}
'; sleep 3; echo'
{"execute":"human-monitor-command",
 "arguments":{"command-line":
  "qemu-io mirror-top \"write 0 1G\""}}') \
| x86_64-softmmu/qemu-system-x86_64 \
-qmp stdio \
-blockdev file,node-name=source,filename=src.img \
-blockdev file,node-name=target,filename=dst.img \
-object iothread,id=iothr0 \
-device virtio-blk,drive=source,iothread=iothr0

crashes:

0  raise () at /usr/lib/libc.so.6
1  abort () at /usr/lib/libc.so.6
2  error_exit
(err=,
msg=msg@entry=0x55fbb1634790 <__func__.27> "qemu_mutex_unlock_impl")
at ../util/qemu-thread-posix.c:37
3  qemu_mutex_unlock_impl
(mutex=mutex@entry=0x55fbb25ab6e0,
file=file@entry=0x55fbb1636957 "../util/async.c",
line=line@entry=650)
at ../util/qemu-thread-posix.c:109
4  aio_context_release (ctx=ctx@entry=0x55fbb25ab680) at ../util/async.c:650
5  bdrv_do_drained_begin
(bs=bs@entry=0x55fbb3a87000, recursive=recursive@entry=false,
parent=parent@entry=0x0,
ignore_bds_parents=ignore_bds_parents@entry=false,
poll=poll@entry=true) at ../block/io.c:441
6  bdrv_do_drained_begin
(poll=true, ignore_bds_parents=false, parent=0x0, recursive=false,
bs=0x55fbb3a87000) at ../block/io.c:448
7  blk_drain (blk=0x55fbb26c5a00) at ../block/block-backend.c:1718
8  blk_unref (blk=0x55fbb26c5a00) at ../block/block-backend.c:498
9  blk_unref (blk=0x55fbb26c5a00) at ../block/block-backend.c:491
10 hmp_qemu_io (mon=0x7fffaf3fc7d0, qdict=)
at ../block/monitor/block-hmp-cmds.c:628

man pthread_mutex_unlock
...
 EPERM  The  mutex type is PTHREAD_MUTEX_ERRORCHECK or
 PTHREAD_MUTEX_RECURSIVE, or the mutex is a robust mutex, and the
 current thread does not own the mutex.

So, thread doesn't own the mutex. And we have iothread here.

Next, note that AIO_WAIT_WHILE() documents that ctx must be acquired
exactly once by caller. But where is it acquired in the call stack?
Seems nowhere.

qemuio_command do acquire aio context.. But we need context acquired
around blk_unref as well. Let's do it.

Reported-by: Max Reitz 
Signed-off-by: Vladimir Sementsov-Ogievskiy 
---
  block/monitor/block-hmp-cmds.c | 7 +++
  1 file changed, 7 insertions(+)

diff --git a/block/monitor/block-hmp-cmds.c b/block/monitor/block-hmp-cmds.c
index ebf1033f31..934100d0eb 100644
--- a/block/monitor/block-hmp-cmds.c
+++ b/block/monitor/block-hmp-cmds.c
@@ -559,6 +559,7 @@ void hmp_qemu_io(Monitor *mon, const QDict *qdict)
  {
  BlockBackend *blk;
  BlockBackend *local_blk = NULL;
+AioContext *ctx;
  bool qdev = qdict_get_try_bool(qdict, "qdev", false);
  const char *device = qdict_get_str(qdict, "device");
  const char *command = qdict_get_str(qdict, "command");
@@ -615,7 +616,13 @@ void hmp_qemu_io(Monitor *mon, const QDict *qdict)
  qemuio_command(blk, command);
  
  fail:

+ctx = blk_get_aio_context(blk);
+aio_context_acquire(ctx);
+
  blk_unref(local_blk);
+
+aio_context_release(ctx);
+
  hmp_handle_error(mon, err);
  }


Looks good.  Now I wonder about the rest of this function, though. 
qemuio_command() acquires the context on its own.  So the only thing 
left that looks a bit like it may want to have the context locked is 
blk_insert_bs().  Most of its callers seem to run in the BB’s native 
context, so they don’t have to acquire it; but blk_exp_add() has the 
context held around it, so... should this place, too?


Max

Re: [PATCH v4 0/2] Fix segfault in qemu_rbd_parse_filename

2021-04-22 Thread Max Reitz


On 21.04.21 23:23, Connor Kuehl wrote:

Connor Kuehl (2):
   iotests/231: Update expected deprecation message
   block/rbd: Add an escape-aware strchr helper

  block/rbd.c| 32 +---
  tests/qemu-iotests/231 |  4 
  tests/qemu-iotests/231.out |  7 ---
  3 files changed, 29 insertions(+), 14 deletions(-)


Thanks, applied to my block-next branch (for 6.1):

https://git.xanclic.moe/XanClic/qemu/commits/branch/block-next

Max

Re: about mirror cancel

2021-04-16 Thread Max Reitz


On 16.04.21 09:05, Max Reitz wrote:

On 15.04.21 20:46, Vladimir Sementsov-Ogievskiy wrote:


[...]

Note, that if cancelling all in-flight requests on target is wrong on 
mirror cancel, we still don't have real bug, as the only 
implementation of .bdrv_cancel_in_flight is stopping reconnect waiting 
in nbd driver. So, we'll cancel requests only if connection is already 
lost anyway.


But that probably means, that correct name of the handler would be 
.bdrv_cancel_in_fligth_requests_that_will_most_probably_fail_anyway()..


It’s documentation states that it should cancel all in-flight requests, 
so it’s more likely it just isn’t implemented where it could be.


Oh, I now see you added it in the same series.  Well, then I suppose 
you’re free to change the semantics as you see fit.


But be aware that even cancelling those requests means that you abandon 
the target.  So it must then fail instead of emitting the COMPLETED 
event (AFAIR the mirror job emits COMPLETED when cancelled in READY with 
force=false).


If the user wants the mirror job to create a consistent copy and so 
cancels it after READY (with force=false), I don’t know whether 
cancelling those hanging requests is what we want.  If the cancel hangs 
and the user sees this, they are still free to decide to cancel again 
with force=true, no?


Max

Re: about mirror cancel

2021-04-16 Thread Max Reitz


On 15.04.21 20:46, Vladimir Sementsov-Ogievskiy wrote:

Hi all!

Recently I've implemented fast-cancelling of mirror job: do 
bdrv_cancel_in_flight() in mirror_cancel().


Now I'm in doubt: is it a correct thing? I heard, that mirror-cancel is 
a kind of valid mirror completion..


Looking at documentation:

# Note that if you issue 'block-job-cancel' after 'drive-mirror' has 
indicated

# (via the event BLOCK_JOB_READY) that the source and destination are
# synchronized, then the event triggered by this command changes to
# BLOCK_JOB_COMPLETED, to indicate that the mirroring has ended and the
# destination now has a point-in-time copy tied to the time of the 
cancellation.


So, in other words, do we guarantee something to the user, if it calls 
mirror-cancel in ready state? Does this abuse exist in libvirt?


How is it abuse it if it’s documented?  AFAIR it does exist, because 
libvirt’s blockcopy always uses mirror (with --pivot it’s allowed to 
complete, without it is cancelled).


(And the point of course is that if you want mirror to create a copy 
without pivoting, you need this behavior, because otherwise the target 
may be in an inconsistent state.)





Note, that if cancelling all in-flight requests on target is wrong on 
mirror cancel, we still don't have real bug, as the only implementation 
of .bdrv_cancel_in_flight is stopping reconnect waiting in nbd driver. 
So, we'll cancel requests only if connection is already lost anyway.


But that probably means, that correct name of the handler would be 
.bdrv_cancel_in_fligth_requests_that_will_most_probably_fail_anyway()..


It’s documentation states that it should cancel all in-flight requests, 
so it’s more likely it just isn’t implemented where it could be.


And it also means, that abuse of mirror-cancel as valid completion is a 
bad idea, as we can't distinguish the cases when user calls job-cancel 
to have a kind of point-in-time copy, or just to cancel job (and being 
not interested in the final state of target).


So, probably we need an option boolean argument for blockjob-cancel, 
like "hard", that will cancel in-flight writes on target node..


There is @force.  See commit b76e4458b1eb3c3.

Max

[PULL 1/1] block/nbd: fix possible use after free of s->connect_thread

2021-04-13 Thread Max Reitz

From: Vladimir Sementsov-Ogievskiy 

If on nbd_close() we detach the thread (in
nbd_co_establish_connection_cancel() thr->state becomes
CONNECT_THREAD_RUNNING_DETACHED), after that point we should not use
s->connect_thread (which is set to NULL), as running thread may free it
at any time.

Still nbd_co_establish_connection() does exactly this: it saves
s->connect_thread to local variable (just for better code style) and
use it even after yield point, when thread may be already detached.

Fix that. Also check thr to be non-NULL on
nbd_co_establish_connection() start for safety.

After this patch "case CONNECT_THREAD_RUNNING_DETACHED" becomes
impossible in the second switch in nbd_co_establish_connection().
Still, don't add extra abort() just before the release. If it somehow
possible to reach this "case:" it won't hurt. Anyway, good refactoring
of all this reconnect mess will come soon.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Message-Id: <20210406155114.1057355-1-vsement...@virtuozzo.com>
Reviewed-by: Roman Kagan 
Signed-off-by: Max Reitz 
---
 block/nbd.c | 11 +++
 1 file changed, 11 insertions(+)

diff --git a/block/nbd.c b/block/nbd.c
index c26dc5a54f..1d4668d42d 100644
--- a/block/nbd.c
+++ b/block/nbd.c
@@ -443,6 +443,11 @@ nbd_co_establish_connection(BlockDriverState *bs, Error 
**errp)
 BDRVNBDState *s = bs->opaque;
 NBDConnectThread *thr = s->connect_thread;
 
+if (!thr) {
+/* detached */
+return -1;
+}
+
 qemu_mutex_lock(>mutex);
 
 switch (thr->state) {
@@ -486,6 +491,12 @@ nbd_co_establish_connection(BlockDriverState *bs, Error 
**errp)
 s->wait_connect = true;
 qemu_coroutine_yield();
 
+if (!s->connect_thread) {
+/* detached */
+return -1;
+}
+assert(thr == s->connect_thread);
+
 qemu_mutex_lock(>mutex);
 
 switch (thr->state) {
-- 
2.29.2

[PULL 0/1] Block patch for 6.0-rc3

2021-04-13 Thread Max Reitz

The following changes since commit dce628a97fde2594f99d738883a157f05aa0a14f:

  Merge remote-tracking branch 'remotes/dg-gitlab/tags/ppc-for-6.0-20210412' 
into staging (2021-04-13 13:05:07 +0100)

are available in the Git repository at:

  https://github.com/XanClic/qemu.git tags/pull-block-2021-04-13

for you to fetch changes up to 0267101af64292c9a84fd9319a763ddfbce9ddc7:

  block/nbd: fix possible use after free of s->connect_thread (2021-04-13 
15:35:12 +0200)

(This is the patch for which Vladimir has sent a pull request yesterday.)


Block patches for 6.0-rc3:
- Use-after-free fix for block/nbd.c


Vladimir Sementsov-Ogievskiy (1):
  block/nbd: fix possible use after free of s->connect_thread

 block/nbd.c | 11 +++
 1 file changed, 11 insertions(+)

-- 
2.29.2

Re: [PATCH for-6.0] block/nbd: fix possible use after free of s->connect_thread

2021-04-13 Thread Max Reitz


On 13.04.21 14:19, Vladimir Sementsov-Ogievskiy wrote:

13.04.2021 14:53, Max Reitz wrote:

On 06.04.21 17:51, Vladimir Sementsov-Ogievskiy wrote:

If on nbd_close() we detach the thread (in
nbd_co_establish_connection_cancel() thr->state becomes
CONNECT_THREAD_RUNNING_DETACHED), after that point we should not use
s->connect_thread (which is set to NULL), as running thread may free it
at any time.

Still nbd_co_establish_connection() does exactly this: it saves
s->connect_thread to local variable (just for better code style) and
use it even after yield point, when thread may be already detached.

Fix that. Also check thr to be non-NULL on
nbd_co_establish_connection() start for safety.

After this patch "case CONNECT_THREAD_RUNNING_DETACHED" becomes
impossible in the second switch in nbd_co_establish_connection().
Still, don't add extra abort() just before the release. If it somehow
possible to reach this "case:" it won't hurt. Anyway, good refactoring
of all this reconnect mess will come soon.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
---

Hi all! I faced a crash, just running 277 iotest in a loop. I can't
reproduce it on master, it reproduces only on my branch with nbd
reconnect refactorings.

Still, it seems very possible that it may crash under some conditions.
So I propose this patch for 6.0. It's written so that it's obvious that
it will not hurt:

  pre-patch, on first hunk we'll just crash if thr is NULL,
  on second hunk it's safe to return -1, and using thr when
  s->connect_thread is already zeroed is obviously wrong.

  block/nbd.c | 11 +++
  1 file changed, 11 insertions(+)

diff --git a/block/nbd.c b/block/nbd.c
index c26dc5a54f..1d4668d42d 100644
--- a/block/nbd.c
+++ b/block/nbd.c
@@ -443,6 +443,11 @@ nbd_co_establish_connection(BlockDriverState 
*bs, Error **errp)

  BDRVNBDState *s = bs->opaque;
  NBDConnectThread *thr = s->connect_thread;
+    if (!thr) {
+    /* detached */
+    return -1;
+    }
+
  qemu_mutex_lock(>mutex);
  switch (thr->state) {


First, it is a bit strange not to set *errp in these cases. 


Oops, right! ashamed)


OK, so who cares.  It wouldn’t do anything anyway.

Apart from that, all the changes do is to turn use after frees or 
immediate NULL dereferences into clean errors.  I can’t see any 
resources that should be cleaned up, so I hope Coverity won’t hate me 
for taking this patch.


And then we’ll see whether Peter will take the pull request...

Max

Re: [PATCH for-6.0] block/nbd: fix possible use after free of s->connect_thread

2021-04-13 Thread Max Reitz


On 06.04.21 17:51, Vladimir Sementsov-Ogievskiy wrote:

If on nbd_close() we detach the thread (in
nbd_co_establish_connection_cancel() thr->state becomes
CONNECT_THREAD_RUNNING_DETACHED), after that point we should not use
s->connect_thread (which is set to NULL), as running thread may free it
at any time.

Still nbd_co_establish_connection() does exactly this: it saves
s->connect_thread to local variable (just for better code style) and
use it even after yield point, when thread may be already detached.

Fix that. Also check thr to be non-NULL on
nbd_co_establish_connection() start for safety.

After this patch "case CONNECT_THREAD_RUNNING_DETACHED" becomes
impossible in the second switch in nbd_co_establish_connection().
Still, don't add extra abort() just before the release. If it somehow
possible to reach this "case:" it won't hurt. Anyway, good refactoring
of all this reconnect mess will come soon.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
---

Hi all! I faced a crash, just running 277 iotest in a loop. I can't
reproduce it on master, it reproduces only on my branch with nbd
reconnect refactorings.

Still, it seems very possible that it may crash under some conditions.
So I propose this patch for 6.0. It's written so that it's obvious that
it will not hurt:

  pre-patch, on first hunk we'll just crash if thr is NULL,
  on second hunk it's safe to return -1, and using thr when
  s->connect_thread is already zeroed is obviously wrong.

  block/nbd.c | 11 +++
  1 file changed, 11 insertions(+)

diff --git a/block/nbd.c b/block/nbd.c
index c26dc5a54f..1d4668d42d 100644
--- a/block/nbd.c
+++ b/block/nbd.c
@@ -443,6 +443,11 @@ nbd_co_establish_connection(BlockDriverState *bs, Error 
**errp)
  BDRVNBDState *s = bs->opaque;
  NBDConnectThread *thr = s->connect_thread;
  
+if (!thr) {

+/* detached */
+return -1;
+}
+
  qemu_mutex_lock(>mutex);
  
  switch (thr->state) {


First, it is a bit strange not to set *errp in these cases.  I don’t 
think it’d make a difference anyway, but now that I look into it...  The 
error would be propagated to s->connect_err, but that isn’t used 
anywhere, so...  What’s the point?


Anyway.  What I really wanted to ask is: nbd_co_establish_connection() 
may create a thread, which accesses the thread.  Is that a problem? 
Like, can this happen:


- nbd_co_establish_connection(): thr->state := THREAD_RUNNING
- nbd_co_establish_connection(): thread is created
- nbd_co_establish_connection(): thr->mutex unlocked
- connect_thread_func(): thr->mutex locked
- connect_thread_func(): thr->state := something else
- connect_thread_func(): thr->mutex unlocked
- nbd_co_establish_connection(): yields
- (As I understood it, this yield leads to
   nbd_co_establish_connection_cancel() continuing)
- nbd_co_EC_cancel(): thr->mutex locked
- nbd_co_EC_cancel(): do_free true
- nbd_co_EC_cancel(): nbd_free_connect_thread()
- connect_thread_func(): nbd_free_connect_thread()

?

Max

Re: [PATCH v2 2/2] block/rbd: Add an escape-aware strchr helper

2021-04-09 Thread Max Reitz


On 09.04.21 16:05, Connor Kuehl wrote:

On 4/6/21 9:24 AM, Max Reitz wrote:

On 01.04.21 23:01, Connor Kuehl wrote:

[..]
diff --git a/block/rbd.c b/block/rbd.c
index 9071a00e3f..c0e4d4a952 100644
--- a/block/rbd.c
+++ b/block/rbd.c
@@ -134,6 +134,22 @@ static char *qemu_rbd_next_tok(char *src, char 
delim, char **p)

  return src;
  }
+static char *qemu_rbd_strchr(char *src, char delim)
+{
+    char *p;
+
+    for (p = src; *p; ++p) {
+    if (*p == delim) {
+    return p;
+    }
+    if (*p == '\\') {
+    ++p;
+    }
+    }
+
+    return NULL;
+}
+


So I thought you could make qemu_rbd_do_next_tok() to do this.  (I 
didn’t say you should, but bear with me.)  That would be possible by 
giving it a new parameter (e.g. @find), and if that is set, return 
@end if *end == delim after the loop, and NULL otherwise.


Now, if you add wrapper functions to make it nice, there’s not much 
more difference in lines added compared to just adding a new function, 
but it does mean your function should basically be the same as 
qemu_rbd_next_tok(), except that no splitting happens, that there is 
no *p, and that @end is returned instead of @src.


Do you have a strong preference for this? I agree that 
qemu_rbd_next_tok() could grow this functionality, but I think it'd be 
simpler to keep it separate in the form of qemu_rbd_strchr().


Oh, no, no.  I mostly said this so it would be clear why both functions 
should basically have the same structure, i.e. why a difference in 
structure might be a sign that something’s wrong.


Sorry if I came across as too verbose.

So there is one difference, and that is that qemu_rbd_next_tok() has 
this condition to skip escaped characters:


 if (*end == '\\' && end[1] != '\0') {

where qemu_rbd_strchr() has only:

 if (*p == '\\') {

And I think qemu_rbd_next_tok() is right; if the string in question 
has a trailing backslash, qemu_rbd_strchr() will ignore the final NUL 
and continue searching past the end of the string.


Aha, good catch. I'll fix this up.


Thanks!

Max

Re: [PATCH v2 3/3] iotests/041: block-job-complete on user-paused job

2021-04-09 Thread Max Reitz


On 09.04.21 15:29, Max Reitz wrote:

Expand test_pause() to check what happens when issuing
block-job-complete on a job that is on STANDBY because it has been
paused by the user.  (This should be an error, and in particular not
hang job_wait_unpaused().)

Signed-off-by: Max Reitz 
---
  tests/qemu-iotests/041 | 13 -
  1 file changed, 12 insertions(+), 1 deletion(-)

diff --git a/tests/qemu-iotests/041 b/tests/qemu-iotests/041
index 5cc02b24fc..d2c9669741 100755
--- a/tests/qemu-iotests/041
+++ b/tests/qemu-iotests/041
@@ -120,7 +120,18 @@ class TestSingleDrive(iotests.QMPTestCase):
  result = self.vm.qmp('block-job-resume', device='drive0')
  self.assert_qmp(result, 'return', {})
  
-self.complete_and_wait()

+self.wait_ready()
+
+# Check that a job on STANDBY cannot be completed
+self.pause_job('drive0')
+result = self.vm.qmp('block-job-complete', device='drive0')
+self.assert_qmp(result, 'error/desc',
+"Job 'drive0' has been paused by the user")


Oops.  Should now be

"Job 'drive0' has been paused and needs to be explicitly resumed"

of course.

Max


+
+result = self.vm.qmp('block-job-resume', device='drive0')
+self.assert_qmp(result, 'return', {})
+
+self.complete_and_wait(wait_ready=False)
  self.vm.shutdown()
  self.assertTrue(iotests.compare_images(test_img, target_img),
  'target image does not match source after mirroring')

[PATCH v2 2/3] test-blockjob: Test job_wait_unpaused()

2021-04-09 Thread Max Reitz

Create a job that remains on STANDBY after a drained section, and see
that invoking job_wait_unpaused() will get it unstuck.

Signed-off-by: Max Reitz 
---
 tests/unit/test-blockjob.c | 140 +
 1 file changed, 140 insertions(+)

diff --git a/tests/unit/test-blockjob.c b/tests/unit/test-blockjob.c
index 7519847912..b7736e298d 100644
--- a/tests/unit/test-blockjob.c
+++ b/tests/unit/test-blockjob.c
@@ -16,6 +16,7 @@
 #include "block/blockjob_int.h"
 #include "sysemu/block-backend.h"
 #include "qapi/qmp/qdict.h"
+#include "iothread.h"
 
 static const BlockJobDriver test_block_job_driver = {
 .job_driver = {
@@ -375,6 +376,144 @@ static void test_cancel_concluded(void)
 cancel_common(s);
 }
 
+/* (See test_yielding_driver for the job description) */
+typedef struct YieldingJob {
+BlockJob common;
+bool should_complete;
+} YieldingJob;
+
+static void yielding_job_complete(Job *job, Error **errp)
+{
+YieldingJob *s = container_of(job, YieldingJob, common.job);
+s->should_complete = true;
+job_enter(job);
+}
+
+static int coroutine_fn yielding_job_run(Job *job, Error **errp)
+{
+YieldingJob *s = container_of(job, YieldingJob, common.job);
+
+job_transition_to_ready(job);
+
+while (!s->should_complete) {
+job_yield(job);
+}
+
+return 0;
+}
+
+/*
+ * This job transitions immediately to the READY state, and then
+ * yields until it is to complete.
+ */
+static const BlockJobDriver test_yielding_driver = {
+.job_driver = {
+.instance_size  = sizeof(YieldingJob),
+.free   = block_job_free,
+.user_resume= block_job_user_resume,
+.run= yielding_job_run,
+.complete   = yielding_job_complete,
+},
+};
+
+/*
+ * Test that job_wait_unpaused() can get jobs from a paused state to
+ * a running state so that job_complete() can be applied (assuming the
+ * pause occurred due to a drain that has already been lifted).
+ * (This is what QMP's block-job-complete does so it can be executed
+ * even immediately after some other operation instated and lifted a
+ * drain.)
+ *
+ * To do this, run YieldingJob in an IO thread, get it into the READY
+ * state, then have a drained section.  Before ending the section,
+ * acquire the context so the job will not be entered and will thus
+ * remain on STANDBY.
+ *
+ * Invoking job_complete() then will fail.
+ *
+ * However, job_wait_unpaused() should see the job is to be resumed,
+ * wait for it to be resumed, and then we can invoke job_complete()
+ * without error.
+ *
+ * Note that on the QMP interface, it is impossible to lock an IO
+ * thread before a drained section ends.  In practice, the
+ * bdrv_drain_all_end() and the aio_context_acquire() will be
+ * reversed.  However, that makes for worse reproducibility here:
+ * Sometimes, the job would no longer be in STANDBY then but already
+ * be started.  We cannot prevent that, because the IO thread runs
+ * concurrently.  We can only prevent it by taking the lock before
+ * ending the drained section, so we do that.
+ *
+ * (You can reverse the order of operations and most of the time the
+ * test will pass, but sometimes the assert(status == STANDBY) will
+ * fail.)
+ */
+static void test_complete_in_standby(void)
+{
+BlockBackend *blk;
+IOThread *iothread;
+AioContext *ctx;
+Job *job;
+BlockJob *bjob;
+Error *local_err = NULL;
+
+/* Create a test drive, move it to an IO thread */
+blk = create_blk(NULL);
+iothread = iothread_new();
+
+ctx = iothread_get_aio_context(iothread);
+blk_set_aio_context(blk, ctx, _abort);
+
+/* Create our test job */
+bjob = mk_job(blk, "job", _yielding_driver, true,
+  JOB_MANUAL_FINALIZE | JOB_MANUAL_DISMISS);
+job = >job;
+assert(job->status == JOB_STATUS_CREATED);
+
+/* Wait for the job to become READY */
+job_start(job);
+aio_context_acquire(ctx);
+AIO_WAIT_WHILE(ctx, job->status != JOB_STATUS_READY);
+aio_context_release(ctx);
+
+/* Begin the drained section, pausing the job */
+bdrv_drain_all_begin();
+assert(job->status == JOB_STATUS_STANDBY);
+/* Lock the IO thread to prevent the job from being run */
+aio_context_acquire(ctx);
+/* This will schedule the job to resume it */
+bdrv_drain_all_end();
+
+/* But the job cannot run, so it will remain on standby */
+assert(job->status == JOB_STATUS_STANDBY);
+
+/* A job on standby cannot be completed */
+job_complete(job, _err);
+assert(local_err != NULL);
+error_free(local_err);
+local_err = NULL;
+
+/*
+ * But waiting for it and then completing it should work.
+ * (This is what qmp_block_job_complete() does.)
+ */
+job_wait_unpaused(job, _abort);
+job_complete(job, _abort);
+
+/* The test is done now, clean up. */
+job_finish

[PATCH v2 1/3] job: Add job_wait_unpaused() for block-job-complete

2021-04-09 Thread Max Reitz

block-job-complete can only be applied when the job is READY, not when
it is on STANDBY (ready, but paused).  Draining a job technically pauses
it (which makes a READY job enter STANDBY), and ending the drained
section does not synchronously resume it, but only schedules the job,
which will then be resumed.  So attempting to complete a job immediately
after a drained section may sometimes fail.

That is bad at least because users cannot really work nicely around
this: A job may be paused and resumed at any time, so waiting for the
job to be in the READY state and then issuing a block-job-complete poses
a TOCTTOU problem.  The only way around it would be to issue
block-job-complete until it no longer fails due to the job being in the
STANDBY state, but that would not be nice.

We can solve the problem by allowing block-job-complete to be invoked on
jobs that are on STANDBY, if that status is the result of a drained
section (not because the user has paused the job), and that section has
ended.  That is, if the job is on STANDBY, but scheduled to be resumed.

Perhaps we could actually just directly allow this, seeing that mirror
is the only user of ready/complete, and that mirror_complete() could
probably work under the given circumstances, but there may be many side
effects to consider.

It is simpler to add a function job_wait_unpaused() that waits for the
job to be resumed (under said circumstances), and to make
qmp_block_job_complete() use it to delay job_complete() until then.

Note that for the future, we probably want to make block-job-complete a
coroutine QMP handler, so instead of polling job_wait_unpaused() would
yield and have job_pause_point() wake it up.  That would get around the
problem of nested polling, which is currently the reason for returning
an error when job->pause_point > 0.

Buglink: https://bugzilla.redhat.com/show_bug.cgi?id=1945635
Signed-off-by: Max Reitz 
---
 include/qemu/job.h | 15 +
 blockdev.c |  3 +++
 job.c  | 53 ++
 3 files changed, 71 insertions(+)

diff --git a/include/qemu/job.h b/include/qemu/job.h
index efc6fa7544..cf3082b6d7 100644
--- a/include/qemu/job.h
+++ b/include/qemu/job.h
@@ -563,4 +563,19 @@ void job_dismiss(Job **job, Error **errp);
  */
 int job_finish_sync(Job *job, void (*finish)(Job *, Error **errp), Error 
**errp);
 
+/**
+ * If the job has been paused because of a drained section, and that
+ * section has ended, wait until the job is resumed.
+ *
+ * Return 0 if the job is not paused, or if it has been successfully
+ * resumed.
+ * Return an error if the job has been paused in such a way that
+ * waiting will not resume it, i.e. if it has been paused by the user,
+ * or if it is still drained.
+ *
+ * Callers must be in the home AioContext and hold the AioContext lock
+ * of job->aio_context.
+ */
+int job_wait_unpaused(Job *job, Error **errp);
+
 #endif
diff --git a/blockdev.c b/blockdev.c
index a57590aae4..c0cc2fa364 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -3414,6 +3414,9 @@ void qmp_block_job_complete(const char *device, Error 
**errp)
 return;
 }
 
+if (job_wait_unpaused(>job, errp) < 0) {
+return;
+}
 trace_qmp_block_job_complete(job);
 job_complete(>job, errp);
 aio_context_release(aio_context);
diff --git a/job.c b/job.c
index 289edee143..fbccd4b32a 100644
--- a/job.c
+++ b/job.c
@@ -505,6 +505,7 @@ void coroutine_fn job_pause_point(Job *job)
 job_do_yield(job, -1);
 job->paused = false;
 job_state_transition(job, status);
+aio_wait_kick();
 }
 
 if (job->driver->resume) {
@@ -1023,3 +1024,55 @@ int job_finish_sync(Job *job, void (*finish)(Job *, 
Error **errp), Error **errp)
 job_unref(job);
 return ret;
 }
+
+int job_wait_unpaused(Job *job, Error **errp)
+{
+/*
+ * Only run this function from the main context, because this is
+ * what we need, and this way we do not have to think about what
+ * happens if the user concurrently pauses the job from the main
+ * monitor.
+ */
+assert(qemu_get_current_aio_context() == qemu_get_aio_context());
+
+/*
+ * Quick path (e.g. so we do not get an error if pause_count > 0
+ * but the job is not even paused)
+ */
+if (!job->paused) {
+return 0;
+}
+
+/* If the user has paused the job, waiting will not help */
+if (job->user_paused) {
+error_setg(errp, "Job '%s' has been paused and needs to be explicitly "
+   "resumed", job->id);
+return -EBUSY;
+}
+
+/*
+ * Similarly, if the job is still drained, waiting may not help
+ * either: If the drain came from an IO thread, waiting would
+ * probably help.  However, if the drain came from the main thread
+ * (which may be possible if the QMP handler calling this function
+ * has been invoked by BDRV_POLL_WHILE() f

[PATCH v2 4/3] iotests: Test completion immediately after drain

2021-04-09 Thread Max Reitz

Test what happens when you have multiple busy block jobs, drain all (via
an empty transaction), and immediately issue a block-job-complete on one
of the jobs.

Sometimes it will still be in STANDBY, in which case block-job-complete
used to fail.  It should not.

Signed-off-by: Max Reitz 
---
 .../tests/mirror-complete-after-drain | 89 +++
 .../tests/mirror-complete-after-drain.out | 14 +++
 2 files changed, 103 insertions(+)
 create mode 100755 tests/qemu-iotests/tests/mirror-complete-after-drain
 create mode 100644 tests/qemu-iotests/tests/mirror-complete-after-drain.out

diff --git a/tests/qemu-iotests/tests/mirror-complete-after-drain 
b/tests/qemu-iotests/tests/mirror-complete-after-drain
new file mode 100755
index 00..b096ffbcb4
--- /dev/null
+++ b/tests/qemu-iotests/tests/mirror-complete-after-drain
@@ -0,0 +1,89 @@
+#!/usr/bin/env python3
+# group: rw
+#
+# Tests for block-job-complete immediately after a drain
+#
+# Copyright (c) 2021 Red Hat, Inc.
+#
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 2 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program.  If not, see <http://www.gnu.org/licenses/>.
+#
+
+import iotests
+
+iotests.script_initialize(supported_fmts=['raw'])
+
+DISK_JOBS = 4
+NULL_JOBS = 1
+
+
+# We cannot auto-generate these in a loop because the files are
+# deleted when their scope ends
+src_imgs = iotests.file_path('src0', 'src1', 'src2', 'src3')
+dst_imgs = iotests.file_path('dst0', 'dst1', 'dst2', 'dst3')
+
+assert len(src_imgs) == DISK_JOBS
+assert len(dst_imgs) == DISK_JOBS
+
+
+for i in range(DISK_JOBS):
+ret = iotests.qemu_img('create', '-f', iotests.imgfmt, src_imgs[i], '128M')
+assert ret == 0
+
+ret = iotests.qemu_img('create', '-f', iotests.imgfmt, dst_imgs[i], '128M')
+assert ret == 0
+
+with iotests.VM() as vm:
+vm.add_object('iothread,id=iothr0')
+vm.add_device('virtio-scsi,iothread=iothr0')
+
+for i in range(DISK_JOBS):
+vm.add_blockdev(f'file,node-name=source-disk-{i},'
+f'filename={src_imgs[i]}')
+
+vm.add_blockdev(f'file,node-name=target-disk-{i},'
+f'filename={dst_imgs[i]}')
+
+vm.add_device(f'scsi-hd,id=device-disk-{i},drive=source-disk-{i}')
+
+for i in range(NULL_JOBS):
+vm.add_blockdev(f'null-co,node-name=source-null-{i},read-zeroes=on')
+vm.add_blockdev(f'null-co,node-name=target-null-{i},read-zeroes=on')
+vm.add_device(f'scsi-hd,id=device-null-{i},drive=source-null-{i}')
+
+vm.launch()
+
+for i in range(DISK_JOBS):
+vm.qmp_log('blockdev-mirror',
+   job_id=f'mirror-disk-{i}',
+   device=f'source-disk-{i}',
+   target=f'target-disk-{i}',
+   sync='full',
+   granularity=1048576,
+   buf_size=(16 * 1048576))
+
+for i in range(NULL_JOBS):
+vm.qmp_log('blockdev-mirror',
+   job_id=f'mirror-null-{i}',
+   device=f'source-null-{i}',
+   target=f'target-null-{i}',
+   sync='full')
+
+for i in range(DISK_JOBS + NULL_JOBS):
+vm.event_wait('BLOCK_JOB_READY')
+
+for i in range(DISK_JOBS):
+vm.hmp(f'qemu-io -d device-disk-{i} "write 0 128M"')
+
+vm.qmp_log('transaction', actions=[])
+vm.qmp_log('block-job-complete', device='mirror-null-0')
diff --git a/tests/qemu-iotests/tests/mirror-complete-after-drain.out 
b/tests/qemu-iotests/tests/mirror-complete-after-drain.out
new file mode 100644
index 00..4d9d0529fe
--- /dev/null
+++ b/tests/qemu-iotests/tests/mirror-complete-after-drain.out
@@ -0,0 +1,14 @@
+{"execute": "blockdev-mirror", "arguments": {"buf-size": 16777216, "device": 
"source-disk-0", "granularity": 1048576, "job-id": "mirror-disk-0", "sync": 
"full", "target": "target-disk-0"}}
+{"return": {}}
+{"execute": "blockdev-mirror", "arguments": {"buf-size": 16777216, "device": 
"source-disk-1", "granularity": 1048576, "job-id": "mirror-disk-1", "sync": 
"full", "target": "target-disk-1"}}
+{"return": {}}
+{"execute": "blockdev-mirror", "arguments": {"buf-size": 16

[PATCH v2 3/3] iotests/041: block-job-complete on user-paused job

2021-04-09 Thread Max Reitz

Expand test_pause() to check what happens when issuing
block-job-complete on a job that is on STANDBY because it has been
paused by the user.  (This should be an error, and in particular not
hang job_wait_unpaused().)

Signed-off-by: Max Reitz 
---
 tests/qemu-iotests/041 | 13 -
 1 file changed, 12 insertions(+), 1 deletion(-)

diff --git a/tests/qemu-iotests/041 b/tests/qemu-iotests/041
index 5cc02b24fc..d2c9669741 100755
--- a/tests/qemu-iotests/041
+++ b/tests/qemu-iotests/041
@@ -120,7 +120,18 @@ class TestSingleDrive(iotests.QMPTestCase):
 result = self.vm.qmp('block-job-resume', device='drive0')
 self.assert_qmp(result, 'return', {})
 
-self.complete_and_wait()
+self.wait_ready()
+
+# Check that a job on STANDBY cannot be completed
+self.pause_job('drive0')
+result = self.vm.qmp('block-job-complete', device='drive0')
+self.assert_qmp(result, 'error/desc',
+"Job 'drive0' has been paused by the user")
+
+result = self.vm.qmp('block-job-resume', device='drive0')
+self.assert_qmp(result, 'return', {})
+
+self.complete_and_wait(wait_ready=False)
 self.vm.shutdown()
 self.assertTrue(iotests.compare_images(test_img, target_img),
 'target image does not match source after mirroring')
-- 
2.29.2

[PATCH v2 0/3] job: Add job_wait_unpaused() for block-job-complete

2021-04-09 Thread Max Reitz

Hi,

v1:
https://lists.nongnu.org/archive/html/qemu-block/2021-04/msg00215.html

Alternative:
https://lists.nongnu.org/archive/html/qemu-block/2021-04/msg00261.html


Compared to v1, I’ve added an aio_wait_kick() to job_pause_point() (as
suggested by Kevin) and adjusted the error message on job->user_paused
(as suggested by John).  I’ve kept the job->pause_count > 0 block
because of the concern of nested event loops Kevin raised, and I’ve
expanded its comment to explain the problem (I hope).

Note also the new note in patch 1’s commit message, which explains how
we’d ideally want block-job-complete to be a coroutine QMP handler so we
could yield instead of polling.


Furthermore, I’ve added the flaky test that I’ve also appended to the
alternative series.  Sometimes it fails 50/100 times for me, sometimes
more like 20/100.  (On master.)  Maybe it won’t reproduce the problem
for you at all.


git-backport-diff against v1:

Key:
[] : patches are identical
[] : number of functional differences between upstream/downstream patch
[down] : patch is downstream-only
The flags [FC] indicate (F)unctional and (C)ontextual differences, respectively

001/4:[0015] [FC] 'job: Add job_wait_unpaused() for block-job-complete'
002/4:[] [--] 'test-blockjob: Test job_wait_unpaused()'
003/4:[] [--] 'iotests/041: block-job-complete on user-paused job'
004/4:[down] 'iotests: Test completion immediately after drain'



Max Reitz (4):
  job: Add job_wait_unpaused() for block-job-complete
  test-blockjob: Test job_wait_unpaused()
  iotests/041: block-job-complete on user-paused job
  iotests: Test completion immediately after drain

 include/qemu/job.h|  15 ++
 blockdev.c|   3 +
 job.c |  53 +++
 tests/unit/test-blockjob.c| 140 ++
 tests/qemu-iotests/041|  13 +-
 .../tests/mirror-complete-after-drain |  89 +++
 .../tests/mirror-complete-after-drain.out |  14 ++
 7 files changed, 326 insertions(+), 1 deletion(-)
 create mode 100755 tests/qemu-iotests/tests/mirror-complete-after-drain
 create mode 100644 tests/qemu-iotests/tests/mirror-complete-after-drain.out

-- 
2.29.2

< 1 2 3 4 5 6 7 8 9 10 >

201 - 300 of 9029 matches

Mail list logo