Re: [PULL 30/39] block: bdrv_reopen_multiple: refresh permissions on updated graph

2021-04-30 Thread Peter Maydell
On Fri, 30 Apr 2021 at 11:53, Kevin Wolf  wrote:
>
> From: Vladimir Sementsov-Ogievskiy 
>
> Move bdrv_reopen_multiple to new paradigm of permission update:
> first update graph relations, then do refresh the permissions.
>
> We have to modify reopen process in file-posix driver: with new scheme
> we don't have prepared permissions in raw_reopen_prepare(), so we
> should reconfigure fd in raw_check_perm(). Still this seems more native
> and simple anyway.
>
> Signed-off-by: Vladimir Sementsov-Ogievskiy 
> Reviewed-by: Kevin Wolf 
> Message-Id: <20210428151804.439460-31-vsement...@virtuozzo.com>
> Signed-off-by: Kevin Wolf 

Hi; Coverity thinks this change introduced a resource leak
(CID 1452772):



> @@ -4271,6 +4270,9 @@ int bdrv_reopen_multiple(BlockReopenQueue *bs_queue, 
> Error **errp)
>  {
>  int ret = -1;
>  BlockReopenQueueEntry *bs_entry, *next;
> +Transaction *tran = tran_new();
> +g_autoptr(GHashTable) found = NULL;
> +g_autoptr(GSList) refresh_list = NULL;

Now we allocate a new Transaction at the start of the function...

>
>  assert(bs_queue != NULL);
>

...but in the code between these two hunks there is this:

QTAILQ_FOREACH(bs_entry, bs_queue, entry) {
ret = bdrv_flush(bs_entry->state.bs);
if (ret < 0) {
error_setg_errno(errp, -ret, "Error flushing drive");
goto cleanup;
}
}

which jumps to 'cleanup' on failure...


> -if (ret == 0) {
> -QTAILQ_FOREACH_REVERSE(bs_entry, bs_queue, entry) {
> -BlockDriverState *bs = bs_entry->state.bs;
> +ret = 0;
> +goto cleanup;
>
> -if (bs->drv->bdrv_reopen_commit_post)
> -bs->drv->bdrv_reopen_commit_post(&bs_entry->state);
> +abort:
> +tran_abort(tran);
> +QTAILQ_FOREACH_SAFE(bs_entry, bs_queue, entry, next) {
> +if (bs_entry->prepared) {
> +bdrv_reopen_abort(&bs_entry->state);
>  }
> +qobject_unref(bs_entry->state.explicit_options);
> +qobject_unref(bs_entry->state.options);
>  }
> +
>  cleanup:
>  QTAILQ_FOREACH_SAFE(bs_entry, bs_queue, entry, next) {
> -if (ret) {
> -if (bs_entry->prepared) {
> -bdrv_reopen_abort(&bs_entry->state);
> -}
> -qobject_unref(bs_entry->state.explicit_options);
> -qobject_unref(bs_entry->state.options);
> -}
> -if (bs_entry->state.new_backing_bs) {
> -bdrv_unref(bs_entry->state.new_backing_bs);
> -}
>  g_free(bs_entry);
>  }
>  g_free(bs_queue);

...and the 'cleanup' label doesn't free the Transaction.


An easy fix would be to move the call to tran_new() down to
below the loop that calls bdrv_flush().

thanks
-- PMM



Re: [PULL 18/39] block: add bdrv_attach_child_common() transaction action

2021-04-30 Thread Peter Maydell
On Fri, 30 Apr 2021 at 11:53, Kevin Wolf  wrote:
>
> From: Vladimir Sementsov-Ogievskiy 
>
> Split out no-perm part of bdrv_root_attach_child() into separate
> transaction action. bdrv_root_attach_child() now moves to new
> permission update paradigm: first update graph relations then update
> permissions.
>
> qsd-jobs test output updated. Seems now permission update goes in
> another order. Still, the test comment say that we only want to check
> that command doesn't crash, and it's still so.
>
> Error message is a bit misleading as it looks like job was added first.
> But actually in new paradigm of graph update we can't distinguish such
> things. We should update the error message, but let's not do it now.
>
> Signed-off-by: Vladimir Sementsov-Ogievskiy 
> Message-Id: <20210428151804.439460-19-vsement...@virtuozzo.com>
> Signed-off-by: Kevin Wolf 

Hi; Coverity thinks it's found a resource leak in this function
(CID 1452773):

> +/*
> + * This function steals the reference to child_bs from the caller.
> + * That reference is later dropped by bdrv_root_unref_child().
> + *
> + * On failure NULL is returned, errp is set and the reference to
> + * child_bs is also dropped.
> + *
> + * The caller must hold the AioContext lock @child_bs, but not that of @ctx
> + * (unless @child_bs is already in @ctx).
> + */
> +BdrvChild *bdrv_root_attach_child(BlockDriverState *child_bs,
> +  const char *child_name,
> +  const BdrvChildClass *child_class,
> +  BdrvChildRole child_role,
> +  uint64_t perm, uint64_t shared_perm,
> +  void *opaque, Error **errp)
> +{
> +int ret;
> +BdrvChild *child = NULL;
> +Transaction *tran = tran_new();

Here we create a new Transaction...

> +
> +ret = bdrv_attach_child_common(child_bs, child_name, child_class,
> +   child_role, perm, shared_perm, opaque,
> +   &child, tran, errp);
> +if (ret < 0) {
> +bdrv_unref(child_bs);
> +return NULL;

...but in this error-return path we don't free it.

> +}
> +
> +ret = bdrv_refresh_perms(child_bs, errp);
> +tran_finalize(tran, ret);
> +
> +bdrv_unref(child_bs);
>  return child;
>  }

thanks
-- PMM



Re: [PULL 37/64] block/snapshot: Fix fallback

2021-04-30 Thread Peter Maydell
On Mon, 7 Sept 2020 at 12:11, Kevin Wolf  wrote:
>
> From: Max Reitz 
>
> If the top node's driver does not provide snapshot functionality and we
> want to fall back to a node down the chain, we need to snapshot all
> non-COW children.  For simplicity's sake, just do not fall back if there
> is more than one such child.  Furthermore, we really only can fall back
> to bs->file and bs->backing, because bdrv_snapshot_goto() has to modify
> the child link (notably, set it to NULL).
>
> Suggested-by: Vladimir Sementsov-Ogievskiy 
> Signed-off-by: Max Reitz 
> Reviewed-by: Andrey Shinkevich 
> Reviewed-by: Kevin Wolf 

Hi; Coverity thinks it's found a problem with this code
(CID 1452774):


> @@ -205,39 +258,46 @@ int bdrv_snapshot_goto(BlockDriverState *bs,
>  return ret;
>  }
>
> -if (bs->file) {
> -BlockDriverState *file;
> -QDict *options = qdict_clone_shallow(bs->options);
> +fallback_ptr = bdrv_snapshot_fallback_ptr(bs);
> +if (fallback_ptr) {
> +QDict *options;
>  QDict *file_options;
>  Error *local_err = NULL;
> +BlockDriverState *fallback_bs = (*fallback_ptr)->bs;
> +char *subqdict_prefix = g_strdup_printf("%s.", 
> (*fallback_ptr)->name);
> +
> +options = qdict_clone_shallow(bs->options);
>
> -file = bs->file->bs;
>  /* Prevent it from getting deleted when detached from bs */
> -bdrv_ref(file);
> +bdrv_ref(fallback_bs);
>
> -qdict_extract_subqdict(options, &file_options, "file.");
> +qdict_extract_subqdict(options, &file_options, subqdict_prefix);
>  qobject_unref(file_options);
> -qdict_put_str(options, "file", bdrv_get_node_name(file));
> +g_free(subqdict_prefix);
> +
> +qdict_put_str(options, (*fallback_ptr)->name,
> +  bdrv_get_node_name(fallback_bs));
>
>  if (drv->bdrv_close) {
>  drv->bdrv_close(bs);
>  }
> -bdrv_unref_child(bs, bs->file);
> -bs->file = NULL;
>
> -ret = bdrv_snapshot_goto(file, snapshot_id, errp);
> +bdrv_unref_child(bs, *fallback_ptr);
> +*fallback_ptr = NULL;

Here we set *fallback_ptr to NULL...

> +
> +ret = bdrv_snapshot_goto(fallback_bs, snapshot_id, errp);
>  open_ret = drv->bdrv_open(bs, options, bs->open_flags, &local_err);
>  qobject_unref(options);
>  if (open_ret < 0) {
> -bdrv_unref(file);
> +bdrv_unref(fallback_bs);
>  bs->drv = NULL;
>  /* A bdrv_snapshot_goto() error takes precedence */
>  error_propagate(errp, local_err);
>  return ret < 0 ? ret : open_ret;
>  }
>
> -assert(bs->file->bs == file);
> -bdrv_unref(file);
> +assert(fallback_bs == (*fallback_ptr)->bs);

...but here we dereference *fallback_ptr, and Coverity doesn't see
anything that it recognizes as being able to change it.

> +bdrv_unref(fallback_bs);
>  return ret;
>  }

False positive, or real issue? (If a false positive, a comment
explaining what's going on wouldn't go amiss -- as a human reader
I'm kind of confused about whether there's some kind of hidden
magic going on here.)

thanks
-- PMM



[PATCH] Add missing coroutine_fn function signature to functions

2021-04-30 Thread cennedee
>From 447601c28d5ed0b1208a0560390f760e75ce5613 Mon Sep 17 00:00:00 2001
From: Cenne Dee 
Date: Fri, 30 Apr 2021 15:52:28 -0400
Subject: [PATCH] Add missing coroutine_fn function signature to functions

Patch adds the signature for all relevant functions ending with _co
or those that use them.

Signed-off-by: Cenne Dee 
---
 block/block-gen.h | 2 +-
 migration/migration.c | 2 +-
 monitor/hmp.c | 2 +-
 scsi/qemu-pr-helper.c | 2 +-
 tests/unit/test-thread-pool.c | 4 ++--
 5 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/block/block-gen.h b/block/block-gen.h
index f80cf48..4963372 100644
--- a/block/block-gen.h
+++ b/block/block-gen.h
@@ -36,7 +36,7 @@ typedef struct BdrvPollCo {
 Coroutine *co; /* Keep pointer here for debugging */
 } BdrvPollCo;

-static inline int bdrv_poll_co(BdrvPollCo *s)
+static inline int coroutine_fn bdrv_poll_co(BdrvPollCo *s)
 {
 assert(!qemu_in_coroutine());

diff --git a/migration/migration.c b/migration/migration.c
index 8ca0341..1acade8 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -539,7 +539,7 @@ static void process_incoming_migration_bh(void *opaque)
 migration_incoming_state_destroy();
 }

-static void process_incoming_migration_co(void *opaque)
+static void coroutine_fn process_incoming_migration_co(void *opaque)
 {
 MigrationIncomingState *mis = migration_incoming_get_current();
 PostcopyState ps;
diff --git a/monitor/hmp.c b/monitor/hmp.c
index 6c0b33a..0a16d61 100644
--- a/monitor/hmp.c
+++ b/monitor/hmp.c
@@ -1068,7 +1068,7 @@ typedef struct HandleHmpCommandCo {
 bool done;
 } HandleHmpCommandCo;

-static void handle_hmp_command_co(void *opaque)
+static void coroutine_fn handle_hmp_command_co(void *opaque)
 {
 HandleHmpCommandCo *data = opaque;
 data->cmd->cmd(data->mon, data->qdict);
diff --git a/scsi/qemu-pr-helper.c b/scsi/qemu-pr-helper.c
index 7b9389b..7c1ed2a 100644
--- a/scsi/qemu-pr-helper.c
+++ b/scsi/qemu-pr-helper.c
@@ -175,7 +175,7 @@ static int do_sgio_worker(void *opaque)
 return status;
 }

-static int do_sgio(int fd, const uint8_t *cdb, uint8_t *sense,
+static int coroutine_fn do_sgio(int fd, const uint8_t *cdb, uint8_t *sense,
 uint8_t *buf, int *sz, int dir)
 {
 ThreadPool *pool = aio_get_thread_pool(qemu_get_aio_context());
diff --git a/tests/unit/test-thread-pool.c b/tests/unit/test-thread-pool.c
index 70dc631..21fc118 100644
--- a/tests/unit/test-thread-pool.c
+++ b/tests/unit/test-thread-pool.c
@@ -72,7 +72,7 @@ static void test_submit_aio(void)
 g_assert_cmpint(data.ret, ==, 0);
 }

-static void co_test_cb(void *opaque)
+static void coroutine_fn co_test_cb(void *opaque)
 {
 WorkerTestData *data = opaque;

@@ -90,7 +90,7 @@ static void co_test_cb(void *opaque)
 /* The test continues in test_submit_co, after aio_poll... */
 }

-static void test_submit_co(void)
+static void coroutine_fn test_submit_co(void)
 {
 WorkerTestData data;
 Coroutine *co = qemu_coroutine_create(co_test_cb, &data);
--
2.31.1





Re: [PATCH v3 14/15] qemu_iotests: add option to show qemu binary logs on stdout

2021-04-30 Thread Emanuele Giuseppe Esposito




On 30/04/2021 15:50, Max Reitz wrote:

On 14.04.21 19:03, Emanuele Giuseppe Esposito wrote:

Using the flag -p, allow the qemu binary to print to stdout.
This helps especially when doing print-debugging.


I think this shouldn’t refer to prints but to qemu’s stdout/stderr in 
general, i.e



Signed-off-by: Emanuele Giuseppe Esposito 
---
  tests/qemu-iotests/check  | 3 ++-
  tests/qemu-iotests/iotests.py | 9 +
  tests/qemu-iotests/testenv.py | 9 +++--
  3 files changed, 18 insertions(+), 3 deletions(-)

diff --git a/tests/qemu-iotests/check b/tests/qemu-iotests/check
index 489178d9a4..84483922eb 100755
--- a/tests/qemu-iotests/check
+++ b/tests/qemu-iotests/check
@@ -33,6 +33,7 @@ def make_argparser() -> argparse.ArgumentParser:
 help='pretty print output for make check')
  p.add_argument('-d', dest='debug', action='store_true', 
help='debug')
+    p.add_argument('-p', dest='print', action='store_true', 
help='shows qemu binary prints to stdout')


I’d prefer for the description to be “redirects qemu's stdout and stderr 
to the test output”.


Also, this line is too long.


  p.add_argument('-gdb', action='store_true',
 help="start gdbserver with $GDB_QEMU options. \
   Default is localhost:12345")
@@ -117,7 +118,7 @@ if __name__ == '__main__':
    aiomode=args.aiomode, cachemode=args.cachemode,
    imgopts=args.imgopts, misalign=args.misalign,
    debug=args.debug, valgrind=args.valgrind,
-  gdb=args.gdb)
+  gdb=args.gdb, qprint=args.print)
  testfinder = TestFinder(test_dir=env.source_iotests)
diff --git a/tests/qemu-iotests/iotests.py 
b/tests/qemu-iotests/iotests.py

index f9832558a0..52ff7332f8 100644
--- a/tests/qemu-iotests/iotests.py
+++ b/tests/qemu-iotests/iotests.py
@@ -79,6 +79,8 @@
  if os.environ.get('GDB_QEMU'):
  qemu_gdb = ['gdbserver'] + 
os.environ.get('GDB_QEMU').strip().split(' ')

+qemu_print = os.environ.get('PRINT_QEMU', False)
+
  imgfmt = os.environ.get('IMGFMT', 'raw')
  imgproto = os.environ.get('IMGPROTO', 'file')
  output_dir = os.environ.get('OUTPUT_DIR', '.')
@@ -621,6 +623,13 @@ def _post_shutdown(self) -> None:
  super()._post_shutdown()
  self.subprocess_check_valgrind(qemu_valgrind)
+    def _pre_launch(self) -> None:
+    super()._pre_launch()
+    if qemu_print and self._qemu_log_file != None:


I think "is not None" is mostly used in Python.  (I’m saying this in 
this weird way because I’m not the one to ask what’s mostly used in 
Python...)


(That also silences mypy, which otherwise complains and then fails 297.)


+    # set QEMU binary output to stdout
+    self._qemu_log_file.close()
+    self._qemu_log_file = None


I don’t know enough Python to know how this works.  I suppose this does 
access the super class’s member?  (I could also imagine it creates a new 
member, just for this child class, but that then effectively overwrites 
the super class’s member.)


Considering _qemu_log_file is apparently meant to be a private attribute 
(from the leading underscore), I think it would be better to have a 
function provided by the super class for this.


It should access the superclass member, and it's the same that 
_post_shutdown does. I can make a function for that.


Regarding all other feedback in all patches that I did not answer: agree 
on all of them, will adjust everything in v4.


Thank you,
Emanuele




+
  def add_object(self, opts):
  self._args.append('-object')
  self._args.append(opts)
diff --git a/tests/qemu-iotests/testenv.py 
b/tests/qemu-iotests/testenv.py

index 39ae7ace33..6ae099114e 100644
--- a/tests/qemu-iotests/testenv.py
+++ b/tests/qemu-iotests/testenv.py
@@ -73,7 +73,7 @@ class TestEnv(ContextManager['TestEnv']):
   'AIOMODE', 'CACHEMODE', 'VALGRIND_QEMU',
   'CACHEMODE_IS_DEFAULT', 'IMGFMT_GENERIC', 
'IMGOPTSSYNTAX',
   'IMGKEYSECRET', 'QEMU_DEFAULT_MACHINE', 
'MALLOC_PERTURB_',

- 'GDB_QEMU']
+ 'GDB_QEMU', 'PRINT_QEMU']
  def get_env(self) -> Dict[str, str]:
  env = {}
@@ -165,7 +165,8 @@ def __init__(self, imgfmt: str, imgproto: str, 
aiomode: str,

   misalign: bool = False,
   debug: bool = False,
   valgrind: bool = False,
- gdb: bool = False) -> None:
+ gdb: bool = False,
+ qprint: bool = False) -> None:
  self.imgfmt = imgfmt
  self.imgproto = imgproto
  self.aiomode = aiomode
@@ -173,6 +174,9 @@ def __init__(self, imgfmt: str, imgproto: str, 
aiomode: str,

  self.misalign = misalign
  self.debug = debug
+    if qprint:
+    self.print_qemu = 'y'
+
  if gdb:
  self.gdb_qemu = os.environ.get('GDB_QEMU', 
'loca

Re: [PATCH v3 10/15] qemu_iotests: extent QMP socket timeout when using valgrind

2021-04-30 Thread Emanuele Giuseppe Esposito




On 30/04/2021 15:02, Max Reitz wrote:

On 14.04.21 19:03, Emanuele Giuseppe Esposito wrote:

As with gdbserver, valgrind delays the test execution, so
the default QMP socket timeout timeout too soon.


I’m curious: The default timeouts should be long enough for slow 
systems, too, though (e.g. heavily-loaded CI systems).  I would expect 
that valgrind is used on developer systems where there is more leeway, 
so the timeouts should still work.


As said in patch 5, I will check again which timeout is essential to 
avoid and which not.


Emanuele


But in practice, thinking about that doesn’t matter.  If valgrind leads 
to a timeout being hit, that wouldn’t be nice.  OTOH, if you run 
valgrind to debug a test/qemu, you don’t particularly care about the 
timeouts anyway.


So in principle, this patch sounds good to me, it’s just that it’s based 
on patch 5, which I don’t fully agree with.


Max


Signed-off-by: Emanuele Giuseppe Esposito 
---
  python/qemu/machine.py    | 2 +-
  tests/qemu-iotests/iotests.py | 4 ++--
  2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/python/qemu/machine.py b/python/qemu/machine.py
index d6142271c2..dce96e1858 100644
--- a/python/qemu/machine.py
+++ b/python/qemu/machine.py
@@ -410,7 +410,7 @@ def _launch(self) -> None:
 shell=False,
 close_fds=False)
-    if 'gdbserver' in self._wrapper:
+    if 'gdbserver' in self._wrapper or 'valgrind' in self._wrapper:
  self._qmp_timer = None
  self._post_launch()
diff --git a/tests/qemu-iotests/iotests.py 
b/tests/qemu-iotests/iotests.py

index a2e8604674..94597433fa 100644
--- a/tests/qemu-iotests/iotests.py
+++ b/tests/qemu-iotests/iotests.py
@@ -489,7 +489,7 @@ def log(msg: Msg,
  class Timeout:
  def __init__(self, seconds, errmsg="Timeout"):
-    if qemu_gdb:
+    if qemu_gdb or qemu_valgrind:
  self.seconds = 3000
  else:
  self.seconds = seconds
@@ -700,7 +700,7 @@ def qmp_to_opts(self, obj):
  return ','.join(output_list)
  def get_qmp_events(self, wait: bool = False) -> List[QMPMessage]:
-    if qemu_gdb:
+    if qemu_gdb or qemu_valgrind:
  wait = 0.0
  return super().get_qmp_events(wait=wait)








Re: [PATCH v3 04/15] qemu-iotests: add option to attach gdbserver

2021-04-30 Thread Emanuele Giuseppe Esposito




On 30/04/2021 13:38, Max Reitz wrote:

On 14.04.21 19:03, Emanuele Giuseppe Esposito wrote:

Add -gdb flag and GDB_QEMU environmental variable
to python tests to attach a gdbserver to each qemu instance.


Well, this patch doesn’t do this, but OK.


Maybe "define" rather than "add"? In the sense of defining the "-gdb" 
option, which is what it actually does.




Out of interest: Why gdbserver and not “just” gdb?  On Fedora, those are 
separate packages, so I don’t have gdbserver installed, that’s why I’m 
asking.


As far as I have tried, using only gdb with ./check is very hard to use, 
because the stdout is filtered out by the script.
So invoking gdb attaches it to QEMU, but it is not possible to start 
execution (run command) or interact with it, because of the python 
script filtering. This leaves the test hanging.


gdbserver is just something that a gdb client can attach to (for 
example, in another console or even in another host) for example with 
the command
# gdb -iex "target remote localhost:12345" . This provides a nice and 
separate gdb monitor to the client.


Emanuele


(I’ve also never used gdbserver before.  From what I can tell, it’s 
basically just a limited version of gdb so it only serves as a server.)



if -gdb is not provided but $GDB_QEMU is set, ignore the
environmental variable.


s/environmental/environment/



Signed-off-by: Emanuele Giuseppe Esposito 
---
  tests/qemu-iotests/check  |  6 +-
  tests/qemu-iotests/iotests.py |  4 
  tests/qemu-iotests/testenv.py | 15 ---
  3 files changed, 21 insertions(+), 4 deletions(-)

diff --git a/tests/qemu-iotests/check b/tests/qemu-iotests/check
index d1c87ceaf1..6186495eee 100755
--- a/tests/qemu-iotests/check
+++ b/tests/qemu-iotests/check
@@ -33,6 +33,9 @@ def make_argparser() -> argparse.ArgumentParser:
 help='pretty print output for make check')
  p.add_argument('-d', dest='debug', action='store_true', 
help='debug')

+    p.add_argument('-gdb', action='store_true',
+   help="start gdbserver with $GDB_QEMU options. \
+ Default is localhost:12345")


That makes it sound like this were the default for the `-gdb` option. 
Since `-gdb` is just a switch, it doesn’t have a default (apart from 
being off by default).


So I’d rephrase this at least to “The default options are 
'localhost:12345'”.  Or maybe “start gdbserver with $GDB_QEMU options 
('localhost:12345' if $GDB_QEMU is empty)”.


Also, $GDB_QEMU as a name is a bit strange, because it does not specify 
which gdb to use; it just gives the options to use for gdb. 
$GDB_QEMU_OPTIONS would be more in line with the naming of the rest of 
the environment variables (or just $GDB_OPTIONS).


Max


  p.add_argument('-misalign', action='store_true',
 help='misalign memory allocations')
  p.add_argument('--color', choices=['on', 'off', 'auto'],







Re: [PATCH v3 05/15] qemu-iotests: delay QMP socket timers

2021-04-30 Thread Emanuele Giuseppe Esposito




On 30/04/2021 13:59, Max Reitz wrote:

On 14.04.21 19:03, Emanuele Giuseppe Esposito wrote:

Attaching a gdbserver implies that the qmp socket
should wait indefinitely for an answer from QEMU.

Signed-off-by: Emanuele Giuseppe Esposito 
---
  python/qemu/machine.py    |  3 +++
  tests/qemu-iotests/iotests.py | 10 +-
  2 files changed, 12 insertions(+), 1 deletion(-)

diff --git a/python/qemu/machine.py b/python/qemu/machine.py
index 12752142c9..d6142271c2 100644
--- a/python/qemu/machine.py
+++ b/python/qemu/machine.py
@@ -409,6 +409,9 @@ def _launch(self) -> None:
 stderr=subprocess.STDOUT,
 shell=False,
 close_fds=False)
+
+    if 'gdbserver' in self._wrapper:
+    self._qmp_timer = None


Why doesn’t __init__() evaluate this?  This here doesn’t feel like the 
right place for it.  If we want to evaluate it here, self._qmp_timer 
shouldn’t exist, and instead the timeout should be a _post_launch() 
parameter.  (Which I would have nothing against, by the way.)


Uhm.. I got another comment in a previous version where for the "event" 
callbacks it was better a property than passing around a parameter. 
Which I honestly agree.


What should __init__() do? The check here is to see if the invocation 
has gdb (and a couple of patches ahead also valgrind), to remove the timer.

If I understand what you mean, you want something like
def __init__(self, timer):

But from my understanding, the class hierarchy is:
QEMUMachine: in machine.py
QEMUQtestMachine(QEMUMachine): in qtest.py
VM(qtest.QEMUQtestMachine): in iotests.py
VM() is then invoked in each test.

So this is not easily reachable by check.py, to pass the parameter into 
__init__




Also, mypy complains that this variable is a float, so iotest 297 (which 
runs mypy) fails.


This and all mistakes of test 297 (mypy) is my fault: I did not have 
pylint-3 installed thus when testing it skipped the 297 test.





  self._post_launch()
  def _early_cleanup(self) -> None:
diff --git a/tests/qemu-iotests/iotests.py 
b/tests/qemu-iotests/iotests.py

index 05d0dc0751..380527245e 100644
--- a/tests/qemu-iotests/iotests.py
+++ b/tests/qemu-iotests/iotests.py
@@ -478,7 +478,10 @@ def log(msg: Msg,
  class Timeout:
  def __init__(self, seconds, errmsg="Timeout"):
-    self.seconds = seconds
+    if qemu_gdb:
+    self.seconds = 3000
+    else:
+    self.seconds = seconds


We might as well completely disable the timeout then, that would be more 
honest, I think.  (I.e. to make __enter__ and __exit__ no-ops.)



  self.errmsg = errmsg
  def __enter__(self):
  signal.signal(signal.SIGALRM, self.timeout)
@@ -684,6 +687,11 @@ def qmp_to_opts(self, obj):
  output_list += [key + '=' + obj[key]]
  return ','.join(output_list)
+    def get_qmp_events(self, wait: bool = False) -> List[QMPMessage]:
+    if qemu_gdb:
+    wait = 0.0


First, this is a bool.  I can see that qmp.py expects a
Union[bool, float], but machine.py expects just a bool.  (Also, mypy 
complains that this specific `wait` here is a `bool`.  You can see that 
by running iotest 297.)


I think here machine.py should reflect the qmp.py behavior and have an 
Union[bool, float] too.


Second, I don’t understand this.  If the caller wants to block waiting 
on an event, then that should have nothing to do with whether we have 
gdb running or not.  As far as I understand, setting wait to 0.0 is the 
same as wait = False, i.e. we don’t block and just return None 
immediately if there is no pending event.


You're right, this might not be needed here. The problem I had was that 
calling gdb and pausing at a breakpoint or something for a while would 
make the QMP socket timeout, thus aborting the whole test. In order to 
avoid that, I need to stop or delay timers.


I can't remember why I added this check here. At some point I am sure 
the test was failing because of socket timeout expiration, but I cannot 
reproduce the problem when commenting out this check above in 
get_qmp_events. The other check in patch 3 should be enough.


Emanuele


Max


+    return super().get_qmp_events(wait=wait)
+
  def get_qmp_events_filtered(self, wait=60.0):
  result = []
  for ev in self.get_qmp_events(wait=wait):








[PATCH 3/4] block/vvfat: Fix leak of mapping_t::path

2021-04-30 Thread Philippe Mathieu-Daudé
read_directory() keeps pointers to alloc'ed data in path ...:

 743 static int read_directory(BDRVVVFATState* s, int mapping_index)
 744 {
 ...
 792 buffer = g_malloc(length);
 ...
 828 /* create mapping for this file */
 829 if(!is_dot && !is_dotdot && (S_ISDIR(st.st_mode) || st.st_size)) {
 830 s->current_mapping = array_get_next(&(s->mapping));
 ...
 847 s->current_mapping->path=buffer;

... but these pointers are never free'd. Free them in vvfat_close(),
to fix (QEMU built with --enable-sanitizers):

  Direct leak of 148 byte(s) in 6 object(s) allocated from:
#0 0x55d7a363773f in malloc (qemu-system-x86_64+0x1dab73f)
#1 0x7f55c6447958 in g_malloc (/lib64/libglib-2.0.so.0+0x58958)
#2 0x55d7a5e17679 in init_directories block/vvfat.c:962:16
#3 0x55d7a5e1251e in vvfat_open block/vvfat.c:1255:9
#4 0x55d7a5a5363f in bdrv_open_driver block.c:1526:15
#5 0x55d7a5a9d369 in bdrv_open_common block.c:1802:11
#6 0x55d7a5a609f1 in bdrv_open_inherit block.c:3444:11
#7 0x55d7a5a65411 in bdrv_open_child_bs block.c:3079:10
#8 0x55d7a5a60079 in bdrv_open_inherit block.c:3391:19
#9 0x55d7a5a65da3 in bdrv_open block.c:3537:12
#10 0x55d7a5b33f6a in blk_new_open block/block-backend.c:421:10
#11 0x55d7a5a0a33e in blockdev_init blockdev.c:610:15
#12 0x55d7a5a088e7 in drive_new blockdev.c:994:11
#13 0x55d7a51b10c4 in drive_init_func softmmu/vl.c:636:12
#14 0x55d7a620e148 in qemu_opts_foreach util/qemu-option.c:1167:14
#15 0x55d7a51b0e20 in configure_blockdev softmmu/vl.c:695:9
#16 0x55d7a51a70b5 in qemu_create_early_backends softmmu/vl.c:1895:5
#17 0x55d7a519bf87 in qemu_init softmmu/vl.c:3551:5
#18 0x55d7a366f619 in main softmmu/main.c:49:5

Fixes: a046433a161 ("Major overhaul of the virtual FAT driver for read/write 
support")
Signed-off-by: Philippe Mathieu-Daudé 
---
 block/vvfat.c | 5 +
 1 file changed, 5 insertions(+)

diff --git a/block/vvfat.c b/block/vvfat.c
index 2cc21787600..c193a816646 100644
--- a/block/vvfat.c
+++ b/block/vvfat.c
@@ -3228,6 +3228,11 @@ static void vvfat_close(BlockDriverState *bs)
 {
 BDRVVVFATState *s = bs->opaque;
 
+for (unsigned j = 0; j < s->mapping.next; j++) {
+mapping_t *mapping = array_get(&(s->mapping), j);
+
+g_free(mapping->path);
+}
 vvfat_close_current_file(s);
 array_free(&(s->fat));
 array_free(&(s->directory));
-- 
2.26.3




[PATCH] xen-block: Use specific blockdev driver

2021-04-30 Thread Anthony PERARD via
From: Anthony PERARD 

... when a xen-block backend instance is created via xenstore.

Following 8d17adf34f50 ("block: remove support for using "file" driver
with block/char devices"), using the "file" blockdev driver for
everything doesn't work anymore, we need to use the "host_device"
driver when the disk image is a block device and "file" driver when it
is a regular file.

Signed-off-by: Anthony PERARD 
---
 hw/block/xen-block.c | 14 +-
 1 file changed, 13 insertions(+), 1 deletion(-)

diff --git a/hw/block/xen-block.c b/hw/block/xen-block.c
index 83754a434481..674953f1adee 100644
--- a/hw/block/xen-block.c
+++ b/hw/block/xen-block.c
@@ -728,6 +728,8 @@ static XenBlockDrive *xen_block_drive_create(const char *id,
 XenBlockDrive *drive = NULL;
 QDict *file_layer;
 QDict *driver_layer;
+struct stat st;
+int rc;
 
 if (params) {
 char **v = g_strsplit(params, ":", 2);
@@ -761,7 +763,17 @@ static XenBlockDrive *xen_block_drive_create(const char 
*id,
 file_layer = qdict_new();
 driver_layer = qdict_new();
 
-qdict_put_str(file_layer, "driver", "file");
+rc = stat(filename, &st);
+if (rc) {
+error_setg_errno(errp, errno, "Could not stat file '%s'", filename);
+goto done;
+}
+if (S_ISBLK(st.st_mode)) {
+qdict_put_str(file_layer, "driver", "host_device");
+} else {
+qdict_put_str(file_layer, "driver", "file");
+}
+
 qdict_put_str(file_layer, "filename", filename);
 g_free(filename);
 
-- 
Anthony PERARD




[PATCH 4/4] block/vvfat: Avoid out of bounds write in create_long_filename()

2021-04-30 Thread Philippe Mathieu-Daudé
The direntry_t::name holds 11 bytes:

  typedef struct direntry_t {
  uint8_t name[8 + 3];
  ...

However create_long_filename() writes up to 31 bytes into it:

 421 for(i=0;i<26*number_of_entries;i++) {
 422 int offset=(i%26);
 423 if(offset<10) offset=1+offset;
 424 else if(offset<22) offset=14+offset-10;
 425 else offset=28+offset-22;
 426 entry=array_get(&(s->directory),s->directory.next-1-(i/26));
 427 if (i >= 2 * length + 2) {
 428 entry->name[offset] = 0xff;
 429 } else if (i % 2 == 0) {
 430 entry->name[offset] = longname[i / 2] & 0xff;
 431 } else {
 432 entry->name[offset] = longname[i / 2] >> 8;
 433 }
 434 }

For example, if i=25, offset=28+25-22=31

Then in lines 428, 430 and 432 the entry->name[] array is written beside
its 11 bytes, as reported by Clang sanitizer:

  block/vvfat.c:430:13: runtime error: index 14 out of bounds for type 'uint8_t 
[11]'
  SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior block/vvfat.c:430:13 
in
  block/vvfat.c:432:13: runtime error: index 15 out of bounds for type 'uint8_t 
[11]'
  SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior block/vvfat.c:432:13 
in
  block/vvfat.c:428:13: runtime error: index 18 out of bounds for type 'uint8_t 
[11]'
  SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior block/vvfat.c:428:13 
in

As I have no idea about what this code does, simply skip the writes if
out of range, since it is not worst than what we have currently (and
my tests using vvfat work identically).

Fixes: de167e416fa ("Virtual VFAT support (Johannes Schindelin)")
Signed-off-by: Philippe Mathieu-Daudé 
---
 block/vvfat.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/block/vvfat.c b/block/vvfat.c
index c193a816646..c7162e77d68 100644
--- a/block/vvfat.c
+++ b/block/vvfat.c
@@ -423,6 +423,9 @@ static direntry_t *create_long_filename(BDRVVVFATState *s, 
const char *filename)
 if(offset<10) offset=1+offset;
 else if(offset<22) offset=14+offset-10;
 else offset=28+offset-22;
+if (offset >= ARRAY_SIZE(entry->name)) {
+continue;
+}
 entry=array_get(&(s->directory),s->directory.next-1-(i/26));
 if (i >= 2 * length + 2) {
 entry->name[offset] = 0xff;
-- 
2.26.3




[PATCH 0/4] block/vvfat: Fix leaks and out of bounds writes

2021-04-30 Thread Philippe Mathieu-Daudé
The first 3 patches are trivial leak fixes, the last
one is a RFC since I have no clue about this code.

Johannes, you wrote this 18 years ago, do you still
remember? =)

Philippe Mathieu-Daudé (4):
  block/vvfat: Fix leak of BDRVVVFATState::qcow_filename
  block/vvfat: Fix leak of BDRVVVFATState::used_clusters
  block/vvfat: Fix leak of mapping_t::path
  block/vvfat: Avoid out of bounds write in create_long_filename()

 block/vvfat.c | 12 +++-
 1 file changed, 11 insertions(+), 1 deletion(-)

-- 
2.26.3





[PATCH 2/4] block/vvfat: Fix leak of BDRVVVFATState::used_clusters

2021-04-30 Thread Philippe Mathieu-Daudé
used_clusters is allocated in enable_write_target(), called by
vvfat_open(), but never free'd.
Allocate it using GLib API, and free it in vvfat_close().

This fixes (QEMU built with --enable-sanitizers):

  Direct leak of 64508 byte(s) in 1 object(s) allocated from:
  #0 0x55d7a36378f7 in calloc (qemu-system-x86_64+0x1dab8f7)
  #1 0x55d7a5e14246 in enable_write_target block/vvfat.c:3145:24
  #2 0x55d7a5e123aa in vvfat_open block/vvfat.c:1236:19
  #3 0x55d7a5a5363f in bdrv_open_driver block.c:1526:15
  #4 0x55d7a5a9d369 in bdrv_open_common block.c:1802:11
  #5 0x55d7a5a609f1 in bdrv_open_inherit block.c:3444:11
  #6 0x55d7a5a65411 in bdrv_open_child_bs block.c:3079:10
  #7 0x55d7a5a60079 in bdrv_open_inherit block.c:3391:19
  #8 0x55d7a5a65da3 in bdrv_open block.c:3537:12
  #9 0x55d7a5b33f6a in blk_new_open block/block-backend.c:421:10
  #10 0x55d7a5a0a33e in blockdev_init blockdev.c:610:15
  #11 0x55d7a5a088e7 in drive_new blockdev.c:994:11
  #12 0x55d7a51b10c4 in drive_init_func softmmu/vl.c:636:12
  #13 0x55d7a620e148 in qemu_opts_foreach util/qemu-option.c:1167:14
  #14 0x55d7a51b0e20 in configure_blockdev softmmu/vl.c:695:9
  #15 0x55d7a51a70b5 in qemu_create_early_backends softmmu/vl.c:1895:5
  #16 0x55d7a519bf87 in qemu_init softmmu/vl.c:3551:5
  #17 0x55d7a366f619 in main softmmu/main.c:49:5

Fixes: a046433a161 ("Major overhaul of the virtual FAT driver for read/write 
support")
Signed-off-by: Philippe Mathieu-Daudé 
---
 block/vvfat.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/block/vvfat.c b/block/vvfat.c
index 5a4a7915220..2cc21787600 100644
--- a/block/vvfat.c
+++ b/block/vvfat.c
@@ -3142,7 +3142,7 @@ static int enable_write_target(BlockDriverState *bs, 
Error **errp)
 int size = sector2cluster(s, s->sector_count);
 QDict *options;
 
-s->used_clusters = calloc(size, 1);
+s->used_clusters = g_malloc0(size);
 
 array_init(&(s->commits), sizeof(commit_t));
 
@@ -3233,6 +3233,7 @@ static void vvfat_close(BlockDriverState *bs)
 array_free(&(s->directory));
 array_free(&(s->mapping));
 g_free(s->cluster_buffer);
+g_free(s->used_clusters);
 g_free(s->qcow_filename);
 
 if (s->qcow) {
-- 
2.26.3




Re: [ANNOUNCE] libblkio v0.1.0 preview release

2021-04-30 Thread Stefan Hajnoczi
On Thu, Apr 29, 2021 at 04:00:38PM +0100, Richard W.M. Jones wrote:
> On Thu, Apr 29, 2021 at 03:41:29PM +0100, Stefan Hajnoczi wrote:
> > On Thu, Apr 29, 2021 at 03:22:59PM +0100, Richard W.M. Jones wrote:
> > > This is where I get confused about what this library actually does.
> > > It's not just a nicer wrapper around io_uring, but what is it actually
> > > doing?
> > 
> > It's a library for what QEMU calls protocol drivers (POSIX files,
> > userspace NVMe driver, etc). In particular, anything offering
> > multi-queue block I/O fits into libblkio.
> > 
> > It is not intended to replace libnbd or other network storage libraries.
> > libblkio's properties API is synchronous to keep things simple for
> > applications. Attaching to network storage needs to be asynchronous,
> > although the libblkio API could be extended if people want to support
> > network storage.
> 
> I think what confuses me is why is NVMe any different from io_uring?
> ie would this work?
> 
> $ blkio-info --output=json io_uring path=/dev/nvme0

The libblkio io_uring driver can handle /dev/nvme* block devices.

The future userspace NVMe driver mentioned above will be a VFIO PCI
driver, like the block/nvme.c driver in QEMU today. It uses a PCI device
directly, not a /dev/nvme* block device.

Stefan


signature.asc
Description: PGP signature


[PATCH 1/4] block/vvfat: Fix leak of BDRVVVFATState::qcow_filename

2021-04-30 Thread Philippe Mathieu-Daudé
qcow_filename is allocated in enable_write_target(), called by
vvfat_open(), but never free'd. Free it in vvfat_close().

This fixes (QEMU built with --enable-sanitizers):

  Direct leak of 4096 byte(s) in 1 object(s) allocated from:
  #0 0x55d7a363773f in malloc 
(/mnt/scratch/qemu/sanitizer/qemu-system-x86_64+0x1dab73f)
  #1 0x7f55c6447958 in g_malloc (/lib64/libglib-2.0.so.0+0x58958)
  #2 0x55d7a5e123aa in vvfat_open block/vvfat.c:1236:19
  #3 0x55d7a5a5363f in bdrv_open_driver block.c:1526:15
  #4 0x55d7a5a9d369 in bdrv_open_common block.c:1802:11
  #5 0x55d7a5a609f1 in bdrv_open_inherit block.c:3444:11
  #6 0x55d7a5a65411 in bdrv_open_child_bs block.c:3079:10
  #7 0x55d7a5a60079 in bdrv_open_inherit block.c:3391:19
  #8 0x55d7a5a65da3 in bdrv_open block.c:3537:12
  #9 0x55d7a5b33f6a in blk_new_open block/block-backend.c:421:10
  #10 0x55d7a5a0a33e in blockdev_init blockdev.c:610:15
  #11 0x55d7a5a088e7 in drive_new blockdev.c:994:11
  #12 0x55d7a51b10c4 in drive_init_func softmmu/vl.c:636:12
  #13 0x55d7a620e148 in qemu_opts_foreach util/qemu-option.c:1167:14
  #14 0x55d7a51b0e20 in configure_blockdev softmmu/vl.c:695:9
  #15 0x55d7a51a70b5 in qemu_create_early_backends softmmu/vl.c:1895:5
  #16 0x55d7a519bf87 in qemu_init softmmu/vl.c:3551:5
  #17 0x55d7a366f619 in main softmmu/main.c:49:5

Fixes: 8475ea48544 ("block/vvfat: Do not unref qcow on closing backing bdrv")
Signed-off-by: Philippe Mathieu-Daudé 
---
 block/vvfat.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/block/vvfat.c b/block/vvfat.c
index 54807f82ca1..5a4a7915220 100644
--- a/block/vvfat.c
+++ b/block/vvfat.c
@@ -3233,6 +3233,7 @@ static void vvfat_close(BlockDriverState *bs)
 array_free(&(s->directory));
 array_free(&(s->mapping));
 g_free(s->cluster_buffer);
+g_free(s->qcow_filename);
 
 if (s->qcow) {
 migrate_del_blocker(s->migration_blocker);
-- 
2.26.3




Re: [ANNOUNCE] libblkio v0.1.0 preview release

2021-04-30 Thread Stefan Hajnoczi
On Thu, Apr 29, 2021 at 05:51:16PM +0200, Kevin Wolf wrote:
> Am 29.04.2021 um 16:05 hat Stefan Hajnoczi geschrieben:
> > Hi,
> > A preview release of libblkio, a library for high-performance block I/O,
> > is now available:
> > 
> >   https://gitlab.com/libblkio/libblkio/-/tree/v0.1.0
> > 
> > Applications are increasingly integrating high-performance I/O
> > interfaces such as Linux io_uring, userspace device drivers, and
> > vhost-user device support. The effort required to add each of these
> > low-level interfaces into an application is relatively high. libblkio
> > provides a single API for efficiently accessing block devices and
> > eliminates the need to write custom code for each one.
> > 
> > The library is not yet ready to use and currently lacks many features.
> > In fact, I hope this news doesn't spread too far yet because libblkio is
> > at a very early stage, but feedback from the QEMU community is necessary
> > at this time.
> 
> I'm a bit worried about the configuration interface. This looks an awful
> lot like plain QOM properties, which have proven to result in both very
> verbose (not to say messy) and rather error prone implementations.
> 
> You have to write setters/getters for every property, even if it usually
> ends up doing the same thing, storing the value somewhere. Worse, these
> getters and setters have to work in very different circumstances, namely
> initialisation where you usually have to store the value away so that it
> can be checked for consistency with other properties in .realize() or
> .complete(), and property updates during runtime. Often enough, we
> forget the latter and create bugs. If we don't create bugs, we usually
> start with 'if (realized)' and have two completely different code paths.
> Another common bug in QOM objects is forgetting to check if mandatory
> properties were actually set.
>
> Did you already consider these problems and decided to go this way
> anyway, or is this more of an accidental design? And if the former, what
> were the reasons that made it appealing?

That's true. Here is the code to reject accesses when the instance is
not initialized:

  self.must_be_initialized()?;

It's very concise but you still need to remember to add it.

The overall reasons for choosing the properties API were:

1. It keeps the library's API very minimal (just generic getters/setters
   for primitive types). It minimizes ABI compatibility issues because
   there are no configuration structs or functions exported by the
   library.

2. It's trivial to have a string setter/getter that automatically
   converts to the primitive type representation, so application config
   file or command-line values can easily be set.

   This is kind of the inverse of what you're saying. If the library
   offers dedicated interfaces for each configuration value then the
   library doesn't need getters/setters for each one but all
   applications need special-purpose code for each configuration value.

That said, this is exactly why I published the preview release. If
someone has a better way to do this or the feedback is just that this is
bad style, then I'm happy to change it.

> Alternatives in QEMU are qdev properties (which are internally QOM
> properties, but provide default implementations and are at least
> automatically read-only after realize, avoiding that whole class of
> bugs) and QAPI.
> If this was QEMU code, I would of course go for QAPI, but a library is
> something different and adding the code generator would probably be a
> bit too much anyway. But the idea in the resulting code would be dealing
> with native structs instead of a bunch of function calls. This would
> probably be the least error prone way for the implementation, but of
> course, it would make binary compatibility a bit harder when adding new
> properties.

An alternative I considered was the typestate and builder patterns:

  /* Create a new io_uring driver in the uninitialized state */
  struct blkio_iou_uninit *blkio_new_io_uring(void);

  /* Uninitialized state property setters */
  int blkio_iou_uninit_set_path(struct blkio_iou_uninit *u,
const char *path);
  int blkio_iou_uninit_set_direct(struct blkio_iou_uninit *u,
  bool o_direct);
  ...

  /* Transition to initialized state. Frees u on success. */
  struct blkio_iou_init *blkio_iou_init(struct blkio_iou_uninit *u);

  /* Initialized state property setters/getters */
  int blkio_iou_init_get_capacity(struct blkio_iou_init *i,
  uint64_t *capacity);
  ...

  /* Transition to started state. Frees i on success. */
  struct blkio_iou_started *blkio_iou_start(struct blkio_iou_init *i);

  ...

  /* Transition back to initialized state. Frees s on success. */
  struct blkio_iou_init *blkio_iou_stop(struct blkio_iou_started *s);

On the plus side:

- No state checks are needed because an API won't even exist if it's
  unavailable in a given state

Re: [PULL 00/39] Block layer patches

2021-04-30 Thread Peter Maydell
On Fri, 30 Apr 2021 at 11:51, Kevin Wolf  wrote:
>
> The following changes since commit ccdf06c1db192152ac70a1dd974c624f566cb7d4:
>
>   Open 6.1 development tree (2021-04-30 11:15:40 +0100)
>
> are available in the Git repository at:
>
>   git://repo.or.cz/qemu/kevin.git tags/for-upstream
>
> for you to fetch changes up to 68bf7336533faa6aa90fdd4558edddbf5d8ef814:
>
>   vhost-user-blk: Fail gracefully on too large queue size (2021-04-30 
> 12:27:48 +0200)
>
> 
> Block layer patches
>
> - Fix permission update order problems with block graph changes
> - qemu-img convert: Unshare write permission for source
> - vhost-user-blk: Fail gracefully on too large queue size


Applied, thanks.

Please update the changelog at https://wiki.qemu.org/ChangeLog/6.1
for any user-visible changes.

-- PMM



Re: [PATCH v2] Document qemu-img options data_file and data_file_raw

2021-04-30 Thread Max Reitz

On 30.04.21 15:34, Connor Kuehl wrote:

The contents of this patch were initially developed and posted by Han
Han[1], however, it appears the original patch was not applied. Since
then, the relevant documentation has been moved and adapted to a new
format.

I've taken most of the original wording and tweaked it according to
some of the feedback from the original patch submission. I've also
adapted it to restructured text, which is the format the documentation
currently uses.

[1] https://lists.nongnu.org/archive/html/qemu-block/2019-10/msg01253.html

Fixes: https://bugzilla.redhat.com/1763105
Signed-off-by: Han Han 
Signed-off-by: Connor Kuehl 
---
Changes since v1:
   * Clarify different behaviors with these options when using qemu-img
 create vs amend (Max)
   * Touch on the negative case of how the file becomes inconsistent
 (John)

  docs/tools/qemu-img.rst | 20 
  1 file changed, 20 insertions(+)

diff --git a/docs/tools/qemu-img.rst b/docs/tools/qemu-img.rst
index c9efcfaefc..87b4a65535 100644
--- a/docs/tools/qemu-img.rst
+++ b/docs/tools/qemu-img.rst
@@ -866,6 +866,26 @@ Supported image file formats:
  issue ``lsattr filename`` to check if the NOCOW flag is set or not
  (Capital 'C' is NOCOW flag).
  
+  ``data_file``

+Filename where all guest data will be stored. If this option is used,
+the qcow2 file will only contain the image's metadata.
+
+Note: Data loss will occur if the given filename already exists when
+using this option with ``qemu-img create`` since ``qemu-img`` will create
+the data file anew, overwriting the file's original contents. To simply
+update the reference to point to the given pre-existing file, use
+``qemu-img amend``.
+
+  ``data_file_raw``
+If this option is set to ``on``, QEMU will always keep the external
+data file consistent as a standalone read-only raw image. It does
+this by forwarding updates through to the raw image in addition to
+updating the image metadata. If set to ``off``, QEMU will only
+update the image metadata without forwarding the changes through
+to the raw image. The default value is ``off``.


Hm, what updates and what changes?  I mean, the first part makes sense 
(the “It does this by...”), but the second part doesn’t.  qemu will 
still forward most writes to the data file.  (Not all, but most.)


(Also, nit pick: With data_file_raw=off, the data file is not a raw 
image.  (You still call it that in the penultimate sentence.))


When you write data to a qcow2 file with data_file, the data also goes 
to the data_file, most of the time.  The exception is when it can be 
handled with a metadata update, i.e. when it's a zero write or discard.


In addition, such updates (i.e. zero writes, I presume) not happening to 
the data file are usually a minor problem.  The real problem is that 
without data_file_raw, data clusters can be allocated anywhere in the 
data file, whereas with data_file_raw, they are allocated at their 
respective guest offset (i.e. the host offset always equals the guest 
offset).


I personally would have been fine with the first sentence, but if we 
want more of an explanation...  Perhaps:




Re: [PATCH v3 15/15] docs/devel/testing: add -p option to the debug section of QEMU iotests

2021-04-30 Thread Max Reitz

On 14.04.21 19:03, Emanuele Giuseppe Esposito wrote:

Signed-off-by: Emanuele Giuseppe Esposito 
---
  docs/devel/testing.rst | 4 
  1 file changed, 4 insertions(+)

diff --git a/docs/devel/testing.rst b/docs/devel/testing.rst
index 62902cfd2d..0c18fc4571 100644
--- a/docs/devel/testing.rst
+++ b/docs/devel/testing.rst
@@ -246,6 +246,10 @@ given as options to the ``check`` script:
  * ``-d`` (debug) just increases the logging verbosity, showing
for example the QMP commands and answers.
  
+* ``-p`` (print) allows QEMU binary stdout to be shown in the


stderr, too.


+  test console, instead of saving it into a log file in
+  ``$TEST_DIR/qemu-machine-``.
+


It doesn’t allow this, though, it forces it.  That means that tests that 
use the log will fail (e.g. 245)[1].  That is, I’d drop the “allows” and 
just state “redirect QEMU’s stdout and stderr to the test output, 
instead of...”.


[1] I realize that all tests will technically fail with -p, because the 
reference output differs, but 245 will not just fail because of that 
difference, but because two of its cases actually really fail.
No need to make a note of this, though.  It’s just that “allows” sounds 
a bit like qemu could choose to put some info into the test output and 
some into the log, when really it’s all or nothing (-p or not -p).


Max


  Test case groups
  
  






[PATCH v2] Document qemu-img options data_file and data_file_raw

2021-04-30 Thread Connor Kuehl
The contents of this patch were initially developed and posted by Han
Han[1], however, it appears the original patch was not applied. Since
then, the relevant documentation has been moved and adapted to a new
format.

I've taken most of the original wording and tweaked it according to
some of the feedback from the original patch submission. I've also
adapted it to restructured text, which is the format the documentation
currently uses.

[1] https://lists.nongnu.org/archive/html/qemu-block/2019-10/msg01253.html

Fixes: https://bugzilla.redhat.com/1763105
Signed-off-by: Han Han 
Signed-off-by: Connor Kuehl 
---
Changes since v1:
  * Clarify different behaviors with these options when using qemu-img
create vs amend (Max)
  * Touch on the negative case of how the file becomes inconsistent
(John)

 docs/tools/qemu-img.rst | 20 
 1 file changed, 20 insertions(+)

diff --git a/docs/tools/qemu-img.rst b/docs/tools/qemu-img.rst
index c9efcfaefc..87b4a65535 100644
--- a/docs/tools/qemu-img.rst
+++ b/docs/tools/qemu-img.rst
@@ -866,6 +866,26 @@ Supported image file formats:
 issue ``lsattr filename`` to check if the NOCOW flag is set or not
 (Capital 'C' is NOCOW flag).
 
+  ``data_file``
+Filename where all guest data will be stored. If this option is used,
+the qcow2 file will only contain the image's metadata.
+
+Note: Data loss will occur if the given filename already exists when
+using this option with ``qemu-img create`` since ``qemu-img`` will create
+the data file anew, overwriting the file's original contents. To simply
+update the reference to point to the given pre-existing file, use
+``qemu-img amend``.
+
+  ``data_file_raw``
+If this option is set to ``on``, QEMU will always keep the external
+data file consistent as a standalone read-only raw image. It does
+this by forwarding updates through to the raw image in addition to
+updating the image metadata. If set to ``off``, QEMU will only
+update the image metadata without forwarding the changes through
+to the raw image. The default value is ``off``.
+
+This option can only be enabled if ``data_file`` is set.
+
 ``Other``
 
   QEMU also supports various other image file formats for
-- 
2.30.2




Re: [PATCH v3 03/15] docs/devel/testing: add debug section to the QEMU iotests chapter

2021-04-30 Thread Paolo Bonzini

On 30/04/21 13:23, Max Reitz wrote:


+---
+QEMU iotests offers some options to debug a failing test, that can be
+given as options to the ``check`` script: 


-,


Even better: "The following options to the ``check`` script can be 
useful when debugging a failing test:".


Paolo




Re: [PATCH v3 14/15] qemu_iotests: add option to show qemu binary logs on stdout

2021-04-30 Thread Max Reitz

On 14.04.21 19:03, Emanuele Giuseppe Esposito wrote:

Using the flag -p, allow the qemu binary to print to stdout.
This helps especially when doing print-debugging.


I think this shouldn’t refer to prints but to qemu’s stdout/stderr in 
general, i.e



Signed-off-by: Emanuele Giuseppe Esposito 
---
  tests/qemu-iotests/check  | 3 ++-
  tests/qemu-iotests/iotests.py | 9 +
  tests/qemu-iotests/testenv.py | 9 +++--
  3 files changed, 18 insertions(+), 3 deletions(-)

diff --git a/tests/qemu-iotests/check b/tests/qemu-iotests/check
index 489178d9a4..84483922eb 100755
--- a/tests/qemu-iotests/check
+++ b/tests/qemu-iotests/check
@@ -33,6 +33,7 @@ def make_argparser() -> argparse.ArgumentParser:
 help='pretty print output for make check')
  
  p.add_argument('-d', dest='debug', action='store_true', help='debug')

+p.add_argument('-p', dest='print', action='store_true', help='shows qemu 
binary prints to stdout')


I’d prefer for the description to be “redirects qemu's stdout and stderr 
to the test output”.


Also, this line is too long.


  p.add_argument('-gdb', action='store_true',
 help="start gdbserver with $GDB_QEMU options. \
   Default is localhost:12345")
@@ -117,7 +118,7 @@ if __name__ == '__main__':
aiomode=args.aiomode, cachemode=args.cachemode,
imgopts=args.imgopts, misalign=args.misalign,
debug=args.debug, valgrind=args.valgrind,
-  gdb=args.gdb)
+  gdb=args.gdb, qprint=args.print)
  
  testfinder = TestFinder(test_dir=env.source_iotests)
  
diff --git a/tests/qemu-iotests/iotests.py b/tests/qemu-iotests/iotests.py

index f9832558a0..52ff7332f8 100644
--- a/tests/qemu-iotests/iotests.py
+++ b/tests/qemu-iotests/iotests.py
@@ -79,6 +79,8 @@
  if os.environ.get('GDB_QEMU'):
  qemu_gdb = ['gdbserver'] + os.environ.get('GDB_QEMU').strip().split(' ')
  
+qemu_print = os.environ.get('PRINT_QEMU', False)

+
  imgfmt = os.environ.get('IMGFMT', 'raw')
  imgproto = os.environ.get('IMGPROTO', 'file')
  output_dir = os.environ.get('OUTPUT_DIR', '.')
@@ -621,6 +623,13 @@ def _post_shutdown(self) -> None:
  super()._post_shutdown()
  self.subprocess_check_valgrind(qemu_valgrind)
  
+def _pre_launch(self) -> None:

+super()._pre_launch()
+if qemu_print and self._qemu_log_file != None:


I think "is not None" is mostly used in Python.  (I’m saying this in 
this weird way because I’m not the one to ask what’s mostly used in 
Python...)


(That also silences mypy, which otherwise complains and then fails 297.)


+# set QEMU binary output to stdout
+self._qemu_log_file.close()
+self._qemu_log_file = None


I don’t know enough Python to know how this works.  I suppose this does 
access the super class’s member?  (I could also imagine it creates a new 
member, just for this child class, but that then effectively overwrites 
the super class’s member.)


Considering _qemu_log_file is apparently meant to be a private attribute 
(from the leading underscore), I think it would be better to have a 
function provided by the super class for this.



+
  def add_object(self, opts):
  self._args.append('-object')
  self._args.append(opts)
diff --git a/tests/qemu-iotests/testenv.py b/tests/qemu-iotests/testenv.py
index 39ae7ace33..6ae099114e 100644
--- a/tests/qemu-iotests/testenv.py
+++ b/tests/qemu-iotests/testenv.py
@@ -73,7 +73,7 @@ class TestEnv(ContextManager['TestEnv']):
   'AIOMODE', 'CACHEMODE', 'VALGRIND_QEMU',
   'CACHEMODE_IS_DEFAULT', 'IMGFMT_GENERIC', 
'IMGOPTSSYNTAX',
   'IMGKEYSECRET', 'QEMU_DEFAULT_MACHINE', 
'MALLOC_PERTURB_',
- 'GDB_QEMU']
+ 'GDB_QEMU', 'PRINT_QEMU']
  
  def get_env(self) -> Dict[str, str]:

  env = {}
@@ -165,7 +165,8 @@ def __init__(self, imgfmt: str, imgproto: str, aiomode: str,
   misalign: bool = False,
   debug: bool = False,
   valgrind: bool = False,
- gdb: bool = False) -> None:
+ gdb: bool = False,
+ qprint: bool = False) -> None:
  self.imgfmt = imgfmt
  self.imgproto = imgproto
  self.aiomode = aiomode
@@ -173,6 +174,9 @@ def __init__(self, imgfmt: str, imgproto: str, aiomode: str,
  self.misalign = misalign
  self.debug = debug
  
+if qprint:

+self.print_qemu = 'y'
+
  if gdb:
  self.gdb_qemu = os.environ.get('GDB_QEMU', 'localhost:12345')
  elif 'GDB_QEMU' in os.environ:
@@ -278,6 +282,7 @@ def print_env(self) -> None:
  SOCKET_SCM_HELPER -- {SOCKET_SCM_HELPER}
  GDB_QEMU  -- "{GDB_QEMU}"
  VALGRIND_QEMU -- "{VALGRIND_QEMU}"
+PRINT_QEMU--  "{PRINT_QEMU}"


Again, I personally don’t like the q

Re: [PATCH v3 13/15] docs/devel/testing: add -valgrind option to the debug section of QEMU iotests

2021-04-30 Thread Max Reitz

On 14.04.21 19:03, Emanuele Giuseppe Esposito wrote:

Signed-off-by: Emanuele Giuseppe Esposito 
---
  docs/devel/testing.rst | 7 +++
  1 file changed, 7 insertions(+)

diff --git a/docs/devel/testing.rst b/docs/devel/testing.rst
index 2ee77a057b..62902cfd2d 100644
--- a/docs/devel/testing.rst
+++ b/docs/devel/testing.rst
@@ -236,6 +236,13 @@ given as options to the ``check`` script:
by setting the ``$GDB_QEMU`` environmental variable.
The final command line will be ``gdbserver $GDB_QEMU $QEMU ...``
  
+* ``-valgrind`` wraps a valgrind instance to QEMU. If it detects


Not a native speaker, but I think it should be “attaches ... to QEMU”, 
“wraps QEMU in a valgrind instances”, or “wraps ... around QEMU".


Apart from that:

Reviewed-by: Max Reitz 


+  warnings, it will print and save the log in
+  ``$TEST_DIR/.valgrind``.
+  The final command line will be ``valgrind --log-file=$TEST_DIR/
+  .valgrind --error-exitcode=99 $QEMU ...``
+  Note: if used together with ``-gdb``, this command will be ignored.
+
  * ``-d`` (debug) just increases the logging verbosity, showing
for example the QMP commands and answers.
  






Re: [PATCH v3 06/15] qemu_iotests: insert gdbserver command line as wrapper for qemu binary

2021-04-30 Thread Max Reitz

On 14.04.21 19:03, Emanuele Giuseppe Esposito wrote:

Signed-off-by: Emanuele Giuseppe Esposito 
---
  tests/qemu-iotests/iotests.py | 3 ++-
  1 file changed, 2 insertions(+), 1 deletion(-)


Reviewed-by: Max Reitz 




Re: [PATCH v3 12/15] qemu_iotests: insert valgrind command line as wrapper for qemu binary

2021-04-30 Thread Max Reitz

On 14.04.21 19:03, Emanuele Giuseppe Esposito wrote:

The priority will be given to gdb command line, meaning if the -gdb
parameter and -valgrind are given, gdb will be wrapped around
the qemu binary.

Signed-off-by: Emanuele Giuseppe Esposito 
---
  tests/qemu-iotests/iotests.py | 3 ++-
  1 file changed, 2 insertions(+), 1 deletion(-)


Reviewed-by: Max Reitz 




Re: [PATCH 2/3] hw/ide: Add Kconfig dependency MICRODRIVE -> PCMCIA

2021-04-30 Thread Laurent Vivier
Le 25/04/2021 à 00:20, Philippe Mathieu-Daudé a écrit :
> The Microdrive Compact Flash can be plugged on a PCMCIA bus.
> Express the dependency using the 'depends on' Kconfig expression.
> 
> Signed-off-by: Philippe Mathieu-Daudé 
> ---
>  hw/ide/Kconfig | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/hw/ide/Kconfig b/hw/ide/Kconfig
> index 5d9106b1ac2..8e2c8934549 100644
> --- a/hw/ide/Kconfig
> +++ b/hw/ide/Kconfig
> @@ -41,6 +41,7 @@ config IDE_VIA
>  config MICRODRIVE
>  bool
>  select IDE_QDEV
> +depends on PCMCIA
>  
>  config AHCI
>  bool
> 

Applied to my trivial-patches branch.

Thanks,
Laurent




Re: [PATCH v3 11/15] qemu_iotests: allow valgrind to read/delete the generated log file

2021-04-30 Thread Max Reitz

On 14.04.21 19:03, Emanuele Giuseppe Esposito wrote:

When using -valgrind on the script tests, it generates a log file
in $TEST_DIR that is either read (if valgrind finds problems) or
otherwise deleted. Provide the same exact behavior when using
-valgrind on the python tests.

Signed-off-by: Emanuele Giuseppe Esposito 
---
  tests/qemu-iotests/iotests.py | 20 
  1 file changed, 20 insertions(+)

diff --git a/tests/qemu-iotests/iotests.py b/tests/qemu-iotests/iotests.py
index 94597433fa..aef67e3a86 100644
--- a/tests/qemu-iotests/iotests.py
+++ b/tests/qemu-iotests/iotests.py
@@ -600,6 +600,26 @@ def __init__(self, path_suffix=''):
   sock_dir=sock_dir)
  self._num_drives = 0
  
+def subprocess_check_valgrind(self, valgrind) -> None:


A type annotation would be nice.  (I.e. List[str], or we make it a bool 
and the caller uses bool(qemu_valgrind).)



+


I’d drop this empty line.


+if not valgrind:
+return
+
+valgrind_filename =  test_dir + "/" + str(self._popen.pid) + 
".valgrind"


mypy (iotest 297) complains that _popen is Optional[], so .pid might not 
exist.  Perhaps this should be safeguarded in the "if not valgrind" 
condition (i.e. "if not valgrind or not self._popen").


Also, pylint complains about the line being too long (79 is the maximum 
length).  Can be fixed with an f-string:


valgrind_filename = f"{test_dir}/{self._popen.pid}.valgrind"


+
+if self.exitcode() == 99:
+with open(valgrind_filename) as f:
+content = f.readlines()
+for line in content:
+print(line, end ="")


'end=""' would be better, I think.  (flake8 complains.)

Also, would this be better as:

with open(valgrind_filename) as f:
for line in f.readlines():
print(line, end="")

?

(Or just

with open(valgrind_filename) as f:
print(f.read())

– wouldn’t that work, too?)

Max


+print("")
+else:
+os.remove(valgrind_filename)
+
+def _post_shutdown(self) -> None:
+super()._post_shutdown()
+self.subprocess_check_valgrind(qemu_valgrind)
+
  def add_object(self, opts):
  self._args.append('-object')
  self._args.append(opts)






Re: [PATCH 3/3] hw/pcmcia: Do not register PCMCIA type if not required

2021-04-30 Thread Laurent Vivier
Le 25/04/2021 à 00:20, Philippe Mathieu-Daudé a écrit :
> If the Kconfig 'PCMCIA' value is not selected, it is pointless
> to build the PCMCIA core components.
> 
> (Currently only one machine of the ARM targets requires this).
> 
> Signed-off-by: Philippe Mathieu-Daudé 
> ---
>  hw/pcmcia/meson.build | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/hw/pcmcia/meson.build b/hw/pcmcia/meson.build
> index ab50bd325d6..51f2512b8ed 100644
> --- a/hw/pcmcia/meson.build
> +++ b/hw/pcmcia/meson.build
> @@ -1,2 +1,2 @@
> -softmmu_ss.add(files('pcmcia.c'))
> +softmmu_ss.add(when: 'CONFIG_PCMCIA', if_true: files('pcmcia.c'))
>  softmmu_ss.add(when: 'CONFIG_PXA2XX', if_true: files('pxa2xx.c'))
> 

Applied to my trivial-patches branch.

Thanks,
Laurent



Re: [PATCH 1/3] hw/arm/pxa2xx: Declare PCMCIA bus with Kconfig

2021-04-30 Thread Laurent Vivier
Le 25/04/2021 à 00:20, Philippe Mathieu-Daudé a écrit :
> The Intel XScale PXA chipsets provide a PCMCIA controller,
> which expose a PCMCIA (IDE) bus. Express this dependency using
> the Kconfig 'select' expression.
> 
> Signed-off-by: Philippe Mathieu-Daudé 
> ---
>  hw/arm/Kconfig | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/hw/arm/Kconfig b/hw/arm/Kconfig
> index 8c37cf00da7..b887f6a5b17 100644
> --- a/hw/arm/Kconfig
> +++ b/hw/arm/Kconfig
> @@ -142,6 +142,7 @@ config PXA2XX
>  select SD
>  select SSI
>  select USB_OHCI
> +select PCMCIA
>  
>  config GUMSTIX
>  bool
> 

Applied to my trivial-patches branch.

Thanks,
Laurent




Re: [PATCH v3 10/15] qemu_iotests: extent QMP socket timeout when using valgrind

2021-04-30 Thread Max Reitz

On 14.04.21 19:03, Emanuele Giuseppe Esposito wrote:

As with gdbserver, valgrind delays the test execution, so
the default QMP socket timeout timeout too soon.


I’m curious: The default timeouts should be long enough for slow 
systems, too, though (e.g. heavily-loaded CI systems).  I would expect 
that valgrind is used on developer systems where there is more leeway, 
so the timeouts should still work.


But in practice, thinking about that doesn’t matter.  If valgrind leads 
to a timeout being hit, that wouldn’t be nice.  OTOH, if you run 
valgrind to debug a test/qemu, you don’t particularly care about the 
timeouts anyway.


So in principle, this patch sounds good to me, it’s just that it’s based 
on patch 5, which I don’t fully agree with.


Max


Signed-off-by: Emanuele Giuseppe Esposito 
---
  python/qemu/machine.py| 2 +-
  tests/qemu-iotests/iotests.py | 4 ++--
  2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/python/qemu/machine.py b/python/qemu/machine.py
index d6142271c2..dce96e1858 100644
--- a/python/qemu/machine.py
+++ b/python/qemu/machine.py
@@ -410,7 +410,7 @@ def _launch(self) -> None:
 shell=False,
 close_fds=False)
  
-if 'gdbserver' in self._wrapper:

+if 'gdbserver' in self._wrapper or 'valgrind' in self._wrapper:
  self._qmp_timer = None
  self._post_launch()
  
diff --git a/tests/qemu-iotests/iotests.py b/tests/qemu-iotests/iotests.py

index a2e8604674..94597433fa 100644
--- a/tests/qemu-iotests/iotests.py
+++ b/tests/qemu-iotests/iotests.py
@@ -489,7 +489,7 @@ def log(msg: Msg,
  
  class Timeout:

  def __init__(self, seconds, errmsg="Timeout"):
-if qemu_gdb:
+if qemu_gdb or qemu_valgrind:
  self.seconds = 3000
  else:
  self.seconds = seconds
@@ -700,7 +700,7 @@ def qmp_to_opts(self, obj):
  return ','.join(output_list)
  
  def get_qmp_events(self, wait: bool = False) -> List[QMPMessage]:

-if qemu_gdb:
+if qemu_gdb or qemu_valgrind:
  wait = 0.0
  return super().get_qmp_events(wait=wait)
  






Re: [PATCH v3 05/15] qemu-iotests: delay QMP socket timers

2021-04-30 Thread Max Reitz

On 14.04.21 19:03, Emanuele Giuseppe Esposito wrote:

Attaching a gdbserver implies that the qmp socket
should wait indefinitely for an answer from QEMU.

Signed-off-by: Emanuele Giuseppe Esposito 
---
  python/qemu/machine.py|  3 +++
  tests/qemu-iotests/iotests.py | 10 +-
  2 files changed, 12 insertions(+), 1 deletion(-)

diff --git a/python/qemu/machine.py b/python/qemu/machine.py
index 12752142c9..d6142271c2 100644
--- a/python/qemu/machine.py
+++ b/python/qemu/machine.py
@@ -409,6 +409,9 @@ def _launch(self) -> None:
 stderr=subprocess.STDOUT,
 shell=False,
 close_fds=False)
+
+if 'gdbserver' in self._wrapper:
+self._qmp_timer = None


Why doesn’t __init__() evaluate this?  This here doesn’t feel like the 
right place for it.  If we want to evaluate it here, self._qmp_timer 
shouldn’t exist, and instead the timeout should be a _post_launch() 
parameter.  (Which I would have nothing against, by the way.)


Also, mypy complains that this variable is a float, so iotest 297 (which 
runs mypy) fails.



  self._post_launch()
  
  def _early_cleanup(self) -> None:

diff --git a/tests/qemu-iotests/iotests.py b/tests/qemu-iotests/iotests.py
index 05d0dc0751..380527245e 100644
--- a/tests/qemu-iotests/iotests.py
+++ b/tests/qemu-iotests/iotests.py
@@ -478,7 +478,10 @@ def log(msg: Msg,
  
  class Timeout:

  def __init__(self, seconds, errmsg="Timeout"):
-self.seconds = seconds
+if qemu_gdb:
+self.seconds = 3000
+else:
+self.seconds = seconds


We might as well completely disable the timeout then, that would be more 
honest, I think.  (I.e. to make __enter__ and __exit__ no-ops.)



  self.errmsg = errmsg
  def __enter__(self):
  signal.signal(signal.SIGALRM, self.timeout)
@@ -684,6 +687,11 @@ def qmp_to_opts(self, obj):
  output_list += [key + '=' + obj[key]]
  return ','.join(output_list)
  
+def get_qmp_events(self, wait: bool = False) -> List[QMPMessage]:

+if qemu_gdb:
+wait = 0.0


First, this is a bool.  I can see that qmp.py expects a
Union[bool, float], but machine.py expects just a bool.  (Also, mypy 
complains that this specific `wait` here is a `bool`.  You can see that 
by running iotest 297.)


Second, I don’t understand this.  If the caller wants to block waiting 
on an event, then that should have nothing to do with whether we have 
gdb running or not.  As far as I understand, setting wait to 0.0 is the 
same as wait = False, i.e. we don’t block and just return None 
immediately if there is no pending event.


Max


+return super().get_qmp_events(wait=wait)
+
  def get_qmp_events_filtered(self, wait=60.0):
  result = []
  for ev in self.get_qmp_events(wait=wait):






Re: [PATCH v3 09/15] qemu_iotests: extend the check script to support valgrind for python tests

2021-04-30 Thread Max Reitz

On 14.04.21 19:03, Emanuele Giuseppe Esposito wrote:

Currently, the check script only parses the option and sets the
VALGRIND_QEMU environmental variable to "y".
Add another local python variable that prepares the command line,
identical to the one provided in the test scripts.

Because the python script does not know in advance the valgring
PID to assign to the log file name, use the "%p" flag in valgrind
log file name that automatically puts the process PID at runtime.

Signed-off-by: Emanuele Giuseppe Esposito 
---
  tests/qemu-iotests/check  |  7 ---
  tests/qemu-iotests/iotests.py | 11 +++
  tests/qemu-iotests/testenv.py |  1 +
  3 files changed, 16 insertions(+), 3 deletions(-)

diff --git a/tests/qemu-iotests/check b/tests/qemu-iotests/check
index 6186495eee..489178d9a4 100755
--- a/tests/qemu-iotests/check
+++ b/tests/qemu-iotests/check
@@ -36,6 +36,10 @@ def make_argparser() -> argparse.ArgumentParser:
  p.add_argument('-gdb', action='store_true',
 help="start gdbserver with $GDB_QEMU options. \
   Default is localhost:12345")
+p.add_argument('-valgrind', action='store_true',
+help='use valgrind, sets VALGRIND_QEMU environment '
+'variable')
+
  p.add_argument('-misalign', action='store_true',
 help='misalign memory allocations')
  p.add_argument('--color', choices=['on', 'off', 'auto'],
@@ -86,9 +90,6 @@ def make_argparser() -> argparse.ArgumentParser:
  g_bash.add_argument('-o', dest='imgopts',
  help='options to pass to qemu-img create/convert, '
  'sets IMGOPTS environment variable')
-g_bash.add_argument('-valgrind', action='store_true',
-help='use valgrind, sets VALGRIND_QEMU environment '
-'variable')
  
  g_sel = p.add_argument_group('test selecting options',

   'The following options specify test set '
diff --git a/tests/qemu-iotests/iotests.py b/tests/qemu-iotests/iotests.py
index 4f3fb13915..a2e8604674 100644
--- a/tests/qemu-iotests/iotests.py
+++ b/tests/qemu-iotests/iotests.py
@@ -96,6 +96,17 @@
  sys.stderr.write('Please run this test via the "check" script\n')
  sys.exit(os.EX_USAGE)
  
+qemu_valgrind = []

+if os.environ.get('VALGRIND_QEMU') == "y" and \
+os.environ.get('NO_VALGRIND') != "y":
+valgrind_logfile = "--log-file=" + test_dir.strip()
+# %p allows to put the valgrind process PID, since
+# we don't know it a priori (subprocess.Peopen is


*Popen


+# not yet invoked)
+valgrind_logfile += "/%p.valgrind"
+
+qemu_valgrind = ['valgrind', valgrind_logfile, '--error-exitcode=99']
+
  socket_scm_helper = os.environ.get('SOCKET_SCM_HELPER', 'socket_scm_helper')
  
  luks_default_secret_object = 'secret,id=keysec0,data=' + \

diff --git a/tests/qemu-iotests/testenv.py b/tests/qemu-iotests/testenv.py
index e131ff42cb..39ae7ace33 100644
--- a/tests/qemu-iotests/testenv.py
+++ b/tests/qemu-iotests/testenv.py
@@ -277,6 +277,7 @@ def print_env(self) -> None:
  SOCK_DIR  -- {SOCK_DIR}
  SOCKET_SCM_HELPER -- {SOCKET_SCM_HELPER}
  GDB_QEMU  -- "{GDB_QEMU}"
+VALGRIND_QEMU -- "{VALGRIND_QEMU}"


I don’t think this should be enclosed in quotes (and neither should 
GDB_QEMU be).  AFAIU, only the *_PROG options are quoted to signify that 
they will be executed as a single program, whether they include spaces 
or not.  OTOH, GDB_QEMU is a list of options, so spaces act as 
separators, and VALGRIND_QEMU is just y or not set, so it’s impossible 
for spaces to be in there.


(Also, I think the “--” should be aligned to all the other “--”s.)

Bikeshedding, though, so:

Reviewed-by: Max Reitz 


  """
  
  args = collections.defaultdict(str, self.get_env())







Re: [PATCH v3 08/15] docs/devel/testing: add -gdb option to the debugging section of QEMU iotests

2021-04-30 Thread Max Reitz

On 14.04.21 19:03, Emanuele Giuseppe Esposito wrote:

Signed-off-by: Emanuele Giuseppe Esposito 
---
  docs/devel/testing.rst | 7 +++
  1 file changed, 7 insertions(+)

diff --git a/docs/devel/testing.rst b/docs/devel/testing.rst
index b7e2370e7e..2ee77a057b 100644
--- a/docs/devel/testing.rst
+++ b/docs/devel/testing.rst
@@ -229,6 +229,13 @@ Debugging a test case
  QEMU iotests offers some options to debug a failing test, that can be
  given as options to the ``check`` script:
  
+* ``-gdb`` wraps ``gdbsever`` to the QEMU binary,

+  so it is possible to connect to it via gdb.
+  One way to do so is via ``gdb -iex "target remote $GDB_QEMU"``
+  The default address is ``localhost:12345``, and can be changed
+  by setting the ``$GDB_QEMU`` environmental variable.


*environment variable


+  The final command line will be ``gdbserver $GDB_QEMU $QEMU ...``
+


I think the order in this explanation is ordered not quite right, 
because it uses $GDB_QEMU before explaining what it is.  (Also, I 
suppose $GDB_QEMU might contain other options than the socket address, 
so "target remote $GDB_QEMU" does not necessarily work.)  I’d 
reorder/change it to:


``-gdb`` wraps every QEMU invocation in a ``gdbserver``, which waits for 
a connection from a gdb client.  The options given to ``gdbserver`` 
(e.g. the address on which to listen for connections) are taken from the 
``$GDB_QEMU`` environment variable.  By default (if ``$GDB_QEMU`` is 
empty), it listens on ``localhost:12345``.
You can connect to it for example with ``gdb -iex "target remote 
$addr"``, where ``$addr`` is the address ``gdbserver`` listens on.


Max


  * ``-d`` (debug) just increases the logging verbosity, showing
for example the QMP commands and answers.
  






Re: [PATCH v3 07/15] qemu-iotests: add gdbserver option to script tests too

2021-04-30 Thread Max Reitz

On 14.04.21 19:03, Emanuele Giuseppe Esposito wrote:

The only limitation here is that running a script with gdbserver
will make the test output mismatch with the expected
results, making the test fail.

Signed-off-by: Emanuele Giuseppe Esposito 
---
  tests/qemu-iotests/common.rc | 8 +++-
  1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/tests/qemu-iotests/common.rc b/tests/qemu-iotests/common.rc
index 65cdba5723..53a3310fee 100644
--- a/tests/qemu-iotests/common.rc
+++ b/tests/qemu-iotests/common.rc
@@ -166,8 +166,14 @@ _qemu_wrapper()
  if [ -n "${QEMU_NEED_PID}" ]; then
  echo $BASHPID > "${QEMU_TEST_DIR}/qemu-${_QEMU_HANDLE}.pid"
  fi
+
+GDB="${QEMU_PROG}"
+if [ ! -z ${GDB_QEMU} ]; then
+GDB="gdbserver ${GDB_QEMU} ${GDB}"
+fi
+
  VALGRIND_QEMU="${VALGRIND_QEMU_VM}" _qemu_proc_exec 
"${VALGRIND_LOGFILE}" \
-"$QEMU_PROG" $QEMU_OPTIONS "$@"
+   $GDB $QEMU_OPTIONS "$@"


This looks strange, because now there is no qemu here.

Why not just fill $GDB with the GDB options, keeping it empty if no gdb 
is to be used?  Then we’d have


VALGRIND_QEMU=... ... \
$GDB "$QEMU_PROG" $QEMU_OPTIONS "$@"

here.

(Also, the indentation changed.  I’d keep it aligned to four spaces.)

Max


  )
  RETVAL=$?
  _qemu_proc_valgrind_log "${VALGRIND_LOGFILE}" $RETVAL






Re: [PATCH v3 04/15] qemu-iotests: add option to attach gdbserver

2021-04-30 Thread Max Reitz

On 14.04.21 19:03, Emanuele Giuseppe Esposito wrote:

Add -gdb flag and GDB_QEMU environmental variable
to python tests to attach a gdbserver to each qemu instance.

if -gdb is not provided but $GDB_QEMU is set, ignore the
environmental variable.

Signed-off-by: Emanuele Giuseppe Esposito 
---
  tests/qemu-iotests/check  |  6 +-
  tests/qemu-iotests/iotests.py |  4 
  tests/qemu-iotests/testenv.py | 15 ---
  3 files changed, 21 insertions(+), 4 deletions(-)

diff --git a/tests/qemu-iotests/check b/tests/qemu-iotests/check
index d1c87ceaf1..6186495eee 100755
--- a/tests/qemu-iotests/check
+++ b/tests/qemu-iotests/check
@@ -33,6 +33,9 @@ def make_argparser() -> argparse.ArgumentParser:
 help='pretty print output for make check')
  
  p.add_argument('-d', dest='debug', action='store_true', help='debug')

+p.add_argument('-gdb', action='store_true',
+   help="start gdbserver with $GDB_QEMU options. \
+ Default is localhost:12345")
  p.add_argument('-misalign', action='store_true',
 help='misalign memory allocations')
  p.add_argument('--color', choices=['on', 'off', 'auto'],
@@ -112,7 +115,8 @@ if __name__ == '__main__':
  env = TestEnv(imgfmt=args.imgfmt, imgproto=args.imgproto,
aiomode=args.aiomode, cachemode=args.cachemode,
imgopts=args.imgopts, misalign=args.misalign,
-  debug=args.debug, valgrind=args.valgrind)
+  debug=args.debug, valgrind=args.valgrind,
+  gdb=args.gdb)
  
  testfinder = TestFinder(test_dir=env.source_iotests)
  
diff --git a/tests/qemu-iotests/iotests.py b/tests/qemu-iotests/iotests.py

index 90d0b62523..05d0dc0751 100644
--- a/tests/qemu-iotests/iotests.py
+++ b/tests/qemu-iotests/iotests.py
@@ -75,6 +75,10 @@
  qemu_prog = os.environ.get('QEMU_PROG', 'qemu')
  qemu_opts = os.environ.get('QEMU_OPTIONS', '').strip().split(' ')
  
+qemu_gdb = []

+if os.environ.get('GDB_QEMU'):
+qemu_gdb = ['gdbserver'] + os.environ.get('GDB_QEMU').strip().split(' ')


os.environ.get('GDB_QEMU') returns an Option[str], so mypy complains 
about the .strip() (thus failing iotest 297).


(That can be fixed for example by either using os.environ['GDB_QEMU'] 
here, like most other places here do, or by something like:


gdb_qemu_env = os.environ.get('GDB_QEMU')
if gdb_qemu_env:
qemu_gdb = ['gdbserver'] + gdb_qemu_env.strip().split(' ')
)

Max


+
  imgfmt = os.environ.get('IMGFMT', 'raw')
  imgproto = os.environ.get('IMGPROTO', 'file')
  output_dir = os.environ.get('OUTPUT_DIR', '.')
diff --git a/tests/qemu-iotests/testenv.py b/tests/qemu-iotests/testenv.py
index 1fbec854c1..e131ff42cb 100644
--- a/tests/qemu-iotests/testenv.py
+++ b/tests/qemu-iotests/testenv.py
@@ -72,7 +72,8 @@ class TestEnv(ContextManager['TestEnv']):
   'QEMU_NBD_OPTIONS', 'IMGOPTS', 'IMGFMT', 'IMGPROTO',
   'AIOMODE', 'CACHEMODE', 'VALGRIND_QEMU',
   'CACHEMODE_IS_DEFAULT', 'IMGFMT_GENERIC', 
'IMGOPTSSYNTAX',
- 'IMGKEYSECRET', 'QEMU_DEFAULT_MACHINE', 'MALLOC_PERTURB_']
+ 'IMGKEYSECRET', 'QEMU_DEFAULT_MACHINE', 'MALLOC_PERTURB_',
+ 'GDB_QEMU']
  
  def get_env(self) -> Dict[str, str]:

  env = {}
@@ -163,7 +164,8 @@ def __init__(self, imgfmt: str, imgproto: str, aiomode: str,
   imgopts: Optional[str] = None,
   misalign: bool = False,
   debug: bool = False,
- valgrind: bool = False) -> None:
+ valgrind: bool = False,
+ gdb: bool = False) -> None:
  self.imgfmt = imgfmt
  self.imgproto = imgproto
  self.aiomode = aiomode
@@ -171,6 +173,11 @@ def __init__(self, imgfmt: str, imgproto: str, aiomode: 
str,
  self.misalign = misalign
  self.debug = debug
  
+if gdb:

+self.gdb_qemu = os.environ.get('GDB_QEMU', 'localhost:12345')
+elif 'GDB_QEMU' in os.environ:
+del os.environ['GDB_QEMU']
+
  if valgrind:
  self.valgrind_qemu = 'y'
  
@@ -268,7 +275,9 @@ def print_env(self) -> None:

  PLATFORM  -- {platform}
  TEST_DIR  -- {TEST_DIR}
  SOCK_DIR  -- {SOCK_DIR}
-SOCKET_SCM_HELPER -- {SOCKET_SCM_HELPER}"""
+SOCKET_SCM_HELPER -- {SOCKET_SCM_HELPER}
+GDB_QEMU  -- "{GDB_QEMU}"
+"""
  
  args = collections.defaultdict(str, self.get_env())
  






Re: [PATCH v3 04/15] qemu-iotests: add option to attach gdbserver

2021-04-30 Thread Max Reitz

On 14.04.21 19:03, Emanuele Giuseppe Esposito wrote:

Add -gdb flag and GDB_QEMU environmental variable
to python tests to attach a gdbserver to each qemu instance.


Well, this patch doesn’t do this, but OK.

Out of interest: Why gdbserver and not “just” gdb?  On Fedora, those are 
separate packages, so I don’t have gdbserver installed, that’s why I’m 
asking.


(I’ve also never used gdbserver before.  From what I can tell, it’s 
basically just a limited version of gdb so it only serves as a server.)



if -gdb is not provided but $GDB_QEMU is set, ignore the
environmental variable.


s/environmental/environment/



Signed-off-by: Emanuele Giuseppe Esposito 
---
  tests/qemu-iotests/check  |  6 +-
  tests/qemu-iotests/iotests.py |  4 
  tests/qemu-iotests/testenv.py | 15 ---
  3 files changed, 21 insertions(+), 4 deletions(-)

diff --git a/tests/qemu-iotests/check b/tests/qemu-iotests/check
index d1c87ceaf1..6186495eee 100755
--- a/tests/qemu-iotests/check
+++ b/tests/qemu-iotests/check
@@ -33,6 +33,9 @@ def make_argparser() -> argparse.ArgumentParser:
 help='pretty print output for make check')
  
  p.add_argument('-d', dest='debug', action='store_true', help='debug')

+p.add_argument('-gdb', action='store_true',
+   help="start gdbserver with $GDB_QEMU options. \
+ Default is localhost:12345")


That makes it sound like this were the default for the `-gdb` option. 
Since `-gdb` is just a switch, it doesn’t have a default (apart from 
being off by default).


So I’d rephrase this at least to “The default options are 
'localhost:12345'”.  Or maybe “start gdbserver with $GDB_QEMU options 
('localhost:12345' if $GDB_QEMU is empty)”.


Also, $GDB_QEMU as a name is a bit strange, because it does not specify 
which gdb to use; it just gives the options to use for gdb. 
$GDB_QEMU_OPTIONS would be more in line with the naming of the rest of 
the environment variables (or just $GDB_OPTIONS).


Max


  p.add_argument('-misalign', action='store_true',
 help='misalign memory allocations')
  p.add_argument('--color', choices=['on', 'off', 'auto'],





Re: [PATCH v3 03/15] docs/devel/testing: add debug section to the QEMU iotests chapter

2021-04-30 Thread Max Reitz

On 14.04.21 19:03, Emanuele Giuseppe Esposito wrote:

Introduce the "Debugging a test case" section, in preparation
to the additional flags that will be added in the next patches.

Signed-off-by: Emanuele Giuseppe Esposito 
---
  docs/devel/testing.rst | 8 
  1 file changed, 8 insertions(+)

diff --git a/docs/devel/testing.rst b/docs/devel/testing.rst
index 1434a50cc4..b7e2370e7e 100644
--- a/docs/devel/testing.rst
+++ b/docs/devel/testing.rst
@@ -224,6 +224,14 @@ another application on the host may have locked the file, 
possibly leading to a
  test failure.  If using such devices are explicitly desired, consider adding
  ``locking=off`` option to disable image locking.
  
+Debugging a test case

+---
+QEMU iotests offers some options to debug a failing test, that can be


-,

Max


+given as options to the ``check`` script:
+
+* ``-d`` (debug) just increases the logging verbosity, showing
+  for example the QMP commands and answers.
+
  Test case groups
  
  






Re: [PATCH v3 02/15] python: qemu: pass the wrapper field from QEMUQtestmachine to QEMUMachine

2021-04-30 Thread Max Reitz

On 14.04.21 19:03, Emanuele Giuseppe Esposito wrote:

Signed-off-by: Emanuele Giuseppe Esposito 
---
  python/qemu/qtest.py | 4 +++-
  1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/python/qemu/qtest.py b/python/qemu/qtest.py
index 39a0cf62fe..c18eae96c6 100644
--- a/python/qemu/qtest.py
+++ b/python/qemu/qtest.py
@@ -111,6 +111,7 @@ class QEMUQtestMachine(QEMUMachine):
  def __init__(self,
   binary: str,
   args: Sequence[str] = (),
+ wrapper: Sequence[str] = (),
   name: Optional[str] = None,
   test_dir: str = "/var/tmp",
   socket_scm_helper: Optional[str] = None,
@@ -119,7 +120,8 @@ def __init__(self,
  name = "qemu-%d" % os.getpid()
  if sock_dir is None:
  sock_dir = test_dir
-super().__init__(binary, args, name=name, test_dir=test_dir,
+super().__init__(binary, args, wrapper=wrapper, name=name,
+ test_dir=test_dir,
   socket_scm_helper=socket_scm_helper,
   sock_dir=sock_dir)
  self._qtest: Optional[QEMUQtestProtocol] = None


Reviewed-by: Max Reitz 




Re: [PATCH v3 01/15] python: qemu: add timer parameter for qmp.accept socket

2021-04-30 Thread Max Reitz

On 14.04.21 19:03, Emanuele Giuseppe Esposito wrote:

Add a new _qmp_timer field to the QEMUMachine class.
The default timer is 15 sec, as per the default in the
qmp accept() function.

Signed-off-by: Emanuele Giuseppe Esposito 
---
  python/qemu/machine.py | 3 ++-
  1 file changed, 2 insertions(+), 1 deletion(-)


Reviewed-by: Max Reitz 

(I would have preferred for the commit message to tell me why we want 
this (I suspect some later patch is going to use it), but oh well.)





Re: [PATCH] mirror: stop cancelling in-flight requests on non-force cancel in READY

2021-04-30 Thread Max Reitz

On 21.04.21 09:58, Vladimir Sementsov-Ogievskiy wrote:

If mirror is READY than cancel operation is not discarding the whole
result of the operation, but instead it's a documented way get a
point-in-time snapshot of source disk.

So, we should not cancel any requests if mirror is READ and
force=false. Let's fix that case.

Note, that bug that we have before this commit is not critical, as the
only .bdrv_cancel_in_flight implementation is nbd_cancel_in_flight()
and it cancels only requests waiting for reconnection, so it should be
rare case.

Fixes: 521ff8b779b11c394dbdc43f02e158dd99df308a
Signed-off-by: Vladimir Sementsov-Ogievskiy 
---
  include/block/block_int.h | 2 +-
  include/qemu/job.h| 2 +-
  block/backup.c| 2 +-
  block/mirror.c| 6 --
  job.c | 2 +-
  tests/qemu-iotests/264| 2 +-
  6 files changed, 9 insertions(+), 7 deletions(-)


Thanks, applied to my block branch:

https://github.com/XanClic/qemu/commits/block

Max




[PULL 37/39] block: Add BDRV_O_NO_SHARE for blk_new_open()

2021-04-30 Thread Kevin Wolf
Normally, blk_new_open() just shares all permissions. This was fine
originally when permissions only protected against uses in the same
process because no other part of the code would actually get to access
the block nodes opened with blk_new_open(). However, since we use it for
file locking now, unsharing permissions becomes desirable.

Add a new BDRV_O_NO_SHARE flag that is used in blk_new_open() to unshare
any permissions that can be unshared.

Signed-off-by: Kevin Wolf 
Message-Id: <20210422164344.283389-2-kw...@redhat.com>
Reviewed-by: Eric Blake 
Reviewed-by: Vladimir Sementsov-Ogievskiy 
Signed-off-by: Kevin Wolf 
---
 include/block/block.h |  1 +
 block/block-backend.c | 19 +--
 2 files changed, 14 insertions(+), 6 deletions(-)

diff --git a/include/block/block.h b/include/block/block.h
index 8d5b3ecebd..82185965ff 100644
--- a/include/block/block.h
+++ b/include/block/block.h
@@ -102,6 +102,7 @@ typedef struct HDGeometry {
 uint32_t cylinders;
 } HDGeometry;
 
+#define BDRV_O_NO_SHARE0x0001 /* don't share permissions */
 #define BDRV_O_RDWR0x0002
 #define BDRV_O_RESIZE  0x0004 /* request permission for resizing the node 
*/
 #define BDRV_O_SNAPSHOT0x0008 /* open the file read only and save writes 
in a snapshot */
diff --git a/block/block-backend.c b/block/block-backend.c
index e4892fd6a5..6fca9853e1 100644
--- a/block/block-backend.c
+++ b/block/block-backend.c
@@ -407,15 +407,19 @@ BlockBackend *blk_new_open(const char *filename, const 
char *reference,
 BlockBackend *blk;
 BlockDriverState *bs;
 uint64_t perm = 0;
+uint64_t shared = BLK_PERM_ALL;
 
-/* blk_new_open() is mainly used in .bdrv_create implementations and the
- * tools where sharing isn't a concern because the BDS stays private, so we
- * just request permission according to the flags.
+/*
+ * blk_new_open() is mainly used in .bdrv_create implementations and the
+ * tools where sharing isn't a major concern because the BDS stays private
+ * and the file is generally not supposed to be used by a second process,
+ * so we just request permission according to the flags.
  *
  * The exceptions are xen_disk and blockdev_init(); in these cases, the
  * caller of blk_new_open() doesn't make use of the permissions, but they
  * shouldn't hurt either. We can still share everything here because the
- * guest devices will add their own blockers if they can't share. */
+ * guest devices will add their own blockers if they can't share.
+ */
 if ((flags & BDRV_O_NO_IO) == 0) {
 perm |= BLK_PERM_CONSISTENT_READ;
 if (flags & BDRV_O_RDWR) {
@@ -425,8 +429,11 @@ BlockBackend *blk_new_open(const char *filename, const 
char *reference,
 if (flags & BDRV_O_RESIZE) {
 perm |= BLK_PERM_RESIZE;
 }
+if (flags & BDRV_O_NO_SHARE) {
+shared = BLK_PERM_CONSISTENT_READ | BLK_PERM_WRITE_UNCHANGED;
+}
 
-blk = blk_new(qemu_get_aio_context(), perm, BLK_PERM_ALL);
+blk = blk_new(qemu_get_aio_context(), perm, shared);
 bs = bdrv_open(filename, reference, options, flags, errp);
 if (!bs) {
 blk_unref(blk);
@@ -435,7 +442,7 @@ BlockBackend *blk_new_open(const char *filename, const char 
*reference,
 
 blk->root = bdrv_root_attach_child(bs, "root", &child_root,
BDRV_CHILD_FILTERED | 
BDRV_CHILD_PRIMARY,
-   perm, BLK_PERM_ALL, blk, errp);
+   perm, shared, blk, errp);
 if (!blk->root) {
 blk_unref(blk);
 return NULL;
-- 
2.30.2




[PULL 36/39] block: refactor bdrv_node_check_perm()

2021-04-30 Thread Kevin Wolf
From: Vladimir Sementsov-Ogievskiy 

Now, bdrv_node_check_perm() is called only with fresh cumulative
permissions, so its actually "refresh_perm".

Move permission calculation to the function. Also, drop unreachable
error message and rewrite the remaining one to be more generic (as now
we don't know which node is added and which was already here).

Add also Virtuozzo copyright, as big work is done at this point.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Reviewed-by: Kevin Wolf 
Message-Id: <20210428151804.439460-37-vsement...@virtuozzo.com>
Signed-off-by: Kevin Wolf 
---
 block.c| 38 +++---
 tests/qemu-iotests/245 |  2 +-
 2 files changed, 12 insertions(+), 28 deletions(-)

diff --git a/block.c b/block.c
index df8fa6003c..874c22c43e 100644
--- a/block.c
+++ b/block.c
@@ -2,6 +2,7 @@
  * QEMU System Emulator block driver
  *
  * Copyright (c) 2003 Fabrice Bellard
+ * Copyright (c) 2020 Virtuozzo International GmbH.
  *
  * Permission is hereby granted, free of charge, to any person obtaining a copy
  * of this software and associated documentation files (the "Software"), to 
deal
@@ -2270,22 +2271,18 @@ static void bdrv_replace_child(BdrvChild *child, 
BlockDriverState *new_bs,
 }
 
 /*
- * Check whether permissions on this node can be changed in a way that
- * @cumulative_perms and @cumulative_shared_perms are the new cumulative
- * permissions of all its parents. This involves checking whether all necessary
- * permission changes to child nodes can be performed.
- *
- * A call to this function must always be followed by a call to bdrv_set_perm()
- * or bdrv_abort_perm_update().
+ * Refresh permissions in @bs subtree. The function is intended to be called
+ * after some graph modification that was done without permission update.
  */
-static int bdrv_node_check_perm(BlockDriverState *bs, BlockReopenQueue *q,
-uint64_t cumulative_perms,
-uint64_t cumulative_shared_perms,
-Transaction *tran, Error **errp)
+static int bdrv_node_refresh_perm(BlockDriverState *bs, BlockReopenQueue *q,
+  Transaction *tran, Error **errp)
 {
 BlockDriver *drv = bs->drv;
 BdrvChild *c;
 int ret;
+uint64_t cumulative_perms, cumulative_shared_perms;
+
+bdrv_get_cumulative_perm(bs, &cumulative_perms, &cumulative_shared_perms);
 
 /* Write permissions never work with read-only images */
 if ((cumulative_perms & (BLK_PERM_WRITE | BLK_PERM_WRITE_UNCHANGED)) &&
@@ -2294,15 +2291,8 @@ static int bdrv_node_check_perm(BlockDriverState *bs, 
BlockReopenQueue *q,
 if (!bdrv_is_writable_after_reopen(bs, NULL)) {
 error_setg(errp, "Block node is read-only");
 } else {
-uint64_t current_perms, current_shared;
-bdrv_get_cumulative_perm(bs, ¤t_perms, ¤t_shared);
-if (current_perms & (BLK_PERM_WRITE | BLK_PERM_WRITE_UNCHANGED)) {
-error_setg(errp, "Cannot make block node read-only, there is "
-   "a writer on it");
-} else {
-error_setg(errp, "Cannot make block node read-only and create "
-   "a writer on it");
-}
+error_setg(errp, "Read-only block node '%s' cannot support "
+   "read-write users", bdrv_get_node_name(bs));
 }
 
 return -EPERM;
@@ -2358,7 +2348,6 @@ static int bdrv_list_refresh_perms(GSList *list, 
BlockReopenQueue *q,
Transaction *tran, Error **errp)
 {
 int ret;
-uint64_t cumulative_perms, cumulative_shared_perms;
 BlockDriverState *bs;
 
 for ( ; list; list = list->next) {
@@ -2368,12 +2357,7 @@ static int bdrv_list_refresh_perms(GSList *list, 
BlockReopenQueue *q,
 return -EINVAL;
 }
 
-bdrv_get_cumulative_perm(bs, &cumulative_perms,
- &cumulative_shared_perms);
-
-ret = bdrv_node_check_perm(bs, q, cumulative_perms,
-   cumulative_shared_perms,
-   tran, errp);
+ret = bdrv_node_refresh_perm(bs, q, tran, errp);
 if (ret < 0) {
 return ret;
 }
diff --git a/tests/qemu-iotests/245 b/tests/qemu-iotests/245
index 11104b9208..fc5297e268 100755
--- a/tests/qemu-iotests/245
+++ b/tests/qemu-iotests/245
@@ -905,7 +905,7 @@ class TestBlockdevReopen(iotests.QMPTestCase):
 # We can't reopen hd1 to read-only, as block-stream requires it to be
 # read-write
 self.reopen(opts['backing'], {'read-only': True},
-"Cannot make block node read-only, there is a writer on 
it")
+"Read-only block node 'hd1' cannot support read-write 
users")
 
 # We can't remove hd2 while the stream job is ongoing
 opts['backing']['backing'] = None
-- 
2.30.2

[PULL 34/39] block: refactor bdrv_child_set_perm_safe() transaction action

2021-04-30 Thread Kevin Wolf
From: Vladimir Sementsov-Ogievskiy 

Old interfaces dropped, nobody directly calls
bdrv_child_set_perm_abort() and bdrv_child_set_perm_commit(), so we can
use personal state structure for the action and stop exploiting
BdrvChild structure. Also, drop "_safe" suffix which is redundant now.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Reviewed-by: Kevin Wolf 
Message-Id: <20210428151804.439460-35-vsement...@virtuozzo.com>
Signed-off-by: Kevin Wolf 
---
 include/block/block_int.h |  5 
 block.c   | 63 ++-
 2 files changed, 22 insertions(+), 46 deletions(-)

diff --git a/include/block/block_int.h b/include/block/block_int.h
index dd2de6bd1d..c823f5b1b3 100644
--- a/include/block/block_int.h
+++ b/include/block/block_int.h
@@ -813,11 +813,6 @@ struct BdrvChild {
  */
 uint64_t shared_perm;
 
-/* backup of permissions during permission update procedure */
-bool has_backup_perm;
-uint64_t backup_perm;
-uint64_t backup_shared_perm;
-
 /*
  * This link is frozen: the child can neither be replaced nor
  * detached from the parent.
diff --git a/block.c b/block.c
index 2362c934a4..7b2a8844f6 100644
--- a/block.c
+++ b/block.c
@@ -2135,59 +2135,40 @@ static GSList *bdrv_topological_dfs(GSList *list, 
GHashTable *found,
 return g_slist_prepend(list, bs);
 }
 
-static void bdrv_child_set_perm_commit(void *opaque)
-{
-BdrvChild *c = opaque;
-
-c->has_backup_perm = false;
-}
+typedef struct BdrvChildSetPermState {
+BdrvChild *child;
+uint64_t old_perm;
+uint64_t old_shared_perm;
+} BdrvChildSetPermState;
 
 static void bdrv_child_set_perm_abort(void *opaque)
 {
-BdrvChild *c = opaque;
-/*
- * We may have child->has_backup_perm unset at this point, as in case of
- * _check_ stage of permission update failure we may _check_ not the whole
- * subtree.  Still, _abort_ is called on the whole subtree anyway.
- */
-if (c->has_backup_perm) {
-c->perm = c->backup_perm;
-c->shared_perm = c->backup_shared_perm;
-c->has_backup_perm = false;
-}
+BdrvChildSetPermState *s = opaque;
+
+s->child->perm = s->old_perm;
+s->child->shared_perm = s->old_shared_perm;
 }
 
 static TransactionActionDrv bdrv_child_set_pem_drv = {
 .abort = bdrv_child_set_perm_abort,
-.commit = bdrv_child_set_perm_commit,
+.clean = g_free,
 };
 
-/*
- * With tran=NULL needs to be followed by direct call to either
- * bdrv_child_set_perm_commit() or bdrv_child_set_perm_abort().
- *
- * With non-NULL tran needs to be followed by tran_abort() or tran_commit()
- * instead.
- */
-static void bdrv_child_set_perm_safe(BdrvChild *c, uint64_t perm,
- uint64_t shared, Transaction *tran)
+static void bdrv_child_set_perm(BdrvChild *c, uint64_t perm,
+uint64_t shared, Transaction *tran)
 {
-if (!c->has_backup_perm) {
-c->has_backup_perm = true;
-c->backup_perm = c->perm;
-c->backup_shared_perm = c->shared_perm;
-}
-/*
- * Note: it's OK if c->has_backup_perm was already set, as we can find the
- * same c twice during check_perm procedure
- */
+BdrvChildSetPermState *s = g_new(BdrvChildSetPermState, 1);
+
+*s = (BdrvChildSetPermState) {
+.child = c,
+.old_perm = c->perm,
+.old_shared_perm = c->shared_perm,
+};
 
 c->perm = perm;
 c->shared_perm = shared;
 
-if (tran) {
-tran_add(tran, &bdrv_child_set_pem_drv, c);
-}
+tran_add(tran, &bdrv_child_set_pem_drv, s);
 }
 
 static void bdrv_drv_set_perm_commit(void *opaque)
@@ -2367,7 +2348,7 @@ static int bdrv_node_check_perm(BlockDriverState *bs, 
BlockReopenQueue *q,
 bdrv_child_perm(bs, c->bs, c, c->role, q,
 cumulative_perms, cumulative_shared_perms,
 &cur_perm, &cur_shared);
-bdrv_child_set_perm_safe(c, cur_perm, cur_shared, tran);
+bdrv_child_set_perm(c, cur_perm, cur_shared, tran);
 }
 
 return 0;
@@ -2466,7 +2447,7 @@ int bdrv_child_try_set_perm(BdrvChild *c, uint64_t perm, 
uint64_t shared,
 Transaction *tran = tran_new();
 int ret;
 
-bdrv_child_set_perm_safe(c, perm, shared, tran);
+bdrv_child_set_perm(c, perm, shared, tran);
 
 ret = bdrv_refresh_perms(c->bs, &local_err);
 
-- 
2.30.2




[PULL 27/39] block: make bdrv_refresh_limits() to be a transaction action

2021-04-30 Thread Kevin Wolf
From: Vladimir Sementsov-Ogievskiy 

To be used in further commit.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Reviewed-by: Kevin Wolf 
Message-Id: <20210428151804.439460-28-vsement...@virtuozzo.com>
Signed-off-by: Kevin Wolf 
---
 include/block/block.h |  3 ++-
 block.c   |  9 -
 block/io.c| 31 +--
 3 files changed, 35 insertions(+), 8 deletions(-)

diff --git a/include/block/block.h b/include/block/block.h
index 85481a05c6..ad38259c91 100644
--- a/include/block/block.h
+++ b/include/block/block.h
@@ -9,6 +9,7 @@
 #include "block/dirty-bitmap.h"
 #include "block/blockjob.h"
 #include "qemu/hbitmap.h"
+#include "qemu/transactions.h"
 
 /*
  * generated_co_wrapper
@@ -421,7 +422,7 @@ int64_t bdrv_get_allocated_file_size(BlockDriverState *bs);
 BlockMeasureInfo *bdrv_measure(BlockDriver *drv, QemuOpts *opts,
BlockDriverState *in_bs, Error **errp);
 void bdrv_get_geometry(BlockDriverState *bs, uint64_t *nb_sectors_ptr);
-void bdrv_refresh_limits(BlockDriverState *bs, Error **errp);
+void bdrv_refresh_limits(BlockDriverState *bs, Transaction *tran, Error 
**errp);
 int bdrv_commit(BlockDriverState *bs);
 int bdrv_make_empty(BdrvChild *c, Error **errp);
 int bdrv_change_backing_file(BlockDriverState *bs, const char *backing_file,
diff --git a/block.c b/block.c
index 1dc14908ac..9922943793 100644
--- a/block.c
+++ b/block.c
@@ -49,7 +49,6 @@
 #include "qemu/timer.h"
 #include "qemu/cutils.h"
 #include "qemu/id.h"
-#include "qemu/transactions.h"
 #include "block/coroutines.h"
 
 #ifdef CONFIG_BSD
@@ -1577,7 +1576,7 @@ static int bdrv_open_driver(BlockDriverState *bs, 
BlockDriver *drv,
 return ret;
 }
 
-bdrv_refresh_limits(bs, &local_err);
+bdrv_refresh_limits(bs, NULL, &local_err);
 if (local_err) {
 error_propagate(errp, local_err);
 return -EINVAL;
@@ -3363,7 +3362,7 @@ int bdrv_set_backing_hd(BlockDriverState *bs, 
BlockDriverState *backing_hd,
 }
 
 out:
-bdrv_refresh_limits(bs, NULL);
+bdrv_refresh_limits(bs, NULL, NULL);
 
 return ret;
 }
@@ -4847,7 +4846,7 @@ static void bdrv_reopen_commit(BDRVReopenState 
*reopen_state)
 bdrv_set_backing_hd(bs, reopen_state->new_backing_bs, &error_abort);
 }
 
-bdrv_refresh_limits(bs, NULL);
+bdrv_refresh_limits(bs, NULL, NULL);
 }
 
 /*
@@ -5244,7 +5243,7 @@ int bdrv_append(BlockDriverState *bs_new, 
BlockDriverState *bs_top,
 out:
 tran_finalize(tran, ret);
 
-bdrv_refresh_limits(bs_top, NULL);
+bdrv_refresh_limits(bs_top, NULL, NULL);
 
 return ret;
 }
diff --git a/block/io.c b/block/io.c
index ca2dca3007..35b6c56efc 100644
--- a/block/io.c
+++ b/block/io.c
@@ -133,13 +133,40 @@ static void bdrv_merge_limits(BlockLimits *dst, const 
BlockLimits *src)
 dst->max_iov = MIN_NON_ZERO(dst->max_iov, src->max_iov);
 }
 
-void bdrv_refresh_limits(BlockDriverState *bs, Error **errp)
+typedef struct BdrvRefreshLimitsState {
+BlockDriverState *bs;
+BlockLimits old_bl;
+} BdrvRefreshLimitsState;
+
+static void bdrv_refresh_limits_abort(void *opaque)
+{
+BdrvRefreshLimitsState *s = opaque;
+
+s->bs->bl = s->old_bl;
+}
+
+static TransactionActionDrv bdrv_refresh_limits_drv = {
+.abort = bdrv_refresh_limits_abort,
+.clean = g_free,
+};
+
+/* @tran is allowed to be NULL, in this case no rollback is possible. */
+void bdrv_refresh_limits(BlockDriverState *bs, Transaction *tran, Error **errp)
 {
 ERRP_GUARD();
 BlockDriver *drv = bs->drv;
 BdrvChild *c;
 bool have_limits;
 
+if (tran) {
+BdrvRefreshLimitsState *s = g_new(BdrvRefreshLimitsState, 1);
+*s = (BdrvRefreshLimitsState) {
+.bs = bs,
+.old_bl = bs->bl,
+};
+tran_add(tran, &bdrv_refresh_limits_drv, s);
+}
+
 memset(&bs->bl, 0, sizeof(bs->bl));
 
 if (!drv) {
@@ -156,7 +183,7 @@ void bdrv_refresh_limits(BlockDriverState *bs, Error **errp)
 QLIST_FOREACH(c, &bs->children, next) {
 if (c->role & (BDRV_CHILD_DATA | BDRV_CHILD_FILTERED | BDRV_CHILD_COW))
 {
-bdrv_refresh_limits(c->bs, errp);
+bdrv_refresh_limits(c->bs, tran, errp);
 if (*errp) {
 return;
 }
-- 
2.30.2




[PULL 29/39] block: bdrv_reopen_multiple(): move bdrv_flush to separate pre-prepare

2021-04-30 Thread Kevin Wolf
From: Vladimir Sementsov-Ogievskiy 

During reopen we may add backing bs from other aio context, which may
lead to changing original context of top bs.

We are going to move graph modification to prepare stage. So, it will
be possible that bdrv_flush() in bdrv_reopen_prepare called on bs in
non-original aio context, which we didn't aquire which leads to crash.

To avoid this problem move bdrv_flush() to be a separate reopen stage
before bdrv_reopen_prepare().

This doesn't seem correct to acquire only one aio context and not all
contexts participating in reopen. But it's not obvious how to do it
correctly, keeping in mind:

 1. rules of bdrv_set_aio_context_ignore() that requires new_context
lock not being held

 2. possible deadlocks because of holding all (or several?) AioContext
locks

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Reviewed-by: Kevin Wolf 
Message-Id: <20210428151804.439460-30-vsement...@virtuozzo.com>
Signed-off-by: Kevin Wolf 
---
 block.c | 14 --
 1 file changed, 8 insertions(+), 6 deletions(-)

diff --git a/block.c b/block.c
index bdfe59d94d..357ec1be2c 100644
--- a/block.c
+++ b/block.c
@@ -4274,6 +4274,14 @@ int bdrv_reopen_multiple(BlockReopenQueue *bs_queue, 
Error **errp)
 
 assert(bs_queue != NULL);
 
+QTAILQ_FOREACH(bs_entry, bs_queue, entry) {
+ret = bdrv_flush(bs_entry->state.bs);
+if (ret < 0) {
+error_setg_errno(errp, -ret, "Error flushing drive");
+goto cleanup;
+}
+}
+
 QTAILQ_FOREACH(bs_entry, bs_queue, entry) {
 assert(bs_entry->state.bs->quiesce_counter > 0);
 if (bdrv_reopen_prepare(&bs_entry->state, bs_queue, errp)) {
@@ -4669,12 +4677,6 @@ static int bdrv_reopen_prepare(BDRVReopenState 
*reopen_state,
 bdrv_reopen_perm(queue, reopen_state->bs,
  &reopen_state->perm, &reopen_state->shared_perm);
 
-ret = bdrv_flush(reopen_state->bs);
-if (ret) {
-error_setg_errno(errp, -ret, "Error flushing drive");
-goto error;
-}
-
 if (drv->bdrv_reopen_prepare) {
 /*
  * If a driver-specific option is missing, it means that we
-- 
2.30.2




[PULL 35/39] block: rename bdrv_replace_child_safe() to bdrv_replace_child()

2021-04-30 Thread Kevin Wolf
From: Vladimir Sementsov-Ogievskiy 

We don't have bdrv_replace_child(), so it's time for
bdrv_replace_child_safe() to take its place.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Reviewed-by: Kevin Wolf 
Message-Id: <20210428151804.439460-36-vsement...@virtuozzo.com>
Signed-off-by: Kevin Wolf 
---
 block.c | 10 +-
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/block.c b/block.c
index 7b2a8844f6..df8fa6003c 100644
--- a/block.c
+++ b/block.c
@@ -2248,12 +2248,12 @@ static TransactionActionDrv bdrv_replace_child_drv = {
 };
 
 /*
- * bdrv_replace_child_safe
+ * bdrv_replace_child
  *
  * Note: real unref of old_bs is done only on commit.
  */
-static void bdrv_replace_child_safe(BdrvChild *child, BlockDriverState *new_bs,
-Transaction *tran)
+static void bdrv_replace_child(BdrvChild *child, BlockDriverState *new_bs,
+   Transaction *tran)
 {
 BdrvReplaceChildState *s = g_new(BdrvReplaceChildState, 1);
 *s = (BdrvReplaceChildState) {
@@ -4803,7 +4803,7 @@ static void 
bdrv_remove_filter_or_cow_child(BlockDriverState *bs,
 }
 
 if (child->bs) {
-bdrv_replace_child_safe(child, NULL, tran);
+bdrv_replace_child(child, NULL, tran);
 }
 
 s = g_new(BdrvRemoveFilterOrCowChild, 1);
@@ -4843,7 +4843,7 @@ static int bdrv_replace_node_noperm(BlockDriverState 
*from,
c->name, from->node_name);
 return -EPERM;
 }
-bdrv_replace_child_safe(c, to, tran);
+bdrv_replace_child(c, to, tran);
 }
 
 return 0;
-- 
2.30.2




[PULL 32/39] block: inline bdrv_check_perm_common()

2021-04-30 Thread Kevin Wolf
From: Vladimir Sementsov-Ogievskiy 

bdrv_check_perm_common() has only one caller, so no more sense in
"common".

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Reviewed-by: Kevin Wolf 
Message-Id: <20210428151804.439460-33-vsement...@virtuozzo.com>
Signed-off-by: Kevin Wolf 
---
 block.c | 32 +++-
 1 file changed, 3 insertions(+), 29 deletions(-)

diff --git a/block.c b/block.c
index bf7c187435..38bd2ea32e 100644
--- a/block.c
+++ b/block.c
@@ -2373,33 +2373,13 @@ static int bdrv_node_check_perm(BlockDriverState *bs, 
BlockReopenQueue *q,
 return 0;
 }
 
-/*
- * If use_cumulative_perms is true, use cumulative_perms and
- * cumulative_shared_perms for first element of the list. Otherwise just 
refresh
- * all permissions.
- */
-static int bdrv_check_perm_common(GSList *list, BlockReopenQueue *q,
-  bool use_cumulative_perms,
-  uint64_t cumulative_perms,
-  uint64_t cumulative_shared_perms,
-  Transaction *tran, Error **errp)
+static int bdrv_list_refresh_perms(GSList *list, BlockReopenQueue *q,
+   Transaction *tran, Error **errp)
 {
 int ret;
+uint64_t cumulative_perms, cumulative_shared_perms;
 BlockDriverState *bs;
 
-if (use_cumulative_perms) {
-bs = list->data;
-
-ret = bdrv_node_check_perm(bs, q, cumulative_perms,
-   cumulative_shared_perms,
-   tran, errp);
-if (ret < 0) {
-return ret;
-}
-
-list = list->next;
-}
-
 for ( ; list; list = list->next) {
 bs = list->data;
 
@@ -2421,12 +2401,6 @@ static int bdrv_check_perm_common(GSList *list, 
BlockReopenQueue *q,
 return 0;
 }
 
-static int bdrv_list_refresh_perms(GSList *list, BlockReopenQueue *q,
-   Transaction *tran, Error **errp)
-{
-return bdrv_check_perm_common(list, q, false, 0, 0, tran, errp);
-}
-
 static void bdrv_node_set_perm(BlockDriverState *bs)
 {
 BlockDriver *drv = bs->drv;
-- 
2.30.2




[PULL 31/39] block: drop unused permission update functions

2021-04-30 Thread Kevin Wolf
From: Vladimir Sementsov-Ogievskiy 

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Reviewed-by: Kevin Wolf 
Message-Id: <20210428151804.439460-32-vsement...@virtuozzo.com>
Signed-off-by: Kevin Wolf 
---
 block.c | 103 
 1 file changed, 103 deletions(-)

diff --git a/block.c b/block.c
index ffaa367a1b..bf7c187435 100644
--- a/block.c
+++ b/block.c
@@ -1987,11 +1987,6 @@ static int bdrv_fill_options(QDict **options, const char 
*filename,
 return 0;
 }
 
-static int bdrv_check_update_perm(BlockDriverState *bs, BlockReopenQueue *q,
-  uint64_t new_used_perm,
-  uint64_t new_shared_perm,
-  Error **errp);
-
 typedef struct BlockReopenQueueEntry {
  bool prepared;
  bool perms_checked;
@@ -2426,56 +2421,12 @@ static int bdrv_check_perm_common(GSList *list, 
BlockReopenQueue *q,
 return 0;
 }
 
-static int bdrv_check_perm(BlockDriverState *bs, BlockReopenQueue *q,
-   uint64_t cumulative_perms,
-   uint64_t cumulative_shared_perms, Error **errp)
-{
-g_autoptr(GSList) list = bdrv_topological_dfs(NULL, NULL, bs);
-return bdrv_check_perm_common(list, q, true, cumulative_perms,
-  cumulative_shared_perms, NULL, errp);
-}
-
 static int bdrv_list_refresh_perms(GSList *list, BlockReopenQueue *q,
Transaction *tran, Error **errp)
 {
 return bdrv_check_perm_common(list, q, false, 0, 0, tran, errp);
 }
 
-/*
- * Notifies drivers that after a previous bdrv_check_perm() call, the
- * permission update is not performed and any preparations made for it (e.g.
- * taken file locks) need to be undone.
- */
-static void bdrv_node_abort_perm_update(BlockDriverState *bs)
-{
-BlockDriver *drv = bs->drv;
-BdrvChild *c;
-
-if (!drv) {
-return;
-}
-
-bdrv_drv_set_perm_abort(bs);
-
-QLIST_FOREACH(c, &bs->children, next) {
-bdrv_child_set_perm_abort(c);
-}
-}
-
-static void bdrv_list_abort_perm_update(GSList *list)
-{
-for ( ; list; list = list->next) {
-bdrv_node_abort_perm_update((BlockDriverState *)list->data);
-}
-}
-
-__attribute__((unused))
-static void bdrv_abort_perm_update(BlockDriverState *bs)
-{
-g_autoptr(GSList) list = bdrv_topological_dfs(NULL, NULL, bs);
-return bdrv_list_abort_perm_update(list);
-}
-
 static void bdrv_node_set_perm(BlockDriverState *bs)
 {
 BlockDriver *drv = bs->drv;
@@ -2557,60 +2508,6 @@ char *bdrv_perm_names(uint64_t perm)
 return g_string_free(result, FALSE);
 }
 
-/*
- * Checks whether a new reference to @bs can be added if the new user requires
- * @new_used_perm/@new_shared_perm as its permissions. If @ignore_children is
- * set, the BdrvChild objects in this list are ignored in the calculations;
- * this allows checking permission updates for an existing reference.
- *
- * Needs to be followed by a call to either bdrv_set_perm() or
- * bdrv_abort_perm_update(). */
-__attribute__((unused))
-static int bdrv_check_update_perm(BlockDriverState *bs, BlockReopenQueue *q,
-  uint64_t new_used_perm,
-  uint64_t new_shared_perm,
-  Error **errp)
-{
-BdrvChild *c;
-uint64_t cumulative_perms = new_used_perm;
-uint64_t cumulative_shared_perms = new_shared_perm;
-
-
-/* There is no reason why anyone couldn't tolerate write_unchanged */
-assert(new_shared_perm & BLK_PERM_WRITE_UNCHANGED);
-
-QLIST_FOREACH(c, &bs->parents, next_parent) {
-if ((new_used_perm & c->shared_perm) != new_used_perm) {
-char *user = bdrv_child_user_desc(c);
-char *perm_names = bdrv_perm_names(new_used_perm & 
~c->shared_perm);
-
-error_setg(errp, "Conflicts with use by %s as '%s', which does not 
"
- "allow '%s' on %s",
-   user, c->name, perm_names, bdrv_get_node_name(c->bs));
-g_free(user);
-g_free(perm_names);
-return -EPERM;
-}
-
-if ((c->perm & new_shared_perm) != c->perm) {
-char *user = bdrv_child_user_desc(c);
-char *perm_names = bdrv_perm_names(c->perm & ~new_shared_perm);
-
-error_setg(errp, "Conflicts with use by %s as '%s', which uses "
- "'%s' on %s",
-   user, c->name, perm_names, bdrv_get_node_name(c->bs));
-g_free(user);
-g_free(perm_names);
-return -EPERM;
-}
-
-cumulative_perms |= c->perm;
-cumulative_shared_perms &= c->shared_perm;
-}
-
-return bdrv_check_perm(bs, q, cumulative_perms, cumulative_shared_perms,
-   errp);
-}
 
 static int bdrv_refresh_perms(BlockDriverState *bs, Error **errp)
 {
-- 
2.30.2




[PULL 38/39] qemu-img convert: Unshare write permission for source

2021-04-30 Thread Kevin Wolf
For a successful conversion of an image, we must make sure that its
content doesn't change during the conversion.

A special case of this is using the same image file both as the source
and as the destination. If both input and output format are raw, the
operation would just be useless work, with other formats it is a sure
way to destroy the image. This will now fail because the image file
can't be opened a second time for the output when opening it for the
input has already acquired file locks to unshare BLK_PERM_WRITE.

Nevertheless, if there is some reason in a special case why it is
actually okay to allow writes to the image while it is being converted,
-U can still be used to force sharing all permissions.

Note that for most image formats, BLK_PERM_WRITE would already be
unshared by the format driver, so this only really makes a difference
for raw source images (but any output format).

Reported-by: Xueqiang Wei 
Signed-off-by: Kevin Wolf 
Reviewed-by: Eric Blake 
Message-Id: <20210422164344.283389-3-kw...@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy 
Signed-off-by: Kevin Wolf 
---
 qemu-img.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/qemu-img.c b/qemu-img.c
index babb5573ab..a5993682aa 100644
--- a/qemu-img.c
+++ b/qemu-img.c
@@ -2146,7 +2146,7 @@ static void set_rate_limit(BlockBackend *blk, int64_t 
rate_limit)
 
 static int img_convert(int argc, char **argv)
 {
-int c, bs_i, flags, src_flags = 0;
+int c, bs_i, flags, src_flags = BDRV_O_NO_SHARE;
 const char *fmt = NULL, *out_fmt = NULL, *cache = "unsafe",
*src_cache = BDRV_DEFAULT_CACHE, *out_baseimg = NULL,
*out_filename, *out_baseimg_param, *snapshot_name = NULL;
-- 
2.30.2




[PULL 39/39] vhost-user-blk: Fail gracefully on too large queue size

2021-04-30 Thread Kevin Wolf
virtio_add_queue() aborts when queue_size > VIRTQUEUE_MAX_SIZE, so
vhost_user_blk_device_realize() should check this before calling it.

Simple reproducer:

qemu-system-x86_64 \
-chardev null,id=foo \
-device vhost-user-blk-pci,queue-size=4096,chardev=foo

Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1935014
Signed-off-by: Kevin Wolf 
Message-Id: <20210413165654.50810-1-kw...@redhat.com>
Reviewed-by: Stefan Hajnoczi 
Reviewed-by: Raphael Norwitz 
Reviewed-by: Philippe Mathieu-Daudé 
Tested-by: Philippe Mathieu-Daudé 
Signed-off-by: Kevin Wolf 
---
 hw/block/vhost-user-blk.c | 5 +
 1 file changed, 5 insertions(+)

diff --git a/hw/block/vhost-user-blk.c b/hw/block/vhost-user-blk.c
index 0b5b9d44cd..f5e9682703 100644
--- a/hw/block/vhost-user-blk.c
+++ b/hw/block/vhost-user-blk.c
@@ -467,6 +467,11 @@ static void vhost_user_blk_device_realize(DeviceState 
*dev, Error **errp)
 error_setg(errp, "vhost-user-blk: queue size must be non-zero");
 return;
 }
+if (s->queue_size > VIRTQUEUE_MAX_SIZE) {
+error_setg(errp, "vhost-user-blk: queue size must not exceed %d",
+   VIRTQUEUE_MAX_SIZE);
+return;
+}
 
 if (!vhost_user_init(&s->vhost_user, &s->chardev, errp)) {
 return;
-- 
2.30.2




[PULL 33/39] block: inline bdrv_replace_child()

2021-04-30 Thread Kevin Wolf
From: Vladimir Sementsov-Ogievskiy 

bdrv_replace_child() has only one caller, the second argument is
unused. Inline it now. This triggers deletion of some more unused
interfaces.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Reviewed-by: Kevin Wolf 
Message-Id: <20210428151804.439460-34-vsement...@virtuozzo.com>
Signed-off-by: Kevin Wolf 
---
 block.c | 101 ++--
 1 file changed, 18 insertions(+), 83 deletions(-)

diff --git a/block.c b/block.c
index 38bd2ea32e..2362c934a4 100644
--- a/block.c
+++ b/block.c
@@ -2401,42 +2401,6 @@ static int bdrv_list_refresh_perms(GSList *list, 
BlockReopenQueue *q,
 return 0;
 }
 
-static void bdrv_node_set_perm(BlockDriverState *bs)
-{
-BlockDriver *drv = bs->drv;
-BdrvChild *c;
-
-if (!drv) {
-return;
-}
-
-bdrv_drv_set_perm_commit(bs);
-
-/* Drivers that never have children can omit .bdrv_child_perm() */
-if (!drv->bdrv_child_perm) {
-assert(QLIST_EMPTY(&bs->children));
-return;
-}
-
-/* Update all children */
-QLIST_FOREACH(c, &bs->children, next) {
-bdrv_child_set_perm_commit(c);
-}
-}
-
-static void bdrv_list_set_perm(GSList *list)
-{
-for ( ; list; list = list->next) {
-bdrv_node_set_perm((BlockDriverState *)list->data);
-}
-}
-
-static void bdrv_set_perm(BlockDriverState *bs)
-{
-g_autoptr(GSList) list = bdrv_topological_dfs(NULL, NULL, bs);
-return bdrv_list_set_perm(list);
-}
-
 void bdrv_get_cumulative_perm(BlockDriverState *bs, uint64_t *perm,
   uint64_t *shared_perm)
 {
@@ -2776,52 +2740,6 @@ static void bdrv_replace_child_noperm(BdrvChild *child,
 }
 }
 
-/*
- * Updates @child to change its reference to point to @new_bs, including
- * checking and applying the necessary permission updates both to the old node
- * and to @new_bs.
- *
- * NULL is passed as @new_bs for removing the reference before freeing @child.
- *
- * If @new_bs is not NULL, bdrv_check_perm() must be called beforehand, as this
- * function uses bdrv_set_perm() to update the permissions according to the new
- * reference that @new_bs gets.
- *
- * Callers must ensure that child->frozen is false.
- */
-static void bdrv_replace_child(BdrvChild *child, BlockDriverState *new_bs)
-{
-BlockDriverState *old_bs = child->bs;
-
-/* Asserts that child->frozen == false */
-bdrv_replace_child_noperm(child, new_bs);
-
-/*
- * Start with the new node's permissions.  If @new_bs is a (direct
- * or indirect) child of @old_bs, we must complete the permission
- * update on @new_bs before we loosen the restrictions on @old_bs.
- * Otherwise, bdrv_check_perm() on @old_bs would re-initiate
- * updating the permissions of @new_bs, and thus not purely loosen
- * restrictions.
- */
-if (new_bs) {
-bdrv_set_perm(new_bs);
-}
-
-if (old_bs) {
-/*
- * Update permissions for old node. We're just taking a parent away, so
- * we're loosening restrictions. Errors of permission update are not
- * fatal in this case, ignore them.
- */
-bdrv_refresh_perms(old_bs, NULL);
-
-/* When the parent requiring a non-default AioContext is removed, the
- * node moves back to the main AioContext */
-bdrv_try_set_aio_context(old_bs, qemu_get_aio_context(), NULL);
-}
-}
-
 static void bdrv_child_free(void *opaque)
 {
 BdrvChild *c = opaque;
@@ -2989,8 +2907,25 @@ static int bdrv_attach_child_noperm(BlockDriverState 
*parent_bs,
 
 static void bdrv_detach_child(BdrvChild *child)
 {
-bdrv_replace_child(child, NULL);
+BlockDriverState *old_bs = child->bs;
+
+bdrv_replace_child_noperm(child, NULL);
 bdrv_remove_empty_child(child);
+
+if (old_bs) {
+/*
+ * Update permissions for old node. We're just taking a parent away, so
+ * we're loosening restrictions. Errors of permission update are not
+ * fatal in this case, ignore them.
+ */
+bdrv_refresh_perms(old_bs, NULL);
+
+/*
+ * When the parent requiring a non-default AioContext is removed, the
+ * node moves back to the main AioContext
+ */
+bdrv_try_set_aio_context(old_bs, qemu_get_aio_context(), NULL);
+}
 }
 
 /*
-- 
2.30.2




[PULL 26/39] block: make bdrv_unset_inherits_from to be a transaction action

2021-04-30 Thread Kevin Wolf
From: Vladimir Sementsov-Ogievskiy 

To be used in the further commit.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Reviewed-by: Kevin Wolf 
Message-Id: <20210428151804.439460-27-vsement...@virtuozzo.com>
Signed-off-by: Kevin Wolf 
---
 block.c | 46 ++
 1 file changed, 42 insertions(+), 4 deletions(-)

diff --git a/block.c b/block.c
index 46af5852ab..1dc14908ac 100644
--- a/block.c
+++ b/block.c
@@ -3205,11 +3205,49 @@ void bdrv_root_unref_child(BdrvChild *child)
 bdrv_unref(child_bs);
 }
 
+typedef struct BdrvSetInheritsFrom {
+BlockDriverState *bs;
+BlockDriverState *old_inherits_from;
+} BdrvSetInheritsFrom;
+
+static void bdrv_set_inherits_from_abort(void *opaque)
+{
+BdrvSetInheritsFrom *s = opaque;
+
+s->bs->inherits_from = s->old_inherits_from;
+}
+
+static TransactionActionDrv bdrv_set_inherits_from_drv = {
+.abort = bdrv_set_inherits_from_abort,
+.clean = g_free,
+};
+
+/* @tran is allowed to be NULL. In this case no rollback is possible */
+static void bdrv_set_inherits_from(BlockDriverState *bs,
+   BlockDriverState *new_inherits_from,
+   Transaction *tran)
+{
+if (tran) {
+BdrvSetInheritsFrom *s = g_new(BdrvSetInheritsFrom, 1);
+
+*s = (BdrvSetInheritsFrom) {
+.bs = bs,
+.old_inherits_from = bs->inherits_from,
+};
+
+tran_add(tran, &bdrv_set_inherits_from_drv, s);
+}
+
+bs->inherits_from = new_inherits_from;
+}
+
 /**
  * Clear all inherits_from pointers from children and grandchildren of
  * @root that point to @root, where necessary.
+ * @tran is allowed to be NULL. In this case no rollback is possible
  */
-static void bdrv_unset_inherits_from(BlockDriverState *root, BdrvChild *child)
+static void bdrv_unset_inherits_from(BlockDriverState *root, BdrvChild *child,
+ Transaction *tran)
 {
 BdrvChild *c;
 
@@ -3224,12 +3262,12 @@ static void bdrv_unset_inherits_from(BlockDriverState 
*root, BdrvChild *child)
 }
 }
 if (c == NULL) {
-child->bs->inherits_from = NULL;
+bdrv_set_inherits_from(child->bs, NULL, tran);
 }
 }
 
 QLIST_FOREACH(c, &child->bs->children, next) {
-bdrv_unset_inherits_from(root, c);
+bdrv_unset_inherits_from(root, c, tran);
 }
 }
 
@@ -3240,7 +3278,7 @@ void bdrv_unref_child(BlockDriverState *parent, BdrvChild 
*child)
 return;
 }
 
-bdrv_unset_inherits_from(parent, child);
+bdrv_unset_inherits_from(parent, child, NULL);
 bdrv_root_unref_child(child);
 }
 
-- 
2.30.2




[PULL 30/39] block: bdrv_reopen_multiple: refresh permissions on updated graph

2021-04-30 Thread Kevin Wolf
From: Vladimir Sementsov-Ogievskiy 

Move bdrv_reopen_multiple to new paradigm of permission update:
first update graph relations, then do refresh the permissions.

We have to modify reopen process in file-posix driver: with new scheme
we don't have prepared permissions in raw_reopen_prepare(), so we
should reconfigure fd in raw_check_perm(). Still this seems more native
and simple anyway.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Reviewed-by: Kevin Wolf 
Message-Id: <20210428151804.439460-31-vsement...@virtuozzo.com>
Signed-off-by: Kevin Wolf 
---
 include/block/block.h |   3 +-
 block.c   | 187 --
 block/file-posix.c|  91 +++-
 3 files changed, 84 insertions(+), 197 deletions(-)

diff --git a/include/block/block.h b/include/block/block.h
index ad38259c91..8d5b3ecebd 100644
--- a/include/block/block.h
+++ b/include/block/block.h
@@ -208,8 +208,7 @@ typedef struct BDRVReopenState {
 BlockdevDetectZeroesOptions detect_zeroes;
 bool backing_missing;
 bool replace_backing_bs;  /* new_backing_bs is ignored if this is false */
-BlockDriverState *new_backing_bs; /* If NULL then detach the current bs */
-uint64_t perm, shared_perm;
+BlockDriverState *old_backing_bs; /* keep pointer for permissions update */
 QDict *options;
 QDict *explicit_options;
 void *opaque;
diff --git a/block.c b/block.c
index 357ec1be2c..ffaa367a1b 100644
--- a/block.c
+++ b/block.c
@@ -95,8 +95,9 @@ static int bdrv_attach_child_noperm(BlockDriverState 
*parent_bs,
 static void bdrv_remove_filter_or_cow_child(BlockDriverState *bs,
 Transaction *tran);
 
-static int bdrv_reopen_prepare(BDRVReopenState *reopen_state, BlockReopenQueue
-   *queue, Error **errp);
+static int bdrv_reopen_prepare(BDRVReopenState *reopen_state,
+   BlockReopenQueue *queue,
+   Transaction *set_backings_tran, Error **errp);
 static void bdrv_reopen_commit(BDRVReopenState *reopen_state);
 static void bdrv_reopen_abort(BDRVReopenState *reopen_state);
 
@@ -2468,6 +2469,7 @@ static void bdrv_list_abort_perm_update(GSList *list)
 }
 }
 
+__attribute__((unused))
 static void bdrv_abort_perm_update(BlockDriverState *bs)
 {
 g_autoptr(GSList) list = bdrv_topological_dfs(NULL, NULL, bs);
@@ -2563,6 +2565,7 @@ char *bdrv_perm_names(uint64_t perm)
  *
  * Needs to be followed by a call to either bdrv_set_perm() or
  * bdrv_abort_perm_update(). */
+__attribute__((unused))
 static int bdrv_check_update_perm(BlockDriverState *bs, BlockReopenQueue *q,
   uint64_t new_used_perm,
   uint64_t new_shared_perm,
@@ -4183,10 +4186,6 @@ static BlockReopenQueue 
*bdrv_reopen_queue_child(BlockReopenQueue *bs_queue,
 bs_entry->state.explicit_options = explicit_options;
 bs_entry->state.flags = flags;
 
-/* This needs to be overwritten in bdrv_reopen_prepare() */
-bs_entry->state.perm = UINT64_MAX;
-bs_entry->state.shared_perm = 0;
-
 /*
  * If keep_old_opts is false then it means that unspecified
  * options must be reset to their original value. We don't allow
@@ -4271,6 +4270,9 @@ int bdrv_reopen_multiple(BlockReopenQueue *bs_queue, 
Error **errp)
 {
 int ret = -1;
 BlockReopenQueueEntry *bs_entry, *next;
+Transaction *tran = tran_new();
+g_autoptr(GHashTable) found = NULL;
+g_autoptr(GSList) refresh_list = NULL;
 
 assert(bs_queue != NULL);
 
@@ -4284,33 +4286,33 @@ int bdrv_reopen_multiple(BlockReopenQueue *bs_queue, 
Error **errp)
 
 QTAILQ_FOREACH(bs_entry, bs_queue, entry) {
 assert(bs_entry->state.bs->quiesce_counter > 0);
-if (bdrv_reopen_prepare(&bs_entry->state, bs_queue, errp)) {
-goto cleanup;
+ret = bdrv_reopen_prepare(&bs_entry->state, bs_queue, tran, errp);
+if (ret < 0) {
+goto abort;
 }
 bs_entry->prepared = true;
 }
 
+found = g_hash_table_new(NULL, NULL);
 QTAILQ_FOREACH(bs_entry, bs_queue, entry) {
 BDRVReopenState *state = &bs_entry->state;
-ret = bdrv_check_perm(state->bs, bs_queue, state->perm,
-  state->shared_perm, errp);
-if (ret < 0) {
-goto cleanup_perm;
-}
-/* Check if new_backing_bs would accept the new permissions */
-if (state->replace_backing_bs && state->new_backing_bs) {
-uint64_t nperm, nshared;
-bdrv_child_perm(state->bs, state->new_backing_bs,
-NULL, bdrv_backing_role(state->bs),
-bs_queue, state->perm, state->shared_perm,
-&nperm, &nshared);
-ret = bdrv_check_update_perm(state->new_backing_bs, NULL,
- nperm, nshared, errp);
-if (ret < 0) {
-  

[PULL 25/39] block: drop ignore_children for permission update functions

2021-04-30 Thread Kevin Wolf
From: Vladimir Sementsov-Ogievskiy 

This argument is always NULL. Drop it.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Reviewed-by: Kevin Wolf 
Message-Id: <20210428151804.439460-26-vsement...@virtuozzo.com>
Signed-off-by: Kevin Wolf 
---
 block.c | 38 +++---
 1 file changed, 11 insertions(+), 27 deletions(-)

diff --git a/block.c b/block.c
index 68dfd822dd..46af5852ab 100644
--- a/block.c
+++ b/block.c
@@ -1988,7 +1988,6 @@ static int bdrv_fill_options(QDict **options, const char 
*filename,
 static int bdrv_check_update_perm(BlockDriverState *bs, BlockReopenQueue *q,
   uint64_t new_used_perm,
   uint64_t new_shared_perm,
-  GSList *ignore_children,
   Error **errp);
 
 typedef struct BlockReopenQueueEntry {
@@ -2065,9 +2064,7 @@ static bool bdrv_a_allow_b(BdrvChild *a, BdrvChild *b, 
Error **errp)
 return false;
 }
 
-static bool bdrv_parent_perms_conflict(BlockDriverState *bs,
-   GSList *ignore_children,
-   Error **errp)
+static bool bdrv_parent_perms_conflict(BlockDriverState *bs, Error **errp)
 {
 BdrvChild *a, *b;
 
@@ -2077,12 +2074,8 @@ static bool bdrv_parent_perms_conflict(BlockDriverState 
*bs,
  * directions.
  */
 QLIST_FOREACH(a, &bs->parents, next_parent) {
-if (g_slist_find(ignore_children, a)) {
-continue;
-}
-
 QLIST_FOREACH(b, &bs->parents, next_parent) {
-if (a == b || g_slist_find(ignore_children, b)) {
+if (a == b) {
 continue;
 }
 
@@ -2310,7 +2303,6 @@ static void bdrv_replace_child_safe(BdrvChild *child, 
BlockDriverState *new_bs,
 static int bdrv_node_check_perm(BlockDriverState *bs, BlockReopenQueue *q,
 uint64_t cumulative_perms,
 uint64_t cumulative_shared_perms,
-GSList *ignore_children,
 Transaction *tran, Error **errp)
 {
 BlockDriver *drv = bs->drv;
@@ -2393,7 +2385,6 @@ static int bdrv_check_perm_common(GSList *list, 
BlockReopenQueue *q,
   bool use_cumulative_perms,
   uint64_t cumulative_perms,
   uint64_t cumulative_shared_perms,
-  GSList *ignore_children,
   Transaction *tran, Error **errp)
 {
 int ret;
@@ -2404,7 +2395,7 @@ static int bdrv_check_perm_common(GSList *list, 
BlockReopenQueue *q,
 
 ret = bdrv_node_check_perm(bs, q, cumulative_perms,
cumulative_shared_perms,
-   ignore_children, tran, errp);
+   tran, errp);
 if (ret < 0) {
 return ret;
 }
@@ -2415,7 +2406,7 @@ static int bdrv_check_perm_common(GSList *list, 
BlockReopenQueue *q,
 for ( ; list; list = list->next) {
 bs = list->data;
 
-if (bdrv_parent_perms_conflict(bs, ignore_children, errp)) {
+if (bdrv_parent_perms_conflict(bs, errp)) {
 return -EINVAL;
 }
 
@@ -2424,7 +2415,7 @@ static int bdrv_check_perm_common(GSList *list, 
BlockReopenQueue *q,
 
 ret = bdrv_node_check_perm(bs, q, cumulative_perms,
cumulative_shared_perms,
-   ignore_children, tran, errp);
+   tran, errp);
 if (ret < 0) {
 return ret;
 }
@@ -2435,19 +2426,17 @@ static int bdrv_check_perm_common(GSList *list, 
BlockReopenQueue *q,
 
 static int bdrv_check_perm(BlockDriverState *bs, BlockReopenQueue *q,
uint64_t cumulative_perms,
-   uint64_t cumulative_shared_perms,
-   GSList *ignore_children, Error **errp)
+   uint64_t cumulative_shared_perms, Error **errp)
 {
 g_autoptr(GSList) list = bdrv_topological_dfs(NULL, NULL, bs);
 return bdrv_check_perm_common(list, q, true, cumulative_perms,
-  cumulative_shared_perms, ignore_children,
-  NULL, errp);
+  cumulative_shared_perms, NULL, errp);
 }
 
 static int bdrv_list_refresh_perms(GSList *list, BlockReopenQueue *q,
Transaction *tran, Error **errp)
 {
-return bdrv_check_perm_common(list, q, false, 0, 0, NULL, tran, errp);
+return bdrv_check_perm_common(list, q, false, 0, 0, tran, errp);
 }
 
 /*
@@ -2576,7 +2565,6 @@ char *bdrv_perm_names(uint64_t perm)
 static int bdrv_check_update_perm(BlockDriverState *bs, BlockReopenQueue *q,
   uint64_t new_used_perm,
  

[PULL 20/39] block: split out bdrv_replace_node_noperm()

2021-04-30 Thread Kevin Wolf
From: Vladimir Sementsov-Ogievskiy 

Split part of bdrv_replace_node_common() to be used separately.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Reviewed-by: Kevin Wolf 
Message-Id: <20210428151804.439460-21-vsement...@virtuozzo.com>
Signed-off-by: Kevin Wolf 
---
 block.c | 50 +++---
 1 file changed, 31 insertions(+), 19 deletions(-)

diff --git a/block.c b/block.c
index c74e6e7cc1..9283c8d97b 100644
--- a/block.c
+++ b/block.c
@@ -4991,6 +4991,34 @@ static bool should_update_child(BdrvChild *c, 
BlockDriverState *to)
 return ret;
 }
 
+static int bdrv_replace_node_noperm(BlockDriverState *from,
+BlockDriverState *to,
+bool auto_skip, Transaction *tran,
+Error **errp)
+{
+BdrvChild *c, *next;
+
+QLIST_FOREACH_SAFE(c, &from->parents, next_parent, next) {
+assert(c->bs == from);
+if (!should_update_child(c, to)) {
+if (auto_skip) {
+continue;
+}
+error_setg(errp, "Should not change '%s' link to '%s'",
+   c->name, from->node_name);
+return -EINVAL;
+}
+if (c->frozen) {
+error_setg(errp, "Cannot change '%s' link to '%s'",
+   c->name, from->node_name);
+return -EPERM;
+}
+bdrv_replace_child_safe(c, to, tran);
+}
+
+return 0;
+}
+
 /*
  * With auto_skip=true bdrv_replace_node_common skips updating from parents
  * if it creates a parent-child relation loop or if parent is block-job.
@@ -5002,7 +5030,6 @@ static int bdrv_replace_node_common(BlockDriverState 
*from,
 BlockDriverState *to,
 bool auto_skip, Error **errp)
 {
-BdrvChild *c, *next;
 Transaction *tran = tran_new();
 g_autoptr(GHashTable) found = NULL;
 g_autoptr(GSList) refresh_list = NULL;
@@ -5022,24 +5049,9 @@ static int bdrv_replace_node_common(BlockDriverState 
*from,
  * permissions based on new graph. If we fail, we'll roll-back the
  * replacement.
  */
-QLIST_FOREACH_SAFE(c, &from->parents, next_parent, next) {
-assert(c->bs == from);
-if (!should_update_child(c, to)) {
-if (auto_skip) {
-continue;
-}
-ret = -EINVAL;
-error_setg(errp, "Should not change '%s' link to '%s'",
-   c->name, from->node_name);
-goto out;
-}
-if (c->frozen) {
-ret = -EPERM;
-error_setg(errp, "Cannot change '%s' link to '%s'",
-   c->name, from->node_name);
-goto out;
-}
-bdrv_replace_child_safe(c, to, tran);
+ret = bdrv_replace_node_noperm(from, to, auto_skip, tran, errp);
+if (ret < 0) {
+goto out;
 }
 
 found = g_hash_table_new(NULL, NULL);
-- 
2.30.2




[PULL 19/39] block: add bdrv_attach_child_noperm() transaction action

2021-04-30 Thread Kevin Wolf
From: Vladimir Sementsov-Ogievskiy 

Split no-perm part of bdrv_attach_child as separate transaction action.
It will be used in later commits.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Reviewed-by: Kevin Wolf 
Message-Id: <20210428151804.439460-20-vsement...@virtuozzo.com>
Signed-off-by: Kevin Wolf 
---
 block.c | 71 ++---
 1 file changed, 58 insertions(+), 13 deletions(-)

diff --git a/block.c b/block.c
index 4d5c60f2ae..c74e6e7cc1 100644
--- a/block.c
+++ b/block.c
@@ -85,6 +85,14 @@ static BlockDriverState *bdrv_open_inherit(const char 
*filename,
 
 static void bdrv_replace_child_noperm(BdrvChild *child,
   BlockDriverState *new_bs);
+static int bdrv_attach_child_noperm(BlockDriverState *parent_bs,
+BlockDriverState *child_bs,
+const char *child_name,
+const BdrvChildClass *child_class,
+BdrvChildRole child_role,
+BdrvChild **child,
+Transaction *tran,
+Error **errp);
 
 static int bdrv_reopen_prepare(BDRVReopenState *reopen_state, BlockReopenQueue
*queue, Error **errp);
@@ -3079,6 +3087,40 @@ static int bdrv_attach_child_common(BlockDriverState 
*child_bs,
 return 0;
 }
 
+static int bdrv_attach_child_noperm(BlockDriverState *parent_bs,
+BlockDriverState *child_bs,
+const char *child_name,
+const BdrvChildClass *child_class,
+BdrvChildRole child_role,
+BdrvChild **child,
+Transaction *tran,
+Error **errp)
+{
+int ret;
+uint64_t perm, shared_perm;
+
+assert(parent_bs->drv);
+
+bdrv_get_cumulative_perm(parent_bs, &perm, &shared_perm);
+bdrv_child_perm(parent_bs, child_bs, NULL, child_role, NULL,
+perm, shared_perm, &perm, &shared_perm);
+
+ret = bdrv_attach_child_common(child_bs, child_name, child_class,
+   child_role, perm, shared_perm, parent_bs,
+   child, tran, errp);
+if (ret < 0) {
+return ret;
+}
+
+QLIST_INSERT_HEAD(&parent_bs->children, *child, next);
+/*
+ * child is removed in bdrv_attach_child_common_abort(), so don't care to
+ * abort this change separately.
+ */
+
+return 0;
+}
+
 static void bdrv_detach_child(BdrvChild *child)
 {
 bdrv_replace_child(child, NULL);
@@ -3139,23 +3181,26 @@ BdrvChild *bdrv_attach_child(BlockDriverState 
*parent_bs,
  BdrvChildRole child_role,
  Error **errp)
 {
-BdrvChild *child;
-uint64_t perm, shared_perm;
-
-bdrv_get_cumulative_perm(parent_bs, &perm, &shared_perm);
+int ret;
+BdrvChild *child = NULL;
+Transaction *tran = tran_new();
 
-assert(parent_bs->drv);
-bdrv_child_perm(parent_bs, child_bs, NULL, child_role, NULL,
-perm, shared_perm, &perm, &shared_perm);
+ret = bdrv_attach_child_noperm(parent_bs, child_bs, child_name, 
child_class,
+   child_role, &child, tran, errp);
+if (ret < 0) {
+goto out;
+}
 
-child = bdrv_root_attach_child(child_bs, child_name, child_class,
-   child_role, perm, shared_perm, parent_bs,
-   errp);
-if (child == NULL) {
-return NULL;
+ret = bdrv_refresh_perms(parent_bs, errp);
+if (ret < 0) {
+goto out;
 }
 
-QLIST_INSERT_HEAD(&parent_bs->children, child, next);
+out:
+tran_finalize(tran, ret);
+
+bdrv_unref(child_bs);
+
 return child;
 }
 
-- 
2.30.2




[PULL 18/39] block: add bdrv_attach_child_common() transaction action

2021-04-30 Thread Kevin Wolf
From: Vladimir Sementsov-Ogievskiy 

Split out no-perm part of bdrv_root_attach_child() into separate
transaction action. bdrv_root_attach_child() now moves to new
permission update paradigm: first update graph relations then update
permissions.

qsd-jobs test output updated. Seems now permission update goes in
another order. Still, the test comment say that we only want to check
that command doesn't crash, and it's still so.

Error message is a bit misleading as it looks like job was added first.
But actually in new paradigm of graph update we can't distinguish such
things. We should update the error message, but let's not do it now.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Message-Id: <20210428151804.439460-19-vsement...@virtuozzo.com>
Signed-off-by: Kevin Wolf 
---
 block.c   | 190 ++
 tests/qemu-iotests/tests/qsd-jobs.out |   2 +-
 2 files changed, 137 insertions(+), 55 deletions(-)

diff --git a/block.c b/block.c
index 6040b9dea0..4d5c60f2ae 100644
--- a/block.c
+++ b/block.c
@@ -2955,37 +2955,74 @@ static void bdrv_replace_child(BdrvChild *child, 
BlockDriverState *new_bs)
 }
 }
 
-/*
- * This function steals the reference to child_bs from the caller.
- * That reference is later dropped by bdrv_root_unref_child().
- *
- * On failure NULL is returned, errp is set and the reference to
- * child_bs is also dropped.
- *
- * The caller must hold the AioContext lock @child_bs, but not that of @ctx
- * (unless @child_bs is already in @ctx).
- */
-BdrvChild *bdrv_root_attach_child(BlockDriverState *child_bs,
-  const char *child_name,
-  const BdrvChildClass *child_class,
-  BdrvChildRole child_role,
-  uint64_t perm, uint64_t shared_perm,
-  void *opaque, Error **errp)
+static void bdrv_remove_empty_child(BdrvChild *child)
 {
-BdrvChild *child;
-Error *local_err = NULL;
-int ret;
-AioContext *ctx;
+assert(!child->bs);
+QLIST_SAFE_REMOVE(child, next);
+g_free(child->name);
+g_free(child);
+}
 
-ret = bdrv_check_update_perm(child_bs, NULL, perm, shared_perm, NULL, 
errp);
-if (ret < 0) {
-bdrv_abort_perm_update(child_bs);
-bdrv_unref(child_bs);
-return NULL;
+typedef struct BdrvAttachChildCommonState {
+BdrvChild **child;
+AioContext *old_parent_ctx;
+AioContext *old_child_ctx;
+} BdrvAttachChildCommonState;
+
+static void bdrv_attach_child_common_abort(void *opaque)
+{
+BdrvAttachChildCommonState *s = opaque;
+BdrvChild *child = *s->child;
+BlockDriverState *bs = child->bs;
+
+bdrv_replace_child_noperm(child, NULL);
+
+if (bdrv_get_aio_context(bs) != s->old_child_ctx) {
+bdrv_try_set_aio_context(bs, s->old_child_ctx, &error_abort);
 }
 
-child = g_new(BdrvChild, 1);
-*child = (BdrvChild) {
+if (bdrv_child_get_parent_aio_context(child) != s->old_parent_ctx) {
+GSList *ignore = g_slist_prepend(NULL, child);
+
+child->klass->can_set_aio_ctx(child, s->old_parent_ctx, &ignore,
+  &error_abort);
+g_slist_free(ignore);
+ignore = g_slist_prepend(NULL, child);
+child->klass->set_aio_ctx(child, s->old_parent_ctx, &ignore);
+
+g_slist_free(ignore);
+}
+
+bdrv_unref(bs);
+bdrv_remove_empty_child(child);
+*s->child = NULL;
+}
+
+static TransactionActionDrv bdrv_attach_child_common_drv = {
+.abort = bdrv_attach_child_common_abort,
+.clean = g_free,
+};
+
+/*
+ * Common part of attaching bdrv child to bs or to blk or to job
+ */
+static int bdrv_attach_child_common(BlockDriverState *child_bs,
+const char *child_name,
+const BdrvChildClass *child_class,
+BdrvChildRole child_role,
+uint64_t perm, uint64_t shared_perm,
+void *opaque, BdrvChild **child,
+Transaction *tran, Error **errp)
+{
+BdrvChild *new_child;
+AioContext *parent_ctx;
+AioContext *child_ctx = bdrv_get_aio_context(child_bs);
+
+assert(child);
+assert(*child == NULL);
+
+new_child = g_new(BdrvChild, 1);
+*new_child = (BdrvChild) {
 .bs = NULL,
 .name   = g_strdup(child_name),
 .klass  = child_class,
@@ -2995,37 +3032,92 @@ BdrvChild *bdrv_root_attach_child(BlockDriverState 
*child_bs,
 .opaque = opaque,
 };
 
-ctx = bdrv_child_get_parent_aio_context(child);
-
-/* If the AioContexts don't match, first try to move the subtree of
+/*
+ * If the AioContexts don't match, first try to move the subtree of
  * child_bs into the AioContext of the new parent. If this doesn't work,
- * try moving 

[PULL 28/39] block: add bdrv_set_backing_noperm() transaction action

2021-04-30 Thread Kevin Wolf
From: Vladimir Sementsov-Ogievskiy 

Split out no-perm part of bdrv_set_backing_hd() as a separate
transaction action. Note the in case of existing BdrvChild we reuse it,
not recreate, just to do less actions.

We don't need to create extra reference to backing_hd as we don't lose
it in bdrv_attach_child().

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Reviewed-by: Kevin Wolf 
Message-Id: <20210428151804.439460-29-vsement...@virtuozzo.com>
Signed-off-by: Kevin Wolf 
---
 block.c | 54 +-
 1 file changed, 37 insertions(+), 17 deletions(-)

diff --git a/block.c b/block.c
index 9922943793..bdfe59d94d 100644
--- a/block.c
+++ b/block.c
@@ -92,6 +92,8 @@ static int bdrv_attach_child_noperm(BlockDriverState 
*parent_bs,
 BdrvChild **child,
 Transaction *tran,
 Error **errp);
+static void bdrv_remove_filter_or_cow_child(BlockDriverState *bs,
+Transaction *tran);
 
 static int bdrv_reopen_prepare(BDRVReopenState *reopen_state, BlockReopenQueue
*queue, Error **errp);
@@ -3322,8 +3324,9 @@ static BdrvChildRole bdrv_backing_role(BlockDriverState 
*bs)
  * Sets the bs->backing link of a BDS. A new reference is created; callers
  * which don't need their own reference any more must call bdrv_unref().
  */
-int bdrv_set_backing_hd(BlockDriverState *bs, BlockDriverState *backing_hd,
-Error **errp)
+static int bdrv_set_backing_noperm(BlockDriverState *bs,
+   BlockDriverState *backing_hd,
+   Transaction *tran, Error **errp)
 {
 int ret = 0;
 bool update_inherits_from = bdrv_chain_contains(bs, backing_hd) &&
@@ -,36 +3336,53 @@ int bdrv_set_backing_hd(BlockDriverState *bs, 
BlockDriverState *backing_hd,
 return -EPERM;
 }
 
-if (backing_hd) {
-bdrv_ref(backing_hd);
-}
-
 if (bs->backing) {
 /* Cannot be frozen, we checked that above */
-bdrv_unref_child(bs, bs->backing);
-bs->backing = NULL;
+bdrv_unset_inherits_from(bs, bs->backing, tran);
+bdrv_remove_filter_or_cow_child(bs, tran);
 }
 
 if (!backing_hd) {
 goto out;
 }
 
-bs->backing = bdrv_attach_child(bs, backing_hd, "backing", &child_of_bds,
-bdrv_backing_role(bs), errp);
-if (!bs->backing) {
-ret = -EPERM;
-goto out;
+ret = bdrv_attach_child_noperm(bs, backing_hd, "backing",
+   &child_of_bds, bdrv_backing_role(bs),
+   &bs->backing, tran, errp);
+if (ret < 0) {
+return ret;
 }
 
-/* If backing_hd was already part of bs's backing chain, and
+
+/*
+ * If backing_hd was already part of bs's backing chain, and
  * inherits_from pointed recursively to bs then let's update it to
- * point directly to bs (else it will become NULL). */
+ * point directly to bs (else it will become NULL).
+ */
 if (update_inherits_from) {
-backing_hd->inherits_from = bs;
+bdrv_set_inherits_from(backing_hd, bs, tran);
 }
 
 out:
-bdrv_refresh_limits(bs, NULL, NULL);
+bdrv_refresh_limits(bs, tran, NULL);
+
+return 0;
+}
+
+int bdrv_set_backing_hd(BlockDriverState *bs, BlockDriverState *backing_hd,
+Error **errp)
+{
+int ret;
+Transaction *tran = tran_new();
+
+ret = bdrv_set_backing_noperm(bs, backing_hd, tran, errp);
+if (ret < 0) {
+goto out;
+}
+
+ret = bdrv_refresh_perms(bs, errp);
+out:
+tran_finalize(tran, ret);
 
 return ret;
 }
-- 
2.30.2




[PULL 15/39] block: add bdrv_list_* permission update functions

2021-04-30 Thread Kevin Wolf
From: Vladimir Sementsov-Ogievskiy 

Add new interface, allowing use of existing node list. It will be used
to fix bdrv_replace_node() in the further commit.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Reviewed-by: Kevin Wolf 
Message-Id: <20210428151804.439460-16-vsement...@virtuozzo.com>
Signed-off-by: Kevin Wolf 
---
 block.c | 106 +---
 1 file changed, 71 insertions(+), 35 deletions(-)

diff --git a/block.c b/block.c
index ab7a4d13d8..98b7bc179c 100644
--- a/block.c
+++ b/block.c
@@ -2249,7 +2249,8 @@ static int bdrv_drv_set_perm(BlockDriverState *bs, 
uint64_t perm,
 static int bdrv_node_check_perm(BlockDriverState *bs, BlockReopenQueue *q,
 uint64_t cumulative_perms,
 uint64_t cumulative_shared_perms,
-GSList *ignore_children, Error **errp)
+GSList *ignore_children,
+Transaction *tran, Error **errp)
 {
 BlockDriver *drv = bs->drv;
 BdrvChild *c;
@@ -2297,7 +2298,7 @@ static int bdrv_node_check_perm(BlockDriverState *bs, 
BlockReopenQueue *q,
 return 0;
 }
 
-ret = bdrv_drv_set_perm(bs, cumulative_perms, cumulative_shared_perms, 
NULL,
+ret = bdrv_drv_set_perm(bs, cumulative_perms, cumulative_shared_perms, 
tran,
 errp);
 if (ret < 0) {
 return ret;
@@ -2316,36 +2317,53 @@ static int bdrv_node_check_perm(BlockDriverState *bs, 
BlockReopenQueue *q,
 bdrv_child_perm(bs, c->bs, c, c->role, q,
 cumulative_perms, cumulative_shared_perms,
 &cur_perm, &cur_shared);
-bdrv_child_set_perm_safe(c, cur_perm, cur_shared, NULL);
+bdrv_child_set_perm_safe(c, cur_perm, cur_shared, tran);
 }
 
 return 0;
 }
 
-static int bdrv_check_perm(BlockDriverState *bs, BlockReopenQueue *q,
-   uint64_t cumulative_perms,
-   uint64_t cumulative_shared_perms,
-   GSList *ignore_children, Error **errp)
+/*
+ * If use_cumulative_perms is true, use cumulative_perms and
+ * cumulative_shared_perms for first element of the list. Otherwise just 
refresh
+ * all permissions.
+ */
+static int bdrv_check_perm_common(GSList *list, BlockReopenQueue *q,
+  bool use_cumulative_perms,
+  uint64_t cumulative_perms,
+  uint64_t cumulative_shared_perms,
+  GSList *ignore_children,
+  Transaction *tran, Error **errp)
 {
 int ret;
-BlockDriverState *root = bs;
-g_autoptr(GSList) list = bdrv_topological_dfs(NULL, NULL, root);
+BlockDriverState *bs;
 
-for ( ; list; list = list->next) {
+if (use_cumulative_perms) {
 bs = list->data;
 
-if (bs != root) {
-if (bdrv_parent_perms_conflict(bs, ignore_children, errp)) {
-return -EINVAL;
-}
+ret = bdrv_node_check_perm(bs, q, cumulative_perms,
+   cumulative_shared_perms,
+   ignore_children, tran, errp);
+if (ret < 0) {
+return ret;
+}
 
-bdrv_get_cumulative_perm(bs, &cumulative_perms,
- &cumulative_shared_perms);
+list = list->next;
+}
+
+for ( ; list; list = list->next) {
+bs = list->data;
+
+if (bdrv_parent_perms_conflict(bs, ignore_children, errp)) {
+return -EINVAL;
 }
 
+bdrv_get_cumulative_perm(bs, &cumulative_perms,
+ &cumulative_shared_perms);
+
 ret = bdrv_node_check_perm(bs, q, cumulative_perms,
cumulative_shared_perms,
-   ignore_children, errp);
+   ignore_children, tran, errp);
 if (ret < 0) {
 return ret;
 }
@@ -2354,6 +2372,23 @@ static int bdrv_check_perm(BlockDriverState *bs, 
BlockReopenQueue *q,
 return 0;
 }
 
+static int bdrv_check_perm(BlockDriverState *bs, BlockReopenQueue *q,
+   uint64_t cumulative_perms,
+   uint64_t cumulative_shared_perms,
+   GSList *ignore_children, Error **errp)
+{
+g_autoptr(GSList) list = bdrv_topological_dfs(NULL, NULL, bs);
+return bdrv_check_perm_common(list, q, true, cumulative_perms,
+  cumulative_shared_perms, ignore_children,
+  NULL, errp);
+}
+
+static int bdrv_list_refresh_perms(GSList *list, BlockReopenQueue *q,
+   Transaction *tran, Error **errp)
+{
+return bdrv_check_perm_common(list, q, false, 0, 0, NULL, tran, errp);
+}
+
 /*
  * No

[PULL 24/39] block/backup-top: drop .active

2021-04-30 Thread Kevin Wolf
From: Vladimir Sementsov-Ogievskiy 

We don't need this workaround anymore: bdrv_append is already smart
enough and we can use new bdrv_drop_filter().

This commit efficiently reverts also recent 705dde27c6c53b73, which
checked .active on io path. Still it said that the problem should be
theoretical. And the logic of filter removement is changed anyway.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Reviewed-by: Kevin Wolf 
Message-Id: <20210428151804.439460-25-vsement...@virtuozzo.com>
Signed-off-by: Kevin Wolf 
---
 block/backup-top.c | 47 +-
 tests/qemu-iotests/283.out |  2 +-
 2 files changed, 2 insertions(+), 47 deletions(-)

diff --git a/block/backup-top.c b/block/backup-top.c
index 62d09f213e..425e3778be 100644
--- a/block/backup-top.c
+++ b/block/backup-top.c
@@ -37,7 +37,6 @@
 typedef struct BDRVBackupTopState {
 BlockCopyState *bcs;
 BdrvChild *target;
-bool active;
 int64_t cluster_size;
 } BDRVBackupTopState;
 
@@ -45,12 +44,6 @@ static coroutine_fn int backup_top_co_preadv(
 BlockDriverState *bs, uint64_t offset, uint64_t bytes,
 QEMUIOVector *qiov, int flags)
 {
-BDRVBackupTopState *s = bs->opaque;
-
-if (!s->active) {
-return -EIO;
-}
-
 return bdrv_co_preadv(bs->backing, offset, bytes, qiov, flags);
 }
 
@@ -60,10 +53,6 @@ static coroutine_fn int backup_top_cbw(BlockDriverState *bs, 
uint64_t offset,
 BDRVBackupTopState *s = bs->opaque;
 uint64_t off, end;
 
-if (!s->active) {
-return -EIO;
-}
-
 if (flags & BDRV_REQ_WRITE_UNCHANGED) {
 return 0;
 }
@@ -137,21 +126,6 @@ static void backup_top_child_perm(BlockDriverState *bs, 
BdrvChild *c,
   uint64_t perm, uint64_t shared,
   uint64_t *nperm, uint64_t *nshared)
 {
-BDRVBackupTopState *s = bs->opaque;
-
-if (!s->active) {
-/*
- * The filter node may be in process of bdrv_append(), which firstly do
- * bdrv_set_backing_hd() and then bdrv_replace_node(). This means that
- * we can't unshare BLK_PERM_WRITE during bdrv_append() operation. So,
- * let's require nothing during bdrv_append() and refresh permissions
- * after it (see bdrv_backup_top_append()).
- */
-*nperm = 0;
-*nshared = BLK_PERM_ALL;
-return;
-}
-
 if (!(role & BDRV_CHILD_FILTERED)) {
 /*
  * Target child
@@ -241,17 +215,6 @@ BlockDriverState *bdrv_backup_top_append(BlockDriverState 
*source,
 }
 appended = true;
 
-/*
- * bdrv_append() finished successfully, now we can require permissions
- * we want.
- */
-state->active = true;
-ret = bdrv_child_refresh_perms(top, top->backing, errp);
-if (ret < 0) {
-error_prepend(errp, "Cannot set permissions for backup-top filter: ");
-goto fail;
-}
-
 state->cluster_size = cluster_size;
 state->bcs = block_copy_state_new(top->backing, state->target,
   cluster_size, perf->use_copy_range,
@@ -268,7 +231,6 @@ BlockDriverState *bdrv_backup_top_append(BlockDriverState 
*source,
 
 fail:
 if (appended) {
-state->active = false;
 bdrv_backup_top_drop(top);
 } else {
 bdrv_unref(top);
@@ -283,16 +245,9 @@ void bdrv_backup_top_drop(BlockDriverState *bs)
 {
 BDRVBackupTopState *s = bs->opaque;
 
-bdrv_drained_begin(bs);
+bdrv_drop_filter(bs, &error_abort);
 
 block_copy_state_free(s->bcs);
 
-s->active = false;
-bdrv_child_refresh_perms(bs, bs->backing, &error_abort);
-bdrv_replace_node(bs, bs->backing->bs, &error_abort);
-bdrv_set_backing_hd(bs, NULL, &error_abort);
-
-bdrv_drained_end(bs);
-
 bdrv_unref(bs);
 }
diff --git a/tests/qemu-iotests/283.out b/tests/qemu-iotests/283.out
index 73eb75102f..97e62a4c94 100644
--- a/tests/qemu-iotests/283.out
+++ b/tests/qemu-iotests/283.out
@@ -5,7 +5,7 @@
 {"execute": "blockdev-add", "arguments": {"driver": "blkdebug", "image": 
"base", "node-name": "other", "take-child-perms": ["write"]}}
 {"return": {}}
 {"execute": "blockdev-backup", "arguments": {"device": "source", "sync": 
"full", "target": "target"}}
-{"error": {"class": "GenericError", "desc": "Cannot set permissions for 
backup-top filter: Conflicts with use by source as 'image', which does not 
allow 'write' on base"}}
+{"error": {"class": "GenericError", "desc": "Cannot append backup-top filter: 
Conflicts with use by source as 'image', which does not allow 'write' on base"}}
 
 === backup-top should be gone after job-finalize ===
 
-- 
2.30.2




[PULL 21/39] block: adapt bdrv_append() for inserting filters

2021-04-30 Thread Kevin Wolf
From: Vladimir Sementsov-Ogievskiy 

bdrv_append is not very good for inserting filters: it does extra
permission update as part of bdrv_set_backing_hd(). During this update
filter may conflict with other parents of top_bs.

Instead, let's first do all graph modifications and after it update
permissions.

append-greedy-filter test-case in test-bdrv-graph-mod is now works, so
move it out of debug option.

Note: bdrv_append() is still only works for backing-child based
filters. It's something to improve later.

Note2: we use the fact that bdrv_append() is used to append new nodes,
without backing child, so we don't need frozen check and inherits_from
logic from bdrv_set_backing_hd().

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Reviewed-by: Kevin Wolf 
Message-Id: <20210428151804.439460-22-vsement...@virtuozzo.com>
Signed-off-by: Kevin Wolf 
---
 block.c  | 27 ---
 tests/unit/test-bdrv-graph-mod.c | 17 ++---
 2 files changed, 22 insertions(+), 22 deletions(-)

diff --git a/block.c b/block.c
index 9283c8d97b..5bb6a2bef9 100644
--- a/block.c
+++ b/block.c
@@ -5088,25 +5088,38 @@ int bdrv_replace_node(BlockDriverState *from, 
BlockDriverState *to,
  * This will modify the BlockDriverState fields, and swap contents
  * between bs_new and bs_top. Both bs_new and bs_top are modified.
  *
- * bs_new must not be attached to a BlockBackend.
+ * bs_new must not be attached to a BlockBackend and must not have backing
+ * child.
  *
  * This function does not create any image files.
  */
 int bdrv_append(BlockDriverState *bs_new, BlockDriverState *bs_top,
 Error **errp)
 {
-int ret = bdrv_set_backing_hd(bs_new, bs_top, errp);
+int ret;
+Transaction *tran = tran_new();
+
+assert(!bs_new->backing);
+
+ret = bdrv_attach_child_noperm(bs_new, bs_top, "backing",
+   &child_of_bds, bdrv_backing_role(bs_new),
+   &bs_new->backing, tran, errp);
 if (ret < 0) {
-return ret;
+goto out;
 }
 
-ret = bdrv_replace_node(bs_top, bs_new, errp);
+ret = bdrv_replace_node_noperm(bs_top, bs_new, true, tran, errp);
 if (ret < 0) {
-bdrv_set_backing_hd(bs_new, NULL, &error_abort);
-return ret;
+goto out;
 }
 
-return 0;
+ret = bdrv_refresh_perms(bs_new, errp);
+out:
+tran_finalize(tran, ret);
+
+bdrv_refresh_limits(bs_top, NULL);
+
+return ret;
 }
 
 static void bdrv_delete(BlockDriverState *bs)
diff --git a/tests/unit/test-bdrv-graph-mod.c b/tests/unit/test-bdrv-graph-mod.c
index 7b3c8b437a..88f25c0cdb 100644
--- a/tests/unit/test-bdrv-graph-mod.c
+++ b/tests/unit/test-bdrv-graph-mod.c
@@ -388,16 +388,6 @@ static void test_append_greedy_filter(void)
 
 int main(int argc, char *argv[])
 {
-int i;
-bool debug = false;
-
-for (i = 1; i < argc; i++) {
-if (!strcmp(argv[i], "-d")) {
-debug = true;
-break;
-}
-}
-
 bdrv_init();
 qemu_init_main_loop(&error_abort);
 
@@ -410,11 +400,8 @@ int main(int argc, char *argv[])
 test_parallel_perm_update);
 g_test_add_func("/bdrv-graph-mod/parallel-exclusive-write",
 test_parallel_exclusive_write);
-
-if (debug) {
-g_test_add_func("/bdrv-graph-mod/append-greedy-filter",
-test_append_greedy_filter);
-}
+g_test_add_func("/bdrv-graph-mod/append-greedy-filter",
+test_append_greedy_filter);
 
 return g_test_run();
 }
-- 
2.30.2




[PULL 11/39] block: rewrite bdrv_child_try_set_perm() using bdrv_refresh_perms()

2021-04-30 Thread Kevin Wolf
From: Vladimir Sementsov-Ogievskiy 

We are going to drop recursive bdrv_child_* functions, so stop use them
in bdrv_child_try_set_perm() as a first step.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Reviewed-by: Kevin Wolf 
Message-Id: <20210428151804.439460-12-vsement...@virtuozzo.com>
Signed-off-by: Kevin Wolf 
---
 block.c | 14 --
 1 file changed, 8 insertions(+), 6 deletions(-)

diff --git a/block.c b/block.c
index 0ee0c2f29a..4511766825 100644
--- a/block.c
+++ b/block.c
@@ -2454,11 +2454,16 @@ int bdrv_child_try_set_perm(BdrvChild *c, uint64_t 
perm, uint64_t shared,
 Error **errp)
 {
 Error *local_err = NULL;
+Transaction *tran = tran_new();
 int ret;
 
-ret = bdrv_child_check_perm(c, NULL, perm, shared, NULL, &local_err);
+bdrv_child_set_perm_safe(c, perm, shared, tran);
+
+ret = bdrv_refresh_perms(c->bs, &local_err);
+
+tran_finalize(tran, ret);
+
 if (ret < 0) {
-bdrv_child_abort_perm_update(c);
 if ((perm & ~c->perm) || (c->shared_perm & ~shared)) {
 /* tighten permissions */
 error_propagate(errp, local_err);
@@ -2472,12 +2477,9 @@ int bdrv_child_try_set_perm(BdrvChild *c, uint64_t perm, 
uint64_t shared,
 error_free(local_err);
 ret = 0;
 }
-return ret;
 }
 
-bdrv_child_set_perm(c);
-
-return 0;
+return ret;
 }
 
 int bdrv_child_refresh_perms(BlockDriverState *bs, BdrvChild *c, Error **errp)
-- 
2.30.2




[PULL 10/39] block: refactor bdrv_child* permission functions

2021-04-30 Thread Kevin Wolf
From: Vladimir Sementsov-Ogievskiy 

Split out non-recursive parts, and refactor as block graph transaction
action.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Reviewed-by: Kevin Wolf 
Message-Id: <20210428151804.439460-11-vsement...@virtuozzo.com>
Signed-off-by: Kevin Wolf 
---
 block.c | 79 ++---
 1 file changed, 59 insertions(+), 20 deletions(-)

diff --git a/block.c b/block.c
index fd621f0403..0ee0c2f29a 100644
--- a/block.c
+++ b/block.c
@@ -49,6 +49,7 @@
 #include "qemu/timer.h"
 #include "qemu/cutils.h"
 #include "qemu/id.h"
+#include "qemu/transactions.h"
 #include "block/coroutines.h"
 
 #ifdef CONFIG_BSD
@@ -2093,6 +2094,61 @@ static void bdrv_child_perm(BlockDriverState *bs, 
BlockDriverState *child_bs,
 }
 }
 
+static void bdrv_child_set_perm_commit(void *opaque)
+{
+BdrvChild *c = opaque;
+
+c->has_backup_perm = false;
+}
+
+static void bdrv_child_set_perm_abort(void *opaque)
+{
+BdrvChild *c = opaque;
+/*
+ * We may have child->has_backup_perm unset at this point, as in case of
+ * _check_ stage of permission update failure we may _check_ not the whole
+ * subtree.  Still, _abort_ is called on the whole subtree anyway.
+ */
+if (c->has_backup_perm) {
+c->perm = c->backup_perm;
+c->shared_perm = c->backup_shared_perm;
+c->has_backup_perm = false;
+}
+}
+
+static TransactionActionDrv bdrv_child_set_pem_drv = {
+.abort = bdrv_child_set_perm_abort,
+.commit = bdrv_child_set_perm_commit,
+};
+
+/*
+ * With tran=NULL needs to be followed by direct call to either
+ * bdrv_child_set_perm_commit() or bdrv_child_set_perm_abort().
+ *
+ * With non-NULL tran needs to be followed by tran_abort() or tran_commit()
+ * instead.
+ */
+static void bdrv_child_set_perm_safe(BdrvChild *c, uint64_t perm,
+ uint64_t shared, Transaction *tran)
+{
+if (!c->has_backup_perm) {
+c->has_backup_perm = true;
+c->backup_perm = c->perm;
+c->backup_shared_perm = c->shared_perm;
+}
+/*
+ * Note: it's OK if c->has_backup_perm was already set, as we can find the
+ * same c twice during check_perm procedure
+ */
+
+c->perm = perm;
+c->shared_perm = shared;
+
+if (tran) {
+tran_add(tran, &bdrv_child_set_pem_drv, c);
+}
+}
+
 /*
  * Check whether permissions on this node can be changed in a way that
  * @cumulative_perms and @cumulative_shared_perms are the new cumulative
@@ -2358,37 +2414,20 @@ static int bdrv_child_check_perm(BdrvChild *c, 
BlockReopenQueue *q,
 return ret;
 }
 
-if (!c->has_backup_perm) {
-c->has_backup_perm = true;
-c->backup_perm = c->perm;
-c->backup_shared_perm = c->shared_perm;
-}
-/*
- * Note: it's OK if c->has_backup_perm was already set, as we can find the
- * same child twice during check_perm procedure
- */
-
-c->perm = perm;
-c->shared_perm = shared;
+bdrv_child_set_perm_safe(c, perm, shared, NULL);
 
 return 0;
 }
 
 static void bdrv_child_set_perm(BdrvChild *c)
 {
-c->has_backup_perm = false;
-
+bdrv_child_set_perm_commit(c);
 bdrv_set_perm(c->bs);
 }
 
 static void bdrv_child_abort_perm_update(BdrvChild *c)
 {
-if (c->has_backup_perm) {
-c->perm = c->backup_perm;
-c->shared_perm = c->backup_shared_perm;
-c->has_backup_perm = false;
-}
-
+bdrv_child_set_perm_abort(c);
 bdrv_abort_perm_update(c->bs);
 }
 
-- 
2.30.2




[PULL 22/39] block: add bdrv_remove_filter_or_cow transaction action

2021-04-30 Thread Kevin Wolf
From: Vladimir Sementsov-Ogievskiy 

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Reviewed-by: Kevin Wolf 
Message-Id: <20210428151804.439460-23-vsement...@virtuozzo.com>
Signed-off-by: Kevin Wolf 
---
 block.c | 84 +++--
 1 file changed, 82 insertions(+), 2 deletions(-)

diff --git a/block.c b/block.c
index 5bb6a2bef9..2ea9cc110d 100644
--- a/block.c
+++ b/block.c
@@ -2963,12 +2963,19 @@ static void bdrv_replace_child(BdrvChild *child, 
BlockDriverState *new_bs)
 }
 }
 
+static void bdrv_child_free(void *opaque)
+{
+BdrvChild *c = opaque;
+
+g_free(c->name);
+g_free(c);
+}
+
 static void bdrv_remove_empty_child(BdrvChild *child)
 {
 assert(!child->bs);
 QLIST_SAFE_REMOVE(child, next);
-g_free(child->name);
-g_free(child);
+bdrv_child_free(child);
 }
 
 typedef struct BdrvAttachChildCommonState {
@@ -4991,6 +4998,79 @@ static bool should_update_child(BdrvChild *c, 
BlockDriverState *to)
 return ret;
 }
 
+typedef struct BdrvRemoveFilterOrCowChild {
+BdrvChild *child;
+bool is_backing;
+} BdrvRemoveFilterOrCowChild;
+
+static void bdrv_remove_filter_or_cow_child_abort(void *opaque)
+{
+BdrvRemoveFilterOrCowChild *s = opaque;
+BlockDriverState *parent_bs = s->child->opaque;
+
+QLIST_INSERT_HEAD(&parent_bs->children, s->child, next);
+if (s->is_backing) {
+parent_bs->backing = s->child;
+} else {
+parent_bs->file = s->child;
+}
+
+/*
+ * We don't have to restore child->bs here to undo bdrv_replace_child()
+ * because that function is transactionable and it registered own 
completion
+ * entries in @tran, so .abort() for bdrv_replace_child_safe() will be
+ * called automatically.
+ */
+}
+
+static void bdrv_remove_filter_or_cow_child_commit(void *opaque)
+{
+BdrvRemoveFilterOrCowChild *s = opaque;
+
+bdrv_child_free(s->child);
+}
+
+static TransactionActionDrv bdrv_remove_filter_or_cow_child_drv = {
+.abort = bdrv_remove_filter_or_cow_child_abort,
+.commit = bdrv_remove_filter_or_cow_child_commit,
+.clean = g_free,
+};
+
+/*
+ * A function to remove backing-chain child of @bs if exists: cow child for
+ * format nodes (always .backing) and filter child for filters (may be .file or
+ * .backing)
+ */
+__attribute__((unused))
+static void bdrv_remove_filter_or_cow_child(BlockDriverState *bs,
+Transaction *tran)
+{
+BdrvRemoveFilterOrCowChild *s;
+BdrvChild *child = bdrv_filter_or_cow_child(bs);
+
+if (!child) {
+return;
+}
+
+if (child->bs) {
+bdrv_replace_child_safe(child, NULL, tran);
+}
+
+s = g_new(BdrvRemoveFilterOrCowChild, 1);
+*s = (BdrvRemoveFilterOrCowChild) {
+.child = child,
+.is_backing = (child == bs->backing),
+};
+tran_add(tran, &bdrv_remove_filter_or_cow_child_drv, s);
+
+QLIST_SAFE_REMOVE(child, next);
+if (s->is_backing) {
+bs->backing = NULL;
+} else {
+bs->file = NULL;
+}
+}
+
 static int bdrv_replace_node_noperm(BlockDriverState *from,
 BlockDriverState *to,
 bool auto_skip, Transaction *tran,
-- 
2.30.2




[PULL 14/39] block: add bdrv_drv_set_perm transaction action

2021-04-30 Thread Kevin Wolf
From: Vladimir Sementsov-Ogievskiy 

Refactor calling driver callbacks to a separate transaction action to
be used later.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Reviewed-by: Kevin Wolf 
Message-Id: <20210428151804.439460-15-vsement...@virtuozzo.com>
Signed-off-by: Kevin Wolf 
---
 block.c | 70 -
 1 file changed, 54 insertions(+), 16 deletions(-)

diff --git a/block.c b/block.c
index cbcc4c15a0..ab7a4d13d8 100644
--- a/block.c
+++ b/block.c
@@ -2189,6 +2189,54 @@ static void bdrv_child_set_perm_safe(BdrvChild *c, 
uint64_t perm,
 }
 }
 
+static void bdrv_drv_set_perm_commit(void *opaque)
+{
+BlockDriverState *bs = opaque;
+uint64_t cumulative_perms, cumulative_shared_perms;
+
+if (bs->drv->bdrv_set_perm) {
+bdrv_get_cumulative_perm(bs, &cumulative_perms,
+ &cumulative_shared_perms);
+bs->drv->bdrv_set_perm(bs, cumulative_perms, cumulative_shared_perms);
+}
+}
+
+static void bdrv_drv_set_perm_abort(void *opaque)
+{
+BlockDriverState *bs = opaque;
+
+if (bs->drv->bdrv_abort_perm_update) {
+bs->drv->bdrv_abort_perm_update(bs);
+}
+}
+
+TransactionActionDrv bdrv_drv_set_perm_drv = {
+.abort = bdrv_drv_set_perm_abort,
+.commit = bdrv_drv_set_perm_commit,
+};
+
+static int bdrv_drv_set_perm(BlockDriverState *bs, uint64_t perm,
+ uint64_t shared_perm, Transaction *tran,
+ Error **errp)
+{
+if (!bs->drv) {
+return 0;
+}
+
+if (bs->drv->bdrv_check_perm) {
+int ret = bs->drv->bdrv_check_perm(bs, perm, shared_perm, errp);
+if (ret < 0) {
+return ret;
+}
+}
+
+if (tran) {
+tran_add(tran, &bdrv_drv_set_perm_drv, bs);
+}
+
+return 0;
+}
+
 /*
  * Check whether permissions on this node can be changed in a way that
  * @cumulative_perms and @cumulative_shared_perms are the new cumulative
@@ -2249,12 +2297,10 @@ static int bdrv_node_check_perm(BlockDriverState *bs, 
BlockReopenQueue *q,
 return 0;
 }
 
-if (drv->bdrv_check_perm) {
-ret = drv->bdrv_check_perm(bs, cumulative_perms,
-   cumulative_shared_perms, errp);
-if (ret < 0) {
-return ret;
-}
+ret = bdrv_drv_set_perm(bs, cumulative_perms, cumulative_shared_perms, 
NULL,
+errp);
+if (ret < 0) {
+return ret;
 }
 
 /* Drivers that never have children can omit .bdrv_child_perm() */
@@ -2322,9 +2368,7 @@ static void bdrv_node_abort_perm_update(BlockDriverState 
*bs)
 return;
 }
 
-if (drv->bdrv_abort_perm_update) {
-drv->bdrv_abort_perm_update(bs);
-}
+bdrv_drv_set_perm_abort(bs);
 
 QLIST_FOREACH(c, &bs->children, next) {
 bdrv_child_set_perm_abort(c);
@@ -2342,7 +2386,6 @@ static void bdrv_abort_perm_update(BlockDriverState *bs)
 
 static void bdrv_node_set_perm(BlockDriverState *bs)
 {
-uint64_t cumulative_perms, cumulative_shared_perms;
 BlockDriver *drv = bs->drv;
 BdrvChild *c;
 
@@ -2350,12 +2393,7 @@ static void bdrv_node_set_perm(BlockDriverState *bs)
 return;
 }
 
-bdrv_get_cumulative_perm(bs, &cumulative_perms, &cumulative_shared_perms);
-
-/* Update this node */
-if (drv->bdrv_set_perm) {
-drv->bdrv_set_perm(bs, cumulative_perms, cumulative_shared_perms);
-}
+bdrv_drv_set_perm_commit(bs);
 
 /* Drivers that never have children can omit .bdrv_child_perm() */
 if (!drv->bdrv_child_perm) {
-- 
2.30.2




[PULL 23/39] block: introduce bdrv_drop_filter()

2021-04-30 Thread Kevin Wolf
From: Vladimir Sementsov-Ogievskiy 

Using bdrv_replace_node() for removing filter is not good enough: it
keeps child reference of the filter, which may conflict with original
top node during permission update.

Instead let's create new interface, which will do all graph
modifications first and then update permissions.

Let's modify bdrv_replace_node_common(), allowing it additionally drop
backing chain child link pointing to new node. This is quite
appropriate for bdrv_drop_intermediate() and makes possible to add
new bdrv_drop_filter() as a simple wrapper.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Reviewed-by: Kevin Wolf 
Message-Id: <20210428151804.439460-24-vsement...@virtuozzo.com>
Signed-off-by: Kevin Wolf 
---
 include/block/block.h |  1 +
 block.c   | 43 +++
 2 files changed, 40 insertions(+), 4 deletions(-)

diff --git a/include/block/block.h b/include/block/block.h
index 16e496a5c4..85481a05c6 100644
--- a/include/block/block.h
+++ b/include/block/block.h
@@ -362,6 +362,7 @@ int bdrv_replace_node(BlockDriverState *from, 
BlockDriverState *to,
   Error **errp);
 BlockDriverState *bdrv_insert_node(BlockDriverState *bs, QDict *node_options,
int flags, Error **errp);
+int bdrv_drop_filter(BlockDriverState *bs, Error **errp);
 
 int bdrv_parse_aio(const char *mode, int *flags);
 int bdrv_parse_cache_mode(const char *mode, int *flags, bool *writethrough);
diff --git a/block.c b/block.c
index 2ea9cc110d..68dfd822dd 100644
--- a/block.c
+++ b/block.c
@@ -5041,7 +5041,6 @@ static TransactionActionDrv 
bdrv_remove_filter_or_cow_child_drv = {
  * format nodes (always .backing) and filter child for filters (may be .file or
  * .backing)
  */
-__attribute__((unused))
 static void bdrv_remove_filter_or_cow_child(BlockDriverState *bs,
 Transaction *tran)
 {
@@ -5105,16 +5104,32 @@ static int bdrv_replace_node_noperm(BlockDriverState 
*from,
  *
  * With auto_skip=false the error is returned if from has a parent which should
  * not be updated.
+ *
+ * With @detach_subchain=true @to must be in a backing chain of @from. In this
+ * case backing link of the cow-parent of @to is removed.
  */
 static int bdrv_replace_node_common(BlockDriverState *from,
 BlockDriverState *to,
-bool auto_skip, Error **errp)
+bool auto_skip, bool detach_subchain,
+Error **errp)
 {
 Transaction *tran = tran_new();
 g_autoptr(GHashTable) found = NULL;
 g_autoptr(GSList) refresh_list = NULL;
+BlockDriverState *to_cow_parent;
 int ret;
 
+if (detach_subchain) {
+assert(bdrv_chain_contains(from, to));
+assert(from != to);
+for (to_cow_parent = from;
+ bdrv_filter_or_cow_bs(to_cow_parent) != to;
+ to_cow_parent = bdrv_filter_or_cow_bs(to_cow_parent))
+{
+;
+}
+}
+
 /* Make sure that @from doesn't go away until we have successfully attached
  * all of its parents to @to. */
 bdrv_ref(from);
@@ -5134,6 +5149,10 @@ static int bdrv_replace_node_common(BlockDriverState 
*from,
 goto out;
 }
 
+if (detach_subchain) {
+bdrv_remove_filter_or_cow_child(to_cow_parent, tran);
+}
+
 found = g_hash_table_new(NULL, NULL);
 
 refresh_list = bdrv_topological_dfs(refresh_list, found, to);
@@ -5158,7 +5177,13 @@ out:
 int bdrv_replace_node(BlockDriverState *from, BlockDriverState *to,
   Error **errp)
 {
-return bdrv_replace_node_common(from, to, true, errp);
+return bdrv_replace_node_common(from, to, true, false, errp);
+}
+
+int bdrv_drop_filter(BlockDriverState *bs, Error **errp)
+{
+return bdrv_replace_node_common(bs, bdrv_filter_or_cow_bs(bs), true, true,
+errp);
 }
 
 /*
@@ -5493,7 +5518,17 @@ int bdrv_drop_intermediate(BlockDriverState *top, 
BlockDriverState *base,
 updated_children = g_slist_prepend(updated_children, c);
 }
 
-bdrv_replace_node_common(top, base, false, &local_err);
+/*
+ * It seems correct to pass detach_subchain=true here, but it triggers
+ * one more yet not fixed bug, when due to nested aio_poll loop we switch 
to
+ * another drained section, which modify the graph (for example, removing
+ * the child, which we keep in updated_children list). So, it's a TODO.
+ *
+ * Note, bug triggered if pass detach_subchain=true here and run
+ * test-bdrv-drain. test_drop_intermediate_poll() test-case will crash.
+ * That's a FIXME.
+ */
+bdrv_replace_node_common(top, base, false, false, &local_err);
 if (local_err) {
 error_report_err(local_err);
 goto exit;
-- 
2.30.2




[PULL 16/39] block: add bdrv_replace_child_safe() transaction action

2021-04-30 Thread Kevin Wolf
From: Vladimir Sementsov-Ogievskiy 

To be used in the following commit.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Reviewed-by: Kevin Wolf 
Message-Id: <20210428151804.439460-17-vsement...@virtuozzo.com>
Signed-off-by: Kevin Wolf 
---
 block.c | 54 ++
 1 file changed, 54 insertions(+)

diff --git a/block.c b/block.c
index 98b7bc179c..a5305662dc 100644
--- a/block.c
+++ b/block.c
@@ -83,6 +83,9 @@ static BlockDriverState *bdrv_open_inherit(const char 
*filename,
BdrvChildRole child_role,
Error **errp);
 
+static void bdrv_replace_child_noperm(BdrvChild *child,
+  BlockDriverState *new_bs);
+
 static int bdrv_reopen_prepare(BDRVReopenState *reopen_state, BlockReopenQueue
*queue, Error **errp);
 static void bdrv_reopen_commit(BDRVReopenState *reopen_state);
@@ -2237,6 +2240,57 @@ static int bdrv_drv_set_perm(BlockDriverState *bs, 
uint64_t perm,
 return 0;
 }
 
+typedef struct BdrvReplaceChildState {
+BdrvChild *child;
+BlockDriverState *old_bs;
+} BdrvReplaceChildState;
+
+static void bdrv_replace_child_commit(void *opaque)
+{
+BdrvReplaceChildState *s = opaque;
+
+bdrv_unref(s->old_bs);
+}
+
+static void bdrv_replace_child_abort(void *opaque)
+{
+BdrvReplaceChildState *s = opaque;
+BlockDriverState *new_bs = s->child->bs;
+
+/* old_bs reference is transparently moved from @s to @s->child */
+bdrv_replace_child_noperm(s->child, s->old_bs);
+bdrv_unref(new_bs);
+}
+
+static TransactionActionDrv bdrv_replace_child_drv = {
+.commit = bdrv_replace_child_commit,
+.abort = bdrv_replace_child_abort,
+.clean = g_free,
+};
+
+/*
+ * bdrv_replace_child_safe
+ *
+ * Note: real unref of old_bs is done only on commit.
+ */
+__attribute__((unused))
+static void bdrv_replace_child_safe(BdrvChild *child, BlockDriverState *new_bs,
+Transaction *tran)
+{
+BdrvReplaceChildState *s = g_new(BdrvReplaceChildState, 1);
+*s = (BdrvReplaceChildState) {
+.child = child,
+.old_bs = child->bs,
+};
+tran_add(tran, &bdrv_replace_child_drv, s);
+
+if (new_bs) {
+bdrv_ref(new_bs);
+}
+bdrv_replace_child_noperm(child, new_bs);
+/* old_bs reference is transparently moved from @child to @s */
+}
+
 /*
  * Check whether permissions on this node can be changed in a way that
  * @cumulative_perms and @cumulative_shared_perms are the new cumulative
-- 
2.30.2




[PULL 17/39] block: fix bdrv_replace_node_common

2021-04-30 Thread Kevin Wolf
From: Vladimir Sementsov-Ogievskiy 

inore_children thing doesn't help to track all propagated permissions
of children we want to ignore. The simplest way to correctly update
permissions is update graph first and then do permission update. In
this case we just referesh permissions for the whole subgraph (in
topological-sort defined order) and everything is correctly calculated
automatically without any ignore_children.

So, refactor bdrv_replace_node_common to first do graph update and then
refresh the permissions.

Test test_parallel_exclusive_write() now pass, so move it out of
debugging "if".

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Reviewed-by: Kevin Wolf 
Message-Id: <20210428151804.439460-18-vsement...@virtuozzo.com>
Signed-off-by: Kevin Wolf 
---
 block.c  | 43 +---
 tests/unit/test-bdrv-graph-mod.c |  4 +--
 2 files changed, 20 insertions(+), 27 deletions(-)

diff --git a/block.c b/block.c
index a5305662dc..6040b9dea0 100644
--- a/block.c
+++ b/block.c
@@ -2273,7 +2273,6 @@ static TransactionActionDrv bdrv_replace_child_drv = {
  *
  * Note: real unref of old_bs is done only on commit.
  */
-__attribute__((unused))
 static void bdrv_replace_child_safe(BdrvChild *child, BlockDriverState *new_bs,
 Transaction *tran)
 {
@@ -4877,8 +4876,9 @@ static int bdrv_replace_node_common(BlockDriverState 
*from,
 bool auto_skip, Error **errp)
 {
 BdrvChild *c, *next;
-GSList *list = NULL, *p;
-uint64_t perm = 0, shared = BLK_PERM_ALL;
+Transaction *tran = tran_new();
+g_autoptr(GHashTable) found = NULL;
+g_autoptr(GSList) refresh_list = NULL;
 int ret;
 
 /* Make sure that @from doesn't go away until we have successfully attached
@@ -4889,7 +4889,12 @@ static int bdrv_replace_node_common(BlockDriverState 
*from,
 assert(bdrv_get_aio_context(from) == bdrv_get_aio_context(to));
 bdrv_drained_begin(from);
 
-/* Put all parents into @list and calculate their cumulative permissions */
+/*
+ * Do the replacement without permission update.
+ * Replacement may influence the permissions, we should calculate new
+ * permissions based on new graph. If we fail, we'll roll-back the
+ * replacement.
+ */
 QLIST_FOREACH_SAFE(c, &from->parents, next_parent, next) {
 assert(c->bs == from);
 if (!should_update_child(c, to)) {
@@ -4907,36 +4912,24 @@ static int bdrv_replace_node_common(BlockDriverState 
*from,
c->name, from->node_name);
 goto out;
 }
-list = g_slist_prepend(list, c);
-perm |= c->perm;
-shared &= c->shared_perm;
+bdrv_replace_child_safe(c, to, tran);
 }
 
-/* Check whether the required permissions can be granted on @to, ignoring
- * all BdrvChild in @list so that they can't block themselves. */
-ret = bdrv_check_update_perm(to, NULL, perm, shared, list, errp);
-if (ret < 0) {
-bdrv_abort_perm_update(to);
-goto out;
-}
+found = g_hash_table_new(NULL, NULL);
 
-/* Now actually perform the change. We performed the permission check for
- * all elements of @list at once, so set the permissions all at once at the
- * very end. */
-for (p = list; p != NULL; p = p->next) {
-c = p->data;
+refresh_list = bdrv_topological_dfs(refresh_list, found, to);
+refresh_list = bdrv_topological_dfs(refresh_list, found, from);
 
-bdrv_ref(to);
-bdrv_replace_child_noperm(c, to);
-bdrv_unref(from);
+ret = bdrv_list_refresh_perms(refresh_list, NULL, tran, errp);
+if (ret < 0) {
+goto out;
 }
 
-bdrv_set_perm(to);
-
 ret = 0;
 
 out:
-g_slist_free(list);
+tran_finalize(tran, ret);
+
 bdrv_drained_end(from);
 bdrv_unref(from);
 
diff --git a/tests/unit/test-bdrv-graph-mod.c b/tests/unit/test-bdrv-graph-mod.c
index e64e81a07c..7b3c8b437a 100644
--- a/tests/unit/test-bdrv-graph-mod.c
+++ b/tests/unit/test-bdrv-graph-mod.c
@@ -408,10 +408,10 @@ int main(int argc, char *argv[])
 test_should_update_child);
 g_test_add_func("/bdrv-graph-mod/parallel-perm-update",
 test_parallel_perm_update);
+g_test_add_func("/bdrv-graph-mod/parallel-exclusive-write",
+test_parallel_exclusive_write);
 
 if (debug) {
-g_test_add_func("/bdrv-graph-mod/parallel-exclusive-write",
-test_parallel_exclusive_write);
 g_test_add_func("/bdrv-graph-mod/append-greedy-filter",
 test_append_greedy_filter);
 }
-- 
2.30.2




[PULL 08/39] util: add transactions.c

2021-04-30 Thread Kevin Wolf
From: Vladimir Sementsov-Ogievskiy 

Add simple transaction API to use in further update of block graph
operations.

Supposed usage is:

- "prepare" is main function of the action and it should make the main
  effect of the action to be visible for the following actions, keeping
  possibility of roll-back, saving necessary things in action state,
  which is prepended to the action list (to do that, prepare func
  should call tran_add()). So, driver struct doesn't include "prepare"
  field, as it is supposed to be called directly.

- commit/rollback is supposed to be called for the list of action
  states, to commit/rollback all the actions in reverse order

- When possible "commit" should not make visible effect for other
  actions, which make possible transparent logical interaction between
  actions.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Reviewed-by: Kevin Wolf 
Message-Id: <20210428151804.439460-9-vsement...@virtuozzo.com>
Signed-off-by: Kevin Wolf 
---
 include/qemu/transactions.h | 63 
 util/transactions.c | 96 +
 MAINTAINERS |  6 +++
 util/meson.build|  1 +
 4 files changed, 166 insertions(+)
 create mode 100644 include/qemu/transactions.h
 create mode 100644 util/transactions.c

diff --git a/include/qemu/transactions.h b/include/qemu/transactions.h
new file mode 100644
index 00..92c5965235
--- /dev/null
+++ b/include/qemu/transactions.h
@@ -0,0 +1,63 @@
+/*
+ * Simple transactions API
+ *
+ * Copyright (c) 2021 Virtuozzo International GmbH.
+ *
+ * Author:
+ *  Vladimir Sementsov-Ogievskiy 
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program. If not, see .
+ *
+ *
+ * = Generic transaction API =
+ *
+ * The intended usage is the following: you create "prepare" functions, which
+ * represents the actions. They will usually have Transaction* argument, and
+ * call tran_add() to register finalization callbacks. For finalization
+ * callbacks, prepare corresponding TransactionActionDrv structures.
+ *
+ * Then, when you need to make a transaction, create an empty Transaction by
+ * tran_create(), call your "prepare" functions on it, and finally call
+ * tran_abort() or tran_commit() to finalize the transaction by corresponding
+ * finalization actions in reverse order.
+ */
+
+#ifndef QEMU_TRANSACTIONS_H
+#define QEMU_TRANSACTIONS_H
+
+#include 
+
+typedef struct TransactionActionDrv {
+void (*abort)(void *opaque);
+void (*commit)(void *opaque);
+void (*clean)(void *opaque);
+} TransactionActionDrv;
+
+typedef struct Transaction Transaction;
+
+Transaction *tran_new(void);
+void tran_add(Transaction *tran, TransactionActionDrv *drv, void *opaque);
+void tran_abort(Transaction *tran);
+void tran_commit(Transaction *tran);
+
+static inline void tran_finalize(Transaction *tran, int ret)
+{
+if (ret < 0) {
+tran_abort(tran);
+} else {
+tran_commit(tran);
+}
+}
+
+#endif /* QEMU_TRANSACTIONS_H */
diff --git a/util/transactions.c b/util/transactions.c
new file mode 100644
index 00..d0bc9a3e73
--- /dev/null
+++ b/util/transactions.c
@@ -0,0 +1,96 @@
+/*
+ * Simple transactions API
+ *
+ * Copyright (c) 2021 Virtuozzo International GmbH.
+ *
+ * Author:
+ *  Sementsov-Ogievskiy Vladimir 
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program. If not, see .
+ */
+
+#include "qemu/osdep.h"
+
+#include "qemu/transactions.h"
+#include "qemu/queue.h"
+
+typedef struct TransactionAction {
+TransactionActionDrv *drv;
+void *opaque;
+QSLIST_ENTRY(TransactionAction) entry;
+} TransactionAction;
+
+struct Transaction {
+QSLIST_HEAD(, TransactionAction) actions;
+};
+
+Transaction *tran_new(void)
+{
+Transaction *tran = g_new(Transaction, 1);
+
+QSLIST_INIT(&tran->actions);
+
+  

[PULL 03/39] tests/test-bdrv-graph-mod: add test_append_greedy_filter

2021-04-30 Thread Kevin Wolf
From: Vladimir Sementsov-Ogievskiy 

bdrv_append() is not quite good for inserting filters: it does extra
permission update in intermediate state, where filter get it filtered
child but is not yet replace it in a backing chain.

Some filters (for example backup-top) may want permissions even when
have no parents. And described intermediate state becomes invalid.

That's (half a) reason, why we need "inactive" state for backup-top
filter.

bdrv_append() will be improved later, now let's add a unit test.

Now test fails, so it runs only with -d flag. To run do

  ./test-bdrv-graph-mod -d -p /bdrv-graph-mod/append-greedy-filter

from /tests.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Reviewed-by: Kevin Wolf 
Message-Id: <20210428151804.439460-4-vsement...@virtuozzo.com>
Signed-off-by: Kevin Wolf 
---
 tests/unit/test-bdrv-graph-mod.c | 33 
 1 file changed, 33 insertions(+)

diff --git a/tests/unit/test-bdrv-graph-mod.c b/tests/unit/test-bdrv-graph-mod.c
index a8219b131e..5b6934e68b 100644
--- a/tests/unit/test-bdrv-graph-mod.c
+++ b/tests/unit/test-bdrv-graph-mod.c
@@ -352,6 +352,37 @@ static void test_parallel_perm_update(void)
 bdrv_unref(top);
 }
 
+/*
+ * It's possible that filter required permissions allows to insert it to 
backing
+ * chain, like:
+ *
+ *  1.  [top] -> [filter] -> [base]
+ *
+ * but doesn't allow to add it as a branch:
+ *
+ *  2.  [filter] --\
+ * v
+ *  [top] -> [base]
+ *
+ * So, inserting such filter should do all graph modifications and only then
+ * update permissions. If we try to go through intermediate state [2] and 
update
+ * permissions on it we'll fail.
+ *
+ * Let's check that bdrv_append() can append such a filter.
+ */
+static void test_append_greedy_filter(void)
+{
+BlockDriverState *top = exclusive_writer_node("top");
+BlockDriverState *base = no_perm_node("base");
+BlockDriverState *fl = exclusive_writer_node("fl1");
+
+bdrv_attach_child(top, base, "backing", &child_of_bds, BDRV_CHILD_COW,
+  &error_abort);
+
+bdrv_append(fl, base, &error_abort);
+bdrv_unref(top);
+}
+
 int main(int argc, char *argv[])
 {
 int i;
@@ -378,6 +409,8 @@ int main(int argc, char *argv[])
 test_parallel_exclusive_write);
 g_test_add_func("/bdrv-graph-mod/parallel-perm-update",
 test_parallel_perm_update);
+g_test_add_func("/bdrv-graph-mod/append-greedy-filter",
+test_append_greedy_filter);
 }
 
 return g_test_run();
-- 
2.30.2




[PULL 00/39] Block layer patches

2021-04-30 Thread Kevin Wolf
The following changes since commit ccdf06c1db192152ac70a1dd974c624f566cb7d4:

  Open 6.1 development tree (2021-04-30 11:15:40 +0100)

are available in the Git repository at:

  git://repo.or.cz/qemu/kevin.git tags/for-upstream

for you to fetch changes up to 68bf7336533faa6aa90fdd4558edddbf5d8ef814:

  vhost-user-blk: Fail gracefully on too large queue size (2021-04-30 12:27:48 
+0200)


Block layer patches

- Fix permission update order problems with block graph changes
- qemu-img convert: Unshare write permission for source
- vhost-user-blk: Fail gracefully on too large queue size


Kevin Wolf (3):
  block: Add BDRV_O_NO_SHARE for blk_new_open()
  qemu-img convert: Unshare write permission for source
  vhost-user-blk: Fail gracefully on too large queue size

Vladimir Sementsov-Ogievskiy (36):
  tests/test-bdrv-graph-mod: add test_parallel_exclusive_write
  tests/test-bdrv-graph-mod: add test_parallel_perm_update
  tests/test-bdrv-graph-mod: add test_append_greedy_filter
  block: bdrv_append(): don't consume reference
  block: BdrvChildClass: add .get_parent_aio_context handler
  block: drop ctx argument from bdrv_root_attach_child
  block: make bdrv_reopen_{prepare,commit,abort} private
  util: add transactions.c
  block: bdrv_refresh_perms: check for parents permissions conflict
  block: refactor bdrv_child* permission functions
  block: rewrite bdrv_child_try_set_perm() using bdrv_refresh_perms()
  block: inline bdrv_child_*() permission functions calls
  block: use topological sort for permission update
  block: add bdrv_drv_set_perm transaction action
  block: add bdrv_list_* permission update functions
  block: add bdrv_replace_child_safe() transaction action
  block: fix bdrv_replace_node_common
  block: add bdrv_attach_child_common() transaction action
  block: add bdrv_attach_child_noperm() transaction action
  block: split out bdrv_replace_node_noperm()
  block: adapt bdrv_append() for inserting filters
  block: add bdrv_remove_filter_or_cow transaction action
  block: introduce bdrv_drop_filter()
  block/backup-top: drop .active
  block: drop ignore_children for permission update functions
  block: make bdrv_unset_inherits_from to be a transaction action
  block: make bdrv_refresh_limits() to be a transaction action
  block: add bdrv_set_backing_noperm() transaction action
  block: bdrv_reopen_multiple(): move bdrv_flush to separate pre-prepare
  block: bdrv_reopen_multiple: refresh permissions on updated graph
  block: drop unused permission update functions
  block: inline bdrv_check_perm_common()
  block: inline bdrv_replace_child()
  block: refactor bdrv_child_set_perm_safe() transaction action
  block: rename bdrv_replace_child_safe() to bdrv_replace_child()
  block: refactor bdrv_node_check_perm()

 include/block/block.h |   14 +-
 include/block/block_int.h |8 +-
 include/qemu/transactions.h   |   63 ++
 block.c   | 1329 -
 block/backup-top.c|   48 +-
 block/block-backend.c |   30 +-
 block/commit.c|1 +
 block/file-posix.c|   91 +--
 block/io.c|   31 +-
 block/mirror.c|3 -
 blockdev.c|4 -
 blockjob.c|   11 +-
 hw/block/vhost-user-blk.c |5 +
 qemu-img.c|2 +-
 tests/unit/test-bdrv-drain.c  |2 +-
 tests/unit/test-bdrv-graph-mod.c  |  209 +-
 util/transactions.c   |   96 +++
 MAINTAINERS   |6 +
 tests/qemu-iotests/245|2 +-
 tests/qemu-iotests/283.out|2 +-
 tests/qemu-iotests/tests/qsd-jobs.out |2 +-
 util/meson.build  |1 +
 22 files changed, 1280 insertions(+), 680 deletions(-)
 create mode 100644 include/qemu/transactions.h
 create mode 100644 util/transactions.c




[PULL 13/39] block: use topological sort for permission update

2021-04-30 Thread Kevin Wolf
From: Vladimir Sementsov-Ogievskiy 

Rewrite bdrv_check_perm(), bdrv_abort_perm_update() and bdrv_set_perm()
to update nodes in topological sort order instead of simple DFS. With
topologically sorted nodes, we update a node only when all its parents
already updated. With DFS it's not so.

Consider the following example:

A -+
|  |
|  v
|  B
|  |
v  |
C<-+

A is parent for B and C, B is parent for C.

Obviously, to update permissions, we should go in order A B C, so, when
we update C, all parent permissions already updated. But with current
approach (simple recursion) we can update in sequence A C B C (C is
updated twice). On first update of C, we consider old B permissions, so
doing wrong thing. If it succeed, all is OK, on second C update we will
finish with correct graph. But if the wrong thing failed, we break the
whole process for no reason (it's possible that updated B permission
will be less strict, but we will never check it).

Also new approach gives a way to simultaneously and correctly update
several nodes, we just need to run bdrv_topological_dfs() several times
to add all nodes and their subtrees into one topologically sorted list
(next patch will update bdrv_replace_node() in this manner).

Test test_parallel_perm_update() is now passing, so move it out of
debugging "if".

We also need to support ignore_children in
bdrv_parent_perms_conflict()

For test 283 order of conflicting parents check is changed.

Note also that in bdrv_check_perm() we don't check for parents conflict
at root bs, as we may be in the middle of permission update in
bdrv_reopen_multiple(). bdrv_reopen_multiple() will be updated soon.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Reviewed-by: Kevin Wolf 
Message-Id: <20210428151804.439460-14-vsement...@virtuozzo.com>
Signed-off-by: Kevin Wolf 
---
 block.c  | 116 +--
 tests/unit/test-bdrv-graph-mod.c |   4 +-
 tests/qemu-iotests/283.out   |   2 +-
 3 files changed, 99 insertions(+), 23 deletions(-)

diff --git a/block.c b/block.c
index e5af4cdae9..cbcc4c15a0 100644
--- a/block.c
+++ b/block.c
@@ -2054,7 +2054,9 @@ static bool bdrv_a_allow_b(BdrvChild *a, BdrvChild *b, 
Error **errp)
 return false;
 }
 
-static bool bdrv_parent_perms_conflict(BlockDriverState *bs, Error **errp)
+static bool bdrv_parent_perms_conflict(BlockDriverState *bs,
+   GSList *ignore_children,
+   Error **errp)
 {
 BdrvChild *a, *b;
 
@@ -2064,8 +2066,12 @@ static bool bdrv_parent_perms_conflict(BlockDriverState 
*bs, Error **errp)
  * directions.
  */
 QLIST_FOREACH(a, &bs->parents, next_parent) {
+if (g_slist_find(ignore_children, a)) {
+continue;
+}
+
 QLIST_FOREACH(b, &bs->parents, next_parent) {
-if (a == b) {
+if (a == b || g_slist_find(ignore_children, b)) {
 continue;
 }
 
@@ -2094,6 +2100,40 @@ static void bdrv_child_perm(BlockDriverState *bs, 
BlockDriverState *child_bs,
 }
 }
 
+/*
+ * Adds the whole subtree of @bs (including @bs itself) to the @list (except 
for
+ * nodes that are already in the @list, of course) so that final list is
+ * topologically sorted. Return the result (GSList @list object is updated, so
+ * don't use old reference after function call).
+ *
+ * On function start @list must be already topologically sorted and for any 
node
+ * in the @list the whole subtree of the node must be in the @list as well. The
+ * simplest way to satisfy this criteria: use only result of
+ * bdrv_topological_dfs() or NULL as @list parameter.
+ */
+static GSList *bdrv_topological_dfs(GSList *list, GHashTable *found,
+BlockDriverState *bs)
+{
+BdrvChild *child;
+g_autoptr(GHashTable) local_found = NULL;
+
+if (!found) {
+assert(!list);
+found = local_found = g_hash_table_new(NULL, NULL);
+}
+
+if (g_hash_table_contains(found, bs)) {
+return list;
+}
+g_hash_table_add(found, bs);
+
+QLIST_FOREACH(child, &bs->children, next) {
+list = bdrv_topological_dfs(list, found, child->bs);
+}
+
+return g_slist_prepend(list, bs);
+}
+
 static void bdrv_child_set_perm_commit(void *opaque)
 {
 BdrvChild *c = opaque;
@@ -2158,10 +2198,10 @@ static void bdrv_child_set_perm_safe(BdrvChild *c, 
uint64_t perm,
  * A call to this function must always be followed by a call to bdrv_set_perm()
  * or bdrv_abort_perm_update().
  */
-static int bdrv_check_perm(BlockDriverState *bs, BlockReopenQueue *q,
-   uint64_t cumulative_perms,
-   uint64_t cumulative_shared_perms,
-   GSList *ignore_children, Error **errp)
+static int bdrv_node_check_perm(BlockDriverState *bs, BlockReopenQueue *q,
+uint64_t cumulative_perms,
+   

[PULL 12/39] block: inline bdrv_child_*() permission functions calls

2021-04-30 Thread Kevin Wolf
From: Vladimir Sementsov-Ogievskiy 

Each of them has only one caller. Open-coding simplifies further
pemission-update system changes.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Reviewed-by: Alberto Garcia 
Reviewed-by: Kevin Wolf 
Message-Id: <20210428151804.439460-13-vsement...@virtuozzo.com>
Signed-off-by: Kevin Wolf 
---
 block.c | 59 +
 1 file changed, 17 insertions(+), 42 deletions(-)

diff --git a/block.c b/block.c
index 4511766825..e5af4cdae9 100644
--- a/block.c
+++ b/block.c
@@ -1974,11 +1974,11 @@ static int bdrv_fill_options(QDict **options, const 
char *filename,
 return 0;
 }
 
-static int bdrv_child_check_perm(BdrvChild *c, BlockReopenQueue *q,
- uint64_t perm, uint64_t shared,
- GSList *ignore_children, Error **errp);
-static void bdrv_child_abort_perm_update(BdrvChild *c);
-static void bdrv_child_set_perm(BdrvChild *c);
+static int bdrv_check_update_perm(BlockDriverState *bs, BlockReopenQueue *q,
+  uint64_t new_used_perm,
+  uint64_t new_shared_perm,
+  GSList *ignore_children,
+  Error **errp);
 
 typedef struct BlockReopenQueueEntry {
  bool prepared;
@@ -2226,15 +2226,21 @@ static int bdrv_check_perm(BlockDriverState *bs, 
BlockReopenQueue *q,
 /* Check all children */
 QLIST_FOREACH(c, &bs->children, next) {
 uint64_t cur_perm, cur_shared;
+GSList *cur_ignore_children;
 
 bdrv_child_perm(bs, c->bs, c, c->role, q,
 cumulative_perms, cumulative_shared_perms,
 &cur_perm, &cur_shared);
-ret = bdrv_child_check_perm(c, q, cur_perm, cur_shared, 
ignore_children,
-errp);
+
+cur_ignore_children = g_slist_prepend(g_slist_copy(ignore_children), 
c);
+ret = bdrv_check_update_perm(c->bs, q, cur_perm, cur_shared,
+ cur_ignore_children, errp);
+g_slist_free(cur_ignore_children);
 if (ret < 0) {
 return ret;
 }
+
+bdrv_child_set_perm_safe(c, cur_perm, cur_shared, NULL);
 }
 
 return 0;
@@ -2261,7 +2267,8 @@ static void bdrv_abort_perm_update(BlockDriverState *bs)
 }
 
 QLIST_FOREACH(c, &bs->children, next) {
-bdrv_child_abort_perm_update(c);
+bdrv_child_set_perm_abort(c);
+bdrv_abort_perm_update(c->bs);
 }
 }
 
@@ -2290,7 +2297,8 @@ static void bdrv_set_perm(BlockDriverState *bs)
 
 /* Update all children */
 QLIST_FOREACH(c, &bs->children, next) {
-bdrv_child_set_perm(c);
+bdrv_child_set_perm_commit(c);
+bdrv_set_perm(c->bs);
 }
 }
 
@@ -2398,39 +2406,6 @@ static int bdrv_check_update_perm(BlockDriverState *bs, 
BlockReopenQueue *q,
ignore_children, errp);
 }
 
-/* Needs to be followed by a call to either bdrv_child_set_perm() or
- * bdrv_child_abort_perm_update(). */
-static int bdrv_child_check_perm(BdrvChild *c, BlockReopenQueue *q,
- uint64_t perm, uint64_t shared,
- GSList *ignore_children, Error **errp)
-{
-int ret;
-
-ignore_children = g_slist_prepend(g_slist_copy(ignore_children), c);
-ret = bdrv_check_update_perm(c->bs, q, perm, shared, ignore_children, 
errp);
-g_slist_free(ignore_children);
-
-if (ret < 0) {
-return ret;
-}
-
-bdrv_child_set_perm_safe(c, perm, shared, NULL);
-
-return 0;
-}
-
-static void bdrv_child_set_perm(BdrvChild *c)
-{
-bdrv_child_set_perm_commit(c);
-bdrv_set_perm(c->bs);
-}
-
-static void bdrv_child_abort_perm_update(BdrvChild *c)
-{
-bdrv_child_set_perm_abort(c);
-bdrv_abort_perm_update(c->bs);
-}
-
 static int bdrv_refresh_perms(BlockDriverState *bs, Error **errp)
 {
 int ret;
-- 
2.30.2




[PULL 09/39] block: bdrv_refresh_perms: check for parents permissions conflict

2021-04-30 Thread Kevin Wolf
From: Vladimir Sementsov-Ogievskiy 

Add additional check that node parents do not interfere with each
other. This should not hurt existing callers and allows in further
patch use bdrv_refresh_perms() to update a subtree of changed
BdrvChild (check that change is correct).

New check will substitute bdrv_check_update_perm() in following
permissions refactoring, so keep error messages the same to avoid
unit test result changes.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Reviewed-by: Alberto Garcia 
Reviewed-by: Kevin Wolf 
Message-Id: <20210428151804.439460-10-vsement...@virtuozzo.com>
Signed-off-by: Kevin Wolf 
---
 block.c | 63 -
 1 file changed, 54 insertions(+), 9 deletions(-)

diff --git a/block.c b/block.c
index 34c728d7e4..fd621f0403 100644
--- a/block.c
+++ b/block.c
@@ -2026,6 +2026,57 @@ bool bdrv_is_writable(BlockDriverState *bs)
 return bdrv_is_writable_after_reopen(bs, NULL);
 }
 
+static char *bdrv_child_user_desc(BdrvChild *c)
+{
+if (c->klass->get_parent_desc) {
+return c->klass->get_parent_desc(c);
+}
+
+return g_strdup("another user");
+}
+
+static bool bdrv_a_allow_b(BdrvChild *a, BdrvChild *b, Error **errp)
+{
+g_autofree char *user = NULL;
+g_autofree char *perm_names = NULL;
+
+if ((b->perm & a->shared_perm) == b->perm) {
+return true;
+}
+
+perm_names = bdrv_perm_names(b->perm & ~a->shared_perm);
+user = bdrv_child_user_desc(a);
+error_setg(errp, "Conflicts with use by %s as '%s', which does not "
+   "allow '%s' on %s",
+   user, a->name, perm_names, bdrv_get_node_name(b->bs));
+
+return false;
+}
+
+static bool bdrv_parent_perms_conflict(BlockDriverState *bs, Error **errp)
+{
+BdrvChild *a, *b;
+
+/*
+ * During the loop we'll look at each pair twice. That's correct because
+ * bdrv_a_allow_b() is asymmetric and we should check each pair in both
+ * directions.
+ */
+QLIST_FOREACH(a, &bs->parents, next_parent) {
+QLIST_FOREACH(b, &bs->parents, next_parent) {
+if (a == b) {
+continue;
+}
+
+if (!bdrv_a_allow_b(a, b, errp)) {
+return true;
+}
+}
+}
+
+return false;
+}
+
 static void bdrv_child_perm(BlockDriverState *bs, BlockDriverState *child_bs,
 BdrvChild *c, BdrvChildRole role,
 BlockReopenQueue *reopen_queue,
@@ -2203,15 +2254,6 @@ void bdrv_get_cumulative_perm(BlockDriverState *bs, 
uint64_t *perm,
 *shared_perm = cumulative_shared_perms;
 }
 
-static char *bdrv_child_user_desc(BdrvChild *c)
-{
-if (c->klass->get_parent_desc) {
-return c->klass->get_parent_desc(c);
-}
-
-return g_strdup("another user");
-}
-
 char *bdrv_perm_names(uint64_t perm)
 {
 struct perm_name {
@@ -2355,6 +2397,9 @@ static int bdrv_refresh_perms(BlockDriverState *bs, Error 
**errp)
 int ret;
 uint64_t perm, shared_perm;
 
+if (bdrv_parent_perms_conflict(bs, errp)) {
+return -EPERM;
+}
 bdrv_get_cumulative_perm(bs, &perm, &shared_perm);
 ret = bdrv_check_perm(bs, NULL, perm, shared_perm, NULL, errp);
 if (ret < 0) {
-- 
2.30.2




[PULL 02/39] tests/test-bdrv-graph-mod: add test_parallel_perm_update

2021-04-30 Thread Kevin Wolf
From: Vladimir Sementsov-Ogievskiy 

Add test to show that simple DFS recursion order is not correct for
permission update. Correct order is topological-sort order, which will
be introduced later.

Consider the block driver which has two filter children: one active
with exclusive write access and one inactive with no specific
permissions.

And, these two children has a common base child, like this:

┌─┐ ┌──┐
│ fl2 │ ◀── │ top  │
└─┘ └──┘
  │   │
  │   │ w
  │   ▼
  │ ┌──┐
  │ │ fl1  │
  │ └──┘
  │   │
  │   │ w
  │   ▼
  │ ┌──┐
  └───▶ │ base │
└──┘

So, exclusive write is propagated.

Assume, we want to make fl2 active instead of fl1.
So, we set some option for top driver and do permission update.

If permission update (remember, it's DFS) goes first through
top->fl1->base branch it will succeed: it firstly drop exclusive write
permissions and than apply them for another BdrvChildren.
But if permission update goes first through top->fl2->base branch it
will fail, as when we try to update fl2->base child, old not yet
updated fl1->base child will be in conflict.

Now test fails, so it runs only with -d flag. To run do

  ./test-bdrv-graph-mod -d -p /bdrv-graph-mod/parallel-perm-update

from /tests.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Reviewed-by: Kevin Wolf 
Message-Id: <20210428151804.439460-3-vsement...@virtuozzo.com>
Signed-off-by: Kevin Wolf 
---
 tests/unit/test-bdrv-graph-mod.c | 116 +++
 1 file changed, 116 insertions(+)

diff --git a/tests/unit/test-bdrv-graph-mod.c b/tests/unit/test-bdrv-graph-mod.c
index 80a9a20066..a8219b131e 100644
--- a/tests/unit/test-bdrv-graph-mod.c
+++ b/tests/unit/test-bdrv-graph-mod.c
@@ -238,6 +238,120 @@ static void test_parallel_exclusive_write(void)
 bdrv_unref(top);
 }
 
+static void write_to_file_perms(BlockDriverState *bs, BdrvChild *c,
+ BdrvChildRole role,
+ BlockReopenQueue *reopen_queue,
+ uint64_t perm, uint64_t shared,
+ uint64_t *nperm, uint64_t *nshared)
+{
+if (bs->file && c == bs->file) {
+*nperm = BLK_PERM_WRITE;
+*nshared = BLK_PERM_ALL & ~BLK_PERM_WRITE;
+} else {
+*nperm = 0;
+*nshared = BLK_PERM_ALL;
+}
+}
+
+static BlockDriver bdrv_write_to_file = {
+.format_name = "tricky-perm",
+.bdrv_child_perm = write_to_file_perms,
+};
+
+
+/*
+ * The following test shows that topological-sort order is required for
+ * permission update, simple DFS is not enough.
+ *
+ * Consider the block driver which has two filter children: one active
+ * with exclusive write access and one inactive with no specific
+ * permissions.
+ *
+ * And, these two children has a common base child, like this:
+ *
+ * ┌─┐ ┌──┐
+ * │ fl2 │ ◀── │ top  │
+ * └─┘ └──┘
+ *   │   │
+ *   │   │ w
+ *   │   ▼
+ *   │ ┌──┐
+ *   │ │ fl1  │
+ *   │ └──┘
+ *   │   │
+ *   │   │ w
+ *   │   ▼
+ *   │ ┌──┐
+ *   └───▶ │ base │
+ * └──┘
+ *
+ * So, exclusive write is propagated.
+ *
+ * Assume, we want to make fl2 active instead of fl1.
+ * So, we set some option for top driver and do permission update.
+ *
+ * With simple DFS, if permission update goes first through
+ * top->fl1->base branch it will succeed: it firstly drop exclusive write
+ * permissions and than apply them for another BdrvChildren.
+ * But if permission update goes first through top->fl2->base branch it
+ * will fail, as when we try to update fl2->base child, old not yet
+ * updated fl1->base child will be in conflict.
+ *
+ * With topological-sort order we always update parents before children, so fl1
+ * and fl2 are both updated when we update base and there is no conflict.
+ */
+static void test_parallel_perm_update(void)
+{
+BlockDriverState *top = no_perm_node("top");
+BlockDriverState *tricky =
+bdrv_new_open_driver(&bdrv_write_to_file, "tricky", BDRV_O_RDWR,
+ &error_abort);
+BlockDriverState *base = no_perm_node("base");
+BlockDriverState *fl1 = pass_through_node("fl1");
+BlockDriverState *fl2 = pass_through_node("fl2");
+BdrvChild *c_fl1, *c_fl2;
+
+/*
+ * bdrv_attach_child() eats child bs reference, so we need two @base
+ * references for two filters:
+ */
+bdrv_ref(base);
+
+bdrv_attach_child(top, tricky, "file", &child_of_bds, BDRV_CHILD_DATA,
+  &error_abort);
+c_fl1 = bdrv_attach_child(tricky, fl1, "first", &child_of_bds,
+  BDRV_CHILD_FILTERED, &error_abort);
+c_fl2 = bdrv_attach_child(tricky, fl2, "second", &child_of_bds,
+  BDRV_CHILD_FILTE

[PULL 04/39] block: bdrv_append(): don't consume reference

2021-04-30 Thread Kevin Wolf
From: Vladimir Sementsov-Ogievskiy 

We have too much comments for this feature. It seems better just don't
do it. Most of real users (tests don't count) have to create additional
reference.

Drop also comment in external_snapshot_prepare:
 - bdrv_append doesn't "remove" old bs in common sense, it sounds
   strange
 - the fact that bdrv_append can fail is obvious from the context
 - the fact that we must rollback all changes in transaction abort is
   known (it's the direct role of abort)

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Reviewed-by: Kevin Wolf 
Message-Id: <20210428151804.439460-5-vsement...@virtuozzo.com>
Signed-off-by: Kevin Wolf 
---
 block.c  | 25 +++--
 block/backup-top.c   |  1 -
 block/commit.c   |  1 +
 block/mirror.c   |  3 ---
 blockdev.c   |  4 
 tests/unit/test-bdrv-drain.c |  2 +-
 tests/unit/test-bdrv-graph-mod.c |  3 +++
 7 files changed, 8 insertions(+), 31 deletions(-)

diff --git a/block.c b/block.c
index c5b887cec1..1e7e8907e4 100644
--- a/block.c
+++ b/block.c
@@ -3213,11 +3213,6 @@ static BlockDriverState 
*bdrv_append_temp_snapshot(BlockDriverState *bs,
 goto out;
 }
 
-/* bdrv_append() consumes a strong reference to bs_snapshot
- * (i.e. it will call bdrv_unref() on it) even on error, so in
- * order to be able to return one, we have to increase
- * bs_snapshot's refcount here */
-bdrv_ref(bs_snapshot);
 ret = bdrv_append(bs_snapshot, bs, errp);
 if (ret < 0) {
 bs_snapshot = NULL;
@@ -4679,36 +4674,22 @@ int bdrv_replace_node(BlockDriverState *from, 
BlockDriverState *to,
  * bs_new must not be attached to a BlockBackend.
  *
  * This function does not create any image files.
- *
- * bdrv_append() takes ownership of a bs_new reference and unrefs it because
- * that's what the callers commonly need. bs_new will be referenced by the old
- * parents of bs_top after bdrv_append() returns. If the caller needs to keep a
- * reference of its own, it must call bdrv_ref().
  */
 int bdrv_append(BlockDriverState *bs_new, BlockDriverState *bs_top,
 Error **errp)
 {
 int ret = bdrv_set_backing_hd(bs_new, bs_top, errp);
 if (ret < 0) {
-goto out;
+return ret;
 }
 
 ret = bdrv_replace_node(bs_top, bs_new, errp);
 if (ret < 0) {
 bdrv_set_backing_hd(bs_new, NULL, &error_abort);
-goto out;
+return ret;
 }
 
-ret = 0;
-
-out:
-/*
- * bs_new is now referenced by its new parents, we don't need the
- * additional reference any more.
- */
-bdrv_unref(bs_new);
-
-return ret;
+return 0;
 }
 
 static void bdrv_delete(BlockDriverState *bs)
diff --git a/block/backup-top.c b/block/backup-top.c
index 589e8b651d..62d09f213e 100644
--- a/block/backup-top.c
+++ b/block/backup-top.c
@@ -234,7 +234,6 @@ BlockDriverState *bdrv_backup_top_append(BlockDriverState 
*source,
 
 bdrv_drained_begin(source);
 
-bdrv_ref(top);
 ret = bdrv_append(top, source, errp);
 if (ret < 0) {
 error_prepend(errp, "Cannot append backup-top filter: ");
diff --git a/block/commit.c b/block/commit.c
index dd9ba87349..b89bb20b75 100644
--- a/block/commit.c
+++ b/block/commit.c
@@ -312,6 +312,7 @@ void commit_start(const char *job_id, BlockDriverState *bs,
 commit_top_bs->total_sectors = top->total_sectors;
 
 ret = bdrv_append(commit_top_bs, top, errp);
+bdrv_unref(commit_top_bs); /* referenced by new parents or failed */
 if (ret < 0) {
 commit_top_bs = NULL;
 goto fail;
diff --git a/block/mirror.c b/block/mirror.c
index 5a71bd8bbc..840b8e8c15 100644
--- a/block/mirror.c
+++ b/block/mirror.c
@@ -1630,9 +1630,6 @@ static BlockJob *mirror_start_job(
 
 bs_opaque->is_commit = target_is_backing;
 
-/* bdrv_append takes ownership of the mirror_top_bs reference, need to keep
- * it alive until block_job_create() succeeds even if bs has no parent. */
-bdrv_ref(mirror_top_bs);
 bdrv_drained_begin(bs);
 ret = bdrv_append(mirror_top_bs, bs, errp);
 bdrv_drained_end(bs);
diff --git a/blockdev.c b/blockdev.c
index a57590aae4..834c2304a1 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -1576,10 +1576,6 @@ static void external_snapshot_prepare(BlkActionState 
*common,
 goto out;
 }
 
-/* This removes our old bs and adds the new bs. This is an operation that
- * can fail, so we need to do it in .prepare; undoing it for abort is
- * always possible. */
-bdrv_ref(state->new_bs);
 ret = bdrv_append(state->new_bs, state->old_bs, errp);
 if (ret < 0) {
 goto out;
diff --git a/tests/unit/test-bdrv-drain.c b/tests/unit/test-bdrv-drain.c
index 8a29e33e00..892f7f47d8 100644
--- a/tests/unit/test-bdrv-drain.c
+++ b/tests/unit/test-bdrv-drain.c
@@ -1478,7 +1478,6 @@ static void test_append_to_drained(void)
 g_assert_cmpint(base_s->drain_count, ==, 1);
 

[PULL 07/39] block: make bdrv_reopen_{prepare,commit,abort} private

2021-04-30 Thread Kevin Wolf
From: Vladimir Sementsov-Ogievskiy 

These functions are called only from bdrv_reopen_multiple() in block.c.
No reason to publish them.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Reviewed-by: Alberto Garcia 
Reviewed-by: Kevin Wolf 
Message-Id: <20210428151804.439460-8-vsement...@virtuozzo.com>
Signed-off-by: Kevin Wolf 
---
 include/block/block.h |  4 
 block.c   | 13 +
 2 files changed, 9 insertions(+), 8 deletions(-)

diff --git a/include/block/block.h b/include/block/block.h
index 54279baa95..16e496a5c4 100644
--- a/include/block/block.h
+++ b/include/block/block.h
@@ -387,10 +387,6 @@ BlockReopenQueue *bdrv_reopen_queue(BlockReopenQueue 
*bs_queue,
 int bdrv_reopen_multiple(BlockReopenQueue *bs_queue, Error **errp);
 int bdrv_reopen_set_read_only(BlockDriverState *bs, bool read_only,
   Error **errp);
-int bdrv_reopen_prepare(BDRVReopenState *reopen_state,
-BlockReopenQueue *queue, Error **errp);
-void bdrv_reopen_commit(BDRVReopenState *reopen_state);
-void bdrv_reopen_abort(BDRVReopenState *reopen_state);
 int bdrv_pwrite_zeroes(BdrvChild *child, int64_t offset,
int64_t bytes, BdrvRequestFlags flags);
 int bdrv_make_zero(BdrvChild *child, BdrvRequestFlags flags);
diff --git a/block.c b/block.c
index 54436c951e..34c728d7e4 100644
--- a/block.c
+++ b/block.c
@@ -82,6 +82,11 @@ static BlockDriverState *bdrv_open_inherit(const char 
*filename,
BdrvChildRole child_role,
Error **errp);
 
+static int bdrv_reopen_prepare(BDRVReopenState *reopen_state, BlockReopenQueue
+   *queue, Error **errp);
+static void bdrv_reopen_commit(BDRVReopenState *reopen_state);
+static void bdrv_reopen_abort(BDRVReopenState *reopen_state);
+
 /* If non-zero, use only whitelisted block drivers */
 static int use_bdrv_whitelist;
 
@@ -4153,8 +4158,8 @@ static int bdrv_reopen_parse_backing(BDRVReopenState 
*reopen_state,
  * commit() for any other BDS that have been left in a prepare() state
  *
  */
-int bdrv_reopen_prepare(BDRVReopenState *reopen_state, BlockReopenQueue *queue,
-Error **errp)
+static int bdrv_reopen_prepare(BDRVReopenState *reopen_state,
+   BlockReopenQueue *queue, Error **errp)
 {
 int ret = -1;
 int old_flags;
@@ -4369,7 +4374,7 @@ error:
  * makes them final by swapping the staging BlockDriverState contents into
  * the active BlockDriverState contents.
  */
-void bdrv_reopen_commit(BDRVReopenState *reopen_state)
+static void bdrv_reopen_commit(BDRVReopenState *reopen_state)
 {
 BlockDriver *drv;
 BlockDriverState *bs;
@@ -4429,7 +4434,7 @@ void bdrv_reopen_commit(BDRVReopenState *reopen_state)
  * Abort the reopen, and delete and free the staged changes in
  * reopen_state
  */
-void bdrv_reopen_abort(BDRVReopenState *reopen_state)
+static void bdrv_reopen_abort(BDRVReopenState *reopen_state)
 {
 BlockDriver *drv;
 
-- 
2.30.2




[PULL 06/39] block: drop ctx argument from bdrv_root_attach_child

2021-04-30 Thread Kevin Wolf
From: Vladimir Sementsov-Ogievskiy 

Passing parent aio context is redundant, as child_class and parent
opaque pointer are enough to retrieve it. Drop the argument and use new
bdrv_child_get_parent_aio_context() interface.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Reviewed-by: Alberto Garcia 
Reviewed-by: Kevin Wolf 
Message-Id: <20210428151804.439460-7-vsement...@virtuozzo.com>
Signed-off-by: Kevin Wolf 
---
 include/block/block_int.h | 1 -
 block.c   | 8 +---
 block/block-backend.c | 4 ++--
 blockjob.c| 3 +--
 4 files changed, 8 insertions(+), 8 deletions(-)

diff --git a/include/block/block_int.h b/include/block/block_int.h
index 737ec632c4..dd2de6bd1d 100644
--- a/include/block/block_int.h
+++ b/include/block/block_int.h
@@ -1308,7 +1308,6 @@ BdrvChild *bdrv_root_attach_child(BlockDriverState 
*child_bs,
   const char *child_name,
   const BdrvChildClass *child_class,
   BdrvChildRole child_role,
-  AioContext *ctx,
   uint64_t perm, uint64_t shared_perm,
   void *opaque, Error **errp);
 void bdrv_root_unref_child(BdrvChild *child);
diff --git a/block.c b/block.c
index 2833912436..54436c951e 100644
--- a/block.c
+++ b/block.c
@@ -2700,13 +2700,13 @@ BdrvChild *bdrv_root_attach_child(BlockDriverState 
*child_bs,
   const char *child_name,
   const BdrvChildClass *child_class,
   BdrvChildRole child_role,
-  AioContext *ctx,
   uint64_t perm, uint64_t shared_perm,
   void *opaque, Error **errp)
 {
 BdrvChild *child;
 Error *local_err = NULL;
 int ret;
+AioContext *ctx;
 
 ret = bdrv_check_update_perm(child_bs, NULL, perm, shared_perm, NULL, 
errp);
 if (ret < 0) {
@@ -2726,6 +2726,8 @@ BdrvChild *bdrv_root_attach_child(BlockDriverState 
*child_bs,
 .opaque = opaque,
 };
 
+ctx = bdrv_child_get_parent_aio_context(child);
+
 /* If the AioContexts don't match, first try to move the subtree of
  * child_bs into the AioContext of the new parent. If this doesn't work,
  * try moving the parent into the AioContext of child_bs instead. */
@@ -2786,8 +2788,8 @@ BdrvChild *bdrv_attach_child(BlockDriverState *parent_bs,
 perm, shared_perm, &perm, &shared_perm);
 
 child = bdrv_root_attach_child(child_bs, child_name, child_class,
-   child_role, bdrv_get_aio_context(parent_bs),
-   perm, shared_perm, parent_bs, errp);
+   child_role, perm, shared_perm, parent_bs,
+   errp);
 if (child == NULL) {
 return NULL;
 }
diff --git a/block/block-backend.c b/block/block-backend.c
index 3f656ef361..e4892fd6a5 100644
--- a/block/block-backend.c
+++ b/block/block-backend.c
@@ -435,7 +435,7 @@ BlockBackend *blk_new_open(const char *filename, const char 
*reference,
 
 blk->root = bdrv_root_attach_child(bs, "root", &child_root,
BDRV_CHILD_FILTERED | 
BDRV_CHILD_PRIMARY,
-   blk->ctx, perm, BLK_PERM_ALL, blk, 
errp);
+   perm, BLK_PERM_ALL, blk, errp);
 if (!blk->root) {
 blk_unref(blk);
 return NULL;
@@ -849,7 +849,7 @@ int blk_insert_bs(BlockBackend *blk, BlockDriverState *bs, 
Error **errp)
 bdrv_ref(bs);
 blk->root = bdrv_root_attach_child(bs, "root", &child_root,
BDRV_CHILD_FILTERED | 
BDRV_CHILD_PRIMARY,
-   blk->ctx, blk->perm, blk->shared_perm,
+   blk->perm, blk->shared_perm,
blk, errp);
 if (blk->root == NULL) {
 return -EPERM;
diff --git a/blockjob.c b/blockjob.c
index 160bf38b19..2fe1d788ba 100644
--- a/blockjob.c
+++ b/blockjob.c
@@ -229,8 +229,7 @@ int block_job_add_bdrv(BlockJob *job, const char *name, 
BlockDriverState *bs,
 if (need_context_ops && job->job.aio_context != qemu_get_aio_context()) {
 aio_context_release(job->job.aio_context);
 }
-c = bdrv_root_attach_child(bs, name, &child_job, 0,
-   job->job.aio_context, perm, shared_perm, job,
+c = bdrv_root_attach_child(bs, name, &child_job, 0, perm, shared_perm, job,
errp);
 if (need_context_ops && job->job.aio_context != qemu_get_aio_context()) {
 aio_context_acquire(job->job.aio_context);
-- 
2.30.2




[PULL 05/39] block: BdrvChildClass: add .get_parent_aio_context handler

2021-04-30 Thread Kevin Wolf
From: Vladimir Sementsov-Ogievskiy 

Add new handler to get aio context and implement it in all child
classes. Add corresponding public interface to be used soon.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Reviewed-by: Alberto Garcia 
Reviewed-by: Kevin Wolf 
Message-Id: <20210428151804.439460-6-vsement...@virtuozzo.com>
Signed-off-by: Kevin Wolf 
---
 include/block/block.h |  2 ++
 include/block/block_int.h |  2 ++
 block.c   | 13 +
 block/block-backend.c |  9 +
 blockjob.c|  8 
 5 files changed, 34 insertions(+)

diff --git a/include/block/block.h b/include/block/block.h
index b3f6e509d4..54279baa95 100644
--- a/include/block/block.h
+++ b/include/block/block.h
@@ -702,6 +702,8 @@ bool bdrv_child_can_set_aio_context(BdrvChild *c, 
AioContext *ctx,
 GSList **ignore, Error **errp);
 bool bdrv_can_set_aio_context(BlockDriverState *bs, AioContext *ctx,
   GSList **ignore, Error **errp);
+AioContext *bdrv_child_get_parent_aio_context(BdrvChild *c);
+
 int bdrv_probe_blocksizes(BlockDriverState *bs, BlockSizes *bsz);
 int bdrv_probe_geometry(BlockDriverState *bs, HDGeometry *geo);
 
diff --git a/include/block/block_int.h b/include/block/block_int.h
index 88e4111939..737ec632c4 100644
--- a/include/block/block_int.h
+++ b/include/block/block_int.h
@@ -789,6 +789,8 @@ struct BdrvChildClass {
 bool (*can_set_aio_ctx)(BdrvChild *child, AioContext *ctx,
 GSList **ignore, Error **errp);
 void (*set_aio_ctx)(BdrvChild *child, AioContext *ctx, GSList **ignore);
+
+AioContext *(*get_parent_aio_context)(BdrvChild *child);
 };
 
 extern const BdrvChildClass child_of_bds;
diff --git a/block.c b/block.c
index 1e7e8907e4..2833912436 100644
--- a/block.c
+++ b/block.c
@@ -1394,6 +1394,13 @@ static int bdrv_child_cb_update_filename(BdrvChild *c, 
BlockDriverState *base,
 return 0;
 }
 
+static AioContext *bdrv_child_cb_get_parent_aio_context(BdrvChild *c)
+{
+BlockDriverState *bs = c->opaque;
+
+return bdrv_get_aio_context(bs);
+}
+
 const BdrvChildClass child_of_bds = {
 .parent_is_bds   = true,
 .get_parent_desc = bdrv_child_get_parent_desc,
@@ -1407,8 +1414,14 @@ const BdrvChildClass child_of_bds = {
 .can_set_aio_ctx = bdrv_child_cb_can_set_aio_ctx,
 .set_aio_ctx = bdrv_child_cb_set_aio_ctx,
 .update_filename = bdrv_child_cb_update_filename,
+.get_parent_aio_context = bdrv_child_cb_get_parent_aio_context,
 };
 
+AioContext *bdrv_child_get_parent_aio_context(BdrvChild *c)
+{
+return c->klass->get_parent_aio_context(c);
+}
+
 static int bdrv_open_flags(BlockDriverState *bs, int flags)
 {
 int open_flags = flags;
diff --git a/block/block-backend.c b/block/block-backend.c
index 413af51f3b..3f656ef361 100644
--- a/block/block-backend.c
+++ b/block/block-backend.c
@@ -298,6 +298,13 @@ static void blk_root_detach(BdrvChild *child)
 }
 }
 
+static AioContext *blk_root_get_parent_aio_context(BdrvChild *c)
+{
+BlockBackend *blk = c->opaque;
+
+return blk_get_aio_context(blk);
+}
+
 static const BdrvChildClass child_root = {
 .inherit_options= blk_root_inherit_options,
 
@@ -318,6 +325,8 @@ static const BdrvChildClass child_root = {
 
 .can_set_aio_ctx= blk_root_can_set_aio_ctx,
 .set_aio_ctx= blk_root_set_aio_ctx,
+
+.get_parent_aio_context = blk_root_get_parent_aio_context,
 };
 
 /*
diff --git a/blockjob.c b/blockjob.c
index 207e8c7fd9..160bf38b19 100644
--- a/blockjob.c
+++ b/blockjob.c
@@ -163,6 +163,13 @@ static void child_job_set_aio_ctx(BdrvChild *c, AioContext 
*ctx,
 job->job.aio_context = ctx;
 }
 
+static AioContext *child_job_get_parent_aio_context(BdrvChild *c)
+{
+BlockJob *job = c->opaque;
+
+return job->job.aio_context;
+}
+
 static const BdrvChildClass child_job = {
 .get_parent_desc= child_job_get_parent_desc,
 .drained_begin  = child_job_drained_begin,
@@ -171,6 +178,7 @@ static const BdrvChildClass child_job = {
 .can_set_aio_ctx= child_job_can_set_aio_ctx,
 .set_aio_ctx= child_job_set_aio_ctx,
 .stay_at_node   = true,
+.get_parent_aio_context = child_job_get_parent_aio_context,
 };
 
 void block_job_remove_all_bdrv(BlockJob *job)
-- 
2.30.2




[PULL 01/39] tests/test-bdrv-graph-mod: add test_parallel_exclusive_write

2021-04-30 Thread Kevin Wolf
From: Vladimir Sementsov-Ogievskiy 

Add the test that shows that concept of ignore_children is incomplete.
Actually, when we want to update something, ignoring permission of some
existing BdrvChild, we should ignore also the propagated effect of this
child to the other children. But that's not done. Better approach
(update permissions on already updated graph) will be implemented
later.

Now the test fails, so it's added with -d argument to not break make
check.

Test fails with

 "Conflicts with use by fl1 as 'backing', which does not allow 'write' on base"

because when updating permissions we can ignore original top->fl1
BdrvChild. But we don't ignore exclusive write permission in fl1->base
BdrvChild, which is propagated. Correct thing to do is make graph
change first and then do permission update from the top node.

To run test do

  ./test-bdrv-graph-mod -d -p /bdrv-graph-mod/parallel-exclusive-write

from /tests.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Reviewed-by: Kevin Wolf 
Message-Id: <20210428151804.439460-2-vsement...@virtuozzo.com>
Signed-off-by: Kevin Wolf 
---
 tests/unit/test-bdrv-graph-mod.c | 70 +++-
 1 file changed, 69 insertions(+), 1 deletion(-)

diff --git a/tests/unit/test-bdrv-graph-mod.c b/tests/unit/test-bdrv-graph-mod.c
index c4f7d16039..80a9a20066 100644
--- a/tests/unit/test-bdrv-graph-mod.c
+++ b/tests/unit/test-bdrv-graph-mod.c
@@ -1,7 +1,7 @@
 /*
  * Block node graph modifications tests
  *
- * Copyright (c) 2019 Virtuozzo International GmbH. All rights reserved.
+ * Copyright (c) 2019-2021 Virtuozzo International GmbH. All rights reserved.
  *
  * This program is free software; you can redistribute it and/or modify
  * it under the terms of the GNU General Public License as published by
@@ -44,6 +44,21 @@ static BlockDriver bdrv_no_perm = {
 .bdrv_child_perm = no_perm_default_perms,
 };
 
+static void exclusive_write_perms(BlockDriverState *bs, BdrvChild *c,
+  BdrvChildRole role,
+  BlockReopenQueue *reopen_queue,
+  uint64_t perm, uint64_t shared,
+  uint64_t *nperm, uint64_t *nshared)
+{
+*nperm = BLK_PERM_WRITE;
+*nshared = BLK_PERM_ALL & ~BLK_PERM_WRITE;
+}
+
+static BlockDriver bdrv_exclusive_writer = {
+.format_name = "exclusive-writer",
+.bdrv_child_perm = exclusive_write_perms,
+};
+
 static BlockDriverState *no_perm_node(const char *name)
 {
 return bdrv_new_open_driver(&bdrv_no_perm, name, BDRV_O_RDWR, 
&error_abort);
@@ -55,6 +70,12 @@ static BlockDriverState *pass_through_node(const char *name)
 BDRV_O_RDWR, &error_abort);
 }
 
+static BlockDriverState *exclusive_writer_node(const char *name)
+{
+return bdrv_new_open_driver(&bdrv_exclusive_writer, name,
+BDRV_O_RDWR, &error_abort);
+}
+
 /*
  * test_update_perm_tree
  *
@@ -185,8 +206,50 @@ static void test_should_update_child(void)
 blk_unref(root);
 }
 
+/*
+ * test_parallel_exclusive_write
+ *
+ * Check that when we replace node, old permissions of the node being removed
+ * doesn't break the replacement.
+ */
+static void test_parallel_exclusive_write(void)
+{
+BlockDriverState *top = exclusive_writer_node("top");
+BlockDriverState *base = no_perm_node("base");
+BlockDriverState *fl1 = pass_through_node("fl1");
+BlockDriverState *fl2 = pass_through_node("fl2");
+
+/*
+ * bdrv_attach_child() eats child bs reference, so we need two @base
+ * references for two filters:
+ */
+bdrv_ref(base);
+
+bdrv_attach_child(top, fl1, "backing", &child_of_bds, BDRV_CHILD_DATA,
+  &error_abort);
+bdrv_attach_child(fl1, base, "backing", &child_of_bds, BDRV_CHILD_FILTERED,
+  &error_abort);
+bdrv_attach_child(fl2, base, "backing", &child_of_bds, BDRV_CHILD_FILTERED,
+  &error_abort);
+
+bdrv_replace_node(fl1, fl2, &error_abort);
+
+bdrv_unref(fl2);
+bdrv_unref(top);
+}
+
 int main(int argc, char *argv[])
 {
+int i;
+bool debug = false;
+
+for (i = 1; i < argc; i++) {
+if (!strcmp(argv[i], "-d")) {
+debug = true;
+break;
+}
+}
+
 bdrv_init();
 qemu_init_main_loop(&error_abort);
 
@@ -196,5 +259,10 @@ int main(int argc, char *argv[])
 g_test_add_func("/bdrv-graph-mod/should-update-child",
 test_should_update_child);
 
+if (debug) {
+g_test_add_func("/bdrv-graph-mod/parallel-exclusive-write",
+test_parallel_exclusive_write);
+}
+
 return g_test_run();
 }
-- 
2.30.2




Re: [PATCH] block: simplify write-threshold and drop write notifiers

2021-04-30 Thread Max Reitz

On 22.04.21 00:09, Vladimir Sementsov-Ogievskiy wrote:

write-notifiers are used only for write-threshold. New code for such
purpose should create filters.

Let's handle write-threshold simply in generic code and drop write
notifiers at all.

Also move part of write-threshold API that is used only for testing to
the test.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
---

I agree that this could be split into 2-3 parts and not combining
everything into one. But I'm tired now.


Er...  You should have put it off until the next day then? O:)

It should be multiple patches.  At least one to move the write threshold 
update to block/io.c, and then another to drop write notifiers.



I can send v2 if needed, so
consider it as RFC. Or take as is if you think it's not too much-in-one.

I also suggest this as a prepartion (and partly substitution) for
"[PATCH v2 0/8] Block layer thread-safety, continued"

  include/block/block_int.h | 12 -
  include/block/write-threshold.h   | 24 -
  block.c   |  1 -
  block/io.c| 21 +---
  block/write-threshold.c   | 87 ++-
  tests/unit/test-write-threshold.c | 38 ++
  6 files changed, 54 insertions(+), 129 deletions(-)


[...]


diff --git a/block/io.c b/block/io.c
index ca2dca3007..e0aa775952 100644
--- a/block/io.c
+++ b/block/io.c
@@ -36,6 +36,8 @@
  #include "qemu/main-loop.h"
  #include "sysemu/replay.h"
  
+#include "qapi/qapi-events-block-core.h"

+
  /* Maximum bounce buffer for copy-on-read and write zeroes, in bytes */
  #define MAX_BOUNCE_BUFFER (32768 << BDRV_SECTOR_BITS)
  
@@ -1974,6 +1976,8 @@ bdrv_co_write_req_prepare(BdrvChild *child, int64_t offset, int64_t bytes,

 child->perm & BLK_PERM_RESIZE);
  
  switch (req->type) {

+uint64_t write_threshold;
+


Works, but I think this is the first time I see a variable declared in a 
switch block.  What I usually do for such cases is to put a block after 
the label.  (i.e. case X: { uint64_t write_threshold; ... })


But it wouldn’t hurt to just declare this at the beginning of 
bdrv_co_write_req_prepare(), I think.



  case BDRV_TRACKED_WRITE:
  case BDRV_TRACKED_DISCARD:
  if (flags & BDRV_REQ_WRITE_UNCHANGED) {
@@ -1981,8 +1985,15 @@ bdrv_co_write_req_prepare(BdrvChild *child, int64_t 
offset, int64_t bytes,
  } else {
  assert(child->perm & BLK_PERM_WRITE);
  }
-return notifier_with_return_list_notify(&bs->before_write_notifiers,
-req);
+write_threshold = qatomic_read(&bs->write_threshold_offset);
+if (write_threshold > 0 && offset + bytes > write_threshold) {
+qapi_event_send_block_write_threshold(
+bs->node_name,
+offset + bytes - write_threshold,
+write_threshold);
+qatomic_set(&bs->write_threshold_offset, 0);
+}


I’d put all of this into a function in block/write-threshold.c that’s 
called from here.


Max


+return 0;
  case BDRV_TRACKED_TRUNCATE:
  assert(child->perm & BLK_PERM_RESIZE);
  return 0;
@@ -3137,12 +3148,6 @@ bool bdrv_qiov_is_aligned(BlockDriverState *bs, 
QEMUIOVector *qiov)
  return true;
  }
  
-void bdrv_add_before_write_notifier(BlockDriverState *bs,

-NotifierWithReturn *notifier)
-{
-notifier_with_return_list_add(&bs->before_write_notifiers, notifier);
-}
-
  void bdrv_io_plug(BlockDriverState *bs)
  {
  BdrvChild *child;





Re: [PATCH v2] vhost-user-blk: Fail gracefully on too large queue size

2021-04-30 Thread Kevin Wolf
Am 13.04.2021 um 18:56 hat Kevin Wolf geschrieben:
> virtio_add_queue() aborts when queue_size > VIRTQUEUE_MAX_SIZE, so
> vhost_user_blk_device_realize() should check this before calling it.
> 
> Simple reproducer:
> 
> qemu-system-x86_64 \
> -chardev null,id=foo \
> -device vhost-user-blk-pci,queue-size=4096,chardev=foo
> 
> Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1935014
> Signed-off-by: Kevin Wolf 

Thanks for the review, applied to the block branch.

Kevin




Re: [PATCH v2] monitor: hmp_qemu_io: acquire aio contex, fix crash

2021-04-30 Thread Max Reitz

On 23.04.21 15:42, Vladimir Sementsov-Ogievskiy wrote:

Max reported the following bug:

$ ./qemu-img create -f raw src.img 1G
$ ./qemu-img create -f raw dst.img 1G

$ (echo '
{"execute":"qmp_capabilities"}
{"execute":"blockdev-mirror",
 "arguments":{"job-id":"mirror",
  "device":"source",
  "target":"target",
  "sync":"full",
  "filter-node-name":"mirror-top"}}
'; sleep 3; echo'
{"execute":"human-monitor-command",
 "arguments":{"command-line":
  "qemu-io mirror-top \"write 0 1G\""}}') \
| x86_64-softmmu/qemu-system-x86_64 \
-qmp stdio \
-blockdev file,node-name=source,filename=src.img \
-blockdev file,node-name=target,filename=dst.img \
-object iothread,id=iothr0 \
-device virtio-blk,drive=source,iothread=iothr0

crashes:

0  raise () at /usr/lib/libc.so.6
1  abort () at /usr/lib/libc.so.6
2  error_exit
(err=,
msg=msg@entry=0x55fbb1634790 <__func__.27> "qemu_mutex_unlock_impl")
at ../util/qemu-thread-posix.c:37
3  qemu_mutex_unlock_impl
(mutex=mutex@entry=0x55fbb25ab6e0,
file=file@entry=0x55fbb1636957 "../util/async.c",
line=line@entry=650)
at ../util/qemu-thread-posix.c:109
4  aio_context_release (ctx=ctx@entry=0x55fbb25ab680) at ../util/async.c:650
5  bdrv_do_drained_begin
(bs=bs@entry=0x55fbb3a87000, recursive=recursive@entry=false,
parent=parent@entry=0x0,
ignore_bds_parents=ignore_bds_parents@entry=false,
poll=poll@entry=true) at ../block/io.c:441
6  bdrv_do_drained_begin
(poll=true, ignore_bds_parents=false, parent=0x0, recursive=false,
bs=0x55fbb3a87000) at ../block/io.c:448
7  blk_drain (blk=0x55fbb26c5a00) at ../block/block-backend.c:1718
8  blk_unref (blk=0x55fbb26c5a00) at ../block/block-backend.c:498
9  blk_unref (blk=0x55fbb26c5a00) at ../block/block-backend.c:491
10 hmp_qemu_io (mon=0x7fffaf3fc7d0, qdict=)
at ../block/monitor/block-hmp-cmds.c:628

man pthread_mutex_unlock
...
 EPERM  The  mutex type is PTHREAD_MUTEX_ERRORCHECK or
 PTHREAD_MUTEX_RECURSIVE, or the mutex is a robust mutex, and the
 current thread does not own the mutex.

So, thread doesn't own the mutex. And we have iothread here.

Next, note that AIO_WAIT_WHILE() documents that ctx must be acquired
exactly once by caller. But where is it acquired in the call stack?
Seems nowhere.

qemuio_command do acquire aio context.. But we need context acquired
around blk_unref() as well and actually around blk_insert_bs() too.

Let's refactor qemuio_command so that it doesn't acquire aio context
but callers do that instead. This way we can cleanly acquire aio
context in hmp_qemu_io() around all three calls.

Reported-by: Max Reitz 
Signed-off-by: Vladimir Sementsov-Ogievskiy 
---

v2: rewrite, to cleanly acquire aio-context around all needed things in
 hmp_qemu_io

  block/monitor/block-hmp-cmds.c | 31 +--
  qemu-io-cmds.c |  8 
  qemu-io.c  | 17 +++--
  3 files changed, 40 insertions(+), 16 deletions(-)


Thanks, applied to my block branch:

https://github.com/XanClic/qemu/commits/block

(With the change below:)


diff --git a/qemu-io-cmds.c b/qemu-io-cmds.c
index 97611969cb..19149e014d 100644
--- a/qemu-io-cmds.c
+++ b/qemu-io-cmds.c
@@ -2457,9 +2457,12 @@ static const cmdinfo_t help_cmd = {
  .oneline= "help for one or all commands",
  };
  
+/*

+ * Called with aio context of blk acquired. Or with qemu_get_aio_context()
+ * context acquired if no blk is NULL.


-no


+ */
  int qemuio_command(BlockBackend *blk, const char *cmd)
  {
-AioContext *ctx;
  char *input;
  const cmdinfo_t *ct;
  char **v;





Re: [PATCH] virtio-blk: drop deprecated scsi=on|off property

2021-04-30 Thread Peter Krempa
On Fri, Apr 30, 2021 at 09:42:05 +0200, Markus Armbruster wrote:
> Eduardo Habkost  writes:
> 
> > On Thu, Apr 29, 2021 at 04:52:21PM +0100, Stefan Hajnoczi wrote:
> >> The scsi=on|off property was deprecated in QEMU 5.0 and can be removed
> >> completely at this point.
> >> 
> >> Drop the scsi=on|off option. It was only available on Legacy virtio-blk
> >> devices. Linux v5.6 already dropped support for it.
> >> 
> >> Remove the hw_compat_2_4[] property assignment since scsi=on|off no
> >> longer exists. Old guests with Legacy virtio-blk devices no longer see
> >> the SCSI host features bit.
> >> 
> >
> > This means pc-2.4 will now break guest ABI if using virtio-blk
> > devices, correct?
> >
> > This looks like a sign we should have deprecated pc-2.4 a long
> > time ago.
> 
> The last batch of PC machine type retiring was pc-1.0 to pc-1.3:
> deprecated in 5.0 (commit 30d2a17b4, Dec 2019), dropped in 6.0 (commit
> f862ddbb1, just weeks ago).  pc-1.3 was a bit over seven years old when
> we released 5.0.  pc-2.4 will be six years old by the time we release
> 6.1.  Fair game?

As a data-point, libvirt will be dropping support for 

Re: [PATCH] virtio-blk: drop deprecated scsi=on|off property

2021-04-30 Thread Markus Armbruster
Eduardo Habkost  writes:

> On Thu, Apr 29, 2021 at 04:52:21PM +0100, Stefan Hajnoczi wrote:
>> The scsi=on|off property was deprecated in QEMU 5.0 and can be removed
>> completely at this point.
>> 
>> Drop the scsi=on|off option. It was only available on Legacy virtio-blk
>> devices. Linux v5.6 already dropped support for it.
>> 
>> Remove the hw_compat_2_4[] property assignment since scsi=on|off no
>> longer exists. Old guests with Legacy virtio-blk devices no longer see
>> the SCSI host features bit.
>> 
>
> This means pc-2.4 will now break guest ABI if using virtio-blk
> devices, correct?
>
> This looks like a sign we should have deprecated pc-2.4 a long
> time ago.

The last batch of PC machine type retiring was pc-1.0 to pc-1.3:
deprecated in 5.0 (commit 30d2a17b4, Dec 2019), dropped in 6.0 (commit
f862ddbb1, just weeks ago).  pc-1.3 was a bit over seven years old when
we released 5.0.  pc-2.4 will be six years old by the time we release
6.1.  Fair game?

>> Live migrating old guests from an old QEMU with the SCSI feature bit
>> enabled will fail with "Features 0x... unsupported. Allowed features:
>> 0x...". We've followed the QEMU deprecation policy so users have been
>> warned...
>> 
>
> Were they really warned, though?  People running
> "-machine pc-i440fx-2.4" might be completely unaware that it was
> silently enabling a deprecated feature.

We've gotten better at documenting deprecations, but we're still bad at
warning on use of deprecated features.

[...]