[PATCH 2/2] ide: Explicitly poll for BHs on cancel

2023-06-22 Thread Lukas Straub
When we still have an AIOCB registered for DMA operations, we try to
settle the respective operation by draining the BlockBackend associated
with the IDE device.

However, this assumes that every DMA operation is associated with some
I/O operation on the BlockBackend, and so settling the latter will
settle the former.  That is not the case; for example, the guest is free
to issue a zero-length TRIM operation that will not result in any I/O
operation forwarded to the BlockBackend.  In such a case, blk_drain()
will be a no-op if no other operations are in flight.

It is clear that if blk_drain() is a no-op, the value of
s->bus->dma->aiocb will not change between checking it in the `if`
condition and asserting that it is NULL after blk_drain().

To settle the DMA operation, we will thus need to explicitly invoke
aio_poll() ourselves, which will run any outstanding BHs (like
ide_trim_bh_cb()), until s->bus->dma->aiocb is NULL.  To stop this from
being an infinite loop, assert that we made progress with every
aio_poll() call (i.e., invoked some BH).

Buglink: https://bugzilla.redhat.com/show_bug.cgi?id=2029980
Signed-off-by: Hanna Reitz 
Signed-off-by: Lukas Straub 
---
 hw/ide/core.c | 12 +++-
 1 file changed, 11 insertions(+), 1 deletion(-)

diff --git a/hw/ide/core.c b/hw/ide/core.c
index d172e70f1e..a5fd89ebdd 100644
--- a/hw/ide/core.c
+++ b/hw/ide/core.c
@@ -736,7 +736,17 @@ void ide_cancel_dma_sync(IDEState *s)
 if (s->bus->dma->aiocb) {
 trace_ide_cancel_dma_sync_remaining();
 blk_drain(s->blk);
-assert(s->bus->dma->aiocb == NULL);
+
+/*
+ * Wait for potentially still-scheduled BHs, like ide_trim_bh_cb()
+ * (blk_drain() will only poll if there are in-flight requests on the
+ * BlockBackend, which there may not necessarily be, e.g. when the
+ * guest has issued a zero-length TRIM request)
+ */
+while (s->bus->dma->aiocb) {
+bool progress = aio_poll(qemu_get_aio_context(), true);
+assert(progress);
+}
 }
 }
 
-- 
2.39.2


pgpj6vMoQ4HKf.pgp
Description: OpenPGP digital signature


[PATCH 1/2] ide: Fix a rare hang during block draining

2023-06-22 Thread Lukas Straub
If the guest issues a discard during a block drain section, the
blk_aio_pdiscard() may not be processed, but queued instead.
And so the callback will never be called to issue the bh and decrease
the BB in-flight number again.
This causes a hang in the drain code, since it will wait forever for the
BB in-flight counter to decrease.

This reverts commit 7e5cdb34 "ide: Increment BB in-flight counter for TRIM BH"
to fix this hang. The bug fixed by that commit will be fixed differently
in the next commit.

Signed-off-by: Lukas Straub 
---
 hw/ide/core.c | 7 ---
 1 file changed, 7 deletions(-)

diff --git a/hw/ide/core.c b/hw/ide/core.c
index de48ff9f86..d172e70f1e 100644
--- a/hw/ide/core.c
+++ b/hw/ide/core.c
@@ -436,16 +436,12 @@ static const AIOCBInfo trim_aiocb_info = {
 static void ide_trim_bh_cb(void *opaque)
 {
 TrimAIOCB *iocb = opaque;
-BlockBackend *blk = iocb->s->blk;
 
 iocb->common.cb(iocb->common.opaque, iocb->ret);
 
 qemu_bh_delete(iocb->bh);
 iocb->bh = NULL;
 qemu_aio_unref(iocb);
-
-/* Paired with an increment in ide_issue_trim() */
-blk_dec_in_flight(blk);
 }
 
 static void ide_issue_trim_cb(void *opaque, int ret)
@@ -516,9 +512,6 @@ BlockAIOCB *ide_issue_trim(
 IDEDevice *dev = s->unit ? s->bus->slave : s->bus->master;
 TrimAIOCB *iocb;
 
-/* Paired with a decrement in ide_trim_bh_cb() */
-blk_inc_in_flight(s->blk);
-
 iocb = blk_aio_get(_aiocb_info, s->blk, cb, cb_opaque);
 iocb->s = s;
 iocb->bh = qemu_bh_new_guarded(ide_trim_bh_cb, iocb,
-- 
2.39.2



pgpsiHGM3Qy0x.pgp
Description: OpenPGP digital signature


Re: [PATCH 17/17] qapi: Reformat doc comments to conform to current conventions

2023-04-28 Thread Lukas Straub
On Fri, 28 Apr 2023 12:54:29 +0200
Markus Armbruster  wrote:

> Change
> 
> # @name: Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed
> #do eiusmod tempor incididunt ut labore et dolore magna aliqua.
> 
> to
> 
> # @name: Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed
> # do eiusmod tempor incididunt ut labore et dolore magna aliqua.
> 
> See recent commit "qapi: Relax doc string @name: description
> indentation rules" for rationale.
> 
> Reflow paragraphs to 70 columns width, and consistently use two spaces
> to separate sentences.
> 
> To check the generated documentation does not change, I compared the
> generated HTML before and after this commit with "wdiff -3".  Finds no
> differences.  Comparing with diff is not useful, as the reflown
> paragraphs are visible there.
> 
> Signed-off-by: Markus Armbruster 

Acked-by: Lukas Straub 

> ---
>  qapi/acpi.json   |   50 +-
>  qapi/audio.json  |   85 +-
>  qapi/authz.json  |   29 +-
>  qapi/block-core.json | 2801 --
>  qapi/block-export.json   |  242 ++--
>  qapi/block.json  |  214 +--
>  qapi/char.json   |  134 +-
>  qapi/common.json |   19 +-
>  qapi/compat.json |   13 +-
>  qapi/control.json|   59 +-
>  qapi/crypto.json |  261 ++--
>  qapi/cryptodev.json  |3 +
>  qapi/cxl.json|   74 +-
>  qapi/dump.json   |   78 +-
>  qapi/error.json  |6 +-
>  qapi/introspect.json |   89 +-
>  qapi/job.json|  139 +-
>  qapi/machine-target.json |  303 +++--
>  qapi/machine.json|  389 +++---
>  qapi/migration.json  | 1117 ---
>  qapi/misc-target.json|   67 +-
>  qapi/misc.json   |  180 ++-
>  qapi/net.json|  260 ++--
>  qapi/pci.json|   35 +-
>  qapi/qapi-schema.json|   25 +-
>  qapi/qdev.json   |   63 +-
>  qapi/qom.json|  404 +++---
>  qapi/rdma.json   |1 -
>  qapi/replay.json |   48 +-
>  qapi/rocker.json |   20 +-
>  qapi/run-state.json  |  215 +--
>  qapi/sockets.json|   50 +-
>  qapi/stats.json  |   83 +-
>  qapi/tpm.json|   20 +-
>  qapi/trace.json  |   34 +-
>  qapi/transaction.json|   87 +-
>  qapi/ui.json |  435 +++---
>  qapi/virtio.json |   84 +-
>  qapi/yank.json   |   42 +-
>  39 files changed, 4322 insertions(+), 3936 deletions(-)
> 
> [...]
> 
> diff --git a/qapi/yank.json b/qapi/yank.json
> index 1639744ada..87ec7cab96 100644
> --- a/qapi/yank.json
> +++ b/qapi/yank.json
> @@ -9,7 +9,7 @@
>  ##
>  # @YankInstanceType:
>  #
> -# An enumeration of yank instance types. See @YankInstance for more
> +# An enumeration of yank instance types.  See @YankInstance for more
>  # information.
>  #
>  # Since: 6.0
> @@ -20,8 +20,8 @@
>  ##
>  # @YankInstanceBlockNode:
>  #
> -# Specifies which block graph node to yank. See @YankInstance for more
> -# information.
> +# Specifies which block graph node to yank.  See @YankInstance for
> +# more information.
>  #
>  # @node-name: the name of the block graph node
>  #
> @@ -33,8 +33,8 @@
>  ##
>  # @YankInstanceChardev:
>  #
> -# Specifies which character device to yank. See @YankInstance for more
> -# information.
> +# Specifies which character device to yank.  See @YankInstance for
> +# more information.
>  #
>  # @id: the chardev's ID
>  #
> @@ -46,21 +46,18 @@
>  ##
>  # @YankInstance:
>  #
> -# A yank instance can be yanked with the @yank qmp command to recover from a
> -# hanging QEMU.
> +# A yank instance can be yanked with the @yank qmp command to recover
> +# from a hanging QEMU.
>  #
>  # Currently implemented yank instances:
>  #
> -# - nbd block device:
> -#   Yanking it will shut down the connection to the nbd server without
> -#   attempting to reconnect.
> -# - socket chardev:
> -#   Yanking it will shut down the connected socket.
> -# - migration:
> -#   Yanking it will shut down all migration connections. Unlike
> -#   @migrate_cancel, it will not notify the migration process, so migration
> -#   will go into @failed state, instead of @cancelled state. @yank should be
> -#   used to recover from hangs.
> +# - nbd block device: Yanking it will shut down the connection to the
> +#   nbd server without attempting to reconnect.
> +# - socket chardev: Yanking it will shut down the connected socket.
> +# - migration: Yanking it will shut down all migration connections.
> +#   Unlike @migrate_cancel, it will n

Re: [PATCH v3 4/4] configure: add --disable-colo-proxy option

2023-04-27 Thread Lukas Straub
On Thu, 27 Apr 2023 23:29:46 +0300
Vladimir Sementsov-Ogievskiy  wrote:

> Add option to not build filter-mirror, filter-rewriter and
> colo-compare when they are not needed.
> 
> There could be more agile configuration, for example add separate
> options for each filter, but that may be done in future on demand. The
> aim of this patch is to make possible to disable the whole COLO Proxy
> subsystem.
> 
> Signed-off-by: Vladimir Sementsov-Ogievskiy 
> ---
>  meson_options.txt |  2 ++
>  net/meson.build   | 14 ++
>  scripts/meson-buildoptions.sh |  3 +++
>  stubs/colo-compare.c  |  7 +++
>  stubs/meson.build |  1 +
>  5 files changed, 23 insertions(+), 4 deletions(-)
>  create mode 100644 stubs/colo-compare.c
> 
> diff --git a/meson_options.txt b/meson_options.txt
> index 2471dd02da..b59e7ae342 100644
> --- a/meson_options.txt
> +++ b/meson_options.txt
> @@ -289,6 +289,8 @@ option('live_block_migration', type: 'feature', value: 
> 'auto',
> description: 'block migration in the main migration stream')
>  option('replication', type: 'feature', value: 'auto',
> description: 'replication support')
> +option('colo_proxy', type: 'feature', value: 'auto',
> +   description: 'colo-proxy support')
>  option('bochs', type: 'feature', value: 'auto',
> description: 'bochs image format support')
>  option('cloop', type: 'feature', value: 'auto',
> diff --git a/net/meson.build b/net/meson.build
> index 87afca3e93..4cfc850c69 100644
> --- a/net/meson.build
> +++ b/net/meson.build
> @@ -1,13 +1,9 @@
>  softmmu_ss.add(files(
>'announce.c',
>'checksum.c',
> -  'colo-compare.c',
> -  'colo.c',
>'dump.c',
>'eth.c',
>'filter-buffer.c',
> -  'filter-mirror.c',
> -  'filter-rewriter.c',
>'filter.c',
>'hub.c',
>'net-hmp-cmds.c',
> @@ -19,6 +15,16 @@ softmmu_ss.add(files(
>'util.c',
>  ))
>  
> +if get_option('replication').allowed() or \
> +get_option('colo_proxy').allowed()
> +  softmmu_ss.add(files('colo-compare.c'))
> +  softmmu_ss.add(files('colo.c'))
> +endif
> +
> +if get_option('colo_proxy').allowed()
> +  softmmu_ss.add(files('filter-mirror.c', 'filter-rewriter.c'))
> +endif
> +

The last discussion didn't really come to a conclusion, but I still
think that 'filter-mirror.c' (which also contains filter-redirect)
should be left unchanged.

>  softmmu_ss.add(when: 'CONFIG_TCG', if_true: files('filter-replay.c'))
>  
>  if have_l2tpv3
> diff --git a/scripts/meson-buildoptions.sh b/scripts/meson-buildoptions.sh
> index d4369a3ad8..036047ce6f 100644
> --- a/scripts/meson-buildoptions.sh
> +++ b/scripts/meson-buildoptions.sh
> @@ -83,6 +83,7 @@ meson_options_help() {
>printf "%s\n" '  capstoneWhether and how to find the capstone 
> library'
>printf "%s\n" '  cloop   cloop image format support'
>printf "%s\n" '  cocoa   Cocoa user interface (macOS only)'
> +  printf "%s\n" '  colo-proxy  colo-proxy support'
>printf "%s\n" '  coreaudio   CoreAudio sound support'
>printf "%s\n" '  crypto-afalgLinux AF_ALG crypto backend driver'
>printf "%s\n" '  curlCURL block device driver'
> @@ -236,6 +237,8 @@ _meson_option_parse() {
>  --disable-cloop) printf "%s" -Dcloop=disabled ;;
>  --enable-cocoa) printf "%s" -Dcocoa=enabled ;;
>  --disable-cocoa) printf "%s" -Dcocoa=disabled ;;
> +--enable-colo-proxy) printf "%s" -Dcolo_proxy=enabled ;;
> +--disable-colo-proxy) printf "%s" -Dcolo_proxy=disabled ;;
>  --enable-coreaudio) printf "%s" -Dcoreaudio=enabled ;;
>  --disable-coreaudio) printf "%s" -Dcoreaudio=disabled ;;
>  --enable-coroutine-pool) printf "%s" -Dcoroutine_pool=true ;;
> diff --git a/stubs/colo-compare.c b/stubs/colo-compare.c
> new file mode 100644
> index 00..ec726665be
> --- /dev/null
> +++ b/stubs/colo-compare.c
> @@ -0,0 +1,7 @@
> +#include "qemu/osdep.h"
> +#include "qemu/notify.h"
> +#include "net/colo-compare.h"
> +
> +void colo_compare_cleanup(void)
> +{
> +}
> diff --git a/stubs/meson.build b/stubs/meson.build
> index 8412cad15f..a56645e2f7 100644
> --- a/stubs/meson.build
> +++ b/stubs/meson.build
> @@ -46,6 +46,7 @@ stub_ss.add(files('target-monitor-defs.c'))
>  stub_ss.add(files('trace-control.c'))
>  stub_ss.add(files('uuid.c'))
>  stub_ss.add(files('colo.c'))
> +stub_ss.add(files('colo-compare.c'))
>  stub_ss.add(files('vmstate.c'))
>  stub_ss.add(files('vm-stop.c'))
>  stub_ss.add(files('win32-kbd-hook.c'))



-- 



pgpyQ83V9J3SK.pgp
Description: OpenPGP digital signature


Re: [PATCH v3 3/4] build: move COLO under CONFIG_REPLICATION

2023-04-27 Thread Lukas Straub
On Thu, 27 Apr 2023 23:29:45 +0300
Vladimir Sementsov-Ogievskiy  wrote:

> We don't allow to use x-colo capability when replication is not
> configured. So, no reason to build COLO when replication is disabled,
> it's unusable in this case.
> 
> Note also that the check in migrate_caps_check() is not the only
> restriction: some functions in migration/colo.c will just abort if
> called with not defined CONFIG_REPLICATION, for example:
> 
> migration_iteration_finish()
>case MIGRATION_STATUS_COLO:
>migrate_start_colo_process()
>colo_process_checkpoint()
>abort()
> 
> It could probably make sense to have possibility to enable COLO without
> REPLICATION, but this requires deeper audit of colo & replication code,
> which may be done later if needed.
> 
> Signed-off-by: Vladimir Sementsov-Ogievskiy 

Reviewed-by: Lukas Straub 

> ---
>  hmp-commands.hx|  2 ++
>  migration/colo.c   | 28 -
>  migration/meson.build  |  6 --
>  migration/migration-hmp-cmds.c |  2 ++
>  migration/options.c| 17 
>  qapi/migration.json| 12 +++
>  stubs/colo.c   | 37 ++
>  stubs/meson.build  |  1 +
>  8 files changed, 62 insertions(+), 43 deletions(-)
>  create mode 100644 stubs/colo.c
> 
> diff --git a/hmp-commands.hx b/hmp-commands.hx
> index bb85ee1d26..fbd0932232 100644
> --- a/hmp-commands.hx
> +++ b/hmp-commands.hx
> @@ -1035,6 +1035,7 @@ SRST
>migration (or once already in postcopy).
>  ERST
>  
> +#ifdef CONFIG_REPLICATION
>  {
>  .name   = "x_colo_lost_heartbeat",
>  .args_type  = "",
> @@ -1043,6 +1044,7 @@ ERST
>"a failover or takeover is needed.",
>  .cmd = hmp_x_colo_lost_heartbeat,
>  },
> +#endif
>  
>  SRST
>  ``x_colo_lost_heartbeat``
> diff --git a/migration/colo.c b/migration/colo.c
> index 07bfa21fea..e4af47eeeb 100644
> --- a/migration/colo.c
> +++ b/migration/colo.c
> @@ -26,9 +26,7 @@
>  #include "qemu/rcu.h"
>  #include "migration/failover.h"
>  #include "migration/ram.h"
> -#ifdef CONFIG_REPLICATION
>  #include "block/replication.h"
> -#endif
>  #include "net/colo-compare.h"
>  #include "net/colo.h"
>  #include "block/block.h"
> @@ -68,7 +66,6 @@ static bool colo_runstate_is_stopped(void)
>  static void secondary_vm_do_failover(void)
>  {
>  /* COLO needs enable block-replication */
> -#ifdef CONFIG_REPLICATION
>  int old_state;
>  MigrationIncomingState *mis = migration_incoming_get_current();
>  Error *local_err = NULL;
> @@ -133,14 +130,10 @@ static void secondary_vm_do_failover(void)
>  if (mis->migration_incoming_co) {
>  qemu_coroutine_enter(mis->migration_incoming_co);
>  }
> -#else
> -abort();
> -#endif
>  }
>  
>  static void primary_vm_do_failover(void)
>  {
> -#ifdef CONFIG_REPLICATION
>  MigrationState *s = migrate_get_current();
>  int old_state;
>  Error *local_err = NULL;
> @@ -181,9 +174,6 @@ static void primary_vm_do_failover(void)
>  
>  /* Notify COLO thread that failover work is finished */
>  qemu_sem_post(>colo_exit_sem);
> -#else
> -abort();
> -#endif
>  }
>  
>  COLOMode get_colo_mode(void)
> @@ -217,7 +207,6 @@ void colo_do_failover(void)
>  }
>  }
>  
> -#ifdef CONFIG_REPLICATION
>  void qmp_xen_set_replication(bool enable, bool primary,
>   bool has_failover, bool failover,
>   Error **errp)
> @@ -271,7 +260,6 @@ void qmp_xen_colo_do_checkpoint(Error **errp)
>  /* Notify all filters of all NIC to do checkpoint */
>  colo_notify_filters_event(COLO_EVENT_CHECKPOINT, errp);
>  }
> -#endif
>  
>  COLOStatus *qmp_query_colo_status(Error **errp)
>  {
> @@ -435,15 +423,11 @@ static int 
> colo_do_checkpoint_transaction(MigrationState *s,
>  }
>  qemu_mutex_lock_iothread();
>  
> -#ifdef CONFIG_REPLICATION
>  replication_do_checkpoint_all(_err);
>  if (local_err) {
>  qemu_mutex_unlock_iothread();
>  goto out;
>  }
> -#else
> -abort();
> -#endif
>  
>  colo_send_message(s->to_dst_file, COLO_MESSAGE_VMSTATE_SEND, _err);
>  if (local_err) {
> @@ -561,15 +545,11 @@ static void colo_process_checkpoint(MigrationState *s)
>  object_unref(OBJECT(bioc));
>  
>  qemu_mutex_lock_iothread();
> -#ifdef CONFIG

Re: [PATCH v3 1/4] block/meson.build: prefer positive condition for replication

2023-04-27 Thread Lukas Straub
On Thu, 27 Apr 2023 23:29:43 +0300
Vladimir Sementsov-Ogievskiy  wrote:

> Signed-off-by: Vladimir Sementsov-Ogievskiy 
> Reviewed-by: Juan Quintela 
> Reviewed-by: Philippe Mathieu-Daudé 

Reviewed-by: Lukas Straub 

> ---
>  block/meson.build | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/block/meson.build b/block/meson.build
> index 382bec0e7d..b9a72e219b 100644
> --- a/block/meson.build
> +++ b/block/meson.build
> @@ -84,7 +84,7 @@ block_ss.add(when: 'CONFIG_WIN32', if_true: 
> files('file-win32.c', 'win32-aio.c')
>  block_ss.add(when: 'CONFIG_POSIX', if_true: [files('file-posix.c'), coref, 
> iokit])
>  block_ss.add(when: libiscsi, if_true: files('iscsi-opts.c'))
>  block_ss.add(when: 'CONFIG_LINUX', if_true: files('nvme.c'))
> -if not get_option('replication').disabled()
> +if get_option('replication').allowed()
>block_ss.add(files('replication.c'))
>  endif
>  block_ss.add(when: libaio, if_true: files('linux-aio.c'))



-- 



pgpFssw8I8Ecv.pgp
Description: OpenPGP digital signature


Re: [PATCH v2 4/4] configure: add --disable-colo-filters option

2023-04-20 Thread Lukas Straub
On Thu, 20 Apr 2023 09:09:48 +
"Zhang, Chen"  wrote:

> > -Original Message-
> > From: Vladimir Sementsov-Ogievskiy 
> > Sent: Thursday, April 20, 2023 6:53 AM
> > To: qemu-de...@nongnu.org
> > Cc: qemu-block@nongnu.org; michael.r...@amd.com; arm...@redhat.com;
> > ebl...@redhat.com; jasow...@redhat.com; quint...@redhat.com; Zhang,
> > Hailiang ; phi...@linaro.org;
> > th...@redhat.com; berra...@redhat.com; marcandre.lur...@redhat.com;
> > pbonz...@redhat.com; d...@treblig.org; hre...@redhat.com;
> > kw...@redhat.com; Zhang, Chen ;
> > lizhij...@fujitsu.com; Vladimir Sementsov-Ogievskiy  > team.ru>  
> > Subject: [PATCH v2 4/4] configure: add --disable-colo-filters option
> > 
> > Add option to not build COLO Proxy subsystem if it is not needed.  
> 
> I think no need to add the --disable-colo-filter option.
> Net-filters just general infrastructure. Another example is COLO also
> use the -chardev socket to connect each filters. No need to add the 
> --disable-colo-chardev
> Please drop this patch. 
> But for COLO network part, still something need to do:
> You can add --disable-colo-proxy not to build the net/colo-compare.c  if it 
> is not needed.
> This file just for COLO and not belong to network filters.

And net/filter-rewriter.c is just for COLO too.

So in summary just drop net/filter-mirror.c from this patch?

> 
> Thanks
> Chen
> 
> > 
> > Signed-off-by: Vladimir Sementsov-Ogievskiy 
> > ---
> >  meson.build   |  1 +
> >  meson_options.txt |  2 ++
> >  net/meson.build   | 11 ---
> >  scripts/meson-buildoptions.sh |  3 +++
> >  4 files changed, 14 insertions(+), 3 deletions(-)
> > 
> > diff --git a/meson.build b/meson.build
> > index c44d05a13f..5b2fdfbd3a 100644
> > --- a/meson.build
> > +++ b/meson.build
> > @@ -1962,6 +1962,7 @@ config_host_data.set('CONFIG_GPROF',
> > get_option('gprof'))
> > config_host_data.set('CONFIG_LIVE_BLOCK_MIGRATION',
> > get_option('live_block_migration').allowed())
> >  config_host_data.set('CONFIG_QOM_CAST_DEBUG',
> > get_option('qom_cast_debug'))
> > config_host_data.set('CONFIG_REPLICATION',
> > get_option('replication').allowed())
> > +config_host_data.set('CONFIG_COLO_FILTERS',
> > +get_option('colo_filters').allowed())
> > 
> >  # has_header
> >  config_host_data.set('CONFIG_EPOLL', cc.has_header('sys/epoll.h')) diff 
> > --git
> > a/meson_options.txt b/meson_options.txt index fc9447d267..ffe81317cb
> > 100644
> > --- a/meson_options.txt
> > +++ b/meson_options.txt
> > @@ -289,6 +289,8 @@ option('live_block_migration', type: 'feature', value:
> > 'auto',
> > description: 'block migration in the main migration stream')
> > option('replication', type: 'feature', value: 'auto',
> > description: 'replication support')
> > +option('colo_filters', type: 'feature', value: 'auto',
> > +   description: 'colo_filters support')
> >  option('bochs', type: 'feature', value: 'auto',
> > description: 'bochs image format support')  option('cloop', type: 
> > 'feature',
> > value: 'auto', diff --git a/net/meson.build b/net/meson.build index
> > 38d50b8c96..7e54744aea 100644
> > --- a/net/meson.build
> > +++ b/net/meson.build
> > @@ -1,12 +1,9 @@
> >  softmmu_ss.add(files(
> >'announce.c',
> >'checksum.c',
> > -  'colo.c',
> >'dump.c',
> >'eth.c',
> >'filter-buffer.c',
> > -  'filter-mirror.c',
> > -  'filter-rewriter.c',
> >'filter.c',
> >'hub.c',
> >'net-hmp-cmds.c',
> > @@ -22,6 +19,14 @@ if get_option('replication').allowed()
> >softmmu_ss.add(files('colo-compare.c'))
> >  endif
> > 
> > +if get_option('replication').allowed() or
> > +get_option('colo_filters').allowed()
> > +  softmmu_ss.add(files('colo.c'))
> > +endif
> > +
> > +if get_option('colo_filters').allowed()
> > +  softmmu_ss.add(files('filter-mirror.c', 'filter-rewriter.c')) endif
> > +
> >  softmmu_ss.add(when: 'CONFIG_TCG', if_true: files('filter-replay.c'))
> > 
> >  if have_l2tpv3
> > diff --git a/scripts/meson-buildoptions.sh b/scripts/meson-buildoptions.sh
> > index 009fab1515..cf9d23369f 100644
> > --- a/scripts/meson-buildoptions.sh
> > +++ b/scripts/meson-buildoptions.sh
> > @@ -83,6 +83,7 @@ meson_options_help() {
> >printf "%s\n" '  capstoneWhether and how to find the capstone 
> > library'
> >printf "%s\n" '  cloop   cloop image format support'
> >printf "%s\n" '  cocoa   Cocoa user interface (macOS only)'
> > +  printf "%s\n" '  colo-filterscolo_filters support'
> >printf "%s\n" '  coreaudio   CoreAudio sound support'
> >printf "%s\n" '  crypto-afalgLinux AF_ALG crypto backend driver'
> >printf "%s\n" '  curlCURL block device driver'
> > @@ -236,6 +237,8 @@ _meson_option_parse() {
> >  --disable-cloop) printf "%s" -Dcloop=disabled ;;
> >  --enable-cocoa) printf "%s" -Dcocoa=enabled ;;
> >  --disable-cocoa) printf "%s" -Dcocoa=disabled ;;
> > +--enable-colo-filters) printf "%s" 

Re: [PATCH v2 0/4] COLO: improve build options

2023-04-20 Thread Lukas Straub
On Thu, 20 Apr 2023 01:52:28 +0300
Vladimir Sementsov-Ogievskiy  wrote:

> Hi all!
> 
> COLO substem seems to be useless when CONFIG_REPLICATION is unset, as we
> simply don't allow to set x-colo capability in this case. So, let's not
> compile in unreachable code and interface we cannot use when
> CONFIG_REPLICATION is unset.
> 
> Also, provide personal configure option for COLO Proxy subsystem.
> 
> v1 was 
> [PATCH] replication: compile out some staff when replication is not configured
> Supersedes: <20230411145112.497785-1-vsement...@yandex-team.ru>

Hey,
This series is a good idea, and looks fine to me. Maybe you can remove
the #ifdef CONFIG_REPLICATION/#ifndef CONFIG_REPLICATION from
migration/colo.c too while you are at it.

Regards,
Lukas Straub

> Vladimir Sementsov-Ogievskiy (4):
>   block/meson.build: prefer positive condition for replication
>   scripts/qapi: allow optional experimental enum values
>   build: move COLO under CONFIG_REPLICATION
>   configure: add --disable-colo-filters option
> 
>  block/meson.build  |  2 +-
>  hmp-commands.hx|  2 ++
>  meson.build|  1 +
>  meson_options.txt  |  2 ++
>  migration/colo.c   |  6 +
>  migration/meson.build  |  6 +++--
>  migration/migration-hmp-cmds.c |  2 ++
>  migration/migration.c  | 19 +++---
>  net/meson.build| 16 +---
>  qapi/migration.json| 12 ++---
>  scripts/meson-buildoptions.sh  |  3 +++
>  scripts/qapi/types.py  |  2 ++
>  stubs/colo.c   | 47 ++
>  stubs/meson.build  |  1 +
>  14 files changed, 95 insertions(+), 26 deletions(-)
>  create mode 100644 stubs/colo.c
> 



-- 



pgpREDHE483vj.pgp
Description: OpenPGP digital signature


[PATCH v6 1/4] replication: Remove s->active_disk

2021-07-18 Thread Lukas Straub
s->active_disk is bs->file. Remove it and use local variables instead.

Signed-off-by: Lukas Straub 
Reviewed-by: Vladimir Sementsov-Ogievskiy 
---
 block/replication.c | 34 +-
 1 file changed, 17 insertions(+), 17 deletions(-)

diff --git a/block/replication.c b/block/replication.c
index 774e15df16..9ad2dfdc69 100644
--- a/block/replication.c
+++ b/block/replication.c
@@ -35,7 +35,6 @@ typedef enum {
 typedef struct BDRVReplicationState {
 ReplicationMode mode;
 ReplicationStage stage;
-BdrvChild *active_disk;
 BlockJob *commit_job;
 BdrvChild *hidden_disk;
 BdrvChild *secondary_disk;
@@ -307,8 +306,10 @@ out:
 return ret;
 }

-static void secondary_do_checkpoint(BDRVReplicationState *s, Error **errp)
+static void secondary_do_checkpoint(BlockDriverState *bs, Error **errp)
 {
+BDRVReplicationState *s = bs->opaque;
+BdrvChild *active_disk = bs->file;
 Error *local_err = NULL;
 int ret;

@@ -323,13 +324,13 @@ static void secondary_do_checkpoint(BDRVReplicationState 
*s, Error **errp)
 return;
 }

-if (!s->active_disk->bs->drv) {
+if (!active_disk->bs->drv) {
 error_setg(errp, "Active disk %s is ejected",
-   s->active_disk->bs->node_name);
+   active_disk->bs->node_name);
 return;
 }

-ret = bdrv_make_empty(s->active_disk, errp);
+ret = bdrv_make_empty(active_disk, errp);
 if (ret < 0) {
 return;
 }
@@ -458,6 +459,7 @@ static void replication_start(ReplicationState *rs, 
ReplicationMode mode,
 BlockDriverState *bs = rs->opaque;
 BDRVReplicationState *s;
 BlockDriverState *top_bs;
+BdrvChild *active_disk;
 int64_t active_length, hidden_length, disk_length;
 AioContext *aio_context;
 Error *local_err = NULL;
@@ -495,15 +497,14 @@ static void replication_start(ReplicationState *rs, 
ReplicationMode mode,
 case REPLICATION_MODE_PRIMARY:
 break;
 case REPLICATION_MODE_SECONDARY:
-s->active_disk = bs->file;
-if (!s->active_disk || !s->active_disk->bs ||
-!s->active_disk->bs->backing) {
+active_disk = bs->file;
+if (!active_disk || !active_disk->bs || !active_disk->bs->backing) {
 error_setg(errp, "Active disk doesn't have backing file");
 aio_context_release(aio_context);
 return;
 }

-s->hidden_disk = s->active_disk->bs->backing;
+s->hidden_disk = active_disk->bs->backing;
 if (!s->hidden_disk->bs || !s->hidden_disk->bs->backing) {
 error_setg(errp, "Hidden disk doesn't have backing file");
 aio_context_release(aio_context);
@@ -518,7 +519,7 @@ static void replication_start(ReplicationState *rs, 
ReplicationMode mode,
 }

 /* verify the length */
-active_length = bdrv_getlength(s->active_disk->bs);
+active_length = bdrv_getlength(active_disk->bs);
 hidden_length = bdrv_getlength(s->hidden_disk->bs);
 disk_length = bdrv_getlength(s->secondary_disk->bs);
 if (active_length < 0 || hidden_length < 0 || disk_length < 0 ||
@@ -530,9 +531,9 @@ static void replication_start(ReplicationState *rs, 
ReplicationMode mode,
 }

 /* Must be true, or the bdrv_getlength() calls would have failed */
-assert(s->active_disk->bs->drv && s->hidden_disk->bs->drv);
+assert(active_disk->bs->drv && s->hidden_disk->bs->drv);

-if (!s->active_disk->bs->drv->bdrv_make_empty ||
+if (!active_disk->bs->drv->bdrv_make_empty ||
 !s->hidden_disk->bs->drv->bdrv_make_empty) {
 error_setg(errp,
"Active disk or hidden disk doesn't support 
make_empty");
@@ -586,7 +587,7 @@ static void replication_start(ReplicationState *rs, 
ReplicationMode mode,
 s->stage = BLOCK_REPLICATION_RUNNING;

 if (s->mode == REPLICATION_MODE_SECONDARY) {
-secondary_do_checkpoint(s, errp);
+secondary_do_checkpoint(bs, errp);
 }

 s->error = 0;
@@ -615,7 +616,7 @@ static void replication_do_checkpoint(ReplicationState *rs, 
Error **errp)
 }

 if (s->mode == REPLICATION_MODE_SECONDARY) {
-secondary_do_checkpoint(s, errp);
+secondary_do_checkpoint(bs, errp);
 }
 aio_context_release(aio_context);
 }
@@ -652,7 +653,6 @@ static void replication_done(void *opaque, int ret)
 if (ret == 0) {
 s->stage = BLOCK_REPLICATION_DONE;

-s->active_disk = NULL;
 s->secondary_disk = NULL;
 s->hidden_disk = NULL;
 s->error = 0;
@@ -705,7 +705,7 @@ static void replication_st

[PATCH v6 2/4] replication: Reduce usage of s->hidden_disk and s->secondary_disk

2021-07-18 Thread Lukas Straub
In preparation for the next patch, initialize s->hidden_disk and
s->secondary_disk later and replace access to them with local variables
in the places where they aren't initialized yet.

Signed-off-by: Lukas Straub 
Reviewed-by: Vladimir Sementsov-Ogievskiy 
---
 block/replication.c | 45 -
 1 file changed, 28 insertions(+), 17 deletions(-)

diff --git a/block/replication.c b/block/replication.c
index 9ad2dfdc69..25bbdf5d4b 100644
--- a/block/replication.c
+++ b/block/replication.c
@@ -366,27 +366,35 @@ static void reopen_backing_file(BlockDriverState *bs, 
bool writable,
 Error **errp)
 {
 BDRVReplicationState *s = bs->opaque;
+BdrvChild *hidden_disk, *secondary_disk;
 BlockReopenQueue *reopen_queue = NULL;

+/*
+ * s->hidden_disk and s->secondary_disk may not be set yet, as they will
+ * only be set after the children are writable.
+ */
+hidden_disk = bs->file->bs->backing;
+secondary_disk = hidden_disk->bs->backing;
+
 if (writable) {
-s->orig_hidden_read_only = bdrv_is_read_only(s->hidden_disk->bs);
-s->orig_secondary_read_only = bdrv_is_read_only(s->secondary_disk->bs);
+s->orig_hidden_read_only = bdrv_is_read_only(hidden_disk->bs);
+s->orig_secondary_read_only = bdrv_is_read_only(secondary_disk->bs);
 }

-bdrv_subtree_drained_begin(s->hidden_disk->bs);
-bdrv_subtree_drained_begin(s->secondary_disk->bs);
+bdrv_subtree_drained_begin(hidden_disk->bs);
+bdrv_subtree_drained_begin(secondary_disk->bs);

 if (s->orig_hidden_read_only) {
 QDict *opts = qdict_new();
 qdict_put_bool(opts, BDRV_OPT_READ_ONLY, !writable);
-reopen_queue = bdrv_reopen_queue(reopen_queue, s->hidden_disk->bs,
+reopen_queue = bdrv_reopen_queue(reopen_queue, hidden_disk->bs,
  opts, true);
 }

 if (s->orig_secondary_read_only) {
 QDict *opts = qdict_new();
 qdict_put_bool(opts, BDRV_OPT_READ_ONLY, !writable);
-reopen_queue = bdrv_reopen_queue(reopen_queue, s->secondary_disk->bs,
+reopen_queue = bdrv_reopen_queue(reopen_queue, secondary_disk->bs,
  opts, true);
 }

@@ -401,8 +409,8 @@ static void reopen_backing_file(BlockDriverState *bs, bool 
writable,
 }
 }

-bdrv_subtree_drained_end(s->hidden_disk->bs);
-bdrv_subtree_drained_end(s->secondary_disk->bs);
+bdrv_subtree_drained_end(hidden_disk->bs);
+bdrv_subtree_drained_end(secondary_disk->bs);
 }

 static void backup_job_cleanup(BlockDriverState *bs)
@@ -459,7 +467,7 @@ static void replication_start(ReplicationState *rs, 
ReplicationMode mode,
 BlockDriverState *bs = rs->opaque;
 BDRVReplicationState *s;
 BlockDriverState *top_bs;
-BdrvChild *active_disk;
+BdrvChild *active_disk, *hidden_disk, *secondary_disk;
 int64_t active_length, hidden_length, disk_length;
 AioContext *aio_context;
 Error *local_err = NULL;
@@ -504,15 +512,15 @@ static void replication_start(ReplicationState *rs, 
ReplicationMode mode,
 return;
 }

-s->hidden_disk = active_disk->bs->backing;
-if (!s->hidden_disk->bs || !s->hidden_disk->bs->backing) {
+hidden_disk = active_disk->bs->backing;
+if (!hidden_disk->bs || !hidden_disk->bs->backing) {
 error_setg(errp, "Hidden disk doesn't have backing file");
 aio_context_release(aio_context);
 return;
 }

-s->secondary_disk = s->hidden_disk->bs->backing;
-if (!s->secondary_disk->bs || !bdrv_has_blk(s->secondary_disk->bs)) {
+secondary_disk = hidden_disk->bs->backing;
+if (!secondary_disk->bs || !bdrv_has_blk(secondary_disk->bs)) {
 error_setg(errp, "The secondary disk doesn't have block backend");
 aio_context_release(aio_context);
 return;
@@ -520,8 +528,8 @@ static void replication_start(ReplicationState *rs, 
ReplicationMode mode,

 /* verify the length */
 active_length = bdrv_getlength(active_disk->bs);
-hidden_length = bdrv_getlength(s->hidden_disk->bs);
-disk_length = bdrv_getlength(s->secondary_disk->bs);
+hidden_length = bdrv_getlength(hidden_disk->bs);
+disk_length = bdrv_getlength(secondary_disk->bs);
 if (active_length < 0 || hidden_length < 0 || disk_length < 0 ||
 active_length != hidden_length || hidden_length != disk_length) {
 error_setg(errp, "Active disk, hidden disk, secondary disk's 
length"
@@ -531,10 +539,10 @@ static void replication_start(ReplicationState *rs, 
Replicat

[PATCH v6 4/4] replication: Remove workaround

2021-07-18 Thread Lukas Straub
Remove the workaround introduced in commit
6ecbc6c52672db5c13805735ca02784879ce8285
"replication: Avoid blk_make_empty() on read-only child".

It is not needed anymore since s->hidden_disk is guaranteed to be
writable when secondary_do_checkpoint() runs. Because replication_start(),
_do_checkpoint() and _stop() are only called by COLO migration code
and COLO-migration activates all disks via bdrv_invalidate_cache_all()
before it calls these functions.

Signed-off-by: Lukas Straub 
Reviewed-by: Vladimir Sementsov-Ogievskiy 
---
 block/replication.c | 12 +---
 1 file changed, 1 insertion(+), 11 deletions(-)

diff --git a/block/replication.c b/block/replication.c
index b74192f795..32444b9a8f 100644
--- a/block/replication.c
+++ b/block/replication.c
@@ -346,17 +346,7 @@ static void secondary_do_checkpoint(BlockDriverState *bs, 
Error **errp)
 return;
 }

-BlockBackend *blk = blk_new(qemu_get_current_aio_context(),
-BLK_PERM_WRITE, BLK_PERM_ALL);
-blk_insert_bs(blk, s->hidden_disk->bs, _err);
-if (local_err) {
-error_propagate(errp, local_err);
-blk_unref(blk);
-return;
-}
-
-ret = blk_make_empty(blk, errp);
-blk_unref(blk);
+ret = bdrv_make_empty(s->hidden_disk, errp);
 if (ret < 0) {
 return;
 }
--
2.20.1


pgpAQbsrZ72Yx.pgp
Description: OpenPGP digital signature


[PATCH v6 3/4] replication: Properly attach children

2021-07-18 Thread Lukas Straub
The replication driver needs access to the children block-nodes of
it's child so it can issue bdrv_make_empty() and bdrv_co_pwritev()
to manage the replication. However, it does this by directly copying
the BdrvChilds, which is wrong.

Fix this by properly attaching the block-nodes with
bdrv_attach_child() and requesting the required permissions.

This ultimatively fixes a potential crash in replication_co_writev(),
because it may write to s->secondary_disk if it is in state
BLOCK_REPLICATION_FAILOVER_FAILED, without requesting write
permissions first. And now the workaround in
secondary_do_checkpoint() can be removed.

Signed-off-by: Lukas Straub 
Reviewed-by: Vladimir Sementsov-Ogievskiy 
---
 block/replication.c | 30 +++---
 1 file changed, 27 insertions(+), 3 deletions(-)

diff --git a/block/replication.c b/block/replication.c
index 25bbdf5d4b..b74192f795 100644
--- a/block/replication.c
+++ b/block/replication.c
@@ -165,7 +165,12 @@ static void replication_child_perm(BlockDriverState *bs, 
BdrvChild *c,
uint64_t perm, uint64_t shared,
uint64_t *nperm, uint64_t *nshared)
 {
-*nperm = BLK_PERM_CONSISTENT_READ;
+if (role & BDRV_CHILD_PRIMARY) {
+*nperm = BLK_PERM_CONSISTENT_READ;
+} else {
+*nperm = 0;
+}
+
 if ((bs->open_flags & (BDRV_O_INACTIVE | BDRV_O_RDWR)) == BDRV_O_RDWR) {
 *nperm |= BLK_PERM_WRITE;
 }
@@ -557,8 +562,25 @@ static void replication_start(ReplicationState *rs, 
ReplicationMode mode,
 return;
 }

-s->hidden_disk = hidden_disk;
-s->secondary_disk = secondary_disk;
+bdrv_ref(hidden_disk->bs);
+s->hidden_disk = bdrv_attach_child(bs, hidden_disk->bs, "hidden disk",
+   _of_bds, BDRV_CHILD_DATA,
+   _err);
+if (local_err) {
+error_propagate(errp, local_err);
+aio_context_release(aio_context);
+return;
+}
+
+bdrv_ref(secondary_disk->bs);
+s->secondary_disk = bdrv_attach_child(bs, secondary_disk->bs,
+  "secondary disk", _of_bds,
+  BDRV_CHILD_DATA, _err);
+if (local_err) {
+error_propagate(errp, local_err);
+aio_context_release(aio_context);
+return;
+}

 /* start backup job now */
 error_setg(>blocker,
@@ -664,7 +686,9 @@ static void replication_done(void *opaque, int ret)
 if (ret == 0) {
 s->stage = BLOCK_REPLICATION_DONE;

+bdrv_unref_child(bs, s->secondary_disk);
 s->secondary_disk = NULL;
+bdrv_unref_child(bs, s->hidden_disk);
 s->hidden_disk = NULL;
 s->error = 0;
 } else {
--
2.20.1



pgpWgkEWjLXBe.pgp
Description: OpenPGP digital signature


[PATCH v6 0/4] replication: Bugfix and properly attach children

2021-07-18 Thread Lukas Straub
Hello Everyone,
A while ago Kevin noticed that the replication driver doesn't properly attach
the children it wants to use. Instead, it directly copies the BdrvChilds from
it's backing file, which is wrong. Ths Patchset fixes the problem, fixes a
potential crash in replication_co_writev due to missing permissions and removes
a workaround that was put in place back then.

Tested with full COLO-migration setup in my COLO testsuite.

Regards,
Lukas Straub

Changes:

-v6:
-Drop "replication: Assert that children are writable"
-Added Reviewed-by tags

-v5:
-Assert that children are writable where it's needed

-v4:
-minor style fixes
-clarify why children areguaranteed to be writable in
 "replication: Remove workaround"
-Added Reviewed-by tags

-v3:
-Split up into multiple patches
-Remove s->active_disk
-Clarify child permissions in commit message

-v2: Test for BDRV_CHILD_PRIMARY in replication_child_perm, since
 bs->file might not be set yet. (Vladimir)

Lukas Straub (4):
  replication: Remove s->active_disk
  replication: Reduce usage of s->hidden_disk and s->secondary_disk
  replication: Properly attach children
  replication: Remove workaround

 block/replication.c | 111 +++-
 1 file changed, 68 insertions(+), 43 deletions(-)

--
2.20.1


pgpm8dBseNSlI.pgp
Description: OpenPGP digital signature


[PATCH v5 5/5] replication: Remove workaround

2021-07-12 Thread Lukas Straub
Remove the workaround introduced in commit
6ecbc6c52672db5c13805735ca02784879ce8285
"replication: Avoid blk_make_empty() on read-only child".

It is not needed anymore since s->hidden_disk is guaranteed to be
writable when secondary_do_checkpoint() runs. Because replication_start(),
_do_checkpoint() and _stop() are only called by COLO migration code
and COLO-migration activates all disks via bdrv_invalidate_cache_all()
before it calls these functions.

Signed-off-by: Lukas Straub 
---
 block/replication.c | 12 +---
 1 file changed, 1 insertion(+), 11 deletions(-)

diff --git a/block/replication.c b/block/replication.c
index 772bb63374..1e9dc4d309 100644
--- a/block/replication.c
+++ b/block/replication.c
@@ -356,17 +356,7 @@ static void secondary_do_checkpoint(BlockDriverState *bs, 
Error **errp)
 return;
 }

-BlockBackend *blk = blk_new(qemu_get_current_aio_context(),
-BLK_PERM_WRITE, BLK_PERM_ALL);
-blk_insert_bs(blk, s->hidden_disk->bs, _err);
-if (local_err) {
-error_propagate(errp, local_err);
-blk_unref(blk);
-return;
-}
-
-ret = blk_make_empty(blk, errp);
-blk_unref(blk);
+ret = bdrv_make_empty(s->hidden_disk, errp);
 if (ret < 0) {
 return;
 }
--
2.20.1


pgpwIEmxlcSCT.pgp
Description: OpenPGP digital signature


[PATCH v5 0/5] replication: Bugfix and properly attach children

2021-07-12 Thread Lukas Straub
Hello Everyone,
A while ago Kevin noticed that the replication driver doesn't properly attach
the children it wants to use. Instead, it directly copies the BdrvChilds from
it's backing file, which is wrong. Ths Patchset fixes the problem, fixes a
potential crash in replication_co_writev due to missing permissions and removes
a workaround that was put in place back then.

Tested with full COLO-migration setup in my COLO testsuite.

Regards,
Lukas Straub

Changes:

-v5:
-Assert that children are writable where it's needed

-v4:
-minor style fixes
-clarify why children areguaranteed to be writable in
 "replication: Remove workaround"
-Added Reviewed-by tags

-v3:
-Split up into multiple patches
-Remove s->active_disk
-Clarify child permissions in commit message

-v2: Test for BDRV_CHILD_PRIMARY in replication_child_perm, since
 bs->file might not be set yet. (Vladimir)


Lukas Straub (5):
  replication: Remove s->active_disk
  replication: Reduce usage of s->hidden_disk and s->secondary_disk
  replication: Properly attach children
  replication: Assert that children are writable
  replication: Remove workaround

 block/replication.c | 121 
 1 file changed, 78 insertions(+), 43 deletions(-)

--
2.20.1


pgp_th29JAkpO.pgp
Description: OpenPGP digital signature


[PATCH v5 3/5] replication: Properly attach children

2021-07-12 Thread Lukas Straub
The replication driver needs access to the children block-nodes of
it's child so it can issue bdrv_make_empty() and bdrv_co_pwritev()
to manage the replication. However, it does this by directly copying
the BdrvChilds, which is wrong.

Fix this by properly attaching the block-nodes with
bdrv_attach_child() and requesting the required permissions.

This ultimatively fixes a potential crash in replication_co_writev(),
because it may write to s->secondary_disk if it is in state
BLOCK_REPLICATION_FAILOVER_FAILED, without requesting write
permissions first. And now the workaround in
secondary_do_checkpoint() can be removed.

Signed-off-by: Lukas Straub 
Reviewed-by: Vladimir Sementsov-Ogievskiy 
---
 block/replication.c | 30 +++---
 1 file changed, 27 insertions(+), 3 deletions(-)

diff --git a/block/replication.c b/block/replication.c
index 25bbdf5d4b..b74192f795 100644
--- a/block/replication.c
+++ b/block/replication.c
@@ -165,7 +165,12 @@ static void replication_child_perm(BlockDriverState *bs, 
BdrvChild *c,
uint64_t perm, uint64_t shared,
uint64_t *nperm, uint64_t *nshared)
 {
-*nperm = BLK_PERM_CONSISTENT_READ;
+if (role & BDRV_CHILD_PRIMARY) {
+*nperm = BLK_PERM_CONSISTENT_READ;
+} else {
+*nperm = 0;
+}
+
 if ((bs->open_flags & (BDRV_O_INACTIVE | BDRV_O_RDWR)) == BDRV_O_RDWR) {
 *nperm |= BLK_PERM_WRITE;
 }
@@ -557,8 +562,25 @@ static void replication_start(ReplicationState *rs, 
ReplicationMode mode,
 return;
 }

-s->hidden_disk = hidden_disk;
-s->secondary_disk = secondary_disk;
+bdrv_ref(hidden_disk->bs);
+s->hidden_disk = bdrv_attach_child(bs, hidden_disk->bs, "hidden disk",
+   _of_bds, BDRV_CHILD_DATA,
+   _err);
+if (local_err) {
+error_propagate(errp, local_err);
+aio_context_release(aio_context);
+return;
+}
+
+bdrv_ref(secondary_disk->bs);
+s->secondary_disk = bdrv_attach_child(bs, secondary_disk->bs,
+  "secondary disk", _of_bds,
+  BDRV_CHILD_DATA, _err);
+if (local_err) {
+error_propagate(errp, local_err);
+aio_context_release(aio_context);
+return;
+}

 /* start backup job now */
 error_setg(>blocker,
@@ -664,7 +686,9 @@ static void replication_done(void *opaque, int ret)
 if (ret == 0) {
 s->stage = BLOCK_REPLICATION_DONE;

+bdrv_unref_child(bs, s->secondary_disk);
 s->secondary_disk = NULL;
+bdrv_unref_child(bs, s->hidden_disk);
 s->hidden_disk = NULL;
 s->error = 0;
 } else {
--
2.20.1



pgp5VSvqWfVkG.pgp
Description: OpenPGP digital signature


[PATCH v5 1/5] replication: Remove s->active_disk

2021-07-12 Thread Lukas Straub
s->active_disk is bs->file. Remove it and use local variables instead.

Signed-off-by: Lukas Straub 
Reviewed-by: Vladimir Sementsov-Ogievskiy 
---
 block/replication.c | 34 +-
 1 file changed, 17 insertions(+), 17 deletions(-)

diff --git a/block/replication.c b/block/replication.c
index 774e15df16..9ad2dfdc69 100644
--- a/block/replication.c
+++ b/block/replication.c
@@ -35,7 +35,6 @@ typedef enum {
 typedef struct BDRVReplicationState {
 ReplicationMode mode;
 ReplicationStage stage;
-BdrvChild *active_disk;
 BlockJob *commit_job;
 BdrvChild *hidden_disk;
 BdrvChild *secondary_disk;
@@ -307,8 +306,10 @@ out:
 return ret;
 }

-static void secondary_do_checkpoint(BDRVReplicationState *s, Error **errp)
+static void secondary_do_checkpoint(BlockDriverState *bs, Error **errp)
 {
+BDRVReplicationState *s = bs->opaque;
+BdrvChild *active_disk = bs->file;
 Error *local_err = NULL;
 int ret;

@@ -323,13 +324,13 @@ static void secondary_do_checkpoint(BDRVReplicationState 
*s, Error **errp)
 return;
 }

-if (!s->active_disk->bs->drv) {
+if (!active_disk->bs->drv) {
 error_setg(errp, "Active disk %s is ejected",
-   s->active_disk->bs->node_name);
+   active_disk->bs->node_name);
 return;
 }

-ret = bdrv_make_empty(s->active_disk, errp);
+ret = bdrv_make_empty(active_disk, errp);
 if (ret < 0) {
 return;
 }
@@ -458,6 +459,7 @@ static void replication_start(ReplicationState *rs, 
ReplicationMode mode,
 BlockDriverState *bs = rs->opaque;
 BDRVReplicationState *s;
 BlockDriverState *top_bs;
+BdrvChild *active_disk;
 int64_t active_length, hidden_length, disk_length;
 AioContext *aio_context;
 Error *local_err = NULL;
@@ -495,15 +497,14 @@ static void replication_start(ReplicationState *rs, 
ReplicationMode mode,
 case REPLICATION_MODE_PRIMARY:
 break;
 case REPLICATION_MODE_SECONDARY:
-s->active_disk = bs->file;
-if (!s->active_disk || !s->active_disk->bs ||
-!s->active_disk->bs->backing) {
+active_disk = bs->file;
+if (!active_disk || !active_disk->bs || !active_disk->bs->backing) {
 error_setg(errp, "Active disk doesn't have backing file");
 aio_context_release(aio_context);
 return;
 }

-s->hidden_disk = s->active_disk->bs->backing;
+s->hidden_disk = active_disk->bs->backing;
 if (!s->hidden_disk->bs || !s->hidden_disk->bs->backing) {
 error_setg(errp, "Hidden disk doesn't have backing file");
 aio_context_release(aio_context);
@@ -518,7 +519,7 @@ static void replication_start(ReplicationState *rs, 
ReplicationMode mode,
 }

 /* verify the length */
-active_length = bdrv_getlength(s->active_disk->bs);
+active_length = bdrv_getlength(active_disk->bs);
 hidden_length = bdrv_getlength(s->hidden_disk->bs);
 disk_length = bdrv_getlength(s->secondary_disk->bs);
 if (active_length < 0 || hidden_length < 0 || disk_length < 0 ||
@@ -530,9 +531,9 @@ static void replication_start(ReplicationState *rs, 
ReplicationMode mode,
 }

 /* Must be true, or the bdrv_getlength() calls would have failed */
-assert(s->active_disk->bs->drv && s->hidden_disk->bs->drv);
+assert(active_disk->bs->drv && s->hidden_disk->bs->drv);

-if (!s->active_disk->bs->drv->bdrv_make_empty ||
+if (!active_disk->bs->drv->bdrv_make_empty ||
 !s->hidden_disk->bs->drv->bdrv_make_empty) {
 error_setg(errp,
"Active disk or hidden disk doesn't support 
make_empty");
@@ -586,7 +587,7 @@ static void replication_start(ReplicationState *rs, 
ReplicationMode mode,
 s->stage = BLOCK_REPLICATION_RUNNING;

 if (s->mode == REPLICATION_MODE_SECONDARY) {
-secondary_do_checkpoint(s, errp);
+secondary_do_checkpoint(bs, errp);
 }

 s->error = 0;
@@ -615,7 +616,7 @@ static void replication_do_checkpoint(ReplicationState *rs, 
Error **errp)
 }

 if (s->mode == REPLICATION_MODE_SECONDARY) {
-secondary_do_checkpoint(s, errp);
+secondary_do_checkpoint(bs, errp);
 }
 aio_context_release(aio_context);
 }
@@ -652,7 +653,6 @@ static void replication_done(void *opaque, int ret)
 if (ret == 0) {
 s->stage = BLOCK_REPLICATION_DONE;

-s->active_disk = NULL;
 s->secondary_disk = NULL;
 s->hidden_disk = NULL;
 s->error = 0;
@@ -705,7 +705,7 @@ static void replication_st

[PATCH v5 4/5] replication: Assert that children are writable

2021-07-12 Thread Lukas Straub
Assert that the children are writable where it's needed.
While there is no test-case for the
BLOCK_REPLICATION_FAILOVER_FAILED state, this at least ensures that
s->secondary_disk is always writable in case replication might go
into that state.

Signed-off-by: Lukas Straub 
---
 block/replication.c | 10 ++
 1 file changed, 10 insertions(+)

diff --git a/block/replication.c b/block/replication.c
index b74192f795..772bb63374 100644
--- a/block/replication.c
+++ b/block/replication.c
@@ -261,6 +261,13 @@ static coroutine_fn int 
replication_co_writev(BlockDriverState *bs,
 int64_t n;

 assert(!flags);
+assert(top->perm & BLK_PERM_WRITE);
+if (s->mode == REPLICATION_MODE_SECONDARY &&
+s->stage != BLOCK_REPLICATION_NONE &&
+s->stage != BLOCK_REPLICATION_DONE) {
+assert(base->perm & BLK_PERM_WRITE);
+}
+
 ret = replication_get_io_status(s);
 if (ret < 0) {
 goto out;
@@ -318,6 +325,9 @@ static void secondary_do_checkpoint(BlockDriverState *bs, 
Error **errp)
 Error *local_err = NULL;
 int ret;

+assert(active_disk->perm & BLK_PERM_WRITE);
+assert(s->hidden_disk->perm & BLK_PERM_WRITE);
+
 if (!s->backup_job) {
 error_setg(errp, "Backup job was cancelled unexpectedly");
 return;
--
2.20.1



pgpCMs1y2Gkrf.pgp
Description: OpenPGP digital signature


[PATCH v5 2/5] replication: Reduce usage of s->hidden_disk and s->secondary_disk

2021-07-12 Thread Lukas Straub
In preparation for the next patch, initialize s->hidden_disk and
s->secondary_disk later and replace access to them with local variables
in the places where they aren't initialized yet.

Signed-off-by: Lukas Straub 
Reviewed-by: Vladimir Sementsov-Ogievskiy 
---
 block/replication.c | 45 -
 1 file changed, 28 insertions(+), 17 deletions(-)

diff --git a/block/replication.c b/block/replication.c
index 9ad2dfdc69..25bbdf5d4b 100644
--- a/block/replication.c
+++ b/block/replication.c
@@ -366,27 +366,35 @@ static void reopen_backing_file(BlockDriverState *bs, 
bool writable,
 Error **errp)
 {
 BDRVReplicationState *s = bs->opaque;
+BdrvChild *hidden_disk, *secondary_disk;
 BlockReopenQueue *reopen_queue = NULL;

+/*
+ * s->hidden_disk and s->secondary_disk may not be set yet, as they will
+ * only be set after the children are writable.
+ */
+hidden_disk = bs->file->bs->backing;
+secondary_disk = hidden_disk->bs->backing;
+
 if (writable) {
-s->orig_hidden_read_only = bdrv_is_read_only(s->hidden_disk->bs);
-s->orig_secondary_read_only = bdrv_is_read_only(s->secondary_disk->bs);
+s->orig_hidden_read_only = bdrv_is_read_only(hidden_disk->bs);
+s->orig_secondary_read_only = bdrv_is_read_only(secondary_disk->bs);
 }

-bdrv_subtree_drained_begin(s->hidden_disk->bs);
-bdrv_subtree_drained_begin(s->secondary_disk->bs);
+bdrv_subtree_drained_begin(hidden_disk->bs);
+bdrv_subtree_drained_begin(secondary_disk->bs);

 if (s->orig_hidden_read_only) {
 QDict *opts = qdict_new();
 qdict_put_bool(opts, BDRV_OPT_READ_ONLY, !writable);
-reopen_queue = bdrv_reopen_queue(reopen_queue, s->hidden_disk->bs,
+reopen_queue = bdrv_reopen_queue(reopen_queue, hidden_disk->bs,
  opts, true);
 }

 if (s->orig_secondary_read_only) {
 QDict *opts = qdict_new();
 qdict_put_bool(opts, BDRV_OPT_READ_ONLY, !writable);
-reopen_queue = bdrv_reopen_queue(reopen_queue, s->secondary_disk->bs,
+reopen_queue = bdrv_reopen_queue(reopen_queue, secondary_disk->bs,
  opts, true);
 }

@@ -401,8 +409,8 @@ static void reopen_backing_file(BlockDriverState *bs, bool 
writable,
 }
 }

-bdrv_subtree_drained_end(s->hidden_disk->bs);
-bdrv_subtree_drained_end(s->secondary_disk->bs);
+bdrv_subtree_drained_end(hidden_disk->bs);
+bdrv_subtree_drained_end(secondary_disk->bs);
 }

 static void backup_job_cleanup(BlockDriverState *bs)
@@ -459,7 +467,7 @@ static void replication_start(ReplicationState *rs, 
ReplicationMode mode,
 BlockDriverState *bs = rs->opaque;
 BDRVReplicationState *s;
 BlockDriverState *top_bs;
-BdrvChild *active_disk;
+BdrvChild *active_disk, *hidden_disk, *secondary_disk;
 int64_t active_length, hidden_length, disk_length;
 AioContext *aio_context;
 Error *local_err = NULL;
@@ -504,15 +512,15 @@ static void replication_start(ReplicationState *rs, 
ReplicationMode mode,
 return;
 }

-s->hidden_disk = active_disk->bs->backing;
-if (!s->hidden_disk->bs || !s->hidden_disk->bs->backing) {
+hidden_disk = active_disk->bs->backing;
+if (!hidden_disk->bs || !hidden_disk->bs->backing) {
 error_setg(errp, "Hidden disk doesn't have backing file");
 aio_context_release(aio_context);
 return;
 }

-s->secondary_disk = s->hidden_disk->bs->backing;
-if (!s->secondary_disk->bs || !bdrv_has_blk(s->secondary_disk->bs)) {
+secondary_disk = hidden_disk->bs->backing;
+if (!secondary_disk->bs || !bdrv_has_blk(secondary_disk->bs)) {
 error_setg(errp, "The secondary disk doesn't have block backend");
 aio_context_release(aio_context);
 return;
@@ -520,8 +528,8 @@ static void replication_start(ReplicationState *rs, 
ReplicationMode mode,

 /* verify the length */
 active_length = bdrv_getlength(active_disk->bs);
-hidden_length = bdrv_getlength(s->hidden_disk->bs);
-disk_length = bdrv_getlength(s->secondary_disk->bs);
+hidden_length = bdrv_getlength(hidden_disk->bs);
+disk_length = bdrv_getlength(secondary_disk->bs);
 if (active_length < 0 || hidden_length < 0 || disk_length < 0 ||
 active_length != hidden_length || hidden_length != disk_length) {
 error_setg(errp, "Active disk, hidden disk, secondary disk's 
length"
@@ -531,10 +539,10 @@ static void replication_start(ReplicationState *rs, 
Replicat

Re: [PATCH v3 4/4] replication: Remove workaround

2021-07-12 Thread Lukas Straub
On Mon, 12 Jul 2021 13:06:19 +0300
Vladimir Sementsov-Ogievskiy  wrote:

> 11.07.2021 23:33, Lukas Straub wrote:
> > On Fri, 9 Jul 2021 10:49:23 +0300
> > Vladimir Sementsov-Ogievskiy  wrote:
> >   
> >> 07.07.2021 21:15, Lukas Straub wrote:  
> >>> Remove the workaround introduced in commit
> >>> 6ecbc6c52672db5c13805735ca02784879ce8285
> >>> "replication: Avoid blk_make_empty() on read-only child".
> >>>
> >>> It is not needed anymore since s->hidden_disk is guaranteed to be
> >>> writable when secondary_do_checkpoint() runs. Because replication_start(),
> >>> _do_checkpoint() and _stop() are only called by COLO migration code
> >>> and COLO-migration doesn't inactivate disks.  
> >>
> >> If look at replication_child_perm() you should also be sure that it always 
> >> works only with RW disks..
> >>
> >> Actually, I think that it would be correct just require BLK_PERM_WRITE in 
> >> replication_child_perm() unconditionally. Let generic layer care about all 
> >> these RD/WR things. In _child_perm() we can require WRITE and don't care. 
> >> If something goes wrong and we can't get WRITE permission we should see 
> >> clean error-out.
> >>
> >> Opposite, if we don't require WRITE permission in some case and still do 
> >> WRITE request, it may crash.
> >>
> >> Still, this may be considered as a preexisting problem of 
> >> replication_child_perm() and fixed separately.  
> > 
> > Hmm, unconditionally requesting write doesn't work, since qemu on the
> > secondary side is started with "-miration incoming", it goes into
> > runstate RUN_STATE_INMIGRATE from the beginning and then blockdev_init()
> > opens every blockdev with BDRV_O_INACTIVE and then it errors out with
> > -drive driver=replication,...: Block node is read-only.  
> 
> Ah, OK. So we need this check in _child_perm().. Then, maybe, leave check or 
> assertion in secondary_do_checkpoint, that hidden_disk is writable?

Good Idea. I will add assertions to secondary_do_checkpoint and 
replication_co_writev too.

> >   
> >>>
> >>> Signed-off-by: Lukas Straub   
> >>
> >> So, for this one commit (with probably updated commit message accordingly 
> >> to my comments, or even rebased on fixed replication_child_perm()):
> >>
> >> Reviewed-by: Vladimir Sementsov-Ogievskiy 
> >>
> >>  
> >>> ---
> >>>block/replication.c | 12 +---
> >>>1 file changed, 1 insertion(+), 11 deletions(-)
> >>>
> >>> diff --git a/block/replication.c b/block/replication.c
> >>> index c0d4a6c264..68b46d65a8 100644
> >>> --- a/block/replication.c
> >>> +++ b/block/replication.c
> >>> @@ -348,17 +348,7 @@ static void secondary_do_checkpoint(BlockDriverState 
> >>> *bs, Error **errp)
> >>>return;
> >>>}
> >>>
> >>> -BlockBackend *blk = blk_new(qemu_get_current_aio_context(),
> >>> -BLK_PERM_WRITE, BLK_PERM_ALL);
> >>> -blk_insert_bs(blk, s->hidden_disk->bs, _err);
> >>> -if (local_err) {
> >>> -error_propagate(errp, local_err);
> >>> -blk_unref(blk);
> >>> -return;
> >>> -}
> >>> -
> >>> -ret = blk_make_empty(blk, errp);
> >>> -blk_unref(blk);
> >>> +ret = bdrv_make_empty(s->hidden_disk, errp);
> >>>if (ret < 0) {
> >>>return;
> >>>}
> >>> --
> >>> 2.20.1
> >>>  
> >>
> >>  
> > 
> > 
> >   
> 
> 



-- 



pgpVkn4sVUUAO.pgp
Description: OpenPGP digital signature


[PATCH v4 1/4] replication: Remove s->active_disk

2021-07-11 Thread Lukas Straub
s->active_disk is bs->file. Remove it and use local variables instead.

Signed-off-by: Lukas Straub 
Reviewed-by: Vladimir Sementsov-Ogievskiy 
---
 block/replication.c | 34 +-
 1 file changed, 17 insertions(+), 17 deletions(-)

diff --git a/block/replication.c b/block/replication.c
index 774e15df16..9ad2dfdc69 100644
--- a/block/replication.c
+++ b/block/replication.c
@@ -35,7 +35,6 @@ typedef enum {
 typedef struct BDRVReplicationState {
 ReplicationMode mode;
 ReplicationStage stage;
-BdrvChild *active_disk;
 BlockJob *commit_job;
 BdrvChild *hidden_disk;
 BdrvChild *secondary_disk;
@@ -307,8 +306,10 @@ out:
 return ret;
 }

-static void secondary_do_checkpoint(BDRVReplicationState *s, Error **errp)
+static void secondary_do_checkpoint(BlockDriverState *bs, Error **errp)
 {
+BDRVReplicationState *s = bs->opaque;
+BdrvChild *active_disk = bs->file;
 Error *local_err = NULL;
 int ret;

@@ -323,13 +324,13 @@ static void secondary_do_checkpoint(BDRVReplicationState 
*s, Error **errp)
 return;
 }

-if (!s->active_disk->bs->drv) {
+if (!active_disk->bs->drv) {
 error_setg(errp, "Active disk %s is ejected",
-   s->active_disk->bs->node_name);
+   active_disk->bs->node_name);
 return;
 }

-ret = bdrv_make_empty(s->active_disk, errp);
+ret = bdrv_make_empty(active_disk, errp);
 if (ret < 0) {
 return;
 }
@@ -458,6 +459,7 @@ static void replication_start(ReplicationState *rs, 
ReplicationMode mode,
 BlockDriverState *bs = rs->opaque;
 BDRVReplicationState *s;
 BlockDriverState *top_bs;
+BdrvChild *active_disk;
 int64_t active_length, hidden_length, disk_length;
 AioContext *aio_context;
 Error *local_err = NULL;
@@ -495,15 +497,14 @@ static void replication_start(ReplicationState *rs, 
ReplicationMode mode,
 case REPLICATION_MODE_PRIMARY:
 break;
 case REPLICATION_MODE_SECONDARY:
-s->active_disk = bs->file;
-if (!s->active_disk || !s->active_disk->bs ||
-!s->active_disk->bs->backing) {
+active_disk = bs->file;
+if (!active_disk || !active_disk->bs || !active_disk->bs->backing) {
 error_setg(errp, "Active disk doesn't have backing file");
 aio_context_release(aio_context);
 return;
 }

-s->hidden_disk = s->active_disk->bs->backing;
+s->hidden_disk = active_disk->bs->backing;
 if (!s->hidden_disk->bs || !s->hidden_disk->bs->backing) {
 error_setg(errp, "Hidden disk doesn't have backing file");
 aio_context_release(aio_context);
@@ -518,7 +519,7 @@ static void replication_start(ReplicationState *rs, 
ReplicationMode mode,
 }

 /* verify the length */
-active_length = bdrv_getlength(s->active_disk->bs);
+active_length = bdrv_getlength(active_disk->bs);
 hidden_length = bdrv_getlength(s->hidden_disk->bs);
 disk_length = bdrv_getlength(s->secondary_disk->bs);
 if (active_length < 0 || hidden_length < 0 || disk_length < 0 ||
@@ -530,9 +531,9 @@ static void replication_start(ReplicationState *rs, 
ReplicationMode mode,
 }

 /* Must be true, or the bdrv_getlength() calls would have failed */
-assert(s->active_disk->bs->drv && s->hidden_disk->bs->drv);
+assert(active_disk->bs->drv && s->hidden_disk->bs->drv);

-if (!s->active_disk->bs->drv->bdrv_make_empty ||
+if (!active_disk->bs->drv->bdrv_make_empty ||
 !s->hidden_disk->bs->drv->bdrv_make_empty) {
 error_setg(errp,
"Active disk or hidden disk doesn't support 
make_empty");
@@ -586,7 +587,7 @@ static void replication_start(ReplicationState *rs, 
ReplicationMode mode,
 s->stage = BLOCK_REPLICATION_RUNNING;

 if (s->mode == REPLICATION_MODE_SECONDARY) {
-secondary_do_checkpoint(s, errp);
+secondary_do_checkpoint(bs, errp);
 }

 s->error = 0;
@@ -615,7 +616,7 @@ static void replication_do_checkpoint(ReplicationState *rs, 
Error **errp)
 }

 if (s->mode == REPLICATION_MODE_SECONDARY) {
-secondary_do_checkpoint(s, errp);
+secondary_do_checkpoint(bs, errp);
 }
 aio_context_release(aio_context);
 }
@@ -652,7 +653,6 @@ static void replication_done(void *opaque, int ret)
 if (ret == 0) {
 s->stage = BLOCK_REPLICATION_DONE;

-s->active_disk = NULL;
 s->secondary_disk = NULL;
 s->hidden_disk = NULL;
 s->error = 0;
@@ -705,7 +705,7 @@ static void replication_st

[PATCH v4 0/4] replication: Bugfix and properly attach children

2021-07-11 Thread Lukas Straub
Hello Everyone,
A while ago Kevin noticed that the replication driver doesn't properly attach
the children it wants to use. Instead, it directly copies the BdrvChilds from
it's backing file, which is wrong. Ths Patchset fixes the problem, fixes a
potential crash in replication_co_writev due to missing permissions and removes
a workaround that was put in place back then.

Regards,
Lukas Straub

Changes:

-v4:
-minor style fixes
-clarify why children areguaranteed to be writable in
 "replication: Remove workaround"
-Added Reviewed-by tags

-v3:
-Split up into multiple patches
-Remove s->active_disk
-Clarify child permissions in commit message

-v2: Test for BDRV_CHILD_PRIMARY in replication_child_perm, since
 bs->file might not be set yet. (Vladimir)


Lukas Straub (4):
  replication: Remove s->active_disk
  replication: Reduce usage of s->hidden_disk and s->secondary_disk
  replication: Properly attach children
  replication: Remove workaround

 block/replication.c | 111 +++-
 1 file changed, 68 insertions(+), 43 deletions(-)

--
2.20.1


pgpLHeo2qb2uq.pgp
Description: OpenPGP digital signature


[PATCH v4 4/4] replication: Remove workaround

2021-07-11 Thread Lukas Straub
Remove the workaround introduced in commit
6ecbc6c52672db5c13805735ca02784879ce8285
"replication: Avoid blk_make_empty() on read-only child".

It is not needed anymore since s->hidden_disk is guaranteed to be
writable when secondary_do_checkpoint() runs. Because replication_start(),
_do_checkpoint() and _stop() are only called by COLO migration code
and COLO-migration activates all disks via bdrv_invalidate_cache_all()
before it calls these functions.

Signed-off-by: Lukas Straub 
---
 block/replication.c | 12 +---
 1 file changed, 1 insertion(+), 11 deletions(-)

diff --git a/block/replication.c b/block/replication.c
index b74192f795..32444b9a8f 100644
--- a/block/replication.c
+++ b/block/replication.c
@@ -346,17 +346,7 @@ static void secondary_do_checkpoint(BlockDriverState *bs, 
Error **errp)
 return;
 }

-BlockBackend *blk = blk_new(qemu_get_current_aio_context(),
-BLK_PERM_WRITE, BLK_PERM_ALL);
-blk_insert_bs(blk, s->hidden_disk->bs, _err);
-if (local_err) {
-error_propagate(errp, local_err);
-blk_unref(blk);
-return;
-}
-
-ret = blk_make_empty(blk, errp);
-blk_unref(blk);
+ret = bdrv_make_empty(s->hidden_disk, errp);
 if (ret < 0) {
 return;
 }
--
2.20.1


pgp7RP2lHvxak.pgp
Description: OpenPGP digital signature


[PATCH v4 3/4] replication: Properly attach children

2021-07-11 Thread Lukas Straub
The replication driver needs access to the children block-nodes of
it's child so it can issue bdrv_make_empty() and bdrv_co_pwritev()
to manage the replication. However, it does this by directly copying
the BdrvChilds, which is wrong.

Fix this by properly attaching the block-nodes with
bdrv_attach_child() and requesting the required permissions.

This ultimatively fixes a potential crash in replication_co_writev(),
because it may write to s->secondary_disk if it is in state
BLOCK_REPLICATION_FAILOVER_FAILED, without requesting write
permissions first. And now the workaround in
secondary_do_checkpoint() can be removed.

Signed-off-by: Lukas Straub 
Reviewed-by: Vladimir Sementsov-Ogievskiy 
---
 block/replication.c | 30 +++---
 1 file changed, 27 insertions(+), 3 deletions(-)

diff --git a/block/replication.c b/block/replication.c
index 25bbdf5d4b..b74192f795 100644
--- a/block/replication.c
+++ b/block/replication.c
@@ -165,7 +165,12 @@ static void replication_child_perm(BlockDriverState *bs, 
BdrvChild *c,
uint64_t perm, uint64_t shared,
uint64_t *nperm, uint64_t *nshared)
 {
-*nperm = BLK_PERM_CONSISTENT_READ;
+if (role & BDRV_CHILD_PRIMARY) {
+*nperm = BLK_PERM_CONSISTENT_READ;
+} else {
+*nperm = 0;
+}
+
 if ((bs->open_flags & (BDRV_O_INACTIVE | BDRV_O_RDWR)) == BDRV_O_RDWR) {
 *nperm |= BLK_PERM_WRITE;
 }
@@ -557,8 +562,25 @@ static void replication_start(ReplicationState *rs, 
ReplicationMode mode,
 return;
 }

-s->hidden_disk = hidden_disk;
-s->secondary_disk = secondary_disk;
+bdrv_ref(hidden_disk->bs);
+s->hidden_disk = bdrv_attach_child(bs, hidden_disk->bs, "hidden disk",
+   _of_bds, BDRV_CHILD_DATA,
+   _err);
+if (local_err) {
+error_propagate(errp, local_err);
+aio_context_release(aio_context);
+return;
+}
+
+bdrv_ref(secondary_disk->bs);
+s->secondary_disk = bdrv_attach_child(bs, secondary_disk->bs,
+  "secondary disk", _of_bds,
+  BDRV_CHILD_DATA, _err);
+if (local_err) {
+error_propagate(errp, local_err);
+aio_context_release(aio_context);
+return;
+}

 /* start backup job now */
 error_setg(>blocker,
@@ -664,7 +686,9 @@ static void replication_done(void *opaque, int ret)
 if (ret == 0) {
 s->stage = BLOCK_REPLICATION_DONE;

+bdrv_unref_child(bs, s->secondary_disk);
 s->secondary_disk = NULL;
+bdrv_unref_child(bs, s->hidden_disk);
 s->hidden_disk = NULL;
 s->error = 0;
 } else {
--
2.20.1



pgpxilLs24yxr.pgp
Description: OpenPGP digital signature


[PATCH v4 2/4] replication: Reduce usage of s->hidden_disk and s->secondary_disk

2021-07-11 Thread Lukas Straub
In preparation for the next patch, initialize s->hidden_disk and
s->secondary_disk later and replace access to them with local variables
in the places where they aren't initialized yet.

Signed-off-by: Lukas Straub 
Reviewed-by: Vladimir Sementsov-Ogievskiy 
---
 block/replication.c | 45 -
 1 file changed, 28 insertions(+), 17 deletions(-)

diff --git a/block/replication.c b/block/replication.c
index 9ad2dfdc69..25bbdf5d4b 100644
--- a/block/replication.c
+++ b/block/replication.c
@@ -366,27 +366,35 @@ static void reopen_backing_file(BlockDriverState *bs, 
bool writable,
 Error **errp)
 {
 BDRVReplicationState *s = bs->opaque;
+BdrvChild *hidden_disk, *secondary_disk;
 BlockReopenQueue *reopen_queue = NULL;

+/*
+ * s->hidden_disk and s->secondary_disk may not be set yet, as they will
+ * only be set after the children are writable.
+ */
+hidden_disk = bs->file->bs->backing;
+secondary_disk = hidden_disk->bs->backing;
+
 if (writable) {
-s->orig_hidden_read_only = bdrv_is_read_only(s->hidden_disk->bs);
-s->orig_secondary_read_only = bdrv_is_read_only(s->secondary_disk->bs);
+s->orig_hidden_read_only = bdrv_is_read_only(hidden_disk->bs);
+s->orig_secondary_read_only = bdrv_is_read_only(secondary_disk->bs);
 }

-bdrv_subtree_drained_begin(s->hidden_disk->bs);
-bdrv_subtree_drained_begin(s->secondary_disk->bs);
+bdrv_subtree_drained_begin(hidden_disk->bs);
+bdrv_subtree_drained_begin(secondary_disk->bs);

 if (s->orig_hidden_read_only) {
 QDict *opts = qdict_new();
 qdict_put_bool(opts, BDRV_OPT_READ_ONLY, !writable);
-reopen_queue = bdrv_reopen_queue(reopen_queue, s->hidden_disk->bs,
+reopen_queue = bdrv_reopen_queue(reopen_queue, hidden_disk->bs,
  opts, true);
 }

 if (s->orig_secondary_read_only) {
 QDict *opts = qdict_new();
 qdict_put_bool(opts, BDRV_OPT_READ_ONLY, !writable);
-reopen_queue = bdrv_reopen_queue(reopen_queue, s->secondary_disk->bs,
+reopen_queue = bdrv_reopen_queue(reopen_queue, secondary_disk->bs,
  opts, true);
 }

@@ -401,8 +409,8 @@ static void reopen_backing_file(BlockDriverState *bs, bool 
writable,
 }
 }

-bdrv_subtree_drained_end(s->hidden_disk->bs);
-bdrv_subtree_drained_end(s->secondary_disk->bs);
+bdrv_subtree_drained_end(hidden_disk->bs);
+bdrv_subtree_drained_end(secondary_disk->bs);
 }

 static void backup_job_cleanup(BlockDriverState *bs)
@@ -459,7 +467,7 @@ static void replication_start(ReplicationState *rs, 
ReplicationMode mode,
 BlockDriverState *bs = rs->opaque;
 BDRVReplicationState *s;
 BlockDriverState *top_bs;
-BdrvChild *active_disk;
+BdrvChild *active_disk, *hidden_disk, *secondary_disk;
 int64_t active_length, hidden_length, disk_length;
 AioContext *aio_context;
 Error *local_err = NULL;
@@ -504,15 +512,15 @@ static void replication_start(ReplicationState *rs, 
ReplicationMode mode,
 return;
 }

-s->hidden_disk = active_disk->bs->backing;
-if (!s->hidden_disk->bs || !s->hidden_disk->bs->backing) {
+hidden_disk = active_disk->bs->backing;
+if (!hidden_disk->bs || !hidden_disk->bs->backing) {
 error_setg(errp, "Hidden disk doesn't have backing file");
 aio_context_release(aio_context);
 return;
 }

-s->secondary_disk = s->hidden_disk->bs->backing;
-if (!s->secondary_disk->bs || !bdrv_has_blk(s->secondary_disk->bs)) {
+secondary_disk = hidden_disk->bs->backing;
+if (!secondary_disk->bs || !bdrv_has_blk(secondary_disk->bs)) {
 error_setg(errp, "The secondary disk doesn't have block backend");
 aio_context_release(aio_context);
 return;
@@ -520,8 +528,8 @@ static void replication_start(ReplicationState *rs, 
ReplicationMode mode,

 /* verify the length */
 active_length = bdrv_getlength(active_disk->bs);
-hidden_length = bdrv_getlength(s->hidden_disk->bs);
-disk_length = bdrv_getlength(s->secondary_disk->bs);
+hidden_length = bdrv_getlength(hidden_disk->bs);
+disk_length = bdrv_getlength(secondary_disk->bs);
 if (active_length < 0 || hidden_length < 0 || disk_length < 0 ||
 active_length != hidden_length || hidden_length != disk_length) {
 error_setg(errp, "Active disk, hidden disk, secondary disk's 
length"
@@ -531,10 +539,10 @@ static void replication_start(ReplicationState *rs, 
Replicat

Re: [PATCH v3 4/4] replication: Remove workaround

2021-07-11 Thread Lukas Straub
On Fri, 9 Jul 2021 10:49:23 +0300
Vladimir Sementsov-Ogievskiy  wrote:

> 07.07.2021 21:15, Lukas Straub wrote:
> > Remove the workaround introduced in commit
> > 6ecbc6c52672db5c13805735ca02784879ce8285
> > "replication: Avoid blk_make_empty() on read-only child".
> > 
> > It is not needed anymore since s->hidden_disk is guaranteed to be
> > writable when secondary_do_checkpoint() runs. Because replication_start(),
> > _do_checkpoint() and _stop() are only called by COLO migration code
> > and COLO-migration doesn't inactivate disks.  
> 
> If look at replication_child_perm() you should also be sure that it always 
> works only with RW disks..
> 
> Actually, I think that it would be correct just require BLK_PERM_WRITE in 
> replication_child_perm() unconditionally. Let generic layer care about all 
> these RD/WR things. In _child_perm() we can require WRITE and don't care. If 
> something goes wrong and we can't get WRITE permission we should see clean 
> error-out.
> 
> Opposite, if we don't require WRITE permission in some case and still do 
> WRITE request, it may crash.
> 
> Still, this may be considered as a preexisting problem of 
> replication_child_perm() and fixed separately.

Hmm, unconditionally requesting write doesn't work, since qemu on the
secondary side is started with "-miration incoming", it goes into
runstate RUN_STATE_INMIGRATE from the beginning and then blockdev_init()
opens every blockdev with BDRV_O_INACTIVE and then it errors out with
-drive driver=replication,...: Block node is read-only.

> > 
> > Signed-off-by: Lukas Straub   
> 
> So, for this one commit (with probably updated commit message accordingly to 
> my comments, or even rebased on fixed replication_child_perm()):
> 
> Reviewed-by: Vladimir Sementsov-Ogievskiy 
> 
> 
> > ---
> >   block/replication.c | 12 +---
> >   1 file changed, 1 insertion(+), 11 deletions(-)
> > 
> > diff --git a/block/replication.c b/block/replication.c
> > index c0d4a6c264..68b46d65a8 100644
> > --- a/block/replication.c
> > +++ b/block/replication.c
> > @@ -348,17 +348,7 @@ static void secondary_do_checkpoint(BlockDriverState 
> > *bs, Error **errp)
> >   return;
> >   }
> > 
> > -BlockBackend *blk = blk_new(qemu_get_current_aio_context(),
> > -BLK_PERM_WRITE, BLK_PERM_ALL);
> > -blk_insert_bs(blk, s->hidden_disk->bs, _err);
> > -if (local_err) {
> > -error_propagate(errp, local_err);
> > -blk_unref(blk);
> > -return;
> > -}
> > -
> > -ret = blk_make_empty(blk, errp);
> > -blk_unref(blk);
> > +ret = bdrv_make_empty(s->hidden_disk, errp);
> >   if (ret < 0) {
> >   return;
> >   }
> > --
> > 2.20.1
> >   
> 
> 



-- 



pgpJHRvFkSClE.pgp
Description: OpenPGP digital signature


Re: [PATCH v3 1/4] replication: Remove s->active_disk

2021-07-09 Thread Lukas Straub
On Fri, 9 Jul 2021 10:11:15 +0300
Vladimir Sementsov-Ogievskiy  wrote:

> 07.07.2021 21:15, Lukas Straub wrote:
> > s->active_disk is bs->file. Remove it and use local variables instead.
> > 
> > Signed-off-by: Lukas Straub 
> > ---
> >   block/replication.c | 38 +-
> >   1 file changed, 21 insertions(+), 17 deletions(-)
> > 
> > diff --git a/block/replication.c b/block/replication.c
> > index 52163f2d1f..50940fbe33 100644
> > --- a/block/replication.c
> > +++ b/block/replication.c
> > @@ -35,7 +35,6 @@ typedef enum {
> >   typedef struct BDRVReplicationState {
> >   ReplicationMode mode;
> >   ReplicationStage stage;
> > -BdrvChild *active_disk;
> >   BlockJob *commit_job;
> >   BdrvChild *hidden_disk;
> >   BdrvChild *secondary_disk;
> > @@ -307,11 +306,15 @@ out:
> >   return ret;
> >   }
> > 
> > -static void secondary_do_checkpoint(BDRVReplicationState *s, Error **errp)
> > +static void secondary_do_checkpoint(BlockDriverState *bs, Error **errp)
> >   {
> > +BDRVReplicationState *s = bs->opaque;
> > +BdrvChild *active_disk;  
> 
> Why not to combine initialization into definition:
> 
> BdrvChild *active_disk = bs->file;

Ok, will fix.

> >   Error *local_err = NULL;
> >   int ret;
> > 
> > +active_disk = bs->file;
> > +
> >   if (!s->backup_job) {
> >   error_setg(errp, "Backup job was cancelled unexpectedly");
> >   return;
> > @@ -323,13 +326,13 @@ static void 
> > secondary_do_checkpoint(BDRVReplicationState *s, Error **errp)
> >   return;
> >   }
> > 
> > -if (!s->active_disk->bs->drv) {
> > +if (!active_disk->bs->drv) {
> >   error_setg(errp, "Active disk %s is ejected",
> > -   s->active_disk->bs->node_name);
> > +   active_disk->bs->node_name);
> >   return;
> >   }
> > 
> > -ret = bdrv_make_empty(s->active_disk, errp);
> > +ret = bdrv_make_empty(active_disk, errp);
> >   if (ret < 0) {
> >   return;
> >   }
> > @@ -451,6 +454,7 @@ static void replication_start(ReplicationState *rs, 
> > ReplicationMode mode,
> >   BlockDriverState *bs = rs->opaque;
> >   BDRVReplicationState *s;
> >   BlockDriverState *top_bs;
> > +BdrvChild *active_disk;
> >   int64_t active_length, hidden_length, disk_length;
> >   AioContext *aio_context;
> >   Error *local_err = NULL;
> > @@ -488,15 +492,14 @@ static void replication_start(ReplicationState *rs, 
> > ReplicationMode mode,
> >   case REPLICATION_MODE_PRIMARY:
> >   break;
> >   case REPLICATION_MODE_SECONDARY:
> > -s->active_disk = bs->file;
> > -if (!s->active_disk || !s->active_disk->bs ||
> > -!s->active_disk->bs->backing) {
> > +active_disk = bs->file;  
> 
> Here initializing active_disk only here makes sense: we consider "active 
> disk" only in secondary mode. Right?

Yes.

> > +if (!active_disk || !active_disk->bs || !active_disk->bs->backing) 
> > {
> >   error_setg(errp, "Active disk doesn't have backing file");
> >   aio_context_release(aio_context);
> >   return;
> >   }
> > 
> > -s->hidden_disk = s->active_disk->bs->backing;
> > +s->hidden_disk = active_disk->bs->backing;
> >   if (!s->hidden_disk->bs || !s->hidden_disk->bs->backing) {
> >   error_setg(errp, "Hidden disk doesn't have backing file");
> >   aio_context_release(aio_context);
> > @@ -511,7 +514,7 @@ static void replication_start(ReplicationState *rs, 
> > ReplicationMode mode,
> >   }
> > 
> >   /* verify the length */
> > -active_length = bdrv_getlength(s->active_disk->bs);
> > +active_length = bdrv_getlength(active_disk->bs);
> >   hidden_length = bdrv_getlength(s->hidden_disk->bs);
> >   disk_length = bdrv_getlength(s->secondary_disk->bs);
> >   if (active_length < 0 || hidden_length < 0 || disk_length < 0 ||
> > @@ -523,9 +526,9 @@ static void replication_start(ReplicationState *rs, 
> > ReplicationMode mode,
> >

[PATCH v3 4/4] replication: Remove workaround

2021-07-07 Thread Lukas Straub
Remove the workaround introduced in commit
6ecbc6c52672db5c13805735ca02784879ce8285
"replication: Avoid blk_make_empty() on read-only child".

It is not needed anymore since s->hidden_disk is guaranteed to be
writable when secondary_do_checkpoint() runs. Because replication_start(),
_do_checkpoint() and _stop() are only called by COLO migration code
and COLO-migration doesn't inactivate disks.

Signed-off-by: Lukas Straub 
---
 block/replication.c | 12 +---
 1 file changed, 1 insertion(+), 11 deletions(-)

diff --git a/block/replication.c b/block/replication.c
index c0d4a6c264..68b46d65a8 100644
--- a/block/replication.c
+++ b/block/replication.c
@@ -348,17 +348,7 @@ static void secondary_do_checkpoint(BlockDriverState *bs, 
Error **errp)
 return;
 }

-BlockBackend *blk = blk_new(qemu_get_current_aio_context(),
-BLK_PERM_WRITE, BLK_PERM_ALL);
-blk_insert_bs(blk, s->hidden_disk->bs, _err);
-if (local_err) {
-error_propagate(errp, local_err);
-blk_unref(blk);
-return;
-}
-
-ret = blk_make_empty(blk, errp);
-blk_unref(blk);
+ret = bdrv_make_empty(s->hidden_disk, errp);
 if (ret < 0) {
 return;
 }
--
2.20.1


pgpKvTw0eVhsi.pgp
Description: OpenPGP digital signature


[PATCH v3 3/4] replication: Properly attach children

2021-07-07 Thread Lukas Straub
The replication driver needs access to the children block-nodes of
it's child so it can issue bdrv_make_empty() and bdrv_co_pwritev()
to manage the replication. However, it does this by directly copying
the BdrvChilds, which is wrong.

Fix this by properly attaching the block-nodes with
bdrv_attach_child() and requesting the required permissions.

Signed-off-by: Lukas Straub 
---
 block/replication.c | 30 +++---
 1 file changed, 27 insertions(+), 3 deletions(-)

diff --git a/block/replication.c b/block/replication.c
index 74adf30f54..c0d4a6c264 100644
--- a/block/replication.c
+++ b/block/replication.c
@@ -165,7 +165,12 @@ static void replication_child_perm(BlockDriverState *bs, 
BdrvChild *c,
uint64_t perm, uint64_t shared,
uint64_t *nperm, uint64_t *nshared)
 {
-*nperm = BLK_PERM_CONSISTENT_READ;
+if (role & BDRV_CHILD_PRIMARY) {
+*nperm = BLK_PERM_CONSISTENT_READ;
+} else {
+*nperm = 0;
+}
+
 if ((bs->open_flags & (BDRV_O_INACTIVE | BDRV_O_RDWR)) == BDRV_O_RDWR) {
 *nperm |= BLK_PERM_WRITE;
 }
@@ -552,8 +557,25 @@ static void replication_start(ReplicationState *rs, 
ReplicationMode mode,
 return;
 }

-s->hidden_disk = hidden_disk;
-s->secondary_disk = secondary_disk;
+bdrv_ref(hidden_disk->bs);
+s->hidden_disk = bdrv_attach_child(bs, hidden_disk->bs, "hidden disk",
+   _of_bds, BDRV_CHILD_DATA,
+   _err);
+if (local_err) {
+error_propagate(errp, local_err);
+aio_context_release(aio_context);
+return;
+}
+
+bdrv_ref(secondary_disk->bs);
+s->secondary_disk = bdrv_attach_child(bs, secondary_disk->bs,
+  "secondary disk", _of_bds,
+  BDRV_CHILD_DATA, _err);
+if (local_err) {
+error_propagate(errp, local_err);
+aio_context_release(aio_context);
+return;
+}

 /* start backup job now */
 error_setg(>blocker,
@@ -659,7 +681,9 @@ static void replication_done(void *opaque, int ret)
 if (ret == 0) {
 s->stage = BLOCK_REPLICATION_DONE;

+bdrv_unref_child(bs, s->secondary_disk);
 s->secondary_disk = NULL;
+bdrv_unref_child(bs, s->hidden_disk);
 s->hidden_disk = NULL;
 s->error = 0;
 } else {
--
2.20.1



pgpCxuOT5oetJ.pgp
Description: OpenPGP digital signature


[PATCH v3 2/4] replication: Reduce usage of s->hidden_disk and s->secondary_disk

2021-07-07 Thread Lukas Straub
In preparation for the next patch, initialize s->hidden_disk and
s->secondary_disk later and replace access to them with local variables
in the places where they aren't initialized yet.

Signed-off-by: Lukas Straub 
---
 block/replication.c | 45 -
 1 file changed, 28 insertions(+), 17 deletions(-)

diff --git a/block/replication.c b/block/replication.c
index 50940fbe33..74adf30f54 100644
--- a/block/replication.c
+++ b/block/replication.c
@@ -368,27 +368,35 @@ static void reopen_backing_file(BlockDriverState *bs, 
bool writable,
 Error **errp)
 {
 BDRVReplicationState *s = bs->opaque;
+BdrvChild *hidden_disk, *secondary_disk;
 BlockReopenQueue *reopen_queue = NULL;

+/*
+ * s->hidden_disk and s->secondary_disk may not be set yet, as they will
+ * only be set after the children are writable.
+ */
+hidden_disk = bs->file->bs->backing;
+secondary_disk = hidden_disk->bs->backing;
+
 if (writable) {
-s->orig_hidden_read_only = bdrv_is_read_only(s->hidden_disk->bs);
-s->orig_secondary_read_only = bdrv_is_read_only(s->secondary_disk->bs);
+s->orig_hidden_read_only = bdrv_is_read_only(hidden_disk->bs);
+s->orig_secondary_read_only = bdrv_is_read_only(secondary_disk->bs);
 }

-bdrv_subtree_drained_begin(s->hidden_disk->bs);
-bdrv_subtree_drained_begin(s->secondary_disk->bs);
+bdrv_subtree_drained_begin(hidden_disk->bs);
+bdrv_subtree_drained_begin(secondary_disk->bs);

 if (s->orig_hidden_read_only) {
 QDict *opts = qdict_new();
 qdict_put_bool(opts, BDRV_OPT_READ_ONLY, !writable);
-reopen_queue = bdrv_reopen_queue(reopen_queue, s->hidden_disk->bs,
+reopen_queue = bdrv_reopen_queue(reopen_queue, hidden_disk->bs,
  opts, true);
 }

 if (s->orig_secondary_read_only) {
 QDict *opts = qdict_new();
 qdict_put_bool(opts, BDRV_OPT_READ_ONLY, !writable);
-reopen_queue = bdrv_reopen_queue(reopen_queue, s->secondary_disk->bs,
+reopen_queue = bdrv_reopen_queue(reopen_queue, secondary_disk->bs,
  opts, true);
 }

@@ -396,8 +404,8 @@ static void reopen_backing_file(BlockDriverState *bs, bool 
writable,
 bdrv_reopen_multiple(reopen_queue, errp);
 }

-bdrv_subtree_drained_end(s->hidden_disk->bs);
-bdrv_subtree_drained_end(s->secondary_disk->bs);
+bdrv_subtree_drained_end(hidden_disk->bs);
+bdrv_subtree_drained_end(secondary_disk->bs);
 }

 static void backup_job_cleanup(BlockDriverState *bs)
@@ -454,7 +462,7 @@ static void replication_start(ReplicationState *rs, 
ReplicationMode mode,
 BlockDriverState *bs = rs->opaque;
 BDRVReplicationState *s;
 BlockDriverState *top_bs;
-BdrvChild *active_disk;
+BdrvChild *active_disk, *hidden_disk, *secondary_disk;
 int64_t active_length, hidden_length, disk_length;
 AioContext *aio_context;
 Error *local_err = NULL;
@@ -499,15 +507,15 @@ static void replication_start(ReplicationState *rs, 
ReplicationMode mode,
 return;
 }

-s->hidden_disk = active_disk->bs->backing;
-if (!s->hidden_disk->bs || !s->hidden_disk->bs->backing) {
+hidden_disk = active_disk->bs->backing;
+if (!hidden_disk->bs || !hidden_disk->bs->backing) {
 error_setg(errp, "Hidden disk doesn't have backing file");
 aio_context_release(aio_context);
 return;
 }

-s->secondary_disk = s->hidden_disk->bs->backing;
-if (!s->secondary_disk->bs || !bdrv_has_blk(s->secondary_disk->bs)) {
+secondary_disk = hidden_disk->bs->backing;
+if (!secondary_disk->bs || !bdrv_has_blk(secondary_disk->bs)) {
 error_setg(errp, "The secondary disk doesn't have block backend");
 aio_context_release(aio_context);
 return;
@@ -515,8 +523,8 @@ static void replication_start(ReplicationState *rs, 
ReplicationMode mode,

 /* verify the length */
 active_length = bdrv_getlength(active_disk->bs);
-hidden_length = bdrv_getlength(s->hidden_disk->bs);
-disk_length = bdrv_getlength(s->secondary_disk->bs);
+hidden_length = bdrv_getlength(hidden_disk->bs);
+disk_length = bdrv_getlength(secondary_disk->bs);
 if (active_length < 0 || hidden_length < 0 || disk_length < 0 ||
 active_length != hidden_length || hidden_length != disk_length) {
 error_setg(errp, "Active disk, hidden disk, secondary disk's 
length"
@@ -526,10 +534,10 @@ static void replication_start(ReplicationState *rs, 
ReplicationM

[PATCH v3 1/4] replication: Remove s->active_disk

2021-07-07 Thread Lukas Straub
s->active_disk is bs->file. Remove it and use local variables instead.

Signed-off-by: Lukas Straub 
---
 block/replication.c | 38 +-
 1 file changed, 21 insertions(+), 17 deletions(-)

diff --git a/block/replication.c b/block/replication.c
index 52163f2d1f..50940fbe33 100644
--- a/block/replication.c
+++ b/block/replication.c
@@ -35,7 +35,6 @@ typedef enum {
 typedef struct BDRVReplicationState {
 ReplicationMode mode;
 ReplicationStage stage;
-BdrvChild *active_disk;
 BlockJob *commit_job;
 BdrvChild *hidden_disk;
 BdrvChild *secondary_disk;
@@ -307,11 +306,15 @@ out:
 return ret;
 }

-static void secondary_do_checkpoint(BDRVReplicationState *s, Error **errp)
+static void secondary_do_checkpoint(BlockDriverState *bs, Error **errp)
 {
+BDRVReplicationState *s = bs->opaque;
+BdrvChild *active_disk;
 Error *local_err = NULL;
 int ret;

+active_disk = bs->file;
+
 if (!s->backup_job) {
 error_setg(errp, "Backup job was cancelled unexpectedly");
 return;
@@ -323,13 +326,13 @@ static void secondary_do_checkpoint(BDRVReplicationState 
*s, Error **errp)
 return;
 }

-if (!s->active_disk->bs->drv) {
+if (!active_disk->bs->drv) {
 error_setg(errp, "Active disk %s is ejected",
-   s->active_disk->bs->node_name);
+   active_disk->bs->node_name);
 return;
 }

-ret = bdrv_make_empty(s->active_disk, errp);
+ret = bdrv_make_empty(active_disk, errp);
 if (ret < 0) {
 return;
 }
@@ -451,6 +454,7 @@ static void replication_start(ReplicationState *rs, 
ReplicationMode mode,
 BlockDriverState *bs = rs->opaque;
 BDRVReplicationState *s;
 BlockDriverState *top_bs;
+BdrvChild *active_disk;
 int64_t active_length, hidden_length, disk_length;
 AioContext *aio_context;
 Error *local_err = NULL;
@@ -488,15 +492,14 @@ static void replication_start(ReplicationState *rs, 
ReplicationMode mode,
 case REPLICATION_MODE_PRIMARY:
 break;
 case REPLICATION_MODE_SECONDARY:
-s->active_disk = bs->file;
-if (!s->active_disk || !s->active_disk->bs ||
-!s->active_disk->bs->backing) {
+active_disk = bs->file;
+if (!active_disk || !active_disk->bs || !active_disk->bs->backing) {
 error_setg(errp, "Active disk doesn't have backing file");
 aio_context_release(aio_context);
 return;
 }

-s->hidden_disk = s->active_disk->bs->backing;
+s->hidden_disk = active_disk->bs->backing;
 if (!s->hidden_disk->bs || !s->hidden_disk->bs->backing) {
 error_setg(errp, "Hidden disk doesn't have backing file");
 aio_context_release(aio_context);
@@ -511,7 +514,7 @@ static void replication_start(ReplicationState *rs, 
ReplicationMode mode,
 }

 /* verify the length */
-active_length = bdrv_getlength(s->active_disk->bs);
+active_length = bdrv_getlength(active_disk->bs);
 hidden_length = bdrv_getlength(s->hidden_disk->bs);
 disk_length = bdrv_getlength(s->secondary_disk->bs);
 if (active_length < 0 || hidden_length < 0 || disk_length < 0 ||
@@ -523,9 +526,9 @@ static void replication_start(ReplicationState *rs, 
ReplicationMode mode,
 }

 /* Must be true, or the bdrv_getlength() calls would have failed */
-assert(s->active_disk->bs->drv && s->hidden_disk->bs->drv);
+assert(active_disk->bs->drv && s->hidden_disk->bs->drv);

-if (!s->active_disk->bs->drv->bdrv_make_empty ||
+if (!active_disk->bs->drv->bdrv_make_empty ||
 !s->hidden_disk->bs->drv->bdrv_make_empty) {
 error_setg(errp,
"Active disk or hidden disk doesn't support 
make_empty");
@@ -579,7 +582,7 @@ static void replication_start(ReplicationState *rs, 
ReplicationMode mode,
 s->stage = BLOCK_REPLICATION_RUNNING;

 if (s->mode == REPLICATION_MODE_SECONDARY) {
-secondary_do_checkpoint(s, errp);
+secondary_do_checkpoint(bs, errp);
 }

 s->error = 0;
@@ -608,7 +611,7 @@ static void replication_do_checkpoint(ReplicationState *rs, 
Error **errp)
 }

 if (s->mode == REPLICATION_MODE_SECONDARY) {
-secondary_do_checkpoint(s, errp);
+secondary_do_checkpoint(bs, errp);
 }
 aio_context_release(aio_context);
 }
@@ -645,7 +648,6 @@ static void replication_done(void *opaque, int ret)
 if (ret == 0) {
 s->stage = BLOCK_REPLICATION_DONE;

-s->active_disk = NULL;
 s->secondary_disk = NULL

[PATCH v3 0/4] replication: Properly attach children

2021-07-07 Thread Lukas Straub
Hello Everyone,
A while ago Kevin noticed that the replication driver doesn't properly attach
the children it wants to use. Instead, it directly copies the BdrvChilds from
it's backing file, which is wrong. Ths Patchset fixes the problem and removes
the workaround that was put in place back then.

Regards,
Lukas Straub

Changes:

-v3:
-Split up into multiple patches
-Remove s->active_disk
-Clarify child permissions in commit message

-v2: Test for BDRV_CHILD_PRIMARY in replication_child_perm, since
 bs->file might not be set yet. (Vladimir)

Lukas Straub (4):
  replication: Remove s->active_disk
  replication: Reduce usage of s->hidden_disk and s->secondary_disk
  replication: Properly attach children
  replication: Remove workaround

 block/replication.c | 115 +++-
 1 file changed, 72 insertions(+), 43 deletions(-)

--
2.20.1


pgphmZr8kEpDA.pgp
Description: OpenPGP digital signature


Re: [PATCH] block/replication.c: Properly attach children

2021-07-07 Thread Lukas Straub
Hi,
Thanks for your review. More below.

Btw: There is a overview of the replication design in
docs/block-replication.txt

On Wed, 7 Jul 2021 16:01:31 +0300
Vladimir Sementsov-Ogievskiy  wrote:

> 06.07.2021 19:11, Lukas Straub wrote:
> > The replication driver needs access to the children block-nodes of
> > it's child so it can issue bdrv_make_empty to manage the replication.
> > However, it does this by directly copying the BdrvChilds, which is
> > wrong.
> > 
> > Fix this by properly attaching the block-nodes with
> > bdrv_attach_child().
> > 
> > Also, remove a workaround introduced in commit
> > 6ecbc6c52672db5c13805735ca02784879ce8285
> > "replication: Avoid blk_make_empty() on read-only child".
> > 
> > Signed-off-by: Lukas Straub 
> > ---
> > 
> > -v2: Test for BDRV_CHILD_PRIMARY in replication_child_perm, since
> >   bs->file might not be set yet. (Vladimir)
> > 
> >   block/replication.c | 94 +
> >   1 file changed, 61 insertions(+), 33 deletions(-)
> > 
> > diff --git a/block/replication.c b/block/replication.c
> > index 52163f2d1f..fd8cb728a3 100644
> > --- a/block/replication.c
> > +++ b/block/replication.c
> > @@ -166,7 +166,12 @@ static void replication_child_perm(BlockDriverState 
> > *bs, BdrvChild *c,
> >  uint64_t perm, uint64_t shared,
> >  uint64_t *nperm, uint64_t *nshared)
> >   {
> > -*nperm = BLK_PERM_CONSISTENT_READ;
> > +if (role & BDRV_CHILD_PRIMARY) {
> > +*nperm = BLK_PERM_CONSISTENT_READ;
> > +} else {
> > +*nperm = 0;
> > +}  
> 
> Why you drop READ access for other children? You don't mention it in 
> commit-msg..
> 
> Upd: ok now I see that we are not going to read from hidden_disk child, and 
> that's the only "other" child that worth to mention.
> Still, we should be sure that hidden_disk child gets WRITE permission in case 
> we are going to call bdrv_make_empty on it.

The code below that in replication_child_perm() should make sure of
that or am i misunderstanding it?

Or do you mean that it should always request WRITE regardless of
bs->open_flags & BDRV_O_INACTIVE?

> > +
> >   if ((bs->open_flags & (BDRV_O_INACTIVE | BDRV_O_RDWR)) == 
> > BDRV_O_RDWR) {
> >   *nperm |= BLK_PERM_WRITE;
> >   }
> > @@ -340,17 +345,7 @@ static void 
> > secondary_do_checkpoint(BDRVReplicationState *s, Error **errp)
> >   return;
> >   }
> >   
> > -BlockBackend *blk = blk_new(qemu_get_current_aio_context(),
> > -BLK_PERM_WRITE, BLK_PERM_ALL);
> > -blk_insert_bs(blk, s->hidden_disk->bs, _err);
> > -if (local_err) {
> > -error_propagate(errp, local_err);
> > -blk_unref(blk);
> > -return;
> > -}
> > -
> > -ret = blk_make_empty(blk, errp);
> > -blk_unref(blk);
> > +ret = bdrv_make_empty(s->hidden_disk, errp);  
> 
> So, here you rely on BLK_PERM_WRITE being set in replication_child_perm().. 
> Probably that's OK, however logic is changed. Shouldn't we always require 
> write permission in replication_child_perm() for hidden_disk ?
> 
> >   if (ret < 0) {
> >   return;
> >   }
> > @@ -365,27 +360,35 @@ static void reopen_backing_file(BlockDriverState *bs, 
> > bool writable,
> >   Error **errp)
> >   {
> >   BDRVReplicationState *s = bs->opaque;
> > +BdrvChild *hidden_disk, *secondary_disk;
> >   BlockReopenQueue *reopen_queue = NULL;
> >   
> > +/*
> > + * s->hidden_disk and s->secondary_disk may not be set yet, as they 
> > will
> > + * only be set after the children are writable.
> > + */
> > +hidden_disk = bs->file->bs->backing;
> > +secondary_disk = hidden_disk->bs->backing;
> > +
> >   if (writable) {
> > -s->orig_hidden_read_only = bdrv_is_read_only(s->hidden_disk->bs);
> > -s->orig_secondary_read_only = 
> > bdrv_is_read_only(s->secondary_disk->bs);
> > +s->orig_hidden_read_only = bdrv_is_read_only(hidden_disk->bs);
> > +s->orig_secondary_read_only = 
> > bdrv_is_read_only(secondary_disk->bs);
> >   }
> >   
> > -bdrv_subtree_drained_begin(s->hidden_disk->bs);
> > -bdrv_subtree_drained_begin(s->secondary_disk->bs)

[PATCH] block/replication.c: Properly attach children

2021-07-06 Thread Lukas Straub
The replication driver needs access to the children block-nodes of
it's child so it can issue bdrv_make_empty to manage the replication.
However, it does this by directly copying the BdrvChilds, which is
wrong.

Fix this by properly attaching the block-nodes with
bdrv_attach_child().

Also, remove a workaround introduced in commit
6ecbc6c52672db5c13805735ca02784879ce8285
"replication: Avoid blk_make_empty() on read-only child".

Signed-off-by: Lukas Straub 
---

-v2: Test for BDRV_CHILD_PRIMARY in replication_child_perm, since
 bs->file might not be set yet. (Vladimir)

 block/replication.c | 94 +
 1 file changed, 61 insertions(+), 33 deletions(-)

diff --git a/block/replication.c b/block/replication.c
index 52163f2d1f..fd8cb728a3 100644
--- a/block/replication.c
+++ b/block/replication.c
@@ -166,7 +166,12 @@ static void replication_child_perm(BlockDriverState *bs, 
BdrvChild *c,
uint64_t perm, uint64_t shared,
uint64_t *nperm, uint64_t *nshared)
 {
-*nperm = BLK_PERM_CONSISTENT_READ;
+if (role & BDRV_CHILD_PRIMARY) {
+*nperm = BLK_PERM_CONSISTENT_READ;
+} else {
+*nperm = 0;
+}
+
 if ((bs->open_flags & (BDRV_O_INACTIVE | BDRV_O_RDWR)) == BDRV_O_RDWR) {
 *nperm |= BLK_PERM_WRITE;
 }
@@ -340,17 +345,7 @@ static void secondary_do_checkpoint(BDRVReplicationState 
*s, Error **errp)
 return;
 }
 
-BlockBackend *blk = blk_new(qemu_get_current_aio_context(),
-BLK_PERM_WRITE, BLK_PERM_ALL);
-blk_insert_bs(blk, s->hidden_disk->bs, _err);
-if (local_err) {
-error_propagate(errp, local_err);
-blk_unref(blk);
-return;
-}
-
-ret = blk_make_empty(blk, errp);
-blk_unref(blk);
+ret = bdrv_make_empty(s->hidden_disk, errp);
 if (ret < 0) {
 return;
 }
@@ -365,27 +360,35 @@ static void reopen_backing_file(BlockDriverState *bs, 
bool writable,
 Error **errp)
 {
 BDRVReplicationState *s = bs->opaque;
+BdrvChild *hidden_disk, *secondary_disk;
 BlockReopenQueue *reopen_queue = NULL;
 
+/*
+ * s->hidden_disk and s->secondary_disk may not be set yet, as they will
+ * only be set after the children are writable.
+ */
+hidden_disk = bs->file->bs->backing;
+secondary_disk = hidden_disk->bs->backing;
+
 if (writable) {
-s->orig_hidden_read_only = bdrv_is_read_only(s->hidden_disk->bs);
-s->orig_secondary_read_only = bdrv_is_read_only(s->secondary_disk->bs);
+s->orig_hidden_read_only = bdrv_is_read_only(hidden_disk->bs);
+s->orig_secondary_read_only = bdrv_is_read_only(secondary_disk->bs);
 }
 
-bdrv_subtree_drained_begin(s->hidden_disk->bs);
-bdrv_subtree_drained_begin(s->secondary_disk->bs);
+bdrv_subtree_drained_begin(hidden_disk->bs);
+bdrv_subtree_drained_begin(secondary_disk->bs);
 
 if (s->orig_hidden_read_only) {
 QDict *opts = qdict_new();
 qdict_put_bool(opts, BDRV_OPT_READ_ONLY, !writable);
-reopen_queue = bdrv_reopen_queue(reopen_queue, s->hidden_disk->bs,
+reopen_queue = bdrv_reopen_queue(reopen_queue, hidden_disk->bs,
  opts, true);
 }
 
 if (s->orig_secondary_read_only) {
 QDict *opts = qdict_new();
 qdict_put_bool(opts, BDRV_OPT_READ_ONLY, !writable);
-reopen_queue = bdrv_reopen_queue(reopen_queue, s->secondary_disk->bs,
+reopen_queue = bdrv_reopen_queue(reopen_queue, secondary_disk->bs,
  opts, true);
 }
 
@@ -393,8 +396,8 @@ static void reopen_backing_file(BlockDriverState *bs, bool 
writable,
 bdrv_reopen_multiple(reopen_queue, errp);
 }
 
-bdrv_subtree_drained_end(s->hidden_disk->bs);
-bdrv_subtree_drained_end(s->secondary_disk->bs);
+bdrv_subtree_drained_end(hidden_disk->bs);
+bdrv_subtree_drained_end(secondary_disk->bs);
 }
 
 static void backup_job_cleanup(BlockDriverState *bs)
@@ -451,6 +454,7 @@ static void replication_start(ReplicationState *rs, 
ReplicationMode mode,
 BlockDriverState *bs = rs->opaque;
 BDRVReplicationState *s;
 BlockDriverState *top_bs;
+BdrvChild *active_disk, *hidden_disk, *secondary_disk;
 int64_t active_length, hidden_length, disk_length;
 AioContext *aio_context;
 Error *local_err = NULL;
@@ -488,32 +492,32 @@ static void replication_start(ReplicationState *rs, 
ReplicationMode mode,
 case REPLICATION_MODE_PRIMARY:
 break;
 case REPLICATION_MODE_SECONDARY:
-s->active_disk = bs->file;
-if (!s->active_disk || !s->active_disk->bs ||
-!s->a

[PATCH resend] block/replication.c: Properly attach children

2021-07-04 Thread Lukas Straub
The replication driver needs access to the children block-nodes of
it's child so it can issue bdrv_make_empty to manage the replication.
However, it does this by directly copying the BdrvChilds, which is
wrong.

Fix this by properly attaching the block-nodes with
bdrv_attach_child().

Also, remove a workaround introduced in commit
6ecbc6c52672db5c13805735ca02784879ce8285
"replication: Avoid blk_make_empty() on read-only child".

Signed-off-by: Lukas Straub 
---

Fix CC: email address so the mailing list doesn't reject it.

 block/replication.c | 94 +
 1 file changed, 61 insertions(+), 33 deletions(-)

diff --git a/block/replication.c b/block/replication.c
index 52163f2d1f..426d2b741a 100644
--- a/block/replication.c
+++ b/block/replication.c
@@ -166,7 +166,12 @@ static void replication_child_perm(BlockDriverState *bs, 
BdrvChild *c,
uint64_t perm, uint64_t shared,
uint64_t *nperm, uint64_t *nshared)
 {
-*nperm = BLK_PERM_CONSISTENT_READ;
+if (c == bs->file) {
+*nperm = BLK_PERM_CONSISTENT_READ;
+} else {
+*nperm = 0;
+}
+
 if ((bs->open_flags & (BDRV_O_INACTIVE | BDRV_O_RDWR)) == BDRV_O_RDWR) {
 *nperm |= BLK_PERM_WRITE;
 }
@@ -340,17 +345,7 @@ static void secondary_do_checkpoint(BDRVReplicationState 
*s, Error **errp)
 return;
 }
 
-BlockBackend *blk = blk_new(qemu_get_current_aio_context(),
-BLK_PERM_WRITE, BLK_PERM_ALL);
-blk_insert_bs(blk, s->hidden_disk->bs, _err);
-if (local_err) {
-error_propagate(errp, local_err);
-blk_unref(blk);
-return;
-}
-
-ret = blk_make_empty(blk, errp);
-blk_unref(blk);
+ret = bdrv_make_empty(s->hidden_disk, errp);
 if (ret < 0) {
 return;
 }
@@ -365,27 +360,35 @@ static void reopen_backing_file(BlockDriverState *bs, 
bool writable,
 Error **errp)
 {
 BDRVReplicationState *s = bs->opaque;
+BdrvChild *hidden_disk, *secondary_disk;
 BlockReopenQueue *reopen_queue = NULL;
 
+/*
+ * s->hidden_disk and s->secondary_disk may not be set yet, as they will
+ * only be set after the children are writable.
+ */
+hidden_disk = bs->file->bs->backing;
+secondary_disk = hidden_disk->bs->backing;
+
 if (writable) {
-s->orig_hidden_read_only = bdrv_is_read_only(s->hidden_disk->bs);
-s->orig_secondary_read_only = bdrv_is_read_only(s->secondary_disk->bs);
+s->orig_hidden_read_only = bdrv_is_read_only(hidden_disk->bs);
+s->orig_secondary_read_only = bdrv_is_read_only(secondary_disk->bs);
 }
 
-bdrv_subtree_drained_begin(s->hidden_disk->bs);
-bdrv_subtree_drained_begin(s->secondary_disk->bs);
+bdrv_subtree_drained_begin(hidden_disk->bs);
+bdrv_subtree_drained_begin(secondary_disk->bs);
 
 if (s->orig_hidden_read_only) {
 QDict *opts = qdict_new();
 qdict_put_bool(opts, BDRV_OPT_READ_ONLY, !writable);
-reopen_queue = bdrv_reopen_queue(reopen_queue, s->hidden_disk->bs,
+reopen_queue = bdrv_reopen_queue(reopen_queue, hidden_disk->bs,
  opts, true);
 }
 
 if (s->orig_secondary_read_only) {
 QDict *opts = qdict_new();
 qdict_put_bool(opts, BDRV_OPT_READ_ONLY, !writable);
-reopen_queue = bdrv_reopen_queue(reopen_queue, s->secondary_disk->bs,
+reopen_queue = bdrv_reopen_queue(reopen_queue, secondary_disk->bs,
  opts, true);
 }
 
@@ -393,8 +396,8 @@ static void reopen_backing_file(BlockDriverState *bs, bool 
writable,
 bdrv_reopen_multiple(reopen_queue, errp);
 }
 
-bdrv_subtree_drained_end(s->hidden_disk->bs);
-bdrv_subtree_drained_end(s->secondary_disk->bs);
+bdrv_subtree_drained_end(hidden_disk->bs);
+bdrv_subtree_drained_end(secondary_disk->bs);
 }
 
 static void backup_job_cleanup(BlockDriverState *bs)
@@ -451,6 +454,7 @@ static void replication_start(ReplicationState *rs, 
ReplicationMode mode,
 BlockDriverState *bs = rs->opaque;
 BDRVReplicationState *s;
 BlockDriverState *top_bs;
+BdrvChild *active_disk, *hidden_disk, *secondary_disk;
 int64_t active_length, hidden_length, disk_length;
 AioContext *aio_context;
 Error *local_err = NULL;
@@ -488,32 +492,32 @@ static void replication_start(ReplicationState *rs, 
ReplicationMode mode,
 case REPLICATION_MODE_PRIMARY:
 break;
 case REPLICATION_MODE_SECONDARY:
-s->active_disk = bs->file;
-if (!s->active_disk || !s->active_disk->bs ||
-!s->active_disk->bs->backing) {
+active_disk = bs->file;
+  

[PATCH resend] nbd: register yank function earlier

2021-07-04 Thread Lukas Straub
Although unlikely, qemu might hang in nbd_send_request().

Allow recovery in this case by registering the yank function before
calling it.

Signed-off-by: Lukas Straub 
---

Fix CC: email address so the mailing list doesn't reject it.

 block/nbd.c | 8 +---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/block/nbd.c b/block/nbd.c
index 601fccc5ba..f6ff1c4fb4 100644
--- a/block/nbd.c
+++ b/block/nbd.c
@@ -369,32 +369,34 @@ int coroutine_fn 
nbd_co_do_establish_connection(BlockDriverState *bs,
 s->ioc = nbd_co_establish_connection(s->conn, >info, true, errp);
 if (!s->ioc) {
 return -ECONNREFUSED;
 }
 
+yank_register_function(BLOCKDEV_YANK_INSTANCE(s->bs->node_name), nbd_yank,
+   bs);
+
 ret = nbd_handle_updated_info(s->bs, NULL);
 if (ret < 0) {
 /*
  * We have connected, but must fail for other reasons.
  * Send NBD_CMD_DISC as a courtesy to the server.
  */
 NBDRequest request = { .type = NBD_CMD_DISC };
 
 nbd_send_request(s->ioc, );
 
+yank_unregister_function(BLOCKDEV_YANK_INSTANCE(s->bs->node_name),
+ nbd_yank, bs);
 object_unref(OBJECT(s->ioc));
 s->ioc = NULL;
 
 return ret;
 }
 
 qio_channel_set_blocking(s->ioc, false, NULL);
 qio_channel_attach_aio_context(s->ioc, bdrv_get_aio_context(bs));
 
-yank_register_function(BLOCKDEV_YANK_INSTANCE(s->bs->node_name), nbd_yank,
-   bs);
-
 /* successfully connected */
 s->state = NBD_CLIENT_CONNECTED;
 qemu_co_queue_restart_all(>free_sema);
 
 return 0;
-- 
2.32.0


pgpzd_EVcikxB.pgp
Description: OpenPGP digital signature


Re: [RFC PATCH] block/io.c: Flush parent for quorum in generic code

2021-05-18 Thread Lukas Straub
On Wed, 12 May 2021 15:49:57 +0800
Zhang Chen  wrote:

> Fix the issue from this patch:
> [PATCH] block: Flush all children in generic code
> From 883833e29cb800b4d92b5d4736252f4004885191
> 
> Quorum driver do not have the primary child.
> It will caused guest block flush issue when use quorum and NBD.
> The vm guest flushes failed,and then guest filesystem is shutdown.

Hi,
I think the problem is rather that the quorum driver provides
.bdrv_co_flush_to_disk (which predates .bdrv_co_flush) instead of
.bdrv_co_flush. Can you try with the following patch instead?

diff --git a/block/quorum.c b/block/quorum.c
index cfc1436abb..f2c0805000 100644
--- a/block/quorum.c
+++ b/block/quorum.c
@@ -1279,7 +1279,7 @@ static BlockDriver bdrv_quorum = {
 .bdrv_dirname   = quorum_dirname,
 .bdrv_co_block_status   = quorum_co_block_status,
 
-.bdrv_co_flush_to_disk  = quorum_co_flush,
+.bdrv_co_flush  = quorum_co_flush,
 
 .bdrv_getlength = quorum_getlength,


> Signed-off-by: Zhang Chen 
> Reported-by: Minghao Yuan 
> ---
>  block/io.c | 31 ++-
>  1 file changed, 22 insertions(+), 9 deletions(-)
> 
> diff --git a/block/io.c b/block/io.c
> index 35b6c56efc..4dc1873cb9 100644
> --- a/block/io.c
> +++ b/block/io.c
> @@ -2849,6 +2849,13 @@ int coroutine_fn bdrv_co_flush(BlockDriverState *bs)
>  BdrvChild *child;
>  int current_gen;
>  int ret = 0;
> +bool no_primary_child = false;
> +
> +/* Quorum drivers do not have the primary child. */
> +if (!primary_child) {
> +primary_child = bs->file;
> +no_primary_child = true;
> +}
>  
>  bdrv_inc_in_flight(bs);
>  
> @@ -2886,12 +2893,12 @@ int coroutine_fn bdrv_co_flush(BlockDriverState *bs)
>  
>  /* But don't actually force it to the disk with cache=unsafe */
>  if (bs->open_flags & BDRV_O_NO_FLUSH) {
> -goto flush_children;
> +goto flush_data;
>  }
>  
>  /* Check if we really need to flush anything */
>  if (bs->flushed_gen == current_gen) {
> -goto flush_children;
> +goto flush_data;
>  }
>  
>  BLKDBG_EVENT(primary_child, BLKDBG_FLUSH_TO_DISK);
> @@ -2938,13 +2945,19 @@ int coroutine_fn bdrv_co_flush(BlockDriverState *bs)
>  /* Now flush the underlying protocol.  It will also have BDRV_O_NO_FLUSH
>   * in the case of cache=unsafe, so there are no useless flushes.
>   */
> -flush_children:
> -ret = 0;
> -QLIST_FOREACH(child, >children, next) {
> -if (child->perm & (BLK_PERM_WRITE | BLK_PERM_WRITE_UNCHANGED)) {
> -int this_child_ret = bdrv_co_flush(child->bs);
> -if (!ret) {
> -ret = this_child_ret;
> +flush_data:
> +if (no_primary_child) {
> +/* Flush parent */
> +ret = bs->file ? bdrv_co_flush(bs->file->bs) : 0;
> +} else {
> +/* Flush childrens */
> +ret = 0;
> +QLIST_FOREACH(child, >children, next) {
> +if (child->perm & (BLK_PERM_WRITE | BLK_PERM_WRITE_UNCHANGED)) {
> +int this_child_ret = bdrv_co_flush(child->bs);
> +if (!ret) {
> +ret = this_child_ret;
> +}
>  }
>  }
>  }



-- 



pgpS2ZYi1f_kf.pgp
Description: OpenPGP digital signature


Re: [PATCH v14 0/7] Introduce 'yank' oob qmp command to recover from hanging qemu

2021-01-13 Thread Lukas Straub
On Tue, 12 Jan 2021 17:20:54 +0100
Markus Armbruster  wrote:

> Queued.  Thanks for persevering!
> 

Great, Thanks!

Regards,
Lukas Straub

-- 



pgp1c_zCQLpLx.pgp
Description: OpenPGP digital signature


[PATCH v14 7/7] tests/test-char.c: Wait for the chardev to connect in char_socket_client_dupid_test

2020-12-28 Thread Lukas Straub
A connecting chardev object has an additional reference by the connecting
thread, so if the chardev is still connecting by the end of the test,
then the chardev object won't be freed. This in turn means that the yank
instance won't be unregistered and when running the next test-case
yank_register_instance will abort, because the yank instance is
already/still registered.

Signed-off-by: Lukas Straub 
Reviewed-by: Daniel P. Berrangé 
---
 tests/test-char.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/tests/test-char.c b/tests/test-char.c
index 953e0d1c1f..41a76410d8 100644
--- a/tests/test-char.c
+++ b/tests/test-char.c
@@ -937,6 +937,7 @@ static void char_socket_client_dupid_test(gconstpointer 
opaque)
 g_assert_nonnull(opts);
 chr1 = qemu_chr_new_from_opts(opts, NULL, _abort);
 g_assert_nonnull(chr1);
+qemu_chr_wait_connected(chr1, _abort);

 chr2 = qemu_chr_new_from_opts(opts, NULL, _err);
 g_assert_null(chr2);
--
2.29.2


pgpTmkpz7Nuqd.pgp
Description: OpenPGP digital signature


[PATCH v14 1/7] Introduce yank feature

2020-12-28 Thread Lukas Straub
The yank feature allows to recover from hanging qemu by "yanking"
at various parts. Other qemu systems can register themselves and
multiple yank functions. Then all yank functions for selected
instances can be called by the 'yank' out-of-band qmp command.
Available instances can be queried by a 'query-yank' oob command.

Signed-off-by: Lukas Straub 
Acked-by: Stefan Hajnoczi 
Reviewed-by: Markus Armbruster 
---
 MAINTAINERS   |   7 ++
 include/qemu/yank.h   |  97 
 qapi/meson.build  |   1 +
 qapi/qapi-schema.json |   1 +
 qapi/yank.json| 119 
 util/meson.build  |   1 +
 util/yank.c   | 207 ++
 7 files changed, 433 insertions(+)
 create mode 100644 include/qemu/yank.h
 create mode 100644 qapi/yank.json
 create mode 100644 util/yank.c

diff --git a/MAINTAINERS b/MAINTAINERS
index 1e7c8f0488..f465a4045a 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -2716,6 +2716,13 @@ F: util/uuid.c
 F: include/qemu/uuid.h
 F: tests/test-uuid.c

+Yank feature
+M: Lukas Straub 
+S: Odd fixes
+F: util/yank.c
+F: include/qemu/yank.h
+F: qapi/yank.json
+
 COLO Framework
 M: zhanghailiang 
 S: Maintained
diff --git a/include/qemu/yank.h b/include/qemu/yank.h
new file mode 100644
index 00..5b93c70cbf
--- /dev/null
+++ b/include/qemu/yank.h
@@ -0,0 +1,97 @@
+/*
+ * QEMU yank feature
+ *
+ * Copyright (c) Lukas Straub 
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or later.
+ * See the COPYING file in the top-level directory.
+ */
+
+#ifndef YANK_H
+#define YANK_H
+
+#include "qapi/qapi-types-yank.h"
+
+typedef void (YankFn)(void *opaque);
+
+/**
+ * yank_register_instance: Register a new instance.
+ *
+ * This registers a new instance for yanking. Must be called before any yank
+ * function is registered for this instance.
+ *
+ * This function is thread-safe.
+ *
+ * @instance: The instance.
+ * @errp: Error object.
+ *
+ * Returns true on success or false if an error occured.
+ */
+bool yank_register_instance(const YankInstance *instance, Error **errp);
+
+/**
+ * yank_unregister_instance: Unregister a instance.
+ *
+ * This unregisters a instance. Must be called only after every yank function
+ * of the instance has been unregistered.
+ *
+ * This function is thread-safe.
+ *
+ * @instance: The instance.
+ */
+void yank_unregister_instance(const YankInstance *instance);
+
+/**
+ * yank_register_function: Register a yank function
+ *
+ * This registers a yank function. All limitations of qmp oob commands apply
+ * to the yank function as well. See docs/devel/qapi-code-gen.txt under
+ * "An OOB-capable command handler must satisfy the following conditions".
+ *
+ * This function is thread-safe.
+ *
+ * @instance: The instance.
+ * @func: The yank function.
+ * @opaque: Will be passed to the yank function.
+ */
+void yank_register_function(const YankInstance *instance,
+YankFn *func,
+void *opaque);
+
+/**
+ * yank_unregister_function: Unregister a yank function
+ *
+ * This unregisters a yank function.
+ *
+ * This function is thread-safe.
+ *
+ * @instance: The instance.
+ * @func: func that was passed to yank_register_function.
+ * @opaque: opaque that was passed to yank_register_function.
+ */
+void yank_unregister_function(const YankInstance *instance,
+  YankFn *func,
+  void *opaque);
+
+/**
+ * yank_generic_iochannel: Generic yank function for iochannel
+ *
+ * This is a generic yank function which will call qio_channel_shutdown on the
+ * provided QIOChannel.
+ *
+ * @opaque: QIOChannel to shutdown
+ */
+void yank_generic_iochannel(void *opaque);
+
+#define BLOCKDEV_YANK_INSTANCE(the_node_name) (&(YankInstance) { \
+.type = YANK_INSTANCE_TYPE_BLOCK_NODE, \
+.u.block_node.node_name = (the_node_name) })
+
+#define CHARDEV_YANK_INSTANCE(the_id) (&(YankInstance) { \
+.type = YANK_INSTANCE_TYPE_CHARDEV, \
+.u.chardev.id = (the_id) })
+
+#define MIGRATION_YANK_INSTANCE (&(YankInstance) { \
+.type = YANK_INSTANCE_TYPE_MIGRATION })
+
+#endif
diff --git a/qapi/meson.build b/qapi/meson.build
index 0e98146f1f..ab68e7900e 100644
--- a/qapi/meson.build
+++ b/qapi/meson.build
@@ -47,6 +47,7 @@ qapi_all_modules = [
   'trace',
   'transaction',
   'ui',
+  'yank',
 ]

 qapi_storage_daemon_modules = [
diff --git a/qapi/qapi-schema.json b/qapi/qapi-schema.json
index 0b444b76d2..3441c9a9ae 100644
--- a/qapi/qapi-schema.json
+++ b/qapi/qapi-schema.json
@@ -86,6 +86,7 @@
 { 'include': 'machine.json' }
 { 'include': 'machine-target.json' }
 { 'include': 'replay.json' }
+{ 'include': 'yank.json' }
 { 'include': 'misc.json' }
 { 'include': 'misc-target.json' }
 { 'include': 'audio.json' }
diff --git a/qapi/yank.json b/qapi/yank.json
new file mode 100644
index 00..167a775594
--- /dev/null
+++ b/qa

[PATCH v14 6/7] io: Document qmp oob suitability of qio_channel_shutdown and io_shutdown

2020-12-28 Thread Lukas Straub
Migration and yank code assume that qio_channel_shutdown is thread
-safe and can be called from qmp oob handler. Document this after
checking the code.

Signed-off-by: Lukas Straub 
Acked-by: Stefan Hajnoczi 
Reviewed-by: Daniel P. Berrangé 
---
 include/io/channel.h | 5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/include/io/channel.h b/include/io/channel.h
index 4d6fe45f63..ab9ea77959 100644
--- a/include/io/channel.h
+++ b/include/io/channel.h
@@ -92,7 +92,8 @@ struct QIOChannel {
  * provide additional optional features.
  *
  * Consult the corresponding public API docs for a description
- * of the semantics of each callback
+ * of the semantics of each callback. io_shutdown in particular
+ * must be thread-safe, terminate quickly and must not block.
  */
 struct QIOChannelClass {
 ObjectClass parent;
@@ -510,6 +511,8 @@ int qio_channel_close(QIOChannel *ioc,
  * QIO_CHANNEL_FEATURE_SHUTDOWN prior to calling
  * this method.
  *
+ * This function is thread-safe, terminates quickly and does not block.
+ *
  * Returns: 0 on success, -1 on error
  */
 int qio_channel_shutdown(QIOChannel *ioc,
--
2.29.2



pgpR98wwisbJU.pgp
Description: OpenPGP digital signature


[PATCH v14 0/7] Introduce 'yank' oob qmp command to recover from hanging qemu

2020-12-28 Thread Lukas Straub

Hello Everyone,
So here is v14.

Changes:

v14:
 -fix checkpatch.pl warning

v13:
 -Address Marc-André Lureau comments:
  -make yank_register_instance return bool
  -rename yank_compare_instances to yank_instance_equal
  -remove breaks
  -use g_str_equal instead of strcmp
  -use g_new0 instead of g_slice_new
  -use QEMU_LOCK_GUARD instead of qemu_mutex_lock/unlock

v12:
 -rebase onto master
  -minor change to migration (removal of "defer" branch in 
qemu_start_incoming_migration)
 -add Reviewed-by tags

v11:
 -squashed MAINTAINERS update into patch 1
 -move qmp doc of yank before misc
 -add title for qmp docs
 -change "Since:" to 6.0
 -add Reviewed-by tags

v10:
 -moved from qapi/misc.json to qapi/yank.json
 -rename 'blockdev' -> 'block-node'
 -document difference betwen migration yank instance and migrate_cancel
 -better document return values of yank command
 -better document yank_lock
 -minor style and spelling fixes

v9:
 -rebase onto master
 -implemented new qmp api as proposed by Markus

v8:
 -add Reviewed-by and Acked-by tags
 -rebase onto master
  -minor change to migration
  -convert to meson
 -change "Since:" to 5.2
 -varios code style fixes (Markus Armbruster)
 -point to oob restrictions in comment to yank_register_function
  (Markus Armbruster)
 -improve qmp documentation (Markus Armbruster)
 -document oob suitability of qio_channel and io_shutdown (Markus Armbruster)

v7:
 -yank_register_instance now returns error via Error **errp instead of aborting
 -dropped "chardev/char.c: Check for duplicate id before  creating chardev"

v6:
 -add Reviewed-by and Acked-by tags
 -rebase on master
 -lots of changes in nbd due to rebase
 -only take maintainership of util/yank.c and include/qemu/yank.h (Daniel P. 
Berrangé)
 -fix a crash discovered by the newly added chardev test
 -fix the test itself

v5:
 -move yank.c to util/
 -move yank.h to include/qemu/
 -add license to yank.h
 -use const char*
 -nbd: use atomic_store_release and atomic_load_aqcuire
 -io-channel: ensure thread-safety and document it
 -add myself as maintainer for yank

v4:
 -fix build errors...

v3:
 -don't touch softmmu/vl.c, use __contructor__ attribute instead (Paolo Bonzini)
 -fix build errors
 -rewrite migration patch so it actually passes all tests

v2:
 -don't touch io/ code anymore
 -always register yank functions
 -'yank' now takes a list of instances to yank
 -'query-yank' returns a list of yankable instances

Overview:
Hello Everyone,
In many cases, if qemu has a network connection (qmp, migration, chardev, etc.)
to some other server and that server dies or hangs, qemu hangs too.
These patches introduce the new 'yank' out-of-band qmp command to recover from
these kinds of hangs. The different subsystems register callbacks which get
executed with the yank command. For example the callback can shutdown() a
socket. This is intended for the colo use-case, but it can be used for other
things too of course.

Regards,
Lukas Straub

Lukas Straub (7):
  Introduce yank feature
  block/nbd.c: Add yank feature
  chardev/char-socket.c: Add yank feature
  migration: Add yank feature
  io/channel-tls.c: make qio_channel_tls_shutdown thread-safe
  io: Document qmp oob suitability of qio_channel_shutdown and
io_shutdown
  tests/test-char.c: Wait for the chardev to connect in
char_socket_client_dupid_test

 MAINTAINERS   |   7 ++
 block/nbd.c   | 153 +++--
 chardev/char-socket.c |  34 ++
 include/io/channel.h  |   5 +-
 include/qemu/yank.h   |  97 
 io/channel-tls.c  |   6 +-
 migration/channel.c   |  13 +++
 migration/migration.c |  22 
 migration/multifd.c   |  10 ++
 migration/qemu-file-channel.c |   7 ++
 migration/savevm.c|   5 +
 qapi/meson.build  |   1 +
 qapi/qapi-schema.json |   1 +
 qapi/yank.json| 119 +++
 tests/test-char.c |   1 +
 util/meson.build  |   1 +
 util/yank.c   | 207 ++
 17 files changed, 625 insertions(+), 64 deletions(-)
 create mode 100644 include/qemu/yank.h
 create mode 100644 qapi/yank.json
 create mode 100644 util/yank.c

--
2.29.2


pgpRGSQfcEZb6.pgp
Description: OpenPGP digital signature


[PATCH v14 5/7] io/channel-tls.c: make qio_channel_tls_shutdown thread-safe

2020-12-28 Thread Lukas Straub
Make qio_channel_tls_shutdown thread-safe by using atomics when
accessing tioc->shutdown.

Signed-off-by: Lukas Straub 
Acked-by: Stefan Hajnoczi 
Reviewed-by: Daniel P. Berrangé 
---
 io/channel-tls.c | 6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/io/channel-tls.c b/io/channel-tls.c
index 388f019977..2ae1b92fc0 100644
--- a/io/channel-tls.c
+++ b/io/channel-tls.c
@@ -23,6 +23,7 @@
 #include "qemu/module.h"
 #include "io/channel-tls.h"
 #include "trace.h"
+#include "qemu/atomic.h"


 static ssize_t qio_channel_tls_write_handler(const char *buf,
@@ -277,7 +278,8 @@ static ssize_t qio_channel_tls_readv(QIOChannel *ioc,
 return QIO_CHANNEL_ERR_BLOCK;
 }
 } else if (errno == ECONNABORTED &&
-   (tioc->shutdown & QIO_CHANNEL_SHUTDOWN_READ)) {
+   (qatomic_load_acquire(>shutdown) &
+QIO_CHANNEL_SHUTDOWN_READ)) {
 return 0;
 }

@@ -361,7 +363,7 @@ static int qio_channel_tls_shutdown(QIOChannel *ioc,
 {
 QIOChannelTLS *tioc = QIO_CHANNEL_TLS(ioc);

-tioc->shutdown |= how;
+qatomic_or(>shutdown, how);

 return qio_channel_shutdown(tioc->master, how, errp);
 }
--
2.29.2



pgp5KOsFpXQ5r.pgp
Description: OpenPGP digital signature


[PATCH v14 3/7] chardev/char-socket.c: Add yank feature

2020-12-28 Thread Lukas Straub
Register a yank function to shutdown the socket on yank.

Signed-off-by: Lukas Straub 
Acked-by: Stefan Hajnoczi 
---
 chardev/char-socket.c | 34 ++
 1 file changed, 34 insertions(+)

diff --git a/chardev/char-socket.c b/chardev/char-socket.c
index 213a4c8dd0..8a707d766c 100644
--- a/chardev/char-socket.c
+++ b/chardev/char-socket.c
@@ -34,6 +34,7 @@
 #include "qapi/error.h"
 #include "qapi/clone-visitor.h"
 #include "qapi/qapi-visit-sockets.h"
+#include "qemu/yank.h"

 #include "chardev/char-io.h"
 #include "qom/object.h"
@@ -70,6 +71,7 @@ struct SocketChardev {
 size_t read_msgfds_num;
 int *write_msgfds;
 size_t write_msgfds_num;
+bool registered_yank;

 SocketAddress *addr;
 bool is_listen;
@@ -415,6 +417,12 @@ static void tcp_chr_free_connection(Chardev *chr)

 tcp_set_msgfds(chr, NULL, 0);
 remove_fd_in_watch(chr);
+if (s->state == TCP_CHARDEV_STATE_CONNECTING
+|| s->state == TCP_CHARDEV_STATE_CONNECTED) {
+yank_unregister_function(CHARDEV_YANK_INSTANCE(chr->label),
+ yank_generic_iochannel,
+ QIO_CHANNEL(s->sioc));
+}
 object_unref(OBJECT(s->sioc));
 s->sioc = NULL;
 object_unref(OBJECT(s->ioc));
@@ -932,6 +940,9 @@ static int tcp_chr_add_client(Chardev *chr, int fd)
 }
 tcp_chr_change_state(s, TCP_CHARDEV_STATE_CONNECTING);
 tcp_chr_set_client_ioc_name(chr, sioc);
+yank_register_function(CHARDEV_YANK_INSTANCE(chr->label),
+   yank_generic_iochannel,
+   QIO_CHANNEL(sioc));
 ret = tcp_chr_new_client(chr, sioc);
 object_unref(OBJECT(sioc));
 return ret;
@@ -946,6 +957,9 @@ static void tcp_chr_accept(QIONetListener *listener,

 tcp_chr_change_state(s, TCP_CHARDEV_STATE_CONNECTING);
 tcp_chr_set_client_ioc_name(chr, cioc);
+yank_register_function(CHARDEV_YANK_INSTANCE(chr->label),
+   yank_generic_iochannel,
+   QIO_CHANNEL(cioc));
 tcp_chr_new_client(chr, cioc);
 }

@@ -961,6 +975,9 @@ static int tcp_chr_connect_client_sync(Chardev *chr, Error 
**errp)
 object_unref(OBJECT(sioc));
 return -1;
 }
+yank_register_function(CHARDEV_YANK_INSTANCE(chr->label),
+   yank_generic_iochannel,
+   QIO_CHANNEL(sioc));
 tcp_chr_new_client(chr, sioc);
 object_unref(OBJECT(sioc));
 return 0;
@@ -976,6 +993,9 @@ static void tcp_chr_accept_server_sync(Chardev *chr)
 tcp_chr_change_state(s, TCP_CHARDEV_STATE_CONNECTING);
 sioc = qio_net_listener_wait_client(s->listener);
 tcp_chr_set_client_ioc_name(chr, sioc);
+yank_register_function(CHARDEV_YANK_INSTANCE(chr->label),
+   yank_generic_iochannel,
+   QIO_CHANNEL(sioc));
 tcp_chr_new_client(chr, sioc);
 object_unref(OBJECT(sioc));
 }
@@ -1086,6 +1106,9 @@ static void char_socket_finalize(Object *obj)
 object_unref(OBJECT(s->tls_creds));
 }
 g_free(s->tls_authz);
+if (s->registered_yank) {
+yank_unregister_instance(CHARDEV_YANK_INSTANCE(chr->label));
+}

 qemu_chr_be_event(chr, CHR_EVENT_CLOSED);
 }
@@ -1101,6 +1124,9 @@ static void qemu_chr_socket_connected(QIOTask *task, void 
*opaque)

 if (qio_task_propagate_error(task, )) {
 tcp_chr_change_state(s, TCP_CHARDEV_STATE_DISCONNECTED);
+yank_unregister_function(CHARDEV_YANK_INSTANCE(chr->label),
+ yank_generic_iochannel,
+ QIO_CHANNEL(sioc));
 check_report_connect_error(chr, err);
 goto cleanup;
 }
@@ -1134,6 +1160,9 @@ static void tcp_chr_connect_client_async(Chardev *chr)
 tcp_chr_change_state(s, TCP_CHARDEV_STATE_CONNECTING);
 sioc = qio_channel_socket_new();
 tcp_chr_set_client_ioc_name(chr, sioc);
+yank_register_function(CHARDEV_YANK_INSTANCE(chr->label),
+   yank_generic_iochannel,
+   QIO_CHANNEL(sioc));
 /*
  * Normally code would use the qio_channel_socket_connect_async
  * method which uses a QIOTask + qio_task_set_error internally
@@ -1376,6 +1405,11 @@ static void qmp_chardev_open_socket(Chardev *chr,
 qemu_chr_set_feature(chr, QEMU_CHAR_FEATURE_FD_PASS);
 }

+if (!yank_register_instance(CHARDEV_YANK_INSTANCE(chr->label), errp)) {
+return;
+}
+s->registered_yank = true;
+
 /* be isn't opened until we get a connection */
 *be_opened = false;

--
2.29.2



pgpyamnb7S8ue.pgp
Description: OpenPGP digital signature


[PATCH v14 4/7] migration: Add yank feature

2020-12-28 Thread Lukas Straub
Register yank functions on sockets to shut them down.

Signed-off-by: Lukas Straub 
Acked-by: Stefan Hajnoczi 
Acked-by: Dr. David Alan Gilbert 
---
 migration/channel.c   | 13 +
 migration/migration.c | 22 ++
 migration/multifd.c   | 10 ++
 migration/qemu-file-channel.c |  7 +++
 migration/savevm.c|  5 +
 5 files changed, 57 insertions(+)

diff --git a/migration/channel.c b/migration/channel.c
index 8a783baa0b..35fe234e9c 100644
--- a/migration/channel.c
+++ b/migration/channel.c
@@ -18,6 +18,8 @@
 #include "trace.h"
 #include "qapi/error.h"
 #include "io/channel-tls.h"
+#include "io/channel-socket.h"
+#include "qemu/yank.h"

 /**
  * @migration_channel_process_incoming - Create new incoming migration channel
@@ -35,6 +37,11 @@ void migration_channel_process_incoming(QIOChannel *ioc)
 trace_migration_set_incoming_channel(
 ioc, object_get_typename(OBJECT(ioc)));

+if (object_dynamic_cast(OBJECT(ioc), TYPE_QIO_CHANNEL_SOCKET)) {
+yank_register_function(MIGRATION_YANK_INSTANCE, yank_generic_iochannel,
+   QIO_CHANNEL(ioc));
+}
+
 if (s->parameters.tls_creds &&
 *s->parameters.tls_creds &&
 !object_dynamic_cast(OBJECT(ioc),
@@ -67,6 +74,12 @@ void migration_channel_connect(MigrationState *s,
 ioc, object_get_typename(OBJECT(ioc)), hostname, error);

 if (!error) {
+if (object_dynamic_cast(OBJECT(ioc), TYPE_QIO_CHANNEL_SOCKET)) {
+yank_register_function(MIGRATION_YANK_INSTANCE,
+   yank_generic_iochannel,
+   QIO_CHANNEL(ioc));
+}
+
 if (s->parameters.tls_creds &&
 *s->parameters.tls_creds &&
 !object_dynamic_cast(OBJECT(ioc),
diff --git a/migration/migration.c b/migration/migration.c
index e0dbde4091..92f7cb70b2 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -56,6 +56,7 @@
 #include "net/announce.h"
 #include "qemu/queue.h"
 #include "multifd.h"
+#include "qemu/yank.h"

 #ifdef CONFIG_VFIO
 #include "hw/vfio/vfio-common.h"
@@ -254,6 +255,8 @@ void migration_incoming_state_destroy(void)
 qapi_free_SocketAddressList(mis->socket_address_list);
 mis->socket_address_list = NULL;
 }
+
+yank_unregister_instance(MIGRATION_YANK_INSTANCE);
 }

 static void migrate_generate_event(int new_state)
@@ -418,6 +421,10 @@ static void qemu_start_incoming_migration(const char *uri, 
Error **errp)
 {
 const char *p = NULL;

+if (!yank_register_instance(MIGRATION_YANK_INSTANCE, errp)) {
+return;
+}
+
 qapi_event_send_migration(MIGRATION_STATUS_SETUP);
 if (strstart(uri, "tcp:", ) ||
 strstart(uri, "unix:", NULL) ||
@@ -432,6 +439,7 @@ static void qemu_start_incoming_migration(const char *uri, 
Error **errp)
 } else if (strstart(uri, "fd:", )) {
 fd_start_incoming_migration(p, errp);
 } else {
+yank_unregister_instance(MIGRATION_YANK_INSTANCE);
 error_setg(errp, "unknown migration protocol: %s", uri);
 }
 }
@@ -1737,6 +1745,7 @@ static void migrate_fd_cleanup(MigrationState *s)
 }
 notifier_list_notify(_state_notifiers, s);
 block_cleanup_parameters(s);
+yank_unregister_instance(MIGRATION_YANK_INSTANCE);
 }

 static void migrate_fd_cleanup_schedule(MigrationState *s)
@@ -2011,6 +2020,7 @@ void qmp_migrate_recover(const char *uri, Error **errp)
  * only re-setup the migration stream and poke existing migration
  * to continue using that newly established channel.
  */
+yank_unregister_instance(MIGRATION_YANK_INSTANCE);
 qemu_start_incoming_migration(uri, errp);
 }

@@ -2148,6 +2158,12 @@ void qmp_migrate(const char *uri, bool has_blk, bool blk,
 return;
 }

+if (!(has_resume && resume)) {
+if (!yank_register_instance(MIGRATION_YANK_INSTANCE, errp)) {
+return;
+}
+}
+
 if (strstart(uri, "tcp:", ) ||
 strstart(uri, "unix:", NULL) ||
 strstart(uri, "vsock:", NULL)) {
@@ -2161,6 +2177,9 @@ void qmp_migrate(const char *uri, bool has_blk, bool blk,
 } else if (strstart(uri, "fd:", )) {
 fd_start_outgoing_migration(s, p, _err);
 } else {
+if (!(has_resume && resume)) {
+yank_unregister_instance(MIGRATION_YANK_INSTANCE);
+}
 error_setg(errp, QERR_INVALID_PARAMETER_VALUE, "uri",
"a valid migration protocol");
 migrate_set_state(>state, MIGRATION_STATUS_SETUP,
@@ -2170,6 +2189,9 @@ void qmp_migrate(const char *uri, bool has_blk, bool blk,
 }

 if (local_err) {
+if (!(has

[PATCH v14 2/7] block/nbd.c: Add yank feature

2020-12-28 Thread Lukas Straub
Register a yank function which shuts down the socket and sets
s->state = NBD_CLIENT_QUIT. This is the same behaviour as if an
error occured.

Signed-off-by: Lukas Straub 
Acked-by: Stefan Hajnoczi 
Reviewed-by: Eric Blake 
---
 block/nbd.c | 153 +++-
 1 file changed, 92 insertions(+), 61 deletions(-)

diff --git a/block/nbd.c b/block/nbd.c
index 42536702b6..0f8d17db6a 100644
--- a/block/nbd.c
+++ b/block/nbd.c
@@ -35,6 +35,7 @@
 #include "qemu/option.h"
 #include "qemu/cutils.h"
 #include "qemu/main-loop.h"
+#include "qemu/atomic.h"

 #include "qapi/qapi-visit-sockets.h"
 #include "qapi/qmp/qstring.h"
@@ -44,6 +45,8 @@
 #include "block/nbd.h"
 #include "block/block_int.h"

+#include "qemu/yank.h"
+
 #define EN_OPTSTR ":exportname="
 #define MAX_NBD_REQUESTS16

@@ -141,14 +144,13 @@ typedef struct BDRVNBDState {
 NBDConnectThread *connect_thread;
 } BDRVNBDState;

-static QIOChannelSocket *nbd_establish_connection(SocketAddress *saddr,
-  Error **errp);
-static QIOChannelSocket *nbd_co_establish_connection(BlockDriverState *bs,
- Error **errp);
+static int nbd_establish_connection(BlockDriverState *bs, SocketAddress *saddr,
+Error **errp);
+static int nbd_co_establish_connection(BlockDriverState *bs, Error **errp);
 static void nbd_co_establish_connection_cancel(BlockDriverState *bs,
bool detach);
-static int nbd_client_handshake(BlockDriverState *bs, QIOChannelSocket *sioc,
-Error **errp);
+static int nbd_client_handshake(BlockDriverState *bs, Error **errp);
+static void nbd_yank(void *opaque);

 static void nbd_clear_bdrvstate(BDRVNBDState *s)
 {
@@ -166,12 +168,12 @@ static void nbd_clear_bdrvstate(BDRVNBDState *s)
 static void nbd_channel_error(BDRVNBDState *s, int ret)
 {
 if (ret == -EIO) {
-if (s->state == NBD_CLIENT_CONNECTED) {
+if (qatomic_load_acquire(>state) == NBD_CLIENT_CONNECTED) {
 s->state = s->reconnect_delay ? NBD_CLIENT_CONNECTING_WAIT :
 NBD_CLIENT_CONNECTING_NOWAIT;
 }
 } else {
-if (s->state == NBD_CLIENT_CONNECTED) {
+if (qatomic_load_acquire(>state) == NBD_CLIENT_CONNECTED) {
 qio_channel_shutdown(s->ioc, QIO_CHANNEL_SHUTDOWN_BOTH, NULL);
 }
 s->state = NBD_CLIENT_QUIT;
@@ -204,7 +206,7 @@ static void reconnect_delay_timer_cb(void *opaque)
 {
 BDRVNBDState *s = opaque;

-if (s->state == NBD_CLIENT_CONNECTING_WAIT) {
+if (qatomic_load_acquire(>state) == NBD_CLIENT_CONNECTING_WAIT) {
 s->state = NBD_CLIENT_CONNECTING_NOWAIT;
 while (qemu_co_enter_next(>free_sema, NULL)) {
 /* Resume all queued requests */
@@ -216,7 +218,7 @@ static void reconnect_delay_timer_cb(void *opaque)

 static void reconnect_delay_timer_init(BDRVNBDState *s, uint64_t 
expire_time_ns)
 {
-if (s->state != NBD_CLIENT_CONNECTING_WAIT) {
+if (qatomic_load_acquire(>state) != NBD_CLIENT_CONNECTING_WAIT) {
 return;
 }

@@ -261,7 +263,7 @@ static void nbd_client_attach_aio_context(BlockDriverState 
*bs,
  * s->connection_co is either yielded from nbd_receive_reply or from
  * nbd_co_reconnect_loop()
  */
-if (s->state == NBD_CLIENT_CONNECTED) {
+if (qatomic_load_acquire(>state) == NBD_CLIENT_CONNECTED) {
 qio_channel_attach_aio_context(QIO_CHANNEL(s->ioc), new_context);
 }

@@ -287,7 +289,7 @@ static void coroutine_fn 
nbd_client_co_drain_begin(BlockDriverState *bs)

 reconnect_delay_timer_del(s);

-if (s->state == NBD_CLIENT_CONNECTING_WAIT) {
+if (qatomic_load_acquire(>state) == NBD_CLIENT_CONNECTING_WAIT) {
 s->state = NBD_CLIENT_CONNECTING_NOWAIT;
 qemu_co_queue_restart_all(>free_sema);
 }
@@ -338,13 +340,14 @@ static void nbd_teardown_connection(BlockDriverState *bs)

 static bool nbd_client_connecting(BDRVNBDState *s)
 {
-return s->state == NBD_CLIENT_CONNECTING_WAIT ||
-s->state == NBD_CLIENT_CONNECTING_NOWAIT;
+NBDClientState state = qatomic_load_acquire(>state);
+return state == NBD_CLIENT_CONNECTING_WAIT ||
+state == NBD_CLIENT_CONNECTING_NOWAIT;
 }

 static bool nbd_client_connecting_wait(BDRVNBDState *s)
 {
-return s->state == NBD_CLIENT_CONNECTING_WAIT;
+return qatomic_load_acquire(>state) == NBD_CLIENT_CONNECTING_WAIT;
 }

 static void connect_bh(void *opaque)
@@ -424,12 +427,12 @@ static void *connect_thread_func(void *opaque)
 return NULL;
 }

-static QIOChannelSocket *coroutine_fn
+static int coroutine_fn
 nbd_co_establish_connection(BlockDriverState *bs, Err

[PATCH v13 6/7] io: Document qmp oob suitability of qio_channel_shutdown and io_shutdown

2020-12-28 Thread Lukas Straub
Migration and yank code assume that qio_channel_shutdown is thread
-safe and can be called from qmp oob handler. Document this after
checking the code.

Signed-off-by: Lukas Straub 
Acked-by: Stefan Hajnoczi 
Reviewed-by: Daniel P. Berrangé 
---
 include/io/channel.h | 5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/include/io/channel.h b/include/io/channel.h
index 4d6fe45f63..ab9ea77959 100644
--- a/include/io/channel.h
+++ b/include/io/channel.h
@@ -92,7 +92,8 @@ struct QIOChannel {
  * provide additional optional features.
  *
  * Consult the corresponding public API docs for a description
- * of the semantics of each callback
+ * of the semantics of each callback. io_shutdown in particular
+ * must be thread-safe, terminate quickly and must not block.
  */
 struct QIOChannelClass {
 ObjectClass parent;
@@ -510,6 +511,8 @@ int qio_channel_close(QIOChannel *ioc,
  * QIO_CHANNEL_FEATURE_SHUTDOWN prior to calling
  * this method.
  *
+ * This function is thread-safe, terminates quickly and does not block.
+ *
  * Returns: 0 on success, -1 on error
  */
 int qio_channel_shutdown(QIOChannel *ioc,
--
2.29.2



pgpk0UhyEYFYF.pgp
Description: OpenPGP digital signature


[PATCH v13 7/7] tests/test-char.c: Wait for the chardev to connect in char_socket_client_dupid_test

2020-12-28 Thread Lukas Straub
A connecting chardev object has an additional reference by the connecting
thread, so if the chardev is still connecting by the end of the test,
then the chardev object won't be freed. This in turn means that the yank
instance won't be unregistered and when running the next test-case
yank_register_instance will abort, because the yank instance is
already/still registered.

Signed-off-by: Lukas Straub 
Reviewed-by: Daniel P. Berrangé 
---
 tests/test-char.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/tests/test-char.c b/tests/test-char.c
index 953e0d1c1f..41a76410d8 100644
--- a/tests/test-char.c
+++ b/tests/test-char.c
@@ -937,6 +937,7 @@ static void char_socket_client_dupid_test(gconstpointer 
opaque)
 g_assert_nonnull(opts);
 chr1 = qemu_chr_new_from_opts(opts, NULL, _abort);
 g_assert_nonnull(chr1);
+qemu_chr_wait_connected(chr1, _abort);

 chr2 = qemu_chr_new_from_opts(opts, NULL, _err);
 g_assert_null(chr2);
--
2.29.2


pgpJuVgXKz72n.pgp
Description: OpenPGP digital signature


[PATCH v13 5/7] io/channel-tls.c: make qio_channel_tls_shutdown thread-safe

2020-12-28 Thread Lukas Straub
Make qio_channel_tls_shutdown thread-safe by using atomics when
accessing tioc->shutdown.

Signed-off-by: Lukas Straub 
Acked-by: Stefan Hajnoczi 
Reviewed-by: Daniel P. Berrangé 
---
 io/channel-tls.c | 6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/io/channel-tls.c b/io/channel-tls.c
index 388f019977..2ae1b92fc0 100644
--- a/io/channel-tls.c
+++ b/io/channel-tls.c
@@ -23,6 +23,7 @@
 #include "qemu/module.h"
 #include "io/channel-tls.h"
 #include "trace.h"
+#include "qemu/atomic.h"


 static ssize_t qio_channel_tls_write_handler(const char *buf,
@@ -277,7 +278,8 @@ static ssize_t qio_channel_tls_readv(QIOChannel *ioc,
 return QIO_CHANNEL_ERR_BLOCK;
 }
 } else if (errno == ECONNABORTED &&
-   (tioc->shutdown & QIO_CHANNEL_SHUTDOWN_READ)) {
+   (qatomic_load_acquire(>shutdown) &
+QIO_CHANNEL_SHUTDOWN_READ)) {
 return 0;
 }

@@ -361,7 +363,7 @@ static int qio_channel_tls_shutdown(QIOChannel *ioc,
 {
 QIOChannelTLS *tioc = QIO_CHANNEL_TLS(ioc);

-tioc->shutdown |= how;
+qatomic_or(>shutdown, how);

 return qio_channel_shutdown(tioc->master, how, errp);
 }
--
2.29.2



pgpLGWLqBoSl5.pgp
Description: OpenPGP digital signature


[PATCH v13 2/7] block/nbd.c: Add yank feature

2020-12-28 Thread Lukas Straub
Register a yank function which shuts down the socket and sets
s->state = NBD_CLIENT_QUIT. This is the same behaviour as if an
error occured.

Signed-off-by: Lukas Straub 
Acked-by: Stefan Hajnoczi 
Reviewed-by: Eric Blake 
---
 block/nbd.c | 153 +++-
 1 file changed, 92 insertions(+), 61 deletions(-)

diff --git a/block/nbd.c b/block/nbd.c
index 42536702b6..0f8d17db6a 100644
--- a/block/nbd.c
+++ b/block/nbd.c
@@ -35,6 +35,7 @@
 #include "qemu/option.h"
 #include "qemu/cutils.h"
 #include "qemu/main-loop.h"
+#include "qemu/atomic.h"

 #include "qapi/qapi-visit-sockets.h"
 #include "qapi/qmp/qstring.h"
@@ -44,6 +45,8 @@
 #include "block/nbd.h"
 #include "block/block_int.h"

+#include "qemu/yank.h"
+
 #define EN_OPTSTR ":exportname="
 #define MAX_NBD_REQUESTS16

@@ -141,14 +144,13 @@ typedef struct BDRVNBDState {
 NBDConnectThread *connect_thread;
 } BDRVNBDState;

-static QIOChannelSocket *nbd_establish_connection(SocketAddress *saddr,
-  Error **errp);
-static QIOChannelSocket *nbd_co_establish_connection(BlockDriverState *bs,
- Error **errp);
+static int nbd_establish_connection(BlockDriverState *bs, SocketAddress *saddr,
+Error **errp);
+static int nbd_co_establish_connection(BlockDriverState *bs, Error **errp);
 static void nbd_co_establish_connection_cancel(BlockDriverState *bs,
bool detach);
-static int nbd_client_handshake(BlockDriverState *bs, QIOChannelSocket *sioc,
-Error **errp);
+static int nbd_client_handshake(BlockDriverState *bs, Error **errp);
+static void nbd_yank(void *opaque);

 static void nbd_clear_bdrvstate(BDRVNBDState *s)
 {
@@ -166,12 +168,12 @@ static void nbd_clear_bdrvstate(BDRVNBDState *s)
 static void nbd_channel_error(BDRVNBDState *s, int ret)
 {
 if (ret == -EIO) {
-if (s->state == NBD_CLIENT_CONNECTED) {
+if (qatomic_load_acquire(>state) == NBD_CLIENT_CONNECTED) {
 s->state = s->reconnect_delay ? NBD_CLIENT_CONNECTING_WAIT :
 NBD_CLIENT_CONNECTING_NOWAIT;
 }
 } else {
-if (s->state == NBD_CLIENT_CONNECTED) {
+if (qatomic_load_acquire(>state) == NBD_CLIENT_CONNECTED) {
 qio_channel_shutdown(s->ioc, QIO_CHANNEL_SHUTDOWN_BOTH, NULL);
 }
 s->state = NBD_CLIENT_QUIT;
@@ -204,7 +206,7 @@ static void reconnect_delay_timer_cb(void *opaque)
 {
 BDRVNBDState *s = opaque;

-if (s->state == NBD_CLIENT_CONNECTING_WAIT) {
+if (qatomic_load_acquire(>state) == NBD_CLIENT_CONNECTING_WAIT) {
 s->state = NBD_CLIENT_CONNECTING_NOWAIT;
 while (qemu_co_enter_next(>free_sema, NULL)) {
 /* Resume all queued requests */
@@ -216,7 +218,7 @@ static void reconnect_delay_timer_cb(void *opaque)

 static void reconnect_delay_timer_init(BDRVNBDState *s, uint64_t 
expire_time_ns)
 {
-if (s->state != NBD_CLIENT_CONNECTING_WAIT) {
+if (qatomic_load_acquire(>state) != NBD_CLIENT_CONNECTING_WAIT) {
 return;
 }

@@ -261,7 +263,7 @@ static void nbd_client_attach_aio_context(BlockDriverState 
*bs,
  * s->connection_co is either yielded from nbd_receive_reply or from
  * nbd_co_reconnect_loop()
  */
-if (s->state == NBD_CLIENT_CONNECTED) {
+if (qatomic_load_acquire(>state) == NBD_CLIENT_CONNECTED) {
 qio_channel_attach_aio_context(QIO_CHANNEL(s->ioc), new_context);
 }

@@ -287,7 +289,7 @@ static void coroutine_fn 
nbd_client_co_drain_begin(BlockDriverState *bs)

 reconnect_delay_timer_del(s);

-if (s->state == NBD_CLIENT_CONNECTING_WAIT) {
+if (qatomic_load_acquire(>state) == NBD_CLIENT_CONNECTING_WAIT) {
 s->state = NBD_CLIENT_CONNECTING_NOWAIT;
 qemu_co_queue_restart_all(>free_sema);
 }
@@ -338,13 +340,14 @@ static void nbd_teardown_connection(BlockDriverState *bs)

 static bool nbd_client_connecting(BDRVNBDState *s)
 {
-return s->state == NBD_CLIENT_CONNECTING_WAIT ||
-s->state == NBD_CLIENT_CONNECTING_NOWAIT;
+NBDClientState state = qatomic_load_acquire(>state);
+return state == NBD_CLIENT_CONNECTING_WAIT ||
+state == NBD_CLIENT_CONNECTING_NOWAIT;
 }

 static bool nbd_client_connecting_wait(BDRVNBDState *s)
 {
-return s->state == NBD_CLIENT_CONNECTING_WAIT;
+return qatomic_load_acquire(>state) == NBD_CLIENT_CONNECTING_WAIT;
 }

 static void connect_bh(void *opaque)
@@ -424,12 +427,12 @@ static void *connect_thread_func(void *opaque)
 return NULL;
 }

-static QIOChannelSocket *coroutine_fn
+static int coroutine_fn
 nbd_co_establish_connection(BlockDriverState *bs, Err

[PATCH v13 4/7] migration: Add yank feature

2020-12-28 Thread Lukas Straub
Register yank functions on sockets to shut them down.

Signed-off-by: Lukas Straub 
Acked-by: Stefan Hajnoczi 
Acked-by: Dr. David Alan Gilbert 
---
 migration/channel.c   | 13 +
 migration/migration.c | 22 ++
 migration/multifd.c   | 10 ++
 migration/qemu-file-channel.c |  7 +++
 migration/savevm.c|  5 +
 5 files changed, 57 insertions(+)

diff --git a/migration/channel.c b/migration/channel.c
index 8a783baa0b..35fe234e9c 100644
--- a/migration/channel.c
+++ b/migration/channel.c
@@ -18,6 +18,8 @@
 #include "trace.h"
 #include "qapi/error.h"
 #include "io/channel-tls.h"
+#include "io/channel-socket.h"
+#include "qemu/yank.h"

 /**
  * @migration_channel_process_incoming - Create new incoming migration channel
@@ -35,6 +37,11 @@ void migration_channel_process_incoming(QIOChannel *ioc)
 trace_migration_set_incoming_channel(
 ioc, object_get_typename(OBJECT(ioc)));

+if (object_dynamic_cast(OBJECT(ioc), TYPE_QIO_CHANNEL_SOCKET)) {
+yank_register_function(MIGRATION_YANK_INSTANCE, yank_generic_iochannel,
+   QIO_CHANNEL(ioc));
+}
+
 if (s->parameters.tls_creds &&
 *s->parameters.tls_creds &&
 !object_dynamic_cast(OBJECT(ioc),
@@ -67,6 +74,12 @@ void migration_channel_connect(MigrationState *s,
 ioc, object_get_typename(OBJECT(ioc)), hostname, error);

 if (!error) {
+if (object_dynamic_cast(OBJECT(ioc), TYPE_QIO_CHANNEL_SOCKET)) {
+yank_register_function(MIGRATION_YANK_INSTANCE,
+   yank_generic_iochannel,
+   QIO_CHANNEL(ioc));
+}
+
 if (s->parameters.tls_creds &&
 *s->parameters.tls_creds &&
 !object_dynamic_cast(OBJECT(ioc),
diff --git a/migration/migration.c b/migration/migration.c
index e0dbde4091..92f7cb70b2 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -56,6 +56,7 @@
 #include "net/announce.h"
 #include "qemu/queue.h"
 #include "multifd.h"
+#include "qemu/yank.h"

 #ifdef CONFIG_VFIO
 #include "hw/vfio/vfio-common.h"
@@ -254,6 +255,8 @@ void migration_incoming_state_destroy(void)
 qapi_free_SocketAddressList(mis->socket_address_list);
 mis->socket_address_list = NULL;
 }
+
+yank_unregister_instance(MIGRATION_YANK_INSTANCE);
 }

 static void migrate_generate_event(int new_state)
@@ -418,6 +421,10 @@ static void qemu_start_incoming_migration(const char *uri, 
Error **errp)
 {
 const char *p = NULL;

+if (!yank_register_instance(MIGRATION_YANK_INSTANCE, errp)) {
+return;
+}
+
 qapi_event_send_migration(MIGRATION_STATUS_SETUP);
 if (strstart(uri, "tcp:", ) ||
 strstart(uri, "unix:", NULL) ||
@@ -432,6 +439,7 @@ static void qemu_start_incoming_migration(const char *uri, 
Error **errp)
 } else if (strstart(uri, "fd:", )) {
 fd_start_incoming_migration(p, errp);
 } else {
+yank_unregister_instance(MIGRATION_YANK_INSTANCE);
 error_setg(errp, "unknown migration protocol: %s", uri);
 }
 }
@@ -1737,6 +1745,7 @@ static void migrate_fd_cleanup(MigrationState *s)
 }
 notifier_list_notify(_state_notifiers, s);
 block_cleanup_parameters(s);
+yank_unregister_instance(MIGRATION_YANK_INSTANCE);
 }

 static void migrate_fd_cleanup_schedule(MigrationState *s)
@@ -2011,6 +2020,7 @@ void qmp_migrate_recover(const char *uri, Error **errp)
  * only re-setup the migration stream and poke existing migration
  * to continue using that newly established channel.
  */
+yank_unregister_instance(MIGRATION_YANK_INSTANCE);
 qemu_start_incoming_migration(uri, errp);
 }

@@ -2148,6 +2158,12 @@ void qmp_migrate(const char *uri, bool has_blk, bool blk,
 return;
 }

+if (!(has_resume && resume)) {
+if (!yank_register_instance(MIGRATION_YANK_INSTANCE, errp)) {
+return;
+}
+}
+
 if (strstart(uri, "tcp:", ) ||
 strstart(uri, "unix:", NULL) ||
 strstart(uri, "vsock:", NULL)) {
@@ -2161,6 +2177,9 @@ void qmp_migrate(const char *uri, bool has_blk, bool blk,
 } else if (strstart(uri, "fd:", )) {
 fd_start_outgoing_migration(s, p, _err);
 } else {
+if (!(has_resume && resume)) {
+yank_unregister_instance(MIGRATION_YANK_INSTANCE);
+}
 error_setg(errp, QERR_INVALID_PARAMETER_VALUE, "uri",
"a valid migration protocol");
 migrate_set_state(>state, MIGRATION_STATUS_SETUP,
@@ -2170,6 +2189,9 @@ void qmp_migrate(const char *uri, bool has_blk, bool blk,
 }

 if (local_err) {
+if (!(has

[PATCH v13 3/7] chardev/char-socket.c: Add yank feature

2020-12-28 Thread Lukas Straub
Register a yank function to shutdown the socket on yank.

Signed-off-by: Lukas Straub 
Acked-by: Stefan Hajnoczi 
---
 chardev/char-socket.c | 34 ++
 1 file changed, 34 insertions(+)

diff --git a/chardev/char-socket.c b/chardev/char-socket.c
index 213a4c8dd0..8a707d766c 100644
--- a/chardev/char-socket.c
+++ b/chardev/char-socket.c
@@ -34,6 +34,7 @@
 #include "qapi/error.h"
 #include "qapi/clone-visitor.h"
 #include "qapi/qapi-visit-sockets.h"
+#include "qemu/yank.h"

 #include "chardev/char-io.h"
 #include "qom/object.h"
@@ -70,6 +71,7 @@ struct SocketChardev {
 size_t read_msgfds_num;
 int *write_msgfds;
 size_t write_msgfds_num;
+bool registered_yank;

 SocketAddress *addr;
 bool is_listen;
@@ -415,6 +417,12 @@ static void tcp_chr_free_connection(Chardev *chr)

 tcp_set_msgfds(chr, NULL, 0);
 remove_fd_in_watch(chr);
+if (s->state == TCP_CHARDEV_STATE_CONNECTING
+|| s->state == TCP_CHARDEV_STATE_CONNECTED) {
+yank_unregister_function(CHARDEV_YANK_INSTANCE(chr->label),
+ yank_generic_iochannel,
+ QIO_CHANNEL(s->sioc));
+}
 object_unref(OBJECT(s->sioc));
 s->sioc = NULL;
 object_unref(OBJECT(s->ioc));
@@ -932,6 +940,9 @@ static int tcp_chr_add_client(Chardev *chr, int fd)
 }
 tcp_chr_change_state(s, TCP_CHARDEV_STATE_CONNECTING);
 tcp_chr_set_client_ioc_name(chr, sioc);
+yank_register_function(CHARDEV_YANK_INSTANCE(chr->label),
+   yank_generic_iochannel,
+   QIO_CHANNEL(sioc));
 ret = tcp_chr_new_client(chr, sioc);
 object_unref(OBJECT(sioc));
 return ret;
@@ -946,6 +957,9 @@ static void tcp_chr_accept(QIONetListener *listener,

 tcp_chr_change_state(s, TCP_CHARDEV_STATE_CONNECTING);
 tcp_chr_set_client_ioc_name(chr, cioc);
+yank_register_function(CHARDEV_YANK_INSTANCE(chr->label),
+   yank_generic_iochannel,
+   QIO_CHANNEL(cioc));
 tcp_chr_new_client(chr, cioc);
 }

@@ -961,6 +975,9 @@ static int tcp_chr_connect_client_sync(Chardev *chr, Error 
**errp)
 object_unref(OBJECT(sioc));
 return -1;
 }
+yank_register_function(CHARDEV_YANK_INSTANCE(chr->label),
+   yank_generic_iochannel,
+   QIO_CHANNEL(sioc));
 tcp_chr_new_client(chr, sioc);
 object_unref(OBJECT(sioc));
 return 0;
@@ -976,6 +993,9 @@ static void tcp_chr_accept_server_sync(Chardev *chr)
 tcp_chr_change_state(s, TCP_CHARDEV_STATE_CONNECTING);
 sioc = qio_net_listener_wait_client(s->listener);
 tcp_chr_set_client_ioc_name(chr, sioc);
+yank_register_function(CHARDEV_YANK_INSTANCE(chr->label),
+   yank_generic_iochannel,
+   QIO_CHANNEL(sioc));
 tcp_chr_new_client(chr, sioc);
 object_unref(OBJECT(sioc));
 }
@@ -1086,6 +1106,9 @@ static void char_socket_finalize(Object *obj)
 object_unref(OBJECT(s->tls_creds));
 }
 g_free(s->tls_authz);
+if (s->registered_yank) {
+yank_unregister_instance(CHARDEV_YANK_INSTANCE(chr->label));
+}

 qemu_chr_be_event(chr, CHR_EVENT_CLOSED);
 }
@@ -1101,6 +1124,9 @@ static void qemu_chr_socket_connected(QIOTask *task, void 
*opaque)

 if (qio_task_propagate_error(task, )) {
 tcp_chr_change_state(s, TCP_CHARDEV_STATE_DISCONNECTED);
+yank_unregister_function(CHARDEV_YANK_INSTANCE(chr->label),
+ yank_generic_iochannel,
+ QIO_CHANNEL(sioc));
 check_report_connect_error(chr, err);
 goto cleanup;
 }
@@ -1134,6 +1160,9 @@ static void tcp_chr_connect_client_async(Chardev *chr)
 tcp_chr_change_state(s, TCP_CHARDEV_STATE_CONNECTING);
 sioc = qio_channel_socket_new();
 tcp_chr_set_client_ioc_name(chr, sioc);
+yank_register_function(CHARDEV_YANK_INSTANCE(chr->label),
+   yank_generic_iochannel,
+   QIO_CHANNEL(sioc));
 /*
  * Normally code would use the qio_channel_socket_connect_async
  * method which uses a QIOTask + qio_task_set_error internally
@@ -1376,6 +1405,11 @@ static void qmp_chardev_open_socket(Chardev *chr,
 qemu_chr_set_feature(chr, QEMU_CHAR_FEATURE_FD_PASS);
 }

+if (!yank_register_instance(CHARDEV_YANK_INSTANCE(chr->label), errp)) {
+return;
+}
+s->registered_yank = true;
+
 /* be isn't opened until we get a connection */
 *be_opened = false;

--
2.29.2



pgpu0zc2CiSD2.pgp
Description: OpenPGP digital signature


[PATCH v13 1/7] Introduce yank feature

2020-12-28 Thread Lukas Straub
The yank feature allows to recover from hanging qemu by "yanking"
at various parts. Other qemu systems can register themselves and
multiple yank functions. Then all yank functions for selected
instances can be called by the 'yank' out-of-band qmp command.
Available instances can be queried by a 'query-yank' oob command.

Signed-off-by: Lukas Straub 
Acked-by: Stefan Hajnoczi 
Reviewed-by: Markus Armbruster 
---
 MAINTAINERS   |   7 ++
 include/qemu/yank.h   |  97 
 qapi/meson.build  |   1 +
 qapi/qapi-schema.json |   1 +
 qapi/yank.json| 119 
 util/meson.build  |   1 +
 util/yank.c   | 206 ++
 7 files changed, 432 insertions(+)
 create mode 100644 include/qemu/yank.h
 create mode 100644 qapi/yank.json
 create mode 100644 util/yank.c

diff --git a/MAINTAINERS b/MAINTAINERS
index 1e7c8f0488..f465a4045a 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -2716,6 +2716,13 @@ F: util/uuid.c
 F: include/qemu/uuid.h
 F: tests/test-uuid.c

+Yank feature
+M: Lukas Straub 
+S: Odd fixes
+F: util/yank.c
+F: include/qemu/yank.h
+F: qapi/yank.json
+
 COLO Framework
 M: zhanghailiang 
 S: Maintained
diff --git a/include/qemu/yank.h b/include/qemu/yank.h
new file mode 100644
index 00..5b93c70cbf
--- /dev/null
+++ b/include/qemu/yank.h
@@ -0,0 +1,97 @@
+/*
+ * QEMU yank feature
+ *
+ * Copyright (c) Lukas Straub 
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or later.
+ * See the COPYING file in the top-level directory.
+ */
+
+#ifndef YANK_H
+#define YANK_H
+
+#include "qapi/qapi-types-yank.h"
+
+typedef void (YankFn)(void *opaque);
+
+/**
+ * yank_register_instance: Register a new instance.
+ *
+ * This registers a new instance for yanking. Must be called before any yank
+ * function is registered for this instance.
+ *
+ * This function is thread-safe.
+ *
+ * @instance: The instance.
+ * @errp: Error object.
+ *
+ * Returns true on success or false if an error occured.
+ */
+bool yank_register_instance(const YankInstance *instance, Error **errp);
+
+/**
+ * yank_unregister_instance: Unregister a instance.
+ *
+ * This unregisters a instance. Must be called only after every yank function
+ * of the instance has been unregistered.
+ *
+ * This function is thread-safe.
+ *
+ * @instance: The instance.
+ */
+void yank_unregister_instance(const YankInstance *instance);
+
+/**
+ * yank_register_function: Register a yank function
+ *
+ * This registers a yank function. All limitations of qmp oob commands apply
+ * to the yank function as well. See docs/devel/qapi-code-gen.txt under
+ * "An OOB-capable command handler must satisfy the following conditions".
+ *
+ * This function is thread-safe.
+ *
+ * @instance: The instance.
+ * @func: The yank function.
+ * @opaque: Will be passed to the yank function.
+ */
+void yank_register_function(const YankInstance *instance,
+YankFn *func,
+void *opaque);
+
+/**
+ * yank_unregister_function: Unregister a yank function
+ *
+ * This unregisters a yank function.
+ *
+ * This function is thread-safe.
+ *
+ * @instance: The instance.
+ * @func: func that was passed to yank_register_function.
+ * @opaque: opaque that was passed to yank_register_function.
+ */
+void yank_unregister_function(const YankInstance *instance,
+  YankFn *func,
+  void *opaque);
+
+/**
+ * yank_generic_iochannel: Generic yank function for iochannel
+ *
+ * This is a generic yank function which will call qio_channel_shutdown on the
+ * provided QIOChannel.
+ *
+ * @opaque: QIOChannel to shutdown
+ */
+void yank_generic_iochannel(void *opaque);
+
+#define BLOCKDEV_YANK_INSTANCE(the_node_name) (&(YankInstance) { \
+.type = YANK_INSTANCE_TYPE_BLOCK_NODE, \
+.u.block_node.node_name = (the_node_name) })
+
+#define CHARDEV_YANK_INSTANCE(the_id) (&(YankInstance) { \
+.type = YANK_INSTANCE_TYPE_CHARDEV, \
+.u.chardev.id = (the_id) })
+
+#define MIGRATION_YANK_INSTANCE (&(YankInstance) { \
+.type = YANK_INSTANCE_TYPE_MIGRATION })
+
+#endif
diff --git a/qapi/meson.build b/qapi/meson.build
index 0e98146f1f..ab68e7900e 100644
--- a/qapi/meson.build
+++ b/qapi/meson.build
@@ -47,6 +47,7 @@ qapi_all_modules = [
   'trace',
   'transaction',
   'ui',
+  'yank',
 ]

 qapi_storage_daemon_modules = [
diff --git a/qapi/qapi-schema.json b/qapi/qapi-schema.json
index 0b444b76d2..3441c9a9ae 100644
--- a/qapi/qapi-schema.json
+++ b/qapi/qapi-schema.json
@@ -86,6 +86,7 @@
 { 'include': 'machine.json' }
 { 'include': 'machine-target.json' }
 { 'include': 'replay.json' }
+{ 'include': 'yank.json' }
 { 'include': 'misc.json' }
 { 'include': 'misc-target.json' }
 { 'include': 'audio.json' }
diff --git a/qapi/yank.json b/qapi/yank.json
new file mode 100644
index 00..167a775594
--- /dev/null
+++ b/qa

[PATCH v13 0/7] Introduce 'yank' oob qmp command to recover from hanging qemu

2020-12-28 Thread Lukas Straub

Hello Everyone,
So here is v13.

Changes:

v13:
 -Address Marc-André Lureau comments:
  -make yank_register_instance return bool
  -rename yank_compare_instances to yank_instance_equal
  -remove breaks
  -use g_str_equal instead of strcmp
  -use g_new0 instead of g_slice_new
  -use QEMU_LOCK_GUARD instead of qemu_mutex_lock/unlock

v12:
 -rebase onto master
  -minor change to migration (removal of "defer" branch in 
qemu_start_incoming_migration)
 -add Reviewed-by tags

v11:
 -squashed MAINTAINERS update into patch 1
 -move qmp doc of yank before misc
 -add title for qmp docs
 -change "Since:" to 6.0
 -add Reviewed-by tags

v10:
 -moved from qapi/misc.json to qapi/yank.json
 -rename 'blockdev' -> 'block-node'
 -document difference betwen migration yank instance and migrate_cancel
 -better document return values of yank command
 -better document yank_lock
 -minor style and spelling fixes

v9:
 -rebase onto master
 -implemented new qmp api as proposed by Markus

v8:
 -add Reviewed-by and Acked-by tags
 -rebase onto master
  -minor change to migration
  -convert to meson
 -change "Since:" to 5.2
 -varios code style fixes (Markus Armbruster)
 -point to oob restrictions in comment to yank_register_function
  (Markus Armbruster)
 -improve qmp documentation (Markus Armbruster)
 -document oob suitability of qio_channel and io_shutdown (Markus Armbruster)

v7:
 -yank_register_instance now returns error via Error **errp instead of aborting
 -dropped "chardev/char.c: Check for duplicate id before  creating chardev"

v6:
 -add Reviewed-by and Acked-by tags
 -rebase on master
 -lots of changes in nbd due to rebase
 -only take maintainership of util/yank.c and include/qemu/yank.h (Daniel P. 
Berrangé)
 -fix a crash discovered by the newly added chardev test
 -fix the test itself

v5:
 -move yank.c to util/
 -move yank.h to include/qemu/
 -add license to yank.h
 -use const char*
 -nbd: use atomic_store_release and atomic_load_aqcuire
 -io-channel: ensure thread-safety and document it
 -add myself as maintainer for yank

v4:
 -fix build errors...

v3:
 -don't touch softmmu/vl.c, use __contructor__ attribute instead (Paolo Bonzini)
 -fix build errors
 -rewrite migration patch so it actually passes all tests

v2:
 -don't touch io/ code anymore
 -always register yank functions
 -'yank' now takes a list of instances to yank
 -'query-yank' returns a list of yankable instances

Overview:
Hello Everyone,
In many cases, if qemu has a network connection (qmp, migration, chardev, etc.)
to some other server and that server dies or hangs, qemu hangs too.
These patches introduce the new 'yank' out-of-band qmp command to recover from
these kinds of hangs. The different subsystems register callbacks which get
executed with the yank command. For example the callback can shutdown() a
socket. This is intended for the colo use-case, but it can be used for other
things too of course.

Regards,
Lukas Straub


Lukas Straub (7):
  Introduce yank feature
  block/nbd.c: Add yank feature
  chardev/char-socket.c: Add yank feature
  migration: Add yank feature
  io/channel-tls.c: make qio_channel_tls_shutdown thread-safe
  io: Document qmp oob suitability of qio_channel_shutdown and
io_shutdown
  tests/test-char.c: Wait for the chardev to connect in
char_socket_client_dupid_test

 MAINTAINERS   |   7 ++
 block/nbd.c   | 153 +++--
 chardev/char-socket.c |  34 ++
 include/io/channel.h  |   5 +-
 include/qemu/yank.h   |  97 
 io/channel-tls.c  |   6 +-
 migration/channel.c   |  13 +++
 migration/migration.c |  22 
 migration/multifd.c   |  10 ++
 migration/qemu-file-channel.c |   7 ++
 migration/savevm.c|   5 +
 qapi/meson.build  |   1 +
 qapi/qapi-schema.json |   1 +
 qapi/yank.json| 119 
 tests/test-char.c |   1 +
 util/meson.build  |   1 +
 util/yank.c   | 206 ++
 17 files changed, 624 insertions(+), 64 deletions(-)
 create mode 100644 include/qemu/yank.h
 create mode 100644 qapi/yank.json
 create mode 100644 util/yank.c

--
2.29.2


pgp_q0vxlz9_B.pgp
Description: OpenPGP digital signature


Re: [PATCH v12 1/7] Introduce yank feature

2020-12-28 Thread Lukas Straub
On Tue, 22 Dec 2020 12:00:29 +0400
Marc-André Lureau  wrote:

> On Sun, Dec 13, 2020 at 3:48 PM Lukas Straub  wrote:
> 
> > The yank feature allows to recover from hanging qemu by "yanking"
> > at various parts. Other qemu systems can register themselves and
> > multiple yank functions. Then all yank functions for selected
> > instances can be called by the 'yank' out-of-band qmp command.
> > Available instances can be queried by a 'query-yank' oob command.
> >
> > Signed-off-by: Lukas Straub 
> > Acked-by: Stefan Hajnoczi 
> > Reviewed-by: Markus Armbruster 
> > ---
> >  MAINTAINERS   |   7 ++
> >  include/qemu/yank.h   |  95 +++
> >  qapi/meson.build  |   1 +
> >  qapi/qapi-schema.json |   1 +
> >  qapi/yank.json| 119 +++
> >  util/meson.build  |   1 +
> >  util/yank.c   | 216 ++
> >  7 files changed, 440 insertions(+)
> >  create mode 100644 include/qemu/yank.h
> >  create mode 100644 qapi/yank.json
> >  create mode 100644 util/yank.c
> >
> > diff --git a/MAINTAINERS b/MAINTAINERS
> > index d48a4e8a8b..5d7e3c0e4b 100644
> > --- a/MAINTAINERS
> > +++ b/MAINTAINERS
> > @@ -2705,6 +2705,13 @@ F: util/uuid.c
> >  F: include/qemu/uuid.h
> >  F: tests/test-uuid.c
> >
> > +Yank feature
> > +M: Lukas Straub 
> > +S: Odd fixes
> > +F: util/yank.c
> > +F: include/qemu/yank.h
> > +F: qapi/yank.json
> > +
> >  COLO Framework
> >  M: zhanghailiang 
> >  S: Maintained
> > diff --git a/include/qemu/yank.h b/include/qemu/yank.h
> > new file mode 100644
> > index 00..96f5b2626f
> > --- /dev/null
> > +++ b/include/qemu/yank.h
> > @@ -0,0 +1,95 @@
> > +/*
> > + * QEMU yank feature
> > + *
> > + * Copyright (c) Lukas Straub 
> > + *
> > + * This work is licensed under the terms of the GNU GPL, version 2 or
> > later.
> > + * See the COPYING file in the top-level directory.
> > + */
> > +
> > +#ifndef YANK_H
> > +#define YANK_H
> > +
> > +#include "qapi/qapi-types-yank.h"
> > +
> > +typedef void (YankFn)(void *opaque);
> > +
> > +/**
> > + * yank_register_instance: Register a new instance.
> > + *
> > + * This registers a new instance for yanking. Must be called before any
> > yank
> > + * function is registered for this instance.
> > + *
> > + * This function is thread-safe.
> > + *
> > + * @instance: The instance.
> > + * @errp: Error object.
> > + */
> > +void yank_register_instance(const YankInstance *instance, Error **errp);
> > +
> >  
> 
> It's a good idea to return a success boolean. (see include/qapi/error.h)

Changed for the next version.

> +/**
> > + * yank_unregister_instance: Unregister a instance.
> > + *
> > + * This unregisters a instance. Must be called only after every yank
> > function
> > + * of the instance has been unregistered.
> > + *
> > + * This function is thread-safe.
> > + *
> > + * @instance: The instance.
> > + */
> > +void yank_unregister_instance(const YankInstance *instance);
> > +
> > +/**
> > + * yank_register_function: Register a yank function
> > + *
> > + * This registers a yank function. All limitations of qmp oob commands
> > apply
> > + * to the yank function as well. See docs/devel/qapi-code-gen.txt under
> > + * "An OOB-capable command handler must satisfy the following conditions".
> > + *
> > + * This function is thread-safe.
> > + *
> > + * @instance: The instance.
> > + * @func: The yank function.
> > + * @opaque: Will be passed to the yank function.
> > + */
> > +void yank_register_function(const YankInstance *instance,
> > +YankFn *func,
> > +void *opaque);
> > +
> > +/**
> > + * yank_unregister_function: Unregister a yank function
> > + *
> > + * This unregisters a yank function.
> > + *
> > + * This function is thread-safe.
> > + *
> > + * @instance: The instance.
> > + * @func: func that was passed to yank_register_function.
> > + * @opaque: opaque that was passed to yank_register_function.
> > + */
> > +void yank_unregister_function(const YankInstance *instance,
> > +  YankFn *func,
> > +  void *opaque);
> > +
> > +/**
> > + * yank_generic_iochannel: Generic ya

[PATCH v12 5/7] io/channel-tls.c: make qio_channel_tls_shutdown thread-safe

2020-12-13 Thread Lukas Straub
Make qio_channel_tls_shutdown thread-safe by using atomics when
accessing tioc->shutdown.

Signed-off-by: Lukas Straub 
Acked-by: Stefan Hajnoczi 
Reviewed-by: Daniel P. Berrangé 
---
 io/channel-tls.c | 6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/io/channel-tls.c b/io/channel-tls.c
index 388f019977..2ae1b92fc0 100644
--- a/io/channel-tls.c
+++ b/io/channel-tls.c
@@ -23,6 +23,7 @@
 #include "qemu/module.h"
 #include "io/channel-tls.h"
 #include "trace.h"
+#include "qemu/atomic.h"


 static ssize_t qio_channel_tls_write_handler(const char *buf,
@@ -277,7 +278,8 @@ static ssize_t qio_channel_tls_readv(QIOChannel *ioc,
 return QIO_CHANNEL_ERR_BLOCK;
 }
 } else if (errno == ECONNABORTED &&
-   (tioc->shutdown & QIO_CHANNEL_SHUTDOWN_READ)) {
+   (qatomic_load_acquire(>shutdown) &
+QIO_CHANNEL_SHUTDOWN_READ)) {
 return 0;
 }

@@ -361,7 +363,7 @@ static int qio_channel_tls_shutdown(QIOChannel *ioc,
 {
 QIOChannelTLS *tioc = QIO_CHANNEL_TLS(ioc);

-tioc->shutdown |= how;
+qatomic_or(>shutdown, how);

 return qio_channel_shutdown(tioc->master, how, errp);
 }
--
2.20.1



pgp_lZLTBABRj.pgp
Description: OpenPGP digital signature


[PATCH v12 7/7] tests/test-char.c: Wait for the chardev to connect in char_socket_client_dupid_test

2020-12-13 Thread Lukas Straub
A connecting chardev object has an additional reference by the connecting
thread, so if the chardev is still connecting by the end of the test,
then the chardev object won't be freed. This in turn means that the yank
instance won't be unregistered and when running the next test-case
yank_register_instance will abort, because the yank instance is
already/still registered.

Signed-off-by: Lukas Straub 
Reviewed-by: Daniel P. Berrangé 
---
 tests/test-char.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/tests/test-char.c b/tests/test-char.c
index 953e0d1c1f..41a76410d8 100644
--- a/tests/test-char.c
+++ b/tests/test-char.c
@@ -937,6 +937,7 @@ static void char_socket_client_dupid_test(gconstpointer 
opaque)
 g_assert_nonnull(opts);
 chr1 = qemu_chr_new_from_opts(opts, NULL, _abort);
 g_assert_nonnull(chr1);
+qemu_chr_wait_connected(chr1, _abort);

 chr2 = qemu_chr_new_from_opts(opts, NULL, _err);
 g_assert_null(chr2);
--
2.20.1


pgpswu6HAepUZ.pgp
Description: OpenPGP digital signature


[PATCH v12 6/7] io: Document qmp oob suitability of qio_channel_shutdown and io_shutdown

2020-12-13 Thread Lukas Straub
Migration and yank code assume that qio_channel_shutdown is thread
-safe and can be called from qmp oob handler. Document this after
checking the code.

Signed-off-by: Lukas Straub 
Acked-by: Stefan Hajnoczi 
Reviewed-by: Daniel P. Berrangé 
---
 include/io/channel.h | 5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/include/io/channel.h b/include/io/channel.h
index 4d6fe45f63..ab9ea77959 100644
--- a/include/io/channel.h
+++ b/include/io/channel.h
@@ -92,7 +92,8 @@ struct QIOChannel {
  * provide additional optional features.
  *
  * Consult the corresponding public API docs for a description
- * of the semantics of each callback
+ * of the semantics of each callback. io_shutdown in particular
+ * must be thread-safe, terminate quickly and must not block.
  */
 struct QIOChannelClass {
 ObjectClass parent;
@@ -510,6 +511,8 @@ int qio_channel_close(QIOChannel *ioc,
  * QIO_CHANNEL_FEATURE_SHUTDOWN prior to calling
  * this method.
  *
+ * This function is thread-safe, terminates quickly and does not block.
+ *
  * Returns: 0 on success, -1 on error
  */
 int qio_channel_shutdown(QIOChannel *ioc,
--
2.20.1



pgpw1CygrcATs.pgp
Description: OpenPGP digital signature


[PATCH v12 3/7] chardev/char-socket.c: Add yank feature

2020-12-13 Thread Lukas Straub
Register a yank function to shutdown the socket on yank.

Signed-off-by: Lukas Straub 
Acked-by: Stefan Hajnoczi 
---
 chardev/char-socket.c | 35 +++
 1 file changed, 35 insertions(+)

diff --git a/chardev/char-socket.c b/chardev/char-socket.c
index 213a4c8dd0..7f2ee9a338 100644
--- a/chardev/char-socket.c
+++ b/chardev/char-socket.c
@@ -34,6 +34,7 @@
 #include "qapi/error.h"
 #include "qapi/clone-visitor.h"
 #include "qapi/qapi-visit-sockets.h"
+#include "qemu/yank.h"

 #include "chardev/char-io.h"
 #include "qom/object.h"
@@ -70,6 +71,7 @@ struct SocketChardev {
 size_t read_msgfds_num;
 int *write_msgfds;
 size_t write_msgfds_num;
+bool registered_yank;

 SocketAddress *addr;
 bool is_listen;
@@ -415,6 +417,12 @@ static void tcp_chr_free_connection(Chardev *chr)

 tcp_set_msgfds(chr, NULL, 0);
 remove_fd_in_watch(chr);
+if (s->state == TCP_CHARDEV_STATE_CONNECTING
+|| s->state == TCP_CHARDEV_STATE_CONNECTED) {
+yank_unregister_function(CHARDEV_YANK_INSTANCE(chr->label),
+ yank_generic_iochannel,
+ QIO_CHANNEL(s->sioc));
+}
 object_unref(OBJECT(s->sioc));
 s->sioc = NULL;
 object_unref(OBJECT(s->ioc));
@@ -932,6 +940,9 @@ static int tcp_chr_add_client(Chardev *chr, int fd)
 }
 tcp_chr_change_state(s, TCP_CHARDEV_STATE_CONNECTING);
 tcp_chr_set_client_ioc_name(chr, sioc);
+yank_register_function(CHARDEV_YANK_INSTANCE(chr->label),
+   yank_generic_iochannel,
+   QIO_CHANNEL(sioc));
 ret = tcp_chr_new_client(chr, sioc);
 object_unref(OBJECT(sioc));
 return ret;
@@ -946,6 +957,9 @@ static void tcp_chr_accept(QIONetListener *listener,

 tcp_chr_change_state(s, TCP_CHARDEV_STATE_CONNECTING);
 tcp_chr_set_client_ioc_name(chr, cioc);
+yank_register_function(CHARDEV_YANK_INSTANCE(chr->label),
+   yank_generic_iochannel,
+   QIO_CHANNEL(cioc));
 tcp_chr_new_client(chr, cioc);
 }

@@ -961,6 +975,9 @@ static int tcp_chr_connect_client_sync(Chardev *chr, Error 
**errp)
 object_unref(OBJECT(sioc));
 return -1;
 }
+yank_register_function(CHARDEV_YANK_INSTANCE(chr->label),
+   yank_generic_iochannel,
+   QIO_CHANNEL(sioc));
 tcp_chr_new_client(chr, sioc);
 object_unref(OBJECT(sioc));
 return 0;
@@ -976,6 +993,9 @@ static void tcp_chr_accept_server_sync(Chardev *chr)
 tcp_chr_change_state(s, TCP_CHARDEV_STATE_CONNECTING);
 sioc = qio_net_listener_wait_client(s->listener);
 tcp_chr_set_client_ioc_name(chr, sioc);
+yank_register_function(CHARDEV_YANK_INSTANCE(chr->label),
+   yank_generic_iochannel,
+   QIO_CHANNEL(sioc));
 tcp_chr_new_client(chr, sioc);
 object_unref(OBJECT(sioc));
 }
@@ -1086,6 +1106,9 @@ static void char_socket_finalize(Object *obj)
 object_unref(OBJECT(s->tls_creds));
 }
 g_free(s->tls_authz);
+if (s->registered_yank) {
+yank_unregister_instance(CHARDEV_YANK_INSTANCE(chr->label));
+}

 qemu_chr_be_event(chr, CHR_EVENT_CLOSED);
 }
@@ -1101,6 +1124,9 @@ static void qemu_chr_socket_connected(QIOTask *task, void 
*opaque)

 if (qio_task_propagate_error(task, )) {
 tcp_chr_change_state(s, TCP_CHARDEV_STATE_DISCONNECTED);
+yank_unregister_function(CHARDEV_YANK_INSTANCE(chr->label),
+ yank_generic_iochannel,
+ QIO_CHANNEL(sioc));
 check_report_connect_error(chr, err);
 goto cleanup;
 }
@@ -1134,6 +1160,9 @@ static void tcp_chr_connect_client_async(Chardev *chr)
 tcp_chr_change_state(s, TCP_CHARDEV_STATE_CONNECTING);
 sioc = qio_channel_socket_new();
 tcp_chr_set_client_ioc_name(chr, sioc);
+yank_register_function(CHARDEV_YANK_INSTANCE(chr->label),
+   yank_generic_iochannel,
+   QIO_CHANNEL(sioc));
 /*
  * Normally code would use the qio_channel_socket_connect_async
  * method which uses a QIOTask + qio_task_set_error internally
@@ -1376,6 +1405,12 @@ static void qmp_chardev_open_socket(Chardev *chr,
 qemu_chr_set_feature(chr, QEMU_CHAR_FEATURE_FD_PASS);
 }

+yank_register_instance(CHARDEV_YANK_INSTANCE(chr->label), errp);
+if (*errp) {
+return;
+}
+s->registered_yank = true;
+
 /* be isn't opened until we get a connection */
 *be_opened = false;

--
2.20.1



pgpgT3srzXmI9.pgp
Description: OpenPGP digital signature


[PATCH v12 4/7] migration: Add yank feature

2020-12-13 Thread Lukas Straub
Register yank functions on sockets to shut them down.

Signed-off-by: Lukas Straub 
Acked-by: Stefan Hajnoczi 
Acked-by: Dr. David Alan Gilbert 
---
 migration/channel.c   | 13 +
 migration/migration.c | 24 
 migration/multifd.c   | 10 ++
 migration/qemu-file-channel.c |  7 +++
 migration/savevm.c|  6 ++
 5 files changed, 60 insertions(+)

diff --git a/migration/channel.c b/migration/channel.c
index 8a783baa0b..35fe234e9c 100644
--- a/migration/channel.c
+++ b/migration/channel.c
@@ -18,6 +18,8 @@
 #include "trace.h"
 #include "qapi/error.h"
 #include "io/channel-tls.h"
+#include "io/channel-socket.h"
+#include "qemu/yank.h"

 /**
  * @migration_channel_process_incoming - Create new incoming migration channel
@@ -35,6 +37,11 @@ void migration_channel_process_incoming(QIOChannel *ioc)
 trace_migration_set_incoming_channel(
 ioc, object_get_typename(OBJECT(ioc)));

+if (object_dynamic_cast(OBJECT(ioc), TYPE_QIO_CHANNEL_SOCKET)) {
+yank_register_function(MIGRATION_YANK_INSTANCE, yank_generic_iochannel,
+   QIO_CHANNEL(ioc));
+}
+
 if (s->parameters.tls_creds &&
 *s->parameters.tls_creds &&
 !object_dynamic_cast(OBJECT(ioc),
@@ -67,6 +74,12 @@ void migration_channel_connect(MigrationState *s,
 ioc, object_get_typename(OBJECT(ioc)), hostname, error);

 if (!error) {
+if (object_dynamic_cast(OBJECT(ioc), TYPE_QIO_CHANNEL_SOCKET)) {
+yank_register_function(MIGRATION_YANK_INSTANCE,
+   yank_generic_iochannel,
+   QIO_CHANNEL(ioc));
+}
+
 if (s->parameters.tls_creds &&
 *s->parameters.tls_creds &&
 !object_dynamic_cast(OBJECT(ioc),
diff --git a/migration/migration.c b/migration/migration.c
index e0dbde4091..dc520a721b 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -56,6 +56,7 @@
 #include "net/announce.h"
 #include "qemu/queue.h"
 #include "multifd.h"
+#include "qemu/yank.h"

 #ifdef CONFIG_VFIO
 #include "hw/vfio/vfio-common.h"
@@ -254,6 +255,8 @@ void migration_incoming_state_destroy(void)
 qapi_free_SocketAddressList(mis->socket_address_list);
 mis->socket_address_list = NULL;
 }
+
+yank_unregister_instance(MIGRATION_YANK_INSTANCE);
 }

 static void migrate_generate_event(int new_state)
@@ -418,6 +421,11 @@ static void qemu_start_incoming_migration(const char *uri, 
Error **errp)
 {
 const char *p = NULL;

+yank_register_instance(MIGRATION_YANK_INSTANCE, errp);
+if (*errp) {
+return;
+}
+
 qapi_event_send_migration(MIGRATION_STATUS_SETUP);
 if (strstart(uri, "tcp:", ) ||
 strstart(uri, "unix:", NULL) ||
@@ -432,6 +440,7 @@ static void qemu_start_incoming_migration(const char *uri, 
Error **errp)
 } else if (strstart(uri, "fd:", )) {
 fd_start_incoming_migration(p, errp);
 } else {
+yank_unregister_instance(MIGRATION_YANK_INSTANCE);
 error_setg(errp, "unknown migration protocol: %s", uri);
 }
 }
@@ -1737,6 +1746,7 @@ static void migrate_fd_cleanup(MigrationState *s)
 }
 notifier_list_notify(_state_notifiers, s);
 block_cleanup_parameters(s);
+yank_unregister_instance(MIGRATION_YANK_INSTANCE);
 }

 static void migrate_fd_cleanup_schedule(MigrationState *s)
@@ -2011,6 +2021,7 @@ void qmp_migrate_recover(const char *uri, Error **errp)
  * only re-setup the migration stream and poke existing migration
  * to continue using that newly established channel.
  */
+yank_unregister_instance(MIGRATION_YANK_INSTANCE);
 qemu_start_incoming_migration(uri, errp);
 }

@@ -2148,6 +2159,13 @@ void qmp_migrate(const char *uri, bool has_blk, bool blk,
 return;
 }

+if (!(has_resume && resume)) {
+yank_register_instance(MIGRATION_YANK_INSTANCE, errp);
+if (*errp) {
+return;
+}
+}
+
 if (strstart(uri, "tcp:", ) ||
 strstart(uri, "unix:", NULL) ||
 strstart(uri, "vsock:", NULL)) {
@@ -2161,6 +2179,9 @@ void qmp_migrate(const char *uri, bool has_blk, bool blk,
 } else if (strstart(uri, "fd:", )) {
 fd_start_outgoing_migration(s, p, _err);
 } else {
+if (!(has_resume && resume)) {
+yank_unregister_instance(MIGRATION_YANK_INSTANCE);
+}
 error_setg(errp, QERR_INVALID_PARAMETER_VALUE, "uri",
"a valid migration protocol");
 migrate_set_state(>state, MIGRATION_STATUS_SETUP,
@@ -2170,6 +2191,9 @@ void qmp_migrate(const char *uri, bool has_blk, bool blk,
 }

 if (loca

[PATCH v12 1/7] Introduce yank feature

2020-12-13 Thread Lukas Straub
The yank feature allows to recover from hanging qemu by "yanking"
at various parts. Other qemu systems can register themselves and
multiple yank functions. Then all yank functions for selected
instances can be called by the 'yank' out-of-band qmp command.
Available instances can be queried by a 'query-yank' oob command.

Signed-off-by: Lukas Straub 
Acked-by: Stefan Hajnoczi 
Reviewed-by: Markus Armbruster 
---
 MAINTAINERS   |   7 ++
 include/qemu/yank.h   |  95 +++
 qapi/meson.build  |   1 +
 qapi/qapi-schema.json |   1 +
 qapi/yank.json| 119 +++
 util/meson.build  |   1 +
 util/yank.c   | 216 ++
 7 files changed, 440 insertions(+)
 create mode 100644 include/qemu/yank.h
 create mode 100644 qapi/yank.json
 create mode 100644 util/yank.c

diff --git a/MAINTAINERS b/MAINTAINERS
index d48a4e8a8b..5d7e3c0e4b 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -2705,6 +2705,13 @@ F: util/uuid.c
 F: include/qemu/uuid.h
 F: tests/test-uuid.c

+Yank feature
+M: Lukas Straub 
+S: Odd fixes
+F: util/yank.c
+F: include/qemu/yank.h
+F: qapi/yank.json
+
 COLO Framework
 M: zhanghailiang 
 S: Maintained
diff --git a/include/qemu/yank.h b/include/qemu/yank.h
new file mode 100644
index 00..96f5b2626f
--- /dev/null
+++ b/include/qemu/yank.h
@@ -0,0 +1,95 @@
+/*
+ * QEMU yank feature
+ *
+ * Copyright (c) Lukas Straub 
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or later.
+ * See the COPYING file in the top-level directory.
+ */
+
+#ifndef YANK_H
+#define YANK_H
+
+#include "qapi/qapi-types-yank.h"
+
+typedef void (YankFn)(void *opaque);
+
+/**
+ * yank_register_instance: Register a new instance.
+ *
+ * This registers a new instance for yanking. Must be called before any yank
+ * function is registered for this instance.
+ *
+ * This function is thread-safe.
+ *
+ * @instance: The instance.
+ * @errp: Error object.
+ */
+void yank_register_instance(const YankInstance *instance, Error **errp);
+
+/**
+ * yank_unregister_instance: Unregister a instance.
+ *
+ * This unregisters a instance. Must be called only after every yank function
+ * of the instance has been unregistered.
+ *
+ * This function is thread-safe.
+ *
+ * @instance: The instance.
+ */
+void yank_unregister_instance(const YankInstance *instance);
+
+/**
+ * yank_register_function: Register a yank function
+ *
+ * This registers a yank function. All limitations of qmp oob commands apply
+ * to the yank function as well. See docs/devel/qapi-code-gen.txt under
+ * "An OOB-capable command handler must satisfy the following conditions".
+ *
+ * This function is thread-safe.
+ *
+ * @instance: The instance.
+ * @func: The yank function.
+ * @opaque: Will be passed to the yank function.
+ */
+void yank_register_function(const YankInstance *instance,
+YankFn *func,
+void *opaque);
+
+/**
+ * yank_unregister_function: Unregister a yank function
+ *
+ * This unregisters a yank function.
+ *
+ * This function is thread-safe.
+ *
+ * @instance: The instance.
+ * @func: func that was passed to yank_register_function.
+ * @opaque: opaque that was passed to yank_register_function.
+ */
+void yank_unregister_function(const YankInstance *instance,
+  YankFn *func,
+  void *opaque);
+
+/**
+ * yank_generic_iochannel: Generic yank function for iochannel
+ *
+ * This is a generic yank function which will call qio_channel_shutdown on the
+ * provided QIOChannel.
+ *
+ * @opaque: QIOChannel to shutdown
+ */
+void yank_generic_iochannel(void *opaque);
+
+#define BLOCKDEV_YANK_INSTANCE(the_node_name) (&(YankInstance) { \
+.type = YANK_INSTANCE_TYPE_BLOCK_NODE, \
+.u.block_node.node_name = (the_node_name) })
+
+#define CHARDEV_YANK_INSTANCE(the_id) (&(YankInstance) { \
+.type = YANK_INSTANCE_TYPE_CHARDEV, \
+.u.chardev.id = (the_id) })
+
+#define MIGRATION_YANK_INSTANCE (&(YankInstance) { \
+.type = YANK_INSTANCE_TYPE_MIGRATION })
+
+#endif
diff --git a/qapi/meson.build b/qapi/meson.build
index 0e98146f1f..ab68e7900e 100644
--- a/qapi/meson.build
+++ b/qapi/meson.build
@@ -47,6 +47,7 @@ qapi_all_modules = [
   'trace',
   'transaction',
   'ui',
+  'yank',
 ]

 qapi_storage_daemon_modules = [
diff --git a/qapi/qapi-schema.json b/qapi/qapi-schema.json
index 0b444b76d2..3441c9a9ae 100644
--- a/qapi/qapi-schema.json
+++ b/qapi/qapi-schema.json
@@ -86,6 +86,7 @@
 { 'include': 'machine.json' }
 { 'include': 'machine-target.json' }
 { 'include': 'replay.json' }
+{ 'include': 'yank.json' }
 { 'include': 'misc.json' }
 { 'include': 'misc-target.json' }
 { 'include': 'audio.json' }
diff --git a/qapi/yank.json b/qapi/yank.json
new file mode 100644
index 00..167a775594
--- /dev/null
+++ b/qapi/yank.json
@@ -0,0 +1,119 @@
+# -*- Mode: Python -*-
+# vim: 

[PATCH v12 0/7] Introduce 'yank' oob qmp command to recover from hanging qemu

2020-12-13 Thread Lukas Straub

Hello Everyone,
So here is v12.
@Marc-André Lureau, We still need an ACK for the chardev patch.

Changes:

v12:
 -rebase onto master
  -minor change to migration (removal of "defer" branch in 
qemu_start_incoming_migration)
 -add Reviewed-by tags

v11:
 -squashed MAINTAINERS update into patch 1
 -move qmp doc of yank before misc
 -add title for qmp docs
 -change "Since:" to 6.0
 -add Reviewed-by tags

v10:
 -moved from qapi/misc.json to qapi/yank.json
 -rename 'blockdev' -> 'block-node'
 -document difference betwen migration yank instance and migrate_cancel
 -better document return values of yank command
 -better document yank_lock
 -minor style and spelling fixes

v9:
 -rebase onto master
 -implemented new qmp api as proposed by Markus

v8:
 -add Reviewed-by and Acked-by tags
 -rebase onto master
  -minor change to migration
  -convert to meson
 -change "Since:" to 5.2
 -varios code style fixes (Markus Armbruster)
 -point to oob restrictions in comment to yank_register_function
  (Markus Armbruster)
 -improve qmp documentation (Markus Armbruster)
 -document oob suitability of qio_channel and io_shutdown (Markus Armbruster)

v7:
 -yank_register_instance now returns error via Error **errp instead of aborting
 -dropped "chardev/char.c: Check for duplicate id before  creating chardev"

v6:
 -add Reviewed-by and Acked-by tags
 -rebase on master
 -lots of changes in nbd due to rebase
 -only take maintainership of util/yank.c and include/qemu/yank.h (Daniel P. 
Berrangé)
 -fix a crash discovered by the newly added chardev test
 -fix the test itself

v5:
 -move yank.c to util/
 -move yank.h to include/qemu/
 -add license to yank.h
 -use const char*
 -nbd: use atomic_store_release and atomic_load_aqcuire
 -io-channel: ensure thread-safety and document it
 -add myself as maintainer for yank

v4:
 -fix build errors...

v3:
 -don't touch softmmu/vl.c, use __contructor__ attribute instead (Paolo Bonzini)
 -fix build errors
 -rewrite migration patch so it actually passes all tests

v2:
 -don't touch io/ code anymore
 -always register yank functions
 -'yank' now takes a list of instances to yank
 -'query-yank' returns a list of yankable instances

Overview:
Hello Everyone,
In many cases, if qemu has a network connection (qmp, migration, chardev, etc.)
to some other server and that server dies or hangs, qemu hangs too.
These patches introduce the new 'yank' out-of-band qmp command to recover from
these kinds of hangs. The different subsystems register callbacks which get
executed with the yank command. For example the callback can shutdown() a
socket. This is intended for the colo use-case, but it can be used for other
things too of course.

Regards,
Lukas Straub

Lukas Straub (7):
  Introduce yank feature
  block/nbd.c: Add yank feature
  chardev/char-socket.c: Add yank feature
  migration: Add yank feature
  io/channel-tls.c: make qio_channel_tls_shutdown thread-safe
  io: Document qmp oob suitability of qio_channel_shutdown and
io_shutdown
  tests/test-char.c: Wait for the chardev to connect in
char_socket_client_dupid_test

 MAINTAINERS   |   7 ++
 block/nbd.c   | 154 ++--
 chardev/char-socket.c |  35 ++
 include/io/channel.h  |   5 +-
 include/qemu/yank.h   |  95 +++
 io/channel-tls.c  |   6 +-
 migration/channel.c   |  13 ++
 migration/migration.c |  24 
 migration/multifd.c   |  10 ++
 migration/qemu-file-channel.c |   7 ++
 migration/savevm.c|   6 +
 qapi/meson.build  |   1 +
 qapi/qapi-schema.json |   1 +
 qapi/yank.json| 119 +++
 tests/test-char.c |   1 +
 util/meson.build  |   1 +
 util/yank.c   | 216 ++
 17 files changed, 637 insertions(+), 64 deletions(-)
 create mode 100644 include/qemu/yank.h
 create mode 100644 qapi/yank.json
 create mode 100644 util/yank.c

--
2.20.1


pgpVozt0Ip024.pgp
Description: OpenPGP digital signature


[PATCH v12 2/7] block/nbd.c: Add yank feature

2020-12-13 Thread Lukas Straub
Register a yank function which shuts down the socket and sets
s->state = NBD_CLIENT_QUIT. This is the same behaviour as if an
error occured.

Signed-off-by: Lukas Straub 
Acked-by: Stefan Hajnoczi 
Reviewed-by: Eric Blake 
---
 block/nbd.c | 154 +++-
 1 file changed, 93 insertions(+), 61 deletions(-)

diff --git a/block/nbd.c b/block/nbd.c
index 42536702b6..994d1e7b33 100644
--- a/block/nbd.c
+++ b/block/nbd.c
@@ -35,6 +35,7 @@
 #include "qemu/option.h"
 #include "qemu/cutils.h"
 #include "qemu/main-loop.h"
+#include "qemu/atomic.h"

 #include "qapi/qapi-visit-sockets.h"
 #include "qapi/qmp/qstring.h"
@@ -44,6 +45,8 @@
 #include "block/nbd.h"
 #include "block/block_int.h"

+#include "qemu/yank.h"
+
 #define EN_OPTSTR ":exportname="
 #define MAX_NBD_REQUESTS16

@@ -141,14 +144,13 @@ typedef struct BDRVNBDState {
 NBDConnectThread *connect_thread;
 } BDRVNBDState;

-static QIOChannelSocket *nbd_establish_connection(SocketAddress *saddr,
-  Error **errp);
-static QIOChannelSocket *nbd_co_establish_connection(BlockDriverState *bs,
- Error **errp);
+static int nbd_establish_connection(BlockDriverState *bs, SocketAddress *saddr,
+Error **errp);
+static int nbd_co_establish_connection(BlockDriverState *bs, Error **errp);
 static void nbd_co_establish_connection_cancel(BlockDriverState *bs,
bool detach);
-static int nbd_client_handshake(BlockDriverState *bs, QIOChannelSocket *sioc,
-Error **errp);
+static int nbd_client_handshake(BlockDriverState *bs, Error **errp);
+static void nbd_yank(void *opaque);

 static void nbd_clear_bdrvstate(BDRVNBDState *s)
 {
@@ -166,12 +168,12 @@ static void nbd_clear_bdrvstate(BDRVNBDState *s)
 static void nbd_channel_error(BDRVNBDState *s, int ret)
 {
 if (ret == -EIO) {
-if (s->state == NBD_CLIENT_CONNECTED) {
+if (qatomic_load_acquire(>state) == NBD_CLIENT_CONNECTED) {
 s->state = s->reconnect_delay ? NBD_CLIENT_CONNECTING_WAIT :
 NBD_CLIENT_CONNECTING_NOWAIT;
 }
 } else {
-if (s->state == NBD_CLIENT_CONNECTED) {
+if (qatomic_load_acquire(>state) == NBD_CLIENT_CONNECTED) {
 qio_channel_shutdown(s->ioc, QIO_CHANNEL_SHUTDOWN_BOTH, NULL);
 }
 s->state = NBD_CLIENT_QUIT;
@@ -204,7 +206,7 @@ static void reconnect_delay_timer_cb(void *opaque)
 {
 BDRVNBDState *s = opaque;

-if (s->state == NBD_CLIENT_CONNECTING_WAIT) {
+if (qatomic_load_acquire(>state) == NBD_CLIENT_CONNECTING_WAIT) {
 s->state = NBD_CLIENT_CONNECTING_NOWAIT;
 while (qemu_co_enter_next(>free_sema, NULL)) {
 /* Resume all queued requests */
@@ -216,7 +218,7 @@ static void reconnect_delay_timer_cb(void *opaque)

 static void reconnect_delay_timer_init(BDRVNBDState *s, uint64_t 
expire_time_ns)
 {
-if (s->state != NBD_CLIENT_CONNECTING_WAIT) {
+if (qatomic_load_acquire(>state) != NBD_CLIENT_CONNECTING_WAIT) {
 return;
 }

@@ -261,7 +263,7 @@ static void nbd_client_attach_aio_context(BlockDriverState 
*bs,
  * s->connection_co is either yielded from nbd_receive_reply or from
  * nbd_co_reconnect_loop()
  */
-if (s->state == NBD_CLIENT_CONNECTED) {
+if (qatomic_load_acquire(>state) == NBD_CLIENT_CONNECTED) {
 qio_channel_attach_aio_context(QIO_CHANNEL(s->ioc), new_context);
 }

@@ -287,7 +289,7 @@ static void coroutine_fn 
nbd_client_co_drain_begin(BlockDriverState *bs)

 reconnect_delay_timer_del(s);

-if (s->state == NBD_CLIENT_CONNECTING_WAIT) {
+if (qatomic_load_acquire(>state) == NBD_CLIENT_CONNECTING_WAIT) {
 s->state = NBD_CLIENT_CONNECTING_NOWAIT;
 qemu_co_queue_restart_all(>free_sema);
 }
@@ -338,13 +340,14 @@ static void nbd_teardown_connection(BlockDriverState *bs)

 static bool nbd_client_connecting(BDRVNBDState *s)
 {
-return s->state == NBD_CLIENT_CONNECTING_WAIT ||
-s->state == NBD_CLIENT_CONNECTING_NOWAIT;
+NBDClientState state = qatomic_load_acquire(>state);
+return state == NBD_CLIENT_CONNECTING_WAIT ||
+state == NBD_CLIENT_CONNECTING_NOWAIT;
 }

 static bool nbd_client_connecting_wait(BDRVNBDState *s)
 {
-return s->state == NBD_CLIENT_CONNECTING_WAIT;
+return qatomic_load_acquire(>state) == NBD_CLIENT_CONNECTING_WAIT;
 }

 static void connect_bh(void *opaque)
@@ -424,12 +427,12 @@ static void *connect_thread_func(void *opaque)
 return NULL;
 }

-static QIOChannelSocket *coroutine_fn
+static int coroutine_fn
 nbd_co_establish_connection(BlockDriverState *bs, Err

Re: [PATCH v11 2/7] block/nbd.c: Add yank feature

2020-12-02 Thread Lukas Straub
On Wed, 2 Dec 2020 15:18:48 +0300
Vladimir Sementsov-Ogievskiy  wrote:

> 15.11.2020 14:36, Lukas Straub wrote:
> > Register a yank function which shuts down the socket and sets
> > s->state = NBD_CLIENT_QUIT. This is the same behaviour as if an
> > error occured.
> > 
> > Signed-off-by: Lukas Straub
> > Acked-by: Stefan Hajnoczi  
> 
> Hi! Could I ask, what's the reason for qatomic_load_acquire access to 
> s->state? Is there same bug fixed? Or is it related somehow to new feature?
> 

Hi,
This is for the new feature, as the yank function runs in a separate thread.

Regards,
Lukas Straub

-- 



pgpx90ndOfANX.pgp
Description: OpenPGP digital signature


[PATCH v11 7/7] tests/test-char.c: Wait for the chardev to connect in char_socket_client_dupid_test

2020-11-15 Thread Lukas Straub
A connecting chardev object has an additional reference by the connecting
thread, so if the chardev is still connecting by the end of the test,
then the chardev object won't be freed. This in turn means that the yank
instance won't be unregistered and when running the next test-case
yank_register_instance will abort, because the yank instance is
already/still registered.

Signed-off-by: Lukas Straub 
Reviewed-by: Daniel P. Berrangé 
---
 tests/test-char.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/tests/test-char.c b/tests/test-char.c
index 9196e566e9..aedb5c9eda 100644
--- a/tests/test-char.c
+++ b/tests/test-char.c
@@ -937,6 +937,7 @@ static void char_socket_client_dupid_test(gconstpointer 
opaque)
 g_assert_nonnull(opts);
 chr1 = qemu_chr_new_from_opts(opts, NULL, _abort);
 g_assert_nonnull(chr1);
+qemu_chr_wait_connected(chr1, _abort);

 chr2 = qemu_chr_new_from_opts(opts, NULL, _err);
 g_assert_null(chr2);
--
2.20.1


pgp3LYpnjFh8T.pgp
Description: OpenPGP digital signature


[PATCH v11 4/7] migration: Add yank feature

2020-11-15 Thread Lukas Straub
Register yank functions on sockets to shut them down.

Signed-off-by: Lukas Straub 
Acked-by: Stefan Hajnoczi 
Acked-by: Dr. David Alan Gilbert 
---
 migration/channel.c   | 13 +
 migration/migration.c | 25 +
 migration/multifd.c   | 10 ++
 migration/qemu-file-channel.c |  7 +++
 migration/savevm.c|  6 ++
 5 files changed, 61 insertions(+)

diff --git a/migration/channel.c b/migration/channel.c
index 8a783baa0b..35fe234e9c 100644
--- a/migration/channel.c
+++ b/migration/channel.c
@@ -18,6 +18,8 @@
 #include "trace.h"
 #include "qapi/error.h"
 #include "io/channel-tls.h"
+#include "io/channel-socket.h"
+#include "qemu/yank.h"

 /**
  * @migration_channel_process_incoming - Create new incoming migration channel
@@ -35,6 +37,11 @@ void migration_channel_process_incoming(QIOChannel *ioc)
 trace_migration_set_incoming_channel(
 ioc, object_get_typename(OBJECT(ioc)));

+if (object_dynamic_cast(OBJECT(ioc), TYPE_QIO_CHANNEL_SOCKET)) {
+yank_register_function(MIGRATION_YANK_INSTANCE, yank_generic_iochannel,
+   QIO_CHANNEL(ioc));
+}
+
 if (s->parameters.tls_creds &&
 *s->parameters.tls_creds &&
 !object_dynamic_cast(OBJECT(ioc),
@@ -67,6 +74,12 @@ void migration_channel_connect(MigrationState *s,
 ioc, object_get_typename(OBJECT(ioc)), hostname, error);

 if (!error) {
+if (object_dynamic_cast(OBJECT(ioc), TYPE_QIO_CHANNEL_SOCKET)) {
+yank_register_function(MIGRATION_YANK_INSTANCE,
+   yank_generic_iochannel,
+   QIO_CHANNEL(ioc));
+}
+
 if (s->parameters.tls_creds &&
 *s->parameters.tls_creds &&
 !object_dynamic_cast(OBJECT(ioc),
diff --git a/migration/migration.c b/migration/migration.c
index 87a9b59f83..a5add9d17d 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -56,6 +56,7 @@
 #include "net/announce.h"
 #include "qemu/queue.h"
 #include "multifd.h"
+#include "qemu/yank.h"

 #ifdef CONFIG_VFIO
 #include "hw/vfio/vfio-common.h"
@@ -252,6 +253,8 @@ void migration_incoming_state_destroy(void)
 qapi_free_SocketAddressList(mis->socket_address_list);
 mis->socket_address_list = NULL;
 }
+
+yank_unregister_instance(MIGRATION_YANK_INSTANCE);
 }

 static void migrate_generate_event(int new_state)
@@ -429,8 +432,14 @@ void qemu_start_incoming_migration(const char *uri, Error 
**errp)
 {
 const char *p = NULL;

+yank_register_instance(MIGRATION_YANK_INSTANCE, errp);
+if (*errp) {
+return;
+}
+
 qapi_event_send_migration(MIGRATION_STATUS_SETUP);
 if (!strcmp(uri, "defer")) {
+yank_unregister_instance(MIGRATION_YANK_INSTANCE);
 deferred_incoming_migration(errp);
 } else if (strstart(uri, "tcp:", ) ||
strstart(uri, "unix:", NULL) ||
@@ -445,6 +454,7 @@ void qemu_start_incoming_migration(const char *uri, Error 
**errp)
 } else if (strstart(uri, "fd:", )) {
 fd_start_incoming_migration(p, errp);
 } else {
+yank_unregister_instance(MIGRATION_YANK_INSTANCE);
 error_setg(errp, "unknown migration protocol: %s", uri);
 }
 }
@@ -1750,6 +1760,7 @@ static void migrate_fd_cleanup(MigrationState *s)
 }
 notifier_list_notify(_state_notifiers, s);
 block_cleanup_parameters(s);
+yank_unregister_instance(MIGRATION_YANK_INSTANCE);
 }

 static void migrate_fd_cleanup_schedule(MigrationState *s)
@@ -2024,6 +2035,7 @@ void qmp_migrate_recover(const char *uri, Error **errp)
  * only re-setup the migration stream and poke existing migration
  * to continue using that newly established channel.
  */
+yank_unregister_instance(MIGRATION_YANK_INSTANCE);
 qemu_start_incoming_migration(uri, errp);
 }

@@ -2161,6 +2173,13 @@ void qmp_migrate(const char *uri, bool has_blk, bool blk,
 return;
 }

+if (!(has_resume && resume)) {
+yank_register_instance(MIGRATION_YANK_INSTANCE, errp);
+if (*errp) {
+return;
+}
+}
+
 if (strstart(uri, "tcp:", ) ||
 strstart(uri, "unix:", NULL) ||
 strstart(uri, "vsock:", NULL)) {
@@ -2174,6 +2193,9 @@ void qmp_migrate(const char *uri, bool has_blk, bool blk,
 } else if (strstart(uri, "fd:", )) {
 fd_start_outgoing_migration(s, p, _err);
 } else {
+if (!(has_resume && resume)) {
+yank_unregister_instance(MIGRATION_YANK_INSTANCE);
+}
 error_setg(errp, QERR_INVALID_PARAMETER_VALUE, "uri",
"a valid migration protocol&qu

[PATCH v11 3/7] chardev/char-socket.c: Add yank feature

2020-11-15 Thread Lukas Straub
Register a yank function to shutdown the socket on yank.

Signed-off-by: Lukas Straub 
Acked-by: Stefan Hajnoczi 
---
 chardev/char-socket.c | 35 +++
 1 file changed, 35 insertions(+)

diff --git a/chardev/char-socket.c b/chardev/char-socket.c
index 213a4c8dd0..7f2ee9a338 100644
--- a/chardev/char-socket.c
+++ b/chardev/char-socket.c
@@ -34,6 +34,7 @@
 #include "qapi/error.h"
 #include "qapi/clone-visitor.h"
 #include "qapi/qapi-visit-sockets.h"
+#include "qemu/yank.h"

 #include "chardev/char-io.h"
 #include "qom/object.h"
@@ -70,6 +71,7 @@ struct SocketChardev {
 size_t read_msgfds_num;
 int *write_msgfds;
 size_t write_msgfds_num;
+bool registered_yank;

 SocketAddress *addr;
 bool is_listen;
@@ -415,6 +417,12 @@ static void tcp_chr_free_connection(Chardev *chr)

 tcp_set_msgfds(chr, NULL, 0);
 remove_fd_in_watch(chr);
+if (s->state == TCP_CHARDEV_STATE_CONNECTING
+|| s->state == TCP_CHARDEV_STATE_CONNECTED) {
+yank_unregister_function(CHARDEV_YANK_INSTANCE(chr->label),
+ yank_generic_iochannel,
+ QIO_CHANNEL(s->sioc));
+}
 object_unref(OBJECT(s->sioc));
 s->sioc = NULL;
 object_unref(OBJECT(s->ioc));
@@ -932,6 +940,9 @@ static int tcp_chr_add_client(Chardev *chr, int fd)
 }
 tcp_chr_change_state(s, TCP_CHARDEV_STATE_CONNECTING);
 tcp_chr_set_client_ioc_name(chr, sioc);
+yank_register_function(CHARDEV_YANK_INSTANCE(chr->label),
+   yank_generic_iochannel,
+   QIO_CHANNEL(sioc));
 ret = tcp_chr_new_client(chr, sioc);
 object_unref(OBJECT(sioc));
 return ret;
@@ -946,6 +957,9 @@ static void tcp_chr_accept(QIONetListener *listener,

 tcp_chr_change_state(s, TCP_CHARDEV_STATE_CONNECTING);
 tcp_chr_set_client_ioc_name(chr, cioc);
+yank_register_function(CHARDEV_YANK_INSTANCE(chr->label),
+   yank_generic_iochannel,
+   QIO_CHANNEL(cioc));
 tcp_chr_new_client(chr, cioc);
 }

@@ -961,6 +975,9 @@ static int tcp_chr_connect_client_sync(Chardev *chr, Error 
**errp)
 object_unref(OBJECT(sioc));
 return -1;
 }
+yank_register_function(CHARDEV_YANK_INSTANCE(chr->label),
+   yank_generic_iochannel,
+   QIO_CHANNEL(sioc));
 tcp_chr_new_client(chr, sioc);
 object_unref(OBJECT(sioc));
 return 0;
@@ -976,6 +993,9 @@ static void tcp_chr_accept_server_sync(Chardev *chr)
 tcp_chr_change_state(s, TCP_CHARDEV_STATE_CONNECTING);
 sioc = qio_net_listener_wait_client(s->listener);
 tcp_chr_set_client_ioc_name(chr, sioc);
+yank_register_function(CHARDEV_YANK_INSTANCE(chr->label),
+   yank_generic_iochannel,
+   QIO_CHANNEL(sioc));
 tcp_chr_new_client(chr, sioc);
 object_unref(OBJECT(sioc));
 }
@@ -1086,6 +1106,9 @@ static void char_socket_finalize(Object *obj)
 object_unref(OBJECT(s->tls_creds));
 }
 g_free(s->tls_authz);
+if (s->registered_yank) {
+yank_unregister_instance(CHARDEV_YANK_INSTANCE(chr->label));
+}

 qemu_chr_be_event(chr, CHR_EVENT_CLOSED);
 }
@@ -1101,6 +1124,9 @@ static void qemu_chr_socket_connected(QIOTask *task, void 
*opaque)

 if (qio_task_propagate_error(task, )) {
 tcp_chr_change_state(s, TCP_CHARDEV_STATE_DISCONNECTED);
+yank_unregister_function(CHARDEV_YANK_INSTANCE(chr->label),
+ yank_generic_iochannel,
+ QIO_CHANNEL(sioc));
 check_report_connect_error(chr, err);
 goto cleanup;
 }
@@ -1134,6 +1160,9 @@ static void tcp_chr_connect_client_async(Chardev *chr)
 tcp_chr_change_state(s, TCP_CHARDEV_STATE_CONNECTING);
 sioc = qio_channel_socket_new();
 tcp_chr_set_client_ioc_name(chr, sioc);
+yank_register_function(CHARDEV_YANK_INSTANCE(chr->label),
+   yank_generic_iochannel,
+   QIO_CHANNEL(sioc));
 /*
  * Normally code would use the qio_channel_socket_connect_async
  * method which uses a QIOTask + qio_task_set_error internally
@@ -1376,6 +1405,12 @@ static void qmp_chardev_open_socket(Chardev *chr,
 qemu_chr_set_feature(chr, QEMU_CHAR_FEATURE_FD_PASS);
 }

+yank_register_instance(CHARDEV_YANK_INSTANCE(chr->label), errp);
+if (*errp) {
+return;
+}
+s->registered_yank = true;
+
 /* be isn't opened until we get a connection */
 *be_opened = false;

--
2.20.1



pgppj_z8lj0Y7.pgp
Description: OpenPGP digital signature


[PATCH v11 2/7] block/nbd.c: Add yank feature

2020-11-15 Thread Lukas Straub
Register a yank function which shuts down the socket and sets
s->state = NBD_CLIENT_QUIT. This is the same behaviour as if an
error occured.

Signed-off-by: Lukas Straub 
Acked-by: Stefan Hajnoczi 
---
 block/nbd.c | 154 +++-
 1 file changed, 93 insertions(+), 61 deletions(-)

diff --git a/block/nbd.c b/block/nbd.c
index 42536702b6..994d1e7b33 100644
--- a/block/nbd.c
+++ b/block/nbd.c
@@ -35,6 +35,7 @@
 #include "qemu/option.h"
 #include "qemu/cutils.h"
 #include "qemu/main-loop.h"
+#include "qemu/atomic.h"

 #include "qapi/qapi-visit-sockets.h"
 #include "qapi/qmp/qstring.h"
@@ -44,6 +45,8 @@
 #include "block/nbd.h"
 #include "block/block_int.h"

+#include "qemu/yank.h"
+
 #define EN_OPTSTR ":exportname="
 #define MAX_NBD_REQUESTS16

@@ -141,14 +144,13 @@ typedef struct BDRVNBDState {
 NBDConnectThread *connect_thread;
 } BDRVNBDState;

-static QIOChannelSocket *nbd_establish_connection(SocketAddress *saddr,
-  Error **errp);
-static QIOChannelSocket *nbd_co_establish_connection(BlockDriverState *bs,
- Error **errp);
+static int nbd_establish_connection(BlockDriverState *bs, SocketAddress *saddr,
+Error **errp);
+static int nbd_co_establish_connection(BlockDriverState *bs, Error **errp);
 static void nbd_co_establish_connection_cancel(BlockDriverState *bs,
bool detach);
-static int nbd_client_handshake(BlockDriverState *bs, QIOChannelSocket *sioc,
-Error **errp);
+static int nbd_client_handshake(BlockDriverState *bs, Error **errp);
+static void nbd_yank(void *opaque);

 static void nbd_clear_bdrvstate(BDRVNBDState *s)
 {
@@ -166,12 +168,12 @@ static void nbd_clear_bdrvstate(BDRVNBDState *s)
 static void nbd_channel_error(BDRVNBDState *s, int ret)
 {
 if (ret == -EIO) {
-if (s->state == NBD_CLIENT_CONNECTED) {
+if (qatomic_load_acquire(>state) == NBD_CLIENT_CONNECTED) {
 s->state = s->reconnect_delay ? NBD_CLIENT_CONNECTING_WAIT :
 NBD_CLIENT_CONNECTING_NOWAIT;
 }
 } else {
-if (s->state == NBD_CLIENT_CONNECTED) {
+if (qatomic_load_acquire(>state) == NBD_CLIENT_CONNECTED) {
 qio_channel_shutdown(s->ioc, QIO_CHANNEL_SHUTDOWN_BOTH, NULL);
 }
 s->state = NBD_CLIENT_QUIT;
@@ -204,7 +206,7 @@ static void reconnect_delay_timer_cb(void *opaque)
 {
 BDRVNBDState *s = opaque;

-if (s->state == NBD_CLIENT_CONNECTING_WAIT) {
+if (qatomic_load_acquire(>state) == NBD_CLIENT_CONNECTING_WAIT) {
 s->state = NBD_CLIENT_CONNECTING_NOWAIT;
 while (qemu_co_enter_next(>free_sema, NULL)) {
 /* Resume all queued requests */
@@ -216,7 +218,7 @@ static void reconnect_delay_timer_cb(void *opaque)

 static void reconnect_delay_timer_init(BDRVNBDState *s, uint64_t 
expire_time_ns)
 {
-if (s->state != NBD_CLIENT_CONNECTING_WAIT) {
+if (qatomic_load_acquire(>state) != NBD_CLIENT_CONNECTING_WAIT) {
 return;
 }

@@ -261,7 +263,7 @@ static void nbd_client_attach_aio_context(BlockDriverState 
*bs,
  * s->connection_co is either yielded from nbd_receive_reply or from
  * nbd_co_reconnect_loop()
  */
-if (s->state == NBD_CLIENT_CONNECTED) {
+if (qatomic_load_acquire(>state) == NBD_CLIENT_CONNECTED) {
 qio_channel_attach_aio_context(QIO_CHANNEL(s->ioc), new_context);
 }

@@ -287,7 +289,7 @@ static void coroutine_fn 
nbd_client_co_drain_begin(BlockDriverState *bs)

 reconnect_delay_timer_del(s);

-if (s->state == NBD_CLIENT_CONNECTING_WAIT) {
+if (qatomic_load_acquire(>state) == NBD_CLIENT_CONNECTING_WAIT) {
 s->state = NBD_CLIENT_CONNECTING_NOWAIT;
 qemu_co_queue_restart_all(>free_sema);
 }
@@ -338,13 +340,14 @@ static void nbd_teardown_connection(BlockDriverState *bs)

 static bool nbd_client_connecting(BDRVNBDState *s)
 {
-return s->state == NBD_CLIENT_CONNECTING_WAIT ||
-s->state == NBD_CLIENT_CONNECTING_NOWAIT;
+NBDClientState state = qatomic_load_acquire(>state);
+return state == NBD_CLIENT_CONNECTING_WAIT ||
+state == NBD_CLIENT_CONNECTING_NOWAIT;
 }

 static bool nbd_client_connecting_wait(BDRVNBDState *s)
 {
-return s->state == NBD_CLIENT_CONNECTING_WAIT;
+return qatomic_load_acquire(>state) == NBD_CLIENT_CONNECTING_WAIT;
 }

 static void connect_bh(void *opaque)
@@ -424,12 +427,12 @@ static void *connect_thread_func(void *opaque)
 return NULL;
 }

-static QIOChannelSocket *coroutine_fn
+static int coroutine_fn
 nbd_co_establish_connection(BlockDriverState *bs, Error **errp)
 {
+int ret;

[PATCH v11 6/7] io: Document qmp oob suitability of qio_channel_shutdown and io_shutdown

2020-11-15 Thread Lukas Straub
Migration and yank code assume that qio_channel_shutdown is thread
-safe and can be called from qmp oob handler. Document this after
checking the code.

Signed-off-by: Lukas Straub 
Acked-by: Stefan Hajnoczi 
Reviewed-by: Daniel P. Berrangé 
---
 include/io/channel.h | 5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/include/io/channel.h b/include/io/channel.h
index 4d6fe45f63..ab9ea77959 100644
--- a/include/io/channel.h
+++ b/include/io/channel.h
@@ -92,7 +92,8 @@ struct QIOChannel {
  * provide additional optional features.
  *
  * Consult the corresponding public API docs for a description
- * of the semantics of each callback
+ * of the semantics of each callback. io_shutdown in particular
+ * must be thread-safe, terminate quickly and must not block.
  */
 struct QIOChannelClass {
 ObjectClass parent;
@@ -510,6 +511,8 @@ int qio_channel_close(QIOChannel *ioc,
  * QIO_CHANNEL_FEATURE_SHUTDOWN prior to calling
  * this method.
  *
+ * This function is thread-safe, terminates quickly and does not block.
+ *
  * Returns: 0 on success, -1 on error
  */
 int qio_channel_shutdown(QIOChannel *ioc,
--
2.20.1



pgpgWjyPDuAY5.pgp
Description: OpenPGP digital signature


[PATCH v11 5/7] io/channel-tls.c: make qio_channel_tls_shutdown thread-safe

2020-11-15 Thread Lukas Straub
Make qio_channel_tls_shutdown thread-safe by using atomics when
accessing tioc->shutdown.

Signed-off-by: Lukas Straub 
Acked-by: Stefan Hajnoczi 
Reviewed-by: Daniel P. Berrangé 
---
 io/channel-tls.c | 6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/io/channel-tls.c b/io/channel-tls.c
index 388f019977..2ae1b92fc0 100644
--- a/io/channel-tls.c
+++ b/io/channel-tls.c
@@ -23,6 +23,7 @@
 #include "qemu/module.h"
 #include "io/channel-tls.h"
 #include "trace.h"
+#include "qemu/atomic.h"


 static ssize_t qio_channel_tls_write_handler(const char *buf,
@@ -277,7 +278,8 @@ static ssize_t qio_channel_tls_readv(QIOChannel *ioc,
 return QIO_CHANNEL_ERR_BLOCK;
 }
 } else if (errno == ECONNABORTED &&
-   (tioc->shutdown & QIO_CHANNEL_SHUTDOWN_READ)) {
+   (qatomic_load_acquire(>shutdown) &
+QIO_CHANNEL_SHUTDOWN_READ)) {
 return 0;
 }

@@ -361,7 +363,7 @@ static int qio_channel_tls_shutdown(QIOChannel *ioc,
 {
 QIOChannelTLS *tioc = QIO_CHANNEL_TLS(ioc);

-tioc->shutdown |= how;
+qatomic_or(>shutdown, how);

 return qio_channel_shutdown(tioc->master, how, errp);
 }
--
2.20.1



pgp9E3Ifl4t4T.pgp
Description: OpenPGP digital signature


[PATCH v11 0/7] Introduce 'yank' oob qmp command to recover from hanging qemu

2020-11-15 Thread Lukas Straub

Hello Everyone,
So here is v11.
@Eric Blake and @Marc-André Lureau: We still need ACKs for NBD and chardev.

Changes:

v11:
 -squashed MAINTAINERS update into patch 1
 -move qmp doc of yank before misc
 -add title for qmp docs
 -change "Since:" to 6.0
 -add Reviewed-by tags

v10:
 -moved from qapi/misc.json to qapi/yank.json
 -rename 'blockdev' -> 'block-node'
 -document difference betwen migration yank instance and migrate_cancel
 -better document return values of yank command
 -better document yank_lock
 -minor style and spelling fixes

v9:
 -rebase onto master
 -implemented new qmp api as proposed by Markus

v8:
 -add Reviewed-by and Acked-by tags
 -rebase onto master
  -minor change to migration
  -convert to meson
 -change "Since:" to 5.2
 -varios code style fixes (Markus Armbruster)
 -point to oob restrictions in comment to yank_register_function
  (Markus Armbruster)
 -improve qmp documentation (Markus Armbruster)
 -document oob suitability of qio_channel and io_shutdown (Markus Armbruster)

v7:
 -yank_register_instance now returns error via Error **errp instead of aborting
 -dropped "chardev/char.c: Check for duplicate id before  creating chardev"

v6:
 -add Reviewed-by and Acked-by tags
 -rebase on master
 -lots of changes in nbd due to rebase
 -only take maintainership of util/yank.c and include/qemu/yank.h (Daniel P. 
Berrangé)
 -fix a crash discovered by the newly added chardev test
 -fix the test itself

v5:
 -move yank.c to util/
 -move yank.h to include/qemu/
 -add license to yank.h
 -use const char*
 -nbd: use atomic_store_release and atomic_load_aqcuire
 -io-channel: ensure thread-safety and document it
 -add myself as maintainer for yank

v4:
 -fix build errors...

v3:
 -don't touch softmmu/vl.c, use __contructor__ attribute instead (Paolo Bonzini)
 -fix build errors
 -rewrite migration patch so it actually passes all tests

v2:
 -don't touch io/ code anymore
 -always register yank functions
 -'yank' now takes a list of instances to yank
 -'query-yank' returns a list of yankable instances

Overview:
Hello Everyone,
In many cases, if qemu has a network connection (qmp, migration, chardev, etc.)
to some other server and that server dies or hangs, qemu hangs too.
These patches introduce the new 'yank' out-of-band qmp command to recover from
these kinds of hangs. The different subsystems register callbacks which get
executed with the yank command. For example the callback can shutdown() a
socket. This is intended for the colo use-case, but it can be used for other
things too of course.

Regards,
Lukas Straub

Lukas Straub (7):
  Introduce yank feature
  block/nbd.c: Add yank feature
  chardev/char-socket.c: Add yank feature
  migration: Add yank feature
  io/channel-tls.c: make qio_channel_tls_shutdown thread-safe
  io: Document qmp oob suitability of qio_channel_shutdown and
io_shutdown
  tests/test-char.c: Wait for the chardev to connect in
char_socket_client_dupid_test

 MAINTAINERS   |   7 ++
 block/nbd.c   | 154 ++--
 chardev/char-socket.c |  35 ++
 include/io/channel.h  |   5 +-
 include/qemu/yank.h   |  95 +++
 io/channel-tls.c  |   6 +-
 migration/channel.c   |  13 ++
 migration/migration.c |  25 
 migration/multifd.c   |  10 ++
 migration/qemu-file-channel.c |   7 ++
 migration/savevm.c|   6 +
 qapi/meson.build  |   1 +
 qapi/qapi-schema.json |   1 +
 qapi/yank.json| 119 +++
 tests/test-char.c |   1 +
 util/meson.build  |   1 +
 util/yank.c   | 216 ++
 17 files changed, 638 insertions(+), 64 deletions(-)
 create mode 100644 include/qemu/yank.h
 create mode 100644 qapi/yank.json
 create mode 100644 util/yank.c

--
2.20.1


pgpKXlhhXx8qQ.pgp
Description: OpenPGP digital signature


[PATCH v11 1/7] Introduce yank feature

2020-11-15 Thread Lukas Straub
The yank feature allows to recover from hanging qemu by "yanking"
at various parts. Other qemu systems can register themselves and
multiple yank functions. Then all yank functions for selected
instances can be called by the 'yank' out-of-band qmp command.
Available instances can be queried by a 'query-yank' oob command.

Signed-off-by: Lukas Straub 
Acked-by: Stefan Hajnoczi 
Reviewed-by: Markus Armbruster 
---
 MAINTAINERS   |   7 ++
 include/qemu/yank.h   |  95 +++
 qapi/meson.build  |   1 +
 qapi/qapi-schema.json |   1 +
 qapi/yank.json| 119 +++
 util/meson.build  |   1 +
 util/yank.c   | 216 ++
 7 files changed, 440 insertions(+)
 create mode 100644 include/qemu/yank.h
 create mode 100644 qapi/yank.json
 create mode 100644 util/yank.c

diff --git a/MAINTAINERS b/MAINTAINERS
index 2e018a0c1d..46ff468b13 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -2688,6 +2688,13 @@ F: util/uuid.c
 F: include/qemu/uuid.h
 F: tests/test-uuid.c

+Yank feature
+M: Lukas Straub 
+S: Odd fixes
+F: util/yank.c
+F: include/qemu/yank.h
+F: qapi/yank.json
+
 COLO Framework
 M: zhanghailiang 
 S: Maintained
diff --git a/include/qemu/yank.h b/include/qemu/yank.h
new file mode 100644
index 00..96f5b2626f
--- /dev/null
+++ b/include/qemu/yank.h
@@ -0,0 +1,95 @@
+/*
+ * QEMU yank feature
+ *
+ * Copyright (c) Lukas Straub 
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or later.
+ * See the COPYING file in the top-level directory.
+ */
+
+#ifndef YANK_H
+#define YANK_H
+
+#include "qapi/qapi-types-yank.h"
+
+typedef void (YankFn)(void *opaque);
+
+/**
+ * yank_register_instance: Register a new instance.
+ *
+ * This registers a new instance for yanking. Must be called before any yank
+ * function is registered for this instance.
+ *
+ * This function is thread-safe.
+ *
+ * @instance: The instance.
+ * @errp: Error object.
+ */
+void yank_register_instance(const YankInstance *instance, Error **errp);
+
+/**
+ * yank_unregister_instance: Unregister a instance.
+ *
+ * This unregisters a instance. Must be called only after every yank function
+ * of the instance has been unregistered.
+ *
+ * This function is thread-safe.
+ *
+ * @instance: The instance.
+ */
+void yank_unregister_instance(const YankInstance *instance);
+
+/**
+ * yank_register_function: Register a yank function
+ *
+ * This registers a yank function. All limitations of qmp oob commands apply
+ * to the yank function as well. See docs/devel/qapi-code-gen.txt under
+ * "An OOB-capable command handler must satisfy the following conditions".
+ *
+ * This function is thread-safe.
+ *
+ * @instance: The instance.
+ * @func: The yank function.
+ * @opaque: Will be passed to the yank function.
+ */
+void yank_register_function(const YankInstance *instance,
+YankFn *func,
+void *opaque);
+
+/**
+ * yank_unregister_function: Unregister a yank function
+ *
+ * This unregisters a yank function.
+ *
+ * This function is thread-safe.
+ *
+ * @instance: The instance.
+ * @func: func that was passed to yank_register_function.
+ * @opaque: opaque that was passed to yank_register_function.
+ */
+void yank_unregister_function(const YankInstance *instance,
+  YankFn *func,
+  void *opaque);
+
+/**
+ * yank_generic_iochannel: Generic yank function for iochannel
+ *
+ * This is a generic yank function which will call qio_channel_shutdown on the
+ * provided QIOChannel.
+ *
+ * @opaque: QIOChannel to shutdown
+ */
+void yank_generic_iochannel(void *opaque);
+
+#define BLOCKDEV_YANK_INSTANCE(the_node_name) (&(YankInstance) { \
+.type = YANK_INSTANCE_TYPE_BLOCK_NODE, \
+.u.block_node.node_name = (the_node_name) })
+
+#define CHARDEV_YANK_INSTANCE(the_id) (&(YankInstance) { \
+.type = YANK_INSTANCE_TYPE_CHARDEV, \
+.u.chardev.id = (the_id) })
+
+#define MIGRATION_YANK_INSTANCE (&(YankInstance) { \
+.type = YANK_INSTANCE_TYPE_MIGRATION })
+
+#endif
diff --git a/qapi/meson.build b/qapi/meson.build
index 0e98146f1f..ab68e7900e 100644
--- a/qapi/meson.build
+++ b/qapi/meson.build
@@ -47,6 +47,7 @@ qapi_all_modules = [
   'trace',
   'transaction',
   'ui',
+  'yank',
 ]

 qapi_storage_daemon_modules = [
diff --git a/qapi/qapi-schema.json b/qapi/qapi-schema.json
index 0b444b76d2..3441c9a9ae 100644
--- a/qapi/qapi-schema.json
+++ b/qapi/qapi-schema.json
@@ -86,6 +86,7 @@
 { 'include': 'machine.json' }
 { 'include': 'machine-target.json' }
 { 'include': 'replay.json' }
+{ 'include': 'yank.json' }
 { 'include': 'misc.json' }
 { 'include': 'misc-target.json' }
 { 'include': 'audio.json' }
diff --git a/qapi/yank.json b/qapi/yank.json
new file mode 100644
index 00..167a775594
--- /dev/null
+++ b/qapi/yank.json
@@ -0,0 +1,119 @@
+# -*- Mode: Python -*-
+# vim: 

Re: [PATCH v10 7/8] MAINTAINERS: Add myself as maintainer for yank feature

2020-11-15 Thread Lukas Straub
On Mon, 02 Nov 2020 07:33:54 +0100
Markus Armbruster  wrote:

> Lukas Straub  writes:
> 
> > I'll maintain this for now as the colo usecase is the first user
> > of this functionality.
> >
> > Signed-off-by: Lukas Straub 
> > Acked-by: Stefan Hajnoczi 
> > ---
> >  MAINTAINERS | 7 +++
> >  1 file changed, 7 insertions(+)
> >
> > diff --git a/MAINTAINERS b/MAINTAINERS
> > index 8c744a9bdf..81288fd219 100644
> > --- a/MAINTAINERS
> > +++ b/MAINTAINERS
> > @@ -2676,6 +2676,13 @@ F: util/uuid.c
> >  F: include/qemu/uuid.h
> >  F: tests/test-uuid.c
> >
> > +Yank feature
> > +M: Lukas Straub 
> > +S: Odd fixes
> > +F: util/yank.c
> > +F: include/qemu/yank.h
> > +F: qapi/yank.json
> > +
> >  COLO Framework
> >  M: zhanghailiang 
> >  S: Maintained  
> 
> I'd squash this into PATCH 1 to mollify checkpatch.pl.
> 
> Regardless,
> Reviewed-by: Markus Armbruster 
> 

Changed for the next version.

-- 



pgpwbyQZpUaYM.pgp
Description: OpenPGP digital signature


Re: [PATCH v10 1/8] Introduce yank feature

2020-11-15 Thread Lukas Straub
On Mon, 02 Nov 2020 07:32:55 +0100
Markus Armbruster  wrote:

> Lukas Straub  writes:
> 
> > The yank feature allows to recover from hanging qemu by "yanking"
> > at various parts. Other qemu systems can register themselves and
> > multiple yank functions. Then all yank functions for selected
> > instances can be called by the 'yank' out-of-band qmp command.
> > Available instances can be queried by a 'query-yank' oob command.
> >
> > Signed-off-by: Lukas Straub 
> > Acked-by: Stefan Hajnoczi   
> [...]
> >  qapi_storage_daemon_modules = [
> > diff --git a/qapi/qapi-schema.json b/qapi/qapi-schema.json
> > index 0b444b76d2..79c1705ed7 100644
> > --- a/qapi/qapi-schema.json
> > +++ b/qapi/qapi-schema.json
> > @@ -91,3 +91,4 @@
> >  { 'include': 'audio.json' }
> >  { 'include': 'acpi.json' }
> >  { 'include': 'pci.json' }
> > +{ 'include': 'yank.json' }  
> 
> This adds the documentation at the very end of the reference manual.  Is
> this where you want it to go?  Check generated
> docs/interop/qemu-qmp-ref.html.

I've moved it above misc for the next version.

> > diff --git a/qapi/yank.json b/qapi/yank.json
> > new file mode 100644
> > index 00..1964a2202e
> > --- /dev/null
> > +++ b/qapi/yank.json
> > @@ -0,0 +1,115 @@
> > +# -*- Mode: Python -*-
> > +# vim: filetype=python
> > +#
> > +  
> 
> Please add a suitable heading here.  Headings look like this:
> 
>##
># Text of heading goes here
>##

Changed for the next version.

> Without it, the yank stuff gets squashed into the previous section
> (happens to be PCI).
> 
> If you want to add an introduction or overview, it goes right below the
> heading.  I'm not asking you to do that, I'm only telling you what's
> possible.
> 
> [...]
> 
> Solid work, pleasant to review, thanks!
> 
> Reviewed-by: Markus Armbruster 
> 

Thanks!

-- 



pgpn2W4tkMdVg.pgp
Description: OpenPGP digital signature


[PATCH v10 5/8] io/channel-tls.c: make qio_channel_tls_shutdown thread-safe

2020-10-30 Thread Lukas Straub
Make qio_channel_tls_shutdown thread-safe by using atomics when
accessing tioc->shutdown.

Signed-off-by: Lukas Straub 
Acked-by: Stefan Hajnoczi 
Reviewed-by: Daniel P. Berrangé 
---
 io/channel-tls.c | 6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/io/channel-tls.c b/io/channel-tls.c
index 7ec8ceff2f..10d0bf59aa 100644
--- a/io/channel-tls.c
+++ b/io/channel-tls.c
@@ -23,6 +23,7 @@
 #include "qemu/module.h"
 #include "io/channel-tls.h"
 #include "trace.h"
+#include "qemu/atomic.h"


 static ssize_t qio_channel_tls_write_handler(const char *buf,
@@ -277,7 +278,8 @@ static ssize_t qio_channel_tls_readv(QIOChannel *ioc,
 return QIO_CHANNEL_ERR_BLOCK;
 }
 } else if (errno == ECONNABORTED &&
-   (tioc->shutdown & QIO_CHANNEL_SHUTDOWN_READ)) {
+   (qatomic_load_acquire(>shutdown) &
+QIO_CHANNEL_SHUTDOWN_READ)) {
 return 0;
 }

@@ -361,7 +363,7 @@ static int qio_channel_tls_shutdown(QIOChannel *ioc,
 {
 QIOChannelTLS *tioc = QIO_CHANNEL_TLS(ioc);

-tioc->shutdown |= how;
+qatomic_or(>shutdown, how);

 return qio_channel_shutdown(tioc->master, how, errp);
 }
--
2.20.1



pgprghgOqQAAI.pgp
Description: OpenPGP digital signature


[PATCH v10 8/8] tests/test-char.c: Wait for the chardev to connect in char_socket_client_dupid_test

2020-10-30 Thread Lukas Straub
A connecting chardev object has an additional reference by the connecting
thread, so if the chardev is still connecting by the end of the test,
then the chardev object won't be freed. This in turn means that the yank
instance won't be unregistered and when running the next test-case
yank_register_instance will abort, because the yank instance is
already/still registered.

Signed-off-by: Lukas Straub 
Reviewed-by: Daniel P. Berrangé 
---
 tests/test-char.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/tests/test-char.c b/tests/test-char.c
index 9196e566e9..aedb5c9eda 100644
--- a/tests/test-char.c
+++ b/tests/test-char.c
@@ -937,6 +937,7 @@ static void char_socket_client_dupid_test(gconstpointer 
opaque)
 g_assert_nonnull(opts);
 chr1 = qemu_chr_new_from_opts(opts, NULL, _abort);
 g_assert_nonnull(chr1);
+qemu_chr_wait_connected(chr1, _abort);

 chr2 = qemu_chr_new_from_opts(opts, NULL, _err);
 g_assert_null(chr2);
--
2.20.1


pgpJtO1z6hH_l.pgp
Description: OpenPGP digital signature


[PATCH v10 4/8] migration: Add yank feature

2020-10-30 Thread Lukas Straub
Register yank functions on sockets to shut them down.

Signed-off-by: Lukas Straub 
Acked-by: Stefan Hajnoczi 
Acked-by: Dr. David Alan Gilbert 
---
 migration/channel.c   | 13 +
 migration/migration.c | 25 +
 migration/multifd.c   | 10 ++
 migration/qemu-file-channel.c |  7 +++
 migration/savevm.c|  6 ++
 5 files changed, 61 insertions(+)

diff --git a/migration/channel.c b/migration/channel.c
index 8a783baa0b..35fe234e9c 100644
--- a/migration/channel.c
+++ b/migration/channel.c
@@ -18,6 +18,8 @@
 #include "trace.h"
 #include "qapi/error.h"
 #include "io/channel-tls.h"
+#include "io/channel-socket.h"
+#include "qemu/yank.h"

 /**
  * @migration_channel_process_incoming - Create new incoming migration channel
@@ -35,6 +37,11 @@ void migration_channel_process_incoming(QIOChannel *ioc)
 trace_migration_set_incoming_channel(
 ioc, object_get_typename(OBJECT(ioc)));

+if (object_dynamic_cast(OBJECT(ioc), TYPE_QIO_CHANNEL_SOCKET)) {
+yank_register_function(MIGRATION_YANK_INSTANCE, yank_generic_iochannel,
+   QIO_CHANNEL(ioc));
+}
+
 if (s->parameters.tls_creds &&
 *s->parameters.tls_creds &&
 !object_dynamic_cast(OBJECT(ioc),
@@ -67,6 +74,12 @@ void migration_channel_connect(MigrationState *s,
 ioc, object_get_typename(OBJECT(ioc)), hostname, error);

 if (!error) {
+if (object_dynamic_cast(OBJECT(ioc), TYPE_QIO_CHANNEL_SOCKET)) {
+yank_register_function(MIGRATION_YANK_INSTANCE,
+   yank_generic_iochannel,
+   QIO_CHANNEL(ioc));
+}
+
 if (s->parameters.tls_creds &&
 *s->parameters.tls_creds &&
 !object_dynamic_cast(OBJECT(ioc),
diff --git a/migration/migration.c b/migration/migration.c
index 9bb4fee5ac..0b0442df37 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -56,6 +56,7 @@
 #include "net/announce.h"
 #include "qemu/queue.h"
 #include "multifd.h"
+#include "qemu/yank.h"

 #define MAX_THROTTLE  (128 << 20)  /* Migration transfer speed throttling 
*/

@@ -248,6 +249,8 @@ void migration_incoming_state_destroy(void)
 qapi_free_SocketAddressList(mis->socket_address_list);
 mis->socket_address_list = NULL;
 }
+
+yank_unregister_instance(MIGRATION_YANK_INSTANCE);
 }

 static void migrate_generate_event(int new_state)
@@ -425,8 +428,14 @@ void qemu_start_incoming_migration(const char *uri, Error 
**errp)
 {
 const char *p = NULL;

+yank_register_instance(MIGRATION_YANK_INSTANCE, errp);
+if (*errp) {
+return;
+}
+
 qapi_event_send_migration(MIGRATION_STATUS_SETUP);
 if (!strcmp(uri, "defer")) {
+yank_unregister_instance(MIGRATION_YANK_INSTANCE);
 deferred_incoming_migration(errp);
 } else if (strstart(uri, "tcp:", ) ||
strstart(uri, "unix:", NULL) ||
@@ -441,6 +450,7 @@ void qemu_start_incoming_migration(const char *uri, Error 
**errp)
 } else if (strstart(uri, "fd:", )) {
 fd_start_incoming_migration(p, errp);
 } else {
+yank_unregister_instance(MIGRATION_YANK_INSTANCE);
 error_setg(errp, "unknown migration protocol: %s", uri);
 }
 }
@@ -1733,6 +1743,7 @@ static void migrate_fd_cleanup(MigrationState *s)
 }
 notifier_list_notify(_state_notifiers, s);
 block_cleanup_parameters(s);
+yank_unregister_instance(MIGRATION_YANK_INSTANCE);
 }

 static void migrate_fd_cleanup_schedule(MigrationState *s)
@@ -2007,6 +2018,7 @@ void qmp_migrate_recover(const char *uri, Error **errp)
  * only re-setup the migration stream and poke existing migration
  * to continue using that newly established channel.
  */
+yank_unregister_instance(MIGRATION_YANK_INSTANCE);
 qemu_start_incoming_migration(uri, errp);
 }

@@ -2144,6 +2156,13 @@ void qmp_migrate(const char *uri, bool has_blk, bool blk,
 return;
 }

+if (!(has_resume && resume)) {
+yank_register_instance(MIGRATION_YANK_INSTANCE, errp);
+if (*errp) {
+return;
+}
+}
+
 if (strstart(uri, "tcp:", ) ||
 strstart(uri, "unix:", NULL) ||
 strstart(uri, "vsock:", NULL)) {
@@ -2157,6 +2176,9 @@ void qmp_migrate(const char *uri, bool has_blk, bool blk,
 } else if (strstart(uri, "fd:", )) {
 fd_start_outgoing_migration(s, p, _err);
 } else {
+if (!(has_resume && resume)) {
+yank_unregister_instance(MIGRATION_YANK_INSTANCE);
+}
 error_setg(errp, QERR_INVALID_PARAMETER_VALUE, "uri",
"a valid m

[PATCH v10 6/8] io: Document qmp oob suitability of qio_channel_shutdown and io_shutdown

2020-10-30 Thread Lukas Straub
Migration and yank code assume that qio_channel_shutdown is thread
-safe and can be called from qmp oob handler. Document this after
checking the code.

Signed-off-by: Lukas Straub 
Acked-by: Stefan Hajnoczi 
Reviewed-by: Daniel P. Berrangé 
---
 include/io/channel.h | 5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/include/io/channel.h b/include/io/channel.h
index 3c04f0edda..e0b9fc615d 100644
--- a/include/io/channel.h
+++ b/include/io/channel.h
@@ -92,7 +92,8 @@ struct QIOChannel {
  * provide additional optional features.
  *
  * Consult the corresponding public API docs for a description
- * of the semantics of each callback
+ * of the semantics of each callback. io_shutdown in particular
+ * must be thread-safe, terminate quickly and must not block.
  */
 struct QIOChannelClass {
 ObjectClass parent;
@@ -510,6 +511,8 @@ int qio_channel_close(QIOChannel *ioc,
  * QIO_CHANNEL_FEATURE_SHUTDOWN prior to calling
  * this method.
  *
+ * This function is thread-safe, terminates quickly and does not block.
+ *
  * Returns: 0 on success, -1 on error
  */
 int qio_channel_shutdown(QIOChannel *ioc,
--
2.20.1



pgpSPmLHd3Cfu.pgp
Description: OpenPGP digital signature


[PATCH v10 2/8] block/nbd.c: Add yank feature

2020-10-30 Thread Lukas Straub
Register a yank function which shuts down the socket and sets
s->state = NBD_CLIENT_QUIT. This is the same behaviour as if an
error occured.

Signed-off-by: Lukas Straub 
Acked-by: Stefan Hajnoczi 
---
 block/nbd.c | 154 +++-
 1 file changed, 93 insertions(+), 61 deletions(-)

diff --git a/block/nbd.c b/block/nbd.c
index 4548046cd7..d66c84ee40 100644
--- a/block/nbd.c
+++ b/block/nbd.c
@@ -35,6 +35,7 @@
 #include "qemu/option.h"
 #include "qemu/cutils.h"
 #include "qemu/main-loop.h"
+#include "qemu/atomic.h"

 #include "qapi/qapi-visit-sockets.h"
 #include "qapi/qmp/qstring.h"
@@ -44,6 +45,8 @@
 #include "block/nbd.h"
 #include "block/block_int.h"

+#include "qemu/yank.h"
+
 #define EN_OPTSTR ":exportname="
 #define MAX_NBD_REQUESTS16

@@ -140,14 +143,13 @@ typedef struct BDRVNBDState {
 NBDConnectThread *connect_thread;
 } BDRVNBDState;

-static QIOChannelSocket *nbd_establish_connection(SocketAddress *saddr,
-  Error **errp);
-static QIOChannelSocket *nbd_co_establish_connection(BlockDriverState *bs,
- Error **errp);
+static int nbd_establish_connection(BlockDriverState *bs, SocketAddress *saddr,
+Error **errp);
+static int nbd_co_establish_connection(BlockDriverState *bs, Error **errp);
 static void nbd_co_establish_connection_cancel(BlockDriverState *bs,
bool detach);
-static int nbd_client_handshake(BlockDriverState *bs, QIOChannelSocket *sioc,
-Error **errp);
+static int nbd_client_handshake(BlockDriverState *bs, Error **errp);
+static void nbd_yank(void *opaque);

 static void nbd_clear_bdrvstate(BDRVNBDState *s)
 {
@@ -165,12 +167,12 @@ static void nbd_clear_bdrvstate(BDRVNBDState *s)
 static void nbd_channel_error(BDRVNBDState *s, int ret)
 {
 if (ret == -EIO) {
-if (s->state == NBD_CLIENT_CONNECTED) {
+if (qatomic_load_acquire(>state) == NBD_CLIENT_CONNECTED) {
 s->state = s->reconnect_delay ? NBD_CLIENT_CONNECTING_WAIT :
 NBD_CLIENT_CONNECTING_NOWAIT;
 }
 } else {
-if (s->state == NBD_CLIENT_CONNECTED) {
+if (qatomic_load_acquire(>state) == NBD_CLIENT_CONNECTED) {
 qio_channel_shutdown(s->ioc, QIO_CHANNEL_SHUTDOWN_BOTH, NULL);
 }
 s->state = NBD_CLIENT_QUIT;
@@ -203,7 +205,7 @@ static void reconnect_delay_timer_cb(void *opaque)
 {
 BDRVNBDState *s = opaque;

-if (s->state == NBD_CLIENT_CONNECTING_WAIT) {
+if (qatomic_load_acquire(>state) == NBD_CLIENT_CONNECTING_WAIT) {
 s->state = NBD_CLIENT_CONNECTING_NOWAIT;
 while (qemu_co_enter_next(>free_sema, NULL)) {
 /* Resume all queued requests */
@@ -215,7 +217,7 @@ static void reconnect_delay_timer_cb(void *opaque)

 static void reconnect_delay_timer_init(BDRVNBDState *s, uint64_t 
expire_time_ns)
 {
-if (s->state != NBD_CLIENT_CONNECTING_WAIT) {
+if (qatomic_load_acquire(>state) != NBD_CLIENT_CONNECTING_WAIT) {
 return;
 }

@@ -260,7 +262,7 @@ static void nbd_client_attach_aio_context(BlockDriverState 
*bs,
  * s->connection_co is either yielded from nbd_receive_reply or from
  * nbd_co_reconnect_loop()
  */
-if (s->state == NBD_CLIENT_CONNECTED) {
+if (qatomic_load_acquire(>state) == NBD_CLIENT_CONNECTED) {
 qio_channel_attach_aio_context(QIO_CHANNEL(s->ioc), new_context);
 }

@@ -286,7 +288,7 @@ static void coroutine_fn 
nbd_client_co_drain_begin(BlockDriverState *bs)

 reconnect_delay_timer_del(s);

-if (s->state == NBD_CLIENT_CONNECTING_WAIT) {
+if (qatomic_load_acquire(>state) == NBD_CLIENT_CONNECTING_WAIT) {
 s->state = NBD_CLIENT_CONNECTING_NOWAIT;
 qemu_co_queue_restart_all(>free_sema);
 }
@@ -337,13 +339,14 @@ static void nbd_teardown_connection(BlockDriverState *bs)

 static bool nbd_client_connecting(BDRVNBDState *s)
 {
-return s->state == NBD_CLIENT_CONNECTING_WAIT ||
-s->state == NBD_CLIENT_CONNECTING_NOWAIT;
+NBDClientState state = qatomic_load_acquire(>state);
+return state == NBD_CLIENT_CONNECTING_WAIT ||
+state == NBD_CLIENT_CONNECTING_NOWAIT;
 }

 static bool nbd_client_connecting_wait(BDRVNBDState *s)
 {
-return s->state == NBD_CLIENT_CONNECTING_WAIT;
+return qatomic_load_acquire(>state) == NBD_CLIENT_CONNECTING_WAIT;
 }

 static void connect_bh(void *opaque)
@@ -423,12 +426,12 @@ static void *connect_thread_func(void *opaque)
 return NULL;
 }

-static QIOChannelSocket *coroutine_fn
+static int coroutine_fn
 nbd_co_establish_connection(BlockDriverState *bs, Error **errp)
 {
+int ret;

[PATCH v10 7/8] MAINTAINERS: Add myself as maintainer for yank feature

2020-10-30 Thread Lukas Straub
I'll maintain this for now as the colo usecase is the first user
of this functionality.

Signed-off-by: Lukas Straub 
Acked-by: Stefan Hajnoczi 
---
 MAINTAINERS | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/MAINTAINERS b/MAINTAINERS
index 8c744a9bdf..81288fd219 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -2676,6 +2676,13 @@ F: util/uuid.c
 F: include/qemu/uuid.h
 F: tests/test-uuid.c

+Yank feature
+M: Lukas Straub 
+S: Odd fixes
+F: util/yank.c
+F: include/qemu/yank.h
+F: qapi/yank.json
+
 COLO Framework
 M: zhanghailiang 
 S: Maintained
--
2.20.1



pgpjxUauG10GX.pgp
Description: OpenPGP digital signature


[PATCH v10 3/8] chardev/char-socket.c: Add yank feature

2020-10-30 Thread Lukas Straub
Register a yank function to shutdown the socket on yank.

Signed-off-by: Lukas Straub 
Acked-by: Stefan Hajnoczi 
---
 chardev/char-socket.c | 35 +++
 1 file changed, 35 insertions(+)

diff --git a/chardev/char-socket.c b/chardev/char-socket.c
index 95e45812d5..5947cbe8bb 100644
--- a/chardev/char-socket.c
+++ b/chardev/char-socket.c
@@ -34,6 +34,7 @@
 #include "qapi/error.h"
 #include "qapi/clone-visitor.h"
 #include "qapi/qapi-visit-sockets.h"
+#include "qemu/yank.h"

 #include "chardev/char-io.h"
 #include "qom/object.h"
@@ -70,6 +71,7 @@ struct SocketChardev {
 size_t read_msgfds_num;
 int *write_msgfds;
 size_t write_msgfds_num;
+bool registered_yank;

 SocketAddress *addr;
 bool is_listen;
@@ -415,6 +417,12 @@ static void tcp_chr_free_connection(Chardev *chr)

 tcp_set_msgfds(chr, NULL, 0);
 remove_fd_in_watch(chr);
+if (s->state == TCP_CHARDEV_STATE_CONNECTING
+|| s->state == TCP_CHARDEV_STATE_CONNECTED) {
+yank_unregister_function(CHARDEV_YANK_INSTANCE(chr->label),
+ yank_generic_iochannel,
+ QIO_CHANNEL(s->sioc));
+}
 object_unref(OBJECT(s->sioc));
 s->sioc = NULL;
 object_unref(OBJECT(s->ioc));
@@ -918,6 +926,9 @@ static int tcp_chr_add_client(Chardev *chr, int fd)
 }
 tcp_chr_change_state(s, TCP_CHARDEV_STATE_CONNECTING);
 tcp_chr_set_client_ioc_name(chr, sioc);
+yank_register_function(CHARDEV_YANK_INSTANCE(chr->label),
+   yank_generic_iochannel,
+   QIO_CHANNEL(sioc));
 ret = tcp_chr_new_client(chr, sioc);
 object_unref(OBJECT(sioc));
 return ret;
@@ -932,6 +943,9 @@ static void tcp_chr_accept(QIONetListener *listener,

 tcp_chr_change_state(s, TCP_CHARDEV_STATE_CONNECTING);
 tcp_chr_set_client_ioc_name(chr, cioc);
+yank_register_function(CHARDEV_YANK_INSTANCE(chr->label),
+   yank_generic_iochannel,
+   QIO_CHANNEL(cioc));
 tcp_chr_new_client(chr, cioc);
 }

@@ -947,6 +961,9 @@ static int tcp_chr_connect_client_sync(Chardev *chr, Error 
**errp)
 object_unref(OBJECT(sioc));
 return -1;
 }
+yank_register_function(CHARDEV_YANK_INSTANCE(chr->label),
+   yank_generic_iochannel,
+   QIO_CHANNEL(sioc));
 tcp_chr_new_client(chr, sioc);
 object_unref(OBJECT(sioc));
 return 0;
@@ -962,6 +979,9 @@ static void tcp_chr_accept_server_sync(Chardev *chr)
 tcp_chr_change_state(s, TCP_CHARDEV_STATE_CONNECTING);
 sioc = qio_net_listener_wait_client(s->listener);
 tcp_chr_set_client_ioc_name(chr, sioc);
+yank_register_function(CHARDEV_YANK_INSTANCE(chr->label),
+   yank_generic_iochannel,
+   QIO_CHANNEL(sioc));
 tcp_chr_new_client(chr, sioc);
 object_unref(OBJECT(sioc));
 }
@@ -1072,6 +1092,9 @@ static void char_socket_finalize(Object *obj)
 object_unref(OBJECT(s->tls_creds));
 }
 g_free(s->tls_authz);
+if (s->registered_yank) {
+yank_unregister_instance(CHARDEV_YANK_INSTANCE(chr->label));
+}

 qemu_chr_be_event(chr, CHR_EVENT_CLOSED);
 }
@@ -1087,6 +1110,9 @@ static void qemu_chr_socket_connected(QIOTask *task, void 
*opaque)

 if (qio_task_propagate_error(task, )) {
 tcp_chr_change_state(s, TCP_CHARDEV_STATE_DISCONNECTED);
+yank_unregister_function(CHARDEV_YANK_INSTANCE(chr->label),
+ yank_generic_iochannel,
+ QIO_CHANNEL(sioc));
 check_report_connect_error(chr, err);
 goto cleanup;
 }
@@ -1120,6 +1146,9 @@ static void tcp_chr_connect_client_async(Chardev *chr)
 tcp_chr_change_state(s, TCP_CHARDEV_STATE_CONNECTING);
 sioc = qio_channel_socket_new();
 tcp_chr_set_client_ioc_name(chr, sioc);
+yank_register_function(CHARDEV_YANK_INSTANCE(chr->label),
+   yank_generic_iochannel,
+   QIO_CHANNEL(sioc));
 /*
  * Normally code would use the qio_channel_socket_connect_async
  * method which uses a QIOTask + qio_task_set_error internally
@@ -1362,6 +1391,12 @@ static void qmp_chardev_open_socket(Chardev *chr,
 qemu_chr_set_feature(chr, QEMU_CHAR_FEATURE_FD_PASS);
 }

+yank_register_instance(CHARDEV_YANK_INSTANCE(chr->label), errp);
+if (*errp) {
+return;
+}
+s->registered_yank = true;
+
 /* be isn't opened until we get a connection */
 *be_opened = false;

--
2.20.1



pgpx12P9M6Wj4.pgp
Description: OpenPGP digital signature


[PATCH v10 1/8] Introduce yank feature

2020-10-30 Thread Lukas Straub
The yank feature allows to recover from hanging qemu by "yanking"
at various parts. Other qemu systems can register themselves and
multiple yank functions. Then all yank functions for selected
instances can be called by the 'yank' out-of-band qmp command.
Available instances can be queried by a 'query-yank' oob command.

Signed-off-by: Lukas Straub 
Acked-by: Stefan Hajnoczi 
---
 include/qemu/yank.h   |  95 +++
 qapi/meson.build  |   1 +
 qapi/qapi-schema.json |   1 +
 qapi/yank.json| 115 ++
 util/meson.build  |   1 +
 util/yank.c   | 216 ++
 6 files changed, 429 insertions(+)
 create mode 100644 include/qemu/yank.h
 create mode 100644 qapi/yank.json
 create mode 100644 util/yank.c

diff --git a/include/qemu/yank.h b/include/qemu/yank.h
new file mode 100644
index 00..96f5b2626f
--- /dev/null
+++ b/include/qemu/yank.h
@@ -0,0 +1,95 @@
+/*
+ * QEMU yank feature
+ *
+ * Copyright (c) Lukas Straub 
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or later.
+ * See the COPYING file in the top-level directory.
+ */
+
+#ifndef YANK_H
+#define YANK_H
+
+#include "qapi/qapi-types-yank.h"
+
+typedef void (YankFn)(void *opaque);
+
+/**
+ * yank_register_instance: Register a new instance.
+ *
+ * This registers a new instance for yanking. Must be called before any yank
+ * function is registered for this instance.
+ *
+ * This function is thread-safe.
+ *
+ * @instance: The instance.
+ * @errp: Error object.
+ */
+void yank_register_instance(const YankInstance *instance, Error **errp);
+
+/**
+ * yank_unregister_instance: Unregister a instance.
+ *
+ * This unregisters a instance. Must be called only after every yank function
+ * of the instance has been unregistered.
+ *
+ * This function is thread-safe.
+ *
+ * @instance: The instance.
+ */
+void yank_unregister_instance(const YankInstance *instance);
+
+/**
+ * yank_register_function: Register a yank function
+ *
+ * This registers a yank function. All limitations of qmp oob commands apply
+ * to the yank function as well. See docs/devel/qapi-code-gen.txt under
+ * "An OOB-capable command handler must satisfy the following conditions".
+ *
+ * This function is thread-safe.
+ *
+ * @instance: The instance.
+ * @func: The yank function.
+ * @opaque: Will be passed to the yank function.
+ */
+void yank_register_function(const YankInstance *instance,
+YankFn *func,
+void *opaque);
+
+/**
+ * yank_unregister_function: Unregister a yank function
+ *
+ * This unregisters a yank function.
+ *
+ * This function is thread-safe.
+ *
+ * @instance: The instance.
+ * @func: func that was passed to yank_register_function.
+ * @opaque: opaque that was passed to yank_register_function.
+ */
+void yank_unregister_function(const YankInstance *instance,
+  YankFn *func,
+  void *opaque);
+
+/**
+ * yank_generic_iochannel: Generic yank function for iochannel
+ *
+ * This is a generic yank function which will call qio_channel_shutdown on the
+ * provided QIOChannel.
+ *
+ * @opaque: QIOChannel to shutdown
+ */
+void yank_generic_iochannel(void *opaque);
+
+#define BLOCKDEV_YANK_INSTANCE(the_node_name) (&(YankInstance) { \
+.type = YANK_INSTANCE_TYPE_BLOCK_NODE, \
+.u.block_node.node_name = (the_node_name) })
+
+#define CHARDEV_YANK_INSTANCE(the_id) (&(YankInstance) { \
+.type = YANK_INSTANCE_TYPE_CHARDEV, \
+.u.chardev.id = (the_id) })
+
+#define MIGRATION_YANK_INSTANCE (&(YankInstance) { \
+.type = YANK_INSTANCE_TYPE_MIGRATION })
+
+#endif
diff --git a/qapi/meson.build b/qapi/meson.build
index 0e98146f1f..ab68e7900e 100644
--- a/qapi/meson.build
+++ b/qapi/meson.build
@@ -47,6 +47,7 @@ qapi_all_modules = [
   'trace',
   'transaction',
   'ui',
+  'yank',
 ]

 qapi_storage_daemon_modules = [
diff --git a/qapi/qapi-schema.json b/qapi/qapi-schema.json
index 0b444b76d2..79c1705ed7 100644
--- a/qapi/qapi-schema.json
+++ b/qapi/qapi-schema.json
@@ -91,3 +91,4 @@
 { 'include': 'audio.json' }
 { 'include': 'acpi.json' }
 { 'include': 'pci.json' }
+{ 'include': 'yank.json' }
diff --git a/qapi/yank.json b/qapi/yank.json
new file mode 100644
index 00..1964a2202e
--- /dev/null
+++ b/qapi/yank.json
@@ -0,0 +1,115 @@
+# -*- Mode: Python -*-
+# vim: filetype=python
+#
+
+##
+# @YankInstanceType:
+#
+# An enumeration of yank instance types. See @YankInstance for more
+# information.
+#
+# Since: 5.2
+##
+{ 'enum': 'YankInstanceType',
+  'data': [ 'block-node', 'chardev', 'migration' ] }
+
+##
+# @YankInstanceBlockNode:
+#
+# Specifies which block graph node to yank. See @YankInstance for more
+# information.
+#
+# @node-name: the name of the block graph node
+#
+# Since: 5.2
+##
+{ 'struct': 'YankInstanceBlockNode',
+  'data': { 'node-name': 'str' } }
+
+##
+# @

[PATCH v10 0/8] Introduce 'yank' oob qmp command to recover from hanging qemu

2020-10-30 Thread Lukas Straub
Hello Everyone,
So here is v10.
We still need ACKs from NBD and chardev maintainers.

Changes:

v10:
 -moved from qapi/misc.json to qapi/yank.json
 -rename 'blockdev' -> 'block-node'
 -document difference betwen migration yank instance and migrate_cancel
 -better document return values of yank command
 -beter document yank_lock
 -minor style and spelling fixes

v9:
 -rebase onto master
 -implemented new qmp api as proposed by Markus

v8:
 -add Reviewed-by and Acked-by tags
 -rebase onto master
  -minor change to migration
  -convert to meson
 -change "Since:" to 5.2
 -varios code style fixes (Markus Armbruster)
 -point to oob restrictions in comment to yank_register_function
  (Markus Armbruster)
 -improve qmp documentation (Markus Armbruster)
 -document oob suitability of qio_channel and io_shutdown (Markus Armbruster)

v7:
 -yank_register_instance now returns error via Error **errp instead of aborting
 -dropped "chardev/char.c: Check for duplicate id before  creating chardev"

v6:
 -add Reviewed-by and Acked-by tags
 -rebase on master
 -lots of changes in nbd due to rebase
 -only take maintainership of util/yank.c and include/qemu/yank.h (Daniel P. 
Berrangé)
 -fix a crash discovered by the newly added chardev test
 -fix the test itself

v5:
 -move yank.c to util/
 -move yank.h to include/qemu/
 -add license to yank.h
 -use const char*
 -nbd: use atomic_store_release and atomic_load_aqcuire
 -io-channel: ensure thread-safety and document it
 -add myself as maintainer for yank

v4:
 -fix build errors...

v3:
 -don't touch softmmu/vl.c, use __contructor__ attribute instead (Paolo Bonzini)
 -fix build errors
 -rewrite migration patch so it actually passes all tests

v2:
 -don't touch io/ code anymore
 -always register yank functions
 -'yank' now takes a list of instances to yank
 -'query-yank' returns a list of yankable instances

Overview:
Hello Everyone,
In many cases, if qemu has a network connection (qmp, migration, chardev, etc.)
to some other server and that server dies or hangs, qemu hangs too.
These patches introduce the new 'yank' out-of-band qmp command to recover from
these kinds of hangs. The different subsystems register callbacks which get
executed with the yank command. For example the callback can shutdown() a
socket. This is intended for the colo use-case, but it can be used for other
things too of course.

Regards,
Lukas Straub

Lukas Straub (8):
  Introduce yank feature
  block/nbd.c: Add yank feature
  chardev/char-socket.c: Add yank feature
  migration: Add yank feature
  io/channel-tls.c: make qio_channel_tls_shutdown thread-safe
  io: Document qmp oob suitability of qio_channel_shutdown and
io_shutdown
  MAINTAINERS: Add myself as maintainer for yank feature
  tests/test-char.c: Wait for the chardev to connect in
char_socket_client_dupid_test

 MAINTAINERS   |   7 ++
 block/nbd.c   | 154 ++--
 chardev/char-socket.c |  35 ++
 include/io/channel.h  |   5 +-
 include/qemu/yank.h   |  95 +++
 io/channel-tls.c  |   6 +-
 migration/channel.c   |  13 ++
 migration/migration.c |  25 
 migration/multifd.c   |  10 ++
 migration/qemu-file-channel.c |   7 ++
 migration/savevm.c|   6 +
 qapi/meson.build  |   1 +
 qapi/qapi-schema.json |   1 +
 qapi/yank.json| 115 ++
 tests/test-char.c |   1 +
 util/meson.build  |   1 +
 util/yank.c   | 216 ++
 17 files changed, 634 insertions(+), 64 deletions(-)
 create mode 100644 include/qemu/yank.h
 create mode 100644 qapi/yank.json
 create mode 100644 util/yank.c

--
2.20.1


pgpxQ7haaweP2.pgp
Description: OpenPGP digital signature


Re: [PATCH v9 1/8] Introduce yank feature

2020-10-30 Thread Lukas Straub
On Fri, 30 Oct 2020 15:02:09 +0100
Markus Armbruster  wrote:

> Lukas Straub  writes:
> 
> > On Thu, 29 Oct 2020 17:36:14 +0100
> > Markus Armbruster  wrote:
> >  
> >> Nothing major, looks almost ready to me.
> >> 
> >> Lukas Straub  writes:
> >>   
> >> > The yank feature allows to recover from hanging qemu by "yanking"
> >> > at various parts. Other qemu systems can register themselves and
> >> > multiple yank functions. Then all yank functions for selected
> >> > instances can be called by the 'yank' out-of-band qmp command.
> >> > Available instances can be queried by a 'query-yank' oob command.
> >> >
> >> > Signed-off-by: Lukas Straub 
> >> > Acked-by: Stefan Hajnoczi 
> >> > ---
> >> >  include/qemu/yank.h |  95 
> >> >  qapi/misc.json  | 106 ++
> >> >  util/meson.build|   1 +
> >> >  util/yank.c | 213   
> >> >   
> >> 
> >> checkpatch.pl warns:
> >> 
> >> WARNING: added, moved or deleted file(s), does MAINTAINERS need 
> >> updating?
> >> 
> >> Can we find a maintainer for the two new files?  
> >
> > Yes, I'm maintaining this for now, see patch 7.  
> 
> Thanks!  Would it make sense to add the yank stuff to a new QAPI module
> yank.json instead of misc.jaon, so the new MAINTAINERS stanza can cover
> it?

Yes, makes sense. Changed for the next version.

> [...]
> >> > diff --git a/qapi/misc.json b/qapi/misc.json
> >> > index 40df513856..3b7de02a4d 100644
> >> > --- a/qapi/misc.json
> >> > +++ b/qapi/misc.json  
> [...]
> >> > +##
> >> > +# @YankInstance:
> >> > +#
> >> > +# A yank instance can be yanked with the "yank" qmp command to recover 
> >> > from a
> >> > +# hanging qemu.
> >> 
> >> QEMU
> >>   
> >> > +#
> >> > +# Currently implemented yank instances:
> >> > +#  -nbd block device:
> >> > +#   Yanking it will shutdown the connection to the nbd server without
> >> > +#   attempting to reconnect.
> >> > +#  -socket chardev:
> >> > +#   Yanking it will shutdown the connected socket.
> >> > +#  -migration:
> >> > +#   Yanking it will shutdown all migration connections.
> >> 
> >> To my surprise, this is recognized as bullet list markup.  But please
> >> put a space between the bullet and the text anyway.
> >> 
> >> Also: "shutdown" is a noun, the verb is spelled "shut down".  
> >
> > Both changed for the next version.
> >  
> >> In my review of v8, I asked how yanking migration is related to command
> >> migrate_cancel.  Daniel explained:
> >> 
> >> migrate_cancel will do a shutdown() on the primary migration socket 
> >> only.
> >> In addition it will toggle the migration state.
> >> 
> >> Yanking will do a shutdown on all migration sockets (important for
> >> multifd), but won't touch migration state or any other aspect of QEMU
> >> code.
> >> 
> >> Overall yanking has less potential for things to go wrong than the
> >> migrate_cancel method, as it doesn't try to do any kind of cleanup
> >> or migration.
> >> 
> >> Would it make sense to work this into the documentation?  
> >
> > How about this?
> >
> >   - migration:
> > Yanking it will shut down all migration connections. Unlike
> > @migrate_cancel, it will not notify the migration process,
> > so migration will go into @failed state, instead of @cancelled
> > state.  
> 
> Works for me.  Advice on when to use it rather than migrate_cancel would
> be nice, though.

Ok, Changed for the next version.

> >> > +#
> >> > +# Since: 5.2
> >> > +##
> >> > +{ 'union': 'YankInstance',
> >> > +  'base': { 'type': 'YankInstanceType' },
> >> > +  'discriminator': 'type',
> >> > +  'data': {
> >> > +  'blockdev': 'YankInstanceBlockdev',
> >> > +  'chardev': 'YankInstanceChardev' } }
> >> > +
> >> > +##
> >> > +# @yank:
> >> > +#
> >> > +# Recover from hanging qemu by yanking the specified instances. See
> >> 
> >> QEMU
> >> 

Re: [PATCH v9 1/8] Introduce yank feature

2020-10-30 Thread Lukas Straub
On Thu, 29 Oct 2020 17:36:14 +0100
Markus Armbruster  wrote:

> Nothing major, looks almost ready to me.
> 
> Lukas Straub  writes:
> 
> > The yank feature allows to recover from hanging qemu by "yanking"
> > at various parts. Other qemu systems can register themselves and
> > multiple yank functions. Then all yank functions for selected
> > instances can be called by the 'yank' out-of-band qmp command.
> > Available instances can be queried by a 'query-yank' oob command.
> >
> > Signed-off-by: Lukas Straub 
> > Acked-by: Stefan Hajnoczi 
> > ---
> >  include/qemu/yank.h |  95 
> >  qapi/misc.json  | 106 ++
> >  util/meson.build|   1 +
> >  util/yank.c | 213   
> 
> checkpatch.pl warns:
> 
> WARNING: added, moved or deleted file(s), does MAINTAINERS need updating?
> 
> Can we find a maintainer for the two new files?

Yes, I'm maintaining this for now, see patch 7.

> >  4 files changed, 415 insertions(+)
> >  create mode 100644 include/qemu/yank.h
> >  create mode 100644 util/yank.c
> >
> > diff --git a/include/qemu/yank.h b/include/qemu/yank.h
> > new file mode 100644
> > index 000000..89755e62af
> > --- /dev/null
> > +++ b/include/qemu/yank.h
> > @@ -0,0 +1,95 @@
> > +/*
> > + * QEMU yank feature
> > + *
> > + * Copyright (c) Lukas Straub 
> > + *
> > + * This work is licensed under the terms of the GNU GPL, version 2 or 
> > later.
> > + * See the COPYING file in the top-level directory.
> > + */
> > +
> > +#ifndef YANK_H
> > +#define YANK_H
> > +
> > +#include "qapi/qapi-types-misc.h"
> > +
> > +typedef void (YankFn)(void *opaque);
> > +
> > +/**
> > + * yank_register_instance: Register a new instance.
> > + *
> > + * This registers a new instance for yanking. Must be called before any 
> > yank
> > + * function is registered for this instance.
> > + *
> > + * This function is thread-safe.
> > + *
> > + * @instance: The instance.
> > + * @errp: Error object.
> > + */
> > +void yank_register_instance(const YankInstance *instance, Error **errp);
> > +
> > +/**
> > + * yank_unregister_instance: Unregister a instance.
> > + *
> > + * This unregisters a instance. Must be called only after every yank 
> > function
> > + * of the instance has been unregistered.
> > + *
> > + * This function is thread-safe.
> > + *
> > + * @instance: The instance.
> > + */
> > +void yank_unregister_instance(const YankInstance *instance);
> > +
> > +/**
> > + * yank_register_function: Register a yank function
> > + *
> > + * This registers a yank function. All limitations of qmp oob commands 
> > apply
> > + * to the yank function as well. See docs/devel/qapi-code-gen.txt under
> > + * "An OOB-capable command handler must satisfy the following conditions".
> > + *
> > + * This function is thread-safe.
> > + *
> > + * @instance: The instance.
> > + * @func: The yank function.
> > + * @opaque: Will be passed to the yank function.
> > + */
> > +void yank_register_function(const YankInstance *instance,
> > +YankFn *func,
> > +void *opaque);
> > +
> > +/**
> > + * yank_unregister_function: Unregister a yank function
> > + *
> > + * This unregisters a yank function.
> > + *
> > + * This function is thread-safe.
> > + *
> > + * @instance: The instance.
> > + * @func: func that was passed to yank_register_function.
> > + * @opaque: opaque that was passed to yank_register_function.
> > + */
> > +void yank_unregister_function(const YankInstance *instance,
> > +  YankFn *func,
> > +  void *opaque);
> > +
> > +/**
> > + * yank_generic_iochannel: Generic yank function for iochannel
> > + *
> > + * This is a generic yank function which will call qio_channel_shutdown on 
> > the
> > + * provided QIOChannel.
> > + *
> > + * @opaque: QIOChannel to shutdown
> > + */
> > +void yank_generic_iochannel(void *opaque);
> > +
> > +#define BLOCKDEV_YANK_INSTANCE(the_node_name) (&(YankInstance) { \
> > +.type = YANK_INSTANCE_TYPE_BLOCKDEV, \
> > +.u.blockdev.node_name = (the_node_name) })
> > +
> > +#define CHARDEV_YANK_INSTANCE(the_id) (&(YankInstance) { \
> &g

[PATCH v9 8/8] tests/test-char.c: Wait for the chardev to connect in char_socket_client_dupid_test

2020-10-28 Thread Lukas Straub
A connecting chardev object has an additional reference by the connecting
thread, so if the chardev is still connecting by the end of the test,
then the chardev object won't be freed. This in turn means that the yank
instance won't be unregistered and when running the next test-case
yank_register_instance will abort, because the yank instance is
already/still registered.

Signed-off-by: Lukas Straub 
Reviewed-by: Daniel P. Berrangé 
---
 tests/test-char.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/tests/test-char.c b/tests/test-char.c
index 9196e566e9..aedb5c9eda 100644
--- a/tests/test-char.c
+++ b/tests/test-char.c
@@ -937,6 +937,7 @@ static void char_socket_client_dupid_test(gconstpointer 
opaque)
 g_assert_nonnull(opts);
 chr1 = qemu_chr_new_from_opts(opts, NULL, _abort);
 g_assert_nonnull(chr1);
+qemu_chr_wait_connected(chr1, _abort);

 chr2 = qemu_chr_new_from_opts(opts, NULL, _err);
 g_assert_null(chr2);
--
2.20.1


pgpUCTbxmqVgD.pgp
Description: OpenPGP digital signature


[PATCH v9 7/8] MAINTAINERS: Add myself as maintainer for yank feature

2020-10-28 Thread Lukas Straub
I'll maintain this for now as the colo usecase is the first user
of this functionality.

Signed-off-by: Lukas Straub 
Acked-by: Stefan Hajnoczi 
Reviewed-by: Daniel P. Berrangé 
---
 MAINTAINERS | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/MAINTAINERS b/MAINTAINERS
index ef6f5c7399..5921e565df 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -2669,6 +2669,12 @@ F: util/uuid.c
 F: include/qemu/uuid.h
 F: tests/test-uuid.c

+Yank feature
+M: Lukas Straub 
+S: Odd fixes
+F: util/yank.c
+F: include/qemu/yank.h
+
 COLO Framework
 M: zhanghailiang 
 S: Maintained
--
2.20.1



pgpAZlEQHp9DS.pgp
Description: OpenPGP digital signature


[PATCH v9 6/8] io: Document qmp oob suitability of qio_channel_shutdown and io_shutdown

2020-10-28 Thread Lukas Straub
Migration and yank code assume that qio_channel_shutdown is thread
-safe and can be called from qmp oob handler. Document this after
checking the code.

Signed-off-by: Lukas Straub 
Acked-by: Stefan Hajnoczi 
Reviewed-by: Daniel P. Berrangé 
---
 include/io/channel.h | 5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/include/io/channel.h b/include/io/channel.h
index 3c04f0edda..e0b9fc615d 100644
--- a/include/io/channel.h
+++ b/include/io/channel.h
@@ -92,7 +92,8 @@ struct QIOChannel {
  * provide additional optional features.
  *
  * Consult the corresponding public API docs for a description
- * of the semantics of each callback
+ * of the semantics of each callback. io_shutdown in particular
+ * must be thread-safe, terminate quickly and must not block.
  */
 struct QIOChannelClass {
 ObjectClass parent;
@@ -510,6 +511,8 @@ int qio_channel_close(QIOChannel *ioc,
  * QIO_CHANNEL_FEATURE_SHUTDOWN prior to calling
  * this method.
  *
+ * This function is thread-safe, terminates quickly and does not block.
+ *
  * Returns: 0 on success, -1 on error
  */
 int qio_channel_shutdown(QIOChannel *ioc,
--
2.20.1



pgpGZ5stoTo2y.pgp
Description: OpenPGP digital signature


[PATCH v9 1/8] Introduce yank feature

2020-10-28 Thread Lukas Straub
The yank feature allows to recover from hanging qemu by "yanking"
at various parts. Other qemu systems can register themselves and
multiple yank functions. Then all yank functions for selected
instances can be called by the 'yank' out-of-band qmp command.
Available instances can be queried by a 'query-yank' oob command.

Signed-off-by: Lukas Straub 
Acked-by: Stefan Hajnoczi 
---
 include/qemu/yank.h |  95 
 qapi/misc.json  | 106 ++
 util/meson.build|   1 +
 util/yank.c | 213 
 4 files changed, 415 insertions(+)
 create mode 100644 include/qemu/yank.h
 create mode 100644 util/yank.c

diff --git a/include/qemu/yank.h b/include/qemu/yank.h
new file mode 100644
index 00..89755e62af
--- /dev/null
+++ b/include/qemu/yank.h
@@ -0,0 +1,95 @@
+/*
+ * QEMU yank feature
+ *
+ * Copyright (c) Lukas Straub 
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or later.
+ * See the COPYING file in the top-level directory.
+ */
+
+#ifndef YANK_H
+#define YANK_H
+
+#include "qapi/qapi-types-misc.h"
+
+typedef void (YankFn)(void *opaque);
+
+/**
+ * yank_register_instance: Register a new instance.
+ *
+ * This registers a new instance for yanking. Must be called before any yank
+ * function is registered for this instance.
+ *
+ * This function is thread-safe.
+ *
+ * @instance: The instance.
+ * @errp: Error object.
+ */
+void yank_register_instance(const YankInstance *instance, Error **errp);
+
+/**
+ * yank_unregister_instance: Unregister a instance.
+ *
+ * This unregisters a instance. Must be called only after every yank function
+ * of the instance has been unregistered.
+ *
+ * This function is thread-safe.
+ *
+ * @instance: The instance.
+ */
+void yank_unregister_instance(const YankInstance *instance);
+
+/**
+ * yank_register_function: Register a yank function
+ *
+ * This registers a yank function. All limitations of qmp oob commands apply
+ * to the yank function as well. See docs/devel/qapi-code-gen.txt under
+ * "An OOB-capable command handler must satisfy the following conditions".
+ *
+ * This function is thread-safe.
+ *
+ * @instance: The instance.
+ * @func: The yank function.
+ * @opaque: Will be passed to the yank function.
+ */
+void yank_register_function(const YankInstance *instance,
+YankFn *func,
+void *opaque);
+
+/**
+ * yank_unregister_function: Unregister a yank function
+ *
+ * This unregisters a yank function.
+ *
+ * This function is thread-safe.
+ *
+ * @instance: The instance.
+ * @func: func that was passed to yank_register_function.
+ * @opaque: opaque that was passed to yank_register_function.
+ */
+void yank_unregister_function(const YankInstance *instance,
+  YankFn *func,
+  void *opaque);
+
+/**
+ * yank_generic_iochannel: Generic yank function for iochannel
+ *
+ * This is a generic yank function which will call qio_channel_shutdown on the
+ * provided QIOChannel.
+ *
+ * @opaque: QIOChannel to shutdown
+ */
+void yank_generic_iochannel(void *opaque);
+
+#define BLOCKDEV_YANK_INSTANCE(the_node_name) (&(YankInstance) { \
+.type = YANK_INSTANCE_TYPE_BLOCKDEV, \
+.u.blockdev.node_name = (the_node_name) })
+
+#define CHARDEV_YANK_INSTANCE(the_id) (&(YankInstance) { \
+.type = YANK_INSTANCE_TYPE_CHARDEV, \
+.u.chardev.id = (the_id) })
+
+#define MIGRATION_YANK_INSTANCE (&(YankInstance) { \
+.type = YANK_INSTANCE_TYPE_MIGRATION })
+
+#endif
diff --git a/qapi/misc.json b/qapi/misc.json
index 40df513856..3b7de02a4d 100644
--- a/qapi/misc.json
+++ b/qapi/misc.json
@@ -568,3 +568,109 @@
  'data': { '*option': 'str' },
  'returns': ['CommandLineOptionInfo'],
  'allow-preconfig': true }
+
+##
+# @YankInstanceType:
+#
+# An enumeration of yank instance types. See "YankInstance" for more
+# information.
+#
+# Since: 5.2
+##
+{ 'enum': 'YankInstanceType',
+  'data': [ 'blockdev', 'chardev', 'migration' ] }
+
+##
+# @YankInstanceBlockdev:
+#
+# Specifies which blockdev to yank. See "YankInstance" for more information.
+#
+# @node-name: the blockdev's node-name
+#
+# Since: 5.2
+##
+{ 'struct': 'YankInstanceBlockdev',
+  'data': { 'node-name': 'str' } }
+
+##
+# @YankInstanceChardev:
+#
+# Specifies which chardev to yank. See "YankInstance" for more information.
+#
+# @id: the chardev's ID
+#
+# Since: 5.2
+##
+{ 'struct': 'YankInstanceChardev',
+  'data': { 'id': 'str' } }
+
+##
+# @YankInstance:
+#
+# A yank instance can be yanked with the "yank" qmp command to recover from a
+# hanging qemu.
+#
+# Currently implemented yank instances:
+#  -nbd block device:
+#   Yanking it will shutdown the connection to the nbd server without
+#   attempting to reconnect.
+#  -socket chardev:
+#   Yanking it will shutdown the connected socket.
+#  -mi

[PATCH v9 5/8] io/channel-tls.c: make qio_channel_tls_shutdown thread-safe

2020-10-28 Thread Lukas Straub
Make qio_channel_tls_shutdown thread-safe by using atomics when
accessing tioc->shutdown.

Signed-off-by: Lukas Straub 
Acked-by: Stefan Hajnoczi 
Reviewed-by: Daniel P. Berrangé 
---
 io/channel-tls.c | 6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/io/channel-tls.c b/io/channel-tls.c
index 7ec8ceff2f..10d0bf59aa 100644
--- a/io/channel-tls.c
+++ b/io/channel-tls.c
@@ -23,6 +23,7 @@
 #include "qemu/module.h"
 #include "io/channel-tls.h"
 #include "trace.h"
+#include "qemu/atomic.h"


 static ssize_t qio_channel_tls_write_handler(const char *buf,
@@ -277,7 +278,8 @@ static ssize_t qio_channel_tls_readv(QIOChannel *ioc,
 return QIO_CHANNEL_ERR_BLOCK;
 }
 } else if (errno == ECONNABORTED &&
-   (tioc->shutdown & QIO_CHANNEL_SHUTDOWN_READ)) {
+   (qatomic_load_acquire(>shutdown) &
+QIO_CHANNEL_SHUTDOWN_READ)) {
 return 0;
 }

@@ -361,7 +363,7 @@ static int qio_channel_tls_shutdown(QIOChannel *ioc,
 {
 QIOChannelTLS *tioc = QIO_CHANNEL_TLS(ioc);

-tioc->shutdown |= how;
+qatomic_or(>shutdown, how);

 return qio_channel_shutdown(tioc->master, how, errp);
 }
--
2.20.1



pgplPla7hd55V.pgp
Description: OpenPGP digital signature


[PATCH v9 2/8] block/nbd.c: Add yank feature

2020-10-28 Thread Lukas Straub
Register a yank function which shuts down the socket and sets
s->state = NBD_CLIENT_QUIT. This is the same behaviour as if an
error occured.

Signed-off-by: Lukas Straub 
Acked-by: Stefan Hajnoczi 
---
 block/nbd.c | 154 +++-
 1 file changed, 93 insertions(+), 61 deletions(-)

diff --git a/block/nbd.c b/block/nbd.c
index 4548046cd7..d66c84ee40 100644
--- a/block/nbd.c
+++ b/block/nbd.c
@@ -35,6 +35,7 @@
 #include "qemu/option.h"
 #include "qemu/cutils.h"
 #include "qemu/main-loop.h"
+#include "qemu/atomic.h"

 #include "qapi/qapi-visit-sockets.h"
 #include "qapi/qmp/qstring.h"
@@ -44,6 +45,8 @@
 #include "block/nbd.h"
 #include "block/block_int.h"

+#include "qemu/yank.h"
+
 #define EN_OPTSTR ":exportname="
 #define MAX_NBD_REQUESTS16

@@ -140,14 +143,13 @@ typedef struct BDRVNBDState {
 NBDConnectThread *connect_thread;
 } BDRVNBDState;

-static QIOChannelSocket *nbd_establish_connection(SocketAddress *saddr,
-  Error **errp);
-static QIOChannelSocket *nbd_co_establish_connection(BlockDriverState *bs,
- Error **errp);
+static int nbd_establish_connection(BlockDriverState *bs, SocketAddress *saddr,
+Error **errp);
+static int nbd_co_establish_connection(BlockDriverState *bs, Error **errp);
 static void nbd_co_establish_connection_cancel(BlockDriverState *bs,
bool detach);
-static int nbd_client_handshake(BlockDriverState *bs, QIOChannelSocket *sioc,
-Error **errp);
+static int nbd_client_handshake(BlockDriverState *bs, Error **errp);
+static void nbd_yank(void *opaque);

 static void nbd_clear_bdrvstate(BDRVNBDState *s)
 {
@@ -165,12 +167,12 @@ static void nbd_clear_bdrvstate(BDRVNBDState *s)
 static void nbd_channel_error(BDRVNBDState *s, int ret)
 {
 if (ret == -EIO) {
-if (s->state == NBD_CLIENT_CONNECTED) {
+if (qatomic_load_acquire(>state) == NBD_CLIENT_CONNECTED) {
 s->state = s->reconnect_delay ? NBD_CLIENT_CONNECTING_WAIT :
 NBD_CLIENT_CONNECTING_NOWAIT;
 }
 } else {
-if (s->state == NBD_CLIENT_CONNECTED) {
+if (qatomic_load_acquire(>state) == NBD_CLIENT_CONNECTED) {
 qio_channel_shutdown(s->ioc, QIO_CHANNEL_SHUTDOWN_BOTH, NULL);
 }
 s->state = NBD_CLIENT_QUIT;
@@ -203,7 +205,7 @@ static void reconnect_delay_timer_cb(void *opaque)
 {
 BDRVNBDState *s = opaque;

-if (s->state == NBD_CLIENT_CONNECTING_WAIT) {
+if (qatomic_load_acquire(>state) == NBD_CLIENT_CONNECTING_WAIT) {
 s->state = NBD_CLIENT_CONNECTING_NOWAIT;
 while (qemu_co_enter_next(>free_sema, NULL)) {
 /* Resume all queued requests */
@@ -215,7 +217,7 @@ static void reconnect_delay_timer_cb(void *opaque)

 static void reconnect_delay_timer_init(BDRVNBDState *s, uint64_t 
expire_time_ns)
 {
-if (s->state != NBD_CLIENT_CONNECTING_WAIT) {
+if (qatomic_load_acquire(>state) != NBD_CLIENT_CONNECTING_WAIT) {
 return;
 }

@@ -260,7 +262,7 @@ static void nbd_client_attach_aio_context(BlockDriverState 
*bs,
  * s->connection_co is either yielded from nbd_receive_reply or from
  * nbd_co_reconnect_loop()
  */
-if (s->state == NBD_CLIENT_CONNECTED) {
+if (qatomic_load_acquire(>state) == NBD_CLIENT_CONNECTED) {
 qio_channel_attach_aio_context(QIO_CHANNEL(s->ioc), new_context);
 }

@@ -286,7 +288,7 @@ static void coroutine_fn 
nbd_client_co_drain_begin(BlockDriverState *bs)

 reconnect_delay_timer_del(s);

-if (s->state == NBD_CLIENT_CONNECTING_WAIT) {
+if (qatomic_load_acquire(>state) == NBD_CLIENT_CONNECTING_WAIT) {
 s->state = NBD_CLIENT_CONNECTING_NOWAIT;
 qemu_co_queue_restart_all(>free_sema);
 }
@@ -337,13 +339,14 @@ static void nbd_teardown_connection(BlockDriverState *bs)

 static bool nbd_client_connecting(BDRVNBDState *s)
 {
-return s->state == NBD_CLIENT_CONNECTING_WAIT ||
-s->state == NBD_CLIENT_CONNECTING_NOWAIT;
+NBDClientState state = qatomic_load_acquire(>state);
+return state == NBD_CLIENT_CONNECTING_WAIT ||
+state == NBD_CLIENT_CONNECTING_NOWAIT;
 }

 static bool nbd_client_connecting_wait(BDRVNBDState *s)
 {
-return s->state == NBD_CLIENT_CONNECTING_WAIT;
+return qatomic_load_acquire(>state) == NBD_CLIENT_CONNECTING_WAIT;
 }

 static void connect_bh(void *opaque)
@@ -423,12 +426,12 @@ static void *connect_thread_func(void *opaque)
 return NULL;
 }

-static QIOChannelSocket *coroutine_fn
+static int coroutine_fn
 nbd_co_establish_connection(BlockDriverState *bs, Error **errp)
 {
+int ret;

[PATCH v9 3/8] chardev/char-socket.c: Add yank feature

2020-10-28 Thread Lukas Straub
Register a yank function to shutdown the socket on yank.

Signed-off-by: Lukas Straub 
Acked-by: Stefan Hajnoczi 
---
 chardev/char-socket.c | 35 +++
 1 file changed, 35 insertions(+)

diff --git a/chardev/char-socket.c b/chardev/char-socket.c
index 95e45812d5..5947cbe8bb 100644
--- a/chardev/char-socket.c
+++ b/chardev/char-socket.c
@@ -34,6 +34,7 @@
 #include "qapi/error.h"
 #include "qapi/clone-visitor.h"
 #include "qapi/qapi-visit-sockets.h"
+#include "qemu/yank.h"

 #include "chardev/char-io.h"
 #include "qom/object.h"
@@ -70,6 +71,7 @@ struct SocketChardev {
 size_t read_msgfds_num;
 int *write_msgfds;
 size_t write_msgfds_num;
+bool registered_yank;

 SocketAddress *addr;
 bool is_listen;
@@ -415,6 +417,12 @@ static void tcp_chr_free_connection(Chardev *chr)

 tcp_set_msgfds(chr, NULL, 0);
 remove_fd_in_watch(chr);
+if (s->state == TCP_CHARDEV_STATE_CONNECTING
+|| s->state == TCP_CHARDEV_STATE_CONNECTED) {
+yank_unregister_function(CHARDEV_YANK_INSTANCE(chr->label),
+ yank_generic_iochannel,
+ QIO_CHANNEL(s->sioc));
+}
 object_unref(OBJECT(s->sioc));
 s->sioc = NULL;
 object_unref(OBJECT(s->ioc));
@@ -918,6 +926,9 @@ static int tcp_chr_add_client(Chardev *chr, int fd)
 }
 tcp_chr_change_state(s, TCP_CHARDEV_STATE_CONNECTING);
 tcp_chr_set_client_ioc_name(chr, sioc);
+yank_register_function(CHARDEV_YANK_INSTANCE(chr->label),
+   yank_generic_iochannel,
+   QIO_CHANNEL(sioc));
 ret = tcp_chr_new_client(chr, sioc);
 object_unref(OBJECT(sioc));
 return ret;
@@ -932,6 +943,9 @@ static void tcp_chr_accept(QIONetListener *listener,

 tcp_chr_change_state(s, TCP_CHARDEV_STATE_CONNECTING);
 tcp_chr_set_client_ioc_name(chr, cioc);
+yank_register_function(CHARDEV_YANK_INSTANCE(chr->label),
+   yank_generic_iochannel,
+   QIO_CHANNEL(cioc));
 tcp_chr_new_client(chr, cioc);
 }

@@ -947,6 +961,9 @@ static int tcp_chr_connect_client_sync(Chardev *chr, Error 
**errp)
 object_unref(OBJECT(sioc));
 return -1;
 }
+yank_register_function(CHARDEV_YANK_INSTANCE(chr->label),
+   yank_generic_iochannel,
+   QIO_CHANNEL(sioc));
 tcp_chr_new_client(chr, sioc);
 object_unref(OBJECT(sioc));
 return 0;
@@ -962,6 +979,9 @@ static void tcp_chr_accept_server_sync(Chardev *chr)
 tcp_chr_change_state(s, TCP_CHARDEV_STATE_CONNECTING);
 sioc = qio_net_listener_wait_client(s->listener);
 tcp_chr_set_client_ioc_name(chr, sioc);
+yank_register_function(CHARDEV_YANK_INSTANCE(chr->label),
+   yank_generic_iochannel,
+   QIO_CHANNEL(sioc));
 tcp_chr_new_client(chr, sioc);
 object_unref(OBJECT(sioc));
 }
@@ -1072,6 +1092,9 @@ static void char_socket_finalize(Object *obj)
 object_unref(OBJECT(s->tls_creds));
 }
 g_free(s->tls_authz);
+if (s->registered_yank) {
+yank_unregister_instance(CHARDEV_YANK_INSTANCE(chr->label));
+}

 qemu_chr_be_event(chr, CHR_EVENT_CLOSED);
 }
@@ -1087,6 +1110,9 @@ static void qemu_chr_socket_connected(QIOTask *task, void 
*opaque)

 if (qio_task_propagate_error(task, )) {
 tcp_chr_change_state(s, TCP_CHARDEV_STATE_DISCONNECTED);
+yank_unregister_function(CHARDEV_YANK_INSTANCE(chr->label),
+ yank_generic_iochannel,
+ QIO_CHANNEL(sioc));
 check_report_connect_error(chr, err);
 goto cleanup;
 }
@@ -1120,6 +1146,9 @@ static void tcp_chr_connect_client_async(Chardev *chr)
 tcp_chr_change_state(s, TCP_CHARDEV_STATE_CONNECTING);
 sioc = qio_channel_socket_new();
 tcp_chr_set_client_ioc_name(chr, sioc);
+yank_register_function(CHARDEV_YANK_INSTANCE(chr->label),
+   yank_generic_iochannel,
+   QIO_CHANNEL(sioc));
 /*
  * Normally code would use the qio_channel_socket_connect_async
  * method which uses a QIOTask + qio_task_set_error internally
@@ -1362,6 +1391,12 @@ static void qmp_chardev_open_socket(Chardev *chr,
 qemu_chr_set_feature(chr, QEMU_CHAR_FEATURE_FD_PASS);
 }

+yank_register_instance(CHARDEV_YANK_INSTANCE(chr->label), errp);
+if (*errp) {
+return;
+}
+s->registered_yank = true;
+
 /* be isn't opened until we get a connection */
 *be_opened = false;

--
2.20.1



pgphWJNBt8ICg.pgp
Description: OpenPGP digital signature


[PATCH v9 4/8] migration: Add yank feature

2020-10-28 Thread Lukas Straub
Register yank functions on sockets to shut them down.

Signed-off-by: Lukas Straub 
Acked-by: Stefan Hajnoczi 
Acked-by: Dr. David Alan Gilbert 
---
 migration/channel.c   | 13 +
 migration/migration.c | 25 +
 migration/multifd.c   | 10 ++
 migration/qemu-file-channel.c |  7 +++
 migration/savevm.c|  6 ++
 5 files changed, 61 insertions(+)

diff --git a/migration/channel.c b/migration/channel.c
index 8a783baa0b..35fe234e9c 100644
--- a/migration/channel.c
+++ b/migration/channel.c
@@ -18,6 +18,8 @@
 #include "trace.h"
 #include "qapi/error.h"
 #include "io/channel-tls.h"
+#include "io/channel-socket.h"
+#include "qemu/yank.h"

 /**
  * @migration_channel_process_incoming - Create new incoming migration channel
@@ -35,6 +37,11 @@ void migration_channel_process_incoming(QIOChannel *ioc)
 trace_migration_set_incoming_channel(
 ioc, object_get_typename(OBJECT(ioc)));

+if (object_dynamic_cast(OBJECT(ioc), TYPE_QIO_CHANNEL_SOCKET)) {
+yank_register_function(MIGRATION_YANK_INSTANCE, yank_generic_iochannel,
+   QIO_CHANNEL(ioc));
+}
+
 if (s->parameters.tls_creds &&
 *s->parameters.tls_creds &&
 !object_dynamic_cast(OBJECT(ioc),
@@ -67,6 +74,12 @@ void migration_channel_connect(MigrationState *s,
 ioc, object_get_typename(OBJECT(ioc)), hostname, error);

 if (!error) {
+if (object_dynamic_cast(OBJECT(ioc), TYPE_QIO_CHANNEL_SOCKET)) {
+yank_register_function(MIGRATION_YANK_INSTANCE,
+   yank_generic_iochannel,
+   QIO_CHANNEL(ioc));
+}
+
 if (s->parameters.tls_creds &&
 *s->parameters.tls_creds &&
 !object_dynamic_cast(OBJECT(ioc),
diff --git a/migration/migration.c b/migration/migration.c
index 0575ecb379..e2c1123a90 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -56,6 +56,7 @@
 #include "net/announce.h"
 #include "qemu/queue.h"
 #include "multifd.h"
+#include "qemu/yank.h"

 #define MAX_THROTTLE  (128 << 20)  /* Migration transfer speed throttling 
*/

@@ -244,6 +245,8 @@ void migration_incoming_state_destroy(void)
 qapi_free_SocketAddressList(mis->socket_address_list);
 mis->socket_address_list = NULL;
 }
+
+yank_unregister_instance(MIGRATION_YANK_INSTANCE);
 }

 static void migrate_generate_event(int new_state)
@@ -390,8 +393,14 @@ void qemu_start_incoming_migration(const char *uri, Error 
**errp)
 {
 const char *p = NULL;

+yank_register_instance(MIGRATION_YANK_INSTANCE, errp);
+if (*errp) {
+return;
+}
+
 qapi_event_send_migration(MIGRATION_STATUS_SETUP);
 if (!strcmp(uri, "defer")) {
+yank_unregister_instance(MIGRATION_YANK_INSTANCE);
 deferred_incoming_migration(errp);
 } else if (strstart(uri, "tcp:", ) ||
strstart(uri, "unix:", NULL) ||
@@ -406,6 +415,7 @@ void qemu_start_incoming_migration(const char *uri, Error 
**errp)
 } else if (strstart(uri, "fd:", )) {
 fd_start_incoming_migration(p, errp);
 } else {
+yank_unregister_instance(MIGRATION_YANK_INSTANCE);
 error_setg(errp, "unknown migration protocol: %s", uri);
 }
 }
@@ -1698,6 +1708,7 @@ static void migrate_fd_cleanup(MigrationState *s)
 }
 notifier_list_notify(_state_notifiers, s);
 block_cleanup_parameters(s);
+yank_unregister_instance(MIGRATION_YANK_INSTANCE);
 }

 static void migrate_fd_cleanup_schedule(MigrationState *s)
@@ -1972,6 +1983,7 @@ void qmp_migrate_recover(const char *uri, Error **errp)
  * only re-setup the migration stream and poke existing migration
  * to continue using that newly established channel.
  */
+yank_unregister_instance(MIGRATION_YANK_INSTANCE);
 qemu_start_incoming_migration(uri, errp);
 }

@@ -2109,6 +2121,13 @@ void qmp_migrate(const char *uri, bool has_blk, bool blk,
 return;
 }

+if (!(has_resume && resume)) {
+yank_register_instance(MIGRATION_YANK_INSTANCE, errp);
+if (*errp) {
+return;
+}
+}
+
 if (strstart(uri, "tcp:", ) ||
 strstart(uri, "unix:", NULL) ||
 strstart(uri, "vsock:", NULL)) {
@@ -2122,6 +2141,9 @@ void qmp_migrate(const char *uri, bool has_blk, bool blk,
 } else if (strstart(uri, "fd:", )) {
 fd_start_outgoing_migration(s, p, _err);
 } else {
+if (!(has_resume && resume)) {
+yank_unregister_instance(MIGRATION_YANK_INSTANCE);
+}
 error_setg(errp, QERR_INVALID_PARAMETER_VALUE, "uri",
"a valid m

[PATCH v9 0/8] Introduce 'yank' oob qmp command to recover from hanging qemu

2020-10-28 Thread Lukas Straub
Hello Everyone,
I finally found time again to work on this, so here is v9 with the new qmp api.
We still need ACKs from NBD and chardev maintainers.

Changes:

v9:
 -rebase onto master
 -implemented new qmp api as proposed by Markus

v8:
 -add Reviewed-by and Acked-by tags
 -rebase onto master
  -minor change to migration
  -convert to meson
 -change "Since:" to 5.2
 -varios code style fixes (Markus Armbruster)
 -point to oob restrictions in comment to yank_register_function
  (Markus Armbruster)
 -improve qmp documentation (Markus Armbruster)
 -document oob suitability of qio_channel and io_shutdown (Markus Armbruster)

v7:
 -yank_register_instance now returns error via Error **errp instead of aborting
 -dropped "chardev/char.c: Check for duplicate id before  creating chardev"

v6:
 -add Reviewed-by and Acked-by tags
 -rebase on master
 -lots of changes in nbd due to rebase
 -only take maintainership of util/yank.c and include/qemu/yank.h (Daniel P. 
Berrangé)
 -fix a crash discovered by the newly added chardev test
 -fix the test itself

v5:
 -move yank.c to util/
 -move yank.h to include/qemu/
 -add license to yank.h
 -use const char*
 -nbd: use atomic_store_release and atomic_load_aqcuire
 -io-channel: ensure thread-safety and document it
 -add myself as maintainer for yank

v4:
 -fix build errors...

v3:
 -don't touch softmmu/vl.c, use __contructor__ attribute instead (Paolo Bonzini)
 -fix build errors
 -rewrite migration patch so it actually passes all tests

v2:
 -don't touch io/ code anymore
 -always register yank functions
 -'yank' now takes a list of instances to yank
 -'query-yank' returns a list of yankable instances

Overview:
Hello Everyone,
In many cases, if qemu has a network connection (qmp, migration, chardev, etc.)
to some other server and that server dies or hangs, qemu hangs too.
These patches introduce the new 'yank' out-of-band qmp command to recover from
these kinds of hangs. The different subsystems register callbacks which get
executed with the yank command. For example the callback can shutdown() a
socket. This is intended for the colo use-case, but it can be used for other
things too of course.

Regards,
Lukas Straub

Lukas Straub (8):
  Introduce yank feature
  block/nbd.c: Add yank feature
  chardev/char-socket.c: Add yank feature
  migration: Add yank feature
  io/channel-tls.c: make qio_channel_tls_shutdown thread-safe
  io: Document qmp oob suitability of qio_channel_shutdown and
io_shutdown
  MAINTAINERS: Add myself as maintainer for yank feature
  tests/test-char.c: Wait for the chardev to connect in
char_socket_client_dupid_test

 MAINTAINERS   |   6 +
 block/nbd.c   | 154 ++--
 chardev/char-socket.c |  35 ++
 include/io/channel.h  |   5 +-
 include/qemu/yank.h   |  95 +++
 io/channel-tls.c  |   6 +-
 migration/channel.c   |  13 +++
 migration/migration.c |  25 
 migration/multifd.c   |  10 ++
 migration/qemu-file-channel.c |   7 ++
 migration/savevm.c|   6 +
 qapi/misc.json| 106 +
 tests/test-char.c |   1 +
 util/meson.build  |   1 +
 util/yank.c   | 213 ++
 15 files changed, 619 insertions(+), 64 deletions(-)
 create mode 100644 include/qemu/yank.h
 create mode 100644 util/yank.c

--
2.20.1


pgp21L4FV4CgL.pgp
Description: OpenPGP digital signature


Re: [PATCH v7 1/8] Introduce yank feature

2020-09-04 Thread Lukas Straub
On Thu, 27 Aug 2020 14:37:00 +0200
Markus Armbruster  wrote:

> I apologize for not reviewing this much earlier.
> 
> Lukas Straub  writes:
> 
> > The yank feature allows to recover from hanging qemu by "yanking"
> > at various parts. Other qemu systems can register themselves and
> > multiple yank functions. Then all yank functions for selected
> > instances can be called by the 'yank' out-of-band qmp command.
> > Available instances can be queried by a 'query-yank' oob command.
> >
> > Signed-off-by: Lukas Straub 
> > Acked-by: Stefan Hajnoczi 
> > ---
> ...
> > diff --git a/qapi/misc.json b/qapi/misc.json
> > index 9d32820dc1..0d6a8f20b7 100644
> > --- a/qapi/misc.json
> > +++ b/qapi/misc.json
> > @@ -1615,3 +1615,48 @@
> >  ##
> >  { 'command': 'query-vm-generation-id', 'returns': 'GuidInfo' }
> >
> > +##
> > +# @YankInstances:
> > +#
> > +# @instances: List of yank instances.
> > +#
> > +# Yank instances are named after the following schema:
> > +# "blockdev:", "chardev:" and "migration"
> > +#
> > +# Since: 5.1
> > +##
> > +{ 'struct': 'YankInstances', 'data': {'instances': ['str'] } }  
> 
> I'm afraid this is a problematic QMP interface.
> 
> By making YankInstances a struct, you keep the door open to adding more
> members, which is good.
> 
> But by making its 'instances' member a ['str'], you close the door to
> using anything but a single string for the individual instances.  Not so
> good.
> 
> The single string encodes information which QMP client will need to
> parse from the string.  We frown on that in QMP.  Use QAPI complex types
> capabilities for structured data.
> 
> Could you use something like this instead?
> 
> { 'enum': 'YankInstanceType',
>   'data': { 'block-node', 'chardev', 'migration' } }
> 
> { 'struct': 'YankInstanceBlockNode',
>   'data': { 'node-name': 'str' } }
> 
> { 'struct': 'YankInstanceChardev',
>   'data' { 'label': 'str' } }
> 
> { 'union': 'YankInstance',
>   'base': { 'type': 'YankInstanceType' },
>   'discriminator': 'type',
>   'data': {
>   'block-node': 'YankInstanceBlockNode',
>   'chardev': 'YankInstanceChardev' } }
> 
> { 'command': 'yank',
>   'data': { 'instances': ['YankInstance'] },
>   'allow-oob': true }

This proposal looks good to me. Does everyone agree?

Regards,
Lukas Straub

> If you're confident nothing will ever be added to YankInstanceBlockNode
> and YankInstanceChardev, you could use str instead.
> 
> > +
> > +##
> > +# @yank:
> > +#
> > +# Recover from hanging qemu by yanking the specified instances.  
> 
> What's an "instance", and what does it mean to "yank" it?
> 
> The documentation of YankInstances above gives a clue on what an
> "instance" is: presumably a block node, a character device or the
> migration job.
> 
> I guess a YankInstance is whatever the code chooses to make one, and the
> current code makes these three kinds.
> 
> Does it make every block node a YankInstance?  If not, which ones?
> 
> Does it make every character device a YankInstance?  If not, which ones?
> 
> Does it make migration always a YankInstance?  If not, when?
> 
> > +#
> > +# Takes @YankInstances as argument.
> > +#
> > +# Returns: nothing.
> > +#
> > +# Example:
> > +#
> > +# -> { "execute": "yank", "arguments": { "instances": ["blockdev:nbd0"] } }
> > +# <- { "return": {} }
> > +#
> > +# Since: 5.1
> > +##
> > +{ 'command': 'yank', 'data': 'YankInstances', 'allow-oob': true }
> > +
> > +##
> > +# @query-yank:
> > +#
> > +# Query yank instances.
> > +#
> > +# Returns: @YankInstances
> > +#
> > +# Example:
> > +#
> > +# -> { "execute": "query-yank" }
> > +# <- { "return": { "instances": ["blockdev:nbd0"] } }
> > +#
> > +# Since: 5.1
> > +##
> > +{ 'command': 'query-yank', 'returns': 'YankInstances', 'allow-oob': true }
> ...


pgpCnC6MLDYNR.pgp
Description: OpenPGP digital signature


[PATCH v8 5/8] io/channel-tls.c: make qio_channel_tls_shutdown thread-safe

2020-09-01 Thread Lukas Straub
Make qio_channel_tls_shutdown thread-safe by using atomics when
accessing tioc->shutdown.

Signed-off-by: Lukas Straub 
Acked-by: Stefan Hajnoczi 
Reviewed-by: Daniel P. Berrangé 
---
 io/channel-tls.c | 6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/io/channel-tls.c b/io/channel-tls.c
index 7ec8ceff2f..b350c84640 100644
--- a/io/channel-tls.c
+++ b/io/channel-tls.c
@@ -23,6 +23,7 @@
 #include "qemu/module.h"
 #include "io/channel-tls.h"
 #include "trace.h"
+#include "qemu/atomic.h"


 static ssize_t qio_channel_tls_write_handler(const char *buf,
@@ -277,7 +278,8 @@ static ssize_t qio_channel_tls_readv(QIOChannel *ioc,
 return QIO_CHANNEL_ERR_BLOCK;
 }
 } else if (errno == ECONNABORTED &&
-   (tioc->shutdown & QIO_CHANNEL_SHUTDOWN_READ)) {
+   (atomic_load_acquire(>shutdown) &
+QIO_CHANNEL_SHUTDOWN_READ)) {
 return 0;
 }

@@ -361,7 +363,7 @@ static int qio_channel_tls_shutdown(QIOChannel *ioc,
 {
 QIOChannelTLS *tioc = QIO_CHANNEL_TLS(ioc);

-tioc->shutdown |= how;
+atomic_or(>shutdown, how);

 return qio_channel_shutdown(tioc->master, how, errp);
 }
--
2.20.1



pgp3TnxMdPuJ_.pgp
Description: OpenPGP digital signature


[PATCH v8 7/8] MAINTAINERS: Add myself as maintainer for yank feature

2020-09-01 Thread Lukas Straub
I'll maintain this for now as the colo usecase is the first user
of this functionality.

Signed-off-by: Lukas Straub 
Acked-by: Stefan Hajnoczi 
Reviewed-by: Daniel P. Berrangé 
---
 MAINTAINERS | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/MAINTAINERS b/MAINTAINERS
index 5a22c8be42..c1d450e25a 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -2615,6 +2615,12 @@ F: util/uuid.c
 F: include/qemu/uuid.h
 F: tests/test-uuid.c

+Yank feature
+M: Lukas Straub 
+S: Odd fixes
+F: util/yank.c
+F: include/qemu/yank.h
+
 COLO Framework
 M: zhanghailiang 
 S: Maintained
--
2.20.1



pgpo5JnRMLSNT.pgp
Description: OpenPGP digital signature


[PATCH v8 3/8] chardev/char-socket.c: Add yank feature

2020-09-01 Thread Lukas Straub
Register a yank function to shutdown the socket on yank.

Signed-off-by: Lukas Straub 
Acked-by: Stefan Hajnoczi 
Reviewed-by: Daniel P. Berrangé 
---
 chardev/char-socket.c | 31 +++
 1 file changed, 31 insertions(+)

diff --git a/chardev/char-socket.c b/chardev/char-socket.c
index ef62dbf3d7..8e2865ca83 100644
--- a/chardev/char-socket.c
+++ b/chardev/char-socket.c
@@ -34,6 +34,7 @@
 #include "qapi/error.h"
 #include "qapi/clone-visitor.h"
 #include "qapi/qapi-visit-sockets.h"
+#include "qemu/yank.h"

 #include "chardev/char-io.h"

@@ -69,6 +70,7 @@ typedef struct {
 size_t read_msgfds_num;
 int *write_msgfds;
 size_t write_msgfds_num;
+char *yank_name;

 SocketAddress *addr;
 bool is_listen;
@@ -413,6 +415,11 @@ static void tcp_chr_free_connection(Chardev *chr)

 tcp_set_msgfds(chr, NULL, 0);
 remove_fd_in_watch(chr);
+if (s->state == TCP_CHARDEV_STATE_CONNECTING
+|| s->state == TCP_CHARDEV_STATE_CONNECTED) {
+yank_unregister_function(s->yank_name, yank_generic_iochannel,
+ QIO_CHANNEL(s->sioc));
+}
 object_unref(OBJECT(s->sioc));
 s->sioc = NULL;
 object_unref(OBJECT(s->ioc));
@@ -916,6 +923,8 @@ static int tcp_chr_add_client(Chardev *chr, int fd)
 }
 tcp_chr_change_state(s, TCP_CHARDEV_STATE_CONNECTING);
 tcp_chr_set_client_ioc_name(chr, sioc);
+yank_register_function(s->yank_name, yank_generic_iochannel,
+   QIO_CHANNEL(sioc));
 ret = tcp_chr_new_client(chr, sioc);
 object_unref(OBJECT(sioc));
 return ret;
@@ -930,6 +939,8 @@ static void tcp_chr_accept(QIONetListener *listener,

 tcp_chr_change_state(s, TCP_CHARDEV_STATE_CONNECTING);
 tcp_chr_set_client_ioc_name(chr, cioc);
+yank_register_function(s->yank_name, yank_generic_iochannel,
+   QIO_CHANNEL(cioc));
 tcp_chr_new_client(chr, cioc);
 }

@@ -945,6 +956,8 @@ static int tcp_chr_connect_client_sync(Chardev *chr, Error 
**errp)
 object_unref(OBJECT(sioc));
 return -1;
 }
+yank_register_function(s->yank_name, yank_generic_iochannel,
+   QIO_CHANNEL(sioc));
 tcp_chr_new_client(chr, sioc);
 object_unref(OBJECT(sioc));
 return 0;
@@ -960,6 +973,8 @@ static void tcp_chr_accept_server_sync(Chardev *chr)
 tcp_chr_change_state(s, TCP_CHARDEV_STATE_CONNECTING);
 sioc = qio_net_listener_wait_client(s->listener);
 tcp_chr_set_client_ioc_name(chr, sioc);
+yank_register_function(s->yank_name, yank_generic_iochannel,
+   QIO_CHANNEL(sioc));
 tcp_chr_new_client(chr, sioc);
 object_unref(OBJECT(sioc));
 }
@@ -1070,6 +1085,10 @@ static void char_socket_finalize(Object *obj)
 object_unref(OBJECT(s->tls_creds));
 }
 g_free(s->tls_authz);
+if (s->yank_name) {
+yank_unregister_instance(s->yank_name);
+g_free(s->yank_name);
+}

 qemu_chr_be_event(chr, CHR_EVENT_CLOSED);
 }
@@ -1085,6 +1104,8 @@ static void qemu_chr_socket_connected(QIOTask *task, void 
*opaque)

 if (qio_task_propagate_error(task, )) {
 tcp_chr_change_state(s, TCP_CHARDEV_STATE_DISCONNECTED);
+yank_unregister_function(s->yank_name, yank_generic_iochannel,
+ QIO_CHANNEL(sioc));
 check_report_connect_error(chr, err);
 goto cleanup;
 }
@@ -1118,6 +1139,8 @@ static void tcp_chr_connect_client_async(Chardev *chr)
 tcp_chr_change_state(s, TCP_CHARDEV_STATE_CONNECTING);
 sioc = qio_channel_socket_new();
 tcp_chr_set_client_ioc_name(chr, sioc);
+yank_register_function(s->yank_name, yank_generic_iochannel,
+   QIO_CHANNEL(sioc));
 /*
  * Normally code would use the qio_channel_socket_connect_async
  * method which uses a QIOTask + qio_task_set_error internally
@@ -1360,6 +1383,14 @@ static void qmp_chardev_open_socket(Chardev *chr,
 qemu_chr_set_feature(chr, QEMU_CHAR_FEATURE_FD_PASS);
 }

+s->yank_name = g_strconcat("chardev:", chr->label, NULL);
+yank_register_instance(s->yank_name, errp);
+if (*errp) {
+g_free(s->yank_name);
+s->yank_name = NULL;
+return;
+}
+
 /* be isn't opened until we get a connection */
 *be_opened = false;

--
2.20.1



pgpMhRYCj1F3A.pgp
Description: OpenPGP digital signature


[PATCH v8 2/8] block/nbd.c: Add yank feature

2020-09-01 Thread Lukas Straub
Register a yank function which shuts down the socket and sets
s->state = NBD_CLIENT_QUIT. This is the same behaviour as if an
error occured.

Signed-off-by: Lukas Straub 
Acked-by: Stefan Hajnoczi 
Reviewed-by: Daniel P. Berrangé 
---
 block/nbd.c | 129 
 1 file changed, 80 insertions(+), 49 deletions(-)

diff --git a/block/nbd.c b/block/nbd.c
index 7bb881fef4..8632cf5340 100644
--- a/block/nbd.c
+++ b/block/nbd.c
@@ -35,6 +35,7 @@
 #include "qemu/option.h"
 #include "qemu/cutils.h"
 #include "qemu/main-loop.h"
+#include "qemu/atomic.h"

 #include "qapi/qapi-visit-sockets.h"
 #include "qapi/qmp/qstring.h"
@@ -43,6 +44,8 @@
 #include "block/nbd.h"
 #include "block/block_int.h"

+#include "qemu/yank.h"
+
 #define EN_OPTSTR ":exportname="
 #define MAX_NBD_REQUESTS16

@@ -84,6 +87,8 @@ typedef struct BDRVNBDState {
 NBDReply reply;
 BlockDriverState *bs;

+char *yank_name;
+
 /* Connection parameters */
 uint32_t reconnect_delay;
 SocketAddress *saddr;
@@ -93,10 +98,10 @@ typedef struct BDRVNBDState {
 char *x_dirty_bitmap;
 } BDRVNBDState;

-static QIOChannelSocket *nbd_establish_connection(SocketAddress *saddr,
-  Error **errp);
-static int nbd_client_handshake(BlockDriverState *bs, QIOChannelSocket *sioc,
-Error **errp);
+static int nbd_establish_connection(BlockDriverState *bs, SocketAddress *saddr,
+Error **errp);
+static int nbd_client_handshake(BlockDriverState *bs, Error **errp);
+static void nbd_yank(void *opaque);

 static void nbd_clear_bdrvstate(BDRVNBDState *s)
 {
@@ -109,17 +114,19 @@ static void nbd_clear_bdrvstate(BDRVNBDState *s)
 s->tlscredsid = NULL;
 g_free(s->x_dirty_bitmap);
 s->x_dirty_bitmap = NULL;
+g_free(s->yank_name);
+s->yank_name = NULL;
 }

 static void nbd_channel_error(BDRVNBDState *s, int ret)
 {
 if (ret == -EIO) {
-if (s->state == NBD_CLIENT_CONNECTED) {
+if (atomic_load_acquire(>state) == NBD_CLIENT_CONNECTED) {
 s->state = s->reconnect_delay ? NBD_CLIENT_CONNECTING_WAIT :
 NBD_CLIENT_CONNECTING_NOWAIT;
 }
 } else {
-if (s->state == NBD_CLIENT_CONNECTED) {
+if (atomic_load_acquire(>state) == NBD_CLIENT_CONNECTED) {
 qio_channel_shutdown(s->ioc, QIO_CHANNEL_SHUTDOWN_BOTH, NULL);
 }
 s->state = NBD_CLIENT_QUIT;
@@ -170,7 +177,7 @@ static void nbd_client_attach_aio_context(BlockDriverState 
*bs,
  * s->connection_co is either yielded from nbd_receive_reply or from
  * nbd_co_reconnect_loop()
  */
-if (s->state == NBD_CLIENT_CONNECTED) {
+if (atomic_load_acquire(>state) == NBD_CLIENT_CONNECTED) {
 qio_channel_attach_aio_context(QIO_CHANNEL(s->ioc), new_context);
 }

@@ -237,20 +244,20 @@ static void nbd_teardown_connection(BlockDriverState *bs)

 static bool nbd_client_connecting(BDRVNBDState *s)
 {
-return s->state == NBD_CLIENT_CONNECTING_WAIT ||
-s->state == NBD_CLIENT_CONNECTING_NOWAIT;
+NBDClientState state = atomic_load_acquire(>state);
+return state == NBD_CLIENT_CONNECTING_WAIT ||
+state == NBD_CLIENT_CONNECTING_NOWAIT;
 }

 static bool nbd_client_connecting_wait(BDRVNBDState *s)
 {
-return s->state == NBD_CLIENT_CONNECTING_WAIT;
+return atomic_load_acquire(>state) == NBD_CLIENT_CONNECTING_WAIT;
 }

 static coroutine_fn void nbd_reconnect_attempt(BDRVNBDState *s)
 {
 int ret;
 Error *local_err = NULL;
-QIOChannelSocket *sioc;

 if (!nbd_client_connecting(s)) {
 return;
@@ -283,21 +290,21 @@ static coroutine_fn void 
nbd_reconnect_attempt(BDRVNBDState *s)
 /* Finalize previous connection if any */
 if (s->ioc) {
 nbd_client_detach_aio_context(s->bs);
+yank_unregister_function(s->yank_name, nbd_yank, s->bs);
 object_unref(OBJECT(s->sioc));
 s->sioc = NULL;
 object_unref(OBJECT(s->ioc));
 s->ioc = NULL;
 }

-sioc = nbd_establish_connection(s->saddr, _err);
-if (!sioc) {
+if (nbd_establish_connection(s->bs, s->saddr, _err) < 0) {
 ret = -ECONNREFUSED;
 goto out;
 }

 bdrv_dec_in_flight(s->bs);

-ret = nbd_client_handshake(s->bs, sioc, _err);
+ret = nbd_client_handshake(s->bs, _err);

 if (s->drained) {
 s->wait_drained_end = true;
@@ -334,7 +341,7 @@ static coroutine_fn void nbd_co_reconnect_loop(BDRVNBDState 
*s)
 nbd_reconnect_attempt(s);

 while (nbd_client_connecting(s)) {
-if (s->state == NBD_CLIENT_CONNECTING_WAIT &&
+if (atomic_load_acquire(>state) == NBD_CLIENT_

[PATCH v8 1/8] Introduce yank feature

2020-09-01 Thread Lukas Straub
The yank feature allows to recover from hanging qemu by "yanking"
at various parts. Other qemu systems can register themselves and
multiple yank functions. Then all yank functions for selected
instances can be called by the 'yank' out-of-band qmp command.
Available instances can be queried by a 'query-yank' oob command.

Signed-off-by: Lukas Straub 
Acked-by: Stefan Hajnoczi 
---
 include/qemu/yank.h |  81 +++
 qapi/misc.json  |  62 +++
 util/meson.build|   1 +
 util/yank.c | 187 
 4 files changed, 331 insertions(+)
 create mode 100644 include/qemu/yank.h
 create mode 100644 util/yank.c

diff --git a/include/qemu/yank.h b/include/qemu/yank.h
new file mode 100644
index 00..c5ab53965a
--- /dev/null
+++ b/include/qemu/yank.h
@@ -0,0 +1,81 @@
+/*
+ * QEMU yank feature
+ *
+ * Copyright (c) Lukas Straub 
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or later.
+ * See the COPYING file in the top-level directory.
+ */
+
+#ifndef YANK_H
+#define YANK_H
+
+typedef void (YankFn)(void *opaque);
+
+/**
+ * yank_register_instance: Register a new instance.
+ *
+ * This registers a new instance for yanking. Must be called before any yank
+ * function is registered for this instance.
+ *
+ * This function is thread-safe.
+ *
+ * @instance_name: The globally unique name of the instance.
+ * @errp: Error object.
+ */
+void yank_register_instance(const char *instance_name, Error **errp);
+
+/**
+ * yank_unregister_instance: Unregister a instance.
+ *
+ * This unregisters a instance. Must be called only after every yank function
+ * of the instance has been unregistered.
+ *
+ * This function is thread-safe.
+ *
+ * @instance_name: The name of the instance.
+ */
+void yank_unregister_instance(const char *instance_name);
+
+/**
+ * yank_register_function: Register a yank function
+ *
+ * This registers a yank function. All limitations of qmp oob commands apply
+ * to the yank function as well. See docs/devel/qapi-code-gen.txt under
+ * "An OOB-capable command handler must satisfy the following conditions".
+ *
+ * This function is thread-safe.
+ *
+ * @instance_name: The name of the instance
+ * @func: The yank function
+ * @opaque: Will be passed to the yank function
+ */
+void yank_register_function(const char *instance_name,
+YankFn *func,
+void *opaque);
+
+/**
+ * yank_unregister_function: Unregister a yank function
+ *
+ * This unregisters a yank function.
+ *
+ * This function is thread-safe.
+ *
+ * @instance_name: The name of the instance
+ * @func: func that was passed to yank_register_function
+ * @opaque: opaque that was passed to yank_register_function
+ */
+void yank_unregister_function(const char *instance_name,
+  YankFn *func,
+  void *opaque);
+
+/**
+ * yank_generic_iochannel: Generic yank function for iochannel
+ *
+ * This is a generic yank function which will call qio_channel_shutdown on the
+ * provided QIOChannel.
+ *
+ * @opaque: QIOChannel to shutdown
+ */
+void yank_generic_iochannel(void *opaque);
+#endif
diff --git a/qapi/misc.json b/qapi/misc.json
index 9d32820dc1..7de330416a 100644
--- a/qapi/misc.json
+++ b/qapi/misc.json
@@ -1615,3 +1615,65 @@
 ##
 { 'command': 'query-vm-generation-id', 'returns': 'GuidInfo' }

+##
+# @YankInstances:
+#
+# @instances: List of yank instances.
+#
+# A yank instance can be yanked with the "yank" qmp command to recover from a
+# hanging qemu.
+#
+# Yank instances are named after the following schema:
+# "blockdev:" refers to a block device. Currently only nbd block
+# devices are implemented.
+# "chardev:" refers to a chardev. Currently only socket chardevs
+# are implemented.
+# "migration" refers to the migration currently in progress.
+#
+# Currently implemented yank instances:
+#  -nbd block device:
+#   Yanking it will shutdown the connection to the nbd server without
+#   attempting to reconnect.
+#  -socket chardev:
+#   Yanking it will shutdown the connected socket.
+#  -migration:
+#   Yanking it will shutdown all migration connections.
+#
+# Since: 5.2
+##
+{ 'struct': 'YankInstances', 'data': {'instances': ['str'] } }
+
+##
+# @yank:
+#
+# Recover from hanging qemu by yanking the specified instances. See
+# "YankInstances" for more information.
+#
+# Takes @YankInstances as argument.
+#
+# Returns: nothing.
+#
+# Example:
+#
+# -> { "execute": "yank", "arguments": { "instances": ["blockdev:nbd0"] } }
+# <- { "return": {} }
+#
+# Since: 5.2
+##
+{ 'command': 'yank', 'data': 'YankInstances', 'allow-oob': true }
+
+##
+# @query-yank:
+#
+# Query yank instances. See "YankInstances" for more information.
+#
+# Returns: @YankInstances
+#
+# Example:
+#
+# -> { "execute"

  1   2   >