Re: [PATCH v2 (BROKEN) 0/6] migration: bring improved savevm/loadvm/delvm to QMP

2020-08-27 Thread Markus Armbruster
Kevin Wolf  writes:

> Am 27.08.2020 um 13:06 hat Markus Armbruster geschrieben:
>> Daniel P. Berrangé  writes:
>> 
>> > On Wed, Aug 26, 2020 at 07:28:24PM +0100, Daniel P. Berrangé wrote:
>> >> On Wed, Aug 26, 2020 at 05:52:06PM +0200, Markus Armbruster wrote:
>> >> > Open questions:
>> >> > 
>> >> > * Do we want the QMP command to delete existing snapshots with
>> >> >   conflicting tag / ID, like HMP savevm does?  Or do we want it to fail
>> >> >   the transaction?
>> >> 
>> >> The intent is for the QMP commands to operate exclusively on
>> >> 'tags', and never consider "ID".
>> >
>> > I forgot that even HMP ignores "ID" now and works exclusively in terms
>> > of tags since:
>> >
>> >
>> >   commit 6ca080453ea403959ccde661030ca16264acc181
>> >   Author: Daniel Henrique Barboza 
>> >   Date:   Wed Nov 7 11:09:58 2018 -0200
>> >
>> > block/snapshot.c: eliminate use of ID input in snapshot operations
>> 
>> Almost a year after I sent the memo I quoted.  It's an incompatible
>> change, but nobody complained, and I'm glad we got this issue out of the
>> way.
>
> FWIW, I would have ignored any complaint about incompatible changes in
> HMP. It's not supposed to be a stable API, but UI.

The iffy part is actually the loss of ability to access snapshots that
lack a name.  Complaints about that would have been valid, I think.
Fortunately, there have been none.




Re: [PATCH 4/6] block, migration: add bdrv_finalize_vmstate helper

2020-08-27 Thread Markus Armbruster
Daniel P. Berrangé  writes:

> On Thu, Aug 27, 2020 at 04:02:38PM +0300, Denis V. Lunev wrote:
>> On 8/27/20 3:58 PM, Daniel P. Berrangé wrote:
>> > On Thu, Jul 09, 2020 at 04:26:42PM +0300, Denis V. Lunev wrote:
>> >> Right now bdrv_fclose() is just calling bdrv_flush().
>> >>
>> >> The problem is that migration code is working inefficiently from block
>> >> layer terms and are frequently called for very small pieces of
>> >> unaligned data. Block layer is capable to work this way, but this is very
>> >> slow.
>> >>
>> >> This patch is a preparation for the introduction of the intermediate
>> >> buffer at block driver state. It would be beneficial to separate
>> >> conventional bdrv_flush() from closing QEMU file from migration code.
>> >>
>> >> The patch also forces bdrv_finalize_vmstate() operation inside
>> >> synchronous blk_save_vmstate() operation. This helper is used from
>> >> qemu-io only.
>> >>
>> >> Signed-off-by: Denis V. Lunev 
>> >> Reviewed-by: Vladimir Sementsov-Ogievskiy 
>> >> CC: Kevin Wolf 
>> >> CC: Max Reitz 
>> >> CC: Stefan Hajnoczi 
>> >> CC: Fam Zheng 
>> >> CC: Juan Quintela 
>> >> CC: "Dr. David Alan Gilbert" 
>> >> CC: Denis Plotnikov 
>> >> ---
>> >>  block/block-backend.c |  6 +-
>> >>  block/io.c| 15 +++
>> >>  include/block/block.h |  5 +
>> >>  migration/savevm.c|  4 
>> >>  4 files changed, 29 insertions(+), 1 deletion(-)
>> >> diff --git a/migration/savevm.c b/migration/savevm.c
>> >> index 45c9dd9d8a..d8a94e312c 100644
>> >> --- a/migration/savevm.c
>> >> +++ b/migration/savevm.c
>> >> @@ -150,6 +150,10 @@ static ssize_t block_get_buffer(void *opaque, 
>> >> uint8_t *buf, int64_t pos,
>> >>  
>> >>  static int bdrv_fclose(void *opaque, Error **errp)
>> >>  {
>> >> +int err = bdrv_finalize_vmstate(opaque);
>> >> +if (err < 0) {
>> >> +return err;
>> > This is returning an error without having populating 'errp' which means
>> > the caller will be missing error diagnosis
>> 
>> but this behaves exactly like the branch below,
>> bdrv_flush() could return error too and errp
>> is not filled in the same way.
>
> Doh, it seems the only caller passes NULL for the errp too,
> so it is a redundant parameter. So nothing wrong with your
> patch after all.

Not setting an error on failure is plainly wrong.

If it works because all callers pass NULL, then the obvious fix is to
drop the @errp parameter.

I agree it's not this patch's fault.  It needs fixing anyway.  If you
have to respin for some other reason, including a fix would be nice.




Re: [PATCH v1 0/7] vhost-user-blk: fix the migration issue and enhance qtests

2020-08-27 Thread Raphael Norwitz
On Thu, Aug 27, 2020 at 8:17 AM Michael S. Tsirkin  wrote:
>
> On Tue, Aug 04, 2020 at 01:36:45PM +0300, Dima Stepanov wrote:
> > Reference e-mail threads:
> >   - https://lists.gnu.org/archive/html/qemu-devel/2020-05/msg01509.html
> >   - https://lists.gnu.org/archive/html/qemu-devel/2020-05/msg05241.html
> >
> > If vhost-user daemon is used as a backend for the vhost device, then we
> > should consider a possibility of disconnect at any moment. There was a 
> > general
> > question here: should we consider it as an error or okay state for the 
> > vhost-user
> > devices during migration process?
> > I think the disconnect event for the vhost-user devices should not break the
> > migration process, because:
> >   - the device will be in the stopped state, so it will not be changed
> > during migration
> >   - if reconnect will be made the migration log will be reinitialized as
> > part of reconnect/init process:
> > #0  vhost_log_global_start (listener=0x563989cf7be0)
> > at hw/virtio/vhost.c:920
> > #1  0x56398603d8bc in listener_add_address_space 
> > (listener=0x563989cf7be0,
> > as=0x563986ea4340 )
> > at softmmu/memory.c:2664
> > #2  0x56398603dd30 in memory_listener_register 
> > (listener=0x563989cf7be0,
> > as=0x563986ea4340 )
> > at softmmu/memory.c:2740
> > #3  0x563985fd6956 in vhost_dev_init (hdev=0x563989cf7bd8,
> > opaque=0x563989cf7e30, backend_type=VHOST_BACKEND_TYPE_USER,
> > busyloop_timeout=0)
> > at hw/virtio/vhost.c:1385
> > #4  0x563985f7d0b8 in vhost_user_blk_connect (dev=0x563989cf7990)
> > at hw/block/vhost-user-blk.c:315
> > #5  0x563985f7d3f6 in vhost_user_blk_event (opaque=0x563989cf7990,
> > event=CHR_EVENT_OPENED)
> > at hw/block/vhost-user-blk.c:379
> > The first patch in the patchset fixes this issue by setting vhost device to 
> > the
> > stopped state in the disconnect handler and check it the 
> > vhost_migration_log()
> > routine before returning from the function.
> > qtest framework was updated to test vhost-user-blk functionality. The
> > vhost-user-blk/vhost-user-blk-tests/migrate_reconnect test was added to 
> > reproduce
> > the original issue found.
>
>
> Raphael any input on this?

Just posted comments on the vhost/vhost-user-blk side. Will look at
the test code next.

>
> > Dima Stepanov (7):
> >   vhost: recheck dev state in the vhost_migration_log routine
> >   vhost: check queue state in the vhost_dev_set_log routine
> >   tests/qtest/vhost-user-test: prepare the tests for adding new dev
> > class
> >   tests/qtest/libqos/virtio-blk: add support for vhost-user-blk
> >   tests/qtest/vhost-user-test: add support for the vhost-user-blk device
> >   tests/qtest/vhost-user-test: add migrate_reconnect test
> >   tests/qtest/vhost-user-test: enable the reconnect tests
> >
> >  hw/block/vhost-user-blk.c  |  13 +-
> >  hw/virtio/vhost.c  |  39 -
> >  include/hw/virtio/vhost-user-blk.h |   1 +
> >  tests/qtest/libqos/virtio-blk.c|  14 ++
> >  tests/qtest/vhost-user-test.c  | 291 
> > +++--
> >  5 files changed, 311 insertions(+), 47 deletions(-)
> >
> > --
> > 2.7.4
>
>



Re: [PATCH v2 2/7] vhost: check queue state in the vhost_dev_set_log routine

2020-08-27 Thread Raphael Norwitz
On Mon, Aug 24, 2020 at 4:41 AM Dima Stepanov  wrote:
>
> If the vhost-user-blk daemon provides only one virtqueue, but device was
> added with several queues, then QEMU will send more VHOST-USER command
> than expected by daemon side. The vhost_virtqueue_start() routine
> handles such case by checking the return value from the
> virtio_queue_get_desc_addr() function call. Add the same check to the
> vhost_dev_set_log() routine.
>
> Signed-off-by: Dima Stepanov 
> ---
>  hw/virtio/vhost.c | 12 
>  1 file changed, 12 insertions(+)
>
> diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c
> index ffef7ab..a33ffd4 100644
> --- a/hw/virtio/vhost.c
> +++ b/hw/virtio/vhost.c
> @@ -825,12 +825,24 @@ static int vhost_dev_set_features(struct vhost_dev *dev,
>  static int vhost_dev_set_log(struct vhost_dev *dev, bool enable_log)
>  {
>  int r, i, idx;
> +hwaddr addr;
> +
>  r = vhost_dev_set_features(dev, enable_log);
>  if (r < 0) {
>  goto err_features;
>  }
>  for (i = 0; i < dev->nvqs; ++i) {
>  idx = dev->vhost_ops->vhost_get_vq_index(dev, dev->vq_index + i);
> +addr = virtio_queue_get_desc_addr(dev->vdev, idx);
> +if (!addr) {
> +/*
> + * The queue might not be ready for start. If this
> + * is the case there is no reason to continue the process.
> + * The similar logic is used by the vhost_virtqueue_start()
> + * routine.
> + */

Shouldn’t we goto err_vq here to reset the logging state of any vqs
which have already been set?

> +break;
> +}
>  r = vhost_virtqueue_set_addr(dev, dev->vqs + i, idx,
>   enable_log);
>  if (r < 0) {
> --
> 2.7.4
>
>



Re: [PATCH v2 1/7] vhost: recheck dev state in the vhost_migration_log routine

2020-08-27 Thread Raphael Norwitz
Generally I’m happy with this. Some comments:

On Mon, Aug 24, 2020 at 4:40 AM Dima Stepanov  wrote:
>
> vhost-user devices can get a disconnect in the middle of the VHOST-USER
> handshake on the migration start. If disconnect event happened right
> before sending next VHOST-USER command, then the vhost_dev_set_log()
> call in the vhost_migration_log() function will return error. This error
> will lead to the assert() and close the QEMU migration source process.
> For the vhost-user devices the disconnect event should not break the
> migration process, because:
>   - the device will be in the stopped state, so it will not be changed
> during migration
>   - if reconnect will be made the migration log will be reinitialized as
> part of reconnect/init process:
> #0  vhost_log_global_start (listener=0x563989cf7be0)
> at hw/virtio/vhost.c:920
> #1  0x56398603d8bc in listener_add_address_space 
> (listener=0x563989cf7be0,
> as=0x563986ea4340 )
> at softmmu/memory.c:2664
> #2  0x56398603dd30 in memory_listener_register 
> (listener=0x563989cf7be0,
> as=0x563986ea4340 )
> at softmmu/memory.c:2740
> #3  0x563985fd6956 in vhost_dev_init (hdev=0x563989cf7bd8,
> opaque=0x563989cf7e30, backend_type=VHOST_BACKEND_TYPE_USER,
> busyloop_timeout=0)
> at hw/virtio/vhost.c:1385
> #4  0x563985f7d0b8 in vhost_user_blk_connect (dev=0x563989cf7990)
> at hw/block/vhost-user-blk.c:315
> #5  0x563985f7d3f6 in vhost_user_blk_event (opaque=0x563989cf7990,
> event=CHR_EVENT_OPENED)
> at hw/block/vhost-user-blk.c:379
> Update the vhost-user-blk device with the internal started field which
> will be used for initialization and clean up. The disconnect event will
> set the overall VHOST device to the stopped state, so it can be used by
> the vhost_migration_log routine.
> Such approach could be propogated to the other vhost-user devices, but
> better idea is just to make the same connect/disconnect code for all the
> vhost-user devices.
>
> This migration issue was slightly discussed earlier:
>   - https://lists.gnu.org/archive/html/qemu-devel/2020-05/msg01509.html
>   - https://lists.gnu.org/archive/html/qemu-devel/2020-05/msg05241.html
>

What you’re doing here on a structural level is adding an additional
flag “started” to VhostUserBlk to track whether the device really
needs to be stopped and cleaned up on a vhost level on a disconnect.
Can you make that more clear in the commit message as you have in the
comments?


> Signed-off-by: Dima Stepanov 
> ---
>  hw/block/vhost-user-blk.c  | 19 ---
>  hw/virtio/vhost.c  | 27 ---
>  include/hw/virtio/vhost-user-blk.h | 10 ++
>  3 files changed, 50 insertions(+), 6 deletions(-)
>
> diff --git a/hw/block/vhost-user-blk.c b/hw/block/vhost-user-blk.c
> index a00b854..5573e89 100644
> --- a/hw/block/vhost-user-blk.c
> +++ b/hw/block/vhost-user-blk.c
> @@ -150,6 +150,7 @@ static int vhost_user_blk_start(VirtIODevice *vdev)
>  error_report("Error starting vhost: %d", -ret);
>  goto err_guest_notifiers;
>  }
> +s->started = true;
>
>  /* guest_notifier_mask/pending not used yet, so just unmask
>   * everything here. virtio-pci will do the right thing by
> @@ -175,6 +176,11 @@ static void vhost_user_blk_stop(VirtIODevice *vdev)
>  VirtioBusClass *k = VIRTIO_BUS_GET_CLASS(qbus);
>  int ret;
>
> +if (!s->started) {
> +return;
> +}
> +s->started = false;
> +
>  if (!k->set_guest_notifiers) {
>  return;
>  }
> @@ -341,9 +347,7 @@ static void vhost_user_blk_disconnect(DeviceState *dev)
>  }
>  s->connected = false;
>
> -if (s->dev.started) {
> -vhost_user_blk_stop(vdev);
> -}
> +vhost_user_blk_stop(vdev);
>
>  vhost_dev_cleanup(&s->dev);
>  }
> @@ -399,6 +403,15 @@ static void vhost_user_blk_event(void *opaque, 
> QEMUChrEvent event)
>  NULL, NULL, false);
>  aio_bh_schedule_oneshot(ctx, vhost_user_blk_chr_closed_bh, 
> opaque);
>  }
> +
> +/*
> + * Move vhost device to the stopped state. The vhost-user device
> + * will be clean up and disconnected in BH. This can be useful in
> + * the vhost migration code. If disconnect was caught there is an
> + * option for the general vhost code to get the dev state without
> + * knowing its type (in this case vhost-user).
> + */
> +s->dev.started = false;
>  break;
>  case CHR_EVENT_BREAK:
>  case CHR_EVENT_MUX_IN:
> diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c
> index 1a1384e..ffef7ab 100644
> --- a/hw/virtio/vhost.c
> +++ b/hw/virtio/vhost.c
> @@ -861,21 +861,42 @@ static int vhost_migration_log(MemoryListener 
> *listener, bool enable)
>  dev->log_enabled = enable;
>  return 0;
>  }
> +
> +r = 0;
>  if (!enable) {
>   

Re: [PATCH v5 00/10] preallocate filter

2020-08-27 Thread Vladimir Sementsov-Ogievskiy

21.08.2020 17:11, Vladimir Sementsov-Ogievskiy wrote:

Hi all!

Here is a filter, which does preallocation on write.

In Virtuozzo we have to deal with some custom distributed storage
solution, where allocation is relatively expensive operation. We have to
workaround it in Qemu, so here is a new filter.


I have a problem now with this thing.

We need preallocation. But we don't want to explicitly specify it in all the 
management tools. So it should be inserted by default. It's OK for us to keep 
this default different from upstream... But there are problems with the 
implicitly inserted filter (actually iotests fail and I failed to fix them)

1. I have to set bs->inherits_from for filter and it's child by hand after 
bdrv_replace_node(), otherwise bdrv_check_perm doesn't work.

2. I have to set filter_bs->implicit and teach bdrv_refresh_filename() to ignore 
implicit filters when it checks for drv->bdrv_file_open, to avoid appearing of 
json in backing file names

3. And the real design problem, which seems impossible to fix: reopen is 
broken, just because user is not prepared to the fact that file child is a 
filter, not a file node and has another options, and don't support options of 
file-posix.

And seems all it (and mostly [3]) shows that implicitly inserting the filter is 
near to be impossible..

So, what are possible solutions?

In virtuozzo7 we have preallocation feature done inside qcow2 driver. This is 
very uncomfortable: we should to handle each possible over-EOF write to 
underlying node (to keep data_end in sync to be able to shrink preallocation on 
close()).. I don't like this way and don't want to port it..

Another option is implementing preallocation inside file-posix driver. Then, 
instead of BDRV_REQ_NO_WAIT flag I'll need to extend serialising requests API 
(bdrv_make_request_serialising() is already used in file-posix.c) to dupport 
no-wait behavior + expanding the serialising request bounds. This option seems 
feasible, so I'll try this way if no other ideas.

Filter is obviously the true way: we use generic block layer for native request 
serialising, don't need to catch every write in qcow2 driver, don't need to 
modify any other driver and get a universal thing. But how to insert it 
implicitly (or at least automatically in some cases) and avoid all the problems?

--
Best regards,
Vladimir



Re: [PATCH v3 00/74] qom: Automated conversion of type checking boilerplate

2020-08-27 Thread Eduardo Habkost
On Tue, Aug 25, 2020 at 03:19:56PM -0400, Eduardo Habkost wrote:
> git tree for this series:
> https://github.com/ehabkost/qemu-hacks/tree/work/qom-macros-autoconvert
> 
> This is an extension of the series previously submitted by
> Daniel[1], including a script that will convert existing type
> checker macros automatically.
> 
> [1] https://lore.kernel.org/qemu-devel/20200723181410.3145233-1-berrange@redh=
> at.com/

I'm queueing the following patches on machine-next:

[PATCH v3 01/74] e1000: Rename QOM class cast macros
[PATCH v3 02/74] megasas: Rename QOM class cast macros
[PATCH v3 03/74] vmw_pvscsi: Rename QOM class cast macros
[PATCH v3 04/74] pl110: Rename pl110_version enum values
[PATCH v3 05/74] allwinner-h3: Rename memmap enum constants
[PATCH v3 06/74] aspeed_soc: Rename memmap/irqmap enum constants
[PATCH v3 07/74] opentitan: Rename memmap enum constants
[PATCH v3 10/74] aspeed_timer: Fix ASPEED_TIMER macro definition
[PATCH v3 11/74] versatile: Fix typo in PCI_VPB_HOST definition
[PATCH v3 12/74] virtio-ccw: Fix definition of VIRTIO_CCW_BUS_GET_CLASS
[PATCH v3 13/74] hvf: Add missing include
[PATCH v3 14/74] hcd-dwc2: Rename USB_*CLASS macros for consistency
[PATCH v3 15/74] tulip: Move TulipState typedef to header
[PATCH v3 16/74] throttle-groups: Move ThrottleGroup typedef to header
[PATCH v3 17/74] pci: Move PCIBusClass typedef to pci.h
[PATCH v3 18/74] i8254: Move PITCommonState/PITCommonClass typedefs to i8254.h
[PATCH v3 19/74] hvf: Move HVFState typedef to hvf.h
[PATCH v3 20/74] mcf_fec: Move mcf_fec_state typedef to header
[PATCH v3 21/74] s390_flic: Move KVMS390FLICState typedef to header
[PATCH v3 22/74] can_emu: Delete macros for non-existing typedef
[PATCH v3 23/74] nubus: Delete unused NUBUS_BRIDGE macro
[PATCH v3 24/74] platform-bus: Delete macros for non-existing typedef
[PATCH v3 25/74] armsse: Rename QOM macros to avoid conflicts
[PATCH v3 26/74] xen-legacy-backend: Add missing typedef XenLegacyDevice
[PATCH v3 27/74] spapr: Move typedef SpaprMachineState to spapr.h
[PATCH v3 28/74] s390x: Move typedef SCLPEventFacility to event-facility.h
[PATCH v3 29/74] vhost-user-gpu: Move QOM macro to header
[PATCH v3 30/74] ahci: Move QOM macros to header
[PATCH v3 31/74] i8257: Move QOM macro to header
[PATCH v3 32/74] ahci: Move QOM macro to header
[PATCH v3 33/74] pckbd: Move QOM macro to header
[PATCH v3 34/74] vmbus: Move QOM macros to vmbus.h
[PATCH v3 35/74] virtio-serial-bus: Move QOM macros to header
[PATCH v3 36/74] piix: Move QOM macros to header
[PATCH v3 37/74] auxbus: Move QOM macros to header
[PATCH v3 38/74] rocker: Move QOM macros to header
[PATCH v3 39/74] pxa2xx: Move QOM macros to header
[PATCH v3 40/74] mptsas: Move QOM macros to header
[PATCH v3 41/74] kvm: Move QOM macros to kvm.h
[PATCH v3 42/74] vfio/pci: Move QOM macros to header
[PATCH v3 43/74] nubus: Rename class type checking macros
[PATCH v3 48/74] s390-virtio-ccw: Rename S390_MACHINE_CLASS macro
[PATCH v3 49/74] swim: Rename struct SWIM to Swim
[PATCH v3 50/74] migration: Rename class type checking macros

-- 
Eduardo




Re: [PATCH v7 0/8] Introduce 'yank' oob qmp command to recover from hanging qemu

2020-08-27 Thread Dr. David Alan Gilbert
* Daniel P. Berrangé (berra...@redhat.com) wrote:
> On Thu, Aug 27, 2020 at 10:42:46AM +0200, Lukas Straub wrote:
> > On Tue, 18 Aug 2020 14:26:31 +0200
> > Lukas Straub  wrote:
> > 
> > > On Tue, 4 Aug 2020 10:11:22 +0200
> > > Lukas Straub  wrote:
> > > 
> > > > Hello Everyone,
> > > > In many cases, if qemu has a network connection (qmp, migration, 
> > > > chardev, etc.)
> > > > to some other server and that server dies or hangs, qemu hangs too.
> > > > These patches introduce the new 'yank' out-of-band qmp command to 
> > > > recover from
> > > > these kinds of hangs. The different subsystems register callbacks which 
> > > > get
> > > > executed with the yank command. For example the callback can shutdown() 
> > > > a
> > > > socket. This is intended for the colo use-case, but it can be used for 
> > > > other
> > > > things too of course.
> > > > 
> > > > Regards,
> > > > Lukas Straub
> > > > 
> > > > v7:
> > > >  -yank_register_instance now returns error via Error **errp instead of 
> > > > aborting
> > > >  -dropped "chardev/char.c: Check for duplicate id before  creating 
> > > > chardev"
> > > > 
> > > > v6:
> > > >  -add Reviewed-by and Acked-by tags
> > > >  -rebase on master
> > > >  -lots of changes in nbd due to rebase
> > > >  -only take maintainership of util/yank.c and include/qemu/yank.h 
> > > > (Daniel P. Berrangé)
> > > >  -fix a crash discovered by the newly added chardev test
> > > >  -fix the test itself
> > > > 
> > > > v5:
> > > >  -move yank.c to util/
> > > >  -move yank.h to include/qemu/
> > > >  -add license to yank.h
> > > >  -use const char*
> > > >  -nbd: use atomic_store_release and atomic_load_aqcuire
> > > >  -io-channel: ensure thread-safety and document it
> > > >  -add myself as maintainer for yank
> > > > 
> > > > v4:
> > > >  -fix build errors...
> > > > 
> > > > v3:
> > > >  -don't touch softmmu/vl.c, use __contructor__ attribute instead (Paolo 
> > > > Bonzini)
> > > >  -fix build errors
> > > >  -rewrite migration patch so it actually passes all tests
> > > > 
> > > > v2:
> > > >  -don't touch io/ code anymore
> > > >  -always register yank functions
> > > >  -'yank' now takes a list of instances to yank
> > > >  -'query-yank' returns a list of yankable instances
> > > > 
> > > > Lukas Straub (8):
> > > >   Introduce yank feature
> > > >   block/nbd.c: Add yank feature
> > > >   chardev/char-socket.c: Add yank feature
> > > >   migration: Add yank feature
> > > >   io/channel-tls.c: make qio_channel_tls_shutdown thread-safe
> > > >   io: Document thread-safety of qio_channel_shutdown
> > > >   MAINTAINERS: Add myself as maintainer for yank feature
> > > >   tests/test-char.c: Wait for the chardev to connect in
> > > > char_socket_client_dupid_test
> > > > 
> > > >  MAINTAINERS   |   6 ++
> > > >  block/nbd.c   | 129 +++-
> > > >  chardev/char-socket.c |  31 ++
> > > >  include/io/channel.h  |   2 +
> > > >  include/qemu/yank.h   |  80 +++
> > > >  io/channel-tls.c  |   6 +-
> > > >  migration/channel.c   |  12 +++
> > > >  migration/migration.c |  25 -
> > > >  migration/multifd.c   |  10 ++
> > > >  migration/qemu-file-channel.c |   6 ++
> > > >  migration/savevm.c|   6 ++
> > > >  qapi/misc.json|  45 +
> > > >  tests/Makefile.include|   2 +-
> > > >  tests/test-char.c |   1 +
> > > >  util/Makefile.objs|   1 +
> > > >  util/yank.c   | 184 ++
> > > >  16 files changed, 493 insertions(+), 53 deletions(-)
> > > >  create mode 100644 include/qemu/yank.h
> > > >  create mode 100644 util/yank.c
> > > > 
> > > > --
> > > > 2.20.1  
> > > 
> > > Ping...
> > 
> > Ping 2...
> > 
> > Also, can the different subsystems have a look at this and give their ok?
> 
> We need ACKs from the NBD, migration and chardev maintainers, for the
> respective patches, then I think this series is ready for a pull request.

I'm happy from Migration:

Acked-by: Dr. David Alan Gilbert 

> Once acks arrive, I'm happy to send a PULL unless someone else has a
> desire todo it.

Looks like Markus would like a QMP tweak; but other than that I'd also
be happy to take it via migration; whichever is easiest.

Dave

> 
> Regards,
> Daniel
> -- 
> |: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
> |: https://libvirt.org -o-https://fstop138.berrange.com :|
> |: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|
-- 
Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK




Re: [PATCH v3 1/7] migration: improve error reporting of block driver state name

2020-08-27 Thread Dr. David Alan Gilbert
* Daniel P. Berrangé (berra...@redhat.com) wrote:
> With blockdev, a BlockDriverState may not have a device name,
> so using a node name is required as an alternative.
> 
> Reviewed-by: Dr. David Alan Gilbert 
> Signed-off-by: Daniel P. Berrangé 

Queuing this one by itself since it's useful anyway.

Dave

> ---
>  migration/savevm.c | 12 ++--
>  tests/qemu-iotests/267.out |  4 ++--
>  2 files changed, 8 insertions(+), 8 deletions(-)
> 
> diff --git a/migration/savevm.c b/migration/savevm.c
> index a843d202b5..304d98ff78 100644
> --- a/migration/savevm.c
> +++ b/migration/savevm.c
> @@ -2682,7 +2682,7 @@ int save_snapshot(const char *name, Error **errp)
>  
>  if (!bdrv_all_can_snapshot(&bs)) {
>  error_setg(errp, "Device '%s' is writable but does not support "
> -   "snapshots", bdrv_get_device_name(bs));
> +   "snapshots", bdrv_get_device_or_node_name(bs));
>  return ret;
>  }
>  
> @@ -2691,7 +2691,7 @@ int save_snapshot(const char *name, Error **errp)
>  ret = bdrv_all_delete_snapshot(name, &bs1, errp);
>  if (ret < 0) {
>  error_prepend(errp, "Error while deleting snapshot on device "
> -  "'%s': ", bdrv_get_device_name(bs1));
> +  "'%s': ", bdrv_get_device_or_node_name(bs1));
>  return ret;
>  }
>  }
> @@ -2766,7 +2766,7 @@ int save_snapshot(const char *name, Error **errp)
>  ret = bdrv_all_create_snapshot(sn, bs, vm_state_size, &bs);
>  if (ret < 0) {
>  error_setg(errp, "Error while creating snapshot on '%s'",
> -   bdrv_get_device_name(bs));
> +   bdrv_get_device_or_node_name(bs));
>  goto the_end;
>  }
>  
> @@ -2884,14 +2884,14 @@ int load_snapshot(const char *name, Error **errp)
>  if (!bdrv_all_can_snapshot(&bs)) {
>  error_setg(errp,
> "Device '%s' is writable but does not support snapshots",
> -   bdrv_get_device_name(bs));
> +   bdrv_get_device_or_node_name(bs));
>  return -ENOTSUP;
>  }
>  ret = bdrv_all_find_snapshot(name, &bs);
>  if (ret < 0) {
>  error_setg(errp,
> "Device '%s' does not have the requested snapshot '%s'",
> -   bdrv_get_device_name(bs), name);
> +   bdrv_get_device_or_node_name(bs), name);
>  return ret;
>  }
>  
> @@ -2920,7 +2920,7 @@ int load_snapshot(const char *name, Error **errp)
>  ret = bdrv_all_goto_snapshot(name, &bs, errp);
>  if (ret < 0) {
>  error_prepend(errp, "Could not load snapshot '%s' on '%s': ",
> -  name, bdrv_get_device_name(bs));
> +  name, bdrv_get_device_or_node_name(bs));
>  goto err_drain;
>  }
>  
> diff --git a/tests/qemu-iotests/267.out b/tests/qemu-iotests/267.out
> index d6d80c099f..215902b3ad 100644
> --- a/tests/qemu-iotests/267.out
> +++ b/tests/qemu-iotests/267.out
> @@ -81,11 +81,11 @@ Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=134217728
>  Testing: -blockdev driver=file,filename=TEST_DIR/t.IMGFMT,node-name=file
>  QEMU X.Y.Z monitor - type 'help' for more information
>  (qemu) savevm snap0
> -Error: Device '' is writable but does not support snapshots
> +Error: Device 'file' is writable but does not support snapshots
>  (qemu) info snapshots
>  No available block device supports snapshots
>  (qemu) loadvm snap0
> -Error: Device '' is writable but does not support snapshots
> +Error: Device 'file' is writable but does not support snapshots
>  (qemu) quit
>  
>  Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=134217728
> -- 
> 2.26.2
> 
-- 
Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK




Re: [PATCH 0/1] qcow2: Skip copy-on-write when allocating a zero cluster

2020-08-27 Thread Brian Foster
On Wed, Aug 26, 2020 at 08:34:32PM +0200, Alberto Garcia wrote:
> On Tue 25 Aug 2020 09:47:24 PM CEST, Brian Foster  wrote:
> > My fio fallocates the entire file by default with this command. Is that
> > the intent of this particular test? I added --fallocate=none to my test
> > runs to incorporate the allocation cost in the I/Os.
> 
> That wasn't intentional, you're right, it should use --fallocate=none (I
> don't see a big difference in my test anyway).
> 
> >> The Linux version is 4.19.132-1 from Debian.
> >
> > Thanks. I don't have LUKS in the mix on my box, but I was running on a
> > more recent kernel (Fedora 5.7.15-100). I threw v4.19 on the box and
> > saw a bit more of a delta between XFS (~14k iops) and ext4 (~24k). The
> > same test shows ~17k iops for XFS and ~19k iops for ext4 on v5.7. If I
> > increase the size of the LVM volume from 126G to >1TB, ext4 runs at
> > roughly the same rate and XFS closes the gap to around ~19k iops as
> > well. I'm not sure what might have changed since v4.19, but care to
> > see if this is still an issue on a more recent kernel?
> 
> Ok, I gave 5.7.10-1 a try but I still get similar numbers.
> 

Strange.

> Perhaps with a larger filesystem there would be a difference? I don't
> know.
> 

Perhaps. I believe Dave mentioned earlier how log size might affect
things.

I created a 125GB lvm volume and see slight deltas in iops going from
testing directly on the block device, to a fully allocated file on
XFS/ext4 and then to a preallocated file on XFS/ext4. In both cases the
numbers are comparable between XFS and ext4. On XFS, I can reproduce a
serious drop in iops if I reduce the default ~64MB log down to 8MB.
Perhaps you could try increasing your log ('-lsize=...' at mkfs time)
and see if that changes anything?

Beyond that, I'd probably try to normalize and simplify your storage
stack if you wanted to narrow it down further. E.g., clean format the
same bdev for XFS and ext4 and pull out things like LUKS just to rule
out any poor interactions.

Brian

> Berto
> 




Re: qcow2 overlay performance

2020-08-27 Thread Yoonho Park
Below is the data with the cache disabled ("virsh attach-disk ... --cache
none"). I added the previous data for reference. Overall, random read
performance was not affected significantly. This makes sense because a
cache is probably not going to help random read performance much. BTW how
big the cache is by default? Random write performance for 4K blocks seems
more "sane" now. Random write performance for 64K blocks is interesting
because base image (0 overlay) performance is 2X slower than 1-5 overlays.
We believe this is because the random writes to an overlay actually turn
into sequential writes (appends to the overlay). Does this make sense?


NO CACHE

  4K blocks64K blocks

olays rd bw rd iops wr bw  wr iops rd bw rd iops wr bw  wr iops

0 4478  11194684   117157001 890 42050  657

1 4490  11222503   625 56656 885 93483  1460

2 4385  10962425   606 56055 875 94445  1475

3 4334  10832307   576 55422 865 95826  1497

4 4356  10892168   542 56070 876 95957  1499

5 4234  10582308   577 54039 844 92936  1452


DEFAULT CACHE (WRITEBACK)

  4K blocks64K blocks

olays rd bw rd iops wr bw  wr iops rd bw rd iops wr bw  wr iops

0 4510  1127438028 109507  67854 1060521808 8153

1 4692  11732924   731 66801 1043104297 1629

2 4524  11312781   695 66801 1043104297 1629

3 4573  11433034   758 65500 102395627  1494

4 4556  11392971   742 67973 1062108099 1689

5 4471  11172937   734 66615 104098472  1538

On Wed, Aug 26, 2020 at 9:18 AM Kevin Wolf  wrote:

> Am 26.08.2020 um 02:46 hat Yoonho Park geschrieben:
> > I have been measuring the performance of qcow2 overlays, and I am hoping
> to
> > get some help in understanding the data I collected. In my experiments, I
> > created a VM and attached a 16G qcow2 disk to it using "qemu-img create"
> > and "virsh attach-disk". I use fio to fill it. I create some number of
> > snapshots (overlays) using "virsh snapshot-create-as". To mimic user
> > activity between taking snapshots, I use fio to randomly write to 10% of
> > each overlay right after I create it. After creating the overlays, I use
> > fio to measure random read performance and random write performance with
> 2
> > different blocks sizes, 4K and 64K. 64K is the qcow2 cluster size used by
> > the 16G qcow2 disk and the overlays (verified with "qemu-img info"). fio
> is
> > using the attached disk as a block device to avoid as much file system
> > overhead as possible. The VM, 16G disk, and snapshots (overlays) all
> reside
> > on local disk. Below are the measurements I collected for up to 5
> overlays.
> >
> >
> >   4K blocks64K blocks
> >
> > olays rd bw rd iops wr bw  wr iops rd bw rd iops wr bw  wr iops
> >
> > 0 4510  1127438028 109507  67854 1060521808 8153
> >
> > 1 4692  11732924   731 66801 1043104297 1629
> >
> > 2 4524  11312781   695 66801 1043104297 1629
> >
> > 3 4573  11433034   758 65500 102395627  1494
> >
> > 4 4556  11392971   742 67973 1062108099 1689
> >
> > 5 4471  11172937   734 66615 104098472  1538
> >
> >
> > Read performance is not affected by overlays. However, write performance
> > drops even with a single overlay. My understanding is that writing 4K
> > blocks requires a read-modify-write because you must fetch a complete
> > cluster from deeper in the overlay chain before writing to the active
> > overlay. However, this does not explain the drop in performance when
> > writing 64K blocks. The performance drop is not as significant, but if
> the
> > write block size matches the cluster size then it seems that there should
> > not be any performance drop because the write can go directly to the
> active
> > overlay.
>
> Can you share the QEMU command line you used?
>
> As you say, it is expected that layer 0 is a bit faster, however not to
> this degree. My guess would be that you use the default cache mode
> (which includes cache.direct=off), so your results are skewed because
> the first requests will only write to memory (the host page cache) and
> only later requests will actually hit the disk.
>
> For benchmarking, you should always use cache.direct=on (or an alias
> that contains it, such as cache=none).
>
> > Another issue I hit is that I cannot set or change the cluster size of
> > overlays. Is this possible with "virsh snapshot-create-as"?
>
> That's a libvirt question. Peter, can you help?
>
> > I am using qemu-system-x86_64 version 4.2.0 and virsh version 6.0.0.
> >
> >
> > Thank you for any insights or advice you have.
>
> Kevin
>
>


[PATCH v2] qcow2: Skip copy-on-write when allocating a zero cluster

2020-08-27 Thread Alberto Garcia
Since commit c8bb23cbdbe32f5c326365e0a82e1b0e68cdcd8a when a write
request results in a new allocation QEMU first tries to see if the
rest of the cluster outside the written area contains only zeroes.

In that case, instead of doing a normal copy-on-write operation and
writing explicit zero buffers to disk, the code zeroes the whole
cluster efficiently using pwrite_zeroes() with BDRV_REQ_NO_FALLBACK.

This improves performance very significantly but it only happens when
we are writing to an area that was completely unallocated before. Zero
clusters (QCOW2_CLUSTER_ZERO_*) are treated like normal clusters and
are therefore slower to allocate.

This happens because the code uses bdrv_is_allocated_above() rather
bdrv_block_status_above(). The former is not as accurate for this
purpose but it is faster. However in the case of qcow2 the underlying
call does already report zero clusters just fine so there is no reason
why we cannot use that information.

After testing 4KB writes on an image that only contains zero clusters
this patch results in almost five times more IOPS.

Signed-off-by: Alberto Garcia 
---
v2:
- Add new, simpler API: bdrv_is_unallocated_or_zero_above()

 include/block/block.h |  2 ++
 block/io.c| 24 
 block/qcow2.c | 37 +
 3 files changed, 47 insertions(+), 16 deletions(-)

diff --git a/include/block/block.h b/include/block/block.h
index 6e36154061..477a72b2e9 100644
--- a/include/block/block.h
+++ b/include/block/block.h
@@ -496,6 +496,8 @@ int bdrv_is_allocated(BlockDriverState *bs, int64_t offset, 
int64_t bytes,
 int bdrv_is_allocated_above(BlockDriverState *top, BlockDriverState *base,
 bool include_base, int64_t offset, int64_t bytes,
 int64_t *pnum);
+int bdrv_is_unallocated_or_zero_above(BlockDriverState *bs, int64_t offset,
+  int64_t bytes);
 
 bool bdrv_is_read_only(BlockDriverState *bs);
 int bdrv_can_set_read_only(BlockDriverState *bs, bool read_only,
diff --git a/block/io.c b/block/io.c
index ad3a51ed53..94faa3f9d7 100644
--- a/block/io.c
+++ b/block/io.c
@@ -2557,6 +2557,30 @@ int bdrv_block_status(BlockDriverState *bs, int64_t 
offset, int64_t bytes,
offset, bytes, pnum, map, file);
 }
 
+/*
+ * Check @bs (and its backing chain) to see if the range defined
+ * by @offset and @bytes is unallocated or known to read as zeroes.
+ * Return 1 if that is the case, 0 otherwise and -errno on error.
+ * This test is meant to be fast rather than accurate so returning 0
+ * does not guarantee non-zero data.
+ */
+int bdrv_is_unallocated_or_zero_above(BlockDriverState *bs, int64_t offset,
+  int64_t bytes)
+{
+int ret;
+int64_t pnum = bytes;
+
+ret = bdrv_common_block_status_above(bs, NULL, false, offset,
+ bytes, &pnum, NULL, NULL);
+
+if (ret < 0) {
+return ret;
+}
+
+return (pnum == bytes) &&
+((ret & BDRV_BLOCK_ZERO) || (!(ret & BDRV_BLOCK_ALLOCATED)));
+}
+
 int coroutine_fn bdrv_is_allocated(BlockDriverState *bs, int64_t offset,
int64_t bytes, int64_t *pnum)
 {
diff --git a/block/qcow2.c b/block/qcow2.c
index da56b1a4df..29bea548db 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -2391,26 +2391,28 @@ static bool merge_cow(uint64_t offset, unsigned bytes,
 return false;
 }
 
-static bool is_unallocated(BlockDriverState *bs, int64_t offset, int64_t bytes)
-{
-int64_t nr;
-return !bytes ||
-(!bdrv_is_allocated_above(bs, NULL, false, offset, bytes, &nr) &&
- nr == bytes);
-}
-
-static bool is_zero_cow(BlockDriverState *bs, QCowL2Meta *m)
+/*
+ * Return 1 if the COW regions read as zeroes, 0 if not, < 0 on error.
+ * Note that returning 0 does not guarantee non-zero data.
+ */
+static int is_zero_cow(BlockDriverState *bs, QCowL2Meta *m)
 {
 /*
  * This check is designed for optimization shortcut so it must be
  * efficient.
- * Instead of is_zero(), use is_unallocated() as it is faster (but not
- * as accurate and can result in false negatives).
+ * Instead of is_zero(), use bdrv_is_unallocated_or_zero_above() as
+ * it is faster (but not as accurate and can result in false negatives).
  */
-return is_unallocated(bs, m->offset + m->cow_start.offset,
-  m->cow_start.nb_bytes) &&
-   is_unallocated(bs, m->offset + m->cow_end.offset,
-  m->cow_end.nb_bytes);
+int ret;
+
+ret = bdrv_is_unallocated_or_zero_above(bs, m->offset + 
m->cow_start.offset,
+m->cow_start.nb_bytes);
+if (ret <= 0) {
+return ret;
+}
+
+return bdrv_is_unallocated_or_zero_above(bs, m->offset + m->cow_end.offset,
+ m->cow_end.nb_bytes);
 }

Re: [PATCH v7 1/8] Introduce yank feature

2020-08-27 Thread Daniel P . Berrangé
On Thu, Aug 27, 2020 at 02:37:00PM +0200, Markus Armbruster wrote:
> I apologize for not reviewing this much earlier.
> 
> Lukas Straub  writes:
> 
> > The yank feature allows to recover from hanging qemu by "yanking"
> > at various parts. Other qemu systems can register themselves and
> > multiple yank functions. Then all yank functions for selected
> > instances can be called by the 'yank' out-of-band qmp command.
> > Available instances can be queried by a 'query-yank' oob command.
> >
> > Signed-off-by: Lukas Straub 
> > Acked-by: Stefan Hajnoczi 
> > ---
> >  include/qemu/yank.h |  80 +++
> >  qapi/misc.json  |  45 +++
> >  util/Makefile.objs  |   1 +
> >  util/yank.c | 184 
> >  4 files changed, 310 insertions(+)
> >  create mode 100644 include/qemu/yank.h
> >  create mode 100644 util/yank.c


> > + *
> > + * This function is thread-safe.
> > + *
> > + * @instance_name: The name of the instance
> > + * @func: The yank function
> > + * @opaque: Will be passed to the yank function
> > + */
> > +void yank_register_function(const char *instance_name,
> > +YankFn *func,
> > +void *opaque);
> 
> Pardon my ignorance... can you explain to me why a yank instance may
> have multiple functions?

multifd migration - there's a single migration "instance", which
has multiple FDs open, each of which has a func registered.


> > diff --git a/qapi/misc.json b/qapi/misc.json
> > index 9d32820dc1..0d6a8f20b7 100644
> > --- a/qapi/misc.json
> > +++ b/qapi/misc.json
> > @@ -1615,3 +1615,48 @@
> >  ##
> >  { 'command': 'query-vm-generation-id', 'returns': 'GuidInfo' }
> >
> > +##
> > +# @YankInstances:
> > +#
> > +# @instances: List of yank instances.
> > +#
> > +# Yank instances are named after the following schema:
> > +# "blockdev:", "chardev:" and "migration"
> > +#
> > +# Since: 5.1
> > +##
> > +{ 'struct': 'YankInstances', 'data': {'instances': ['str'] } }
> 
> I'm afraid this is a problematic QMP interface.
> 
> By making YankInstances a struct, you keep the door open to adding more
> members, which is good.
> 
> But by making its 'instances' member a ['str'], you close the door to
> using anything but a single string for the individual instances.  Not so
> good.
> 
> The single string encodes information which QMP client will need to
> parse from the string.  We frown on that in QMP.  Use QAPI complex types
> capabilities for structured data.
> 
> Could you use something like this instead?
> 
> { 'enum': 'YankInstanceType',
>   'data': { 'block-node', 'chardev', 'migration' } }
> 
> { 'struct': 'YankInstanceBlockNode',
>   'data': { 'node-name': 'str' } }
> 
> { 'struct': 'YankInstanceChardev',
>   'data' { 'label': 'str' } }
> 
> { 'union': 'YankInstance',
>   'base': { 'type': 'YankInstanceType' },
>   'discriminator': 'type',
>   'data': {
>   'block-node': 'YankInstanceBlockNode',
>   'chardev': 'YankInstanceChardev' } }
> 
> { 'command': 'yank',
>   'data': { 'instances': ['YankInstance'] },
>   'allow-oob': true }
> 
> If you're confident nothing will ever be added to YankInstanceBlockNode
> and YankInstanceChardev, you could use str instead.

I raised this idea back in the v1 posting. There's a thread starting
at this:

  https://lists.gnu.org/archive/html/qemu-devel/2020-05/msg02881.html

which eventually concluded that plain string is best.

I think that's right because the yank feature is providing generic
infrastructure with zero knowledge of backends that are using it.
The only interaction between the yank functionality and its callers
is via an opaque function callback. So there's no conceptual place
at which the yank infra would want to know about the backends nor
pass any backend specific config params to it.

> > +##
> > +# @yank:
> > +#
> > +# Recover from hanging qemu by yanking the specified instances.
> 
> What's an "instance", and what does it mean to "yank" it?
> 
> The documentation of YankInstances above gives a clue on what an
> "instance" is: presumably a block node, a character device or the
> migration job.
> 
> I guess a YankInstance is whatever the code chooses to make one, and the
> current code makes these three kinds.
> 
> Does it make every block node a YankInstance?  If not, which ones?
> 
> Does it make every character device a YankInstance?  If not, which ones?
> 
> Does it make migration always a YankInstance?  If not, when?

>From the POV of the yank code, the "instance" is intentionally opaque.
Whatever the instance wants todo with its callback is acceptable, as
long as it isn't violating the rules for the callback by doing stuff
which can block. In practice right now, an instance is anything which
has a network connection associated with it, but it doesn't have to be
restricted to just networking. Anything which is talking to a service
which can get "stuck" is in scope for supporting yanking.

eg I could imagine an instance having s

Re: [PATCH v7 0/8] Introduce 'yank' oob qmp command to recover from hanging qemu

2020-08-27 Thread Markus Armbruster
Daniel P. Berrangé  writes:

> On Thu, Aug 27, 2020 at 10:42:46AM +0200, Lukas Straub wrote:
[...]
>> Also, can the different subsystems have a look at this and give their ok?
>
> We need ACKs from the NBD, migration and chardev maintainers, for the
> respective patches, then I think this series is ready for a pull request.

The QMP interface and its documentation need a bit of work, see my
review of PATCH 1.  I'm hopeful v8 will nail it.

> Once acks arrive, I'm happy to send a PULL unless someone else has a
> desire todo it.

Not yet, please.




[PULL 10/13] virtio-blk-pci: default num_queues to -smp N

2020-08-27 Thread Michael S. Tsirkin
From: Stefan Hajnoczi 

Automatically size the number of virtio-blk-pci request virtqueues to
match the number of vCPUs.  Other transports continue to default to 1
request virtqueue.

A 1:1 virtqueue:vCPU mapping ensures that completion interrupts are
handled on the same vCPU that submitted the request.  No IPI is
necessary to complete an I/O request and performance is improved.  The
maximum number of MSI-X vectors and virtqueues limit are respected.

Performance improves from 78k to 104k IOPS on a 32 vCPU guest with 101
virtio-blk-pci devices (ioengine=libaio, iodepth=1, bs=4k, rw=randread
with NVMe storage).

Signed-off-by: Stefan Hajnoczi 
Reviewed-by: Cornelia Huck 
Reviewed-by: Pankaj Gupta 
Message-Id: <20200818143348.310613-7-stefa...@redhat.com>
Reviewed-by: Michael S. Tsirkin 
Signed-off-by: Michael S. Tsirkin 
---
 include/hw/virtio/virtio-blk.h | 2 ++
 hw/block/virtio-blk.c  | 6 +-
 hw/core/machine.c  | 1 +
 hw/virtio/virtio-blk-pci.c | 7 ++-
 4 files changed, 14 insertions(+), 2 deletions(-)

diff --git a/include/hw/virtio/virtio-blk.h b/include/hw/virtio/virtio-blk.h
index b1334c3904..7539c2b848 100644
--- a/include/hw/virtio/virtio-blk.h
+++ b/include/hw/virtio/virtio-blk.h
@@ -30,6 +30,8 @@ struct virtio_blk_inhdr
 unsigned char status;
 };
 
+#define VIRTIO_BLK_AUTO_NUM_QUEUES UINT16_MAX
+
 struct VirtIOBlkConf
 {
 BlockConf conf;
diff --git a/hw/block/virtio-blk.c b/hw/block/virtio-blk.c
index 413783693c..2204ba149e 100644
--- a/hw/block/virtio-blk.c
+++ b/hw/block/virtio-blk.c
@@ -1147,6 +1147,9 @@ static void virtio_blk_device_realize(DeviceState *dev, 
Error **errp)
 error_setg(errp, "Device needs media, but drive is empty");
 return;
 }
+if (conf->num_queues == VIRTIO_BLK_AUTO_NUM_QUEUES) {
+conf->num_queues = 1;
+}
 if (!conf->num_queues) {
 error_setg(errp, "num-queues property must be larger than 0");
 return;
@@ -1281,7 +1284,8 @@ static Property virtio_blk_properties[] = {
 #endif
 DEFINE_PROP_BIT("request-merging", VirtIOBlock, conf.request_merging, 0,
 true),
-DEFINE_PROP_UINT16("num-queues", VirtIOBlock, conf.num_queues, 1),
+DEFINE_PROP_UINT16("num-queues", VirtIOBlock, conf.num_queues,
+   VIRTIO_BLK_AUTO_NUM_QUEUES),
 DEFINE_PROP_UINT16("queue-size", VirtIOBlock, conf.queue_size, 256),
 DEFINE_PROP_BOOL("seg-max-adjust", VirtIOBlock, conf.seg_max_adjust, true),
 DEFINE_PROP_LINK("iothread", VirtIOBlock, conf.iothread, TYPE_IOTHREAD,
diff --git a/hw/core/machine.c b/hw/core/machine.c
index 9ee2aa0f7b..7f65fa8743 100644
--- a/hw/core/machine.c
+++ b/hw/core/machine.c
@@ -31,6 +31,7 @@
 GlobalProperty hw_compat_5_1[] = {
 { "vhost-scsi", "num_queues", "1"},
 { "vhost-user-scsi", "num_queues", "1"},
+{ "virtio-blk-device", "num-queues", "1"},
 { "virtio-scsi-device", "num_queues", "1"},
 };
 const size_t hw_compat_5_1_len = G_N_ELEMENTS(hw_compat_5_1);
diff --git a/hw/virtio/virtio-blk-pci.c b/hw/virtio/virtio-blk-pci.c
index 849cc7dfd8..37c6e0aeb4 100644
--- a/hw/virtio/virtio-blk-pci.c
+++ b/hw/virtio/virtio-blk-pci.c
@@ -50,9 +50,14 @@ static void virtio_blk_pci_realize(VirtIOPCIProxy *vpci_dev, 
Error **errp)
 {
 VirtIOBlkPCI *dev = VIRTIO_BLK_PCI(vpci_dev);
 DeviceState *vdev = DEVICE(&dev->vdev);
+VirtIOBlkConf *conf = &dev->vdev.conf;
+
+if (conf->num_queues == VIRTIO_BLK_AUTO_NUM_QUEUES) {
+conf->num_queues = virtio_pci_optimal_num_queues(0);
+}
 
 if (vpci_dev->nvectors == DEV_NVECTORS_UNSPECIFIED) {
-vpci_dev->nvectors = dev->vdev.conf.num_queues + 1;
+vpci_dev->nvectors = conf->num_queues + 1;
 }
 
 qdev_realize(vdev, BUS(&vpci_dev->bus), errp);
-- 
MST




[PULL 11/13] vhost-user-blk-pci: default num_queues to -smp N

2020-08-27 Thread Michael S. Tsirkin
From: Stefan Hajnoczi 

Automatically size the number of request virtqueues to match the number
of vCPUs.  This ensures that completion interrupts are handled on the
same vCPU that submitted the request.  No IPI is necessary to complete
an I/O request and performance is improved.  The maximum number of MSI-X
vectors and virtqueues limit are respected.

Signed-off-by: Stefan Hajnoczi 
Reviewed-by: Cornelia Huck 
Reviewed-by: Raphael Norwitz 
Message-Id: <20200818143348.310613-8-stefa...@redhat.com>
Reviewed-by: Michael S. Tsirkin 
Signed-off-by: Michael S. Tsirkin 
---
 include/hw/virtio/vhost-user-blk.h | 2 ++
 hw/block/vhost-user-blk.c  | 6 +-
 hw/core/machine.c  | 1 +
 hw/virtio/vhost-user-blk-pci.c | 4 
 4 files changed, 12 insertions(+), 1 deletion(-)

diff --git a/include/hw/virtio/vhost-user-blk.h 
b/include/hw/virtio/vhost-user-blk.h
index 34ad6f0c0e..292d17147c 100644
--- a/include/hw/virtio/vhost-user-blk.h
+++ b/include/hw/virtio/vhost-user-blk.h
@@ -25,6 +25,8 @@
 #define VHOST_USER_BLK(obj) \
 OBJECT_CHECK(VHostUserBlk, (obj), TYPE_VHOST_USER_BLK)
 
+#define VHOST_USER_BLK_AUTO_NUM_QUEUES UINT16_MAX
+
 typedef struct VHostUserBlk {
 VirtIODevice parent_obj;
 CharBackend chardev;
diff --git a/hw/block/vhost-user-blk.c b/hw/block/vhost-user-blk.c
index a00b854736..39aec42dae 100644
--- a/hw/block/vhost-user-blk.c
+++ b/hw/block/vhost-user-blk.c
@@ -420,6 +420,9 @@ static void vhost_user_blk_device_realize(DeviceState *dev, 
Error **errp)
 return;
 }
 
+if (s->num_queues == VHOST_USER_BLK_AUTO_NUM_QUEUES) {
+s->num_queues = 1;
+}
 if (!s->num_queues || s->num_queues > VIRTIO_QUEUE_MAX) {
 error_setg(errp, "vhost-user-blk: invalid number of IO queues");
 return;
@@ -531,7 +534,8 @@ static const VMStateDescription vmstate_vhost_user_blk = {
 
 static Property vhost_user_blk_properties[] = {
 DEFINE_PROP_CHR("chardev", VHostUserBlk, chardev),
-DEFINE_PROP_UINT16("num-queues", VHostUserBlk, num_queues, 1),
+DEFINE_PROP_UINT16("num-queues", VHostUserBlk, num_queues,
+   VHOST_USER_BLK_AUTO_NUM_QUEUES),
 DEFINE_PROP_UINT32("queue-size", VHostUserBlk, queue_size, 128),
 DEFINE_PROP_BIT("config-wce", VHostUserBlk, config_wce, 0, true),
 DEFINE_PROP_END_OF_LIST(),
diff --git a/hw/core/machine.c b/hw/core/machine.c
index 7f65fa8743..ea26d61237 100644
--- a/hw/core/machine.c
+++ b/hw/core/machine.c
@@ -30,6 +30,7 @@
 
 GlobalProperty hw_compat_5_1[] = {
 { "vhost-scsi", "num_queues", "1"},
+{ "vhost-user-blk", "num-queues", "1"},
 { "vhost-user-scsi", "num_queues", "1"},
 { "virtio-blk-device", "num-queues", "1"},
 { "virtio-scsi-device", "num_queues", "1"},
diff --git a/hw/virtio/vhost-user-blk-pci.c b/hw/virtio/vhost-user-blk-pci.c
index 4f5d5cbf44..a62a71e067 100644
--- a/hw/virtio/vhost-user-blk-pci.c
+++ b/hw/virtio/vhost-user-blk-pci.c
@@ -54,6 +54,10 @@ static void vhost_user_blk_pci_realize(VirtIOPCIProxy 
*vpci_dev, Error **errp)
 VHostUserBlkPCI *dev = VHOST_USER_BLK_PCI(vpci_dev);
 DeviceState *vdev = DEVICE(&dev->vdev);
 
+if (dev->vdev.num_queues == VHOST_USER_BLK_AUTO_NUM_QUEUES) {
+dev->vdev.num_queues = virtio_pci_optimal_num_queues(0);
+}
+
 if (vpci_dev->nvectors == DEV_NVECTORS_UNSPECIFIED) {
 vpci_dev->nvectors = dev->vdev.num_queues + 1;
 }
-- 
MST




Re: [PATCH v4 3/6] util: add Error object for qemu_open_internal error reporting

2020-08-27 Thread Daniel P . Berrangé
On Wed, Aug 26, 2020 at 01:03:19PM +0200, Markus Armbruster wrote:
> Daniel P. Berrangé  writes:
> 
> > On Tue, Aug 25, 2020 at 05:14:21PM +0200, Markus Armbruster wrote:
> >> Daniel P. Berrangé  writes:
> >> 
> >> > Instead of relying on the limited information from errno, we can now
> >> > also provide detailed error messages.
> >> 
> >> The more detailed error messages are currently always ignored, but the
> >> next patches will fix that.
> >> 
> >> > Signed-off-by: Daniel P. Berrangé 
> >> > ---
> >> >  util/osdep.c | 21 +++--
> >> >  1 file changed, 19 insertions(+), 2 deletions(-)
> >> >
> >> > diff --git a/util/osdep.c b/util/osdep.c
> >> > index 9ff92551e7..9c7118d3cb 100644
> >> > --- a/util/osdep.c
> >> > +++ b/util/osdep.c
> >> > @@ -284,7 +284,7 @@ int qemu_lock_fd_test(int fd, int64_t start, int64_t 
> >> > len, bool exclusive)
> >> >   * Opens a file with FD_CLOEXEC set
> >> >   */
> >> >  static int
> >> > -qemu_open_internal(const char *name, int flags, mode_t mode)
> >> > +qemu_open_internal(const char *name, int flags, mode_t mode, Error 
> >> > **errp)
> >> >  {
> >> >  int ret;
> >> >  
> >> > @@ -298,24 +298,31 @@ qemu_open_internal(const char *name, int flags, 
> >> > mode_t mode)
> >> >  
> >> >  fdset_id = qemu_parse_fdset(fdset_id_str);
> >> >  if (fdset_id == -1) {
> >> > +error_setg(errp, "Could not parse fdset %s", name);
> >> >  errno = EINVAL;
> >> >  return -1;
> >> >  }
> >> >  
> >> >  fd = monitor_fdset_get_fd(fdset_id, flags);
> >> >  if (fd < 0) {
> >> > +error_setg_errno(errp, -fd, "Could not acquire FD for %s 
> >> > flags %x",
> >> > + name, flags);
> >> >  errno = -fd;
> >> >  return -1;
> >> >  }
> >> >  
> >> >  dupfd = qemu_dup_flags(fd, flags);
> >> >  if (dupfd == -1) {
> >> > +error_setg_errno(errp, errno, "Could not dup FD for %s 
> >> > flags %x",
> >> > + name, flags);
> >> >  return -1;
> >> >  }
> >> >  
> >> >  ret = monitor_fdset_dup_fd_add(fdset_id, dupfd);
> >> >  if (ret == -1) {
> >> >  close(dupfd);
> >> > +error_setg(errp, "Could not save FD for %s flags %x",
> >> > +   name, flags);
> >> 
> >> Can this happen?
> >
> > Well there's code in monitor_fdset_dup_fd_add that can return -1.
> 
> It fails when
> 
> * @fdset_id contains @dupfd
> 
>   @dupfd is a fresh file descriptor.  If @fdset_id already contains it,
>   it's stale there.  That would be a programming error.  Recommend to
>   assert.
> 
> * @fdset_id is not in @mon_fdsets
> 
>   monitor_fdset_get_fd() fails the same way.  monitor_fdset_dup_fd_add()
>   can fail that way after monitor_fdset_get_fd() succeed only if the fd
>   set went away between the two.  Could that happen?  Would it be safe?
> 
>   This is the only user of monitor_fdset_dup_fd_add().  Why not remove
>   the awkward failure mode by making monitor_fdset_dup_fd_add() dup the
>   fd and add?

Once we push  the qemu_dup call into monitor_fdset_dup_fd_add, we
might as well go the whole way and merge monitor_fdset_get_fd
into it too. So I've done that, turning 3 calls into 1.

Regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|




Re: [PATCH v2 (BROKEN) 0/6] migration: bring improved savevm/loadvm/delvm to QMP

2020-08-27 Thread Kevin Wolf
Am 27.08.2020 um 13:06 hat Markus Armbruster geschrieben:
> Daniel P. Berrangé  writes:
> 
> > On Wed, Aug 26, 2020 at 07:28:24PM +0100, Daniel P. Berrangé wrote:
> >> On Wed, Aug 26, 2020 at 05:52:06PM +0200, Markus Armbruster wrote:
> >> > Open questions:
> >> > 
> >> > * Do we want the QMP command to delete existing snapshots with
> >> >   conflicting tag / ID, like HMP savevm does?  Or do we want it to fail
> >> >   the transaction?
> >> 
> >> The intent is for the QMP commands to operate exclusively on
> >> 'tags', and never consider "ID".
> >
> > I forgot that even HMP ignores "ID" now and works exclusively in terms
> > of tags since:
> >
> >
> >   commit 6ca080453ea403959ccde661030ca16264acc181
> >   Author: Daniel Henrique Barboza 
> >   Date:   Wed Nov 7 11:09:58 2018 -0200
> >
> > block/snapshot.c: eliminate use of ID input in snapshot operations
> 
> Almost a year after I sent the memo I quoted.  It's an incompatible
> change, but nobody complained, and I'm glad we got this issue out of the
> way.

FWIW, I would have ignored any complaint about incompatible changes in
HMP. It's not supposed to be a stable API, but UI.

Kevin




Re: [PATCH 4/6] block, migration: add bdrv_finalize_vmstate helper

2020-08-27 Thread Daniel P . Berrangé
On Thu, Aug 27, 2020 at 04:02:38PM +0300, Denis V. Lunev wrote:
> On 8/27/20 3:58 PM, Daniel P. Berrangé wrote:
> > On Thu, Jul 09, 2020 at 04:26:42PM +0300, Denis V. Lunev wrote:
> >> Right now bdrv_fclose() is just calling bdrv_flush().
> >>
> >> The problem is that migration code is working inefficiently from block
> >> layer terms and are frequently called for very small pieces of
> >> unaligned data. Block layer is capable to work this way, but this is very
> >> slow.
> >>
> >> This patch is a preparation for the introduction of the intermediate
> >> buffer at block driver state. It would be beneficial to separate
> >> conventional bdrv_flush() from closing QEMU file from migration code.
> >>
> >> The patch also forces bdrv_finalize_vmstate() operation inside
> >> synchronous blk_save_vmstate() operation. This helper is used from
> >> qemu-io only.
> >>
> >> Signed-off-by: Denis V. Lunev 
> >> Reviewed-by: Vladimir Sementsov-Ogievskiy 
> >> CC: Kevin Wolf 
> >> CC: Max Reitz 
> >> CC: Stefan Hajnoczi 
> >> CC: Fam Zheng 
> >> CC: Juan Quintela 
> >> CC: "Dr. David Alan Gilbert" 
> >> CC: Denis Plotnikov 
> >> ---
> >>  block/block-backend.c |  6 +-
> >>  block/io.c| 15 +++
> >>  include/block/block.h |  5 +
> >>  migration/savevm.c|  4 
> >>  4 files changed, 29 insertions(+), 1 deletion(-)
> >> diff --git a/migration/savevm.c b/migration/savevm.c
> >> index 45c9dd9d8a..d8a94e312c 100644
> >> --- a/migration/savevm.c
> >> +++ b/migration/savevm.c
> >> @@ -150,6 +150,10 @@ static ssize_t block_get_buffer(void *opaque, uint8_t 
> >> *buf, int64_t pos,
> >>  
> >>  static int bdrv_fclose(void *opaque, Error **errp)
> >>  {
> >> +int err = bdrv_finalize_vmstate(opaque);
> >> +if (err < 0) {
> >> +return err;
> > This is returning an error without having populating 'errp' which means
> > the caller will be missing error diagnosis
> 
> but this behaves exactly like the branch below,
> bdrv_flush() could return error too and errp
> is not filled in the same way.

Doh, it seems the only caller passes NULL for the errp too,
so it is a redundant parameter. So nothing wrong with your
patch after all.


Regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|




Re: [PATCH 4/6] block, migration: add bdrv_finalize_vmstate helper

2020-08-27 Thread Denis V. Lunev
On 8/27/20 3:58 PM, Daniel P. Berrangé wrote:
> On Thu, Jul 09, 2020 at 04:26:42PM +0300, Denis V. Lunev wrote:
>> Right now bdrv_fclose() is just calling bdrv_flush().
>>
>> The problem is that migration code is working inefficiently from block
>> layer terms and are frequently called for very small pieces of
>> unaligned data. Block layer is capable to work this way, but this is very
>> slow.
>>
>> This patch is a preparation for the introduction of the intermediate
>> buffer at block driver state. It would be beneficial to separate
>> conventional bdrv_flush() from closing QEMU file from migration code.
>>
>> The patch also forces bdrv_finalize_vmstate() operation inside
>> synchronous blk_save_vmstate() operation. This helper is used from
>> qemu-io only.
>>
>> Signed-off-by: Denis V. Lunev 
>> Reviewed-by: Vladimir Sementsov-Ogievskiy 
>> CC: Kevin Wolf 
>> CC: Max Reitz 
>> CC: Stefan Hajnoczi 
>> CC: Fam Zheng 
>> CC: Juan Quintela 
>> CC: "Dr. David Alan Gilbert" 
>> CC: Denis Plotnikov 
>> ---
>>  block/block-backend.c |  6 +-
>>  block/io.c| 15 +++
>>  include/block/block.h |  5 +
>>  migration/savevm.c|  4 
>>  4 files changed, 29 insertions(+), 1 deletion(-)
>> diff --git a/migration/savevm.c b/migration/savevm.c
>> index 45c9dd9d8a..d8a94e312c 100644
>> --- a/migration/savevm.c
>> +++ b/migration/savevm.c
>> @@ -150,6 +150,10 @@ static ssize_t block_get_buffer(void *opaque, uint8_t 
>> *buf, int64_t pos,
>>  
>>  static int bdrv_fclose(void *opaque, Error **errp)
>>  {
>> +int err = bdrv_finalize_vmstate(opaque);
>> +if (err < 0) {
>> +return err;
> This is returning an error without having populating 'errp' which means
> the caller will be missing error diagnosis

but this behaves exactly like the branch below,
bdrv_flush() could return error too and errp
is not filled in the same way.

Den


>> +}
>>  return bdrv_flush(opaque);
>>  }
> Regards,
> Daniel




Re: [PATCH 4/6] block, migration: add bdrv_finalize_vmstate helper

2020-08-27 Thread Daniel P . Berrangé
On Thu, Jul 09, 2020 at 04:26:42PM +0300, Denis V. Lunev wrote:
> Right now bdrv_fclose() is just calling bdrv_flush().
> 
> The problem is that migration code is working inefficiently from block
> layer terms and are frequently called for very small pieces of
> unaligned data. Block layer is capable to work this way, but this is very
> slow.
> 
> This patch is a preparation for the introduction of the intermediate
> buffer at block driver state. It would be beneficial to separate
> conventional bdrv_flush() from closing QEMU file from migration code.
> 
> The patch also forces bdrv_finalize_vmstate() operation inside
> synchronous blk_save_vmstate() operation. This helper is used from
> qemu-io only.
> 
> Signed-off-by: Denis V. Lunev 
> Reviewed-by: Vladimir Sementsov-Ogievskiy 
> CC: Kevin Wolf 
> CC: Max Reitz 
> CC: Stefan Hajnoczi 
> CC: Fam Zheng 
> CC: Juan Quintela 
> CC: "Dr. David Alan Gilbert" 
> CC: Denis Plotnikov 
> ---
>  block/block-backend.c |  6 +-
>  block/io.c| 15 +++
>  include/block/block.h |  5 +
>  migration/savevm.c|  4 
>  4 files changed, 29 insertions(+), 1 deletion(-)

> diff --git a/migration/savevm.c b/migration/savevm.c
> index 45c9dd9d8a..d8a94e312c 100644
> --- a/migration/savevm.c
> +++ b/migration/savevm.c
> @@ -150,6 +150,10 @@ static ssize_t block_get_buffer(void *opaque, uint8_t 
> *buf, int64_t pos,
>  
>  static int bdrv_fclose(void *opaque, Error **errp)
>  {
> +int err = bdrv_finalize_vmstate(opaque);
> +if (err < 0) {
> +return err;

This is returning an error without having populating 'errp' which means
the caller will be missing error diagnosis

> +}
>  return bdrv_flush(opaque);
>  }

Regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|




Re: [PATCH v8 0/6] block: seriously improve savevm/loadvm performance

2020-08-27 Thread Denis V. Lunev
On 8/20/20 10:42 AM, Denis V. Lunev wrote:
> On 7/9/20 4:26 PM, Denis V. Lunev wrote:
>> This series do standard basic things:
>> - it creates intermediate buffer for all writes from QEMU migration code
>>   to QCOW2 image,
>> - this buffer is sent to disk asynchronously, allowing several writes to
>>   run in parallel.
>>
>> In general, migration code is fantastically inefficent (by observation),
>> buffers are not aligned and sent with arbitrary pieces, a lot of time
>> less than 100 bytes at a chunk, which results in read-modify-write
>> operations with non-cached operations. It should also be noted that all
>> operations are performed into unallocated image blocks, which also suffer
>> due to partial writes to such new clusters.
>>
>> This patch series is an implementation of idea discussed in the RFC
>> posted by Denis Plotnikov
>> https://lists.gnu.org/archive/html/qemu-devel/2020-04/msg01925.html
>> Results with this series over NVME are better than original code
>> original rfcthis
>> cached:  1.79s  2.38s   1.27s
>> non-cached:  3.29s  1.31s   0.81s
>>
>> Changes from v7:
>> - dropped lock from LoadVMState
>> - fixed assert in last patch
>> - dropped patch 1 as queued
>>
>> Changes from v6:
>> - blk_load_vmstate kludges added (patchew problem fixed)
>>
>> Changes from v5:
>> - loadvm optimizations added with Vladimir comments included
>>
>> Changes from v4:
>> - added patch 4 with blk_save_vmstate() cleanup
>> - added R-By
>> - bdrv_flush_vmstate -> bdrv_finalize_vmstate
>> - fixed return code of bdrv_co_do_save_vmstate
>> - fixed typos in comments (Eric, thanks!)
>> - fixed patchew warnings
>>
>> Changes from v3:
>> - rebased to master
>> - added patch 3 which removes aio_task_pool_wait_one()
>> - added R-By to patch 1
>> - patch 4 is rewritten via bdrv_run_co
>> - error path in blk_save_vmstate() is rewritten to call bdrv_flush_vmstate
>>   unconditionally
>> - added some comments
>> - fixes initialization in bdrv_co_vmstate_save_task_entry as suggested
>>
>> Changes from v2:
>> - code moved from QCOW2 level to generic block level
>> - created bdrv_flush_vmstate helper to fix 022, 029 tests
>> - added recursive for bs->file in bdrv_co_flush_vmstate (fix 267)
>> - fixed blk_save_vmstate helper
>> - fixed coroutine wait as Vladimir suggested with waiting fixes from me
>>
>> Changes from v1:
>> - patchew warning fixed
>> - fixed validation that only 1 waiter is allowed in patch 1
>>
>> Signed-off-by: Denis V. Lunev 
>> CC: Kevin Wolf 
>> CC: Max Reitz 
>> CC: Stefan Hajnoczi 
>> CC: Fam Zheng 
>> CC: Juan Quintela 
>> CC: "Dr. David Alan Gilbert" 
>> CC: Vladimir Sementsov-Ogievskiy 
>> CC: Denis Plotnikov 
>>
>>
> ping
ping V2



Re: [PATCH v7 1/8] Introduce yank feature

2020-08-27 Thread Markus Armbruster
I apologize for not reviewing this much earlier.

Lukas Straub  writes:

> The yank feature allows to recover from hanging qemu by "yanking"
> at various parts. Other qemu systems can register themselves and
> multiple yank functions. Then all yank functions for selected
> instances can be called by the 'yank' out-of-band qmp command.
> Available instances can be queried by a 'query-yank' oob command.
>
> Signed-off-by: Lukas Straub 
> Acked-by: Stefan Hajnoczi 
> ---
>  include/qemu/yank.h |  80 +++
>  qapi/misc.json  |  45 +++
>  util/Makefile.objs  |   1 +
>  util/yank.c | 184 
>  4 files changed, 310 insertions(+)
>  create mode 100644 include/qemu/yank.h
>  create mode 100644 util/yank.c
>
> diff --git a/include/qemu/yank.h b/include/qemu/yank.h
> new file mode 100644
> index 00..cd184fcd05
> --- /dev/null
> +++ b/include/qemu/yank.h
> @@ -0,0 +1,80 @@
> +/*
> + * QEMU yank feature
> + *
> + * Copyright (c) Lukas Straub 
> + *
> + * This work is licensed under the terms of the GNU GPL, version 2 or later.
> + * See the COPYING file in the top-level directory.
> + */
> +
> +#ifndef YANK_H
> +#define YANK_H
> +
> +typedef void (YankFn) (void *opaque);

No space before parameter lists, please.

> +
> +/**
> + * yank_register_instance: Register a new instance.
> + *
> + * This registers a new instance for yanking. Must be called before any yank
> + * function is registered for this instance.
> + *
> + * This function is thread-safe.
> + *
> + * @instance_name: The globally unique name of the instance.
> + * @errp: ...
> + */
> +void yank_register_instance(const char *instance_name, Error **errp);
> +
> +/**
> + * yank_unregister_instance: Unregister a instance.
> + *
> + * This unregisters a instance. Must be called only after every yank function
> + * of the instance has been unregistered.
> + *
> + * This function is thread-safe.
> + *
> + * @instance_name: The name of the instance.
> + */
> +void yank_unregister_instance(const char *instance_name);
> +
> +/**
> + * yank_register_function: Register a yank function
> + *
> + * This registers a yank function. All limitations of qmp oob commands apply
> + * to the yank function as well.

The restrictions are documented in docs/devel/qapi-code-gen.txt under
"An OOB-capable command handler must satisfy the following conditions".
Let's point there,

> + *
> + * This function is thread-safe.
> + *
> + * @instance_name: The name of the instance
> + * @func: The yank function
> + * @opaque: Will be passed to the yank function
> + */
> +void yank_register_function(const char *instance_name,
> +YankFn *func,
> +void *opaque);

Pardon my ignorance... can you explain to me why a yank instance may
have multiple functions?

> +
> +/**
> + * yank_unregister_function: Unregister a yank function
> + *
> + * This unregisters a yank function.
> + *
> + * This function is thread-safe.
> + *
> + * @instance_name: The name of the instance
> + * @func: func that was passed to yank_register_function
> + * @opaque: opaque that was passed to yank_register_function
> + */
> +void yank_unregister_function(const char *instance_name,
> +  YankFn *func,
> +  void *opaque);
> +
> +/**
> + * yank_unregister_function: Generic yank function for iochannel

Pasto, should be

* yank_generic_iochannel: ...

> + *
> + * This is a generic yank function which will call qio_channel_shutdown on 
> the
> + * provided QIOChannel.
> + *
> + * @opaque: QIOChannel to shutdown
> + */
> +void yank_generic_iochannel(void *opaque);
> +#endif
> diff --git a/qapi/misc.json b/qapi/misc.json
> index 9d32820dc1..0d6a8f20b7 100644
> --- a/qapi/misc.json
> +++ b/qapi/misc.json
> @@ -1615,3 +1615,48 @@
>  ##
>  { 'command': 'query-vm-generation-id', 'returns': 'GuidInfo' }
>
> +##
> +# @YankInstances:
> +#
> +# @instances: List of yank instances.
> +#
> +# Yank instances are named after the following schema:
> +# "blockdev:", "chardev:" and "migration"
> +#
> +# Since: 5.1
> +##
> +{ 'struct': 'YankInstances', 'data': {'instances': ['str'] } }

I'm afraid this is a problematic QMP interface.

By making YankInstances a struct, you keep the door open to adding more
members, which is good.

But by making its 'instances' member a ['str'], you close the door to
using anything but a single string for the individual instances.  Not so
good.

The single string encodes information which QMP client will need to
parse from the string.  We frown on that in QMP.  Use QAPI complex types
capabilities for structured data.

Could you use something like this instead?

{ 'enum': 'YankInstanceType',
  'data': { 'block-node', 'chardev', 'migration' } }

{ 'struct': 'YankInstanceBlockNode',
  'data': { 'node-name': 'str' } }

{ 'struct': 'YankInstanceChardev',
  'data' { 'label': 'str' } }

{ 'union': 'YankInstance',
  'base': { 'type'

Re: [PATCH v1 0/7] vhost-user-blk: fix the migration issue and enhance qtests

2020-08-27 Thread Michael S. Tsirkin
On Tue, Aug 04, 2020 at 01:36:45PM +0300, Dima Stepanov wrote:
> Reference e-mail threads:
>   - https://lists.gnu.org/archive/html/qemu-devel/2020-05/msg01509.html
>   - https://lists.gnu.org/archive/html/qemu-devel/2020-05/msg05241.html
> 
> If vhost-user daemon is used as a backend for the vhost device, then we
> should consider a possibility of disconnect at any moment. There was a general
> question here: should we consider it as an error or okay state for the 
> vhost-user
> devices during migration process?
> I think the disconnect event for the vhost-user devices should not break the
> migration process, because:
>   - the device will be in the stopped state, so it will not be changed
> during migration
>   - if reconnect will be made the migration log will be reinitialized as
> part of reconnect/init process:
> #0  vhost_log_global_start (listener=0x563989cf7be0)
> at hw/virtio/vhost.c:920
> #1  0x56398603d8bc in listener_add_address_space 
> (listener=0x563989cf7be0,
> as=0x563986ea4340 )
> at softmmu/memory.c:2664
> #2  0x56398603dd30 in memory_listener_register 
> (listener=0x563989cf7be0,
> as=0x563986ea4340 )
> at softmmu/memory.c:2740
> #3  0x563985fd6956 in vhost_dev_init (hdev=0x563989cf7bd8,
> opaque=0x563989cf7e30, backend_type=VHOST_BACKEND_TYPE_USER,
> busyloop_timeout=0)
> at hw/virtio/vhost.c:1385
> #4  0x563985f7d0b8 in vhost_user_blk_connect (dev=0x563989cf7990)
> at hw/block/vhost-user-blk.c:315
> #5  0x563985f7d3f6 in vhost_user_blk_event (opaque=0x563989cf7990,
> event=CHR_EVENT_OPENED)
> at hw/block/vhost-user-blk.c:379
> The first patch in the patchset fixes this issue by setting vhost device to 
> the
> stopped state in the disconnect handler and check it the vhost_migration_log()
> routine before returning from the function.
> qtest framework was updated to test vhost-user-blk functionality. The
> vhost-user-blk/vhost-user-blk-tests/migrate_reconnect test was added to 
> reproduce
> the original issue found.


Raphael any input on this?

> Dima Stepanov (7):
>   vhost: recheck dev state in the vhost_migration_log routine
>   vhost: check queue state in the vhost_dev_set_log routine
>   tests/qtest/vhost-user-test: prepare the tests for adding new dev
> class
>   tests/qtest/libqos/virtio-blk: add support for vhost-user-blk
>   tests/qtest/vhost-user-test: add support for the vhost-user-blk device
>   tests/qtest/vhost-user-test: add migrate_reconnect test
>   tests/qtest/vhost-user-test: enable the reconnect tests
> 
>  hw/block/vhost-user-blk.c  |  13 +-
>  hw/virtio/vhost.c  |  39 -
>  include/hw/virtio/vhost-user-blk.h |   1 +
>  tests/qtest/libqos/virtio-blk.c|  14 ++
>  tests/qtest/vhost-user-test.c  | 291 
> +++--
>  5 files changed, 311 insertions(+), 47 deletions(-)
> 
> -- 
> 2.7.4




[PATCH] sd: sdhci: check data_count is within fifo_buffer

2020-08-27 Thread P J P
From: Prasad J Pandit 

While doing multi block SDMA, transfer block size may exceed
the 's->fifo_buffer[s->buf_maxsz]' size. It may leave the
current element pointer 's->data_count' pointing out of bounds.
Leading the subsequent DMA r/w operation to OOB access issue.
Add check to avoid it.

 -> https://ruhr-uni-bochum.sciebo.de/s/NNWP2GfwzYKeKwE?path=%2Fsdhci_oob_write1
 ==1459837==ERROR: AddressSanitizer: heap-buffer-overflow
 WRITE of size 54722048 at 0x6151e280 thread T3
#0  __interceptor_memcpy (/lib64/libasan.so.6+0x3a71d)
#1  flatview_read_continue ../exec.c:3245
#2  flatview_read ../exec.c:3278
#3  address_space_read_full ../exec.c:3291
#4  address_space_rw ../exec.c:3319
#5  dma_memory_rw_relaxed ../include/sysemu/dma.h:87
#6  dma_memory_rw ../include/sysemu/dma.h:110
#7  dma_memory_read ../include/sysemu/dma.h:116
#8  sdhci_sdma_transfer_multi_blocks ../hw/sd/sdhci.c:629
#9  sdhci_write ../hw/sd/sdhci.c:1097
#10 memory_region_write_accessor ../softmmu/memory.c:483
...

Reported-by: Ruhr-University 
Signed-off-by: Prasad J Pandit 
---
 hw/sd/sdhci.c | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/hw/sd/sdhci.c b/hw/sd/sdhci.c
index 1785d7e1f7..155e25ceee 100644
--- a/hw/sd/sdhci.c
+++ b/hw/sd/sdhci.c
@@ -604,6 +604,9 @@ static void sdhci_sdma_transfer_multi_blocks(SDHCIState *s)
 s->blkcnt--;
 }
 }
+if (s->data_count <= begin || s->data_count > s->buf_maxsz) {
+break;
+}
 dma_memory_write(s->dma_as, s->sdmasysad,
  &s->fifo_buffer[begin], s->data_count - begin);
 s->sdmasysad += s->data_count - begin;
@@ -626,6 +629,9 @@ static void sdhci_sdma_transfer_multi_blocks(SDHCIState *s)
 s->data_count = block_size;
 boundary_count -= block_size - begin;
 }
+if (s->data_count <= begin || s->data_count > s->buf_maxsz) {
+break;
+}
 dma_memory_read(s->dma_as, s->sdmasysad,
 &s->fifo_buffer[begin], s->data_count - begin);
 s->sdmasysad += s->data_count - begin;
-- 
2.26.2




[PATCH] hw/ide: check null block pointer before blk_drain

2020-08-27 Thread P J P
From: Prasad J Pandit 

While cancelling an i/o operation via ide_cancel_dma_sync(),
check for null block pointer before calling blk_drain(). Avoid
null pointer dereference.

 -> https://ruhr-uni-bochum.sciebo.de/s/NNWP2GfwzYKeKwE?path=%2Fide_nullptr1
==1803100==Hint: address points to the zero page.
#0 blk_bs ../block/block-backend.c:714
#1 blk_drain ../block/block-backend.c:1715
#2 ide_cancel_dma_sync ../hw/ide/core.c:723
#3 bmdma_cmd_writeb ../hw/ide/pci.c:298
#4 bmdma_write ../hw/ide/piix.c:75
#5 memory_region_write_accessor ../softmmu/memory.c:483
#6 access_with_adjusted_size ../softmmu/memory.c:544
#7 memory_region_dispatch_write ../softmmu/memory.c:1465
#8 flatview_write_continue ../exec.c:3176
...

Reported-by: Ruhr-University 
Signed-off-by: Prasad J Pandit 
---
 hw/ide/core.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/hw/ide/core.c b/hw/ide/core.c
index d997a78e47..038af1cd6b 100644
--- a/hw/ide/core.c
+++ b/hw/ide/core.c
@@ -718,7 +718,7 @@ void ide_cancel_dma_sync(IDEState *s)
  * whole DMA operation will be submitted to disk with a single
  * aio operation with preadv/pwritev.
  */
-if (s->bus->dma->aiocb) {
+if (s->blk && s->bus->dma->aiocb) {
 trace_ide_cancel_dma_sync_remaining();
 blk_drain(s->blk);
 assert(s->bus->dma->aiocb == NULL);
-- 
2.26.2




[PATCH] fdc: check null block pointer before blk_pwrite

2020-08-27 Thread P J P
From: Prasad J Pandit 

While transferring data via fdctrl_write_data(), check that
current drive does not have a null block pointer. Avoid
null pointer dereference.

 -> https://ruhr-uni-bochum.sciebo.de/s/NNWP2GfwzYKeKwE?path=%2Ffdc_nullptr1
==1658854==Hint: address points to the zero page.
#0 blk_inc_in_flight block/block-backend.c:1327
#1 blk_prw block/block-backend.c:1299
#2 blk_pwrite block/block-backend.c:1464
#3 fdctrl_write_data hw/block/fdc.c:2418
#4 fdctrl_write hw/block/fdc.c:962
#5 portio_write ioport.c:205
#6 memory_region_write_accessor memory.c:483
#7 access_with_adjusted_size memory.c:544
#8 memory_region_dispatch_write memory.c:1476

Reported-by: Ruhr-University 
Signed-off-by: Prasad J Pandit 
---
 hw/block/fdc.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/hw/block/fdc.c b/hw/block/fdc.c
index e9ed3eef45..dedadac68a 100644
--- a/hw/block/fdc.c
+++ b/hw/block/fdc.c
@@ -2419,7 +2419,8 @@ static void fdctrl_write_data(FDCtrl *fdctrl, uint32_t 
value)
 if (pos == FD_SECTOR_LEN - 1 ||
 fdctrl->data_pos == fdctrl->data_len) {
 cur_drv = get_cur_drv(fdctrl);
-if (blk_pwrite(cur_drv->blk, fd_offset(cur_drv), fdctrl->fifo,
+if (cur_drv->blk
+&& blk_pwrite(cur_drv->blk, fd_offset(cur_drv), fdctrl->fifo,
BDRV_SECTOR_SIZE, 0) < 0) {
 FLOPPY_DPRINTF("error writing sector %d\n",
fd_sector(cur_drv));
-- 
2.26.2




Re: [PATCH v2 (BROKEN) 0/6] migration: bring improved savevm/loadvm/delvm to QMP

2020-08-27 Thread Daniel P . Berrangé
On Thu, Aug 27, 2020 at 01:04:43PM +0200, Markus Armbruster wrote:
> Daniel P. Berrangé  writes:
> 
> > On Wed, Aug 26, 2020 at 05:52:06PM +0200, Markus Armbruster wrote:
> > From the POV of practicality, making a design that unifies internal
> > and external snapshots is something I'm considering out of scope.
> > It increases the design time burden, as well as implementation burden.
> > On my side, improving internal snapshots is a "spare time" project,
> > not something I can justify spending weeks or months on.
> 
> I'm not demanding a solution that unifies internal and external
> snapshots.  I'm asking for a bit of serious thought on an interface that
> could compatibly evolve there.  Hours, not weeks or months.
> 
> > My goal is to implement something that is achievable in a short
> > amount of time that gets us out of the hole we've been in for 10
> > years. Minimal refactoring of the internal snapshot code aside
> > from fixing the critical limitations we have today around choice
> > of disks to snapshot.
> >
> > If someone later wants to come up with a grand unified design
> > for everything that's fine, we can deprecate the new QMP commands
> > I'm proposing now.
> 
> Failing at coming up with an interface that has a reasonable chance to
> be future-proof is okay.
> 
> Not even trying is not okay.

This was raised in my v1 posting:

  https://lists.gnu.org/archive/html/qemu-devel/2020-07/msg01346.html

but the conclusion was that it was a non-trivial amount of extra
implementation work


> Specifically, I'd like you to think about monolothic snapshot command
> (internal snapshots only by design) vs. transaction of individual
> snapshot commands (design is not restricted to internal snapshots, but
> we may want to accept implementation restrictions).
> 
> We already have transactionable individual storage snapshot commands.
> What's missing is a transactionable machine state snapshot command.

At a high level I consider what I've proposed as being higher level
syntax sugar vs a more generic future impl based on multiple commands
for snapshotting disk & state. I don't think I'd claim that it will
evolve to become the design you're suggesting here, as they are designs
from different levels in the stack. IOW, I think the would ultimately
just exist in parallel. I don't think that's a real problem from a
maint POV, as the large burden from the monolithic snapshot code is
not the HMP/QMP interface, but rather the guts of the internal
impl in the migration/savevm.c and block/snapshot.c files. That code
will exist for as long as the HMP commands exist, and adding a QMP
interface doesn't make that situation worse unless we were intending
to drop the existing HMP commands. If we did change our minds though,
we can just deprecate the QMP command at any time we like.


> One restriction I'd readily accept at this time is "the machine state
> snapshot must write to a QCOW2 that is also internally snapshot in the
> same transaction".
> 
> Now explain to me why this is impractical.

The issues were described by Kevin here:

  https://lists.gnu.org/archive/html/qemu-devel/2020-07/msg02057.html

Assuming the migration impl is actually possible to solve, there is
still the question of actually writing it. That's a non-trivial
amount of work someone has to find time for.

Regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|




[PATCH v3 6/7] iotests: add support for capturing and matching QMP events

2020-08-27 Thread Daniel P . Berrangé
When using the _launch_qemu and _send_qemu_cmd functions from
common.qemu, any QMP events get mixed in with the output from
the commands and responses.

This makes it difficult to write a test case as the ordering
of events in the output is not stable.

This introduces a variable 'capture_events' which can be set
to a list of event names. Any events listed in this variable
will not be printed, instead collected in the $QEMU_EVENTS
environment variable.

A new '_wait_event' function can be invoked to collect events
at a fixed point in time. The function will first pull events
cached in $QEMU_EVENTS variable, and if none are found, will
then read more from QMP.

Signed-off-by: Daniel P. Berrangé 
---
 tests/qemu-iotests/common.qemu | 107 -
 1 file changed, 106 insertions(+), 1 deletion(-)

diff --git a/tests/qemu-iotests/common.qemu b/tests/qemu-iotests/common.qemu
index de680cf1c7..87d7a54001 100644
--- a/tests/qemu-iotests/common.qemu
+++ b/tests/qemu-iotests/common.qemu
@@ -53,6 +53,15 @@ _in_fd=4
 # If $mismatch_only is set, only non-matching responses will
 # be echoed.
 #
+# If $capture_events is non-empty, then any QMP event names it lists
+# will not be echoed out, but instead collected in the $QEMU_EVENTS
+# variable. The _wait_event function can later be used to received
+# the cached events.
+#
+# If $only_capture_events is set to anything but an empty string,
+# when an error will be raised if a QMP message is seen which is
+# not an event listed in $capture_events.
+#
 # If $success_or_failure is set, the meaning of the arguments is
 # changed as follows:
 # $2: A string to search for in the response; if found, this indicates
@@ -78,6 +87,32 @@ _timed_wait_for()
 QEMU_STATUS[$h]=0
 while IFS= read -t ${QEMU_COMM_TIMEOUT} resp <&${QEMU_OUT[$h]}
 do
+if [ -n "$capture_events" ]; then
+capture=0
+local evname
+for evname in $capture_events
+do
+grep -q "\"event\": \"${evname}\"" < <(echo "${resp}")
+if [ $? -eq 0 ]; then
+capture=1
+fi
+done
+if [ $capture = 1 ];
+then
+ev=$(echo "${resp}" | tr -d '\r' | tr % .)
+QEMU_EVENTS="${QEMU_EVENTS:+${QEMU_EVENTS}%}${ev}"
+if [ -n "$only_capture_events" ]; then
+return
+else
+continue
+fi
+fi
+fi
+if [ -n "$only_capture_events" ]; then
+echo "Only expected $capture_events but got ${resp}"
+exit 1
+fi
+
 if [ -z "${silent}" ] && [ -z "${mismatch_only}" ]; then
 echo "${resp}" | _filter_testdir | _filter_qemu \
| _filter_qemu_io | _filter_qmp | _filter_hmp
@@ -177,12 +212,82 @@ _send_qemu_cmd()
 let count--;
 done
 if [ ${QEMU_STATUS[$h]} -ne 0 ] && [ -z "${qemu_error_no_exit}" ]; then
-echo "Timeout waiting for ${1} on handle ${h}"
+echo "Timeout waiting for command ${1} response on handle ${h}"
 exit 1 #Timeout means the test failed
 fi
 }
 
 
+# Check event cache for a named QMP event
+#
+# Input parameters:
+# $1:   Name of the QMP event to check for
+#
+# Checks if the named QMP event that was previously captured
+# into $QEMU_EVENTS. When matched, the QMP event will be echoed
+# and the $matched variable set to 1.
+#
+# _wait_event is more suitable for test usage in most cases
+_check_cached_events()
+{
+local evname=${1}
+
+local match="\"event\": \"$evname\""
+
+matched=0
+if [ -n "$QEMU_EVENTS" ]; then
+CURRENT_QEMU_EVENTS=$QEMU_EVENTS
+QEMU_EVENTS=
+old_IFS=$IFS
+IFS="%"
+for ev in $CURRENT_QEMU_EVENTS
+do
+grep -q "$match" < <(echo "${ev}")
+if [ $? -eq 0 -a $matched = 0 ]; then
+echo "${ev}" | _filter_testdir | _filter_qemu \
+   | _filter_qemu_io | _filter_qmp | _filter_hmp
+matched=1
+else
+QEMU_EVENTS="${QEMU_EVENTS:+${QEMU_EVENTS}%}${ev}"
+fi
+done
+IFS=$old_IFS
+fi
+}
+
+# Wait for a named QMP event
+#
+# Input parameters:
+# $1:   QEMU handle to use
+# $2:   Name of the QMP event to wait for
+#
+# Checks if the named QMP event that was previously captured
+# into $QEMU_EVENTS. If none are present, then waits for the
+# event to arrive on the QMP channel. When matched, the QMP
+# event will be echoed
+_wait_event()
+{
+local h=${1}
+local evname=${2}
+
+while true
+do
+_check_cached_events $evname
+
+if [ $matched = 1 ];
+then
+return
+fi
+
+only_capture_events=1 qemu_error_no_exit=1 _timed_wait_for ${h}
+
+if [ ${QEMU_STATUS[$h]} -ne 0 ] ; then
+echo "Timeout waiting for e

[PATCH v3 4/7] block: add ability to specify list of blockdevs during snapshot

2020-08-27 Thread Daniel P . Berrangé
When running snapshot operations, there are various rules for which
blockdevs are included/excluded. While this provides reasonable default
behaviour, there are scenarios that are not well handled by the default
logic. Some of the conditions do not have a single correct answer.

Thus there needs to be a way for the mgmt app to provide an explicit
list of blockdevs to perform snapshots across. This can be achieved
by passing a list of node names that should be used.

Signed-off-by: Daniel P. Berrangé 
---
 block/monitor/block-hmp-cmds.c |   4 +-
 block/snapshot.c   | 167 +++--
 include/block/snapshot.h   |  13 +--
 migration/savevm.c |  16 ++--
 monitor/hmp-cmds.c |   2 +-
 5 files changed, 138 insertions(+), 64 deletions(-)

diff --git a/block/monitor/block-hmp-cmds.c b/block/monitor/block-hmp-cmds.c
index 9df11494d6..db76c43cc2 100644
--- a/block/monitor/block-hmp-cmds.c
+++ b/block/monitor/block-hmp-cmds.c
@@ -900,7 +900,7 @@ void hmp_info_snapshots(Monitor *mon, const QDict *qdict)
 SnapshotEntry *snapshot_entry;
 Error *err = NULL;
 
-bs = bdrv_all_find_vmstate_bs(&err);
+bs = bdrv_all_find_vmstate_bs(NULL, &err);
 if (!bs) {
 error_report_err(err);
 return;
@@ -952,7 +952,7 @@ void hmp_info_snapshots(Monitor *mon, const QDict *qdict)
 total = 0;
 for (i = 0; i < nb_sns; i++) {
 SnapshotEntry *next_sn;
-if (bdrv_all_find_snapshot(sn_tab[i].name, NULL) == 0) {
+if (bdrv_all_find_snapshot(sn_tab[i].name, NULL, NULL) == 0) {
 global_snapshots[total] = i;
 total++;
 QTAILQ_FOREACH(image_entry, &image_list, next) {
diff --git a/block/snapshot.c b/block/snapshot.c
index 6839060622..5691cdc6cb 100644
--- a/block/snapshot.c
+++ b/block/snapshot.c
@@ -385,6 +385,36 @@ int bdrv_snapshot_load_tmp_by_id_or_name(BlockDriverState 
*bs,
 return ret;
 }
 
+
+static int bdrv_all_get_snapshot_devices(strList *devices,
+ GList **all_bdrvs,
+ Error **errp)
+{
+g_autoptr(GList) bdrvs = NULL;
+
+if (devices) {
+while (devices) {
+BlockDriverState *bs = bdrv_find_node(devices->value);
+if (!bs) {
+error_setg(errp, "No block device node '%s'", devices->value);
+return -1;
+}
+bdrvs = g_list_append(bdrvs, bs);
+devices = devices->next;
+}
+} else {
+BlockDriverState *bs;
+BdrvNextIterator it;
+for (bs = bdrv_first(&it); bs; bs = bdrv_next(&it)) {
+bdrvs = g_list_append(bdrvs, bs);
+}
+}
+
+*all_bdrvs = g_steal_pointer(&bdrvs);
+return 0;
+}
+
+
 static bool bdrv_all_snapshots_includes_bs(BlockDriverState *bs)
 {
 if (!bdrv_is_inserted(bs) || bdrv_is_read_only(bs)) {
@@ -400,43 +430,56 @@ static bool 
bdrv_all_snapshots_includes_bs(BlockDriverState *bs)
  * These functions will properly handle dataplane (take aio_context_acquire
  * when appropriate for appropriate block drivers) */
 
-bool bdrv_all_can_snapshot(Error **errp)
+bool bdrv_all_can_snapshot(strList *devices, Error **errp)
 {
-BlockDriverState *bs;
-BdrvNextIterator it;
+g_autoptr(GList) bdrvs = NULL;
+GList *iterbdrvs;
 
-for (bs = bdrv_first(&it); bs; bs = bdrv_next(&it)) {
+if (bdrv_all_get_snapshot_devices(devices, &bdrvs, errp) < 0) {
+return false;
+}
+
+iterbdrvs = bdrvs;
+while (iterbdrvs) {
+BlockDriverState *bs = iterbdrvs->data;
 AioContext *ctx = bdrv_get_aio_context(bs);
 bool ok;
 
 aio_context_acquire(ctx);
-if (bdrv_all_snapshots_includes_bs(bs)) {
+if (devices || bdrv_all_snapshots_includes_bs(bs)) {
 ok = bdrv_can_snapshot(bs);
 }
 aio_context_release(ctx);
 if (!ok) {
 error_setg(errp, "Device '%s' is writable but does not support "
"snapshots", bdrv_get_device_or_node_name(bs));
-bdrv_next_cleanup(&it);
 return false;
 }
+
+iterbdrvs = iterbdrvs->next;
 }
 
 return true;
 }
 
-int bdrv_all_delete_snapshot(const char *name, Error **errp)
+int bdrv_all_delete_snapshot(const char *name, strList *devices, Error **errp)
 {
-BlockDriverState *bs;
-BdrvNextIterator it;
-QEMUSnapshotInfo sn1, *snapshot = &sn1;
+g_autoptr(GList) bdrvs = NULL;
+GList *iterbdrvs;
 
-for (bs = bdrv_first(&it); bs; bs = bdrv_next(&it)) {
+if (bdrv_all_get_snapshot_devices(devices, &bdrvs, errp) < 0) {
+return -1;
+}
+
+iterbdrvs = bdrvs;
+while (iterbdrvs) {
+BlockDriverState *bs = iterbdrvs->data;
 AioContext *ctx = bdrv_get_aio_context(bs);
-int ret;
+QEMUSnapshotInfo sn1, *snapshot = &sn1;
+int ret = 0;
 
 aio_context_acquire(ctx);
-if

[PATCH v3 7/7] migration: introduce snapshot-{save, load, delete} QMP commands

2020-08-27 Thread Daniel P . Berrangé
savevm, loadvm and delvm are some of the few HMP commands that have never
been converted to use QMP. The primary reason for this lack of conversion
is that they block execution of the thread for as long as they run.

Despite this downside, however, libvirt and applications using libvirt
have used these commands for as long as QMP has existed, via the
"human-monitor-command" passthrough command. IOW, while it is clearly
desirable to be able to fix the blocking problem, this is not an
immediate obstacle to real world usage.

Meanwhile there is a need for other features which involve adding new
parameters to the commands. This is possible with HMP passthrough, but
it provides no reliable way for apps to introspect features, so using
QAPI modelling is highly desirable.

This patch thus introduces new snapshot-{load,save,delete} commands to
QMP that are intended to replace the old HMP counterparts. The new
commands are given different names, because they will be using the new
QEMU job framework and thus will have diverging behaviour from the HMP
originals. It would thus be misleading to keep the same name.

While this design uses the generic job framework, the current impl is
still blocking. The intention that the blocking problem is fixed later.
None the less applications using these new commands should assume that
they are asynchronous and thus wait for the job status change event to
indicate completion.

Signed-off-by: Daniel P. Berrangé 
---
 include/migration/snapshot.h |  10 +-
 migration/savevm.c   | 220 -
 monitor/hmp-cmds.c   |   4 +-
 qapi/job.json|   9 +-
 qapi/migration.json  | 135 +
 replay/replay-snapshot.c |   4 +-
 softmmu/vl.c |   2 +-
 tests/qemu-iotests/310   | 255 
 tests/qemu-iotests/310.out   | 369 +++
 tests/qemu-iotests/group |   1 +
 10 files changed, 991 insertions(+), 18 deletions(-)
 create mode 100755 tests/qemu-iotests/310
 create mode 100644 tests/qemu-iotests/310.out

diff --git a/include/migration/snapshot.h b/include/migration/snapshot.h
index c85b6ec75b..f2ed9d1e43 100644
--- a/include/migration/snapshot.h
+++ b/include/migration/snapshot.h
@@ -15,7 +15,13 @@
 #ifndef QEMU_MIGRATION_SNAPSHOT_H
 #define QEMU_MIGRATION_SNAPSHOT_H
 
-int save_snapshot(const char *name, Error **errp);
-int load_snapshot(const char *name, Error **errp);
+#include "qapi/qapi-builtin-types.h"
+
+int save_snapshot(const char *name,
+  const char *vmstate, strList *devices,
+  Error **errp);
+int load_snapshot(const char *name,
+  const char *vmstate, strList *devices,
+  Error **errp);
 
 #endif
diff --git a/migration/savevm.c b/migration/savevm.c
index 4a52704132..c5d8131d82 100644
--- a/migration/savevm.c
+++ b/migration/savevm.c
@@ -43,6 +43,8 @@
 #include "qapi/error.h"
 #include "qapi/qapi-commands-migration.h"
 #include "qapi/qapi-commands-misc.h"
+#include "qapi/clone-visitor.h"
+#include "qapi/qapi-builtin-visit.h"
 #include "qapi/qmp/qerror.h"
 #include "qemu/error-report.h"
 #include "sysemu/cpus.h"
@@ -2658,7 +2660,8 @@ int qemu_load_device_state(QEMUFile *f)
 return 0;
 }
 
-int save_snapshot(const char *name, Error **errp)
+int save_snapshot(const char *name, const char *vmstate,
+  strList *devices, Error **errp)
 {
 BlockDriverState *bs;
 QEMUSnapshotInfo sn1, *sn = &sn1, old_sn1, *old_sn = &old_sn1;
@@ -2680,18 +2683,18 @@ int save_snapshot(const char *name, Error **errp)
 return ret;
 }
 
-if (!bdrv_all_can_snapshot(NULL, errp)) {
+if (!bdrv_all_can_snapshot(devices, errp)) {
 return ret;
 }
 
 /* Delete old snapshots of the same name */
 if (name) {
-if (bdrv_all_delete_snapshot(name, NULL, errp) < 0) {
+if (bdrv_all_delete_snapshot(name, devices, errp) < 0) {
 return ret;
 }
 }
 
-bs = bdrv_all_find_vmstate_bs(NULL, NULL, errp);
+bs = bdrv_all_find_vmstate_bs(vmstate, devices, errp);
 if (bs == NULL) {
 return ret;
 }
@@ -2757,7 +2760,7 @@ int save_snapshot(const char *name, Error **errp)
 aio_context_release(aio_context);
 aio_context = NULL;
 
-ret = bdrv_all_create_snapshot(sn, bs, vm_state_size, NULL, errp);
+ret = bdrv_all_create_snapshot(sn, bs, vm_state_size, devices, errp);
 if (ret < 0) {
 goto the_end;
 }
@@ -2858,7 +2861,8 @@ void qmp_xen_load_devices_state(const char *filename, 
Error **errp)
 migration_incoming_state_destroy();
 }
 
-int load_snapshot(const char *name, Error **errp)
+int load_snapshot(const char *name, const char *vmstate,
+  strList *devices, Error **errp)
 {
 BlockDriverState *bs_vm_state;
 QEMUSnapshotInfo sn;
@@ -2873,15 +2877,15 @@ int load_snapshot(const char *name, Error **errp)
 return -1;
 }
 
-if (!bdrv_all_can_snapshot(NUL

[PATCH v3 5/7] block: allow specifying name of block device for vmstate storage

2020-08-27 Thread Daniel P . Berrangé
Currently the vmstate will be stored in the first block device that
supports snapshots. Historically this would have usually been the
root device, but with UEFI it might be the variable store. There
needs to be a way to override the choice of block device to store
the state in.

Signed-off-by: Daniel P. Berrangé 
---
 block/monitor/block-hmp-cmds.c |  2 +-
 block/snapshot.c   | 17 +++--
 include/block/snapshot.h   |  4 +++-
 migration/savevm.c |  4 ++--
 4 files changed, 21 insertions(+), 6 deletions(-)

diff --git a/block/monitor/block-hmp-cmds.c b/block/monitor/block-hmp-cmds.c
index db76c43cc2..81d1b52262 100644
--- a/block/monitor/block-hmp-cmds.c
+++ b/block/monitor/block-hmp-cmds.c
@@ -900,7 +900,7 @@ void hmp_info_snapshots(Monitor *mon, const QDict *qdict)
 SnapshotEntry *snapshot_entry;
 Error *err = NULL;
 
-bs = bdrv_all_find_vmstate_bs(NULL, &err);
+bs = bdrv_all_find_vmstate_bs(NULL, NULL, &err);
 if (!bs) {
 error_report_err(err);
 return;
diff --git a/block/snapshot.c b/block/snapshot.c
index 5691cdc6cb..1f7b9a5146 100644
--- a/block/snapshot.c
+++ b/block/snapshot.c
@@ -604,7 +604,9 @@ int bdrv_all_create_snapshot(QEMUSnapshotInfo *sn,
 return 0;
 }
 
-BlockDriverState *bdrv_all_find_vmstate_bs(strList *devices, Error **errp)
+BlockDriverState *bdrv_all_find_vmstate_bs(const char *vmstate_bs,
+   strList *devices,
+   Error **errp)
 {
 g_autoptr(GList) bdrvs = NULL;
 GList *iterbdrvs;
@@ -624,6 +626,13 @@ BlockDriverState *bdrv_all_find_vmstate_bs(strList 
*devices, Error **errp)
 bdrv_can_snapshot(bs);
 aio_context_release(ctx);
 
+if (vmstate_bs && g_str_equal(vmstate_bs,
+  bdrv_get_node_name(bs))) {
+error_setg(errp, "block device '%s' does not support snapshots",
+   vmstate_bs);
+return NULL;
+}
+
 if (found) {
 return bs;
 }
@@ -631,6 +640,10 @@ BlockDriverState *bdrv_all_find_vmstate_bs(strList 
*devices, Error **errp)
 iterbdrvs = iterbdrvs->next;
 }
 
-error_setg(errp, "No block device supports snapshots");
+if (vmstate_bs) {
+error_setg(errp, "Block device '%s' does not exist", vmstate_bs);
+} else {
+error_setg(errp, "No block device supports snapshots");
+}
 return NULL;
 }
diff --git a/include/block/snapshot.h b/include/block/snapshot.h
index 1c5b0705a9..05550e5da1 100644
--- a/include/block/snapshot.h
+++ b/include/block/snapshot.h
@@ -86,6 +86,8 @@ int bdrv_all_create_snapshot(QEMUSnapshotInfo *sn,
  strList *devices,
  Error **errp);
 
-BlockDriverState *bdrv_all_find_vmstate_bs(strList *devices, Error **errp);
+BlockDriverState *bdrv_all_find_vmstate_bs(const char *vmstate_bs,
+   strList *devices,
+   Error **errp);
 
 #endif
diff --git a/migration/savevm.c b/migration/savevm.c
index ae56de1a85..4a52704132 100644
--- a/migration/savevm.c
+++ b/migration/savevm.c
@@ -2691,7 +2691,7 @@ int save_snapshot(const char *name, Error **errp)
 }
 }
 
-bs = bdrv_all_find_vmstate_bs(NULL, errp);
+bs = bdrv_all_find_vmstate_bs(NULL, NULL, errp);
 if (bs == NULL) {
 return ret;
 }
@@ -2881,7 +2881,7 @@ int load_snapshot(const char *name, Error **errp)
 return -1;
 }
 
-bs_vm_state = bdrv_all_find_vmstate_bs(NULL, errp);
+bs_vm_state = bdrv_all_find_vmstate_bs(NULL, NULL, errp);
 if (!bs_vm_state) {
 return -1;
 }
-- 
2.26.2




[PATCH v3 3/7] migration: stop returning errno from load_snapshot()

2020-08-27 Thread Daniel P . Berrangé
None of the callers care about the errno value since there is a full
Error object populated. This gives consistency with save_snapshot()
which already just returns -1.

Reviewed-by: Dr. David Alan Gilbert 
Signed-off-by: Daniel P. Berrangé 
---
 migration/savevm.c | 17 -
 1 file changed, 8 insertions(+), 9 deletions(-)

diff --git a/migration/savevm.c b/migration/savevm.c
index 3826c437cc..711137bcbe 100644
--- a/migration/savevm.c
+++ b/migration/savevm.c
@@ -2870,20 +2870,20 @@ int load_snapshot(const char *name, Error **errp)
 if (!replay_can_snapshot()) {
 error_setg(errp, "Record/replay does not allow loading snapshot "
"right now. Try once more later.");
-return -EINVAL;
+return -1;
 }
 
 if (!bdrv_all_can_snapshot(errp)) {
-return -ENOTSUP;
+return -1;
 }
 ret = bdrv_all_find_snapshot(name, errp);
 if (ret < 0) {
-return ret;
+return -1;
 }
 
 bs_vm_state = bdrv_all_find_vmstate_bs(errp);
 if (!bs_vm_state) {
-return -ENOTSUP;
+return -1;
 }
 aio_context = bdrv_get_aio_context(bs_vm_state);
 
@@ -2892,11 +2892,11 @@ int load_snapshot(const char *name, Error **errp)
 ret = bdrv_snapshot_find(bs_vm_state, &sn, name);
 aio_context_release(aio_context);
 if (ret < 0) {
-return ret;
+return -1;
 } else if (sn.vm_state_size == 0) {
 error_setg(errp, "This is a disk-only snapshot. Revert to it "
" offline using qemu-img");
-return -EINVAL;
+return -1;
 }
 
 /* Flush all IO requests so they don't interfere with the new state.  */
@@ -2911,7 +2911,6 @@ int load_snapshot(const char *name, Error **errp)
 f = qemu_fopen_bdrv(bs_vm_state, 0);
 if (!f) {
 error_setg(errp, "Could not open VM state file");
-ret = -EINVAL;
 goto err_drain;
 }
 
@@ -2927,14 +2926,14 @@ int load_snapshot(const char *name, Error **errp)
 
 if (ret < 0) {
 error_setg(errp, "Error %d while loading VM state", ret);
-return ret;
+return -1;
 }
 
 return 0;
 
 err_drain:
 bdrv_drain_all_end();
-return ret;
+return -1;
 }
 
 void vmstate_register_ram(MemoryRegion *mr, DeviceState *dev)
-- 
2.26.2




[PATCH v3 2/7] block: push error reporting into bdrv_all_*_snapshot functions

2020-08-27 Thread Daniel P . Berrangé
The bdrv_all_*_snapshot functions return a BlockDriverState pointer
for the invalid backend, which the callers then use to report an
error message. In some cases multiple callers are reporting the
same error message, but with slightly different text. In the future
there will be more error scenarios for some of these methods, which
will benefit from fine grained error message reporting. So it is
helpful to push error reporting down a level.

Signed-off-by: Daniel P. Berrangé 
---
 block/monitor/block-hmp-cmds.c |  7 ++--
 block/snapshot.c   | 77 +-
 include/block/snapshot.h   | 14 +++
 migration/savevm.c | 37 +---
 monitor/hmp-cmds.c |  7 +---
 tests/qemu-iotests/267.out | 10 ++---
 6 files changed, 65 insertions(+), 87 deletions(-)

diff --git a/block/monitor/block-hmp-cmds.c b/block/monitor/block-hmp-cmds.c
index 4c8c375172..9df11494d6 100644
--- a/block/monitor/block-hmp-cmds.c
+++ b/block/monitor/block-hmp-cmds.c
@@ -898,10 +898,11 @@ void hmp_info_snapshots(Monitor *mon, const QDict *qdict)
 
 ImageEntry *image_entry, *next_ie;
 SnapshotEntry *snapshot_entry;
+Error *err = NULL;
 
-bs = bdrv_all_find_vmstate_bs();
+bs = bdrv_all_find_vmstate_bs(&err);
 if (!bs) {
-monitor_printf(mon, "No available block device supports snapshots\n");
+error_report_err(err);
 return;
 }
 aio_context = bdrv_get_aio_context(bs);
@@ -951,7 +952,7 @@ void hmp_info_snapshots(Monitor *mon, const QDict *qdict)
 total = 0;
 for (i = 0; i < nb_sns; i++) {
 SnapshotEntry *next_sn;
-if (bdrv_all_find_snapshot(sn_tab[i].name, &bs1) == 0) {
+if (bdrv_all_find_snapshot(sn_tab[i].name, NULL) == 0) {
 global_snapshots[total] = i;
 total++;
 QTAILQ_FOREACH(image_entry, &image_list, next) {
diff --git a/block/snapshot.c b/block/snapshot.c
index bd9fb01817..6839060622 100644
--- a/block/snapshot.c
+++ b/block/snapshot.c
@@ -400,14 +400,14 @@ static bool 
bdrv_all_snapshots_includes_bs(BlockDriverState *bs)
  * These functions will properly handle dataplane (take aio_context_acquire
  * when appropriate for appropriate block drivers) */
 
-bool bdrv_all_can_snapshot(BlockDriverState **first_bad_bs)
+bool bdrv_all_can_snapshot(Error **errp)
 {
-bool ok = true;
 BlockDriverState *bs;
 BdrvNextIterator it;
 
 for (bs = bdrv_first(&it); bs; bs = bdrv_next(&it)) {
 AioContext *ctx = bdrv_get_aio_context(bs);
+bool ok;
 
 aio_context_acquire(ctx);
 if (bdrv_all_snapshots_includes_bs(bs)) {
@@ -415,26 +415,25 @@ bool bdrv_all_can_snapshot(BlockDriverState 
**first_bad_bs)
 }
 aio_context_release(ctx);
 if (!ok) {
+error_setg(errp, "Device '%s' is writable but does not support "
+   "snapshots", bdrv_get_device_or_node_name(bs));
 bdrv_next_cleanup(&it);
-goto fail;
+return false;
 }
 }
 
-fail:
-*first_bad_bs = bs;
-return ok;
+return true;
 }
 
-int bdrv_all_delete_snapshot(const char *name, BlockDriverState **first_bad_bs,
- Error **errp)
+int bdrv_all_delete_snapshot(const char *name, Error **errp)
 {
-int ret = 0;
 BlockDriverState *bs;
 BdrvNextIterator it;
 QEMUSnapshotInfo sn1, *snapshot = &sn1;
 
 for (bs = bdrv_first(&it); bs; bs = bdrv_next(&it)) {
 AioContext *ctx = bdrv_get_aio_context(bs);
+int ret;
 
 aio_context_acquire(ctx);
 if (bdrv_all_snapshots_includes_bs(bs) &&
@@ -445,26 +444,25 @@ int bdrv_all_delete_snapshot(const char *name, 
BlockDriverState **first_bad_bs,
 }
 aio_context_release(ctx);
 if (ret < 0) {
+error_prepend(errp, "Could not delete snapshot '%s' on '%s': ",
+  name, bdrv_get_device_or_node_name(bs));
 bdrv_next_cleanup(&it);
-goto fail;
+return -1;
 }
 }
 
-fail:
-*first_bad_bs = bs;
-return ret;
+return 0;
 }
 
 
-int bdrv_all_goto_snapshot(const char *name, BlockDriverState **first_bad_bs,
-   Error **errp)
+int bdrv_all_goto_snapshot(const char *name, Error **errp)
 {
-int ret = 0;
 BlockDriverState *bs;
 BdrvNextIterator it;
 
 for (bs = bdrv_first(&it); bs; bs = bdrv_next(&it)) {
 AioContext *ctx = bdrv_get_aio_context(bs);
+int ret;
 
 aio_context_acquire(ctx);
 if (bdrv_all_snapshots_includes_bs(bs)) {
@@ -472,75 +470,75 @@ int bdrv_all_goto_snapshot(const char *name, 
BlockDriverState **first_bad_bs,
 }
 aio_context_release(ctx);
 if (ret < 0) {
+error_prepend(errp, "Could not load snapshot '%s' on '%s': ",
+  name, bdrv_get_device_or_node_name(bs));
 bdrv_next_cleanup(&it);
-goto 

[PATCH v3 1/7] migration: improve error reporting of block driver state name

2020-08-27 Thread Daniel P . Berrangé
With blockdev, a BlockDriverState may not have a device name,
so using a node name is required as an alternative.

Reviewed-by: Dr. David Alan Gilbert 
Signed-off-by: Daniel P. Berrangé 
---
 migration/savevm.c | 12 ++--
 tests/qemu-iotests/267.out |  4 ++--
 2 files changed, 8 insertions(+), 8 deletions(-)

diff --git a/migration/savevm.c b/migration/savevm.c
index a843d202b5..304d98ff78 100644
--- a/migration/savevm.c
+++ b/migration/savevm.c
@@ -2682,7 +2682,7 @@ int save_snapshot(const char *name, Error **errp)
 
 if (!bdrv_all_can_snapshot(&bs)) {
 error_setg(errp, "Device '%s' is writable but does not support "
-   "snapshots", bdrv_get_device_name(bs));
+   "snapshots", bdrv_get_device_or_node_name(bs));
 return ret;
 }
 
@@ -2691,7 +2691,7 @@ int save_snapshot(const char *name, Error **errp)
 ret = bdrv_all_delete_snapshot(name, &bs1, errp);
 if (ret < 0) {
 error_prepend(errp, "Error while deleting snapshot on device "
-  "'%s': ", bdrv_get_device_name(bs1));
+  "'%s': ", bdrv_get_device_or_node_name(bs1));
 return ret;
 }
 }
@@ -2766,7 +2766,7 @@ int save_snapshot(const char *name, Error **errp)
 ret = bdrv_all_create_snapshot(sn, bs, vm_state_size, &bs);
 if (ret < 0) {
 error_setg(errp, "Error while creating snapshot on '%s'",
-   bdrv_get_device_name(bs));
+   bdrv_get_device_or_node_name(bs));
 goto the_end;
 }
 
@@ -2884,14 +2884,14 @@ int load_snapshot(const char *name, Error **errp)
 if (!bdrv_all_can_snapshot(&bs)) {
 error_setg(errp,
"Device '%s' is writable but does not support snapshots",
-   bdrv_get_device_name(bs));
+   bdrv_get_device_or_node_name(bs));
 return -ENOTSUP;
 }
 ret = bdrv_all_find_snapshot(name, &bs);
 if (ret < 0) {
 error_setg(errp,
"Device '%s' does not have the requested snapshot '%s'",
-   bdrv_get_device_name(bs), name);
+   bdrv_get_device_or_node_name(bs), name);
 return ret;
 }
 
@@ -2920,7 +2920,7 @@ int load_snapshot(const char *name, Error **errp)
 ret = bdrv_all_goto_snapshot(name, &bs, errp);
 if (ret < 0) {
 error_prepend(errp, "Could not load snapshot '%s' on '%s': ",
-  name, bdrv_get_device_name(bs));
+  name, bdrv_get_device_or_node_name(bs));
 goto err_drain;
 }
 
diff --git a/tests/qemu-iotests/267.out b/tests/qemu-iotests/267.out
index d6d80c099f..215902b3ad 100644
--- a/tests/qemu-iotests/267.out
+++ b/tests/qemu-iotests/267.out
@@ -81,11 +81,11 @@ Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=134217728
 Testing: -blockdev driver=file,filename=TEST_DIR/t.IMGFMT,node-name=file
 QEMU X.Y.Z monitor - type 'help' for more information
 (qemu) savevm snap0
-Error: Device '' is writable but does not support snapshots
+Error: Device 'file' is writable but does not support snapshots
 (qemu) info snapshots
 No available block device supports snapshots
 (qemu) loadvm snap0
-Error: Device '' is writable but does not support snapshots
+Error: Device 'file' is writable but does not support snapshots
 (qemu) quit
 
 Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=134217728
-- 
2.26.2




[PATCH v3 0/7] migration: bring improved savevm/loadvm/delvm to QMP

2020-08-27 Thread Daniel P . Berrangé
 v1: https://lists.gnu.org/archive/html/qemu-devel/2020-07/msg00866.html
 v2: https://lists.gnu.org/archive/html/qemu-devel/2020-07/msg07523.html

When QMP was first introduced some 10+ years ago now, the snapshot
related commands (savevm/loadvm/delvm) were not converted. This was
primarily because their implementation causes blocking of the thread
running the monitor commands. This was (and still is) considered
undesirable behaviour both in HMP and QMP.

In theory someone was supposed to fix this flaw at some point in the
past 10 years and bring them into the QMP world. Sadly, thus far it
hasn't happened as people always had more important things to work
on. Enterprise apps were much more interested in external snapshots
than internal snapshots as they have many more features.

Meanwhile users still want to use internal snapshots as there is
a certainly simplicity in having everything self-contained in one
image, even though it has limitations. Thus the apps that end up
executing the savevm/loadvm/delvm via the "human-monitor-command"
QMP command.

IOW, the problematic blocking behaviour that was one of the reasons
for not having savevm/loadvm/delvm in QMP is experienced by applications
regardless. By not portting the commands to QMP due to one design flaw,
we've forced apps and users to suffer from other design flaws of HMP (
bad error reporting, strong type checking of args, no introspection) for
an additional 10 years. This feels rather sub-optimal :-(

In practice users don't appear to care strongly about the fact that these
commands block the VM while they run. I might have seen one bug report
about it, but it certainly isn't something that comes up as a frequent
topic except among us QEMU maintainers. Users do care about having
access to the snapshot feature.

Where I am seeing frequent complaints is wrt the use of OVMF combined
with snapshots which has some serious pain points. This is getting worse
as the push to ditch legacy BIOS in favour of UEFI gain momentum both
across OS vendors and mgmt apps. Solving it requires new parameters to
the commands, but doing this in HMP is super unappealing.

After 10 years, I think it is time for us to be a little pragmatic about
our handling of snapshots commands. My desire is that libvirt should never
use "human-monitor-command" under any circumstances, because of the
inherant flaws in HMP as a protocol for machine consumption.

Thus in this series I'm proposing a fairly direct mapping of the existing
HMP commands for savevm/loadvm/delvm into QMP as a first step. This does
not solve the blocking thread problem, but it does put in a place a
design using the jobs framework which can facilitate solving it later.
It does also solve the error reporting, type checking and introspection
problems inherant to HMP. So we're winning on 3 out of the 4 problems,
and pushed apps to a QMP design that will let us solve the last
remaining problem.

With a QMP variant, we reasonably deal with the problems related to OVMF:

 - The logic to pick which disk to store the vmstate in is not
   satsifactory.

   The first block driver state cannot be assumed to be the root disk
   image, it might be OVMF varstore and we don't want to store vmstate
   in there.

 - The logic to decide which disks must be snapshotted is hardwired
   to all disks which are writable

   Again with OVMF there might be a writable varstore, but this can be
   raw rather than qcow2 format, and thus unable to be snapshotted.
   While users might wish to snapshot their varstore, in some/many/most
   cases it is entirely uneccessary. Users are blocked from snapshotting
   their VM though due to this varstore.

These are solved by adding two parameters to the commands. The first is
a block device node name that identifies the image to store vmstate in,
and the second is a list of node names to include for the snapshots.
If the list of nodes isn't given, it falls back to the historical
behaviour of using all disks matching some undocumented criteria.

In the block code I've only dealt with node names for block devices, as
IIUC, this is all that libvirt should need in the -blockdev world it now
lives in. IOW, I've made not attempt to cope with people wanting to use
these QMP commands in combination with -drive args, as libvirt will
never use -drive with a QEMU new enough to have these new commands.

The main limitations of this current impl

 - The snapshot process runs serialized in the main thread. ie QEMU
   guest execution is blocked for the duration. The job framework
   lets us fix this in future without changing the QMP semantics
   exposed to the apps.

 - Most vmstate loading errors just go to stderr, as they are not
   using Error **errp reporting. Thus the job framework just
   reports a fairly generic message

 "Error -22 while loading VM state"

   Again this can be fixed later without changing the QMP semantics
   exposed to apps.

I've done some minimal work in libvirt to start to make use of th

Re: [PATCH v2 (BROKEN) 0/6] migration: bring improved savevm/loadvm/delvm to QMP

2020-08-27 Thread Markus Armbruster
Daniel P. Berrangé  writes:

> On Wed, Aug 26, 2020 at 07:28:24PM +0100, Daniel P. Berrangé wrote:
>> On Wed, Aug 26, 2020 at 05:52:06PM +0200, Markus Armbruster wrote:
>> > Open questions:
>> > 
>> > * Do we want the QMP command to delete existing snapshots with
>> >   conflicting tag / ID, like HMP savevm does?  Or do we want it to fail
>> >   the transaction?
>> 
>> The intent is for the QMP commands to operate exclusively on
>> 'tags', and never consider "ID".
>
> I forgot that even HMP ignores "ID" now and works exclusively in terms
> of tags since:
>
>
>   commit 6ca080453ea403959ccde661030ca16264acc181
>   Author: Daniel Henrique Barboza 
>   Date:   Wed Nov 7 11:09:58 2018 -0200
>
> block/snapshot.c: eliminate use of ID input in snapshot operations

Almost a year after I sent the memo I quoted.  It's an incompatible
change, but nobody complained, and I'm glad we got this issue out of the
way.




Re: [PATCH v2 (BROKEN) 0/6] migration: bring improved savevm/loadvm/delvm to QMP

2020-08-27 Thread Markus Armbruster
Daniel P. Berrangé  writes:

> On Wed, Aug 26, 2020 at 05:52:06PM +0200, Markus Armbruster wrote:
>> Sorry for taking so long to reply.
>> 
>> Daniel P. Berrangé  writes:
>> 
>> > A followup to:
>> >
>> >  v1: https://lists.gnu.org/archive/html/qemu-devel/2020-07/msg00866.html
>> >
>> > When QMP was first introduced some 10+ years ago now, the snapshot
>> > related commands (savevm/loadvm/delvm) were not converted. This was
>> > primarily because their implementation causes blocking of the thread
>> > running the monitor commands. This was (and still is) considered
>> > undesirable behaviour both in HMP and QMP.
>> 
>> One of several reasons.
>> 
>> > In theory someone was supposed to fix this flaw at some point in the
>> > past 10 years and bring them into the QMP world. Sadly, thus far it
>> > hasn't happened as people always had more important things to work
>> > on. Enterprise apps were much more interested in external snapshots
>> > than internal snapshots as they have many more features.
>> 
>> Several attempts have been made to bring the functionality to QMP.
>> Sadly, they went nowhere.
>> 
>> I posted an analysis of the issues in reply to one of the more serious
>> attempts:
>> 
>> Message-ID: <87lh7l783q@blackfin.pond.sub.org>
>> https://lists.nongnu.org/archive/html/qemu-devel/2016-01/msg03593.html
>> 
>> I'd like to hear your take on it.  I append the relevant part for your
>> convenience.  Perhaps your code is already close to what I describe
>> there.  I'm interested in where it falls short.
>> 
>> > Meanwhile users still want to use internal snapshots as there is
>> > a certainly simplicity in having everything self-contained in one
>> > image, even though it has limitations. Thus the apps that end up
>> > executing the savevm/loadvm/delvm via the "human-monitor-command"
>> > QMP command.
>> >
>> > IOW, the problematic blocking behaviour that was one of the reasons
>> > for not having savevm/loadvm/delvm in QMP is experienced by applications
>> > regardless. By not portting the commands to QMP due to one design flaw,
>> > we've forced apps and users to suffer from other design flaws of HMP (
>> > bad error reporting, strong type checking of args, no introspection) for
>> > an additional 10 years. This feels rather sub-optimal :-(
>> >
>> > In practice users don't appear to care strongly about the fact that these
>> > commands block the VM while they run. I might have seen one bug report
>> > about it, but it certainly isn't something that comes up as a frequent
>> > topic except among us QEMU maintainers. Users do care about having
>> > access to the snapshot feature.
>> >
>> > Where I am seeing frequent complaints is wrt the use of OVMF combined
>> > with snapshots which has some serious pain points. This is getting worse
>> > as the push to ditch legacy BIOS in favour of UEFI gain momentum both
>> > across OS vendors and mgmt apps. Solving it requires new parameters to
>> > the commands, but doing this in HMP is super unappealing.
>> >
>> > After 10 years, I think it is time for us to be a little pragmatic about
>> > our handling of snapshots commands. My desire is that libvirt should never
>> > use "human-monitor-command" under any circumstances, because of the
>> > inherant flaws in HMP as a protocol for machine consumption.
>> >
>> > Thus in this series I'm proposing a fairly direct mapping of the existing
>> > HMP commands for savevm/loadvm/delvm into QMP as a first step. This does
>> > not solve the blocking thread problem, but it does put in a place a
>> > design using the jobs framework which can facilitate solving it later.
>> > It does also solve the error reporting, type checking and introspection
>> > problems inherant to HMP. So we're winning on 3 out of the 4 problems,
>> > and pushed apps to a QMP design that will let us solve the last
>> > remaining problem.
>> >
>> > With a QMP variant, we reasonably deal with the problems related to OVMF:
>> >
>> >  - The logic to pick which disk to store the vmstate in is not
>> >satsifactory.
>> >
>> >The first block driver state cannot be assumed to be the root disk
>> >image, it might be OVMF varstore and we don't want to store vmstate
>> >in there.
>> 
>> Yes, this is one of the issues.  Glad you're addressing it.
>> 
>> >  - The logic to decide which disks must be snapshotted is hardwired
>> >to all disks which are writable
>> >
>> >Again with OVMF there might be a writable varstore, but this can be
>> >raw rather than qcow2 format, and thus unable to be snapshotted.
>> >While users might wish to snapshot their varstore, in some/many/most
>> >cases it is entirely uneccessary. Users are blocked from snapshotting
>> >their VM though due to this varstore.
>> 
>> Another one.  Glad again.
>> 
>> > These are solved by adding two parameters to the commands. The first is
>> > a block device node name that identifies the image to store vmstate in,
>> > and the second is a list of node names

Re: [PATCH v7 0/8] Introduce 'yank' oob qmp command to recover from hanging qemu

2020-08-27 Thread Daniel P . Berrangé
On Thu, Aug 27, 2020 at 10:42:46AM +0200, Lukas Straub wrote:
> On Tue, 18 Aug 2020 14:26:31 +0200
> Lukas Straub  wrote:
> 
> > On Tue, 4 Aug 2020 10:11:22 +0200
> > Lukas Straub  wrote:
> > 
> > > Hello Everyone,
> > > In many cases, if qemu has a network connection (qmp, migration, chardev, 
> > > etc.)
> > > to some other server and that server dies or hangs, qemu hangs too.
> > > These patches introduce the new 'yank' out-of-band qmp command to recover 
> > > from
> > > these kinds of hangs. The different subsystems register callbacks which 
> > > get
> > > executed with the yank command. For example the callback can shutdown() a
> > > socket. This is intended for the colo use-case, but it can be used for 
> > > other
> > > things too of course.
> > > 
> > > Regards,
> > > Lukas Straub
> > > 
> > > v7:
> > >  -yank_register_instance now returns error via Error **errp instead of 
> > > aborting
> > >  -dropped "chardev/char.c: Check for duplicate id before  creating 
> > > chardev"
> > > 
> > > v6:
> > >  -add Reviewed-by and Acked-by tags
> > >  -rebase on master
> > >  -lots of changes in nbd due to rebase
> > >  -only take maintainership of util/yank.c and include/qemu/yank.h (Daniel 
> > > P. Berrangé)
> > >  -fix a crash discovered by the newly added chardev test
> > >  -fix the test itself
> > > 
> > > v5:
> > >  -move yank.c to util/
> > >  -move yank.h to include/qemu/
> > >  -add license to yank.h
> > >  -use const char*
> > >  -nbd: use atomic_store_release and atomic_load_aqcuire
> > >  -io-channel: ensure thread-safety and document it
> > >  -add myself as maintainer for yank
> > > 
> > > v4:
> > >  -fix build errors...
> > > 
> > > v3:
> > >  -don't touch softmmu/vl.c, use __contructor__ attribute instead (Paolo 
> > > Bonzini)
> > >  -fix build errors
> > >  -rewrite migration patch so it actually passes all tests
> > > 
> > > v2:
> > >  -don't touch io/ code anymore
> > >  -always register yank functions
> > >  -'yank' now takes a list of instances to yank
> > >  -'query-yank' returns a list of yankable instances
> > > 
> > > Lukas Straub (8):
> > >   Introduce yank feature
> > >   block/nbd.c: Add yank feature
> > >   chardev/char-socket.c: Add yank feature
> > >   migration: Add yank feature
> > >   io/channel-tls.c: make qio_channel_tls_shutdown thread-safe
> > >   io: Document thread-safety of qio_channel_shutdown
> > >   MAINTAINERS: Add myself as maintainer for yank feature
> > >   tests/test-char.c: Wait for the chardev to connect in
> > > char_socket_client_dupid_test
> > > 
> > >  MAINTAINERS   |   6 ++
> > >  block/nbd.c   | 129 +++-
> > >  chardev/char-socket.c |  31 ++
> > >  include/io/channel.h  |   2 +
> > >  include/qemu/yank.h   |  80 +++
> > >  io/channel-tls.c  |   6 +-
> > >  migration/channel.c   |  12 +++
> > >  migration/migration.c |  25 -
> > >  migration/multifd.c   |  10 ++
> > >  migration/qemu-file-channel.c |   6 ++
> > >  migration/savevm.c|   6 ++
> > >  qapi/misc.json|  45 +
> > >  tests/Makefile.include|   2 +-
> > >  tests/test-char.c |   1 +
> > >  util/Makefile.objs|   1 +
> > >  util/yank.c   | 184 ++
> > >  16 files changed, 493 insertions(+), 53 deletions(-)
> > >  create mode 100644 include/qemu/yank.h
> > >  create mode 100644 util/yank.c
> > > 
> > > --
> > > 2.20.1  
> > 
> > Ping...
> 
> Ping 2...
> 
> Also, can the different subsystems have a look at this and give their ok?

We need ACKs from the NBD, migration and chardev maintainers, for the
respective patches, then I think this series is ready for a pull request.

Once acks arrive, I'm happy to send a PULL unless someone else has a
desire todo it.


Regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|




Re: [PATCH v7 4/8] migration: Add yank feature

2020-08-27 Thread Daniel P . Berrangé
On Tue, Aug 04, 2020 at 10:11:45AM +0200, Lukas Straub wrote:
> Register yank functions on sockets to shut them down.
> 
> Signed-off-by: Lukas Straub 
> Acked-by: Stefan Hajnoczi 
> ---
>  migration/channel.c   | 12 
>  migration/migration.c | 25 -
>  migration/multifd.c   | 10 ++
>  migration/qemu-file-channel.c |  6 ++
>  migration/savevm.c|  6 ++
>  tests/Makefile.include|  2 +-
>  6 files changed, 59 insertions(+), 2 deletions(-)

Reviewed-by: Daniel P. Berrangé 


Regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|




Re: [PATCH v7 2/8] block/nbd.c: Add yank feature

2020-08-27 Thread Daniel P . Berrangé
On Tue, Aug 04, 2020 at 10:11:37AM +0200, Lukas Straub wrote:
> Register a yank function which shuts down the socket and sets
> s->state = NBD_CLIENT_QUIT. This is the same behaviour as if an
> error occured.
> 
> Signed-off-by: Lukas Straub 
> Acked-by: Stefan Hajnoczi 
> ---
>  block/nbd.c | 129 
>  1 file changed, 80 insertions(+), 49 deletions(-)

Reviewed-by: Daniel P. Berrangé 


Regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|




Re: [PATCH v7 3/8] chardev/char-socket.c: Add yank feature

2020-08-27 Thread Daniel P . Berrangé
On Tue, Aug 04, 2020 at 10:11:41AM +0200, Lukas Straub wrote:
> Register a yank function to shutdown the socket on yank.
> 
> Signed-off-by: Lukas Straub 
> Acked-by: Stefan Hajnoczi 
> ---
>  chardev/char-socket.c | 31 +++
>  1 file changed, 31 insertions(+)

Reviewed-by: Daniel P. Berrangé 


Regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|




Re: [PATCH v7 1/8] Introduce yank feature

2020-08-27 Thread Daniel P . Berrangé
On Tue, Aug 04, 2020 at 10:11:34AM +0200, Lukas Straub wrote:
> The yank feature allows to recover from hanging qemu by "yanking"
> at various parts. Other qemu systems can register themselves and
> multiple yank functions. Then all yank functions for selected
> instances can be called by the 'yank' out-of-band qmp command.
> Available instances can be queried by a 'query-yank' oob command.
> 
> Signed-off-by: Lukas Straub 
> Acked-by: Stefan Hajnoczi 
> ---
>  include/qemu/yank.h |  80 +++
>  qapi/misc.json  |  45 +++
>  util/Makefile.objs  |   1 +
>  util/yank.c | 184 
>  4 files changed, 310 insertions(+)
>  create mode 100644 include/qemu/yank.h
>  create mode 100644 util/yank.c

Reviewed-by: Daniel P. Berrangé 


Regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|




Re: [PATCH v7 8/8] tests/test-char.c: Wait for the chardev to connect in char_socket_client_dupid_test

2020-08-27 Thread Daniel P . Berrangé
On Tue, Aug 04, 2020 at 10:12:01AM +0200, Lukas Straub wrote:
> A connecting chardev object has an additional reference by the connecting
> thread, so if the chardev is still connecting by the end of the test,
> then the chardev object won't be freed. This in turn means that the yank
> instance won't be unregistered and when running the next test-case
> yank_register_instance will abort, because the yank instance is
> already/still registered.
> 
> Signed-off-by: Lukas Straub 
> ---
>  tests/test-char.c | 1 +
>  1 file changed, 1 insertion(+)

Reviewed-by: Daniel P. Berrangé 


Regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|




Re: [PATCH v5 07/10] block: introduce preallocate filter

2020-08-27 Thread Vladimir Sementsov-Ogievskiy

26.08.2020 16:51, David Edmondson wrote:

+  file-systems with slow allocation.
+
+  Supported options:
+
+  .. program:: preallocate
+  .. option:: prealloc-align
+
+On preallocation, align file length to this number, default 1M.

*the file length

As for “number”...  Well, it is a number.  But “value” might fit better.
  Or “length (in bytes)”?

Isn't it really:

"On preallocation, ensure that the file length is aligned to a multiple
of this value, default 1M."



Sounds good, thanks!


--
Best regards,
Vladimir



Re: [PATCH for-5.2 v2 3/9] pc-bios/s390-ccw: Introduce ENODEV define and remove guards of others

2020-08-27 Thread Thomas Huth
On 06/08/2020 15.27, Janosch Frank wrote:
> On 8/6/20 12:53 PM, Thomas Huth wrote:
>> Remove the "#ifndef E..." guards from the defines here - the header
>> guard S390_CCW_H at the top of the file should avoid double definition,
>> and if the error code is defined in a different file already, we're in
>> trouble anyway, then it's better to see the error at compile time instead
>> of hunting weird behavior during runtime later.
>> Also define ENODEV - we will use this in a later patch.
>>
>> Signed-off-by: Thomas Huth 
> 
> Would it make sense to use the errno.h numbers for the defines?

Which one? From Linux? From Windows? From BSD? ... I think it's likely
best if we keep them separate to avoid confusion.

 Thomas


> Reviewed-by: Janosch Frank 
> 
>> ---
>>  pc-bios/s390-ccw/s390-ccw.h | 6 ++
>>  1 file changed, 2 insertions(+), 4 deletions(-)
>>
>> diff --git a/pc-bios/s390-ccw/s390-ccw.h b/pc-bios/s390-ccw/s390-ccw.h
>> index 36b884cced..dbc4c64851 100644
>> --- a/pc-bios/s390-ccw/s390-ccw.h
>> +++ b/pc-bios/s390-ccw/s390-ccw.h
>> @@ -27,12 +27,10 @@ typedef unsigned long long __u64;
>>  #define false 0
>>  #define PAGE_SIZE 4096
>>  
>> -#ifndef EIO
>>  #define EIO 1
>> -#endif
>> -#ifndef EBUSY
>>  #define EBUSY   2
>> -#endif
>> +#define ENODEV  3
>> +
>>  #ifndef NULL
>>  #define NULL0
>>  #endif
>>
> 
> 




Re: [PATCH v7 0/8] Introduce 'yank' oob qmp command to recover from hanging qemu

2020-08-27 Thread Lukas Straub
On Tue, 18 Aug 2020 14:26:31 +0200
Lukas Straub  wrote:

> On Tue, 4 Aug 2020 10:11:22 +0200
> Lukas Straub  wrote:
> 
> > Hello Everyone,
> > In many cases, if qemu has a network connection (qmp, migration, chardev, 
> > etc.)
> > to some other server and that server dies or hangs, qemu hangs too.
> > These patches introduce the new 'yank' out-of-band qmp command to recover 
> > from
> > these kinds of hangs. The different subsystems register callbacks which get
> > executed with the yank command. For example the callback can shutdown() a
> > socket. This is intended for the colo use-case, but it can be used for other
> > things too of course.
> > 
> > Regards,
> > Lukas Straub
> > 
> > v7:
> >  -yank_register_instance now returns error via Error **errp instead of 
> > aborting
> >  -dropped "chardev/char.c: Check for duplicate id before  creating chardev"
> > 
> > v6:
> >  -add Reviewed-by and Acked-by tags
> >  -rebase on master
> >  -lots of changes in nbd due to rebase
> >  -only take maintainership of util/yank.c and include/qemu/yank.h (Daniel 
> > P. Berrangé)
> >  -fix a crash discovered by the newly added chardev test
> >  -fix the test itself
> > 
> > v5:
> >  -move yank.c to util/
> >  -move yank.h to include/qemu/
> >  -add license to yank.h
> >  -use const char*
> >  -nbd: use atomic_store_release and atomic_load_aqcuire
> >  -io-channel: ensure thread-safety and document it
> >  -add myself as maintainer for yank
> > 
> > v4:
> >  -fix build errors...
> > 
> > v3:
> >  -don't touch softmmu/vl.c, use __contructor__ attribute instead (Paolo 
> > Bonzini)
> >  -fix build errors
> >  -rewrite migration patch so it actually passes all tests
> > 
> > v2:
> >  -don't touch io/ code anymore
> >  -always register yank functions
> >  -'yank' now takes a list of instances to yank
> >  -'query-yank' returns a list of yankable instances
> > 
> > Lukas Straub (8):
> >   Introduce yank feature
> >   block/nbd.c: Add yank feature
> >   chardev/char-socket.c: Add yank feature
> >   migration: Add yank feature
> >   io/channel-tls.c: make qio_channel_tls_shutdown thread-safe
> >   io: Document thread-safety of qio_channel_shutdown
> >   MAINTAINERS: Add myself as maintainer for yank feature
> >   tests/test-char.c: Wait for the chardev to connect in
> > char_socket_client_dupid_test
> > 
> >  MAINTAINERS   |   6 ++
> >  block/nbd.c   | 129 +++-
> >  chardev/char-socket.c |  31 ++
> >  include/io/channel.h  |   2 +
> >  include/qemu/yank.h   |  80 +++
> >  io/channel-tls.c  |   6 +-
> >  migration/channel.c   |  12 +++
> >  migration/migration.c |  25 -
> >  migration/multifd.c   |  10 ++
> >  migration/qemu-file-channel.c |   6 ++
> >  migration/savevm.c|   6 ++
> >  qapi/misc.json|  45 +
> >  tests/Makefile.include|   2 +-
> >  tests/test-char.c |   1 +
> >  util/Makefile.objs|   1 +
> >  util/yank.c   | 184 ++
> >  16 files changed, 493 insertions(+), 53 deletions(-)
> >  create mode 100644 include/qemu/yank.h
> >  create mode 100644 util/yank.c
> > 
> > --
> > 2.20.1  
> 
> Ping...

Ping 2...

Also, can the different subsystems have a look at this and give their ok?

Regards,
Lukas Straub


pgpfpQdrFZe4q.pgp
Description: OpenPGP digital signature