Re: [Qemu-block] [Qemu-devel] [PATCH] tcmu: Introduce qemu-tcmu utility

2019-03-12 Thread Fam Zheng



> On Mar 12, 2019, at 18:03, Kevin Wolf  wrote:
> 
> Am 12.03.2019 um 03:18 hat Fam Zheng geschrieben:
>>> On Mar 11, 2019, at 19:06, Kevin Wolf  wrote:
>>> Am 09.03.2019 um 02:46 hat Yaowei Bai geschrieben:
 Thanks for explaining the background. It comes to my mind that actually we
 talked about these two cases with Fam a bit long time ago and decided to
 support both these two cases. The reason why we implement case2 first is
 currently we care more about exporting new opened images and it's a bit
 more convenient, exporting from a VM or QMP can be added in the later
 release. Do you think it's reasonable/acceptable that we support both
 cases and use case2 for normal new opened images and case1 for the
 circumstances you mention above?
>>> 
>>> I would like to avoid a second code path because it comes with a
>>> maintenance cost.
>>> 
>>> Experience also tells that adding a new way to parse option strings will
>>> come back at us later because it we must always maintain compatibility
>>> with yet another format.
>> 
>> If the rule is that cfgstr strictly follows -blockdev syntax and the
>> parsing code is mostly shared, it shouldn’t be a problem, right? I
>> suppose this will limit the expressiveness of cfgstr but might be a
>> reasonable tradeoff between implementation complexity, user
>> friendliness and interface consistency.
> 
> Yes, if we could pass the cfgstr directly to an existing parser, that
> would make it less bad at least.
> 
> Of course, if we directly pass cfgstr to the -blockdev parser, it also
> means that we can't have additional options for configuring the
> BlockBackend (which could be part of the -export command line option for
> qemu-tcmu).
> 
> That could be addressed by wrapping another QAPI object around it that
> only contain BlockdevOptions as one field, but then you have to prefix
> every block node option if you use keyval visitor syntax.
> 
>> Another possibility is to use json: format in cfgstr for anything more
>> complex than a single -blockdev.
> 
> Parsing like -blockdev already includes JSON syntax (which is necessary
> to get the full expressiveness).
> 
> If you want to support the equivalent of multiple -blockdevs, you'd need
> to wrap the BlockdevOptions in a QAPI list.
> 
>>> So I would prefer not doing this and just passing command line options
>>> to qemu-tcmu, which can reuse the already existing code paths.
>> 
>> I think the effect of not supporting adding new blockdev from cfgstr
>> is that we have to resort to QMP to allow hot-adding targets. It’s not
>> necessarily bad, though I’m not sure hwo that aligns with Yaowei’s
>> development plan.
> 
> This is true, but we need a way to do this from QMP anyway. So the
> question is whether we want to introduce a second way to achieve the
> same thing a bit more conveniently. But I expect that hot-adding is
> something that is usually done with management software anyway, so does
> the convenience actually buy us much?

The real difference is, are existing management software to adopt qemu-tcmu 
built around targetcli family or QMP? I think the management must understand 
targetcli interface to work, talking QMP is an additional burden.

So IMO cfgstr can ideally be the only channel that the management interacts 
with, if we could reuse existing QMP/QAPI code well enough.

Fam

> 
> Kevin





[Qemu-block] [PULL 07/26] vhost-user: simplify vhost_user_init/vhost_user_cleanup

2019-03-12 Thread Michael S. Tsirkin
From: Marc-André Lureau 

Take a VhostUserState* that can be pre-allocated, and initialize it
with the associated chardev.

Signed-off-by: Marc-André Lureau 
Reviewed-by: Tiwei Bie 
Message-Id: <20190308140454.32437-4-marcandre.lur...@redhat.com>
Reviewed-by: Michael S. Tsirkin 
Signed-off-by: Michael S. Tsirkin 
---
 include/hw/virtio/vhost-user-blk.h  |  2 +-
 include/hw/virtio/vhost-user-scsi.h |  2 +-
 include/hw/virtio/vhost-user.h  |  2 +-
 backends/cryptodev-vhost-user.c | 18 --
 hw/block/vhost-user-blk.c   | 22 --
 hw/scsi/vhost-user-scsi.c   | 20 
 hw/virtio/vhost-stub.c  |  4 ++--
 hw/virtio/vhost-user.c  | 16 
 net/vhost-user.c| 13 -
 9 files changed, 33 insertions(+), 66 deletions(-)

diff --git a/include/hw/virtio/vhost-user-blk.h 
b/include/hw/virtio/vhost-user-blk.h
index d52944aeeb..a8a106eecb 100644
--- a/include/hw/virtio/vhost-user-blk.h
+++ b/include/hw/virtio/vhost-user-blk.h
@@ -36,7 +36,7 @@ typedef struct VHostUserBlk {
 uint32_t queue_size;
 uint32_t config_wce;
 struct vhost_dev dev;
-VhostUserState *vhost_user;
+VhostUserState vhost_user;
 } VHostUserBlk;
 
 #endif
diff --git a/include/hw/virtio/vhost-user-scsi.h 
b/include/hw/virtio/vhost-user-scsi.h
index e429cacd8e..738f9288bd 100644
--- a/include/hw/virtio/vhost-user-scsi.h
+++ b/include/hw/virtio/vhost-user-scsi.h
@@ -30,7 +30,7 @@
 
 typedef struct VHostUserSCSI {
 VHostSCSICommon parent_obj;
-VhostUserState *vhost_user;
+VhostUserState vhost_user;
 } VHostUserSCSI;
 
 #endif /* VHOST_USER_SCSI_H */
diff --git a/include/hw/virtio/vhost-user.h b/include/hw/virtio/vhost-user.h
index fd660393a0..811e325f42 100644
--- a/include/hw/virtio/vhost-user.h
+++ b/include/hw/virtio/vhost-user.h
@@ -22,7 +22,7 @@ typedef struct VhostUserState {
 VhostUserHostNotifier notifier[VIRTIO_QUEUE_MAX];
 } VhostUserState;
 
-VhostUserState *vhost_user_init(void);
+bool vhost_user_init(VhostUserState *user, CharBackend *chr, Error **errp);
 void vhost_user_cleanup(VhostUserState *user);
 
 #endif
diff --git a/backends/cryptodev-vhost-user.c b/backends/cryptodev-vhost-user.c
index d539f14d59..1052a5d0e9 100644
--- a/backends/cryptodev-vhost-user.c
+++ b/backends/cryptodev-vhost-user.c
@@ -47,7 +47,7 @@
 typedef struct CryptoDevBackendVhostUser {
 CryptoDevBackend parent_obj;
 
-VhostUserState *vhost_user;
+VhostUserState vhost_user;
 CharBackend chr;
 char *chr_name;
 bool opened;
@@ -104,7 +104,7 @@ cryptodev_vhost_user_start(int queues,
 continue;
 }
 
-options.opaque = s->vhost_user;
+options.opaque = >vhost_user;
 options.backend_type = VHOST_BACKEND_TYPE_USER;
 options.cc = b->conf.peers.ccs[i];
 s->vhost_crypto[i] = cryptodev_vhost_init();
@@ -182,7 +182,6 @@ static void cryptodev_vhost_user_init(
 size_t i;
 Error *local_err = NULL;
 Chardev *chr;
-VhostUserState *user;
 CryptoDevBackendClient *cc;
 CryptoDevBackendVhostUser *s =
   CRYPTODEV_BACKEND_VHOST_USER(backend);
@@ -213,15 +212,10 @@ static void cryptodev_vhost_user_init(
 }
 }
 
-user = vhost_user_init();
-if (!user) {
-error_setg(errp, "Failed to init vhost_user");
+if (!vhost_user_init(>vhost_user, >chr, errp)) {
 return;
 }
 
-user->chr = >chr;
-s->vhost_user = user;
-
 qemu_chr_fe_set_handlers(>chr, NULL, NULL,
  cryptodev_vhost_user_event, NULL, s, NULL, true);
 
@@ -307,11 +301,7 @@ static void cryptodev_vhost_user_cleanup(
 }
 }
 
-if (s->vhost_user) {
-vhost_user_cleanup(s->vhost_user);
-g_free(s->vhost_user);
-s->vhost_user = NULL;
-}
+vhost_user_cleanup(>vhost_user);
 }
 
 static void cryptodev_vhost_user_set_chardev(Object *obj,
diff --git a/hw/block/vhost-user-blk.c b/hw/block/vhost-user-blk.c
index 44ac814016..767c734a81 100644
--- a/hw/block/vhost-user-blk.c
+++ b/hw/block/vhost-user-blk.c
@@ -253,7 +253,6 @@ static void vhost_user_blk_device_realize(DeviceState *dev, 
Error **errp)
 {
 VirtIODevice *vdev = VIRTIO_DEVICE(dev);
 VHostUserBlk *s = VHOST_USER_BLK(vdev);
-VhostUserState *user;
 struct vhost_virtqueue *vqs = NULL;
 int i, ret;
 
@@ -272,15 +271,10 @@ static void vhost_user_blk_device_realize(DeviceState 
*dev, Error **errp)
 return;
 }
 
-user = vhost_user_init();
-if (!user) {
-error_setg(errp, "vhost-user-blk: failed to init vhost_user");
+if (!vhost_user_init(>vhost_user, >chardev, errp)) {
 return;
 }
 
-user->chr = >chardev;
-s->vhost_user = user;
-
 virtio_init(vdev, "virtio-blk", VIRTIO_ID_BLOCK,
 sizeof(struct virtio_blk_config));
 
@@ -297,7 +291,7 @@ static void vhost_user_blk_device_realize(DeviceState *dev, 
Error 

[Qemu-block] [PULL 23/26] vhost-user-blk: Add support to get/set inflight buffer

2019-03-12 Thread Michael S. Tsirkin
From: Xie Yongji 

This patch adds support for vhost-user-blk device to get/set
inflight buffer from/to backend.

Signed-off-by: Xie Yongji 
Signed-off-by: Zhang Yu 
Message-Id: <20190228085355.9614-6-xieyon...@baidu.com>
Reviewed-by: Michael S. Tsirkin 
Signed-off-by: Michael S. Tsirkin 
---
 include/hw/virtio/vhost-user-blk.h |  1 +
 hw/block/vhost-user-blk.c  | 28 
 2 files changed, 29 insertions(+)

diff --git a/include/hw/virtio/vhost-user-blk.h 
b/include/hw/virtio/vhost-user-blk.h
index a8a106eecb..68634bee61 100644
--- a/include/hw/virtio/vhost-user-blk.h
+++ b/include/hw/virtio/vhost-user-blk.h
@@ -36,6 +36,7 @@ typedef struct VHostUserBlk {
 uint32_t queue_size;
 uint32_t config_wce;
 struct vhost_dev dev;
+struct vhost_inflight *inflight;
 VhostUserState vhost_user;
 } VHostUserBlk;
 
diff --git a/hw/block/vhost-user-blk.c b/hw/block/vhost-user-blk.c
index 767c734a81..28b81368f7 100644
--- a/hw/block/vhost-user-blk.c
+++ b/hw/block/vhost-user-blk.c
@@ -128,6 +128,21 @@ static void vhost_user_blk_start(VirtIODevice *vdev)
 }
 
 s->dev.acked_features = vdev->guest_features;
+
+if (!s->inflight->addr) {
+ret = vhost_dev_get_inflight(>dev, s->queue_size, s->inflight);
+if (ret < 0) {
+error_report("Error get inflight: %d", -ret);
+goto err_guest_notifiers;
+}
+}
+
+ret = vhost_dev_set_inflight(>dev, s->inflight);
+if (ret < 0) {
+error_report("Error set inflight: %d", -ret);
+goto err_guest_notifiers;
+}
+
 ret = vhost_dev_start(>dev, vdev);
 if (ret < 0) {
 error_report("Error starting vhost: %d", -ret);
@@ -249,6 +264,13 @@ static void vhost_user_blk_handle_output(VirtIODevice 
*vdev, VirtQueue *vq)
 }
 }
 
+static void vhost_user_blk_reset(VirtIODevice *vdev)
+{
+VHostUserBlk *s = VHOST_USER_BLK(vdev);
+
+vhost_dev_free_inflight(s->inflight);
+}
+
 static void vhost_user_blk_device_realize(DeviceState *dev, Error **errp)
 {
 VirtIODevice *vdev = VIRTIO_DEVICE(dev);
@@ -283,6 +305,8 @@ static void vhost_user_blk_device_realize(DeviceState *dev, 
Error **errp)
  vhost_user_blk_handle_output);
 }
 
+s->inflight = g_new0(struct vhost_inflight, 1);
+
 s->dev.nvqs = s->num_queues;
 s->dev.vqs = g_new(struct vhost_virtqueue, s->dev.nvqs);
 s->dev.vq_index = 0;
@@ -315,6 +339,7 @@ vhost_err:
 vhost_dev_cleanup(>dev);
 virtio_err:
 g_free(vqs);
+g_free(s->inflight);
 virtio_cleanup(vdev);
 vhost_user_cleanup(>vhost_user);
 }
@@ -327,7 +352,9 @@ static void vhost_user_blk_device_unrealize(DeviceState 
*dev, Error **errp)
 
 vhost_user_blk_set_status(vdev, 0);
 vhost_dev_cleanup(>dev);
+vhost_dev_free_inflight(s->inflight);
 g_free(vqs);
+g_free(s->inflight);
 virtio_cleanup(vdev);
 vhost_user_cleanup(>vhost_user);
 }
@@ -372,6 +399,7 @@ static void vhost_user_blk_class_init(ObjectClass *klass, 
void *data)
 vdc->set_config = vhost_user_blk_set_config;
 vdc->get_features = vhost_user_blk_get_features;
 vdc->set_status = vhost_user_blk_set_status;
+vdc->reset = vhost_user_blk_reset;
 }
 
 static const TypeInfo vhost_user_blk_info = {
-- 
MST



Re: [Qemu-block] [Qemu-devel] [PULL 04/28] qapi: drop x- from x-block-latency-histogram-set

2019-03-12 Thread Eric Blake
On 3/12/19 2:31 PM, Kevin Wolf wrote:

>> ...but patch itself is correct. Not worth a v2 pull request, since it is
>> not user-visible.
> 
> I just updated the tag without sending a formal v2, so Peter will pull
> the corrected version.

Only works insofar that we currently only require pull requests to be
signed tags, but not immutable emails (if Peter's workflow ever starts
flagging a mismatch between the commit hash mentioned in a signed pull
email request vs. where the tag actually points in the remote repo, then
we can't blindly reuse pull tags like that...)

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3226
Virtualization:  qemu.org | libvirt.org



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-block] [Qemu-devel] [PULL 04/28] qapi: drop x- from x-block-latency-histogram-set

2019-03-12 Thread Kevin Wolf
Am 12.03.2019 um 20:03 hat Eric Blake geschrieben:
> On 3/12/19 12:30 PM, Kevin Wolf wrote:
> > From: Vladimir Sementsov-Ogievskiy 
> > 
> > Drop x- and x_ prefixes for latency histograms and update version to
> > 3.1
> 
> Commit body is stale...
> 
> > 
> > Signed-off-by: Vladimir Sementsov-Ogievskiy 
> > Signed-off-by: Kevin Wolf 
> > ---
> >  qapi/block-core.json | 20 ++--
> >  block/qapi.c | 12 ++--
> >  blockdev.c   |  2 +-
> >  3 files changed, 17 insertions(+), 17 deletions(-)
> > 
> > diff --git a/qapi/block-core.json b/qapi/block-core.json
> > index 3f0a5cb1e8..7377fecb46 100644
> > --- a/qapi/block-core.json
> > +++ b/qapi/block-core.json
> > @@ -537,13 +537,13 @@
> >  # +--
> >  # 10   50   100
> >  #
> > -# Since: 2.12
> > +# Since: 4.0
> 
> ...but patch itself is correct. Not worth a v2 pull request, since it is
> not user-visible.

I just updated the tag without sending a formal v2, so Peter will pull
the corrected version.

Kevin


signature.asc
Description: PGP signature


Re: [Qemu-block] [Qemu-devel] [PULL 04/28] qapi: drop x- from x-block-latency-histogram-set

2019-03-12 Thread Eric Blake
On 3/12/19 12:30 PM, Kevin Wolf wrote:
> From: Vladimir Sementsov-Ogievskiy 
> 
> Drop x- and x_ prefixes for latency histograms and update version to
> 3.1

Commit body is stale...

> 
> Signed-off-by: Vladimir Sementsov-Ogievskiy 
> Signed-off-by: Kevin Wolf 
> ---
>  qapi/block-core.json | 20 ++--
>  block/qapi.c | 12 ++--
>  blockdev.c   |  2 +-
>  3 files changed, 17 insertions(+), 17 deletions(-)
> 
> diff --git a/qapi/block-core.json b/qapi/block-core.json
> index 3f0a5cb1e8..7377fecb46 100644
> --- a/qapi/block-core.json
> +++ b/qapi/block-core.json
> @@ -537,13 +537,13 @@
>  # +--
>  # 10   50   100
>  #
> -# Since: 2.12
> +# Since: 4.0

...but patch itself is correct. Not worth a v2 pull request, since it is
not user-visible.

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3226
Virtualization:  qemu.org | libvirt.org



signature.asc
Description: OpenPGP digital signature


[Qemu-block] [PULL 23/28] block: Allow changing the backing file on reopen

2019-03-12 Thread Kevin Wolf
From: Alberto Garcia 

This patch allows the user to change the backing file of an image that
is being reopened. Here's what it does:

 - In bdrv_reopen_prepare(): check that the value of 'backing' points
   to an existing node or is null. If it points to an existing node it
   also needs to make sure that replacing the backing file will not
   create a cycle in the node graph (i.e. you cannot reach the parent
   from the new backing file).

 - In bdrv_reopen_commit(): perform the actual node replacement by
   calling bdrv_set_backing_hd().

There may be temporary implicit nodes between a BDS and its backing
file (e.g. a commit filter node). In these cases bdrv_reopen_prepare()
looks for the real (non-implicit) backing file and requires that the
'backing' option points to it. Replacing or detaching a backing file
is forbidden if there are implicit nodes in the middle.

Although x-blockdev-reopen is meant to be used like blockdev-add,
there's an important thing that must be taken into account: the only
way to set a new backing file is by using a reference to an existing
node (previously added with e.g. blockdev-add).  If 'backing' contains
a dictionary with a new set of options ({"driver": "qcow2", "file": {
... }}) then it is interpreted that the _existing_ backing file must
be reopened with those options.

Signed-off-by: Alberto Garcia 
Signed-off-by: Kevin Wolf 
---
 include/block/block.h |   2 +
 block.c   | 166 ++
 2 files changed, 168 insertions(+)

diff --git a/include/block/block.h b/include/block/block.h
index fcc61a3a19..68a3efbb43 100644
--- a/include/block/block.h
+++ b/include/block/block.h
@@ -188,6 +188,8 @@ typedef struct BDRVReopenState {
 int flags;
 BlockdevDetectZeroesOptions detect_zeroes;
 bool backing_missing;
+bool replace_backing_bs;  /* new_backing_bs is ignored if this is false */
+BlockDriverState *new_backing_bs; /* If NULL then detach the current bs */
 uint64_t perm, shared_perm;
 QDict *options;
 QDict *explicit_options;
diff --git a/block.c b/block.c
index c043f5eadd..c6786cd35b 100644
--- a/block.c
+++ b/block.c
@@ -2983,6 +2983,27 @@ BlockDriverState *bdrv_open(const char *filename, const 
char *reference,
  NULL, errp);
 }
 
+/*
+ * Returns true if @child can be reached recursively from @bs
+ */
+static bool bdrv_recurse_has_child(BlockDriverState *bs,
+   BlockDriverState *child)
+{
+BdrvChild *c;
+
+if (bs == child) {
+return true;
+}
+
+QLIST_FOREACH(c, >children, next) {
+if (bdrv_recurse_has_child(c->bs, child)) {
+return true;
+}
+}
+
+return false;
+}
+
 /*
  * Adds a BlockDriverState to a simple queue for an atomic, transactional
  * reopen of multiple devices.
@@ -3208,6 +3229,19 @@ int bdrv_reopen_multiple(AioContext *ctx, 
BlockReopenQueue *bs_queue, Error **er
 if (ret < 0) {
 goto cleanup_perm;
 }
+/* Check if new_backing_bs would accept the new permissions */
+if (state->replace_backing_bs && state->new_backing_bs) {
+uint64_t nperm, nshared;
+bdrv_child_perm(state->bs, state->new_backing_bs,
+NULL, _backing, bs_queue,
+state->perm, state->shared_perm,
+, );
+ret = bdrv_check_update_perm(state->new_backing_bs, NULL,
+ nperm, nshared, NULL, errp);
+if (ret < 0) {
+goto cleanup_perm;
+}
+}
 bs_entry->perms_checked = true;
 }
 
@@ -3231,6 +3265,9 @@ cleanup_perm:
 bdrv_set_perm(state->bs, state->perm, state->shared_perm);
 } else {
 bdrv_abort_perm_update(state->bs);
+if (state->replace_backing_bs && state->new_backing_bs) {
+bdrv_abort_perm_update(state->new_backing_bs);
+}
 }
 }
 cleanup:
@@ -3242,6 +3279,9 @@ cleanup:
 qobject_unref(bs_entry->state.explicit_options);
 qobject_unref(bs_entry->state.options);
 }
+if (bs_entry->state.new_backing_bs) {
+bdrv_unref(bs_entry->state.new_backing_bs);
+}
 g_free(bs_entry);
 }
 g_free(bs_queue);
@@ -3313,6 +3353,101 @@ static void bdrv_reopen_perm(BlockReopenQueue *q, 
BlockDriverState *bs,
 *shared = cumulative_shared_perms;
 }
 
+/*
+ * Take a BDRVReopenState and check if the value of 'backing' in the
+ * reopen_state->options QDict is valid or not.
+ *
+ * If 'backing' is missing from the QDict then return 0.
+ *
+ * If 'backing' contains the node name of the backing file of
+ * reopen_state->bs then return 0.
+ *
+ * If 'backing' contains a different node name (or is null) then check
+ * whether the current backing file can be replaced with the new one.
+ * If that's the case then 

[Qemu-block] [PULL 22/28] block: Allow omitting the 'backing' option in certain cases

2019-03-12 Thread Kevin Wolf
From: Alberto Garcia 

Of all options of type BlockdevRef used to specify children in
BlockdevOptions, 'backing' is the only one that is optional.

For "x-blockdev-reopen" we want that if an option is omitted then it
must be reset to its default value. The default value of 'backing'
means that QEMU opens the backing file specified in the image
metadata, but this is not something that we want to support for the
reopen operation.

Because of this the 'backing' option has to be specified during
reopen, pointing to the existing backing file if we want to keep it,
or pointing to a different one (or NULL) if we want to replace it (to
be implemented in a subsequent patch).

In order to simplify things a bit and not to require that the user
passes the 'backing' option to every single block device even when
it's clearly not necessary, this patch allows omitting this option if
the block device being reopened doesn't have a backing file attached
_and_ no default backing file is specified in the image metadata.

Signed-off-by: Alberto Garcia 
Signed-off-by: Kevin Wolf 
---
 block.c | 8 +++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/block.c b/block.c
index 9698f2ad44..c043f5eadd 100644
--- a/block.c
+++ b/block.c
@@ -3434,7 +3434,13 @@ int bdrv_reopen_prepare(BDRVReopenState *reopen_state, 
BlockReopenQueue *queue,
 
 drv_prepared = true;
 
-if (drv->supports_backing && reopen_state->backing_missing) {
+/*
+ * We must provide the 'backing' option if the BDS has a backing
+ * file or if the image file has a backing file name as part of
+ * its metadata. Otherwise the 'backing' option can be omitted.
+ */
+if (drv->supports_backing && reopen_state->backing_missing &&
+(backing_bs(reopen_state->bs) || reopen_state->bs->backing_file[0])) {
 error_setg(errp, "backing is missing for '%s'",
reopen_state->bs->node_name);
 ret = -EINVAL;
-- 
2.20.1




[Qemu-block] [PULL 13/28] file-posix: Prepare permission code for fd switching

2019-03-12 Thread Kevin Wolf
In order to be able to dynamically reopen the file read-only or
read-write, depending on the users that are attached, we need to be able
to switch to a different file descriptor during the permission change.

This interacts with reopen, which also creates a new file descriptor and
performs permission changes internally. In this case, the permission
change code must reuse the reopen file descriptor instead of creating a
third one.

In turn, reopen can drop its code to copy file locks to the new file
descriptor because that is now done when applying the new permissions.

Signed-off-by: Kevin Wolf 
---
 block/file-posix.c | 96 --
 1 file changed, 85 insertions(+), 11 deletions(-)

diff --git a/block/file-posix.c b/block/file-posix.c
index 932cc8e58c..e41e0779c6 100644
--- a/block/file-posix.c
+++ b/block/file-posix.c
@@ -144,6 +144,7 @@ typedef struct BDRVRawState {
 uint64_t locked_perm;
 uint64_t locked_shared_perm;
 
+int perm_change_fd;
 BDRVReopenState *reopen_state;
 
 #ifdef CONFIG_XFS
@@ -845,7 +846,8 @@ static int raw_handle_perm_lock(BlockDriverState *bs,
 }
 
 static int raw_reconfigure_getfd(BlockDriverState *bs, int flags,
- int *open_flags, Error **errp)
+ int *open_flags, bool force_dup,
+ Error **errp)
 {
 BDRVRawState *s = bs->opaque;
 int fd = -1;
@@ -871,6 +873,11 @@ static int raw_reconfigure_getfd(BlockDriverState *bs, int 
flags,
 assert((s->open_flags & O_ASYNC) == 0);
 #endif
 
+if (!force_dup && *open_flags == s->open_flags) {
+/* We're lucky, the existing fd is fine */
+return s->fd;
+}
+
 if ((*open_flags & ~fcntl_flags) == (s->open_flags & ~fcntl_flags)) {
 /* dup the original fd */
 fd = qemu_dup(s->fd);
@@ -935,7 +942,7 @@ static int raw_reopen_prepare(BDRVReopenState *state,
 qemu_opts_to_qdict(opts, state->options);
 
 rs->fd = raw_reconfigure_getfd(state->bs, state->flags, >open_flags,
-   _err);
+   true, _err);
 if (local_err) {
 error_propagate(errp, local_err);
 ret = -1;
@@ -951,14 +958,6 @@ static int raw_reopen_prepare(BDRVReopenState *state,
 ret = -EINVAL;
 goto out_fd;
 }
-
-/* Copy locks to the new fd */
-ret = raw_apply_lock_bytes(NULL, rs->fd, s->locked_perm,
-   s->locked_shared_perm, false, errp);
-if (ret < 0) {
-ret = -EINVAL;
-goto out_fd;
-}
 }
 
 s->reopen_state = state;
@@ -2696,12 +2695,78 @@ static QemuOptsList raw_create_opts = {
 static int raw_check_perm(BlockDriverState *bs, uint64_t perm, uint64_t shared,
   Error **errp)
 {
-return raw_handle_perm_lock(bs, RAW_PL_PREPARE, perm, shared, errp);
+BDRVRawState *s = bs->opaque;
+BDRVRawReopenState *rs = NULL;
+int open_flags;
+int ret;
+
+if (s->perm_change_fd) {
+/*
+ * In the context of reopen, this function may be called several times
+ * (directly and recursively while change permissions of the parent).
+ * This is even true for children that don't inherit from the original
+ * reopen node, so s->reopen_state is not set.
+ *
+ * Ignore all but the first call.
+ */
+return 0;
+}
+
+if (s->reopen_state) {
+/* We already have a new file descriptor to set permissions for */
+assert(s->reopen_state->perm == perm);
+assert(s->reopen_state->shared_perm == shared);
+rs = s->reopen_state->opaque;
+s->perm_change_fd = rs->fd;
+} else {
+/* We may need a new fd if auto-read-only switches the mode */
+ret = raw_reconfigure_getfd(bs, bs->open_flags, _flags,
+false, errp);
+if (ret < 0) {
+return ret;
+} else if (ret != s->fd) {
+s->perm_change_fd = ret;
+}
+}
+
+/* Prepare permissions on old fd to avoid conflicts between old and new,
+ * but keep everything locked that new will need. */
+ret = raw_handle_perm_lock(bs, RAW_PL_PREPARE, perm, shared, errp);
+if (ret < 0) {
+goto fail;
+}
+
+/* Copy locks to the new fd */
+if (s->perm_change_fd) {
+ret = raw_apply_lock_bytes(NULL, s->perm_change_fd, perm, ~shared,
+   false, errp);
+if (ret < 0) {
+raw_handle_perm_lock(bs, RAW_PL_ABORT, 0, 0, NULL);
+goto fail;
+}
+}
+return 0;
+
+fail:
+if (s->perm_change_fd && !s->reopen_state) {
+qemu_close(s->perm_change_fd);
+}
+s->perm_change_fd = 0;
+return ret;
 }
 
 static void raw_set_perm(BlockDriverState *bs, uint64_t perm, uint64_t shared)
 {
 BDRVRawState *s = bs->opaque;
+

[Qemu-block] [PULL 28/28] qemu-iotests: Test the x-blockdev-reopen QMP command

2019-03-12 Thread Kevin Wolf
From: Alberto Garcia 

This patch adds several tests for the x-blockdev-reopen QMP command.

Signed-off-by: Alberto Garcia 
Signed-off-by: Kevin Wolf 
---
 tests/qemu-iotests/245 | 991 +
 tests/qemu-iotests/245.out |   5 +
 tests/qemu-iotests/group   |   1 +
 3 files changed, 997 insertions(+)
 create mode 100644 tests/qemu-iotests/245
 create mode 100644 tests/qemu-iotests/245.out

diff --git a/tests/qemu-iotests/245 b/tests/qemu-iotests/245
new file mode 100644
index 00..7891a210c1
--- /dev/null
+++ b/tests/qemu-iotests/245
@@ -0,0 +1,991 @@
+#!/usr/bin/env python
+#
+# Test cases for the QMP 'x-blockdev-reopen' command
+#
+# Copyright (C) 2018-2019 Igalia, S.L.
+# Author: Alberto Garcia 
+#
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 2 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program.  If not, see .
+#
+
+import os
+import re
+import iotests
+import copy
+import json
+from iotests import qemu_img, qemu_io
+
+hd_path = [
+os.path.join(iotests.test_dir, 'hd0.img'),
+os.path.join(iotests.test_dir, 'hd1.img'),
+os.path.join(iotests.test_dir, 'hd2.img')
+]
+
+def hd_opts(idx):
+return {'driver': iotests.imgfmt,
+'node-name': 'hd%d' % idx,
+'file': {'driver': 'file',
+ 'node-name': 'hd%d-file' % idx,
+ 'filename':  hd_path[idx] } }
+
+class TestBlockdevReopen(iotests.QMPTestCase):
+total_io_cmds = 0
+
+def setUp(self):
+qemu_img('create', '-f', iotests.imgfmt, hd_path[0], '3M')
+qemu_img('create', '-f', iotests.imgfmt, '-b', hd_path[0], hd_path[1])
+qemu_img('create', '-f', iotests.imgfmt, hd_path[2], '3M')
+qemu_io('-f', iotests.imgfmt, '-c', 'write -P 0xa0  0 1M', hd_path[0])
+qemu_io('-f', iotests.imgfmt, '-c', 'write -P 0xa1 1M 1M', hd_path[1])
+qemu_io('-f', iotests.imgfmt, '-c', 'write -P 0xa2 2M 1M', hd_path[2])
+self.vm = iotests.VM()
+self.vm.launch()
+
+def tearDown(self):
+self.vm.shutdown()
+self.check_qemu_io_errors()
+os.remove(hd_path[0])
+os.remove(hd_path[1])
+os.remove(hd_path[2])
+
+# The output of qemu-io is not returned by vm.hmp_qemu_io() but
+# it's stored in the log and can only be read when the VM has been
+# shut down. This function runs qemu-io and keeps track of the
+# number of times it's been called.
+def run_qemu_io(self, img, cmd):
+result = self.vm.hmp_qemu_io(img, cmd)
+self.assert_qmp(result, 'return', '')
+self.total_io_cmds += 1
+
+# Once the VM is shut down we can parse the log and see if qemu-io
+# ran without errors.
+def check_qemu_io_errors(self):
+self.assertFalse(self.vm.is_running())
+found = 0
+log = self.vm.get_log()
+for line in log.split("\n"):
+if line.startswith("Pattern verification failed"):
+raise Exception("%s (command #%d)" % (line, found))
+if re.match("read .*/.* bytes at offset", line):
+found += 1
+self.assertEqual(found, self.total_io_cmds,
+ "Expected output of %d qemu-io commands, found %d" %
+ (found, self.total_io_cmds))
+
+# Run x-blockdev-reopen with 'opts' but applying 'newopts'
+# on top of it. The original 'opts' dict is unmodified
+def reopen(self, opts, newopts = {}, errmsg = None):
+opts = copy.deepcopy(opts)
+
+# Apply the changes from 'newopts' on top of 'opts'
+for key in newopts:
+value = newopts[key]
+# If key has the form "foo.bar" then we need to do
+# opts["foo"]["bar"] = value, not opts["foo.bar"] = value
+subdict = opts
+while key.find('.') != -1:
+[prefix, key] = key.split('.', 1)
+subdict = opts[prefix]
+subdict[key] = value
+
+result = self.vm.qmp('x-blockdev-reopen', conv_keys = False, **opts)
+if errmsg:
+self.assert_qmp(result, 'error/class', 'GenericError')
+self.assert_qmp(result, 'error/desc', errmsg)
+else:
+self.assert_qmp(result, 'return', {})
+
+
+# Run query-named-block-nodes and return the specified entry
+def get_node(self, node_name):
+result = self.vm.qmp('query-named-block-nodes')
+for node in result['return']:
+if 

[Qemu-block] [PULL 27/28] block: Add an 'x-blockdev-reopen' QMP command

2019-03-12 Thread Kevin Wolf
From: Alberto Garcia 

This command allows reopening an arbitrary BlockDriverState with a
new set of options. Some options (e.g node-name) cannot be changed
and some block drivers don't allow reopening, but otherwise this
command is modelled after 'blockdev-add' and the state of the reopened
BlockDriverState should generally be the same as if it had just been
added by 'blockdev-add' with the same set of options.

One notable exception is the 'backing' option: 'x-blockdev-reopen'
requires that it is always present unless the BlockDriverState in
question doesn't have a current or default backing file.

This command allows reconfiguring the graph by using the appropriate
options to change the children of a node. At the moment it's possible
to change a backing file by setting the 'backing' option to the name
of the new node, but it should also be possible to add a similar
functionality to other block drivers (e.g. Quorum, blkverify).

Although the API is unlikely to change, this command is marked
experimental for the time being so there's room to see if the
semantics need changes.

Signed-off-by: Alberto Garcia 
Signed-off-by: Kevin Wolf 
---
 qapi/block-core.json | 42 +++
 blockdev.c   | 47 
 2 files changed, 89 insertions(+)

diff --git a/qapi/block-core.json b/qapi/block-core.json
index 7377fecb46..474e268c4d 100644
--- a/qapi/block-core.json
+++ b/qapi/block-core.json
@@ -3997,6 +3997,48 @@
 ##
 { 'command': 'blockdev-add', 'data': 'BlockdevOptions', 'boxed': true }
 
+##
+# @x-blockdev-reopen:
+#
+# Reopens a block device using the given set of options. Any option
+# not specified will be reset to its default value regardless of its
+# previous status. If an option cannot be changed or a particular
+# driver does not support reopening then the command will return an
+# error.
+#
+# The top-level @node-name option (from BlockdevOptions) must be
+# specified and is used to select the block device to be reopened.
+# Other @node-name options must be either omitted or set to the
+# current name of the appropriate node. This command won't change any
+# node name and any attempt to do it will result in an error.
+#
+# In the case of options that refer to child nodes, the behavior of
+# this command depends on the value:
+#
+#  1) A set of options (BlockdevOptions): the child is reopened with
+# the specified set of options.
+#
+#  2) A reference to the current child: the child is reopened using
+# its existing set of options.
+#
+#  3) A reference to a different node: the current child is replaced
+# with the specified one.
+#
+#  4) NULL: the current child (if any) is detached.
+#
+# Options (1) and (2) are supported in all cases, but at the moment
+# only @backing allows replacing or detaching an existing child.
+#
+# Unlike with blockdev-add, the @backing option must always be present
+# unless the node being reopened does not have a backing file and its
+# image does not have a default backing file name as part of its
+# metadata.
+#
+# Since: 4.0
+##
+{ 'command': 'x-blockdev-reopen',
+  'data': 'BlockdevOptions', 'boxed': true }
+
 ##
 # @blockdev-del:
 #
diff --git a/blockdev.c b/blockdev.c
index 84dd678289..3cc51b2ee7 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -4287,6 +4287,53 @@ fail:
 visit_free(v);
 }
 
+void qmp_x_blockdev_reopen(BlockdevOptions *options, Error **errp)
+{
+BlockDriverState *bs;
+AioContext *ctx;
+QObject *obj;
+Visitor *v = qobject_output_visitor_new();
+Error *local_err = NULL;
+BlockReopenQueue *queue;
+QDict *qdict;
+
+/* Check for the selected node name */
+if (!options->has_node_name) {
+error_setg(errp, "Node name not specified");
+goto fail;
+}
+
+bs = bdrv_find_node(options->node_name);
+if (!bs) {
+error_setg(errp, "Cannot find node named '%s'", options->node_name);
+goto fail;
+}
+
+/* Put all options in a QDict and flatten it */
+visit_type_BlockdevOptions(v, NULL, , _err);
+if (local_err) {
+error_propagate(errp, local_err);
+goto fail;
+}
+
+visit_complete(v, );
+qdict = qobject_to(QDict, obj);
+
+qdict_flatten(qdict);
+
+/* Perform the reopen operation */
+ctx = bdrv_get_aio_context(bs);
+aio_context_acquire(ctx);
+bdrv_subtree_drained_begin(bs);
+queue = bdrv_reopen_queue(NULL, bs, qdict, false);
+bdrv_reopen_multiple(queue, errp);
+bdrv_subtree_drained_end(bs);
+aio_context_release(ctx);
+
+fail:
+visit_free(v);
+}
+
 void qmp_blockdev_del(const char *node_name, Error **errp)
 {
 AioContext *aio_context;
-- 
2.20.1




[Qemu-block] [PULL 26/28] block: Remove the AioContext parameter from bdrv_reopen_multiple()

2019-03-12 Thread Kevin Wolf
From: Alberto Garcia 

This parameter has been unused since 1a63a907507fbbcfaee3f622907ec244b

Signed-off-by: Alberto Garcia 
Signed-off-by: Kevin Wolf 
---
 include/block/block.h | 2 +-
 block.c   | 4 ++--
 block/replication.c   | 3 +--
 qemu-io-cmds.c| 2 +-
 4 files changed, 5 insertions(+), 6 deletions(-)

diff --git a/include/block/block.h b/include/block/block.h
index 68a3efbb43..e452988b66 100644
--- a/include/block/block.h
+++ b/include/block/block.h
@@ -304,7 +304,7 @@ BlockDriverState *bdrv_new_open_driver(BlockDriver *drv, 
const char *node_name,
 BlockReopenQueue *bdrv_reopen_queue(BlockReopenQueue *bs_queue,
 BlockDriverState *bs, QDict *options,
 bool keep_old_opts);
-int bdrv_reopen_multiple(AioContext *ctx, BlockReopenQueue *bs_queue, Error 
**errp);
+int bdrv_reopen_multiple(BlockReopenQueue *bs_queue, Error **errp);
 int bdrv_reopen_set_read_only(BlockDriverState *bs, bool read_only,
   Error **errp);
 int bdrv_reopen_prepare(BDRVReopenState *reopen_state,
diff --git a/block.c b/block.c
index a5e69ad81e..ed9253c786 100644
--- a/block.c
+++ b/block.c
@@ -3254,7 +3254,7 @@ BlockReopenQueue *bdrv_reopen_queue(BlockReopenQueue 
*bs_queue,
  * All affected nodes must be drained between bdrv_reopen_queue() and
  * bdrv_reopen_multiple().
  */
-int bdrv_reopen_multiple(AioContext *ctx, BlockReopenQueue *bs_queue, Error 
**errp)
+int bdrv_reopen_multiple(BlockReopenQueue *bs_queue, Error **errp)
 {
 int ret = -1;
 BlockReopenQueueEntry *bs_entry, *next;
@@ -3347,7 +3347,7 @@ int bdrv_reopen_set_read_only(BlockDriverState *bs, bool 
read_only,
 
 bdrv_subtree_drained_begin(bs);
 queue = bdrv_reopen_queue(NULL, bs, opts, true);
-ret = bdrv_reopen_multiple(bdrv_get_aio_context(bs), queue, errp);
+ret = bdrv_reopen_multiple(queue, errp);
 bdrv_subtree_drained_end(bs);
 
 return ret;
diff --git a/block/replication.c b/block/replication.c
index a2f3590310..b95bd28802 100644
--- a/block/replication.c
+++ b/block/replication.c
@@ -385,8 +385,7 @@ static void reopen_backing_file(BlockDriverState *bs, bool 
writable,
 }
 
 if (reopen_queue) {
-bdrv_reopen_multiple(bdrv_get_aio_context(bs),
- reopen_queue, _err);
+bdrv_reopen_multiple(reopen_queue, _err);
 error_propagate(errp, local_err);
 }
 
diff --git a/qemu-io-cmds.c b/qemu-io-cmds.c
index ff9a5cd80f..35dcdcf413 100644
--- a/qemu-io-cmds.c
+++ b/qemu-io-cmds.c
@@ -2081,7 +2081,7 @@ static int reopen_f(BlockBackend *blk, int argc, char 
**argv)
 
 bdrv_subtree_drained_begin(bs);
 brq = bdrv_reopen_queue(NULL, bs, opts, true);
-bdrv_reopen_multiple(bdrv_get_aio_context(bs), brq, _err);
+bdrv_reopen_multiple(brq, _err);
 bdrv_subtree_drained_end(bs);
 
 if (local_err) {
-- 
2.20.1




[Qemu-block] [PULL 10/28] file-posix: Factor out raw_reconfigure_getfd()

2019-03-12 Thread Kevin Wolf
Signed-off-by: Kevin Wolf 
---
 block/file-posix.c | 107 ++---
 1 file changed, 62 insertions(+), 45 deletions(-)

diff --git a/block/file-posix.c b/block/file-posix.c
index ba6ab62a38..ae57ba1fc6 100644
--- a/block/file-posix.c
+++ b/block/file-posix.c
@@ -842,6 +842,62 @@ static int raw_handle_perm_lock(BlockDriverState *bs,
 return ret;
 }
 
+static int raw_reconfigure_getfd(BlockDriverState *bs, int flags,
+ int *open_flags, Error **errp)
+{
+BDRVRawState *s = bs->opaque;
+int fd = -1;
+int ret;
+int fcntl_flags = O_APPEND | O_NONBLOCK;
+#ifdef O_NOATIME
+fcntl_flags |= O_NOATIME;
+#endif
+
+*open_flags = 0;
+if (s->type == FTYPE_CD) {
+*open_flags |= O_NONBLOCK;
+}
+
+raw_parse_flags(flags, open_flags);
+
+#ifdef O_ASYNC
+/* Not all operating systems have O_ASYNC, and those that don't
+ * will not let us track the state into rs->open_flags (typically
+ * you achieve the same effect with an ioctl, for example I_SETSIG
+ * on Solaris). But we do not use O_ASYNC, so that's fine.
+ */
+assert((s->open_flags & O_ASYNC) == 0);
+#endif
+
+if ((*open_flags & ~fcntl_flags) == (s->open_flags & ~fcntl_flags)) {
+/* dup the original fd */
+fd = qemu_dup(s->fd);
+if (fd >= 0) {
+ret = fcntl_setfl(fd, *open_flags);
+if (ret) {
+qemu_close(fd);
+fd = -1;
+}
+}
+}
+
+/* If we cannot use fcntl, or fcntl failed, fall back to qemu_open() */
+if (fd == -1) {
+const char *normalized_filename = bs->filename;
+ret = raw_normalize_devicepath(_filename, errp);
+if (ret >= 0) {
+assert(!(*open_flags & O_CREAT));
+fd = qemu_open(normalized_filename, *open_flags);
+if (fd == -1) {
+error_setg_errno(errp, errno, "Could not reopen file");
+return -1;
+}
+}
+}
+
+return fd;
+}
+
 static int raw_reopen_prepare(BDRVReopenState *state,
   BlockReopenQueue *queue, Error **errp)
 {
@@ -858,7 +914,6 @@ static int raw_reopen_prepare(BDRVReopenState *state,
 
 state->opaque = g_new0(BDRVRawReopenState, 1);
 rs = state->opaque;
-rs->fd = -1;
 
 /* Handle options changes */
 opts = qemu_opts_create(_runtime_opts, NULL, 0, _abort);
@@ -877,50 +932,12 @@ static int raw_reopen_prepare(BDRVReopenState *state,
  * bdrv_reopen_prepare() will detect changes and complain. */
 qemu_opts_to_qdict(opts, state->options);
 
-if (s->type == FTYPE_CD) {
-rs->open_flags |= O_NONBLOCK;
-}
-
-raw_parse_flags(state->flags, >open_flags);
-
-int fcntl_flags = O_APPEND | O_NONBLOCK;
-#ifdef O_NOATIME
-fcntl_flags |= O_NOATIME;
-#endif
-
-#ifdef O_ASYNC
-/* Not all operating systems have O_ASYNC, and those that don't
- * will not let us track the state into rs->open_flags (typically
- * you achieve the same effect with an ioctl, for example I_SETSIG
- * on Solaris). But we do not use O_ASYNC, so that's fine.
- */
-assert((s->open_flags & O_ASYNC) == 0);
-#endif
-
-if ((rs->open_flags & ~fcntl_flags) == (s->open_flags & ~fcntl_flags)) {
-/* dup the original fd */
-rs->fd = qemu_dup(s->fd);
-if (rs->fd >= 0) {
-ret = fcntl_setfl(rs->fd, rs->open_flags);
-if (ret) {
-qemu_close(rs->fd);
-rs->fd = -1;
-}
-}
-}
-
-/* If we cannot use fcntl, or fcntl failed, fall back to qemu_open() */
-if (rs->fd == -1) {
-const char *normalized_filename = state->bs->filename;
-ret = raw_normalize_devicepath(_filename, errp);
-if (ret >= 0) {
-assert(!(rs->open_flags & O_CREAT));
-rs->fd = qemu_open(normalized_filename, rs->open_flags);
-if (rs->fd == -1) {
-error_setg_errno(errp, errno, "Could not reopen file");
-ret = -1;
-}
-}
+rs->fd = raw_reconfigure_getfd(state->bs, state->flags, >open_flags,
+   _err);
+if (local_err) {
+error_propagate(errp, local_err);
+ret = -1;
+goto out;
 }
 
 /* Fail already reopen_prepare() if we can't get a working O_DIRECT
-- 
2.20.1




[Qemu-block] [PULL 25/28] block: Add bdrv_reset_options_allowed()

2019-03-12 Thread Kevin Wolf
From: Alberto Garcia 

bdrv_reopen_prepare() receives a BDRVReopenState with (among other
things) a new set of options to be applied to that BlockDriverState.

If an option is missing then it means that we want to reset it to its
default value rather than keep the previous one. This way the state
of the block device after being reopened is comparable to that of a
device added with "blockdev-add" using the same set of options.

Not all options from all drivers can be changed this way, however.
If the user attempts to reset an immutable option to its default value
using this method then we must forbid it.

This new function takes a BlockDriverState and a new set of options
and checks if there's any option that was previously set but is
missing from the new set of options.

If the option is present in both sets we don't need to check that they
have the same value. The loop at the end of bdrv_reopen_prepare()
already takes care of that.

Signed-off-by: Alberto Garcia 
Signed-off-by: Kevin Wolf 
---
 block.c | 58 +
 1 file changed, 58 insertions(+)

diff --git a/block.c b/block.c
index c6786cd35b..a5e69ad81e 100644
--- a/block.c
+++ b/block.c
@@ -2983,6 +2983,53 @@ BlockDriverState *bdrv_open(const char *filename, const 
char *reference,
  NULL, errp);
 }
 
+/* Return true if the NULL-terminated @list contains @str */
+static bool is_str_in_list(const char *str, const char *const *list)
+{
+if (str && list) {
+int i;
+for (i = 0; list[i] != NULL; i++) {
+if (!strcmp(str, list[i])) {
+return true;
+}
+}
+}
+return false;
+}
+
+/*
+ * Check that every option set in @bs->options is also set in
+ * @new_opts.
+ *
+ * Options listed in the common_options list and in
+ * @bs->drv->mutable_opts are skipped.
+ *
+ * Return 0 on success, otherwise return -EINVAL and set @errp.
+ */
+static int bdrv_reset_options_allowed(BlockDriverState *bs,
+  const QDict *new_opts, Error **errp)
+{
+const QDictEntry *e;
+/* These options are common to all block drivers and are handled
+ * in bdrv_reopen_prepare() so they can be left out of @new_opts */
+const char *const common_options[] = {
+"node-name", "discard", "cache.direct", "cache.no-flush",
+"read-only", "auto-read-only", "detect-zeroes", NULL
+};
+
+for (e = qdict_first(bs->options); e; e = qdict_next(bs->options, e)) {
+if (!qdict_haskey(new_opts, e->key) &&
+!is_str_in_list(e->key, common_options) &&
+!is_str_in_list(e->key, bs->drv->mutable_opts)) {
+error_setg(errp, "Option '%s' cannot be reset "
+   "to its default value", e->key);
+return -EINVAL;
+}
+}
+
+return 0;
+}
+
 /*
  * Returns true if @child can be reached recursively from @bs
  */
@@ -3546,6 +3593,17 @@ int bdrv_reopen_prepare(BDRVReopenState *reopen_state, 
BlockReopenQueue *queue,
 }
 
 if (drv->bdrv_reopen_prepare) {
+/*
+ * If a driver-specific option is missing, it means that we
+ * should reset it to its default value.
+ * But not all options allow that, so we need to check it first.
+ */
+ret = bdrv_reset_options_allowed(reopen_state->bs,
+ reopen_state->options, errp);
+if (ret) {
+goto error;
+}
+
 ret = drv->bdrv_reopen_prepare(reopen_state, queue, _err);
 if (ret) {
 if (local_err != NULL) {
-- 
2.20.1




[Qemu-block] [PULL 06/28] qemu-iotests: commit to backing file with auto-read-only

2019-03-12 Thread Kevin Wolf
Signed-off-by: Kevin Wolf 
Reviewed-by: Eric Blake 
---
 tests/qemu-iotests/232 | 31 +++
 tests/qemu-iotests/232.out | 20 
 2 files changed, 51 insertions(+)

diff --git a/tests/qemu-iotests/232 b/tests/qemu-iotests/232
index 71fd48eff0..0de097fc88 100755
--- a/tests/qemu-iotests/232
+++ b/tests/qemu-iotests/232
@@ -29,6 +29,7 @@ status=1  # failure is the default!
 _cleanup()
 {
 _cleanup_test_img
+rm -f $TEST_IMG.[01234]
 }
 trap "_cleanup; exit \$status" 0 1 2 3 15
 
@@ -143,6 +144,36 @@ run_qemu_info_block -blockdev 
driver=file,filename="$TEST_IMG",node-name=node0,a
 run_qemu_info_block -blockdev 
driver=file,filename="$TEST_IMG",node-name=node0,auto-read-only=on
 run_qemu_info_block -blockdev driver=file,filename="$TEST_IMG",node-name=node0
 
+echo
+echo "=== Try commit to backing file with auto-read-only ==="
+echo
+
+TEST_IMG="$TEST_IMG.0" _make_test_img $size
+TEST_IMG="$TEST_IMG.1" _make_test_img $size
+TEST_IMG="$TEST_IMG.2" _make_test_img $size
+TEST_IMG="$TEST_IMG.3" _make_test_img $size
+TEST_IMG="$TEST_IMG.4" _make_test_img $size
+
+(cat <

[Qemu-block] [PULL 01/28] gluster: Handle changed glfs_ftruncate signature

2019-03-12 Thread Kevin Wolf
From: Prasanna Kumar Kalever 

New versions of Glusters libgfapi.so have an updated glfs_ftruncate()
function that returns additional 'struct stat' structures to enable
advanced caching of attributes. This is useful for file servers, not so
much for QEMU. Nevertheless, the API has changed and needs to be
adopted.

Signed-off-by: Prasanna Kumar Kalever 
Signed-off-by: Niels de Vos 
Signed-off-by: Kevin Wolf 
---
 configure   | 18 ++
 block/gluster.c |  4 
 2 files changed, 22 insertions(+)

diff --git a/configure b/configure
index cab830a4c9..5354d51930 100755
--- a/configure
+++ b/configure
@@ -456,6 +456,7 @@ glusterfs_xlator_opt="no"
 glusterfs_discard="no"
 glusterfs_fallocate="no"
 glusterfs_zerofill="no"
+glusterfs_ftruncate_has_stat="no"
 gtk=""
 gtk_gl="no"
 tls_priority="NORMAL"
@@ -4091,6 +4092,19 @@ if test "$glusterfs" != "no" ; then
   glusterfs_fallocate="yes"
   glusterfs_zerofill="yes"
 fi
+cat > $TMPC << EOF
+#include 
+
+int
+main(void)
+{
+   /* new glfs_ftruncate() passes two additional args */
+   return glfs_ftruncate(NULL, 0, NULL, NULL);
+}
+EOF
+if compile_prog "$glusterfs_cflags" "$glusterfs_libs" ; then
+  glusterfs_ftruncate_has_stat="yes"
+fi
   else
 if test "$glusterfs" = "yes" ; then
   feature_not_found "GlusterFS backend support" \
@@ -6976,6 +6990,10 @@ if test "$glusterfs_zerofill" = "yes" ; then
   echo "CONFIG_GLUSTERFS_ZEROFILL=y" >> $config_host_mak
 fi
 
+if test "$glusterfs_ftruncate_has_stat" = "yes" ; then
+  echo "CONFIG_GLUSTERFS_FTRUNCATE_HAS_STAT=y" >> $config_host_mak
+fi
+
 if test "$libssh2" = "yes" ; then
   echo "CONFIG_LIBSSH2=m" >> $config_host_mak
   echo "LIBSSH2_CFLAGS=$libssh2_cflags" >> $config_host_mak
diff --git a/block/gluster.c b/block/gluster.c
index af64330211..f853aa87f4 100644
--- a/block/gluster.c
+++ b/block/gluster.c
@@ -20,6 +20,10 @@
 #include "qemu/option.h"
 #include "qemu/cutils.h"
 
+#ifdef CONFIG_GLUSTERFS_FTRUNCATE_HAS_STAT
+# define glfs_ftruncate(fd, offset) glfs_ftruncate(fd, offset, NULL, NULL)
+#endif
+
 #define GLUSTER_OPT_FILENAME"filename"
 #define GLUSTER_OPT_VOLUME  "volume"
 #define GLUSTER_OPT_PATH"path"
-- 
2.20.1




[Qemu-block] [PULL 04/28] qapi: drop x- from x-block-latency-histogram-set

2019-03-12 Thread Kevin Wolf
From: Vladimir Sementsov-Ogievskiy 

Drop x- and x_ prefixes for latency histograms and update version to
3.1

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Signed-off-by: Kevin Wolf 
---
 qapi/block-core.json | 20 ++--
 block/qapi.c | 12 ++--
 blockdev.c   |  2 +-
 3 files changed, 17 insertions(+), 17 deletions(-)

diff --git a/qapi/block-core.json b/qapi/block-core.json
index 3f0a5cb1e8..7377fecb46 100644
--- a/qapi/block-core.json
+++ b/qapi/block-core.json
@@ -537,13 +537,13 @@
 # +--
 # 10   50   100
 #
-# Since: 2.12
+# Since: 4.0
 ##
 { 'struct': 'BlockLatencyHistogramInfo',
   'data': {'boundaries': ['uint64'], 'bins': ['uint64'] } }
 
 ##
-# @x-block-latency-histogram-set:
+# @block-latency-histogram-set:
 #
 # Manage read, write and flush latency histograms for the device.
 #
@@ -573,7 +573,7 @@
 #
 # Returns: error if device is not found or any boundary arrays are invalid.
 #
-# Since: 2.12
+# Since: 4.0
 #
 # Example: set new histograms for all io types with intervals
 # [0, 10), [10, 50), [50, 100), [100, +inf):
@@ -607,7 +607,7 @@
 #  "arguments": { "device": "drive0" } }
 # <- { "return": {} }
 ##
-{ 'command': 'x-block-latency-histogram-set',
+{ 'command': 'block-latency-histogram-set',
   'data': {'id': 'str',
'*boundaries': ['uint64'],
'*boundaries-read': ['uint64'],
@@ -894,11 +894,11 @@
 # @timed_stats: Statistics specific to the set of previously defined
 #   intervals of time (Since 2.5)
 #
-# @x_rd_latency_histogram: @BlockLatencyHistogramInfo. (Since 2.12)
+# @rd_latency_histogram: @BlockLatencyHistogramInfo. (Since 4.0)
 #
-# @x_wr_latency_histogram: @BlockLatencyHistogramInfo. (Since 2.12)
+# @wr_latency_histogram: @BlockLatencyHistogramInfo. (Since 4.0)
 #
-# @x_flush_latency_histogram: @BlockLatencyHistogramInfo. (Since 2.12)
+# @flush_latency_histogram: @BlockLatencyHistogramInfo. (Since 4.0)
 #
 # Since: 0.14.0
 ##
@@ -913,9 +913,9 @@
'invalid_wr_operations': 'int', 'invalid_flush_operations': 'int',
'account_invalid': 'bool', 'account_failed': 'bool',
'timed_stats': ['BlockDeviceTimedStats'],
-   '*x_rd_latency_histogram': 'BlockLatencyHistogramInfo',
-   '*x_wr_latency_histogram': 'BlockLatencyHistogramInfo',
-   '*x_flush_latency_histogram': 'BlockLatencyHistogramInfo' } }
+   '*rd_latency_histogram': 'BlockLatencyHistogramInfo',
+   '*wr_latency_histogram': 'BlockLatencyHistogramInfo',
+   '*flush_latency_histogram': 'BlockLatencyHistogramInfo' } }
 
 ##
 # @BlockStats:
diff --git a/block/qapi.c b/block/qapi.c
index 6002a768f8..21edab34fc 100644
--- a/block/qapi.c
+++ b/block/qapi.c
@@ -493,14 +493,14 @@ static void bdrv_query_blk_stats(BlockDeviceStats *ds, 
BlockBackend *blk)
 }
 
 bdrv_latency_histogram_stats(>latency_histogram[BLOCK_ACCT_READ],
- >has_x_rd_latency_histogram,
- >x_rd_latency_histogram);
+ >has_rd_latency_histogram,
+ >rd_latency_histogram);
 bdrv_latency_histogram_stats(>latency_histogram[BLOCK_ACCT_WRITE],
- >has_x_wr_latency_histogram,
- >x_wr_latency_histogram);
+ >has_wr_latency_histogram,
+ >wr_latency_histogram);
 bdrv_latency_histogram_stats(>latency_histogram[BLOCK_ACCT_FLUSH],
- >has_x_flush_latency_histogram,
- >x_flush_latency_histogram);
+ >has_flush_latency_histogram,
+ >flush_latency_histogram);
 }
 
 static BlockStats *bdrv_query_bds_stats(BlockDriverState *bs,
diff --git a/blockdev.c b/blockdev.c
index 850fb34c52..84dd678289 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -4452,7 +4452,7 @@ void qmp_x_blockdev_set_iothread(const char *node_name, 
StrOrNull *iothread,
 aio_context_release(old_context);
 }
 
-void qmp_x_block_latency_histogram_set(
+void qmp_block_latency_histogram_set(
 const char *id,
 bool has_boundaries, uint64List *boundaries,
 bool has_boundaries_read, uint64List *boundaries_read,
-- 
2.20.1




[Qemu-block] [PULL 24/28] block: Add a 'mutable_opts' field to BlockDriver

2019-03-12 Thread Kevin Wolf
From: Alberto Garcia 

If we reopen a BlockDriverState and there is an option that is present
in bs->options but missing from the new set of options then we have to
return an error unless the driver is able to reset it to its default
value.

This patch adds a new 'mutable_opts' field to BlockDriver. This is
a list of runtime options that can be modified during reopen. If an
option in this list is unspecified on reopen then it must be reset (or
return an error).

Signed-off-by: Alberto Garcia 
Signed-off-by: Kevin Wolf 
---
 include/block/block_int.h |  8 
 block/file-posix.c|  6 ++
 block/qcow2.c | 25 +
 block/raw-format.c|  3 +++
 4 files changed, 42 insertions(+)

diff --git a/include/block/block_int.h b/include/block/block_int.h
index 438feba4e5..01e855a066 100644
--- a/include/block/block_int.h
+++ b/include/block/block_int.h
@@ -383,6 +383,14 @@ struct BlockDriver {
 
 /* List of options for creating images, terminated by name == NULL */
 QemuOptsList *create_opts;
+/*
+ * If this driver supports reopening images this contains a
+ * NULL-terminated list of the runtime options that can be
+ * modified. If an option in this list is unspecified during
+ * reopen then it _must_ be reset to its default value or return
+ * an error.
+ */
+const char *const *mutable_opts;
 
 /*
  * Returns 0 for completed check, -errno for internal errors.
diff --git a/block/file-posix.c b/block/file-posix.c
index d41059d139..e9fa6aac48 100644
--- a/block/file-posix.c
+++ b/block/file-posix.c
@@ -442,6 +442,8 @@ static QemuOptsList raw_runtime_opts = {
 },
 };
 
+static const char *const mutable_opts[] = { "x-check-cache-dropped", NULL };
+
 static int raw_open_common(BlockDriverState *bs, QDict *options,
int bdrv_flags, int open_flags,
bool device, Error **errp)
@@ -2870,6 +2872,7 @@ BlockDriver bdrv_file = {
 .bdrv_set_perm   = raw_set_perm,
 .bdrv_abort_perm_update = raw_abort_perm_update,
 .create_opts = _create_opts,
+.mutable_opts = mutable_opts,
 };
 
 /***/
@@ -3321,6 +3324,7 @@ static BlockDriver bdrv_host_device = {
 .bdrv_reopen_abort   = raw_reopen_abort,
 .bdrv_co_create_opts = hdev_co_create_opts,
 .create_opts = _create_opts,
+.mutable_opts= mutable_opts,
 .bdrv_co_invalidate_cache = raw_co_invalidate_cache,
 .bdrv_co_pwrite_zeroes = hdev_co_pwrite_zeroes,
 
@@ -3447,6 +3451,7 @@ static BlockDriver bdrv_host_cdrom = {
 .bdrv_reopen_abort   = raw_reopen_abort,
 .bdrv_co_create_opts = hdev_co_create_opts,
 .create_opts = _create_opts,
+.mutable_opts= mutable_opts,
 .bdrv_co_invalidate_cache = raw_co_invalidate_cache,
 
 
@@ -3580,6 +3585,7 @@ static BlockDriver bdrv_host_cdrom = {
 .bdrv_reopen_abort   = raw_reopen_abort,
 .bdrv_co_create_opts = hdev_co_create_opts,
 .create_opts= _create_opts,
+.mutable_opts   = mutable_opts,
 
 .bdrv_co_preadv = raw_co_preadv,
 .bdrv_co_pwritev= raw_co_pwritev,
diff --git a/block/qcow2.c b/block/qcow2.c
index c4dd876fb4..0fc9b0561e 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -627,6 +627,30 @@ int qcow2_validate_table(BlockDriverState *bs, uint64_t 
offset,
 return 0;
 }
 
+static const char *const mutable_opts[] = {
+QCOW2_OPT_LAZY_REFCOUNTS,
+QCOW2_OPT_DISCARD_REQUEST,
+QCOW2_OPT_DISCARD_SNAPSHOT,
+QCOW2_OPT_DISCARD_OTHER,
+QCOW2_OPT_OVERLAP,
+QCOW2_OPT_OVERLAP_TEMPLATE,
+QCOW2_OPT_OVERLAP_MAIN_HEADER,
+QCOW2_OPT_OVERLAP_ACTIVE_L1,
+QCOW2_OPT_OVERLAP_ACTIVE_L2,
+QCOW2_OPT_OVERLAP_REFCOUNT_TABLE,
+QCOW2_OPT_OVERLAP_REFCOUNT_BLOCK,
+QCOW2_OPT_OVERLAP_SNAPSHOT_TABLE,
+QCOW2_OPT_OVERLAP_INACTIVE_L1,
+QCOW2_OPT_OVERLAP_INACTIVE_L2,
+QCOW2_OPT_OVERLAP_BITMAP_DIRECTORY,
+QCOW2_OPT_CACHE_SIZE,
+QCOW2_OPT_L2_CACHE_SIZE,
+QCOW2_OPT_L2_CACHE_ENTRY_SIZE,
+QCOW2_OPT_REFCOUNT_CACHE_SIZE,
+QCOW2_OPT_CACHE_CLEAN_INTERVAL,
+NULL
+};
+
 static QemuOptsList qcow2_runtime_opts = {
 .name = "qcow2",
 .head = QTAILQ_HEAD_INITIALIZER(qcow2_runtime_opts.head),
@@ -5275,6 +5299,7 @@ BlockDriver bdrv_qcow2 = {
 
 .create_opts = _create_opts,
 .strong_runtime_opts = qcow2_strong_runtime_opts,
+.mutable_opts= mutable_opts,
 .bdrv_co_check   = qcow2_co_check,
 .bdrv_amend_options  = qcow2_amend_options,
 
diff --git a/block/raw-format.c b/block/raw-format.c
index e3e5ba2c8a..cec29986cc 100644
--- a/block/raw-format.c
+++ b/block/raw-format.c
@@ -37,6 +37,8 @@ typedef struct BDRVRawState {
 bool has_size;
 } BDRVRawState;
 
+static const char *const mutable_opts[] = { "offset", "size", NULL };
+
 static QemuOptsList raw_runtime_opts = {
 .name = "raw",
 .head = 

[Qemu-block] [PULL 15/28] nvme: fix write zeroes offset and count

2019-03-12 Thread Kevin Wolf
From: Keith Busch 

The implementation used blocks units rather than the expected bytes.

Fixes: c03e7ef12a9 ("nvme: Implement Write Zeroes")
Reported-by: Ming Lei 
Signed-off-by: Keith Busch 
Reviewed-by: Christoph Hellwig 
Signed-off-by: Kevin Wolf 
---
 hw/block/nvme.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/hw/block/nvme.c b/hw/block/nvme.c
index 8325b5e88a..7caf92532a 100644
--- a/hw/block/nvme.c
+++ b/hw/block/nvme.c
@@ -324,8 +324,8 @@ static uint16_t nvme_write_zeros(NvmeCtrl *n, NvmeNamespace 
*ns, NvmeCmd *cmd,
 const uint8_t data_shift = ns->id_ns.lbaf[lba_index].ds;
 uint64_t slba = le64_to_cpu(rw->slba);
 uint32_t nlb  = le16_to_cpu(rw->nlb) + 1;
-uint64_t aio_slba = slba << (data_shift - BDRV_SECTOR_BITS);
-uint32_t aio_nlb = nlb << (data_shift - BDRV_SECTOR_BITS);
+uint64_t offset = slba << data_shift;
+uint32_t count = nlb << data_shift;
 
 if (unlikely(slba + nlb > ns->id_ns.nsze)) {
 trace_nvme_err_invalid_lba_range(slba, nlb, ns->id_ns.nsze);
@@ -335,7 +335,7 @@ static uint16_t nvme_write_zeros(NvmeCtrl *n, NvmeNamespace 
*ns, NvmeCmd *cmd,
 req->has_sg = false;
 block_acct_start(blk_get_stats(n->conf.blk), >acct, 0,
  BLOCK_ACCT_WRITE);
-req->aiocb = blk_aio_pwrite_zeroes(n->conf.blk, aio_slba, aio_nlb,
+req->aiocb = blk_aio_pwrite_zeroes(n->conf.blk, offset, count,
 BDRV_REQ_MAY_UNMAP, nvme_rw_cb, req);
 return NVME_NO_COMPLETE;
 }
-- 
2.20.1




[Qemu-block] [PULL 11/28] file-posix: Store BDRVRawState.reopen_state during reopen

2019-03-12 Thread Kevin Wolf
We'll want to access the file descriptor in the reopen_state while
processing permission changes in the context of the repoen.

Signed-off-by: Kevin Wolf 
---
 block/file-posix.c | 10 ++
 1 file changed, 10 insertions(+)

diff --git a/block/file-posix.c b/block/file-posix.c
index ae57ba1fc6..6aaee1df16 100644
--- a/block/file-posix.c
+++ b/block/file-posix.c
@@ -144,6 +144,8 @@ typedef struct BDRVRawState {
 uint64_t locked_perm;
 uint64_t locked_shared_perm;
 
+BDRVReopenState *reopen_state;
+
 #ifdef CONFIG_XFS
 bool is_xfs:1;
 #endif
@@ -952,6 +954,7 @@ static int raw_reopen_prepare(BDRVReopenState *state,
 }
 }
 
+s->reopen_state = state;
 out:
 qemu_opts_del(opts);
 return ret;
@@ -978,12 +981,16 @@ static void raw_reopen_commit(BDRVReopenState *state)
 
 g_free(state->opaque);
 state->opaque = NULL;
+
+assert(s->reopen_state == state);
+s->reopen_state = NULL;
 }
 
 
 static void raw_reopen_abort(BDRVReopenState *state)
 {
 BDRVRawReopenState *rs = state->opaque;
+BDRVRawState *s = state->bs->opaque;
 
  /* nothing to do if NULL, we didn't get far enough */
 if (rs == NULL) {
@@ -996,6 +1003,9 @@ static void raw_reopen_abort(BDRVReopenState *state)
 }
 g_free(state->opaque);
 state->opaque = NULL;
+
+assert(s->reopen_state == state);
+s->reopen_state = NULL;
 }
 
 static int hdev_get_max_transfer_length(BlockDriverState *bs, int fd)
-- 
2.20.1




[Qemu-block] [PULL 09/28] file-posix: Fix bdrv_open_flags() for snapshot=on

2019-03-12 Thread Kevin Wolf
Using a different read-only setting for bs->open_flags than for the
flags to the driver's open function is just inconsistent and a bad idea.
After this patch, the temporary snapshot keeps being opened read-only if
read-only=on,snapshot=on is passed.

If we wanted to change this behaviour to make only the orginal image
file read-only, but the temporary overlay read-write (as the comment in
the removed code suggests), that change would have to be made in
bdrv_temp_snapshot_options() (where the comment suggests otherwise).

Addressing this inconsistency before introducing dynamic auto-read-only
is important because otherwise we would immediately try to reopen the
temporary overlay even though the file is already unlinked.

Signed-off-by: Kevin Wolf 
---
 block.c   | 7 ---
 tests/qemu-iotests/051| 7 +++
 tests/qemu-iotests/051.out| 9 +
 tests/qemu-iotests/051.pc.out | 9 +
 4 files changed, 25 insertions(+), 7 deletions(-)

diff --git a/block.c b/block.c
index 9b9d25e843..33804cdcaa 100644
--- a/block.c
+++ b/block.c
@@ -1163,13 +1163,6 @@ static int bdrv_open_flags(BlockDriverState *bs, int 
flags)
  */
 open_flags &= ~(BDRV_O_SNAPSHOT | BDRV_O_NO_BACKING | BDRV_O_PROTOCOL);
 
-/*
- * Snapshots should be writable.
- */
-if (flags & BDRV_O_TEMPORARY) {
-open_flags |= BDRV_O_RDWR;
-}
-
 return open_flags;
 }
 
diff --git a/tests/qemu-iotests/051 b/tests/qemu-iotests/051
index 3b50c7f188..151b850a8b 100755
--- a/tests/qemu-iotests/051
+++ b/tests/qemu-iotests/051
@@ -356,6 +356,13 @@ $QEMU_IO -c "read -P 0x33 0 4k" "$TEST_IMG" | 
_filter_qemu_io
 # Using snapshot=on with a non-existent TMPDIR
 TMPDIR=/nonexistent run_qemu -drive driver=null-co,snapshot=on
 
+# Using snapshot=on together with read-only=on
+echo "info block" |
+run_qemu -drive 
file="$TEST_IMG",snapshot=on,read-only=on,if=none,id=$device_id |
+_filter_qemu_io |
+sed -e 's#"/[^"]*/vl\.[A-Za-z]\{6\}"#SNAPSHOT_PATH#g'
+
+
 # success, all done
 echo "*** done"
 rm -f $seq.full
diff --git a/tests/qemu-iotests/051.out b/tests/qemu-iotests/051.out
index b900935fbc..9f1cf22608 100644
--- a/tests/qemu-iotests/051.out
+++ b/tests/qemu-iotests/051.out
@@ -458,4 +458,13 @@ read 4096/4096 bytes at offset 0
 Testing: -drive driver=null-co,snapshot=on
 QEMU_PROG: -drive driver=null-co,snapshot=on: Could not get temporary 
filename: No such file or directory
 
+Testing: -drive 
file=TEST_DIR/t.qcow2,snapshot=on,read-only=on,if=none,id=drive0
+QEMU X.Y.Z monitor - type 'help' for more information
+(qemu) info block
+drive0 (NODE_NAME): json:{"backing": {"driver": "qcow2", "file": {"driver": 
"file", "filename": "TEST_DIR/t.qcow2"}}, "driver": "qcow2", "file": {"driver": 
"file", "filename": SNAPSHOT_PATH}} (qcow2, read-only)
+Removable device: not locked, tray closed
+Cache mode:   writeback, ignore flushes
+Backing file: TEST_DIR/t.qcow2 (chain depth: 1)
+(qemu) quit
+
 *** done
diff --git a/tests/qemu-iotests/051.pc.out b/tests/qemu-iotests/051.pc.out
index 8c5c735dfd..c4743cc31c 100644
--- a/tests/qemu-iotests/051.pc.out
+++ b/tests/qemu-iotests/051.pc.out
@@ -530,4 +530,13 @@ read 4096/4096 bytes at offset 0
 Testing: -drive driver=null-co,snapshot=on
 QEMU_PROG: -drive driver=null-co,snapshot=on: Could not get temporary 
filename: No such file or directory
 
+Testing: -drive 
file=TEST_DIR/t.qcow2,snapshot=on,read-only=on,if=none,id=drive0
+QEMU X.Y.Z monitor - type 'help' for more information
+(qemu) info block
+drive0 (NODE_NAME): json:{"backing": {"driver": "qcow2", "file": {"driver": 
"file", "filename": "TEST_DIR/t.qcow2"}}, "driver": "qcow2", "file": {"driver": 
"file", "filename": SNAPSHOT_PATH}} (qcow2, read-only)
+Removable device: not locked, tray closed
+Cache mode:   writeback, ignore flushes
+Backing file: TEST_DIR/t.qcow2 (chain depth: 1)
+(qemu) quit
+
 *** done
-- 
2.20.1




[Qemu-block] [PULL 03/28] qapi: move to QOM path for x-block-latency-histogram-set

2019-03-12 Thread Kevin Wolf
From: Vladimir Sementsov-Ogievskiy 

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Signed-off-by: Kevin Wolf 
---
 qapi/block-core.json |  4 ++--
 blockdev.c   | 12 ++--
 2 files changed, 8 insertions(+), 8 deletions(-)

diff --git a/qapi/block-core.json b/qapi/block-core.json
index 919d0530b2..3f0a5cb1e8 100644
--- a/qapi/block-core.json
+++ b/qapi/block-core.json
@@ -550,7 +550,7 @@
 # If only @device parameter is specified, remove all present latency histograms
 # for the device. Otherwise, add/reset some of (or all) latency histograms.
 #
-# @device: device name to set latency histogram for.
+# @id: The name or QOM path of the guest device.
 #
 # @boundaries: list of interval boundary values (see description in
 #  BlockLatencyHistogramInfo definition). If specified, all
@@ -608,7 +608,7 @@
 # <- { "return": {} }
 ##
 { 'command': 'x-block-latency-histogram-set',
-  'data': {'device': 'str',
+  'data': {'id': 'str',
'*boundaries': ['uint64'],
'*boundaries-read': ['uint64'],
'*boundaries-write': ['uint64'],
diff --git a/blockdev.c b/blockdev.c
index 871966ca13..850fb34c52 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -4453,21 +4453,21 @@ void qmp_x_blockdev_set_iothread(const char *node_name, 
StrOrNull *iothread,
 }
 
 void qmp_x_block_latency_histogram_set(
-const char *device,
+const char *id,
 bool has_boundaries, uint64List *boundaries,
 bool has_boundaries_read, uint64List *boundaries_read,
 bool has_boundaries_write, uint64List *boundaries_write,
 bool has_boundaries_flush, uint64List *boundaries_flush,
 Error **errp)
 {
-BlockBackend *blk = blk_by_name(device);
+BlockBackend *blk = qmp_get_blk(NULL, id, errp);
 BlockAcctStats *stats;
 int ret;
 
 if (!blk) {
-error_setg(errp, "Device '%s' not found", device);
 return;
 }
+
 stats = blk_get_stats(blk);
 
 if (!has_boundaries && !has_boundaries_read && !has_boundaries_write &&
@@ -4482,7 +4482,7 @@ void qmp_x_block_latency_histogram_set(
 stats, BLOCK_ACCT_READ,
 has_boundaries_read ? boundaries_read : boundaries);
 if (ret) {
-error_setg(errp, "Device '%s' set read boundaries fail", device);
+error_setg(errp, "Device '%s' set read boundaries fail", id);
 return;
 }
 }
@@ -4492,7 +4492,7 @@ void qmp_x_block_latency_histogram_set(
 stats, BLOCK_ACCT_WRITE,
 has_boundaries_write ? boundaries_write : boundaries);
 if (ret) {
-error_setg(errp, "Device '%s' set write boundaries fail", device);
+error_setg(errp, "Device '%s' set write boundaries fail", id);
 return;
 }
 }
@@ -4502,7 +4502,7 @@ void qmp_x_block_latency_histogram_set(
 stats, BLOCK_ACCT_FLUSH,
 has_boundaries_flush ? boundaries_flush : boundaries);
 if (ret) {
-error_setg(errp, "Device '%s' set flush boundaries fail", device);
+error_setg(errp, "Device '%s' set flush boundaries fail", id);
 return;
 }
 }
-- 
2.20.1




[Qemu-block] [PULL 00/28] Block layer patches

2019-03-12 Thread Kevin Wolf
The following changes since commit eda1df0345f5a1e337e30367124dcb0e802bdfde:

  Merge remote-tracking branch 'remotes/armbru/tags/pull-pflash-2019-03-11' 
into staging (2019-03-12 11:12:36 +)

are available in the Git repository at:

  git://repo.or.cz/qemu/kevin.git tags/for-upstream

for you to fetch changes up to c31dfeb02a1d155bdb961edeb61a137a589c174b:

  qemu-iotests: Test the x-blockdev-reopen QMP command (2019-03-12 17:58:37 
+0100)


Block layer patches:

- file-posix: Make auto-read-only dynamic
- Add x-blockdev-reopen QMP command
- Finalize block-latency-histogram QMP command
- gluster: Build fixes for newer lib version


Alberto Garcia (13):
  block: Allow freezing BdrvChild links
  block: Freeze the backing chain for the duration of the commit job
  block: Freeze the backing chain for the duration of the mirror job
  block: Freeze the backing chain for the duration of the stream job
  block: Add 'keep_old_opts' parameter to bdrv_reopen_queue()
  block: Handle child references in bdrv_reopen_queue()
  block: Allow omitting the 'backing' option in certain cases
  block: Allow changing the backing file on reopen
  block: Add a 'mutable_opts' field to BlockDriver
  block: Add bdrv_reset_options_allowed()
  block: Remove the AioContext parameter from bdrv_reopen_multiple()
  block: Add an 'x-blockdev-reopen' QMP command
  qemu-iotests: Test the x-blockdev-reopen QMP command

Keith Busch (1):
  nvme: fix write zeroes offset and count

Kevin Wolf (10):
  tests/virtio-blk-test: Disable auto-read-only
  qemu-iotests: commit to backing file with auto-read-only
  block: Avoid useless local_err
  block: Make permission changes in reopen less wrong
  file-posix: Fix bdrv_open_flags() for snapshot=on
  file-posix: Factor out raw_reconfigure_getfd()
  file-posix: Store BDRVRawState.reopen_state during reopen
  file-posix: Lock new fd in raw_reopen_prepare()
  file-posix: Prepare permission code for fd switching
  file-posix: Make auto-read-only dynamic

Niels de Vos (1):
  gluster: the glfs_io_cbk callback function pointer adds pre/post stat args

Prasanna Kumar Kalever (1):
  gluster: Handle changed glfs_ftruncate signature

Vladimir Sementsov-Ogievskiy (2):
  qapi: move to QOM path for x-block-latency-histogram-set
  qapi: drop x- from x-block-latency-histogram-set

 qapi/block-core.json  |  66 ++-
 configure |  42 ++
 include/block/block.h |  13 +-
 include/block/block_int.h |  14 +
 block.c   | 440 +--
 block/commit.c|  16 +
 block/file-posix.c| 254 ---
 block/gluster.c   |  10 +-
 block/mirror.c|   8 +
 block/qapi.c  |  12 +-
 block/qcow2.c |  25 ++
 block/raw-format.c|   3 +
 block/replication.c   |   7 +-
 block/stream.c|  21 +
 blockdev.c|  61 ++-
 hw/block/nvme.c   |   6 +-
 qemu-io-cmds.c|   4 +-
 tests/virtio-blk-test.c   |   2 +-
 tests/qemu-iotests/051|   7 +
 tests/qemu-iotests/051.out|   9 +
 tests/qemu-iotests/051.pc.out |   9 +
 tests/qemu-iotests/232|  31 ++
 tests/qemu-iotests/232.out|  32 +-
 tests/qemu-iotests/245| 991 ++
 tests/qemu-iotests/245.out|   5 +
 tests/qemu-iotests/group  |   1 +
 26 files changed, 1929 insertions(+), 160 deletions(-)
 create mode 100644 tests/qemu-iotests/245
 create mode 100644 tests/qemu-iotests/245.out



[Qemu-block] [PULL 21/28] block: Handle child references in bdrv_reopen_queue()

2019-03-12 Thread Kevin Wolf
From: Alberto Garcia 

Children in QMP are specified with BlockdevRef / BlockdevRefOrNull,
which can contain a set of child options, a child reference, or
NULL. In optional attributes like "backing" it can also be missing.

Only the first case (set of child options) is being handled properly
by bdrv_reopen_queue(). This patch deals with all the others.

Here's how these cases should be handled when bdrv_reopen_queue() is
deciding what to do with each child of a BlockDriverState:

   1) Set of child options: if the child was implicitly created (i.e
  inherits_from points to the parent) then the options are removed
  from the parent's options QDict and are passed to the child with
  a recursive bdrv_reopen_queue() call. This case was already
  working fine.

   2) Child reference: there's two possibilites here.

  2a) Reference to the current child: if the child was implicitly
  created then it is put in the reopen queue, keeping its
  current set of options (since this was a child reference
  there was no way to specify a different set of options).
  If the child is not implicit then it keeps its current set
  of options but it is not reopened (and therefore does not
  inherit any new option from the parent).

  2b) Reference to a different BDS: the current child is not put
  in the reopen queue at all. Passing a reference to a
  different BDS can be used to replace a child, although at
  the moment no driver implements this, so it results in an
  error. In any case, the current child is not going to be
  reopened (and might in fact disappear if it's replaced)

   3) NULL: This is similar to (2b). Although no driver allows this
  yet it can be used to detach the current child so it should not
  be put in the reopen queue.

   4) Missing option: at the moment "backing" is the only case where
  this can happen. With "blockdev-add", leaving "backing" out
  means that the default backing file is opened. We don't want to
  open a new image during reopen, so we require that "backing" is
  always present. We'll relax this requirement a bit in the next
  patch. If keep_old_opts is true and "backing" is missing then
  this behaves like 2a (the current child is reopened).

Signed-off-by: Alberto Garcia 
Signed-off-by: Kevin Wolf 
---
 include/block/block.h |  1 +
 block.c   | 52 +--
 2 files changed, 46 insertions(+), 7 deletions(-)

diff --git a/include/block/block.h b/include/block/block.h
index 4b5a1a..fcc61a3a19 100644
--- a/include/block/block.h
+++ b/include/block/block.h
@@ -187,6 +187,7 @@ typedef struct BDRVReopenState {
 BlockDriverState *bs;
 int flags;
 BlockdevDetectZeroesOptions detect_zeroes;
+bool backing_missing;
 uint64_t perm, shared_perm;
 QDict *options;
 QDict *explicit_options;
diff --git a/block.c b/block.c
index 3cd281dbcc..9698f2ad44 100644
--- a/block.c
+++ b/block.c
@@ -3107,9 +3107,21 @@ static BlockReopenQueue 
*bdrv_reopen_queue_child(BlockReopenQueue *bs_queue,
 bs_entry->state.perm = UINT64_MAX;
 bs_entry->state.shared_perm = 0;
 
+/*
+ * If keep_old_opts is false then it means that unspecified
+ * options must be reset to their original value. We don't allow
+ * resetting 'backing' but we need to know if the option is
+ * missing in order to decide if we have to return an error.
+ */
+if (!keep_old_opts) {
+bs_entry->state.backing_missing =
+!qdict_haskey(options, "backing") &&
+!qdict_haskey(options, "backing.driver");
+}
+
 QLIST_FOREACH(child, >children, next) {
-QDict *new_child_options;
-char *child_key_dot;
+QDict *new_child_options = NULL;
+bool child_keep_old = keep_old_opts;
 
 /* reopen can only change the options of block devices that were
  * implicitly created and inherited options. For other (referenced)
@@ -3118,13 +3130,32 @@ static BlockReopenQueue 
*bdrv_reopen_queue_child(BlockReopenQueue *bs_queue,
 continue;
 }
 
-child_key_dot = g_strdup_printf("%s.", child->name);
-qdict_extract_subqdict(explicit_options, NULL, child_key_dot);
-qdict_extract_subqdict(options, _child_options, child_key_dot);
-g_free(child_key_dot);
+/* Check if the options contain a child reference */
+if (qdict_haskey(options, child->name)) {
+const char *childref = qdict_get_try_str(options, child->name);
+/*
+ * The current child must not be reopened if the child
+ * reference is null or points to a different node.
+ */
+if (g_strcmp0(childref, child->bs->node_name)) {
+continue;
+}
+/*
+ * If the child reference points to the current child then
+

[Qemu-block] [PULL 19/28] block: Freeze the backing chain for the duration of the stream job

2019-03-12 Thread Kevin Wolf
From: Alberto Garcia 

Signed-off-by: Alberto Garcia 
Signed-off-by: Kevin Wolf 
---
 block/stream.c | 21 +
 1 file changed, 21 insertions(+)

diff --git a/block/stream.c b/block/stream.c
index e14579ff80..6253c86fae 100644
--- a/block/stream.c
+++ b/block/stream.c
@@ -35,6 +35,7 @@ typedef struct StreamBlockJob {
 BlockdevOnError on_error;
 char *backing_file_str;
 bool bs_read_only;
+bool chain_frozen;
 } StreamBlockJob;
 
 static int coroutine_fn stream_populate(BlockBackend *blk,
@@ -49,6 +50,16 @@ static int coroutine_fn stream_populate(BlockBackend *blk,
 return blk_co_preadv(blk, offset, qiov.size, , BDRV_REQ_COPY_ON_READ);
 }
 
+static void stream_abort(Job *job)
+{
+StreamBlockJob *s = container_of(job, StreamBlockJob, common.job);
+
+if (s->chain_frozen) {
+BlockJob *bjob = >common;
+bdrv_unfreeze_backing_chain(blk_bs(bjob->blk), s->base);
+}
+}
+
 static int stream_prepare(Job *job)
 {
 StreamBlockJob *s = container_of(job, StreamBlockJob, common.job);
@@ -58,6 +69,9 @@ static int stream_prepare(Job *job)
 Error *local_err = NULL;
 int ret = 0;
 
+bdrv_unfreeze_backing_chain(bs, base);
+s->chain_frozen = false;
+
 if (bs->backing) {
 const char *base_id = NULL, *base_fmt = NULL;
 if (base) {
@@ -208,6 +222,7 @@ static const BlockJobDriver stream_job_driver = {
 .free  = block_job_free,
 .run   = stream_run,
 .prepare   = stream_prepare,
+.abort = stream_abort,
 .clean = stream_clean,
 .user_resume   = block_job_user_resume,
 .drain = block_job_drain,
@@ -254,9 +269,15 @@ void stream_start(const char *job_id, BlockDriverState *bs,
_abort);
 }
 
+if (bdrv_freeze_backing_chain(bs, base, errp) < 0) {
+job_early_fail(>common.job);
+goto fail;
+}
+
 s->base = base;
 s->backing_file_str = g_strdup(backing_file_str);
 s->bs_read_only = bs_read_only;
+s->chain_frozen = true;
 
 s->on_error = on_error;
 trace_stream_start(bs, base, s);
-- 
2.20.1




[Qemu-block] [PULL 16/28] block: Allow freezing BdrvChild links

2019-03-12 Thread Kevin Wolf
From: Alberto Garcia 

Our permission system is useful to define what operations are allowed
on a certain block node and includes things like BLK_PERM_WRITE or
BLK_PERM_RESIZE among others.

One of the permissions is BLK_PERM_GRAPH_MOD which allows "changing
the node that this BdrvChild points to". The exact meaning of this has
never been very clear, but it can be understood as "change any of the
links connected to the node". This can be used to prevent changing a
backing link, but it's too coarse.

This patch adds a new 'frozen' attribute to BdrvChild, which forbids
detaching the link from the node it points to, and new API to freeze
and unfreeze a backing chain.

After this change a few functions can fail, so they need additional
checks.

Signed-off-by: Alberto Garcia 
Signed-off-by: Kevin Wolf 
---
 include/block/block.h |  5 +++
 include/block/block_int.h |  6 
 block.c   | 76 +++
 3 files changed, 87 insertions(+)

diff --git a/include/block/block.h b/include/block/block.h
index 6a758a76f8..b0c04f16c8 100644
--- a/include/block/block.h
+++ b/include/block/block.h
@@ -353,6 +353,11 @@ int bdrv_drop_intermediate(BlockDriverState *top, 
BlockDriverState *base,
 BlockDriverState *bdrv_find_overlay(BlockDriverState *active,
 BlockDriverState *bs);
 BlockDriverState *bdrv_find_base(BlockDriverState *bs);
+bool bdrv_is_backing_chain_frozen(BlockDriverState *bs, BlockDriverState *base,
+  Error **errp);
+int bdrv_freeze_backing_chain(BlockDriverState *bs, BlockDriverState *base,
+  Error **errp);
+void bdrv_unfreeze_backing_chain(BlockDriverState *bs, BlockDriverState *base);
 
 
 typedef struct BdrvCheckResult {
diff --git a/include/block/block_int.h b/include/block/block_int.h
index a23cabaddd..438feba4e5 100644
--- a/include/block/block_int.h
+++ b/include/block/block_int.h
@@ -711,6 +711,12 @@ struct BdrvChild {
 uint64_t backup_perm;
 uint64_t backup_shared_perm;
 
+/*
+ * This link is frozen: the child can neither be replaced nor
+ * detached from the parent.
+ */
+bool frozen;
+
 QLIST_ENTRY(BdrvChild) next;
 QLIST_ENTRY(BdrvChild) next_parent;
 };
diff --git a/block.c b/block.c
index 33804cdcaa..a2bffce38d 100644
--- a/block.c
+++ b/block.c
@@ -2126,6 +2126,8 @@ static void bdrv_replace_child_noperm(BdrvChild *child,
 BlockDriverState *old_bs = child->bs;
 int i;
 
+assert(!child->frozen);
+
 if (old_bs && new_bs) {
 assert(bdrv_get_aio_context(old_bs) == bdrv_get_aio_context(new_bs));
 }
@@ -2342,6 +2344,10 @@ void bdrv_set_backing_hd(BlockDriverState *bs, 
BlockDriverState *backing_hd,
 bool update_inherits_from = bdrv_chain_contains(bs, backing_hd) &&
 bdrv_inherits_from_recursive(backing_hd, bs);
 
+if (bdrv_is_backing_chain_frozen(bs, backing_bs(bs), errp)) {
+return;
+}
+
 if (backing_hd) {
 bdrv_ref(backing_hd);
 }
@@ -3721,6 +3727,11 @@ void bdrv_replace_node(BlockDriverState *from, 
BlockDriverState *to,
 if (!should_update_child(c, to)) {
 continue;
 }
+if (c->frozen) {
+error_setg(errp, "Cannot change '%s' link to '%s'",
+   c->name, from->node_name);
+goto out;
+}
 list = g_slist_prepend(list, c);
 perm |= c->perm;
 shared &= c->shared_perm;
@@ -3932,6 +3943,63 @@ BlockDriverState *bdrv_find_base(BlockDriverState *bs)
 return bdrv_find_overlay(bs, NULL);
 }
 
+/*
+ * Return true if at least one of the backing links between @bs and
+ * @base is frozen. @errp is set if that's the case.
+ */
+bool bdrv_is_backing_chain_frozen(BlockDriverState *bs, BlockDriverState *base,
+  Error **errp)
+{
+BlockDriverState *i;
+
+for (i = bs; i != base && i->backing; i = backing_bs(i)) {
+if (i->backing->frozen) {
+error_setg(errp, "Cannot change '%s' link from '%s' to '%s'",
+   i->backing->name, i->node_name,
+   backing_bs(i)->node_name);
+return true;
+}
+}
+
+return false;
+}
+
+/*
+ * Freeze all backing links between @bs and @base.
+ * If any of the links is already frozen the operation is aborted and
+ * none of the links are modified.
+ * Returns 0 on success. On failure returns < 0 and sets @errp.
+ */
+int bdrv_freeze_backing_chain(BlockDriverState *bs, BlockDriverState *base,
+  Error **errp)
+{
+BlockDriverState *i;
+
+if (bdrv_is_backing_chain_frozen(bs, base, errp)) {
+return -EPERM;
+}
+
+for (i = bs; i != base && i->backing; i = backing_bs(i)) {
+i->backing->frozen = true;
+}
+
+return 0;
+}
+
+/*
+ * Unfreeze all backing links between @bs and @base. The caller must
+ * ensure that all links are frozen before using this 

[Qemu-block] [PULL 20/28] block: Add 'keep_old_opts' parameter to bdrv_reopen_queue()

2019-03-12 Thread Kevin Wolf
From: Alberto Garcia 

The bdrv_reopen_queue() function is used to create a queue with
the BDSs that are going to be reopened and their new options. Once
the queue is ready bdrv_reopen_multiple() is called to perform the
operation.

The original options from each one of the BDSs are kept, with the new
options passed to bdrv_reopen_queue() applied on top of them.

For "x-blockdev-reopen" we want a function that behaves much like
"blockdev-add". We want to ignore the previous set of options so that
only the ones actually specified by the user are applied, with the
rest having their default values.

One of the things that we need is a way to tell bdrv_reopen_queue()
whether we want to keep the old set of options or not, and that's what
this patch does. All current callers are setting this new parameter to
true and x-blockdev-reopen will set it to false.

Signed-off-by: Alberto Garcia 
Signed-off-by: Kevin Wolf 
---
 include/block/block.h |  3 ++-
 block.c   | 34 +++---
 block/replication.c   |  4 ++--
 qemu-io-cmds.c|  2 +-
 4 files changed, 24 insertions(+), 19 deletions(-)

diff --git a/include/block/block.h b/include/block/block.h
index b0c04f16c8..4b5a1a 100644
--- a/include/block/block.h
+++ b/include/block/block.h
@@ -299,7 +299,8 @@ BlockDriverState *bdrv_open(const char *filename, const 
char *reference,
 BlockDriverState *bdrv_new_open_driver(BlockDriver *drv, const char *node_name,
int flags, Error **errp);
 BlockReopenQueue *bdrv_reopen_queue(BlockReopenQueue *bs_queue,
-BlockDriverState *bs, QDict *options);
+BlockDriverState *bs, QDict *options,
+bool keep_old_opts);
 int bdrv_reopen_multiple(AioContext *ctx, BlockReopenQueue *bs_queue, Error 
**errp);
 int bdrv_reopen_set_read_only(BlockDriverState *bs, bool read_only,
   Error **errp);
diff --git a/block.c b/block.c
index a2bffce38d..3cd281dbcc 100644
--- a/block.c
+++ b/block.c
@@ -3010,7 +3010,8 @@ static BlockReopenQueue 
*bdrv_reopen_queue_child(BlockReopenQueue *bs_queue,
  QDict *options,
  const BdrvChildRole *role,
  QDict *parent_options,
- int parent_flags)
+ int parent_flags,
+ bool keep_old_opts)
 {
 assert(bs != NULL);
 
@@ -3050,13 +3051,13 @@ static BlockReopenQueue 
*bdrv_reopen_queue_child(BlockReopenQueue *bs_queue,
  */
 
 /* Old explicitly set values (don't overwrite by inherited value) */
-if (bs_entry) {
-old_options = qdict_clone_shallow(bs_entry->state.explicit_options);
-} else {
-old_options = qdict_clone_shallow(bs->explicit_options);
+if (bs_entry || keep_old_opts) {
+old_options = qdict_clone_shallow(bs_entry ?
+  bs_entry->state.explicit_options :
+  bs->explicit_options);
+bdrv_join_options(bs, options, old_options);
+qobject_unref(old_options);
 }
-bdrv_join_options(bs, options, old_options);
-qobject_unref(old_options);
 
 explicit_options = qdict_clone_shallow(options);
 
@@ -3068,10 +3069,12 @@ static BlockReopenQueue 
*bdrv_reopen_queue_child(BlockReopenQueue *bs_queue,
 flags = bdrv_get_flags(bs);
 }
 
-/* Old values are used for options that aren't set yet */
-old_options = qdict_clone_shallow(bs->options);
-bdrv_join_options(bs, options, old_options);
-qobject_unref(old_options);
+if (keep_old_opts) {
+/* Old values are used for options that aren't set yet */
+old_options = qdict_clone_shallow(bs->options);
+bdrv_join_options(bs, options, old_options);
+qobject_unref(old_options);
+}
 
 /* We have the final set of options so let's update the flags */
 options_copy = qdict_clone_shallow(options);
@@ -3121,7 +3124,7 @@ static BlockReopenQueue 
*bdrv_reopen_queue_child(BlockReopenQueue *bs_queue,
 g_free(child_key_dot);
 
 bdrv_reopen_queue_child(bs_queue, child->bs, new_child_options,
-child->role, options, flags);
+child->role, options, flags, keep_old_opts);
 }
 
 return bs_queue;
@@ -3129,9 +3132,10 @@ static BlockReopenQueue 
*bdrv_reopen_queue_child(BlockReopenQueue *bs_queue,
 
 BlockReopenQueue *bdrv_reopen_queue(BlockReopenQueue *bs_queue,
 BlockDriverState *bs,
-QDict *options)
+QDict *options, bool keep_old_opts)
 {
-return bdrv_reopen_queue_child(bs_queue, bs, 

[Qemu-block] [PULL 14/28] file-posix: Make auto-read-only dynamic

2019-03-12 Thread Kevin Wolf
Until now, with auto-read-only=on we tried to open the file read-write
first and if that failed, read-only was tried. This is actually not good
enough for libvirt, which gives QEMU SELinux permissions for read-write
only as soon as it actually intends to write to the image. So we need to
be able to switch between read-only and read-write at runtime.

This patch makes auto-read-only dynamic, i.e. the file is opened
read-only as long as no user of the node has requested write
permissions, but it is automatically reopened read-write as soon as the
first writer is attached. Conversely, if the last writer goes away, the
file is reopened read-only again.

bs->read_only is no longer set for auto-read-only=on files even if the
file descriptor is opened read-only because it will be transparently
upgraded as soon as a writer is attached. This changes the output of
qemu-iotests 232.

Signed-off-by: Kevin Wolf 
---
 block/file-posix.c | 36 +---
 tests/qemu-iotests/232.out | 12 ++--
 2 files changed, 23 insertions(+), 25 deletions(-)

diff --git a/block/file-posix.c b/block/file-posix.c
index e41e0779c6..d41059d139 100644
--- a/block/file-posix.c
+++ b/block/file-posix.c
@@ -376,13 +376,21 @@ static void raw_probe_alignment(BlockDriverState *bs, int 
fd, Error **errp)
 }
 }
 
-static void raw_parse_flags(int bdrv_flags, int *open_flags)
+static void raw_parse_flags(int bdrv_flags, int *open_flags, bool has_writers)
 {
+bool read_write = false;
 assert(open_flags != NULL);
 
 *open_flags |= O_BINARY;
 *open_flags &= ~O_ACCMODE;
-if (bdrv_flags & BDRV_O_RDWR) {
+
+if (bdrv_flags & BDRV_O_AUTO_RDONLY) {
+read_write = has_writers;
+} else if (bdrv_flags & BDRV_O_RDWR) {
+read_write = true;
+}
+
+if (read_write) {
 *open_flags |= O_RDWR;
 } else {
 *open_flags |= O_RDONLY;
@@ -518,24 +526,12 @@ static int raw_open_common(BlockDriverState *bs, QDict 
*options,
false);
 
 s->open_flags = open_flags;
-raw_parse_flags(bdrv_flags, >open_flags);
+raw_parse_flags(bdrv_flags, >open_flags, false);
 
 s->fd = -1;
 fd = qemu_open(filename, s->open_flags, 0644);
 ret = fd < 0 ? -errno : 0;
 
-if (ret == -EACCES || ret == -EROFS) {
-/* Try to degrade to read-only, but if it doesn't work, still use the
- * normal error message. */
-if (bdrv_apply_auto_read_only(bs, NULL, NULL) == 0) {
-bdrv_flags &= ~BDRV_O_RDWR;
-raw_parse_flags(bdrv_flags, >open_flags);
-assert(!(s->open_flags & O_CREAT));
-fd = qemu_open(filename, s->open_flags);
-ret = fd < 0 ? -errno : 0;
-}
-}
-
 if (ret < 0) {
 error_setg_errno(errp, -ret, "Could not open '%s'", filename);
 if (ret == -EROFS) {
@@ -846,12 +842,14 @@ static int raw_handle_perm_lock(BlockDriverState *bs,
 }
 
 static int raw_reconfigure_getfd(BlockDriverState *bs, int flags,
- int *open_flags, bool force_dup,
+ int *open_flags, uint64_t perm, bool 
force_dup,
  Error **errp)
 {
 BDRVRawState *s = bs->opaque;
 int fd = -1;
 int ret;
+bool has_writers = perm &
+(BLK_PERM_WRITE | BLK_PERM_WRITE_UNCHANGED | BLK_PERM_RESIZE);
 int fcntl_flags = O_APPEND | O_NONBLOCK;
 #ifdef O_NOATIME
 fcntl_flags |= O_NOATIME;
@@ -862,7 +860,7 @@ static int raw_reconfigure_getfd(BlockDriverState *bs, int 
flags,
 *open_flags |= O_NONBLOCK;
 }
 
-raw_parse_flags(flags, open_flags);
+raw_parse_flags(flags, open_flags, has_writers);
 
 #ifdef O_ASYNC
 /* Not all operating systems have O_ASYNC, and those that don't
@@ -942,7 +940,7 @@ static int raw_reopen_prepare(BDRVReopenState *state,
 qemu_opts_to_qdict(opts, state->options);
 
 rs->fd = raw_reconfigure_getfd(state->bs, state->flags, >open_flags,
-   true, _err);
+   state->perm, true, _err);
 if (local_err) {
 error_propagate(errp, local_err);
 ret = -1;
@@ -2720,7 +2718,7 @@ static int raw_check_perm(BlockDriverState *bs, uint64_t 
perm, uint64_t shared,
 s->perm_change_fd = rs->fd;
 } else {
 /* We may need a new fd if auto-read-only switches the mode */
-ret = raw_reconfigure_getfd(bs, bs->open_flags, _flags,
+ret = raw_reconfigure_getfd(bs, bs->open_flags, _flags, perm,
 false, errp);
 if (ret < 0) {
 return ret;
diff --git a/tests/qemu-iotests/232.out b/tests/qemu-iotests/232.out
index 5036ef13d4..5bcc44bb62 100644
--- a/tests/qemu-iotests/232.out
+++ b/tests/qemu-iotests/232.out
@@ -22,12 +22,12 @@ NODE_NAME: TEST_DIR/t.IMGFMT (file, read-only)
 NODE_NAME: TEST_DIR/t.IMGFMT (file, read-only)
 
 QEMU_PROG: -drive 

[Qemu-block] [PULL 18/28] block: Freeze the backing chain for the duration of the mirror job

2019-03-12 Thread Kevin Wolf
From: Alberto Garcia 

Signed-off-by: Alberto Garcia 
Signed-off-by: Kevin Wolf 
---
 block/mirror.c | 8 
 1 file changed, 8 insertions(+)

diff --git a/block/mirror.c b/block/mirror.c
index 726d3c27fb..010fdafd79 100644
--- a/block/mirror.c
+++ b/block/mirror.c
@@ -630,6 +630,10 @@ static int mirror_exit_common(Job *job)
 }
 s->prepared = true;
 
+if (bdrv_chain_contains(src, target_bs)) {
+bdrv_unfreeze_backing_chain(mirror_top_bs, target_bs);
+}
+
 bdrv_release_dirty_bitmap(src, s->dirty_bitmap);
 
 /* Make sure that the source BDS doesn't go away during bdrv_replace_node,
@@ -1639,6 +1643,10 @@ static void mirror_start_job(const char *job_id, 
BlockDriverState *bs,
 goto fail;
 }
 }
+
+if (bdrv_freeze_backing_chain(mirror_top_bs, target, errp) < 0) {
+goto fail;
+}
 }
 
 QTAILQ_INIT(>ops_in_flight);
-- 
2.20.1




[Qemu-block] [PULL 17/28] block: Freeze the backing chain for the duration of the commit job

2019-03-12 Thread Kevin Wolf
From: Alberto Garcia 

Signed-off-by: Alberto Garcia 
Signed-off-by: Kevin Wolf 
---
 block/commit.c | 16 
 1 file changed, 16 insertions(+)

diff --git a/block/commit.c b/block/commit.c
index 3b46ca7f97..ba60fef58a 100644
--- a/block/commit.c
+++ b/block/commit.c
@@ -39,6 +39,7 @@ typedef struct CommitBlockJob {
 BlockDriverState *base_bs;
 BlockdevOnError on_error;
 bool base_read_only;
+bool chain_frozen;
 char *backing_file_str;
 } CommitBlockJob;
 
@@ -68,6 +69,9 @@ static int commit_prepare(Job *job)
 {
 CommitBlockJob *s = container_of(job, CommitBlockJob, common.job);
 
+bdrv_unfreeze_backing_chain(s->commit_top_bs, s->base_bs);
+s->chain_frozen = false;
+
 /* Remove base node parent that still uses BLK_PERM_WRITE/RESIZE before
  * the normal backing chain can be restored. */
 blk_unref(s->base);
@@ -84,6 +88,10 @@ static void commit_abort(Job *job)
 CommitBlockJob *s = container_of(job, CommitBlockJob, common.job);
 BlockDriverState *top_bs = blk_bs(s->top);
 
+if (s->chain_frozen) {
+bdrv_unfreeze_backing_chain(s->commit_top_bs, s->base_bs);
+}
+
 /* Make sure commit_top_bs and top stay around until bdrv_replace_node() */
 bdrv_ref(top_bs);
 bdrv_ref(s->commit_top_bs);
@@ -330,6 +338,11 @@ void commit_start(const char *job_id, BlockDriverState *bs,
 }
 }
 
+if (bdrv_freeze_backing_chain(commit_top_bs, base, errp) < 0) {
+goto fail;
+}
+s->chain_frozen = true;
+
 ret = block_job_add_bdrv(>common, "base", base, 0, BLK_PERM_ALL, errp);
 if (ret < 0) {
 goto fail;
@@ -362,6 +375,9 @@ void commit_start(const char *job_id, BlockDriverState *bs,
 return;
 
 fail:
+if (s->chain_frozen) {
+bdrv_unfreeze_backing_chain(commit_top_bs, base);
+}
 if (s->base) {
 blk_unref(s->base);
 }
-- 
2.20.1




[Qemu-block] [PULL 12/28] file-posix: Lock new fd in raw_reopen_prepare()

2019-03-12 Thread Kevin Wolf
There is no reason why we can take locks on the new file descriptor only
in raw_reopen_commit() where error handling isn't possible any more.
Instead, we can already do this in raw_reopen_prepare().

Signed-off-by: Kevin Wolf 
---
 block/file-posix.c | 27 ---
 1 file changed, 16 insertions(+), 11 deletions(-)

diff --git a/block/file-posix.c b/block/file-posix.c
index 6aaee1df16..932cc8e58c 100644
--- a/block/file-posix.c
+++ b/block/file-posix.c
@@ -906,7 +906,7 @@ static int raw_reopen_prepare(BDRVReopenState *state,
 BDRVRawState *s;
 BDRVRawReopenState *rs;
 QemuOpts *opts;
-int ret = 0;
+int ret;
 Error *local_err = NULL;
 
 assert(state != NULL);
@@ -947,14 +947,27 @@ static int raw_reopen_prepare(BDRVReopenState *state,
 if (rs->fd != -1) {
 raw_probe_alignment(state->bs, rs->fd, _err);
 if (local_err) {
-qemu_close(rs->fd);
-rs->fd = -1;
 error_propagate(errp, local_err);
 ret = -EINVAL;
+goto out_fd;
+}
+
+/* Copy locks to the new fd */
+ret = raw_apply_lock_bytes(NULL, rs->fd, s->locked_perm,
+   s->locked_shared_perm, false, errp);
+if (ret < 0) {
+ret = -EINVAL;
+goto out_fd;
 }
 }
 
 s->reopen_state = state;
+ret = 0;
+out_fd:
+if (ret < 0) {
+qemu_close(rs->fd);
+rs->fd = -1;
+}
 out:
 qemu_opts_del(opts);
 return ret;
@@ -964,18 +977,10 @@ static void raw_reopen_commit(BDRVReopenState *state)
 {
 BDRVRawReopenState *rs = state->opaque;
 BDRVRawState *s = state->bs->opaque;
-Error *local_err = NULL;
 
 s->check_cache_dropped = rs->check_cache_dropped;
 s->open_flags = rs->open_flags;
 
-/* Copy locks to the new fd before closing the old one. */
-raw_apply_lock_bytes(NULL, rs->fd, s->locked_perm,
- s->locked_shared_perm, false, _err);
-if (local_err) {
-/* shouldn't fail in a sane host, but report it just in case. */
-error_report_err(local_err);
-}
 qemu_close(s->fd);
 s->fd = rs->fd;
 
-- 
2.20.1




[Qemu-block] [PULL 08/28] block: Make permission changes in reopen less wrong

2019-03-12 Thread Kevin Wolf
The way that reopen interacts with permission changes has one big
problem: Both operations are recursive, and the permissions are changes
for each node in the reopen queue.

For a simple graph that consists just of parent and child,
.bdrv_check_perm will be called twice for the child, once recursively
when adjusting the permissions of parent, and once again when the child
itself is reopened.

Even worse, the first .bdrv_check_perm call happens before
.bdrv_reopen_prepare was called for the child and the second one is
called afterwards.

Making sure that .bdrv_check_perm (and the other permission callbacks)
are called only once is hard. We can cope with multiple calls right now,
but as soon as file-posix gets a dynamic auto-read-only that may need to
open a new file descriptor, we get the additional requirement that all
of them are after the .bdrv_reopen_prepare call.

So reorder things in bdrv_reopen_multiple() to first call
.bdrv_reopen_prepare for all involved nodes and only then adjust
permissions.

Signed-off-by: Kevin Wolf 
---
 block.c | 35 ---
 1 file changed, 24 insertions(+), 11 deletions(-)

diff --git a/block.c b/block.c
index e18bd5eefd..9b9d25e843 100644
--- a/block.c
+++ b/block.c
@@ -1698,6 +1698,7 @@ static void bdrv_child_set_perm(BdrvChild *c, uint64_t 
perm, uint64_t shared);
 
 typedef struct BlockReopenQueueEntry {
  bool prepared;
+ bool perms_checked;
  BDRVReopenState state;
  QSIMPLEQ_ENTRY(BlockReopenQueueEntry) entry;
 } BlockReopenQueueEntry;
@@ -3166,6 +3167,16 @@ int bdrv_reopen_multiple(AioContext *ctx, 
BlockReopenQueue *bs_queue, Error **er
 bs_entry->prepared = true;
 }
 
+QSIMPLEQ_FOREACH(bs_entry, bs_queue, entry) {
+BDRVReopenState *state = _entry->state;
+ret = bdrv_check_perm(state->bs, bs_queue, state->perm,
+  state->shared_perm, NULL, errp);
+if (ret < 0) {
+goto cleanup_perm;
+}
+bs_entry->perms_checked = true;
+}
+
 /* If we reach this point, we have success and just need to apply the
  * changes
  */
@@ -3174,7 +3185,20 @@ int bdrv_reopen_multiple(AioContext *ctx, 
BlockReopenQueue *bs_queue, Error **er
 }
 
 ret = 0;
+cleanup_perm:
+QSIMPLEQ_FOREACH_SAFE(bs_entry, bs_queue, entry, next) {
+BDRVReopenState *state = _entry->state;
+
+if (!bs_entry->perms_checked) {
+continue;
+}
 
+if (ret == 0) {
+bdrv_set_perm(state->bs, state->perm, state->shared_perm);
+} else {
+bdrv_abort_perm_update(state->bs);
+}
+}
 cleanup:
 QSIMPLEQ_FOREACH_SAFE(bs_entry, bs_queue, entry, next) {
 if (ret) {
@@ -3428,12 +3452,6 @@ int bdrv_reopen_prepare(BDRVReopenState *reopen_state, 
BlockReopenQueue *queue,
 } while ((entry = qdict_next(reopen_state->options, entry)));
 }
 
-ret = bdrv_check_perm(reopen_state->bs, queue, reopen_state->perm,
-  reopen_state->shared_perm, NULL, errp);
-if (ret < 0) {
-goto error;
-}
-
 ret = 0;
 
 /* Restore the original reopen_state->options QDict */
@@ -3500,9 +3518,6 @@ void bdrv_reopen_commit(BDRVReopenState *reopen_state)
 
 bdrv_refresh_limits(bs, NULL);
 
-bdrv_set_perm(reopen_state->bs, reopen_state->perm,
-  reopen_state->shared_perm);
-
 new_can_write =
 !bdrv_is_read_only(bs) && !(bdrv_get_flags(bs) & BDRV_O_INACTIVE);
 if (!old_can_write && new_can_write && drv->bdrv_reopen_bitmaps_rw) {
@@ -3534,8 +3549,6 @@ void bdrv_reopen_abort(BDRVReopenState *reopen_state)
 if (drv->bdrv_reopen_abort) {
 drv->bdrv_reopen_abort(reopen_state);
 }
-
-bdrv_abort_perm_update(reopen_state->bs);
 }
 
 
-- 
2.20.1




[Qemu-block] [PULL 07/28] block: Avoid useless local_err

2019-03-12 Thread Kevin Wolf
Signed-off-by: Kevin Wolf 
Reviewed-by: Alberto Garcia 
Reviewed-by: Eric Blake 
---
 block.c | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/block.c b/block.c
index ccf008c177..e18bd5eefd 100644
--- a/block.c
+++ b/block.c
@@ -3155,14 +3155,12 @@ int bdrv_reopen_multiple(AioContext *ctx, 
BlockReopenQueue *bs_queue, Error **er
 {
 int ret = -1;
 BlockReopenQueueEntry *bs_entry, *next;
-Error *local_err = NULL;
 
 assert(bs_queue != NULL);
 
 QSIMPLEQ_FOREACH(bs_entry, bs_queue, entry) {
 assert(bs_entry->state.bs->quiesce_counter > 0);
-if (bdrv_reopen_prepare(_entry->state, bs_queue, _err)) {
-error_propagate(errp, local_err);
+if (bdrv_reopen_prepare(_entry->state, bs_queue, errp)) {
 goto cleanup;
 }
 bs_entry->prepared = true;
-- 
2.20.1




[Qemu-block] [PULL 02/28] gluster: the glfs_io_cbk callback function pointer adds pre/post stat args

2019-03-12 Thread Kevin Wolf
From: Niels de Vos 

The glfs_*_async() functions do a callback once finished. This callback
has changed its arguments, pre- and post-stat structures have been
added. This makes it possible to improve caching, which is useful for
Samba and NFS-Ganesha, but not so much for QEMU. Gluster 6 is the first
release that includes these new arguments.

With an additional detection in ./configure, the new arguments can
conditionally get included in the glfs_io_cbk handler.

Signed-off-by: Niels de Vos 
Signed-off-by: Kevin Wolf 
---
 configure   | 24 
 block/gluster.c |  6 +-
 2 files changed, 29 insertions(+), 1 deletion(-)

diff --git a/configure b/configure
index 5354d51930..45a3654c5b 100755
--- a/configure
+++ b/configure
@@ -457,6 +457,7 @@ glusterfs_discard="no"
 glusterfs_fallocate="no"
 glusterfs_zerofill="no"
 glusterfs_ftruncate_has_stat="no"
+glusterfs_iocb_has_stat="no"
 gtk=""
 gtk_gl="no"
 tls_priority="NORMAL"
@@ -4105,6 +4106,25 @@ EOF
 if compile_prog "$glusterfs_cflags" "$glusterfs_libs" ; then
   glusterfs_ftruncate_has_stat="yes"
 fi
+cat > $TMPC << EOF
+#include 
+
+/* new glfs_io_cbk() passes two additional glfs_stat structs */
+static void
+glusterfs_iocb(glfs_fd_t *fd, ssize_t ret, struct glfs_stat *prestat, struct 
glfs_stat *poststat, void *data)
+{}
+
+int
+main(void)
+{
+   glfs_io_cbk iocb = _iocb;
+   iocb(NULL, 0 , NULL, NULL, NULL);
+   return 0;
+}
+EOF
+if compile_prog "$glusterfs_cflags" "$glusterfs_libs" ; then
+  glusterfs_iocb_has_stat="yes"
+fi
   else
 if test "$glusterfs" = "yes" ; then
   feature_not_found "GlusterFS backend support" \
@@ -6994,6 +7014,10 @@ if test "$glusterfs_ftruncate_has_stat" = "yes" ; then
   echo "CONFIG_GLUSTERFS_FTRUNCATE_HAS_STAT=y" >> $config_host_mak
 fi
 
+if test "$glusterfs_iocb_has_stat" = "yes" ; then
+  echo "CONFIG_GLUSTERFS_IOCB_HAS_STAT=y" >> $config_host_mak
+fi
+
 if test "$libssh2" = "yes" ; then
   echo "CONFIG_LIBSSH2=m" >> $config_host_mak
   echo "LIBSSH2_CFLAGS=$libssh2_cflags" >> $config_host_mak
diff --git a/block/gluster.c b/block/gluster.c
index f853aa87f4..51f184cbd8 100644
--- a/block/gluster.c
+++ b/block/gluster.c
@@ -729,7 +729,11 @@ static struct glfs 
*qemu_gluster_init(BlockdevOptionsGluster *gconf,
 /*
  * AIO callback routine called from GlusterFS thread.
  */
-static void gluster_finish_aiocb(struct glfs_fd *fd, ssize_t ret, void *arg)
+static void gluster_finish_aiocb(struct glfs_fd *fd, ssize_t ret,
+#ifdef CONFIG_GLUSTERFS_IOCB_HAS_STAT
+ struct glfs_stat *pre, struct glfs_stat *post,
+#endif
+ void *arg)
 {
 GlusterAIOCB *acb = (GlusterAIOCB *)arg;
 
-- 
2.20.1




[Qemu-block] [PULL 05/28] tests/virtio-blk-test: Disable auto-read-only

2019-03-12 Thread Kevin Wolf
tests/virtio-blk-test uses a temporary image file that it deletes while
QEMU is still running, so it can't be reopened when writers are
attached or detached. Disable auto-read-only to keep it always writable.

Signed-off-by: Kevin Wolf 
Reviewed-by: Eric Blake 
---
 tests/virtio-blk-test.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tests/virtio-blk-test.c b/tests/virtio-blk-test.c
index b02be0274e..b65365934b 100644
--- a/tests/virtio-blk-test.c
+++ b/tests/virtio-blk-test.c
@@ -751,7 +751,7 @@ static void *virtio_blk_test_setup(GString *cmd_line, void 
*arg)
 char *tmp_path = drive_create();
 
 g_string_append_printf(cmd_line,
-   " -drive if=none,id=drive0,file=%s,format=raw "
+   " -drive 
if=none,id=drive0,file=%s,format=raw,auto-read-only=off "
"-drive 
if=none,id=drive1,file=null-co://,format=raw ",
tmp_path);
 
-- 
2.20.1




Re: [Qemu-block] [PATCH v4 11/12] pc: Support firmware configuration with -blockdev

2019-03-12 Thread Laszlo Ersek
On 03/11/19 18:45, Paolo Bonzini wrote:
> On 11/03/19 18:42, Markus Armbruster wrote:
>> Incremental diff since v3:
> 
> Ack, thanks.

I'm OK with this change too (as it doesn't seem to touch parts of the
code that I could comment on ;) )

Thanks
Laszlo

>> diff --git a/hw/i386/pc_sysfw.c b/hw/i386/pc_sysfw.c
>> index ccf2221acb..c628540774 100644
>> --- a/hw/i386/pc_sysfw.c
>> +++ b/hw/i386/pc_sysfw.c
>> @@ -82,13 +82,19 @@ static void pc_isa_bios_init(MemoryRegion *rom_memory,
>>  memory_region_set_readonly(isa_bios, true);
>>  }
>>  
>> -static PFlashCFI01 *pc_pflash_create(const char *name)
>> +static PFlashCFI01 *pc_pflash_create(PCMachineState *pcms,
>> + const char *name,
>> + const char *alias_prop_name)
>>  {
>>  DeviceState *dev = qdev_create(NULL, TYPE_PFLASH_CFI01);
>>  
>>  qdev_prop_set_uint64(dev, "sector-length", FLASH_SECTOR_SIZE);
>>  qdev_prop_set_uint8(dev, "width", 1);
>>  qdev_prop_set_string(dev, "name", name);
>> +object_property_add_child(OBJECT(pcms), name, OBJECT(dev),
>> +  _abort);
>> +object_property_add_alias(OBJECT(pcms), alias_prop_name,
>> +  OBJECT(dev), "drive", _abort);
>>  return PFLASH_CFI01(dev);
>>  }
>>  
>> @@ -97,14 +103,10 @@ void pc_system_flash_create(PCMachineState *pcms)
>>  PCMachineClass *pcmc = PC_MACHINE_GET_CLASS(pcms);
>>  
>>  if (pcmc->pci_enabled) {
>> -pcms->flash[0] = pc_pflash_create("system.flash0");
>> -pcms->flash[1] = pc_pflash_create("system.flash1");
>> -object_property_add_alias(OBJECT(pcms), "pflash0",
>> -  OBJECT(pcms->flash[0]), "drive",
>> -  _abort);
>> -object_property_add_alias(OBJECT(pcms), "pflash1",
>> -  OBJECT(pcms->flash[1]), "drive",
>> -  _abort);
>> +pcms->flash[0] = pc_pflash_create(pcms, "system.flash0",
>> +  "pflash0");
>> +pcms->flash[1] = pc_pflash_create(pcms, "system.flash1",
>> +  "pflash1");
>>  }
>>  }
>>  
>> @@ -122,19 +124,7 @@ static void 
>> pc_system_flash_cleanup_unused(PCMachineState *pcms)
>>  prop_name = g_strdup_printf("pflash%d", i);
>>  object_property_del(OBJECT(pcms), prop_name, _abort);
>>  g_free(prop_name);
>> -/*
>> - * Delete @dev_obj.  Straight object_unref() is wrong,
>> - * since it leaves dangling references in the parent bus
>> - * behind.  object_unparent() doesn't work, either: since
>> - * @dev_obj hasn't been realized, dev_obj->parent is null,
>> - * and object_unparent() does nothing.  DeviceClass method
>> - * device_unparent() works, but we have to take a
>> - * temporary reference, or else device_unparent() commits
>> - * a use-after-free error.
>> - */
>> -object_ref(dev_obj);
>> -object_get_class(dev_obj)->unparent(dev_obj);
>> -object_unref(dev_obj);
>> +object_unparent(dev_obj);
>>  pcms->flash[i] = NULL;
>>  }
>>  }
>>
> 




[Qemu-block] [PATCH v3 13/13] qemu-iotests: Test the x-blockdev-reopen QMP command

2019-03-12 Thread Alberto Garcia
This patch adds several tests for the x-blockdev-reopen QMP command.

Signed-off-by: Alberto Garcia 
---
 tests/qemu-iotests/245 | 991 +
 tests/qemu-iotests/245.out |   5 +
 tests/qemu-iotests/group   |   1 +
 3 files changed, 997 insertions(+)
 create mode 100644 tests/qemu-iotests/245
 create mode 100644 tests/qemu-iotests/245.out

diff --git a/tests/qemu-iotests/245 b/tests/qemu-iotests/245
new file mode 100644
index 00..7891a210c1
--- /dev/null
+++ b/tests/qemu-iotests/245
@@ -0,0 +1,991 @@
+#!/usr/bin/env python
+#
+# Test cases for the QMP 'x-blockdev-reopen' command
+#
+# Copyright (C) 2018-2019 Igalia, S.L.
+# Author: Alberto Garcia 
+#
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 2 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program.  If not, see .
+#
+
+import os
+import re
+import iotests
+import copy
+import json
+from iotests import qemu_img, qemu_io
+
+hd_path = [
+os.path.join(iotests.test_dir, 'hd0.img'),
+os.path.join(iotests.test_dir, 'hd1.img'),
+os.path.join(iotests.test_dir, 'hd2.img')
+]
+
+def hd_opts(idx):
+return {'driver': iotests.imgfmt,
+'node-name': 'hd%d' % idx,
+'file': {'driver': 'file',
+ 'node-name': 'hd%d-file' % idx,
+ 'filename':  hd_path[idx] } }
+
+class TestBlockdevReopen(iotests.QMPTestCase):
+total_io_cmds = 0
+
+def setUp(self):
+qemu_img('create', '-f', iotests.imgfmt, hd_path[0], '3M')
+qemu_img('create', '-f', iotests.imgfmt, '-b', hd_path[0], hd_path[1])
+qemu_img('create', '-f', iotests.imgfmt, hd_path[2], '3M')
+qemu_io('-f', iotests.imgfmt, '-c', 'write -P 0xa0  0 1M', hd_path[0])
+qemu_io('-f', iotests.imgfmt, '-c', 'write -P 0xa1 1M 1M', hd_path[1])
+qemu_io('-f', iotests.imgfmt, '-c', 'write -P 0xa2 2M 1M', hd_path[2])
+self.vm = iotests.VM()
+self.vm.launch()
+
+def tearDown(self):
+self.vm.shutdown()
+self.check_qemu_io_errors()
+os.remove(hd_path[0])
+os.remove(hd_path[1])
+os.remove(hd_path[2])
+
+# The output of qemu-io is not returned by vm.hmp_qemu_io() but
+# it's stored in the log and can only be read when the VM has been
+# shut down. This function runs qemu-io and keeps track of the
+# number of times it's been called.
+def run_qemu_io(self, img, cmd):
+result = self.vm.hmp_qemu_io(img, cmd)
+self.assert_qmp(result, 'return', '')
+self.total_io_cmds += 1
+
+# Once the VM is shut down we can parse the log and see if qemu-io
+# ran without errors.
+def check_qemu_io_errors(self):
+self.assertFalse(self.vm.is_running())
+found = 0
+log = self.vm.get_log()
+for line in log.split("\n"):
+if line.startswith("Pattern verification failed"):
+raise Exception("%s (command #%d)" % (line, found))
+if re.match("read .*/.* bytes at offset", line):
+found += 1
+self.assertEqual(found, self.total_io_cmds,
+ "Expected output of %d qemu-io commands, found %d" %
+ (found, self.total_io_cmds))
+
+# Run x-blockdev-reopen with 'opts' but applying 'newopts'
+# on top of it. The original 'opts' dict is unmodified
+def reopen(self, opts, newopts = {}, errmsg = None):
+opts = copy.deepcopy(opts)
+
+# Apply the changes from 'newopts' on top of 'opts'
+for key in newopts:
+value = newopts[key]
+# If key has the form "foo.bar" then we need to do
+# opts["foo"]["bar"] = value, not opts["foo.bar"] = value
+subdict = opts
+while key.find('.') != -1:
+[prefix, key] = key.split('.', 1)
+subdict = opts[prefix]
+subdict[key] = value
+
+result = self.vm.qmp('x-blockdev-reopen', conv_keys = False, **opts)
+if errmsg:
+self.assert_qmp(result, 'error/class', 'GenericError')
+self.assert_qmp(result, 'error/desc', errmsg)
+else:
+self.assert_qmp(result, 'return', {})
+
+
+# Run query-named-block-nodes and return the specified entry
+def get_node(self, node_name):
+result = self.vm.qmp('query-named-block-nodes')
+for node in result['return']:
+if node['node-name'] == node_name:
+   

[Qemu-block] [PATCH v3 10/13] block: Add bdrv_reset_options_allowed()

2019-03-12 Thread Alberto Garcia
bdrv_reopen_prepare() receives a BDRVReopenState with (among other
things) a new set of options to be applied to that BlockDriverState.

If an option is missing then it means that we want to reset it to its
default value rather than keep the previous one. This way the state
of the block device after being reopened is comparable to that of a
device added with "blockdev-add" using the same set of options.

Not all options from all drivers can be changed this way, however.
If the user attempts to reset an immutable option to its default value
using this method then we must forbid it.

This new function takes a BlockDriverState and a new set of options
and checks if there's any option that was previously set but is
missing from the new set of options.

If the option is present in both sets we don't need to check that they
have the same value. The loop at the end of bdrv_reopen_prepare()
already takes care of that.

Signed-off-by: Alberto Garcia 
---
 block.c | 58 ++
 1 file changed, 58 insertions(+)

diff --git a/block.c b/block.c
index c6786cd35b..a5e69ad81e 100644
--- a/block.c
+++ b/block.c
@@ -2983,6 +2983,53 @@ BlockDriverState *bdrv_open(const char *filename, const 
char *reference,
  NULL, errp);
 }
 
+/* Return true if the NULL-terminated @list contains @str */
+static bool is_str_in_list(const char *str, const char *const *list)
+{
+if (str && list) {
+int i;
+for (i = 0; list[i] != NULL; i++) {
+if (!strcmp(str, list[i])) {
+return true;
+}
+}
+}
+return false;
+}
+
+/*
+ * Check that every option set in @bs->options is also set in
+ * @new_opts.
+ *
+ * Options listed in the common_options list and in
+ * @bs->drv->mutable_opts are skipped.
+ *
+ * Return 0 on success, otherwise return -EINVAL and set @errp.
+ */
+static int bdrv_reset_options_allowed(BlockDriverState *bs,
+  const QDict *new_opts, Error **errp)
+{
+const QDictEntry *e;
+/* These options are common to all block drivers and are handled
+ * in bdrv_reopen_prepare() so they can be left out of @new_opts */
+const char *const common_options[] = {
+"node-name", "discard", "cache.direct", "cache.no-flush",
+"read-only", "auto-read-only", "detect-zeroes", NULL
+};
+
+for (e = qdict_first(bs->options); e; e = qdict_next(bs->options, e)) {
+if (!qdict_haskey(new_opts, e->key) &&
+!is_str_in_list(e->key, common_options) &&
+!is_str_in_list(e->key, bs->drv->mutable_opts)) {
+error_setg(errp, "Option '%s' cannot be reset "
+   "to its default value", e->key);
+return -EINVAL;
+}
+}
+
+return 0;
+}
+
 /*
  * Returns true if @child can be reached recursively from @bs
  */
@@ -3546,6 +3593,17 @@ int bdrv_reopen_prepare(BDRVReopenState *reopen_state, 
BlockReopenQueue *queue,
 }
 
 if (drv->bdrv_reopen_prepare) {
+/*
+ * If a driver-specific option is missing, it means that we
+ * should reset it to its default value.
+ * But not all options allow that, so we need to check it first.
+ */
+ret = bdrv_reset_options_allowed(reopen_state->bs,
+ reopen_state->options, errp);
+if (ret) {
+goto error;
+}
+
 ret = drv->bdrv_reopen_prepare(reopen_state, queue, _err);
 if (ret) {
 if (local_err != NULL) {
-- 
2.11.0




[Qemu-block] [PATCH v3 06/13] block: Handle child references in bdrv_reopen_queue()

2019-03-12 Thread Alberto Garcia
Children in QMP are specified with BlockdevRef / BlockdevRefOrNull,
which can contain a set of child options, a child reference, or
NULL. In optional attributes like "backing" it can also be missing.

Only the first case (set of child options) is being handled properly
by bdrv_reopen_queue(). This patch deals with all the others.

Here's how these cases should be handled when bdrv_reopen_queue() is
deciding what to do with each child of a BlockDriverState:

   1) Set of child options: if the child was implicitly created (i.e
  inherits_from points to the parent) then the options are removed
  from the parent's options QDict and are passed to the child with
  a recursive bdrv_reopen_queue() call. This case was already
  working fine.

   2) Child reference: there's two possibilites here.

  2a) Reference to the current child: if the child was implicitly
  created then it is put in the reopen queue, keeping its
  current set of options (since this was a child reference
  there was no way to specify a different set of options).
  If the child is not implicit then it keeps its current set
  of options but it is not reopened (and therefore does not
  inherit any new option from the parent).

  2b) Reference to a different BDS: the current child is not put
  in the reopen queue at all. Passing a reference to a
  different BDS can be used to replace a child, although at
  the moment no driver implements this, so it results in an
  error. In any case, the current child is not going to be
  reopened (and might in fact disappear if it's replaced)

   3) NULL: This is similar to (2b). Although no driver allows this
  yet it can be used to detach the current child so it should not
  be put in the reopen queue.

   4) Missing option: at the moment "backing" is the only case where
  this can happen. With "blockdev-add", leaving "backing" out
  means that the default backing file is opened. We don't want to
  open a new image during reopen, so we require that "backing" is
  always present. We'll relax this requirement a bit in the next
  patch. If keep_old_opts is true and "backing" is missing then
  this behaves like 2a (the current child is reopened).

Signed-off-by: Alberto Garcia 
---
 block.c   | 52 ---
 include/block/block.h |  1 +
 2 files changed, 46 insertions(+), 7 deletions(-)

diff --git a/block.c b/block.c
index 3cd281dbcc..9698f2ad44 100644
--- a/block.c
+++ b/block.c
@@ -3107,9 +3107,21 @@ static BlockReopenQueue 
*bdrv_reopen_queue_child(BlockReopenQueue *bs_queue,
 bs_entry->state.perm = UINT64_MAX;
 bs_entry->state.shared_perm = 0;
 
+/*
+ * If keep_old_opts is false then it means that unspecified
+ * options must be reset to their original value. We don't allow
+ * resetting 'backing' but we need to know if the option is
+ * missing in order to decide if we have to return an error.
+ */
+if (!keep_old_opts) {
+bs_entry->state.backing_missing =
+!qdict_haskey(options, "backing") &&
+!qdict_haskey(options, "backing.driver");
+}
+
 QLIST_FOREACH(child, >children, next) {
-QDict *new_child_options;
-char *child_key_dot;
+QDict *new_child_options = NULL;
+bool child_keep_old = keep_old_opts;
 
 /* reopen can only change the options of block devices that were
  * implicitly created and inherited options. For other (referenced)
@@ -3118,13 +3130,32 @@ static BlockReopenQueue 
*bdrv_reopen_queue_child(BlockReopenQueue *bs_queue,
 continue;
 }
 
-child_key_dot = g_strdup_printf("%s.", child->name);
-qdict_extract_subqdict(explicit_options, NULL, child_key_dot);
-qdict_extract_subqdict(options, _child_options, child_key_dot);
-g_free(child_key_dot);
+/* Check if the options contain a child reference */
+if (qdict_haskey(options, child->name)) {
+const char *childref = qdict_get_try_str(options, child->name);
+/*
+ * The current child must not be reopened if the child
+ * reference is null or points to a different node.
+ */
+if (g_strcmp0(childref, child->bs->node_name)) {
+continue;
+}
+/*
+ * If the child reference points to the current child then
+ * reopen it with its existing set of options (note that
+ * it can still inherit new options from the parent).
+ */
+child_keep_old = true;
+} else {
+/* Extract child options ("child-name.*") */
+char *child_key_dot = g_strdup_printf("%s.", child->name);
+qdict_extract_subqdict(explicit_options, NULL, child_key_dot);
+qdict_extract_subqdict(options, 

[Qemu-block] [PATCH v3 08/13] block: Allow changing the backing file on reopen

2019-03-12 Thread Alberto Garcia
This patch allows the user to change the backing file of an image that
is being reopened. Here's what it does:

 - In bdrv_reopen_prepare(): check that the value of 'backing' points
   to an existing node or is null. If it points to an existing node it
   also needs to make sure that replacing the backing file will not
   create a cycle in the node graph (i.e. you cannot reach the parent
   from the new backing file).

 - In bdrv_reopen_commit(): perform the actual node replacement by
   calling bdrv_set_backing_hd().

There may be temporary implicit nodes between a BDS and its backing
file (e.g. a commit filter node). In these cases bdrv_reopen_prepare()
looks for the real (non-implicit) backing file and requires that the
'backing' option points to it. Replacing or detaching a backing file
is forbidden if there are implicit nodes in the middle.

Although x-blockdev-reopen is meant to be used like blockdev-add,
there's an important thing that must be taken into account: the only
way to set a new backing file is by using a reference to an existing
node (previously added with e.g. blockdev-add).  If 'backing' contains
a dictionary with a new set of options ({"driver": "qcow2", "file": {
... }}) then it is interpreted that the _existing_ backing file must
be reopened with those options.

Signed-off-by: Alberto Garcia 
---
 block.c   | 166 ++
 include/block/block.h |   2 +
 2 files changed, 168 insertions(+)

diff --git a/block.c b/block.c
index c043f5eadd..c6786cd35b 100644
--- a/block.c
+++ b/block.c
@@ -2984,6 +2984,27 @@ BlockDriverState *bdrv_open(const char *filename, const 
char *reference,
 }
 
 /*
+ * Returns true if @child can be reached recursively from @bs
+ */
+static bool bdrv_recurse_has_child(BlockDriverState *bs,
+   BlockDriverState *child)
+{
+BdrvChild *c;
+
+if (bs == child) {
+return true;
+}
+
+QLIST_FOREACH(c, >children, next) {
+if (bdrv_recurse_has_child(c->bs, child)) {
+return true;
+}
+}
+
+return false;
+}
+
+/*
  * Adds a BlockDriverState to a simple queue for an atomic, transactional
  * reopen of multiple devices.
  *
@@ -3208,6 +3229,19 @@ int bdrv_reopen_multiple(AioContext *ctx, 
BlockReopenQueue *bs_queue, Error **er
 if (ret < 0) {
 goto cleanup_perm;
 }
+/* Check if new_backing_bs would accept the new permissions */
+if (state->replace_backing_bs && state->new_backing_bs) {
+uint64_t nperm, nshared;
+bdrv_child_perm(state->bs, state->new_backing_bs,
+NULL, _backing, bs_queue,
+state->perm, state->shared_perm,
+, );
+ret = bdrv_check_update_perm(state->new_backing_bs, NULL,
+ nperm, nshared, NULL, errp);
+if (ret < 0) {
+goto cleanup_perm;
+}
+}
 bs_entry->perms_checked = true;
 }
 
@@ -3231,6 +3265,9 @@ cleanup_perm:
 bdrv_set_perm(state->bs, state->perm, state->shared_perm);
 } else {
 bdrv_abort_perm_update(state->bs);
+if (state->replace_backing_bs && state->new_backing_bs) {
+bdrv_abort_perm_update(state->new_backing_bs);
+}
 }
 }
 cleanup:
@@ -3242,6 +3279,9 @@ cleanup:
 qobject_unref(bs_entry->state.explicit_options);
 qobject_unref(bs_entry->state.options);
 }
+if (bs_entry->state.new_backing_bs) {
+bdrv_unref(bs_entry->state.new_backing_bs);
+}
 g_free(bs_entry);
 }
 g_free(bs_queue);
@@ -3314,6 +3354,101 @@ static void bdrv_reopen_perm(BlockReopenQueue *q, 
BlockDriverState *bs,
 }
 
 /*
+ * Take a BDRVReopenState and check if the value of 'backing' in the
+ * reopen_state->options QDict is valid or not.
+ *
+ * If 'backing' is missing from the QDict then return 0.
+ *
+ * If 'backing' contains the node name of the backing file of
+ * reopen_state->bs then return 0.
+ *
+ * If 'backing' contains a different node name (or is null) then check
+ * whether the current backing file can be replaced with the new one.
+ * If that's the case then reopen_state->replace_backing_bs is set to
+ * true and reopen_state->new_backing_bs contains a pointer to the new
+ * backing BlockDriverState (or NULL).
+ *
+ * Return 0 on success, otherwise return < 0 and set @errp.
+ */
+static int bdrv_reopen_parse_backing(BDRVReopenState *reopen_state,
+ Error **errp)
+{
+BlockDriverState *bs = reopen_state->bs;
+BlockDriverState *overlay_bs, *new_backing_bs;
+QObject *value;
+const char *str;
+
+value = qdict_get(reopen_state->options, "backing");
+if (value == NULL) {
+return 0;
+}
+
+switch (qobject_type(value)) {
+case QTYPE_QNULL:
+

[Qemu-block] [PATCH v3 04/13] block: Freeze the backing chain for the duration of the stream job

2019-03-12 Thread Alberto Garcia
Signed-off-by: Alberto Garcia 
---
 block/stream.c | 21 +
 1 file changed, 21 insertions(+)

diff --git a/block/stream.c b/block/stream.c
index e14579ff80..6253c86fae 100644
--- a/block/stream.c
+++ b/block/stream.c
@@ -35,6 +35,7 @@ typedef struct StreamBlockJob {
 BlockdevOnError on_error;
 char *backing_file_str;
 bool bs_read_only;
+bool chain_frozen;
 } StreamBlockJob;
 
 static int coroutine_fn stream_populate(BlockBackend *blk,
@@ -49,6 +50,16 @@ static int coroutine_fn stream_populate(BlockBackend *blk,
 return blk_co_preadv(blk, offset, qiov.size, , BDRV_REQ_COPY_ON_READ);
 }
 
+static void stream_abort(Job *job)
+{
+StreamBlockJob *s = container_of(job, StreamBlockJob, common.job);
+
+if (s->chain_frozen) {
+BlockJob *bjob = >common;
+bdrv_unfreeze_backing_chain(blk_bs(bjob->blk), s->base);
+}
+}
+
 static int stream_prepare(Job *job)
 {
 StreamBlockJob *s = container_of(job, StreamBlockJob, common.job);
@@ -58,6 +69,9 @@ static int stream_prepare(Job *job)
 Error *local_err = NULL;
 int ret = 0;
 
+bdrv_unfreeze_backing_chain(bs, base);
+s->chain_frozen = false;
+
 if (bs->backing) {
 const char *base_id = NULL, *base_fmt = NULL;
 if (base) {
@@ -208,6 +222,7 @@ static const BlockJobDriver stream_job_driver = {
 .free  = block_job_free,
 .run   = stream_run,
 .prepare   = stream_prepare,
+.abort = stream_abort,
 .clean = stream_clean,
 .user_resume   = block_job_user_resume,
 .drain = block_job_drain,
@@ -254,9 +269,15 @@ void stream_start(const char *job_id, BlockDriverState *bs,
_abort);
 }
 
+if (bdrv_freeze_backing_chain(bs, base, errp) < 0) {
+job_early_fail(>common.job);
+goto fail;
+}
+
 s->base = base;
 s->backing_file_str = g_strdup(backing_file_str);
 s->bs_read_only = bs_read_only;
+s->chain_frozen = true;
 
 s->on_error = on_error;
 trace_stream_start(bs, base, s);
-- 
2.11.0




[Qemu-block] [PATCH v3 12/13] block: Add an 'x-blockdev-reopen' QMP command

2019-03-12 Thread Alberto Garcia
This command allows reopening an arbitrary BlockDriverState with a
new set of options. Some options (e.g node-name) cannot be changed
and some block drivers don't allow reopening, but otherwise this
command is modelled after 'blockdev-add' and the state of the reopened
BlockDriverState should generally be the same as if it had just been
added by 'blockdev-add' with the same set of options.

One notable exception is the 'backing' option: 'x-blockdev-reopen'
requires that it is always present unless the BlockDriverState in
question doesn't have a current or default backing file.

This command allows reconfiguring the graph by using the appropriate
options to change the children of a node. At the moment it's possible
to change a backing file by setting the 'backing' option to the name
of the new node, but it should also be possible to add a similar
functionality to other block drivers (e.g. Quorum, blkverify).

Although the API is unlikely to change, this command is marked
experimental for the time being so there's room to see if the
semantics need changes.

Signed-off-by: Alberto Garcia 
---
 blockdev.c   | 47 +++
 qapi/block-core.json | 42 ++
 2 files changed, 89 insertions(+)

diff --git a/blockdev.c b/blockdev.c
index 84dd678289..3cc51b2ee7 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -4287,6 +4287,53 @@ fail:
 visit_free(v);
 }
 
+void qmp_x_blockdev_reopen(BlockdevOptions *options, Error **errp)
+{
+BlockDriverState *bs;
+AioContext *ctx;
+QObject *obj;
+Visitor *v = qobject_output_visitor_new();
+Error *local_err = NULL;
+BlockReopenQueue *queue;
+QDict *qdict;
+
+/* Check for the selected node name */
+if (!options->has_node_name) {
+error_setg(errp, "Node name not specified");
+goto fail;
+}
+
+bs = bdrv_find_node(options->node_name);
+if (!bs) {
+error_setg(errp, "Cannot find node named '%s'", options->node_name);
+goto fail;
+}
+
+/* Put all options in a QDict and flatten it */
+visit_type_BlockdevOptions(v, NULL, , _err);
+if (local_err) {
+error_propagate(errp, local_err);
+goto fail;
+}
+
+visit_complete(v, );
+qdict = qobject_to(QDict, obj);
+
+qdict_flatten(qdict);
+
+/* Perform the reopen operation */
+ctx = bdrv_get_aio_context(bs);
+aio_context_acquire(ctx);
+bdrv_subtree_drained_begin(bs);
+queue = bdrv_reopen_queue(NULL, bs, qdict, false);
+bdrv_reopen_multiple(queue, errp);
+bdrv_subtree_drained_end(bs);
+aio_context_release(ctx);
+
+fail:
+visit_free(v);
+}
+
 void qmp_blockdev_del(const char *node_name, Error **errp)
 {
 AioContext *aio_context;
diff --git a/qapi/block-core.json b/qapi/block-core.json
index 7377fecb46..474e268c4d 100644
--- a/qapi/block-core.json
+++ b/qapi/block-core.json
@@ -3998,6 +3998,48 @@
 { 'command': 'blockdev-add', 'data': 'BlockdevOptions', 'boxed': true }
 
 ##
+# @x-blockdev-reopen:
+#
+# Reopens a block device using the given set of options. Any option
+# not specified will be reset to its default value regardless of its
+# previous status. If an option cannot be changed or a particular
+# driver does not support reopening then the command will return an
+# error.
+#
+# The top-level @node-name option (from BlockdevOptions) must be
+# specified and is used to select the block device to be reopened.
+# Other @node-name options must be either omitted or set to the
+# current name of the appropriate node. This command won't change any
+# node name and any attempt to do it will result in an error.
+#
+# In the case of options that refer to child nodes, the behavior of
+# this command depends on the value:
+#
+#  1) A set of options (BlockdevOptions): the child is reopened with
+# the specified set of options.
+#
+#  2) A reference to the current child: the child is reopened using
+# its existing set of options.
+#
+#  3) A reference to a different node: the current child is replaced
+# with the specified one.
+#
+#  4) NULL: the current child (if any) is detached.
+#
+# Options (1) and (2) are supported in all cases, but at the moment
+# only @backing allows replacing or detaching an existing child.
+#
+# Unlike with blockdev-add, the @backing option must always be present
+# unless the node being reopened does not have a backing file and its
+# image does not have a default backing file name as part of its
+# metadata.
+#
+# Since: 4.0
+##
+{ 'command': 'x-blockdev-reopen',
+  'data': 'BlockdevOptions', 'boxed': true }
+
+##
 # @blockdev-del:
 #
 # Deletes a block device that has been added using blockdev-add.
-- 
2.11.0




[Qemu-block] [PATCH v3 09/13] block: Add a 'mutable_opts' field to BlockDriver

2019-03-12 Thread Alberto Garcia
If we reopen a BlockDriverState and there is an option that is present
in bs->options but missing from the new set of options then we have to
return an error unless the driver is able to reset it to its default
value.

This patch adds a new 'mutable_opts' field to BlockDriver. This is
a list of runtime options that can be modified during reopen. If an
option in this list is unspecified on reopen then it must be reset (or
return an error).

Signed-off-by: Alberto Garcia 
---
 block/file-posix.c|  6 ++
 block/qcow2.c | 25 +
 block/raw-format.c|  3 +++
 include/block/block_int.h |  8 
 4 files changed, 42 insertions(+)

diff --git a/block/file-posix.c b/block/file-posix.c
index d41059d139..e9fa6aac48 100644
--- a/block/file-posix.c
+++ b/block/file-posix.c
@@ -442,6 +442,8 @@ static QemuOptsList raw_runtime_opts = {
 },
 };
 
+static const char *const mutable_opts[] = { "x-check-cache-dropped", NULL };
+
 static int raw_open_common(BlockDriverState *bs, QDict *options,
int bdrv_flags, int open_flags,
bool device, Error **errp)
@@ -2870,6 +2872,7 @@ BlockDriver bdrv_file = {
 .bdrv_set_perm   = raw_set_perm,
 .bdrv_abort_perm_update = raw_abort_perm_update,
 .create_opts = _create_opts,
+.mutable_opts = mutable_opts,
 };
 
 /***/
@@ -3321,6 +3324,7 @@ static BlockDriver bdrv_host_device = {
 .bdrv_reopen_abort   = raw_reopen_abort,
 .bdrv_co_create_opts = hdev_co_create_opts,
 .create_opts = _create_opts,
+.mutable_opts= mutable_opts,
 .bdrv_co_invalidate_cache = raw_co_invalidate_cache,
 .bdrv_co_pwrite_zeroes = hdev_co_pwrite_zeroes,
 
@@ -3447,6 +3451,7 @@ static BlockDriver bdrv_host_cdrom = {
 .bdrv_reopen_abort   = raw_reopen_abort,
 .bdrv_co_create_opts = hdev_co_create_opts,
 .create_opts = _create_opts,
+.mutable_opts= mutable_opts,
 .bdrv_co_invalidate_cache = raw_co_invalidate_cache,
 
 
@@ -3580,6 +3585,7 @@ static BlockDriver bdrv_host_cdrom = {
 .bdrv_reopen_abort   = raw_reopen_abort,
 .bdrv_co_create_opts = hdev_co_create_opts,
 .create_opts= _create_opts,
+.mutable_opts   = mutable_opts,
 
 .bdrv_co_preadv = raw_co_preadv,
 .bdrv_co_pwritev= raw_co_pwritev,
diff --git a/block/qcow2.c b/block/qcow2.c
index c4dd876fb4..0fc9b0561e 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -627,6 +627,30 @@ int qcow2_validate_table(BlockDriverState *bs, uint64_t 
offset,
 return 0;
 }
 
+static const char *const mutable_opts[] = {
+QCOW2_OPT_LAZY_REFCOUNTS,
+QCOW2_OPT_DISCARD_REQUEST,
+QCOW2_OPT_DISCARD_SNAPSHOT,
+QCOW2_OPT_DISCARD_OTHER,
+QCOW2_OPT_OVERLAP,
+QCOW2_OPT_OVERLAP_TEMPLATE,
+QCOW2_OPT_OVERLAP_MAIN_HEADER,
+QCOW2_OPT_OVERLAP_ACTIVE_L1,
+QCOW2_OPT_OVERLAP_ACTIVE_L2,
+QCOW2_OPT_OVERLAP_REFCOUNT_TABLE,
+QCOW2_OPT_OVERLAP_REFCOUNT_BLOCK,
+QCOW2_OPT_OVERLAP_SNAPSHOT_TABLE,
+QCOW2_OPT_OVERLAP_INACTIVE_L1,
+QCOW2_OPT_OVERLAP_INACTIVE_L2,
+QCOW2_OPT_OVERLAP_BITMAP_DIRECTORY,
+QCOW2_OPT_CACHE_SIZE,
+QCOW2_OPT_L2_CACHE_SIZE,
+QCOW2_OPT_L2_CACHE_ENTRY_SIZE,
+QCOW2_OPT_REFCOUNT_CACHE_SIZE,
+QCOW2_OPT_CACHE_CLEAN_INTERVAL,
+NULL
+};
+
 static QemuOptsList qcow2_runtime_opts = {
 .name = "qcow2",
 .head = QTAILQ_HEAD_INITIALIZER(qcow2_runtime_opts.head),
@@ -5275,6 +5299,7 @@ BlockDriver bdrv_qcow2 = {
 
 .create_opts = _create_opts,
 .strong_runtime_opts = qcow2_strong_runtime_opts,
+.mutable_opts= mutable_opts,
 .bdrv_co_check   = qcow2_co_check,
 .bdrv_amend_options  = qcow2_amend_options,
 
diff --git a/block/raw-format.c b/block/raw-format.c
index e3e5ba2c8a..cec29986cc 100644
--- a/block/raw-format.c
+++ b/block/raw-format.c
@@ -37,6 +37,8 @@ typedef struct BDRVRawState {
 bool has_size;
 } BDRVRawState;
 
+static const char *const mutable_opts[] = { "offset", "size", NULL };
+
 static QemuOptsList raw_runtime_opts = {
 .name = "raw",
 .head = QTAILQ_HEAD_INITIALIZER(raw_runtime_opts.head),
@@ -570,6 +572,7 @@ BlockDriver bdrv_raw = {
 .create_opts  = _create_opts,
 .bdrv_has_zero_init   = _has_zero_init,
 .strong_runtime_opts  = raw_strong_runtime_opts,
+.mutable_opts = mutable_opts,
 };
 
 static void bdrv_raw_init(void)
diff --git a/include/block/block_int.h b/include/block/block_int.h
index 438feba4e5..01e855a066 100644
--- a/include/block/block_int.h
+++ b/include/block/block_int.h
@@ -383,6 +383,14 @@ struct BlockDriver {
 
 /* List of options for creating images, terminated by name == NULL */
 QemuOptsList *create_opts;
+/*
+ * If this driver supports reopening images this contains a
+ * NULL-terminated list of the runtime options that can be
+ * modified. If an option in this list is 

[Qemu-block] [PATCH v3 11/13] block: Remove the AioContext parameter from bdrv_reopen_multiple()

2019-03-12 Thread Alberto Garcia
This parameter has been unused since 1a63a907507fbbcfaee3f622907ec244b

Signed-off-by: Alberto Garcia 
---
 block.c   | 4 ++--
 block/replication.c   | 3 +--
 include/block/block.h | 2 +-
 qemu-io-cmds.c| 2 +-
 4 files changed, 5 insertions(+), 6 deletions(-)

diff --git a/block.c b/block.c
index a5e69ad81e..ed9253c786 100644
--- a/block.c
+++ b/block.c
@@ -3254,7 +3254,7 @@ BlockReopenQueue *bdrv_reopen_queue(BlockReopenQueue 
*bs_queue,
  * All affected nodes must be drained between bdrv_reopen_queue() and
  * bdrv_reopen_multiple().
  */
-int bdrv_reopen_multiple(AioContext *ctx, BlockReopenQueue *bs_queue, Error 
**errp)
+int bdrv_reopen_multiple(BlockReopenQueue *bs_queue, Error **errp)
 {
 int ret = -1;
 BlockReopenQueueEntry *bs_entry, *next;
@@ -3347,7 +3347,7 @@ int bdrv_reopen_set_read_only(BlockDriverState *bs, bool 
read_only,
 
 bdrv_subtree_drained_begin(bs);
 queue = bdrv_reopen_queue(NULL, bs, opts, true);
-ret = bdrv_reopen_multiple(bdrv_get_aio_context(bs), queue, errp);
+ret = bdrv_reopen_multiple(queue, errp);
 bdrv_subtree_drained_end(bs);
 
 return ret;
diff --git a/block/replication.c b/block/replication.c
index a2f3590310..b95bd28802 100644
--- a/block/replication.c
+++ b/block/replication.c
@@ -385,8 +385,7 @@ static void reopen_backing_file(BlockDriverState *bs, bool 
writable,
 }
 
 if (reopen_queue) {
-bdrv_reopen_multiple(bdrv_get_aio_context(bs),
- reopen_queue, _err);
+bdrv_reopen_multiple(reopen_queue, _err);
 error_propagate(errp, local_err);
 }
 
diff --git a/include/block/block.h b/include/block/block.h
index 68a3efbb43..e452988b66 100644
--- a/include/block/block.h
+++ b/include/block/block.h
@@ -304,7 +304,7 @@ BlockDriverState *bdrv_new_open_driver(BlockDriver *drv, 
const char *node_name,
 BlockReopenQueue *bdrv_reopen_queue(BlockReopenQueue *bs_queue,
 BlockDriverState *bs, QDict *options,
 bool keep_old_opts);
-int bdrv_reopen_multiple(AioContext *ctx, BlockReopenQueue *bs_queue, Error 
**errp);
+int bdrv_reopen_multiple(BlockReopenQueue *bs_queue, Error **errp);
 int bdrv_reopen_set_read_only(BlockDriverState *bs, bool read_only,
   Error **errp);
 int bdrv_reopen_prepare(BDRVReopenState *reopen_state,
diff --git a/qemu-io-cmds.c b/qemu-io-cmds.c
index ff9a5cd80f..35dcdcf413 100644
--- a/qemu-io-cmds.c
+++ b/qemu-io-cmds.c
@@ -2081,7 +2081,7 @@ static int reopen_f(BlockBackend *blk, int argc, char 
**argv)
 
 bdrv_subtree_drained_begin(bs);
 brq = bdrv_reopen_queue(NULL, bs, opts, true);
-bdrv_reopen_multiple(bdrv_get_aio_context(bs), brq, _err);
+bdrv_reopen_multiple(brq, _err);
 bdrv_subtree_drained_end(bs);
 
 if (local_err) {
-- 
2.11.0




[Qemu-block] [PATCH v3 02/13] block: Freeze the backing chain for the duration of the commit job

2019-03-12 Thread Alberto Garcia
Signed-off-by: Alberto Garcia 
---
 block/commit.c | 16 
 1 file changed, 16 insertions(+)

diff --git a/block/commit.c b/block/commit.c
index 3b46ca7f97..ba60fef58a 100644
--- a/block/commit.c
+++ b/block/commit.c
@@ -39,6 +39,7 @@ typedef struct CommitBlockJob {
 BlockDriverState *base_bs;
 BlockdevOnError on_error;
 bool base_read_only;
+bool chain_frozen;
 char *backing_file_str;
 } CommitBlockJob;
 
@@ -68,6 +69,9 @@ static int commit_prepare(Job *job)
 {
 CommitBlockJob *s = container_of(job, CommitBlockJob, common.job);
 
+bdrv_unfreeze_backing_chain(s->commit_top_bs, s->base_bs);
+s->chain_frozen = false;
+
 /* Remove base node parent that still uses BLK_PERM_WRITE/RESIZE before
  * the normal backing chain can be restored. */
 blk_unref(s->base);
@@ -84,6 +88,10 @@ static void commit_abort(Job *job)
 CommitBlockJob *s = container_of(job, CommitBlockJob, common.job);
 BlockDriverState *top_bs = blk_bs(s->top);
 
+if (s->chain_frozen) {
+bdrv_unfreeze_backing_chain(s->commit_top_bs, s->base_bs);
+}
+
 /* Make sure commit_top_bs and top stay around until bdrv_replace_node() */
 bdrv_ref(top_bs);
 bdrv_ref(s->commit_top_bs);
@@ -330,6 +338,11 @@ void commit_start(const char *job_id, BlockDriverState *bs,
 }
 }
 
+if (bdrv_freeze_backing_chain(commit_top_bs, base, errp) < 0) {
+goto fail;
+}
+s->chain_frozen = true;
+
 ret = block_job_add_bdrv(>common, "base", base, 0, BLK_PERM_ALL, errp);
 if (ret < 0) {
 goto fail;
@@ -362,6 +375,9 @@ void commit_start(const char *job_id, BlockDriverState *bs,
 return;
 
 fail:
+if (s->chain_frozen) {
+bdrv_unfreeze_backing_chain(commit_top_bs, base);
+}
 if (s->base) {
 blk_unref(s->base);
 }
-- 
2.11.0




[Qemu-block] [PATCH v3 01/13] block: Allow freezing BdrvChild links

2019-03-12 Thread Alberto Garcia
Our permission system is useful to define what operations are allowed
on a certain block node and includes things like BLK_PERM_WRITE or
BLK_PERM_RESIZE among others.

One of the permissions is BLK_PERM_GRAPH_MOD which allows "changing
the node that this BdrvChild points to". The exact meaning of this has
never been very clear, but it can be understood as "change any of the
links connected to the node". This can be used to prevent changing a
backing link, but it's too coarse.

This patch adds a new 'frozen' attribute to BdrvChild, which forbids
detaching the link from the node it points to, and new API to freeze
and unfreeze a backing chain.

After this change a few functions can fail, so they need additional
checks.

Signed-off-by: Alberto Garcia 
---
 block.c   | 76 +++
 include/block/block.h |  5 
 include/block/block_int.h |  6 
 3 files changed, 87 insertions(+)

diff --git a/block.c b/block.c
index 33804cdcaa..a2bffce38d 100644
--- a/block.c
+++ b/block.c
@@ -2126,6 +2126,8 @@ static void bdrv_replace_child_noperm(BdrvChild *child,
 BlockDriverState *old_bs = child->bs;
 int i;
 
+assert(!child->frozen);
+
 if (old_bs && new_bs) {
 assert(bdrv_get_aio_context(old_bs) == bdrv_get_aio_context(new_bs));
 }
@@ -2342,6 +2344,10 @@ void bdrv_set_backing_hd(BlockDriverState *bs, 
BlockDriverState *backing_hd,
 bool update_inherits_from = bdrv_chain_contains(bs, backing_hd) &&
 bdrv_inherits_from_recursive(backing_hd, bs);
 
+if (bdrv_is_backing_chain_frozen(bs, backing_bs(bs), errp)) {
+return;
+}
+
 if (backing_hd) {
 bdrv_ref(backing_hd);
 }
@@ -3721,6 +3727,11 @@ void bdrv_replace_node(BlockDriverState *from, 
BlockDriverState *to,
 if (!should_update_child(c, to)) {
 continue;
 }
+if (c->frozen) {
+error_setg(errp, "Cannot change '%s' link to '%s'",
+   c->name, from->node_name);
+goto out;
+}
 list = g_slist_prepend(list, c);
 perm |= c->perm;
 shared &= c->shared_perm;
@@ -3933,6 +3944,63 @@ BlockDriverState *bdrv_find_base(BlockDriverState *bs)
 }
 
 /*
+ * Return true if at least one of the backing links between @bs and
+ * @base is frozen. @errp is set if that's the case.
+ */
+bool bdrv_is_backing_chain_frozen(BlockDriverState *bs, BlockDriverState *base,
+  Error **errp)
+{
+BlockDriverState *i;
+
+for (i = bs; i != base && i->backing; i = backing_bs(i)) {
+if (i->backing->frozen) {
+error_setg(errp, "Cannot change '%s' link from '%s' to '%s'",
+   i->backing->name, i->node_name,
+   backing_bs(i)->node_name);
+return true;
+}
+}
+
+return false;
+}
+
+/*
+ * Freeze all backing links between @bs and @base.
+ * If any of the links is already frozen the operation is aborted and
+ * none of the links are modified.
+ * Returns 0 on success. On failure returns < 0 and sets @errp.
+ */
+int bdrv_freeze_backing_chain(BlockDriverState *bs, BlockDriverState *base,
+  Error **errp)
+{
+BlockDriverState *i;
+
+if (bdrv_is_backing_chain_frozen(bs, base, errp)) {
+return -EPERM;
+}
+
+for (i = bs; i != base && i->backing; i = backing_bs(i)) {
+i->backing->frozen = true;
+}
+
+return 0;
+}
+
+/*
+ * Unfreeze all backing links between @bs and @base. The caller must
+ * ensure that all links are frozen before using this function.
+ */
+void bdrv_unfreeze_backing_chain(BlockDriverState *bs, BlockDriverState *base)
+{
+BlockDriverState *i;
+
+for (i = bs; i != base && i->backing; i = backing_bs(i)) {
+assert(i->backing->frozen);
+i->backing->frozen = false;
+}
+}
+
+/*
  * Drops images above 'base' up to and including 'top', and sets the image
  * above 'top' to have base as its backing file.
  *
@@ -3981,6 +4049,14 @@ int bdrv_drop_intermediate(BlockDriverState *top, 
BlockDriverState *base,
 goto exit;
 }
 
+/* This function changes all links that point to top and makes
+ * them point to base. Check that none of them is frozen. */
+QLIST_FOREACH(c, >parents, next_parent) {
+if (c->frozen) {
+goto exit;
+}
+}
+
 /* If 'base' recursively inherits from 'top' then we should set
  * base->inherits_from to top->inherits_from after 'top' and all
  * other intermediate nodes have been dropped.
diff --git a/include/block/block.h b/include/block/block.h
index 6a758a76f8..b0c04f16c8 100644
--- a/include/block/block.h
+++ b/include/block/block.h
@@ -353,6 +353,11 @@ int bdrv_drop_intermediate(BlockDriverState *top, 
BlockDriverState *base,
 BlockDriverState *bdrv_find_overlay(BlockDriverState *active,
 BlockDriverState *bs);
 BlockDriverState 

[Qemu-block] [PATCH v3 07/13] block: Allow omitting the 'backing' option in certain cases

2019-03-12 Thread Alberto Garcia
Of all options of type BlockdevRef used to specify children in
BlockdevOptions, 'backing' is the only one that is optional.

For "x-blockdev-reopen" we want that if an option is omitted then it
must be reset to its default value. The default value of 'backing'
means that QEMU opens the backing file specified in the image
metadata, but this is not something that we want to support for the
reopen operation.

Because of this the 'backing' option has to be specified during
reopen, pointing to the existing backing file if we want to keep it,
or pointing to a different one (or NULL) if we want to replace it (to
be implemented in a subsequent patch).

In order to simplify things a bit and not to require that the user
passes the 'backing' option to every single block device even when
it's clearly not necessary, this patch allows omitting this option if
the block device being reopened doesn't have a backing file attached
_and_ no default backing file is specified in the image metadata.

Signed-off-by: Alberto Garcia 
---
 block.c | 8 +++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/block.c b/block.c
index 9698f2ad44..c043f5eadd 100644
--- a/block.c
+++ b/block.c
@@ -3434,7 +3434,13 @@ int bdrv_reopen_prepare(BDRVReopenState *reopen_state, 
BlockReopenQueue *queue,
 
 drv_prepared = true;
 
-if (drv->supports_backing && reopen_state->backing_missing) {
+/*
+ * We must provide the 'backing' option if the BDS has a backing
+ * file or if the image file has a backing file name as part of
+ * its metadata. Otherwise the 'backing' option can be omitted.
+ */
+if (drv->supports_backing && reopen_state->backing_missing &&
+(backing_bs(reopen_state->bs) || reopen_state->bs->backing_file[0])) {
 error_setg(errp, "backing is missing for '%s'",
reopen_state->bs->node_name);
 ret = -EINVAL;
-- 
2.11.0




[Qemu-block] [PATCH v3 03/13] block: Freeze the backing chain for the duration of the mirror job

2019-03-12 Thread Alberto Garcia
Signed-off-by: Alberto Garcia 
---
 block/mirror.c | 8 
 1 file changed, 8 insertions(+)

diff --git a/block/mirror.c b/block/mirror.c
index 726d3c27fb..010fdafd79 100644
--- a/block/mirror.c
+++ b/block/mirror.c
@@ -630,6 +630,10 @@ static int mirror_exit_common(Job *job)
 }
 s->prepared = true;
 
+if (bdrv_chain_contains(src, target_bs)) {
+bdrv_unfreeze_backing_chain(mirror_top_bs, target_bs);
+}
+
 bdrv_release_dirty_bitmap(src, s->dirty_bitmap);
 
 /* Make sure that the source BDS doesn't go away during bdrv_replace_node,
@@ -1639,6 +1643,10 @@ static void mirror_start_job(const char *job_id, 
BlockDriverState *bs,
 goto fail;
 }
 }
+
+if (bdrv_freeze_backing_chain(mirror_top_bs, target, errp) < 0) {
+goto fail;
+}
 }
 
 QTAILQ_INIT(>ops_in_flight);
-- 
2.11.0




[Qemu-block] [PATCH v3 00/13] Add a 'x-blockdev-reopen' QMP command

2019-03-12 Thread Alberto Garcia
Hi,

here's a new version of the patches to add a new QMP command for
bdrv_reopen().

Refer to the cover letter of v1 for a complete description of the
feature:

   https://lists.gnu.org/archive/html/qemu-block/2019-01/msg00623.html

Regards,

Berto

v3:
- Rebase on top of Kevin's block branch (e5a4df8ec3e30b37187954b4368).
- Patch 8: move the permission checks to bdrv_reopen_multiple() after
  "block: Make permission changes in reopen less wrong"
- Patch 13: change iotest number and adjust output after the rebase.

v2: https://lists.gnu.org/archive/html/qemu-block/2019-03/msg00195.html
- Patch 1: Update documentation, fix check in bdrv_drop_intermediate()
  and add a new one in bdrv_replace_node() [Kevin]
- Patch 2: Add a missing unfreeze call if commit_start() fails, and
  prevent a double unfreeze in commit_abort() [Kevin]
- Patch 4: Prevent double unfreeze in stream_abort() [Kevin]
- Patch 6: Update documentation and don't complain for a missing
  'backing' option if the driver does not support backing files [Kevin]
- Patch 7: Contextual differences due to changes in previous patches.
- Patch 8: Forbid changing the backing file if there are implicit
  nodes on the way. Simplify code and update the permission checks to
  handle a new backing file [Kevin, Berto]
- Patch 9: Remove BlockDriver.runtime_opts from the previous version
  and use only mutable_opts [Kevin]
- Patch 10: Compare bs->options with reopen_state->options to see what
  options the user is trying to reset [Kevin]
- Patch 13: Update the tests to reflect all the new changes.

v1: https://lists.gnu.org/archive/html/qemu-block/2019-01/msg00623.html
- Patch 10: forbid setting a new backing file with a different
  AioContext.
- Patch 13 (new): Remove unused parameter from bdrv_reopen_multiple.
- Patch 14: Acquire the AioContext before calling
  bdrv_reopen_multiple().
- Patch 15: More test cases.
- Patches 3, 8, 9, 11, 12: scripts/checkpatch.pl is more picky now
  with the format of multi-line comments, so correct them.

RFCv2: https://lists.gnu.org/archive/html/qemu-block/2018-11/msg00901.html

Output of backport-diff against v2:

Key:
[] : patches are identical
[] : number of functional differences between upstream/downstream patch
[down] : patch is downstream-only
The flags [FC] indicate (F)unctional and (C)ontextual differences, respectively

001/13:[] [--] 'block: Allow freezing BdrvChild links'
002/13:[] [--] 'block: Freeze the backing chain for the duration of the 
commit job'
003/13:[] [--] 'block: Freeze the backing chain for the duration of the 
mirror job'
004/13:[] [--] 'block: Freeze the backing chain for the duration of the 
stream job'
005/13:[] [--] 'block: Add 'keep_old_opts' parameter to bdrv_reopen_queue()'
006/13:[] [--] 'block: Handle child references in bdrv_reopen_queue()'
007/13:[] [--] 'block: Allow omitting the 'backing' option in certain cases'
008/13:[0042] [FC] 'block: Allow changing the backing file on reopen'
009/13:[] [--] 'block: Add a 'mutable_opts' field to BlockDriver'
010/13:[] [--] 'block: Add bdrv_reset_options_allowed()'
011/13:[] [--] 'block: Remove the AioContext parameter from 
bdrv_reopen_multiple()'
012/13:[] [--] 'block: Add an 'x-blockdev-reopen' QMP command'
013/13:[0004] [FC] 'qemu-iotests: Test the x-blockdev-reopen QMP command'

Alberto Garcia (13):
  block: Allow freezing BdrvChild links
  block: Freeze the backing chain for the duration of the commit job
  block: Freeze the backing chain for the duration of the mirror job
  block: Freeze the backing chain for the duration of the stream job
  block: Add 'keep_old_opts' parameter to bdrv_reopen_queue()
  block: Handle child references in bdrv_reopen_queue()
  block: Allow omitting the 'backing' option in certain cases
  block: Allow changing the backing file on reopen
  block: Add a 'mutable_opts' field to BlockDriver
  block: Add bdrv_reset_options_allowed()
  block: Remove the AioContext parameter from bdrv_reopen_multiple()
  block: Add an 'x-blockdev-reopen' QMP command
  qemu-iotests: Test the x-blockdev-reopen QMP command

 block.c| 394 --
 block/commit.c |  16 +
 block/file-posix.c |   6 +
 block/mirror.c |   8 +
 block/qcow2.c  |  25 ++
 block/raw-format.c |   3 +
 block/replication.c|   7 +-
 block/stream.c |  21 +
 blockdev.c |  47 +++
 include/block/block.h  |  13 +-
 include/block/block_int.h  |  14 +
 qapi/block-core.json   |  42 ++
 qemu-io-cmds.c |   4 +-
 tests/qemu-iotests/245 | 991 +
 tests/qemu-iotests/245.out |   5 +
 tests/qemu-iotests/group   |   1 +
 16 files changed, 1566 insertions(+), 31 deletions(-)
 create mode 100644 tests/qemu-iotests/245
 create mode 100644 tests/qemu-iotests/245.out

-- 
2.11.0




[Qemu-block] [PATCH v3 05/13] block: Add 'keep_old_opts' parameter to bdrv_reopen_queue()

2019-03-12 Thread Alberto Garcia
The bdrv_reopen_queue() function is used to create a queue with
the BDSs that are going to be reopened and their new options. Once
the queue is ready bdrv_reopen_multiple() is called to perform the
operation.

The original options from each one of the BDSs are kept, with the new
options passed to bdrv_reopen_queue() applied on top of them.

For "x-blockdev-reopen" we want a function that behaves much like
"blockdev-add". We want to ignore the previous set of options so that
only the ones actually specified by the user are applied, with the
rest having their default values.

One of the things that we need is a way to tell bdrv_reopen_queue()
whether we want to keep the old set of options or not, and that's what
this patch does. All current callers are setting this new parameter to
true and x-blockdev-reopen will set it to false.

Signed-off-by: Alberto Garcia 
---
 block.c   | 34 +++---
 block/replication.c   |  4 ++--
 include/block/block.h |  3 ++-
 qemu-io-cmds.c|  2 +-
 4 files changed, 24 insertions(+), 19 deletions(-)

diff --git a/block.c b/block.c
index a2bffce38d..3cd281dbcc 100644
--- a/block.c
+++ b/block.c
@@ -3010,7 +3010,8 @@ static BlockReopenQueue 
*bdrv_reopen_queue_child(BlockReopenQueue *bs_queue,
  QDict *options,
  const BdrvChildRole *role,
  QDict *parent_options,
- int parent_flags)
+ int parent_flags,
+ bool keep_old_opts)
 {
 assert(bs != NULL);
 
@@ -3050,13 +3051,13 @@ static BlockReopenQueue 
*bdrv_reopen_queue_child(BlockReopenQueue *bs_queue,
  */
 
 /* Old explicitly set values (don't overwrite by inherited value) */
-if (bs_entry) {
-old_options = qdict_clone_shallow(bs_entry->state.explicit_options);
-} else {
-old_options = qdict_clone_shallow(bs->explicit_options);
+if (bs_entry || keep_old_opts) {
+old_options = qdict_clone_shallow(bs_entry ?
+  bs_entry->state.explicit_options :
+  bs->explicit_options);
+bdrv_join_options(bs, options, old_options);
+qobject_unref(old_options);
 }
-bdrv_join_options(bs, options, old_options);
-qobject_unref(old_options);
 
 explicit_options = qdict_clone_shallow(options);
 
@@ -3068,10 +3069,12 @@ static BlockReopenQueue 
*bdrv_reopen_queue_child(BlockReopenQueue *bs_queue,
 flags = bdrv_get_flags(bs);
 }
 
-/* Old values are used for options that aren't set yet */
-old_options = qdict_clone_shallow(bs->options);
-bdrv_join_options(bs, options, old_options);
-qobject_unref(old_options);
+if (keep_old_opts) {
+/* Old values are used for options that aren't set yet */
+old_options = qdict_clone_shallow(bs->options);
+bdrv_join_options(bs, options, old_options);
+qobject_unref(old_options);
+}
 
 /* We have the final set of options so let's update the flags */
 options_copy = qdict_clone_shallow(options);
@@ -3121,7 +3124,7 @@ static BlockReopenQueue 
*bdrv_reopen_queue_child(BlockReopenQueue *bs_queue,
 g_free(child_key_dot);
 
 bdrv_reopen_queue_child(bs_queue, child->bs, new_child_options,
-child->role, options, flags);
+child->role, options, flags, keep_old_opts);
 }
 
 return bs_queue;
@@ -3129,9 +3132,10 @@ static BlockReopenQueue 
*bdrv_reopen_queue_child(BlockReopenQueue *bs_queue,
 
 BlockReopenQueue *bdrv_reopen_queue(BlockReopenQueue *bs_queue,
 BlockDriverState *bs,
-QDict *options)
+QDict *options, bool keep_old_opts)
 {
-return bdrv_reopen_queue_child(bs_queue, bs, options, NULL, NULL, 0);
+return bdrv_reopen_queue_child(bs_queue, bs, options, NULL, NULL, 0,
+   keep_old_opts);
 }
 
 /*
@@ -3224,7 +3228,7 @@ int bdrv_reopen_set_read_only(BlockDriverState *bs, bool 
read_only,
 qdict_put_bool(opts, BDRV_OPT_READ_ONLY, read_only);
 
 bdrv_subtree_drained_begin(bs);
-queue = bdrv_reopen_queue(NULL, bs, opts);
+queue = bdrv_reopen_queue(NULL, bs, opts, true);
 ret = bdrv_reopen_multiple(bdrv_get_aio_context(bs), queue, errp);
 bdrv_subtree_drained_end(bs);
 
diff --git a/block/replication.c b/block/replication.c
index 4c80b54daf..a2f3590310 100644
--- a/block/replication.c
+++ b/block/replication.c
@@ -374,14 +374,14 @@ static void reopen_backing_file(BlockDriverState *bs, 
bool writable,
 QDict *opts = qdict_new();
 qdict_put_bool(opts, BDRV_OPT_READ_ONLY, !writable);
 reopen_queue = 

Re: [Qemu-block] [Qemu-devel] [PATCH v2 01/13] block: Allow freezing BdrvChild links

2019-03-12 Thread Vladimir Sementsov-Ogievskiy
06.03.2019 21:11, Alberto Garcia wrote:
> Our permission system is useful to define what operations are allowed
> on a certain block node and includes things like BLK_PERM_WRITE or
> BLK_PERM_RESIZE among others.
> 
> One of the permissions is BLK_PERM_GRAPH_MOD which allows "changing
> the node that this BdrvChild points to". The exact meaning of this has
> never been very clear, but it can be understood as "change any of the
> links connected to the node". This can be used to prevent changing a
> backing link, but it's too coarse.
> 
> This patch adds a new 'frozen' attribute to BdrvChild, which forbids
> detaching the link from the node it points to, and new API to freeze
> and unfreeze a backing chain.
> 
> After this change a few functions can fail, so they need additional
> checks.
> 
> Signed-off-by: Alberto Garcia 
> ---
>   block.c   | 76 
> +++
>   include/block/block.h |  5 
>   include/block/block_int.h |  6 
>   3 files changed, 87 insertions(+)
> 
> diff --git a/block.c b/block.c
> index 35e78e2172..6e9c72f0cd 100644
> --- a/block.c
> +++ b/block.c
> @@ -2127,6 +2127,8 @@ static void bdrv_replace_child_noperm(BdrvChild *child,
>   BlockDriverState *old_bs = child->bs;
>   int i;
>   
> +assert(!child->frozen);
> +
>   if (old_bs && new_bs) {
>   assert(bdrv_get_aio_context(old_bs) == 
> bdrv_get_aio_context(new_bs));
>   }
> @@ -2343,6 +2345,10 @@ void bdrv_set_backing_hd(BlockDriverState *bs, 
> BlockDriverState *backing_hd,
>   bool update_inherits_from = bdrv_chain_contains(bs, backing_hd) &&
>   bdrv_inherits_from_recursive(backing_hd, bs);
>   
> +if (bdrv_is_backing_chain_frozen(bs, backing_bs(bs), errp)) {
> +return;
> +}
> +
>   if (backing_hd) {
>   bdrv_ref(backing_hd);
>   }
> @@ -3712,6 +3718,11 @@ void bdrv_replace_node(BlockDriverState *from, 
> BlockDriverState *to,
>   if (!should_update_child(c, to)) {
>   continue;
>   }
> +if (c->frozen) {
> +error_setg(errp, "Cannot change '%s' link to '%s'",
> +   c->name, from->node_name);
> +goto out;
> +}
>   list = g_slist_prepend(list, c);
>   perm |= c->perm;
>   shared &= c->shared_perm;
> @@ -3924,6 +3935,63 @@ BlockDriverState *bdrv_find_base(BlockDriverState *bs)
>   }
>   
>   /*
> + * Return true if at least one of the backing links between @bs and
> + * @base is frozen. @errp is set if that's the case.
> + */
> +bool bdrv_is_backing_chain_frozen(BlockDriverState *bs, BlockDriverState 
> *base,
> +  Error **errp)
> +{
> +BlockDriverState *i;
> +
> +for (i = bs; i != base && i->backing; i = backing_bs(i)) {
> +if (i->backing->frozen) {
> +error_setg(errp, "Cannot change '%s' link from '%s' to '%s'",
> +   i->backing->name, i->node_name,
> +   backing_bs(i)->node_name);
> +return true;
> +}
> +}
> +
> +return false;
> +}
> +
> +/*
> + * Freeze all backing links between @bs and @base.
> + * If any of the links is already frozen the operation is aborted and
> + * none of the links are modified.
> + * Returns 0 on success. On failure returns < 0 and sets @errp.
> + */
> +int bdrv_freeze_backing_chain(BlockDriverState *bs, BlockDriverState *base,
> +  Error **errp)
> +{
> +BlockDriverState *i;
> +
> +if (bdrv_is_backing_chain_frozen(bs, base, errp)) {
> +return -EPERM;
> +}
> +
> +for (i = bs; i != base && i->backing; i = backing_bs(i)) {
> +i->backing->frozen = true;
> +}
> +
> +return 0;
> +}
> +
> +/*
> + * Unfreeze all backing links between @bs and @base. The caller must
> + * ensure that all links are frozen before using this function.
> + */
> +void bdrv_unfreeze_backing_chain(BlockDriverState *bs, BlockDriverState 
> *base)
> +{
> +BlockDriverState *i;
> +
> +for (i = bs; i != base && i->backing; i = backing_bs(i)) {
> +assert(i->backing->frozen);
> +i->backing->frozen = false;
> +}
> +}
> +
> +/*
>* Drops images above 'base' up to and including 'top', and sets the image
>* above 'top' to have base as its backing file.
>*
> @@ -3972,6 +4040,14 @@ int bdrv_drop_intermediate(BlockDriverState *top, 
> BlockDriverState *base,
>   goto exit;
>   }
>   
> +/* This function changes all links that point to top and makes
> + * them point to base. Check that none of them is frozen. */
> +QLIST_FOREACH(c, >parents, next_parent) {
> +if (c->frozen) {
> +goto exit;
> +}
> +}
> +
>   /* If 'base' recursively inherits from 'top' then we should set
>* base->inherits_from to top->inherits_from after 'top' and all
>* other intermediate nodes have been dropped.

Hmm, looking at this I 

Re: [Qemu-block] [Qemu-devel] [PATCH v2 0/4] block/qcow2-bitmap: Enable resize with persistent bitmaps

2019-03-12 Thread John Snow



On 3/11/19 2:51 PM, Vladimir Sementsov-Ogievskiy wrote:
> Hi!
> 
> This is my counter-propasal on
> "[PATCH 0/5] block/qcow2-bitmap: Enable resize with persistent bitmaps" by 
> John.
> 
> It does bitmap resize, but don't flush bitmaps during resize, postponing this
> controversial thing.
> 
> Based-on: https://github.com/jnsnow/qemu bitmaps
> 
> John Snow (2):
>   block/qcow2-bitmap: Allow resizes with persistent bitmaps
>   tests/qemu-iotests: add bitmap resize test 246
> 
> Vladimir Sementsov-Ogievskiy (2):
>   docs/interop/qcow2: Improve bitmap flag in_use specification
>   block/qcow2-bitmap: Don't check size for IN_USE bitmap
> 
>  docs/interop/qcow2.txt |  18 ++-
>  block/qcow2.h  |   1 +
>  block/qcow2-bitmap.c   |  67 -
>  block/qcow2.c  |   4 +-
>  tests/qemu-iotests/246 | 114 ++
>  tests/qemu-iotests/246.out | 295 +
>  tests/qemu-iotests/group   |   1 +
>  7 files changed, 491 insertions(+), 9 deletions(-)
>  create mode 100755 tests/qemu-iotests/246
>  create mode 100644 tests/qemu-iotests/246.out
> 

Alright, in the interest of moving forward I will stage this, we can
figure out if it's a problem when I am not sick :)

--js



Re: [Qemu-block] [Qemu-devel] [PATCH RESEND v4] drive-mirror: add incremental mode

2019-03-12 Thread Vladimir Sementsov-Ogievskiy
12.03.2019 18:58, John Snow wrote:
> 
> 
> On 3/12/19 9:43 AM, Eric Blake wrote:
>> On 3/12/19 3:19 AM, Vladimir Sementsov-Ogievskiy wrote:
>>
 So one important point about incremental backups is that you can
 actually do them incrementally: The bitmap is effectively cleared at the
 beginning of the backup process (a successor bitmap is installed that is
 cleared and receives all changes; at the end of the backup, it either
 replaces the old bitmap (on success) or is merged into it (on failure)).
Therefore, you can do the next incremental backup by using the same 
 bitmap.
>>>
>>> Hmm, I heard in some other thread that Eric finally decided to always start
>>> backup on copied bitmap in libvirt, so this logic is skipped. Do we need it
>>> for mirror? Sorry if I'm wrong, Eric, am I?
>>
>> You are correct that my current libvirt patches do incremental backup
>> mode with a temporary bitmap (so regardless of whether the temporary
>> bitmap is cleared on success or left alone is irrelevant, the temporary
>> bitmap goes away afterwards). But just because libvirt isn't using that
>> particular feature of qemu's incremental backups does not excuse qemu
>> from not thinking about other clients that might be impacted by doing
>> incremental backup with a live bitmap, where qemu management of the
>> bitmap matters.
>>
> 
> I'd prefer adding SYNC=DIFFERENTIAL to intuit that the bitmap isn't
> modified after use, if there's no reason to add an actual "incremental"
> mode to mirror.
> 
> Then it would be fine for mirror to implement differential and not
> incremental, and still use the bitmap.
> 

Agree, this is better.

But I have an idea: could we just add parameter @x-bitmap, independent of @sync 
?

In this case, we can get NONE mode reduced by bitmap, which may be usable too.
And FULL mode with bitmap would be what you called DIFFERENTIAL. Not sure about
could TOP mode with bitmap be meaningful.


Also it may be useful at some point to share bitmap between jobs, so we could

start backup(sync=none) from active disk to local temp node, and then mirror 
from temp
to target. Which gives backup which (1) don't slow down guest writes if target 
is far and slow
and (2) fast as it is mostly mirror, keeping in mind that mirror is faster than 
backup as it uses
async pattern and block_status.

And to make such a backup incremental we need to share dirty bitmap between 
backup and mirror,
and this bitmap should be cleared when something is copied out from source (it 
may be cleared
both by backup(sync=none) and mirror)... But I'm not sure that shared bitmap 
should be the same
as bitmap which initializes differential/incremental mode. And this is why I 
propose new parameter
to be experimental.

-- 
Best regards,
Vladimir


Re: [Qemu-block] [Qemu-devel] [PATCH v3 11/12] pc: Support firmware configuration with -blockdev

2019-03-12 Thread Laszlo Ersek
On 03/12/19 07:52, Markus Armbruster wrote:
> Philippe Mathieu-Daudé  writes:
> 
>> Le lun. 11 mars 2019 17:02, Markus Armbruster  a écrit :
>>
>>> Paolo Bonzini  writes:
> [...]
 Does it work if you add the device yourself as a child of /machine,
 instead of relying on /machine/unattached?
>>>
>>> I figure you're suggesting something like this incremental patch:
>>>
>>>diff --git a/hw/i386/pc_sysfw.c b/hw/i386/pc_sysfw.c
>>>index ccf2221acb..9d3a80c51a 100644
>>>--- a/hw/i386/pc_sysfw.c
>>>+++ b/hw/i386/pc_sysfw.c
>>>@@ -99,6 +99,10 @@ void pc_system_flash_create(PCMachineState *pcms)
>>> if (pcmc->pci_enabled) {
>>> pcms->flash[0] = pc_pflash_create("system.flash0");
>>> pcms->flash[1] = pc_pflash_create("system.flash1");
>>>+object_property_add_child(OBJECT(pcms), "pfl0",
>>>+  OBJECT(pcms->flash[0]),
>>> _abort);
>>>+object_property_add_child(OBJECT(pcms), "pfl1",
>>>+  OBJECT(pcms->flash[1]),
>>> _abort);
>>> object_property_add_alias(OBJECT(pcms), "pflash0",
>>>   OBJECT(pcms->flash[0]), "drive",
>>>   _abort);
>>>@@ -122,19 +126,7 @@ static void
>>> pc_system_flash_cleanup_unused(PCMachineState *pcms)
>>> prop_name = g_strdup_printf("pflash%d", i);
>>> object_property_del(OBJECT(pcms), prop_name, _abort);
>>> g_free(prop_name);
>>>-/*
>>>- * Delete @dev_obj.  Straight object_unref() is wrong,
>>>- * since it leaves dangling references in the parent bus
>>>- * behind.  object_unparent() doesn't work, either: since
>>>- * @dev_obj hasn't been realized, dev_obj->parent is null,
>>>- * and object_unparent() does nothing.  DeviceClass method
>>>- * device_unparent() works, but we have to take a
>>>- * temporary reference, or else device_unparent() commits
>>>- * a use-after-free error.
>>>- */
>>>-object_ref(dev_obj);
>>>-object_get_class(dev_obj)->unparent(dev_obj);
>>>-object_unref(dev_obj);
>>>+object_unparent(dev_obj);
>>> pcms->flash[i] = NULL;
>>> }
>>> }
>>>
>>> Passes "make check" and "info qtree" looks good both with and without
>>> -machine pflash0,...
>>>
>>> I'm not exactly happy with "pfl0" and "pfl1".  Got a favorite color for
>>> my bikeshed?
>>>
>>
>> ovmf_code & ovmf_vars?
> 
> Tempting, since we use these guys only for OVMF so far.  However, we
> could also use them for SeaBIOS[*], or whatever other firmware comes up.
> 
> [...]
> 
> 
> [*] See also
> Subject: Re: Configuring pflash devices for OVMF firmware
> Message-ID: <9c4e222f-3941-426e-3195-5598b2af1...@redhat.com>
> https://lists.nongnu.org/archive/html/qemu-devel/2019-01/msg08145.html
> 

I think I'd slightly prefer if we didn't use the term "OVMF" in this
context.

Thanks
Laszlo



Re: [Qemu-block] [Qemu-devel] [PATCH RESEND v4] drive-mirror: add incremental mode

2019-03-12 Thread John Snow



On 3/12/19 9:43 AM, Eric Blake wrote:
> On 3/12/19 3:19 AM, Vladimir Sementsov-Ogievskiy wrote:
> 
>>> So one important point about incremental backups is that you can
>>> actually do them incrementally: The bitmap is effectively cleared at the
>>> beginning of the backup process (a successor bitmap is installed that is
>>> cleared and receives all changes; at the end of the backup, it either
>>> replaces the old bitmap (on success) or is merged into it (on failure)).
>>>   Therefore, you can do the next incremental backup by using the same 
>>> bitmap.
>>
>> Hmm, I heard in some other thread that Eric finally decided to always start
>> backup on copied bitmap in libvirt, so this logic is skipped. Do we need it
>> for mirror? Sorry if I'm wrong, Eric, am I?
> 
> You are correct that my current libvirt patches do incremental backup
> mode with a temporary bitmap (so regardless of whether the temporary
> bitmap is cleared on success or left alone is irrelevant, the temporary
> bitmap goes away afterwards). But just because libvirt isn't using that
> particular feature of qemu's incremental backups does not excuse qemu
> from not thinking about other clients that might be impacted by doing
> incremental backup with a live bitmap, where qemu management of the
> bitmap matters.
> 

I'd prefer adding SYNC=DIFFERENTIAL to intuit that the bitmap isn't
modified after use, if there's no reason to add an actual "incremental"
mode to mirror.

Then it would be fine for mirror to implement differential and not
incremental, and still use the bitmap.

--js



Re: [Qemu-block] [PATCH v2 08/13] block: Allow changing the backing file on reopen

2019-03-12 Thread Alberto Garcia
On Tue 12 Mar 2019 01:20:54 PM CET, Kevin Wolf wrote:
>> +reopen_state->replace_backing_bs = true;
>> +reopen_state->new_backing_bs = new_backing_bs;
>
> Do we need to bdrv_ref(new_backing_bs) here in case its only reference
> is dropped in the same reopen transaction?

I'm not sure if it can happen with the current code, but the reopen
state should have an extra reference so I'll add it.

>> +/* Check if new_backing_bs would accept the new permissions */
>> +if (reopen_state->replace_backing_bs && reopen_state->new_backing_bs) {
>> +uint64_t cur_perm, cur_shared;
>> +bdrv_child_perm(reopen_state->bs, reopen_state->new_backing_bs,
>> +NULL, _backing, NULL,
>
> Why do we pass NULL instead of queue here? Shouldn't the required
> permissions be calculated based on the post-reopen state?

You're right, I'll fix it.

Berto



Re: [Qemu-block] [PATCH 0/5] block/qcow2-bitmap: Enable resize with persistent bitmaps

2019-03-12 Thread Vladimir Sementsov-Ogievskiy
12.03.2019 13:20, Kevin Wolf wrote:
> Am 11.03.2019 um 19:35 hat John Snow geschrieben:
>> On 3/11/19 2:26 PM, Vladimir Sementsov-Ogievskiy wrote:
>>> Also, we have no comments from Max and Kevin who are maintainers of the 
>>> only file,
>>> changed in the series. And why this don't go through their trees?
>>
>> I've discussed it with Kevin in the IRC channel; I don't think he has
>> strong feelings on flushing bitmap metadata versus not as much as he
>> cares that we make sure we leave openable images on crash. In general I
>> think bitmaps are "our problem".
> 
> This, pretty much. I haven't been involved much in the bitmap code
> before, and today I have other series to deal with for soft freeze. So
> I'll trust you to do the right thing.
> 
> And the right thing means at least that after the resize operation has
> returned (either success or failure), a crash doesn't lead to an image
> file that can't be opened again.
> 
> Ideally, the same would be true for a crash in the middle the resize
> operation, but as John explained to me that we require an exact match of
> the size,

Not exact, and bitmap size is not stored in the image. But it is checked,
that bitmap table is enough to store bitmap corresponding to current image size.

  this doesn't seem to completely work out without more changes:
> Either a way to update both atomically (journal?) or at least accept
> oversized bitmaps so we can just choose the right order. The latter
> sounds tempting to me, though I haven't put much thought in the
> consequences this would have.

I think correct way is not doing this check for inconsistent bitmaps, 
considering
their size to be possibly outdated. And I've sent a counter proposal which goes
this way.

-- 
Best regards,
Vladimir


Re: [Qemu-block] [PATCH v2 4/4] tests/qemu-iotests: add bitmap resize test 246

2019-03-12 Thread Eric Blake
On 3/11/19 1:51 PM, Vladimir Sementsov-Ogievskiy wrote:
> From: John Snow 
> 
> Test that we can actually resize qcow2 images with persistent bitmaps
> correctly. Throw some other goofy stuff at the test while we're at it,
> like adding bitmaps of different granularities and at different times.
> 
> Signed-off-by: John Snow 
> Signed-off-by: Vladimir Sementsov-Ogievskiy 
>[vsmentsov: drop \n from the end of test output,
>   test output changed a bit: some bitmaps goes in other order
>   int the output]

s/int/in/

I'm guessing the output order changed because we are no longer flushing
things at intermediate steps?

I have not reviewed this (or John's original patch) closely, but trust
that John can determine if this is worth including for soft freeze
today. But I can state that I applied all four patches of this series,
to at least check that they build and the new iotests passes, so you can
add:

Tested-by: Eric Blake 

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3226
Virtualization:  qemu.org | libvirt.org



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-block] [PATCH 2/2] block-stream: include base into job node list

2019-03-12 Thread Andrey Shinkevich


On 08/03/2019 17:00, Alberto Garcia wrote:
> On Thu 21 Feb 2019 04:26:39 PM CET, Andrey Shinkevich wrote:
>> The block-stream job needs to own the base node as the limiter
>> of the copy-on-read operation. So, the base node is included in
>> the job node list (block_job_add_bdrv).
>> Also, the block-stream job would not allow the base node to go
>> away due to the graph modification, e.g. when a filter node is
>> inserted between the bottom node and the base node.
>> For that reason, the flag BLK_PERM_GRAPH_MOD is unset in the
>> shared permission bit mask of the base node.
>>
>> Signed-off-by: Andrey Shinkevich 
>> ---
>>   block/stream.c | 9 +
>>   1 file changed, 9 insertions(+)
>>
>> diff --git a/block/stream.c b/block/stream.c
>> index 7a49ac0..c8f93d4 100644
>> --- a/block/stream.c
>> +++ b/block/stream.c
>> @@ -259,6 +259,15 @@ void stream_start(const char *job_id, BlockDriverState 
>> *bs,
>>  _abort);
>>   }
>>   
>> +if (base) {
>> +/*
>> + * The base node should not disappear during the job.
>> + */
>> +block_job_add_bdrv(>common, "base", base, 0,
>> +   BLK_PERM_ALL & ~BLK_PERM_GRAPH_MOD,
>> +   _abort);
> 
> I'm not sure if I understand the ~BLK_PERM_GRAPH_MOD bit, what's the
> consequence of not removing that permission?
> 
> Berto
> 
Alberto,
Thank you for your question.
There will be neither a bad consequence, nor a good one.
That means the block_job_add_bdrv() does not help to protect the graph
from modification. So, "freezing BdrvChild links" may be a better
solution if the chain includes the 'base' node.
-- 
With the best regards,
Andrey Shinkevich


Re: [Qemu-block] [Qemu-devel] [PATCH RESEND v4] drive-mirror: add incremental mode

2019-03-12 Thread Eric Blake
On 3/12/19 3:19 AM, Vladimir Sementsov-Ogievskiy wrote:

>> So one important point about incremental backups is that you can
>> actually do them incrementally: The bitmap is effectively cleared at the
>> beginning of the backup process (a successor bitmap is installed that is
>> cleared and receives all changes; at the end of the backup, it either
>> replaces the old bitmap (on success) or is merged into it (on failure)).
>>   Therefore, you can do the next incremental backup by using the same bitmap.
> 
> Hmm, I heard in some other thread that Eric finally decided to always start
> backup on copied bitmap in libvirt, so this logic is skipped. Do we need it
> for mirror? Sorry if I'm wrong, Eric, am I?

You are correct that my current libvirt patches do incremental backup
mode with a temporary bitmap (so regardless of whether the temporary
bitmap is cleared on success or left alone is irrelevant, the temporary
bitmap goes away afterwards). But just because libvirt isn't using that
particular feature of qemu's incremental backups does not excuse qemu
from not thinking about other clients that might be impacted by doing
incremental backup with a live bitmap, where qemu management of the
bitmap matters.

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3226
Virtualization:  qemu.org | libvirt.org



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-block] [Qemu-devel] [PATCH] tcmu: Introduce qemu-tcmu utility

2019-03-12 Thread Fam Zheng



> On Mar 11, 2019, at 19:06, Kevin Wolf  wrote:
> 
> Am 09.03.2019 um 02:46 hat Yaowei Bai geschrieben:
>> Thanks for explaining the background. It comes to my mind that actually we
>> talked about these two cases with Fam a bit long time ago and decided to
>> support both these two cases. The reason why we implement case2 first is
>> currently we care more about exporting new opened images and it's a bit
>> more convenient, exporting from a VM or QMP can be added in the later
>> release. Do you think it's reasonable/acceptable that we support both
>> cases and use case2 for normal new opened images and case1 for the
>> circumstances you mention above?
> 
> I would like to avoid a second code path because it comes with a
> maintenance cost.
> 
> Experience also tells that adding a new way to parse option strings will
> come back at us later because it we must always maintain compatibility
> with yet another format.

If the rule is that cfgstr strictly follows -blockdev syntax and the parsing 
code is mostly shared, it shouldn’t be a problem, right? I suppose this will 
limit the expressiveness of cfgstr but might be a reasonable tradeoff between 
implementation complexity, user friendliness and interface consistency.

Another possibility is to use json: format in cfgstr for anything more complex 
than a single -blockdev.

> 
> So I would prefer not doing this and just passing command line options
> to qemu-tcmu, which can reuse the already existing code paths.

I think the effect of not supporting adding new blockdev from cfgstr is that we 
have to resort to QMP to allow hot-adding targets. It’s not necessarily bad, 
though I’m not sure hwo that aligns with Yaowei’s development plan.

Fam


> 
> Kevin
> 





Re: [Qemu-block] [PATCH v2 09/13] block: Add a 'mutable_opts' field to BlockDriver

2019-03-12 Thread Kevin Wolf
Am 12.03.2019 um 13:32 hat Kevin Wolf geschrieben:
> Am 06.03.2019 um 19:11 hat Alberto Garcia geschrieben:
> > If we reopen a BlockDriverState and there is an option that is present
> > in bs->options but missing from the new set of options then we have to
> > return an error unless the driver is able to reset it to its default
> > value.
> > 
> > This patch adds a new 'mutable_opts' field to BlockDriver. This is
> > a list of runtime options that can be modified during reopen. If an
> > option in this list is unspecified on reopen then it must be reset (or
> > return an error).
> > 
> > Signed-off-by: Alberto Garcia 
> > ---
> >  block/file-posix.c|  6 ++
> >  block/qcow2.c | 25 +
> >  block/raw-format.c|  3 +++
> >  include/block/block_int.h |  8 
> >  4 files changed, 42 insertions(+)
> 
> Two more drivers seem to be able to change options: gluster (debug and
> logfile) and throttle (throttle-group).

Actually those aren't necessary in this patch: throttle-group isn't even
optional, so it never has to be reset; and I misread the gluster code,
it just re-adds the old values for debug and logfile and doesn't
actually accept changes.

So the patch looks fine, after all.

Kevin



Re: [Qemu-block] [PATCH v2 09/13] block: Add a 'mutable_opts' field to BlockDriver

2019-03-12 Thread Alberto Garcia
On Tue 12 Mar 2019 01:32:00 PM CET, Kevin Wolf wrote:
> Am 06.03.2019 um 19:11 hat Alberto Garcia geschrieben:
>> If we reopen a BlockDriverState and there is an option that is present
>> in bs->options but missing from the new set of options then we have to
>> return an error unless the driver is able to reset it to its default
>> value.
>> 
>> This patch adds a new 'mutable_opts' field to BlockDriver. This is
>> a list of runtime options that can be modified during reopen. If an
>> option in this list is unspecified on reopen then it must be reset (or
>> return an error).
>> 
>> Signed-off-by: Alberto Garcia 
>> ---
>>  block/file-posix.c|  6 ++
>>  block/qcow2.c | 25 +
>>  block/raw-format.c|  3 +++
>>  include/block/block_int.h |  8 
>>  4 files changed, 42 insertions(+)
>
> Two more drivers seem to be able to change options: gluster (debug and
> logfile) and throttle (throttle-group).

Unless I missed something gluster doesn't allow changing any options
during reopen, the _reopen_prepare() function only reads the flags.

The 'throttle-group' option is mandatory so it cannot be left unset. We
can add mutable_opts to that driver but it won't make any difference in
practice.

Berto



Re: [Qemu-block] [PATCH v2 09/13] block: Add a 'mutable_opts' field to BlockDriver

2019-03-12 Thread Kevin Wolf
Am 06.03.2019 um 19:11 hat Alberto Garcia geschrieben:
> If we reopen a BlockDriverState and there is an option that is present
> in bs->options but missing from the new set of options then we have to
> return an error unless the driver is able to reset it to its default
> value.
> 
> This patch adds a new 'mutable_opts' field to BlockDriver. This is
> a list of runtime options that can be modified during reopen. If an
> option in this list is unspecified on reopen then it must be reset (or
> return an error).
> 
> Signed-off-by: Alberto Garcia 
> ---
>  block/file-posix.c|  6 ++
>  block/qcow2.c | 25 +
>  block/raw-format.c|  3 +++
>  include/block/block_int.h |  8 
>  4 files changed, 42 insertions(+)

Two more drivers seem to be able to change options: gluster (debug and
logfile) and throttle (throttle-group).

Kevin



Re: [Qemu-block] [PATCH v2 08/13] block: Allow changing the backing file on reopen

2019-03-12 Thread Kevin Wolf
Am 06.03.2019 um 19:11 hat Alberto Garcia geschrieben:
> This patch allows the user to change the backing file of an image that
> is being reopened. Here's what it does:
> 
>  - In bdrv_reopen_prepare(): check that the value of 'backing' points
>to an existing node or is null. If it points to an existing node it
>also needs to make sure that replacing the backing file will not
>create a cycle in the node graph (i.e. you cannot reach the parent
>from the new backing file).
> 
>  - In bdrv_reopen_commit(): perform the actual node replacement by
>calling bdrv_set_backing_hd().
> 
> There may be temporary implicit nodes between a BDS and its backing
> file (e.g. a commit filter node). In these cases bdrv_reopen_prepare()
> looks for the real (non-implicit) backing file and requires that the
> 'backing' option points to it. Replacing or detaching a backing file
> is forbidden if there are implicit nodes in the middle.
> 
> Although x-blockdev-reopen is meant to be used like blockdev-add,
> there's an important thing that must be taken into account: the only
> way to set a new backing file is by using a reference to an existing
> node (previously added with e.g. blockdev-add).  If 'backing' contains
> a dictionary with a new set of options ({"driver": "qcow2", "file": {
> ... }}) then it is interpreted that the _existing_ backing file must
> be reopened with those options.
> 
> Signed-off-by: Alberto Garcia 

> @@ -3294,6 +3315,98 @@ static void bdrv_reopen_perm(BlockReopenQueue *q, 
> BlockDriverState *bs,
>  }
>  
>  /*
> + * Take a BDRVReopenState and check if the value of 'backing' in the
> + * reopen_state->options QDict is valid or not.
> + *
> + * If 'backing' is missing from the QDict then return 0.
> + *
> + * If 'backing' contains the node name of the backing file of
> + * reopen_state->bs then return 0.
> + *
> + * If 'backing' contains a different node name (or is null) then check
> + * whether the current backing file can be replaced with the new one.
> + * If that's the case then reopen_state->replace_backing_bs is set to
> + * true and reopen_state->new_backing_bs contains a pointer to the new
> + * backing BlockDriverState (or NULL).
> + *
> + * Return 0 on success, otherwise return < 0 and set @errp.
> + */
> +static int bdrv_reopen_parse_backing(BDRVReopenState *reopen_state,
> + Error **errp)
> +{
> +BlockDriverState *bs = reopen_state->bs;
> +BlockDriverState *overlay_bs, *new_backing_bs;
> +QObject *value;
> +const char *str;
> +
> +value = qdict_get(reopen_state->options, "backing");
> +if (value == NULL) {
> +return 0;
> +}
> +
> +switch (qobject_type(value)) {
> +case QTYPE_QNULL:
> +new_backing_bs = NULL;
> +break;
> +case QTYPE_QSTRING:
> +str = qobject_get_try_str(value);
> +new_backing_bs = bdrv_lookup_bs(NULL, str, errp);
> +if (new_backing_bs == NULL) {
> +return -EINVAL;
> +} else if (bdrv_recurse_has_child(new_backing_bs, bs)) {
> +error_setg(errp, "Making '%s' a backing file of '%s' "
> +   "would create a cycle", str, bs->node_name);
> +return -EINVAL;
> +}
> +break;
> +default:
> +/* 'backing' does not allow any other data type */
> +g_assert_not_reached();
> +}
> +
> +/*
> + * TODO: before removing the x- prefix from x-blockdev-reopen we
> + * should move the new backing file into the right AioContext
> + * instead of returning an error.
> + */
> +if (new_backing_bs) {
> +if (bdrv_get_aio_context(new_backing_bs) != 
> bdrv_get_aio_context(bs)) {
> +error_setg(errp, "Cannot use a new backing file "
> +   "with a different AioContext");
> +return -EINVAL;
> +}
> +}
> +
> +/*
> + * Find the "actual" backing file by skipping all links that point
> + * to an implicit node, if any (e.g. a commit filter node).
> + */
> +overlay_bs = bs;
> +while (backing_bs(overlay_bs) && backing_bs(overlay_bs)->implicit) {
> +overlay_bs = backing_bs(overlay_bs);
> +}
> +
> +/* If we want to replace the backing file we need some extra checks */
> +if (new_backing_bs != backing_bs(overlay_bs)) {
> +/* Check for implicit nodes between bs and its backing file */
> +if (bs != overlay_bs) {
> +error_setg(errp, "Cannot change backing link if '%s' has "
> +   "an implicit backing file", bs->node_name);
> +return -EPERM;
> +}
> +/* Check if the backing link that we want to replace is frozen */
> +if (bdrv_is_backing_chain_frozen(overlay_bs, backing_bs(overlay_bs),
> + errp)) {
> +return -EPERM;
> +}
> +reopen_state->replace_backing_bs = true;
> +reopen_state->new_backing_bs = 

Re: [Qemu-block] [PATCH 0/5] block/qcow2-bitmap: Enable resize with persistent bitmaps

2019-03-12 Thread Kevin Wolf
Am 11.03.2019 um 19:35 hat John Snow geschrieben:
> On 3/11/19 2:26 PM, Vladimir Sementsov-Ogievskiy wrote:
> > Also, we have no comments from Max and Kevin who are maintainers of the 
> > only file,
> > changed in the series. And why this don't go through their trees?
> 
> I've discussed it with Kevin in the IRC channel; I don't think he has
> strong feelings on flushing bitmap metadata versus not as much as he
> cares that we make sure we leave openable images on crash. In general I
> think bitmaps are "our problem".

This, pretty much. I haven't been involved much in the bitmap code
before, and today I have other series to deal with for soft freeze. So
I'll trust you to do the right thing.

And the right thing means at least that after the resize operation has
returned (either success or failure), a crash doesn't lead to an image
file that can't be opened again.

Ideally, the same would be true for a crash in the middle the resize
operation, but as John explained to me that we require an exact match of
the size, this doesn't seem to completely work out without more changes:
Either a way to update both atomically (journal?) or at least accept
oversized bitmaps so we can just choose the right order. The latter
sounds tempting to me, though I haven't put much thought in the
consequences this would have.

Kevin



Re: [Qemu-block] [Qemu-devel] [PATCH] tcmu: Introduce qemu-tcmu utility

2019-03-12 Thread Kevin Wolf
Am 12.03.2019 um 03:18 hat Fam Zheng geschrieben:
> > On Mar 11, 2019, at 19:06, Kevin Wolf  wrote:
> > Am 09.03.2019 um 02:46 hat Yaowei Bai geschrieben:
> >> Thanks for explaining the background. It comes to my mind that actually we
> >> talked about these two cases with Fam a bit long time ago and decided to
> >> support both these two cases. The reason why we implement case2 first is
> >> currently we care more about exporting new opened images and it's a bit
> >> more convenient, exporting from a VM or QMP can be added in the later
> >> release. Do you think it's reasonable/acceptable that we support both
> >> cases and use case2 for normal new opened images and case1 for the
> >> circumstances you mention above?
> > 
> > I would like to avoid a second code path because it comes with a
> > maintenance cost.
> > 
> > Experience also tells that adding a new way to parse option strings will
> > come back at us later because it we must always maintain compatibility
> > with yet another format.
> 
> If the rule is that cfgstr strictly follows -blockdev syntax and the
> parsing code is mostly shared, it shouldn’t be a problem, right? I
> suppose this will limit the expressiveness of cfgstr but might be a
> reasonable tradeoff between implementation complexity, user
> friendliness and interface consistency.

Yes, if we could pass the cfgstr directly to an existing parser, that
would make it less bad at least.

Of course, if we directly pass cfgstr to the -blockdev parser, it also
means that we can't have additional options for configuring the
BlockBackend (which could be part of the -export command line option for
qemu-tcmu).

That could be addressed by wrapping another QAPI object around it that
only contain BlockdevOptions as one field, but then you have to prefix
every block node option if you use keyval visitor syntax.

> Another possibility is to use json: format in cfgstr for anything more
> complex than a single -blockdev.

Parsing like -blockdev already includes JSON syntax (which is necessary
to get the full expressiveness).

If you want to support the equivalent of multiple -blockdevs, you'd need
to wrap the BlockdevOptions in a QAPI list.

> > So I would prefer not doing this and just passing command line options
> > to qemu-tcmu, which can reuse the already existing code paths.
> 
> I think the effect of not supporting adding new blockdev from cfgstr
> is that we have to resort to QMP to allow hot-adding targets. It’s not
> necessarily bad, though I’m not sure hwo that aligns with Yaowei’s
> development plan.

This is true, but we need a way to do this from QMP anyway. So the
question is whether we want to introduce a second way to achieve the
same thing a bit more conveniently. But I expect that hot-adding is
something that is usually done with management software anyway, so does
the convenience actually buy us much?

Kevin



Re: [Qemu-block] [PATCH v2 0/2] block: Gluster 6 compatibility

2019-03-12 Thread Niels de Vos
On Mon, Mar 11, 2019 at 12:10:06PM +0100, Kevin Wolf wrote:
> Am 09.03.2019 um 10:40 hat Niels de Vos geschrieben:
> > On Fri, Mar 08, 2019 at 02:11:51PM +0100, Kevin Wolf wrote:
> > > Am 05.03.2019 um 16:46 hat Niels de Vos geschrieben:
> > > > Gluster 6 is currently available as release candidate. There have been a
> > > > few changes to libgfapi.so that need to be adapted by consuming projects
> > > > like QEMU. Fedora Rawhide already contains glusterfs-6.0-RC0, and this
> > > > prevents rebuilds of QEMU there (https://bugzilla.redhat.com/1684298).
> > > > 
> > > > The following two changes should be sufficient to consume Gluster 6 once
> > > > it is released. These have been tested on CentOS-7 with Gluster 5 and
> > > > Gluster 6 (minimal manual qemu-img tests only).
> > > > 
> > > > This v2 post contains changes suggested by Daniel P. Berrangé and Kevin
> > > > Wolf. Thanks!
> > > 
> > > Thanks, applied to the block branch.
> > 
> > Thanks! Stefano Garzarella gave a suggestion for further cleanup. I was
> > planning to address that (no #ifdef for function arguments) next week
> > when I'm back from a trip, Is that something you would also like to see,
> > or do you prefer the change to stay minimal/small as it is now? I'm
> > happy to send a followup if you agree that it is cleaner.
> 
> I don't mind either way.
> 
> I'm going to send a pull request tomorrow for soft freeze, but if you
> tell me that I should wait with this one, I can remove it from my queue
> for now. It's a bug fix, so we can still apply an updated version during
> the freeze.
> 
> A follow-up works for me, too, so whatever you prefer.

In that case, I prefer to keep the current patches as they are. If
further changes make the code much better readable/maintainable I would
have provided a new version. But at the moment, and after a little more
consideration, I do not think there is much benefit in the cleanup
Stefano suggested.

Thanks,
Niels



Re: [Qemu-block] [Qemu-devel] [PATCH RESEND v4] drive-mirror: add incremental mode

2019-03-12 Thread Vladimir Sementsov-Ogievskiy
27.02.2019 18:25, Max Reitz wrote:
> CC-ing John because of the keyword "incremental".
> 
> On 14.02.19 07:43, mahaocong wrote:
>> From: mahaocong 
>>
>> This patch adds possibility to start mirroring with user-created-bitmap.
>> On full mode, mirror create a non-named-bitmap by scanning whole block-chain,
>> and on top mode, mirror create a bitmap by scanning the top block layer. So I
>> think I can copy a user-created-bitmap and use it as the initial state of the
>> mirror, the same as incremental mode drive-backup, and I call this new mode
>> as incremental mode drive-mirror.
>>
>> A possible usage scene of incremental mode mirror is live migration. For 
>> maintain
>> the block data and recover after a malfunction, someone may backup data to 
>> ceph
>> or other distributed storage. On qemu incremental backup, we need to create 
>> a new
>> bitmap and attach to block device before the first backup job. Then the 
>> bitmap
>> records the change after the backup job. If we want to migration this vm, we 
>> can
>> migrate block data between source and destionation by using drive-mirror 
>> directly,
>> or use backup data and backup-bitmap to reduce the data should be 
>> synchronize.
>> To do this, we should first create a new image in destination and set 
>> backing file
>> as backup image, then set backup-bitmap as the initial state of incremental 
>> mode
>> drive-mirror, and synchronize dirty block starting with the state set by the 
>> last
>> incremental backup job. When the mirror complete, we get an active layer on 
>> destination,
>> and its backing file is backup image on ceph. Then we can do live copy data 
>> from
>> backing files into overlay images by using block-stream, or do backup 
>> continually.
>>
>> In this scene, It seems that If the guest os doesn't write too many data 
>> after the
>> last backup, the incremental mode may transmit less data than full mode or 
>> top
>> mode. However, if the write data is too many, there is no advantage on 
>> incremental
>> mode compare with other mode.
>>
>> This scene can be described as following steps:
>> 1.create rbd image in ceph, and map nbd device with rbd image.
>> 2.create a new bitmap and attach to block device.
>> 3.do full mode backup on nbd device and thus sync it to the rbd image.
>> 4.severl times incremental mode backup.
>> 5.create new image in destination and set its backing file as backup image.
>> 6.do live-migration, and migrate block data by incremental mode drive-mirror
>>with bitmap created from step 2.
>>
>> Signed-off-by: Ma Haocong 
>> ---
> 
> So one important point about incremental backups is that you can
> actually do them incrementally: The bitmap is effectively cleared at the
> beginning of the backup process (a successor bitmap is installed that is
> cleared and receives all changes; at the end of the backup, it either
> replaces the old bitmap (on success) or is merged into it (on failure)).
>   Therefore, you can do the next incremental backup by using the same bitmap.

Hmm, I heard in some other thread that Eric finally decided to always start
backup on copied bitmap in libvirt, so this logic is skipped. Do we need it
for mirror? Sorry if I'm wrong, Eric, am I?

> 
> How would this work with mirroring?  Instead of clearing the bitmap at
> the start of the process, it'd need to be cleared at the end (because we
> reach synchronization between source and target then).  But how would
> error handling work?
> 
> I suppose the named bitmap would need to be copied to act as the dirty
> bitmap for the mirror job (at the start of the job).  If a failure
> occurs, the copy is simply discarded.  On success, the named bitmap is
> cleared when the job is completed.  Hm, that seems to make sense.  Did I
> forget anything, John?
> 
> In any case, I don't think this patch implemented anything to this
> regard...?  So it doesn't really implement incremental mirroring.
> However, I think it should, if possible.
> 
> Max
> 


-- 
Best regards,
Vladimir


Re: [Qemu-block] [Qemu-devel] [PATCH v3 11/12] pc: Support firmware configuration with -blockdev

2019-03-12 Thread Markus Armbruster
Philippe Mathieu-Daudé  writes:

> Le lun. 11 mars 2019 17:02, Markus Armbruster  a écrit :
>
>> Paolo Bonzini  writes:
[...]
>> > Does it work if you add the device yourself as a child of /machine,
>> > instead of relying on /machine/unattached?
>>
>> I figure you're suggesting something like this incremental patch:
>>
>>diff --git a/hw/i386/pc_sysfw.c b/hw/i386/pc_sysfw.c
>>index ccf2221acb..9d3a80c51a 100644
>>--- a/hw/i386/pc_sysfw.c
>>+++ b/hw/i386/pc_sysfw.c
>>@@ -99,6 +99,10 @@ void pc_system_flash_create(PCMachineState *pcms)
>> if (pcmc->pci_enabled) {
>> pcms->flash[0] = pc_pflash_create("system.flash0");
>> pcms->flash[1] = pc_pflash_create("system.flash1");
>>+object_property_add_child(OBJECT(pcms), "pfl0",
>>+  OBJECT(pcms->flash[0]),
>> _abort);
>>+object_property_add_child(OBJECT(pcms), "pfl1",
>>+  OBJECT(pcms->flash[1]),
>> _abort);
>> object_property_add_alias(OBJECT(pcms), "pflash0",
>>   OBJECT(pcms->flash[0]), "drive",
>>   _abort);
>>@@ -122,19 +126,7 @@ static void
>> pc_system_flash_cleanup_unused(PCMachineState *pcms)
>> prop_name = g_strdup_printf("pflash%d", i);
>> object_property_del(OBJECT(pcms), prop_name, _abort);
>> g_free(prop_name);
>>-/*
>>- * Delete @dev_obj.  Straight object_unref() is wrong,
>>- * since it leaves dangling references in the parent bus
>>- * behind.  object_unparent() doesn't work, either: since
>>- * @dev_obj hasn't been realized, dev_obj->parent is null,
>>- * and object_unparent() does nothing.  DeviceClass method
>>- * device_unparent() works, but we have to take a
>>- * temporary reference, or else device_unparent() commits
>>- * a use-after-free error.
>>- */
>>-object_ref(dev_obj);
>>-object_get_class(dev_obj)->unparent(dev_obj);
>>-object_unref(dev_obj);
>>+object_unparent(dev_obj);
>> pcms->flash[i] = NULL;
>> }
>> }
>>
>> Passes "make check" and "info qtree" looks good both with and without
>> -machine pflash0,...
>>
>> I'm not exactly happy with "pfl0" and "pfl1".  Got a favorite color for
>> my bikeshed?
>>
>
> ovmf_code & ovmf_vars?

Tempting, since we use these guys only for OVMF so far.  However, we
could also use them for SeaBIOS[*], or whatever other firmware comes up.

[...]


[*] See also
Subject: Re: Configuring pflash devices for OVMF firmware
Message-ID: <9c4e222f-3941-426e-3195-5598b2af1...@redhat.com>
https://lists.nongnu.org/archive/html/qemu-devel/2019-01/msg08145.html



Re: [Qemu-block] [PATCH v2 1/4] docs/interop/qcow2: Improve bitmap flag in_use specification

2019-03-12 Thread Vladimir Sementsov-Ogievskiy


On 12.03.2019 3:00, Eric Blake wrote:
> On 3/11/19 1:51 PM, Vladimir Sementsov-Ogievskiy wrote:
>> We already use (we didn't notice it) IN_USE flag for marking bitmap
>> metadata outdated, such as AUTO flag, which mirrors enabled/disabled
>> bitmaps. No we are going to support bitmap resize, so it's good to
> 
> s/No/Now/
> 
>> write IN_USE meaning with more details.
>>
>> Signed-off-by: Vladimir Sementsov-Ogievskiy 
>> ---
>>   docs/interop/qcow2.txt | 18 +++---
>>   1 file changed, 15 insertions(+), 3 deletions(-)
>>
>> diff --git a/docs/interop/qcow2.txt b/docs/interop/qcow2.txt
>> index fb5cb47245..575a5f25e2 100644
>> --- a/docs/interop/qcow2.txt
>> +++ b/docs/interop/qcow2.txt
>> @@ -589,7 +589,19 @@ Structure of a bitmap directory entry:
>>   Bit
>> 0: in_use
>>The bitmap was not saved correctly and may be
>> - inconsistent.
>> + inconsistent. This inconsitency may touch both 
>> bitmap
> 
> inconsistency
> 
>> + data and metadata, and this mean that bitmap state
>> + (its data and metadata) was changed but not stored
>> + back to the image. This flag doesn't relate to 
>> format
>> + corruption, all fields are still correct from qcow2
>> + point of view, they just may be outdated.
>> +
>> + Note: Currently, Qemu may change (additionally to
>> + bitmap data) @auto flag and size of the bitmap 
>> during
>> + image resize. This mean, that not only bitmap data
>> + may be outdated if @in_use flag set, but also 
>> value of
>> + @auto flag and bitmap size (which is indirectly
>> + referenced by @bitmap_table_size).
> 
> Feels wordy. Maybe drop the second paragraph starting with Note, and
> merely use this for the first paragraph:
> 
> The bitmap was not saved correctly and may be inconsistent. Although the
> bitmap metadata is still well-formed from a qcow2 perspective, the
> metadata (such as the auto flag or bitmap size) or data contents may be
> outdated.

Sounds good (as always), thanks)

> 
>>   
>> 1: auto
>>The bitmap must reflect all changes of the virtual
>> @@ -717,8 +729,8 @@ corresponding range of the virtual disk (see above) was 
>> written to while the
>>   bitmap was 'enabled'. An unset bit means that this range was not written 
>> to.
>>   
>>   The software doesn't have to sync the bitmap in the image file with its
>> -representation in RAM after each write. Flag 'in_use' should be set while 
>> the
>> -bitmap is not synced.
>> +representation in RAM after each write or metadata change. Flag 'in_use'
>> +should be set while the bitmap is not synced.
>>   
>>   In the image file the 'enabled' state is reflected by the 'auto' flag. If 
>> this
>>   flag is set, the software must consider the bitmap as 'enabled' and start
>>
>