Re: [PATCH 5/5] vhost-user-blk: Check that num-queues is supported by backend

2021-04-28 Thread Raphael Norwitz
Reviewed-by: Raphael Norwitz 

On Thu, Apr 22, 2021 at 07:02:21PM +0200, Kevin Wolf wrote:
> Creating a device with a number of queues that isn't supported by the
> backend is pointless, the device won't work properly and the error
> messages are rather confusing.
> 
> Just fail to create the device if num-queues is higher than what the
> backend supports.
> 
> Since the relationship between num-queues and the number of virtqueues
> depends on the specific device, this is an additional value that needs
> to be initialised by the device. For convenience, allow leaving it 0 if
> the check should be skipped. This makes sense for vhost-user-net where
> separate vhost devices are used for the queues and custom initialisation
> code is needed to perform the check.
> 
> Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1935031
> Signed-off-by: Kevin Wolf 
> ---
>  include/hw/virtio/vhost.h | 2 ++
>  hw/block/vhost-user-blk.c | 1 +
>  hw/virtio/vhost-user.c| 5 +
>  3 files changed, 8 insertions(+)
> 
> diff --git a/include/hw/virtio/vhost.h b/include/hw/virtio/vhost.h
> index 4a8bc75415..21a9a52088 100644
> --- a/include/hw/virtio/vhost.h
> +++ b/include/hw/virtio/vhost.h
> @@ -74,6 +74,8 @@ struct vhost_dev {
>  int nvqs;
>  /* the first virtqueue which would be used by this vhost dev */
>  int vq_index;
> +/* if non-zero, minimum required value for max_queues */
> +int num_queues;
>  uint64_t features;
>  uint64_t acked_features;
>  uint64_t backend_features;
> diff --git a/hw/block/vhost-user-blk.c b/hw/block/vhost-user-blk.c
> index b6f4bb3f6f..ac2193abef 100644
> --- a/hw/block/vhost-user-blk.c
> +++ b/hw/block/vhost-user-blk.c
> @@ -324,6 +324,7 @@ static int vhost_user_blk_connect(DeviceState *dev, Error 
> **errp)
>  }
>  s->connected = true;
>  
> +s->dev.num_queues = s->num_queues;
>  s->dev.nvqs = s->num_queues;
>  s->dev.vqs = s->vhost_vqs;
>  s->dev.vq_index = 0;
> diff --git a/hw/virtio/vhost-user.c b/hw/virtio/vhost-user.c
> index ded0c10453..ee57abe045 100644
> --- a/hw/virtio/vhost-user.c
> +++ b/hw/virtio/vhost-user.c
> @@ -1909,6 +1909,11 @@ static int vhost_user_backend_init(struct vhost_dev 
> *dev, void *opaque)
>  return err;
>  }
>  }
> +if (dev->num_queues && dev->max_queues < dev->num_queues) {
> +error_report("The maximum number of queues supported by the "
> + "backend is %" PRIu64, dev->max_queues);
> +return -EINVAL;
> +}
>  
>  if (virtio_has_feature(features, VIRTIO_F_IOMMU_PLATFORM) &&
>  !(virtio_has_feature(dev->protocol_features,
> -- 
> 2.30.2
> 


Re: [PATCH v4 00/36] block: update graph permissions update

2021-04-28 Thread Vladimir Sementsov-Ogievskiy

28.04.2021 20:03, Kevin Wolf wrote:

Am 28.04.2021 um 17:17 hat Vladimir Sementsov-Ogievskiy geschrieben:

Hi all!

And here is v4.


Thanks, applied to the block branch.


Thanks! And thanks a lot for reviewing!



Though the error message shown by the test in patch 18 does need some
improvement on top. We should probably name both conflicting users and
who blocks whom in a way that is neutral as to which user is the new
one.


I'll look at it next week.



Also, curious that again patch 7 (and only patch 7) got From: mangled by
the mailing list. Seems there is something in it that Mailman doesn't
like?



Very interesting.. Curly braces in subject maybe? But I think, that not a first 
time I use them..


--
Best regards,
Vladimir



Re: [PATCH 1/5] vhost-user-blk: Don't reconnect during initialisation

2021-04-28 Thread Philippe Mathieu-Daudé
On 4/22/21 7:02 PM, Kevin Wolf wrote:
> This is a partial revert of commits 77542d43149 and bc79c87bcde.
> 
> Usually, an error during initialisation means that the configuration was
> wrong. Reconnecting won't make the error go away, but just turn the
> error condition into an endless loop. Avoid this and return errors
> again.
> 
> Additionally, calling vhost_user_blk_disconnect() from the chardev event
> handler could result in use-after-free because none of the
> initialisation code expects that the device could just go away in the

TIL initialisation wording.

> middle. So removing the call fixes crashes in several places.
> 
> For example, using a num-queues setting that is incompatible with the
> backend would result in a crash like this (dereferencing dev->opaque,
> which is already NULL):
> 
>  #0  0x55d0a4bd in vhost_user_read_cb (source=0x568f4690, 
> condition=(G_IO_IN | G_IO_HUP), opaque=0x7fffcbf0) at 
> ../hw/virtio/vhost-user.c:313
>  #1  0x55d950d3 in qio_channel_fd_source_dispatch 
> (source=0x57c3f750, callback=0x55d0a478 , 
> user_data=0x7fffcbf0) at ../io/channel-watch.c:84
>  #2  0x77b32a9f in g_main_context_dispatch () at 
> /lib64/libglib-2.0.so.0
>  #3  0x77b84a98 in g_main_context_iterate.constprop () at 
> /lib64/libglib-2.0.so.0
>  #4  0x77b32163 in g_main_loop_run () at /lib64/libglib-2.0.so.0
>  #5  0x55d0a724 in vhost_user_read (dev=0x57bc62f8, 
> msg=0x7fffcc50) at ../hw/virtio/vhost-user.c:402
>  #6  0x55d0ee6b in vhost_user_get_config (dev=0x57bc62f8, 
> config=0x57bc62ac "", config_len=60) at ../hw/virtio/vhost-user.c:2133
>  #7  0x55d56d46 in vhost_dev_get_config (hdev=0x57bc62f8, 
> config=0x57bc62ac "", config_len=60) at ../hw/virtio/vhost.c:1566
>  #8  0x55cdd150 in vhost_user_blk_device_realize (dev=0x57bc60b0, 
> errp=0x7fffcf90) at ../hw/block/vhost-user-blk.c:510
>  #9  0x55d08f6d in virtio_device_realize (dev=0x57bc60b0, 
> errp=0x7fffcff0) at ../hw/virtio/virtio.c:3660
> 
> Signed-off-by: Kevin Wolf 
> ---
>  hw/block/vhost-user-blk.c | 54 ++-
>  1 file changed, 13 insertions(+), 41 deletions(-)




Re: [PATCH 4/5] virtio: Fail if iommu_platform is requested, but unsupported

2021-04-28 Thread Raphael Norwitz
On Thu, Apr 22, 2021 at 07:02:20PM +0200, Kevin Wolf wrote:
> Commit 2943b53f6 (' virtio: force VIRTIO_F_IOMMU_PLATFORM') made sure
> that vhost can't just reject VIRTIO_F_IOMMU_PLATFORM when it was
> requested. However, just adding it back to the negotiated flags isn't
> right either because it promises support to the guest that the device
> actually doesn't support. One example of a vhost-user device that
> doesn't have support for the flag is the vhost-user-blk export of QEMU.
> 
> Instead of successfully creating a device that doesn't work, just fail
> to plug the device when it doesn't support the feature, but it was
> requested. This results in much clearer error messages.
> 
> Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1935031
> Signed-off-by: Kevin Wolf 
> ---
>  hw/virtio/virtio-bus.c | 5 +
>  1 file changed, 5 insertions(+)
> 
> diff --git a/hw/virtio/virtio-bus.c b/hw/virtio/virtio-bus.c
> index d6332d45c3..859978d248 100644
> --- a/hw/virtio/virtio-bus.c
> +++ b/hw/virtio/virtio-bus.c
> @@ -69,6 +69,11 @@ void virtio_bus_device_plugged(VirtIODevice *vdev, Error 
> **errp)
>  return;
>  }
>

Can you explain this check a little more?

Above we have:
bool has_iommu = virtio_host_has_feature(vdev, VIRTIO_F_IOMMU_PLATFORM);

and then we get the host features from the bckend:
vdev->host_features = vdc->get_features(vdev, vdev->host_features

So as is this is catching the case where vdev->host_features had
VIRTIO_F_IOMMU_PLATFORM set before (by default?), but doesn't now that
the features have been retrieved? 

Why not just:
if (!virtio_host_has_feature(vdev, VIRTIO_F_IOMMU_PLATFORM)) {

> +if (has_iommu && !virtio_host_has_feature(vdev, 
> VIRTIO_F_IOMMU_PLATFORM)) {
> +error_setg(errp, "iommu_platform=true is not supported by the 
> device");
> +return;
> +}
> +
>  if (klass->device_plugged != NULL) {
>  klass->device_plugged(qbus->parent, _err);
>  }
> -- 
> 2.30.2
> 


Re: [PATCH 3/5] vhost-user-blk: Get more feature flags from vhost devicey

2021-04-28 Thread Raphael Norwitz
Acked-by: Raphael Norwitz 

On Thu, Apr 22, 2021 at 07:02:19PM +0200, Kevin Wolf wrote:
> VIRTIO_F_RING_PACKED and VIRTIO_F_IOMMU_PLATFORM need to be supported by
> the vhost device, otherwise advertising it to the guest doesn't result
> in a working configuration. They are currently not supported by the
> vhost-user-blk export in QEMU.
> 
> Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1935020
> Signed-off-by: Kevin Wolf 
> ---
>  hw/block/vhost-user-blk.c | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/hw/block/vhost-user-blk.c b/hw/block/vhost-user-blk.c
> index 8422a29470..b6f4bb3f6f 100644
> --- a/hw/block/vhost-user-blk.c
> +++ b/hw/block/vhost-user-blk.c
> @@ -47,6 +47,8 @@ static const int user_feature_bits[] = {
>  VIRTIO_RING_F_INDIRECT_DESC,
>  VIRTIO_RING_F_EVENT_IDX,
>  VIRTIO_F_NOTIFY_ON_EMPTY,
> +VIRTIO_F_RING_PACKED,
> +VIRTIO_F_IOMMU_PLATFORM,
>  VHOST_INVALID_FEATURE_BIT
>  };
>  
> -- 
> 2.30.2
> 


Re: [PATCH] hw/ide: Fix crash when plugging a piix3-ide device into the x-remote machine

2021-04-28 Thread Stefan Hajnoczi
On Wed, Apr 28, 2021 at 04:18:17PM +0200, Markus Armbruster wrote:
> Stefan Hajnoczi  writes:
> 
> > On Tue, Apr 27, 2021 at 02:02:27PM -0400, John Snow wrote:
> >> On 4/27/21 1:54 PM, Philippe Mathieu-Daudé wrote:
> >> > On 4/27/21 7:16 PM, John Snow wrote:
> >> > > On 4/27/21 9:54 AM, Stefan Hajnoczi wrote:
> >> > > > I suggest fixing this at the qdev level. Make piix3-ide have a
> >> > > > sub-device that inherits from ISA_DEVICE so it can only be 
> >> > > > instantiated
> >> > > > when there's an ISA bus.
> >> > > > 
> >> > > > Stefan
> >> > > 
> >> > > My qdev knowledge is shaky. Does this imply that you agree with the
> >> > > direction of Thomas's patch, or do you just mean to disagree with Phil
> >> > > on his preferred course of action?
> >> > 
> >> > My understanding is a disagreement to both, with a 3rd direction :)
> >> > 
> >> > I agree with Stefan direction but I'm not sure (yet) that a sub-device
> >> > is the best (long-term) solution. I guess there is a design issue with
> >> > this device, and would like to understanding it first.
> >> > 
> >> > IIUC Stefan says the piix3-ide is both a PCI and IDE device, but QOM
> >> > only allow a single parent. Multiple QOM inheritance is resolved as
> >> > interfaces, but PCI/IDE qdev aren't interfaces, rather abstract objects.
> >> > So he suggests to embed an IDE device within the PCI piix3-ide device.
> >> > 
> >> > My view is the PIIX is a chipset that share stuffs between components,
> >> > and the IDE bus belongs to the chipset PCI root (or eventually the
> >> > PCI-ISA bridge, function #0). The IDE function would use the IDE bus
> >> > from its root parent as a linked property.
> >> > My problem is currently this device is user-creatable as a Frankenstein
> >> > single PCI function, out of its chipset. I'm not sure yet this is a
> >> > dead end or I could work something out.
> >> > 
> >> > Regards,
> >> > 
> >> > Phil.
> >> > 
> >> 
> >> It sounds complicated. In the meantime, I think I am favor of taking
> >> Thomas's patch because it merely adds some error routing to allow us to
> >> avoid a crash. The core organizational issues of the IDE device(s) will
> >> remain and can be solved later as needed.
> >
> > The approach in this patch is okay but we should keep in mind it only
> > solves piix3-ide. ISA provides a non-qdev backdoor API and there may be
> > more instances of this type of bug.
> >
> > A qdev fix would address the root cause and make it possible to drop the
> > backdoor API, but that's probably too much work for little benefit.
> 
> What do you mean by backdoor API?  Global @isabus?

Yes. It's also strange that isa_get_irq(ISADevice *dev, unsigned isairq)
accepts dev = NULL as a valid argument.

Stefan


signature.asc
Description: PGP signature


Re: [PATCH 1/5] vhost-user-blk: Don't reconnect during initialisation

2021-04-28 Thread Raphael Norwitz
On Wed, Apr 28, 2021 at 07:31:13PM +0200, Kevin Wolf wrote:
> Am 28.04.2021 um 18:52 hat Raphael Norwitz geschrieben:
> > Given what you've shown with the use-after-free, I agree the changes
> > need to be reverted. I'm a little uneasy about removing the reconnect
> > logic from the device realization completely though.
> > 
> > On Thu, Apr 22, 2021 at 07:02:17PM +0200, Kevin Wolf wrote:
> > > This is a partial revert of commits 77542d43149 and bc79c87bcde.
> > > 
> > > Usually, an error during initialisation means that the configuration was
> > > wrong. Reconnecting won't make the error go away, but just turn the
> > > error condition into an endless loop. Avoid this and return errors
> > > again.
> > > 
> > 
> > Is that nessesarily true? As I understand it the main usecases for
> > device reconnect are to allow a device backend to be restarted after a
> > failure or to allow the backend to be upgraded without restarting the
> > guest. I agree - misconfiguration could be a common cause of a device
> > backend crashing at realize time, but couldn't there be others? Maybe
> > transient memory pressure?
> > 
> > Especially in the case where one process is connecting to many different
> > vhost-user-blk instances, I could imagine power-ons and incoming
> > migrations racing with backend restarts quite frequently. Should
> > these cases cause failures?
> > 
> > We can still hit the infinite looping case you describe post-realize.
> > Why should we treat pre-realize differently?
> 
> I think there is one main difference between realize() and later
> operation, which is that we can actually deliver an error to the user
> during realize(). When we're just starting QEMU and processing the CLI
> arguments, failure is very obvious, and in the context of QMP
> device-add, the client is actively waiting for a result, too.
> 
> Later on, there is no good way to communicate an error (writes to stderr
> just end up in some logfile at best, QAPI events can be missed), and
> even if there were, we would have to do something with the guest until
> the user/management application actually reacts to the error. The latter
> is not a problem during realize() because the device wasn't even plugged
> in yet.
> 
> So I think there are good reasons why it could make sense to distinguish
> initialisation and operation.
>

Fair enough. I'm happy in this case, provided we remain resiliant
against one-off daemon restarts during realize.

> > > Additionally, calling vhost_user_blk_disconnect() from the chardev event
> > > handler could result in use-after-free because none of the
> > > initialisation code expects that the device could just go away in the
> > > middle. So removing the call fixes crashes in several places.
> > > 
> > > For example, using a num-queues setting that is incompatible with the
> > > backend would result in a crash like this (dereferencing dev->opaque,
> > > which is already NULL):
> > > 
> > >  #0  0x55d0a4bd in vhost_user_read_cb (source=0x568f4690, 
> > > condition=(G_IO_IN | G_IO_HUP), opaque=0x7fffcbf0) at 
> > > ../hw/virtio/vhost-user.c:313
> > >  #1  0x55d950d3 in qio_channel_fd_source_dispatch 
> > > (source=0x57c3f750, callback=0x55d0a478 , 
> > > user_data=0x7fffcbf0) at ../io/channel-watch.c:84
> > >  #2  0x77b32a9f in g_main_context_dispatch () at 
> > > /lib64/libglib-2.0.so.0
> > >  #3  0x77b84a98 in g_main_context_iterate.constprop () at 
> > > /lib64/libglib-2.0.so.0
> > >  #4  0x77b32163 in g_main_loop_run () at /lib64/libglib-2.0.so.0
> > >  #5  0x55d0a724 in vhost_user_read (dev=0x57bc62f8, 
> > > msg=0x7fffcc50) at ../hw/virtio/vhost-user.c:402
> > >  #6  0x55d0ee6b in vhost_user_get_config (dev=0x57bc62f8, 
> > > config=0x57bc62ac "", config_len=60) at ../hw/virtio/vhost-user.c:2133
> > >  #7  0x55d56d46 in vhost_dev_get_config (hdev=0x57bc62f8, 
> > > config=0x57bc62ac "", config_len=60) at ../hw/virtio/vhost.c:1566
> > >  #8  0x55cdd150 in vhost_user_blk_device_realize 
> > > (dev=0x57bc60b0, errp=0x7fffcf90) at 
> > > ../hw/block/vhost-user-blk.c:510
> > >  #9  0x55d08f6d in virtio_device_realize (dev=0x57bc60b0, 
> > > errp=0x7fffcff0) at ../hw/virtio/virtio.c:3660
> > > 
> > > Signed-off-by: Kevin Wolf 
> > > ---
> > >  hw/block/vhost-user-blk.c | 54 ++-
> > >  1 file changed, 13 insertions(+), 41 deletions(-)
> > > 
> > > diff --git a/hw/block/vhost-user-blk.c b/hw/block/vhost-user-blk.c
> > > index f5e9682703..e824b0a759 100644
> > > --- a/hw/block/vhost-user-blk.c
> > > +++ b/hw/block/vhost-user-blk.c
> > > @@ -50,6 +50,8 @@ static const int user_feature_bits[] = {
> > >  VHOST_INVALID_FEATURE_BIT
> > >  };
> > >  
> > > +static void vhost_user_blk_event(void *opaque, QEMUChrEvent event);
> > > +
> > >  static void vhost_user_blk_update_config(VirtIODevice *vdev, uint8_t 
> > > *config)
> > >  {
> > > 

Re: [PATCH 2/5] vhost-user-blk: Use Error more consistently

2021-04-28 Thread Raphael Norwitz
Code looks ok - question about the commit message.

Acked-by: Raphael Norwitz 

On Thu, Apr 22, 2021 at 07:02:18PM +0200, Kevin Wolf wrote:
> Now that vhost_user_blk_connect() is not called from an event handler
> any more, but directly from vhost_user_blk_device_realize(), we don't
> have to resort to error_report() any more, but can use Error again.

vhost_user_blk_connect() is still called by vhost_user_blk_event() which
is registered as an event handler. I don't understand your point around
us having had to use error_report() before, but not anymore. Can you
clarify?

> 
> With Error, the callers are responsible for adding context if necessary
> (such as the "-device" option the error refers to). Additional prefixes
> are redundant and better omitted.
> 
> Signed-off-by: Kevin Wolf 
> ---
>  hw/block/vhost-user-blk.c | 22 +++---
>  1 file changed, 11 insertions(+), 11 deletions(-)
> 
> diff --git a/hw/block/vhost-user-blk.c b/hw/block/vhost-user-blk.c
> index e824b0a759..8422a29470 100644
> --- a/hw/block/vhost-user-blk.c
> +++ b/hw/block/vhost-user-blk.c
> @@ -311,7 +311,7 @@ static void vhost_user_blk_reset(VirtIODevice *vdev)
>  vhost_dev_free_inflight(s->inflight);
>  }
>  
> -static int vhost_user_blk_connect(DeviceState *dev)
> +static int vhost_user_blk_connect(DeviceState *dev, Error **errp)
>  {
>  VirtIODevice *vdev = VIRTIO_DEVICE(dev);
>  VHostUserBlk *s = VHOST_USER_BLK(vdev);
> @@ -331,8 +331,7 @@ static int vhost_user_blk_connect(DeviceState *dev)
>  
>  ret = vhost_dev_init(>dev, >vhost_user, VHOST_BACKEND_TYPE_USER, 
> 0);
>  if (ret < 0) {
> -error_report("vhost-user-blk: vhost initialization failed: %s",
> - strerror(-ret));
> +error_setg_errno(errp, -ret, "vhost initialization failed");
>  return ret;
>  }
>  
> @@ -340,8 +339,7 @@ static int vhost_user_blk_connect(DeviceState *dev)
>  if (virtio_device_started(vdev, vdev->status)) {
>  ret = vhost_user_blk_start(vdev);
>  if (ret < 0) {
> -error_report("vhost-user-blk: vhost start failed: %s",
> - strerror(-ret));
> +error_setg_errno(errp, -ret, "vhost start failed");
>  return ret;
>  }
>  }
> @@ -380,10 +378,12 @@ static void vhost_user_blk_event(void *opaque, 
> QEMUChrEvent event)
>  DeviceState *dev = opaque;
>  VirtIODevice *vdev = VIRTIO_DEVICE(dev);
>  VHostUserBlk *s = VHOST_USER_BLK(vdev);
> +Error *local_err = NULL;
>  
>  switch (event) {
>  case CHR_EVENT_OPENED:
> -if (vhost_user_blk_connect(dev) < 0) {
> +if (vhost_user_blk_connect(dev, _err) < 0) {
> +error_report_err(local_err);
>  qemu_chr_fe_disconnect(>chardev);
>  return;
>  }
> @@ -427,7 +427,7 @@ static void vhost_user_blk_device_realize(DeviceState 
> *dev, Error **errp)
>  int i, ret;
>  
>  if (!s->chardev.chr) {
> -error_setg(errp, "vhost-user-blk: chardev is mandatory");
> +error_setg(errp, "chardev is mandatory");
>  return;
>  }
>  
> @@ -435,16 +435,16 @@ static void vhost_user_blk_device_realize(DeviceState 
> *dev, Error **errp)
>  s->num_queues = 1;
>  }
>  if (!s->num_queues || s->num_queues > VIRTIO_QUEUE_MAX) {
> -error_setg(errp, "vhost-user-blk: invalid number of IO queues");
> +error_setg(errp, "invalid number of IO queues");
>  return;
>  }
>  
>  if (!s->queue_size) {
> -error_setg(errp, "vhost-user-blk: queue size must be non-zero");
> +error_setg(errp, "queue size must be non-zero");
>  return;
>  }
>  if (s->queue_size > VIRTQUEUE_MAX_SIZE) {
> -error_setg(errp, "vhost-user-blk: queue size must not exceed %d",
> +error_setg(errp, "queue size must not exceed %d",
> VIRTQUEUE_MAX_SIZE);
>  return;
>  }
> @@ -471,7 +471,7 @@ static void vhost_user_blk_device_realize(DeviceState 
> *dev, Error **errp)
>  goto virtio_err;
>  }
>  
> -if (vhost_user_blk_connect(dev) < 0) {
> +if (vhost_user_blk_connect(dev, errp) < 0) {
>  qemu_chr_fe_disconnect(>chardev);
>  goto virtio_err;
>  }
> -- 
> 2.30.2
> 


Re: [PATCH 1/5] vhost-user-blk: Don't reconnect during initialisation

2021-04-28 Thread Kevin Wolf
Am 28.04.2021 um 18:52 hat Raphael Norwitz geschrieben:
> Given what you've shown with the use-after-free, I agree the changes
> need to be reverted. I'm a little uneasy about removing the reconnect
> logic from the device realization completely though.
> 
> On Thu, Apr 22, 2021 at 07:02:17PM +0200, Kevin Wolf wrote:
> > This is a partial revert of commits 77542d43149 and bc79c87bcde.
> > 
> > Usually, an error during initialisation means that the configuration was
> > wrong. Reconnecting won't make the error go away, but just turn the
> > error condition into an endless loop. Avoid this and return errors
> > again.
> > 
> 
> Is that nessesarily true? As I understand it the main usecases for
> device reconnect are to allow a device backend to be restarted after a
> failure or to allow the backend to be upgraded without restarting the
> guest. I agree - misconfiguration could be a common cause of a device
> backend crashing at realize time, but couldn't there be others? Maybe
> transient memory pressure?
> 
> Especially in the case where one process is connecting to many different
> vhost-user-blk instances, I could imagine power-ons and incoming
> migrations racing with backend restarts quite frequently. Should
> these cases cause failures?
> 
> We can still hit the infinite looping case you describe post-realize.
> Why should we treat pre-realize differently?

I think there is one main difference between realize() and later
operation, which is that we can actually deliver an error to the user
during realize(). When we're just starting QEMU and processing the CLI
arguments, failure is very obvious, and in the context of QMP
device-add, the client is actively waiting for a result, too.

Later on, there is no good way to communicate an error (writes to stderr
just end up in some logfile at best, QAPI events can be missed), and
even if there were, we would have to do something with the guest until
the user/management application actually reacts to the error. The latter
is not a problem during realize() because the device wasn't even plugged
in yet.

So I think there are good reasons why it could make sense to distinguish
initialisation and operation.

> > Additionally, calling vhost_user_blk_disconnect() from the chardev event
> > handler could result in use-after-free because none of the
> > initialisation code expects that the device could just go away in the
> > middle. So removing the call fixes crashes in several places.
> > 
> > For example, using a num-queues setting that is incompatible with the
> > backend would result in a crash like this (dereferencing dev->opaque,
> > which is already NULL):
> > 
> >  #0  0x55d0a4bd in vhost_user_read_cb (source=0x568f4690, 
> > condition=(G_IO_IN | G_IO_HUP), opaque=0x7fffcbf0) at 
> > ../hw/virtio/vhost-user.c:313
> >  #1  0x55d950d3 in qio_channel_fd_source_dispatch 
> > (source=0x57c3f750, callback=0x55d0a478 , 
> > user_data=0x7fffcbf0) at ../io/channel-watch.c:84
> >  #2  0x77b32a9f in g_main_context_dispatch () at 
> > /lib64/libglib-2.0.so.0
> >  #3  0x77b84a98 in g_main_context_iterate.constprop () at 
> > /lib64/libglib-2.0.so.0
> >  #4  0x77b32163 in g_main_loop_run () at /lib64/libglib-2.0.so.0
> >  #5  0x55d0a724 in vhost_user_read (dev=0x57bc62f8, 
> > msg=0x7fffcc50) at ../hw/virtio/vhost-user.c:402
> >  #6  0x55d0ee6b in vhost_user_get_config (dev=0x57bc62f8, 
> > config=0x57bc62ac "", config_len=60) at ../hw/virtio/vhost-user.c:2133
> >  #7  0x55d56d46 in vhost_dev_get_config (hdev=0x57bc62f8, 
> > config=0x57bc62ac "", config_len=60) at ../hw/virtio/vhost.c:1566
> >  #8  0x55cdd150 in vhost_user_blk_device_realize 
> > (dev=0x57bc60b0, errp=0x7fffcf90) at 
> > ../hw/block/vhost-user-blk.c:510
> >  #9  0x55d08f6d in virtio_device_realize (dev=0x57bc60b0, 
> > errp=0x7fffcff0) at ../hw/virtio/virtio.c:3660
> > 
> > Signed-off-by: Kevin Wolf 
> > ---
> >  hw/block/vhost-user-blk.c | 54 ++-
> >  1 file changed, 13 insertions(+), 41 deletions(-)
> > 
> > diff --git a/hw/block/vhost-user-blk.c b/hw/block/vhost-user-blk.c
> > index f5e9682703..e824b0a759 100644
> > --- a/hw/block/vhost-user-blk.c
> > +++ b/hw/block/vhost-user-blk.c
> > @@ -50,6 +50,8 @@ static const int user_feature_bits[] = {
> >  VHOST_INVALID_FEATURE_BIT
> >  };
> >  
> > +static void vhost_user_blk_event(void *opaque, QEMUChrEvent event);
> > +
> >  static void vhost_user_blk_update_config(VirtIODevice *vdev, uint8_t 
> > *config)
> >  {
> >  VHostUserBlk *s = VHOST_USER_BLK(vdev);
> > @@ -362,19 +364,6 @@ static void vhost_user_blk_disconnect(DeviceState *dev)
> >  vhost_dev_cleanup(>dev);
> >  }
> >  
> > -static void vhost_user_blk_event(void *opaque, QEMUChrEvent event,
> > - bool realized);
> > -
> > -static void vhost_user_blk_event_realize(void *opaque, 

Re: [PATCH 1/5] vhost-user-blk: Don't reconnect during initialisation

2021-04-28 Thread Raphael Norwitz
Given what you've shown with the use-after-free, I agree the changes
need to be reverted. I'm a little uneasy about removing the reconnect
logic from the device realization completely though.

On Thu, Apr 22, 2021 at 07:02:17PM +0200, Kevin Wolf wrote:
> This is a partial revert of commits 77542d43149 and bc79c87bcde.
> 
> Usually, an error during initialisation means that the configuration was
> wrong. Reconnecting won't make the error go away, but just turn the
> error condition into an endless loop. Avoid this and return errors
> again.
> 

Is that nessesarily true? As I understand it the main usecases for
device reconnect are to allow a device backend to be restarted after a
failure or to allow the backend to be upgraded without restarting the
guest. I agree - misconfiguration could be a common cause of a device
backend crashing at realize time, but couldn't there be others? Maybe
transient memory pressure?

Especially in the case where one process is connecting to many different
vhost-user-blk instances, I could imagine power-ons and incoming
migrations racing with backend restarts quite frequently. Should
these cases cause failures?

We can still hit the infinite looping case you describe post-realize.
Why should we treat pre-realize differently?

> Additionally, calling vhost_user_blk_disconnect() from the chardev event
> handler could result in use-after-free because none of the
> initialisation code expects that the device could just go away in the
> middle. So removing the call fixes crashes in several places.
> 
> For example, using a num-queues setting that is incompatible with the
> backend would result in a crash like this (dereferencing dev->opaque,
> which is already NULL):
> 
>  #0  0x55d0a4bd in vhost_user_read_cb (source=0x568f4690, 
> condition=(G_IO_IN | G_IO_HUP), opaque=0x7fffcbf0) at 
> ../hw/virtio/vhost-user.c:313
>  #1  0x55d950d3 in qio_channel_fd_source_dispatch 
> (source=0x57c3f750, callback=0x55d0a478 , 
> user_data=0x7fffcbf0) at ../io/channel-watch.c:84
>  #2  0x77b32a9f in g_main_context_dispatch () at 
> /lib64/libglib-2.0.so.0
>  #3  0x77b84a98 in g_main_context_iterate.constprop () at 
> /lib64/libglib-2.0.so.0
>  #4  0x77b32163 in g_main_loop_run () at /lib64/libglib-2.0.so.0
>  #5  0x55d0a724 in vhost_user_read (dev=0x57bc62f8, 
> msg=0x7fffcc50) at ../hw/virtio/vhost-user.c:402
>  #6  0x55d0ee6b in vhost_user_get_config (dev=0x57bc62f8, 
> config=0x57bc62ac "", config_len=60) at ../hw/virtio/vhost-user.c:2133
>  #7  0x55d56d46 in vhost_dev_get_config (hdev=0x57bc62f8, 
> config=0x57bc62ac "", config_len=60) at ../hw/virtio/vhost.c:1566
>  #8  0x55cdd150 in vhost_user_blk_device_realize (dev=0x57bc60b0, 
> errp=0x7fffcf90) at ../hw/block/vhost-user-blk.c:510
>  #9  0x55d08f6d in virtio_device_realize (dev=0x57bc60b0, 
> errp=0x7fffcff0) at ../hw/virtio/virtio.c:3660
> 
> Signed-off-by: Kevin Wolf 
> ---
>  hw/block/vhost-user-blk.c | 54 ++-
>  1 file changed, 13 insertions(+), 41 deletions(-)
> 
> diff --git a/hw/block/vhost-user-blk.c b/hw/block/vhost-user-blk.c
> index f5e9682703..e824b0a759 100644
> --- a/hw/block/vhost-user-blk.c
> +++ b/hw/block/vhost-user-blk.c
> @@ -50,6 +50,8 @@ static const int user_feature_bits[] = {
>  VHOST_INVALID_FEATURE_BIT
>  };
>  
> +static void vhost_user_blk_event(void *opaque, QEMUChrEvent event);
> +
>  static void vhost_user_blk_update_config(VirtIODevice *vdev, uint8_t *config)
>  {
>  VHostUserBlk *s = VHOST_USER_BLK(vdev);
> @@ -362,19 +364,6 @@ static void vhost_user_blk_disconnect(DeviceState *dev)
>  vhost_dev_cleanup(>dev);
>  }
>  
> -static void vhost_user_blk_event(void *opaque, QEMUChrEvent event,
> - bool realized);
> -
> -static void vhost_user_blk_event_realize(void *opaque, QEMUChrEvent event)
> -{
> -vhost_user_blk_event(opaque, event, false);
> -}
> -
> -static void vhost_user_blk_event_oper(void *opaque, QEMUChrEvent event)
> -{
> -vhost_user_blk_event(opaque, event, true);
> -}
> -
>  static void vhost_user_blk_chr_closed_bh(void *opaque)
>  {
>  DeviceState *dev = opaque;
> @@ -382,12 +371,11 @@ static void vhost_user_blk_chr_closed_bh(void *opaque)
>  VHostUserBlk *s = VHOST_USER_BLK(vdev);
>  
>  vhost_user_blk_disconnect(dev);
> -qemu_chr_fe_set_handlers(>chardev, NULL, NULL,
> -vhost_user_blk_event_oper, NULL, opaque, NULL, true);
> +qemu_chr_fe_set_handlers(>chardev, NULL, NULL, vhost_user_blk_event,
> + NULL, opaque, NULL, true);
>  }
>  
> -static void vhost_user_blk_event(void *opaque, QEMUChrEvent event,
> - bool realized)
> +static void vhost_user_blk_event(void *opaque, QEMUChrEvent event)
>  {
>  DeviceState *dev = opaque;
>  VirtIODevice *vdev = VIRTIO_DEVICE(dev);
> @@ 

Re: [PATCH v4 00/36] block: update graph permissions update

2021-04-28 Thread Kevin Wolf
Am 28.04.2021 um 17:17 hat Vladimir Sementsov-Ogievskiy geschrieben:
> Hi all!
> 
> And here is v4.

Thanks, applied to the block branch.

Though the error message shown by the test in patch 18 does need some
improvement on top. We should probably name both conflicting users and
who blocks whom in a way that is neutral as to which user is the new
one.

Also, curious that again patch 7 (and only patch 7) got From: mangled by
the mailing list. Seems there is something in it that Mailman doesn't
like?

Kevin




Re: [PATCH v4 00/36] block: update graph permissions update

2021-04-28 Thread no-reply
Patchew URL: 
https://patchew.org/QEMU/20210428151804.439460-1-vsement...@virtuozzo.com/



Hi,

This series seems to have some coding style problems. See output below for
more information:

Type: series
Message-id: 20210428151804.439460-1-vsement...@virtuozzo.com
Subject: [PATCH v4 00/36] block: update graph permissions update

=== TEST SCRIPT BEGIN ===
#!/bin/bash
git rev-parse base > /dev/null || exit 0
git config --local diff.renamelimit 0
git config --local diff.renames True
git config --local diff.algorithm histogram
./scripts/checkpatch.pl --mailback base..
=== TEST SCRIPT END ===

Updating 3c8cf5a9c21ff8782164d1def7f44bd888713384
From https://github.com/patchew-project/qemu
 - [tag update]  patchew/20210428141655.387430-1-f4...@amsat.org -> 
patchew/20210428141655.387430-1-f4...@amsat.org
 * [new tag] patchew/20210428151804.439460-1-vsement...@virtuozzo.com 
-> patchew/20210428151804.439460-1-vsement...@virtuozzo.com
Switched to a new branch 'test'
a0ef51a block: refactor bdrv_node_check_perm()
7429fe1 block: rename bdrv_replace_child_safe() to bdrv_replace_child()
370a34f block: refactor bdrv_child_set_perm_safe() transaction action
d1a11c5 block: inline bdrv_replace_child()
ab4725c block: inline bdrv_check_perm_common()
1e87dc2 block: drop unused permission update functions
7f7c03c block: bdrv_reopen_multiple: refresh permissions on updated graph
dae8ccf block: bdrv_reopen_multiple(): move bdrv_flush to separate pre-prepare
3882433 block: add bdrv_set_backing_noperm() transaction action
4dbfab2 block: make bdrv_refresh_limits() to be a transaction action
0cf05c0 block: make bdrv_unset_inherits_from to be a transaction action
1268c27 block: drop ignore_children for permission update functions
3b2ce34 block/backup-top: drop .active
f8d8895 block: introduce bdrv_drop_filter()
5bf1218 block: add bdrv_remove_filter_or_cow transaction action
00b761a block: adapt bdrv_append() for inserting filters
dd27bfe block: split out bdrv_replace_node_noperm()
2b2d8e4 block: add bdrv_attach_child_noperm() transaction action
3c44761 block: add bdrv_attach_child_common() transaction action
3f07f35 block: fix bdrv_replace_node_common
e2a0ea6 block: add bdrv_replace_child_safe() transaction action
92f4628 block: add bdrv_list_* permission update functions
84a955c block: add bdrv_drv_set_perm transaction action
769127d block: use topological sort for permission update
1a3293c block: inline bdrv_child_*() permission functions calls
0eafad8 block: rewrite bdrv_child_try_set_perm() using bdrv_refresh_perms()
02b7d94 block: refactor bdrv_child* permission functions
081d242 block: bdrv_refresh_perms: check for parents permissions conflict
e7d5541 util: add transactions.c
a5835ab block: make bdrv_reopen_{prepare, commit, abort} private
eebe328 block: drop ctx argument from bdrv_root_attach_child
6bfb813 block: BdrvChildClass: add .get_parent_aio_context handler
faf9f5a block: bdrv_append(): don't consume reference
09a714e tests/test-bdrv-graph-mod: add test_append_greedy_filter
a2e5008 tests/test-bdrv-graph-mod: add test_parallel_perm_update
7e9a7a4 tests/test-bdrv-graph-mod: add test_parallel_exclusive_write

=== OUTPUT BEGIN ===
1/36 Checking commit 7e9a7a4d740a (tests/test-bdrv-graph-mod: add 
test_parallel_exclusive_write)
2/36 Checking commit a2e500832942 (tests/test-bdrv-graph-mod: add 
test_parallel_perm_update)
3/36 Checking commit 09a714e2f77d (tests/test-bdrv-graph-mod: add 
test_append_greedy_filter)
4/36 Checking commit faf9f5a35e07 (block: bdrv_append(): don't consume 
reference)
5/36 Checking commit 6bfb813f671d (block: BdrvChildClass: add 
.get_parent_aio_context handler)
6/36 Checking commit eebe328485c7 (block: drop ctx argument from 
bdrv_root_attach_child)
7/36 Checking commit a5835ab099b6 (block: make bdrv_reopen_{prepare, commit, 
abort} private)
ERROR: Author email address is mangled by the mailing list
#2: 
Author: Vladimir Sementsov-Ogievskiy via 

total: 1 errors, 0 warnings, 47 lines checked

Patch 7/36 has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.

8/36 Checking commit e7d55412e1a6 (util: add transactions.c)
9/36 Checking commit 081d242cc6c7 (block: bdrv_refresh_perms: check for parents 
permissions conflict)
10/36 Checking commit 02b7d94ba496 (block: refactor bdrv_child* permission 
functions)
11/36 Checking commit 0eafad8a0bdb (block: rewrite bdrv_child_try_set_perm() 
using bdrv_refresh_perms())
12/36 Checking commit 1a3293c8af87 (block: inline bdrv_child_*() permission 
functions calls)
13/36 Checking commit 769127da032f (block: use topological sort for permission 
update)
14/36 Checking commit 84a955ce7cdd (block: add bdrv_drv_set_perm transaction 
action)
15/36 Checking commit 92f46287bd3e (block: add bdrv_list_* permission update 
functions)
16/36 Checking commit e2a0ea66e2c7 (block: add bdrv_replace_child_safe() 
transaction action)
17/36 Checking commit 3f07f3545be4 (block: fix 

[PATCH v4 33/36] block: inline bdrv_replace_child()

2021-04-28 Thread Vladimir Sementsov-Ogievskiy
bdrv_replace_child() has only one caller, the second argument is
unused. Inline it now. This triggers deletion of some more unused
interfaces.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Reviewed-by: Kevin Wolf 
---
 block.c | 101 ++--
 1 file changed, 18 insertions(+), 83 deletions(-)

diff --git a/block.c b/block.c
index 38bd2ea32e..2362c934a4 100644
--- a/block.c
+++ b/block.c
@@ -2401,42 +2401,6 @@ static int bdrv_list_refresh_perms(GSList *list, 
BlockReopenQueue *q,
 return 0;
 }
 
-static void bdrv_node_set_perm(BlockDriverState *bs)
-{
-BlockDriver *drv = bs->drv;
-BdrvChild *c;
-
-if (!drv) {
-return;
-}
-
-bdrv_drv_set_perm_commit(bs);
-
-/* Drivers that never have children can omit .bdrv_child_perm() */
-if (!drv->bdrv_child_perm) {
-assert(QLIST_EMPTY(>children));
-return;
-}
-
-/* Update all children */
-QLIST_FOREACH(c, >children, next) {
-bdrv_child_set_perm_commit(c);
-}
-}
-
-static void bdrv_list_set_perm(GSList *list)
-{
-for ( ; list; list = list->next) {
-bdrv_node_set_perm((BlockDriverState *)list->data);
-}
-}
-
-static void bdrv_set_perm(BlockDriverState *bs)
-{
-g_autoptr(GSList) list = bdrv_topological_dfs(NULL, NULL, bs);
-return bdrv_list_set_perm(list);
-}
-
 void bdrv_get_cumulative_perm(BlockDriverState *bs, uint64_t *perm,
   uint64_t *shared_perm)
 {
@@ -2776,52 +2740,6 @@ static void bdrv_replace_child_noperm(BdrvChild *child,
 }
 }
 
-/*
- * Updates @child to change its reference to point to @new_bs, including
- * checking and applying the necessary permission updates both to the old node
- * and to @new_bs.
- *
- * NULL is passed as @new_bs for removing the reference before freeing @child.
- *
- * If @new_bs is not NULL, bdrv_check_perm() must be called beforehand, as this
- * function uses bdrv_set_perm() to update the permissions according to the new
- * reference that @new_bs gets.
- *
- * Callers must ensure that child->frozen is false.
- */
-static void bdrv_replace_child(BdrvChild *child, BlockDriverState *new_bs)
-{
-BlockDriverState *old_bs = child->bs;
-
-/* Asserts that child->frozen == false */
-bdrv_replace_child_noperm(child, new_bs);
-
-/*
- * Start with the new node's permissions.  If @new_bs is a (direct
- * or indirect) child of @old_bs, we must complete the permission
- * update on @new_bs before we loosen the restrictions on @old_bs.
- * Otherwise, bdrv_check_perm() on @old_bs would re-initiate
- * updating the permissions of @new_bs, and thus not purely loosen
- * restrictions.
- */
-if (new_bs) {
-bdrv_set_perm(new_bs);
-}
-
-if (old_bs) {
-/*
- * Update permissions for old node. We're just taking a parent away, so
- * we're loosening restrictions. Errors of permission update are not
- * fatal in this case, ignore them.
- */
-bdrv_refresh_perms(old_bs, NULL);
-
-/* When the parent requiring a non-default AioContext is removed, the
- * node moves back to the main AioContext */
-bdrv_try_set_aio_context(old_bs, qemu_get_aio_context(), NULL);
-}
-}
-
 static void bdrv_child_free(void *opaque)
 {
 BdrvChild *c = opaque;
@@ -2989,8 +2907,25 @@ static int bdrv_attach_child_noperm(BlockDriverState 
*parent_bs,
 
 static void bdrv_detach_child(BdrvChild *child)
 {
-bdrv_replace_child(child, NULL);
+BlockDriverState *old_bs = child->bs;
+
+bdrv_replace_child_noperm(child, NULL);
 bdrv_remove_empty_child(child);
+
+if (old_bs) {
+/*
+ * Update permissions for old node. We're just taking a parent away, so
+ * we're loosening restrictions. Errors of permission update are not
+ * fatal in this case, ignore them.
+ */
+bdrv_refresh_perms(old_bs, NULL);
+
+/*
+ * When the parent requiring a non-default AioContext is removed, the
+ * node moves back to the main AioContext
+ */
+bdrv_try_set_aio_context(old_bs, qemu_get_aio_context(), NULL);
+}
 }
 
 /*
-- 
2.29.2




[PATCH v4 28/36] block: add bdrv_set_backing_noperm() transaction action

2021-04-28 Thread Vladimir Sementsov-Ogievskiy
Split out no-perm part of bdrv_set_backing_hd() as a separate
transaction action. Note the in case of existing BdrvChild we reuse it,
not recreate, just to do less actions.

We don't need to create extra reference to backing_hd as we don't lose
it in bdrv_attach_child().

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Reviewed-by: Kevin Wolf 
---
 block.c | 54 +-
 1 file changed, 37 insertions(+), 17 deletions(-)

diff --git a/block.c b/block.c
index 9922943793..bdfe59d94d 100644
--- a/block.c
+++ b/block.c
@@ -92,6 +92,8 @@ static int bdrv_attach_child_noperm(BlockDriverState 
*parent_bs,
 BdrvChild **child,
 Transaction *tran,
 Error **errp);
+static void bdrv_remove_filter_or_cow_child(BlockDriverState *bs,
+Transaction *tran);
 
 static int bdrv_reopen_prepare(BDRVReopenState *reopen_state, BlockReopenQueue
*queue, Error **errp);
@@ -3322,8 +3324,9 @@ static BdrvChildRole bdrv_backing_role(BlockDriverState 
*bs)
  * Sets the bs->backing link of a BDS. A new reference is created; callers
  * which don't need their own reference any more must call bdrv_unref().
  */
-int bdrv_set_backing_hd(BlockDriverState *bs, BlockDriverState *backing_hd,
-Error **errp)
+static int bdrv_set_backing_noperm(BlockDriverState *bs,
+   BlockDriverState *backing_hd,
+   Transaction *tran, Error **errp)
 {
 int ret = 0;
 bool update_inherits_from = bdrv_chain_contains(bs, backing_hd) &&
@@ -,36 +3336,53 @@ int bdrv_set_backing_hd(BlockDriverState *bs, 
BlockDriverState *backing_hd,
 return -EPERM;
 }
 
-if (backing_hd) {
-bdrv_ref(backing_hd);
-}
-
 if (bs->backing) {
 /* Cannot be frozen, we checked that above */
-bdrv_unref_child(bs, bs->backing);
-bs->backing = NULL;
+bdrv_unset_inherits_from(bs, bs->backing, tran);
+bdrv_remove_filter_or_cow_child(bs, tran);
 }
 
 if (!backing_hd) {
 goto out;
 }
 
-bs->backing = bdrv_attach_child(bs, backing_hd, "backing", _of_bds,
-bdrv_backing_role(bs), errp);
-if (!bs->backing) {
-ret = -EPERM;
-goto out;
+ret = bdrv_attach_child_noperm(bs, backing_hd, "backing",
+   _of_bds, bdrv_backing_role(bs),
+   >backing, tran, errp);
+if (ret < 0) {
+return ret;
 }
 
-/* If backing_hd was already part of bs's backing chain, and
+
+/*
+ * If backing_hd was already part of bs's backing chain, and
  * inherits_from pointed recursively to bs then let's update it to
- * point directly to bs (else it will become NULL). */
+ * point directly to bs (else it will become NULL).
+ */
 if (update_inherits_from) {
-backing_hd->inherits_from = bs;
+bdrv_set_inherits_from(backing_hd, bs, tran);
 }
 
 out:
-bdrv_refresh_limits(bs, NULL, NULL);
+bdrv_refresh_limits(bs, tran, NULL);
+
+return 0;
+}
+
+int bdrv_set_backing_hd(BlockDriverState *bs, BlockDriverState *backing_hd,
+Error **errp)
+{
+int ret;
+Transaction *tran = tran_new();
+
+ret = bdrv_set_backing_noperm(bs, backing_hd, tran, errp);
+if (ret < 0) {
+goto out;
+}
+
+ret = bdrv_refresh_perms(bs, errp);
+out:
+tran_finalize(tran, ret);
 
 return ret;
 }
-- 
2.29.2




[PATCH v4 32/36] block: inline bdrv_check_perm_common()

2021-04-28 Thread Vladimir Sementsov-Ogievskiy
bdrv_check_perm_common() has only one caller, so no more sense in
"common".

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Reviewed-by: Kevin Wolf 
---
 block.c | 32 +++-
 1 file changed, 3 insertions(+), 29 deletions(-)

diff --git a/block.c b/block.c
index bf7c187435..38bd2ea32e 100644
--- a/block.c
+++ b/block.c
@@ -2373,33 +2373,13 @@ static int bdrv_node_check_perm(BlockDriverState *bs, 
BlockReopenQueue *q,
 return 0;
 }
 
-/*
- * If use_cumulative_perms is true, use cumulative_perms and
- * cumulative_shared_perms for first element of the list. Otherwise just 
refresh
- * all permissions.
- */
-static int bdrv_check_perm_common(GSList *list, BlockReopenQueue *q,
-  bool use_cumulative_perms,
-  uint64_t cumulative_perms,
-  uint64_t cumulative_shared_perms,
-  Transaction *tran, Error **errp)
+static int bdrv_list_refresh_perms(GSList *list, BlockReopenQueue *q,
+   Transaction *tran, Error **errp)
 {
 int ret;
+uint64_t cumulative_perms, cumulative_shared_perms;
 BlockDriverState *bs;
 
-if (use_cumulative_perms) {
-bs = list->data;
-
-ret = bdrv_node_check_perm(bs, q, cumulative_perms,
-   cumulative_shared_perms,
-   tran, errp);
-if (ret < 0) {
-return ret;
-}
-
-list = list->next;
-}
-
 for ( ; list; list = list->next) {
 bs = list->data;
 
@@ -2421,12 +2401,6 @@ static int bdrv_check_perm_common(GSList *list, 
BlockReopenQueue *q,
 return 0;
 }
 
-static int bdrv_list_refresh_perms(GSList *list, BlockReopenQueue *q,
-   Transaction *tran, Error **errp)
-{
-return bdrv_check_perm_common(list, q, false, 0, 0, tran, errp);
-}
-
 static void bdrv_node_set_perm(BlockDriverState *bs)
 {
 BlockDriver *drv = bs->drv;
-- 
2.29.2




[PATCH v4 34/36] block: refactor bdrv_child_set_perm_safe() transaction action

2021-04-28 Thread Vladimir Sementsov-Ogievskiy
Old interfaces dropped, nobody directly calls
bdrv_child_set_perm_abort() and bdrv_child_set_perm_commit(), so we can
use personal state structure for the action and stop exploiting
BdrvChild structure. Also, drop "_safe" suffix which is redundant now.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Reviewed-by: Kevin Wolf 
---
 include/block/block_int.h |  5 
 block.c   | 63 ++-
 2 files changed, 22 insertions(+), 46 deletions(-)

diff --git a/include/block/block_int.h b/include/block/block_int.h
index dd2de6bd1d..c823f5b1b3 100644
--- a/include/block/block_int.h
+++ b/include/block/block_int.h
@@ -813,11 +813,6 @@ struct BdrvChild {
  */
 uint64_t shared_perm;
 
-/* backup of permissions during permission update procedure */
-bool has_backup_perm;
-uint64_t backup_perm;
-uint64_t backup_shared_perm;
-
 /*
  * This link is frozen: the child can neither be replaced nor
  * detached from the parent.
diff --git a/block.c b/block.c
index 2362c934a4..7b2a8844f6 100644
--- a/block.c
+++ b/block.c
@@ -2135,59 +2135,40 @@ static GSList *bdrv_topological_dfs(GSList *list, 
GHashTable *found,
 return g_slist_prepend(list, bs);
 }
 
-static void bdrv_child_set_perm_commit(void *opaque)
-{
-BdrvChild *c = opaque;
-
-c->has_backup_perm = false;
-}
+typedef struct BdrvChildSetPermState {
+BdrvChild *child;
+uint64_t old_perm;
+uint64_t old_shared_perm;
+} BdrvChildSetPermState;
 
 static void bdrv_child_set_perm_abort(void *opaque)
 {
-BdrvChild *c = opaque;
-/*
- * We may have child->has_backup_perm unset at this point, as in case of
- * _check_ stage of permission update failure we may _check_ not the whole
- * subtree.  Still, _abort_ is called on the whole subtree anyway.
- */
-if (c->has_backup_perm) {
-c->perm = c->backup_perm;
-c->shared_perm = c->backup_shared_perm;
-c->has_backup_perm = false;
-}
+BdrvChildSetPermState *s = opaque;
+
+s->child->perm = s->old_perm;
+s->child->shared_perm = s->old_shared_perm;
 }
 
 static TransactionActionDrv bdrv_child_set_pem_drv = {
 .abort = bdrv_child_set_perm_abort,
-.commit = bdrv_child_set_perm_commit,
+.clean = g_free,
 };
 
-/*
- * With tran=NULL needs to be followed by direct call to either
- * bdrv_child_set_perm_commit() or bdrv_child_set_perm_abort().
- *
- * With non-NULL tran needs to be followed by tran_abort() or tran_commit()
- * instead.
- */
-static void bdrv_child_set_perm_safe(BdrvChild *c, uint64_t perm,
- uint64_t shared, Transaction *tran)
+static void bdrv_child_set_perm(BdrvChild *c, uint64_t perm,
+uint64_t shared, Transaction *tran)
 {
-if (!c->has_backup_perm) {
-c->has_backup_perm = true;
-c->backup_perm = c->perm;
-c->backup_shared_perm = c->shared_perm;
-}
-/*
- * Note: it's OK if c->has_backup_perm was already set, as we can find the
- * same c twice during check_perm procedure
- */
+BdrvChildSetPermState *s = g_new(BdrvChildSetPermState, 1);
+
+*s = (BdrvChildSetPermState) {
+.child = c,
+.old_perm = c->perm,
+.old_shared_perm = c->shared_perm,
+};
 
 c->perm = perm;
 c->shared_perm = shared;
 
-if (tran) {
-tran_add(tran, _child_set_pem_drv, c);
-}
+tran_add(tran, _child_set_pem_drv, s);
 }
 
 static void bdrv_drv_set_perm_commit(void *opaque)
@@ -2367,7 +2348,7 @@ static int bdrv_node_check_perm(BlockDriverState *bs, 
BlockReopenQueue *q,
 bdrv_child_perm(bs, c->bs, c, c->role, q,
 cumulative_perms, cumulative_shared_perms,
 _perm, _shared);
-bdrv_child_set_perm_safe(c, cur_perm, cur_shared, tran);
+bdrv_child_set_perm(c, cur_perm, cur_shared, tran);
 }
 
 return 0;
@@ -2466,7 +2447,7 @@ int bdrv_child_try_set_perm(BdrvChild *c, uint64_t perm, 
uint64_t shared,
 Transaction *tran = tran_new();
 int ret;
 
-bdrv_child_set_perm_safe(c, perm, shared, tran);
+bdrv_child_set_perm(c, perm, shared, tran);
 
 ret = bdrv_refresh_perms(c->bs, _err);
 
-- 
2.29.2




[PATCH v4 24/36] block/backup-top: drop .active

2021-04-28 Thread Vladimir Sementsov-Ogievskiy
We don't need this workaround anymore: bdrv_append is already smart
enough and we can use new bdrv_drop_filter().

This commit efficiently reverts also recent 705dde27c6c53b73, which
checked .active on io path. Still it said that the problem should be
theoretical. And the logic of filter removement is changed anyway.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Reviewed-by: Kevin Wolf 
---
 block/backup-top.c | 47 +-
 tests/qemu-iotests/283.out |  2 +-
 2 files changed, 2 insertions(+), 47 deletions(-)

diff --git a/block/backup-top.c b/block/backup-top.c
index 62d09f213e..425e3778be 100644
--- a/block/backup-top.c
+++ b/block/backup-top.c
@@ -37,7 +37,6 @@
 typedef struct BDRVBackupTopState {
 BlockCopyState *bcs;
 BdrvChild *target;
-bool active;
 int64_t cluster_size;
 } BDRVBackupTopState;
 
@@ -45,12 +44,6 @@ static coroutine_fn int backup_top_co_preadv(
 BlockDriverState *bs, uint64_t offset, uint64_t bytes,
 QEMUIOVector *qiov, int flags)
 {
-BDRVBackupTopState *s = bs->opaque;
-
-if (!s->active) {
-return -EIO;
-}
-
 return bdrv_co_preadv(bs->backing, offset, bytes, qiov, flags);
 }
 
@@ -60,10 +53,6 @@ static coroutine_fn int backup_top_cbw(BlockDriverState *bs, 
uint64_t offset,
 BDRVBackupTopState *s = bs->opaque;
 uint64_t off, end;
 
-if (!s->active) {
-return -EIO;
-}
-
 if (flags & BDRV_REQ_WRITE_UNCHANGED) {
 return 0;
 }
@@ -137,21 +126,6 @@ static void backup_top_child_perm(BlockDriverState *bs, 
BdrvChild *c,
   uint64_t perm, uint64_t shared,
   uint64_t *nperm, uint64_t *nshared)
 {
-BDRVBackupTopState *s = bs->opaque;
-
-if (!s->active) {
-/*
- * The filter node may be in process of bdrv_append(), which firstly do
- * bdrv_set_backing_hd() and then bdrv_replace_node(). This means that
- * we can't unshare BLK_PERM_WRITE during bdrv_append() operation. So,
- * let's require nothing during bdrv_append() and refresh permissions
- * after it (see bdrv_backup_top_append()).
- */
-*nperm = 0;
-*nshared = BLK_PERM_ALL;
-return;
-}
-
 if (!(role & BDRV_CHILD_FILTERED)) {
 /*
  * Target child
@@ -241,17 +215,6 @@ BlockDriverState *bdrv_backup_top_append(BlockDriverState 
*source,
 }
 appended = true;
 
-/*
- * bdrv_append() finished successfully, now we can require permissions
- * we want.
- */
-state->active = true;
-ret = bdrv_child_refresh_perms(top, top->backing, errp);
-if (ret < 0) {
-error_prepend(errp, "Cannot set permissions for backup-top filter: ");
-goto fail;
-}
-
 state->cluster_size = cluster_size;
 state->bcs = block_copy_state_new(top->backing, state->target,
   cluster_size, perf->use_copy_range,
@@ -268,7 +231,6 @@ BlockDriverState *bdrv_backup_top_append(BlockDriverState 
*source,
 
 fail:
 if (appended) {
-state->active = false;
 bdrv_backup_top_drop(top);
 } else {
 bdrv_unref(top);
@@ -283,16 +245,9 @@ void bdrv_backup_top_drop(BlockDriverState *bs)
 {
 BDRVBackupTopState *s = bs->opaque;
 
-bdrv_drained_begin(bs);
+bdrv_drop_filter(bs, _abort);
 
 block_copy_state_free(s->bcs);
 
-s->active = false;
-bdrv_child_refresh_perms(bs, bs->backing, _abort);
-bdrv_replace_node(bs, bs->backing->bs, _abort);
-bdrv_set_backing_hd(bs, NULL, _abort);
-
-bdrv_drained_end(bs);
-
 bdrv_unref(bs);
 }
diff --git a/tests/qemu-iotests/283.out b/tests/qemu-iotests/283.out
index 73eb75102f..97e62a4c94 100644
--- a/tests/qemu-iotests/283.out
+++ b/tests/qemu-iotests/283.out
@@ -5,7 +5,7 @@
 {"execute": "blockdev-add", "arguments": {"driver": "blkdebug", "image": 
"base", "node-name": "other", "take-child-perms": ["write"]}}
 {"return": {}}
 {"execute": "blockdev-backup", "arguments": {"device": "source", "sync": 
"full", "target": "target"}}
-{"error": {"class": "GenericError", "desc": "Cannot set permissions for 
backup-top filter: Conflicts with use by source as 'image', which does not 
allow 'write' on base"}}
+{"error": {"class": "GenericError", "desc": "Cannot append backup-top filter: 
Conflicts with use by source as 'image', which does not allow 'write' on base"}}
 
 === backup-top should be gone after job-finalize ===
 
-- 
2.29.2




[PATCH v4 22/36] block: add bdrv_remove_filter_or_cow transaction action

2021-04-28 Thread Vladimir Sementsov-Ogievskiy
Signed-off-by: Vladimir Sementsov-Ogievskiy 
Reviewed-by: Kevin Wolf 
---
 block.c | 84 +++--
 1 file changed, 82 insertions(+), 2 deletions(-)

diff --git a/block.c b/block.c
index 5bb6a2bef9..2ea9cc110d 100644
--- a/block.c
+++ b/block.c
@@ -2963,12 +2963,19 @@ static void bdrv_replace_child(BdrvChild *child, 
BlockDriverState *new_bs)
 }
 }
 
+static void bdrv_child_free(void *opaque)
+{
+BdrvChild *c = opaque;
+
+g_free(c->name);
+g_free(c);
+}
+
 static void bdrv_remove_empty_child(BdrvChild *child)
 {
 assert(!child->bs);
 QLIST_SAFE_REMOVE(child, next);
-g_free(child->name);
-g_free(child);
+bdrv_child_free(child);
 }
 
 typedef struct BdrvAttachChildCommonState {
@@ -4991,6 +4998,79 @@ static bool should_update_child(BdrvChild *c, 
BlockDriverState *to)
 return ret;
 }
 
+typedef struct BdrvRemoveFilterOrCowChild {
+BdrvChild *child;
+bool is_backing;
+} BdrvRemoveFilterOrCowChild;
+
+static void bdrv_remove_filter_or_cow_child_abort(void *opaque)
+{
+BdrvRemoveFilterOrCowChild *s = opaque;
+BlockDriverState *parent_bs = s->child->opaque;
+
+QLIST_INSERT_HEAD(_bs->children, s->child, next);
+if (s->is_backing) {
+parent_bs->backing = s->child;
+} else {
+parent_bs->file = s->child;
+}
+
+/*
+ * We don't have to restore child->bs here to undo bdrv_replace_child()
+ * because that function is transactionable and it registered own 
completion
+ * entries in @tran, so .abort() for bdrv_replace_child_safe() will be
+ * called automatically.
+ */
+}
+
+static void bdrv_remove_filter_or_cow_child_commit(void *opaque)
+{
+BdrvRemoveFilterOrCowChild *s = opaque;
+
+bdrv_child_free(s->child);
+}
+
+static TransactionActionDrv bdrv_remove_filter_or_cow_child_drv = {
+.abort = bdrv_remove_filter_or_cow_child_abort,
+.commit = bdrv_remove_filter_or_cow_child_commit,
+.clean = g_free,
+};
+
+/*
+ * A function to remove backing-chain child of @bs if exists: cow child for
+ * format nodes (always .backing) and filter child for filters (may be .file or
+ * .backing)
+ */
+__attribute__((unused))
+static void bdrv_remove_filter_or_cow_child(BlockDriverState *bs,
+Transaction *tran)
+{
+BdrvRemoveFilterOrCowChild *s;
+BdrvChild *child = bdrv_filter_or_cow_child(bs);
+
+if (!child) {
+return;
+}
+
+if (child->bs) {
+bdrv_replace_child_safe(child, NULL, tran);
+}
+
+s = g_new(BdrvRemoveFilterOrCowChild, 1);
+*s = (BdrvRemoveFilterOrCowChild) {
+.child = child,
+.is_backing = (child == bs->backing),
+};
+tran_add(tran, _remove_filter_or_cow_child_drv, s);
+
+QLIST_SAFE_REMOVE(child, next);
+if (s->is_backing) {
+bs->backing = NULL;
+} else {
+bs->file = NULL;
+}
+}
+
 static int bdrv_replace_node_noperm(BlockDriverState *from,
 BlockDriverState *to,
 bool auto_skip, Transaction *tran,
-- 
2.29.2




[PATCH v4 21/36] block: adapt bdrv_append() for inserting filters

2021-04-28 Thread Vladimir Sementsov-Ogievskiy
bdrv_append is not very good for inserting filters: it does extra
permission update as part of bdrv_set_backing_hd(). During this update
filter may conflict with other parents of top_bs.

Instead, let's first do all graph modifications and after it update
permissions.

append-greedy-filter test-case in test-bdrv-graph-mod is now works, so
move it out of debug option.

Note: bdrv_append() is still only works for backing-child based
filters. It's something to improve later.

Note2: we use the fact that bdrv_append() is used to append new nodes,
without backing child, so we don't need frozen check and inherits_from
logic from bdrv_set_backing_hd().

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Reviewed-by: Kevin Wolf 
---
 block.c  | 27 ---
 tests/unit/test-bdrv-graph-mod.c | 17 ++---
 2 files changed, 22 insertions(+), 22 deletions(-)

diff --git a/block.c b/block.c
index 9283c8d97b..5bb6a2bef9 100644
--- a/block.c
+++ b/block.c
@@ -5088,25 +5088,38 @@ int bdrv_replace_node(BlockDriverState *from, 
BlockDriverState *to,
  * This will modify the BlockDriverState fields, and swap contents
  * between bs_new and bs_top. Both bs_new and bs_top are modified.
  *
- * bs_new must not be attached to a BlockBackend.
+ * bs_new must not be attached to a BlockBackend and must not have backing
+ * child.
  *
  * This function does not create any image files.
  */
 int bdrv_append(BlockDriverState *bs_new, BlockDriverState *bs_top,
 Error **errp)
 {
-int ret = bdrv_set_backing_hd(bs_new, bs_top, errp);
+int ret;
+Transaction *tran = tran_new();
+
+assert(!bs_new->backing);
+
+ret = bdrv_attach_child_noperm(bs_new, bs_top, "backing",
+   _of_bds, bdrv_backing_role(bs_new),
+   _new->backing, tran, errp);
 if (ret < 0) {
-return ret;
+goto out;
 }
 
-ret = bdrv_replace_node(bs_top, bs_new, errp);
+ret = bdrv_replace_node_noperm(bs_top, bs_new, true, tran, errp);
 if (ret < 0) {
-bdrv_set_backing_hd(bs_new, NULL, _abort);
-return ret;
+goto out;
 }
 
-return 0;
+ret = bdrv_refresh_perms(bs_new, errp);
+out:
+tran_finalize(tran, ret);
+
+bdrv_refresh_limits(bs_top, NULL);
+
+return ret;
 }
 
 static void bdrv_delete(BlockDriverState *bs)
diff --git a/tests/unit/test-bdrv-graph-mod.c b/tests/unit/test-bdrv-graph-mod.c
index 7b3c8b437a..88f25c0cdb 100644
--- a/tests/unit/test-bdrv-graph-mod.c
+++ b/tests/unit/test-bdrv-graph-mod.c
@@ -388,16 +388,6 @@ static void test_append_greedy_filter(void)
 
 int main(int argc, char *argv[])
 {
-int i;
-bool debug = false;
-
-for (i = 1; i < argc; i++) {
-if (!strcmp(argv[i], "-d")) {
-debug = true;
-break;
-}
-}
-
 bdrv_init();
 qemu_init_main_loop(_abort);
 
@@ -410,11 +400,8 @@ int main(int argc, char *argv[])
 test_parallel_perm_update);
 g_test_add_func("/bdrv-graph-mod/parallel-exclusive-write",
 test_parallel_exclusive_write);
-
-if (debug) {
-g_test_add_func("/bdrv-graph-mod/append-greedy-filter",
-test_append_greedy_filter);
-}
+g_test_add_func("/bdrv-graph-mod/append-greedy-filter",
+test_append_greedy_filter);
 
 return g_test_run();
 }
-- 
2.29.2




[PATCH v4 36/36] block: refactor bdrv_node_check_perm()

2021-04-28 Thread Vladimir Sementsov-Ogievskiy
Now, bdrv_node_check_perm() is called only with fresh cumulative
permissions, so its actually "refresh_perm".

Move permission calculation to the function. Also, drop unreachable
error message and rewrite the remaining one to be more generic (as now
we don't know which node is added and which was already here).

Add also Virtuozzo copyright, as big work is done at this point.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Reviewed-by: Kevin Wolf 
---
 block.c| 38 +++---
 tests/qemu-iotests/245 |  2 +-
 2 files changed, 12 insertions(+), 28 deletions(-)

diff --git a/block.c b/block.c
index df8fa6003c..874c22c43e 100644
--- a/block.c
+++ b/block.c
@@ -2,6 +2,7 @@
  * QEMU System Emulator block driver
  *
  * Copyright (c) 2003 Fabrice Bellard
+ * Copyright (c) 2020 Virtuozzo International GmbH.
  *
  * Permission is hereby granted, free of charge, to any person obtaining a copy
  * of this software and associated documentation files (the "Software"), to 
deal
@@ -2270,22 +2271,18 @@ static void bdrv_replace_child(BdrvChild *child, 
BlockDriverState *new_bs,
 }
 
 /*
- * Check whether permissions on this node can be changed in a way that
- * @cumulative_perms and @cumulative_shared_perms are the new cumulative
- * permissions of all its parents. This involves checking whether all necessary
- * permission changes to child nodes can be performed.
- *
- * A call to this function must always be followed by a call to bdrv_set_perm()
- * or bdrv_abort_perm_update().
+ * Refresh permissions in @bs subtree. The function is intended to be called
+ * after some graph modification that was done without permission update.
  */
-static int bdrv_node_check_perm(BlockDriverState *bs, BlockReopenQueue *q,
-uint64_t cumulative_perms,
-uint64_t cumulative_shared_perms,
-Transaction *tran, Error **errp)
+static int bdrv_node_refresh_perm(BlockDriverState *bs, BlockReopenQueue *q,
+  Transaction *tran, Error **errp)
 {
 BlockDriver *drv = bs->drv;
 BdrvChild *c;
 int ret;
+uint64_t cumulative_perms, cumulative_shared_perms;
+
+bdrv_get_cumulative_perm(bs, _perms, _shared_perms);
 
 /* Write permissions never work with read-only images */
 if ((cumulative_perms & (BLK_PERM_WRITE | BLK_PERM_WRITE_UNCHANGED)) &&
@@ -2294,15 +2291,8 @@ static int bdrv_node_check_perm(BlockDriverState *bs, 
BlockReopenQueue *q,
 if (!bdrv_is_writable_after_reopen(bs, NULL)) {
 error_setg(errp, "Block node is read-only");
 } else {
-uint64_t current_perms, current_shared;
-bdrv_get_cumulative_perm(bs, _perms, _shared);
-if (current_perms & (BLK_PERM_WRITE | BLK_PERM_WRITE_UNCHANGED)) {
-error_setg(errp, "Cannot make block node read-only, there is "
-   "a writer on it");
-} else {
-error_setg(errp, "Cannot make block node read-only and create "
-   "a writer on it");
-}
+error_setg(errp, "Read-only block node '%s' cannot support "
+   "read-write users", bdrv_get_node_name(bs));
 }
 
 return -EPERM;
@@ -2358,7 +2348,6 @@ static int bdrv_list_refresh_perms(GSList *list, 
BlockReopenQueue *q,
Transaction *tran, Error **errp)
 {
 int ret;
-uint64_t cumulative_perms, cumulative_shared_perms;
 BlockDriverState *bs;
 
 for ( ; list; list = list->next) {
@@ -2368,12 +2357,7 @@ static int bdrv_list_refresh_perms(GSList *list, 
BlockReopenQueue *q,
 return -EINVAL;
 }
 
-bdrv_get_cumulative_perm(bs, _perms,
- _shared_perms);
-
-ret = bdrv_node_check_perm(bs, q, cumulative_perms,
-   cumulative_shared_perms,
-   tran, errp);
+ret = bdrv_node_refresh_perm(bs, q, tran, errp);
 if (ret < 0) {
 return ret;
 }
diff --git a/tests/qemu-iotests/245 b/tests/qemu-iotests/245
index 11104b9208..fc5297e268 100755
--- a/tests/qemu-iotests/245
+++ b/tests/qemu-iotests/245
@@ -905,7 +905,7 @@ class TestBlockdevReopen(iotests.QMPTestCase):
 # We can't reopen hd1 to read-only, as block-stream requires it to be
 # read-write
 self.reopen(opts['backing'], {'read-only': True},
-"Cannot make block node read-only, there is a writer on 
it")
+"Read-only block node 'hd1' cannot support read-write 
users")
 
 # We can't remove hd2 while the stream job is ongoing
 opts['backing']['backing'] = None
-- 
2.29.2




[PATCH v4 26/36] block: make bdrv_unset_inherits_from to be a transaction action

2021-04-28 Thread Vladimir Sementsov-Ogievskiy
To be used in the further commit.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Reviewed-by: Kevin Wolf 
---
 block.c | 46 ++
 1 file changed, 42 insertions(+), 4 deletions(-)

diff --git a/block.c b/block.c
index 46af5852ab..1dc14908ac 100644
--- a/block.c
+++ b/block.c
@@ -3205,11 +3205,49 @@ void bdrv_root_unref_child(BdrvChild *child)
 bdrv_unref(child_bs);
 }
 
+typedef struct BdrvSetInheritsFrom {
+BlockDriverState *bs;
+BlockDriverState *old_inherits_from;
+} BdrvSetInheritsFrom;
+
+static void bdrv_set_inherits_from_abort(void *opaque)
+{
+BdrvSetInheritsFrom *s = opaque;
+
+s->bs->inherits_from = s->old_inherits_from;
+}
+
+static TransactionActionDrv bdrv_set_inherits_from_drv = {
+.abort = bdrv_set_inherits_from_abort,
+.clean = g_free,
+};
+
+/* @tran is allowed to be NULL. In this case no rollback is possible */
+static void bdrv_set_inherits_from(BlockDriverState *bs,
+   BlockDriverState *new_inherits_from,
+   Transaction *tran)
+{
+if (tran) {
+BdrvSetInheritsFrom *s = g_new(BdrvSetInheritsFrom, 1);
+
+*s = (BdrvSetInheritsFrom) {
+.bs = bs,
+.old_inherits_from = bs->inherits_from,
+};
+
+tran_add(tran, _set_inherits_from_drv, s);
+}
+
+bs->inherits_from = new_inherits_from;
+}
+
 /**
  * Clear all inherits_from pointers from children and grandchildren of
  * @root that point to @root, where necessary.
+ * @tran is allowed to be NULL. In this case no rollback is possible
  */
-static void bdrv_unset_inherits_from(BlockDriverState *root, BdrvChild *child)
+static void bdrv_unset_inherits_from(BlockDriverState *root, BdrvChild *child,
+ Transaction *tran)
 {
 BdrvChild *c;
 
@@ -3224,12 +3262,12 @@ static void bdrv_unset_inherits_from(BlockDriverState 
*root, BdrvChild *child)
 }
 }
 if (c == NULL) {
-child->bs->inherits_from = NULL;
+bdrv_set_inherits_from(child->bs, NULL, tran);
 }
 }
 
 QLIST_FOREACH(c, >bs->children, next) {
-bdrv_unset_inherits_from(root, c);
+bdrv_unset_inherits_from(root, c, tran);
 }
 }
 
@@ -3240,7 +3278,7 @@ void bdrv_unref_child(BlockDriverState *parent, BdrvChild 
*child)
 return;
 }
 
-bdrv_unset_inherits_from(parent, child);
+bdrv_unset_inherits_from(parent, child, NULL);
 bdrv_root_unref_child(child);
 }
 
-- 
2.29.2




[PATCH v4 30/36] block: bdrv_reopen_multiple: refresh permissions on updated graph

2021-04-28 Thread Vladimir Sementsov-Ogievskiy
Move bdrv_reopen_multiple to new paradigm of permission update:
first update graph relations, then do refresh the permissions.

We have to modify reopen process in file-posix driver: with new scheme
we don't have prepared permissions in raw_reopen_prepare(), so we
should reconfigure fd in raw_check_perm(). Still this seems more native
and simple anyway.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Reviewed-by: Kevin Wolf 
---
 include/block/block.h |   3 +-
 block.c   | 187 --
 block/file-posix.c|  91 +++-
 3 files changed, 84 insertions(+), 197 deletions(-)

diff --git a/include/block/block.h b/include/block/block.h
index ad38259c91..8d5b3ecebd 100644
--- a/include/block/block.h
+++ b/include/block/block.h
@@ -208,8 +208,7 @@ typedef struct BDRVReopenState {
 BlockdevDetectZeroesOptions detect_zeroes;
 bool backing_missing;
 bool replace_backing_bs;  /* new_backing_bs is ignored if this is false */
-BlockDriverState *new_backing_bs; /* If NULL then detach the current bs */
-uint64_t perm, shared_perm;
+BlockDriverState *old_backing_bs; /* keep pointer for permissions update */
 QDict *options;
 QDict *explicit_options;
 void *opaque;
diff --git a/block.c b/block.c
index 357ec1be2c..ffaa367a1b 100644
--- a/block.c
+++ b/block.c
@@ -95,8 +95,9 @@ static int bdrv_attach_child_noperm(BlockDriverState 
*parent_bs,
 static void bdrv_remove_filter_or_cow_child(BlockDriverState *bs,
 Transaction *tran);
 
-static int bdrv_reopen_prepare(BDRVReopenState *reopen_state, BlockReopenQueue
-   *queue, Error **errp);
+static int bdrv_reopen_prepare(BDRVReopenState *reopen_state,
+   BlockReopenQueue *queue,
+   Transaction *set_backings_tran, Error **errp);
 static void bdrv_reopen_commit(BDRVReopenState *reopen_state);
 static void bdrv_reopen_abort(BDRVReopenState *reopen_state);
 
@@ -2468,6 +2469,7 @@ static void bdrv_list_abort_perm_update(GSList *list)
 }
 }
 
+__attribute__((unused))
 static void bdrv_abort_perm_update(BlockDriverState *bs)
 {
 g_autoptr(GSList) list = bdrv_topological_dfs(NULL, NULL, bs);
@@ -2563,6 +2565,7 @@ char *bdrv_perm_names(uint64_t perm)
  *
  * Needs to be followed by a call to either bdrv_set_perm() or
  * bdrv_abort_perm_update(). */
+__attribute__((unused))
 static int bdrv_check_update_perm(BlockDriverState *bs, BlockReopenQueue *q,
   uint64_t new_used_perm,
   uint64_t new_shared_perm,
@@ -4183,10 +4186,6 @@ static BlockReopenQueue 
*bdrv_reopen_queue_child(BlockReopenQueue *bs_queue,
 bs_entry->state.explicit_options = explicit_options;
 bs_entry->state.flags = flags;
 
-/* This needs to be overwritten in bdrv_reopen_prepare() */
-bs_entry->state.perm = UINT64_MAX;
-bs_entry->state.shared_perm = 0;
-
 /*
  * If keep_old_opts is false then it means that unspecified
  * options must be reset to their original value. We don't allow
@@ -4271,6 +4270,9 @@ int bdrv_reopen_multiple(BlockReopenQueue *bs_queue, 
Error **errp)
 {
 int ret = -1;
 BlockReopenQueueEntry *bs_entry, *next;
+Transaction *tran = tran_new();
+g_autoptr(GHashTable) found = NULL;
+g_autoptr(GSList) refresh_list = NULL;
 
 assert(bs_queue != NULL);
 
@@ -4284,33 +4286,33 @@ int bdrv_reopen_multiple(BlockReopenQueue *bs_queue, 
Error **errp)
 
 QTAILQ_FOREACH(bs_entry, bs_queue, entry) {
 assert(bs_entry->state.bs->quiesce_counter > 0);
-if (bdrv_reopen_prepare(_entry->state, bs_queue, errp)) {
-goto cleanup;
+ret = bdrv_reopen_prepare(_entry->state, bs_queue, tran, errp);
+if (ret < 0) {
+goto abort;
 }
 bs_entry->prepared = true;
 }
 
+found = g_hash_table_new(NULL, NULL);
 QTAILQ_FOREACH(bs_entry, bs_queue, entry) {
 BDRVReopenState *state = _entry->state;
-ret = bdrv_check_perm(state->bs, bs_queue, state->perm,
-  state->shared_perm, errp);
-if (ret < 0) {
-goto cleanup_perm;
-}
-/* Check if new_backing_bs would accept the new permissions */
-if (state->replace_backing_bs && state->new_backing_bs) {
-uint64_t nperm, nshared;
-bdrv_child_perm(state->bs, state->new_backing_bs,
-NULL, bdrv_backing_role(state->bs),
-bs_queue, state->perm, state->shared_perm,
-, );
-ret = bdrv_check_update_perm(state->new_backing_bs, NULL,
- nperm, nshared, errp);
-if (ret < 0) {
-goto cleanup_perm;
-}
+
+refresh_list = bdrv_topological_dfs(refresh_list, found, state->bs);
+if 

[PATCH v4 35/36] block: rename bdrv_replace_child_safe() to bdrv_replace_child()

2021-04-28 Thread Vladimir Sementsov-Ogievskiy
We don't have bdrv_replace_child(), so it's time for
bdrv_replace_child_safe() to take its place.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Reviewed-by: Kevin Wolf 
---
 block.c | 10 +-
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/block.c b/block.c
index 7b2a8844f6..df8fa6003c 100644
--- a/block.c
+++ b/block.c
@@ -2248,12 +2248,12 @@ static TransactionActionDrv bdrv_replace_child_drv = {
 };
 
 /*
- * bdrv_replace_child_safe
+ * bdrv_replace_child
  *
  * Note: real unref of old_bs is done only on commit.
  */
-static void bdrv_replace_child_safe(BdrvChild *child, BlockDriverState *new_bs,
-Transaction *tran)
+static void bdrv_replace_child(BdrvChild *child, BlockDriverState *new_bs,
+   Transaction *tran)
 {
 BdrvReplaceChildState *s = g_new(BdrvReplaceChildState, 1);
 *s = (BdrvReplaceChildState) {
@@ -4803,7 +4803,7 @@ static void 
bdrv_remove_filter_or_cow_child(BlockDriverState *bs,
 }
 
 if (child->bs) {
-bdrv_replace_child_safe(child, NULL, tran);
+bdrv_replace_child(child, NULL, tran);
 }
 
 s = g_new(BdrvRemoveFilterOrCowChild, 1);
@@ -4843,7 +4843,7 @@ static int bdrv_replace_node_noperm(BlockDriverState 
*from,
c->name, from->node_name);
 return -EPERM;
 }
-bdrv_replace_child_safe(c, to, tran);
+bdrv_replace_child(c, to, tran);
 }
 
 return 0;
-- 
2.29.2




[PATCH v4 29/36] block: bdrv_reopen_multiple(): move bdrv_flush to separate pre-prepare

2021-04-28 Thread Vladimir Sementsov-Ogievskiy
During reopen we may add backing bs from other aio context, which may
lead to changing original context of top bs.

We are going to move graph modification to prepare stage. So, it will
be possible that bdrv_flush() in bdrv_reopen_prepare called on bs in
non-original aio context, which we didn't aquire which leads to crash.

To avoid this problem move bdrv_flush() to be a separate reopen stage
before bdrv_reopen_prepare().

This doesn't seem correct to acquire only one aio context and not all
contexts participating in reopen. But it's not obvious how to do it
correctly, keeping in mind:

 1. rules of bdrv_set_aio_context_ignore() that requires new_context
lock not being held

 2. possible deadlocks because of holding all (or several?) AioContext
locks

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Reviewed-by: Kevin Wolf 
---
 block.c | 14 --
 1 file changed, 8 insertions(+), 6 deletions(-)

diff --git a/block.c b/block.c
index bdfe59d94d..357ec1be2c 100644
--- a/block.c
+++ b/block.c
@@ -4274,6 +4274,14 @@ int bdrv_reopen_multiple(BlockReopenQueue *bs_queue, 
Error **errp)
 
 assert(bs_queue != NULL);
 
+QTAILQ_FOREACH(bs_entry, bs_queue, entry) {
+ret = bdrv_flush(bs_entry->state.bs);
+if (ret < 0) {
+error_setg_errno(errp, -ret, "Error flushing drive");
+goto cleanup;
+}
+}
+
 QTAILQ_FOREACH(bs_entry, bs_queue, entry) {
 assert(bs_entry->state.bs->quiesce_counter > 0);
 if (bdrv_reopen_prepare(_entry->state, bs_queue, errp)) {
@@ -4669,12 +4677,6 @@ static int bdrv_reopen_prepare(BDRVReopenState 
*reopen_state,
 bdrv_reopen_perm(queue, reopen_state->bs,
  _state->perm, _state->shared_perm);
 
-ret = bdrv_flush(reopen_state->bs);
-if (ret) {
-error_setg_errno(errp, -ret, "Error flushing drive");
-goto error;
-}
-
 if (drv->bdrv_reopen_prepare) {
 /*
  * If a driver-specific option is missing, it means that we
-- 
2.29.2




[PATCH v4 25/36] block: drop ignore_children for permission update functions

2021-04-28 Thread Vladimir Sementsov-Ogievskiy
This argument is always NULL. Drop it.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Reviewed-by: Kevin Wolf 
---
 block.c | 38 +++---
 1 file changed, 11 insertions(+), 27 deletions(-)

diff --git a/block.c b/block.c
index 68dfd822dd..46af5852ab 100644
--- a/block.c
+++ b/block.c
@@ -1988,7 +1988,6 @@ static int bdrv_fill_options(QDict **options, const char 
*filename,
 static int bdrv_check_update_perm(BlockDriverState *bs, BlockReopenQueue *q,
   uint64_t new_used_perm,
   uint64_t new_shared_perm,
-  GSList *ignore_children,
   Error **errp);
 
 typedef struct BlockReopenQueueEntry {
@@ -2065,9 +2064,7 @@ static bool bdrv_a_allow_b(BdrvChild *a, BdrvChild *b, 
Error **errp)
 return false;
 }
 
-static bool bdrv_parent_perms_conflict(BlockDriverState *bs,
-   GSList *ignore_children,
-   Error **errp)
+static bool bdrv_parent_perms_conflict(BlockDriverState *bs, Error **errp)
 {
 BdrvChild *a, *b;
 
@@ -2077,12 +2074,8 @@ static bool bdrv_parent_perms_conflict(BlockDriverState 
*bs,
  * directions.
  */
 QLIST_FOREACH(a, >parents, next_parent) {
-if (g_slist_find(ignore_children, a)) {
-continue;
-}
-
 QLIST_FOREACH(b, >parents, next_parent) {
-if (a == b || g_slist_find(ignore_children, b)) {
+if (a == b) {
 continue;
 }
 
@@ -2310,7 +2303,6 @@ static void bdrv_replace_child_safe(BdrvChild *child, 
BlockDriverState *new_bs,
 static int bdrv_node_check_perm(BlockDriverState *bs, BlockReopenQueue *q,
 uint64_t cumulative_perms,
 uint64_t cumulative_shared_perms,
-GSList *ignore_children,
 Transaction *tran, Error **errp)
 {
 BlockDriver *drv = bs->drv;
@@ -2393,7 +2385,6 @@ static int bdrv_check_perm_common(GSList *list, 
BlockReopenQueue *q,
   bool use_cumulative_perms,
   uint64_t cumulative_perms,
   uint64_t cumulative_shared_perms,
-  GSList *ignore_children,
   Transaction *tran, Error **errp)
 {
 int ret;
@@ -2404,7 +2395,7 @@ static int bdrv_check_perm_common(GSList *list, 
BlockReopenQueue *q,
 
 ret = bdrv_node_check_perm(bs, q, cumulative_perms,
cumulative_shared_perms,
-   ignore_children, tran, errp);
+   tran, errp);
 if (ret < 0) {
 return ret;
 }
@@ -2415,7 +2406,7 @@ static int bdrv_check_perm_common(GSList *list, 
BlockReopenQueue *q,
 for ( ; list; list = list->next) {
 bs = list->data;
 
-if (bdrv_parent_perms_conflict(bs, ignore_children, errp)) {
+if (bdrv_parent_perms_conflict(bs, errp)) {
 return -EINVAL;
 }
 
@@ -2424,7 +2415,7 @@ static int bdrv_check_perm_common(GSList *list, 
BlockReopenQueue *q,
 
 ret = bdrv_node_check_perm(bs, q, cumulative_perms,
cumulative_shared_perms,
-   ignore_children, tran, errp);
+   tran, errp);
 if (ret < 0) {
 return ret;
 }
@@ -2435,19 +2426,17 @@ static int bdrv_check_perm_common(GSList *list, 
BlockReopenQueue *q,
 
 static int bdrv_check_perm(BlockDriverState *bs, BlockReopenQueue *q,
uint64_t cumulative_perms,
-   uint64_t cumulative_shared_perms,
-   GSList *ignore_children, Error **errp)
+   uint64_t cumulative_shared_perms, Error **errp)
 {
 g_autoptr(GSList) list = bdrv_topological_dfs(NULL, NULL, bs);
 return bdrv_check_perm_common(list, q, true, cumulative_perms,
-  cumulative_shared_perms, ignore_children,
-  NULL, errp);
+  cumulative_shared_perms, NULL, errp);
 }
 
 static int bdrv_list_refresh_perms(GSList *list, BlockReopenQueue *q,
Transaction *tran, Error **errp)
 {
-return bdrv_check_perm_common(list, q, false, 0, 0, NULL, tran, errp);
+return bdrv_check_perm_common(list, q, false, 0, 0, tran, errp);
 }
 
 /*
@@ -2576,7 +2565,6 @@ char *bdrv_perm_names(uint64_t perm)
 static int bdrv_check_update_perm(BlockDriverState *bs, BlockReopenQueue *q,
   uint64_t new_used_perm,
   uint64_t new_shared_perm,
-  GSList *ignore_children,
 

[PATCH v4 27/36] block: make bdrv_refresh_limits() to be a transaction action

2021-04-28 Thread Vladimir Sementsov-Ogievskiy
To be used in further commit.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Reviewed-by: Kevin Wolf 
---
 include/block/block.h |  3 ++-
 block.c   |  9 -
 block/io.c| 31 +--
 3 files changed, 35 insertions(+), 8 deletions(-)

diff --git a/include/block/block.h b/include/block/block.h
index 85481a05c6..ad38259c91 100644
--- a/include/block/block.h
+++ b/include/block/block.h
@@ -9,6 +9,7 @@
 #include "block/dirty-bitmap.h"
 #include "block/blockjob.h"
 #include "qemu/hbitmap.h"
+#include "qemu/transactions.h"
 
 /*
  * generated_co_wrapper
@@ -421,7 +422,7 @@ int64_t bdrv_get_allocated_file_size(BlockDriverState *bs);
 BlockMeasureInfo *bdrv_measure(BlockDriver *drv, QemuOpts *opts,
BlockDriverState *in_bs, Error **errp);
 void bdrv_get_geometry(BlockDriverState *bs, uint64_t *nb_sectors_ptr);
-void bdrv_refresh_limits(BlockDriverState *bs, Error **errp);
+void bdrv_refresh_limits(BlockDriverState *bs, Transaction *tran, Error 
**errp);
 int bdrv_commit(BlockDriverState *bs);
 int bdrv_make_empty(BdrvChild *c, Error **errp);
 int bdrv_change_backing_file(BlockDriverState *bs, const char *backing_file,
diff --git a/block.c b/block.c
index 1dc14908ac..9922943793 100644
--- a/block.c
+++ b/block.c
@@ -49,7 +49,6 @@
 #include "qemu/timer.h"
 #include "qemu/cutils.h"
 #include "qemu/id.h"
-#include "qemu/transactions.h"
 #include "block/coroutines.h"
 
 #ifdef CONFIG_BSD
@@ -1577,7 +1576,7 @@ static int bdrv_open_driver(BlockDriverState *bs, 
BlockDriver *drv,
 return ret;
 }
 
-bdrv_refresh_limits(bs, _err);
+bdrv_refresh_limits(bs, NULL, _err);
 if (local_err) {
 error_propagate(errp, local_err);
 return -EINVAL;
@@ -3363,7 +3362,7 @@ int bdrv_set_backing_hd(BlockDriverState *bs, 
BlockDriverState *backing_hd,
 }
 
 out:
-bdrv_refresh_limits(bs, NULL);
+bdrv_refresh_limits(bs, NULL, NULL);
 
 return ret;
 }
@@ -4847,7 +4846,7 @@ static void bdrv_reopen_commit(BDRVReopenState 
*reopen_state)
 bdrv_set_backing_hd(bs, reopen_state->new_backing_bs, _abort);
 }
 
-bdrv_refresh_limits(bs, NULL);
+bdrv_refresh_limits(bs, NULL, NULL);
 }
 
 /*
@@ -5244,7 +5243,7 @@ int bdrv_append(BlockDriverState *bs_new, 
BlockDriverState *bs_top,
 out:
 tran_finalize(tran, ret);
 
-bdrv_refresh_limits(bs_top, NULL);
+bdrv_refresh_limits(bs_top, NULL, NULL);
 
 return ret;
 }
diff --git a/block/io.c b/block/io.c
index ca2dca3007..35b6c56efc 100644
--- a/block/io.c
+++ b/block/io.c
@@ -133,13 +133,40 @@ static void bdrv_merge_limits(BlockLimits *dst, const 
BlockLimits *src)
 dst->max_iov = MIN_NON_ZERO(dst->max_iov, src->max_iov);
 }
 
-void bdrv_refresh_limits(BlockDriverState *bs, Error **errp)
+typedef struct BdrvRefreshLimitsState {
+BlockDriverState *bs;
+BlockLimits old_bl;
+} BdrvRefreshLimitsState;
+
+static void bdrv_refresh_limits_abort(void *opaque)
+{
+BdrvRefreshLimitsState *s = opaque;
+
+s->bs->bl = s->old_bl;
+}
+
+static TransactionActionDrv bdrv_refresh_limits_drv = {
+.abort = bdrv_refresh_limits_abort,
+.clean = g_free,
+};
+
+/* @tran is allowed to be NULL, in this case no rollback is possible. */
+void bdrv_refresh_limits(BlockDriverState *bs, Transaction *tran, Error **errp)
 {
 ERRP_GUARD();
 BlockDriver *drv = bs->drv;
 BdrvChild *c;
 bool have_limits;
 
+if (tran) {
+BdrvRefreshLimitsState *s = g_new(BdrvRefreshLimitsState, 1);
+*s = (BdrvRefreshLimitsState) {
+.bs = bs,
+.old_bl = bs->bl,
+};
+tran_add(tran, _refresh_limits_drv, s);
+}
+
 memset(>bl, 0, sizeof(bs->bl));
 
 if (!drv) {
@@ -156,7 +183,7 @@ void bdrv_refresh_limits(BlockDriverState *bs, Error **errp)
 QLIST_FOREACH(c, >children, next) {
 if (c->role & (BDRV_CHILD_DATA | BDRV_CHILD_FILTERED | BDRV_CHILD_COW))
 {
-bdrv_refresh_limits(c->bs, errp);
+bdrv_refresh_limits(c->bs, tran, errp);
 if (*errp) {
 return;
 }
-- 
2.29.2




[PATCH v4 11/36] block: rewrite bdrv_child_try_set_perm() using bdrv_refresh_perms()

2021-04-28 Thread Vladimir Sementsov-Ogievskiy
We are going to drop recursive bdrv_child_* functions, so stop use them
in bdrv_child_try_set_perm() as a first step.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Reviewed-by: Kevin Wolf 
---
 block.c | 14 --
 1 file changed, 8 insertions(+), 6 deletions(-)

diff --git a/block.c b/block.c
index 0ee0c2f29a..4511766825 100644
--- a/block.c
+++ b/block.c
@@ -2454,11 +2454,16 @@ int bdrv_child_try_set_perm(BdrvChild *c, uint64_t 
perm, uint64_t shared,
 Error **errp)
 {
 Error *local_err = NULL;
+Transaction *tran = tran_new();
 int ret;
 
-ret = bdrv_child_check_perm(c, NULL, perm, shared, NULL, _err);
+bdrv_child_set_perm_safe(c, perm, shared, tran);
+
+ret = bdrv_refresh_perms(c->bs, _err);
+
+tran_finalize(tran, ret);
+
 if (ret < 0) {
-bdrv_child_abort_perm_update(c);
 if ((perm & ~c->perm) || (c->shared_perm & ~shared)) {
 /* tighten permissions */
 error_propagate(errp, local_err);
@@ -2472,12 +2477,9 @@ int bdrv_child_try_set_perm(BdrvChild *c, uint64_t perm, 
uint64_t shared,
 error_free(local_err);
 ret = 0;
 }
-return ret;
 }
 
-bdrv_child_set_perm(c);
-
-return 0;
+return ret;
 }
 
 int bdrv_child_refresh_perms(BlockDriverState *bs, BdrvChild *c, Error **errp)
-- 
2.29.2




[PATCH v4 09/36] block: bdrv_refresh_perms: check for parents permissions conflict

2021-04-28 Thread Vladimir Sementsov-Ogievskiy
Add additional check that node parents do not interfere with each
other. This should not hurt existing callers and allows in further
patch use bdrv_refresh_perms() to update a subtree of changed
BdrvChild (check that change is correct).

New check will substitute bdrv_check_update_perm() in following
permissions refactoring, so keep error messages the same to avoid
unit test result changes.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Reviewed-by: Alberto Garcia 
Reviewed-by: Kevin Wolf 
---
 block.c | 63 -
 1 file changed, 54 insertions(+), 9 deletions(-)

diff --git a/block.c b/block.c
index 34c728d7e4..fd621f0403 100644
--- a/block.c
+++ b/block.c
@@ -2026,6 +2026,57 @@ bool bdrv_is_writable(BlockDriverState *bs)
 return bdrv_is_writable_after_reopen(bs, NULL);
 }
 
+static char *bdrv_child_user_desc(BdrvChild *c)
+{
+if (c->klass->get_parent_desc) {
+return c->klass->get_parent_desc(c);
+}
+
+return g_strdup("another user");
+}
+
+static bool bdrv_a_allow_b(BdrvChild *a, BdrvChild *b, Error **errp)
+{
+g_autofree char *user = NULL;
+g_autofree char *perm_names = NULL;
+
+if ((b->perm & a->shared_perm) == b->perm) {
+return true;
+}
+
+perm_names = bdrv_perm_names(b->perm & ~a->shared_perm);
+user = bdrv_child_user_desc(a);
+error_setg(errp, "Conflicts with use by %s as '%s', which does not "
+   "allow '%s' on %s",
+   user, a->name, perm_names, bdrv_get_node_name(b->bs));
+
+return false;
+}
+
+static bool bdrv_parent_perms_conflict(BlockDriverState *bs, Error **errp)
+{
+BdrvChild *a, *b;
+
+/*
+ * During the loop we'll look at each pair twice. That's correct because
+ * bdrv_a_allow_b() is asymmetric and we should check each pair in both
+ * directions.
+ */
+QLIST_FOREACH(a, >parents, next_parent) {
+QLIST_FOREACH(b, >parents, next_parent) {
+if (a == b) {
+continue;
+}
+
+if (!bdrv_a_allow_b(a, b, errp)) {
+return true;
+}
+}
+}
+
+return false;
+}
+
 static void bdrv_child_perm(BlockDriverState *bs, BlockDriverState *child_bs,
 BdrvChild *c, BdrvChildRole role,
 BlockReopenQueue *reopen_queue,
@@ -2203,15 +2254,6 @@ void bdrv_get_cumulative_perm(BlockDriverState *bs, 
uint64_t *perm,
 *shared_perm = cumulative_shared_perms;
 }
 
-static char *bdrv_child_user_desc(BdrvChild *c)
-{
-if (c->klass->get_parent_desc) {
-return c->klass->get_parent_desc(c);
-}
-
-return g_strdup("another user");
-}
-
 char *bdrv_perm_names(uint64_t perm)
 {
 struct perm_name {
@@ -2355,6 +2397,9 @@ static int bdrv_refresh_perms(BlockDriverState *bs, Error 
**errp)
 int ret;
 uint64_t perm, shared_perm;
 
+if (bdrv_parent_perms_conflict(bs, errp)) {
+return -EPERM;
+}
 bdrv_get_cumulative_perm(bs, , _perm);
 ret = bdrv_check_perm(bs, NULL, perm, shared_perm, NULL, errp);
 if (ret < 0) {
-- 
2.29.2




[PATCH v4 16/36] block: add bdrv_replace_child_safe() transaction action

2021-04-28 Thread Vladimir Sementsov-Ogievskiy
To be used in the following commit.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Reviewed-by: Kevin Wolf 
---
 block.c | 54 ++
 1 file changed, 54 insertions(+)

diff --git a/block.c b/block.c
index 98b7bc179c..a5305662dc 100644
--- a/block.c
+++ b/block.c
@@ -83,6 +83,9 @@ static BlockDriverState *bdrv_open_inherit(const char 
*filename,
BdrvChildRole child_role,
Error **errp);
 
+static void bdrv_replace_child_noperm(BdrvChild *child,
+  BlockDriverState *new_bs);
+
 static int bdrv_reopen_prepare(BDRVReopenState *reopen_state, BlockReopenQueue
*queue, Error **errp);
 static void bdrv_reopen_commit(BDRVReopenState *reopen_state);
@@ -2237,6 +2240,57 @@ static int bdrv_drv_set_perm(BlockDriverState *bs, 
uint64_t perm,
 return 0;
 }
 
+typedef struct BdrvReplaceChildState {
+BdrvChild *child;
+BlockDriverState *old_bs;
+} BdrvReplaceChildState;
+
+static void bdrv_replace_child_commit(void *opaque)
+{
+BdrvReplaceChildState *s = opaque;
+
+bdrv_unref(s->old_bs);
+}
+
+static void bdrv_replace_child_abort(void *opaque)
+{
+BdrvReplaceChildState *s = opaque;
+BlockDriverState *new_bs = s->child->bs;
+
+/* old_bs reference is transparently moved from @s to @s->child */
+bdrv_replace_child_noperm(s->child, s->old_bs);
+bdrv_unref(new_bs);
+}
+
+static TransactionActionDrv bdrv_replace_child_drv = {
+.commit = bdrv_replace_child_commit,
+.abort = bdrv_replace_child_abort,
+.clean = g_free,
+};
+
+/*
+ * bdrv_replace_child_safe
+ *
+ * Note: real unref of old_bs is done only on commit.
+ */
+__attribute__((unused))
+static void bdrv_replace_child_safe(BdrvChild *child, BlockDriverState *new_bs,
+Transaction *tran)
+{
+BdrvReplaceChildState *s = g_new(BdrvReplaceChildState, 1);
+*s = (BdrvReplaceChildState) {
+.child = child,
+.old_bs = child->bs,
+};
+tran_add(tran, _replace_child_drv, s);
+
+if (new_bs) {
+bdrv_ref(new_bs);
+}
+bdrv_replace_child_noperm(child, new_bs);
+/* old_bs reference is transparently moved from @child to @s */
+}
+
 /*
  * Check whether permissions on this node can be changed in a way that
  * @cumulative_perms and @cumulative_shared_perms are the new cumulative
-- 
2.29.2




[PATCH v4 15/36] block: add bdrv_list_* permission update functions

2021-04-28 Thread Vladimir Sementsov-Ogievskiy
Add new interface, allowing use of existing node list. It will be used
to fix bdrv_replace_node() in the further commit.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Reviewed-by: Kevin Wolf 
---
 block.c | 106 +---
 1 file changed, 71 insertions(+), 35 deletions(-)

diff --git a/block.c b/block.c
index ab7a4d13d8..98b7bc179c 100644
--- a/block.c
+++ b/block.c
@@ -2249,7 +2249,8 @@ static int bdrv_drv_set_perm(BlockDriverState *bs, 
uint64_t perm,
 static int bdrv_node_check_perm(BlockDriverState *bs, BlockReopenQueue *q,
 uint64_t cumulative_perms,
 uint64_t cumulative_shared_perms,
-GSList *ignore_children, Error **errp)
+GSList *ignore_children,
+Transaction *tran, Error **errp)
 {
 BlockDriver *drv = bs->drv;
 BdrvChild *c;
@@ -2297,7 +2298,7 @@ static int bdrv_node_check_perm(BlockDriverState *bs, 
BlockReopenQueue *q,
 return 0;
 }
 
-ret = bdrv_drv_set_perm(bs, cumulative_perms, cumulative_shared_perms, 
NULL,
+ret = bdrv_drv_set_perm(bs, cumulative_perms, cumulative_shared_perms, 
tran,
 errp);
 if (ret < 0) {
 return ret;
@@ -2316,36 +2317,53 @@ static int bdrv_node_check_perm(BlockDriverState *bs, 
BlockReopenQueue *q,
 bdrv_child_perm(bs, c->bs, c, c->role, q,
 cumulative_perms, cumulative_shared_perms,
 _perm, _shared);
-bdrv_child_set_perm_safe(c, cur_perm, cur_shared, NULL);
+bdrv_child_set_perm_safe(c, cur_perm, cur_shared, tran);
 }
 
 return 0;
 }
 
-static int bdrv_check_perm(BlockDriverState *bs, BlockReopenQueue *q,
-   uint64_t cumulative_perms,
-   uint64_t cumulative_shared_perms,
-   GSList *ignore_children, Error **errp)
+/*
+ * If use_cumulative_perms is true, use cumulative_perms and
+ * cumulative_shared_perms for first element of the list. Otherwise just 
refresh
+ * all permissions.
+ */
+static int bdrv_check_perm_common(GSList *list, BlockReopenQueue *q,
+  bool use_cumulative_perms,
+  uint64_t cumulative_perms,
+  uint64_t cumulative_shared_perms,
+  GSList *ignore_children,
+  Transaction *tran, Error **errp)
 {
 int ret;
-BlockDriverState *root = bs;
-g_autoptr(GSList) list = bdrv_topological_dfs(NULL, NULL, root);
+BlockDriverState *bs;
 
-for ( ; list; list = list->next) {
+if (use_cumulative_perms) {
 bs = list->data;
 
-if (bs != root) {
-if (bdrv_parent_perms_conflict(bs, ignore_children, errp)) {
-return -EINVAL;
-}
+ret = bdrv_node_check_perm(bs, q, cumulative_perms,
+   cumulative_shared_perms,
+   ignore_children, tran, errp);
+if (ret < 0) {
+return ret;
+}
 
-bdrv_get_cumulative_perm(bs, _perms,
- _shared_perms);
+list = list->next;
+}
+
+for ( ; list; list = list->next) {
+bs = list->data;
+
+if (bdrv_parent_perms_conflict(bs, ignore_children, errp)) {
+return -EINVAL;
 }
 
+bdrv_get_cumulative_perm(bs, _perms,
+ _shared_perms);
+
 ret = bdrv_node_check_perm(bs, q, cumulative_perms,
cumulative_shared_perms,
-   ignore_children, errp);
+   ignore_children, tran, errp);
 if (ret < 0) {
 return ret;
 }
@@ -2354,6 +2372,23 @@ static int bdrv_check_perm(BlockDriverState *bs, 
BlockReopenQueue *q,
 return 0;
 }
 
+static int bdrv_check_perm(BlockDriverState *bs, BlockReopenQueue *q,
+   uint64_t cumulative_perms,
+   uint64_t cumulative_shared_perms,
+   GSList *ignore_children, Error **errp)
+{
+g_autoptr(GSList) list = bdrv_topological_dfs(NULL, NULL, bs);
+return bdrv_check_perm_common(list, q, true, cumulative_perms,
+  cumulative_shared_perms, ignore_children,
+  NULL, errp);
+}
+
+static int bdrv_list_refresh_perms(GSList *list, BlockReopenQueue *q,
+   Transaction *tran, Error **errp)
+{
+return bdrv_check_perm_common(list, q, false, 0, 0, NULL, tran, errp);
+}
+
 /*
  * Notifies drivers that after a previous bdrv_check_perm() call, the
  * permission update is not performed and any preparations made for it (e.g.
@@ -2375,15 +2410,19 @@ static void 

[PATCH v4 05/36] block: BdrvChildClass: add .get_parent_aio_context handler

2021-04-28 Thread Vladimir Sementsov-Ogievskiy
Add new handler to get aio context and implement it in all child
classes. Add corresponding public interface to be used soon.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Reviewed-by: Alberto Garcia 
Reviewed-by: Kevin Wolf 
---
 include/block/block.h |  2 ++
 include/block/block_int.h |  2 ++
 block.c   | 13 +
 block/block-backend.c |  9 +
 blockjob.c|  8 
 5 files changed, 34 insertions(+)

diff --git a/include/block/block.h b/include/block/block.h
index b3f6e509d4..54279baa95 100644
--- a/include/block/block.h
+++ b/include/block/block.h
@@ -702,6 +702,8 @@ bool bdrv_child_can_set_aio_context(BdrvChild *c, 
AioContext *ctx,
 GSList **ignore, Error **errp);
 bool bdrv_can_set_aio_context(BlockDriverState *bs, AioContext *ctx,
   GSList **ignore, Error **errp);
+AioContext *bdrv_child_get_parent_aio_context(BdrvChild *c);
+
 int bdrv_probe_blocksizes(BlockDriverState *bs, BlockSizes *bsz);
 int bdrv_probe_geometry(BlockDriverState *bs, HDGeometry *geo);
 
diff --git a/include/block/block_int.h b/include/block/block_int.h
index 88e4111939..737ec632c4 100644
--- a/include/block/block_int.h
+++ b/include/block/block_int.h
@@ -789,6 +789,8 @@ struct BdrvChildClass {
 bool (*can_set_aio_ctx)(BdrvChild *child, AioContext *ctx,
 GSList **ignore, Error **errp);
 void (*set_aio_ctx)(BdrvChild *child, AioContext *ctx, GSList **ignore);
+
+AioContext *(*get_parent_aio_context)(BdrvChild *child);
 };
 
 extern const BdrvChildClass child_of_bds;
diff --git a/block.c b/block.c
index 1e7e8907e4..2833912436 100644
--- a/block.c
+++ b/block.c
@@ -1394,6 +1394,13 @@ static int bdrv_child_cb_update_filename(BdrvChild *c, 
BlockDriverState *base,
 return 0;
 }
 
+static AioContext *bdrv_child_cb_get_parent_aio_context(BdrvChild *c)
+{
+BlockDriverState *bs = c->opaque;
+
+return bdrv_get_aio_context(bs);
+}
+
 const BdrvChildClass child_of_bds = {
 .parent_is_bds   = true,
 .get_parent_desc = bdrv_child_get_parent_desc,
@@ -1407,8 +1414,14 @@ const BdrvChildClass child_of_bds = {
 .can_set_aio_ctx = bdrv_child_cb_can_set_aio_ctx,
 .set_aio_ctx = bdrv_child_cb_set_aio_ctx,
 .update_filename = bdrv_child_cb_update_filename,
+.get_parent_aio_context = bdrv_child_cb_get_parent_aio_context,
 };
 
+AioContext *bdrv_child_get_parent_aio_context(BdrvChild *c)
+{
+return c->klass->get_parent_aio_context(c);
+}
+
 static int bdrv_open_flags(BlockDriverState *bs, int flags)
 {
 int open_flags = flags;
diff --git a/block/block-backend.c b/block/block-backend.c
index 413af51f3b..3f656ef361 100644
--- a/block/block-backend.c
+++ b/block/block-backend.c
@@ -298,6 +298,13 @@ static void blk_root_detach(BdrvChild *child)
 }
 }
 
+static AioContext *blk_root_get_parent_aio_context(BdrvChild *c)
+{
+BlockBackend *blk = c->opaque;
+
+return blk_get_aio_context(blk);
+}
+
 static const BdrvChildClass child_root = {
 .inherit_options= blk_root_inherit_options,
 
@@ -318,6 +325,8 @@ static const BdrvChildClass child_root = {
 
 .can_set_aio_ctx= blk_root_can_set_aio_ctx,
 .set_aio_ctx= blk_root_set_aio_ctx,
+
+.get_parent_aio_context = blk_root_get_parent_aio_context,
 };
 
 /*
diff --git a/blockjob.c b/blockjob.c
index 207e8c7fd9..160bf38b19 100644
--- a/blockjob.c
+++ b/blockjob.c
@@ -163,6 +163,13 @@ static void child_job_set_aio_ctx(BdrvChild *c, AioContext 
*ctx,
 job->job.aio_context = ctx;
 }
 
+static AioContext *child_job_get_parent_aio_context(BdrvChild *c)
+{
+BlockJob *job = c->opaque;
+
+return job->job.aio_context;
+}
+
 static const BdrvChildClass child_job = {
 .get_parent_desc= child_job_get_parent_desc,
 .drained_begin  = child_job_drained_begin,
@@ -171,6 +178,7 @@ static const BdrvChildClass child_job = {
 .can_set_aio_ctx= child_job_can_set_aio_ctx,
 .set_aio_ctx= child_job_set_aio_ctx,
 .stay_at_node   = true,
+.get_parent_aio_context = child_job_get_parent_aio_context,
 };
 
 void block_job_remove_all_bdrv(BlockJob *job)
-- 
2.29.2




[PATCH v4 14/36] block: add bdrv_drv_set_perm transaction action

2021-04-28 Thread Vladimir Sementsov-Ogievskiy
Refactor calling driver callbacks to a separate transaction action to
be used later.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Reviewed-by: Kevin Wolf 
---
 block.c | 70 -
 1 file changed, 54 insertions(+), 16 deletions(-)

diff --git a/block.c b/block.c
index cbcc4c15a0..ab7a4d13d8 100644
--- a/block.c
+++ b/block.c
@@ -2189,6 +2189,54 @@ static void bdrv_child_set_perm_safe(BdrvChild *c, 
uint64_t perm,
 }
 }
 
+static void bdrv_drv_set_perm_commit(void *opaque)
+{
+BlockDriverState *bs = opaque;
+uint64_t cumulative_perms, cumulative_shared_perms;
+
+if (bs->drv->bdrv_set_perm) {
+bdrv_get_cumulative_perm(bs, _perms,
+ _shared_perms);
+bs->drv->bdrv_set_perm(bs, cumulative_perms, cumulative_shared_perms);
+}
+}
+
+static void bdrv_drv_set_perm_abort(void *opaque)
+{
+BlockDriverState *bs = opaque;
+
+if (bs->drv->bdrv_abort_perm_update) {
+bs->drv->bdrv_abort_perm_update(bs);
+}
+}
+
+TransactionActionDrv bdrv_drv_set_perm_drv = {
+.abort = bdrv_drv_set_perm_abort,
+.commit = bdrv_drv_set_perm_commit,
+};
+
+static int bdrv_drv_set_perm(BlockDriverState *bs, uint64_t perm,
+ uint64_t shared_perm, Transaction *tran,
+ Error **errp)
+{
+if (!bs->drv) {
+return 0;
+}
+
+if (bs->drv->bdrv_check_perm) {
+int ret = bs->drv->bdrv_check_perm(bs, perm, shared_perm, errp);
+if (ret < 0) {
+return ret;
+}
+}
+
+if (tran) {
+tran_add(tran, _drv_set_perm_drv, bs);
+}
+
+return 0;
+}
+
 /*
  * Check whether permissions on this node can be changed in a way that
  * @cumulative_perms and @cumulative_shared_perms are the new cumulative
@@ -2249,12 +2297,10 @@ static int bdrv_node_check_perm(BlockDriverState *bs, 
BlockReopenQueue *q,
 return 0;
 }
 
-if (drv->bdrv_check_perm) {
-ret = drv->bdrv_check_perm(bs, cumulative_perms,
-   cumulative_shared_perms, errp);
-if (ret < 0) {
-return ret;
-}
+ret = bdrv_drv_set_perm(bs, cumulative_perms, cumulative_shared_perms, 
NULL,
+errp);
+if (ret < 0) {
+return ret;
 }
 
 /* Drivers that never have children can omit .bdrv_child_perm() */
@@ -2322,9 +2368,7 @@ static void bdrv_node_abort_perm_update(BlockDriverState 
*bs)
 return;
 }
 
-if (drv->bdrv_abort_perm_update) {
-drv->bdrv_abort_perm_update(bs);
-}
+bdrv_drv_set_perm_abort(bs);
 
 QLIST_FOREACH(c, >children, next) {
 bdrv_child_set_perm_abort(c);
@@ -2342,7 +2386,6 @@ static void bdrv_abort_perm_update(BlockDriverState *bs)
 
 static void bdrv_node_set_perm(BlockDriverState *bs)
 {
-uint64_t cumulative_perms, cumulative_shared_perms;
 BlockDriver *drv = bs->drv;
 BdrvChild *c;
 
@@ -2350,12 +2393,7 @@ static void bdrv_node_set_perm(BlockDriverState *bs)
 return;
 }
 
-bdrv_get_cumulative_perm(bs, _perms, _shared_perms);
-
-/* Update this node */
-if (drv->bdrv_set_perm) {
-drv->bdrv_set_perm(bs, cumulative_perms, cumulative_shared_perms);
-}
+bdrv_drv_set_perm_commit(bs);
 
 /* Drivers that never have children can omit .bdrv_child_perm() */
 if (!drv->bdrv_child_perm) {
-- 
2.29.2




[PATCH v4 23/36] block: introduce bdrv_drop_filter()

2021-04-28 Thread Vladimir Sementsov-Ogievskiy
Using bdrv_replace_node() for removing filter is not good enough: it
keeps child reference of the filter, which may conflict with original
top node during permission update.

Instead let's create new interface, which will do all graph
modifications first and then update permissions.

Let's modify bdrv_replace_node_common(), allowing it additionally drop
backing chain child link pointing to new node. This is quite
appropriate for bdrv_drop_intermediate() and makes possible to add
new bdrv_drop_filter() as a simple wrapper.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Reviewed-by: Kevin Wolf 
---
 include/block/block.h |  1 +
 block.c   | 43 +++
 2 files changed, 40 insertions(+), 4 deletions(-)

diff --git a/include/block/block.h b/include/block/block.h
index 16e496a5c4..85481a05c6 100644
--- a/include/block/block.h
+++ b/include/block/block.h
@@ -362,6 +362,7 @@ int bdrv_replace_node(BlockDriverState *from, 
BlockDriverState *to,
   Error **errp);
 BlockDriverState *bdrv_insert_node(BlockDriverState *bs, QDict *node_options,
int flags, Error **errp);
+int bdrv_drop_filter(BlockDriverState *bs, Error **errp);
 
 int bdrv_parse_aio(const char *mode, int *flags);
 int bdrv_parse_cache_mode(const char *mode, int *flags, bool *writethrough);
diff --git a/block.c b/block.c
index 2ea9cc110d..68dfd822dd 100644
--- a/block.c
+++ b/block.c
@@ -5041,7 +5041,6 @@ static TransactionActionDrv 
bdrv_remove_filter_or_cow_child_drv = {
  * format nodes (always .backing) and filter child for filters (may be .file or
  * .backing)
  */
-__attribute__((unused))
 static void bdrv_remove_filter_or_cow_child(BlockDriverState *bs,
 Transaction *tran)
 {
@@ -5105,16 +5104,32 @@ static int bdrv_replace_node_noperm(BlockDriverState 
*from,
  *
  * With auto_skip=false the error is returned if from has a parent which should
  * not be updated.
+ *
+ * With @detach_subchain=true @to must be in a backing chain of @from. In this
+ * case backing link of the cow-parent of @to is removed.
  */
 static int bdrv_replace_node_common(BlockDriverState *from,
 BlockDriverState *to,
-bool auto_skip, Error **errp)
+bool auto_skip, bool detach_subchain,
+Error **errp)
 {
 Transaction *tran = tran_new();
 g_autoptr(GHashTable) found = NULL;
 g_autoptr(GSList) refresh_list = NULL;
+BlockDriverState *to_cow_parent;
 int ret;
 
+if (detach_subchain) {
+assert(bdrv_chain_contains(from, to));
+assert(from != to);
+for (to_cow_parent = from;
+ bdrv_filter_or_cow_bs(to_cow_parent) != to;
+ to_cow_parent = bdrv_filter_or_cow_bs(to_cow_parent))
+{
+;
+}
+}
+
 /* Make sure that @from doesn't go away until we have successfully attached
  * all of its parents to @to. */
 bdrv_ref(from);
@@ -5134,6 +5149,10 @@ static int bdrv_replace_node_common(BlockDriverState 
*from,
 goto out;
 }
 
+if (detach_subchain) {
+bdrv_remove_filter_or_cow_child(to_cow_parent, tran);
+}
+
 found = g_hash_table_new(NULL, NULL);
 
 refresh_list = bdrv_topological_dfs(refresh_list, found, to);
@@ -5158,7 +5177,13 @@ out:
 int bdrv_replace_node(BlockDriverState *from, BlockDriverState *to,
   Error **errp)
 {
-return bdrv_replace_node_common(from, to, true, errp);
+return bdrv_replace_node_common(from, to, true, false, errp);
+}
+
+int bdrv_drop_filter(BlockDriverState *bs, Error **errp)
+{
+return bdrv_replace_node_common(bs, bdrv_filter_or_cow_bs(bs), true, true,
+errp);
 }
 
 /*
@@ -5493,7 +5518,17 @@ int bdrv_drop_intermediate(BlockDriverState *top, 
BlockDriverState *base,
 updated_children = g_slist_prepend(updated_children, c);
 }
 
-bdrv_replace_node_common(top, base, false, _err);
+/*
+ * It seems correct to pass detach_subchain=true here, but it triggers
+ * one more yet not fixed bug, when due to nested aio_poll loop we switch 
to
+ * another drained section, which modify the graph (for example, removing
+ * the child, which we keep in updated_children list). So, it's a TODO.
+ *
+ * Note, bug triggered if pass detach_subchain=true here and run
+ * test-bdrv-drain. test_drop_intermediate_poll() test-case will crash.
+ * That's a FIXME.
+ */
+bdrv_replace_node_common(top, base, false, false, _err);
 if (local_err) {
 error_report_err(local_err);
 goto exit;
-- 
2.29.2




[PATCH v4 31/36] block: drop unused permission update functions

2021-04-28 Thread Vladimir Sementsov-Ogievskiy
Signed-off-by: Vladimir Sementsov-Ogievskiy 
Reviewed-by: Kevin Wolf 
---
 block.c | 103 
 1 file changed, 103 deletions(-)

diff --git a/block.c b/block.c
index ffaa367a1b..bf7c187435 100644
--- a/block.c
+++ b/block.c
@@ -1987,11 +1987,6 @@ static int bdrv_fill_options(QDict **options, const char 
*filename,
 return 0;
 }
 
-static int bdrv_check_update_perm(BlockDriverState *bs, BlockReopenQueue *q,
-  uint64_t new_used_perm,
-  uint64_t new_shared_perm,
-  Error **errp);
-
 typedef struct BlockReopenQueueEntry {
  bool prepared;
  bool perms_checked;
@@ -2426,56 +2421,12 @@ static int bdrv_check_perm_common(GSList *list, 
BlockReopenQueue *q,
 return 0;
 }
 
-static int bdrv_check_perm(BlockDriverState *bs, BlockReopenQueue *q,
-   uint64_t cumulative_perms,
-   uint64_t cumulative_shared_perms, Error **errp)
-{
-g_autoptr(GSList) list = bdrv_topological_dfs(NULL, NULL, bs);
-return bdrv_check_perm_common(list, q, true, cumulative_perms,
-  cumulative_shared_perms, NULL, errp);
-}
-
 static int bdrv_list_refresh_perms(GSList *list, BlockReopenQueue *q,
Transaction *tran, Error **errp)
 {
 return bdrv_check_perm_common(list, q, false, 0, 0, tran, errp);
 }
 
-/*
- * Notifies drivers that after a previous bdrv_check_perm() call, the
- * permission update is not performed and any preparations made for it (e.g.
- * taken file locks) need to be undone.
- */
-static void bdrv_node_abort_perm_update(BlockDriverState *bs)
-{
-BlockDriver *drv = bs->drv;
-BdrvChild *c;
-
-if (!drv) {
-return;
-}
-
-bdrv_drv_set_perm_abort(bs);
-
-QLIST_FOREACH(c, >children, next) {
-bdrv_child_set_perm_abort(c);
-}
-}
-
-static void bdrv_list_abort_perm_update(GSList *list)
-{
-for ( ; list; list = list->next) {
-bdrv_node_abort_perm_update((BlockDriverState *)list->data);
-}
-}
-
-__attribute__((unused))
-static void bdrv_abort_perm_update(BlockDriverState *bs)
-{
-g_autoptr(GSList) list = bdrv_topological_dfs(NULL, NULL, bs);
-return bdrv_list_abort_perm_update(list);
-}
-
 static void bdrv_node_set_perm(BlockDriverState *bs)
 {
 BlockDriver *drv = bs->drv;
@@ -2557,60 +2508,6 @@ char *bdrv_perm_names(uint64_t perm)
 return g_string_free(result, FALSE);
 }
 
-/*
- * Checks whether a new reference to @bs can be added if the new user requires
- * @new_used_perm/@new_shared_perm as its permissions. If @ignore_children is
- * set, the BdrvChild objects in this list are ignored in the calculations;
- * this allows checking permission updates for an existing reference.
- *
- * Needs to be followed by a call to either bdrv_set_perm() or
- * bdrv_abort_perm_update(). */
-__attribute__((unused))
-static int bdrv_check_update_perm(BlockDriverState *bs, BlockReopenQueue *q,
-  uint64_t new_used_perm,
-  uint64_t new_shared_perm,
-  Error **errp)
-{
-BdrvChild *c;
-uint64_t cumulative_perms = new_used_perm;
-uint64_t cumulative_shared_perms = new_shared_perm;
-
-
-/* There is no reason why anyone couldn't tolerate write_unchanged */
-assert(new_shared_perm & BLK_PERM_WRITE_UNCHANGED);
-
-QLIST_FOREACH(c, >parents, next_parent) {
-if ((new_used_perm & c->shared_perm) != new_used_perm) {
-char *user = bdrv_child_user_desc(c);
-char *perm_names = bdrv_perm_names(new_used_perm & 
~c->shared_perm);
-
-error_setg(errp, "Conflicts with use by %s as '%s', which does not 
"
- "allow '%s' on %s",
-   user, c->name, perm_names, bdrv_get_node_name(c->bs));
-g_free(user);
-g_free(perm_names);
-return -EPERM;
-}
-
-if ((c->perm & new_shared_perm) != c->perm) {
-char *user = bdrv_child_user_desc(c);
-char *perm_names = bdrv_perm_names(c->perm & ~new_shared_perm);
-
-error_setg(errp, "Conflicts with use by %s as '%s', which uses "
- "'%s' on %s",
-   user, c->name, perm_names, bdrv_get_node_name(c->bs));
-g_free(user);
-g_free(perm_names);
-return -EPERM;
-}
-
-cumulative_perms |= c->perm;
-cumulative_shared_perms &= c->shared_perm;
-}
-
-return bdrv_check_perm(bs, q, cumulative_perms, cumulative_shared_perms,
-   errp);
-}
 
 static int bdrv_refresh_perms(BlockDriverState *bs, Error **errp)
 {
-- 
2.29.2




[PATCH v4 20/36] block: split out bdrv_replace_node_noperm()

2021-04-28 Thread Vladimir Sementsov-Ogievskiy
Split part of bdrv_replace_node_common() to be used separately.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Reviewed-by: Kevin Wolf 
---
 block.c | 50 +++---
 1 file changed, 31 insertions(+), 19 deletions(-)

diff --git a/block.c b/block.c
index c74e6e7cc1..9283c8d97b 100644
--- a/block.c
+++ b/block.c
@@ -4991,6 +4991,34 @@ static bool should_update_child(BdrvChild *c, 
BlockDriverState *to)
 return ret;
 }
 
+static int bdrv_replace_node_noperm(BlockDriverState *from,
+BlockDriverState *to,
+bool auto_skip, Transaction *tran,
+Error **errp)
+{
+BdrvChild *c, *next;
+
+QLIST_FOREACH_SAFE(c, >parents, next_parent, next) {
+assert(c->bs == from);
+if (!should_update_child(c, to)) {
+if (auto_skip) {
+continue;
+}
+error_setg(errp, "Should not change '%s' link to '%s'",
+   c->name, from->node_name);
+return -EINVAL;
+}
+if (c->frozen) {
+error_setg(errp, "Cannot change '%s' link to '%s'",
+   c->name, from->node_name);
+return -EPERM;
+}
+bdrv_replace_child_safe(c, to, tran);
+}
+
+return 0;
+}
+
 /*
  * With auto_skip=true bdrv_replace_node_common skips updating from parents
  * if it creates a parent-child relation loop or if parent is block-job.
@@ -5002,7 +5030,6 @@ static int bdrv_replace_node_common(BlockDriverState 
*from,
 BlockDriverState *to,
 bool auto_skip, Error **errp)
 {
-BdrvChild *c, *next;
 Transaction *tran = tran_new();
 g_autoptr(GHashTable) found = NULL;
 g_autoptr(GSList) refresh_list = NULL;
@@ -5022,24 +5049,9 @@ static int bdrv_replace_node_common(BlockDriverState 
*from,
  * permissions based on new graph. If we fail, we'll roll-back the
  * replacement.
  */
-QLIST_FOREACH_SAFE(c, >parents, next_parent, next) {
-assert(c->bs == from);
-if (!should_update_child(c, to)) {
-if (auto_skip) {
-continue;
-}
-ret = -EINVAL;
-error_setg(errp, "Should not change '%s' link to '%s'",
-   c->name, from->node_name);
-goto out;
-}
-if (c->frozen) {
-ret = -EPERM;
-error_setg(errp, "Cannot change '%s' link to '%s'",
-   c->name, from->node_name);
-goto out;
-}
-bdrv_replace_child_safe(c, to, tran);
+ret = bdrv_replace_node_noperm(from, to, auto_skip, tran, errp);
+if (ret < 0) {
+goto out;
 }
 
 found = g_hash_table_new(NULL, NULL);
-- 
2.29.2




[PATCH v4 08/36] util: add transactions.c

2021-04-28 Thread Vladimir Sementsov-Ogievskiy
Add simple transaction API to use in further update of block graph
operations.

Supposed usage is:

- "prepare" is main function of the action and it should make the main
  effect of the action to be visible for the following actions, keeping
  possibility of roll-back, saving necessary things in action state,
  which is prepended to the action list (to do that, prepare func
  should call tran_add()). So, driver struct doesn't include "prepare"
  field, as it is supposed to be called directly.

- commit/rollback is supposed to be called for the list of action
  states, to commit/rollback all the actions in reverse order

- When possible "commit" should not make visible effect for other
  actions, which make possible transparent logical interaction between
  actions.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Reviewed-by: Kevin Wolf 
---
 include/qemu/transactions.h | 63 
 util/transactions.c | 96 +
 MAINTAINERS |  6 +++
 util/meson.build|  1 +
 4 files changed, 166 insertions(+)
 create mode 100644 include/qemu/transactions.h
 create mode 100644 util/transactions.c

diff --git a/include/qemu/transactions.h b/include/qemu/transactions.h
new file mode 100644
index 00..92c5965235
--- /dev/null
+++ b/include/qemu/transactions.h
@@ -0,0 +1,63 @@
+/*
+ * Simple transactions API
+ *
+ * Copyright (c) 2021 Virtuozzo International GmbH.
+ *
+ * Author:
+ *  Vladimir Sementsov-Ogievskiy 
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program. If not, see .
+ *
+ *
+ * = Generic transaction API =
+ *
+ * The intended usage is the following: you create "prepare" functions, which
+ * represents the actions. They will usually have Transaction* argument, and
+ * call tran_add() to register finalization callbacks. For finalization
+ * callbacks, prepare corresponding TransactionActionDrv structures.
+ *
+ * Then, when you need to make a transaction, create an empty Transaction by
+ * tran_create(), call your "prepare" functions on it, and finally call
+ * tran_abort() or tran_commit() to finalize the transaction by corresponding
+ * finalization actions in reverse order.
+ */
+
+#ifndef QEMU_TRANSACTIONS_H
+#define QEMU_TRANSACTIONS_H
+
+#include 
+
+typedef struct TransactionActionDrv {
+void (*abort)(void *opaque);
+void (*commit)(void *opaque);
+void (*clean)(void *opaque);
+} TransactionActionDrv;
+
+typedef struct Transaction Transaction;
+
+Transaction *tran_new(void);
+void tran_add(Transaction *tran, TransactionActionDrv *drv, void *opaque);
+void tran_abort(Transaction *tran);
+void tran_commit(Transaction *tran);
+
+static inline void tran_finalize(Transaction *tran, int ret)
+{
+if (ret < 0) {
+tran_abort(tran);
+} else {
+tran_commit(tran);
+}
+}
+
+#endif /* QEMU_TRANSACTIONS_H */
diff --git a/util/transactions.c b/util/transactions.c
new file mode 100644
index 00..d0bc9a3e73
--- /dev/null
+++ b/util/transactions.c
@@ -0,0 +1,96 @@
+/*
+ * Simple transactions API
+ *
+ * Copyright (c) 2021 Virtuozzo International GmbH.
+ *
+ * Author:
+ *  Sementsov-Ogievskiy Vladimir 
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program. If not, see .
+ */
+
+#include "qemu/osdep.h"
+
+#include "qemu/transactions.h"
+#include "qemu/queue.h"
+
+typedef struct TransactionAction {
+TransactionActionDrv *drv;
+void *opaque;
+QSLIST_ENTRY(TransactionAction) entry;
+} TransactionAction;
+
+struct Transaction {
+QSLIST_HEAD(, TransactionAction) actions;
+};
+
+Transaction *tran_new(void)
+{
+Transaction *tran = g_new(Transaction, 1);
+
+QSLIST_INIT(>actions);
+
+return tran;
+}
+
+void tran_add(Transaction *tran, TransactionActionDrv *drv, void *opaque)
+{
+TransactionAction *act;
+
+   

[PATCH v4 18/36] block: add bdrv_attach_child_common() transaction action

2021-04-28 Thread Vladimir Sementsov-Ogievskiy
Split out no-perm part of bdrv_root_attach_child() into separate
transaction action. bdrv_root_attach_child() now moves to new
permission update paradigm: first update graph relations then update
permissions.

qsd-jobs test output updated. Seems now permission update goes in
another order. Still, the test comment say that we only want to check
that command doesn't crash, and it's still so.

Error message is a bit misleading as it looks like job was added first.
But actually in new paradigm of graph update we can't distinguish such
things. We should update the error message, but let's not do it now.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
---
 block.c   | 190 ++
 tests/qemu-iotests/tests/qsd-jobs.out |   2 +-
 2 files changed, 137 insertions(+), 55 deletions(-)

diff --git a/block.c b/block.c
index 6040b9dea0..4d5c60f2ae 100644
--- a/block.c
+++ b/block.c
@@ -2955,37 +2955,74 @@ static void bdrv_replace_child(BdrvChild *child, 
BlockDriverState *new_bs)
 }
 }
 
-/*
- * This function steals the reference to child_bs from the caller.
- * That reference is later dropped by bdrv_root_unref_child().
- *
- * On failure NULL is returned, errp is set and the reference to
- * child_bs is also dropped.
- *
- * The caller must hold the AioContext lock @child_bs, but not that of @ctx
- * (unless @child_bs is already in @ctx).
- */
-BdrvChild *bdrv_root_attach_child(BlockDriverState *child_bs,
-  const char *child_name,
-  const BdrvChildClass *child_class,
-  BdrvChildRole child_role,
-  uint64_t perm, uint64_t shared_perm,
-  void *opaque, Error **errp)
+static void bdrv_remove_empty_child(BdrvChild *child)
 {
-BdrvChild *child;
-Error *local_err = NULL;
-int ret;
-AioContext *ctx;
+assert(!child->bs);
+QLIST_SAFE_REMOVE(child, next);
+g_free(child->name);
+g_free(child);
+}
 
-ret = bdrv_check_update_perm(child_bs, NULL, perm, shared_perm, NULL, 
errp);
-if (ret < 0) {
-bdrv_abort_perm_update(child_bs);
-bdrv_unref(child_bs);
-return NULL;
+typedef struct BdrvAttachChildCommonState {
+BdrvChild **child;
+AioContext *old_parent_ctx;
+AioContext *old_child_ctx;
+} BdrvAttachChildCommonState;
+
+static void bdrv_attach_child_common_abort(void *opaque)
+{
+BdrvAttachChildCommonState *s = opaque;
+BdrvChild *child = *s->child;
+BlockDriverState *bs = child->bs;
+
+bdrv_replace_child_noperm(child, NULL);
+
+if (bdrv_get_aio_context(bs) != s->old_child_ctx) {
+bdrv_try_set_aio_context(bs, s->old_child_ctx, _abort);
 }
 
-child = g_new(BdrvChild, 1);
-*child = (BdrvChild) {
+if (bdrv_child_get_parent_aio_context(child) != s->old_parent_ctx) {
+GSList *ignore = g_slist_prepend(NULL, child);
+
+child->klass->can_set_aio_ctx(child, s->old_parent_ctx, ,
+  _abort);
+g_slist_free(ignore);
+ignore = g_slist_prepend(NULL, child);
+child->klass->set_aio_ctx(child, s->old_parent_ctx, );
+
+g_slist_free(ignore);
+}
+
+bdrv_unref(bs);
+bdrv_remove_empty_child(child);
+*s->child = NULL;
+}
+
+static TransactionActionDrv bdrv_attach_child_common_drv = {
+.abort = bdrv_attach_child_common_abort,
+.clean = g_free,
+};
+
+/*
+ * Common part of attaching bdrv child to bs or to blk or to job
+ */
+static int bdrv_attach_child_common(BlockDriverState *child_bs,
+const char *child_name,
+const BdrvChildClass *child_class,
+BdrvChildRole child_role,
+uint64_t perm, uint64_t shared_perm,
+void *opaque, BdrvChild **child,
+Transaction *tran, Error **errp)
+{
+BdrvChild *new_child;
+AioContext *parent_ctx;
+AioContext *child_ctx = bdrv_get_aio_context(child_bs);
+
+assert(child);
+assert(*child == NULL);
+
+new_child = g_new(BdrvChild, 1);
+*new_child = (BdrvChild) {
 .bs = NULL,
 .name   = g_strdup(child_name),
 .klass  = child_class,
@@ -2995,37 +3032,92 @@ BdrvChild *bdrv_root_attach_child(BlockDriverState 
*child_bs,
 .opaque = opaque,
 };
 
-ctx = bdrv_child_get_parent_aio_context(child);
-
-/* If the AioContexts don't match, first try to move the subtree of
+/*
+ * If the AioContexts don't match, first try to move the subtree of
  * child_bs into the AioContext of the new parent. If this doesn't work,
- * try moving the parent into the AioContext of child_bs instead. */
-if (bdrv_get_aio_context(child_bs) != ctx) {
-ret = 

[PATCH v4 10/36] block: refactor bdrv_child* permission functions

2021-04-28 Thread Vladimir Sementsov-Ogievskiy
Split out non-recursive parts, and refactor as block graph transaction
action.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Reviewed-by: Kevin Wolf 
---
 block.c | 79 ++---
 1 file changed, 59 insertions(+), 20 deletions(-)

diff --git a/block.c b/block.c
index fd621f0403..0ee0c2f29a 100644
--- a/block.c
+++ b/block.c
@@ -49,6 +49,7 @@
 #include "qemu/timer.h"
 #include "qemu/cutils.h"
 #include "qemu/id.h"
+#include "qemu/transactions.h"
 #include "block/coroutines.h"
 
 #ifdef CONFIG_BSD
@@ -2093,6 +2094,61 @@ static void bdrv_child_perm(BlockDriverState *bs, 
BlockDriverState *child_bs,
 }
 }
 
+static void bdrv_child_set_perm_commit(void *opaque)
+{
+BdrvChild *c = opaque;
+
+c->has_backup_perm = false;
+}
+
+static void bdrv_child_set_perm_abort(void *opaque)
+{
+BdrvChild *c = opaque;
+/*
+ * We may have child->has_backup_perm unset at this point, as in case of
+ * _check_ stage of permission update failure we may _check_ not the whole
+ * subtree.  Still, _abort_ is called on the whole subtree anyway.
+ */
+if (c->has_backup_perm) {
+c->perm = c->backup_perm;
+c->shared_perm = c->backup_shared_perm;
+c->has_backup_perm = false;
+}
+}
+
+static TransactionActionDrv bdrv_child_set_pem_drv = {
+.abort = bdrv_child_set_perm_abort,
+.commit = bdrv_child_set_perm_commit,
+};
+
+/*
+ * With tran=NULL needs to be followed by direct call to either
+ * bdrv_child_set_perm_commit() or bdrv_child_set_perm_abort().
+ *
+ * With non-NULL tran needs to be followed by tran_abort() or tran_commit()
+ * instead.
+ */
+static void bdrv_child_set_perm_safe(BdrvChild *c, uint64_t perm,
+ uint64_t shared, Transaction *tran)
+{
+if (!c->has_backup_perm) {
+c->has_backup_perm = true;
+c->backup_perm = c->perm;
+c->backup_shared_perm = c->shared_perm;
+}
+/*
+ * Note: it's OK if c->has_backup_perm was already set, as we can find the
+ * same c twice during check_perm procedure
+ */
+
+c->perm = perm;
+c->shared_perm = shared;
+
+if (tran) {
+tran_add(tran, _child_set_pem_drv, c);
+}
+}
+
 /*
  * Check whether permissions on this node can be changed in a way that
  * @cumulative_perms and @cumulative_shared_perms are the new cumulative
@@ -2358,37 +2414,20 @@ static int bdrv_child_check_perm(BdrvChild *c, 
BlockReopenQueue *q,
 return ret;
 }
 
-if (!c->has_backup_perm) {
-c->has_backup_perm = true;
-c->backup_perm = c->perm;
-c->backup_shared_perm = c->shared_perm;
-}
-/*
- * Note: it's OK if c->has_backup_perm was already set, as we can find the
- * same child twice during check_perm procedure
- */
-
-c->perm = perm;
-c->shared_perm = shared;
+bdrv_child_set_perm_safe(c, perm, shared, NULL);
 
 return 0;
 }
 
 static void bdrv_child_set_perm(BdrvChild *c)
 {
-c->has_backup_perm = false;
-
+bdrv_child_set_perm_commit(c);
 bdrv_set_perm(c->bs);
 }
 
 static void bdrv_child_abort_perm_update(BdrvChild *c)
 {
-if (c->has_backup_perm) {
-c->perm = c->backup_perm;
-c->shared_perm = c->backup_shared_perm;
-c->has_backup_perm = false;
-}
-
+bdrv_child_set_perm_abort(c);
 bdrv_abort_perm_update(c->bs);
 }
 
-- 
2.29.2




[PATCH v4 19/36] block: add bdrv_attach_child_noperm() transaction action

2021-04-28 Thread Vladimir Sementsov-Ogievskiy
Split no-perm part of bdrv_attach_child as separate transaction action.
It will be used in later commits.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Reviewed-by: Kevin Wolf 
---
 block.c | 71 ++---
 1 file changed, 58 insertions(+), 13 deletions(-)

diff --git a/block.c b/block.c
index 4d5c60f2ae..c74e6e7cc1 100644
--- a/block.c
+++ b/block.c
@@ -85,6 +85,14 @@ static BlockDriverState *bdrv_open_inherit(const char 
*filename,
 
 static void bdrv_replace_child_noperm(BdrvChild *child,
   BlockDriverState *new_bs);
+static int bdrv_attach_child_noperm(BlockDriverState *parent_bs,
+BlockDriverState *child_bs,
+const char *child_name,
+const BdrvChildClass *child_class,
+BdrvChildRole child_role,
+BdrvChild **child,
+Transaction *tran,
+Error **errp);
 
 static int bdrv_reopen_prepare(BDRVReopenState *reopen_state, BlockReopenQueue
*queue, Error **errp);
@@ -3079,6 +3087,40 @@ static int bdrv_attach_child_common(BlockDriverState 
*child_bs,
 return 0;
 }
 
+static int bdrv_attach_child_noperm(BlockDriverState *parent_bs,
+BlockDriverState *child_bs,
+const char *child_name,
+const BdrvChildClass *child_class,
+BdrvChildRole child_role,
+BdrvChild **child,
+Transaction *tran,
+Error **errp)
+{
+int ret;
+uint64_t perm, shared_perm;
+
+assert(parent_bs->drv);
+
+bdrv_get_cumulative_perm(parent_bs, , _perm);
+bdrv_child_perm(parent_bs, child_bs, NULL, child_role, NULL,
+perm, shared_perm, , _perm);
+
+ret = bdrv_attach_child_common(child_bs, child_name, child_class,
+   child_role, perm, shared_perm, parent_bs,
+   child, tran, errp);
+if (ret < 0) {
+return ret;
+}
+
+QLIST_INSERT_HEAD(_bs->children, *child, next);
+/*
+ * child is removed in bdrv_attach_child_common_abort(), so don't care to
+ * abort this change separately.
+ */
+
+return 0;
+}
+
 static void bdrv_detach_child(BdrvChild *child)
 {
 bdrv_replace_child(child, NULL);
@@ -3139,23 +3181,26 @@ BdrvChild *bdrv_attach_child(BlockDriverState 
*parent_bs,
  BdrvChildRole child_role,
  Error **errp)
 {
-BdrvChild *child;
-uint64_t perm, shared_perm;
-
-bdrv_get_cumulative_perm(parent_bs, , _perm);
+int ret;
+BdrvChild *child = NULL;
+Transaction *tran = tran_new();
 
-assert(parent_bs->drv);
-bdrv_child_perm(parent_bs, child_bs, NULL, child_role, NULL,
-perm, shared_perm, , _perm);
+ret = bdrv_attach_child_noperm(parent_bs, child_bs, child_name, 
child_class,
+   child_role, , tran, errp);
+if (ret < 0) {
+goto out;
+}
 
-child = bdrv_root_attach_child(child_bs, child_name, child_class,
-   child_role, perm, shared_perm, parent_bs,
-   errp);
-if (child == NULL) {
-return NULL;
+ret = bdrv_refresh_perms(parent_bs, errp);
+if (ret < 0) {
+goto out;
 }
 
-QLIST_INSERT_HEAD(_bs->children, child, next);
+out:
+tran_finalize(tran, ret);
+
+bdrv_unref(child_bs);
+
 return child;
 }
 
-- 
2.29.2




[PATCH v4 17/36] block: fix bdrv_replace_node_common

2021-04-28 Thread Vladimir Sementsov-Ogievskiy
inore_children thing doesn't help to track all propagated permissions
of children we want to ignore. The simplest way to correctly update
permissions is update graph first and then do permission update. In
this case we just referesh permissions for the whole subgraph (in
topological-sort defined order) and everything is correctly calculated
automatically without any ignore_children.

So, refactor bdrv_replace_node_common to first do graph update and then
refresh the permissions.

Test test_parallel_exclusive_write() now pass, so move it out of
debugging "if".

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Reviewed-by: Kevin Wolf 
---
 block.c  | 43 +---
 tests/unit/test-bdrv-graph-mod.c |  4 +--
 2 files changed, 20 insertions(+), 27 deletions(-)

diff --git a/block.c b/block.c
index a5305662dc..6040b9dea0 100644
--- a/block.c
+++ b/block.c
@@ -2273,7 +2273,6 @@ static TransactionActionDrv bdrv_replace_child_drv = {
  *
  * Note: real unref of old_bs is done only on commit.
  */
-__attribute__((unused))
 static void bdrv_replace_child_safe(BdrvChild *child, BlockDriverState *new_bs,
 Transaction *tran)
 {
@@ -4877,8 +4876,9 @@ static int bdrv_replace_node_common(BlockDriverState 
*from,
 bool auto_skip, Error **errp)
 {
 BdrvChild *c, *next;
-GSList *list = NULL, *p;
-uint64_t perm = 0, shared = BLK_PERM_ALL;
+Transaction *tran = tran_new();
+g_autoptr(GHashTable) found = NULL;
+g_autoptr(GSList) refresh_list = NULL;
 int ret;
 
 /* Make sure that @from doesn't go away until we have successfully attached
@@ -4889,7 +4889,12 @@ static int bdrv_replace_node_common(BlockDriverState 
*from,
 assert(bdrv_get_aio_context(from) == bdrv_get_aio_context(to));
 bdrv_drained_begin(from);
 
-/* Put all parents into @list and calculate their cumulative permissions */
+/*
+ * Do the replacement without permission update.
+ * Replacement may influence the permissions, we should calculate new
+ * permissions based on new graph. If we fail, we'll roll-back the
+ * replacement.
+ */
 QLIST_FOREACH_SAFE(c, >parents, next_parent, next) {
 assert(c->bs == from);
 if (!should_update_child(c, to)) {
@@ -4907,36 +4912,24 @@ static int bdrv_replace_node_common(BlockDriverState 
*from,
c->name, from->node_name);
 goto out;
 }
-list = g_slist_prepend(list, c);
-perm |= c->perm;
-shared &= c->shared_perm;
+bdrv_replace_child_safe(c, to, tran);
 }
 
-/* Check whether the required permissions can be granted on @to, ignoring
- * all BdrvChild in @list so that they can't block themselves. */
-ret = bdrv_check_update_perm(to, NULL, perm, shared, list, errp);
-if (ret < 0) {
-bdrv_abort_perm_update(to);
-goto out;
-}
+found = g_hash_table_new(NULL, NULL);
 
-/* Now actually perform the change. We performed the permission check for
- * all elements of @list at once, so set the permissions all at once at the
- * very end. */
-for (p = list; p != NULL; p = p->next) {
-c = p->data;
+refresh_list = bdrv_topological_dfs(refresh_list, found, to);
+refresh_list = bdrv_topological_dfs(refresh_list, found, from);
 
-bdrv_ref(to);
-bdrv_replace_child_noperm(c, to);
-bdrv_unref(from);
+ret = bdrv_list_refresh_perms(refresh_list, NULL, tran, errp);
+if (ret < 0) {
+goto out;
 }
 
-bdrv_set_perm(to);
-
 ret = 0;
 
 out:
-g_slist_free(list);
+tran_finalize(tran, ret);
+
 bdrv_drained_end(from);
 bdrv_unref(from);
 
diff --git a/tests/unit/test-bdrv-graph-mod.c b/tests/unit/test-bdrv-graph-mod.c
index e64e81a07c..7b3c8b437a 100644
--- a/tests/unit/test-bdrv-graph-mod.c
+++ b/tests/unit/test-bdrv-graph-mod.c
@@ -408,10 +408,10 @@ int main(int argc, char *argv[])
 test_should_update_child);
 g_test_add_func("/bdrv-graph-mod/parallel-perm-update",
 test_parallel_perm_update);
+g_test_add_func("/bdrv-graph-mod/parallel-exclusive-write",
+test_parallel_exclusive_write);
 
 if (debug) {
-g_test_add_func("/bdrv-graph-mod/parallel-exclusive-write",
-test_parallel_exclusive_write);
 g_test_add_func("/bdrv-graph-mod/append-greedy-filter",
 test_append_greedy_filter);
 }
-- 
2.29.2




[PATCH v4 06/36] block: drop ctx argument from bdrv_root_attach_child

2021-04-28 Thread Vladimir Sementsov-Ogievskiy
Passing parent aio context is redundant, as child_class and parent
opaque pointer are enough to retrieve it. Drop the argument and use new
bdrv_child_get_parent_aio_context() interface.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Reviewed-by: Alberto Garcia 
Reviewed-by: Kevin Wolf 
---
 include/block/block_int.h | 1 -
 block.c   | 8 +---
 block/block-backend.c | 4 ++--
 blockjob.c| 3 +--
 4 files changed, 8 insertions(+), 8 deletions(-)

diff --git a/include/block/block_int.h b/include/block/block_int.h
index 737ec632c4..dd2de6bd1d 100644
--- a/include/block/block_int.h
+++ b/include/block/block_int.h
@@ -1308,7 +1308,6 @@ BdrvChild *bdrv_root_attach_child(BlockDriverState 
*child_bs,
   const char *child_name,
   const BdrvChildClass *child_class,
   BdrvChildRole child_role,
-  AioContext *ctx,
   uint64_t perm, uint64_t shared_perm,
   void *opaque, Error **errp);
 void bdrv_root_unref_child(BdrvChild *child);
diff --git a/block.c b/block.c
index 2833912436..54436c951e 100644
--- a/block.c
+++ b/block.c
@@ -2700,13 +2700,13 @@ BdrvChild *bdrv_root_attach_child(BlockDriverState 
*child_bs,
   const char *child_name,
   const BdrvChildClass *child_class,
   BdrvChildRole child_role,
-  AioContext *ctx,
   uint64_t perm, uint64_t shared_perm,
   void *opaque, Error **errp)
 {
 BdrvChild *child;
 Error *local_err = NULL;
 int ret;
+AioContext *ctx;
 
 ret = bdrv_check_update_perm(child_bs, NULL, perm, shared_perm, NULL, 
errp);
 if (ret < 0) {
@@ -2726,6 +2726,8 @@ BdrvChild *bdrv_root_attach_child(BlockDriverState 
*child_bs,
 .opaque = opaque,
 };
 
+ctx = bdrv_child_get_parent_aio_context(child);
+
 /* If the AioContexts don't match, first try to move the subtree of
  * child_bs into the AioContext of the new parent. If this doesn't work,
  * try moving the parent into the AioContext of child_bs instead. */
@@ -2786,8 +2788,8 @@ BdrvChild *bdrv_attach_child(BlockDriverState *parent_bs,
 perm, shared_perm, , _perm);
 
 child = bdrv_root_attach_child(child_bs, child_name, child_class,
-   child_role, bdrv_get_aio_context(parent_bs),
-   perm, shared_perm, parent_bs, errp);
+   child_role, perm, shared_perm, parent_bs,
+   errp);
 if (child == NULL) {
 return NULL;
 }
diff --git a/block/block-backend.c b/block/block-backend.c
index 3f656ef361..e4892fd6a5 100644
--- a/block/block-backend.c
+++ b/block/block-backend.c
@@ -435,7 +435,7 @@ BlockBackend *blk_new_open(const char *filename, const char 
*reference,
 
 blk->root = bdrv_root_attach_child(bs, "root", _root,
BDRV_CHILD_FILTERED | 
BDRV_CHILD_PRIMARY,
-   blk->ctx, perm, BLK_PERM_ALL, blk, 
errp);
+   perm, BLK_PERM_ALL, blk, errp);
 if (!blk->root) {
 blk_unref(blk);
 return NULL;
@@ -849,7 +849,7 @@ int blk_insert_bs(BlockBackend *blk, BlockDriverState *bs, 
Error **errp)
 bdrv_ref(bs);
 blk->root = bdrv_root_attach_child(bs, "root", _root,
BDRV_CHILD_FILTERED | 
BDRV_CHILD_PRIMARY,
-   blk->ctx, blk->perm, blk->shared_perm,
+   blk->perm, blk->shared_perm,
blk, errp);
 if (blk->root == NULL) {
 return -EPERM;
diff --git a/blockjob.c b/blockjob.c
index 160bf38b19..2fe1d788ba 100644
--- a/blockjob.c
+++ b/blockjob.c
@@ -229,8 +229,7 @@ int block_job_add_bdrv(BlockJob *job, const char *name, 
BlockDriverState *bs,
 if (need_context_ops && job->job.aio_context != qemu_get_aio_context()) {
 aio_context_release(job->job.aio_context);
 }
-c = bdrv_root_attach_child(bs, name, _job, 0,
-   job->job.aio_context, perm, shared_perm, job,
+c = bdrv_root_attach_child(bs, name, _job, 0, perm, shared_perm, job,
errp);
 if (need_context_ops && job->job.aio_context != qemu_get_aio_context()) {
 aio_context_acquire(job->job.aio_context);
-- 
2.29.2




[PATCH v4 01/36] tests/test-bdrv-graph-mod: add test_parallel_exclusive_write

2021-04-28 Thread Vladimir Sementsov-Ogievskiy
Add the test that shows that concept of ignore_children is incomplete.
Actually, when we want to update something, ignoring permission of some
existing BdrvChild, we should ignore also the propagated effect of this
child to the other children. But that's not done. Better approach
(update permissions on already updated graph) will be implemented
later.

Now the test fails, so it's added with -d argument to not break make
check.

Test fails with

 "Conflicts with use by fl1 as 'backing', which does not allow 'write' on base"

because when updating permissions we can ignore original top->fl1
BdrvChild. But we don't ignore exclusive write permission in fl1->base
BdrvChild, which is propagated. Correct thing to do is make graph
change first and then do permission update from the top node.

To run test do

  ./test-bdrv-graph-mod -d -p /bdrv-graph-mod/parallel-exclusive-write

from /tests.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Reviewed-by: Kevin Wolf 
---
 tests/unit/test-bdrv-graph-mod.c | 70 +++-
 1 file changed, 69 insertions(+), 1 deletion(-)

diff --git a/tests/unit/test-bdrv-graph-mod.c b/tests/unit/test-bdrv-graph-mod.c
index c4f7d16039..80a9a20066 100644
--- a/tests/unit/test-bdrv-graph-mod.c
+++ b/tests/unit/test-bdrv-graph-mod.c
@@ -1,7 +1,7 @@
 /*
  * Block node graph modifications tests
  *
- * Copyright (c) 2019 Virtuozzo International GmbH. All rights reserved.
+ * Copyright (c) 2019-2021 Virtuozzo International GmbH. All rights reserved.
  *
  * This program is free software; you can redistribute it and/or modify
  * it under the terms of the GNU General Public License as published by
@@ -44,6 +44,21 @@ static BlockDriver bdrv_no_perm = {
 .bdrv_child_perm = no_perm_default_perms,
 };
 
+static void exclusive_write_perms(BlockDriverState *bs, BdrvChild *c,
+  BdrvChildRole role,
+  BlockReopenQueue *reopen_queue,
+  uint64_t perm, uint64_t shared,
+  uint64_t *nperm, uint64_t *nshared)
+{
+*nperm = BLK_PERM_WRITE;
+*nshared = BLK_PERM_ALL & ~BLK_PERM_WRITE;
+}
+
+static BlockDriver bdrv_exclusive_writer = {
+.format_name = "exclusive-writer",
+.bdrv_child_perm = exclusive_write_perms,
+};
+
 static BlockDriverState *no_perm_node(const char *name)
 {
 return bdrv_new_open_driver(_no_perm, name, BDRV_O_RDWR, 
_abort);
@@ -55,6 +70,12 @@ static BlockDriverState *pass_through_node(const char *name)
 BDRV_O_RDWR, _abort);
 }
 
+static BlockDriverState *exclusive_writer_node(const char *name)
+{
+return bdrv_new_open_driver(_exclusive_writer, name,
+BDRV_O_RDWR, _abort);
+}
+
 /*
  * test_update_perm_tree
  *
@@ -185,8 +206,50 @@ static void test_should_update_child(void)
 blk_unref(root);
 }
 
+/*
+ * test_parallel_exclusive_write
+ *
+ * Check that when we replace node, old permissions of the node being removed
+ * doesn't break the replacement.
+ */
+static void test_parallel_exclusive_write(void)
+{
+BlockDriverState *top = exclusive_writer_node("top");
+BlockDriverState *base = no_perm_node("base");
+BlockDriverState *fl1 = pass_through_node("fl1");
+BlockDriverState *fl2 = pass_through_node("fl2");
+
+/*
+ * bdrv_attach_child() eats child bs reference, so we need two @base
+ * references for two filters:
+ */
+bdrv_ref(base);
+
+bdrv_attach_child(top, fl1, "backing", _of_bds, BDRV_CHILD_DATA,
+  _abort);
+bdrv_attach_child(fl1, base, "backing", _of_bds, BDRV_CHILD_FILTERED,
+  _abort);
+bdrv_attach_child(fl2, base, "backing", _of_bds, BDRV_CHILD_FILTERED,
+  _abort);
+
+bdrv_replace_node(fl1, fl2, _abort);
+
+bdrv_unref(fl2);
+bdrv_unref(top);
+}
+
 int main(int argc, char *argv[])
 {
+int i;
+bool debug = false;
+
+for (i = 1; i < argc; i++) {
+if (!strcmp(argv[i], "-d")) {
+debug = true;
+break;
+}
+}
+
 bdrv_init();
 qemu_init_main_loop(_abort);
 
@@ -196,5 +259,10 @@ int main(int argc, char *argv[])
 g_test_add_func("/bdrv-graph-mod/should-update-child",
 test_should_update_child);
 
+if (debug) {
+g_test_add_func("/bdrv-graph-mod/parallel-exclusive-write",
+test_parallel_exclusive_write);
+}
+
 return g_test_run();
 }
-- 
2.29.2




[PATCH v4 04/36] block: bdrv_append(): don't consume reference

2021-04-28 Thread Vladimir Sementsov-Ogievskiy
We have too much comments for this feature. It seems better just don't
do it. Most of real users (tests don't count) have to create additional
reference.

Drop also comment in external_snapshot_prepare:
 - bdrv_append doesn't "remove" old bs in common sense, it sounds
   strange
 - the fact that bdrv_append can fail is obvious from the context
 - the fact that we must rollback all changes in transaction abort is
   known (it's the direct role of abort)

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Reviewed-by: Kevin Wolf 
---
 block.c  | 25 +++--
 block/backup-top.c   |  1 -
 block/commit.c   |  1 +
 block/mirror.c   |  3 ---
 blockdev.c   |  4 
 tests/unit/test-bdrv-drain.c |  2 +-
 tests/unit/test-bdrv-graph-mod.c |  3 +++
 7 files changed, 8 insertions(+), 31 deletions(-)

diff --git a/block.c b/block.c
index c5b887cec1..1e7e8907e4 100644
--- a/block.c
+++ b/block.c
@@ -3213,11 +3213,6 @@ static BlockDriverState 
*bdrv_append_temp_snapshot(BlockDriverState *bs,
 goto out;
 }
 
-/* bdrv_append() consumes a strong reference to bs_snapshot
- * (i.e. it will call bdrv_unref() on it) even on error, so in
- * order to be able to return one, we have to increase
- * bs_snapshot's refcount here */
-bdrv_ref(bs_snapshot);
 ret = bdrv_append(bs_snapshot, bs, errp);
 if (ret < 0) {
 bs_snapshot = NULL;
@@ -4679,36 +4674,22 @@ int bdrv_replace_node(BlockDriverState *from, 
BlockDriverState *to,
  * bs_new must not be attached to a BlockBackend.
  *
  * This function does not create any image files.
- *
- * bdrv_append() takes ownership of a bs_new reference and unrefs it because
- * that's what the callers commonly need. bs_new will be referenced by the old
- * parents of bs_top after bdrv_append() returns. If the caller needs to keep a
- * reference of its own, it must call bdrv_ref().
  */
 int bdrv_append(BlockDriverState *bs_new, BlockDriverState *bs_top,
 Error **errp)
 {
 int ret = bdrv_set_backing_hd(bs_new, bs_top, errp);
 if (ret < 0) {
-goto out;
+return ret;
 }
 
 ret = bdrv_replace_node(bs_top, bs_new, errp);
 if (ret < 0) {
 bdrv_set_backing_hd(bs_new, NULL, _abort);
-goto out;
+return ret;
 }
 
-ret = 0;
-
-out:
-/*
- * bs_new is now referenced by its new parents, we don't need the
- * additional reference any more.
- */
-bdrv_unref(bs_new);
-
-return ret;
+return 0;
 }
 
 static void bdrv_delete(BlockDriverState *bs)
diff --git a/block/backup-top.c b/block/backup-top.c
index 589e8b651d..62d09f213e 100644
--- a/block/backup-top.c
+++ b/block/backup-top.c
@@ -234,7 +234,6 @@ BlockDriverState *bdrv_backup_top_append(BlockDriverState 
*source,
 
 bdrv_drained_begin(source);
 
-bdrv_ref(top);
 ret = bdrv_append(top, source, errp);
 if (ret < 0) {
 error_prepend(errp, "Cannot append backup-top filter: ");
diff --git a/block/commit.c b/block/commit.c
index dd9ba87349..b89bb20b75 100644
--- a/block/commit.c
+++ b/block/commit.c
@@ -312,6 +312,7 @@ void commit_start(const char *job_id, BlockDriverState *bs,
 commit_top_bs->total_sectors = top->total_sectors;
 
 ret = bdrv_append(commit_top_bs, top, errp);
+bdrv_unref(commit_top_bs); /* referenced by new parents or failed */
 if (ret < 0) {
 commit_top_bs = NULL;
 goto fail;
diff --git a/block/mirror.c b/block/mirror.c
index 5a71bd8bbc..840b8e8c15 100644
--- a/block/mirror.c
+++ b/block/mirror.c
@@ -1630,9 +1630,6 @@ static BlockJob *mirror_start_job(
 
 bs_opaque->is_commit = target_is_backing;
 
-/* bdrv_append takes ownership of the mirror_top_bs reference, need to keep
- * it alive until block_job_create() succeeds even if bs has no parent. */
-bdrv_ref(mirror_top_bs);
 bdrv_drained_begin(bs);
 ret = bdrv_append(mirror_top_bs, bs, errp);
 bdrv_drained_end(bs);
diff --git a/blockdev.c b/blockdev.c
index a57590aae4..834c2304a1 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -1576,10 +1576,6 @@ static void external_snapshot_prepare(BlkActionState 
*common,
 goto out;
 }
 
-/* This removes our old bs and adds the new bs. This is an operation that
- * can fail, so we need to do it in .prepare; undoing it for abort is
- * always possible. */
-bdrv_ref(state->new_bs);
 ret = bdrv_append(state->new_bs, state->old_bs, errp);
 if (ret < 0) {
 goto out;
diff --git a/tests/unit/test-bdrv-drain.c b/tests/unit/test-bdrv-drain.c
index 8a29e33e00..892f7f47d8 100644
--- a/tests/unit/test-bdrv-drain.c
+++ b/tests/unit/test-bdrv-drain.c
@@ -1478,7 +1478,6 @@ static void test_append_to_drained(void)
 g_assert_cmpint(base_s->drain_count, ==, 1);
 g_assert_cmpint(base->in_flight, ==, 0);
 
-/* Takes ownership of overlay, so we don't have to unref it later */
 

[PATCH v4 07/36] block: make bdrv_reopen_{prepare, commit, abort} private

2021-04-28 Thread Vladimir Sementsov-Ogievskiy via
These functions are called only from bdrv_reopen_multiple() in block.c.
No reason to publish them.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Reviewed-by: Alberto Garcia 
Reviewed-by: Kevin Wolf 
---
 include/block/block.h |  4 
 block.c   | 13 +
 2 files changed, 9 insertions(+), 8 deletions(-)

diff --git a/include/block/block.h b/include/block/block.h
index 54279baa95..16e496a5c4 100644
--- a/include/block/block.h
+++ b/include/block/block.h
@@ -387,10 +387,6 @@ BlockReopenQueue *bdrv_reopen_queue(BlockReopenQueue 
*bs_queue,
 int bdrv_reopen_multiple(BlockReopenQueue *bs_queue, Error **errp);
 int bdrv_reopen_set_read_only(BlockDriverState *bs, bool read_only,
   Error **errp);
-int bdrv_reopen_prepare(BDRVReopenState *reopen_state,
-BlockReopenQueue *queue, Error **errp);
-void bdrv_reopen_commit(BDRVReopenState *reopen_state);
-void bdrv_reopen_abort(BDRVReopenState *reopen_state);
 int bdrv_pwrite_zeroes(BdrvChild *child, int64_t offset,
int64_t bytes, BdrvRequestFlags flags);
 int bdrv_make_zero(BdrvChild *child, BdrvRequestFlags flags);
diff --git a/block.c b/block.c
index 54436c951e..34c728d7e4 100644
--- a/block.c
+++ b/block.c
@@ -82,6 +82,11 @@ static BlockDriverState *bdrv_open_inherit(const char 
*filename,
BdrvChildRole child_role,
Error **errp);
 
+static int bdrv_reopen_prepare(BDRVReopenState *reopen_state, BlockReopenQueue
+   *queue, Error **errp);
+static void bdrv_reopen_commit(BDRVReopenState *reopen_state);
+static void bdrv_reopen_abort(BDRVReopenState *reopen_state);
+
 /* If non-zero, use only whitelisted block drivers */
 static int use_bdrv_whitelist;
 
@@ -4153,8 +4158,8 @@ static int bdrv_reopen_parse_backing(BDRVReopenState 
*reopen_state,
  * commit() for any other BDS that have been left in a prepare() state
  *
  */
-int bdrv_reopen_prepare(BDRVReopenState *reopen_state, BlockReopenQueue *queue,
-Error **errp)
+static int bdrv_reopen_prepare(BDRVReopenState *reopen_state,
+   BlockReopenQueue *queue, Error **errp)
 {
 int ret = -1;
 int old_flags;
@@ -4369,7 +4374,7 @@ error:
  * makes them final by swapping the staging BlockDriverState contents into
  * the active BlockDriverState contents.
  */
-void bdrv_reopen_commit(BDRVReopenState *reopen_state)
+static void bdrv_reopen_commit(BDRVReopenState *reopen_state)
 {
 BlockDriver *drv;
 BlockDriverState *bs;
@@ -4429,7 +4434,7 @@ void bdrv_reopen_commit(BDRVReopenState *reopen_state)
  * Abort the reopen, and delete and free the staged changes in
  * reopen_state
  */
-void bdrv_reopen_abort(BDRVReopenState *reopen_state)
+static void bdrv_reopen_abort(BDRVReopenState *reopen_state)
 {
 BlockDriver *drv;
 
-- 
2.29.2




[PATCH v4 03/36] tests/test-bdrv-graph-mod: add test_append_greedy_filter

2021-04-28 Thread Vladimir Sementsov-Ogievskiy
bdrv_append() is not quite good for inserting filters: it does extra
permission update in intermediate state, where filter get it filtered
child but is not yet replace it in a backing chain.

Some filters (for example backup-top) may want permissions even when
have no parents. And described intermediate state becomes invalid.

That's (half a) reason, why we need "inactive" state for backup-top
filter.

bdrv_append() will be improved later, now let's add a unit test.

Now test fails, so it runs only with -d flag. To run do

  ./test-bdrv-graph-mod -d -p /bdrv-graph-mod/append-greedy-filter

from /tests.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Reviewed-by: Kevin Wolf 
---
 tests/unit/test-bdrv-graph-mod.c | 33 
 1 file changed, 33 insertions(+)

diff --git a/tests/unit/test-bdrv-graph-mod.c b/tests/unit/test-bdrv-graph-mod.c
index a8219b131e..5b6934e68b 100644
--- a/tests/unit/test-bdrv-graph-mod.c
+++ b/tests/unit/test-bdrv-graph-mod.c
@@ -352,6 +352,37 @@ static void test_parallel_perm_update(void)
 bdrv_unref(top);
 }
 
+/*
+ * It's possible that filter required permissions allows to insert it to 
backing
+ * chain, like:
+ *
+ *  1.  [top] -> [filter] -> [base]
+ *
+ * but doesn't allow to add it as a branch:
+ *
+ *  2.  [filter] --\
+ * v
+ *  [top] -> [base]
+ *
+ * So, inserting such filter should do all graph modifications and only then
+ * update permissions. If we try to go through intermediate state [2] and 
update
+ * permissions on it we'll fail.
+ *
+ * Let's check that bdrv_append() can append such a filter.
+ */
+static void test_append_greedy_filter(void)
+{
+BlockDriverState *top = exclusive_writer_node("top");
+BlockDriverState *base = no_perm_node("base");
+BlockDriverState *fl = exclusive_writer_node("fl1");
+
+bdrv_attach_child(top, base, "backing", _of_bds, BDRV_CHILD_COW,
+  _abort);
+
+bdrv_append(fl, base, _abort);
+bdrv_unref(top);
+}
+
 int main(int argc, char *argv[])
 {
 int i;
@@ -378,6 +409,8 @@ int main(int argc, char *argv[])
 test_parallel_exclusive_write);
 g_test_add_func("/bdrv-graph-mod/parallel-perm-update",
 test_parallel_perm_update);
+g_test_add_func("/bdrv-graph-mod/append-greedy-filter",
+test_append_greedy_filter);
 }
 
 return g_test_run();
-- 
2.29.2




[PATCH v4 02/36] tests/test-bdrv-graph-mod: add test_parallel_perm_update

2021-04-28 Thread Vladimir Sementsov-Ogievskiy
Add test to show that simple DFS recursion order is not correct for
permission update. Correct order is topological-sort order, which will
be introduced later.

Consider the block driver which has two filter children: one active
with exclusive write access and one inactive with no specific
permissions.

And, these two children has a common base child, like this:

┌─┐ ┌──┐
│ fl2 │ ◀── │ top  │
└─┘ └──┘
  │   │
  │   │ w
  │   ▼
  │ ┌──┐
  │ │ fl1  │
  │ └──┘
  │   │
  │   │ w
  │   ▼
  │ ┌──┐
  └───▶ │ base │
└──┘

So, exclusive write is propagated.

Assume, we want to make fl2 active instead of fl1.
So, we set some option for top driver and do permission update.

If permission update (remember, it's DFS) goes first through
top->fl1->base branch it will succeed: it firstly drop exclusive write
permissions and than apply them for another BdrvChildren.
But if permission update goes first through top->fl2->base branch it
will fail, as when we try to update fl2->base child, old not yet
updated fl1->base child will be in conflict.

Now test fails, so it runs only with -d flag. To run do

  ./test-bdrv-graph-mod -d -p /bdrv-graph-mod/parallel-perm-update

from /tests.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Reviewed-by: Kevin Wolf 
---
 tests/unit/test-bdrv-graph-mod.c | 116 +++
 1 file changed, 116 insertions(+)

diff --git a/tests/unit/test-bdrv-graph-mod.c b/tests/unit/test-bdrv-graph-mod.c
index 80a9a20066..a8219b131e 100644
--- a/tests/unit/test-bdrv-graph-mod.c
+++ b/tests/unit/test-bdrv-graph-mod.c
@@ -238,6 +238,120 @@ static void test_parallel_exclusive_write(void)
 bdrv_unref(top);
 }
 
+static void write_to_file_perms(BlockDriverState *bs, BdrvChild *c,
+ BdrvChildRole role,
+ BlockReopenQueue *reopen_queue,
+ uint64_t perm, uint64_t shared,
+ uint64_t *nperm, uint64_t *nshared)
+{
+if (bs->file && c == bs->file) {
+*nperm = BLK_PERM_WRITE;
+*nshared = BLK_PERM_ALL & ~BLK_PERM_WRITE;
+} else {
+*nperm = 0;
+*nshared = BLK_PERM_ALL;
+}
+}
+
+static BlockDriver bdrv_write_to_file = {
+.format_name = "tricky-perm",
+.bdrv_child_perm = write_to_file_perms,
+};
+
+
+/*
+ * The following test shows that topological-sort order is required for
+ * permission update, simple DFS is not enough.
+ *
+ * Consider the block driver which has two filter children: one active
+ * with exclusive write access and one inactive with no specific
+ * permissions.
+ *
+ * And, these two children has a common base child, like this:
+ *
+ * ┌─┐ ┌──┐
+ * │ fl2 │ ◀── │ top  │
+ * └─┘ └──┘
+ *   │   │
+ *   │   │ w
+ *   │   ▼
+ *   │ ┌──┐
+ *   │ │ fl1  │
+ *   │ └──┘
+ *   │   │
+ *   │   │ w
+ *   │   ▼
+ *   │ ┌──┐
+ *   └───▶ │ base │
+ * └──┘
+ *
+ * So, exclusive write is propagated.
+ *
+ * Assume, we want to make fl2 active instead of fl1.
+ * So, we set some option for top driver and do permission update.
+ *
+ * With simple DFS, if permission update goes first through
+ * top->fl1->base branch it will succeed: it firstly drop exclusive write
+ * permissions and than apply them for another BdrvChildren.
+ * But if permission update goes first through top->fl2->base branch it
+ * will fail, as when we try to update fl2->base child, old not yet
+ * updated fl1->base child will be in conflict.
+ *
+ * With topological-sort order we always update parents before children, so fl1
+ * and fl2 are both updated when we update base and there is no conflict.
+ */
+static void test_parallel_perm_update(void)
+{
+BlockDriverState *top = no_perm_node("top");
+BlockDriverState *tricky =
+bdrv_new_open_driver(_write_to_file, "tricky", BDRV_O_RDWR,
+ _abort);
+BlockDriverState *base = no_perm_node("base");
+BlockDriverState *fl1 = pass_through_node("fl1");
+BlockDriverState *fl2 = pass_through_node("fl2");
+BdrvChild *c_fl1, *c_fl2;
+
+/*
+ * bdrv_attach_child() eats child bs reference, so we need two @base
+ * references for two filters:
+ */
+bdrv_ref(base);
+
+bdrv_attach_child(top, tricky, "file", _of_bds, BDRV_CHILD_DATA,
+  _abort);
+c_fl1 = bdrv_attach_child(tricky, fl1, "first", _of_bds,
+  BDRV_CHILD_FILTERED, _abort);
+c_fl2 = bdrv_attach_child(tricky, fl2, "second", _of_bds,
+  BDRV_CHILD_FILTERED, _abort);
+bdrv_attach_child(fl1, base, "backing", _of_bds, BDRV_CHILD_FILTERED,
+  _abort);
+bdrv_attach_child(fl2, base, "backing", 

[PATCH v4 12/36] block: inline bdrv_child_*() permission functions calls

2021-04-28 Thread Vladimir Sementsov-Ogievskiy
Each of them has only one caller. Open-coding simplifies further
pemission-update system changes.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Reviewed-by: Alberto Garcia 
Reviewed-by: Kevin Wolf 
---
 block.c | 59 +
 1 file changed, 17 insertions(+), 42 deletions(-)

diff --git a/block.c b/block.c
index 4511766825..e5af4cdae9 100644
--- a/block.c
+++ b/block.c
@@ -1974,11 +1974,11 @@ static int bdrv_fill_options(QDict **options, const 
char *filename,
 return 0;
 }
 
-static int bdrv_child_check_perm(BdrvChild *c, BlockReopenQueue *q,
- uint64_t perm, uint64_t shared,
- GSList *ignore_children, Error **errp);
-static void bdrv_child_abort_perm_update(BdrvChild *c);
-static void bdrv_child_set_perm(BdrvChild *c);
+static int bdrv_check_update_perm(BlockDriverState *bs, BlockReopenQueue *q,
+  uint64_t new_used_perm,
+  uint64_t new_shared_perm,
+  GSList *ignore_children,
+  Error **errp);
 
 typedef struct BlockReopenQueueEntry {
  bool prepared;
@@ -2226,15 +2226,21 @@ static int bdrv_check_perm(BlockDriverState *bs, 
BlockReopenQueue *q,
 /* Check all children */
 QLIST_FOREACH(c, >children, next) {
 uint64_t cur_perm, cur_shared;
+GSList *cur_ignore_children;
 
 bdrv_child_perm(bs, c->bs, c, c->role, q,
 cumulative_perms, cumulative_shared_perms,
 _perm, _shared);
-ret = bdrv_child_check_perm(c, q, cur_perm, cur_shared, 
ignore_children,
-errp);
+
+cur_ignore_children = g_slist_prepend(g_slist_copy(ignore_children), 
c);
+ret = bdrv_check_update_perm(c->bs, q, cur_perm, cur_shared,
+ cur_ignore_children, errp);
+g_slist_free(cur_ignore_children);
 if (ret < 0) {
 return ret;
 }
+
+bdrv_child_set_perm_safe(c, cur_perm, cur_shared, NULL);
 }
 
 return 0;
@@ -2261,7 +2267,8 @@ static void bdrv_abort_perm_update(BlockDriverState *bs)
 }
 
 QLIST_FOREACH(c, >children, next) {
-bdrv_child_abort_perm_update(c);
+bdrv_child_set_perm_abort(c);
+bdrv_abort_perm_update(c->bs);
 }
 }
 
@@ -2290,7 +2297,8 @@ static void bdrv_set_perm(BlockDriverState *bs)
 
 /* Update all children */
 QLIST_FOREACH(c, >children, next) {
-bdrv_child_set_perm(c);
+bdrv_child_set_perm_commit(c);
+bdrv_set_perm(c->bs);
 }
 }
 
@@ -2398,39 +2406,6 @@ static int bdrv_check_update_perm(BlockDriverState *bs, 
BlockReopenQueue *q,
ignore_children, errp);
 }
 
-/* Needs to be followed by a call to either bdrv_child_set_perm() or
- * bdrv_child_abort_perm_update(). */
-static int bdrv_child_check_perm(BdrvChild *c, BlockReopenQueue *q,
- uint64_t perm, uint64_t shared,
- GSList *ignore_children, Error **errp)
-{
-int ret;
-
-ignore_children = g_slist_prepend(g_slist_copy(ignore_children), c);
-ret = bdrv_check_update_perm(c->bs, q, perm, shared, ignore_children, 
errp);
-g_slist_free(ignore_children);
-
-if (ret < 0) {
-return ret;
-}
-
-bdrv_child_set_perm_safe(c, perm, shared, NULL);
-
-return 0;
-}
-
-static void bdrv_child_set_perm(BdrvChild *c)
-{
-bdrv_child_set_perm_commit(c);
-bdrv_set_perm(c->bs);
-}
-
-static void bdrv_child_abort_perm_update(BdrvChild *c)
-{
-bdrv_child_set_perm_abort(c);
-bdrv_abort_perm_update(c->bs);
-}
-
 static int bdrv_refresh_perms(BlockDriverState *bs, Error **errp)
 {
 int ret;
-- 
2.29.2




[PATCH v4 13/36] block: use topological sort for permission update

2021-04-28 Thread Vladimir Sementsov-Ogievskiy
Rewrite bdrv_check_perm(), bdrv_abort_perm_update() and bdrv_set_perm()
to update nodes in topological sort order instead of simple DFS. With
topologically sorted nodes, we update a node only when all its parents
already updated. With DFS it's not so.

Consider the following example:

A -+
|  |
|  v
|  B
|  |
v  |
C<-+

A is parent for B and C, B is parent for C.

Obviously, to update permissions, we should go in order A B C, so, when
we update C, all parent permissions already updated. But with current
approach (simple recursion) we can update in sequence A C B C (C is
updated twice). On first update of C, we consider old B permissions, so
doing wrong thing. If it succeed, all is OK, on second C update we will
finish with correct graph. But if the wrong thing failed, we break the
whole process for no reason (it's possible that updated B permission
will be less strict, but we will never check it).

Also new approach gives a way to simultaneously and correctly update
several nodes, we just need to run bdrv_topological_dfs() several times
to add all nodes and their subtrees into one topologically sorted list
(next patch will update bdrv_replace_node() in this manner).

Test test_parallel_perm_update() is now passing, so move it out of
debugging "if".

We also need to support ignore_children in
bdrv_parent_perms_conflict()

For test 283 order of conflicting parents check is changed.

Note also that in bdrv_check_perm() we don't check for parents conflict
at root bs, as we may be in the middle of permission update in
bdrv_reopen_multiple(). bdrv_reopen_multiple() will be updated soon.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Reviewed-by: Kevin Wolf 
---
 block.c  | 116 +--
 tests/unit/test-bdrv-graph-mod.c |   4 +-
 tests/qemu-iotests/283.out   |   2 +-
 3 files changed, 99 insertions(+), 23 deletions(-)

diff --git a/block.c b/block.c
index e5af4cdae9..cbcc4c15a0 100644
--- a/block.c
+++ b/block.c
@@ -2054,7 +2054,9 @@ static bool bdrv_a_allow_b(BdrvChild *a, BdrvChild *b, 
Error **errp)
 return false;
 }
 
-static bool bdrv_parent_perms_conflict(BlockDriverState *bs, Error **errp)
+static bool bdrv_parent_perms_conflict(BlockDriverState *bs,
+   GSList *ignore_children,
+   Error **errp)
 {
 BdrvChild *a, *b;
 
@@ -2064,8 +2066,12 @@ static bool bdrv_parent_perms_conflict(BlockDriverState 
*bs, Error **errp)
  * directions.
  */
 QLIST_FOREACH(a, >parents, next_parent) {
+if (g_slist_find(ignore_children, a)) {
+continue;
+}
+
 QLIST_FOREACH(b, >parents, next_parent) {
-if (a == b) {
+if (a == b || g_slist_find(ignore_children, b)) {
 continue;
 }
 
@@ -2094,6 +2100,40 @@ static void bdrv_child_perm(BlockDriverState *bs, 
BlockDriverState *child_bs,
 }
 }
 
+/*
+ * Adds the whole subtree of @bs (including @bs itself) to the @list (except 
for
+ * nodes that are already in the @list, of course) so that final list is
+ * topologically sorted. Return the result (GSList @list object is updated, so
+ * don't use old reference after function call).
+ *
+ * On function start @list must be already topologically sorted and for any 
node
+ * in the @list the whole subtree of the node must be in the @list as well. The
+ * simplest way to satisfy this criteria: use only result of
+ * bdrv_topological_dfs() or NULL as @list parameter.
+ */
+static GSList *bdrv_topological_dfs(GSList *list, GHashTable *found,
+BlockDriverState *bs)
+{
+BdrvChild *child;
+g_autoptr(GHashTable) local_found = NULL;
+
+if (!found) {
+assert(!list);
+found = local_found = g_hash_table_new(NULL, NULL);
+}
+
+if (g_hash_table_contains(found, bs)) {
+return list;
+}
+g_hash_table_add(found, bs);
+
+QLIST_FOREACH(child, >children, next) {
+list = bdrv_topological_dfs(list, found, child->bs);
+}
+
+return g_slist_prepend(list, bs);
+}
+
 static void bdrv_child_set_perm_commit(void *opaque)
 {
 BdrvChild *c = opaque;
@@ -2158,10 +2198,10 @@ static void bdrv_child_set_perm_safe(BdrvChild *c, 
uint64_t perm,
  * A call to this function must always be followed by a call to bdrv_set_perm()
  * or bdrv_abort_perm_update().
  */
-static int bdrv_check_perm(BlockDriverState *bs, BlockReopenQueue *q,
-   uint64_t cumulative_perms,
-   uint64_t cumulative_shared_perms,
-   GSList *ignore_children, Error **errp)
+static int bdrv_node_check_perm(BlockDriverState *bs, BlockReopenQueue *q,
+uint64_t cumulative_perms,
+uint64_t cumulative_shared_perms,
+GSList *ignore_children, Error **errp)
 {
 BlockDriver *drv = 

[PATCH v4 00/36] block: update graph permissions update

2021-04-28 Thread Vladimir Sementsov-Ogievskiy
Hi all!

And here is v4.

v4:
git: https://src.openvz.org/scm/~vsementsov/qemu.git
tag: up-block-topologic-perm-v4

01: drop misleading comment
04: add missed bdrv_unref to new bdrv_append() caller
08: s/Than/Then/ in comment 
18: add missed ".clean = g_free" to bdrv_attach_child_common_drv
fix typo in comment
qsd-jobs test output updated 
22: rewrite comment. I've changed your last wording, be free to return
it back if you like it more. 

05,06,09: Add Alberto's r-b

all patches except for 18, for new test output: Add Kevin's r-b

===

The main point of the series is fixing some permission update problems
(see patches 01-03 as examples), that in turn makes possible more clean
inserting and removing of filters (see patch 26 where .active field is
dropped finally from backup-top filter, we don't need a workaround
anymore).

The series brings util/transactions.c (patch 10) and use of it in
block.c, which allows clean block graph change transactions, with
possibility of reverting all modifications (movement and removement of
children, changing aio context, changing permissions) in reverse order
on failure path.

The series also helps Alberto's "Allow changing bs->file on reopen"
which we want to merge prior dropping x- prefix from blockdev-reopen
command.

Vladimir Sementsov-Ogievskiy (36):
  tests/test-bdrv-graph-mod: add test_parallel_exclusive_write
  tests/test-bdrv-graph-mod: add test_parallel_perm_update
  tests/test-bdrv-graph-mod: add test_append_greedy_filter
  block: bdrv_append(): don't consume reference
  block: BdrvChildClass: add .get_parent_aio_context handler
  block: drop ctx argument from bdrv_root_attach_child
  block: make bdrv_reopen_{prepare,commit,abort} private
  util: add transactions.c
  block: bdrv_refresh_perms: check for parents permissions conflict
  block: refactor bdrv_child* permission functions
  block: rewrite bdrv_child_try_set_perm() using bdrv_refresh_perms()
  block: inline bdrv_child_*() permission functions calls
  block: use topological sort for permission update
  block: add bdrv_drv_set_perm transaction action
  block: add bdrv_list_* permission update functions
  block: add bdrv_replace_child_safe() transaction action
  block: fix bdrv_replace_node_common
  block: add bdrv_attach_child_common() transaction action
  block: add bdrv_attach_child_noperm() transaction action
  block: split out bdrv_replace_node_noperm()
  block: adapt bdrv_append() for inserting filters
  block: add bdrv_remove_filter_or_cow transaction action
  block: introduce bdrv_drop_filter()
  block/backup-top: drop .active
  block: drop ignore_children for permission update functions
  block: make bdrv_unset_inherits_from to be a transaction action
  block: make bdrv_refresh_limits() to be a transaction action
  block: add bdrv_set_backing_noperm() transaction action
  block: bdrv_reopen_multiple(): move bdrv_flush to separate pre-prepare
  block: bdrv_reopen_multiple: refresh permissions on updated graph
  block: drop unused permission update functions
  block: inline bdrv_check_perm_common()
  block: inline bdrv_replace_child()
  block: refactor bdrv_child_set_perm_safe() transaction action
  block: rename bdrv_replace_child_safe() to bdrv_replace_child()
  block: refactor bdrv_node_check_perm()

 include/block/block.h |   13 +-
 include/block/block_int.h |8 +-
 include/qemu/transactions.h   |   63 ++
 block.c   | 1329 +++--
 block/backup-top.c|   48 +-
 block/block-backend.c |   13 +-
 block/commit.c|1 +
 block/file-posix.c|   91 +-
 block/io.c|   31 +-
 block/mirror.c|3 -
 blockdev.c|4 -
 blockjob.c|   11 +-
 tests/unit/test-bdrv-drain.c  |2 +-
 tests/unit/test-bdrv-graph-mod.c  |  209 +++-
 util/transactions.c   |   96 ++
 MAINTAINERS   |6 +
 tests/qemu-iotests/245|2 +-
 tests/qemu-iotests/283.out|2 +-
 tests/qemu-iotests/tests/qsd-jobs.out |2 +-
 util/meson.build  |1 +
 20 files changed, 1261 insertions(+), 674 deletions(-)
 create mode 100644 include/qemu/transactions.h
 create mode 100644 util/transactions.c

-- 
2.29.2




Re: [PATCH] hw/ide: Fix crash when plugging a piix3-ide device into the x-remote machine

2021-04-28 Thread Markus Armbruster
Stefan Hajnoczi  writes:

> On Tue, Apr 27, 2021 at 07:54:21PM +0200, Philippe Mathieu-Daudé wrote:
>> On 4/27/21 7:16 PM, John Snow wrote:
>> > On 4/27/21 9:54 AM, Stefan Hajnoczi wrote:
>> >> I suggest fixing this at the qdev level. Make piix3-ide have a
>> >> sub-device that inherits from ISA_DEVICE so it can only be instantiated
>> >> when there's an ISA bus.
>> >>
>> >> Stefan
>> > 
>> > My qdev knowledge is shaky. Does this imply that you agree with the
>> > direction of Thomas's patch, or do you just mean to disagree with Phil
>> > on his preferred course of action?
>> 
>> My understanding is a disagreement to both, with a 3rd direction :)
>> 
>> I agree with Stefan direction but I'm not sure (yet) that a sub-device
>> is the best (long-term) solution. I guess there is a design issue with
>> this device, and would like to understanding it first.
>> 
>> IIUC Stefan says the piix3-ide is both a PCI and IDE device, but QOM
>> only allow a single parent. Multiple QOM inheritance is resolved as
>> interfaces, but PCI/IDE qdev aren't interfaces, rather abstract objects.
>> So he suggests to embed an IDE device within the PCI piix3-ide device.
>> 
>> My view is the PIIX is a chipset that share stuffs between components,
>> and the IDE bus belongs to the chipset PCI root (or eventually the
>> PCI-ISA bridge, function #0). The IDE function would use the IDE bus
>> from its root parent as a linked property.
>> My problem is currently this device is user-creatable as a Frankenstein
>> single PCI function, out of its chipset. I'm not sure yet this is a
>> dead end or I could work something out.
>
> Kevin and Paolo previously pointed out that piix3-ide is sometimes used
> with the Q35 machine type. The user-creatable piix3-ide device needs to
> be deprecated before it can be dropped. That's a long process that
> cannot fix the current crash any time soon.
>
> I do support deprecating the user-creatable piix3-ide device in favor of
> a proper Q35 Legacy IDE implementation. The main problem is this
> involves a bunch of work and I'm not sure who would do it (the payoff is
> not very high).

In my opinion, letting users plug device models for PCI *functions* as
if they were *devices* was a mistake.  Compounding the mistake of not
modelling the difference between PCI function and PCI device.

The more of them we can deprecate, the better.




Re: [PATCH] hw/ide: Fix crash when plugging a piix3-ide device into the x-remote machine

2021-04-28 Thread Markus Armbruster
Stefan Hajnoczi  writes:

> On Tue, Apr 27, 2021 at 02:02:27PM -0400, John Snow wrote:
>> On 4/27/21 1:54 PM, Philippe Mathieu-Daudé wrote:
>> > On 4/27/21 7:16 PM, John Snow wrote:
>> > > On 4/27/21 9:54 AM, Stefan Hajnoczi wrote:
>> > > > I suggest fixing this at the qdev level. Make piix3-ide have a
>> > > > sub-device that inherits from ISA_DEVICE so it can only be instantiated
>> > > > when there's an ISA bus.
>> > > > 
>> > > > Stefan
>> > > 
>> > > My qdev knowledge is shaky. Does this imply that you agree with the
>> > > direction of Thomas's patch, or do you just mean to disagree with Phil
>> > > on his preferred course of action?
>> > 
>> > My understanding is a disagreement to both, with a 3rd direction :)
>> > 
>> > I agree with Stefan direction but I'm not sure (yet) that a sub-device
>> > is the best (long-term) solution. I guess there is a design issue with
>> > this device, and would like to understanding it first.
>> > 
>> > IIUC Stefan says the piix3-ide is both a PCI and IDE device, but QOM
>> > only allow a single parent. Multiple QOM inheritance is resolved as
>> > interfaces, but PCI/IDE qdev aren't interfaces, rather abstract objects.
>> > So he suggests to embed an IDE device within the PCI piix3-ide device.
>> > 
>> > My view is the PIIX is a chipset that share stuffs between components,
>> > and the IDE bus belongs to the chipset PCI root (or eventually the
>> > PCI-ISA bridge, function #0). The IDE function would use the IDE bus
>> > from its root parent as a linked property.
>> > My problem is currently this device is user-creatable as a Frankenstein
>> > single PCI function, out of its chipset. I'm not sure yet this is a
>> > dead end or I could work something out.
>> > 
>> > Regards,
>> > 
>> > Phil.
>> > 
>> 
>> It sounds complicated. In the meantime, I think I am favor of taking
>> Thomas's patch because it merely adds some error routing to allow us to
>> avoid a crash. The core organizational issues of the IDE device(s) will
>> remain and can be solved later as needed.
>
> The approach in this patch is okay but we should keep in mind it only
> solves piix3-ide. ISA provides a non-qdev backdoor API and there may be
> more instances of this type of bug.
>
> A qdev fix would address the root cause and make it possible to drop the
> backdoor API, but that's probably too much work for little benefit.

What do you mean by backdoor API?  Global @isabus?




Re: [PATCH 0/9] hw/block: m25p80: Fix the mess of dummy bytes needed for fast read commands

2021-04-28 Thread Cédric Le Goater
On 4/28/21 3:12 PM, Bin Meng wrote:
> Hi Cédric,
> 
> On Tue, Apr 27, 2021 at 10:32 PM Cédric Le Goater  wrote:
>>
>> Hello,
>>
>> On 4/27/21 10:54 AM, Francisco Iglesias wrote:
>>> On [2021 Apr 27] Tue 15:56:10, Alistair Francis wrote:
 On Fri, Apr 23, 2021 at 4:46 PM Bin Meng  wrote:
>
> On Mon, Feb 8, 2021 at 10:41 PM Bin Meng  wrote:
>>
>> On Thu, Jan 21, 2021 at 10:18 PM Francisco Iglesias
>>  wrote:
>>>
>>> Hi Bin,
>>>
>>> On [2021 Jan 21] Thu 16:59:51, Bin Meng wrote:
 Hi Francisco,

 On Thu, Jan 21, 2021 at 4:50 PM Francisco Iglesias
  wrote:
>
> Dear Bin,
>
> On [2021 Jan 20] Wed 22:20:25, Bin Meng wrote:
>> Hi Francisco,
>>
>> On Tue, Jan 19, 2021 at 9:01 PM Francisco Iglesias
>>  wrote:
>>>
>>> Hi Bin,
>>>
>>> On [2021 Jan 18] Mon 20:32:19, Bin Meng wrote:
 Hi Francisco,

 On Mon, Jan 18, 2021 at 6:06 PM Francisco Iglesias
  wrote:
>
> Hi Bin,
>
> On [2021 Jan 15] Fri 22:38:18, Bin Meng wrote:
>> Hi Francisco,
>>
>> On Fri, Jan 15, 2021 at 8:26 PM Francisco Iglesias
>>  wrote:
>>>
>>> Hi Bin,
>>>
>>> On [2021 Jan 15] Fri 10:07:52, Bin Meng wrote:
 Hi Francisco,

 On Fri, Jan 15, 2021 at 2:13 AM Francisco Iglesias
  wrote:
>
> Hi Bin,
>
> On [2021 Jan 14] Thu 23:08:53, Bin Meng wrote:
>> From: Bin Meng 
>>
>> The m25p80 model uses s->needed_bytes to indicate how many 
>> follow-up
>> bytes are expected to be received after it receives a 
>> command. For
>> example, depending on the address mode, either 3-byte 
>> address or
>> 4-byte address is needed.
>>
>> For fast read family commands, some dummy cycles are 
>> required after
>> sending the address bytes, and the dummy cycles need to be 
>> counted
>> in s->needed_bytes. This is where the mess began.
>>
>> As the variable name (needed_bytes) indicates, the unit is 
>> in byte.
>> It is not in bit, or cycle. However for some reason the 
>> model has
>> been using the number of dummy cycles for s->needed_bytes. 
>> The right
>> approach is to convert the number of dummy cycles to bytes 
>> based on
>> the SPI protocol, for example, 6 dummy cycles for the Fast 
>> Read Quad
>> I/O (EBh) should be converted to 3 bytes per the formula (6 
>> * 4 / 8).
>
> While not being the original implementor I must assume that 
> above solution was
> considered but not chosen by the developers due to it is 
> inaccuracy (it
> wouldn't be possible to model exacly 6 dummy cycles, only a 
> multiple of 8,
> meaning that if the controller is wrongly programmed to 
> generate 7 the error
> wouldn't be caught and the controller will still be 
> considered "correct"). Now
> that we have this detail in the implementation I'm in favor 
> of keeping it, this
> also because the detail is already in use for catching 
> exactly above error.
>

 I found no clue from the commit message that my proposed 
 solution here
 was ever considered, otherwise all SPI controller models 
 supporting
 software generation should have been found out seriously 
 broken long
 time ago!
>>>
>>>
>>> The controllers you are referring to might lack support for 
>>> commands requiring
>>> dummy clock cycles but I really hope they work with the other 
>>> commands? If so I
>>
>> I am not sure why you view dummy clock cycles as something 
>> special
>> that needs some special support from the SPI controller. For the 
>> case
>> 1 controller, it's nothing special from the controller 
>> perspective,
>> just like sending out a command, or address bytes, or data. The
>> controller just shifts data bit by bit from its tx fifo and 

Re: [PATCH 0/9] hw/block: m25p80: Fix the mess of dummy bytes needed for fast read commands

2021-04-28 Thread Bin Meng
Hi Cédric,

On Tue, Apr 27, 2021 at 10:32 PM Cédric Le Goater  wrote:
>
> Hello,
>
> On 4/27/21 10:54 AM, Francisco Iglesias wrote:
> > On [2021 Apr 27] Tue 15:56:10, Alistair Francis wrote:
> >> On Fri, Apr 23, 2021 at 4:46 PM Bin Meng  wrote:
> >>>
> >>> On Mon, Feb 8, 2021 at 10:41 PM Bin Meng  wrote:
> 
>  On Thu, Jan 21, 2021 at 10:18 PM Francisco Iglesias
>   wrote:
> >
> > Hi Bin,
> >
> > On [2021 Jan 21] Thu 16:59:51, Bin Meng wrote:
> >> Hi Francisco,
> >>
> >> On Thu, Jan 21, 2021 at 4:50 PM Francisco Iglesias
> >>  wrote:
> >>>
> >>> Dear Bin,
> >>>
> >>> On [2021 Jan 20] Wed 22:20:25, Bin Meng wrote:
>  Hi Francisco,
> 
>  On Tue, Jan 19, 2021 at 9:01 PM Francisco Iglesias
>   wrote:
> >
> > Hi Bin,
> >
> > On [2021 Jan 18] Mon 20:32:19, Bin Meng wrote:
> >> Hi Francisco,
> >>
> >> On Mon, Jan 18, 2021 at 6:06 PM Francisco Iglesias
> >>  wrote:
> >>>
> >>> Hi Bin,
> >>>
> >>> On [2021 Jan 15] Fri 22:38:18, Bin Meng wrote:
>  Hi Francisco,
> 
>  On Fri, Jan 15, 2021 at 8:26 PM Francisco Iglesias
>   wrote:
> >
> > Hi Bin,
> >
> > On [2021 Jan 15] Fri 10:07:52, Bin Meng wrote:
> >> Hi Francisco,
> >>
> >> On Fri, Jan 15, 2021 at 2:13 AM Francisco Iglesias
> >>  wrote:
> >>>
> >>> Hi Bin,
> >>>
> >>> On [2021 Jan 14] Thu 23:08:53, Bin Meng wrote:
>  From: Bin Meng 
> 
>  The m25p80 model uses s->needed_bytes to indicate how many 
>  follow-up
>  bytes are expected to be received after it receives a 
>  command. For
>  example, depending on the address mode, either 3-byte 
>  address or
>  4-byte address is needed.
> 
>  For fast read family commands, some dummy cycles are 
>  required after
>  sending the address bytes, and the dummy cycles need to be 
>  counted
>  in s->needed_bytes. This is where the mess began.
> 
>  As the variable name (needed_bytes) indicates, the unit is 
>  in byte.
>  It is not in bit, or cycle. However for some reason the 
>  model has
>  been using the number of dummy cycles for s->needed_bytes. 
>  The right
>  approach is to convert the number of dummy cycles to bytes 
>  based on
>  the SPI protocol, for example, 6 dummy cycles for the Fast 
>  Read Quad
>  I/O (EBh) should be converted to 3 bytes per the formula (6 
>  * 4 / 8).
> >>>
> >>> While not being the original implementor I must assume that 
> >>> above solution was
> >>> considered but not chosen by the developers due to it is 
> >>> inaccuracy (it
> >>> wouldn't be possible to model exacly 6 dummy cycles, only a 
> >>> multiple of 8,
> >>> meaning that if the controller is wrongly programmed to 
> >>> generate 7 the error
> >>> wouldn't be caught and the controller will still be 
> >>> considered "correct"). Now
> >>> that we have this detail in the implementation I'm in favor 
> >>> of keeping it, this
> >>> also because the detail is already in use for catching 
> >>> exactly above error.
> >>>
> >>
> >> I found no clue from the commit message that my proposed 
> >> solution here
> >> was ever considered, otherwise all SPI controller models 
> >> supporting
> >> software generation should have been found out seriously 
> >> broken long
> >> time ago!
> >
> >
> > The controllers you are referring to might lack support for 
> > commands requiring
> > dummy clock cycles but I really hope they work with the other 
> > commands? If so I
> 
>  I am not sure why you view dummy clock cycles as something 
>  special
>  that needs some special support from the SPI controller. For the 
>  case
>  1 controller, it's nothing special from the controller 
>  perspective,
>  just like sending out a command, or address bytes, or data. The
>  controller just shifts data bit by bit from its tx fifo and 
>  that's it.
>  In 

[PATCH v2 7/8] hw/block/fdc-sysbus: Add 'dma-channel' property

2021-04-28 Thread Philippe Mathieu-Daudé
QDev properties to be set before the device is realized should
be exposed as a Property with a DEFINE_PROP_XXX() macro, then
accessed with the equivalent qdev_prop_set_xxx() API.

Do this with the FDCtrlSysBus 'dma-channel' property: convert
it to int32_t, default-initialize with DEFINE_PROP_INT32() and
use qdev_prop_set_int32() to set its value in fdctrl_init_sysbus().

Signed-off-by: Philippe Mathieu-Daudé 
---
 hw/block/fdc-internal.h | 2 +-
 hw/block/fdc-sysbus.c   | 9 ++---
 2 files changed, 3 insertions(+), 8 deletions(-)

diff --git a/hw/block/fdc-internal.h b/hw/block/fdc-internal.h
index 278de725e69..29b318f7525 100644
--- a/hw/block/fdc-internal.h
+++ b/hw/block/fdc-internal.h
@@ -96,7 +96,7 @@ struct FDCtrl {
 qemu_irq irq;
 /* Controller state */
 QEMUTimer *result_timer;
-int dma_chann;
+int32_t dma_chann;
 uint8_t phase;
 IsaDma *dma;
 /* Controller's identification */
diff --git a/hw/block/fdc-sysbus.c b/hw/block/fdc-sysbus.c
index 8f94c2efb63..74c7c8f2e01 100644
--- a/hw/block/fdc-sysbus.c
+++ b/hw/block/fdc-sysbus.c
@@ -106,15 +106,11 @@ void sysbus_fdc_init_drives(SysBusDevice *dev, DriveInfo 
**fds)
 void fdctrl_init_sysbus(qemu_irq irq, int dma_chann,
 hwaddr mmio_base, DriveInfo **fds)
 {
-FDCtrl *fdctrl;
 DeviceState *dev;
 SysBusDevice *sbd;
-FDCtrlSysBus *sys;
 
 dev = qdev_new("sysbus-fdc");
-sys = SYSBUS_FDC(dev);
-fdctrl = >state;
-fdctrl->dma_chann = dma_chann; /* FIXME */
+qdev_prop_set_int32(dev, "dma-channel", dma_chann);
 sbd = SYS_BUS_DEVICE(dev);
 sysbus_realize_and_unref(sbd, _fatal);
 sysbus_connect_irq(sbd, 0, irq);
@@ -131,8 +127,6 @@ static void sysbus_fdc_common_initfn(Object *obj)
 FDCtrlSysBus *sys = SYSBUS_FDC(obj);
 FDCtrl *fdctrl = >state;
 
-fdctrl->dma_chann = -1;
-
 qdev_set_legacy_instance_id(dev, 0 /* io */, 2); /* FIXME */
 
 memory_region_init_io(>iomem, obj,
@@ -173,6 +167,7 @@ static Property sysbus_fdc_properties[] = {
 DEFINE_PROP_SIGNED("fallback", FDCtrlSysBus, state.fallback,
 FLOPPY_DRIVE_TYPE_144, qdev_prop_fdc_drive_type,
 FloppyDriveType),
+DEFINE_PROP_INT32("dma-channel", FDCtrlSysBus, state.dma_chann, -1),
 DEFINE_PROP_END_OF_LIST(),
 };
 
-- 
2.26.3




[PATCH v2 6/8] hw/sparc/sun4m: Inline sun4m_fdctrl_init()

2021-04-28 Thread Philippe Mathieu-Daudé
There is only one call site for sun4m_fdctrl_init(), and this
function is specific to the sun4m machines, not part of the
SYSBUS_FDC API. Move it locally with the machine code, and
remove its declaration in "hw/block/fdc.h".

Suggested-by: Mark Cave-Ayland 
Signed-off-by: Philippe Mathieu-Daudé 
---
 include/hw/block/fdc.h |  2 --
 hw/block/fdc-sysbus.c  | 16 
 hw/sparc/sun4m.c   | 16 
 3 files changed, 16 insertions(+), 18 deletions(-)

diff --git a/include/hw/block/fdc.h b/include/hw/block/fdc.h
index 52e45c53078..06612218630 100644
--- a/include/hw/block/fdc.h
+++ b/include/hw/block/fdc.h
@@ -14,8 +14,6 @@ void isa_fdc_init_drives(ISADevice *fdc, DriveInfo **fds);
 void sysbus_fdc_init_drives(SysBusDevice *dev, DriveInfo **fds);
 void fdctrl_init_sysbus(qemu_irq irq, int dma_chann,
 hwaddr mmio_base, DriveInfo **fds);
-void sun4m_fdctrl_init(qemu_irq irq, hwaddr io_base,
-   DriveInfo **fds, qemu_irq *fdc_tc);
 
 FloppyDriveType isa_fdc_get_drive_type(ISADevice *fdc, int i);
 int cmos_get_fd_drive_type(FloppyDriveType fd0);
diff --git a/hw/block/fdc-sysbus.c b/hw/block/fdc-sysbus.c
index 1163e53165d..8f94c2efb63 100644
--- a/hw/block/fdc-sysbus.c
+++ b/hw/block/fdc-sysbus.c
@@ -123,22 +123,6 @@ void fdctrl_init_sysbus(qemu_irq irq, int dma_chann,
 sysbus_fdc_init_drives(sbd, fds);
 }
 
-void sun4m_fdctrl_init(qemu_irq irq, hwaddr io_base,
-   DriveInfo **fds, qemu_irq *fdc_tc)
-{
-DeviceState *dev;
-SysBusDevice *sbd;
-
-dev = qdev_new("sun-fdtwo");
-sbd = SYS_BUS_DEVICE(dev);
-sysbus_realize_and_unref(sbd, _fatal);
-sysbus_connect_irq(sbd, 0, irq);
-sysbus_mmio_map(sbd, 0, io_base);
-*fdc_tc = qdev_get_gpio_in(dev, 0);
-
-sysbus_fdc_init_drives(sbd, fds);
-}
-
 static void sysbus_fdc_common_initfn(Object *obj)
 {
 DeviceState *dev = DEVICE(obj);
diff --git a/hw/sparc/sun4m.c b/hw/sparc/sun4m.c
index 1a00816d9a8..27bcea07157 100644
--- a/hw/sparc/sun4m.c
+++ b/hw/sparc/sun4m.c
@@ -837,6 +837,22 @@ static void dummy_fdc_tc(void *opaque, int irq, int level)
 {
 }
 
+static void sun4m_fdctrl_init(qemu_irq irq, hwaddr io_base,
+  DriveInfo **fds, qemu_irq *fdc_tc)
+{
+DeviceState *dev;
+SysBusDevice *sbd;
+
+dev = qdev_new("sun-fdtwo");
+sbd = SYS_BUS_DEVICE(dev);
+sysbus_realize_and_unref(sbd, _fatal);
+sysbus_connect_irq(sbd, 0, irq);
+sysbus_mmio_map(sbd, 0, io_base);
+*fdc_tc = qdev_get_gpio_in(dev, 0);
+
+sysbus_fdc_init_drives(sbd, fds);
+}
+
 static void sun4m_hw_init(const struct sun4m_hwdef *hwdef,
   MachineState *machine)
 {
-- 
2.26.3




[PATCH v2 5/8] hw/block/fdc: Add sysbus_fdc_init_drives() method

2021-04-28 Thread Philippe Mathieu-Daudé
FDCtrlSysBus's FDCtrl state is a private field. However it is
accessed by the public fdctrl_init_sysbus() and sun4m_fdctrl_init()
methods. To be able to move them out of fdc-sysbus.c, first add
the sysbus_fdc_init_drives() method and use it in these 2 functions.

Signed-off-by: Philippe Mathieu-Daudé 
---
 include/hw/block/fdc.h |  2 ++
 hw/block/fdc-sysbus.c  | 23 ---
 2 files changed, 18 insertions(+), 7 deletions(-)

diff --git a/include/hw/block/fdc.h b/include/hw/block/fdc.h
index 1ecca7cac7f..52e45c53078 100644
--- a/include/hw/block/fdc.h
+++ b/include/hw/block/fdc.h
@@ -3,6 +3,7 @@
 
 #include "exec/hwaddr.h"
 #include "qapi/qapi-types-block.h"
+#include "hw/sysbus.h"
 
 /* fdc.c */
 #define MAX_FD 2
@@ -10,6 +11,7 @@
 #define TYPE_ISA_FDC "isa-fdc"
 
 void isa_fdc_init_drives(ISADevice *fdc, DriveInfo **fds);
+void sysbus_fdc_init_drives(SysBusDevice *dev, DriveInfo **fds);
 void fdctrl_init_sysbus(qemu_irq irq, int dma_chann,
 hwaddr mmio_base, DriveInfo **fds);
 void sun4m_fdctrl_init(qemu_irq irq, hwaddr io_base,
diff --git a/hw/block/fdc-sysbus.c b/hw/block/fdc-sysbus.c
index 71755fd6ae4..1163e53165d 100644
--- a/hw/block/fdc-sysbus.c
+++ b/hw/block/fdc-sysbus.c
@@ -94,6 +94,15 @@ static void fdctrl_handle_tc(void *opaque, int irq, int 
level)
 trace_fdctrl_tc_pulse(level);
 }
 
+void sysbus_fdc_init_drives(SysBusDevice *dev, DriveInfo **fds)
+{
+FDCtrlSysBus *fdc;
+
+fdc = SYSBUS_FDC(dev);
+
+fdctrl_init_drives(>state.bus, fds);
+}
+
 void fdctrl_init_sysbus(qemu_irq irq, int dma_chann,
 hwaddr mmio_base, DriveInfo **fds)
 {
@@ -111,23 +120,23 @@ void fdctrl_init_sysbus(qemu_irq irq, int dma_chann,
 sysbus_connect_irq(sbd, 0, irq);
 sysbus_mmio_map(sbd, 0, mmio_base);
 
-fdctrl_init_drives(>state.bus, fds);
+sysbus_fdc_init_drives(sbd, fds);
 }
 
 void sun4m_fdctrl_init(qemu_irq irq, hwaddr io_base,
DriveInfo **fds, qemu_irq *fdc_tc)
 {
 DeviceState *dev;
-FDCtrlSysBus *sys;
+SysBusDevice *sbd;
 
 dev = qdev_new("sun-fdtwo");
-sysbus_realize_and_unref(SYS_BUS_DEVICE(dev), _fatal);
-sys = SYSBUS_FDC(dev);
-sysbus_connect_irq(SYS_BUS_DEVICE(sys), 0, irq);
-sysbus_mmio_map(SYS_BUS_DEVICE(sys), 0, io_base);
+sbd = SYS_BUS_DEVICE(dev);
+sysbus_realize_and_unref(sbd, _fatal);
+sysbus_connect_irq(sbd, 0, irq);
+sysbus_mmio_map(sbd, 0, io_base);
 *fdc_tc = qdev_get_gpio_in(dev, 0);
 
-fdctrl_init_drives(>state.bus, fds);
+sysbus_fdc_init_drives(sbd, fds);
 }
 
 static void sysbus_fdc_common_initfn(Object *obj)
-- 
2.26.3




[PATCH v2 1/8] hw/block/fdc: Replace disabled fprintf() by trace event

2021-04-28 Thread Philippe Mathieu-Daudé
Reviewed-by: John Snow 
Acked-by: Mark Cave-Ayland 
Signed-off-by: Philippe Mathieu-Daudé 
---
 hw/block/fdc.c| 7 +--
 hw/block/trace-events | 1 +
 2 files changed, 2 insertions(+), 6 deletions(-)

diff --git a/hw/block/fdc.c b/hw/block/fdc.c
index a825c2acbae..1d3a0473678 100644
--- a/hw/block/fdc.c
+++ b/hw/block/fdc.c
@@ -1242,12 +1242,7 @@ static void fdctrl_external_reset_isa(DeviceState *d)
 
 static void fdctrl_handle_tc(void *opaque, int irq, int level)
 {
-//FDCtrl *s = opaque;
-
-if (level) {
-// XXX
-FLOPPY_DPRINTF("TC pulsed\n");
-}
+trace_fdctrl_tc_pulse(level);
 }
 
 /* Change IRQ state */
diff --git a/hw/block/trace-events b/hw/block/trace-events
index fa12e3a67a7..306989c193c 100644
--- a/hw/block/trace-events
+++ b/hw/block/trace-events
@@ -3,6 +3,7 @@
 # fdc.c
 fdc_ioport_read(uint8_t reg, uint8_t value) "read reg 0x%02x val 0x%02x"
 fdc_ioport_write(uint8_t reg, uint8_t value) "write reg 0x%02x val 0x%02x"
+fdctrl_tc_pulse(int level) "TC pulse: %u"
 
 # pflash_cfi01.c
 # pflash_cfi02.c
-- 
2.26.3




[PATCH v2 8/8] hw/mips/jazz: Inline fdctrl_init_sysbus()

2021-04-28 Thread Philippe Mathieu-Daudé
There is only one call site for fdctrl_init_sysbus(), and this
function is specific to the jazz machines, not part of the
SYSBUS_FDC API. Move it locally with the machine code, and
remove its declaration in "hw/block/fdc.h".

Suggested-by: Mark Cave-Ayland 
Signed-off-by: Philippe Mathieu-Daudé 
---
 include/hw/block/fdc.h |  3 ---
 hw/block/fdc-sysbus.c  | 16 
 hw/mips/jazz.c | 16 
 3 files changed, 16 insertions(+), 19 deletions(-)

diff --git a/include/hw/block/fdc.h b/include/hw/block/fdc.h
index 06612218630..ac99d6bcaa0 100644
--- a/include/hw/block/fdc.h
+++ b/include/hw/block/fdc.h
@@ -1,7 +1,6 @@
 #ifndef HW_FDC_H
 #define HW_FDC_H
 
-#include "exec/hwaddr.h"
 #include "qapi/qapi-types-block.h"
 #include "hw/sysbus.h"
 
@@ -12,8 +11,6 @@
 
 void isa_fdc_init_drives(ISADevice *fdc, DriveInfo **fds);
 void sysbus_fdc_init_drives(SysBusDevice *dev, DriveInfo **fds);
-void fdctrl_init_sysbus(qemu_irq irq, int dma_chann,
-hwaddr mmio_base, DriveInfo **fds);
 
 FloppyDriveType isa_fdc_get_drive_type(ISADevice *fdc, int i);
 int cmos_get_fd_drive_type(FloppyDriveType fd0);
diff --git a/hw/block/fdc-sysbus.c b/hw/block/fdc-sysbus.c
index 74c7c8f2e01..5c7e49bcc3f 100644
--- a/hw/block/fdc-sysbus.c
+++ b/hw/block/fdc-sysbus.c
@@ -103,22 +103,6 @@ void sysbus_fdc_init_drives(SysBusDevice *dev, DriveInfo 
**fds)
 fdctrl_init_drives(>state.bus, fds);
 }
 
-void fdctrl_init_sysbus(qemu_irq irq, int dma_chann,
-hwaddr mmio_base, DriveInfo **fds)
-{
-DeviceState *dev;
-SysBusDevice *sbd;
-
-dev = qdev_new("sysbus-fdc");
-qdev_prop_set_int32(dev, "dma-channel", dma_chann);
-sbd = SYS_BUS_DEVICE(dev);
-sysbus_realize_and_unref(sbd, _fatal);
-sysbus_connect_irq(sbd, 0, irq);
-sysbus_mmio_map(sbd, 0, mmio_base);
-
-sysbus_fdc_init_drives(sbd, fds);
-}
-
 static void sysbus_fdc_common_initfn(Object *obj)
 {
 DeviceState *dev = DEVICE(obj);
diff --git a/hw/mips/jazz.c b/hw/mips/jazz.c
index 1a0888a0fd5..34fb6a3de9b 100644
--- a/hw/mips/jazz.c
+++ b/hw/mips/jazz.c
@@ -144,6 +144,22 @@ static void mips_jazz_do_transaction_failed(CPUState *cs, 
hwaddr physaddr,
 }
 #endif /* CONFIG_TCG && !CONFIG_USER_ONLY */
 
+static void fdctrl_init_sysbus(qemu_irq irq, int dma_chann,
+   hwaddr mmio_base, DriveInfo **fds)
+{
+DeviceState *dev;
+SysBusDevice *sbd;
+
+dev = qdev_new("sysbus-fdc");
+qdev_prop_set_int32(dev, "dma-channel", dma_chann);
+sbd = SYS_BUS_DEVICE(dev);
+sysbus_realize_and_unref(sbd, _fatal);
+sysbus_connect_irq(sbd, 0, irq);
+sysbus_mmio_map(sbd, 0, mmio_base);
+
+sysbus_fdc_init_drives(sbd, fds);
+}
+
 static void mips_jazz_init(MachineState *machine,
enum jazz_model_e jazz_model)
 {
-- 
2.26.3




[PATCH v2 4/8] hw/block/fdc: Extract SysBus floppy controllers to fdc-sysbus.c

2021-04-28 Thread Philippe Mathieu-Daudé
Some machines use floppy controllers via the SysBus interface,
and don't need to pull in all the SysBus code.
Extract the SysBus specific code to a new unit: fdc-sysbus.c,
and add a new Kconfig symbol: "FDC_SYSBUS".

Reviewed-by: John Snow 
Acked-by: Mark Cave-Ayland 
Signed-off-by: Philippe Mathieu-Daudé 
---
 hw/block/fdc-sysbus.c | 252 ++
 hw/block/fdc.c| 220 
 MAINTAINERS   |   1 +
 hw/block/Kconfig  |   4 +
 hw/block/meson.build  |   1 +
 hw/block/trace-events |   2 +
 hw/mips/Kconfig   |   2 +-
 hw/sparc/Kconfig  |   2 +-
 8 files changed, 262 insertions(+), 222 deletions(-)
 create mode 100644 hw/block/fdc-sysbus.c

diff --git a/hw/block/fdc-sysbus.c b/hw/block/fdc-sysbus.c
new file mode 100644
index 000..71755fd6ae4
--- /dev/null
+++ b/hw/block/fdc-sysbus.c
@@ -0,0 +1,252 @@
+/*
+ * QEMU Floppy disk emulator (Intel 82078)
+ *
+ * Copyright (c) 2003, 2007 Jocelyn Mayer
+ * Copyright (c) 2008 Hervé Poussineau
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a copy
+ * of this software and associated documentation files (the "Software"), to 
deal
+ * in the Software without restriction, including without limitation the rights
+ * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+ * copies of the Software, and to permit persons to whom the Software is
+ * furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING 
FROM,
+ * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
+ * THE SOFTWARE.
+ */
+
+#include "qemu/osdep.h"
+#include "qapi/error.h"
+#include "qom/object.h"
+#include "hw/sysbus.h"
+#include "hw/block/fdc.h"
+#include "migration/vmstate.h"
+#include "fdc-internal.h"
+#include "trace.h"
+
+#define TYPE_SYSBUS_FDC "base-sysbus-fdc"
+typedef struct FDCtrlSysBusClass FDCtrlSysBusClass;
+typedef struct FDCtrlSysBus FDCtrlSysBus;
+DECLARE_OBJ_CHECKERS(FDCtrlSysBus, FDCtrlSysBusClass,
+ SYSBUS_FDC, TYPE_SYSBUS_FDC)
+
+struct FDCtrlSysBusClass {
+/*< private >*/
+SysBusDeviceClass parent_class;
+/*< public >*/
+
+bool use_strict_io;
+};
+
+struct FDCtrlSysBus {
+/*< private >*/
+SysBusDevice parent_obj;
+/*< public >*/
+
+struct FDCtrl state;
+};
+
+static uint64_t fdctrl_read_mem(void *opaque, hwaddr reg, unsigned ize)
+{
+return fdctrl_read(opaque, (uint32_t)reg);
+}
+
+static void fdctrl_write_mem(void *opaque, hwaddr reg,
+ uint64_t value, unsigned size)
+{
+fdctrl_write(opaque, (uint32_t)reg, value);
+}
+
+static const MemoryRegionOps fdctrl_mem_ops = {
+.read = fdctrl_read_mem,
+.write = fdctrl_write_mem,
+.endianness = DEVICE_NATIVE_ENDIAN,
+};
+
+static const MemoryRegionOps fdctrl_mem_strict_ops = {
+.read = fdctrl_read_mem,
+.write = fdctrl_write_mem,
+.endianness = DEVICE_NATIVE_ENDIAN,
+.valid = {
+.min_access_size = 1,
+.max_access_size = 1,
+},
+};
+
+static void fdctrl_external_reset_sysbus(DeviceState *d)
+{
+FDCtrlSysBus *sys = SYSBUS_FDC(d);
+FDCtrl *s = >state;
+
+fdctrl_reset(s, 0);
+}
+
+static void fdctrl_handle_tc(void *opaque, int irq, int level)
+{
+trace_fdctrl_tc_pulse(level);
+}
+
+void fdctrl_init_sysbus(qemu_irq irq, int dma_chann,
+hwaddr mmio_base, DriveInfo **fds)
+{
+FDCtrl *fdctrl;
+DeviceState *dev;
+SysBusDevice *sbd;
+FDCtrlSysBus *sys;
+
+dev = qdev_new("sysbus-fdc");
+sys = SYSBUS_FDC(dev);
+fdctrl = >state;
+fdctrl->dma_chann = dma_chann; /* FIXME */
+sbd = SYS_BUS_DEVICE(dev);
+sysbus_realize_and_unref(sbd, _fatal);
+sysbus_connect_irq(sbd, 0, irq);
+sysbus_mmio_map(sbd, 0, mmio_base);
+
+fdctrl_init_drives(>state.bus, fds);
+}
+
+void sun4m_fdctrl_init(qemu_irq irq, hwaddr io_base,
+   DriveInfo **fds, qemu_irq *fdc_tc)
+{
+DeviceState *dev;
+FDCtrlSysBus *sys;
+
+dev = qdev_new("sun-fdtwo");
+sysbus_realize_and_unref(SYS_BUS_DEVICE(dev), _fatal);
+sys = SYSBUS_FDC(dev);
+sysbus_connect_irq(SYS_BUS_DEVICE(sys), 0, irq);
+sysbus_mmio_map(SYS_BUS_DEVICE(sys), 0, io_base);
+*fdc_tc = qdev_get_gpio_in(dev, 0);
+
+fdctrl_init_drives(>state.bus, fds);
+}
+
+static void sysbus_fdc_common_initfn(Object *obj)
+{
+DeviceState *dev = DEVICE(obj);
+FDCtrlSysBusClass *sbdc = 

[PATCH v2 3/8] hw/block/fdc: Extract ISA floppy controllers to fdc-isa.c

2021-04-28 Thread Philippe Mathieu-Daudé
Some machines use floppy controllers via the SysBus interface,
and don't need to pull in all the ISA code.
Extract the ISA specific code to a new unit: fdc-isa.c, and
add a new Kconfig symbol: "FDC_ISA".

Reviewed-by: John Snow 
Acked-by: Mark Cave-Ayland 
Signed-off-by: Philippe Mathieu-Daudé 
---
 hw/block/fdc-isa.c   | 313 +++
 hw/block/fdc.c   | 257 ---
 MAINTAINERS  |   1 +
 hw/block/Kconfig |   4 +
 hw/block/meson.build |   1 +
 hw/i386/Kconfig  |   2 +-
 hw/isa/Kconfig   |   6 +-
 hw/sparc64/Kconfig   |   2 +-
 8 files changed, 324 insertions(+), 262 deletions(-)
 create mode 100644 hw/block/fdc-isa.c

diff --git a/hw/block/fdc-isa.c b/hw/block/fdc-isa.c
new file mode 100644
index 000..97f3f9e5c0a
--- /dev/null
+++ b/hw/block/fdc-isa.c
@@ -0,0 +1,313 @@
+/*
+ * QEMU Floppy disk emulator (Intel 82078)
+ *
+ * Copyright (c) 2003, 2007 Jocelyn Mayer
+ * Copyright (c) 2008 Hervé Poussineau
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a copy
+ * of this software and associated documentation files (the "Software"), to 
deal
+ * in the Software without restriction, including without limitation the rights
+ * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+ * copies of the Software, and to permit persons to whom the Software is
+ * furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING 
FROM,
+ * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
+ * THE SOFTWARE.
+ */
+/*
+ * The controller is used in Sun4m systems in a slightly different
+ * way. There are changes in DOR register and DMA is not available.
+ */
+
+#include "qemu/osdep.h"
+#include "hw/block/fdc.h"
+#include "qapi/error.h"
+#include "qemu/error-report.h"
+#include "qemu/timer.h"
+#include "hw/acpi/aml-build.h"
+#include "hw/irq.h"
+#include "hw/isa/isa.h"
+#include "hw/qdev-properties.h"
+#include "hw/qdev-properties-system.h"
+#include "migration/vmstate.h"
+#include "hw/block/block.h"
+#include "sysemu/block-backend.h"
+#include "sysemu/blockdev.h"
+#include "sysemu/sysemu.h"
+#include "qemu/log.h"
+#include "qemu/main-loop.h"
+#include "qemu/module.h"
+#include "trace.h"
+#include "qom/object.h"
+#include "fdc-internal.h"
+
+OBJECT_DECLARE_SIMPLE_TYPE(FDCtrlISABus, ISA_FDC)
+
+struct FDCtrlISABus {
+ISADevice parent_obj;
+
+uint32_t iobase;
+uint32_t irq;
+uint32_t dma;
+struct FDCtrl state;
+int32_t bootindexA;
+int32_t bootindexB;
+};
+
+static void fdctrl_external_reset_isa(DeviceState *d)
+{
+FDCtrlISABus *isa = ISA_FDC(d);
+FDCtrl *s = >state;
+
+fdctrl_reset(s, 0);
+}
+
+void isa_fdc_init_drives(ISADevice *fdc, DriveInfo **fds)
+{
+fdctrl_init_drives(_FDC(fdc)->state.bus, fds);
+}
+
+static const MemoryRegionPortio fdc_portio_list[] = {
+{ 1, 5, 1, .read = fdctrl_read, .write = fdctrl_write },
+{ 7, 1, 1, .read = fdctrl_read, .write = fdctrl_write },
+PORTIO_END_OF_LIST(),
+};
+
+static void isabus_fdc_realize(DeviceState *dev, Error **errp)
+{
+ISADevice *isadev = ISA_DEVICE(dev);
+FDCtrlISABus *isa = ISA_FDC(dev);
+FDCtrl *fdctrl = >state;
+Error *err = NULL;
+
+isa_register_portio_list(isadev, >portio_list,
+ isa->iobase, fdc_portio_list, fdctrl,
+ "fdc");
+
+isa_init_irq(isadev, >irq, isa->irq);
+fdctrl->dma_chann = isa->dma;
+if (fdctrl->dma_chann != -1) {
+fdctrl->dma = isa_get_dma(isa_bus_from_device(isadev), isa->dma);
+if (!fdctrl->dma) {
+error_setg(errp, "ISA controller does not support DMA");
+return;
+}
+}
+
+qdev_set_legacy_instance_id(dev, isa->iobase, 2);
+
+fdctrl_realize_common(dev, fdctrl, );
+if (err != NULL) {
+error_propagate(errp, err);
+return;
+}
+}
+
+FloppyDriveType isa_fdc_get_drive_type(ISADevice *fdc, int i)
+{
+FDCtrlISABus *isa = ISA_FDC(fdc);
+
+return isa->state.drives[i].drive;
+}
+
+static void isa_fdc_get_drive_max_chs(FloppyDriveType type, uint8_t *maxc,
+  uint8_t *maxh, uint8_t *maxs)
+{
+const FDFormat *fdf;
+
+*maxc = *maxh = *maxs = 0;
+for (fdf = fd_formats; fdf->drive != FLOPPY_DRIVE_TYPE_NONE; fdf++) {
+if (fdf->drive != type) {
+continue;
+}
+if (*maxc < 

[PATCH v2 2/8] hw/block/fdc: Declare shared prototypes in fdc-internal.h

2021-04-28 Thread Philippe Mathieu-Daudé
We want to extract ISA/SysBus code from the generic fdc.c file.
First, declare the prototypes we will access from the new units
into a new local header: "fdc-internal.h".

Acked-by: Mark Cave-Ayland 
Signed-off-by: Philippe Mathieu-Daudé 
---
 hw/block/fdc-internal.h | 156 
 hw/block/fdc.c  | 126 +++-
 MAINTAINERS |   1 +
 3 files changed, 165 insertions(+), 118 deletions(-)
 create mode 100644 hw/block/fdc-internal.h

diff --git a/hw/block/fdc-internal.h b/hw/block/fdc-internal.h
new file mode 100644
index 000..278de725e69
--- /dev/null
+++ b/hw/block/fdc-internal.h
@@ -0,0 +1,156 @@
+/*
+ * QEMU Floppy disk emulator (Intel 82078)
+ *
+ * Copyright (c) 2003, 2007 Jocelyn Mayer
+ * Copyright (c) 2008 Hervé Poussineau
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a copy
+ * of this software and associated documentation files (the "Software"), to 
deal
+ * in the Software without restriction, including without limitation the rights
+ * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+ * copies of the Software, and to permit persons to whom the Software is
+ * furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING 
FROM,
+ * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
+ * THE SOFTWARE.
+ */
+#ifndef HW_BLOCK_FDC_INTERNAL_H
+#define HW_BLOCK_FDC_INTERNAL_H
+
+#include "exec/memory.h"
+#include "exec/ioport.h"
+#include "hw/block/block.h"
+#include "hw/block/fdc.h"
+#include "qapi/qapi-types-block.h"
+
+typedef struct FDCtrl FDCtrl;
+
+/* Floppy bus emulation */
+
+typedef struct FloppyBus {
+BusState bus;
+FDCtrl *fdc;
+} FloppyBus;
+
+/* Floppy disk drive emulation */
+
+typedef enum FDriveRate {
+FDRIVE_RATE_500K = 0x00,  /* 500 Kbps */
+FDRIVE_RATE_300K = 0x01,  /* 300 Kbps */
+FDRIVE_RATE_250K = 0x02,  /* 250 Kbps */
+FDRIVE_RATE_1M   = 0x03,  /*   1 Mbps */
+} FDriveRate;
+
+typedef enum FDriveSize {
+FDRIVE_SIZE_UNKNOWN,
+FDRIVE_SIZE_350,
+FDRIVE_SIZE_525,
+} FDriveSize;
+
+typedef struct FDFormat {
+FloppyDriveType drive;
+uint8_t last_sect;
+uint8_t max_track;
+uint8_t max_head;
+FDriveRate rate;
+} FDFormat;
+
+typedef enum FDiskFlags {
+FDISK_DBL_SIDES  = 0x01,
+} FDiskFlags;
+
+typedef struct FDrive {
+FDCtrl *fdctrl;
+BlockBackend *blk;
+BlockConf *conf;
+/* Drive status */
+FloppyDriveType drive;/* CMOS drive type*/
+uint8_t perpendicular;/* 2.88 MB access mode*/
+/* Position */
+uint8_t head;
+uint8_t track;
+uint8_t sect;
+/* Media */
+FloppyDriveType disk; /* Current disk type  */
+FDiskFlags flags;
+uint8_t last_sect;/* Nb sector per track*/
+uint8_t max_track;/* Nb of tracks   */
+uint16_t bps; /* Bytes per sector   */
+uint8_t ro;   /* Is read-only   */
+uint8_t media_changed;/* Is media changed   */
+uint8_t media_rate;   /* Data rate of medium*/
+
+bool media_validated; /* Have we validated the media? */
+} FDrive;
+
+struct FDCtrl {
+MemoryRegion iomem;
+qemu_irq irq;
+/* Controller state */
+QEMUTimer *result_timer;
+int dma_chann;
+uint8_t phase;
+IsaDma *dma;
+/* Controller's identification */
+uint8_t version;
+/* HW */
+uint8_t sra;
+uint8_t srb;
+uint8_t dor;
+uint8_t dor_vmstate; /* only used as temp during vmstate */
+uint8_t tdr;
+uint8_t dsr;
+uint8_t msr;
+uint8_t cur_drv;
+uint8_t status0;
+uint8_t status1;
+uint8_t status2;
+/* Command FIFO */
+uint8_t *fifo;
+int32_t fifo_size;
+uint32_t data_pos;
+uint32_t data_len;
+uint8_t data_state;
+uint8_t data_dir;
+uint8_t eot; /* last wanted sector */
+/* States kept only to be returned back */
+/* precompensation */
+uint8_t precomp_trk;
+uint8_t config;
+uint8_t lock;
+/* Power down config (also with status regB access mode */
+uint8_t pwrd;
+/* Floppy drives */
+FloppyBus bus;
+uint8_t num_floppies;
+FDrive drives[MAX_FD];
+struct {
+FloppyDriveType type;
+} qdev_for_drives[MAX_FD];
+int reset_sensei;
+FloppyDriveType fallback; /* type=auto failure fallback */
+/* Timers state */
+uint8_t 

[PATCH v2 0/8] hw/block/fdc: Allow Kconfig-selecting ISA bus/SysBus floppy controllers

2021-04-28 Thread Philippe Mathieu-Daudé
Hi,

The floppy disc controllers pulls in irrelevant devices (sysbus in
an ISA-only machine, ISA bus + isa devices on a sysbus-only machine).

This series clean that by extracting each device in its own file,
adding the corresponding Kconfig symbols: FDC_ISA and FDC_SYSBUS.

Since v1:
- added missing "hw/block/block.h" header (jsnow)
- inlined hardware specific calls (Mark)
- added R-b/A-b tags

Missing review: 5-8 (1-4 can be queued meanwhile)

Regards,

Phil.

Philippe Mathieu-Daudé (8):
  hw/block/fdc: Replace disabled fprintf() by trace event
  hw/block/fdc: Declare shared prototypes in fdc-internal.h
  hw/block/fdc: Extract ISA floppy controllers to fdc-isa.c
  hw/block/fdc: Extract SysBus floppy controllers to fdc-sysbus.c
  hw/block/fdc: Add sysbus_fdc_init_drives() method
  hw/sparc/sun4m: Inline sun4m_fdctrl_init()
  hw/block/fdc-sysbus: Add 'dma-channel' property
  hw/mips/jazz: Inline fdctrl_init_sysbus()

 hw/block/fdc-internal.h | 156 +++
 include/hw/block/fdc.h  |   7 +-
 hw/block/fdc-isa.c  | 313 +
 hw/block/fdc-sysbus.c   | 224 +++
 hw/block/fdc.c  | 608 +---
 hw/mips/jazz.c  |  16 ++
 hw/sparc/sun4m.c|  16 ++
 MAINTAINERS |   3 +
 hw/block/Kconfig|   8 +
 hw/block/meson.build|   2 +
 hw/block/trace-events   |   3 +
 hw/i386/Kconfig |   2 +-
 hw/isa/Kconfig  |   6 +-
 hw/mips/Kconfig |   2 +-
 hw/sparc/Kconfig|   2 +-
 hw/sparc64/Kconfig  |   2 +-
 16 files changed, 758 insertions(+), 612 deletions(-)
 create mode 100644 hw/block/fdc-internal.h
 create mode 100644 hw/block/fdc-isa.c
 create mode 100644 hw/block/fdc-sysbus.c

-- 
2.26.3





Re: [PATCH 0/4] hw/block/fdc: Allow Kconfig-selecting ISA bus/SysBus floppy controllers

2021-04-28 Thread Philippe Mathieu-Daudé
On 4/28/21 1:25 PM, Philippe Mathieu-Daudé wrote:
> On 4/28/21 9:57 AM, Mark Cave-Ayland wrote:

>> This basically looks good to me. You may be able to simplify this
>> further by removing the global legacy init functions
>> fdctrl_init_sysbus() and sun4m_fdctrl_init(): from what I can see
>> fdctrl_init_sysbus() is only used in hw/mips/jazz.c and
>> sun4m_fdctrl_init() is only used in hw/sparc/sun4m.c so you might as
>> well inline them or move the functions to the relevant files.
> 
> But both use FDCtrlSysBus which this series makes local to
> hw/block/fdc-sysbus.c, and use fdctrl_init_drives() which is declared in
> hw/block/fdc-internal.h, and use FloppyBus (also declared there).
> 
> Apparently they do that to initialize the fdctrl->dma_chann field...
> Which should be a property, but FDCtrl isn't QOM... So not that
> easy, it requires more work.

Hmm FDCtrl doesn't have to be a QOM object, it is a simple embedded
structure. It should be doable then.




Re: [PATCH 4/4] hw/block/fdc: Extract SysBus floppy controllers to fdc-sysbus.c

2021-04-28 Thread Philippe Mathieu-Daudé
On 4/27/21 7:34 PM, John Snow wrote:
> On 4/15/21 6:23 AM, Philippe Mathieu-Daudé wrote:
>> Some machines use floppy controllers via the SysBus interface,
>> and don't need to pull in all the SysBus code.
>> Extract the SysBus specific code to a new unit: fdc-sysbus.c,
>> and add a new Kconfig symbol: "FDC_SYSBUS".
>>
>> Signed-off-by: Philippe Mathieu-Daudé 
> 
> LGTM, but again you'll want someone to review the Kconfig changes. It
> looks reasonable to me at a glance, but I just don't know what I don't
> know there.
> 
> I'm trusting you somewhat that you've audited the proper dependencies
> for each subsystem and that these have been accurately described via
> Kconfig -- my knowledge of non-x86 platforms is a bit meager, so I am
> relying on CI to tell me if this breaks anything I care about.

The way I test these Kconfig changes is configuring with
--without-default-devices, and build/test for each machine doing:

$ echo $MACHINE=y > default-configs/devices/$arch-softmmu.mak

(I overwrite because currently we don't support having a base
config different than the default-configs one).

For example for the SPARCbook machine:

$ echo CONFIG_SUN4M=y > default-configs/devices/sparc-softmmu.mak
$ ninja qemu-system-sparc
$ qemu-system-sparc -M SPARCbook -S

Or for the Jazz machine:

$ echo CONFIG_JAZZ=y > default-configs/devices/mips64el-softmmu.mak
$ echo CONFIG_SEMIHOSTING=y >> default-configs/devices/mips64el-softmmu.mak
$ ninja qemu-system-mips64el
$ ./qemu-system-mips64el -M pica61 -S

(The CONFIG_SEMIHOSTING is a particular kludge pending another series)

But unfortunately there are predating bugs breaking this testing.

I'll add this information in the respin's cover.

> Would love to get an ACK from Mark Cave-Ayland and Hervé Poussineau if
> possible, but if they're not available to take a quick peek, we'll try
> to get this in closer to the beginning of the dev window to maximize
> problem-finding time.

The sun4m maintainer is Artyom.

> Reviewed-by: John Snow 

Thanks!

> 
>> ---
>>   hw/block/fdc-sysbus.c | 252 ++
>>   hw/block/fdc.c    | 220 
>>   MAINTAINERS   |   1 +
>>   hw/block/Kconfig  |   4 +
>>   hw/block/meson.build  |   1 +
>>   hw/block/trace-events |   2 +
>>   hw/mips/Kconfig   |   2 +-
>>   hw/sparc/Kconfig  |   2 +-
>>   8 files changed, 262 insertions(+), 222 deletions(-)
>>   create mode 100644 hw/block/fdc-sysbus.c




Re: [PATCH v3 00/36] block: update graph permissions update

2021-04-28 Thread Kevin Wolf
Am 17.03.2021 um 15:34 hat Vladimir Sementsov-Ogievskiy geschrieben:
> Hi all!
> 
> Finally, I finished v3. Phew.
> 
> Missed a soft-freeze. Should we consider it a bugfix? There are bugfixes
> here but they are mostly theoretical. So, up to Kevin, should it go to
> current release or to the next..
> 
> The main point of the series is fixing some permission update problems
> (see patches 01-03 as examples), that in turn makes possible more clean
> inserting and removing of filters (see patch 26 where .active field is
> dropped finally from backup-top filter, we don't need a workaround
> anymore).
> 
> The series brings util/transactions.c (patch 10) and use of it in
> block.c, which allows clean block graph change transactions, with
> possibility of reverting all modifications (movement and removement of
> children, changing aio context, changing permissions) in reverse order
> on failure path.
> 
> The series also helps Alberto's "Allow changing bs->file on reopen"
> which we want to merge prior dropping x- prefix from blockdev-reopen
> command.

Ok, I made it through the whole series. Looking quite good, I just had
some minor comments, but maybe not quite trival enough to just fix them
up while applying.

I had comments on patches 1, 4, 8, 18 and 22. With these addressed, you
can add to all patches in the whole series:
Reviewed-by: Kevin Wolf 

v4 should then be the final version. :-)

Kevin




Re: [PATCH 0/4] hw/block/fdc: Allow Kconfig-selecting ISA bus/SysBus floppy controllers

2021-04-28 Thread Philippe Mathieu-Daudé
On 4/28/21 9:57 AM, Mark Cave-Ayland wrote:
> On 15/04/2021 11:23, Philippe Mathieu-Daudé wrote:
> 
>> Hi,
>>
>> The floppy disc controllers pulls in irrelevant devices (sysbus in
>> an ISA-only machine, ISA bus + isa devices on a sysbus-only machine).
>>
>> This series clean that by extracting each device in its own file,
>> adding the corresponding Kconfig symbols: FDC_ISA and FDC_SYSBUS.
>>
>> Regards,
>>
>> Phil.
>>
>> Philippe Mathieu-Daudé (4):
>>    hw/block/fdc: Replace disabled fprintf() by trace event
>>    hw/block/fdc: Declare shared prototypes in fdc-internal.h
>>    hw/block/fdc: Extract ISA floppy controllers to fdc-isa.c
>>    hw/block/fdc: Extract SysBus floppy controllers to fdc-sysbus.c
>>
>>   hw/block/fdc-internal.h | 155 ++
>>   hw/block/fdc-isa.c  | 313 +
>>   hw/block/fdc-sysbus.c   | 252 +
>>   hw/block/fdc.c  | 608 +---
>>   MAINTAINERS |   3 +
>>   hw/block/Kconfig    |   8 +
>>   hw/block/meson.build    |   2 +
>>   hw/block/trace-events   |   3 +
>>   hw/i386/Kconfig |   2 +-
>>   hw/isa/Kconfig  |   6 +-
>>   hw/mips/Kconfig |   2 +-
>>   hw/sparc/Kconfig    |   2 +-
>>   hw/sparc64/Kconfig  |   2 +-
>>   13 files changed, 751 insertions(+), 607 deletions(-)
>>   create mode 100644 hw/block/fdc-internal.h
>>   create mode 100644 hw/block/fdc-isa.c
>>   create mode 100644 hw/block/fdc-sysbus.c
> 
> This basically looks good to me. You may be able to simplify this
> further by removing the global legacy init functions
> fdctrl_init_sysbus() and sun4m_fdctrl_init(): from what I can see
> fdctrl_init_sysbus() is only used in hw/mips/jazz.c and
> sun4m_fdctrl_init() is only used in hw/sparc/sun4m.c so you might as
> well inline them or move the functions to the relevant files.

But both use FDCtrlSysBus which this series makes local to
hw/block/fdc-sysbus.c, and use fdctrl_init_drives() which is declared in
hw/block/fdc-internal.h, and use FloppyBus (also declared there).

Apparently they do that to initialize the fdctrl->dma_chann field...
Which should be a property, but FDCtrl isn't QOM... So not that
easy, it requires more work.

> Acked-by: Mark Cave-Ayland 

Thanks!

Phil.




Re: [PATCH] hw/ide: Fix crash when plugging a piix3-ide device into the x-remote machine

2021-04-28 Thread Philippe Mathieu-Daudé
On 4/28/21 11:32 AM, Thomas Huth wrote:
> On 28/04/2021 11.24, Stefan Hajnoczi wrote:
>> On Fri, Apr 16, 2021 at 02:52:56PM +0200, Thomas Huth wrote:
>>> @@ -158,7 +166,11 @@ static void pci_piix_ide_realize(PCIDevice *dev,
>>> Error **errp)
>>>     vmstate_register(VMSTATE_IF(dev), 0, _ide_pci, d);
>>>   -    pci_piix_init_ports(d);
>>> +    rc = pci_piix_init_ports(d);
>>> +    if (rc) {
>>> +    error_setg_errno(errp, -rc, "Failed to realize %s",
>>> + object_get_typename(OBJECT(dev)));
>>> +    }
>>
>> Is there an error message explaining the reason for the failure
>> somewhere. I can't see one in the patch and imagine users will be
>> puzzled/frustrated by a generic "Failed to realize" error message. Can
>> you make it more meaningful?
> 
> Do you have a suggestion for a better message?

At this point it is hard to do better... Else we'd need to propagate
a meaningful Error* from ide_init_ioport(), and emit something like
"Failed to initialize the ISA bus"?




Re: [PATCH 0/2] block/export: Fix crash on error after iothread conflict

2021-04-28 Thread Stefan Hajnoczi
On Thu, Apr 22, 2021 at 04:53:33PM +0200, Max Reitz wrote:
> Hi,
> 
> By passing the @iothread option to block-export-add, the new export can
> be moved to the given iothread.  This may conflict with an existing
> parent of the node in question.  How this conflict is resolved, depends
> on @fixed-iothread: If that option is true, the error is fatal and
> block-export-add fails.  If it is false, the error is ignored and the
> node stays in its original iothread.
> 
> However, in the implementation, the ignored error is still in *errp, and
> so if a second error occurs afterwards and tries to put something into
> *errp, that will fail an assertion.
> 
> To really ignore the error, we have to free it and clear *errp (with an
> ERRP_GUARD()).
> 
> Patch 1 is the fix, patch 2 a regression test.
> 
> 
> Max Reitz (2):
>   block/export: Free ignored Error
>   iotests/307: Test iothread conflict for exports
> 
>  block/export/export.c  |  4 
>  tests/qemu-iotests/307 | 15 +++
>  tests/qemu-iotests/307.out |  8 
>  3 files changed, 27 insertions(+)
> 
> -- 
> 2.30.2
> 

Thanks for fixing this!

Reviewed-by: Stefan Hajnoczi 


signature.asc
Description: PGP signature


Re: [PATCH] hw/ide: Fix crash when plugging a piix3-ide device into the x-remote machine

2021-04-28 Thread Thomas Huth

On 28/04/2021 11.24, Stefan Hajnoczi wrote:

On Fri, Apr 16, 2021 at 02:52:56PM +0200, Thomas Huth wrote:

@@ -158,7 +166,11 @@ static void pci_piix_ide_realize(PCIDevice *dev, Error 
**errp)
  
  vmstate_register(VMSTATE_IF(dev), 0, _ide_pci, d);
  
-pci_piix_init_ports(d);

+rc = pci_piix_init_ports(d);
+if (rc) {
+error_setg_errno(errp, -rc, "Failed to realize %s",
+ object_get_typename(OBJECT(dev)));
+}


Is there an error message explaining the reason for the failure
somewhere. I can't see one in the patch and imagine users will be
puzzled/frustrated by a generic "Failed to realize" error message. Can
you make it more meaningful?


Do you have a suggestion for a better message?

 Thomas




Re: [PATCH] hw/ide: Fix crash when plugging a piix3-ide device into the x-remote machine

2021-04-28 Thread Stefan Hajnoczi
On Fri, Apr 16, 2021 at 02:52:56PM +0200, Thomas Huth wrote:
> @@ -158,7 +166,11 @@ static void pci_piix_ide_realize(PCIDevice *dev, Error 
> **errp)
>  
>  vmstate_register(VMSTATE_IF(dev), 0, _ide_pci, d);
>  
> -pci_piix_init_ports(d);
> +rc = pci_piix_init_ports(d);
> +if (rc) {
> +error_setg_errno(errp, -rc, "Failed to realize %s",
> + object_get_typename(OBJECT(dev)));
> +}

Is there an error message explaining the reason for the failure
somewhere. I can't see one in the patch and imagine users will be
puzzled/frustrated by a generic "Failed to realize" error message. Can
you make it more meaningful?

Thanks,
Stefan


signature.asc
Description: PGP signature


Re: [PATCH] hw/ide: Fix crash when plugging a piix3-ide device into the x-remote machine

2021-04-28 Thread Stefan Hajnoczi
On Tue, Apr 27, 2021 at 02:02:27PM -0400, John Snow wrote:
> On 4/27/21 1:54 PM, Philippe Mathieu-Daudé wrote:
> > On 4/27/21 7:16 PM, John Snow wrote:
> > > On 4/27/21 9:54 AM, Stefan Hajnoczi wrote:
> > > > I suggest fixing this at the qdev level. Make piix3-ide have a
> > > > sub-device that inherits from ISA_DEVICE so it can only be instantiated
> > > > when there's an ISA bus.
> > > > 
> > > > Stefan
> > > 
> > > My qdev knowledge is shaky. Does this imply that you agree with the
> > > direction of Thomas's patch, or do you just mean to disagree with Phil
> > > on his preferred course of action?
> > 
> > My understanding is a disagreement to both, with a 3rd direction :)
> > 
> > I agree with Stefan direction but I'm not sure (yet) that a sub-device
> > is the best (long-term) solution. I guess there is a design issue with
> > this device, and would like to understanding it first.
> > 
> > IIUC Stefan says the piix3-ide is both a PCI and IDE device, but QOM
> > only allow a single parent. Multiple QOM inheritance is resolved as
> > interfaces, but PCI/IDE qdev aren't interfaces, rather abstract objects.
> > So he suggests to embed an IDE device within the PCI piix3-ide device.
> > 
> > My view is the PIIX is a chipset that share stuffs between components,
> > and the IDE bus belongs to the chipset PCI root (or eventually the
> > PCI-ISA bridge, function #0). The IDE function would use the IDE bus
> > from its root parent as a linked property.
> > My problem is currently this device is user-creatable as a Frankenstein
> > single PCI function, out of its chipset. I'm not sure yet this is a
> > dead end or I could work something out.
> > 
> > Regards,
> > 
> > Phil.
> > 
> 
> It sounds complicated. In the meantime, I think I am favor of taking
> Thomas's patch because it merely adds some error routing to allow us to
> avoid a crash. The core organizational issues of the IDE device(s) will
> remain and can be solved later as needed.

The approach in this patch is okay but we should keep in mind it only
solves piix3-ide. ISA provides a non-qdev backdoor API and there may be
more instances of this type of bug.

A qdev fix would address the root cause and make it possible to drop the
backdoor API, but that's probably too much work for little benefit.

Stefan


signature.asc
Description: PGP signature


Re: [PATCH] hw/ide: Fix crash when plugging a piix3-ide device into the x-remote machine

2021-04-28 Thread Stefan Hajnoczi
On Tue, Apr 27, 2021 at 07:54:21PM +0200, Philippe Mathieu-Daudé wrote:
> On 4/27/21 7:16 PM, John Snow wrote:
> > On 4/27/21 9:54 AM, Stefan Hajnoczi wrote:
> >> I suggest fixing this at the qdev level. Make piix3-ide have a
> >> sub-device that inherits from ISA_DEVICE so it can only be instantiated
> >> when there's an ISA bus.
> >>
> >> Stefan
> > 
> > My qdev knowledge is shaky. Does this imply that you agree with the
> > direction of Thomas's patch, or do you just mean to disagree with Phil
> > on his preferred course of action?
> 
> My understanding is a disagreement to both, with a 3rd direction :)
> 
> I agree with Stefan direction but I'm not sure (yet) that a sub-device
> is the best (long-term) solution. I guess there is a design issue with
> this device, and would like to understanding it first.
> 
> IIUC Stefan says the piix3-ide is both a PCI and IDE device, but QOM
> only allow a single parent. Multiple QOM inheritance is resolved as
> interfaces, but PCI/IDE qdev aren't interfaces, rather abstract objects.
> So he suggests to embed an IDE device within the PCI piix3-ide device.
> 
> My view is the PIIX is a chipset that share stuffs between components,
> and the IDE bus belongs to the chipset PCI root (or eventually the
> PCI-ISA bridge, function #0). The IDE function would use the IDE bus
> from its root parent as a linked property.
> My problem is currently this device is user-creatable as a Frankenstein
> single PCI function, out of its chipset. I'm not sure yet this is a
> dead end or I could work something out.

Kevin and Paolo previously pointed out that piix3-ide is sometimes used
with the Q35 machine type. The user-creatable piix3-ide device needs to
be deprecated before it can be dropped. That's a long process that
cannot fix the current crash any time soon.

I do support deprecating the user-creatable piix3-ide device in favor of
a proper Q35 Legacy IDE implementation. The main problem is this
involves a bunch of work and I'm not sure who would do it (the payoff is
not very high).

Stefan


signature.asc
Description: PGP signature


Re: [PATCH v3 15/33] nbd/client-connection: use QEMU_LOCK_GUARD

2021-04-28 Thread Vladimir Sementsov-Ogievskiy

28.04.2021 09:08, Roman Kagan wrote:

On Fri, Apr 16, 2021 at 11:08:53AM +0300, Vladimir Sementsov-Ogievskiy wrote:

Signed-off-by: Vladimir Sementsov-Ogievskiy 
---
  nbd/client-connection.c | 94 ++---
  1 file changed, 42 insertions(+), 52 deletions(-)

diff --git a/nbd/client-connection.c b/nbd/client-connection.c
index 4e39a5b1af..b45a0bd5f6 100644
--- a/nbd/client-connection.c
+++ b/nbd/client-connection.c
@@ -87,17 +87,16 @@ static void *connect_thread_func(void *opaque)
  conn->sioc = NULL;
  }
  
-qemu_mutex_lock(>mutex);

-
-assert(conn->running);
-conn->running = false;
-if (conn->wait_co) {
-aio_co_schedule(NULL, conn->wait_co);
-conn->wait_co = NULL;
+WITH_QEMU_LOCK_GUARD(>mutex) {
+assert(conn->running);
+conn->running = false;
+if (conn->wait_co) {
+aio_co_schedule(NULL, conn->wait_co);
+conn->wait_co = NULL;
+}
  }
  do_free = conn->detached;


->detached is now accessed outside the mutex


Oops. Will fix.



  
-qemu_mutex_unlock(>mutex);
  
  if (do_free) {

  nbd_client_connection_do_free(conn);
@@ -136,61 +135,54 @@ void nbd_client_connection_release(NBDClientConnection 
*conn)
  QIOChannelSocket *coroutine_fn
  nbd_co_establish_connection(NBDClientConnection *conn, Error **errp)
  {
-QIOChannelSocket *sioc = NULL;
  QemuThread thread;
  
-qemu_mutex_lock(>mutex);

-
-/*
- * Don't call nbd_co_establish_connection() in several coroutines in
- * parallel. Only one call at once is supported.
- */
-assert(!conn->wait_co);
-
-if (!conn->running) {
-if (conn->sioc) {
-/* Previous attempt finally succeeded in background */
-sioc = g_steal_pointer(>sioc);
-qemu_mutex_unlock(>mutex);
-
-return sioc;
+WITH_QEMU_LOCK_GUARD(>mutex) {
+/*
+ * Don't call nbd_co_establish_connection() in several coroutines in
+ * parallel. Only one call at once is supported.
+ */
+assert(!conn->wait_co);
+
+if (!conn->running) {
+if (conn->sioc) {
+/* Previous attempt finally succeeded in background */
+return g_steal_pointer(>sioc);
+}
+
+conn->running = true;
+error_free(conn->err);
+conn->err = NULL;
+qemu_thread_create(, "nbd-connect",
+   connect_thread_func, conn, 
QEMU_THREAD_DETACHED);
  }
  
-conn->running = true;

-error_free(conn->err);
-conn->err = NULL;
-qemu_thread_create(, "nbd-connect",
-   connect_thread_func, conn, QEMU_THREAD_DETACHED);
+conn->wait_co = qemu_coroutine_self();
  }
  
-conn->wait_co = qemu_coroutine_self();

-
-qemu_mutex_unlock(>mutex);
-
  /*
   * We are going to wait for connect-thread finish, but
   * nbd_co_establish_connection_cancel() can interrupt.
   */
  qemu_coroutine_yield();
  
-qemu_mutex_lock(>mutex);

-
-if (conn->running) {
-/*
- * Obviously, drained section wants to start. Report the attempt as
- * failed. Still connect thread is executing in background, and its
- * result may be used for next connection attempt.
- */
-error_setg(errp, "Connection attempt cancelled by other operation");
-} else {
-error_propagate(errp, conn->err);
-conn->err = NULL;
-sioc = g_steal_pointer(>sioc);
+WITH_QEMU_LOCK_GUARD(>mutex) {
+if (conn->running) {
+/*
+ * Obviously, drained section wants to start. Report the attempt as
+ * failed. Still connect thread is executing in background, and its
+ * result may be used for next connection attempt.
+ */
+error_setg(errp, "Connection attempt cancelled by other 
operation");
+return NULL;
+} else {
+error_propagate(errp, conn->err);
+conn->err = NULL;
+return g_steal_pointer(>sioc);
+}
  }
  
-qemu_mutex_unlock(>mutex);

-
-return sioc;
+abort(); /* unreachable */
  }
  
  /*

@@ -201,12 +193,10 @@ nbd_co_establish_connection(NBDClientConnection *conn, 
Error **errp)
   */
  void coroutine_fn nbd_co_establish_connection_cancel(NBDClientConnection 
*conn)
  {
-qemu_mutex_lock(>mutex);
+QEMU_LOCK_GUARD(>mutex);
  
  if (conn->wait_co) {

  aio_co_schedule(NULL, conn->wait_co);
  conn->wait_co = NULL;
  }
-
-qemu_mutex_unlock(>mutex);
  }
--
2.29.2




--
Best regards,
Vladimir



Re: [PATCH v3 14/33] nbd: move connection code from block/nbd to nbd/client-connection

2021-04-28 Thread Vladimir Sementsov-Ogievskiy

28.04.2021 01:45, Roman Kagan wrote:

On Fri, Apr 16, 2021 at 11:08:52AM +0300, Vladimir Sementsov-Ogievskiy wrote:

We now have bs-independent connection API, which consists of four
functions:

   nbd_client_connection_new()
   nbd_client_connection_unref()
   nbd_co_establish_connection()
   nbd_co_establish_connection_cancel()

Move them to a separate file together with NBDClientConnection
structure which becomes private to the new API.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
---
  include/block/nbd.h |  11 +++
  block/nbd.c | 187 ---
  nbd/client-connection.c | 212 
  nbd/meson.build |   1 +
  4 files changed, 224 insertions(+), 187 deletions(-)
  create mode 100644 nbd/client-connection.c

diff --git a/include/block/nbd.h b/include/block/nbd.h
index 5f34d23bb0..57381be76f 100644
--- a/include/block/nbd.h
+++ b/include/block/nbd.h
@@ -406,4 +406,15 @@ const char *nbd_info_lookup(uint16_t info);
  const char *nbd_cmd_lookup(uint16_t info);
  const char *nbd_err_lookup(int err);
  
+/* nbd/client-connection.c */

+typedef struct NBDClientConnection NBDClientConnection;
+
+NBDClientConnection *nbd_client_connection_new(const SocketAddress *saddr);
+void nbd_client_connection_release(NBDClientConnection *conn);
+
+QIOChannelSocket *coroutine_fn
+nbd_co_establish_connection(NBDClientConnection *conn, Error **errp);
+
+void coroutine_fn nbd_co_establish_connection_cancel(NBDClientConnection 
*conn);
+
  #endif
diff --git a/block/nbd.c b/block/nbd.c
index 8531d019b2..9bd68dcf10 100644
--- a/block/nbd.c
+++ b/block/nbd.c
@@ -66,24 +66,6 @@ typedef enum NBDClientState {
  NBD_CLIENT_QUIT
  } NBDClientState;
  
-typedef struct NBDClientConnection {

-/* Initialization constants */
-SocketAddress *saddr; /* address to connect to */
-
-/*
- * Result of last attempt. Valid in FAIL and SUCCESS states.
- * If you want to steal error, don't forget to set pointer to NULL.
- */
-QIOChannelSocket *sioc;
-Error *err;
-
-QemuMutex mutex;
-/* All further fields are protected by mutex */
-bool running; /* thread is running now */
-bool detached; /* thread is detached and should cleanup the state */
-Coroutine *wait_co; /* nbd_co_establish_connection() wait in yield() */
-} NBDClientConnection;
-
  typedef struct BDRVNBDState {
  QIOChannelSocket *sioc; /* The master data channel */
  QIOChannel *ioc; /* The current I/O channel which may differ (eg TLS) */
@@ -118,12 +100,8 @@ typedef struct BDRVNBDState {
  NBDClientConnection *conn;
  } BDRVNBDState;
  
-static void nbd_client_connection_release(NBDClientConnection *conn);

  static int nbd_establish_connection(BlockDriverState *bs, SocketAddress 
*saddr,
  Error **errp);
-static coroutine_fn QIOChannelSocket *
-nbd_co_establish_connection(NBDClientConnection *conn, Error **errp);
-static void nbd_co_establish_connection_cancel(NBDClientConnection *conn);
  static int nbd_client_handshake(BlockDriverState *bs, Error **errp);
  static void nbd_yank(void *opaque);
  
@@ -340,171 +318,6 @@ static bool nbd_client_connecting_wait(BDRVNBDState *s)

  return qatomic_load_acquire(>state) == NBD_CLIENT_CONNECTING_WAIT;
  }
  
-static NBDClientConnection *

-nbd_client_connection_new(const SocketAddress *saddr)
-{
-NBDClientConnection *conn = g_new(NBDClientConnection, 1);
-
-*conn = (NBDClientConnection) {
-.saddr = QAPI_CLONE(SocketAddress, saddr),
-};
-
-qemu_mutex_init(>mutex);
-
-return conn;
-}
-
-static void nbd_client_connection_do_free(NBDClientConnection *conn)
-{
-if (conn->sioc) {
-qio_channel_close(QIO_CHANNEL(conn->sioc), NULL);
-object_unref(OBJECT(conn->sioc));
-}
-error_free(conn->err);
-qapi_free_SocketAddress(conn->saddr);
-g_free(conn);
-}
-
-static void *connect_thread_func(void *opaque)
-{
-NBDClientConnection *conn = opaque;
-bool do_free;
-int ret;
-
-conn->sioc = qio_channel_socket_new();
-
-error_free(conn->err);
-conn->err = NULL;
-ret = qio_channel_socket_connect_sync(conn->sioc, conn->saddr, >err);
-if (ret < 0) {
-object_unref(OBJECT(conn->sioc));
-conn->sioc = NULL;
-}
-
-qemu_mutex_lock(>mutex);
-
-assert(conn->running);
-conn->running = false;
-if (conn->wait_co) {
-aio_co_schedule(NULL, conn->wait_co);
-conn->wait_co = NULL;
-}
-do_free = conn->detached;
-
-qemu_mutex_unlock(>mutex);
-
-if (do_free) {
-nbd_client_connection_do_free(conn);
-}
-
-return NULL;
-}
-
-static void nbd_client_connection_release(NBDClientConnection *conn)
-{
-bool do_free;
-
-if (!conn) {
-return;
-}
-
-qemu_mutex_lock(>mutex);
-if (conn->running) {
-conn->detached = true;
-}
-do_free = !conn->running && !conn->detached;
-

Re: [PATCH v3 13/33] block/nbd: introduce nbd_client_connection_release()

2021-04-28 Thread Vladimir Sementsov-Ogievskiy

28.04.2021 01:35, Roman Kagan wrote:

On Fri, Apr 16, 2021 at 11:08:51AM +0300, Vladimir Sementsov-Ogievskiy wrote:

Signed-off-by: Vladimir Sementsov-Ogievskiy 
---
  block/nbd.c | 43 ++-
  1 file changed, 26 insertions(+), 17 deletions(-)

diff --git a/block/nbd.c b/block/nbd.c
index 21a4039359..8531d019b2 100644
--- a/block/nbd.c
+++ b/block/nbd.c
@@ -118,7 +118,7 @@ typedef struct BDRVNBDState {
  NBDClientConnection *conn;
  } BDRVNBDState;
  
-static void nbd_free_connect_thread(NBDClientConnection *conn);

+static void nbd_client_connection_release(NBDClientConnection *conn);
  static int nbd_establish_connection(BlockDriverState *bs, SocketAddress 
*saddr,
  Error **errp);
  static coroutine_fn QIOChannelSocket *
@@ -130,20 +130,9 @@ static void nbd_yank(void *opaque);
  static void nbd_clear_bdrvstate(BlockDriverState *bs)
  {
  BDRVNBDState *s = (BDRVNBDState *)bs->opaque;
-NBDClientConnection *conn = s->conn;
-bool do_free;
-
-qemu_mutex_lock(>mutex);
-if (conn->running) {
-conn->detached = true;
-}
-do_free = !conn->running && !conn->detached;
-qemu_mutex_unlock(>mutex);
  
-/* the runaway thread will clean it up itself */

-if (do_free) {
-nbd_free_connect_thread(conn);
-}
+nbd_client_connection_release(s->conn);
+s->conn = NULL;
  
  yank_unregister_instance(BLOCKDEV_YANK_INSTANCE(bs->node_name));
  
@@ -365,7 +354,7 @@ nbd_client_connection_new(const SocketAddress *saddr)

  return conn;
  }
  
-static void nbd_free_connect_thread(NBDClientConnection *conn)

+static void nbd_client_connection_do_free(NBDClientConnection *conn)
  {
  if (conn->sioc) {
  qio_channel_close(QIO_CHANNEL(conn->sioc), NULL);
@@ -379,8 +368,8 @@ static void nbd_free_connect_thread(NBDClientConnection 
*conn)
  static void *connect_thread_func(void *opaque)
  {
  NBDClientConnection *conn = opaque;
+bool do_free;
  int ret;
-bool do_free = false;
  


This hunk belongs in patch 8.



Agree




  conn->sioc = qio_channel_socket_new();
  
@@ -405,12 +394,32 @@ static void *connect_thread_func(void *opaque)

  qemu_mutex_unlock(>mutex);
  
  if (do_free) {

-nbd_free_connect_thread(conn);
+nbd_client_connection_do_free(conn);
  }
  
  return NULL;

  }
  
+static void nbd_client_connection_release(NBDClientConnection *conn)

+{
+bool do_free;
+
+if (!conn) {
+return;
+}
+
+qemu_mutex_lock(>mutex);
+if (conn->running) {
+conn->detached = true;
+}
+do_free = !conn->running && !conn->detached;
+qemu_mutex_unlock(>mutex);
+
+if (do_free) {
+nbd_client_connection_do_free(conn);
+}
+}
+
  /*
   * Get a new connection in context of @conn:
   *   if thread is running, wait for completion
--
2.29.2




--
Best regards,
Vladimir



Re: [PATCH v3 08/33] block/nbd: drop thr->state

2021-04-28 Thread Vladimir Sementsov-Ogievskiy

28.04.2021 01:23, Roman Kagan wrote:

On Fri, Apr 16, 2021 at 11:08:46AM +0300, Vladimir Sementsov-Ogievskiy wrote:

We don't need all these states. The code refactored to use two boolean
variables looks simpler.


Indeed.



Signed-off-by: Vladimir Sementsov-Ogievskiy 
---
  block/nbd.c | 125 ++--
  1 file changed, 34 insertions(+), 91 deletions(-)

diff --git a/block/nbd.c b/block/nbd.c
index e1f39eda6c..2b26a033a4 100644
--- a/block/nbd.c
+++ b/block/nbd.c
@@ -66,24 +66,6 @@ typedef enum NBDClientState {
  NBD_CLIENT_QUIT
  } NBDClientState;
  
-typedef enum NBDConnectThreadState {

-/* No thread, no pending results */
-CONNECT_THREAD_NONE,
-
-/* Thread is running, no results for now */
-CONNECT_THREAD_RUNNING,
-
-/*
- * Thread is running, but requestor exited. Thread should close
- * the new socket and free the connect state on exit.
- */
-CONNECT_THREAD_RUNNING_DETACHED,
-
-/* Thread finished, results are stored in a state */
-CONNECT_THREAD_FAIL,
-CONNECT_THREAD_SUCCESS
-} NBDConnectThreadState;
-
  typedef struct NBDConnectThread {
  /* Initialization constants */
  SocketAddress *saddr; /* address to connect to */
@@ -97,7 +79,8 @@ typedef struct NBDConnectThread {
  
  QemuMutex mutex;

  /* All further fields are protected by mutex */
-NBDConnectThreadState state; /* current state of the thread */
+bool running; /* thread is running now */
+bool detached; /* thread is detached and should cleanup the state */
  Coroutine *wait_co; /* nbd_co_establish_connection() wait in yield() */
  } NBDConnectThread;
  
@@ -147,17 +130,17 @@ static void nbd_clear_bdrvstate(BlockDriverState *bs)

  {
  BDRVNBDState *s = (BDRVNBDState *)bs->opaque;
  NBDConnectThread *thr = s->connect_thread;
-bool thr_running;
+bool do_free;
  
  qemu_mutex_lock(>mutex);

-thr_running = thr->state == CONNECT_THREAD_RUNNING;
-if (thr_running) {
-thr->state = CONNECT_THREAD_RUNNING_DETACHED;
+if (thr->running) {
+thr->detached = true;
  }
+do_free = !thr->running && !thr->detached;


This is redundant.  You can unconditionally set ->detached and only
depend on ->running for the rest of this function.


Still, I don't want to set ->detached unconditionally, just to not break its 
meaning (even before freeing). And that fact that detached is set only once worth 
assertion. So, I think:

assert(!thr->detached);
if (thr->running) {
  thr->detached = true;
} else {
  do_free = true;
}




  qemu_mutex_unlock(>mutex);
  
  /* the runaway thread will clean it up itself */

-if (!thr_running) {
+if (do_free) {
  nbd_free_connect_thread(thr);
  }
  
@@ -373,7 +356,6 @@ static void nbd_init_connect_thread(BDRVNBDState *s)
  
  *s->connect_thread = (NBDConnectThread) {

  .saddr = QAPI_CLONE(SocketAddress, s->saddr),
-.state = CONNECT_THREAD_NONE,
  };
  
  qemu_mutex_init(>connect_thread->mutex);

@@ -408,20 +390,13 @@ static void *connect_thread_func(void *opaque)
  
  qemu_mutex_lock(>mutex);
  
-switch (thr->state) {

-case CONNECT_THREAD_RUNNING:
-thr->state = ret < 0 ? CONNECT_THREAD_FAIL : CONNECT_THREAD_SUCCESS;
-if (thr->wait_co) {
-aio_co_schedule(NULL, thr->wait_co);
-thr->wait_co = NULL;
-}
-break;
-case CONNECT_THREAD_RUNNING_DETACHED:
-do_free = true;
-break;
-default:
-abort();
+assert(thr->running);
+thr->running = false;
+if (thr->wait_co) {
+aio_co_schedule(NULL, thr->wait_co);
+thr->wait_co = NULL;
  }
+do_free = thr->detached;
  
  qemu_mutex_unlock(>mutex);
  
@@ -435,36 +410,24 @@ static void *connect_thread_func(void *opaque)

  static int coroutine_fn
  nbd_co_establish_connection(BlockDriverState *bs, Error **errp)
  {
-int ret;
  QemuThread thread;
  BDRVNBDState *s = bs->opaque;
  NBDConnectThread *thr = s->connect_thread;
  
+assert(!s->sioc);

+
  qemu_mutex_lock(>mutex);
  
-switch (thr->state) {

-case CONNECT_THREAD_FAIL:
-case CONNECT_THREAD_NONE:
+if (!thr->running) {
+if (thr->sioc) {
+/* Previous attempt finally succeeded in background */
+goto out;
+}
+thr->running = true;
  error_free(thr->err);
  thr->err = NULL;
-thr->state = CONNECT_THREAD_RUNNING;
  qemu_thread_create(, "nbd-connect",
 connect_thread_func, thr, QEMU_THREAD_DETACHED);
-break;
-case CONNECT_THREAD_SUCCESS:
-/* Previous attempt finally succeeded in background */
-thr->state = CONNECT_THREAD_NONE;
-s->sioc = thr->sioc;
-thr->sioc = NULL;
-yank_register_function(BLOCKDEV_YANK_INSTANCE(bs->node_name),
-   nbd_yank, bs);
-

Re: [PATCH 0/4] hw/block/fdc: Allow Kconfig-selecting ISA bus/SysBus floppy controllers

2021-04-28 Thread Mark Cave-Ayland

On 15/04/2021 11:23, Philippe Mathieu-Daudé wrote:


Hi,

The floppy disc controllers pulls in irrelevant devices (sysbus in
an ISA-only machine, ISA bus + isa devices on a sysbus-only machine).

This series clean that by extracting each device in its own file,
adding the corresponding Kconfig symbols: FDC_ISA and FDC_SYSBUS.

Regards,

Phil.

Philippe Mathieu-Daudé (4):
   hw/block/fdc: Replace disabled fprintf() by trace event
   hw/block/fdc: Declare shared prototypes in fdc-internal.h
   hw/block/fdc: Extract ISA floppy controllers to fdc-isa.c
   hw/block/fdc: Extract SysBus floppy controllers to fdc-sysbus.c

  hw/block/fdc-internal.h | 155 ++
  hw/block/fdc-isa.c  | 313 +
  hw/block/fdc-sysbus.c   | 252 +
  hw/block/fdc.c  | 608 +---
  MAINTAINERS |   3 +
  hw/block/Kconfig|   8 +
  hw/block/meson.build|   2 +
  hw/block/trace-events   |   3 +
  hw/i386/Kconfig |   2 +-
  hw/isa/Kconfig  |   6 +-
  hw/mips/Kconfig |   2 +-
  hw/sparc/Kconfig|   2 +-
  hw/sparc64/Kconfig  |   2 +-
  13 files changed, 751 insertions(+), 607 deletions(-)
  create mode 100644 hw/block/fdc-internal.h
  create mode 100644 hw/block/fdc-isa.c
  create mode 100644 hw/block/fdc-sysbus.c


This basically looks good to me. You may be able to simplify this further by removing 
the global legacy init functions fdctrl_init_sysbus() and sun4m_fdctrl_init(): from 
what I can see fdctrl_init_sysbus() is only used in hw/mips/jazz.c and 
sun4m_fdctrl_init() is only used in hw/sparc/sun4m.c so you might as well inline them 
or move the functions to the relevant files.


Acked-by: Mark Cave-Ayland 


ATB,

Mark.



Re: [PATCH v3 15/33] nbd/client-connection: use QEMU_LOCK_GUARD

2021-04-28 Thread Roman Kagan
On Fri, Apr 16, 2021 at 11:08:53AM +0300, Vladimir Sementsov-Ogievskiy wrote:
> Signed-off-by: Vladimir Sementsov-Ogievskiy 
> ---
>  nbd/client-connection.c | 94 ++---
>  1 file changed, 42 insertions(+), 52 deletions(-)
> 
> diff --git a/nbd/client-connection.c b/nbd/client-connection.c
> index 4e39a5b1af..b45a0bd5f6 100644
> --- a/nbd/client-connection.c
> +++ b/nbd/client-connection.c
> @@ -87,17 +87,16 @@ static void *connect_thread_func(void *opaque)
>  conn->sioc = NULL;
>  }
>  
> -qemu_mutex_lock(>mutex);
> -
> -assert(conn->running);
> -conn->running = false;
> -if (conn->wait_co) {
> -aio_co_schedule(NULL, conn->wait_co);
> -conn->wait_co = NULL;
> +WITH_QEMU_LOCK_GUARD(>mutex) {
> +assert(conn->running);
> +conn->running = false;
> +if (conn->wait_co) {
> +aio_co_schedule(NULL, conn->wait_co);
> +conn->wait_co = NULL;
> +}
>  }
>  do_free = conn->detached;

->detached is now accessed outside the mutex

>  
> -qemu_mutex_unlock(>mutex);
>  
>  if (do_free) {
>  nbd_client_connection_do_free(conn);
> @@ -136,61 +135,54 @@ void nbd_client_connection_release(NBDClientConnection 
> *conn)
>  QIOChannelSocket *coroutine_fn
>  nbd_co_establish_connection(NBDClientConnection *conn, Error **errp)
>  {
> -QIOChannelSocket *sioc = NULL;
>  QemuThread thread;
>  
> -qemu_mutex_lock(>mutex);
> -
> -/*
> - * Don't call nbd_co_establish_connection() in several coroutines in
> - * parallel. Only one call at once is supported.
> - */
> -assert(!conn->wait_co);
> -
> -if (!conn->running) {
> -if (conn->sioc) {
> -/* Previous attempt finally succeeded in background */
> -sioc = g_steal_pointer(>sioc);
> -qemu_mutex_unlock(>mutex);
> -
> -return sioc;
> +WITH_QEMU_LOCK_GUARD(>mutex) {
> +/*
> + * Don't call nbd_co_establish_connection() in several coroutines in
> + * parallel. Only one call at once is supported.
> + */
> +assert(!conn->wait_co);
> +
> +if (!conn->running) {
> +if (conn->sioc) {
> +/* Previous attempt finally succeeded in background */
> +return g_steal_pointer(>sioc);
> +}
> +
> +conn->running = true;
> +error_free(conn->err);
> +conn->err = NULL;
> +qemu_thread_create(, "nbd-connect",
> +   connect_thread_func, conn, 
> QEMU_THREAD_DETACHED);
>  }
>  
> -conn->running = true;
> -error_free(conn->err);
> -conn->err = NULL;
> -qemu_thread_create(, "nbd-connect",
> -   connect_thread_func, conn, QEMU_THREAD_DETACHED);
> +conn->wait_co = qemu_coroutine_self();
>  }
>  
> -conn->wait_co = qemu_coroutine_self();
> -
> -qemu_mutex_unlock(>mutex);
> -
>  /*
>   * We are going to wait for connect-thread finish, but
>   * nbd_co_establish_connection_cancel() can interrupt.
>   */
>  qemu_coroutine_yield();
>  
> -qemu_mutex_lock(>mutex);
> -
> -if (conn->running) {
> -/*
> - * Obviously, drained section wants to start. Report the attempt as
> - * failed. Still connect thread is executing in background, and its
> - * result may be used for next connection attempt.
> - */
> -error_setg(errp, "Connection attempt cancelled by other operation");
> -} else {
> -error_propagate(errp, conn->err);
> -conn->err = NULL;
> -sioc = g_steal_pointer(>sioc);
> +WITH_QEMU_LOCK_GUARD(>mutex) {
> +if (conn->running) {
> +/*
> + * Obviously, drained section wants to start. Report the attempt 
> as
> + * failed. Still connect thread is executing in background, and 
> its
> + * result may be used for next connection attempt.
> + */
> +error_setg(errp, "Connection attempt cancelled by other 
> operation");
> +return NULL;
> +} else {
> +error_propagate(errp, conn->err);
> +conn->err = NULL;
> +return g_steal_pointer(>sioc);
> +}
>  }
>  
> -qemu_mutex_unlock(>mutex);
> -
> -return sioc;
> +abort(); /* unreachable */
>  }
>  
>  /*
> @@ -201,12 +193,10 @@ nbd_co_establish_connection(NBDClientConnection *conn, 
> Error **errp)
>   */
>  void coroutine_fn nbd_co_establish_connection_cancel(NBDClientConnection 
> *conn)
>  {
> -qemu_mutex_lock(>mutex);
> +QEMU_LOCK_GUARD(>mutex);
>  
>  if (conn->wait_co) {
>  aio_co_schedule(NULL, conn->wait_co);
>  conn->wait_co = NULL;
>  }
> -
> -qemu_mutex_unlock(>mutex);
>  }
> -- 
> 2.29.2
>