Re: [PATCH] virtio-blk: drop deprecated scsi=on|off property

2021-04-29 Thread Michal Prívozník
On 4/29/21 5:52 PM, Stefan Hajnoczi wrote:
> The scsi=on|off property was deprecated in QEMU 5.0 and can be removed
> completely at this point.
> 
> Drop the scsi=on|off option. It was only available on Legacy virtio-blk
> devices. Linux v5.6 already dropped support for it.
> 
> Remove the hw_compat_2_4[] property assignment since scsi=on|off no
> longer exists. Old guests with Legacy virtio-blk devices no longer see
> the SCSI host features bit.
> 
> Live migrating old guests from an old QEMU with the SCSI feature bit
> enabled will fail with "Features 0x... unsupported. Allowed features:
> 0x...". We've followed the QEMU deprecation policy so users have been
> warned...
> 
> I have tested that libvirt still works when the property is absent. It
> no longer adds scsi=on|off to the command-line.
> 
> Cc: Markus Armbruster 
> Cc: Christoph Hellwig 
> Cc: Peter Krempa 
> Cc: Dr. David Alan Gilbert 
> Signed-off-by: Stefan Hajnoczi 
> ---
>  docs/specs/tpm.rst   |   2 +-
>  docs/system/deprecated.rst   |  13 ---
>  docs/pci_expander_bridge.txt |   2 +-
>  hw/block/virtio-blk.c| 192 +--
>  hw/core/machine.c|   2 -
>  5 files changed, 3 insertions(+), 208 deletions(-)

Reviewed-by: Michal Privoznik 

Michal




Re: [PATCH] virtio-blk: drop deprecated scsi=on|off property

2021-04-29 Thread Eduardo Habkost
On Thu, Apr 29, 2021 at 04:52:21PM +0100, Stefan Hajnoczi wrote:
> The scsi=on|off property was deprecated in QEMU 5.0 and can be removed
> completely at this point.
> 
> Drop the scsi=on|off option. It was only available on Legacy virtio-blk
> devices. Linux v5.6 already dropped support for it.
> 
> Remove the hw_compat_2_4[] property assignment since scsi=on|off no
> longer exists. Old guests with Legacy virtio-blk devices no longer see
> the SCSI host features bit.
> 

This means pc-2.4 will now break guest ABI if using virtio-blk
devices, correct?

This looks like a sign we should have deprecated pc-2.4 a long
time ago.

> Live migrating old guests from an old QEMU with the SCSI feature bit
> enabled will fail with "Features 0x... unsupported. Allowed features:
> 0x...". We've followed the QEMU deprecation policy so users have been
> warned...
> 

Were they really warned, though?  People running
"-machine pc-i440fx-2.4" might be completely unaware that it was
silently enabling a deprecated feature.

Can we have this documented in a more explicit way?  Maybe just a
comment at hw_compat_2_4 would be enough, to warn people doing
backports and rebases downstream.

Can we make QEMU refuse to start if using pc-2.4 + virtio-blk
together, just to be sure?

An alternative would be keeping the property (and the
hw_compat_2_4 entry) just to keep pc-2.4 working (until pc-2.4 is
deprecated and removed), but refusing to realize the device if
the feature is enabled.


> I have tested that libvirt still works when the property is absent. It
> no longer adds scsi=on|off to the command-line.
> 
> Cc: Markus Armbruster 
> Cc: Christoph Hellwig 
> Cc: Peter Krempa 
> Cc: Dr. David Alan Gilbert 
> Signed-off-by: Stefan Hajnoczi 
> ---
[...]
> diff --git a/hw/core/machine.c b/hw/core/machine.c
> index 40def78183..286f18ec6d 100644
> --- a/hw/core/machine.c
> +++ b/hw/core/machine.c
> @@ -194,8 +194,6 @@ GlobalProperty hw_compat_2_5[] = {
>  const size_t hw_compat_2_5_len = G_N_ELEMENTS(hw_compat_2_5);
>  
>  GlobalProperty hw_compat_2_4[] = {
> -/* Optional because the 'scsi' property is Linux-only */
> -{ "virtio-blk-device", "scsi", "true", .optional = true },
>  { "e1000", "extra_mac_registers", "off" },
>  { "virtio-pci", "x-disable-pcie", "on" },
>  { "virtio-pci", "migrate-extra", "off" },


-- 
Eduardo




libnbd thread-local errors and dlclose (was: Re: [ANNOUNCE] libblkio v0.1.0 preview release)

2021-04-29 Thread Richard W.M. Jones
On Thu, Apr 29, 2021 at 04:08:22PM +0100, Daniel P. Berrangé wrote:
> On Thu, Apr 29, 2021 at 04:00:38PM +0100, Richard W.M. Jones wrote:
> > On Thu, Apr 29, 2021 at 03:41:29PM +0100, Stefan Hajnoczi wrote:
> > > On Thu, Apr 29, 2021 at 03:22:59PM +0100, Richard W.M. Jones wrote:
> > > > libvirt originally, and now libnbd, keep a per-thread error message
> > > > (stored in thread-local storage).  It's a lot nicer than having to
> > > > pass  to every function.  You can just write:
> > > > 
> > > >  if (nbd_connect_tcp (nbd, "remote", "nbd") == -1) {
> > > >fprintf (stderr,
> > > > "failed to connect to remote server: %s (errno = %d)\n",
> > > > nbd_get_error (), nbd_get_errno ());
> > > >exit (EXIT_FAILURE);
> > > >  }
> > > > 
> > > > (https://libguestfs.org/libnbd.3.html#ERROR-HANDLING)
> > > > 
> > > > It means you can extend the range of error information available in
> > > > future.  Also you can return a 'const char *' and the application
> > > > doesn't have to worry about lifetimes, at least in the common case.
> > > 
> > > Thanks for sharing the idea, I think it would work well for libblkio
> > > too.
> > > 
> > > Do you ignore the dlclose(3) memory leak?
> > 
> > I believe this mechanism in libnbd ensures there is no leak in the
> > ordinary shared library (not dlopen/dlclose) case:
> > 
> > https://gitlab.com/nbdkit/libnbd/-/blob/f9257a9fdc68706a4079deb4ced61e1d468f28d6/lib/errors.c#L35
> > 
> > However I'm not sure what happens in the dlopen case, so I'm going to
> > test that out now ...
> 
> pthread_key destructors are a disaster waiting to happen with
> dlclose, if the dlclose happens while non-main threads are
> still running. When those threads exit, they'll run the
> destructor which points to a function that is no longer
> resident in memory.

While libnbd had a memory leak there, now fixed by:

https://gitlab.com/nbdkit/libnbd/-/commit/026d281c57dd95485cc9bf829918b5efd9e32ddb

I don't actually think we have the bug you're describing.  We have a
destructor (errors_free()) which runs on dlclose, which deletes the
thread-local storage key, so the destructor will not run after the
library has been unloaded.

I added a test for this which works fine for me:

https://gitlab.com/nbdkit/libnbd/-/commit/831d142787aba4f5b638418e02cf7e0f2a051565

Rich.

-- 
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
Read my programming and virtualization blog: http://rwmj.wordpress.com
libguestfs lets you edit virtual machines.  Supports shell scripting,
bindings from many languages.  http://libguestfs.org




[PATCH v2 5/6] virtio: Fail if iommu_platform is requested, but unsupported

2021-04-29 Thread Kevin Wolf
Commit 2943b53f6 (' virtio: force VIRTIO_F_IOMMU_PLATFORM') made sure
that vhost can't just reject VIRTIO_F_IOMMU_PLATFORM when it was
requested. However, just adding it back to the negotiated flags isn't
right either because it promises support to the guest that the device
actually doesn't support. One example of a vhost-user device that
doesn't have support for the flag is the vhost-user-blk export of QEMU.

Instead of successfully creating a device that doesn't work, just fail
to plug the device when it doesn't support the feature, but it was
requested. This results in much clearer error messages.

Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1935019
Signed-off-by: Kevin Wolf 
Reviewed-by: Raphael Norwitz 
---
 hw/virtio/virtio-bus.c | 5 +
 1 file changed, 5 insertions(+)

diff --git a/hw/virtio/virtio-bus.c b/hw/virtio/virtio-bus.c
index d6332d45c3..859978d248 100644
--- a/hw/virtio/virtio-bus.c
+++ b/hw/virtio/virtio-bus.c
@@ -69,6 +69,11 @@ void virtio_bus_device_plugged(VirtIODevice *vdev, Error 
**errp)
 return;
 }
 
+if (has_iommu && !virtio_host_has_feature(vdev, VIRTIO_F_IOMMU_PLATFORM)) {
+error_setg(errp, "iommu_platform=true is not supported by the device");
+return;
+}
+
 if (klass->device_plugged != NULL) {
 klass->device_plugged(qbus->parent, _err);
 }
-- 
2.30.2




[PATCH v2 6/6] vhost-user-blk: Check that num-queues is supported by backend

2021-04-29 Thread Kevin Wolf
Creating a device with a number of queues that isn't supported by the
backend is pointless, the device won't work properly and the error
messages are rather confusing.

Just fail to create the device if num-queues is higher than what the
backend supports.

Since the relationship between num-queues and the number of virtqueues
depends on the specific device, this is an additional value that needs
to be initialised by the device. For convenience, allow leaving it 0 if
the check should be skipped. This makes sense for vhost-user-net where
separate vhost devices are used for the queues and custom initialisation
code is needed to perform the check.

Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1935031
Signed-off-by: Kevin Wolf 
Reviewed-by: Raphael Norwitz 
---
 include/hw/virtio/vhost.h | 2 ++
 hw/block/vhost-user-blk.c | 1 +
 hw/virtio/vhost-user.c| 5 +
 3 files changed, 8 insertions(+)

diff --git a/include/hw/virtio/vhost.h b/include/hw/virtio/vhost.h
index 4a8bc75415..21a9a52088 100644
--- a/include/hw/virtio/vhost.h
+++ b/include/hw/virtio/vhost.h
@@ -74,6 +74,8 @@ struct vhost_dev {
 int nvqs;
 /* the first virtqueue which would be used by this vhost dev */
 int vq_index;
+/* if non-zero, minimum required value for max_queues */
+int num_queues;
 uint64_t features;
 uint64_t acked_features;
 uint64_t backend_features;
diff --git a/hw/block/vhost-user-blk.c b/hw/block/vhost-user-blk.c
index c7e502f4c7..c6210fad0c 100644
--- a/hw/block/vhost-user-blk.c
+++ b/hw/block/vhost-user-blk.c
@@ -324,6 +324,7 @@ static int vhost_user_blk_connect(DeviceState *dev, Error 
**errp)
 }
 s->connected = true;
 
+s->dev.num_queues = s->num_queues;
 s->dev.nvqs = s->num_queues;
 s->dev.vqs = s->vhost_vqs;
 s->dev.vq_index = 0;
diff --git a/hw/virtio/vhost-user.c b/hw/virtio/vhost-user.c
index ded0c10453..ee57abe045 100644
--- a/hw/virtio/vhost-user.c
+++ b/hw/virtio/vhost-user.c
@@ -1909,6 +1909,11 @@ static int vhost_user_backend_init(struct vhost_dev 
*dev, void *opaque)
 return err;
 }
 }
+if (dev->num_queues && dev->max_queues < dev->num_queues) {
+error_report("The maximum number of queues supported by the "
+ "backend is %" PRIu64, dev->max_queues);
+return -EINVAL;
+}
 
 if (virtio_has_feature(features, VIRTIO_F_IOMMU_PLATFORM) &&
 !(virtio_has_feature(dev->protocol_features,
-- 
2.30.2




[PATCH v2 3/6] vhost-user-blk: Improve error reporting in realize

2021-04-29 Thread Kevin Wolf
Now that vhost_user_blk_connect() is not called from an event handler
any more, but directly from vhost_user_blk_device_realize(), we can
actually make use of Error again instead of calling error_report() in
the inner function and setting a more generic and therefore less useful
error message in realize() itself.

With Error, the callers are responsible for adding context if necessary
(such as the "-device" option the error refers to). Additional prefixes
are redundant and better omitted.

Signed-off-by: Kevin Wolf 
Acked-by: Raphael Norwitz 
---
 hw/block/vhost-user-blk.c | 23 +++
 1 file changed, 11 insertions(+), 12 deletions(-)

diff --git a/hw/block/vhost-user-blk.c b/hw/block/vhost-user-blk.c
index c0b9958da1..f3a45af97c 100644
--- a/hw/block/vhost-user-blk.c
+++ b/hw/block/vhost-user-blk.c
@@ -311,7 +311,7 @@ static void vhost_user_blk_reset(VirtIODevice *vdev)
 vhost_dev_free_inflight(s->inflight);
 }
 
-static int vhost_user_blk_connect(DeviceState *dev)
+static int vhost_user_blk_connect(DeviceState *dev, Error **errp)
 {
 VirtIODevice *vdev = VIRTIO_DEVICE(dev);
 VHostUserBlk *s = VHOST_USER_BLK(vdev);
@@ -331,8 +331,7 @@ static int vhost_user_blk_connect(DeviceState *dev)
 
 ret = vhost_dev_init(>dev, >vhost_user, VHOST_BACKEND_TYPE_USER, 0);
 if (ret < 0) {
-error_report("vhost-user-blk: vhost initialization failed: %s",
- strerror(-ret));
+error_setg_errno(errp, -ret, "vhost initialization failed");
 return ret;
 }
 
@@ -340,8 +339,7 @@ static int vhost_user_blk_connect(DeviceState *dev)
 if (virtio_device_started(vdev, vdev->status)) {
 ret = vhost_user_blk_start(vdev);
 if (ret < 0) {
-error_report("vhost-user-blk: vhost start failed: %s",
- strerror(-ret));
+error_setg_errno(errp, -ret, "vhost start failed");
 return ret;
 }
 }
@@ -380,10 +378,12 @@ static void vhost_user_blk_event(void *opaque, 
QEMUChrEvent event)
 DeviceState *dev = opaque;
 VirtIODevice *vdev = VIRTIO_DEVICE(dev);
 VHostUserBlk *s = VHOST_USER_BLK(vdev);
+Error *local_err = NULL;
 
 switch (event) {
 case CHR_EVENT_OPENED:
-if (vhost_user_blk_connect(dev) < 0) {
+if (vhost_user_blk_connect(dev, _err) < 0) {
+error_report_err(local_err);
 qemu_chr_fe_disconnect(>chardev);
 return;
 }
@@ -426,7 +426,7 @@ static void vhost_user_blk_device_realize(DeviceState *dev, 
Error **errp)
 int i, ret;
 
 if (!s->chardev.chr) {
-error_setg(errp, "vhost-user-blk: chardev is mandatory");
+error_setg(errp, "chardev is mandatory");
 return;
 }
 
@@ -434,16 +434,16 @@ static void vhost_user_blk_device_realize(DeviceState 
*dev, Error **errp)
 s->num_queues = 1;
 }
 if (!s->num_queues || s->num_queues > VIRTIO_QUEUE_MAX) {
-error_setg(errp, "vhost-user-blk: invalid number of IO queues");
+error_setg(errp, "invalid number of IO queues");
 return;
 }
 
 if (!s->queue_size) {
-error_setg(errp, "vhost-user-blk: queue size must be non-zero");
+error_setg(errp, "queue size must be non-zero");
 return;
 }
 if (s->queue_size > VIRTQUEUE_MAX_SIZE) {
-error_setg(errp, "vhost-user-blk: queue size must not exceed %d",
+error_setg(errp, "queue size must not exceed %d",
VIRTQUEUE_MAX_SIZE);
 return;
 }
@@ -469,8 +469,7 @@ static void vhost_user_blk_device_realize(DeviceState *dev, 
Error **errp)
 goto virtio_err;
 }
 
-if (vhost_user_blk_connect(dev) < 0) {
-error_setg(errp, "vhost-user-blk: could not connect");
+if (vhost_user_blk_connect(dev, errp) < 0) {
 qemu_chr_fe_disconnect(>chardev);
 goto virtio_err;
 }
-- 
2.30.2




Re: libnbd thread-local errors and dlclose (was: Re: [ANNOUNCE] libblkio v0.1.0 preview release)

2021-04-29 Thread Daniel P . Berrangé
On Thu, Apr 29, 2021 at 06:17:32PM +0100, Richard W.M. Jones wrote:
> On Thu, Apr 29, 2021 at 04:08:22PM +0100, Daniel P. Berrangé wrote:
> > On Thu, Apr 29, 2021 at 04:00:38PM +0100, Richard W.M. Jones wrote:
> > > On Thu, Apr 29, 2021 at 03:41:29PM +0100, Stefan Hajnoczi wrote:
> > > > On Thu, Apr 29, 2021 at 03:22:59PM +0100, Richard W.M. Jones wrote:
> > > > > libvirt originally, and now libnbd, keep a per-thread error message
> > > > > (stored in thread-local storage).  It's a lot nicer than having to
> > > > > pass  to every function.  You can just write:
> > > > > 
> > > > >  if (nbd_connect_tcp (nbd, "remote", "nbd") == -1) {
> > > > >fprintf (stderr,
> > > > > "failed to connect to remote server: %s (errno = %d)\n",
> > > > > nbd_get_error (), nbd_get_errno ());
> > > > >exit (EXIT_FAILURE);
> > > > >  }
> > > > > 
> > > > > (https://libguestfs.org/libnbd.3.html#ERROR-HANDLING)
> > > > > 
> > > > > It means you can extend the range of error information available in
> > > > > future.  Also you can return a 'const char *' and the application
> > > > > doesn't have to worry about lifetimes, at least in the common case.
> > > > 
> > > > Thanks for sharing the idea, I think it would work well for libblkio
> > > > too.
> > > > 
> > > > Do you ignore the dlclose(3) memory leak?
> > > 
> > > I believe this mechanism in libnbd ensures there is no leak in the
> > > ordinary shared library (not dlopen/dlclose) case:
> > > 
> > > https://gitlab.com/nbdkit/libnbd/-/blob/f9257a9fdc68706a4079deb4ced61e1d468f28d6/lib/errors.c#L35
> > > 
> > > However I'm not sure what happens in the dlopen case, so I'm going to
> > > test that out now ...
> > 
> > pthread_key destructors are a disaster waiting to happen with
> > dlclose, if the dlclose happens while non-main threads are
> > still running. When those threads exit, they'll run the
> > destructor which points to a function that is no longer
> > resident in memory.
> 
> While libnbd had a memory leak there, now fixed by:
> 
> https://gitlab.com/nbdkit/libnbd/-/commit/026d281c57dd95485cc9bf829918b5efd9e32ddb

So on dlclose(), the errors_free() method will be run by the linker.

The pthread_key_delete method doesn't run the cleanup callbacks, so
your fix here frees the thread local error string in the thread that
called dlclose explicitly.

If any other thread had an error string associated with the thread
local, that will still be leaked.  I don't see any attractive way
to avoid that leak.

IOW, I think the fix is incomplete, but its better than nothing,
and I don't have suggestion to improve it.

> I don't actually think we have the bug you're describing.  We have a
> destructor (errors_free()) which runs on dlclose, which deletes the
> thread-local storage key, so the destructor will not run after the
> library has been unloaded.
> 
> I added a test for this which works fine for me:
> 
> https://gitlab.com/nbdkit/libnbd/-/commit/831d142787aba4f5b638418e02cf7e0f2a051565

Yes, I think your using of ELF destructor function is enough in this
scenario.

Regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|




[PATCH v2 4/6] vhost-user-blk: Get more feature flags from vhost device

2021-04-29 Thread Kevin Wolf
VIRTIO_F_RING_PACKED and VIRTIO_F_IOMMU_PLATFORM need to be supported by
the vhost device, otherwise advertising it to the guest doesn't result
in a working configuration. They are currently not supported by the
vhost-user-blk export in QEMU.

Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1935020
Signed-off-by: Kevin Wolf 
Acked-by: Raphael Norwitz 
---
 hw/block/vhost-user-blk.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/hw/block/vhost-user-blk.c b/hw/block/vhost-user-blk.c
index f3a45af97c..c7e502f4c7 100644
--- a/hw/block/vhost-user-blk.c
+++ b/hw/block/vhost-user-blk.c
@@ -47,6 +47,8 @@ static const int user_feature_bits[] = {
 VIRTIO_RING_F_INDIRECT_DESC,
 VIRTIO_RING_F_EVENT_IDX,
 VIRTIO_F_NOTIFY_ON_EMPTY,
+VIRTIO_F_RING_PACKED,
+VIRTIO_F_IOMMU_PLATFORM,
 VHOST_INVALID_FEATURE_BIT
 };
 
-- 
2.30.2




[PATCH v2 1/6] vhost-user-blk: Make sure to set Error on realize failure

2021-04-29 Thread Kevin Wolf
We have to set errp before jumping to virtio_err, otherwise the caller
(virtio_device_realize()) will take this as success and crash when it
later tries to access things that we've already freed in the error path.

Fixes: 77542d431491788d1e8e79d93ce10172ef207775
Signed-off-by: Kevin Wolf 
---
 hw/block/vhost-user-blk.c | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/hw/block/vhost-user-blk.c b/hw/block/vhost-user-blk.c
index f5e9682703..7c85248a7b 100644
--- a/hw/block/vhost-user-blk.c
+++ b/hw/block/vhost-user-blk.c
@@ -447,7 +447,6 @@ static void vhost_user_blk_device_realize(DeviceState *dev, 
Error **errp)
 {
 VirtIODevice *vdev = VIRTIO_DEVICE(dev);
 VHostUserBlk *s = VHOST_USER_BLK(vdev);
-Error *err = NULL;
 int i, ret;
 
 if (!s->chardev.chr) {
@@ -495,8 +494,7 @@ static void vhost_user_blk_device_realize(DeviceState *dev, 
Error **errp)
  NULL, true);
 
 reconnect:
-if (qemu_chr_fe_wait_connected(>chardev, ) < 0) {
-error_report_err(err);
+if (qemu_chr_fe_wait_connected(>chardev, errp) < 0) {
 goto virtio_err;
 }
 
-- 
2.30.2




[PATCH v2 2/6] vhost-user-blk: Don't reconnect during initialisation

2021-04-29 Thread Kevin Wolf
This is a partial revert of commits 77542d43149 and bc79c87bcde.

Usually, an error during initialisation means that the configuration was
wrong. Reconnecting won't make the error go away, but just turn the
error condition into an endless loop. Avoid this and return errors
again.

Additionally, calling vhost_user_blk_disconnect() from the chardev event
handler could result in use-after-free because none of the
initialisation code expects that the device could just go away in the
middle. So removing the call fixes crashes in several places.

For example, using a num-queues setting that is incompatible with the
backend would result in a crash like this (dereferencing dev->opaque,
which is already NULL):

 #0  0x55d0a4bd in vhost_user_read_cb (source=0x568f4690, 
condition=(G_IO_IN | G_IO_HUP), opaque=0x7fffcbf0) at 
../hw/virtio/vhost-user.c:313
 #1  0x55d950d3 in qio_channel_fd_source_dispatch 
(source=0x57c3f750, callback=0x55d0a478 , 
user_data=0x7fffcbf0) at ../io/channel-watch.c:84
 #2  0x77b32a9f in g_main_context_dispatch () at /lib64/libglib-2.0.so.0
 #3  0x77b84a98 in g_main_context_iterate.constprop () at 
/lib64/libglib-2.0.so.0
 #4  0x77b32163 in g_main_loop_run () at /lib64/libglib-2.0.so.0
 #5  0x55d0a724 in vhost_user_read (dev=0x57bc62f8, 
msg=0x7fffcc50) at ../hw/virtio/vhost-user.c:402
 #6  0x55d0ee6b in vhost_user_get_config (dev=0x57bc62f8, 
config=0x57bc62ac "", config_len=60) at ../hw/virtio/vhost-user.c:2133
 #7  0x55d56d46 in vhost_dev_get_config (hdev=0x57bc62f8, 
config=0x57bc62ac "", config_len=60) at ../hw/virtio/vhost.c:1566
 #8  0x55cdd150 in vhost_user_blk_device_realize (dev=0x57bc60b0, 
errp=0x7fffcf90) at ../hw/block/vhost-user-blk.c:510
 #9  0x55d08f6d in virtio_device_realize (dev=0x57bc60b0, 
errp=0x7fffcff0) at ../hw/virtio/virtio.c:3660

Signed-off-by: Kevin Wolf 
---
 hw/block/vhost-user-blk.c | 59 +++
 1 file changed, 17 insertions(+), 42 deletions(-)

diff --git a/hw/block/vhost-user-blk.c b/hw/block/vhost-user-blk.c
index 7c85248a7b..c0b9958da1 100644
--- a/hw/block/vhost-user-blk.c
+++ b/hw/block/vhost-user-blk.c
@@ -50,6 +50,8 @@ static const int user_feature_bits[] = {
 VHOST_INVALID_FEATURE_BIT
 };
 
+static void vhost_user_blk_event(void *opaque, QEMUChrEvent event);
+
 static void vhost_user_blk_update_config(VirtIODevice *vdev, uint8_t *config)
 {
 VHostUserBlk *s = VHOST_USER_BLK(vdev);
@@ -362,19 +364,6 @@ static void vhost_user_blk_disconnect(DeviceState *dev)
 vhost_dev_cleanup(>dev);
 }
 
-static void vhost_user_blk_event(void *opaque, QEMUChrEvent event,
- bool realized);
-
-static void vhost_user_blk_event_realize(void *opaque, QEMUChrEvent event)
-{
-vhost_user_blk_event(opaque, event, false);
-}
-
-static void vhost_user_blk_event_oper(void *opaque, QEMUChrEvent event)
-{
-vhost_user_blk_event(opaque, event, true);
-}
-
 static void vhost_user_blk_chr_closed_bh(void *opaque)
 {
 DeviceState *dev = opaque;
@@ -382,12 +371,11 @@ static void vhost_user_blk_chr_closed_bh(void *opaque)
 VHostUserBlk *s = VHOST_USER_BLK(vdev);
 
 vhost_user_blk_disconnect(dev);
-qemu_chr_fe_set_handlers(>chardev, NULL, NULL,
-vhost_user_blk_event_oper, NULL, opaque, NULL, true);
+qemu_chr_fe_set_handlers(>chardev, NULL, NULL, vhost_user_blk_event,
+ NULL, opaque, NULL, true);
 }
 
-static void vhost_user_blk_event(void *opaque, QEMUChrEvent event,
- bool realized)
+static void vhost_user_blk_event(void *opaque, QEMUChrEvent event)
 {
 DeviceState *dev = opaque;
 VirtIODevice *vdev = VIRTIO_DEVICE(dev);
@@ -401,17 +389,7 @@ static void vhost_user_blk_event(void *opaque, 
QEMUChrEvent event,
 }
 break;
 case CHR_EVENT_CLOSED:
-/*
- * Closing the connection should happen differently on device
- * initialization and operation stages.
- * On initalization, we want to re-start vhost_dev initialization
- * from the very beginning right away when the connection is closed,
- * so we clean up vhost_dev on each connection closing.
- * On operation, we want to postpone vhost_dev cleanup to let the
- * other code perform its own cleanup sequence using vhost_dev data
- * (e.g. vhost_dev_set_log).
- */
-if (realized && !runstate_check(RUN_STATE_SHUTDOWN)) {
+if (!runstate_check(RUN_STATE_SHUTDOWN)) {
 /*
  * A close event may happen during a read/write, but vhost
  * code assumes the vhost_dev remains setup, so delay the
@@ -431,8 +409,6 @@ static void vhost_user_blk_event(void *opaque, QEMUChrEvent 
event,
  * knowing its type (in this case vhost-user).
  */
 s->dev.started = 

[PATCH v2 0/6] vhost-user-blk: Error handling fixes during initialistion

2021-04-29 Thread Kevin Wolf
vhost-user-blk neglects for several properties to check whether the
configured value is even compatible with the backend. This results
sometimes in crashes because of buggy error handling code, and sometimes
in devices that are presented differently to the guest than the backend
would expect and that don't work properly therefore.

This series fixes some of these bugs.

v2:
- Fix error paths in realize() that didn't set errp
- Added vhost_dev_cleanup() back in the error path (more faithful revert
  of 77542d43149)

Kevin Wolf (6):
  vhost-user-blk: Make sure to set Error on realize failure
  vhost-user-blk: Don't reconnect during initialisation
  vhost-user-blk: Improve error reporting in realize
  vhost-user-blk: Get more feature flags from vhost device
  virtio: Fail if iommu_platform is requested, but unsupported
  vhost-user-blk: Check that num-queues is supported by backend

 include/hw/virtio/vhost.h |  2 +
 hw/block/vhost-user-blk.c | 85 ++-
 hw/virtio/vhost-user.c|  5 +++
 hw/virtio/virtio-bus.c|  5 +++
 4 files changed, 42 insertions(+), 55 deletions(-)

-- 
2.30.2




Re: [PATCH 1/5] vhost-user-blk: Don't reconnect during initialisation

2021-04-29 Thread Raphael Norwitz
On Thu, Apr 29, 2021 at 02:48:52PM +0200, Kevin Wolf wrote:
> So maybe patch 2 should come first and also fix the preexisting bug, and
> of course this patch needs a v2 that doesn't introduce the new instances
> of the bug.

Sounds good to me.

> 
> Kevin
> 


Re: [PATCH] virtio-blk: drop deprecated scsi=on|off property

2021-04-29 Thread Peter Krempa
On Thu, Apr 29, 2021 at 16:52:21 +0100, Stefan Hajnoczi wrote:
> The scsi=on|off property was deprecated in QEMU 5.0 and can be removed
> completely at this point.
> 
> Drop the scsi=on|off option. It was only available on Legacy virtio-blk
> devices. Linux v5.6 already dropped support for it.
> 
> Remove the hw_compat_2_4[] property assignment since scsi=on|off no
> longer exists. Old guests with Legacy virtio-blk devices no longer see
> the SCSI host features bit.

Does this mean that qemu rejects it if it's explicitly enabled on the
commandline? 

> Live migrating old guests from an old QEMU with the SCSI feature bit
> enabled will fail with "Features 0x... unsupported. Allowed features:
> 0x...". We've followed the QEMU deprecation policy so users have been
> warned...
> 
> I have tested that libvirt still works when the property is absent. It
> no longer adds scsi=on|off to the command-line.

Yup, we deliberately don't format it unless the user requested it since:

https://gitlab.com/libvirt/libvirt/-/commit/ec69f0190be731d12faeac08dbf63325836509a9

Depending on your answer above I might need to dig through the code
again to see whether we do the correct thing if it's no longer
available.

> 
> Cc: Markus Armbruster 
> Cc: Christoph Hellwig 
> Cc: Peter Krempa 
> Cc: Dr. David Alan Gilbert 
> Signed-off-by: Stefan Hajnoczi 
> ---
>  docs/specs/tpm.rst   |   2 +-
>  docs/system/deprecated.rst   |  13 ---
>  docs/pci_expander_bridge.txt |   2 +-
>  hw/block/virtio-blk.c| 192 +--
>  hw/core/machine.c|   2 -
>  5 files changed, 3 insertions(+), 208 deletions(-)




[PATCH] virtio-blk: drop deprecated scsi=on|off property

2021-04-29 Thread Stefan Hajnoczi
The scsi=on|off property was deprecated in QEMU 5.0 and can be removed
completely at this point.

Drop the scsi=on|off option. It was only available on Legacy virtio-blk
devices. Linux v5.6 already dropped support for it.

Remove the hw_compat_2_4[] property assignment since scsi=on|off no
longer exists. Old guests with Legacy virtio-blk devices no longer see
the SCSI host features bit.

Live migrating old guests from an old QEMU with the SCSI feature bit
enabled will fail with "Features 0x... unsupported. Allowed features:
0x...". We've followed the QEMU deprecation policy so users have been
warned...

I have tested that libvirt still works when the property is absent. It
no longer adds scsi=on|off to the command-line.

Cc: Markus Armbruster 
Cc: Christoph Hellwig 
Cc: Peter Krempa 
Cc: Dr. David Alan Gilbert 
Signed-off-by: Stefan Hajnoczi 
---
 docs/specs/tpm.rst   |   2 +-
 docs/system/deprecated.rst   |  13 ---
 docs/pci_expander_bridge.txt |   2 +-
 hw/block/virtio-blk.c| 192 +--
 hw/core/machine.c|   2 -
 5 files changed, 3 insertions(+), 208 deletions(-)

diff --git a/docs/specs/tpm.rst b/docs/specs/tpm.rst
index 3be190343a..0ec017ab95 100644
--- a/docs/specs/tpm.rst
+++ b/docs/specs/tpm.rst
@@ -328,7 +328,7 @@ In case a pSeries machine is emulated, use the following 
command line:
 -tpmdev emulator,id=tpm0,chardev=chrtpm \
 -device tpm-spapr,tpmdev=tpm0 \
 -device spapr-vscsi,id=scsi0,reg=0x2000 \
--device 
virtio-blk-pci,scsi=off,bus=pci.0,addr=0x3,drive=drive-virtio-disk0,id=virtio-disk0
 \
+-device 
virtio-blk-pci,bus=pci.0,addr=0x3,drive=drive-virtio-disk0,id=virtio-disk0 \
 -drive file=test.img,format=raw,if=none,id=drive-virtio-disk0
 
 In case an Arm virt machine is emulated, use the following command line:
diff --git a/docs/system/deprecated.rst b/docs/system/deprecated.rst
index 80cae86252..1abb64b669 100644
--- a/docs/system/deprecated.rst
+++ b/docs/system/deprecated.rst
@@ -248,19 +248,6 @@ machines have been renamed ``raspi2b`` and ``raspi3b``.
 Device options
 --
 
-Emulated device options
-'''
-
-``-device virtio-blk,scsi=on|off`` (since 5.0.0)
-
-
-The virtio-blk SCSI passthrough feature is a legacy VIRTIO feature.  VIRTIO 1.0
-and later do not support it because the virtio-scsi device was introduced for
-full SCSI support.  Use virtio-scsi instead when SCSI passthrough is required.
-
-Note this also applies to ``-device virtio-blk-pci,scsi=on|off``, which is an
-alias.
-
 Block device options
 
 
diff --git a/docs/pci_expander_bridge.txt b/docs/pci_expander_bridge.txt
index 36750273bb..540191f5e0 100644
--- a/docs/pci_expander_bridge.txt
+++ b/docs/pci_expander_bridge.txt
@@ -25,7 +25,7 @@ A detailed command line would be:
 -object memory-backend-ram,size=1024M,policy=bind,host-nodes=1,id=ram-node1 
-numa node,nodeid=1,cpus=1,memdev=ram-node1
 -device pxb,id=bridge1,bus=pci.0,numa_node=1,bus_nr=4 -netdev user,id=nd 
-device e1000,bus=bridge1,addr=0x4,netdev=nd
 -device pxb,id=bridge2,bus=pci.0,numa_node=0,bus_nr=8 -device 
e1000,bus=bridge2,addr=0x3
--device pxb,id=bridge3,bus=pci.0,bus_nr=40 -drive if=none,id=drive0,file=[img] 
-device virtio-blk-pci,drive=drive0,scsi=off,bus=bridge3,addr=1
+-device pxb,id=bridge3,bus=pci.0,bus_nr=40 -drive if=none,id=drive0,file=[img] 
-device virtio-blk-pci,drive=drive0,bus=bridge3,addr=1
 
 Here you have:
  - 2 NUMA nodes for the guest, 0 and 1. (both mapped to the same NUMA node in 
host, but you can and should put it in different host NUMA nodes)
diff --git a/hw/block/virtio-blk.c b/hw/block/virtio-blk.c
index d28979efb8..5023e046fc 100644
--- a/hw/block/virtio-blk.c
+++ b/hw/block/virtio-blk.c
@@ -25,10 +25,6 @@
 #include "sysemu/runstate.h"
 #include "hw/virtio/virtio-blk.h"
 #include "dataplane/virtio-blk.h"
-#include "scsi/constants.h"
-#ifdef __linux__
-# include 
-#endif
 #include "hw/virtio/virtio-bus.h"
 #include "migration/qemu-file-types.h"
 #include "hw/virtio/virtio-access.h"
@@ -200,59 +196,6 @@ out:
 aio_context_release(blk_get_aio_context(s->conf.conf.blk));
 }
 
-#ifdef __linux__
-
-typedef struct {
-VirtIOBlockReq *req;
-struct sg_io_hdr hdr;
-} VirtIOBlockIoctlReq;
-
-static void virtio_blk_ioctl_complete(void *opaque, int status)
-{
-VirtIOBlockIoctlReq *ioctl_req = opaque;
-VirtIOBlockReq *req = ioctl_req->req;
-VirtIOBlock *s = req->dev;
-VirtIODevice *vdev = VIRTIO_DEVICE(s);
-struct virtio_scsi_inhdr *scsi;
-struct sg_io_hdr *hdr;
-
-scsi = (void *)req->elem.in_sg[req->elem.in_num - 2].iov_base;
-
-if (status) {
-status = VIRTIO_BLK_S_UNSUPP;
-virtio_stl_p(vdev, >errors, 255);
-goto out;
-}
-
-hdr = _req->hdr;
-/*
- * From SCSI-Generic-HOWTO: "Some lower level drivers (e.g. ide-scsi)
- * clear the masked_status field [hence status gets cleared 

Re: [ANNOUNCE] libblkio v0.1.0 preview release

2021-04-29 Thread Kevin Wolf
Am 29.04.2021 um 16:05 hat Stefan Hajnoczi geschrieben:
> Hi,
> A preview release of libblkio, a library for high-performance block I/O,
> is now available:
> 
>   https://gitlab.com/libblkio/libblkio/-/tree/v0.1.0
> 
> Applications are increasingly integrating high-performance I/O
> interfaces such as Linux io_uring, userspace device drivers, and
> vhost-user device support. The effort required to add each of these
> low-level interfaces into an application is relatively high. libblkio
> provides a single API for efficiently accessing block devices and
> eliminates the need to write custom code for each one.
> 
> The library is not yet ready to use and currently lacks many features.
> In fact, I hope this news doesn't spread too far yet because libblkio is
> at a very early stage, but feedback from the QEMU community is necessary
> at this time.

I'm a bit worried about the configuration interface. This looks an awful
lot like plain QOM properties, which have proven to result in both very
verbose (not to say messy) and rather error prone implementations.

You have to write setters/getters for every property, even if it usually
ends up doing the same thing, storing the value somewhere. Worse, these
getters and setters have to work in very different circumstances, namely
initialisation where you usually have to store the value away so that it
can be checked for consistency with other properties in .realize() or
.complete(), and property updates during runtime. Often enough, we
forget the latter and create bugs. If we don't create bugs, we usually
start with 'if (realized)' and have two completely different code paths.
Another common bug in QOM objects is forgetting to check if mandatory
properties were actually set.

Did you already consider these problems and decided to go this way
anyway, or is this more of an accidental design? And if the former, what
were the reasons that made it appealing?

Alternatives in QEMU are qdev properties (which are internally QOM
properties, but provide default implementations and are at least
automatically read-only after realize, avoiding that whole class of
bugs) and QAPI.
If this was QEMU code, I would of course go for QAPI, but a library is
something different and adding the code generator would probably be a
bit too much anyway. But the idea in the resulting code would be dealing
with native structs instead of a bunch of function calls. This would
probably be the least error prone way for the implementation, but of
course, it would make binary compatibility a bit harder when adding new
properties.

Thinking a bit further, we'll likely get the same problems as with QOM
in other places, too, e.g. how do you introspect which properties exist
in a given build?

Kevin


signature.asc
Description: PGP signature


Re: [ANNOUNCE] libblkio v0.1.0 preview release

2021-04-29 Thread Daniel P . Berrangé
On Thu, Apr 29, 2021 at 04:31:41PM +0100, Richard W.M. Jones wrote:
> On Thu, Apr 29, 2021 at 04:08:22PM +0100, Daniel P. Berrangé wrote:
> > On Thu, Apr 29, 2021 at 04:00:38PM +0100, Richard W.M. Jones wrote:
> > > On Thu, Apr 29, 2021 at 03:41:29PM +0100, Stefan Hajnoczi wrote:
> > > > On Thu, Apr 29, 2021 at 03:22:59PM +0100, Richard W.M. Jones wrote:
> > > > > On Thu, Apr 29, 2021 at 03:05:50PM +0100, Stefan Hajnoczi wrote:
> > > > > > The purpose of this preview release is to discuss both the API 
> > > > > > design
> > > > > > and general direction of the project. API documentation is available
> > > > > > here:
> > > > > > 
> > > > > >   https://gitlab.com/libblkio/libblkio/-/blob/v0.1.0/docs/blkio.rst
> > > > > 
> > > > > libvirt originally, and now libnbd, keep a per-thread error message
> > > > > (stored in thread-local storage).  It's a lot nicer than having to
> > > > > pass  to every function.  You can just write:
> > > > > 
> > > > >  if (nbd_connect_tcp (nbd, "remote", "nbd") == -1) {
> > > > >fprintf (stderr,
> > > > > "failed to connect to remote server: %s (errno = %d)\n",
> > > > > nbd_get_error (), nbd_get_errno ());
> > > > >exit (EXIT_FAILURE);
> > > > >  }
> > > > > 
> > > > > (https://libguestfs.org/libnbd.3.html#ERROR-HANDLING)
> > > > > 
> > > > > It means you can extend the range of error information available in
> > > > > future.  Also you can return a 'const char *' and the application
> > > > > doesn't have to worry about lifetimes, at least in the common case.
> > > > 
> > > > Thanks for sharing the idea, I think it would work well for libblkio
> > > > too.
> > > > 
> > > > Do you ignore the dlclose(3) memory leak?
> > > 
> > > I believe this mechanism in libnbd ensures there is no leak in the
> > > ordinary shared library (not dlopen/dlclose) case:
> > > 
> > > https://gitlab.com/nbdkit/libnbd/-/blob/f9257a9fdc68706a4079deb4ced61e1d468f28d6/lib/errors.c#L35
> > > 
> > > However I'm not sure what happens in the dlopen case, so I'm going to
> > > test that out now ...
> > 
> > pthread_key destructors are a disaster waiting to happen with
> > dlclose, if the dlclose happens while non-main threads are
> > still running. When those threads exit, they'll run the
> > destructor which points to a function that is no longer
> > resident in memory.
> > 
> > IOW if you have destrutors, then you need to make sure your
> > library uses "-z nodelete" when linking, to turn dlclose()
> > into a no-op.
> 
> I suspect letting the string leak is a better idea for libnbd.

Is dlclose() really that important ? libnbd is such a small
thing that i doubt anyone will even notice the space being
consumed if you mark it nodelete, and if an app is repeatedly
doing an op that triggers dlopen+dlclose many times, then
dlclose is especially useless.

Personally I think removing memory leaks on thread exit is
more important than honouring dlclose, as some apps can do
pathelogical things creating *many* very short lived
threads.

Regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|




Re: [ANNOUNCE] libblkio v0.1.0 preview release

2021-04-29 Thread Richard W.M. Jones
On Thu, Apr 29, 2021 at 04:08:22PM +0100, Daniel P. Berrangé wrote:
> On Thu, Apr 29, 2021 at 04:00:38PM +0100, Richard W.M. Jones wrote:
> > On Thu, Apr 29, 2021 at 03:41:29PM +0100, Stefan Hajnoczi wrote:
> > > On Thu, Apr 29, 2021 at 03:22:59PM +0100, Richard W.M. Jones wrote:
> > > > On Thu, Apr 29, 2021 at 03:05:50PM +0100, Stefan Hajnoczi wrote:
> > > > > The purpose of this preview release is to discuss both the API design
> > > > > and general direction of the project. API documentation is available
> > > > > here:
> > > > > 
> > > > >   https://gitlab.com/libblkio/libblkio/-/blob/v0.1.0/docs/blkio.rst
> > > > 
> > > > libvirt originally, and now libnbd, keep a per-thread error message
> > > > (stored in thread-local storage).  It's a lot nicer than having to
> > > > pass  to every function.  You can just write:
> > > > 
> > > >  if (nbd_connect_tcp (nbd, "remote", "nbd") == -1) {
> > > >fprintf (stderr,
> > > > "failed to connect to remote server: %s (errno = %d)\n",
> > > > nbd_get_error (), nbd_get_errno ());
> > > >exit (EXIT_FAILURE);
> > > >  }
> > > > 
> > > > (https://libguestfs.org/libnbd.3.html#ERROR-HANDLING)
> > > > 
> > > > It means you can extend the range of error information available in
> > > > future.  Also you can return a 'const char *' and the application
> > > > doesn't have to worry about lifetimes, at least in the common case.
> > > 
> > > Thanks for sharing the idea, I think it would work well for libblkio
> > > too.
> > > 
> > > Do you ignore the dlclose(3) memory leak?
> > 
> > I believe this mechanism in libnbd ensures there is no leak in the
> > ordinary shared library (not dlopen/dlclose) case:
> > 
> > https://gitlab.com/nbdkit/libnbd/-/blob/f9257a9fdc68706a4079deb4ced61e1d468f28d6/lib/errors.c#L35
> > 
> > However I'm not sure what happens in the dlopen case, so I'm going to
> > test that out now ...
> 
> pthread_key destructors are a disaster waiting to happen with
> dlclose, if the dlclose happens while non-main threads are
> still running. When those threads exit, they'll run the
> destructor which points to a function that is no longer
> resident in memory.
> 
> IOW if you have destrutors, then you need to make sure your
> library uses "-z nodelete" when linking, to turn dlclose()
> into a no-op.

I suspect letting the string leak is a better idea for libnbd.

Still trying to write a nice reproducer ..

Rich.

>   commit 8e44e5593eb9b89fbc0b54fde15f130707a0d81e
>   Author: Daniel P. Berrangé 
>   Date:   Thu Sep 1 17:57:06 2011 +0100
> 
> Prevent crash from dlclose() of libvirt.so
> 
> When libvirt calls virInitialize it creates a thread local
> for the virErrorPtr storage, and registers a callback to
> cleanup memory when a thread exits. When libvirt is dlclose()d
> or otherwise made non-resident, the callback function is
> removed from memory, but the thread local may still exist
> and if a thread later exists, it will invoke the callback
> and SEGV. There may also be other thread locals with callbacks
> pointing to libvirt code, so it is in general never safe to
> unload libvirt.so from memory once initialized.
> 
> To allow dlclose() to succeed, but keep libvirt.so resident
> in memory, link with '-z nodelete'. This issue was first
> found with the libvirt CIM provider, but can potentially
> hit many of the dynamic language bindings which all ultimately
> involve dlopen() in some way, either on libvirt.so itself,
> or on the glue code for the binding which in turns links
> to libvirt
> 
> 
> 
> Regards,
> Daniel
> -- 
> |: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
> |: https://libvirt.org -o-https://fstop138.berrange.com :|
> |: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|

-- 
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
Read my programming and virtualization blog: http://rwmj.wordpress.com
virt-p2v converts physical machines to virtual machines.  Boot with a
live CD or over the network (PXE) and turn machines into KVM guests.
http://libguestfs.org/virt-v2v




Re: [ANNOUNCE] libblkio v0.1.0 preview release

2021-04-29 Thread Daniel P . Berrangé
On Thu, Apr 29, 2021 at 04:00:38PM +0100, Richard W.M. Jones wrote:
> On Thu, Apr 29, 2021 at 03:41:29PM +0100, Stefan Hajnoczi wrote:
> > On Thu, Apr 29, 2021 at 03:22:59PM +0100, Richard W.M. Jones wrote:
> > > On Thu, Apr 29, 2021 at 03:05:50PM +0100, Stefan Hajnoczi wrote:
> > > > The purpose of this preview release is to discuss both the API design
> > > > and general direction of the project. API documentation is available
> > > > here:
> > > > 
> > > >   https://gitlab.com/libblkio/libblkio/-/blob/v0.1.0/docs/blkio.rst
> > > 
> > > libvirt originally, and now libnbd, keep a per-thread error message
> > > (stored in thread-local storage).  It's a lot nicer than having to
> > > pass  to every function.  You can just write:
> > > 
> > >  if (nbd_connect_tcp (nbd, "remote", "nbd") == -1) {
> > >fprintf (stderr,
> > > "failed to connect to remote server: %s (errno = %d)\n",
> > > nbd_get_error (), nbd_get_errno ());
> > >exit (EXIT_FAILURE);
> > >  }
> > > 
> > > (https://libguestfs.org/libnbd.3.html#ERROR-HANDLING)
> > > 
> > > It means you can extend the range of error information available in
> > > future.  Also you can return a 'const char *' and the application
> > > doesn't have to worry about lifetimes, at least in the common case.
> > 
> > Thanks for sharing the idea, I think it would work well for libblkio
> > too.
> > 
> > Do you ignore the dlclose(3) memory leak?
> 
> I believe this mechanism in libnbd ensures there is no leak in the
> ordinary shared library (not dlopen/dlclose) case:
> 
> https://gitlab.com/nbdkit/libnbd/-/blob/f9257a9fdc68706a4079deb4ced61e1d468f28d6/lib/errors.c#L35
> 
> However I'm not sure what happens in the dlopen case, so I'm going to
> test that out now ...

pthread_key destructors are a disaster waiting to happen with
dlclose, if the dlclose happens while non-main threads are
still running. When those threads exit, they'll run the
destructor which points to a function that is no longer
resident in memory.

IOW if you have destrutors, then you need to make sure your
library uses "-z nodelete" when linking, to turn dlclose()
into a no-op.

  commit 8e44e5593eb9b89fbc0b54fde15f130707a0d81e
  Author: Daniel P. Berrangé 
  Date:   Thu Sep 1 17:57:06 2011 +0100

Prevent crash from dlclose() of libvirt.so

When libvirt calls virInitialize it creates a thread local
for the virErrorPtr storage, and registers a callback to
cleanup memory when a thread exits. When libvirt is dlclose()d
or otherwise made non-resident, the callback function is
removed from memory, but the thread local may still exist
and if a thread later exists, it will invoke the callback
and SEGV. There may also be other thread locals with callbacks
pointing to libvirt code, so it is in general never safe to
unload libvirt.so from memory once initialized.

To allow dlclose() to succeed, but keep libvirt.so resident
in memory, link with '-z nodelete'. This issue was first
found with the libvirt CIM provider, but can potentially
hit many of the dynamic language bindings which all ultimately
involve dlopen() in some way, either on libvirt.so itself,
or on the glue code for the binding which in turns links
to libvirt



Regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|




Re: [ANNOUNCE] libblkio v0.1.0 preview release

2021-04-29 Thread Richard W.M. Jones
On Thu, Apr 29, 2021 at 03:41:29PM +0100, Stefan Hajnoczi wrote:
> On Thu, Apr 29, 2021 at 03:22:59PM +0100, Richard W.M. Jones wrote:
> > On Thu, Apr 29, 2021 at 03:05:50PM +0100, Stefan Hajnoczi wrote:
> > > The purpose of this preview release is to discuss both the API design
> > > and general direction of the project. API documentation is available
> > > here:
> > > 
> > >   https://gitlab.com/libblkio/libblkio/-/blob/v0.1.0/docs/blkio.rst
> > 
> > libvirt originally, and now libnbd, keep a per-thread error message
> > (stored in thread-local storage).  It's a lot nicer than having to
> > pass  to every function.  You can just write:
> > 
> >  if (nbd_connect_tcp (nbd, "remote", "nbd") == -1) {
> >fprintf (stderr,
> > "failed to connect to remote server: %s (errno = %d)\n",
> > nbd_get_error (), nbd_get_errno ());
> >exit (EXIT_FAILURE);
> >  }
> > 
> > (https://libguestfs.org/libnbd.3.html#ERROR-HANDLING)
> > 
> > It means you can extend the range of error information available in
> > future.  Also you can return a 'const char *' and the application
> > doesn't have to worry about lifetimes, at least in the common case.
> 
> Thanks for sharing the idea, I think it would work well for libblkio
> too.
> 
> Do you ignore the dlclose(3) memory leak?

I believe this mechanism in libnbd ensures there is no leak in the
ordinary shared library (not dlopen/dlclose) case:

https://gitlab.com/nbdkit/libnbd/-/blob/f9257a9fdc68706a4079deb4ced61e1d468f28d6/lib/errors.c#L35

However I'm not sure what happens in the dlopen case, so I'm going to
test that out now ...

> > > Examples are available here:
> > > 
> > >   https://gitlab.com/libblkio/libblkio/-/tree/v0.1.0/examples
> > > 
> > > The goal is to eventually include the following drivers:
> > > - Linux io_uring
> > > - NVMe (VFIO and vfio-user)
> > > - virtio-blk (VFIO, vfio-user, vhost-user-blk, and vhost-vdpa-blk)
> > >
> > > There are a few reasons why libblkio is needed:
> > > 
> > > 1. QEMU, Ceph, GlusterFS, MariaDB, and other programs have been adding
> > >more low-level block I/O code, most of it duplicated. Usually only
> > >one or two of Linux AIO, io_uring, userspace drivers, vhost-user
> > >drivers, etc are implemented. This makes it difficult to benefit from
> > >the latest advances in high-performance block I/O.
> > > 
> > > 2. Coding to a standard API makes it possible to introduce new
> > >optimizations or hardware interfaces without costly changes to the
> > >software stack.
> > > 
> > > 3. A client library is needed so applications can take advantage of
> > >qemu-storage-daemon's vhost-user-blk exports.
> > > 
> > > 4. Implementing block I/O as a library allows QEMU to use Rust for new
> > >code without messy QEMU internal API bindings. Note that libblkio
> > >currently does not provide a Rust crate, it only offers a C API.
> > 
> > This is where I get confused about what this library actually does.
> > It's not just a nicer wrapper around io_uring, but what is it actually
> > doing?
> 
> It's a library for what QEMU calls protocol drivers (POSIX files,
> userspace NVMe driver, etc). In particular, anything offering
> multi-queue block I/O fits into libblkio.
> 
> It is not intended to replace libnbd or other network storage libraries.
> libblkio's properties API is synchronous to keep things simple for
> applications. Attaching to network storage needs to be asynchronous,
> although the libblkio API could be extended if people want to support
> network storage.

I think what confuses me is why is NVMe any different from io_uring?
ie would this work?

$ blkio-info --output=json io_uring path=/dev/nvme0

Rich.

-- 
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
Read my programming and virtualization blog: http://rwmj.wordpress.com
virt-df lists disk usage of guests without needing to install any
software inside the virtual machine.  Supports Linux and Windows.
http://people.redhat.com/~rjones/virt-df/




Re: [ANNOUNCE] libblkio v0.1.0 preview release

2021-04-29 Thread Stefan Hajnoczi
On Thu, Apr 29, 2021 at 03:22:59PM +0100, Richard W.M. Jones wrote:
> On Thu, Apr 29, 2021 at 03:05:50PM +0100, Stefan Hajnoczi wrote:
> > The purpose of this preview release is to discuss both the API design
> > and general direction of the project. API documentation is available
> > here:
> > 
> >   https://gitlab.com/libblkio/libblkio/-/blob/v0.1.0/docs/blkio.rst
> 
> libvirt originally, and now libnbd, keep a per-thread error message
> (stored in thread-local storage).  It's a lot nicer than having to
> pass  to every function.  You can just write:
> 
>  if (nbd_connect_tcp (nbd, "remote", "nbd") == -1) {
>fprintf (stderr,
> "failed to connect to remote server: %s (errno = %d)\n",
> nbd_get_error (), nbd_get_errno ());
>exit (EXIT_FAILURE);
>  }
> 
> (https://libguestfs.org/libnbd.3.html#ERROR-HANDLING)
> 
> It means you can extend the range of error information available in
> future.  Also you can return a 'const char *' and the application
> doesn't have to worry about lifetimes, at least in the common case.

Thanks for sharing the idea, I think it would work well for libblkio
too.

Do you ignore the dlclose(3) memory leak?

> > Examples are available here:
> > 
> >   https://gitlab.com/libblkio/libblkio/-/tree/v0.1.0/examples
> > 
> > The goal is to eventually include the following drivers:
> > - Linux io_uring
> > - NVMe (VFIO and vfio-user)
> > - virtio-blk (VFIO, vfio-user, vhost-user-blk, and vhost-vdpa-blk)
> >
> > There are a few reasons why libblkio is needed:
> > 
> > 1. QEMU, Ceph, GlusterFS, MariaDB, and other programs have been adding
> >more low-level block I/O code, most of it duplicated. Usually only
> >one or two of Linux AIO, io_uring, userspace drivers, vhost-user
> >drivers, etc are implemented. This makes it difficult to benefit from
> >the latest advances in high-performance block I/O.
> > 
> > 2. Coding to a standard API makes it possible to introduce new
> >optimizations or hardware interfaces without costly changes to the
> >software stack.
> > 
> > 3. A client library is needed so applications can take advantage of
> >qemu-storage-daemon's vhost-user-blk exports.
> > 
> > 4. Implementing block I/O as a library allows QEMU to use Rust for new
> >code without messy QEMU internal API bindings. Note that libblkio
> >currently does not provide a Rust crate, it only offers a C API.
> 
> This is where I get confused about what this library actually does.
> It's not just a nicer wrapper around io_uring, but what is it actually
> doing?

It's a library for what QEMU calls protocol drivers (POSIX files,
userspace NVMe driver, etc). In particular, anything offering
multi-queue block I/O fits into libblkio.

It is not intended to replace libnbd or other network storage libraries.
libblkio's properties API is synchronous to keep things simple for
applications. Attaching to network storage needs to be asynchronous,
although the libblkio API could be extended if people want to support
network storage.

Stefan


signature.asc
Description: PGP signature


Re: [ANNOUNCE] libblkio v0.1.0 preview release

2021-04-29 Thread Richard W.M. Jones
On Thu, Apr 29, 2021 at 03:05:50PM +0100, Stefan Hajnoczi wrote:
> The purpose of this preview release is to discuss both the API design
> and general direction of the project. API documentation is available
> here:
> 
>   https://gitlab.com/libblkio/libblkio/-/blob/v0.1.0/docs/blkio.rst

libvirt originally, and now libnbd, keep a per-thread error message
(stored in thread-local storage).  It's a lot nicer than having to
pass  to every function.  You can just write:

 if (nbd_connect_tcp (nbd, "remote", "nbd") == -1) {
   fprintf (stderr,
"failed to connect to remote server: %s (errno = %d)\n",
nbd_get_error (), nbd_get_errno ());
   exit (EXIT_FAILURE);
 }

(https://libguestfs.org/libnbd.3.html#ERROR-HANDLING)

It means you can extend the range of error information available in
future.  Also you can return a 'const char *' and the application
doesn't have to worry about lifetimes, at least in the common case.

> Examples are available here:
> 
>   https://gitlab.com/libblkio/libblkio/-/tree/v0.1.0/examples
> 
> The goal is to eventually include the following drivers:
> - Linux io_uring
> - NVMe (VFIO and vfio-user)
> - virtio-blk (VFIO, vfio-user, vhost-user-blk, and vhost-vdpa-blk)
>
> There are a few reasons why libblkio is needed:
> 
> 1. QEMU, Ceph, GlusterFS, MariaDB, and other programs have been adding
>more low-level block I/O code, most of it duplicated. Usually only
>one or two of Linux AIO, io_uring, userspace drivers, vhost-user
>drivers, etc are implemented. This makes it difficult to benefit from
>the latest advances in high-performance block I/O.
> 
> 2. Coding to a standard API makes it possible to introduce new
>optimizations or hardware interfaces without costly changes to the
>software stack.
> 
> 3. A client library is needed so applications can take advantage of
>qemu-storage-daemon's vhost-user-blk exports.
> 
> 4. Implementing block I/O as a library allows QEMU to use Rust for new
>code without messy QEMU internal API bindings. Note that libblkio
>currently does not provide a Rust crate, it only offers a C API.

This is where I get confused about what this library actually does.
It's not just a nicer wrapper around io_uring, but what is it actually
doing?

Rich.

> For QEMU integration the next step is a libblkio BlockDriver. This new
> BlockDriver will provide libblkio functionality to QEMU. Eventually
> libblkio will provide block/file-posix.c, block/nvme.c, and the upcoming
> vhost-vdpa-blk functionality and the QEMU code can be simplified or
> removed.
> 
> libblkio does not contain an event loop implementation or disk image
> format drivers and that functionality will not be part of libblkio.
> 
> How to participate:
> 1. Share your thoughts on the direction and your requirements.
> 2. Review the API docs and give feedback.
> 3. Write a libblkio driver for NVMe VFIO/vfio-user, virtio-blk
>VFIO/vfio-user, vhost-user-blk. Or just a null driver for
>benchmarking/testing :).
> 4. Integrate libblkio into Ceph, GlusterFS, MariaDB, etc.
> 
> I am now beginning QEMU and fio integration. Stefano Garzarella is
> looking at adding a vhost-vdpa-blk driver to libblkio.
> 
> Stefan



-- 
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
Read my programming and virtualization blog: http://rwmj.wordpress.com
virt-builder quickly builds VMs from scratch
http://libguestfs.org/virt-builder.1.html




[ANNOUNCE] libblkio v0.1.0 preview release

2021-04-29 Thread Stefan Hajnoczi
Hi,
A preview release of libblkio, a library for high-performance block I/O,
is now available:

  https://gitlab.com/libblkio/libblkio/-/tree/v0.1.0

Applications are increasingly integrating high-performance I/O
interfaces such as Linux io_uring, userspace device drivers, and
vhost-user device support. The effort required to add each of these
low-level interfaces into an application is relatively high. libblkio
provides a single API for efficiently accessing block devices and
eliminates the need to write custom code for each one.

The library is not yet ready to use and currently lacks many features.
In fact, I hope this news doesn't spread too far yet because libblkio is
at a very early stage, but feedback from the QEMU community is necessary
at this time.

The purpose of this preview release is to discuss both the API design
and general direction of the project. API documentation is available
here:

  https://gitlab.com/libblkio/libblkio/-/blob/v0.1.0/docs/blkio.rst

Examples are available here:

  https://gitlab.com/libblkio/libblkio/-/tree/v0.1.0/examples

The goal is to eventually include the following drivers:
- Linux io_uring
- NVMe (VFIO and vfio-user)
- virtio-blk (VFIO, vfio-user, vhost-user-blk, and vhost-vdpa-blk)

There are a few reasons why libblkio is needed:

1. QEMU, Ceph, GlusterFS, MariaDB, and other programs have been adding
   more low-level block I/O code, most of it duplicated. Usually only
   one or two of Linux AIO, io_uring, userspace drivers, vhost-user
   drivers, etc are implemented. This makes it difficult to benefit from
   the latest advances in high-performance block I/O.

2. Coding to a standard API makes it possible to introduce new
   optimizations or hardware interfaces without costly changes to the
   software stack.

3. A client library is needed so applications can take advantage of
   qemu-storage-daemon's vhost-user-blk exports.

4. Implementing block I/O as a library allows QEMU to use Rust for new
   code without messy QEMU internal API bindings. Note that libblkio
   currently does not provide a Rust crate, it only offers a C API.

For QEMU integration the next step is a libblkio BlockDriver. This new
BlockDriver will provide libblkio functionality to QEMU. Eventually
libblkio will provide block/file-posix.c, block/nvme.c, and the upcoming
vhost-vdpa-blk functionality and the QEMU code can be simplified or
removed.

libblkio does not contain an event loop implementation or disk image
format drivers and that functionality will not be part of libblkio.

How to participate:
1. Share your thoughts on the direction and your requirements.
2. Review the API docs and give feedback.
3. Write a libblkio driver for NVMe VFIO/vfio-user, virtio-blk
   VFIO/vfio-user, vhost-user-blk. Or just a null driver for
   benchmarking/testing :).
4. Integrate libblkio into Ceph, GlusterFS, MariaDB, etc.

I am now beginning QEMU and fio integration. Stefano Garzarella is
looking at adding a vhost-vdpa-blk driver to libblkio.

Stefan


signature.asc
Description: PGP signature


Re: [PATCH] hw/ide: Fix crash when plugging a piix3-ide device into the x-remote machine

2021-04-29 Thread Markus Armbruster
Stefan Hajnoczi  writes:

> On Wed, Apr 28, 2021 at 04:18:17PM +0200, Markus Armbruster wrote:
>> Stefan Hajnoczi  writes:

[...]

>> > The approach in this patch is okay but we should keep in mind it only
>> > solves piix3-ide. ISA provides a non-qdev backdoor API and there may be
>> > more instances of this type of bug.
>> >
>> > A qdev fix would address the root cause and make it possible to drop the
>> > backdoor API, but that's probably too much work for little benefit.
>> 
>> What do you mean by backdoor API?  Global @isabus?
>
> Yes. It's also strange that isa_get_irq(ISADevice *dev, unsigned isairq)
> accepts dev = NULL as a valid argument.

@isabus is static in hw/isa/isa-bus.c.  Uses:

* Limit isa_bus_new() to one ISA bus.  Arbitrary restriction; multiple
  ISA buses could work with suitable memory mapping and IRQ wiring.
  "Single ISA bus" assumptions could of course hide elsewhere in the
  code.

* Implied argument to isa_get_irq(), isa_register_ioport(),
  isa_register_portio_list(), isa_address_space(),
  isa_address_space_io().

  isa_get_irq() asserts that a non-null @dev is a child of @isabus.
  This means we don't actually need @isabus, except when @dev is null.
  I suspect two separate functions would be cleaner: one taking an
  ISABus * argument, and a wrapper taking an ISADevice * argument.

  isa_address_space() and isa_address_space_io() work the same, less the
  assertion.

  isa_register_ioport() and isa_register_portio_list() take a non-null
  @dev argument.  They don't actually need @isabus.

To eliminate global @isabus, we need to fix up the callers passing null
@dev.  Clean solution: plumb the ISABus returned by isa_bus_new() to the
call sites.  Where that's impractical, we can also get it from QOM, like
build_dsdt_microvm() does.




Re: [PATCH 2/5] vhost-user-blk: Use Error more consistently

2021-04-29 Thread Raphael Norwitz
Makes sense - I see why it makes reporting better at realize time.

Thanks for the clarification.

On Thu, Apr 29, 2021 at 11:26:29AM +0200, Kevin Wolf wrote:
> Am 28.04.2021 um 20:08 hat Raphael Norwitz geschrieben:
> > Code looks ok - question about the commit message.
> > 
> > Acked-by: Raphael Norwitz 
> > 
> > On Thu, Apr 22, 2021 at 07:02:18PM +0200, Kevin Wolf wrote:
> > > Now that vhost_user_blk_connect() is not called from an event handler
> > > any more, but directly from vhost_user_blk_device_realize(), we don't
> > > have to resort to error_report() any more, but can use Error again.
> > 
> > vhost_user_blk_connect() is still called by vhost_user_blk_event() which
> > is registered as an event handler. I don't understand your point around
> > us having had to use error_report() before, but not anymore. Can you
> > clarify?
> 
> What I meant is that vhost_user_blk_event() can't really make use of an
> Error object other than passing it to error_report_err(), which has the
> same result as directly using error_report().
> 
> With the new code where vhost_user_blk_device_realize() calls the
> function directly, we can actually return the error to its caller so
> that it ends up in the QMP result or the command line error message.
> 
> The result is still not great because vhost_user_blk_connect() doesn't
> know the original error message. We'd have to add Error to
> vhost_dev_init() and the functions that it calls to get the real error
> messages, but at least it's a first step in the right direction.
> 
> We already figured that we need to change error reporting so we can know
> whether we should retry, so I guess this can be solved at the same time.
> 
> Kevin
> 


Re: [PATCH 1/5] vhost-user-blk: Don't reconnect during initialisation

2021-04-29 Thread Kevin Wolf
Am 22.04.2021 um 19:02 hat Kevin Wolf geschrieben:
> This is a partial revert of commits 77542d43149 and bc79c87bcde.
> 
> Usually, an error during initialisation means that the configuration was
> wrong. Reconnecting won't make the error go away, but just turn the
> error condition into an endless loop. Avoid this and return errors
> again.
> 
> Additionally, calling vhost_user_blk_disconnect() from the chardev event
> handler could result in use-after-free because none of the
> initialisation code expects that the device could just go away in the
> middle. So removing the call fixes crashes in several places.
> 
> For example, using a num-queues setting that is incompatible with the
> backend would result in a crash like this (dereferencing dev->opaque,
> which is already NULL):
> 
>  #0  0x55d0a4bd in vhost_user_read_cb (source=0x568f4690, 
> condition=(G_IO_IN | G_IO_HUP), opaque=0x7fffcbf0) at 
> ../hw/virtio/vhost-user.c:313
>  #1  0x55d950d3 in qio_channel_fd_source_dispatch 
> (source=0x57c3f750, callback=0x55d0a478 , 
> user_data=0x7fffcbf0) at ../io/channel-watch.c:84
>  #2  0x77b32a9f in g_main_context_dispatch () at 
> /lib64/libglib-2.0.so.0
>  #3  0x77b84a98 in g_main_context_iterate.constprop () at 
> /lib64/libglib-2.0.so.0
>  #4  0x77b32163 in g_main_loop_run () at /lib64/libglib-2.0.so.0
>  #5  0x55d0a724 in vhost_user_read (dev=0x57bc62f8, 
> msg=0x7fffcc50) at ../hw/virtio/vhost-user.c:402
>  #6  0x55d0ee6b in vhost_user_get_config (dev=0x57bc62f8, 
> config=0x57bc62ac "", config_len=60) at ../hw/virtio/vhost-user.c:2133
>  #7  0x55d56d46 in vhost_dev_get_config (hdev=0x57bc62f8, 
> config=0x57bc62ac "", config_len=60) at ../hw/virtio/vhost.c:1566
>  #8  0x55cdd150 in vhost_user_blk_device_realize (dev=0x57bc60b0, 
> errp=0x7fffcf90) at ../hw/block/vhost-user-blk.c:510
>  #9  0x55d08f6d in virtio_device_realize (dev=0x57bc60b0, 
> errp=0x7fffcff0) at ../hw/virtio/virtio.c:3660
> 
> Signed-off-by: Kevin Wolf 
> ---
>  hw/block/vhost-user-blk.c | 54 ++-
>  1 file changed, 13 insertions(+), 41 deletions(-)
> 
> diff --git a/hw/block/vhost-user-blk.c b/hw/block/vhost-user-blk.c
> index f5e9682703..e824b0a759 100644
> --- a/hw/block/vhost-user-blk.c
> +++ b/hw/block/vhost-user-blk.c
> @@ -50,6 +50,8 @@ static const int user_feature_bits[] = {
>  VHOST_INVALID_FEATURE_BIT
>  };
>  
> +static void vhost_user_blk_event(void *opaque, QEMUChrEvent event);
> +
>  static void vhost_user_blk_update_config(VirtIODevice *vdev, uint8_t *config)
>  {
>  VHostUserBlk *s = VHOST_USER_BLK(vdev);
> @@ -362,19 +364,6 @@ static void vhost_user_blk_disconnect(DeviceState *dev)
>  vhost_dev_cleanup(>dev);
>  }
>  
> -static void vhost_user_blk_event(void *opaque, QEMUChrEvent event,
> - bool realized);
> -
> -static void vhost_user_blk_event_realize(void *opaque, QEMUChrEvent event)
> -{
> -vhost_user_blk_event(opaque, event, false);
> -}
> -
> -static void vhost_user_blk_event_oper(void *opaque, QEMUChrEvent event)
> -{
> -vhost_user_blk_event(opaque, event, true);
> -}
> -
>  static void vhost_user_blk_chr_closed_bh(void *opaque)
>  {
>  DeviceState *dev = opaque;
> @@ -382,12 +371,11 @@ static void vhost_user_blk_chr_closed_bh(void *opaque)
>  VHostUserBlk *s = VHOST_USER_BLK(vdev);
>  
>  vhost_user_blk_disconnect(dev);
> -qemu_chr_fe_set_handlers(>chardev, NULL, NULL,
> -vhost_user_blk_event_oper, NULL, opaque, NULL, true);
> +qemu_chr_fe_set_handlers(>chardev, NULL, NULL, vhost_user_blk_event,
> + NULL, opaque, NULL, true);
>  }
>  
> -static void vhost_user_blk_event(void *opaque, QEMUChrEvent event,
> - bool realized)
> +static void vhost_user_blk_event(void *opaque, QEMUChrEvent event)
>  {
>  DeviceState *dev = opaque;
>  VirtIODevice *vdev = VIRTIO_DEVICE(dev);
> @@ -401,17 +389,7 @@ static void vhost_user_blk_event(void *opaque, 
> QEMUChrEvent event,
>  }
>  break;
>  case CHR_EVENT_CLOSED:
> -/*
> - * Closing the connection should happen differently on device
> - * initialization and operation stages.
> - * On initalization, we want to re-start vhost_dev initialization
> - * from the very beginning right away when the connection is closed,
> - * so we clean up vhost_dev on each connection closing.
> - * On operation, we want to postpone vhost_dev cleanup to let the
> - * other code perform its own cleanup sequence using vhost_dev data
> - * (e.g. vhost_dev_set_log).
> - */
> -if (realized && !runstate_check(RUN_STATE_SHUTDOWN)) {
> +if (!runstate_check(RUN_STATE_SHUTDOWN)) {
>  /*
>   * A close event may happen during a read/write, but 

Re: [PATCH 4/5] virtio: Fail if iommu_platform is requested, but unsupported

2021-04-29 Thread Raphael Norwitz
Got it - thanks for the clarification.

Reviewed-by: Raphael Norwitz 

On Thu, Apr 29, 2021 at 11:34:11AM +0200, Kevin Wolf wrote:
> Am 28.04.2021 um 21:24 hat Raphael Norwitz geschrieben:
> > On Thu, Apr 22, 2021 at 07:02:20PM +0200, Kevin Wolf wrote:
> > > Commit 2943b53f6 (' virtio: force VIRTIO_F_IOMMU_PLATFORM') made sure
> > > that vhost can't just reject VIRTIO_F_IOMMU_PLATFORM when it was
> > > requested. However, just adding it back to the negotiated flags isn't
> > > right either because it promises support to the guest that the device
> > > actually doesn't support. One example of a vhost-user device that
> > > doesn't have support for the flag is the vhost-user-blk export of QEMU.
> > > 
> > > Instead of successfully creating a device that doesn't work, just fail
> > > to plug the device when it doesn't support the feature, but it was
> > > requested. This results in much clearer error messages.
> > > 
> > > Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1935019
> > > Signed-off-by: Kevin Wolf 
> > > ---
> > >  hw/virtio/virtio-bus.c | 5 +
> > >  1 file changed, 5 insertions(+)
> > > 
> > > diff --git a/hw/virtio/virtio-bus.c b/hw/virtio/virtio-bus.c
> > > index d6332d45c3..859978d248 100644
> > > --- a/hw/virtio/virtio-bus.c
> > > +++ b/hw/virtio/virtio-bus.c
> > > @@ -69,6 +69,11 @@ void virtio_bus_device_plugged(VirtIODevice *vdev, 
> > > Error **errp)
> > >  return;
> > >  }
> > >
> > 
> > Can you explain this check a little more?
> > 
> > Above we have:
> > bool has_iommu = virtio_host_has_feature(vdev, VIRTIO_F_IOMMU_PLATFORM);
> 
> If I underdstand the code correctly, at this point this is still the
> unchanged value of the iommu_platform=on|off qdev property as given by
> the user.
> 
> > and then we get the host features from the bckend:
> > vdev->host_features = vdc->get_features(vdev, vdev->host_features
> 
> Yes, and now a flag is only set if the user had requested it and the
> backend also supports it.
> 
> > So as is this is catching the case where vdev->host_features had
> > VIRTIO_F_IOMMU_PLATFORM set before (by default?), but doesn't now that
> > the features have been retrieved? 
> > 
> > Why not just:
> > if (!virtio_host_has_feature(vdev, VIRTIO_F_IOMMU_PLATFORM)) {
> 
> We don't want to fail if the user hadn't even requested the feature, but
> just if it was requested, but could not be provided.
>
> > > +if (has_iommu && !virtio_host_has_feature(vdev, 
> > > VIRTIO_F_IOMMU_PLATFORM)) {
> > > +error_setg(errp, "iommu_platform=true is not supported by the 
> > > device");
> > > +return;
> > > +}
> > > +
> > >  if (klass->device_plugged != NULL) {
> > >  klass->device_plugged(qbus->parent, _err);
> > >  }
> 
> Kevin
> 


Re: [PATCH 4/5] virtio: Fail if iommu_platform is requested, but unsupported

2021-04-29 Thread Kevin Wolf
Am 28.04.2021 um 21:24 hat Raphael Norwitz geschrieben:
> On Thu, Apr 22, 2021 at 07:02:20PM +0200, Kevin Wolf wrote:
> > Commit 2943b53f6 (' virtio: force VIRTIO_F_IOMMU_PLATFORM') made sure
> > that vhost can't just reject VIRTIO_F_IOMMU_PLATFORM when it was
> > requested. However, just adding it back to the negotiated flags isn't
> > right either because it promises support to the guest that the device
> > actually doesn't support. One example of a vhost-user device that
> > doesn't have support for the flag is the vhost-user-blk export of QEMU.
> > 
> > Instead of successfully creating a device that doesn't work, just fail
> > to plug the device when it doesn't support the feature, but it was
> > requested. This results in much clearer error messages.
> > 
> > Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1935031
> > Signed-off-by: Kevin Wolf 
> > ---
> >  hw/virtio/virtio-bus.c | 5 +
> >  1 file changed, 5 insertions(+)
> > 
> > diff --git a/hw/virtio/virtio-bus.c b/hw/virtio/virtio-bus.c
> > index d6332d45c3..859978d248 100644
> > --- a/hw/virtio/virtio-bus.c
> > +++ b/hw/virtio/virtio-bus.c
> > @@ -69,6 +69,11 @@ void virtio_bus_device_plugged(VirtIODevice *vdev, Error 
> > **errp)
> >  return;
> >  }
> >
> 
> Can you explain this check a little more?
> 
> Above we have:
> bool has_iommu = virtio_host_has_feature(vdev, VIRTIO_F_IOMMU_PLATFORM);

If I underdstand the code correctly, at this point this is still the
unchanged value of the iommu_platform=on|off qdev property as given by
the user.

> and then we get the host features from the bckend:
> vdev->host_features = vdc->get_features(vdev, vdev->host_features

Yes, and now a flag is only set if the user had requested it and the
backend also supports it.

> So as is this is catching the case where vdev->host_features had
> VIRTIO_F_IOMMU_PLATFORM set before (by default?), but doesn't now that
> the features have been retrieved? 
> 
> Why not just:
> if (!virtio_host_has_feature(vdev, VIRTIO_F_IOMMU_PLATFORM)) {

We don't want to fail if the user hadn't even requested the feature, but
just if it was requested, but could not be provided.

> > +if (has_iommu && !virtio_host_has_feature(vdev, 
> > VIRTIO_F_IOMMU_PLATFORM)) {
> > +error_setg(errp, "iommu_platform=true is not supported by the 
> > device");
> > +return;
> > +}
> > +
> >  if (klass->device_plugged != NULL) {
> >  klass->device_plugged(qbus->parent, _err);
> >  }

Kevin




Re: [PATCH 2/5] vhost-user-blk: Use Error more consistently

2021-04-29 Thread Kevin Wolf
Am 28.04.2021 um 20:08 hat Raphael Norwitz geschrieben:
> Code looks ok - question about the commit message.
> 
> Acked-by: Raphael Norwitz 
> 
> On Thu, Apr 22, 2021 at 07:02:18PM +0200, Kevin Wolf wrote:
> > Now that vhost_user_blk_connect() is not called from an event handler
> > any more, but directly from vhost_user_blk_device_realize(), we don't
> > have to resort to error_report() any more, but can use Error again.
> 
> vhost_user_blk_connect() is still called by vhost_user_blk_event() which
> is registered as an event handler. I don't understand your point around
> us having had to use error_report() before, but not anymore. Can you
> clarify?

What I meant is that vhost_user_blk_event() can't really make use of an
Error object other than passing it to error_report_err(), which has the
same result as directly using error_report().

With the new code where vhost_user_blk_device_realize() calls the
function directly, we can actually return the error to its caller so
that it ends up in the QMP result or the command line error message.

The result is still not great because vhost_user_blk_connect() doesn't
know the original error message. We'd have to add Error to
vhost_dev_init() and the functions that it calls to get the real error
messages, but at least it's a first step in the right direction.

We already figured that we need to change error reporting so we can know
whether we should retry, so I guess this can be solved at the same time.

Kevin