Re: [libvirt PATCH 2/2] hostdev: mdev: Lookup mdevs by sysfs path rather than mdev struct

2021-01-07 Thread Erik Skultety
On Thu, Jan 07, 2021 at 09:32:43PM +0100, Ján Tomko wrote:
> On a Thursday in 2021, Erik Skultety wrote:
> > The lookup didn't do anything apart from comparing the sysfs paths
> > anyway since that's what makes each mdev unique.
> > The most ridiculous usage of the old logic was in
> > virHostdevReAttachMediatedDevices where in order to drop an mdev
> > hostdev from the list of active devices we first had to create a new
> > mdev and use it in the lookup call. Why couldn't we have used the
> > hostdev directly? Because the hostdev and mdev structures are
> > incompatible.
> > 
> > The way mdevs are currently removed is via a write to a specific sysfs
> > attribute. If you do it while the machine which has the mdev assigned
> > is running, the write call may block (this should return a write error
> > instead and it is likely a bug in the vendor driver) until the device

I'll reword ^this bit slightly as it turned out that this behavior conforms
to the kernel device model (which it previously didn't) and a device "remove"
apparently cannot return an error (I have this via a conversation on a private
list, so I can't link it as a source of truth).

> > is no longer bound to vfio which is when the QEMU process exits.
> > 
> > The interesting part here comes afterwards when we're cleaning up and
> > call virHostdevReAttachMediatedDevices. The domain doesn't exist
> > anymore, so the list of active hostdevs needs to be updated and the
> > respective hostdevs removed from the list, but remember we had to
> > create an mdev object in the memory in order to find it in the list
> > first which will fail because the write to sysfs had already removed
> > the mdev instance from the host system.
> > And so the next time you try to start the same domain you'll get:
> > 
> > "Requested operation is not valid: mediated device  is in use by
> > driver QEMU, domain "
> > 
> 
> Is there a public bug/issue you could link here?

I just created one (gitlab issue) for bookkeeping and I'll link it here.

> 
> > Signed-off-by: Erik Skultety 
> > ---
> > src/hypervisor/virhostdev.c | 10 --
> > src/util/virmdev.c  | 16 
> > src/util/virmdev.h  |  4 ++--
> > 3 files changed, 14 insertions(+), 16 deletions(-)
> > 
> > diff --git a/src/hypervisor/virhostdev.c b/src/hypervisor/virhostdev.c
> > index aa3fc8738f..ea04382d0d 100644
> > --- a/src/hypervisor/virhostdev.c
> > +++ b/src/hypervisor/virhostdev.c
> > @@ -1975,7 +1975,7 @@ 
> > virHostdevReAttachMediatedDevices(virHostdevManagerPtr mgr,
> > 
> > virObjectLock(mgr->activeMediatedHostdevs);
> > for (i = 0; i < nhostdevs; i++) {
> > -g_autoptr(virMediatedDevice) mdev = NULL;
> > +g_autofree char * sysfspath = NULL;
> 
> extra space
> 
> > virMediatedDevicePtr tmp;
> > virDomainHostdevSubsysMediatedDevPtr mdevsrc;
> > virDomainHostdevDefPtr hostdev = hostdevs[i];
> 
> Reviewed-by: Ján Tomko 
> 

Thanks.

Erik



Re: [libvirt PATCH 2/2] hostdev: mdev: Lookup mdevs by sysfs path rather than mdev struct

2021-01-07 Thread Ján Tomko

On a Thursday in 2021, Erik Skultety wrote:

The lookup didn't do anything apart from comparing the sysfs paths
anyway since that's what makes each mdev unique.
The most ridiculous usage of the old logic was in
virHostdevReAttachMediatedDevices where in order to drop an mdev
hostdev from the list of active devices we first had to create a new
mdev and use it in the lookup call. Why couldn't we have used the
hostdev directly? Because the hostdev and mdev structures are
incompatible.

The way mdevs are currently removed is via a write to a specific sysfs
attribute. If you do it while the machine which has the mdev assigned
is running, the write call may block (this should return a write error
instead and it is likely a bug in the vendor driver) until the device
is no longer bound to vfio which is when the QEMU process exits.

The interesting part here comes afterwards when we're cleaning up and
call virHostdevReAttachMediatedDevices. The domain doesn't exist
anymore, so the list of active hostdevs needs to be updated and the
respective hostdevs removed from the list, but remember we had to
create an mdev object in the memory in order to find it in the list
first which will fail because the write to sysfs had already removed
the mdev instance from the host system.
And so the next time you try to start the same domain you'll get:

"Requested operation is not valid: mediated device  is in use by
driver QEMU, domain "



Is there a public bug/issue you could link here?


Signed-off-by: Erik Skultety 
---
src/hypervisor/virhostdev.c | 10 --
src/util/virmdev.c  | 16 
src/util/virmdev.h  |  4 ++--
3 files changed, 14 insertions(+), 16 deletions(-)

diff --git a/src/hypervisor/virhostdev.c b/src/hypervisor/virhostdev.c
index aa3fc8738f..ea04382d0d 100644
--- a/src/hypervisor/virhostdev.c
+++ b/src/hypervisor/virhostdev.c
@@ -1975,7 +1975,7 @@ virHostdevReAttachMediatedDevices(virHostdevManagerPtr 
mgr,

virObjectLock(mgr->activeMediatedHostdevs);
for (i = 0; i < nhostdevs; i++) {
-g_autoptr(virMediatedDevice) mdev = NULL;
+g_autofree char * sysfspath = NULL;


extra space


virMediatedDevicePtr tmp;
virDomainHostdevSubsysMediatedDevPtr mdevsrc;
virDomainHostdevDefPtr hostdev = hostdevs[i];


Reviewed-by: Ján Tomko 

Jano


signature.asc
Description: PGP signature


[libvirt PATCH 2/2] hostdev: mdev: Lookup mdevs by sysfs path rather than mdev struct

2021-01-07 Thread Erik Skultety
The lookup didn't do anything apart from comparing the sysfs paths
anyway since that's what makes each mdev unique.
The most ridiculous usage of the old logic was in
virHostdevReAttachMediatedDevices where in order to drop an mdev
hostdev from the list of active devices we first had to create a new
mdev and use it in the lookup call. Why couldn't we have used the
hostdev directly? Because the hostdev and mdev structures are
incompatible.

The way mdevs are currently removed is via a write to a specific sysfs
attribute. If you do it while the machine which has the mdev assigned
is running, the write call may block (this should return a write error
instead and it is likely a bug in the vendor driver) until the device
is no longer bound to vfio which is when the QEMU process exits.

The interesting part here comes afterwards when we're cleaning up and
call virHostdevReAttachMediatedDevices. The domain doesn't exist
anymore, so the list of active hostdevs needs to be updated and the
respective hostdevs removed from the list, but remember we had to
create an mdev object in the memory in order to find it in the list
first which will fail because the write to sysfs had already removed
the mdev instance from the host system.
And so the next time you try to start the same domain you'll get:

"Requested operation is not valid: mediated device  is in use by
driver QEMU, domain "

Signed-off-by: Erik Skultety 
---
 src/hypervisor/virhostdev.c | 10 --
 src/util/virmdev.c  | 16 
 src/util/virmdev.h  |  4 ++--
 3 files changed, 14 insertions(+), 16 deletions(-)

diff --git a/src/hypervisor/virhostdev.c b/src/hypervisor/virhostdev.c
index aa3fc8738f..ea04382d0d 100644
--- a/src/hypervisor/virhostdev.c
+++ b/src/hypervisor/virhostdev.c
@@ -1975,7 +1975,7 @@ virHostdevReAttachMediatedDevices(virHostdevManagerPtr 
mgr,
 
 virObjectLock(mgr->activeMediatedHostdevs);
 for (i = 0; i < nhostdevs; i++) {
-g_autoptr(virMediatedDevice) mdev = NULL;
+g_autofree char * sysfspath = NULL;
 virMediatedDevicePtr tmp;
 virDomainHostdevSubsysMediatedDevPtr mdevsrc;
 virDomainHostdevDefPtr hostdev = hostdevs[i];
@@ -1984,14 +1984,12 @@ virHostdevReAttachMediatedDevices(virHostdevManagerPtr 
mgr,
 continue;
 
 mdevsrc = >source.subsys.u.mdev;
-
-if (!(mdev = virMediatedDeviceNew(mdevsrc->uuidstr,
-  mdevsrc->model)))
-continue;
+sysfspath = virMediatedDeviceGetSysfsPath(mdevsrc->uuidstr);
 
 /* Remove from the list only mdevs assigned to @drv_name/@dom_name */
 
-tmp = virMediatedDeviceListFind(mgr->activeMediatedHostdevs, mdev);
+tmp = virMediatedDeviceListFind(mgr->activeMediatedHostdevs,
+sysfspath);
 
 /* skip inactive devices */
 if (!tmp)
diff --git a/src/util/virmdev.c b/src/util/virmdev.c
index b6df353d56..fc27e9e45d 100644
--- a/src/util/virmdev.c
+++ b/src/util/virmdev.c
@@ -308,7 +308,7 @@ int
 virMediatedDeviceListAdd(virMediatedDeviceListPtr list,
  virMediatedDevicePtr *dev)
 {
-if (virMediatedDeviceListFind(list, *dev)) {
+if (virMediatedDeviceListFind(list, (*dev)->path)) {
 virReportError(VIR_ERR_INTERNAL_ERROR,
_("device %s is already in use"), (*dev)->path);
 return -1;
@@ -354,7 +354,7 @@ virMediatedDevicePtr
 virMediatedDeviceListSteal(virMediatedDeviceListPtr list,
virMediatedDevicePtr dev)
 {
-int idx = virMediatedDeviceListFindIndex(list, dev);
+int idx = virMediatedDeviceListFindIndex(list, dev->path);
 
 return virMediatedDeviceListStealIndex(list, idx);
 }
@@ -370,13 +370,13 @@ virMediatedDeviceListDel(virMediatedDeviceListPtr list,
 
 int
 virMediatedDeviceListFindIndex(virMediatedDeviceListPtr list,
-   virMediatedDevicePtr dev)
+   const char *sysfspath)
 {
 size_t i;
 
 for (i = 0; i < list->count; i++) {
-virMediatedDevicePtr other = list->devs[i];
-if (STREQ(other->path, dev->path))
+virMediatedDevicePtr dev = list->devs[i];
+if (STREQ(sysfspath, dev->path))
 return i;
 }
 return -1;
@@ -385,11 +385,11 @@ virMediatedDeviceListFindIndex(virMediatedDeviceListPtr 
list,
 
 virMediatedDevicePtr
 virMediatedDeviceListFind(virMediatedDeviceListPtr list,
-  virMediatedDevicePtr dev)
+  const char *sysfspath)
 {
 int idx;
 
-if ((idx = virMediatedDeviceListFindIndex(list, dev)) >= 0)
+if ((idx = virMediatedDeviceListFindIndex(list, sysfspath)) >= 0)
 return list->devs[idx];
 else
 return NULL;
@@ -403,7 +403,7 @@ virMediatedDeviceIsUsed(virMediatedDevicePtr dev,
 const char *drvname, *domname;
 virMediatedDevicePtr tmp = NULL;
 
-if ((tmp =