** Description changed:

+ [Impact]
+ 
+ Attempts to hotplug devices shared to userspace (qemu) via vfio triggers
+ a deadlock in the kernel.  A reboot is required to resolve this.
+ 
+ [Test Case]
+ 
+ Set up a KVM instance with attached devices, attempt to hotplug those
+ using ipmitool.
+ 
+ [Regression Potential]
+ 
+ The change is to an uncommonly used driver.  There is common code
+ changes, but these are a noop in the normal case and should be easy to
+ confirm basic operation.
+ 
+ [Other Info]
+  
+ This fix has been verified by the reporter as fixing the deadlock.
+ 
+ ===
+ 
  We are seeing deadlocks during hotplug of devices under vfio.
- 
  
  As per the Linux kernel source code, there is a deadlock situation
  between vfio_pci_remove() and vfio_pci_release() on PCIe hotplug events.
  This issue can be avoided either by skipping the PCIe reset
  functionality or do device_unlock() in vfio_pci_remove() beforfe calling
  the function vfio_del_group_dev()().
  
  Code flow on PCIe hotplug event:
  
  Execution flow 1:
-   device_release_driver() ( ( 
https://elixir.bootlin.com/linux/latest/source/drivers/base/dd.c#L935 )
-    device_release_driver_internal() ( 
https://elixir.bootlin.com/linux/latest/source/drivers/base/dd.c#L908 )
-    device_lock(dev); ( 
https://elixir.bootlin.com/linux/latest/source/drivers/base/dd.c#L915 )
-    vfio_pci_remove() ( 
https://elixir.bootlin.com/linux/v4.18.5/source/drivers/vfio/pci/vfio_pci.c#L392
 )
-      vfio_del_group_dev() 
https://elixir.bootlin.com/linux/v4.18.5/source/drivers/vfio/vfio.c#L923
-        send event request to user and wait for VFIO_PCI_DEVICE release in 
vfio_pci_release() ( 
https://elixir.bootlin.com/linux/v4.18.5/source/drivers/vfio/vfio.c#L967 )
+   device_release_driver() ( ( 
https://elixir.bootlin.com/linux/latest/source/drivers/base/dd.c#L935 )
+    device_release_driver_internal() ( 
https://elixir.bootlin.com/linux/latest/source/drivers/base/dd.c#L908 )
+    device_lock(dev); ( 
https://elixir.bootlin.com/linux/latest/source/drivers/base/dd.c#L915 )
+    vfio_pci_remove() ( 
https://elixir.bootlin.com/linux/v4.18.5/source/drivers/vfio/pci/vfio_pci.c#L392
 )
+      vfio_del_group_dev() 
https://elixir.bootlin.com/linux/v4.18.5/source/drivers/vfio/vfio.c#L923
+        send event request to user and wait for VFIO_PCI_DEVICE release in 
vfio_pci_release() ( 
https://elixir.bootlin.com/linux/v4.18.5/source/drivers/vfio/vfio.c#L967 )
  
  Execution flow 2 triggered by above step "send event request to user":
-   vfio_pci_releas() ( 
https://elixir.bootlin.com/linux/v4.18.5/source/drivers/vfio/pci/vfio_pci.c#L392
 )
-     vfio_pci_disable() ( 
https://elixir.bootlin.com/linux/v4.18.5/source/drivers/vfio/pci/vfio_pci.c#L302
 )
-       vfio_pci_try_bus_reset() ( 
https://elixir.bootlin.com/linux/v4.18.5/source/drivers/vfio/pci/vfio_pci.c#L1346
 )
-         pci_try_reset_bus() ( 
https://elixir.bootlin.com/linux/v4.18.5/source/drivers/pci/pci.c#L4981 )
-           pci_bus_save_and_disable() ( 
https://elixir.bootlin.com/linux/v4.18.5/source/drivers/pci/pci.c#L4760 )
-             pci_dev_lock(dev); ( 
https://elixir.bootlin.com/linux/v4.18.5/source/drivers/pci/pci.c#L4765 )
+   vfio_pci_releas() ( 
https://elixir.bootlin.com/linux/v4.18.5/source/drivers/vfio/pci/vfio_pci.c#L392
 )
+     vfio_pci_disable() ( 
https://elixir.bootlin.com/linux/v4.18.5/source/drivers/vfio/pci/vfio_pci.c#L302
 )
+       vfio_pci_try_bus_reset() ( 
https://elixir.bootlin.com/linux/v4.18.5/source/drivers/vfio/pci/vfio_pci.c#L1346
 )
+         pci_try_reset_bus() ( 
https://elixir.bootlin.com/linux/v4.18.5/source/drivers/pci/pci.c#L4981 )
+           pci_bus_save_and_disable() ( 
https://elixir.bootlin.com/linux/v4.18.5/source/drivers/pci/pci.c#L4760 )
+             pci_dev_lock(dev); ( 
https://elixir.bootlin.com/linux/v4.18.5/source/drivers/pci/pci.c#L4765 )
  
-              DEADLOCK here since PCI_DEIVCE_LOCK is held by PCI_DEVICE
+              DEADLOCK here since PCI_DEIVCE_LOCK is held by PCI_DEVICE
  remove code path in DD.c

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1792099

Title:
  device hotplug of vfio devices can lead to deadlock in
  vfio_pci_release

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1792099/+subscriptions

-- 
ubuntu-bugs mailing list
[email protected]
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

Reply via email to