** Changed in: linux (Ubuntu)
Status: Fix Committed => Fix Released
--
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1792099
Title:
device hotplug of vfio devices can lead to deadlock in
vfio_pci_release
Status in linux package in Ubuntu:
Fix Released
Status in linux source package in Bionic:
Fix Released
Bug description:
[Impact]
Attempts to hotplug devices shared to userspace (qemu) via vfio
triggers a deadlock in the kernel. A reboot is required to resolve
this.
[Test Case]
Set up a KVM instance with attached devices, attempt to hotplug those
using ipmitool.
[Regression Potential]
The change is to an uncommonly used driver. There is common code
changes, but these are a noop in the normal case and should be easy to
confirm basic operation.
[Other Info]
This fix has been verified by the reporter as fixing the deadlock.
===
We are seeing deadlocks during hotplug of devices under vfio.
As per the Linux kernel source code, there is a deadlock situation
between vfio_pci_remove() and vfio_pci_release() on PCIe hotplug
events. This issue can be avoided either by skipping the PCIe reset
functionality or do device_unlock() in vfio_pci_remove() beforfe
calling the function vfio_del_group_dev()().
Code flow on PCIe hotplug event:
Execution flow 1:
device_release_driver() ( (
https://elixir.bootlin.com/linux/latest/source/drivers/base/dd.c#L935 )
device_release_driver_internal() (
https://elixir.bootlin.com/linux/latest/source/drivers/base/dd.c#L908 )
device_lock(dev); (
https://elixir.bootlin.com/linux/latest/source/drivers/base/dd.c#L915 )
vfio_pci_remove() (
https://elixir.bootlin.com/linux/v4.18.5/source/drivers/vfio/pci/vfio_pci.c#L392
)
vfio_del_group_dev()
https://elixir.bootlin.com/linux/v4.18.5/source/drivers/vfio/vfio.c#L923
send event request to user and wait for VFIO_PCI_DEVICE release in
vfio_pci_release() (
https://elixir.bootlin.com/linux/v4.18.5/source/drivers/vfio/vfio.c#L967 )
Execution flow 2 triggered by above step "send event request to user":
vfio_pci_releas() (
https://elixir.bootlin.com/linux/v4.18.5/source/drivers/vfio/pci/vfio_pci.c#L392
)
vfio_pci_disable() (
https://elixir.bootlin.com/linux/v4.18.5/source/drivers/vfio/pci/vfio_pci.c#L302
)
vfio_pci_try_bus_reset() (
https://elixir.bootlin.com/linux/v4.18.5/source/drivers/vfio/pci/vfio_pci.c#L1346
)
pci_try_reset_bus() (
https://elixir.bootlin.com/linux/v4.18.5/source/drivers/pci/pci.c#L4981 )
pci_bus_save_and_disable() (
https://elixir.bootlin.com/linux/v4.18.5/source/drivers/pci/pci.c#L4760 )
pci_dev_lock(dev); (
https://elixir.bootlin.com/linux/v4.18.5/source/drivers/pci/pci.c#L4765 )
DEADLOCK here since PCI_DEIVCE_LOCK is held by PCI_DEVICE
remove code path in DD.c
To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1792099/+subscriptions
--
Mailing list: https://launchpad.net/~kernel-packages
Post to : [email protected]
Unsubscribe : https://launchpad.net/~kernel-packages
More help : https://help.launchpad.net/ListHelp