Public bug reported: When starting a VM with a passthrough PCIe device, the vfio_pci driver will block while its fault handler pre-faults the entire mapped area. For PCIe devices with large BAR regions this takes a very long time to complete, and thus causes soft lockup warnings on the host. This process can take hours with multiple passthrough large BAR region PCIe devices.
This issue was introduced in kernel version 6.8.0-48-generic, with the addition of patches "vfio/pci: Use unmap_mapping_range()" and "vfio/pci: Insert full vma on mmap'd MMIO fault". The patch "vfio/pci: Use unmap_mapping_range()" rewrote the way VFIO tracks mapped regions to use the "vmf_insert_pfn" function instead of tracking them itself and using "io_remap_pfn_range". The implementation using "vmf_insert_pfn" is significantly slower. The patch "vfio/pci: Insert full vma on mmap'd MMIO fault" introduced this pre-faulting behavior, causing soft lockup warnings on the host while the VM launches. Without "vfio/pci: Insert full vma on mmap'd MMIO fault", a guest OS experiences significantly longer boot times as faults are generated while configuring the passthrough PCIe devices, but the host does not see soft lockup warnings. Both of these performance issues are resolved upstream by patchset [1], but this would be a complex backport to 6.8, with significant changes to core parts of the kernel. The "vfio/pci: Use unmap_mapping_range()" patch was introduced as part of patchset [2], and is intended to resolve a WARN_ON splat introduced by the upstream patch ba168b52bf8e ("mm: use rwsem assertion macros for mmap_lock"). However, this mmap_lock patch is not present in noble:linux, and hence noble:linux was never impacted by the WARN_ON issue. Thus, we can safely revert the following patches to resolve this VFIO slowdown: - "vfio/pci: Insert full vma on mmap'd MMIO fault" - "vfio/pci: Use unmap_mapping_range()" [1] https://patchwork.kernel.org/project/linux-mm/list/?series=883517 [2] https://lore.kernel.org/all/20240530045236.1005864-3-alex.william...@redhat.com/ ** Affects: linux (Ubuntu) Importance: Undecided Status: Invalid ** Affects: linux-nvidia (Ubuntu) Importance: Undecided Status: Invalid ** Affects: linux (Ubuntu Noble) Importance: Undecided Assignee: Jacob Martin (jacobmartin) Status: In Progress ** Affects: linux-nvidia (Ubuntu Noble) Importance: Undecided Assignee: Jacob Martin (jacobmartin) Status: In Progress ** Also affects: linux (Ubuntu Noble) Importance: Undecided Status: New ** Changed in: linux (Ubuntu) Status: New => Invalid ** Changed in: linux (Ubuntu Noble) Assignee: (unassigned) => Jacob Martin (jacobmartin) ** Also affects: linux-nvidia (Ubuntu) Importance: Undecided Status: New ** Changed in: linux-nvidia (Ubuntu) Status: New => Invalid ** Changed in: linux-nvidia (Ubuntu Noble) Assignee: (unassigned) => Jacob Martin (jacobmartin) ** Changed in: linux (Ubuntu Noble) Status: New => In Progress ** Changed in: linux-nvidia (Ubuntu Noble) Status: New => In Progress -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/2089306 Title: vfio_pci soft lockup on VM start while using PCIe passthrough To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2089306/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs