Public bug reported:

When starting a VM with a passthrough PCIe device, the vfio_pci driver
will block while its fault handler pre-faults the entire mapped area.
For PCIe devices with large BAR regions this takes a very long time to
complete, and thus causes soft lockup warnings on the host. This process
can take hours with multiple passthrough large BAR region PCIe devices.

This issue was introduced in kernel version 6.8.0-48-generic, with the
addition of patches "vfio/pci: Use unmap_mapping_range()" and "vfio/pci:
Insert full vma on mmap'd MMIO fault".

The patch "vfio/pci: Use unmap_mapping_range()" rewrote the way VFIO
tracks mapped regions to use the "vmf_insert_pfn" function instead of
tracking them itself and using "io_remap_pfn_range". The implementation
using "vmf_insert_pfn" is significantly slower.

The patch "vfio/pci: Insert full vma on mmap'd MMIO fault" introduced
this pre-faulting behavior, causing soft lockup warnings on the host
while the VM launches.

Without "vfio/pci: Insert full vma on mmap'd MMIO fault", a guest OS
experiences significantly longer boot times as faults are generated
while configuring the passthrough PCIe devices, but the host does not
see soft lockup warnings.

Both of these performance issues are resolved upstream by patchset [1],
but this would be a complex backport to 6.8, with significant changes to
core parts of the kernel.

The "vfio/pci: Use unmap_mapping_range()" patch was introduced as part
of patchset [2], and is intended to resolve a WARN_ON splat introduced
by the upstream patch ba168b52bf8e ("mm: use rwsem assertion macros for
mmap_lock"). However, this mmap_lock patch is not present in
noble:linux, and hence noble:linux was never impacted by the WARN_ON
issue.

Thus, we can safely revert the following patches to resolve this VFIO slowdown:
- "vfio/pci: Insert full vma on mmap'd MMIO fault"
- "vfio/pci: Use unmap_mapping_range()"

[1] https://patchwork.kernel.org/project/linux-mm/list/?series=883517
[2] 
https://lore.kernel.org/all/20240530045236.1005864-3-alex.william...@redhat.com/

** Affects: linux (Ubuntu)
     Importance: Undecided
         Status: Invalid

** Affects: linux-nvidia (Ubuntu)
     Importance: Undecided
         Status: Invalid

** Affects: linux (Ubuntu Noble)
     Importance: Undecided
     Assignee: Jacob Martin (jacobmartin)
         Status: In Progress

** Affects: linux-nvidia (Ubuntu Noble)
     Importance: Undecided
     Assignee: Jacob Martin (jacobmartin)
         Status: In Progress

** Also affects: linux (Ubuntu Noble)
   Importance: Undecided
       Status: New

** Changed in: linux (Ubuntu)
       Status: New => Invalid

** Changed in: linux (Ubuntu Noble)
     Assignee: (unassigned) => Jacob Martin (jacobmartin)

** Also affects: linux-nvidia (Ubuntu)
   Importance: Undecided
       Status: New

** Changed in: linux-nvidia (Ubuntu)
       Status: New => Invalid

** Changed in: linux-nvidia (Ubuntu Noble)
     Assignee: (unassigned) => Jacob Martin (jacobmartin)

** Changed in: linux (Ubuntu Noble)
       Status: New => In Progress

** Changed in: linux-nvidia (Ubuntu Noble)
       Status: New => In Progress

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/2089306

Title:
  vfio_pci soft lockup on VM start while using PCIe passthrough

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2089306/+subscriptions


-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

Reply via email to