Public bug reported:

This upstream (v6.16) fix resolves an issue when trying to pin a memfd
folio before it has been faulted in, which can lead to a crash when
CONFIG_DEBUG_VM is enabled or an accounting issue with resv_huge_pages
when that kconfig is not present. Contiguous memory is required for the
vCMDQ feature on Grace and one way of achieving that is by using huge
pages to back the VM memory.

While testing PR 179 with the 4k host kernel and a QEMU branch with the 
pluggable SMMUv3 interface, I found that the VM would exhibit symptoms of the 
vCMDQ not being backed back contiguous memory:
[    0.377799] acpi NVDA200C:00: tegra241_cmdqv: unexpected error reported. 
vintf_map: 0000000000000001, vcmdq_map 00000000:00000000:00000000:00000002
[    0.379174] arm-smmu-v3 arm-smmu-v3.0.auto: CMDQ error (cons 0x04000000): 
Unknown
[    0.379954] arm-smmu-v3 arm-smmu-v3.0.auto: skipping command in error state:
[    0.380632] arm-smmu-v3 arm-smmu-v3.0.auto:  0x0001000000000011
[    0.381147] arm-smmu-v3 arm-smmu-v3.0.auto:  0x0000000000000000

When this occurred, I noticed that the huge page metadata did not match 
expectations. Notably, it showed that an extra 16G of hugepages was being used 
and also reflected a negative “in reserve” count, indicating an underflow 
condition.
# grep -i hugep /proc/meminfo 
AnonHugePages:     69632 kB
ShmemHugePages:        0 kB
FileHugePages:         0 kB
HugePages_Total:      64
HugePages_Free:       32
HugePages_Rsvd:    18446744073709551600
HugePages_Surp:        0
Hugepagesize:    1048576 kB

After instrumenting the kernel, I was able to prove the underflow and
then noticed this upstream fix. The data also showed that the newer QEMU
branch makes more calls to memfd_pin_foilios() during GPU VFIO setup,
which triggered the bug in the kernel - I never saw this bug with the
older QEMU branch we’ve been using for quite some time for Grace
virtualization. After applying the fix, I no longer see the bad huge
page metadata and the vCMDQ feature works properly with the 4k host
kernel.

Lore discussion: 
https://lkml.kernel.org/r/[email protected]
Upstream SHA: eb920662230f mm/hugetlb: don't crash when allocating a folio if 
there are no resv

This commit picked cleanly to 24.04_linux-nvidia-6.14-next.

Testing:
GPU PT on 4k host with more huge pages than the VM requires (e.g. 32 1G 
hugepages for a 16G VM)
QEMU: https://github.com/nvmochs/QEMU/tree/smmuv3-accel-07212025_egm

qemu-system-aarch64 \
        -object iommufd,id=iommufd0 \
        -machine hmat=on -machine 
virt,accel=kvm,gic-version=3,ras=on,highmem-mmio-size=512G \
        -cpu host -smp cpus=4 -m size=16G,slots=2,maxmem=66G -nographic \
        -object 
memory-backend-file,size=8G,id=m0,mem-path=/hugepages/,prealloc=on,share=off \
        -object 
memory-backend-file,size=8G,id=m1,mem-path=/hugepages/,prealloc=on,share=off \
        -numa node,memdev=m0,cpus=0-3,nodeid=0 -numa node,memdev=m1,nodeid=1 \
        -numa node,nodeid=2 -numa node,nodeid=3 -numa node,nodeid=4 -numa 
node,nodeid=5\
        -numa node,nodeid=6 -numa node,nodeid=7 -numa node,nodeid=8 -numa 
node,nodeid=9\
        -device pxb-pcie,id=pcie.1,bus_nr=1,bus=pcie.0 -device 
arm-smmuv3,primary-bus=pcie.1,id=smmuv3.1,accel=on,cmdqv=on \
        -device pcie-root-port,id=pcie.port1,bus=pcie.1,chassis=1,io-reserve=0 \
        -device 
vfio-pci-nohotplug,host=0009:01:00.0,bus=pcie.port1,rombar=0,id=dev0,iommufd=iommufd0
 \
        -object acpi-generic-initiator,id=gi0,pci-dev=dev0,node=2 \
        -object acpi-generic-initiator,id=gi1,pci-dev=dev0,node=3 \
        -object acpi-generic-initiator,id=gi2,pci-dev=dev0,node=4 \
        -object acpi-generic-initiator,id=gi3,pci-dev=dev0,node=5 \
        -object acpi-generic-initiator,id=gi4,pci-dev=dev0,node=6 \
        -object acpi-generic-initiator,id=gi5,pci-dev=dev0,node=7 \
        -object acpi-generic-initiator,id=gi6,pci-dev=dev0,node=8 \
        -object acpi-generic-initiator,id=gi7,pci-dev=dev0,node=9 \
        -bios /usr/share/AAVMF/AAVMF_CODE.fd \
        -device nvme,drive=nvme0,serial=deadbeaf1,bus=pcie.0 \
        -drive 
file=guest.qcow2,index=0,media=disk,format=qcow2,if=none,id=nvme0 \
        -device 
e1000,romfile=/usr/local/share/qemu/efi-e1000.rom,netdev=net0,bus=pcie.0 \
        -netdev user,id=net0,hostfwd=tcp::5558-:22,hostfwd=tcp::5586-:5586

** Affects: linux-nvidia-6.14 (Ubuntu)
     Importance: Undecided
         Status: New

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/2119577

Title:
  Backport: mm/hugetlb: don't crash when allocating a folio if there are
  no resv

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux-nvidia-6.14/+bug/2119577/+subscriptions


-- 
ubuntu-bugs mailing list
[email protected]
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

Reply via email to