Re: [vfio-users] NVIDIA error: Failed to initialize DMA. Failed to allocate push buffer
Yes, this was it! Before making may more changes, I looked in /proc/iomem and found this: 1802100-20020ff : PCI Bus :80 200-2000fff 200-200120f : PCI Bus :81 200-2000fff : :81:00.0 200-2257fff : efifb HERE, wrong module! 2001000-20011ff : :81:00.0 2001000-20011ff : vfio-pci 2001200-2001203 : :81:00.2 2001200-2001203 : vfio-pci 2001204-2001204 : :81:00.2 2001204-2001204 : vfio-pci I added "video=efifb:off" to kernel options, also added "blacklist efifb" to modprobe.d and restarted; After restart (but before starting virtual machine), /proc/iomem (for this mem. range) was: 1802100-20020ff : PCI Bus :80 200-200120f : PCI Bus :81 200-2000fff : :81:00.0 2001000-20011ff : :81:00.0 2001200-2001203 : :81:00.2 2001204-2001204 : :81:00.2 i.e. no modules listed. After starting virtual machine (with GPU working - finally!), it is: 1802100-20020ff : PCI Bus :80 200-200120f : PCI Bus :81 200-2000fff : :81:00.0 200-2000fff : vfio-pci <<< HERE, correct module! 2001000-20011ff : :81:00.0 2001000-20011ff : vfio-pci 2001200-2001203 : :81:00.2 2001200-2001203 : vfio-pci 2001204-2001204 : :81:00.2 2001204-2001204 : vfio-pci Thanks and Happy New Year to you! B. On Thu, 30 Dec 2021, at 11:04 AM, Arjen wrote: > On Thursday, December 30th, 2021 at 11:48, Bronek Kozicki > wrote: > >> I think here is the strongest hint; the host dmesg is floodeed with messages >> "BAR 1: can't reserve" > > This sounds like you need to add video=efifb:off to the kernel > parameters. > (Don't use a combined video=efifb:off,vesafb:off, but separate them if > you need them both.) > Or release the EFI framebuffer before doing passthrough. > Or make sure the system does not use this GPU during POST and boot. > Or maybe some other work-around that is needed for passthrough of the > single GPU of the system? -- Bronek Kozicki b...@incorrekt.com ___ vfio-users mailing list vfio-users@redhat.com https://listman.redhat.com/mailman/listinfo/vfio-users
Re: [vfio-users] NVIDIA error: Failed to initialize DMA. Failed to allocate push buffer
I think here is the strongest hint; the host dmesg is floodeed with messages "BAR 1: can't reserve" 2021-12-30T10:01:59.456992+ gauss.lan.incorrekt.net kernel: vfio-pci :81:00.0: vfio_ecap_init: hiding ecap 0x1e@0x258 2021-12-30T10:01:59.457413+ gauss.lan.incorrekt.net kernel: vfio-pci :81:00.0: vfio_ecap_init: hiding ecap 0x19@0x900 2021-12-30T10:01:59.457675+ gauss.lan.incorrekt.net kernel: vfio-pci :81:00.0: BAR 1: can't reserve [mem 0x200-0x2000fff 64bit pref] 2021-12-30T10:01:59.486586+ gauss.lan.incorrekt.net kernel: vfio-pci :81:00.1: enabling device ( -> 0002) 2021-12-30T10:01:59.546592+ gauss.lan.incorrekt.net kernel: vfio-pci :81:00.3: enabling device ( -> 0002) . . . 2021-12-30T10:09:58.164738+ gauss.lan.incorrekt.net kernel: vfio-pci :81:00.0: BAR 1: can't reserve [mem 0x200-0x2000fff 64bit pref] 2021-12-30T10:09:58.164811+ gauss.lan.incorrekt.net kernel: vfio-pci :81:00.0: BAR 1: can't reserve [mem 0x200-0x2000fff 64bit pref] I will try adjusting host BIOS options. B. On Thu, 30 Dec 2021, at 10:25 AM, Bronek Kozicki wrote: > Some more information: > > > 1. driver seem to be loading fine in guest > bronekk@euclid:~$ sudo dmesg | grep -E "nvidia|0d:00" > [0.810066] pci :0d:00.0: [10de:1eb1] type 00 class 0x03 > [0.814518] pci :0d:00.0: reg 0x10: [mem 0xc000-0xc0ff] > [0.818518] pci :0d:00.0: reg 0x14: [mem > 0x10-0x100fff 64bit pref] > [0.825110] pci :0d:00.0: reg 0x1c: [mem > 0x101000-0x1011ff 64bit pref] > [0.829048] pci :0d:00.0: reg 0x24: [io 0x9000-0x907f] > [0.834899] pci :0d:00.0: PME# supported from D0 D3hot D3cold > [0.836042] pci :0d:00.1: [10de:10f8] type 00 class 0x040300 > [0.837841] pci :0d:00.1: reg 0x10: [mem 0xc100-0xc1003fff] > [0.845020] pci :0d:00.2: [10de:1ad8] type 00 class 0x0c0330 > [0.847351] pci :0d:00.2: reg 0x10: [mem > 0x101200-0x101203 64bit pref] > [0.854518] pci :0d:00.2: reg 0x1c: [mem > 0x101204-0x101204 64bit pref] > [0.858820] pci :0d:00.2: PME# supported from D0 D3hot D3cold > [0.862836] pci :0d:00.3: [10de:1ad9] type 00 class 0x0c8000 > [0.864838] pci :0d:00.3: reg 0x10: [mem 0xc1004000-0xc1004fff] > [0.873964] pci :0d:00.3: PME# supported from D0 D3hot D3cold > [0.932598] pci :0d:00.0: vgaarb: VGA device added: > decodes=io+mem,owns=none,locks=none >[0.934523] pci > :0d:00.0: vgaarb: bridge control possible > >[0.936134] pci :0d:00.0: vgaarb: setting as boot > device (VGA legacy resources not available) > [1.440190] pci > :0d:00.1: D0 power state depends on :0d:00.0 > >[1.441170] pci :0d:00.2: D0 power state depends > on :0d:00.0 > [1.443582] pci > :0d:00.3: D0 power state depends on :0d:00.0 > [2.619525] xhci_hcd :0d:00.2: xHCI Host Controller > [2.620624] xhci_hcd :0d:00.2: new USB bus registered, assigned > bus number 11 > [2.622792] xhci_hcd :0d:00.2: hcc params 0x0180ff05 hci version > 0x110 quirks 0x0010 > [2.672211] usb usb11: SerialNumber: :0d:00.2 > [2.676422] xhci_hcd :0d:00.2: xHCI Host Controller > [2.677944] xhci_hcd :0d:00.2: new USB bus registered, assigned > bus number 12 > [2.681209] xhci_hcd :0d:00.2: Host supports USB 3.1 Enhanced > SuperSpeed > [2.705956] usb usb12: SerialNumber: :0d:00.2 > [3.926249] nvidia: loading out-of-tree module taints kernel. > [3.927118] nvidia: module license 'NVIDIA' taints kernel. > [3.938804] nvidia: module verification failed: signature and/or > required key missing - tainting kernel > [3.966693] nvidia-nvlink: Nvlink Core is being initialized, major > device number 249 > [3.971181] nvidia :0d:00.0: vgaarb: changed VGA decodes: > olddecodes=io+mem,decodes=none:owns=none > [4.070078] nvidia-modeset: Loading NVIDIA Kernel Mode Setting > Driver for UNIX platforms 460.91.03 Fri Jul 2 05:43:38 UTC 2021 > [4.349705] [drm] [nvidia-drm] [GPU ID 0x0d00] Loading driver > [4.352647] [drm] Initialized nvidia-drm 0.0.0 20160202 for > :0d:00.0 on minor 0 > [4.527067] audit: type=1400 audit(1640858541.112:5): > apparmor="STATUS" operation="profile_load" profile="unconfined" > name="nvidia_modprobe" pid=650 comm="apparmor_parser" > [4.527073]
Re: [vfio-users] NVIDIA error: Failed to initialize DMA. Failed to allocate push buffer
Some more information: 1. driver seem to be loading fine in guest bronekk@euclid:~$ sudo dmesg | grep -E "nvidia|0d:00" [0.810066] pci :0d:00.0: [10de:1eb1] type 00 class 0x03 [0.814518] pci :0d:00.0: reg 0x10: [mem 0xc000-0xc0ff] [0.818518] pci :0d:00.0: reg 0x14: [mem 0x10-0x100fff 64bit pref] [0.825110] pci :0d:00.0: reg 0x1c: [mem 0x101000-0x1011ff 64bit pref] [0.829048] pci :0d:00.0: reg 0x24: [io 0x9000-0x907f] [0.834899] pci :0d:00.0: PME# supported from D0 D3hot D3cold [0.836042] pci :0d:00.1: [10de:10f8] type 00 class 0x040300 [0.837841] pci :0d:00.1: reg 0x10: [mem 0xc100-0xc1003fff] [0.845020] pci :0d:00.2: [10de:1ad8] type 00 class 0x0c0330 [0.847351] pci :0d:00.2: reg 0x10: [mem 0x101200-0x101203 64bit pref] [0.854518] pci :0d:00.2: reg 0x1c: [mem 0x101204-0x101204 64bit pref] [0.858820] pci :0d:00.2: PME# supported from D0 D3hot D3cold [0.862836] pci :0d:00.3: [10de:1ad9] type 00 class 0x0c8000 [0.864838] pci :0d:00.3: reg 0x10: [mem 0xc1004000-0xc1004fff] [0.873964] pci :0d:00.3: PME# supported from D0 D3hot D3cold [0.932598] pci :0d:00.0: vgaarb: VGA device added: decodes=io+mem,owns=none,locks=none [0.934523] pci :0d:00.0: vgaarb: bridge control possible [0.936134] pci :0d:00.0: vgaarb: setting as boot device (VGA legacy resources not available) [1.440190] pci :0d:00.1: D0 power state depends on :0d:00.0 [1.441170] pci :0d:00.2: D0 power state depends on :0d:00.0 [1.443582] pci :0d:00.3: D0 power state depends on :0d:00.0 [2.619525] xhci_hcd :0d:00.2: xHCI Host Controller [2.620624] xhci_hcd :0d:00.2: new USB bus registered, assigned bus number 11 [2.622792] xhci_hcd :0d:00.2: hcc params 0x0180ff05 hci version 0x110 quirks 0x0010 [2.672211] usb usb11: SerialNumber: :0d:00.2 [2.676422] xhci_hcd :0d:00.2: xHCI Host Controller [2.677944] xhci_hcd :0d:00.2: new USB bus registered, assigned bus number 12 [2.681209] xhci_hcd :0d:00.2: Host supports USB 3.1 Enhanced SuperSpeed [2.705956] usb usb12: SerialNumber: :0d:00.2 [3.926249] nvidia: loading out-of-tree module taints kernel. [3.927118] nvidia: module license 'NVIDIA' taints kernel. [3.938804] nvidia: module verification failed: signature and/or required key missing - tainting kernel [3.966693] nvidia-nvlink: Nvlink Core is being initialized, major device number 249 [3.971181] nvidia :0d:00.0: vgaarb: changed VGA decodes: olddecodes=io+mem,decodes=none:owns=none [4.070078] nvidia-modeset: Loading NVIDIA Kernel Mode Setting Driver for UNIX platforms 460.91.03 Fri Jul 2 05:43:38 UTC 2021 [4.349705] [drm] [nvidia-drm] [GPU ID 0x0d00] Loading driver [4.352647] [drm] Initialized nvidia-drm 0.0.0 20160202 for :0d:00.0 on minor 0 [4.527067] audit: type=1400 audit(1640858541.112:5): apparmor="STATUS" operation="profile_load" profile="unconfined" name="nvidia_modprobe" pid=650 comm="apparmor_parser" [4.527073] audit: type=1400 audit(1640858541.112:6): apparmor="STATUS" operation="profile_load" profile="unconfined" name="nvidia_modprobe//kmod" pid=650 comm="apparmor_parser" [4.915963] snd_hda_intel :0d:00.1: Disabling MSI [4.954737] snd_hda_intel :0d:00.1: Handle vga_switcheroo audio client [5.244486] input: HDA NVidia HDMI/DP,pcm=3 as /devices/pci:00/:00:03.4/:0d:00.1/sound/card0/input6 [5.247732] input: HDA NVidia HDMI/DP,pcm=7 as /devices/pci:00/:00:03.4/:0d:00.1/sound/card0/input7 [5.250636] input: HDA NVidia HDMI/DP,pcm=8 as /devices/pci:00/:00:03.4/:0d:00.1/sound/card0/input8 [5.253520] input: HDA NVidia HDMI/DP,pcm=9 as /devices/pci:00/:00:03.4/:0d:00.1/sound/card0/input9 [5.256445] input: HDA NVidia HDMI/DP,pcm=10 as /devices/pci:00/:00:03.4/:0d:00.1/sound/card0/input10 [5.259401] input: HDA NVidia HDMI/DP,pcm=11 as /devices/pci:00/:00:03.4/:0d:00.1/sound/card0/input11 [5.262271] input: HDA NVidia HDMI/DP,pcm=12 as /devices/pci:00/:00:03.4/:0d:00.1/sound/card0/input12 bronekk@euclid:~$ sudo nvidia-smi Thu Dec 30 10:04:48 2021 +-+ | NVIDIA-SMI 460.91.03Driver Version: