Re: [vfio-users] NVIDIA error: Failed to initialize DMA. Failed to allocate push buffer

2021-12-30 Thread Bronek Kozicki
Yes, this was it!

Before making may more changes, I looked in /proc/iomem and found this:

1802100-20020ff : PCI Bus :80
  200-2000fff
  200-200120f : PCI Bus :81
200-2000fff : :81:00.0
  200-2257fff : efifb  HERE, wrong module!
2001000-20011ff : :81:00.0
  2001000-20011ff : vfio-pci
2001200-2001203 : :81:00.2
  2001200-2001203 : vfio-pci
2001204-2001204 : :81:00.2
  2001204-2001204 : vfio-pci

I added "video=efifb:off"  to kernel options, also added "blacklist efifb" to 
modprobe.d and restarted; 

After restart (but before starting virtual machine), /proc/iomem (for this mem. 
range) was:

1802100-20020ff : PCI Bus :80
  200-200120f : PCI Bus :81
200-2000fff : :81:00.0
2001000-20011ff : :81:00.0
2001200-2001203 : :81:00.2
2001204-2001204 : :81:00.2

i.e. no modules listed.

After starting virtual machine (with GPU working - finally!), it is:

1802100-20020ff : PCI Bus :80
  200-200120f : PCI Bus :81
200-2000fff : :81:00.0
  200-2000fff : vfio-pci <<< HERE, correct module!
2001000-20011ff : :81:00.0
  2001000-20011ff : vfio-pci
2001200-2001203 : :81:00.2
  2001200-2001203 : vfio-pci
2001204-2001204 : :81:00.2
  2001204-2001204 : vfio-pci

Thanks and Happy New Year to you!


B.


On Thu, 30 Dec 2021, at 11:04 AM, Arjen wrote:
> On Thursday, December 30th, 2021 at 11:48, Bronek Kozicki 
>  wrote:
>
>> I think here is the strongest hint; the host dmesg is floodeed with messages 
>> "BAR 1: can't reserve"
>
> This sounds like you need to add video=efifb:off to the kernel 
> parameters.
> (Don't use a combined video=efifb:off,vesafb:off, but separate them if 
> you need them both.)
> Or release the EFI framebuffer before doing passthrough.
> Or make sure the system does not use this GPU during POST and boot.
> Or maybe some other work-around that is needed for passthrough of the 
> single GPU of the system?

-- 
  Bronek Kozicki
  b...@incorrekt.com

___
vfio-users mailing list
vfio-users@redhat.com
https://listman.redhat.com/mailman/listinfo/vfio-users



Re: [vfio-users] NVIDIA error: Failed to initialize DMA. Failed to allocate push buffer

2021-12-30 Thread Bronek Kozicki
I think here is the strongest hint; the host dmesg is floodeed with messages 
"BAR 1: can't reserve"


2021-12-30T10:01:59.456992+ gauss.lan.incorrekt.net kernel: vfio-pci 
:81:00.0: vfio_ecap_init: hiding ecap 0x1e@0x258
2021-12-30T10:01:59.457413+ gauss.lan.incorrekt.net kernel: vfio-pci 
:81:00.0: vfio_ecap_init: hiding ecap 0x19@0x900
2021-12-30T10:01:59.457675+ gauss.lan.incorrekt.net kernel: vfio-pci 
:81:00.0: BAR 1: can't reserve [mem 0x200-0x2000fff 64bit pref]
2021-12-30T10:01:59.486586+ gauss.lan.incorrekt.net kernel: vfio-pci 
:81:00.1: enabling device ( -> 0002)
2021-12-30T10:01:59.546592+ gauss.lan.incorrekt.net kernel: vfio-pci 
:81:00.3: enabling device ( -> 0002)

. . .

2021-12-30T10:09:58.164738+ gauss.lan.incorrekt.net kernel: vfio-pci 
:81:00.0: BAR 1: can't reserve [mem 0x200-0x2000fff 64bit pref]
2021-12-30T10:09:58.164811+ gauss.lan.incorrekt.net kernel: vfio-pci 
:81:00.0: BAR 1: can't reserve [mem 0x200-0x2000fff 64bit pref]


I will try adjusting host BIOS options.


B.


On Thu, 30 Dec 2021, at 10:25 AM, Bronek Kozicki wrote:
> Some more information:
>
>
> 1. driver seem to be loading fine in guest
> bronekk@euclid:~$ sudo dmesg | grep -E "nvidia|0d:00"
> [0.810066] pci :0d:00.0: [10de:1eb1] type 00 class 0x03
> [0.814518] pci :0d:00.0: reg 0x10: [mem 0xc000-0xc0ff]
> [0.818518] pci :0d:00.0: reg 0x14: [mem 
> 0x10-0x100fff 64bit pref]
> [0.825110] pci :0d:00.0: reg 0x1c: [mem 
> 0x101000-0x1011ff 64bit pref]
> [0.829048] pci :0d:00.0: reg 0x24: [io  0x9000-0x907f]
> [0.834899] pci :0d:00.0: PME# supported from D0 D3hot D3cold
> [0.836042] pci :0d:00.1: [10de:10f8] type 00 class 0x040300
> [0.837841] pci :0d:00.1: reg 0x10: [mem 0xc100-0xc1003fff]
> [0.845020] pci :0d:00.2: [10de:1ad8] type 00 class 0x0c0330
> [0.847351] pci :0d:00.2: reg 0x10: [mem 
> 0x101200-0x101203 64bit pref]
> [0.854518] pci :0d:00.2: reg 0x1c: [mem 
> 0x101204-0x101204 64bit pref]
> [0.858820] pci :0d:00.2: PME# supported from D0 D3hot D3cold
> [0.862836] pci :0d:00.3: [10de:1ad9] type 00 class 0x0c8000
> [0.864838] pci :0d:00.3: reg 0x10: [mem 0xc1004000-0xc1004fff]
> [0.873964] pci :0d:00.3: PME# supported from D0 D3hot D3cold
> [0.932598] pci :0d:00.0: vgaarb: VGA device added: 
> decodes=io+mem,owns=none,locks=none 
>[0.934523] pci 
> :0d:00.0: vgaarb: bridge control possible   
> 
>[0.936134] pci :0d:00.0: vgaarb: setting as boot 
> device (VGA legacy resources not available) 
>  [1.440190] pci 
> :0d:00.1: D0 power state depends on :0d:00.0
> 
>[1.441170] pci :0d:00.2: D0 power state depends 
> on :0d:00.0 
>   [1.443582] pci 
> :0d:00.3: D0 power state depends on :0d:00.0
> [2.619525] xhci_hcd :0d:00.2: xHCI Host Controller
> [2.620624] xhci_hcd :0d:00.2: new USB bus registered, assigned 
> bus number 11
> [2.622792] xhci_hcd :0d:00.2: hcc params 0x0180ff05 hci version 
> 0x110 quirks 0x0010
> [2.672211] usb usb11: SerialNumber: :0d:00.2
> [2.676422] xhci_hcd :0d:00.2: xHCI Host Controller
> [2.677944] xhci_hcd :0d:00.2: new USB bus registered, assigned 
> bus number 12
> [2.681209] xhci_hcd :0d:00.2: Host supports USB 3.1 Enhanced 
> SuperSpeed
> [2.705956] usb usb12: SerialNumber: :0d:00.2
> [3.926249] nvidia: loading out-of-tree module taints kernel.
> [3.927118] nvidia: module license 'NVIDIA' taints kernel.
> [3.938804] nvidia: module verification failed: signature and/or 
> required key missing - tainting kernel
> [3.966693] nvidia-nvlink: Nvlink Core is being initialized, major 
> device number 249
> [3.971181] nvidia :0d:00.0: vgaarb: changed VGA decodes: 
> olddecodes=io+mem,decodes=none:owns=none
> [4.070078] nvidia-modeset: Loading NVIDIA Kernel Mode Setting 
> Driver for UNIX platforms  460.91.03  Fri Jul  2 05:43:38 UTC 2021
> [4.349705] [drm] [nvidia-drm] [GPU ID 0x0d00] Loading driver
> [4.352647] [drm] Initialized nvidia-drm 0.0.0 20160202 for 
> :0d:00.0 on minor 0
> [4.527067] audit: type=1400 audit(1640858541.112:5): 
> apparmor="STATUS" operation="profile_load" profile="unconfined" 
> name="nvidia_modprobe" pid=650 comm="apparmor_parser"
> [4.527073] 

Re: [vfio-users] NVIDIA error: Failed to initialize DMA. Failed to allocate push buffer

2021-12-30 Thread Bronek Kozicki
Some more information:


1. driver seem to be loading fine in guest
bronekk@euclid:~$ sudo dmesg | grep -E "nvidia|0d:00"
[0.810066] pci :0d:00.0: [10de:1eb1] type 00 class 0x03
[0.814518] pci :0d:00.0: reg 0x10: [mem 0xc000-0xc0ff]
[0.818518] pci :0d:00.0: reg 0x14: [mem 0x10-0x100fff 64bit 
pref]
[0.825110] pci :0d:00.0: reg 0x1c: [mem 0x101000-0x1011ff 64bit 
pref]
[0.829048] pci :0d:00.0: reg 0x24: [io  0x9000-0x907f]
[0.834899] pci :0d:00.0: PME# supported from D0 D3hot D3cold
[0.836042] pci :0d:00.1: [10de:10f8] type 00 class 0x040300
[0.837841] pci :0d:00.1: reg 0x10: [mem 0xc100-0xc1003fff]
[0.845020] pci :0d:00.2: [10de:1ad8] type 00 class 0x0c0330
[0.847351] pci :0d:00.2: reg 0x10: [mem 0x101200-0x101203 64bit 
pref]
[0.854518] pci :0d:00.2: reg 0x1c: [mem 0x101204-0x101204 64bit 
pref]
[0.858820] pci :0d:00.2: PME# supported from D0 D3hot D3cold
[0.862836] pci :0d:00.3: [10de:1ad9] type 00 class 0x0c8000
[0.864838] pci :0d:00.3: reg 0x10: [mem 0xc1004000-0xc1004fff]
[0.873964] pci :0d:00.3: PME# supported from D0 D3hot D3cold
[0.932598] pci :0d:00.0: vgaarb: VGA device added: 
decodes=io+mem,owns=none,locks=none 
   [0.934523] pci :0d:00.0: vgaarb: 
bridge control possible 
 [0.936134] pci 
:0d:00.0: vgaarb: setting as boot device (VGA legacy resources not 
available)  
[1.440190] pci :0d:00.1: D0 power state depends on :0d:00.0 

  [1.441170] pci :0d:00.2: D0 power state 
depends on :0d:00.0 
  [1.443582] pci 
:0d:00.3: D0 power state depends on :0d:00.0
[2.619525] xhci_hcd :0d:00.2: xHCI Host Controller
[2.620624] xhci_hcd :0d:00.2: new USB bus registered, assigned bus 
number 11
[2.622792] xhci_hcd :0d:00.2: hcc params 0x0180ff05 hci version 0x110 
quirks 0x0010
[2.672211] usb usb11: SerialNumber: :0d:00.2
[2.676422] xhci_hcd :0d:00.2: xHCI Host Controller
[2.677944] xhci_hcd :0d:00.2: new USB bus registered, assigned bus 
number 12
[2.681209] xhci_hcd :0d:00.2: Host supports USB 3.1 Enhanced SuperSpeed
[2.705956] usb usb12: SerialNumber: :0d:00.2
[3.926249] nvidia: loading out-of-tree module taints kernel.
[3.927118] nvidia: module license 'NVIDIA' taints kernel.
[3.938804] nvidia: module verification failed: signature and/or required 
key missing - tainting kernel
[3.966693] nvidia-nvlink: Nvlink Core is being initialized, major device 
number 249
[3.971181] nvidia :0d:00.0: vgaarb: changed VGA decodes: 
olddecodes=io+mem,decodes=none:owns=none
[4.070078] nvidia-modeset: Loading NVIDIA Kernel Mode Setting Driver for 
UNIX platforms  460.91.03  Fri Jul  2 05:43:38 UTC 2021
[4.349705] [drm] [nvidia-drm] [GPU ID 0x0d00] Loading driver
[4.352647] [drm] Initialized nvidia-drm 0.0.0 20160202 for :0d:00.0 on 
minor 0
[4.527067] audit: type=1400 audit(1640858541.112:5): apparmor="STATUS" 
operation="profile_load" profile="unconfined" name="nvidia_modprobe" pid=650 
comm="apparmor_parser"
[4.527073] audit: type=1400 audit(1640858541.112:6): apparmor="STATUS" 
operation="profile_load" profile="unconfined" name="nvidia_modprobe//kmod" 
pid=650 comm="apparmor_parser"
[4.915963] snd_hda_intel :0d:00.1: Disabling MSI
[4.954737] snd_hda_intel :0d:00.1: Handle vga_switcheroo audio client
[5.244486] input: HDA NVidia HDMI/DP,pcm=3 as 
/devices/pci:00/:00:03.4/:0d:00.1/sound/card0/input6
[5.247732] input: HDA NVidia HDMI/DP,pcm=7 as 
/devices/pci:00/:00:03.4/:0d:00.1/sound/card0/input7
[5.250636] input: HDA NVidia HDMI/DP,pcm=8 as 
/devices/pci:00/:00:03.4/:0d:00.1/sound/card0/input8
[5.253520] input: HDA NVidia HDMI/DP,pcm=9 as 
/devices/pci:00/:00:03.4/:0d:00.1/sound/card0/input9
[5.256445] input: HDA NVidia HDMI/DP,pcm=10 as 
/devices/pci:00/:00:03.4/:0d:00.1/sound/card0/input10
[5.259401] input: HDA NVidia HDMI/DP,pcm=11 as 
/devices/pci:00/:00:03.4/:0d:00.1/sound/card0/input11
[5.262271] input: HDA NVidia HDMI/DP,pcm=12 as 
/devices/pci:00/:00:03.4/:0d:00.1/sound/card0/input12


bronekk@euclid:~$ sudo nvidia-smi
Thu Dec 30 10:04:48 2021
+-+
| NVIDIA-SMI 460.91.03Driver Version: