Hi,
I'm trying to use PCI passthrough to give an NVIDIA GPU to a VM with qemu / KVM. I've summarized my environment below and the error I get is near the bottom. Any help would be appreciated. There are a few guides I've been referring to already: https://wiki.debian.org/VGAPassthrough https://www.pugetsystems.com/labs/articles/Multiheaded-NVIDIA-Gaming-using-Ubuntu-14-04-KVM-585/ https://wiki.archlinux.org/index.php/PCI_passthrough_via_OVMF https://bbs.archlinux.org/viewtopic.php?id=162768 http://www.linux-kvm.org/page/VGA_device_assignment (this mentions NVIDIA is troublesome) - is there any other guide that is up to date and useful for NVIDIA users? The host is a HP Z800 with two GPUs: - K2200 for the host - K420 for the VM (also tried Quadro 2000 instead of K420) The system runs Debian jessie, using a kernel from jessie-backports: $ uname -a Linux test1 4.3.0-0.bpo.1-amd64 #1 SMP Debian 4.3.3-7~bpo8+1 (2016-01-19) x86_64 GNU/Linux $ qemu-system-x86_64 -version QEMU emulator version 2.1.2 (Debian 1:2.1+dfsg-12+deb8u5a), Copyright (c) 2003-2008 Fabrice Bellard $ dpkg -l | egrep -i 'ovmf|seabios' ii ovmf 0~20131112.2590861a-3 ii seabios 1.7.5-1 The first problem encountered was that the Xorg NVIDIA driver initializes both GPUs, I stopped it doing that by adding to the /etc/X11/xorg.conf display config: Option "ProbeAllGpus" "false" and verified that the K420 is no longer mentioned in the Xorg logs. While testing, I've completely disabled X on the host anyway. I extracted the ROM using: echo 1 | tee /sys/devices/pci0000\:40/0000\:40\:03.0/0000\:42\:00.0/rom cat /sys/devices/pci0000\:40/0000\:40\:03.0/0000\:42\:00.0/rom > nvidia-k420.rom - is this expected to get a valid ROM, or could it be retrieving a modified image? Should I try another method to extract it? In /etc/initramfs-tools/modules I've added: # for older kernel #pci_stub ids=10de:0ff3,10de:0e1b # for newer kernel vfio_pci ids=10de:0ff3,10de:0e1b I also checked with "lspci -vv" to make sure the GPU doesn't have any driver associated with it except vfio-pci In my /etc/modules: pci_stub vfio vfio_iommu_type1 vfio_pci kvm kvm_intel In /etc/default/grub: GRUB_CMDLINE_LINUX_DEFAULT="quiet intel_iommu=on vfio_iommu_type1.allow_unsafe_interrupts=1" In /etc/modprobe.d/vfio.conf: options vfio-pci ids=10de:0ff3,10de:0e1b and /etc/modprobe.d/kvm_iommu.conf options kvm allow_unsafe_assigned_interrupts=1 I've tried booting the VM with: qemu-system-x86_64 -enable-kvm -M q35 -m 1024 -cpu host,kvm=off \ -smp 4,sockets=1,cores=4,threads=1 \ -bios /usr/share/seabios/bios.bin -nographic \ -device ioh3420,bus=pcie.0,addr=1c.0,multifunction=on,port=1,chassis=1,id=root.1 \ -device piix4-ide,bus=pcie.0,id=piix4-ide \ -device vfio-pci,host=42:00.0,bus=root.1,addr=00.0,multifunction=on,x-vga=on,romfile=/root/nvidia-k420.rom \ -device vfio-pci,host=42:00.1,bus=root.1,addr=00.1 \ -soundhw ac97 \ -drive file=/dev/mapper/vg00-win7_kvm_test1,id=disk,format=raw -device ide-hd,bus=piix4-ide.0,drive=disk \ -drive file=${BOOT_CD},id=isocd -device ide-cd,bus=piix4-ide.1,drive=isocd \ -usb -usbdevice host:009.004 -usbdevice host:009.005 \ -boot order=d and the screen remains blank (in power save mode) If I remove "-nographic" and replace it with "-vga qxl -vnc 0:15" then it boots correctly, the screen is still blank, but I can connect to it with VNC. I installed Win7 in the VM and then looked at the NVIDIA device in the Device Manager and it had this error: "This device cannot start (Code 10)" then I installed the NVIDIA driver in the VM and rebooted it and checked again and it had this error: "This device cannot find enough free resources that it can use (code 12)" I've tried with and without the "romfile" option, I've tried using the OVMF BIOS (from the ovmf Debian package) instead of the seabios ROM and I've tried various permutations of the PCI config (e.g. with and without ioh3420) but so far it doesn't display anything on the second monitor. Here are some details from dmesg: # dmesg | egrep '42:|vfio' [ 0.000000] Command line: BOOT_IMAGE=/vmlinuz-4.3.0-0.bpo.1-amd64 root=/dev/mapper/vg00-root ro quiet intel_iommu=on vfio_iommu_type1.allow_unsafe_interrupts=1 [ 0.000000] Kernel command line: BOOT_IMAGE=/vmlinuz-4.3.0-0.bpo.1-amd64 root=/dev/mapper/vg00-root ro quiet intel_iommu=on vfio_iommu_type1.allow_unsafe_interrupts=1 [ 0.337426] pci 0000:42:00.0: [10de:0ff3] type 00 class 0x030000 [ 0.337469] pci 0000:42:00.0: reg 0x10: [mem 0xc2000000-0xc2ffffff] [ 0.337479] pci 0000:42:00.0: reg 0x14: [mem 0xb0000000-0xbfffffff 64bit pref] [ 0.337490] pci 0000:42:00.0: reg 0x1c: [mem 0xc0000000-0xc1ffffff 64bit pref] [ 0.337498] pci 0000:42:00.0: reg 0x24: [io 0xe000-0xe07f] [ 0.337505] pci 0000:42:00.0: reg 0x30: [mem 0x00000000-0x0007ffff pref] [ 0.337597] pci 0000:42:00.1: [10de:0e1b] type 00 class 0x040300 [ 0.337620] pci 0000:42:00.1: reg 0x10: [mem 0xc3000000-0xc3003fff] [ 0.352317] vgaarb: device added: PCI:0000:42:00.0,decodes=io+mem,owns=none,locks=none [ 0.352321] vgaarb: bridge control possible 0000:42:00.0 [ 0.368673] pci 0000:42:00.0: BAR 6: assigned [mem 0xc3080000-0xc30fffff pref] [ 0.368715] pci_bus 0000:42: resource 0 [io 0xe000-0xefff] [ 0.368716] pci_bus 0000:42: resource 1 [mem 0xc2000000-0xc30fffff] [ 0.368717] pci_bus 0000:42: resource 2 [mem 0xb0000000-0xc1ffffff 64bit pref] [ 0.632208] iommu: Adding device 0000:42:00.0 to group 23 [ 0.632227] iommu: Adding device 0000:42:00.1 to group 23 [ 13.994137] vgaarb: device changed decodes: PCI:0000:42:00.0,olddecodes=io+mem,decodes=io+mem:owns=none [ 14.014338] vfio_pci: add [10de:0ff3[ffff:ffff]] class 0x000000/00000000 [ 14.030225] vfio_pci: add [10de:0e1b[ffff:ffff]] class 0x000000/00000000 [ 254.694080] vgaarb: device changed decodes: PCI:0000:42:00.0,olddecodes=io+mem,decodes=io+mem:owns=none [ 254.708815] vgaarb: device changed decodes: PCI:0000:42:00.0,olddecodes=io+mem,decodes=io+mem:owns=none [ 255.932544] vfio-pci 0000:42:00.0: enabling device (0102 -> 0103) and I also get an Oops when trying to use OVMF.fd instead of seabios/bios.bin: [ 200.896541] BUG: unable to handle kernel paging request at 0000000200000068 [ 200.896676] IP: [<ffffffffa041ccac>] __mtrr_lookup_var_next+0xc/0xb0 [kvm] [ 200.896764] PGD 0 [ 200.896830] Oops: 0000 [#1] SMP ... [ 200.902841] Call Trace: [ 200.902897] [<ffffffffa041d605>] ? kvm_mtrr_check_gfn_range_consistency+0xc5/0x120 [kvm] [ 200.902971] [<ffffffffa0408fdc>] ? tdp_page_fault+0xac/0x2d0 [kvm] [ 200.903031] [<ffffffffa040023f>] ? kvm_mmu_page_fault+0x1f/0x110 [kvm] Can anybody comment on what I should try? Should I just try another GPU, maybe another brand? The host needs the powerful GPU but the VM doesn't necessarily need a lot of GPU capability. Regards, Daniel _______________________________________________ vfio-users mailing list [email protected] https://www.redhat.com/mailman/listinfo/vfio-users
