Re: Issues with GTX960 on CentOS7 using bhyve PCI passthru (FreeBSD 11-RC2)
> Found my original attempt by modifying /usr/src/sys/amd64/vmm/x86.c > Unified diff follows, but this didn't work for me. > ("bhyve_id[]" commented out to prevent compiler complaints) Who knows what sort of trickery nVidia's driver is up to besides CPUID when determining the presence of virtualization. Regardless of that, VGA PCIe passthrough does not work in bhyve even with Quadro 2000 (which Xen people have had success with). The problem appears to be in the area of assigning memory-mapped I/O ranges by bhyve for the VGA card to a region outside of the CPU's addressable space; i.e., bhyve does not check CPUID's 0x8008 AL value (0x27 for my CPU, which is 39 bits -- while bhyve assigns 0xd0 & above for the large Prefetch Memory chunks, which requires 40 address bits). At least this is my understanding of why VGA passthrough does not work. This seems easy to fix. Could someone who knows better have a look? Unlike Linux, FreeBSD has no problem assigning BAR range outside addressable range, and then panics when trying to write to these virtual memory addresses. See [0] below. > There doesn't seem to be support for CPUID 0x4001 in bhyve either. What is it supposed to do? [0] Linux dmesg: [0.204799] PCI: MMCONFIG for domain [bus 00-ff] at [mem 0xe000-0xefff] (base 0xe000) [0.205474] PCI: MMCONFIG at [mem 0xe000-0xefff] reserved in ACPI motherboard resources [0.206080] PCI: Using host bridge windows from ACPI; if necessary, use "pci=nocrs" and report a bug [0.207306] ACPI: PCI Root Bridge [PC00] (domain [bus 00]) [0.207724] acpi PNP0A03:00: _OSC: OS supports [ExtendedConfig ASPM ClockPM Segments MSI] [0.208291] acpi PNP0A03:00: _OSC failed (AE_NOT_FOUND); disabling ASPM [0.208759] acpi PNP0A03:00: host bridge window [mem 0xd0-0xd00c0f window] (ignored, not CPU addressable) [0.209517] PCI host bridge to bus :00 [0.209808] pci_bus :00: root bus resource [io 0x-0x0cf7 window] [0.210281] pci_bus :00: root bus resource [io 0x0d00-0x1fff window] [0.210752] pci_bus :00: root bus resource [io 0x2000-0x211f window] [0.211224] pci_bus :00: root bus resource [mem 0xc000-0xc40f window] [0.211743] pci_bus :00: root bus resource [bus 00] [...] [0.223902] PCI: Using ACPI for IRQ routing [0.265987] pci :00:03.0: can't claim BAR 1 [mem 0xd0-0xd007ff 64bit pref]: no compatible bridge window [0.266735] pci :00:03.0: can't claim BAR 3 [mem 0xd00800-0xd00bff 64bit pref]: no compatible bridge window [0.284717] pci :00:03.0: can't claim BAR 6 [mem 0xf600-0xf607 pref]: no compatible bridge window [...] [0.285407] pci :00:03.0: BAR 1: no space for [mem size 0x0800 64bit pref] [0.285933] pci :00:03.0: BAR 1: trying firmware assignment [mem 0xd0-0xd007ff 64bit pref] [0.286599] pci :00:03.0: BAR 1: [mem 0xd0-0xd007ff 64bit pref] conflicts with PCI mem [mem 0x-0x7f] [0.287419] pci :00:03.0: BAR 1: failed to assign [mem size 0x0800 64bit pref] [0.287968] pci :00:03.0: BAR 3: no space for [mem size 0x0400 64bit pref] [0.288506] pci :00:03.0: BAR 3: trying firmware assignment [mem 0xd00800-0xd00bff 64bit pref] [0.289173] pci :00:03.0: BAR 3: [mem 0xd00800-0xd00bff 64bit pref] conflicts with PCI mem [mem 0x-0x7f] [0.289992] pci :00:03.0: BAR 3: failed to assign [mem size 0x0400 64bit pref] [0.290539] pci :00:03.0: BAR 6: assigned [mem 0xc008-0xc00f pref] [0.291039] pci :00:01.0: BAR 6: assigned [mem 0xc0002000-0xc00027ff pref] [0.291540] pci :00:02.0: BAR 6: assigned [mem 0xc0002800-0xc0002fff pref] Cannot get output from Linux's `lspci -vvn` booted with "pci=nocrs" kernel option, as it hangs now close to the end of boot process (not sure why, was able to finish booting before). Another machine: vgapci0@pci0:1:0:0: class=0x03 card=0x083510de chip=0x0df810de rev=0xa1 hdr=0x00 vendor = 'nVidia Corporation' device = 'GF108 [Quadro 600]' class = display subclass = VGA bar [10] = type Memory, range 32, base 0xfa00, size 16777216, enabled bar [14] = type Prefetchable Memory, range 64, base 0xe800, size 134217728, enabled bar [1c] = type Prefetchable Memory, range 64, base 0xf000, size 33554432, enabled bar [24] = type I/O Port, range 32, base 0xe000, size 128, enabled hdac0@pci0:1:0:1: class=0x040300 card=0x083510de chip=0x0bea10de rev=0xa1 hdr=0x00 vendor = 'nVidia Corporation' device = 'GF108 High Definition Audio Controller' class = multimedia subclass = HDA bar [10] = type Memory, range 32, base 0xfb08, size 16384, enabled Host: ppt0@pci0:1:0:0:class=0x03 card=0x084a10de chip=0x0dd810de rev=0xa
Re: Issues with GTX960 on CentOS7 using bhyve PCI passthru (FreeBSD 11-RC2)
Found my original attempt by modifying /usr/src/sys/amd64/vmm/x86.c Unified diff follows, but this didn't work for me. ("bhyve_id[]" commented out to prevent compiler complaints) There doesn't seem to be support for CPUID 0x4001 in bhyve either. --- x86.c.orig 2016-09-11 14:40:22.410462000 +0100 +++ x86.c 2016-09-11 15:53:14.182186000 +0100 @@ -52,7 +52,7 @@ #defineCPUID_VM_HIGH 0x4000 -static const char bhyve_id[12] = "bhyve bhyve "; +/* static const char bhyve_id[12] = "bhyve bhyve "; */ static uint64_t bhyve_xcpuids; SYSCTL_ULONG(_hw_vmm, OID_AUTO, bhyve_xcpuids, CTLFLAG_RW, &bhyve_xcpuids, 0, @@ -236,7 +236,7 @@ regs[2] &= ~(CPUID2_VMX | CPUID2_EST | CPUID2_TM2); regs[2] &= ~(CPUID2_SMX); - regs[2] |= CPUID2_HV; + /* regs[2] |= CPUID2_HV; */ if (x2apic_state != X2APIC_DISABLED) regs[2] |= CPUID2_X2APIC; @@ -463,12 +463,15 @@ } break; + /* +* Don't expose KVM to guest case 0x4000: regs[0] = CPUID_VM_HIGH; bcopy(bhyve_id, ®s[1], 4); bcopy(bhyve_id + 4, ®s[2], 4); bcopy(bhyve_id + 8, ®s[3], 4); break; + */ default: /* ___ freebsd-virtualization@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-virtualization To unsubscribe, send any mail to "freebsd-virtualization-unsubscr...@freebsd.org"
Re: Issues with GTX960 on CentOS7 using bhyve PCI passthru (FreeBSD 11-RC2)
With QEMU, they have the "kvm=off" option which hides hypervisor info from the guest. See: https://www.redhat.com/archives/libvir-list/2014-August/msg00512.html I did try to replicate this a while back but didn't have much success - maybe I missed a flag? The QEMU diff seems relatively small, see: http://lists.gnu.org/archive/html/qemu-devel/2014-06/msg00302.html Having another go at doing this is on my to-do list, but not very near the top! ___ freebsd-virtualization@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-virtualization To unsubscribe, send any mail to "freebsd-virtualization-unsubscr...@freebsd.org"
Re: Issues with GTX960 on CentOS7 using bhyve PCI passthru (FreeBSD 11-RC2)
Howdy, virtualization zealots! This is in reply to maillist thread [0]. It so happens that I have to get GPU-accelerated OpenCL working on my machine, so I had a play with bhyve & PCI-e passthrough for VGA. I was using nVidia Quadro 600 (GF108) for testing (planning to use AMD/ATI for OpenCL, of course). I tried a Linux guest with the proprietary nVidia driver, and the result was that the driver couldn't init the VGA during boot: [1.394726] nvidia: module license 'NVIDIA' taints kernel. [1.395140] Disabling lock debugging due to kernel taint [1.412132] nvidia: module verification failed: signature and/or required key missing - tainting kernel [1.419359] nvidia :00:04.0: can't derive routing for PCI INT A [1.419807] nvidia :00:04.0: PCI INT A: no GSI [1.420157] NVRM: This PCI I/O region assigned to your NVIDIA device is invalid: [1.420157] NVRM: BAR1 is 0M @ 0x0 (PCI::00:04.0) [1.421023] NVRM: The system BIOS may have misconfigured your GPU. [1.421476] nvidia: probe of :00:04.0 failed with error -1 [1.437301] nvidia-nvlink: Nvlink Core is being initialized, major device number 247 [1.440094] NVRM: The NVIDIA probe routine failed for 1 device(s). [1.440530] NVRM: None of the NVIDIA graphics adapters were initialized! After adding the "pci=nocrs" Linux boot option (which, from what I understand, magically helps to [partially] workaround bhyve assigning addresses beyond host CPU's physically addressable space for PCIe memory-mapped registers), the guest couldn't finish booting, because bhyve would segfault. Turns out the what peripherals are used, and their order on the command line, are important. Edit: actually, looks like it's the number of CPUs (the '-c' flag's argument) that makes the difference; the machine has a CPU with 4 core, no multithreading. This didn't work (segfault): `bhyve -A -H -P -s 0:0,hostbridge -s 1:0,lpc -s 2:0,virtio-net,tap0 \ -s 3:0,virtio-blk,./bhyve_lunix.img \ -s 4:0,ahci-cd,./ubuntu-16.04.1-server-amd64.iso \ -s 5:0,passthru,1/0/0 -l com1,stdio -c 4 -m 1024M -S lunixguest` [...] [ OK ] Listening on Load/Save RF Kill Switch Status /dev/rfkill Watch. [ OK ] Reached target Swap. Assertion failed: (pi->pi_bar[baridx].type == PCIBAR_IO), function passthru_write, file /usr/src/usr.sbin/bhyve/pci_passthru.c, line 850. Abort (core dumped) But his worked, finally: `bhyve -c 1 -m 1024M -S -A -H -P -s 0:0,hostbridge -s 1:0,lpc \ -s 2:0,virtio-net,tap0 -s 3:0,virtio-blk,./bhyve_lunix.img \ -s 4:0,passthru,1/0/0 -l com1,stdio lunixguest` So, the guest booted, and didn't complain about non-addressable- -by-CPU BARs anymore. However, the same fate befall me as Dom had in this thread -- the driver loaded: [1.691216] nvidia: module verification failed: signature and/or required key missing - tainting kernel [1.696641] nvidia :00:04.0: can't derive routing for PCI INT A [1.698093] nvidia :00:04.0: PCI INT A: no GSI [1.699277] vgaarb: device changed decodes: PCI::00:04.0,olddecodes=io+mem,decodes=none:owns=io+mem [1.701461] nvidia-nvlink: Nvlink Core is being initialized, major device number 247 [1.702649] NVRM: loading NVIDIA UNIX x86_64 Kernel Module 375.26 Thu Dec 8 18:36:43 PST 2016 (using threaded interrupts) [1.705481] nvidia-modeset: Loading NVIDIA Kernel Mode Setting Driver for UNIX platforms 375.26 Thu Dec 8 18:04:14 PST 2016 [1.708941] [drm] [nvidia-drm] [GPU ID 0x0004] Loading driver but couldn't talk to the card: [lost the log, but it was the same as Dom's: "NVRM: rm_init_adapter failed"]. So I decided to try test in a FreeBSD 10.3-STABLE guest. With older driver, or just loading 'nvidia' without modesetting, I got guest kernel panics [1]. I loaded 'nvidia-modeset', there was more success: Linux ELF exec handler installed Linux x86-64 ELF exec handler installed nvidia0: on vgapci0 vgapci0: child nvidia0 requested pci_enable_io vgapci0: attempting to allocate 1 MSI vectors (1 supported) msi: routing MSI IRQ 269 to local APIC 2 vector 51 vgapci0: using IRQ 269 for MSI vgapci0: child nvidia0 requested pci_enable_io nvidia-modeset: Loading NVIDIA Kernel Mode Setting Driver for UNIX platforms 367.44 Wed Aug 17 22:05:09 PDT 2016 But: # nvidia-smi NVRM: Xid (PCI::00:04): 62, !2369() NVRM: RmInitAdapter failed! (0x26:0x65:1072) nvidia0: NVRM: rm_init_adapter() failed! No devices were found It also panicked after starting Xorg. After stumbling upon some Xen forums, I found the solution: nVidia crippled the driver so that it detects virtualization environment, and refuses to attach to anything but high-end pro cards! Those bastards [if the speculation is true]! GTX960 didn't work. Quadro 600 didn't work. So I tried with a Quadro 2000: root@fbsd12tst:~