GPU passthrough: mixed success on Linux, not yet on Windows

Robert Crowston via freebsd-virtualization Sun, 17 Mar 2019 09:23:06 -0700

Hi folks, this is my first post to the group. Apologies for length.

I've been experimenting with GPU passthrough on bhyve. For background, the host 
system is FreeBSD 12.0-RELEASE on an AMD Ryzen 1700 CPU @ 3.8 GHz, 32 GB of ECC 
RAM, with two nVidia GPUs. I'm working with a Linux Debian 9 guest and a 
Windows Server 2019 (desktop experience installed) guest. I also have a USB 
controller passed-through for bluetooth and keyboard.


With some unpleasant hacks I have succeeded in starting X on the Linux guest, 
passing-through an nVidia GT 710 under the nouveau driver. I can run the "mate" 
desktop and glxgears, both of which are smooth at 4K. The Unity Heaven 
benchmark runs at an embarrassing 0.1 fps, and 2160p x264 video in VLC runs at 
about 5 fps. Neither appears to be CPU-bound in the host or the guest.

The hack I had to make: I found that many instructions to access memory-mapped 
PCI BARs are not being executed on the CPU in guest mode but are being passed 
back for emulation in the hypervisor. This causes an assertion to fail inside 
passthru_write() in pci_passthru.c ["pi->pi_bar[baridx].type == PCIBAR_IO"] 
because it does not expect to perform memory-mapped IO for the guest. Examining 
the to-be-emulated instructions in vmexit_inst_emul() {e.g., movl (%rdi), 
%eax}, they look benign to me, and I have no explanation for why the CPU 
refused to execute them in guest mode.

As an amateur work-around, I removed the assertion and instead I obtain the 
desired offset into the guest's BAR, calculate what that guest address 
translates to in the host's address space, open(2) /dev/mem, mmap(2) over to 
that address, and perform the write directly. I do a similar trick in 
passthru_read(). Ugly, slow, but functional.

This code path is accessed continuously whether or not X is running, with an 
increase in activity when running anything GPU-heavy. Always to bar 1, and 
mostly around the same offsets. I added some logging of this event. It runs at 
about 100 lines per second while playing video. An excerpt is:
...
Unexpected out-of-vm passthrough write #492036 to bar 1 at offset 41100.
Unexpected out-of-vm passthrough write #492037 to bar 1 at offset 41100.
Unexpected out-of-vm passthrough read #276162 to bar 1 at offset 561280.
Unexpected out-of-vm passthrough write #492038 to bar 1 at offset 38028.
Unexpected out-of-vm passthrough write #492039 to bar 1 at offset 38028.
Unexpected out-of-vm passthrough read #276163 to bar 1 at offset 561184.
Unexpected out-of-vm passthrough read #276164 to bar 1 at offset 561184.
Unexpected out-of-vm passthrough read #276165 to bar 1 at offset 561184.
Unexpected out-of-vm passthrough read #276166 to bar 1 at offset 561184.
...

So my question here is,
1. How do I diagnose why the instructions are not being executed in guest mode?

Some other problems:

2. Once the virtual machine is shut down, the passed-through GPU doesn't get 
turned off. Whatever message was on the screen in the final throes of Linux's 
shutdown stays there. Maybe there is a specific detach command which bhyve or 
nouveau hasn't yet implemented? Alternatively, maybe I could exploit some power 
management feature to reset the card when bhyve exits.

3. It is not possible to reboot the guest and then start X again without an 
intervening host reboot. The text console works fine. Xorg.0.log has a message 
like
    (EE) [drm] Failed to open DRM device for pci:0000:00:06.0: -19
    (EE) open /dev/dri/card0: No such file or directory
dmesg is not very helpful either.[0] I suspect that this is related to problem 
(2).

4. There is a known bug in the version of the Xorg server that ships with 
Debian 9, where the switch from an animated mouse cursor back to a static 
cursor causes the X server to sit in a busy loop of gradually increasing stack 
depth, if the GPU takes too long to communicate with the driver.[1] For me, 
this consistently happens after I type my password into the Debian login dialog 
box and eventually (~ 120 minutes) locks up the host by eating all the swap. A 
work-around is to replace the guest's animated cursors with static cursors. The 
bug is fixed in newer versions of X, but I haven't tested whether their fix 
works for me yet.

5. The GPU doesn't come to life until the nouveau driver kicks in. What is 
special about the driver? Why doesn't the UEFI open the GPU and send it output 
before the boot? Any idea if the problem is on the UEFI side or the hypervisor 
side?

6. On Windows, the way Windows probes multi-BAR devices seems to be 
inconsistent with bhyve's model for storing io memory mappings. Specifically, I 
believe Windows assigns the 0xffffffff sentinel to all BARs on a device in one 
shot, then reads them back and assigns the true addresses afterwards. However, 
bhyve sees the multiple 0xffffffff assignments to different BARs as a clash and 
errors out on the second BAR probe. I removed most of the mmio_rb_tree error 
handling in mem.c and this is sufficient for Windows to boot, and detect and 
correctly identify the GPU. (A better solution might be to handle the initial 
0xffffffff write as a special case.) I can then install the official nVidia 
drivers without problem over Remote Desktop. However, the GPU never springs 
into life: I am stuck with a "Windows has stopped this device because it has 
reported problems. (Code 43)" error in the device manager, a blank screen, and 
not much else to go on.

Is it worth me continuing to hack away at these problems---of course I'm happy 
to share anything I come up with---or is there an official solution to GPU 
support in the pipe about to make my efforts redundant :)?

Thanks,
Robert Crowston.

---
Footnotes

[0]  Diff'ing dmesg after successful GPU initialization (+) and after failure 
(-), and cutting out some lines that aren't relevant:
 nouveau 0000:00:06.0: bios: version 80.28.a6.00.10
+nouveau 0000:00:06.0: priv: HUB0: 085014 ffffffff (1f70820b)
 nouveau 0000:00:06.0: fb: 1024 MiB DDR3
@@ -466,24 +467,17 @@
 nouveau 0000:00:06.0: DRM: DCB conn 00: 00001031
 nouveau 0000:00:06.0: DRM: DCB conn 01: 00002161
 nouveau 0000:00:06.0: DRM: DCB conn 02: 00000200
-nouveau 0000:00:06.0: disp: chid 0 mthd 0000 data 00000400 00001000 00000002
-nouveau 0000:00:06.0: timeout at 
/build/linux-UEAD6s/linux-4.9.144/drivers/gpu/drm/nouveau/nvkm/engine/disp/dmacgf119.c:88/gf119_disp_dmac_init()!
-nouveau 0000:00:06.0: disp: ch 1 init: c207009b
-nouveau: DRM:00000000:0000927c: init failed with -16
-nouveau 0000:00:06.0: timeout at 
/build/linux-UEAD6s/linux-4.9.144/drivers/gpu/drm/nouveau/nvkm/engine/disp/dmacgf119.c:54/gf119_disp_dmac_fini()!
-nouveau 0000:00:06.0: disp: ch 1 fini: c2071088
-nouveau 0000:00:06.0: timeout at 
/build/linux-UEAD6s/linux-4.9.144/drivers/gpu/drm/nouveau/nvkm/engine/disp/dmacgf119.c:54/gf119_disp_dmac_fini()!
-nouveau 0000:00:06.0: disp: ch 1 fini: c2071088
+[drm] Supports vblank timestamp caching Rev 2 (21.10.2013).
+[drm] Driver supports precise vblank timestamp query.
+nouveau 0000:00:06.0: DRM: MM: using COPY for buffer copies
+nouveau 0000:00:06.0: DRM: allocated 1920x1080 fb: 0x60000, bo ffff96fdb39a1800
+fbcon: nouveaufb (fb0) is primary device
-nouveau 0000:00:06.0: timeout at 
/build/linux-UEAD6s/linux-4.9.144/drivers/gpu/drm/nouveau/nvkm/engine/disp/coregf119.c:187/gf119_disp_core_fini()
-nouveau 0000:00:06.0: disp: core fini: 8d0f0088
-[TTM] Finalizing pool allocator
-[TTM] Finalizing DMA pool allocator
-[TTM] Zone  kernel: Used memory at exit: 0 kiB
-[TTM] Zone   dma32: Used memory at exit: 0 kiB
-nouveau: probe of 0000:00:06.0 failed with error -16
+Console: switching to colour frame buffer device 240x67
+nouveau 0000:00:06.0: fb0: nouveaufb frame buffer device
+[drm] Initialized nouveau 1.3.1 20120801 for 0000:00:06.0 on minor 0

[1] 
https://devtalk.nvidia.com/default/topic/1028172/linux/titan-v-ubuntu-16-04lts-and-387-34-driver-crashes-badly/post/5230898/#5230898
_______________________________________________
[email protected] mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-virtualization
To unsubscribe, send any mail to 
"[email protected]"

GPU passthrough: mixed success on Linux, not yet on Windows

Reply via email to