Alright all. So a few updates:

After the 22+ hour uptime two days ago, I did a manual reboot of host/guest to 
continue testing the "deterministic stability" of the system (as I mentioned 
yesterday). Within a few minutes, the VM exited with a new error code (0x88) 
(previously it was 0x60). After I did the PCI passthrough adjustments, I 
haven't gotten 0x60 anymore and the VM has become a lot more stable. For 
whatever reason I do sometimes still get a 0x88, but it hasn't happened too 
many times. After the initial 0x88 exit code, I rebooted the host/vm again, 
then the VM worked for about 3-4 hours before getting another 0x88. After 
another host/vm reboot, the VM became stable and ran for another 19 hours 
(which is this morning). I then turned it off since I wanted to experiment with 
the pkill situation we were discussing recently. So besides figuring out what's 
causing it to sometimes do a 0x88, we are close I think to a stable setup. I've 
noticed that in situations where it stays stable, I usually have my Task 
Manager up since I want to see the resource utilization, temperatures, and 
uptime of the system quickly, while I monitor it. Could there be something with 
the task manager always polling different hardware for their state that may 
cause the VM to stay alive (vs not having task manager and just idling the 
machine causing it to die?). There is no sleep support in the VM so we don't 
need to worry about the VM going to sleep/hiberation.

I've also worked on a start/stop script to integrate this into the rc.d system 
a little bit better. Before I was using /etc/crontab and @reboot to start the 
VM at boot time after everything has started up, which works fine, but I 
couldn't find a @shutdown equivalent. So I wrote a basic rc.d script. The 
script is working but the problem is that for w/e reason it blocks the rest of 
the services on my machine from starting. I investigated it a bit and it has to 
do with "rcorder". I also analyzed the order by doing "rcorder /etc/rc.d/* 
/usr/local/etc/rc.d/*" which revealed where in the chain it falls. Originally 
there was no "REQUIRE:" line so it was being thrown into the beginning of the 
chain. I then switched it to "REQUIRE: LOGIN" which threw it close to the end 
of the chain (which is good), but it still has some services that follow it, 
and I noticed that even though I can ping the box after reboot, for w/e reason 
even though sshd falls before LOGIN, I still couldn't ssh into the box until I 
shutdown the Windows VM, which allows execution to continue. The start script I 
have for bhyve is a blocking call since it runs "bhyve ..." which would end up 
capturing the terminal. When I do it through /etc/crontab @reboot, this doesn't 
seem to be an issue, but it is when doing it through an rc.d script. Anyone 
have any recommendations about this? I'll post the three scripts I'm using at 
the end of this message.

Also, I'm more open now to writing a new section for the handbook regarding 
everything I've learned for GPU passthrough for gaming purposes. Although I 
have to preface that my experience is limited and it's only optimized for this 
flow, so if I write it, I would only be able to write about this particular 
path and things/caveats around it. Normally the handbook usually has a more 
general knowledge approach as well which I'm not qualified to write about at 
the moment. Would it be fine for me to start writing this section and get it 
into the handbook, and other people more experienced than me can expand that 
section with more general knowledge about bhyve/passthrough (and of course 
Intel/NVIDIA stuff can come after)? I can start with GPU passthrough / gaming / 
AMD Host + AMD Dedicated Card / and I can also add the start/stop scripts + 
rc.d script.


The paths, names, and values will need to be adjusted for each person's 
individual setup:




VM START SCRIPT:



#!/bin/sh    

VM_DIR="$(dirname $(realpath $0))"    
VM_NAME="$(basename $VM_DIR)"    

cd "$VM_DIR"    

bhyve -AHPSw -c sockets=1,cores=16,threads=1 -m 32G \
-s 0,hostbridge \
-s 1,nvme,disk0.img \
-s 3:0,passthru,3/0/0 \
-s 3:1,passthru,3/0/1 \
-s 13:0,passthru,13/0/0 \
-s 30,xhci,tablet \
-s 31,lpc \
-l bootrom,/usr/local/share/uefi-firmware/BHYVE_UEFI.fd,fwcfg=qemu \
-l com1,stdio \
$VM_NAME

# Exit the script here and leave some options for debugging more    
# ergonomically after the exit line.    
exit    

# Only use this when installing the VM for the first time:    

# VNC. Once you install Windows and enable RDP, you can turn this off.    
-s 29,fbuf,tcp=0.0.0.0:5900,w=1024,h=768,wait \    

# The Windows ISO and the VirtIO Drivers ISO.    
# You can download the latest stable virtio drivers at:    
# 
https://fedorapeople.org/groups/virt/virtio-win/direct-downloads/stable-virtio/virtio-win.iso
    
-s 4,ahci-cd,../files/Win10_22H2_English_x64v1_2023.iso \
-s 5,ahci-cd,../files/virtio-win-0.1.271.iso \    

# If you want a network device without using virtio, you can use e1000.    
# You should install the virtio drivers and use virtio-net instead for    
# better performance.    
-s 2,e1000,tap0 \ 
-s 2,virtio-net,tap0 \    



VM STOP SCRIPT:


#!/bin/sh

# Send SIGTERM twice to make sure Windows listens to
# the ACPI shutdown signal.
pkill bhyve
pkill bhyve

# Wait a bit for the guest to shutdown properly before
# we continue shutting down the host.
sleep 10

bhyvectl --vm=gaming --destroy




RC.D SCRIPT:


#!/bin/sh

# PROVIDE: vm_gaming
# REQUIRE: LOGIN
# KEYWORD: nojail shutdown

. /etc/rc.subr

name=vm_gaming
rcvar=vm_gaming_enable 

start_cmd="${name}_start"
stop_cmd="${name}_stop"

vm_gaming_start()
{
        /atlantis/vms/gaming/start.sh
}

vm_gaming_stop()
{
        /atlantis/vms/gaming/stop.sh
}

load_rc_config $name 

: ${vm_gaming_enable:="NO"} 

run_rc_command "$1"


Reply via email to