On 8/27/25 05:51, Petru Garstea wrote:
Greetings,
I’m running a *Debian Linux 12.11 VM on FreeBSD 14.3* using *bhyve*.
Inside the VM, I’ve deployed the *Docker engine* with *Ollama
configured for ROCm support*.
However, when executing an LLM, the *GPU fails to initialize
correctly*, causing the process to fail.
Please note on the bare metal this setup works fine.
The full log of this behavior is included below.
---
kernel: [drm] PCIE GART of 512M enabled (table at 0x0000008000000000).
kernel: [drm] PSP is resuming...
kernel: [drm] reserve 0xa00000 from 0x82fd000000 for PSP TMR
kernel: amdgpu 0000:00:01.0: amdgpu: RAS: optional ras ta ucode is
not available
kernel: amdgpu 0000:00:01.0: amdgpu: SECUREDISPLAY: securedisplay ta
ucode is not available
kernel: amdgpu 0000:00:01.0: amdgpu: SMU is resuming...
kernel: amdgpu 0000:00:01.0: amdgpu: smu driver if version =
0x0000000e, smu fw if version = 0x00000012, smu fw program = 0,
version = 0x00413900 (65.57.0)
kernel: amdgpu 0000:00:01.0: amdgpu: SMU driver if version not matched
kernel: amdgpu 0000:00:01.0: amdgpu: use vbios provided pptable
kernel: amdgpu 0000:00:01.0: amdgpu: SMU is resumed successfully!
kernel: [drm] DMUB hardware initialized: version=0x02020017
kernel: [drm] kiq ring mec 2 pipe 1 q 0
kernel: [drm] VCN decode and encode initialized successfully(under
DPG Mode).
kernel: [drm] JPEG decode initialized successfully.
kernel: amdgpu 0000:00:01.0: amdgpu: ring gfx_0.0.0 uses VM inv eng 0
on hub 0
kernel: amdgpu 0000:00:01.0: amdgpu: ring comp_1.0.0 uses VM inv eng
1 on hub 0
kernel: amdgpu 0000:00:01.0: amdgpu: ring comp_1.1.0 uses VM inv eng
4 on hub 0
kernel: amdgpu 0000:00:01.0: amdgpu: ring comp_1.2.0 uses VM inv eng
5 on hub 0
kernel: amdgpu 0000:00:01.0: amdgpu: ring comp_1.3.0 uses VM inv eng
6 on hub 0
kernel: amdgpu 0000:00:01.0: amdgpu: ring comp_1.0.1 uses VM inv eng
7 on hub 0
kernel: amdgpu 0000:00:01.0: amdgpu: ring comp_1.1.1 uses VM inv eng
8 on hub 0
kernel: amdgpu 0000:00:01.0: amdgpu: ring comp_1.2.1 uses VM inv eng
9 on hub 0
kernel: amdgpu 0000:00:01.0: amdgpu: ring comp_1.3.1 uses VM inv eng
10 on hub 0
kernel: amdgpu 0000:00:01.0: amdgpu: ring kiq_2.1.0 uses VM inv eng
11 on hub 0
kernel: amdgpu 0000:00:01.0: amdgpu: ring sdma0 uses VM inv eng 12 on
hub 0
kernel: amdgpu 0000:00:01.0: amdgpu: ring sdma1 uses VM inv eng 13 on
hub 0
kernel: amdgpu 0000:00:01.0: amdgpu: ring vcn_dec_0 uses VM inv eng 0
on hub 1
kernel: amdgpu 0000:00:01.0: amdgpu: ring vcn_enc_0.0 uses VM inv eng
1 on hub 1
kernel: amdgpu 0000:00:01.0: amdgpu: ring vcn_enc_0.1 uses VM inv eng
4 on hub 1
kernel: amdgpu 0000:00:01.0: amdgpu: ring jpeg_dec uses VM inv eng 5
on hub 1
kernel: amdgpu 0000:00:01.0: [drm] Cannot find any crtc or sizes
kernel: amdgpu: qcm fence wait loop timeout expired
kernel: amdgpu: The cp might be in an unrecoverable state due to an
unsuccessful queues preemption
kernel: amdgpu: Pasid 0x8002 DQM create queue type 0 failed. ret -62
kernel: amdgpu 0000:00:01.0: amdgpu: GPU reset begin!
kernel: amdgpu: Failed to suspend process 0x8002
kernel: amdgpu: Failed to suspend process 0x8001
kernel: amdgpu 0000:00:01.0: amdgpu: free PSP TMR buffer
kernel: amdgpu 0000:00:01.0: amdgpu: MODE1 reset
kernel: amdgpu 0000:00:01.0: amdgpu: GPU mode1 reset
kernel: amdgpu 0000:00:01.0: amdgpu: GPU smu mode1 reset
kernel: amdgpu 0000:00:01.0: amdgpu: GPU reset succeeded, trying to
resume
kernel: clocksource: Long readout interval, skipping watchdog check:
cs_nsec: 12622536057 wd_nsec: 12613480925
kernel: [drm] PCIE GART of 512M enabled (table at 0x0000008000000000).
kernel: [drm] VRAM is lost due to GPU reset!
kernel: [drm] PSP is resuming...
kernel: [drm:psp_hw_start [amdgpu]] *ERROR* PSP create ring failed!
kernel: [drm:psp_resume [amdgpu]] *ERROR* PSP resume failed
kernel: [drm:amdgpu_device_fw_loading [amdgpu]] *ERROR* resume of IP
block <psp> failed -62
kernel: amdgpu 0000:00:01.0: amdgpu: GPU reset(1) failed
kernel: amdgpu: qcm fence wait loop timeout expired
kernel: amdgpu: The cp might be in an unrecoverable state due to an
unsuccessful queues preemption
kernel: amdgpu 0000:00:01.0: amdgpu: GPU reset end with ret = -62
kernel: amdgpu 0000:00:01.0: amdgpu: GPU reset begin!
kernel: amdgpu 0000:00:01.0: amdgpu: Failed to disallow df cstate
Regards,
Petru
Hello!
Before you start docker, are you able to verify that the GPU is actually
working in the vm?
How did you verify ? (for AMD i don't know tho tooling)
Regards,
Stephan