Hi,
I was working on an issue [1] triggered by MDEV passthrough to a KVM Guest.
I got recommended to try intel_iommu=igfx_off and it worked like a
charm allowing me to run opencl things in the guest on the device.
[2] still states for intel_iommu=igfx_off "If this fixes anything,
please ensure you file a bug reporting the problem.". While I'm not
sure how important that is to you still I thought to post you a
summary.
Maybe you are collecting a list of affected HW or anything like that?
If you want me to file a bugzilla or provide extra data let me know.
Here all the snippets I think you might consider helpful.
## Kernel Version ##
5.0.0-8-generic (Ubuntu)
The same kernel is active in Host and Guest.
## Crash ##
(before the fix by intel_iommu=igfx_off)
[ 245.284511] DMAR: DRHD: handling fault status reg 2
[ 245.284530] DMAR: [DMA Write] Request device [00:02.0] fault addr
8c9128000 [fault reason 23] Unknown
[ 245.284560] DMAR: DRHD: handling fault status reg 2
[ 245.284579] DMAR: [DMA Write] Request device [00:02.0] fault addr
8c914a000 [fault reason 23] Unknown
[ 245.284610] DMAR: DRHD: handling fault status reg 2
[ 250.106273] [drm] GPU HANG: ecode 8:0:0xe757fefe, reason: no
progress on rcs0, action: reset
[ 250.106274] [drm] GPU hangs can indicate a bug anywhere in the
entire gfx stack, including userspace.
[ 250.106275] [drm] Please file a _new_ bug report on
bugs.freedesktop.org against DRI -> DRM/Intel
[ 250.106276] [drm] drm/i915 developers can then reassign to the
right component if it's not a kernel issue.
[ 250.106276] [drm] The gpu crash dump is required to analyze gpu
hangs, so please always attach it.
[ 250.106277] [drm] GPU crash dump saved to /sys/class/drm/card0/error
[ 250.106299] i915 0000:00:02.0: Resetting rcs0 for no progress on rcs0
[ 251.900704] i915 0000:00:02.0: Resetting chip for no progress on rcs0
[ 251.900718] i915 0000:00:02.0: GPU recovery failed
## HW Information ##
CPU: Intel(R) Core(TM) i5-5250U CPU @ 1.60GHz
$ lspci -v -s 00:02.0
00:02.0 VGA compatible controller: Intel Corporation HD Graphics 6000
(rev 09) (prog-if 00 [VGA controller])
Subsystem: Intel Corporation HD Graphics 6000
Flags: bus master, fast devsel, latency 0, IRQ 48
Memory at f6000000 (64-bit, non-prefetchable) [size=16M]
Memory at e0000000 (64-bit, prefetchable) [size=256M]
I/O ports at f000 [size=64]
[virtual] Expansion ROM at 000c0000 [disabled] [size=128K]
Capabilities: <access denied>
Kernel driver in use: i915
Kernel modules: i915
## Mediated Devices passthrough setup ##
Enabled on kernel commandline via /etc/default/grub:
i915.enable_gvt=1 intel_iommu=on drm.debug=0
And loading the modules:
$ printf "kvmgt\nvfio-iommu-type1\nvfio-mdev" | sudo tee
/etc/initramfs-tools/modules
Update and reboot
$ sudo update-initramfs -u
$ sudo update-grub
Then I was creating a UUID for the MDEV
$ cd /sys/bus/pci/devices/0000:00:02.0/mdev_supported_types/i915-GVTg_V4_4
$ echo 4dd50f26-ec08-11e8-b838-4bc3356865b6 | sudo tee create
Finally I was telling libvirt to use that modifying my guest XML like
<graphics type='spice'>
<listen type='none'/>
<gl enable='yes'/>
</graphics>
<hostdev mode='subsystem' type='mdev' managed='no' model='vfio-pci'>
<source>
<address uuid='4dd50f26-ec08-11e8-b838-4bc3356865b6'/>
</source>
</hostdev>
## guest workload to force-trigger the bug ##
Note: it could happen without it, but that proved to be faster than
waiting for the crash
Install opencl bits and try to use it, that would be:
$ sudo apt install ocl-icd-libopencl1
$ sudo apt install opencl-headers
$ sudo apt install clinfo
$ sudo apt install beignet
And then run clinfo
$ clinfo
[1]: https://bugs.freedesktop.org/show_bug.cgi?id=110238
[2]: https://www.kernel.org/doc/Documentation/Intel-IOMMU.txt
P.S. the list won't send me the subscription confirmation - three
tries waited for an hour; so I expect to be at least moderated if not
denied on the list post :-/
--
Christian Ehrhardt
Software Engineer, Ubuntu Server
Canonical Ltd
_______________________________________________
iommu mailing list
[email protected]
https://lists.linuxfoundation.org/mailman/listinfo/iommu