Hello Jason,
Hope you are doing well. I am Chaitanya from the linux graphics team in
Intel.
This mail is regarding a regression we are seeing in our CI runs[1] on
linux-next repository.
Since the version next-20251106 [2], we are seeing our tests timing out
presumably caused by a GPU Hang.
`````````````````````````````````````````````````````````````````````````````````
<6> [490.872058] i915 0000:00:02.0: [drm] Got hung context on vcs0 with
active request 939:2 [0x1004] not yet started
<6> [490.875244] i915 0000:00:02.0: [drm] GPU HANG: ecode 12:4:baffffff
<7> [496.424189] i915 0000:00:02.0:
[drm:intel_guc_context_reset_process_msg [i915]] GT1: GUC: Got context
reset notification: 0x1004 on vcs0, exiting = no, banned = no
<6> [496.921551] i915 0000:00:02.0: [drm] Got hung context on vcs0 with
active request 939:2 [0x1004] not yet started
<6> [496.924799] i915 0000:00:02.0: [drm] GPU HANG: ecode 12:4:baffffff
<4> [499.946641] [IGT] Per-test timeout exceeded. Killing the current
test with SIGQUIT.
`````````````````````````````````````````````````````````````````````````````````
Details log can be found in [3].
After bisecting the tree, the following patch [4] seems to be the first
"bad" commit
`````````````````````````````````````````````````````````````````````````````````````````````````````````
commit d373449d8e97891434db0c64afca79d903c1194e
Author: Jason Gunthorpe [email protected]
Date: Thu Oct 23 15:22:36 2025 -0300
iommu/vt-d: Use the generic iommu page table
`````````````````````````````````````````````````````````````````````````````````````````````````````````
We could not revert the patch because of merge issues but resetting to
the parent[5] of the commit seems to fix the issue.
Could you please check why the patch causes this regression and provide
a fix if necessary?
Thank you.
Regards
Chaitanya
[1]
https://intel-gfx-ci.01.org/tree/linux-next/combined-alt.html?
[2]
https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/commit/?h=next-20251106
[3]
https://intel-gfx-ci.01.org/tree/linux-next/next-20251106/bat-arlh-2/dmesg0.txt
[4]
https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/commit/?h=next-20251106&id=d373449d8e97891434db0c64afca79d903c1194e
[5]
https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/commit/?h=next-20251106&id=ef7bfe5bbffdcfa033beeeb068c6317f71730679