On Thu, 21 May 2026, Marco Nenciarini <[email protected]> wrote: > Cross-reference: Aaron Esau (Cc'd) posted a 3-patch series targeting > this on intel-gfx@ on 2026-05-09 [1]. The series received pushback > from Imre, Jani N, and Mika arguing for catching the failure pre-commit > so the atomic_commit can fail cleanly at check time rather than > mid-commit. The series is currently stalled. With Jakub's report, > Aaron's report, and mine, the bug reproduces on at least three > independent setups across i915 and xe, ARL-H and MTL, with and without > an active NVIDIA driver. > > On the NVIDIA framing: Aaron's cover letter attributed the MSGBUS > unresponsiveness to the NVIDIA dGPU not participating in S0ix > (NVreg_EnableS0ixPowerManagement). That framing has two cracks. My > reproduction has S0ix participation enabled AND NVIDIA runtime PM > fully disabled (NVreg_DynamicPowerManagement=0x00, dGPU stays in D0 > since boot, never enters D3), yet the bug still fires. Jakub's setup > has xe forcing the iGPU and no active NVIDIA driver in dmesg. So > whatever platform-side condition causes the PHY to wedge, the NVIDIA > module parameters are not the lever, and the bug occurs without an > active NVIDIA driver. The fix has to be on the i915/xe side.
Looks like Jakub sent the same message twice. I replied to the other one [1], with references to existing gitlab issues. I'll reiterate that 1) any combo with an out-of-tree module loaded gets no priority, 2) MTL/ARL with the xe driver doesn't get much priority, and 3) the proposed series from Aaron is not viable. That said, it does look like a regression with MTL/ARL and the i915 driver and without out-of-tree modules loaded, and we're looking into it. BR, Jani. [1] https://lore.kernel.org/r/[email protected] -- Jani Nikula, Intel
