Followup-For: Bug #1125155
X-Debbugs-Cc: [email protected]

Olivier:

amdgpu 0000:03:00.0: [drm] *ERROR* [CRTC:91:crtc-0] flip_done timed out
[drm:amdgpu_dm_atomic_check [amdgpu]] *ERROR* [CRTC:91:crtc-0] hw_done 
or flip_done timed out

This is almost always a symptom of a failure of the GPU's firmware.
These GPUs consist of many "IP blocks" (Intellectual Property) - better 
known as firmware - for the
different functions each 'block' provides.

The error is telling us the driver is waiting for an acknowledgement 
from the GPU firmware but it never arrives.

In most cases it would require the firmware/block to be reset and 
restarted to regain functionality.

The causes of these errors across a vast number of AMD GPUs are not
understood even by the AMD Linux GPU developers!

  Recently though there's an initiative to try to deal with the symptom 
by identifying where the
problem is and restarting the IP block.

We'll have to wait and hope they follow through and we get a commit that 
can also be back-ported.

See https://lkml.org/lkml/2026/1/22/2079

I build and run latest mainline (currently on 6.19.0-rc7+debian+tj) and
see the flip_done timeout occassionally so if the thread leads to a
usuable patch I'll be testing it.

Tj.

Reply via email to