Hi Thomas, On 09/03/2026 09:43, Thomas Gleixner wrote: > On Sun, Mar 08 2026 at 18:23, Matthieu Baerts wrote: >> 08 Mar 2026 17:58:26 Thomas Gleixner <[email protected]>: >>> So I'm back to square one. I go and do what I should have done in the >>> first place. Write a debug patch with trace_printks and let the people >>> who can actually trigger the problem run with it. >> >> Happy to test such debug patches! > > See below. > > Enable the tracepoints either on the kernel command line: > > trace_event=sched_switch,mmcid:* > > or before starting the test case: > > echo 1 >/sys/kernel/tracing/events/sched/sched_switch/enable > echo 1 >/sys/kernel/tracing/events/mmcid/enable > > I added a 50ms timeout into mm_cid_get() which freezes the trace and > emits a warning. If you enable panic_on_warn and ftrace_dump_on_oops, > then it dumps the trace buffer once it hits the warning. > > Either kernel command line: > > panic_on_warn ftrace_dump_on_oops > > or > > echo 1 >/proc/sys/kernel/panic_on_warn > echo 1 >/proc/sys/kernel/ftrace_dump_on_oops > > That should provide enough information to decode this mystery.
Thank you for the debug patch and the clear instructions. I managed to reproduce the issue with the extra debug. The ouput is available here: https://github.com/user-attachments/files/25841808/issue-617-debug.txt.gz Just in case, the kernel config file that was used: https://github.com/user-attachments/files/25841873/issue-617-debug.config.gz Please tell me if it is an issue to download these files from GitHub. The output file has 10k+ lines. Cheers, Matt -- Sponsored by the NGI0 Core fund.

