On Tue, Feb 17, 2026 at 03:44:06PM +0100, Danilo Krummrich wrote:
> On Tue Feb 17, 2026 at 3:28 PM CET, Philipp Stanner wrote:
> > OK, maybe I'm lost, but what delayed_work?
> >
> > The jobqueue's delayed work item gets either created on JQ::new() or in
> > jq.submit_job(). Why would anyone – that is: any driver – implement a
> > delayed work in its timeout callback?
> >
> > That doesn't make sense.
> >
> > JQ notifies the driver from its delayed_work through
> > timeout_callback(), and in that callback the driver closes the
> > associated firmware ring.
> >
> > And it drops the JQ. So it is gone. A new JQ will get a new timeout
> > work item.
> >
> > That's basically all the driver must ever do. Maybe some logging and
> > stuff.
> >
> > With firmware scheduling it should really be that simple.
> >
> > And signalling / notifying userspace gets done by jobqueue.
> >
> > Right?
> 
> Well, the timeout path is part of the fence signaling critical section until 
> all
> fences have been signaled.
> 
> But, if I, for instance, just kick off another work from the timeout handler 
> and
> subsequently signal all fences by dropping the JQ, this other work must not 
> play
> after DMA fence signaling rules anymore and is free to do whatever (maybe even
> take a device coredump without needing GFP_NOWAIT).
> 
> Xe does this with xe_devcoredump_deferred_snap_work for instance.
> 

Yes.

> >> You also potentially want device core dumps. Those usually use GFP_NOWAIT 
> >> so
> >> that they can't cycle back and wait for some fence. The down side is that
> >> they can trivially fail under even light memory pressure.
> >
> > Simply logging into dmesg should do the trick, shouldn't it?
> 

The trick is to make devcoredump a multi-step process. In the TDR,
allocate as little memory as possible using NOWAIT—for example, record
the parts of objects that might disappear, or take references and store
them in the allocated “snap” object so they remain stable. This is the
snap step.

Next, kick a worker that looks at the snap and allocates the memory
needed for the capture step. Here you can safely save off BOs contents,
for example..

After that comes the print step, which converts the captured data into
human-readable output for the devcoredump.

This is actually a simplified view: the capture step can exceed the
kvmalloc size limit (default 2GB), so your print step may need to
trigger additional capture phases. Multiple capture phases also mean you
must hold onto the snap for a longer period.

There is also a time-complexity bug in the capture printer. I added
support for offset-based reads of the print data to prevent it from
becoming insanely expensive.

If you have any questions, feel free to ping me, or refer to
xe_devcoredump.c for the implementation.

Matt

> You can't "log" a device coredump into dmesg. :)

Reply via email to