On Wed, Dec 24, 2025 at 7:28 PM Matthew Brost <[email protected]> wrote:
> [...]
> >  drivers/gpu/drm/xe/xe_tlb_inval.c | 6 +++---
> >  1 file changed, 3 insertions(+), 3 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/xe/xe_tlb_inval.c 
> > b/drivers/gpu/drm/xe/xe_tlb_inval.c
> > index 918a59e686ea..b2cf6e17fbc5 100644
> > --- a/drivers/gpu/drm/xe/xe_tlb_inval.c
> > +++ b/drivers/gpu/drm/xe/xe_tlb_inval.c
> > @@ -94,7 +94,7 @@ static void xe_tlb_inval_fence_timeout(struct work_struct 
> > *work)
> >               xe_tlb_inval_fence_signal(fence);
> >       }
> >       if (!list_empty(&tlb_inval->pending_fences))
> > -             queue_delayed_work(system_wq, &tlb_inval->fence_tdr,
> > +             queue_delayed_work(system_percpu_wq, &tlb_inval->fence_tdr,
>
> Actually system_wq or system_percpu_wq doesn't work here as this is the
> fence signaling path. We should use one Xe's ordered work queues which
> is properly setup to be reclaim same.

Hi,

So only for this specific workqueue we should use for example this, instead:

462     /** @ordered_wq: used to serialize compute mode resume */
463     struct workqueue_struct *ordered_wq;

I noticed this has been allocated using:

 490     xe->ordered_wq = alloc_ordered_workqueue("xe-ordered-wq", 0);

Using alloc_ordered_workqueue() makes this workqueue to be unbound.

569 #define alloc_ordered_workqueue(fmt, flags, args...)            \
570     alloc_workqueue(fmt, WQ_UNBOUND | __WQ_ORDERED | (flags), 1, ##args)

So this patch should be split in 2:
- 1 patch with Xe's ordered workqueue, changing the behavior to
unbound "implicitly"
- 1 patch changing system_wq with the new per-cpu wq (system_percpu_wq).

To keep this workqueue per-cpu we can use xe->unordered_wq, that makes use of
alloc_workqueue() without specifying flags (eg. WQ_UNBOUND or the new
WQ_PERCPU),
making this workqueue per-cpu.

Would this sound reasonable to you?

Thanks!
-- 

Marco Crivellari

L3 Support Engineer

Reply via email to