On Wed, Dec 24, 2025 at 7:28 PM Matthew Brost <[email protected]> wrote:
> [...]
> > drivers/gpu/drm/xe/xe_tlb_inval.c | 6 +++---
> > 1 file changed, 3 insertions(+), 3 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/xe/xe_tlb_inval.c
> > b/drivers/gpu/drm/xe/xe_tlb_inval.c
> > index 918a59e686ea..b2cf6e17fbc5 100644
> > --- a/drivers/gpu/drm/xe/xe_tlb_inval.c
> > +++ b/drivers/gpu/drm/xe/xe_tlb_inval.c
> > @@ -94,7 +94,7 @@ static void xe_tlb_inval_fence_timeout(struct work_struct
> > *work)
> > xe_tlb_inval_fence_signal(fence);
> > }
> > if (!list_empty(&tlb_inval->pending_fences))
> > - queue_delayed_work(system_wq, &tlb_inval->fence_tdr,
> > + queue_delayed_work(system_percpu_wq, &tlb_inval->fence_tdr,
>
> Actually system_wq or system_percpu_wq doesn't work here as this is the
> fence signaling path. We should use one Xe's ordered work queues which
> is properly setup to be reclaim same.
Hi,
So only for this specific workqueue we should use for example this, instead:
462 /** @ordered_wq: used to serialize compute mode resume */
463 struct workqueue_struct *ordered_wq;
I noticed this has been allocated using:
490 xe->ordered_wq = alloc_ordered_workqueue("xe-ordered-wq", 0);
Using alloc_ordered_workqueue() makes this workqueue to be unbound.
569 #define alloc_ordered_workqueue(fmt, flags, args...) \
570 alloc_workqueue(fmt, WQ_UNBOUND | __WQ_ORDERED | (flags), 1, ##args)
So this patch should be split in 2:
- 1 patch with Xe's ordered workqueue, changing the behavior to
unbound "implicitly"
- 1 patch changing system_wq with the new per-cpu wq (system_percpu_wq).
To keep this workqueue per-cpu we can use xe->unordered_wq, that makes use of
alloc_workqueue() without specifying flags (eg. WQ_UNBOUND or the new
WQ_PERCPU),
making this workqueue per-cpu.
Would this sound reasonable to you?
Thanks!
--
Marco Crivellari
L3 Support Engineer