On Tue, 2026-05-05 at 20:33 -0700, Matthew Brost wrote: > Xe/TTM backup reclaim can be extremely expensive under fragmentation > pressure as reclaim may migrate or destroy actively used GPU working > sets despite the system still having substantial free memory > available. > > Under high-order opportunistic reclaim, repeatedly backing up GPU > memory can lead to reclaim/rebind ping-pong behavior where active GPU > working sets are continuously torn down and reconstructed without > materially improving allocation success. > > Use the new shrink_control::opportunistic_compaction hint to avoid Xe > backup reclaim during fragmentation-driven high-order reclaim > attempts. > In this mode the shrinker skips advertising backup-backed reclaimable > memory and avoids initiating backup operations entirely. > > Order-0 and non-opportunistic reclaim behavior remain unchanged, so > Xe backup reclaim still participates normally during genuine memory > pressure. > > Cc: Andrew Morton <[email protected]> > Cc: Dave Chinner <[email protected]> > Cc: Qi Zheng <[email protected]> > Cc: Roman Gushchin <[email protected]> > Cc: Muchun Song <[email protected]> > Cc: David Hildenbrand <[email protected]> > Cc: Lorenzo Stoakes <[email protected]> > Cc: "Liam R. Howlett" <[email protected]> > Cc: Vlastimil Babka <[email protected]> > Cc: Mike Rapoport <[email protected]> > Cc: Suren Baghdasaryan <[email protected]> > Cc: Michal Hocko <[email protected]> > Cc: Johannes Weiner <[email protected]> > Cc: Shakeel Butt <[email protected]> > Cc: Kairui Song <[email protected]> > Cc: Barry Song <[email protected]> > Cc: Axel Rasmussen <[email protected]> > Cc: Yuanchu Xie <[email protected]> > Cc: Wei Xu <[email protected]> > Cc: [email protected] > Cc: [email protected] > Assisted-by: Claude:claude-opus-4.6 > Signed-off-by: Matthew Brost <[email protected]>
Reviewed-by: Thomas Hellström <[email protected]> > --- > drivers/gpu/drm/xe/xe_shrinker.c | 20 +++++++++++++++++--- > 1 file changed, 17 insertions(+), 3 deletions(-) > > diff --git a/drivers/gpu/drm/xe/xe_shrinker.c > b/drivers/gpu/drm/xe/xe_shrinker.c > index 83374cd57660..4646b0f5b82b 100644 > --- a/drivers/gpu/drm/xe/xe_shrinker.c > +++ b/drivers/gpu/drm/xe/xe_shrinker.c > @@ -139,10 +139,17 @@ static unsigned long > xe_shrinker_count(struct shrinker *shrink, struct shrink_control > *sc) > { > struct xe_shrinker *shrinker = to_xe_shrinker(shrink); > - unsigned long num_pages; > + unsigned long num_pages = 0; > bool can_backup = !!(sc->gfp_mask & __GFP_FS); > > - num_pages = ttm_backup_bytes_avail() >> PAGE_SHIFT; > + /* > + * Skip accounting backup-able pages when this is an > opportunistic > + * high-order pass: TTM backup work shrinks at native page > granularity > + * and is unlikely to produce the contiguous block the > caller wants, > + * so don't advertise it as reclaimable for this hint. > + */ > + if (!sc->order || !sc->opportunistic_compaction) > + num_pages = ttm_backup_bytes_avail() >> PAGE_SHIFT; > read_lock(&shrinker->lock); > > if (can_backup) > @@ -233,7 +240,14 @@ static unsigned long xe_shrinker_scan(struct > shrinker *shrink, struct shrink_con > } > > sc->nr_scanned = nr_scanned; > - if (nr_scanned >= nr_to_scan || !can_backup) > + /* > + * Stop after the purge pass for opportunistic high-order > reclaim: > + * the subsequent backup/writeback pass works at native page > order > + * and is unlikely to free a contiguous high-order block, so > doing > + * it here would just churn working sets for no compaction > benefit. > + */ > + if (nr_scanned >= nr_to_scan || !can_backup || > + (sc->order && sc->opportunistic_compaction)) > goto out; > > /* If we didn't wake before, try to do it now if needed. */
