On Tue, 2026-05-05 at 20:33 -0700, Matthew Brost wrote:
> Xe/TTM backup reclaim can be extremely expensive under fragmentation
> pressure as reclaim may migrate or destroy actively used GPU working
> sets despite the system still having substantial free memory
> available.
> 
> Under high-order opportunistic reclaim, repeatedly backing up GPU
> memory can lead to reclaim/rebind ping-pong behavior where active GPU
> working sets are continuously torn down and reconstructed without
> materially improving allocation success.
> 
> Use the new shrink_control::opportunistic_compaction hint to avoid Xe
> backup reclaim during fragmentation-driven high-order reclaim
> attempts.
> In this mode the shrinker skips advertising backup-backed reclaimable
> memory and avoids initiating backup operations entirely.
> 
> Order-0 and non-opportunistic reclaim behavior remain unchanged, so
> Xe backup reclaim still participates normally during genuine memory
> pressure.
> 
> Cc: Andrew Morton <[email protected]>
> Cc: Dave Chinner <[email protected]>
> Cc: Qi Zheng <[email protected]>
> Cc: Roman Gushchin <[email protected]>
> Cc: Muchun Song <[email protected]>
> Cc: David Hildenbrand <[email protected]>
> Cc: Lorenzo Stoakes <[email protected]>
> Cc: "Liam R. Howlett" <[email protected]>
> Cc: Vlastimil Babka <[email protected]>
> Cc: Mike Rapoport <[email protected]>
> Cc: Suren Baghdasaryan <[email protected]>
> Cc: Michal Hocko <[email protected]>
> Cc: Johannes Weiner <[email protected]>
> Cc: Shakeel Butt <[email protected]>
> Cc: Kairui Song <[email protected]>
> Cc: Barry Song <[email protected]>
> Cc: Axel Rasmussen <[email protected]>
> Cc: Yuanchu Xie <[email protected]>
> Cc: Wei Xu <[email protected]>
> Cc: [email protected]
> Cc: [email protected]
> Assisted-by: Claude:claude-opus-4.6
> Signed-off-by: Matthew Brost <[email protected]>

Reviewed-by: Thomas Hellström <[email protected]>

> ---
>  drivers/gpu/drm/xe/xe_shrinker.c | 20 +++++++++++++++++---
>  1 file changed, 17 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/gpu/drm/xe/xe_shrinker.c
> b/drivers/gpu/drm/xe/xe_shrinker.c
> index 83374cd57660..4646b0f5b82b 100644
> --- a/drivers/gpu/drm/xe/xe_shrinker.c
> +++ b/drivers/gpu/drm/xe/xe_shrinker.c
> @@ -139,10 +139,17 @@ static unsigned long
>  xe_shrinker_count(struct shrinker *shrink, struct shrink_control
> *sc)
>  {
>       struct xe_shrinker *shrinker = to_xe_shrinker(shrink);
> -     unsigned long num_pages;
> +     unsigned long num_pages = 0;
>       bool can_backup = !!(sc->gfp_mask & __GFP_FS);
>  
> -     num_pages = ttm_backup_bytes_avail() >> PAGE_SHIFT;
> +     /*
> +      * Skip accounting backup-able pages when this is an
> opportunistic
> +      * high-order pass: TTM backup work shrinks at native page
> granularity
> +      * and is unlikely to produce the contiguous block the
> caller wants,
> +      * so don't advertise it as reclaimable for this hint.
> +      */
> +     if (!sc->order || !sc->opportunistic_compaction)
> +             num_pages = ttm_backup_bytes_avail() >> PAGE_SHIFT;
>       read_lock(&shrinker->lock);
>  
>       if (can_backup)
> @@ -233,7 +240,14 @@ static unsigned long xe_shrinker_scan(struct
> shrinker *shrink, struct shrink_con
>       }
>  
>       sc->nr_scanned = nr_scanned;
> -     if (nr_scanned >= nr_to_scan || !can_backup)
> +     /*
> +      * Stop after the purge pass for opportunistic high-order
> reclaim:
> +      * the subsequent backup/writeback pass works at native page
> order
> +      * and is unlikely to free a contiguous high-order block, so
> doing
> +      * it here would just churn working sets for no compaction
> benefit.
> +      */
> +     if (nr_scanned >= nr_to_scan || !can_backup ||
> +         (sc->order && sc->opportunistic_compaction))
>               goto out;
>  
>       /* If we didn't wake before, try to do it now if needed. */

Reply via email to