Thank you for your contribution! Sashiko AI review found 3 potential issue(s) 
to consider:
- [High] Concurrent order-0 wakers cause opportunistic high-order kswapd 
reclaims to become destructive, defeating the patch's purpose.
- [High] The opportunistic compaction hint decays to UNSET on retries and 
premature wakeups, silently converting opportunistic reclaim into destructive 
reclaim.
- [Medium] The `order` field referenced in `struct shrink_control`'s 
documentation is missing from the struct, rendering the API incomplete.
--

commit 458389aa250d63553aa4e2e1f9e0c39d7fbae43e
Author: Matthew Brost <[email protected]>
Subject: mm: Introduce opportunistic_compaction concept to vmscan and shrinkers

This commit introduces an opportunistic compaction hint for kswapd and
shrinkers. It aims to prevent aggressive reclaim of working sets during
fragmentation-sensitive high-order allocations. Shrinkers can use this
hint to skip destructive work when reclaim is unlikely to produce a
contiguous high-order block.

> diff --git a/include/linux/shrinker.h b/include/linux/shrinker.h
> index 1a00be90d93a9..5f3e8dc981296 100644
> --- a/include/linux/shrinker.h
> +++ b/include/linux/shrinker.h
> @@ -37,6 +37,26 @@ struct shrink_control {
>       /* current node being shrunk (for NUMA aware shrinkers) */
>       int nid;
>  
> +     /*
> +      * Opportunistic compaction hint.
> +      *
> +      * Set by the reclaim path to tell shrinkers that this pass is
> +      * driven by an order > 0 allocation that the caller is willing to
> +      * have fail (e.g., __GFP_NORETRY / __GFP_RETRY_MAYFAIL without
> +      * __GFP_NOFAIL). Such allocations only really benefit from
> +      * shrinking when doing so frees up a contiguous, high-order block;
> +      * thrashing working sets in the hope of producing one is typically
> +      * counter-productive.
> +      *
> +      * Shrinkers that can produce naturally-aligned high-order folios
> +      * (see shrink_control::order) should treat this as a hint to skip
> +      * costly work that is unlikely to help compaction (for example,
> +      * evicting hot/working-set pages just to free single pages).
> +      *
> +      * Only meaningful when @order > 0; ignored otherwise.
> +      */
> +     bool opportunistic_compaction;

[Severity: Medium]
The comment references a shrink_control::order field, but does this field
actually exist in struct shrink_control?

Without passing the target allocation order from scan_control through
shrink_slab(), it seems shrinker implementations won't be able to determine
the requested block size to conditionally skip work.

> diff --git a/mm/vmscan.c b/mm/vmscan.c
> index bd1b1aa125819..c40bc3f9ddd43 100644
> --- a/mm/vmscan.c
> +++ b/mm/vmscan.c

[ ... ]

> @@ -7385,6 +7428,11 @@ static void kswapd_try_to_sleep(pg_data_t *pgdat, int 
> alloc_order, int reclaim_o
>  
>                       if (READ_ONCE(pgdat->kswapd_order) < reclaim_order)
>                               WRITE_ONCE(pgdat->kswapd_order, reclaim_order);
> +
> +                     if (kswapd_opportunistic_compaction ==
> +                         KSWAPD_NO_OPPORTUNISTIC_COMPACTION)
> +                             
> atomic_set(&pgdat->kswapd_opportunistic_compaction,
> +                                        KSWAPD_NO_OPPORTUNISTIC_COMPACTION);
>               }
>  
>               finish_wait(&pgdat->kswapd_wait, &wait);

[Severity: High]
Does this conditional block inadvertently allow the opportunistic hint to
decay on retries?

In kswapd(), the hint is extracted and reset to
KSWAPD_UNSET_OPPORTUNISTIC_COMPACTION before calling balance_pgdat().

If the first reclaim pass fails and kswapd goes back to sleep, this code
only restores the hint if it was KSWAPD_NO_OPPORTUNISTIC_COMPACTION. 
If it was KSWAPD_OPPORTUNISTIC_COMPACTION, the state decays to UNSET.

On the subsequent retry, sc_opportunistic_compaction() will evaluate
UNSET as false, meaning a multi-pass reclaim or premature wakeup will
cause kswapd to resume full destructive reclaim for the high-order
allocation.

[ ... ]

> @@ -7537,6 +7594,28 @@ void wakeup_kswapd(struct zone *zone, gfp_t gfp_flags, 
> int order,
>       if (READ_ONCE(pgdat->kswapd_order) < order)
>               WRITE_ONCE(pgdat->kswapd_order, order);
>  
> +     /*
> +      * Fold this waker into the per-pgdat opportunistic-compaction hint
> +      * that kswapd will pick up at the start of its next run.
> +      *
> +      * The state is sticky in the "NO" direction: once any waker in this
> +      * batch is order-0 or a non-failable high-order allocation, the hint
> +      * stays cleared until kswapd consumes it. Only when every waker so
> +      * far is a failable high-order allocation do we set
> +      * KSWAPD_OPPORTUNISTIC_COMPACTION, asking shrinkers to skip work
> +      * that won't realistically help compaction.
> +      */
> +     if (atomic_read(&pgdat->kswapd_opportunistic_compaction) !=
> +         KSWAPD_NO_OPPORTUNISTIC_COMPACTION) {
> +             if (!order || !gfp_opportunistic_compaction(gfp_flags))
> +                     atomic_set(&pgdat->kswapd_opportunistic_compaction,
> +                                KSWAPD_NO_OPPORTUNISTIC_COMPACTION);
> +             else if (order && gfp_opportunistic_compaction(gfp_flags))
> +                     atomic_cmpxchg(&pgdat->kswapd_opportunistic_compaction,
> +                                    KSWAPD_UNSET_OPPORTUNISTIC_COMPACTION,
> +                                    KSWAPD_OPPORTUNISTIC_COMPACTION);
> +     }

[Severity: High]
Can concurrent order-0 wakers cause opportunistic high-order kswapd
reclaims to become destructive here?

Since pgdat->kswapd_order tracks the maximum requested allocation order
while the hint independently aggregates with a sticky-NO approach, what
happens if an opportunistic high-order allocation (e.g., order 9) and a
normal order-0 allocation concurrently wake kswapd?

It appears the aggregated state becomes order=9 with
KSWAPD_NO_OPPORTUNISTIC_COMPACTION. This would cause kswapd to execute
balance_pgdat() for order 9 with the non-opportunistic hint, meaning
shrinkers receive opportunistic_compaction = false and aggressively destroy
the working set to satisfy the order-9 allocation.

Because order-0 allocations (like page cache faults) are common, does this
guarantee the opportunistic compaction heuristic will be routinely bypassed
under normal memory pressure?

-- 
Sashiko AI review ยท 
https://sashiko.dev/#/patchset/[email protected]?part=1

Reply via email to