Thank you for your contribution! Sashiko AI review found 3 potential issue(s) to consider: - [High] Concurrent order-0 wakers cause opportunistic high-order kswapd reclaims to become destructive, defeating the patch's purpose. - [High] The opportunistic compaction hint decays to UNSET on retries and premature wakeups, silently converting opportunistic reclaim into destructive reclaim. - [Medium] The `order` field referenced in `struct shrink_control`'s documentation is missing from the struct, rendering the API incomplete. --
commit 458389aa250d63553aa4e2e1f9e0c39d7fbae43e Author: Matthew Brost <[email protected]> Subject: mm: Introduce opportunistic_compaction concept to vmscan and shrinkers This commit introduces an opportunistic compaction hint for kswapd and shrinkers. It aims to prevent aggressive reclaim of working sets during fragmentation-sensitive high-order allocations. Shrinkers can use this hint to skip destructive work when reclaim is unlikely to produce a contiguous high-order block. > diff --git a/include/linux/shrinker.h b/include/linux/shrinker.h > index 1a00be90d93a9..5f3e8dc981296 100644 > --- a/include/linux/shrinker.h > +++ b/include/linux/shrinker.h > @@ -37,6 +37,26 @@ struct shrink_control { > /* current node being shrunk (for NUMA aware shrinkers) */ > int nid; > > + /* > + * Opportunistic compaction hint. > + * > + * Set by the reclaim path to tell shrinkers that this pass is > + * driven by an order > 0 allocation that the caller is willing to > + * have fail (e.g., __GFP_NORETRY / __GFP_RETRY_MAYFAIL without > + * __GFP_NOFAIL). Such allocations only really benefit from > + * shrinking when doing so frees up a contiguous, high-order block; > + * thrashing working sets in the hope of producing one is typically > + * counter-productive. > + * > + * Shrinkers that can produce naturally-aligned high-order folios > + * (see shrink_control::order) should treat this as a hint to skip > + * costly work that is unlikely to help compaction (for example, > + * evicting hot/working-set pages just to free single pages). > + * > + * Only meaningful when @order > 0; ignored otherwise. > + */ > + bool opportunistic_compaction; [Severity: Medium] The comment references a shrink_control::order field, but does this field actually exist in struct shrink_control? Without passing the target allocation order from scan_control through shrink_slab(), it seems shrinker implementations won't be able to determine the requested block size to conditionally skip work. > diff --git a/mm/vmscan.c b/mm/vmscan.c > index bd1b1aa125819..c40bc3f9ddd43 100644 > --- a/mm/vmscan.c > +++ b/mm/vmscan.c [ ... ] > @@ -7385,6 +7428,11 @@ static void kswapd_try_to_sleep(pg_data_t *pgdat, int > alloc_order, int reclaim_o > > if (READ_ONCE(pgdat->kswapd_order) < reclaim_order) > WRITE_ONCE(pgdat->kswapd_order, reclaim_order); > + > + if (kswapd_opportunistic_compaction == > + KSWAPD_NO_OPPORTUNISTIC_COMPACTION) > + > atomic_set(&pgdat->kswapd_opportunistic_compaction, > + KSWAPD_NO_OPPORTUNISTIC_COMPACTION); > } > > finish_wait(&pgdat->kswapd_wait, &wait); [Severity: High] Does this conditional block inadvertently allow the opportunistic hint to decay on retries? In kswapd(), the hint is extracted and reset to KSWAPD_UNSET_OPPORTUNISTIC_COMPACTION before calling balance_pgdat(). If the first reclaim pass fails and kswapd goes back to sleep, this code only restores the hint if it was KSWAPD_NO_OPPORTUNISTIC_COMPACTION. If it was KSWAPD_OPPORTUNISTIC_COMPACTION, the state decays to UNSET. On the subsequent retry, sc_opportunistic_compaction() will evaluate UNSET as false, meaning a multi-pass reclaim or premature wakeup will cause kswapd to resume full destructive reclaim for the high-order allocation. [ ... ] > @@ -7537,6 +7594,28 @@ void wakeup_kswapd(struct zone *zone, gfp_t gfp_flags, > int order, > if (READ_ONCE(pgdat->kswapd_order) < order) > WRITE_ONCE(pgdat->kswapd_order, order); > > + /* > + * Fold this waker into the per-pgdat opportunistic-compaction hint > + * that kswapd will pick up at the start of its next run. > + * > + * The state is sticky in the "NO" direction: once any waker in this > + * batch is order-0 or a non-failable high-order allocation, the hint > + * stays cleared until kswapd consumes it. Only when every waker so > + * far is a failable high-order allocation do we set > + * KSWAPD_OPPORTUNISTIC_COMPACTION, asking shrinkers to skip work > + * that won't realistically help compaction. > + */ > + if (atomic_read(&pgdat->kswapd_opportunistic_compaction) != > + KSWAPD_NO_OPPORTUNISTIC_COMPACTION) { > + if (!order || !gfp_opportunistic_compaction(gfp_flags)) > + atomic_set(&pgdat->kswapd_opportunistic_compaction, > + KSWAPD_NO_OPPORTUNISTIC_COMPACTION); > + else if (order && gfp_opportunistic_compaction(gfp_flags)) > + atomic_cmpxchg(&pgdat->kswapd_opportunistic_compaction, > + KSWAPD_UNSET_OPPORTUNISTIC_COMPACTION, > + KSWAPD_OPPORTUNISTIC_COMPACTION); > + } [Severity: High] Can concurrent order-0 wakers cause opportunistic high-order kswapd reclaims to become destructive here? Since pgdat->kswapd_order tracks the maximum requested allocation order while the hint independently aggregates with a sticky-NO approach, what happens if an opportunistic high-order allocation (e.g., order 9) and a normal order-0 allocation concurrently wake kswapd? It appears the aggregated state becomes order=9 with KSWAPD_NO_OPPORTUNISTIC_COMPACTION. This would cause kswapd to execute balance_pgdat() for order 9 with the non-opportunistic hint, meaning shrinkers receive opportunistic_compaction = false and aggressively destroy the working set to satisfy the order-9 allocation. Because order-0 allocations (like page cache faults) are common, does this guarantee the opportunistic compaction heuristic will be routinely bypassed under normal memory pressure? -- Sashiko AI review ยท https://sashiko.dev/#/patchset/[email protected]?part=1
