> diff --git a/mm/bpf_memcontrol.c b/mm/bpf_memcontrol.c
> index 1f726a7b22e3..0353c8736aa5 100644
> --- a/mm/bpf_memcontrol.c
> +++ b/mm/bpf_memcontrol.c

[ ... ]

> @@ -162,6 +163,60 @@ __bpf_kfunc void bpf_mem_cgroup_flush_stats(struct 
> mem_cgroup *memcg)
>       mem_cgroup_flush_stats(memcg);
>  }
>
> +/**
> + * bpf_try_to_free_mem_cgroup_pages - attempt to reclaim pages from
> + *                                    a memory cgroup
> + * @memcg:           the target memory cgroup to reclaim from
> + * @nr_pages:        the number of pages to reclaim
> + * @gfp_mask:        GFP flags controlling the reclaim behavior
> + * @reclaim_options: bitmask of MEMCG_RECLAIM_* flags to tune
> + *                   reclaim strategy
> + * @swappiness:      swappiness override value, or a sentinel to use
> + *                   the default
> + *
> + * BPF-facing wrapper around try_to_free_mem_cgroup_pages() that
> + * validates and translates the @swappiness argument before
> + * delegating to the core reclaim path.
> + *
> + * The @swappiness parameter follows these semantics:
> + *   - Values in [MIN_SWAPPINESS, SWAPPINESS_ANON_ONLY] are passed
> + *     through as an explicit swappiness override.
> + *   - Values below MIN_SWAPPINESS are treated as "use the system
> + *     default"; the override pointer is set to NULL and the cgroup's
> + *     own swappiness setting takes effect.
> + *   - Values above SWAPPINESS_ANON_ONLY are rejected as invalid.
> + *   - If @reclaim_options does not include MEMCG_RECLAIM_PROACTIVE,
> + *     the @swappiness override is ignored entirely by the core
> + *     reclaim path and the system default is used regardless.
> + *
> + * Swap usage during reclaim is gated on @reclaim_options: swap is
> + * considered only when MEMCG_RECLAIM_MAY_SWAP is set.  Without this
> + * flag, reclaim is restricted to file-backed pages regardless of the
> + * @swappiness value or the cgroup's swappiness setting.
> + *
> + * Return:
> + *   The number of pages actually reclaimed on success, or -%EINVAL
> + *   if @swappiness exceeds SWAPPINESS_ANON_ONLY.
> + */
> +unsigned long bpf_try_to_free_mem_cgroup_pages(struct mem_cgroup *memcg,

Should this function have the __bpf_kfunc annotation?  Every other kfunc
in this file (bpf_get_root_mem_cgroup, bpf_get_mem_cgroup,
bpf_put_mem_cgroup, bpf_mem_cgroup_vm_events, bpf_mem_cgroup_usage,
bpf_mem_cgroup_memory_events, bpf_mem_cgroup_page_state,
bpf_mem_cgroup_flush_stats) has it.

Documentation/bpf/kfuncs.rst section 2.5 states that kfunc definitions
should always be annotated with the __bpf_kfunc macro to prevent the
compiler from inlining the kfunc or the function being elided in an LTO
build as it's not used elsewhere in the kernel.

> +                                            unsigned long nr_pages,
> +                                            gfp_t gfp_mask,
> +                                            unsigned int reclaim_options,
> +                                            int swappiness)
> +{
> +     int *swapiness_ptr;
> +
> +     if (swappiness > SWAPPINESS_ANON_ONLY)
> +             return -EINVAL;

How can BPF callers detect this error?  The function returns
'unsigned long' (matching try_to_free_mem_cgroup_pages()'s page count),
but here it returns -EINVAL.

Implicit conversion of -EINVAL (int -22) to unsigned long produces
0xFFFFFFFFFFFFFFEA -- a very large positive value, not a negative error
code.  The kernel-doc above explicitly states "or -%EINVAL if
@swappiness exceeds SWAPPINESS_ANON_ONLY", which is impossible for an
unsigned return.

A BPF caller cannot distinguish the error from a real reclaim count via
'if (ret < 0)' because ret is unsigned.  The sample BPF program later
in the series (samples/bpf/memcg.bpf.c in commit 8dcfd8aa3316) checks
'bpf_try_to_free_mem_cgroup_pages(...) <= 0' which would only match 0
and would silently treat the giant 'error' value as 'many pages
reclaimed'.

Should this either return 'long' instead (so callers can branch on
negative), or use a clearly defined sentinel like (unsigned long)-1
(as the other kfuncs in this file do for errors)?

> +     else if (swappiness < MIN_SWAPPINESS)
> +             swapiness_ptr = NULL;
> +     else
> +             swapiness_ptr = &swappiness;
> +
> +     return try_to_free_mem_cgroup_pages(memcg, nr_pages, gfp_mask,
> +                                         reclaim_options, swapiness_ptr);
> +}

[ ... ]

> @@ -175,6 +230,8 @@ BTF_ID_FLAGS(func, bpf_mem_cgroup_usage)
>  BTF_ID_FLAGS(func, bpf_mem_cgroup_page_state)
>  BTF_ID_FLAGS(func, bpf_mem_cgroup_flush_stats, KF_SLEEPABLE)
>
> +BTF_ID_FLAGS(func, bpf_try_to_free_mem_cgroup_pages, KF_SLEEPABLE)
> +
>  BTF_KFUNCS_END(bpf_memcontrol_kfuncs)


---
AI reviewed your patch. Please fix the bug or email reply why it's not a bug.
See: https://github.com/kernel-patches/vmtest/blob/master/ci/claude/README.md

CI run summary: https://github.com/kernel-patches/bpf/actions/runs/26429228214

Reply via email to