> diff --git a/kernel/bpf/helpers.c b/kernel/bpf/helpers.c
> index 94fcd4ab39e9..8abb99712043 100644
> --- a/kernel/bpf/helpers.c
> +++ b/kernel/bpf/helpers.c
> @@ -2247,6 +2247,9 @@ void bpf_list_head_free(const struct btf_field *field, 
> void *list_head,
>       if (!head->next || list_empty(head))
>               goto unlock;
>       head = head->next;
> +     /* Clear owner under spinlock to ensure the owner is always valid */
> +     for (pos = head; pos != orig_head; pos = pos->next)
> +             WRITE_ONCE(container_of(pos, struct bpf_list_node_kern, 
> list_head)->owner, NULL);
>  unlock:
>       INIT_LIST_HEAD(orig_head);
>       __bpf_spin_unlock_irqrestore(spin_lock);

Setting owner to NULL under lock A is good for concurrent readers
(bpf_list_front/back), but could it open a window for concurrent
writers?

After the lock is released, the drain loop below processes each node
without holding any lock. Meanwhile, if a BPF program on another CPU
holds a refcounted owning reference to one of these nodes (obtained
earlier via bpf_refcount_acquire), it can attempt __bpf_list_add to
a different list under a different lock B.

In __bpf_list_add():

    if (cmpxchg(&node->owner, NULL, BPF_PTR_POISON)) {
        ...
    }
    list_add(n, prev);

Before this patch, owner pointed to the old list head (non-NULL), so
the cmpxchg would fail and the add would be rejected. With owner
cleared to NULL, the cmpxchg succeeds, and list_add() modifies the
node's next/prev to point into list B.

> @@ -2255,7 +2258,9 @@ void bpf_list_head_free(const struct btf_field *field, 
> void *list_head,
>               void *obj = head;
>
>               obj -= field->graph_root.node_offset;
> +             pos = head;
>               head = head->next;
> +             list_del_init(pos);

If the concurrent list_add above has already modified head->next, does
this drain loop then follow into list B's chain? The termination
condition (head != orig_head) would never be satisfied by list B's
nodes, and the loop would call __bpf_obj_drop_impl on list B's nodes
without holding lock B.

Would it be safer to also call list_del_init() under the spinlock
(e.g., splicing into a local list), so that nodes are fully detached
from the chain before the lock is released? The existing comment notes
that draining outside the lock avoids deadlocks from tracing programs,
but the list_del_init operations themselves should be inexpensive
compared to __bpf_obj_drop_impl.

This commit also appears to be a bug fix (addressing stale owner
pointers after drain) but does not include a Fixes: tag. The concurrent
reader issue was introduced when bpf_list_front/back was added.

Suggested tag:

    Fixes: fb5b480205ba ("bpf: Add bpf_list_{front,back} kfunc")


---
AI reviewed your patch. Please fix the bug or email reply why it's not a bug.
See: https://github.com/kernel-patches/vmtest/blob/master/ci/claude/README.md

CI run summary: https://github.com/kernel-patches/bpf/actions/runs/23710972607

Reply via email to