On Tue, Mar 10, 2026 at 9:17 PM Kuniyuki Iwashima <[email protected]> wrote:
>
> On Fri, Mar 6, 2026 at 6:05 AM Michal Luczaj <[email protected]> wrote:
> >
> > On 3/6/26 06:44, Kuniyuki Iwashima wrote:
> > > On Thu, Mar 5, 2026 at 3:32 PM Michal Luczaj <[email protected]> wrote:
> > >>
> > >> Instead of repeating the same (un)locking pattern, reuse
> > >> sock_map_sk_{acquire,release}(). This centralizes the code and makes it
> > >> easier to adapt sockmap to af_unix-specific locking.
> > >>
> > >> Signed-off-by: Michal Luczaj <[email protected]>
> > >> ---
> > >>  net/core/sock_map.c | 21 +++++++--------------
> > >>  1 file changed, 7 insertions(+), 14 deletions(-)
> > >>
> > >> diff --git a/net/core/sock_map.c b/net/core/sock_map.c
> > >> index 02a68be3002a..7ba6a7f24ccd 100644
> > >> --- a/net/core/sock_map.c
> > >> +++ b/net/core/sock_map.c
> > >> @@ -353,11 +353,9 @@ static void sock_map_free(struct bpf_map *map)
> > >>                 sk = xchg(psk, NULL);
> > >>                 if (sk) {
> > >>                         sock_hold(sk);
> > >> -                       lock_sock(sk);
> > >> -                       rcu_read_lock();
> > >> +                       sock_map_sk_acquire(sk);
> > >>                         sock_map_unref(sk, psk);
> > >> -                       rcu_read_unlock();
> > >> -                       release_sock(sk);
> > >> +                       sock_map_sk_release(sk);
> > >>                         sock_put(sk);
> > >>                 }
> > >>         }
> > >> @@ -1176,11 +1174,9 @@ static void sock_hash_free(struct bpf_map *map)
> > >>                  */
> > >>                 hlist_for_each_entry_safe(elem, node, &unlink_list, 
> > >> node) {
> > >>                         hlist_del(&elem->node);
> > >> -                       lock_sock(elem->sk);
> > >> -                       rcu_read_lock();
> > >> +                       sock_map_sk_acquire(elem->sk);
> > >>                         sock_map_unref(elem->sk, elem);
> > >> -                       rcu_read_unlock();
> > >> -                       release_sock(elem->sk);
> > >> +                       sock_map_sk_release(elem->sk);
> > >>                         sock_put(elem->sk);
> > >>                         sock_hash_free_elem(htab, elem);
> > >>                 }
> > >> @@ -1676,8 +1672,7 @@ void sock_map_close(struct sock *sk, long timeout)
> > >>         void (*saved_close)(struct sock *sk, long timeout);
> > >>         struct sk_psock *psock;
> > >>
> > >> -       lock_sock(sk);
> > >> -       rcu_read_lock();
> > >> +       sock_map_sk_acquire(sk);
> > >>         psock = sk_psock(sk);
> > >>         if (likely(psock)) {
> > >>                 saved_close = psock->saved_close;
> > >> @@ -1685,16 +1680,14 @@ void sock_map_close(struct sock *sk, long 
> > >> timeout)
> > >>                 psock = sk_psock_get(sk);
> > >>                 if (unlikely(!psock))
> > >>                         goto no_psock;
> > >> -               rcu_read_unlock();
> > >>                 sk_psock_stop(psock);
> > >> -               release_sock(sk);
> > >> +               sock_map_sk_release(sk);
> > >
> > > I think sk_psock_stop() was intentionally put outside
> > > of rcu_read_lock() to not extend the grace period
> > > unnecessarily. e.g. while + __sk_msg_free().
> > >
> > > Maybe add __sock_map_sk_release() without
> > > rcu_read_unlock() ?
> >
> > How about dropping this patch completely? The more I stare at it, I see no
> > reason why af_unix state lock would matter in any of these places.
>
> I agree.  Actually, once it's held, it can be released right away.
> The lock is only to ensure that peer is set after checking
> TCP_ESTABLISHED, but it continues holding unix_state_lock()
> unnecessarily long.
>
> Honestly I prefer Martin's idea, using unix_peer_get() in
> unix_stream_bpf_update_proto().

Pondering again, I'm leaning to towards my initial approach,
just null check for peer, which allows bpf_iter to acquire
unix_state_lock() and make sure the socket is alive.
(still lock_sock() is needed for bpf_setsockopt())

The check is lightweight and SOCKMAP does not need to
hold the lock unnecessarily, and we can provide stable
result to bpf_iter.

IOW, if we hold unix_state_lock() in the SOCKMAP path
(even unix_peer_get()), we cannot use unix_state_lock()
for bpf_iter and lose stability since it will trigger dead lock.

Reply via email to