March 6, 2026 at 14:04, "Jiayuan Chen" <[email protected]
mailto:[email protected]?to=%22Jiayuan%20Chen%22%20%3Cjiayuan.chen%40linux.dev%3E
> wrote:
>
> On 3/6/26 7:30 AM, Michal Luczaj wrote:
>
[...]
> > diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c
> > index 3756a93dc63a..3d2cfb4ecbcd 100644
> > --- a/net/unix/af_unix.c
> > +++ b/net/unix/af_unix.c
> > @@ -3729,15 +3729,14 @@ static int bpf_iter_unix_seq_show(struct seq_file
> > *seq, void *v)
> > struct bpf_prog *prog;
> > struct sock *sk = v;
> > uid_t uid;
> > - bool slow;
> > int ret;
> > > if (v == SEQ_START_TOKEN)
> > return 0;
> > > - slow = lock_sock_fast(sk);
> > + lock_sock(sk);
> > > - if (unlikely(sk_unhashed(sk))) {
> > + if (unlikely(sock_flag(sk, SOCK_DEAD))) {
> > ret = SEQ_SKIP;
> > goto unlock;
> > }
> >
> Switching to lock_sock() fixes the deadlock, but it does not provide mutual
> exclusion with unix_release_sock(), which uses unix_state_lock() exclusively
> and does not touch lock_sock() at all. So a dying socket can still reach the
> BPF prog concurrently with unix_release_sock() running on another CPU.
>
> Both SOCK_DEAD and the clearing of unix_peer(sk) happen under
> unix_state_lock() in unix_release_sock(). Without taking unix_state_lock()
> before the SOCK_DEAD check, there is a window:
>
> iter unix_release_sock()
> --- lock_sock(sk)
> SOCK_DEAD == 0(check passes)
> unix_state_lock(sk)
> unix_peer(sk) = NULL
> sock_set_flag(sk, SOCK_DEAD)
> unix_state_unlock(sk)
> BPF prog runs
> → accesses unix_peer(sk) == NULL → crash
Sorry for malformed message.
Here is correct:
iter unix_release_sock()
--- lock_sock(sk)
SOCK_DEAD == 0 (check passes)
unix_state_lock(sk)
unix_peer(sk) = NULL
sock_set_flag(sk, SOCK_DEAD)
unix_state_unlock(sk)
BPF prog runs
→ accesses unix_peer(sk) == NULL → crash