On Fri, Nov 14, 2025 at 10:33:42AM +0100, Stefano Garzarella wrote: > On Thu, Nov 13, 2025 at 10:26:04AM -0800, Bobby Eshleman wrote: > > On Thu, Nov 13, 2025 at 04:24:44PM +0100, Stefano Garzarella wrote: > > > On Wed, Nov 12, 2025 at 10:27:18AM -0800, Bobby Eshleman wrote: > > > > On Wed, Nov 12, 2025 at 03:19:47PM +0100, Stefano Garzarella wrote: > > > > > On Tue, Nov 11, 2025 at 10:54:48PM -0800, Bobby Eshleman wrote: > > > > > > From: Bobby Eshleman <[email protected]> > > > > > > > > > > > > Add NS support to vsock loopback. Sockets in a global mode netns > > > > > > communicate with each other, regardless of namespace. Sockets in a > > > > > > local > > > > > > mode netns may only communicate with other sockets within the same > > > > > > namespace. > > > > > > > > > > > > Signed-off-by: Bobby Eshleman <[email protected]> > > > > [...] > > > > > > > > @@ -131,7 +136,41 @@ static void vsock_loopback_work(struct > > > > > > work_struct *work) > > > > > > */ > > > > > > virtio_transport_consume_skb_sent(skb, false); > > > > > > virtio_transport_deliver_tap_pkt(skb); > > > > > > - virtio_transport_recv_pkt(&loopback_transport, skb, > > > > > > NULL, 0); > > > > > > + > > > > > > + /* In the case of virtio_transport_reset_no_sock(), the > > > > > > skb > > > > > > + * does not hold a reference on the socket, and so does > > > > > > not > > > > > > + * transitively hold a reference on the net. > > > > > > + * > > > > > > + * There is an ABA race condition in this sequence: > > > > > > + * 1. the sender sends a packet > > > > > > + * 2. worker calls virtio_transport_recv_pkt(), using > > > > > > the > > > > > > + * sender's net > > > > > > + * 3. virtio_transport_recv_pkt() uses t->send_pkt() > > > > > > passing the > > > > > > + * sender's net > > > > > > + * 4. virtio_transport_recv_pkt() free's the skb, > > > > > > dropping the > > > > > > + * reference to the socket > > > > > > + * 5. the socket closes, frees its reference to the net > > > > > > + * 6. Finally, the worker for the second t->send_pkt() > > > > > > call > > > > > > + * processes the skb, and uses the now stale net > > > > > > pointer for > > > > > > + * socket lookups. > > > > > > + * > > > > > > + * To prevent this, we acquire a net reference in > > > > > > vsock_loopback_send_pkt() > > > > > > + * and hold it until virtio_transport_recv_pkt() > > > > > > completes. > > > > > > + * > > > > > > + * Additionally, we must grab a reference on the skb > > > > > > before > > > > > > + * calling virtio_transport_recv_pkt() to prevent it > > > > > > from > > > > > > + * freeing the skb before we have a chance to release > > > > > > the net. > > > > > > + */ > > > > > > + net_mode = virtio_vsock_skb_net_mode(skb); > > > > > > + net = virtio_vsock_skb_net(skb); > > > > > > > > > > Wait, we are adding those just for loopback (in theory used only for > > > > > testing/debugging)? And only to support > > > > > virtio_transport_reset_no_sock() use > > > > > case? > > > > > > > > Yes, exactly, only loopback + reset_no_sock(). The issue doesn't exist > > > > for vhost-vsock because vhost_vsock holds a net reference, and it > > > > doesn't exist for non-reset_no_sock calls because after looking up the > > > > socket we transfer skb ownership to it, which holds down the skb -> sk > > > > -> > > > > net reference chain. > > > > > > > > > > > > > > Honestly I don't like this, do we have any alternative? > > > > > > > > > > I'll also try to think something else. > > > > > > > > > > Stefano > > > > > > > > > > > > I've been thinking about this all morning... maybe > > > > we can do something like this: > > > > > > > > ``` > > > > > > > > virtio_transport_recv_pkt(..., struct sock *reply_sk) {... } > > > > > > > > virtio_transport_reset_no_sock(..., reply_sk) > > > > { > > > > if (reply_sk) > > > > skb_set_owner_sk_safe(reply, reply_sk) > > > > > > Interesting, but what about if we call skb_set_owner_sk_safe() in > > > vsock_loopback.c just before calling virtio_transport_recv_pkt() for every > > > skb? > > > > I think the issue with this is that at the time vsock_loopback calls > > virtio_transport_recv_pkt() the reply skb hasn't yet been allocated by > > virtio_transport_reset_no_sock() and we can't wait for it to return > > because the original skb may be freed by then. > > Right! > > > > > We might be able to keep it all in vsock_loopback if we removed the need > > to use the original skb or sk by just using the net. But to do that we > > would need to add a netns_tracker per net somewhere. I guess that would > > end up in a list or hashmap in struct vsock_loopback. > > > > Another option that does simplify a little, but unfortunately still doesn't > > keep > > everything in loopback: > > > > @@ -1205,7 +1205,7 @@ static int virtio_transport_reset_no_sock(const > > struct virtio_transport *t, > > if (!reply) > > return -ENOMEM; > > > > - return t->send_pkt(reply, net, net_mode); > > + return t->send_pkt(reply, net, net_mode, skb->sk); > > } > > > > @@ -27,11 +27,16 @@ static u32 vsock_loopback_get_local_cid(void) > > } > > > > static int vsock_loopback_send_pkt(struct sk_buff *skb, struct net *net, > > - enum vsock_net_mode net_mode) > > + enum vsock_net_mode net_mode, > > + struct sock *rst_owner) > > { > > struct vsock_loopback *vsock = &the_vsock_loopback; > > int len = skb->len; > > > > + if (!skb->sk && rst_owner) > > + WARN_ONCE(!skb_set_owner_sk_safe(skb, rst_owner), > > + "loopback socket has sk_refcnt == 0\n"); > > + > > This doesn't seem too bad IMO, but at this point, why we can't do that > in virtio_transport_reset_no_sock() for any kind of transport? > > I mean, in any case the RST packet should be handled by the same net of the > "sender", no? > > At this point, can we just put the `vsk` of the sender in the `info` and > virtio_transport_alloc_skb() will already do that. > > WDYT? > Am I missing something?
This is the right answer... I'm pretty sure this works out-of-the-box for all transports. I'll implement it and report back with a new rev if all good or come back to this thread to discuss if any issues arise. Have a good weekend! Best, Bobby
