On 05/06/2026 18:08, David Laight wrote: > On Fri, 5 Jun 2026 14:53:14 +0300 > Arseniy Krasnov <[email protected]> wrote: > >> Logically it was based on TCP implementation, so to make further >> support easier, rewrite it in the TCP way. >> >> Signed-off-by: Arseniy Krasnov <[email protected]> >> --- >> net/vmw_vsock/virtio_transport_common.c | 64 ++++++++++++------------- >> 1 file changed, 32 insertions(+), 32 deletions(-) >> >> diff --git a/net/vmw_vsock/virtio_transport_common.c >> b/net/vmw_vsock/virtio_transport_common.c >> index 2fd9eaaf5ca6..00caeeaa5590 100644 >> --- a/net/vmw_vsock/virtio_transport_common.c >> +++ b/net/vmw_vsock/virtio_transport_common.c >> @@ -73,10 +73,13 @@ static bool virtio_transport_can_zcopy(const struct >> virtio_transport *t_ops, >> static int virtio_transport_fill_skb(struct sk_buff *skb, >> struct virtio_vsock_pkt_info *info, >> size_t len, >> - bool zcopy) >> + bool zcopy, struct ubuf_info *uarg) >> { >> struct msghdr *msg = info->msg; >> >> + /* We have completion - attach it to 'skb'. */ >> + skb_zcopy_set(skb, uarg, NULL); >> + >> if (zcopy) >> return __zerocopy_sg_from_iter(msg, NULL, skb, >> &msg->msg_iter, len, NULL); >> @@ -208,7 +211,8 @@ static struct sk_buff *virtio_transport_alloc_skb(struct >> virtio_vsock_pkt_info * >> u32 src_cid, >> u32 src_port, >> u32 dst_cid, >> - u32 dst_port) >> + u32 dst_port, >> + struct ubuf_info *uarg) >> { >> struct vsock_sock *vsk; >> struct sk_buff *skb; >> @@ -245,7 +249,7 @@ static struct sk_buff *virtio_transport_alloc_skb(struct >> virtio_vsock_pkt_info * >> if (info->msg && payload_len > 0) { >> int err; >> >> - err = virtio_transport_fill_skb(skb, info, payload_len, zcopy); >> + err = virtio_transport_fill_skb(skb, info, payload_len, zcopy, >> uarg); >> if (err) >> goto out; >> >> @@ -321,38 +325,36 @@ static int virtio_transport_send_pkt_info(struct >> vsock_sock *vsk, >> if (pkt_len == 0 && info->op == VIRTIO_VSOCK_OP_RW) >> return pkt_len; >> >> - if (info->msg) { >> - /* If zerocopy is not enabled by 'setsockopt()', we behave as >> - * there is no MSG_ZEROCOPY flag set. >> + if (info->msg && (info->msg->msg_flags & MSG_ZEROCOPY)) { >> + /* If 'info->msg' is not NULL, this is only VIRTIO_VSOCK_OP_RW. >> + * 'MSG_ZEROCOPY' flag handling here is based on the same flag >> + * handling from 'tcp_sendmsg_locked()'. >> */ >> - if (!sock_flag(sk_vsock(vsk), SOCK_ZEROCOPY)) >> - info->msg->msg_flags &= ~MSG_ZEROCOPY; >> + if (info->msg->msg_ubuf) { >> + uarg = info->msg->msg_ubuf; >> + can_zcopy = virtio_transport_can_zcopy(t_ops, info, >> pkt_len); >> + } else if (sock_flag(sk_vsock(vsk), SOCK_ZEROCOPY)) { >> + uarg = msg_zerocopy_realloc(sk_vsock(vsk), pkt_len, >> + NULL, false); >> + if (!uarg) { >> + virtio_transport_put_credit(vvs, pkt_len); >> + return -ENOMEM; >> + } >> >> - if (info->msg->msg_flags & MSG_ZEROCOPY) >> can_zcopy = virtio_transport_can_zcopy(t_ops, info, >> pkt_len); >> >> + if (!can_zcopy) >> + uarg_to_msgzc(uarg)->zerocopy = 0; >> + >> + have_uref = true; >> + } >> + >> + /* 'can_zcopy' means that this transmission will be >> + * in zerocopy way (e.g. using 'frags' array). >> + */ > I've not looked at the tcp code, but the above doesn't look right. > I don't see why msg->msg_ubuf might be non-NULL without SOCK_ZEROCOPY set. > That would give the outer code a callback when the last skb is freed but > still copy the data.
Hi, I guess case when 'msg->msg_ubuf' is non-NULL is special case today for io_uring MSG_ZEROCOPY implementation. It was added here https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=eb315a7d1396b1139fc7daea55f2d3191e8e7092 As I see implementation of its tests in tools/testing/selftests/net/io_uring_zerocopy_tx.c , it doesn't require setting SOCK_ZEROCOPY option for socket, so for virtio vsock case I just copied same logic to maintain compatibility, because there is MSG_ZEROCOPY io_uring test for virtio/vsock. > > I also don't see the point of calling msg_zerocopy_realloc() to get a > callback when the last skb is freed and then setting > uarg_to_msgzc(uarg)->zerocopy = 0; > so that the callback doesn't actually do anything. > It isn't as though you 'find out' later on that you can't actually do > zerocopy. Sorry, what do You mean "last skb" ? In this code we first allocate uarg (allocate, because third arg is always NULL). Then in loop we allocate sk_buffs, fill it with data and send. I mean first/last skb will be freed after uarg is already allocated and we don't touch it. I think i didn't understand Your question here. > >> if (can_zcopy) >> max_skb_len = min_t(u32, VIRTIO_VSOCK_MAX_PKT_BUF_SIZE, >> (MAX_SKB_FRAGS * PAGE_SIZE)); >> - >> - if (info->msg->msg_flags & MSG_ZEROCOPY && >> - info->op == VIRTIO_VSOCK_OP_RW) { >> - uarg = info->msg->msg_ubuf; >> - >> - if (!uarg) { >> - uarg = msg_zerocopy_realloc(sk_vsock(vsk), >> - pkt_len, NULL, >> false); >> - if (!uarg) { >> - virtio_transport_put_credit(vvs, >> pkt_len); >> - return -ENOMEM; >> - } >> - >> - if (!can_zcopy) >> - uarg_to_msgzc(uarg)->zerocopy = 0; >> - >> - have_uref = true; >> - } >> - } >> } >> >> rest_len = pkt_len; >> @@ -365,14 +367,12 @@ static int virtio_transport_send_pkt_info(struct >> vsock_sock *vsk, >> >> skb = virtio_transport_alloc_skb(info, skb_len, can_zcopy, >> src_cid, src_port, >> - dst_cid, dst_port); >> + dst_cid, dst_port, uarg); >> if (!skb) { >> ret = -ENOMEM; >> break; >> } >> >> - skb_zcopy_set(skb, uarg, NULL); > Aren't you passing uarg through two function calls instead of doing it here. > Doesn't even make it clearer what is going on. Agree, to simplify patch, uarg could be set earlier (without passing it to functions) I guess. Thanks > > -- David > >> - >> virtio_transport_inc_tx_pkt(vvs, skb); >> >> ret = t_ops->send_pkt(skb, info->net); >> @@ -1178,7 +1178,7 @@ static int virtio_transport_reset_no_sock(const struct >> virtio_transport *t, >> le64_to_cpu(hdr->dst_cid), >> le32_to_cpu(hdr->dst_port), >> le64_to_cpu(hdr->src_cid), >> - le32_to_cpu(hdr->src_port)); >> + le32_to_cpu(hdr->src_port), NULL); >> if (!reply) >> return -ENOMEM; >>

