Re: [RFC PATCH v4 07/17] af_vsock: rest of SEQPACKET support

2021-02-11 Thread Stefano Garzarella

On Sun, Feb 07, 2021 at 06:16:12PM +0300, Arseny Krasnov wrote:

This does rest of SOCK_SEQPACKET support:
1) Adds socket ops for SEQPACKET type.
2) Allows to create socket with SEQPACKET type.

Signed-off-by: Arseny Krasnov 
---
net/vmw_vsock/af_vsock.c | 37 -
1 file changed, 36 insertions(+), 1 deletion(-)

diff --git a/net/vmw_vsock/af_vsock.c b/net/vmw_vsock/af_vsock.c
index a033d3340ac4..c77998a14018 100644
--- a/net/vmw_vsock/af_vsock.c
+++ b/net/vmw_vsock/af_vsock.c
@@ -452,6 +452,7 @@ int vsock_assign_transport(struct vsock_sock *vsk, struct 
vsock_sock *psk)
new_transport = transport_dgram;
break;
case SOCK_STREAM:
+   case SOCK_SEQPACKET:
if (vsock_use_local_transport(remote_cid))
new_transport = transport_local;
else if (remote_cid <= VMADDR_CID_HOST || !transport_h2g ||
@@ -459,6 +460,15 @@ int vsock_assign_transport(struct vsock_sock *vsk, struct 
vsock_sock *psk)
new_transport = transport_g2h;
else
new_transport = transport_h2g;
+
+   if (sk->sk_type == SOCK_SEQPACKET) {
+   if (!new_transport ||
+   !new_transport->seqpacket_seq_send_len ||
+   !new_transport->seqpacket_seq_send_eor ||
+   !new_transport->seqpacket_seq_get_len ||
+   !new_transport->seqpacket_dequeue)
+   return -ESOCKTNOSUPPORT;
+   }


Maybe we should move this check after the try_module_get() call, since 
the memory pointed by 'new_transport' pointer can be deallocated in the 
meantime.


Also, if the socket had a transport before, we should deassign it before 
returning an error.



break;
default:
return -ESOCKTNOSUPPORT;
@@ -684,6 +694,7 @@ static int __vsock_bind(struct sock *sk, struct sockaddr_vm 
*addr)

switch (sk->sk_socket->type) {
case SOCK_STREAM:
+   case SOCK_SEQPACKET:
spin_lock_bh(_table_lock);
retval = __vsock_bind_connectible(vsk, addr);
spin_unlock_bh(_table_lock);
@@ -769,7 +780,7 @@ static struct sock *__vsock_create(struct net *net,

static bool sock_type_connectible(u16 type)
{
-   return type == SOCK_STREAM;
+   return (type == SOCK_STREAM) || (type == SOCK_SEQPACKET);
}

static void __vsock_release(struct sock *sk, int level)
@@ -2199,6 +2210,27 @@ static const struct proto_ops vsock_stream_ops = {
.sendpage = sock_no_sendpage,
};

+static const struct proto_ops vsock_seqpacket_ops = {
+   .family = PF_VSOCK,
+   .owner = THIS_MODULE,
+   .release = vsock_release,
+   .bind = vsock_bind,
+   .connect = vsock_connect,
+   .socketpair = sock_no_socketpair,
+   .accept = vsock_accept,
+   .getname = vsock_getname,
+   .poll = vsock_poll,
+   .ioctl = sock_no_ioctl,
+   .listen = vsock_listen,
+   .shutdown = vsock_shutdown,
+   .setsockopt = vsock_connectible_setsockopt,
+   .getsockopt = vsock_connectible_getsockopt,
+   .sendmsg = vsock_connectible_sendmsg,
+   .recvmsg = vsock_connectible_recvmsg,
+   .mmap = sock_no_mmap,
+   .sendpage = sock_no_sendpage,
+};
+
static int vsock_create(struct net *net, struct socket *sock,
int protocol, int kern)
{
@@ -2219,6 +2251,9 @@ static int vsock_create(struct net *net, struct socket 
*sock,
case SOCK_STREAM:
sock->ops = _stream_ops;
break;
+   case SOCK_SEQPACKET:
+   sock->ops = _seqpacket_ops;
+   break;
default:
return -ESOCKTNOSUPPORT;
}
--
2.25.1



___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [RFC PATCH v4 08/17] af_vsock: update comments for stream sockets

2021-02-11 Thread Stefano Garzarella

On Sun, Feb 07, 2021 at 06:16:29PM +0300, Arseny Krasnov wrote:

This replaces 'stream' to 'connect oriented' in comments as SEQPACKET is
also connect oriented.


I'm not a native speaker but maybe is better 'connection oriented' or 
looking at socket(2) man page 'connection-based' is also fine.


Thanks,
Stefano



Signed-off-by: Arseny Krasnov 
---
net/vmw_vsock/af_vsock.c | 31 +--
1 file changed, 17 insertions(+), 14 deletions(-)

diff --git a/net/vmw_vsock/af_vsock.c b/net/vmw_vsock/af_vsock.c
index c77998a14018..6e5e192cb703 100644
--- a/net/vmw_vsock/af_vsock.c
+++ b/net/vmw_vsock/af_vsock.c
@@ -415,8 +415,8 @@ static void vsock_deassign_transport(struct vsock_sock *vsk)

/* Assign a transport to a socket and call the .init transport callback.
 *
- * Note: for stream socket this must be called when vsk->remote_addr is set
- * (e.g. during the connect() or when a connection request on a listener
+ * Note: for connect oriented socket this must be called when vsk->remote_addr
+ * is set (e.g. during the connect() or when a connection request on a listener
 * socket is received).
 * The vsk->remote_addr is used to decide which transport to use:
 *  - remote CID == VMADDR_CID_LOCAL or g2h->local_cid or VMADDR_CID_HOST if
@@ -479,10 +479,10 @@ int vsock_assign_transport(struct vsock_sock *vsk, struct 
vsock_sock *psk)
return 0;

/* transport->release() must be called with sock lock acquired.
-* This path can only be taken during vsock_stream_connect(),
-* where we have already held the sock lock.
-* In the other cases, this function is called on a new socket
-* which is not assigned to any transport.
+* This path can only be taken during vsock_connect(), where we
+* have already held the sock lock. In the other cases, this
+* function is called on a new socket which is not assigned to
+* any transport.
 */
vsk->transport->release(vsk);
vsock_deassign_transport(vsk);
@@ -659,9 +659,10 @@ static int __vsock_bind_connectible(struct vsock_sock *vsk,

vsock_addr_init(>local_addr, new_addr.svm_cid, new_addr.svm_port);

-   /* Remove stream sockets from the unbound list and add them to the hash
-* table for easy lookup by its address.  The unbound list is simply an
-* extra entry at the end of the hash table, a trick used by AF_UNIX.
+   /* Remove connect oriented sockets from the unbound list and add them
+* to the hash table for easy lookup by its address.  The unbound list
+* is simply an extra entry at the end of the hash table, a trick used
+* by AF_UNIX.
 */
__vsock_remove_bound(vsk);
__vsock_insert_bound(vsock_bound_sockets(>local_addr), vsk);
@@ -952,10 +953,10 @@ static int vsock_shutdown(struct socket *sock, int mode)
if ((mode & ~SHUTDOWN_MASK) || !mode)
return -EINVAL;

-   /* If this is a STREAM socket and it is not connected then bail out
-* immediately.  If it is a DGRAM socket then we must first kick the
-* socket so that it wakes up from any sleeping calls, for example
-* recv(), and then afterwards return the error.
+   /* If this is a connect oriented socket and it is not connected then
+* bail out immediately.  If it is a DGRAM socket then we must first
+* kick the socket so that it wakes up from any sleeping calls, for
+* example recv(), and then afterwards return the error.
 */

sk = sock->sk;
@@ -1786,7 +1787,9 @@ static int vsock_connectible_sendmsg(struct socket *sock, 
struct msghdr *msg,

transport = vsk->transport;

-   /* Callers should not provide a destination with stream sockets. */
+   /* Callers should not provide a destination with connect oriented
+* sockets.
+*/
if (msg->msg_namelen) {
err = sk->sk_state == TCP_ESTABLISHED ? -EISCONN : -EOPNOTSUPP;
goto out;
--
2.25.1



___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [RFC PATCH v4 09/17] virtio/vsock: dequeue callback for SOCK_SEQPACKET

2021-02-11 Thread Stefano Garzarella

On Sun, Feb 07, 2021 at 06:16:46PM +0300, Arseny Krasnov wrote:

This adds transport callback and it's logic for SEQPACKET dequeue.
Callback fetches RW packets from rx queue of socket until whole record
is copied(if user's buffer is full, user is not woken up). This is done
to not stall sender, because if we wake up user and it leaves syscall,
nobody will send credit update for rest of record, and sender will wait
for next enter of read syscall at receiver's side. So if user buffer is
full, we just send credit update and drop data. If during copy SEQ_BEGIN
was found(and not all data was copied), copying is restarted by reset
user's iov iterator(previous unfinished data is dropped).

Signed-off-by: Arseny Krasnov 
---
include/linux/virtio_vsock.h|   5 +
include/uapi/linux/virtio_vsock.h   |  16 
net/vmw_vsock/virtio_transport_common.c | 120 
3 files changed, 141 insertions(+)

diff --git a/include/linux/virtio_vsock.h b/include/linux/virtio_vsock.h
index dc636b727179..4d0de3dee9a4 100644
--- a/include/linux/virtio_vsock.h
+++ b/include/linux/virtio_vsock.h
@@ -36,6 +36,11 @@ struct virtio_vsock_sock {
u32 rx_bytes;
u32 buf_alloc;
struct list_head rx_queue;
+
+   /* For SOCK_SEQPACKET */
+   u32 user_read_seq_len;
+   u32 user_read_copied;
+   u32 curr_rx_msg_cnt;
};

struct virtio_vsock_pkt {
diff --git a/include/uapi/linux/virtio_vsock.h 
b/include/uapi/linux/virtio_vsock.h
index 1d57ed3d84d2..cf9c165e5cca 100644
--- a/include/uapi/linux/virtio_vsock.h
+++ b/include/uapi/linux/virtio_vsock.h
@@ -63,8 +63,14 @@ struct virtio_vsock_hdr {
__le32  fwd_cnt;
} __attribute__((packed));

+struct virtio_vsock_seq_hdr {
+   __le32  msg_cnt;
+   __le32  msg_len;
+} __attribute__((packed));
+
enum virtio_vsock_type {
VIRTIO_VSOCK_TYPE_STREAM = 1,
+   VIRTIO_VSOCK_TYPE_SEQPACKET = 2,
};

enum virtio_vsock_op {
@@ -83,6 +89,11 @@ enum virtio_vsock_op {
VIRTIO_VSOCK_OP_CREDIT_UPDATE = 6,
/* Request the peer to send the credit info to us */
VIRTIO_VSOCK_OP_CREDIT_REQUEST = 7,
+
+   /* Record begin for SOCK_SEQPACKET */
+   VIRTIO_VSOCK_OP_SEQ_BEGIN = 8,
+   /* Record end for SOCK_SEQPACKET */
+   VIRTIO_VSOCK_OP_SEQ_END = 9,
};

/* VIRTIO_VSOCK_OP_SHUTDOWN flags values */
@@ -91,4 +102,9 @@ enum virtio_vsock_shutdown {
VIRTIO_VSOCK_SHUTDOWN_SEND = 2,
};

+/* VIRTIO_VSOCK_OP_RW flags values */
+enum virtio_vsock_rw {
+   VIRTIO_VSOCK_RW_EOR = 1,
+};
+
#endif /* _UAPI_LINUX_VIRTIO_VSOCK_H */
diff --git a/net/vmw_vsock/virtio_transport_common.c 
b/net/vmw_vsock/virtio_transport_common.c
index 5956939eebb7..4572d01c8ea5 100644
--- a/net/vmw_vsock/virtio_transport_common.c
+++ b/net/vmw_vsock/virtio_transport_common.c
@@ -397,6 +397,126 @@ virtio_transport_stream_do_dequeue(struct vsock_sock *vsk,
return err;
}

+static inline void virtio_transport_remove_pkt(struct virtio_vsock_pkt *pkt)
+{
+   list_del(>list);
+   virtio_transport_free_pkt(pkt);
+}
+
+static size_t virtio_transport_drop_until_seq_begin(struct virtio_vsock_sock 
*vvs)
+{


This function is not used here, but in the next patch, so I'd add this 
with the next patch.



+   struct virtio_vsock_pkt *pkt, *n;
+   size_t bytes_dropped = 0;
+
+   list_for_each_entry_safe(pkt, n, >rx_queue, list) {
+   if (le16_to_cpu(pkt->hdr.op) == VIRTIO_VSOCK_OP_SEQ_BEGIN)
+   break;
+
+   bytes_dropped += le32_to_cpu(pkt->hdr.len);
+   virtio_transport_dec_rx_pkt(vvs, pkt);
+   virtio_transport_remove_pkt(pkt);
+   }
+
+   return bytes_dropped;
+}
+
+static int virtio_transport_seqpacket_do_dequeue(struct vsock_sock *vsk,
+struct msghdr *msg,
+bool *msg_ready)
+{


Also this function is not used, maybe you can add in this patch the 
virtio_transport_seqpacket_dequeue() implementation.



+   struct virtio_vsock_sock *vvs = vsk->trans;
+   struct virtio_vsock_pkt *pkt;
+   int err = 0;
+   size_t user_buf_len = msg->msg_iter.count;
+
+   *msg_ready = false;
+   spin_lock_bh(>rx_lock);
+
+   while (!*msg_ready && !list_empty(>rx_queue) && !err) {
+   pkt = list_first_entry(>rx_queue, struct virtio_vsock_pkt, 
list);
+
+   switch (le16_to_cpu(pkt->hdr.op)) {
+   case VIRTIO_VSOCK_OP_SEQ_BEGIN: {
+   /* Unexpected 'SEQ_BEGIN' during record copy:
+* Leave receive loop, 'EAGAIN' will restart it from
+* outer receive loop, packet is still in queue and
+* counters are cleared. So in next loop enter,
+* 'SEQ_BEGIN' will be dequeued first. User's iov
+* iterator will be reset in outer loop. Also
+* send 

Re: [RFC PATCH v4 10/17] virtio/vsock: fetch length for SEQPACKET record

2021-02-11 Thread Stefano Garzarella

On Sun, Feb 07, 2021 at 06:17:08PM +0300, Arseny Krasnov wrote:

This adds transport callback which tries to fetch record begin marker
from socket's rx queue. It is called from af_vsock.c before reading data
packets of record.

Signed-off-by: Arseny Krasnov 
---
include/linux/virtio_vsock.h|  1 +
net/vmw_vsock/virtio_transport_common.c | 40 +
2 files changed, 41 insertions(+)

diff --git a/include/linux/virtio_vsock.h b/include/linux/virtio_vsock.h
index 4d0de3dee9a4..a5e8681bfc6a 100644
--- a/include/linux/virtio_vsock.h
+++ b/include/linux/virtio_vsock.h
@@ -85,6 +85,7 @@ virtio_transport_dgram_dequeue(struct vsock_sock *vsk,
   struct msghdr *msg,
   size_t len, int flags);

+size_t virtio_transport_seqpacket_seq_get_len(struct vsock_sock *vsk);
s64 virtio_transport_stream_has_data(struct vsock_sock *vsk);
s64 virtio_transport_stream_has_space(struct vsock_sock *vsk);

diff --git a/net/vmw_vsock/virtio_transport_common.c 
b/net/vmw_vsock/virtio_transport_common.c
index 4572d01c8ea5..7ac552bfd90b 100644
--- a/net/vmw_vsock/virtio_transport_common.c
+++ b/net/vmw_vsock/virtio_transport_common.c
@@ -420,6 +420,46 @@ static size_t virtio_transport_drop_until_seq_begin(struct 
virtio_vsock_sock *vv
return bytes_dropped;
}

+size_t virtio_transport_seqpacket_seq_get_len(struct vsock_sock *vsk)
+{
+   struct virtio_vsock_seq_hdr *seq_hdr;
+   struct virtio_vsock_sock *vvs;
+   struct virtio_vsock_pkt *pkt;
+   size_t bytes_dropped;
+
+   vvs = vsk->trans;
+
+   spin_lock_bh(>rx_lock);
+
+   /* Fetch all orphaned 'RW', packets, and
+* send credit update.


Single line?


+*/
+   bytes_dropped = virtio_transport_drop_until_seq_begin(vvs);
+
+   if (list_empty(>rx_queue))
+   goto out;
+
+   pkt = list_first_entry(>rx_queue, struct virtio_vsock_pkt, list);
+
+   vvs->user_read_copied = 0;
+
+   seq_hdr = (struct virtio_vsock_seq_hdr *)pkt->buf;
+   vvs->user_read_seq_len = le32_to_cpu(seq_hdr->msg_len);
+   vvs->curr_rx_msg_cnt = le32_to_cpu(seq_hdr->msg_cnt);
+   virtio_transport_dec_rx_pkt(vvs, pkt);
+   virtio_transport_remove_pkt(pkt);
+out:
+   spin_unlock_bh(>rx_lock);
+
+   if (bytes_dropped)
+   virtio_transport_send_credit_update(vsk,
+   VIRTIO_VSOCK_TYPE_SEQPACKET,
+   NULL);
+
+   return vvs->user_read_seq_len;
+}
+EXPORT_SYMBOL_GPL(virtio_transport_seqpacket_seq_get_len);
+
static int virtio_transport_seqpacket_do_dequeue(struct vsock_sock *vsk,
 struct msghdr *msg,
 bool *msg_ready)
--
2.25.1



___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [RFC PATCH v4 09/17] virtio/vsock: dequeue callback for SOCK_SEQPACKET

2021-02-11 Thread Stefano Garzarella

On Thu, Feb 11, 2021 at 02:54:28PM +0100, Stefano Garzarella wrote:

On Sun, Feb 07, 2021 at 06:16:46PM +0300, Arseny Krasnov wrote:

This adds transport callback and it's logic for SEQPACKET dequeue.
Callback fetches RW packets from rx queue of socket until whole record
is copied(if user's buffer is full, user is not woken up). This is done
to not stall sender, because if we wake up user and it leaves syscall,
nobody will send credit update for rest of record, and sender will wait
for next enter of read syscall at receiver's side. So if user buffer is
full, we just send credit update and drop data. If during copy SEQ_BEGIN
was found(and not all data was copied), copying is restarted by reset
user's iov iterator(previous unfinished data is dropped).

Signed-off-by: Arseny Krasnov 
---
include/linux/virtio_vsock.h|   5 +
include/uapi/linux/virtio_vsock.h   |  16 
net/vmw_vsock/virtio_transport_common.c | 120 
3 files changed, 141 insertions(+)

diff --git a/include/linux/virtio_vsock.h b/include/linux/virtio_vsock.h
index dc636b727179..4d0de3dee9a4 100644
--- a/include/linux/virtio_vsock.h
+++ b/include/linux/virtio_vsock.h
@@ -36,6 +36,11 @@ struct virtio_vsock_sock {
u32 rx_bytes;
u32 buf_alloc;
struct list_head rx_queue;
+
+   /* For SOCK_SEQPACKET */
+   u32 user_read_seq_len;
+   u32 user_read_copied;
+   u32 curr_rx_msg_cnt;
};

struct virtio_vsock_pkt {
diff --git a/include/uapi/linux/virtio_vsock.h 
b/include/uapi/linux/virtio_vsock.h
index 1d57ed3d84d2..cf9c165e5cca 100644
--- a/include/uapi/linux/virtio_vsock.h
+++ b/include/uapi/linux/virtio_vsock.h
@@ -63,8 +63,14 @@ struct virtio_vsock_hdr {
__le32  fwd_cnt;
} __attribute__((packed));

+struct virtio_vsock_seq_hdr {
+   __le32  msg_cnt;


Maybe it's better 'msg_id' for this field, since we use it to identify a 
message. Then whether we use a counter or a random number, I think it's 
just an implementation detail.


As Michael said, perhaps this detail should be discussed in the proposal 
for VIRTIO spec changes.



+   __le32  msg_len;
+} __attribute__((packed));
+
enum virtio_vsock_type {
VIRTIO_VSOCK_TYPE_STREAM = 1,
+   VIRTIO_VSOCK_TYPE_SEQPACKET = 2,
};

enum virtio_vsock_op {
@@ -83,6 +89,11 @@ enum virtio_vsock_op {
VIRTIO_VSOCK_OP_CREDIT_UPDATE = 6,
/* Request the peer to send the credit info to us */
VIRTIO_VSOCK_OP_CREDIT_REQUEST = 7,
+
+   /* Record begin for SOCK_SEQPACKET */
+   VIRTIO_VSOCK_OP_SEQ_BEGIN = 8,
+   /* Record end for SOCK_SEQPACKET */
+   VIRTIO_VSOCK_OP_SEQ_END = 9,
};

/* VIRTIO_VSOCK_OP_SHUTDOWN flags values */
@@ -91,4 +102,9 @@ enum virtio_vsock_shutdown {
VIRTIO_VSOCK_SHUTDOWN_SEND = 2,
};

+/* VIRTIO_VSOCK_OP_RW flags values */
+enum virtio_vsock_rw {
+   VIRTIO_VSOCK_RW_EOR = 1,
+};
+
#endif /* _UAPI_LINUX_VIRTIO_VSOCK_H */
diff --git a/net/vmw_vsock/virtio_transport_common.c 
b/net/vmw_vsock/virtio_transport_common.c
index 5956939eebb7..4572d01c8ea5 100644
--- a/net/vmw_vsock/virtio_transport_common.c
+++ b/net/vmw_vsock/virtio_transport_common.c
@@ -397,6 +397,126 @@ virtio_transport_stream_do_dequeue(struct vsock_sock *vsk,
return err;
}

+static inline void virtio_transport_remove_pkt(struct virtio_vsock_pkt *pkt)
+{
+   list_del(>list);
+   virtio_transport_free_pkt(pkt);
+}
+
+static size_t virtio_transport_drop_until_seq_begin(struct virtio_vsock_sock 
*vvs)
+{


This function is not used here, but in the next patch, so I'd add this 
with the next patch.



+   struct virtio_vsock_pkt *pkt, *n;
+   size_t bytes_dropped = 0;
+
+   list_for_each_entry_safe(pkt, n, >rx_queue, list) {
+   if (le16_to_cpu(pkt->hdr.op) == VIRTIO_VSOCK_OP_SEQ_BEGIN)
+   break;
+
+   bytes_dropped += le32_to_cpu(pkt->hdr.len);
+   virtio_transport_dec_rx_pkt(vvs, pkt);
+   virtio_transport_remove_pkt(pkt);
+   }
+
+   return bytes_dropped;
+}
+
+static int virtio_transport_seqpacket_do_dequeue(struct vsock_sock *vsk,
+struct msghdr *msg,
+bool *msg_ready)
+{


Also this function is not used, maybe you can add in this patch the 
virtio_transport_seqpacket_dequeue() implementation.



+   struct virtio_vsock_sock *vvs = vsk->trans;
+   struct virtio_vsock_pkt *pkt;
+   int err = 0;
+   size_t user_buf_len = msg->msg_iter.count;
+
+   *msg_ready = false;
+   spin_lock_bh(>rx_lock);
+
+   while (!*msg_ready && !list_empty(>rx_queue) && !err) {
+   pkt = list_first_entry(>rx_queue, struct virtio_vsock_pkt, 
list);
+
+   switch (le16_to_cpu(pkt->hdr.op)) {
+   case VIRTIO_VSOCK_OP_SEQ_BEGIN: {
+   /* Unexpected 'SEQ_BEGIN' during record copy:
+* Leave receive loop, 

Re: [RFC PATCH v4 12/17] virtio/vsock: rest of SOCK_SEQPACKET support

2021-02-11 Thread Stefano Garzarella

On Sun, Feb 07, 2021 at 06:17:44PM +0300, Arseny Krasnov wrote:

This adds rest of logic for SEQPACKET:
1) Packet's type is now set in 'virtio_send_pkt_info()' using
  type of socket.
2) SEQPACKET specific functions which send SEQ_BEGIN/SEQ_END.
  Note that both functions may sleep to wait enough space for
  SEQPACKET header.
3) SEQ_BEGIN/SEQ_END to TAP packet capture.
4) Send SHUTDOWN on socket close for SEQPACKET type.

Signed-off-by: Arseny Krasnov 
---
include/linux/virtio_vsock.h|  9 +++
net/vmw_vsock/virtio_transport_common.c | 99 +
2 files changed, 95 insertions(+), 13 deletions(-)

diff --git a/include/linux/virtio_vsock.h b/include/linux/virtio_vsock.h
index a5e8681bfc6a..c4a39424686d 100644
--- a/include/linux/virtio_vsock.h
+++ b/include/linux/virtio_vsock.h
@@ -41,6 +41,7 @@ struct virtio_vsock_sock {
u32 user_read_seq_len;
u32 user_read_copied;
u32 curr_rx_msg_cnt;
+   u32 next_tx_msg_cnt;
};

struct virtio_vsock_pkt {
@@ -85,7 +86,15 @@ virtio_transport_dgram_dequeue(struct vsock_sock *vsk,
   struct msghdr *msg,
   size_t len, int flags);

+int virtio_transport_seqpacket_seq_send_len(struct vsock_sock *vsk, size_t 
len, int flags);
+int virtio_transport_seqpacket_seq_send_eor(struct vsock_sock *vsk, int flags);
size_t virtio_transport_seqpacket_seq_get_len(struct vsock_sock *vsk);
+int
+virtio_transport_seqpacket_dequeue(struct vsock_sock *vsk,
+  struct msghdr *msg,
+  int flags,
+  bool *msg_ready);
+
s64 virtio_transport_stream_has_data(struct vsock_sock *vsk);
s64 virtio_transport_stream_has_space(struct vsock_sock *vsk);

diff --git a/net/vmw_vsock/virtio_transport_common.c 
b/net/vmw_vsock/virtio_transport_common.c
index 51b66f8dd7c7..0aa0fd33e9d6 100644
--- a/net/vmw_vsock/virtio_transport_common.c
+++ b/net/vmw_vsock/virtio_transport_common.c
@@ -139,6 +139,8 @@ static struct sk_buff *virtio_transport_build_skb(void 
*opaque)
break;
case VIRTIO_VSOCK_OP_CREDIT_UPDATE:
case VIRTIO_VSOCK_OP_CREDIT_REQUEST:
+   case VIRTIO_VSOCK_OP_SEQ_BEGIN:
+   case VIRTIO_VSOCK_OP_SEQ_END:
hdr->op = cpu_to_le16(AF_VSOCK_OP_CONTROL);
break;
default:
@@ -165,6 +167,14 @@ void virtio_transport_deliver_tap_pkt(struct 
virtio_vsock_pkt *pkt)
}
EXPORT_SYMBOL_GPL(virtio_transport_deliver_tap_pkt);

+static u16 virtio_transport_get_type(struct sock *sk)
+{
+   if (sk->sk_type == SOCK_STREAM)
+   return VIRTIO_VSOCK_TYPE_STREAM;
+   else
+   return VIRTIO_VSOCK_TYPE_SEQPACKET;
+}
+


Maybe add this function in this part of the file from the first patch, 
so you don't need to move it in this series.



/* This function can only be used on connecting/connected sockets,
 * since a socket assigned to a transport is required.
 *
@@ -179,6 +189,13 @@ static int virtio_transport_send_pkt_info(struct 
vsock_sock *vsk,
struct virtio_vsock_pkt *pkt;
u32 pkt_len = info->pkt_len;

+   info->type = virtio_transport_get_type(sk_vsock(vsk));


I'd this change in another patch before this one, since this touch also 
the stream part.



+
+   if (info->type == VIRTIO_VSOCK_TYPE_SEQPACKET &&
+   info->msg &&
+   info->msg->msg_flags & MSG_EOR)
+   info->flags |= VIRTIO_VSOCK_RW_EOR;
+
t_ops = virtio_transport_get_ops(vsk);
if (unlikely(!t_ops))
return -EFAULT;
@@ -397,13 +414,61 @@ virtio_transport_stream_do_dequeue(struct vsock_sock *vsk,
return err;
}

-static u16 virtio_transport_get_type(struct sock *sk)
+static int virtio_transport_seqpacket_send_ctrl(struct vsock_sock *vsk,
+   int type,
+   size_t len,
+   int flags)
{
-   if (sk->sk_type == SOCK_STREAM)
-   return VIRTIO_VSOCK_TYPE_STREAM;
-   else
-   return VIRTIO_VSOCK_TYPE_SEQPACKET;
+   struct virtio_vsock_sock *vvs = vsk->trans;
+   struct virtio_vsock_pkt_info info = {
+   .op = type,
+   .vsk = vsk,
+   .pkt_len = sizeof(struct virtio_vsock_seq_hdr)
+   };
+
+   struct virtio_vsock_seq_hdr seq_hdr = {
+   .msg_cnt = vvs->next_tx_msg_cnt,
+   .msg_len = len
+   };
+
+   struct kvec seq_hdr_kiov = {
+   .iov_base = (void *)_hdr,
+   .iov_len = sizeof(struct virtio_vsock_seq_hdr)
+   };
+
+   struct msghdr msg = {0};
+
+   //XXX: do we need 'vsock_transport_send_notify_data' pointer?
+   if (vsock_wait_space(sk_vsock(vsk),
+sizeof(struct virtio_vsock_seq_hdr),
+flags, NULL))
+   return -1;
+
+   

Re: [RFC PATCH v4 16/17] loopback/vsock: setup SEQPACKET ops for transport

2021-02-11 Thread Stefano Garzarella
Please move this patch before the test and I'd change the prefix in 
"vsock_loopback" or "vsock/loopback".


Thanks,
Stefano

On Sun, Feb 07, 2021 at 06:18:48PM +0300, Arseny Krasnov wrote:

This adds SEQPACKET ops for loopback transport

Signed-off-by: Arseny Krasnov 
---
net/vmw_vsock/vsock_loopback.c | 5 +
1 file changed, 5 insertions(+)

diff --git a/net/vmw_vsock/vsock_loopback.c b/net/vmw_vsock/vsock_loopback.c
index a45f7ffca8c5..c0da94119f74 100644
--- a/net/vmw_vsock/vsock_loopback.c
+++ b/net/vmw_vsock/vsock_loopback.c
@@ -89,6 +89,11 @@ static struct virtio_transport loopback_transport = {
.stream_is_active = virtio_transport_stream_is_active,
.stream_allow = virtio_transport_stream_allow,

+   .seqpacket_seq_send_len   = 
virtio_transport_seqpacket_seq_send_len,
+   .seqpacket_seq_send_eor   = 
virtio_transport_seqpacket_seq_send_eor,
+   .seqpacket_seq_get_len= 
virtio_transport_seqpacket_seq_get_len,
+   .seqpacket_dequeue= virtio_transport_seqpacket_dequeue,
+
.notify_poll_in   = virtio_transport_notify_poll_in,
.notify_poll_out  = virtio_transport_notify_poll_out,
.notify_recv_init = virtio_transport_notify_recv_init,
--
2.25.1



___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [RFC PATCH v4 17/17] virtio/vsock: simplify credit update function API

2021-02-11 Thread Stefano Garzarella

On Sun, Feb 07, 2021 at 06:19:03PM +0300, Arseny Krasnov wrote:

'virtio_transport_send_credit_update()' has some extra args:
1) 'type' may be set in 'virtio_transport_send_pkt_info()' using type
  of socket.
2) This function is static and 'hdr' arg was always NULL.



Okay, I saw this patch after my previous comment.

I think this looks good, but please move this before your changes (e.g.  
before patch 'virtio/vsock: dequeue callback for SOCK_SEQPACKET').


In this way you don't need to modify 
virtio_transport_notify_buffer_size(), calling 
virtio_transport_get_type() and then remove these changes.


It's generally not a good idea to make changes in a patch and then 
remove them a few patches later in the same series. This should ring a 
bell about moving these changes before others.


Thanks,
Stefano


Signed-off-by: Arseny Krasnov 
---
net/vmw_vsock/virtio_transport_common.c | 20 +---
1 file changed, 5 insertions(+), 15 deletions(-)

diff --git a/net/vmw_vsock/virtio_transport_common.c 
b/net/vmw_vsock/virtio_transport_common.c

index 0aa0fd33e9d6..46308679c8a4 100644
--- a/net/vmw_vsock/virtio_transport_common.c
+++ b/net/vmw_vsock/virtio_transport_common.c
@@ -286,13 +286,10 @@ void virtio_transport_put_credit(struct virtio_vsock_sock 
*vvs, u32 credit)
}
EXPORT_SYMBOL_GPL(virtio_transport_put_credit);

-static int virtio_transport_send_credit_update(struct vsock_sock *vsk,
-  int type,
-  struct virtio_vsock_hdr *hdr)
+static int virtio_transport_send_credit_update(struct vsock_sock *vsk)
{
struct virtio_vsock_pkt_info info = {
.op = VIRTIO_VSOCK_OP_CREDIT_UPDATE,
-   .type = type,
.vsk = vsk,
};

@@ -401,9 +398,7 @@ virtio_transport_stream_do_dequeue(struct vsock_sock *vsk,
 * with different values.
 */
if (free_space < VIRTIO_VSOCK_MAX_PKT_BUF_SIZE) {
-   virtio_transport_send_credit_update(vsk,
-   VIRTIO_VSOCK_TYPE_STREAM,
-   NULL);
+   virtio_transport_send_credit_update(vsk);
}

return total;
@@ -525,9 +520,7 @@ size_t virtio_transport_seqpacket_seq_get_len(struct 
vsock_sock *vsk)
spin_unlock_bh(>rx_lock);

if (bytes_dropped)
-   virtio_transport_send_credit_update(vsk,
-   VIRTIO_VSOCK_TYPE_SEQPACKET,
-   NULL);
+   virtio_transport_send_credit_update(vsk);

return vvs->user_read_seq_len;
}
@@ -624,8 +617,7 @@ static int virtio_transport_seqpacket_do_dequeue(struct 
vsock_sock *vsk,

spin_unlock_bh(>rx_lock);

-   virtio_transport_send_credit_update(vsk, VIRTIO_VSOCK_TYPE_SEQPACKET,
-   NULL);
+   virtio_transport_send_credit_update(vsk);

return err;
}
@@ -735,15 +727,13 @@ EXPORT_SYMBOL_GPL(virtio_transport_do_socket_init);
void virtio_transport_notify_buffer_size(struct vsock_sock *vsk, u64 *val)
{
struct virtio_vsock_sock *vvs = vsk->trans;
-   int type;

if (*val > VIRTIO_VSOCK_MAX_BUF_SIZE)
*val = VIRTIO_VSOCK_MAX_BUF_SIZE;

vvs->buf_alloc = *val;

-   type = virtio_transport_get_type(sk_vsock(vsk));
-   virtio_transport_send_credit_update(vsk, type, NULL);
+   virtio_transport_send_credit_update(vsk);
}
EXPORT_SYMBOL_GPL(virtio_transport_notify_buffer_size);

--
2.25.1



___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [RFC PATCH v4 00/17] virtio/vsock: introduce SOCK_SEQPACKET support

2021-02-11 Thread Stefano Garzarella

Hi Arseny,

On Mon, Feb 08, 2021 at 09:32:59AM +0300, Arseny Krasnov wrote:


On 07.02.2021 19:20, Michael S. Tsirkin wrote:

On Sun, Feb 07, 2021 at 06:12:56PM +0300, Arseny Krasnov wrote:

This patchset impelements support of SOCK_SEQPACKET for virtio
transport.
As SOCK_SEQPACKET guarantees to save record boundaries, so to
do it, two new packet operations were added: first for start of record
 and second to mark end of record(SEQ_BEGIN and SEQ_END later). Also,
both operations carries metadata - to maintain boundaries and payload
integrity. Metadata is introduced by adding special header with two
fields - message count and message length:

struct virtio_vsock_seq_hdr {
__le32  msg_cnt;
__le32  msg_len;
} __attribute__((packed));

This header is transmitted as payload of SEQ_BEGIN and SEQ_END
packets(buffer of second virtio descriptor in chain) in the same way as
data transmitted in RW packets. Payload was chosen as buffer for this
header to avoid touching first virtio buffer which carries header of
packet, because someone could check that size of this buffer is equal
to size of packet header. To send record, packet with start marker is
sent first(it's header contains length of record and counter), then
counter is incremented and all data is sent as usual 'RW' packets and
finally SEQ_END is sent(it also carries counter of message, which is
counter of SEQ_BEGIN + 1), also after sedning SEQ_END counter is
incremented again. On receiver's side, length of record is known from
packet with start record marker. To check that no packets were dropped
by transport, counters of two sequential SEQ_BEGIN and SEQ_END are
checked(counter of SEQ_END must be bigger that counter of SEQ_BEGIN by
1) and length of data between two markers is compared to length in
SEQ_BEGIN header.
Now as  packets of one socket are not reordered neither on
vsock nor on vhost transport layers, such markers allows to restore
original record on receiver's side. If user's buffer is smaller that
record length, when all out of size data is dropped.
Maximum length of datagram is not limited as in stream socket,
because same credit logic is used. Difference with stream socket is
that user is not woken up until whole record is received or error
occurred. Implementation also supports 'MSG_EOR' and 'MSG_TRUNC' flags.
Tests also implemented.

 Arseny Krasnov (17):
  af_vsock: update functions for connectible socket
  af_vsock: separate wait data loop
  af_vsock: separate receive data loop
  af_vsock: implement SEQPACKET receive loop
  af_vsock: separate wait space loop
  af_vsock: implement send logic for SEQPACKET
  af_vsock: rest of SEQPACKET support
  af_vsock: update comments for stream sockets
  virtio/vsock: dequeue callback for SOCK_SEQPACKET
  virtio/vsock: fetch length for SEQPACKET record
  virtio/vsock: add SEQPACKET receive logic
  virtio/vsock: rest of SOCK_SEQPACKET support
  virtio/vsock: setup SEQPACKET ops for transport
  vhost/vsock: setup SEQPACKET ops for transport
  vsock_test: add SOCK_SEQPACKET tests
  loopback/vsock: setup SEQPACKET ops for transport
  virtio/vsock: simplify credit update function API

 drivers/vhost/vsock.c   |   8 +-
 include/linux/virtio_vsock.h|  15 +
 include/net/af_vsock.h  |   9 +
 include/uapi/linux/virtio_vsock.h   |  16 +
 net/vmw_vsock/af_vsock.c| 588 +++---
 net/vmw_vsock/virtio_transport.c|   5 +
 net/vmw_vsock/virtio_transport_common.c | 316 ++--
 net/vmw_vsock/vsock_loopback.c  |   5 +
 tools/testing/vsock/util.c  |  32 +-
 tools/testing/vsock/util.h  |   3 +
 tools/testing/vsock/vsock_test.c| 126 +
 11 files changed, 895 insertions(+), 228 deletions(-)

 TODO:
 - What to do, when server doesn't support SOCK_SEQPACKET. In current
   implementation RST is replied in the same way when listening port
   is not found. I think that current RST is enough,because case when
   server doesn't support SEQ_PACKET is same when listener missed(e.g.
   no listener in both cases).


I think is fine.


   - virtio spec patch

Ok


Yes, please prepare a patch to discuss the VIRTIO spec changes.

For example for 'virtio_vsock_seq_hdr', I left a comment about 'msg_cnt' 
naming that should be better to discuss with virtio guys.


Anyway, I reviewed this series and I left some comments.
I think we are in a good shape :-)

Thanks,
Stefano

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH netdev] virtio-net: support XDP_TX when not more queues

2021-02-11 Thread Jesper Dangaard Brouer
On Wed, 10 Feb 2021 16:40:41 -0500
"Michael S. Tsirkin"  wrote:

> On Wed, Jan 13, 2021 at 04:08:57PM +0800, Xuan Zhuo wrote:
> > The number of queues implemented by many virtio backends is limited,
> > especially some machines have a large number of CPUs. In this case, it
> > is often impossible to allocate a separate queue for XDP_TX.
> > 
> > This patch allows XDP_TX to run by reuse the existing SQ with
> > __netif_tx_lock() hold when there are not enough queues.

I'm a little puzzled about the choice of using the netdevice TXQ
lock __netif_tx_lock() / __netif_tx_unlock().
Can you explain more about this choice?

> > 
> > Signed-off-by: Xuan Zhuo 
> > Reviewed-by: Dust Li   
> 
> I'd like to get some advice on whether this is ok from some
> XDP experts - previously my understanding was that it is
> preferable to disable XDP for such devices than
> use locks on XDP fast path.

I think it is acceptable, because the ndo_xdp_xmit / virtnet_xdp_xmit
takes a bulk of packets (currently 16).

Some drivers already does this.

It would have been nice if we could set a feature flag, that allow
users to see that this driver uses locking in the XDP transmit
(ndo_xdp_xmit) function call... but it seems like a pipe dream :-P

Code related to the locking

> > diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
> > index ba8e637..7a3b2a7 100644
> > --- a/drivers/net/virtio_net.c
> > +++ b/drivers/net/virtio_net.c
[...]
> > @@ -481,14 +484,34 @@ static int __virtnet_xdp_xmit_one(struct virtnet_info 
> > *vi,
> > return 0;
> >  }
> >  
> > -static struct send_queue *virtnet_xdp_sq(struct virtnet_info *vi)
> > +static struct send_queue *virtnet_get_xdp_sq(struct virtnet_info *vi)
> >  {
> > unsigned int qp;
> > +   struct netdev_queue *txq;
> > +
> > +   if (vi->curr_queue_pairs > nr_cpu_ids) {
> > +   qp = vi->curr_queue_pairs - vi->xdp_queue_pairs + 
> > smp_processor_id();
> > +   } else {
> > +   qp = smp_processor_id() % vi->curr_queue_pairs;
> > +   txq = netdev_get_tx_queue(vi->dev, qp);
> > +   __netif_tx_lock(txq, raw_smp_processor_id());
> > +   }
> >  
> > -   qp = vi->curr_queue_pairs - vi->xdp_queue_pairs + smp_processor_id();
> > return >sq[qp];
> >  }
> >  
> > +static void virtnet_put_xdp_sq(struct virtnet_info *vi)
> > +{
> > +   unsigned int qp;
> > +   struct netdev_queue *txq;
> > +
> > +   if (vi->curr_queue_pairs <= nr_cpu_ids) {
> > +   qp = smp_processor_id() % vi->curr_queue_pairs;
> > +   txq = netdev_get_tx_queue(vi->dev, qp);
> > +   __netif_tx_unlock(txq);
> > +   }
> > +}
> > +
> >  static int virtnet_xdp_xmit(struct net_device *dev,
> > int n, struct xdp_frame **frames, u32 flags)
> >  {
> > @@ -512,7 +535,7 @@ static int virtnet_xdp_xmit(struct net_device *dev,
> > if (!xdp_prog)
> > return -ENXIO;
> >  
> > -   sq = virtnet_xdp_sq(vi);
> > +   sq = virtnet_get_xdp_sq(vi);
> >  
> > if (unlikely(flags & ~XDP_XMIT_FLAGS_MASK)) {
> > ret = -EINVAL;
> > @@ -560,12 +583,13 @@ static int virtnet_xdp_xmit(struct net_device *dev,
> > sq->stats.kicks += kicks;
> > u64_stats_update_end(>stats.syncp);
> >  
> > +   virtnet_put_xdp_sq(vi);
> > return ret;
> >  }
> >  



-- 
Best regards,
  Jesper Dangaard Brouer
  MSc.CS, Principal Kernel Engineer at Red Hat
  LinkedIn: http://www.linkedin.com/in/brouer

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [RFC PATCH v4 03/17] af_vsock: separate receive data loop

2021-02-11 Thread Stefano Garzarella

On Sun, Feb 07, 2021 at 06:15:05PM +0300, Arseny Krasnov wrote:

This moves STREAM specific data receive logic to dedicated function:
'__vsock_stream_recvmsg()', while checks that will be same for both
types of socket are in shared function: 'vsock_connectible_recvmsg()'.

Signed-off-by: Arseny Krasnov 
---
net/vmw_vsock/af_vsock.c | 117 +++
1 file changed, 68 insertions(+), 49 deletions(-)

diff --git a/net/vmw_vsock/af_vsock.c b/net/vmw_vsock/af_vsock.c
index 38927695786f..66c8a932f49b 100644
--- a/net/vmw_vsock/af_vsock.c
+++ b/net/vmw_vsock/af_vsock.c
@@ -1898,65 +1898,22 @@ static int vsock_wait_data(struct sock *sk, struct 
wait_queue_entry *wait,
return err;
}

-static int
-vsock_connectible_recvmsg(struct socket *sock, struct msghdr *msg, size_t len,
- int flags)
+static int __vsock_stream_recvmsg(struct sock *sk, struct msghdr *msg,
+ size_t len, int flags)
{
-   struct sock *sk;
-   struct vsock_sock *vsk;
+   struct vsock_transport_recv_notify_data recv_data;
const struct vsock_transport *transport;
-   int err;
-   size_t target;
+   struct vsock_sock *vsk;
ssize_t copied;
+   size_t target;
long timeout;
-   struct vsock_transport_recv_notify_data recv_data;
+   int err;

DEFINE_WAIT(wait);

-   sk = sock->sk;
vsk = vsock_sk(sk);
-   err = 0;
-
-   lock_sock(sk);
-
transport = vsk->transport;

-   if (!transport || sk->sk_state != TCP_ESTABLISHED) {
-   /* Recvmsg is supposed to return 0 if a peer performs an
-* orderly shutdown. Differentiate between that case and when a
-* peer has not connected or a local shutdown occured with the
-* SOCK_DONE flag.
-*/
-   if (sock_flag(sk, SOCK_DONE))
-   err = 0;
-   else
-   err = -ENOTCONN;
-
-   goto out;
-   }
-
-   if (flags & MSG_OOB) {
-   err = -EOPNOTSUPP;
-   goto out;
-   }
-
-   /* We don't check peer_shutdown flag here since peer may actually shut
-* down, but there can be data in the queue that a local socket can
-* receive.
-*/
-   if (sk->sk_shutdown & RCV_SHUTDOWN) {
-   err = 0;
-   goto out;
-   }
-
-   /* It is valid on Linux to pass in a zero-length receive buffer.  This
-* is not an error.  We may as well bail out now.
-*/
-   if (!len) {
-   err = 0;
-   goto out;
-   }
-
/* We must not copy less than target bytes into the user's buffer
 * before returning successfully, so we wait for the consume queue to
 * have that much data to consume before dequeueing.  Note that this


At the end of __vsock_stream_recvmsg() you are calling release_sock(sk) 
and it's wrong since we are releasing it in vsock_connectible_recvmsg().


Please fix it.

@@ -2020,6 +1977,68 @@ vsock_connectible_recvmsg(struct socket *sock, 
struct msghdr *msg, size_t len,

return err;
}

+static int
+vsock_connectible_recvmsg(struct socket *sock, struct msghdr *msg, size_t len,
+ int flags)
+{
+   struct sock *sk;
+   struct vsock_sock *vsk;
+   const struct vsock_transport *transport;
+   int err;
+
+   DEFINE_WAIT(wait);
+
+   sk = sock->sk;
+   vsk = vsock_sk(sk);
+   err = 0;
+
+   lock_sock(sk);
+
+   transport = vsk->transport;
+
+   if (!transport || sk->sk_state != TCP_ESTABLISHED) {
+   /* Recvmsg is supposed to return 0 if a peer performs an
+* orderly shutdown. Differentiate between that case and when a
+* peer has not connected or a local shutdown occurred with the
+* SOCK_DONE flag.
+*/
+   if (sock_flag(sk, SOCK_DONE))
+   err = 0;
+   else
+   err = -ENOTCONN;
+
+   goto out;
+   }
+
+   if (flags & MSG_OOB) {
+   err = -EOPNOTSUPP;
+   goto out;
+   }
+
+   /* We don't check peer_shutdown flag here since peer may actually shut
+* down, but there can be data in the queue that a local socket can
+* receive.
+*/
+   if (sk->sk_shutdown & RCV_SHUTDOWN) {
+   err = 0;
+   goto out;
+   }
+
+   /* It is valid on Linux to pass in a zero-length receive buffer.  This
+* is not an error.  We may as well bail out now.
+*/
+   if (!len) {
+   err = 0;
+   goto out;
+   }
+
+   err = __vsock_stream_recvmsg(sk, msg, len, flags);
+
+out:
+   release_sock(sk);
+   return err;
+}
+


The rest of the patch LGTM.

Stefano

___

Re: [RFC PATCH v4 06/17] af_vsock: implement send logic for SEQPACKET

2021-02-11 Thread Stefano Garzarella

On Sun, Feb 07, 2021 at 06:15:57PM +0300, Arseny Krasnov wrote:

This adds some logic to current stream enqueue function for SEQPACKET
support:
1) Send record's begin/end marker.
2) Return value from enqueue function is whole record length or error
  for SOCK_SEQPACKET.

Signed-off-by: Arseny Krasnov 
---
include/net/af_vsock.h   |  2 ++
net/vmw_vsock/af_vsock.c | 22 --
2 files changed, 22 insertions(+), 2 deletions(-)

diff --git a/include/net/af_vsock.h b/include/net/af_vsock.h
index 19f6f22821ec..198d58c4c7ee 100644
--- a/include/net/af_vsock.h
+++ b/include/net/af_vsock.h
@@ -136,6 +136,8 @@ struct vsock_transport {
bool (*stream_allow)(u32 cid, u32 port);

/* SEQ_PACKET. */
+   int (*seqpacket_seq_send_len)(struct vsock_sock *, size_t len, int 
flags);
+   int (*seqpacket_seq_send_eor)(struct vsock_sock *, int flags);


As before, we could add the identifier of the parameters.

Other than that, the patch LGTM.

Stefano


size_t (*seqpacket_seq_get_len)(struct vsock_sock *);
int (*seqpacket_dequeue)(struct vsock_sock *, struct msghdr *,
 int flags, bool *msg_ready);
diff --git a/net/vmw_vsock/af_vsock.c b/net/vmw_vsock/af_vsock.c
index ea99261e88ac..a033d3340ac4 100644
--- a/net/vmw_vsock/af_vsock.c
+++ b/net/vmw_vsock/af_vsock.c
@@ -1806,6 +1806,12 @@ static int vsock_connectible_sendmsg(struct socket 
*sock, struct msghdr *msg,
if (err < 0)
goto out;

+   if (sk->sk_type == SOCK_SEQPACKET) {
+   err = transport->seqpacket_seq_send_len(vsk, len, 
msg->msg_flags);
+   if (err < 0)
+   goto out;
+   }
+
while (total_written < len) {
ssize_t written;

@@ -1852,9 +1858,21 @@ static int vsock_connectible_sendmsg(struct socket 
*sock, struct msghdr *msg,

}

+   if (sk->sk_type == SOCK_SEQPACKET) {
+   err = transport->seqpacket_seq_send_eor(vsk, msg->msg_flags);
+   if (err < 0)
+   goto out;
+   }
+
out_err:
-   if (total_written > 0)
-   err = total_written;
+   if (total_written > 0) {
+   /* Return number of written bytes only if:
+* 1) SOCK_STREAM socket.
+* 2) SOCK_SEQPACKET socket when whole buffer is sent.
+*/
+   if (sk->sk_type == SOCK_STREAM || total_written == len)
+   err = total_written;
+   }
out:
release_sock(sk);
return err;
--
2.25.1



___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [RFC PATCH v4 02/17] af_vsock: separate wait data loop

2021-02-11 Thread Stefano Garzarella

On Sun, Feb 07, 2021 at 06:14:48PM +0300, Arseny Krasnov wrote:

This moves wait loop for data to dedicated function, because later
it will be used by SEQPACKET data receive loop.

Signed-off-by: Arseny Krasnov 
---
net/vmw_vsock/af_vsock.c | 158 +--
1 file changed, 86 insertions(+), 72 deletions(-)

diff --git a/net/vmw_vsock/af_vsock.c b/net/vmw_vsock/af_vsock.c
index f4fabec50650..38927695786f 100644
--- a/net/vmw_vsock/af_vsock.c
+++ b/net/vmw_vsock/af_vsock.c
@@ -1833,6 +1833,71 @@ static int vsock_connectible_sendmsg(struct socket 
*sock, struct msghdr *msg,
return err;
}

+static int vsock_wait_data(struct sock *sk, struct wait_queue_entry *wait,
+  long timeout,
+  struct vsock_transport_recv_notify_data *recv_data,
+  size_t target)
+{
+   const struct vsock_transport *transport;
+   struct vsock_sock *vsk;
+   s64 data;
+   int err;
+
+   vsk = vsock_sk(sk);
+   err = 0;
+   transport = vsk->transport;
+   prepare_to_wait(sk_sleep(sk), wait, TASK_INTERRUPTIBLE);
+
+   while ((data = vsock_stream_has_data(vsk)) == 0) {
+   if (sk->sk_err != 0 ||
+   (sk->sk_shutdown & RCV_SHUTDOWN) ||
+   (vsk->peer_shutdown & SEND_SHUTDOWN)) {
+   goto out;
+   }
+
+   /* Don't wait for non-blocking sockets. */
+   if (timeout == 0) {
+   err = -EAGAIN;
+   goto out;
+   }
+
+   if (recv_data) {
+   err = transport->notify_recv_pre_block(vsk, target, 
recv_data);
+   if (err < 0)
+   goto out;
+   }
+
+   release_sock(sk);
+   timeout = schedule_timeout(timeout);
+   lock_sock(sk);
+
+   if (signal_pending(current)) {
+   err = sock_intr_errno(timeout);
+   goto out;
+   } else if (timeout == 0) {
+   err = -EAGAIN;
+   goto out;
+   }
+   }
+
+   finish_wait(sk_sleep(sk), wait);
+
+   /* Invalid queue pair content. XXX This should
+* be changed to a connection reset in a later
+* change.
+*/
+   if (data < 0)
+   return -ENOMEM;
+
+   /* Have some data, return. */
+   if (data)
+   return data;


IIUC here data must be != 0 so you can simply return data in any case.

Or cleaner, you can do 'break' instead of 'goto out' in the error paths 
and after the while loop you can do something like this:


finish_wait(sk_sleep(sk), wait);

if (err)
return err;

if (data < 0)
return -ENOMEM;

return data;
}


+
+out:
+   finish_wait(sk_sleep(sk), wait);
+   return err;
+}
+
static int
vsock_connectible_recvmsg(struct socket *sock, struct msghdr *msg, size_t len,
  int flags)
@@ -1912,85 +1977,34 @@ vsock_connectible_recvmsg(struct socket *sock, struct 
msghdr *msg, size_t len,


while (1) {
-   s64 ready;
+   ssize_t read;

-   prepare_to_wait(sk_sleep(sk), , TASK_INTERRUPTIBLE);
-   ready = vsock_stream_has_data(vsk);
-
-   if (ready == 0) {
-   if (sk->sk_err != 0 ||
-   (sk->sk_shutdown & RCV_SHUTDOWN) ||
-   (vsk->peer_shutdown & SEND_SHUTDOWN)) {
-   finish_wait(sk_sleep(sk), );
-   break;
-   }
-   /* Don't wait for non-blocking sockets. */
-   if (timeout == 0) {
-   err = -EAGAIN;
-   finish_wait(sk_sleep(sk), );
-   break;
-   }
-
-   err = transport->notify_recv_pre_block(
-   vsk, target, _data);
-   if (err < 0) {
-   finish_wait(sk_sleep(sk), );
-   break;
-   }
-   release_sock(sk);
-   timeout = schedule_timeout(timeout);
-   lock_sock(sk);
-
-   if (signal_pending(current)) {
-   err = sock_intr_errno(timeout);
-   finish_wait(sk_sleep(sk), );
-   break;
-   } else if (timeout == 0) {
-   err = -EAGAIN;
-   finish_wait(sk_sleep(sk), );
-   break;
-   }
-   } else {
-   ssize_t read;
+

Re: [RFC PATCH v4 01/17] af_vsock: update functions for connectible socket

2021-02-11 Thread Stefano Garzarella

On Sun, Feb 07, 2021 at 06:14:23PM +0300, Arseny Krasnov wrote:

This prepares af_vsock.c for SEQPACKET support: some functions such
as setsockopt(), getsockopt(), connect(), recvmsg(), sendmsg() are
shared between both types of sockets, so rename them in general
manner.

Signed-off-by: Arseny Krasnov 
---
net/vmw_vsock/af_vsock.c | 64 +---
1 file changed, 34 insertions(+), 30 deletions(-)


This patch LGTM:

Reviewed-by: Stefano Garzarella 

Thanks,
Stefano



diff --git a/net/vmw_vsock/af_vsock.c b/net/vmw_vsock/af_vsock.c
index 6894f21dc147..f4fabec50650 100644
--- a/net/vmw_vsock/af_vsock.c
+++ b/net/vmw_vsock/af_vsock.c
@@ -604,8 +604,8 @@ static void vsock_pending_work(struct work_struct *work)

/ SOCKET OPERATIONS /

-static int __vsock_bind_stream(struct vsock_sock *vsk,
-  struct sockaddr_vm *addr)
+static int __vsock_bind_connectible(struct vsock_sock *vsk,
+   struct sockaddr_vm *addr)
{
static u32 port;
struct sockaddr_vm new_addr;
@@ -685,7 +685,7 @@ static int __vsock_bind(struct sock *sk, struct sockaddr_vm 
*addr)
switch (sk->sk_socket->type) {
case SOCK_STREAM:
spin_lock_bh(_table_lock);
-   retval = __vsock_bind_stream(vsk, addr);
+   retval = __vsock_bind_connectible(vsk, addr);
spin_unlock_bh(_table_lock);
break;

@@ -767,6 +767,11 @@ static struct sock *__vsock_create(struct net *net,
return sk;
}

+static bool sock_type_connectible(u16 type)
+{
+   return type == SOCK_STREAM;
+}
+
static void __vsock_release(struct sock *sk, int level)
{
if (sk) {
@@ -785,7 +790,7 @@ static void __vsock_release(struct sock *sk, int level)

if (vsk->transport)
vsk->transport->release(vsk);
-   else if (sk->sk_type == SOCK_STREAM)
+   else if (sock_type_connectible(sk->sk_type))
vsock_remove_sock(vsk);

sock_orphan(sk);
@@ -945,7 +950,7 @@ static int vsock_shutdown(struct socket *sock, int mode)
sk = sock->sk;
if (sock->state == SS_UNCONNECTED) {
err = -ENOTCONN;
-   if (sk->sk_type == SOCK_STREAM)
+   if (sock_type_connectible(sk->sk_type))
return err;
} else {
sock->state = SS_DISCONNECTING;
@@ -960,7 +965,7 @@ static int vsock_shutdown(struct socket *sock, int mode)
sk->sk_state_change(sk);
release_sock(sk);

-   if (sk->sk_type == SOCK_STREAM) {
+   if (sock_type_connectible(sk->sk_type)) {
sock_reset_flag(sk, SOCK_DONE);
vsock_send_shutdown(sk, mode);
}
@@ -1013,7 +1018,7 @@ static __poll_t vsock_poll(struct file *file, struct 
socket *sock,
if (!(sk->sk_shutdown & SEND_SHUTDOWN))
mask |= EPOLLOUT | EPOLLWRNORM | EPOLLWRBAND;

-   } else if (sock->type == SOCK_STREAM) {
+   } else if (sock_type_connectible(sk->sk_type)) {
const struct vsock_transport *transport;

lock_sock(sk);
@@ -1263,8 +1268,8 @@ static void vsock_connect_timeout(struct work_struct 
*work)
sock_put(sk);
}

-static int vsock_stream_connect(struct socket *sock, struct sockaddr *addr,
-   int addr_len, int flags)
+static int vsock_connect(struct socket *sock, struct sockaddr *addr,
+int addr_len, int flags)
{
int err;
struct sock *sk;
@@ -1414,7 +1419,7 @@ static int vsock_accept(struct socket *sock, struct 
socket *newsock, int flags,

lock_sock(listener);

-   if (sock->type != SOCK_STREAM) {
+   if (!sock_type_connectible(sock->type)) {
err = -EOPNOTSUPP;
goto out;
}
@@ -1491,7 +1496,7 @@ static int vsock_listen(struct socket *sock, int backlog)

lock_sock(sk);

-   if (sock->type != SOCK_STREAM) {
+   if (!sock_type_connectible(sk->sk_type)) {
err = -EOPNOTSUPP;
goto out;
}
@@ -1535,11 +1540,11 @@ static void vsock_update_buffer_size(struct vsock_sock 
*vsk,
vsk->buffer_size = val;
}

-static int vsock_stream_setsockopt(struct socket *sock,
-  int level,
-  int optname,
-  sockptr_t optval,
-  unsigned int optlen)
+static int vsock_connectible_setsockopt(struct socket *sock,
+   int level,
+   int optname,
+   sockptr_t optval,
+   unsigned int optlen)
{
int err;
struct sock *sk;
@@ -1617,10 +1622,10 @@ static int 

Re: [RFC PATCH v4 04/17] af_vsock: implement SEQPACKET receive loop

2021-02-11 Thread Stefano Garzarella

On Sun, Feb 07, 2021 at 06:15:22PM +0300, Arseny Krasnov wrote:

This adds receive loop for SEQPACKET. It looks like receive loop for
STREAM, but there is a little bit difference:
1) It doesn't call notify callbacks.
2) It doesn't care about 'SO_SNDLOWAT' and 'SO_RCVLOWAT' values, because
  there is no sense for these values in SEQPACKET case.
3) It waits until whole record is received or error is found during
  receiving.
4) It processes and sets 'MSG_TRUNC' flag.

So to avoid extra conditions for two types of socket inside one loop, two
independent functions were created.

Signed-off-by: Arseny Krasnov 
---
include/net/af_vsock.h   |  5 +++
net/vmw_vsock/af_vsock.c | 96 +++-
2 files changed, 100 insertions(+), 1 deletion(-)

diff --git a/include/net/af_vsock.h b/include/net/af_vsock.h
index b1c717286993..bb6a0e52be86 100644
--- a/include/net/af_vsock.h
+++ b/include/net/af_vsock.h
@@ -135,6 +135,11 @@ struct vsock_transport {
bool (*stream_is_active)(struct vsock_sock *);
bool (*stream_allow)(u32 cid, u32 port);

+   /* SEQ_PACKET. */
+   size_t (*seqpacket_seq_get_len)(struct vsock_sock *);
+   int (*seqpacket_dequeue)(struct vsock_sock *, struct msghdr *,
+int flags, bool *msg_ready);


CHECK: Alignment should match open parenthesis
#35: FILE: include/net/af_vsock.h:141:
+   int (*seqpacket_dequeue)(struct vsock_sock *, struct msghdr *,
+int flags, bool *msg_ready);

And to make checkpatch.pl happy please use the identifier name also for 
the others parameter. I know we haven't done this before, but for new 
code I think we can do it.



+
/* Notification. */
int (*notify_poll_in)(struct vsock_sock *, size_t, bool *);
int (*notify_poll_out)(struct vsock_sock *, size_t, bool *);
diff --git a/net/vmw_vsock/af_vsock.c b/net/vmw_vsock/af_vsock.c
index 66c8a932f49b..3d8af987216a 100644
--- a/net/vmw_vsock/af_vsock.c
+++ b/net/vmw_vsock/af_vsock.c
@@ -1977,6 +1977,97 @@ static int __vsock_stream_recvmsg(struct sock *sk, 
struct msghdr *msg,
return err;
}

+static int __vsock_seqpacket_recvmsg(struct sock *sk, struct msghdr *msg,
+size_t len, int flags)
+{
+   const struct vsock_transport *transport;
+   const struct iovec *orig_iov;
+   unsigned long orig_nr_segs;
+   bool msg_ready;
+   struct vsock_sock *vsk;
+   size_t record_len;
+   long timeout;
+   int err = 0;
+   DEFINE_WAIT(wait);
+
+   vsk = vsock_sk(sk);
+   transport = vsk->transport;
+
+   timeout = sock_rcvtimeo(sk, flags & MSG_DONTWAIT);
+   orig_nr_segs = msg->msg_iter.nr_segs;
+   orig_iov = msg->msg_iter.iov;
+   msg_ready = false;
+   record_len = 0;
+
+   while (1) {
+   err = vsock_wait_data(sk, , timeout, NULL, 0);
+
+   if (err <= 0) {
+   /* In case of any loop break(timeout, signal
+* interrupt or shutdown), we report user that
+* nothing was copied.
+*/
+   err = 0;
+   break;
+   }
+
+   if (record_len == 0) {
+   record_len =
+   transport->seqpacket_seq_get_len(vsk);
+
+   if (record_len == 0)
+   continue;
+   }
+
+   err = transport->seqpacket_dequeue(vsk, msg,
+   flags, _ready);


A single line here should be okay.


+   if (err < 0) {
+   if (err == -EAGAIN) {
+   iov_iter_init(>msg_iter, READ,
+ orig_iov, orig_nr_segs,
+ len);
+   /* Clear 'MSG_EOR' here, because dequeue
+* callback above set it again if it was
+* set by sender. This 'MSG_EOR' is from
+* dropped record.
+*/
+   msg->msg_flags &= ~MSG_EOR;
+   record_len = 0;
+   continue;
+   }
+
+   err = -ENOMEM;
+   break;
+   }
+
+   if (msg_ready)
+   break;
+   }
+
+   if (sk->sk_err)
+   err = -sk->sk_err;
+   else if (sk->sk_shutdown & RCV_SHUTDOWN)
+   err = 0;
+
+   if (msg_ready) {
+   /* User sets MSG_TRUNC, so return real length of
+* packet.
+*/
+   if (flags & MSG_TRUNC)
+   err = record_len;
+   else
+   err = len - 

Re: [RFC PATCH v4 05/17] af_vsock: separate wait space loop

2021-02-11 Thread Stefano Garzarella

On Sun, Feb 07, 2021 at 06:15:41PM +0300, Arseny Krasnov wrote:

This moves loop that waits for space on send to separate function,
because it will be used for SEQ_BEGIN/SEQ_END sending before and
after data transmission. Waiting for SEQ_BEGIN/SEQ_END is needed
because such packets carries SEQPACKET header that couldn't be
fragmented by credit mechanism, so to avoid it, sender waits until
enough space will be ready.

Signed-off-by: Arseny Krasnov 
---
include/net/af_vsock.h   |  2 +
net/vmw_vsock/af_vsock.c | 93 ++--
2 files changed, 62 insertions(+), 33 deletions(-)

diff --git a/include/net/af_vsock.h b/include/net/af_vsock.h
index bb6a0e52be86..19f6f22821ec 100644
--- a/include/net/af_vsock.h
+++ b/include/net/af_vsock.h
@@ -205,6 +205,8 @@ void vsock_remove_sock(struct vsock_sock *vsk);
void vsock_for_each_connected_socket(void (*fn)(struct sock *sk));
int vsock_assign_transport(struct vsock_sock *vsk, struct vsock_sock *psk);
bool vsock_find_cid(unsigned int cid);
+int vsock_wait_space(struct sock *sk, size_t space, int flags,
+struct vsock_transport_send_notify_data *send_data);

/ TAP /

diff --git a/net/vmw_vsock/af_vsock.c b/net/vmw_vsock/af_vsock.c
index 3d8af987216a..ea99261e88ac 100644
--- a/net/vmw_vsock/af_vsock.c
+++ b/net/vmw_vsock/af_vsock.c
@@ -1693,6 +1693,64 @@ static int vsock_connectible_getsockopt(struct socket 
*sock,
return 0;
}

+int vsock_wait_space(struct sock *sk, size_t space, int flags,
+struct vsock_transport_send_notify_data *send_data)
+{
+   const struct vsock_transport *transport;
+   struct vsock_sock *vsk;
+   long timeout;
+   int err;
+
+   DEFINE_WAIT_FUNC(wait, woken_wake_function);
+
+   vsk = vsock_sk(sk);
+   transport = vsk->transport;
+   timeout = sock_sndtimeo(sk, flags & MSG_DONTWAIT);
+   err = 0;
+
+   add_wait_queue(sk_sleep(sk), );
+
+   while (vsock_stream_has_space(vsk) < space &&
+  sk->sk_err == 0 &&
+  !(sk->sk_shutdown & SEND_SHUTDOWN) &&
+  !(vsk->peer_shutdown & RCV_SHUTDOWN)) {


Maybe a new line here, like in the original code, would help the 
readability.



+   /* Don't wait for non-blocking sockets. */
+   if (timeout == 0) {
+   err = -EAGAIN;
+   goto out_err;
+   }
+
+   if (send_data) {
+   err = transport->notify_send_pre_block(vsk, send_data);
+   if (err < 0)
+   goto out_err;
+   }
+
+   release_sock(sk);
+   timeout = wait_woken(, TASK_INTERRUPTIBLE, timeout);
+   lock_sock(sk);
+   if (signal_pending(current)) {
+   err = sock_intr_errno(timeout);
+   goto out_err;
+   } else if (timeout == 0) {
+   err = -EAGAIN;
+   goto out_err;
+   }
+   }
+
+   if (sk->sk_err) {
+   err = -sk->sk_err;
+   } else if ((sk->sk_shutdown & SEND_SHUTDOWN) ||
+  (vsk->peer_shutdown & RCV_SHUTDOWN)) {
+   err = -EPIPE;
+   }
+
+out_err:
+   remove_wait_queue(sk_sleep(sk), );
+   return err;
+}
+EXPORT_SYMBOL_GPL(vsock_wait_space);
+
static int vsock_connectible_sendmsg(struct socket *sock, struct msghdr *msg,
 size_t len)
{


After removing the wait loop in vsock_connectible_sendmsg(), we should 
remove the 'timeout' variable because it is no longer used.



@@ -1751,39 +1809,8 @@ static int vsock_connectible_sendmsg(struct socket 
*sock, struct msghdr *msg,
while (total_written < len) {
ssize_t written;

-   add_wait_queue(sk_sleep(sk), );
-   while (vsock_stream_has_space(vsk) == 0 &&
-  sk->sk_err == 0 &&
-  !(sk->sk_shutdown & SEND_SHUTDOWN) &&
-  !(vsk->peer_shutdown & RCV_SHUTDOWN)) {
-
-   /* Don't wait for non-blocking sockets. */
-   if (timeout == 0) {
-   err = -EAGAIN;
-   remove_wait_queue(sk_sleep(sk), );
-   goto out_err;
-   }
-
-   err = transport->notify_send_pre_block(vsk, _data);
-   if (err < 0) {
-   remove_wait_queue(sk_sleep(sk), );
-   goto out_err;
-   }
-
-   release_sock(sk);
-   timeout = wait_woken(, TASK_INTERRUPTIBLE, 
timeout);
-   lock_sock(sk);
-   if (signal_pending(current)) {
-   err = sock_intr_errno(timeout);
-   

Re: [RFC PATCH v4 02/17] af_vsock: separate wait data loop

2021-02-11 Thread Jorgen Hansen


> On 7 Feb 2021, at 16:14, Arseny Krasnov  wrote:
> 
> This moves wait loop for data to dedicated function, because later
> it will be used by SEQPACKET data receive loop.
> 
> Signed-off-by: Arseny Krasnov 
> ---
> net/vmw_vsock/af_vsock.c | 158 +--
> 1 file changed, 86 insertions(+), 72 deletions(-)
> 
> diff --git a/net/vmw_vsock/af_vsock.c b/net/vmw_vsock/af_vsock.c
> index f4fabec50650..38927695786f 100644
> --- a/net/vmw_vsock/af_vsock.c
> +++ b/net/vmw_vsock/af_vsock.c
> @@ -1833,6 +1833,71 @@ static int vsock_connectible_sendmsg(struct socket 
> *sock, struct msghdr *msg,
>   return err;
> }
> 
> +static int vsock_wait_data(struct sock *sk, struct wait_queue_entry *wait,
> +long timeout,
> +struct vsock_transport_recv_notify_data *recv_data,
> +size_t target)
> +{
> + const struct vsock_transport *transport;
> + struct vsock_sock *vsk;
> + s64 data;
> + int err;
> +
> + vsk = vsock_sk(sk);
> + err = 0;
> + transport = vsk->transport;
> + prepare_to_wait(sk_sleep(sk), wait, TASK_INTERRUPTIBLE);
> +
> + while ((data = vsock_stream_has_data(vsk)) == 0) {
> + if (sk->sk_err != 0 ||
> + (sk->sk_shutdown & RCV_SHUTDOWN) ||
> + (vsk->peer_shutdown & SEND_SHUTDOWN)) {
> + goto out;
> + }
> +
> + /* Don't wait for non-blocking sockets. */
> + if (timeout == 0) {
> + err = -EAGAIN;
> + goto out;
> + }
> +
> + if (recv_data) {
> + err = transport->notify_recv_pre_block(vsk, target, 
> recv_data);
> + if (err < 0)
> + goto out;
> + }
> +
> + release_sock(sk);
> + timeout = schedule_timeout(timeout);
> + lock_sock(sk);
> +
> + if (signal_pending(current)) {
> + err = sock_intr_errno(timeout);
> + goto out;
> + } else if (timeout == 0) {
> + err = -EAGAIN;
> + goto out;
> + }
> + }
> +
> + finish_wait(sk_sleep(sk), wait);
> +
> + /* Invalid queue pair content. XXX This should
> +  * be changed to a connection reset in a later
> +  * change.
> +  */

Since you are here, could you update this comment to something like:

/* Internal transport error when checking for available
 * data. XXX This should be changed to a connection
 * reset in a later change.
 */

> + if (data < 0)
> + return -ENOMEM;
> +
> + /* Have some data, return. */
> + if (data)
> + return data;
> +
> +out:
> + finish_wait(sk_sleep(sk), wait);
> + return err;
> +}

I agree with Stefanos suggestion to get rid of the out: part  and just have the 
single finish_wait().

> +
> static int
> vsock_connectible_recvmsg(struct socket *sock, struct msghdr *msg, size_t len,
> int flags)
> @@ -1912,85 +1977,34 @@ vsock_connectible_recvmsg(struct socket *sock, struct 
> msghdr *msg, size_t len,
> 
> 
>   while (1) {
> - s64 ready;
> + ssize_t read;
> 
> - prepare_to_wait(sk_sleep(sk), , TASK_INTERRUPTIBLE);
> - ready = vsock_stream_has_data(vsk);
> -
> - if (ready == 0) {
> - if (sk->sk_err != 0 ||
> - (sk->sk_shutdown & RCV_SHUTDOWN) ||
> - (vsk->peer_shutdown & SEND_SHUTDOWN)) {
> - finish_wait(sk_sleep(sk), );
> - break;
> - }
> - /* Don't wait for non-blocking sockets. */
> - if (timeout == 0) {
> - err = -EAGAIN;
> - finish_wait(sk_sleep(sk), );
> - break;
> - }
> -
> - err = transport->notify_recv_pre_block(
> - vsk, target, _data);
> - if (err < 0) {
> - finish_wait(sk_sleep(sk), );
> - break;
> - }
> - release_sock(sk);
> - timeout = schedule_timeout(timeout);
> - lock_sock(sk);
> -
> - if (signal_pending(current)) {
> - err = sock_intr_errno(timeout);
> - finish_wait(sk_sleep(sk), );
> - break;
> - } else if (timeout == 0) {
> - err = -EAGAIN;
> - finish_wait(sk_sleep(sk), );
> - break;
> - }
> - } else {
> - 

Re: [PATCH] vdpa/mlx5: fix param validation in mlx5_vdpa_get_config()

2021-02-11 Thread Stefano Garzarella

On Wed, Feb 10, 2021 at 07:12:31AM -0500, Michael S. Tsirkin wrote:

On Wed, Feb 10, 2021 at 12:17:19PM +0800, Jason Wang wrote:


On 2021/2/9 下午5:00, Stefano Garzarella wrote:
> On Tue, Feb 09, 2021 at 07:43:02AM +0200, Eli Cohen wrote:
> > On Mon, Feb 08, 2021 at 05:17:41PM +0100, Stefano Garzarella wrote:
> > > It's legal to have 'offset + len' equal to
> > > sizeof(struct virtio_net_config), since 'ndev->config' is a
> > > 'struct virtio_net_config', so we can safely copy its content under
> > > this condition.
> > >
> > > Fixes: 1a86b377aa21 ("vdpa/mlx5: Add VDPA driver for supported
> > > mlx5 devices")
> > > Cc: sta...@vger.kernel.org
> > > Signed-off-by: Stefano Garzarella 
> >
> > Acked-by: Eli Cohen 
> >
> > BTW, same error in vdpa_sim you may want to fix.
> >
>
> Commit 65b709586e22 ("vdpa_sim: add get_config callback in
> vdpasim_dev_attr") unintentionally solved it.
>
> Since it's a simulator, maybe we can avoid solving it in the stable
> branches. Or does it matter?


I think not, since the module depends on RUNTIME_TESTING_MENU.

Thanks



Well people use the simulator for development...
I'm not going to block this patch on it, but if someone
has the cycles to post a stable branch patch, that would be
great.



Okay, I'll do it.

Thanks,
Stefano

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [PATCH iproute2-next v5 0/5] Add vdpa device management tool

2021-02-11 Thread David Ahern
On 2/10/21 11:34 AM, Parav Pandit wrote:
> Linux vdpa interface allows vdpa device management functionality.
> This includes adding, removing, querying vdpa devices.
> 
> vdpa interface also includes showing supported management devices
> which support such operations.
> 
> This patchset includes kernel uapi headers and a vdpa tool.
> 

applied to iproute2-next.

I am expecting a followup converting devlink to use the indent and mnl
helpers.

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH for 5.10] vdpa_sim: fix param validation in vdpasim_get_config()

2021-02-11 Thread Stefano Garzarella
Commit 65b709586e222fa6ffd4166ac7fdb5d5dad113ee upstream.

Before this patch, if 'offset + len' was equal to
sizeof(struct virtio_net_config), the entire buffer wasn't filled,
returning incorrect values to the caller.

Since 'vdpasim->config' type is 'struct virtio_net_config', we can
safely copy its content under this condition.

Commit 65b709586e22 ("vdpa_sim: add get_config callback in
vdpasim_dev_attr") unintentionally solved it upstream while
refactoring vdpa_sim.c to support multiple devices. But we don't want
to backport it to stable branches as it contains many changes.

Fixes: 2c53d0f64c06 ("vdpasim: vDPA device simulator")
Cc:  # 5.10.x
Signed-off-by: Stefano Garzarella 
---
 drivers/vdpa/vdpa_sim/vdpa_sim.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/vdpa/vdpa_sim/vdpa_sim.c b/drivers/vdpa/vdpa_sim/vdpa_sim.c
index 6a90fdb9cbfc..8ca178d7b02f 100644
--- a/drivers/vdpa/vdpa_sim/vdpa_sim.c
+++ b/drivers/vdpa/vdpa_sim/vdpa_sim.c
@@ -572,7 +572,7 @@ static void vdpasim_get_config(struct vdpa_device *vdpa, 
unsigned int offset,
 {
struct vdpasim *vdpasim = vdpa_to_sim(vdpa);
 
-   if (offset + len < sizeof(struct virtio_net_config))
+   if (offset + len <= sizeof(struct virtio_net_config))
memcpy(buf, (u8 *)>config + offset, len);
 }
 
-- 
2.29.2

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


RE: [PATCH iproute2-next v5 0/5] Add vdpa device management tool

2021-02-11 Thread Parav Pandit



> From: David Ahern 
> Sent: Thursday, February 11, 2021 9:50 PM
> 
> On 2/10/21 11:34 AM, Parav Pandit wrote:
> > Linux vdpa interface allows vdpa device management functionality.
> > This includes adding, removing, querying vdpa devices.
> >
> > vdpa interface also includes showing supported management devices
> > which support such operations.
> >
> > This patchset includes kernel uapi headers and a vdpa tool.
> >
> 
> applied to iproute2-next.
> 
> I am expecting a followup converting devlink to use the indent and mnl
> helpers.
Yes. Thanks David. 
I will finish internal review next week and more test to bring here.
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [RFC v2 1/7] vhost: Delete trailing dot in errpr_setg argument

2021-02-11 Thread Stefano Garzarella

On Tue, Feb 09, 2021 at 07:11:41PM +0100, Eugenio Perez Martin wrote:

On Tue, Feb 9, 2021 at 5:25 PM Eric Blake  wrote:


On 2/9/21 9:37 AM, Eugenio Pérez wrote:
> As error_setg points

Incomplete sentence?

Missing Signed-off-by.



Sorry, I should have paid more attention.

Maybe it is better to send this though qemu-trivial, so it does not
mess with this series?


Yes, I agree that it can go regardless of this series.

Thanks,
Stefano

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


vsock virtio: questions about supporting DGRAM type

2021-02-11 Thread Jiang Wang .
Hi guys,

I am working on supporting DGRAM type for virtio/vhost vsock. I
already did some work and a draft code is here (which passed my tests,
but still need some cleanup and only works from host to guest as of
now, will add host to guest soon):
https://github.com/Jiang1155/linux/commit/4e89736e0bce15496460ff411cb4694b143d1c3d
qemu changes are here:
https://github.com/Jiang1155/qemu/commit/7ab778801e3e8969ab98e44539943810a2fb03eb

Today, I just noticed that the Asias had an old version of virtio
which had both dgram and stream support, see this link:
https://kvm.vger.kernel.narkive.com/BMvH9eEr/rfc-v2-0-7-introduce-vm-sockets-virtio-transport#post1

But somehow, the dgram part seems never merged to upstream linux (the
stream part is merged). If so, does anyone know what is the reason for
this? Did we drop dgram support for some specific reason or the code
needs some improvement?

My current code differs from Asias' code in some ways. It does not use
credit and does not support fragmentation.  It basically adds two virt
queues and re-uses the existing functions for tx and rx ( there is
somewhat duplicate code for now, but I will try to make common
functions to reduce it). If we still want to support dgram in upstream
linux, which way do you guys recommend? If necessary, I can try to
base on Asias' old code and continue working on it. If there is
anything unclear, just let me know. Thanks.

Regards,

Jiang
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization