Re: [RFC] VSOCK: The performance problem of vhost_vsock.
On 2018/10/18 上午9:22, jiangyiwen wrote: On 2018/10/17 20:31, Jason Wang wrote: On 2018/10/17 下午7:41, jiangyiwen wrote: On 2018/10/17 17:51, Jason Wang wrote: On 2018/10/17 下午5:39, Jason Wang wrote: Hi Jason and Stefan, Maybe I find the reason of bad performance. I found pkt_len is limited to VIRTIO_VSOCK_DEFAULT_RX_BUF_SIZE(4K), it will cause the bandwidth is limited to 500~600MB/s. And once I increase to 64k, it can improve about 3 times(~1500MB/s). Looks like the value was chosen for a balance between rx buffer size and performance. Allocating 64K always even for small packet is kind of waste and stress for guest memory. Virito-net try to avoid this by inventing the merge able rx buffer which allows big packet to be scattered in into different buffers. We can reuse this idea or revisit the idea of using virtio-net/vhost-net as a transport of vsock. What interesting is the performance is still behind vhost-net. Thanks By the way, I send to 64K in application once, and I don't use sg_init_one and rewrite function to packet sg list because pkt_len include multiple pages. Thanks, Yiwen. Btw, if you're using vsock for transferring large files, maybe it's more efficient to implement sendpage() for vsock to allow sendfile()/splice() work. Thanks I can't agree more. why vhost_vsock is still behind vhost_net? Because I use sendfile() to test performance at first, and then I found vsock don't implement sendpage() and cause the bandwidth can't be increased. So I use read() and send() to replace sendfile(), it will increase some switch between kernel and user mode, and sendfile() can support zero copy. I think this is main reason. Thanks. Want to post patches for this then :) ? Thanks I may not post patches at the moment because there are other tasks. After a period of time, I will consider implement the feature. Thanks. That's fine. Thanks ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
Re: [RFC] VSOCK: The performance problem of vhost_vsock.
On 2018/10/17 20:31, Jason Wang wrote: > > On 2018/10/17 下午7:41, jiangyiwen wrote: >> On 2018/10/17 17:51, Jason Wang wrote: >>> On 2018/10/17 下午5:39, Jason Wang wrote: > Hi Jason and Stefan, > > Maybe I find the reason of bad performance. > > I found pkt_len is limited to VIRTIO_VSOCK_DEFAULT_RX_BUF_SIZE(4K), > it will cause the bandwidth is limited to 500~600MB/s. And once I > increase to 64k, it can improve about 3 times(~1500MB/s). Looks like the value was chosen for a balance between rx buffer size and performance. Allocating 64K always even for small packet is kind of waste and stress for guest memory. Virito-net try to avoid this by inventing the merge able rx buffer which allows big packet to be scattered in into different buffers. We can reuse this idea or revisit the idea of using virtio-net/vhost-net as a transport of vsock. What interesting is the performance is still behind vhost-net. Thanks > By the way, I send to 64K in application once, and I don't use > sg_init_one and rewrite function to packet sg list because pkt_len > include multiple pages. > > Thanks, > Yiwen. >>> >>> Btw, if you're using vsock for transferring large files, maybe it's more >>> efficient to implement sendpage() for vsock to allow sendfile()/splice() >>> work. >>> >>> Thanks >>> >> I can't agree more. >> >> why vhost_vsock is still behind vhost_net? >> Because I use sendfile() to test performance at first, and then >> I found vsock don't implement sendpage() and cause the bandwidth >> can't be increased. So I use read() and send() to replace sendfile(), >> it will increase some switch between kernel and user mode, and sendfile() >> can support zero copy. I think this is main reason. >> >> Thanks. > > > Want to post patches for this then :) ? > > Thanks > I may not post patches at the moment because there are other tasks. After a period of time, I will consider implement the feature. Thanks. > >> >>> . >>> >> > > . > ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
Re: [RFC] VSOCK: The performance problem of vhost_vsock.
On 2018/10/17 下午7:41, jiangyiwen wrote: On 2018/10/17 17:51, Jason Wang wrote: On 2018/10/17 下午5:39, Jason Wang wrote: Hi Jason and Stefan, Maybe I find the reason of bad performance. I found pkt_len is limited to VIRTIO_VSOCK_DEFAULT_RX_BUF_SIZE(4K), it will cause the bandwidth is limited to 500~600MB/s. And once I increase to 64k, it can improve about 3 times(~1500MB/s). Looks like the value was chosen for a balance between rx buffer size and performance. Allocating 64K always even for small packet is kind of waste and stress for guest memory. Virito-net try to avoid this by inventing the merge able rx buffer which allows big packet to be scattered in into different buffers. We can reuse this idea or revisit the idea of using virtio-net/vhost-net as a transport of vsock. What interesting is the performance is still behind vhost-net. Thanks By the way, I send to 64K in application once, and I don't use sg_init_one and rewrite function to packet sg list because pkt_len include multiple pages. Thanks, Yiwen. Btw, if you're using vsock for transferring large files, maybe it's more efficient to implement sendpage() for vsock to allow sendfile()/splice() work. Thanks I can't agree more. why vhost_vsock is still behind vhost_net? Because I use sendfile() to test performance at first, and then I found vsock don't implement sendpage() and cause the bandwidth can't be increased. So I use read() and send() to replace sendfile(), it will increase some switch between kernel and user mode, and sendfile() can support zero copy. I think this is main reason. Thanks. Want to post patches for this then :) ? Thanks . ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
Re: [RFC] VSOCK: The performance problem of vhost_vsock.
On 2018/10/17 17:51, Jason Wang wrote: > > On 2018/10/17 下午5:39, Jason Wang wrote: >>> Hi Jason and Stefan, >>> >>> Maybe I find the reason of bad performance. >>> >>> I found pkt_len is limited to VIRTIO_VSOCK_DEFAULT_RX_BUF_SIZE(4K), >>> it will cause the bandwidth is limited to 500~600MB/s. And once I >>> increase to 64k, it can improve about 3 times(~1500MB/s). >> >> >> Looks like the value was chosen for a balance between rx buffer size and >> performance. Allocating 64K always even for small packet is kind of waste >> and stress for guest memory. Virito-net try to avoid this by inventing the >> merge able rx buffer which allows big packet to be scattered in into >> different buffers. We can reuse this idea or revisit the idea of using >> virtio-net/vhost-net as a transport of vsock. >> >> What interesting is the performance is still behind vhost-net. >> >> Thanks >> >>> >>> By the way, I send to 64K in application once, and I don't use >>> sg_init_one and rewrite function to packet sg list because pkt_len >>> include multiple pages. >>> >>> Thanks, >>> Yiwen. > > > Btw, if you're using vsock for transferring large files, maybe it's more > efficient to implement sendpage() for vsock to allow sendfile()/splice() work. > > Thanks > I can't agree more. why vhost_vsock is still behind vhost_net? Because I use sendfile() to test performance at first, and then I found vsock don't implement sendpage() and cause the bandwidth can't be increased. So I use read() and send() to replace sendfile(), it will increase some switch between kernel and user mode, and sendfile() can support zero copy. I think this is main reason. Thanks. > > . > ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
Re: [RFC] VSOCK: The performance problem of vhost_vsock.
On 2018/10/17 17:39, Jason Wang wrote: > > On 2018/10/17 下午5:27, jiangyiwen wrote: >> On 2018/10/15 14:12, jiangyiwen wrote: >>> On 2018/10/15 10:33, Jason Wang wrote: On 2018年10月15日 09:43, jiangyiwen wrote: > Hi Stefan & All: > > Now I find vhost-vsock has two performance problems even if it > is not designed for performance. > > First, I think vhost-vsock should faster than vhost-net because it > is no TCP/IP stack, but the real test result vhost-net is 5~10 > times than vhost-vsock, currently I am looking for the reason. TCP/IP is not a must for vhost-net. How do you test and compare the performance? Thanks >>> I test the performance used my test tool, like follows: >>> >>> Server Client >>> socket() >>> bind() >>> listen() >>> >>> socket(AF_VSOCK) or socket(AF_INET) >>> Accept() <-->connect() >>> *==Start Record Time==* >>> Call syscall sendfile() >>> Recv() >>> Send end >>> Receive end >>> Send(file_size) >>> Recv(file_size) >>> *==End Record Time==* >>> >>> The test result, vhost-vsock is about 500MB/s, and vhost-net is about >>> 2500MB/s. >>> >>> By the way, vhost-net use single queue. >>> >>> Thanks. >>> > Second, vhost-vsock only supports two vqs(tx and rx), that means > if multiple sockets in the guest will use the same vq to transmit > the message and get the response. So if there are multiple applications > in the guest, we should support "Multiqueue" feature for Virtio-vsock. > > Stefan, have you encountered these problems? > > Thanks, > Yiwen. > . >>> >> Hi Jason and Stefan, >> >> Maybe I find the reason of bad performance. >> >> I found pkt_len is limited to VIRTIO_VSOCK_DEFAULT_RX_BUF_SIZE(4K), >> it will cause the bandwidth is limited to 500~600MB/s. And once I >> increase to 64k, it can improve about 3 times(~1500MB/s). > > > Looks like the value was chosen for a balance between rx buffer size and > performance. Allocating 64K always even for small packet is kind of waste and > stress for guest memory. Virito-net try to avoid this by inventing the merge > able rx buffer which allows big packet to be scattered in into different > buffers. We can reuse this idea or revisit the idea of using > virtio-net/vhost-net as a transport of vsock. > > What interesting is the performance is still behind vhost-net. > > Thanks > Actually I don't understand why pkt_len is limited to VIRTIO_VSOCK_DEFAULT_RX_BUF_SIZE in virtio_transport_send_pkt_info(), while I think it should used VIRTIO_VSOCK_MAX_PKT_BUF_SIZE instead. Thanks. >> >> By the way, I send to 64K in application once, and I don't use >> sg_init_one and rewrite function to packet sg list because pkt_len >> include multiple pages. >> >> Thanks, >> Yiwen. >> >>> ___ >>> Virtualization mailing list >>> Virtualization@lists.linux-foundation.org >>> https://lists.linuxfoundation.org/mailman/listinfo/virtualization >>> >> > > . > ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
Re: [RFC] VSOCK: The performance problem of vhost_vsock.
On 2018/10/17 下午5:39, Jason Wang wrote: Hi Jason and Stefan, Maybe I find the reason of bad performance. I found pkt_len is limited to VIRTIO_VSOCK_DEFAULT_RX_BUF_SIZE(4K), it will cause the bandwidth is limited to 500~600MB/s. And once I increase to 64k, it can improve about 3 times(~1500MB/s). Looks like the value was chosen for a balance between rx buffer size and performance. Allocating 64K always even for small packet is kind of waste and stress for guest memory. Virito-net try to avoid this by inventing the merge able rx buffer which allows big packet to be scattered in into different buffers. We can reuse this idea or revisit the idea of using virtio-net/vhost-net as a transport of vsock. What interesting is the performance is still behind vhost-net. Thanks By the way, I send to 64K in application once, and I don't use sg_init_one and rewrite function to packet sg list because pkt_len include multiple pages. Thanks, Yiwen. Btw, if you're using vsock for transferring large files, maybe it's more efficient to implement sendpage() for vsock to allow sendfile()/splice() work. Thanks ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
Re: [RFC] VSOCK: The performance problem of vhost_vsock.
On 2018/10/17 下午5:27, jiangyiwen wrote: On 2018/10/15 14:12, jiangyiwen wrote: On 2018/10/15 10:33, Jason Wang wrote: On 2018年10月15日 09:43, jiangyiwen wrote: Hi Stefan & All: Now I find vhost-vsock has two performance problems even if it is not designed for performance. First, I think vhost-vsock should faster than vhost-net because it is no TCP/IP stack, but the real test result vhost-net is 5~10 times than vhost-vsock, currently I am looking for the reason. TCP/IP is not a must for vhost-net. How do you test and compare the performance? Thanks I test the performance used my test tool, like follows: Server Client socket() bind() listen() socket(AF_VSOCK) or socket(AF_INET) Accept() <-->connect() *==Start Record Time==* Call syscall sendfile() Recv() Send end Receive end Send(file_size) Recv(file_size) *==End Record Time==* The test result, vhost-vsock is about 500MB/s, and vhost-net is about 2500MB/s. By the way, vhost-net use single queue. Thanks. Second, vhost-vsock only supports two vqs(tx and rx), that means if multiple sockets in the guest will use the same vq to transmit the message and get the response. So if there are multiple applications in the guest, we should support "Multiqueue" feature for Virtio-vsock. Stefan, have you encountered these problems? Thanks, Yiwen. . Hi Jason and Stefan, Maybe I find the reason of bad performance. I found pkt_len is limited to VIRTIO_VSOCK_DEFAULT_RX_BUF_SIZE(4K), it will cause the bandwidth is limited to 500~600MB/s. And once I increase to 64k, it can improve about 3 times(~1500MB/s). Looks like the value was chosen for a balance between rx buffer size and performance. Allocating 64K always even for small packet is kind of waste and stress for guest memory. Virito-net try to avoid this by inventing the merge able rx buffer which allows big packet to be scattered in into different buffers. We can reuse this idea or revisit the idea of using virtio-net/vhost-net as a transport of vsock. What interesting is the performance is still behind vhost-net. Thanks By the way, I send to 64K in application once, and I don't use sg_init_one and rewrite function to packet sg list because pkt_len include multiple pages. Thanks, Yiwen. ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
Re: [RFC] VSOCK: The performance problem of vhost_vsock.
On 2018/10/15 14:12, jiangyiwen wrote: > On 2018/10/15 10:33, Jason Wang wrote: >> >> >> On 2018年10月15日 09:43, jiangyiwen wrote: >>> Hi Stefan & All: >>> >>> Now I find vhost-vsock has two performance problems even if it >>> is not designed for performance. >>> >>> First, I think vhost-vsock should faster than vhost-net because it >>> is no TCP/IP stack, but the real test result vhost-net is 5~10 >>> times than vhost-vsock, currently I am looking for the reason. >> >> TCP/IP is not a must for vhost-net. >> >> How do you test and compare the performance? >> >> Thanks >> > > I test the performance used my test tool, like follows: > > Server Client > socket() > bind() > listen() > > socket(AF_VSOCK) or socket(AF_INET) > Accept() <-->connect() > *==Start Record Time==* > Call syscall sendfile() > Recv() > Send end > Receive end > Send(file_size) > Recv(file_size) > *==End Record Time==* > > The test result, vhost-vsock is about 500MB/s, and vhost-net is about > 2500MB/s. > > By the way, vhost-net use single queue. > > Thanks. > >>> Second, vhost-vsock only supports two vqs(tx and rx), that means >>> if multiple sockets in the guest will use the same vq to transmit >>> the message and get the response. So if there are multiple applications >>> in the guest, we should support "Multiqueue" feature for Virtio-vsock. >>> >>> Stefan, have you encountered these problems? >>> >>> Thanks, >>> Yiwen. >>> >> >> >> . >> > > Hi Jason and Stefan, Maybe I find the reason of bad performance. I found pkt_len is limited to VIRTIO_VSOCK_DEFAULT_RX_BUF_SIZE(4K), it will cause the bandwidth is limited to 500~600MB/s. And once I increase to 64k, it can improve about 3 times(~1500MB/s). By the way, I send to 64K in application once, and I don't use sg_init_one and rewrite function to packet sg list because pkt_len include multiple pages. Thanks, Yiwen. > ___ > Virtualization mailing list > Virtualization@lists.linux-foundation.org > https://lists.linuxfoundation.org/mailman/listinfo/virtualization > ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
Re: [RFC] VSOCK: The performance problem of vhost_vsock.
On 2018/10/15 10:33, Jason Wang wrote: > > > On 2018年10月15日 09:43, jiangyiwen wrote: >> Hi Stefan & All: >> >> Now I find vhost-vsock has two performance problems even if it >> is not designed for performance. >> >> First, I think vhost-vsock should faster than vhost-net because it >> is no TCP/IP stack, but the real test result vhost-net is 5~10 >> times than vhost-vsock, currently I am looking for the reason. > > TCP/IP is not a must for vhost-net. > > How do you test and compare the performance? > > Thanks > I test the performance used my test tool, like follows: Server Client socket() bind() listen() socket(AF_VSOCK) or socket(AF_INET) Accept() <-->connect() *==Start Record Time==* Call syscall sendfile() Recv() Send end Receive end Send(file_size) Recv(file_size) *==End Record Time==* The test result, vhost-vsock is about 500MB/s, and vhost-net is about 2500MB/s. By the way, vhost-net use single queue. Thanks. >> Second, vhost-vsock only supports two vqs(tx and rx), that means >> if multiple sockets in the guest will use the same vq to transmit >> the message and get the response. So if there are multiple applications >> in the guest, we should support "Multiqueue" feature for Virtio-vsock. >> >> Stefan, have you encountered these problems? >> >> Thanks, >> Yiwen. >> > > > . > ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
Re: [RFC] VSOCK: The performance problem of vhost_vsock.
On 2018年10月15日 09:43, jiangyiwen wrote: Hi Stefan & All: Now I find vhost-vsock has two performance problems even if it is not designed for performance. First, I think vhost-vsock should faster than vhost-net because it is no TCP/IP stack, but the real test result vhost-net is 5~10 times than vhost-vsock, currently I am looking for the reason. TCP/IP is not a must for vhost-net. How do you test and compare the performance? Thanks Second, vhost-vsock only supports two vqs(tx and rx), that means if multiple sockets in the guest will use the same vq to transmit the message and get the response. So if there are multiple applications in the guest, we should support "Multiqueue" feature for Virtio-vsock. Stefan, have you encountered these problems? Thanks, Yiwen. ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization