Re: [RFC] VSOCK: The performance problem of vhost_vsock.

2018-10-17 Thread Jason Wang


On 2018/10/18 上午9:22, jiangyiwen wrote:

On 2018/10/17 20:31, Jason Wang wrote:

On 2018/10/17 下午7:41, jiangyiwen wrote:

On 2018/10/17 17:51, Jason Wang wrote:

On 2018/10/17 下午5:39, Jason Wang wrote:

Hi Jason and Stefan,

Maybe I find the reason of bad performance.

I found pkt_len is limited to VIRTIO_VSOCK_DEFAULT_RX_BUF_SIZE(4K),
it will cause the bandwidth is limited to 500~600MB/s. And once I
increase to 64k, it can improve about 3 times(~1500MB/s).

Looks like the value was chosen for a balance between rx buffer size and 
performance. Allocating 64K always even for small packet is kind of waste and 
stress for guest memory. Virito-net try to avoid this by inventing the merge 
able rx buffer which allows big packet to be scattered in into different 
buffers. We can reuse this idea or revisit the idea of using 
virtio-net/vhost-net as a transport of vsock.

What interesting is the performance is still behind vhost-net.

Thanks


By the way, I send to 64K in application once, and I don't use
sg_init_one and rewrite function to packet sg list because pkt_len
include multiple pages.

Thanks,
Yiwen.

Btw, if you're using vsock for transferring large files, maybe it's more 
efficient to implement sendpage() for vsock to allow sendfile()/splice() work.

Thanks


I can't agree more.

why vhost_vsock is still behind vhost_net?
Because I use sendfile() to test performance at first, and then
I found vsock don't implement sendpage() and cause the bandwidth
can't be increased. So I use read() and send() to replace sendfile(),
it will increase some switch between kernel and user mode, and sendfile()
can support zero copy. I think this is main reason.

Thanks.


Want to post patches for this then :) ?

Thanks


I may not post patches at the moment because there are other tasks.

After a period of time, I will consider implement the feature.

Thanks.



That's fine.

Thanks

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [RFC] VSOCK: The performance problem of vhost_vsock.

2018-10-17 Thread jiangyiwen
On 2018/10/17 20:31, Jason Wang wrote:
> 
> On 2018/10/17 下午7:41, jiangyiwen wrote:
>> On 2018/10/17 17:51, Jason Wang wrote:
>>> On 2018/10/17 下午5:39, Jason Wang wrote:
> Hi Jason and Stefan,
>
> Maybe I find the reason of bad performance.
>
> I found pkt_len is limited to VIRTIO_VSOCK_DEFAULT_RX_BUF_SIZE(4K),
> it will cause the bandwidth is limited to 500~600MB/s. And once I
> increase to 64k, it can improve about 3 times(~1500MB/s).

 Looks like the value was chosen for a balance between rx buffer size and 
 performance. Allocating 64K always even for small packet is kind of waste 
 and stress for guest memory. Virito-net try to avoid this by inventing the 
 merge able rx buffer which allows big packet to be scattered in into 
 different buffers. We can reuse this idea or revisit the idea of using 
 virtio-net/vhost-net as a transport of vsock.

 What interesting is the performance is still behind vhost-net.

 Thanks

> By the way, I send to 64K in application once, and I don't use
> sg_init_one and rewrite function to packet sg list because pkt_len
> include multiple pages.
>
> Thanks,
> Yiwen.
>>>
>>> Btw, if you're using vsock for transferring large files, maybe it's more 
>>> efficient to implement sendpage() for vsock to allow sendfile()/splice() 
>>> work.
>>>
>>> Thanks
>>>
>> I can't agree more.
>>
>> why vhost_vsock is still behind vhost_net?
>> Because I use sendfile() to test performance at first, and then
>> I found vsock don't implement sendpage() and cause the bandwidth
>> can't be increased. So I use read() and send() to replace sendfile(),
>> it will increase some switch between kernel and user mode, and sendfile()
>> can support zero copy. I think this is main reason.
>>
>> Thanks.
> 
> 
> Want to post patches for this then :) ?
> 
> Thanks
> 

I may not post patches at the moment because there are other tasks.

After a period of time, I will consider implement the feature.

Thanks.

> 
>>
>>> .
>>>
>>
> 
> .
> 


___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [RFC] VSOCK: The performance problem of vhost_vsock.

2018-10-17 Thread Jason Wang


On 2018/10/17 下午7:41, jiangyiwen wrote:

On 2018/10/17 17:51, Jason Wang wrote:

On 2018/10/17 下午5:39, Jason Wang wrote:

Hi Jason and Stefan,

Maybe I find the reason of bad performance.

I found pkt_len is limited to VIRTIO_VSOCK_DEFAULT_RX_BUF_SIZE(4K),
it will cause the bandwidth is limited to 500~600MB/s. And once I
increase to 64k, it can improve about 3 times(~1500MB/s).


Looks like the value was chosen for a balance between rx buffer size and 
performance. Allocating 64K always even for small packet is kind of waste and 
stress for guest memory. Virito-net try to avoid this by inventing the merge 
able rx buffer which allows big packet to be scattered in into different 
buffers. We can reuse this idea or revisit the idea of using 
virtio-net/vhost-net as a transport of vsock.

What interesting is the performance is still behind vhost-net.

Thanks


By the way, I send to 64K in application once, and I don't use
sg_init_one and rewrite function to packet sg list because pkt_len
include multiple pages.

Thanks,
Yiwen.


Btw, if you're using vsock for transferring large files, maybe it's more 
efficient to implement sendpage() for vsock to allow sendfile()/splice() work.

Thanks


I can't agree more.

why vhost_vsock is still behind vhost_net?
Because I use sendfile() to test performance at first, and then
I found vsock don't implement sendpage() and cause the bandwidth
can't be increased. So I use read() and send() to replace sendfile(),
it will increase some switch between kernel and user mode, and sendfile()
can support zero copy. I think this is main reason.

Thanks.



Want to post patches for this then :) ?

Thanks





.




___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [RFC] VSOCK: The performance problem of vhost_vsock.

2018-10-17 Thread jiangyiwen
On 2018/10/17 17:51, Jason Wang wrote:
> 
> On 2018/10/17 下午5:39, Jason Wang wrote:

>>> Hi Jason and Stefan,
>>>
>>> Maybe I find the reason of bad performance.
>>>
>>> I found pkt_len is limited to VIRTIO_VSOCK_DEFAULT_RX_BUF_SIZE(4K),
>>> it will cause the bandwidth is limited to 500~600MB/s. And once I
>>> increase to 64k, it can improve about 3 times(~1500MB/s).
>>
>>
>> Looks like the value was chosen for a balance between rx buffer size and 
>> performance. Allocating 64K always even for small packet is kind of waste 
>> and stress for guest memory. Virito-net try to avoid this by inventing the 
>> merge able rx buffer which allows big packet to be scattered in into 
>> different buffers. We can reuse this idea or revisit the idea of using 
>> virtio-net/vhost-net as a transport of vsock.
>>
>> What interesting is the performance is still behind vhost-net.
>>
>> Thanks
>>
>>>
>>> By the way, I send to 64K in application once, and I don't use
>>> sg_init_one and rewrite function to packet sg list because pkt_len
>>> include multiple pages.
>>>
>>> Thanks,
>>> Yiwen. 
> 
> 
> Btw, if you're using vsock for transferring large files, maybe it's more 
> efficient to implement sendpage() for vsock to allow sendfile()/splice() work.
> 
> Thanks
>

I can't agree more.

why vhost_vsock is still behind vhost_net?
Because I use sendfile() to test performance at first, and then
I found vsock don't implement sendpage() and cause the bandwidth
can't be increased. So I use read() and send() to replace sendfile(),
it will increase some switch between kernel and user mode, and sendfile()
can support zero copy. I think this is main reason.

Thanks.

> 
> .
> 


___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [RFC] VSOCK: The performance problem of vhost_vsock.

2018-10-17 Thread jiangyiwen
On 2018/10/17 17:39, Jason Wang wrote:
> 
> On 2018/10/17 下午5:27, jiangyiwen wrote:
>> On 2018/10/15 14:12, jiangyiwen wrote:
>>> On 2018/10/15 10:33, Jason Wang wrote:

 On 2018年10月15日 09:43, jiangyiwen wrote:
> Hi Stefan & All:
>
> Now I find vhost-vsock has two performance problems even if it
> is not designed for performance.
>
> First, I think vhost-vsock should faster than vhost-net because it
> is no TCP/IP stack, but the real test result vhost-net is 5~10
> times than vhost-vsock, currently I am looking for the reason.
 TCP/IP is not a must for vhost-net.

 How do you test and compare the performance?

 Thanks

>>> I test the performance used my test tool, like follows:
>>>
>>> Server   Client
>>> socket()
>>> bind()
>>> listen()
>>>
>>>   socket(AF_VSOCK) or socket(AF_INET)
>>> Accept() <-->connect()
>>>   *==Start Record Time==*
>>>   Call syscall sendfile()
>>> Recv()
>>>   Send end
>>> Receive end
>>> Send(file_size)
>>>   Recv(file_size)
>>>   *==End Record Time==*
>>>
>>> The test result, vhost-vsock is about 500MB/s, and vhost-net is about 
>>> 2500MB/s.
>>>
>>> By the way, vhost-net use single queue.
>>>
>>> Thanks.
>>>
> Second, vhost-vsock only supports two vqs(tx and rx), that means
> if multiple sockets in the guest will use the same vq to transmit
> the message and get the response. So if there are multiple applications
> in the guest, we should support "Multiqueue" feature for Virtio-vsock.
>
> Stefan, have you encountered these problems?
>
> Thanks,
> Yiwen.
>

 .

>>>
>> Hi Jason and Stefan,
>>
>> Maybe I find the reason of bad performance.
>>
>> I found pkt_len is limited to VIRTIO_VSOCK_DEFAULT_RX_BUF_SIZE(4K),
>> it will cause the bandwidth is limited to 500~600MB/s. And once I
>> increase to 64k, it can improve about 3 times(~1500MB/s).
> 
> 
> Looks like the value was chosen for a balance between rx buffer size and 
> performance. Allocating 64K always even for small packet is kind of waste and 
> stress for guest memory. Virito-net try to avoid this by inventing the merge 
> able rx buffer which allows big packet to be scattered in into different 
> buffers. We can reuse this idea or revisit the idea of using 
> virtio-net/vhost-net as a transport of vsock.
> 
> What interesting is the performance is still behind vhost-net.
> 
> Thanks
> 

Actually I don't understand why pkt_len is limited to
VIRTIO_VSOCK_DEFAULT_RX_BUF_SIZE in virtio_transport_send_pkt_info(),
while I think it should used VIRTIO_VSOCK_MAX_PKT_BUF_SIZE instead.

Thanks.

>>
>> By the way, I send to 64K in application once, and I don't use
>> sg_init_one and rewrite function to packet sg list because pkt_len
>> include multiple pages.
>>
>> Thanks,
>> Yiwen.
>>
>>> ___
>>> Virtualization mailing list
>>> Virtualization@lists.linux-foundation.org
>>> https://lists.linuxfoundation.org/mailman/listinfo/virtualization
>>>
>>
> 
> .
> 


___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [RFC] VSOCK: The performance problem of vhost_vsock.

2018-10-17 Thread Jason Wang


On 2018/10/17 下午5:39, Jason Wang wrote:



Hi Jason and Stefan,

Maybe I find the reason of bad performance.

I found pkt_len is limited to VIRTIO_VSOCK_DEFAULT_RX_BUF_SIZE(4K),
it will cause the bandwidth is limited to 500~600MB/s. And once I
increase to 64k, it can improve about 3 times(~1500MB/s).



Looks like the value was chosen for a balance between rx buffer size 
and performance. Allocating 64K always even for small packet is kind 
of waste and stress for guest memory. Virito-net try to avoid this by 
inventing the merge able rx buffer which allows big packet to be 
scattered in into different buffers. We can reuse this idea or revisit 
the idea of using virtio-net/vhost-net as a transport of vsock.


What interesting is the performance is still behind vhost-net.

Thanks



By the way, I send to 64K in application once, and I don't use
sg_init_one and rewrite function to packet sg list because pkt_len
include multiple pages.

Thanks,
Yiwen. 



Btw, if you're using vsock for transferring large files, maybe it's more 
efficient to implement sendpage() for vsock to allow sendfile()/splice() 
work.


Thanks

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [RFC] VSOCK: The performance problem of vhost_vsock.

2018-10-17 Thread Jason Wang


On 2018/10/17 下午5:27, jiangyiwen wrote:

On 2018/10/15 14:12, jiangyiwen wrote:

On 2018/10/15 10:33, Jason Wang wrote:


On 2018年10月15日 09:43, jiangyiwen wrote:

Hi Stefan & All:

Now I find vhost-vsock has two performance problems even if it
is not designed for performance.

First, I think vhost-vsock should faster than vhost-net because it
is no TCP/IP stack, but the real test result vhost-net is 5~10
times than vhost-vsock, currently I am looking for the reason.

TCP/IP is not a must for vhost-net.

How do you test and compare the performance?

Thanks


I test the performance used my test tool, like follows:

Server   Client
socket()
bind()
listen()

  socket(AF_VSOCK) or socket(AF_INET)
Accept() <-->connect()
  *==Start Record Time==*
  Call syscall sendfile()
Recv()
  Send end
Receive end
Send(file_size)
  Recv(file_size)
  *==End Record Time==*

The test result, vhost-vsock is about 500MB/s, and vhost-net is about 2500MB/s.

By the way, vhost-net use single queue.

Thanks.


Second, vhost-vsock only supports two vqs(tx and rx), that means
if multiple sockets in the guest will use the same vq to transmit
the message and get the response. So if there are multiple applications
in the guest, we should support "Multiqueue" feature for Virtio-vsock.

Stefan, have you encountered these problems?

Thanks,
Yiwen.



.




Hi Jason and Stefan,

Maybe I find the reason of bad performance.

I found pkt_len is limited to VIRTIO_VSOCK_DEFAULT_RX_BUF_SIZE(4K),
it will cause the bandwidth is limited to 500~600MB/s. And once I
increase to 64k, it can improve about 3 times(~1500MB/s).



Looks like the value was chosen for a balance between rx buffer size and 
performance. Allocating 64K always even for small packet is kind of 
waste and stress for guest memory. Virito-net try to avoid this by 
inventing the merge able rx buffer which allows big packet to be 
scattered in into different buffers. We can reuse this idea or revisit 
the idea of using virtio-net/vhost-net as a transport of vsock.


What interesting is the performance is still behind vhost-net.

Thanks



By the way, I send to 64K in application once, and I don't use
sg_init_one and rewrite function to packet sg list because pkt_len
include multiple pages.

Thanks,
Yiwen.


___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization




___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [RFC] VSOCK: The performance problem of vhost_vsock.

2018-10-17 Thread jiangyiwen
On 2018/10/15 14:12, jiangyiwen wrote:
> On 2018/10/15 10:33, Jason Wang wrote:
>>
>>
>> On 2018年10月15日 09:43, jiangyiwen wrote:
>>> Hi Stefan & All:
>>>
>>> Now I find vhost-vsock has two performance problems even if it
>>> is not designed for performance.
>>>
>>> First, I think vhost-vsock should faster than vhost-net because it
>>> is no TCP/IP stack, but the real test result vhost-net is 5~10
>>> times than vhost-vsock, currently I am looking for the reason.
>>
>> TCP/IP is not a must for vhost-net.
>>
>> How do you test and compare the performance?
>>
>> Thanks
>>
> 
> I test the performance used my test tool, like follows:
> 
> Server   Client
> socket()
> bind()
> listen()
> 
>  socket(AF_VSOCK) or socket(AF_INET)
> Accept() <-->connect()
>  *==Start Record Time==*
>  Call syscall sendfile()
> Recv()
>  Send end
> Receive end
> Send(file_size)
>  Recv(file_size)
>  *==End Record Time==*
> 
> The test result, vhost-vsock is about 500MB/s, and vhost-net is about 
> 2500MB/s.
> 
> By the way, vhost-net use single queue.
> 
> Thanks.
> 
>>> Second, vhost-vsock only supports two vqs(tx and rx), that means
>>> if multiple sockets in the guest will use the same vq to transmit
>>> the message and get the response. So if there are multiple applications
>>> in the guest, we should support "Multiqueue" feature for Virtio-vsock.
>>>
>>> Stefan, have you encountered these problems?
>>>
>>> Thanks,
>>> Yiwen.
>>>
>>
>>
>> .
>>
> 
> 

Hi Jason and Stefan,

Maybe I find the reason of bad performance.

I found pkt_len is limited to VIRTIO_VSOCK_DEFAULT_RX_BUF_SIZE(4K),
it will cause the bandwidth is limited to 500~600MB/s. And once I
increase to 64k, it can improve about 3 times(~1500MB/s).

By the way, I send to 64K in application once, and I don't use
sg_init_one and rewrite function to packet sg list because pkt_len
include multiple pages.

Thanks,
Yiwen.

> ___
> Virtualization mailing list
> Virtualization@lists.linux-foundation.org
> https://lists.linuxfoundation.org/mailman/listinfo/virtualization
> 


___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [RFC] VSOCK: The performance problem of vhost_vsock.

2018-10-15 Thread jiangyiwen
On 2018/10/15 10:33, Jason Wang wrote:
> 
> 
> On 2018年10月15日 09:43, jiangyiwen wrote:
>> Hi Stefan & All:
>>
>> Now I find vhost-vsock has two performance problems even if it
>> is not designed for performance.
>>
>> First, I think vhost-vsock should faster than vhost-net because it
>> is no TCP/IP stack, but the real test result vhost-net is 5~10
>> times than vhost-vsock, currently I am looking for the reason.
> 
> TCP/IP is not a must for vhost-net.
> 
> How do you test and compare the performance?
> 
> Thanks
> 

I test the performance used my test tool, like follows:

Server   Client
socket()
bind()
listen()

 socket(AF_VSOCK) or socket(AF_INET)
Accept() <-->connect()
 *==Start Record Time==*
 Call syscall sendfile()
Recv()
 Send end
Receive end
Send(file_size)
 Recv(file_size)
 *==End Record Time==*

The test result, vhost-vsock is about 500MB/s, and vhost-net is about 2500MB/s.

By the way, vhost-net use single queue.

Thanks.

>> Second, vhost-vsock only supports two vqs(tx and rx), that means
>> if multiple sockets in the guest will use the same vq to transmit
>> the message and get the response. So if there are multiple applications
>> in the guest, we should support "Multiqueue" feature for Virtio-vsock.
>>
>> Stefan, have you encountered these problems?
>>
>> Thanks,
>> Yiwen.
>>
> 
> 
> .
> 


___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [RFC] VSOCK: The performance problem of vhost_vsock.

2018-10-14 Thread Jason Wang



On 2018年10月15日 09:43, jiangyiwen wrote:

Hi Stefan & All:

Now I find vhost-vsock has two performance problems even if it
is not designed for performance.

First, I think vhost-vsock should faster than vhost-net because it
is no TCP/IP stack, but the real test result vhost-net is 5~10
times than vhost-vsock, currently I am looking for the reason.


TCP/IP is not a must for vhost-net.

How do you test and compare the performance?

Thanks


Second, vhost-vsock only supports two vqs(tx and rx), that means
if multiple sockets in the guest will use the same vq to transmit
the message and get the response. So if there are multiple applications
in the guest, we should support "Multiqueue" feature for Virtio-vsock.

Stefan, have you encountered these problems?

Thanks,
Yiwen.



___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization