Re: [Qemu-devel] [ RFC Patch v6 0/2] Support Receive-Segment-Offload(RSC) for WHQL

2016-05-29 Thread Wei Xu


On 2016年05月30日 12:22, Jason Wang wrote:



On 2016年05月29日 00:37, w...@redhat.com wrote:

From: Wei Xu 

Changes in V6:
- Sync upstream code
- Split new fields in 'virtio_net_hdr' to a seperate patch
- Remove feature bit code, replace it with a command line parameter
'guest_rsc'
which is turned off by default.

Changes in V5:
- Passed all IPv4/6 test cases
- Add new fields in 'virtio_net_hdr'
- Set 'gso_type' & 'coalesced packets' in new field.
- Bypass all 'tcp option' packet
- Bypass all 'pure ack' packet
- Bypass all 'duplicate ack' packet
- Change 'guest_rsc' feature bit to 'false' by default
- Feedbacks from v4, typo, etc.


Change-log is very important for the ease and speed up reviewers. More
details are more than welcomed. But I see some changes were not
documented here. Please give a more complete one in next iteration.

OK.




Note:
There is still a few pending issues about the feature bit, and need to be
discussed with windows driver maintainer, so linux guests with this patch
won't work at current, haven't figure it out yet, but i'm guessing it's
caused by the 'gso_type' is set to 'VIRTIO_NET_HDR_GSO_TCPV4/6',
will fix it after get the final solution, the below test steps and
performance data is based on v4.


This is probably because you've increased the vnet header length.



Another suggestion from Jason is to adjust part of the code to make it
more readable, since there maybe still few change about the flowchart
in the future, such as timestamp, duplicate ack, so i'd like to delay it
temporarily.

Changes in V4:
- Add new host feature bit
- Replace using fixed header lenght with dynamic header lenght in
VirtIONet
- Change ip/ip6 header union in NetRscUnit to void* pointer
- Add macro prefix, adjust code indent, etc.

Changes in V3:
- Removed big param list, replace it with 'NetRscUnit'
- Different virtio header size
- Modify callback function to direct call.
- Needn't check the failure of g_malloc()
- Other code format adjustment, macro naming, etc

Changes in V2:
- Add detailed commit log

This patch is to support WHQL test for Windows guest, while this
feature also
benifits other guest works as a kernel 'gro' like feature with userspace
implementation.
Feature information:
   http://msdn.microsoft.com/en-us/library/windows/hardware/jj853324

Both IPv4 and IPv6 are supported, though performance with userspace
virtio
is slow than vhost-net, there is about 1.5x to 2x performance
improvement to
userspace virtio, this is done by turning this feature on and disable
'tso/gso/gro' on corresponding tap interface and guest interface,
while get
less improment with all these feature on.

Linux guest performance data(Netperf):
MIGRATED TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to
192.168.2.101 () port 0 AF_INET : nodelay
Size   SizeSize Time Throughput
bytes  bytes   bytessecs.10^6bits/sec

  87380  16384 646.00 1221.20
  87380  16384 646.00 1260.30

  87380  163841286.00 1978.51
  87380  163841286.00 2286.05

  87380  163842566.00 2677.94
  87380  163842566.00 4615.42

  87380  163845126.00 2956.54
  87380  163845126.00 5356.39

  87380  16384   10246.00 2798.17
  87380  16384   10246.00 4943.30

  87380  16384   20486.00 2681.09
  87380  16384   20486.00 4835.81

  87380  16384   40966.00 3390.14
  87380  16384   40966.00 5391.54

  87380  16384   80926.00 3008.27
  87380  16384   80926.00 5381.68

  87380  16384  102406.00 2999.89
  87380  16384  102406.00 5393.11

Test steps:
Although this feature is mainly used for window guest, i used linux
guest to
help test the feature, to make things simple, i used 3 steps to test
the patch
as i moved on.

1. With a tcp socket client/server pair running on 2 linux guest, thus
i can
control
the traffic and debugging the code as i want.
2. Netperf on linux guest test the throughput.
3. WHQL test with 2 Windows guests.

Wei Xu (3):
   virtio-net rsc: support coalescing ipv4 tcp traffic
   virtio-net rsc: support coalescing ipv6 tcp traffic
   virtio-net rsc: add 2 new rsc information fields to 'virtio_net_hdr'

  hw/net/virtio-net.c | 642
+++-
  include/hw/virtio/virtio-net.h  |   2 +
  include/hw/virtio/virtio.h  |  75 
  include/standard-headers/linux/virtio_net.h |   3 +
  4 files changed, 721 insertions(+), 1 deletion(-)







Re: [Qemu-devel] [ RFC Patch v6 0/2] Support Receive-Segment-Offload(RSC) for WHQL

2016-05-29 Thread Jason Wang



On 2016年05月29日 00:37, w...@redhat.com wrote:

From: Wei Xu 

Changes in V6:
- Sync upstream code
- Split new fields in 'virtio_net_hdr' to a seperate patch
- Remove feature bit code, replace it with a command line parameter 'guest_rsc'
which is turned off by default.

Changes in V5:
- Passed all IPv4/6 test cases
- Add new fields in 'virtio_net_hdr'
- Set 'gso_type' & 'coalesced packets' in new field.
- Bypass all 'tcp option' packet
- Bypass all 'pure ack' packet
- Bypass all 'duplicate ack' packet
- Change 'guest_rsc' feature bit to 'false' by default
- Feedbacks from v4, typo, etc.


Change-log is very important for the ease and speed up reviewers. More 
details are more than welcomed. But I see some changes were not 
documented here. Please give a more complete one in next iteration.




Note:
There is still a few pending issues about the feature bit, and need to be
discussed with windows driver maintainer, so linux guests with this patch
won't work at current, haven't figure it out yet, but i'm guessing it's
caused by the 'gso_type' is set to 'VIRTIO_NET_HDR_GSO_TCPV4/6',
will fix it after get the final solution, the below test steps and
performance data is based on v4.


This is probably because you've increased the vnet header length.



Another suggestion from Jason is to adjust part of the code to make it
more readable, since there maybe still few change about the flowchart
in the future, such as timestamp, duplicate ack, so i'd like to delay it
temporarily.

Changes in V4:
- Add new host feature bit
- Replace using fixed header lenght with dynamic header lenght in VirtIONet
- Change ip/ip6 header union in NetRscUnit to void* pointer
- Add macro prefix, adjust code indent, etc.

Changes in V3:
- Removed big param list, replace it with 'NetRscUnit'
- Different virtio header size
- Modify callback function to direct call.
- Needn't check the failure of g_malloc()
- Other code format adjustment, macro naming, etc

Changes in V2:
- Add detailed commit log

This patch is to support WHQL test for Windows guest, while this feature also
benifits other guest works as a kernel 'gro' like feature with userspace
implementation.
Feature information:
   http://msdn.microsoft.com/en-us/library/windows/hardware/jj853324

Both IPv4 and IPv6 are supported, though performance with userspace virtio
is slow than vhost-net, there is about 1.5x to 2x performance improvement to
userspace virtio, this is done by turning this feature on and disable
'tso/gso/gro' on corresponding tap interface and guest interface, while get
less improment with all these feature on.

Linux guest performance data(Netperf):
MIGRATED TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 192.168.2.101 
() port 0 AF_INET : nodelay
Size   SizeSize Time Throughput
bytes  bytes   bytessecs.10^6bits/sec

  87380  16384 646.00 1221.20
  87380  16384 646.00 1260.30

  87380  163841286.00 1978.51
  87380  163841286.00 2286.05

  87380  163842566.00 2677.94
  87380  163842566.00 4615.42

  87380  163845126.00 2956.54
  87380  163845126.00 5356.39

  87380  16384   10246.00 2798.17
  87380  16384   10246.00 4943.30

  87380  16384   20486.00 2681.09
  87380  16384   20486.00 4835.81

  87380  16384   40966.00 3390.14
  87380  16384   40966.00 5391.54

  87380  16384   80926.00 3008.27
  87380  16384   80926.00 5381.68

  87380  16384  102406.00 2999.89
  87380  16384  102406.00 5393.11

Test steps:
Although this feature is mainly used for window guest, i used linux guest to
help test the feature, to make things simple, i used 3 steps to test the patch
as i moved on.

1. With a tcp socket client/server pair running on 2 linux guest, thus i can
control
the traffic and debugging the code as i want.
2. Netperf on linux guest test the throughput.
3. WHQL test with 2 Windows guests.

Wei Xu (3):
   virtio-net rsc: support coalescing ipv4 tcp traffic
   virtio-net rsc: support coalescing ipv6 tcp traffic
   virtio-net rsc: add 2 new rsc information fields to 'virtio_net_hdr'

  hw/net/virtio-net.c | 642 +++-
  include/hw/virtio/virtio-net.h  |   2 +
  include/hw/virtio/virtio.h  |  75 
  include/standard-headers/linux/virtio_net.h |   3 +
  4 files changed, 721 insertions(+), 1 deletion(-)






[Qemu-devel] [ RFC Patch v6 0/2] Support Receive-Segment-Offload(RSC) for WHQL

2016-05-28 Thread wexu
From: Wei Xu 

Changes in V6:
- Sync upstream code
- Split new fields in 'virtio_net_hdr' to a seperate patch
- Remove feature bit code, replace it with a command line parameter 'guest_rsc'
which is turned off by default.

Changes in V5:
- Passed all IPv4/6 test cases
- Add new fields in 'virtio_net_hdr'
- Set 'gso_type' & 'coalesced packets' in new field.
- Bypass all 'tcp option' packet
- Bypass all 'pure ack' packet
- Bypass all 'duplicate ack' packet
- Change 'guest_rsc' feature bit to 'false' by default
- Feedbacks from v4, typo, etc.

Note:
There is still a few pending issues about the feature bit, and need to be 
discussed with windows driver maintainer, so linux guests with this patch
won't work at current, haven't figure it out yet, but i'm guessing it's
caused by the 'gso_type' is set to 'VIRTIO_NET_HDR_GSO_TCPV4/6',
will fix it after get the final solution, the below test steps and
performance data is based on v4.

Another suggestion from Jason is to adjust part of the code to make it
more readable, since there maybe still few change about the flowchart
in the future, such as timestamp, duplicate ack, so i'd like to delay it
temporarily.

Changes in V4:
- Add new host feature bit
- Replace using fixed header lenght with dynamic header lenght in VirtIONet 
- Change ip/ip6 header union in NetRscUnit to void* pointer
- Add macro prefix, adjust code indent, etc.

Changes in V3:
- Removed big param list, replace it with 'NetRscUnit' 
- Different virtio header size
- Modify callback function to direct call.
- Needn't check the failure of g_malloc()
- Other code format adjustment, macro naming, etc 

Changes in V2:
- Add detailed commit log

This patch is to support WHQL test for Windows guest, while this feature also
benifits other guest works as a kernel 'gro' like feature with userspace 
implementation.
Feature information:
  http://msdn.microsoft.com/en-us/library/windows/hardware/jj853324

Both IPv4 and IPv6 are supported, though performance with userspace virtio
is slow than vhost-net, there is about 1.5x to 2x performance improvement to
userspace virtio, this is done by turning this feature on and disable
'tso/gso/gro' on corresponding tap interface and guest interface, while get
less improment with all these feature on.

Linux guest performance data(Netperf):
MIGRATED TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 192.168.2.101 
() port 0 AF_INET : nodelay
Size   SizeSize Time Throughput  
bytes  bytes   bytessecs.10^6bits/sec  

 87380  16384 646.00 1221.20   
 87380  16384 646.00 1260.30   

 87380  163841286.00 1978.51   
 87380  163841286.00 2286.05   

 87380  163842566.00 2677.94   
 87380  163842566.00 4615.42   

 87380  163845126.00 2956.54   
 87380  163845126.00 5356.39   

 87380  16384   10246.00 2798.17   
 87380  16384   10246.00 4943.30   

 87380  16384   20486.00 2681.09   
 87380  16384   20486.00 4835.81   

 87380  16384   40966.00 3390.14   
 87380  16384   40966.00 5391.54   

 87380  16384   80926.00 3008.27   
 87380  16384   80926.00 5381.68   

 87380  16384  102406.00 2999.89   
 87380  16384  102406.00 5393.11 

Test steps:
Although this feature is mainly used for window guest, i used linux guest to 
help test the feature, to make things simple, i used 3 steps to test the patch
as i moved on.

1. With a tcp socket client/server pair running on 2 linux guest, thus i can 
control
the traffic and debugging the code as i want.
2. Netperf on linux guest test the throughput.
3. WHQL test with 2 Windows guests.

Wei Xu (3):
  virtio-net rsc: support coalescing ipv4 tcp traffic
  virtio-net rsc: support coalescing ipv6 tcp traffic
  virtio-net rsc: add 2 new rsc information fields to 'virtio_net_hdr'

 hw/net/virtio-net.c | 642 +++-
 include/hw/virtio/virtio-net.h  |   2 +
 include/hw/virtio/virtio.h  |  75 
 include/standard-headers/linux/virtio_net.h |   3 +
 4 files changed, 721 insertions(+), 1 deletion(-)

-- 
2.7.1