Hi Sugesh,
I did more performance testings on it.
In ovs-dpdk + VM environment, I consumed qperf on VM side and has there
performace number (left colunm is consuming Hardware CKSUM, right colunm
is consuming Software CKSUM).
we can see in tcp throughput part, it has big improvment. I would like
to make HW-TCP-CKSUM enabled in default in next patch.
[root@localhost ~]# qperf -t 60 -oo msg_size:1:64K:*2 10.100.85.247 tcp_bw
tcp_lat
tcp_bw: *HW-CKSUM* * SW-CKSUM(in VM)*
bw = 1.91 MB/sec 1.93 MB/sec
tcp_bw:
bw = 4 MB/sec 3.97 MB/sec
tcp_bw:
bw = 7.74 MB/sec 7.76 MB/sec
tcp_bw:
bw = 14.7 MB/sec 14.7 MB/sec
tcp_bw:
bw = 27.8 MB/sec 27.4 MB/sec
tcp_bw:
bw = 51.3 MB/sec 49.1 MB/sec
tcp_bw:
bw = 87.5 MB/sec 83.1 MB/sec
tcp_bw:
bw = 144 MB/sec 129 MB/sec
tcp_bw:
bw = 203 MB/sec 189 MB/sec
tcp_bw:
bw = 261 MB/sec 252 MB/sec
tcp_bw:
bw = 317 MB/sec 253 MB/sec
tcp_bw:
bw = 400 MB/sec 307 MB/sec
tcp_bw:
bw = 611 MB/sec 491 MB/sec
tcp_bw:
bw = 912 MB/sec 662 MB/sec
tcp_bw:
bw = 1.11 GB/sec 729 MB/sec
tcp_bw:
bw = 1.17 GB/sec 861 MB/sec
tcp_bw:
bw = 1.17 GB/sec 1.08 GB/sec
tcp_lat:
latency = 29.1 us 29.4 us
tcp_lat:
latency = 28.8 us 29.1 us
tcp_lat:
latency = 29 us 28.9 us
tcp_lat:
latency = 28.7 us 29.2 us
tcp_lat:
latency = 29.2 us 28.9 us
tcp_lat:
latency = 28.9 us 29.1 us
tcp_lat:
latency = 29.4 us 29.4 us
tcp_lat:
latency = 29.6 us 29.9 us
tcp_lat:
latency = 30.5 us 30.4 us
tcp_lat:
latency = 47.1 us 39.8 us
tcp_lat:
latency = 53.6 us 45.2 us
tcp_lat:
latency = 43.5 us 44.4 us
tcp_lat:
latency = 53.8 us 49.1 us
tcp_lat:
latency = 81.8 us 78.5 us
tcp_lat:
latency = 82.3 us 83.3 us
tcp_lat:
latency = 93.1 us 97.2 us
tcp_lat:
latency = 237 us 211 us
2017-06-23 15:58 GMT+08:00 Chandran, Sugesh <[email protected]>:
>
>
>
>
> *Regards*
>
> *_Sugesh*
>
>
>
> *From:* Gao Zhenyu [mailto:[email protected]]
> *Sent:* Wednesday, June 21, 2017 9:32 AM
> *To:* Chandran, Sugesh <[email protected]>
> *Cc:* [email protected]; [email protected]; [email protected]; Kavanagh,
> Mark B <[email protected]>; [email protected]
> *Subject:* Re: [ovs-dev] [RFC PATCH v1] net-dpdk: Introducing TX tcp HW
> checksum offload support for DPDK pnic
>
>
>
> I get it. Maybe caculating it in OVS part is doable as well.
>
> So, how about adding more options to let people choose HW-tcp-cksum(reduce
> cpu cycles) or SW-tcp-cksum(may be better performance)?
>
> Then we have NO-TCP-CKSUM, SW-TCP-CKSUM, HW-TCP-CKSUM.
>
> *[Sugesh] In OVS-DPDK, I am not sure about the advantage of having HW
> checksum. Because even if you save CPU cycles, that will get used for non
> vector tx.*
>
> *So I would prefer to keep these options only if there are really a need
> for that.*
>
> BTW, when will DPDK support tx checksum offload with vectorization?
>
> *[Sugesh] I don’t see any plan to do that in near future. Could be worth
> to ask in DPDK mailing list as well.*
>
>
>
> Thanks
>
> Zhenyu Gao
>
>
>
>
>
> 2017-06-21 16:03 GMT+08:00 Chandran, Sugesh <[email protected]>:
>
>
>
>
>
> *Regards*
>
> *_Sugesh*
>
>
>
> *From:* Gao Zhenyu [mailto:[email protected]]
> *Sent:* Monday, June 19, 2017 1:23 PM
> *To:* Chandran, Sugesh <[email protected]>
> *Cc:* [email protected]; [email protected]; [email protected]; Kavanagh,
> Mark B <[email protected]>; [email protected]
> *Subject:* Re: [ovs-dev] [RFC PATCH v1] net-dpdk: Introducing TX tcp HW
> checksum offload support for DPDK pnic
>
>
>
> Thanks for that comments.
>
> [Sugesh] Any reason, why this patch does only the TCP checksum offload??
> The command line option says tx_checksum offload (it could be mistakenly
> considered for full checksum offload).
>
> [Zhenyu Gao] DPDK nic supports many hw offload feature like IPv4,IPV6,TCP,
> UDP,VXLAN,GRE. I would like to make them work step by step. A huge patch
> may introduce more potential issues.
>
> TCP offload is a basic and essential feature so I prefer to implement it
> first.
>
> *[Sugesh] Ok, Fine!*
>
>
>
> [Sugesh] What is the performance improvement offered with this feature? Do
> you have any numbers to share?
> [Zhenyu Gao]I think DPDK uses non-vector functions when Tx checksum
> offload is enabled. Will it give enough performance improvement to mitigate
> that cost?
>
> It is a draft patch to collect advise and suggestions. In my draft
> testing, it doesn't show improvment or regression
>
> In ovs-dpdk + veth environment, veth support tcp cksum offload by default,
> but it introduces tcp connection issue because veth believes it supports
> cksum and offload to ovs, but dpdk side doesn't do the offloading.
>
> So I have to use ethtool -K eth1 tx off to disable all tx offloading if
> using original ovs-dpdk. That means we cannot consume TSO as well.
>
> *[Sugesh] This is a concern. We have to consider other usecases as well.
> Most of the high performance ovs-dpdk applications doesn’t use any
> kernel/veth pair interfaces in OVS-DPDK datapath.*
>
>
>
>
>
> It is a ovs-dpdk + veth environment. So it consumes sendmsg/ recvmsg on
> RX/TX in ovs-dpdk side. The netperf was executed on ovs-dpdk + veth side.
> The veth side enabled tx-tcp hw cksum, disabled tso. Bottleneck was not
> in cksum, and running testing in a vhost VM is more reasonable.
>
> *[Sugesh] I agree with you. But its worthwhile to know what is the
> performance delta. Also if the cost of vectorization is high, we may
> consider to do the checksum calculation in software itself. I feel x86
> instructions can do checksum calculation pretty efficient. Have you
> consider that option?*
>
>
> [root@16ee46e4b793 ~]# netperf -H 10.100.85.247 -t TCP_RR -l 10
> MIGRATED TCP REQUEST/RESPONSE TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET
> to 10.100.85.247 () port 0 AF_INET : first burst 0
> Local /Remote
> Socket Size Request Resp. Elapsed Trans.
> Send Recv Size Size Time Rate
> bytes Bytes bytes bytes secs. per sec
>
> 16384 87380 1 1 10.00 15001.87(HW tcp-cksum)
> 15062.72(No HW tcp-cksum)
> 16384 87380
>
>
> [root@16ee46e4b793 ~]# netperf -H 10.100.85.247 -t TCP_STREAM -l 10
> MIGRATED TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to
> 10.100.85.247 () port 0 AF_INET
> Recv Send Send
> Socket Socket Message Elapsed
> Size Size Size Time Throughput
> bytes bytes bytes secs. 10^6bits/sec
>
> 87380 16384 16384 10.02 263.41(HW tcp-cksum) 265.31(No HW
> tcp-cksum)
>
>
>
> I would like to keep it disabled in default setting unless we implement
> more tx offloading like TSO.(Do you have concern on it?) BTW, I think I
> can rename NETDEV_TX_CHECKSUM_OFFLOAD into NETDEV_TX_TCP_CHECKSUM_OFFLOAD.
>
> Please let me know if you get any questions. :)
>
> *[Sugesh] On Rx checksum offload case, it works with vector instructions.
> The latest DPDK support rx checksum offload with vectorization. *
>
> Thanks
>
>
>
> 2017-06-19 17:26 GMT+08:00 Chandran, Sugesh <[email protected]>:
>
> Hi Zhenyu,
>
> Thank you for working on this,
> I have couple of questions in this patch.
>
> Regards
> _Sugesh
>
>
> > -----Original Message-----
> > From: [email protected] [mailto:ovs-dev-
> > [email protected]] On Behalf Of Zhenyu Gao
> > Sent: Friday, June 16, 2017 1:54 PM
> > To: [email protected]; [email protected]; [email protected]; Kavanagh,
> > Mark B <[email protected]>; [email protected]
> > Subject: [ovs-dev] [RFC PATCH v1] net-dpdk: Introducing TX tcp HW
> > checksum offload support for DPDK pnic
> >
> > This patch introduce TX tcp-checksum offload support for DPDK pnic.
> > The feature is disabled by default and can be enabled by setting tx-
> > checksum-offload, which like:
> > ovs-vsctl set Interface dpdk-eth3 \
> > options:tx-checksum-offload=true
> > ---
> > lib/netdev-dpdk.c | 112
> > +++++++++++++++++++++++++++++++++++++++++++++++----
> > vswitchd/vswitch.xml | 13 ++++--
> > 2 files changed, 115 insertions(+), 10 deletions(-)
> >
> > diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c index bba4de3..5a68a48
> > 100644
> > --- a/lib/netdev-dpdk.c
> > +++ b/lib/netdev-dpdk.c
> > @@ -32,6 +32,7 @@
> > #include <rte_mbuf.h>
> > #include <rte_meter.h>
> > #include <rte_virtio_net.h>
> > +#include <rte_ip.h>
> >
> > #include "dirs.h"
> > #include "dp-packet.h"
> > @@ -328,6 +329,7 @@ struct ingress_policer {
> >
> > enum dpdk_hw_ol_features {
> > NETDEV_RX_CHECKSUM_OFFLOAD = 1 << 0,
> > + NETDEV_TX_CHECKSUM_OFFLOAD = 1 << 1,
> > };
> >
> > struct netdev_dpdk {
> > @@ -649,6 +651,8 @@ dpdk_eth_dev_queue_setup(struct netdev_dpdk
> > *dev, int n_rxq, int n_txq)
> > int diag = 0;
> > int i;
> > struct rte_eth_conf conf = port_conf;
> > + struct rte_eth_txconf *txconf;
> > + struct rte_eth_dev_info dev_info;
> >
> > if (dev->mtu > ETHER_MTU) {
> > conf.rxmode.jumbo_frame = 1;
> > @@ -676,9 +680,16 @@ dpdk_eth_dev_queue_setup(struct netdev_dpdk
> > *dev, int n_rxq, int n_txq)
> > break;
> > }
> >
> > + rte_eth_dev_info_get(dev->port_id, &dev_info);
> > + txconf = &dev_info.default_txconf;
> > + if (dev->hw_ol_features & NETDEV_TX_CHECKSUM_OFFLOAD) {
> > + /*Enable tx offload feature on pnic*/
> > + txconf->txq_flags = 0;
> > + }
> > +
> > for (i = 0; i < n_txq; i++) {
> > diag = rte_eth_tx_queue_setup(dev->port_id, i,
> dev->txq_size,
> > - dev->socket_id, NULL);
> > + dev->socket_id, txconf);
> > if (diag) {
> > VLOG_INFO("Interface %s txq(%d) setup error: %s",
> > dev->up.name, i, rte_strerror(-diag)); @@
> -724,11 +735,15 @@
> > dpdk_eth_checksum_offload_configure(struct netdev_dpdk *dev) {
> > struct rte_eth_dev_info info;
> > bool rx_csum_ol_flag = false;
> > + bool tx_csum_ol_flag = false;
> > uint32_t rx_chksm_offload_capa = DEV_RX_OFFLOAD_UDP_CKSUM |
> > DEV_RX_OFFLOAD_TCP_CKSUM |
> > DEV_RX_OFFLOAD_IPV4_CKSUM;
> > + uint32_t tx_chksm_offload_capa = DEV_TX_OFFLOAD_TCP_CKSUM;
>
> [Sugesh] Any reason, why this patch does only the TCP checksum offload??
> The command line option says tx_checksum offload (it could be mistakenly
> considered for full checksum offload).
>
> > +
> > rte_eth_dev_info_get(dev->port_id, &info);
> > rx_csum_ol_flag = (dev->hw_ol_features &
> > NETDEV_RX_CHECKSUM_OFFLOAD) != 0;
> > + tx_csum_ol_flag = (dev->hw_ol_features &
> > + NETDEV_TX_CHECKSUM_OFFLOAD) != 0;
> >
> > if (rx_csum_ol_flag &&
> > (info.rx_offload_capa & rx_chksm_offload_capa) != @@ -736,9
> +751,15
> > @@ dpdk_eth_checksum_offload_configure(struct netdev_dpdk *dev)
> > VLOG_WARN_ONCE("Rx checksum offload is not supported on device
> > %"PRIu8,
> > dev->port_id);
> > dev->hw_ol_features &= ~NETDEV_RX_CHECKSUM_OFFLOAD;
> > - return;
> > + } else if (tx_csum_ol_flag &&
> > + (info.tx_offload_capa & tx_chksm_offload_capa) !=
> > + tx_chksm_offload_capa) {
> > + VLOG_WARN_ONCE("Tx checksum offload is not supported on device
> > %"PRIu8,
> > + dev->port_id);
> > + dev->hw_ol_features &= ~NETDEV_TX_CHECKSUM_OFFLOAD;
> > + } else {
> > + netdev_request_reconfigure(&dev->up);
> > }
> > - netdev_request_reconfigure(&dev->up);
> > }
> >
> > --
>
> [Sugesh] What is the performance improvement offered with this feature? Do
> you have any numbers to share?
> I think DPDK uses non-vector functions when Tx checksum offload is
> enabled. Will it give enough performance improvement to mitigate that cost?
>
> Finally Rx checksum offload is going to be a default option (there wont be
> any configuration option to enable/disable, Kevin's patch for the support
> is already acked and waiting to merge). Similarly can't we enable it by
> default when it is supported?
>
>
>
> > 1.8.3.1
>
> >
> > _______________________________________________
> > dev mailing list
> > [email protected]
> > https://mail.openvswitch.org/mailman/listinfo/ovs-dev
>
>
>
>
>
_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev