I get it. Maybe caculating it in OVS part is doable as well. So, how about adding more options to let people choose HW-tcp-cksum(reduce cpu cycles) or SW-tcp-cksum(may be better performance)? Then we have NO-TCP-CKSUM, SW-TCP-CKSUM, HW-TCP-CKSUM.
BTW, when will DPDK support tx checksum offload with vectorization? Thanks Zhenyu Gao 2017-06-21 16:03 GMT+08:00 Chandran, Sugesh <sugesh.chand...@intel.com>: > > > > > *Regards* > > *_Sugesh* > > > > *From:* Gao Zhenyu [mailto:sysugaozhe...@gmail.com] > *Sent:* Monday, June 19, 2017 1:23 PM > *To:* Chandran, Sugesh <sugesh.chand...@intel.com> > *Cc:* b...@ovn.org; u9012...@gmail.com; ktray...@redhat.com; Kavanagh, > Mark B <mark.b.kavan...@intel.com>; d...@openvswitch.org > *Subject:* Re: [ovs-dev] [RFC PATCH v1] net-dpdk: Introducing TX tcp HW > checksum offload support for DPDK pnic > > > > Thanks for that comments. > > [Sugesh] Any reason, why this patch does only the TCP checksum offload?? > The command line option says tx_checksum offload (it could be mistakenly > considered for full checksum offload). > > [Zhenyu Gao] DPDK nic supports many hw offload feature like IPv4,IPV6,TCP, > UDP,VXLAN,GRE. I would like to make them work step by step. A huge patch > may introduce more potential issues. > > TCP offload is a basic and essential feature so I prefer to implement it > first. > > *[Sugesh] Ok, Fine!* > > > > [Sugesh] What is the performance improvement offered with this feature? Do > you have any numbers to share? > [Zhenyu Gao]I think DPDK uses non-vector functions when Tx checksum > offload is enabled. Will it give enough performance improvement to mitigate > that cost? > > It is a draft patch to collect advise and suggestions. In my draft > testing, it doesn't show improvment or regression > > In ovs-dpdk + veth environment, veth support tcp cksum offload by default, > but it introduces tcp connection issue because veth believes it supports > cksum and offload to ovs, but dpdk side doesn't do the offloading. > > So I have to use ethtool -K eth1 tx off to disable all tx offloading if > using original ovs-dpdk. That means we cannot consume TSO as well. > > *[Sugesh] This is a concern. We have to consider other usecases as well. > Most of the high performance ovs-dpdk applications doesn’t use any > kernel/veth pair interfaces in OVS-DPDK datapath.* > > > > > > It is a ovs-dpdk + veth environment. So it consumes sendmsg/ recvmsg on > RX/TX in ovs-dpdk side. The netperf was executed on ovs-dpdk + veth side. > The veth side enabled tx-tcp hw cksum, disabled tso. Bottleneck was not > in cksum, and running testing in a vhost VM is more reasonable. > > *[Sugesh] I agree with you. But its worthwhile to know what is the > performance delta. Also if the cost of vectorization is high, we may > consider to do the checksum calculation in software itself. I feel x86 > instructions can do checksum calculation pretty efficient. Have you > consider that option?* > > > [root@16ee46e4b793 ~]# netperf -H 10.100.85.247 -t TCP_RR -l 10 > MIGRATED TCP REQUEST/RESPONSE TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET > to 10.100.85.247 () port 0 AF_INET : first burst 0 > Local /Remote > Socket Size Request Resp. Elapsed Trans. > Send Recv Size Size Time Rate > bytes Bytes bytes bytes secs. per sec > > 16384 87380 1 1 10.00 15001.87(HW tcp-cksum) > 15062.72(No HW tcp-cksum) > 16384 87380 > > > [root@16ee46e4b793 ~]# netperf -H 10.100.85.247 -t TCP_STREAM -l 10 > MIGRATED TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to > 10.100.85.247 () port 0 AF_INET > Recv Send Send > Socket Socket Message Elapsed > Size Size Size Time Throughput > bytes bytes bytes secs. 10^6bits/sec > > 87380 16384 16384 10.02 263.41(HW tcp-cksum) 265.31(No HW > tcp-cksum) > > > > I would like to keep it disabled in default setting unless we implement > more tx offloading like TSO.(Do you have concern on it?) BTW, I think I > can rename NETDEV_TX_CHECKSUM_OFFLOAD into NETDEV_TX_TCP_CHECKSUM_OFFLOAD. > > Please let me know if you get any questions. :) > > *[Sugesh] On Rx checksum offload case, it works with vector instructions. > The latest DPDK support rx checksum offload with vectorization. * > > Thanks > > > > 2017-06-19 17:26 GMT+08:00 Chandran, Sugesh <sugesh.chand...@intel.com>: > > Hi Zhenyu, > > Thank you for working on this, > I have couple of questions in this patch. > > Regards > _Sugesh > > > > -----Original Message----- > > From: ovs-dev-boun...@openvswitch.org [mailto:ovs-dev- > > boun...@openvswitch.org] On Behalf Of Zhenyu Gao > > Sent: Friday, June 16, 2017 1:54 PM > > To: b...@ovn.org; u9012...@gmail.com; ktray...@redhat.com; Kavanagh, > > Mark B <mark.b.kavan...@intel.com>; d...@openvswitch.org > > Subject: [ovs-dev] [RFC PATCH v1] net-dpdk: Introducing TX tcp HW > > checksum offload support for DPDK pnic > > > > This patch introduce TX tcp-checksum offload support for DPDK pnic. > > The feature is disabled by default and can be enabled by setting tx- > > checksum-offload, which like: > > ovs-vsctl set Interface dpdk-eth3 \ > > options:tx-checksum-offload=true > > --- > > lib/netdev-dpdk.c | 112 > > +++++++++++++++++++++++++++++++++++++++++++++++---- > > vswitchd/vswitch.xml | 13 ++++-- > > 2 files changed, 115 insertions(+), 10 deletions(-) > > > > diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c index bba4de3..5a68a48 > > 100644 > > --- a/lib/netdev-dpdk.c > > +++ b/lib/netdev-dpdk.c > > @@ -32,6 +32,7 @@ > > #include <rte_mbuf.h> > > #include <rte_meter.h> > > #include <rte_virtio_net.h> > > +#include <rte_ip.h> > > > > #include "dirs.h" > > #include "dp-packet.h" > > @@ -328,6 +329,7 @@ struct ingress_policer { > > > > enum dpdk_hw_ol_features { > > NETDEV_RX_CHECKSUM_OFFLOAD = 1 << 0, > > + NETDEV_TX_CHECKSUM_OFFLOAD = 1 << 1, > > }; > > > > struct netdev_dpdk { > > @@ -649,6 +651,8 @@ dpdk_eth_dev_queue_setup(struct netdev_dpdk > > *dev, int n_rxq, int n_txq) > > int diag = 0; > > int i; > > struct rte_eth_conf conf = port_conf; > > + struct rte_eth_txconf *txconf; > > + struct rte_eth_dev_info dev_info; > > > > if (dev->mtu > ETHER_MTU) { > > conf.rxmode.jumbo_frame = 1; > > @@ -676,9 +680,16 @@ dpdk_eth_dev_queue_setup(struct netdev_dpdk > > *dev, int n_rxq, int n_txq) > > break; > > } > > > > + rte_eth_dev_info_get(dev->port_id, &dev_info); > > + txconf = &dev_info.default_txconf; > > + if (dev->hw_ol_features & NETDEV_TX_CHECKSUM_OFFLOAD) { > > + /*Enable tx offload feature on pnic*/ > > + txconf->txq_flags = 0; > > + } > > + > > for (i = 0; i < n_txq; i++) { > > diag = rte_eth_tx_queue_setup(dev->port_id, i, > dev->txq_size, > > - dev->socket_id, NULL); > > + dev->socket_id, txconf); > > if (diag) { > > VLOG_INFO("Interface %s txq(%d) setup error: %s", > > dev->up.name, i, rte_strerror(-diag)); @@ > -724,11 +735,15 @@ > > dpdk_eth_checksum_offload_configure(struct netdev_dpdk *dev) { > > struct rte_eth_dev_info info; > > bool rx_csum_ol_flag = false; > > + bool tx_csum_ol_flag = false; > > uint32_t rx_chksm_offload_capa = DEV_RX_OFFLOAD_UDP_CKSUM | > > DEV_RX_OFFLOAD_TCP_CKSUM | > > DEV_RX_OFFLOAD_IPV4_CKSUM; > > + uint32_t tx_chksm_offload_capa = DEV_TX_OFFLOAD_TCP_CKSUM; > > [Sugesh] Any reason, why this patch does only the TCP checksum offload?? > The command line option says tx_checksum offload (it could be mistakenly > considered for full checksum offload). > > > + > > rte_eth_dev_info_get(dev->port_id, &info); > > rx_csum_ol_flag = (dev->hw_ol_features & > > NETDEV_RX_CHECKSUM_OFFLOAD) != 0; > > + tx_csum_ol_flag = (dev->hw_ol_features & > > + NETDEV_TX_CHECKSUM_OFFLOAD) != 0; > > > > if (rx_csum_ol_flag && > > (info.rx_offload_capa & rx_chksm_offload_capa) != @@ -736,9 > +751,15 > > @@ dpdk_eth_checksum_offload_configure(struct netdev_dpdk *dev) > > VLOG_WARN_ONCE("Rx checksum offload is not supported on device > > %"PRIu8, > > dev->port_id); > > dev->hw_ol_features &= ~NETDEV_RX_CHECKSUM_OFFLOAD; > > - return; > > + } else if (tx_csum_ol_flag && > > + (info.tx_offload_capa & tx_chksm_offload_capa) != > > + tx_chksm_offload_capa) { > > + VLOG_WARN_ONCE("Tx checksum offload is not supported on device > > %"PRIu8, > > + dev->port_id); > > + dev->hw_ol_features &= ~NETDEV_TX_CHECKSUM_OFFLOAD; > > + } else { > > + netdev_request_reconfigure(&dev->up); > > } > > - netdev_request_reconfigure(&dev->up); > > } > > > > -- > > [Sugesh] What is the performance improvement offered with this feature? Do > you have any numbers to share? > I think DPDK uses non-vector functions when Tx checksum offload is > enabled. Will it give enough performance improvement to mitigate that cost? > > Finally Rx checksum offload is going to be a default option (there wont be > any configuration option to enable/disable, Kevin's patch for the support > is already acked and waiting to merge). Similarly can't we enable it by > default when it is supported? > > > > > 1.8.3.1 > > > > > _______________________________________________ > > dev mailing list > > d...@openvswitch.org > > https://mail.openvswitch.org/mailman/listinfo/ovs-dev > > > _______________________________________________ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev