Re: [ovs-dev] [PATCH v2] netdev-dpdk: Refactor the DPDK transmit path.
Hello Sunil, Marko and Ian. Mike worked to identify the reason for the performance issue reported by you a while ago. He summarized below. I wonder if you can give a try on his patch too and tell us if we are on the right track. Thanks, fbl On Wed, Jan 05, 2022 at 03:01:47PM -0500, Mike Pattrick wrote: > Hello Flavio, > > Great patch, I think you really did a lot to improve the code here and > I think that's borne out by the consistent performance improvements > across multiple tests. > > Regarding the 4% regression that Intel detected, I found the following > white paper to describe the "scatter" test: > > https://builders.intel.com/docs/networkbuilders/open-vswitch-optimized-deployment-benchmark-technology-guide.pdf > > This document calls out the following key points: > > The original test was summarized as: > - 32 VMs with one million flows. > - Test runs on four physical cores for OVS and 10 hyper-threaded cores > for TestPMD > - An Ixia pitches traffic at a sub 0.1% loss rate > - The server catches traffic with a E810-C 100G > - The traffic's profile is: Ether()/IP()/UDP()/VXLAN()/Ether()/IP() > - On the outer IP, the source address changes incrementally across the > 32 instances > - The destination address remains the same on the outer IP. > - The inner source IP remains > - The inner destination address increments to create the one million > flows for the test > - EMC and SMC were disabled > > I could not reproduce this test exactly because I don't have access to > the same hardware - notably the Intel NIC and an Ixia - and I didn't > want to create an environment that wouldn't be reproduced in real world > scenarios. I did pin VM and TXQs/RXQs threads to cores, but I didn't > optimize the setup nearly to the extant that the white paper described. > My test setup consisted of two Fedora 35 servers directly connected > across Mellanox5E cards with Trex pitching traffic and TestPMD > reflecting it. > > In my test I was still able to reproduce a similar performance penalty. > I found that the key factors was the combination of VXLAN and a large > number of flows. So once I had a setup that could reproduce close to > the 4% penalty I stopped modifying my test framework and started > searching for the slow code. > > I didn't see any obvious issues in the code that should cause a > significant slowdown, in fact, most of the code is identical or > slightly improved. So to help my analysis, I created several variations > of your patch reverting small aspects of the change and benchmarked > each variation. > > Because the difference in performance across each variation was so > minor, I took a lot of samples. I pitched traffic over one million > flows for 240 seconds and averaged out the throughput, I then repeated > this process a total of five times for each patch. Finally, I repeated > the whole process three times to produce 15 data points per patch. > > The best results came from the patch enclosed below, with the code from > netdev_dpdk_common_send() protected by the splinlock, as it is in the > pre-patch code. This yielded a 2.7% +/- 0.64 performance boost over the > master branch. > > > Cheers, > Michael > > > diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c > index bc1633663..5db5d7e2a 100644 > --- a/lib/netdev-dpdk.c > +++ b/lib/netdev-dpdk.c > @@ -2777,13 +2777,13 @@ netdev_dpdk_vhost_send(struct netdev *netdev, int qid, > return 0; > } > > -cnt = netdev_dpdk_common_send(netdev, batch, ); > - > if (OVS_UNLIKELY(!rte_spinlock_trylock(>tx_q[qid].tx_lock))) { > COVERAGE_INC(vhost_tx_contention); > rte_spinlock_lock(>tx_q[qid].tx_lock); > } > > +cnt = netdev_dpdk_common_send(netdev, batch, ); > + > pkts = (struct rte_mbuf **) batch->packets; > vhost_batch_cnt = cnt; > retries = 0; > @@ -2843,13 +2843,15 @@ netdev_dpdk_eth_send(struct netdev *netdev, int qid, > return 0; > } > > -cnt = netdev_dpdk_common_send(netdev, batch, ); > -dropped = batch_cnt - cnt; > if (OVS_UNLIKELY(concurrent_txq)) { > qid = qid % dev->up.n_txq; > rte_spinlock_lock(>tx_q[qid].tx_lock); > } > > +cnt = netdev_dpdk_common_send(netdev, batch, ); > + > +dropped = batch_cnt - cnt; > + > dropped += netdev_dpdk_eth_tx_burst(dev, qid, pkts, cnt); > if (OVS_UNLIKELY(dropped)) { > struct netdev_dpdk_sw_stats *sw_stats = dev->sw_stats; > > > On Sun, 2021-01-10 at 00:05 -0300, Flavio Leitner wrote: > > This patch split out the common code between vhost and > > dpdk transmit paths to shared functions to simplify the > > code and fix an issue. > > > > The issue is that the packet coming from non-DPDK device > > and egressing on a DPDK device currently skips the hwol > > preparation. > > > > This also have the side effect of leaving only the dpdk > > transmit code under the txq lock. > > > > Signed-off-by: Flavio Leitner > > Reviewed-by: David Marchand > > --- > >
Re: [ovs-dev] [PATCH v2] netdev-dpdk: Refactor the DPDK transmit path.
Hello Flavio, Great patch, I think you really did a lot to improve the code here and I think that's borne out by the consistent performance improvements across multiple tests. Regarding the 4% regression that Intel detected, I found the following white paper to describe the "scatter" test: https://builders.intel.com/docs/networkbuilders/open-vswitch-optimized-deployment-benchmark-technology-guide.pdf This document calls out the following key points: The original test was summarized as: - 32 VMs with one million flows. - Test runs on four physical cores for OVS and 10 hyper-threaded cores for TestPMD - An Ixia pitches traffic at a sub 0.1% loss rate - The server catches traffic with a E810-C 100G - The traffic's profile is: Ether()/IP()/UDP()/VXLAN()/Ether()/IP() - On the outer IP, the source address changes incrementally across the 32 instances - The destination address remains the same on the outer IP. - The inner source IP remains - The inner destination address increments to create the one million flows for the test - EMC and SMC were disabled I could not reproduce this test exactly because I don't have access to the same hardware - notably the Intel NIC and an Ixia - and I didn't want to create an environment that wouldn't be reproduced in real world scenarios. I did pin VM and TXQs/RXQs threads to cores, but I didn't optimize the setup nearly to the extant that the white paper described. My test setup consisted of two Fedora 35 servers directly connected across Mellanox5E cards with Trex pitching traffic and TestPMD reflecting it. In my test I was still able to reproduce a similar performance penalty. I found that the key factors was the combination of VXLAN and a large number of flows. So once I had a setup that could reproduce close to the 4% penalty I stopped modifying my test framework and started searching for the slow code. I didn't see any obvious issues in the code that should cause a significant slowdown, in fact, most of the code is identical or slightly improved. So to help my analysis, I created several variations of your patch reverting small aspects of the change and benchmarked each variation. Because the difference in performance across each variation was so minor, I took a lot of samples. I pitched traffic over one million flows for 240 seconds and averaged out the throughput, I then repeated this process a total of five times for each patch. Finally, I repeated the whole process three times to produce 15 data points per patch. The best results came from the patch enclosed below, with the code from netdev_dpdk_common_send() protected by the splinlock, as it is in the pre-patch code. This yielded a 2.7% +/- 0.64 performance boost over the master branch. Cheers, Michael diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c index bc1633663..5db5d7e2a 100644 --- a/lib/netdev-dpdk.c +++ b/lib/netdev-dpdk.c @@ -2777,13 +2777,13 @@ netdev_dpdk_vhost_send(struct netdev *netdev, int qid, return 0; } -cnt = netdev_dpdk_common_send(netdev, batch, ); - if (OVS_UNLIKELY(!rte_spinlock_trylock(>tx_q[qid].tx_lock))) { COVERAGE_INC(vhost_tx_contention); rte_spinlock_lock(>tx_q[qid].tx_lock); } +cnt = netdev_dpdk_common_send(netdev, batch, ); + pkts = (struct rte_mbuf **) batch->packets; vhost_batch_cnt = cnt; retries = 0; @@ -2843,13 +2843,15 @@ netdev_dpdk_eth_send(struct netdev *netdev, int qid, return 0; } -cnt = netdev_dpdk_common_send(netdev, batch, ); -dropped = batch_cnt - cnt; if (OVS_UNLIKELY(concurrent_txq)) { qid = qid % dev->up.n_txq; rte_spinlock_lock(>tx_q[qid].tx_lock); } +cnt = netdev_dpdk_common_send(netdev, batch, ); + +dropped = batch_cnt - cnt; + dropped += netdev_dpdk_eth_tx_burst(dev, qid, pkts, cnt); if (OVS_UNLIKELY(dropped)) { struct netdev_dpdk_sw_stats *sw_stats = dev->sw_stats; On Sun, 2021-01-10 at 00:05 -0300, Flavio Leitner wrote: > This patch split out the common code between vhost and > dpdk transmit paths to shared functions to simplify the > code and fix an issue. > > The issue is that the packet coming from non-DPDK device > and egressing on a DPDK device currently skips the hwol > preparation. > > This also have the side effect of leaving only the dpdk > transmit code under the txq lock. > > Signed-off-by: Flavio Leitner > Reviewed-by: David Marchand > --- > > V2: >- mentioned the tx lock change in the commit message. >- fixed packet leak when copy fails. >- moved pkt_cnt = cnt two lines up. > > I tested the following scenarios with iperf and iperf -R > and noticed no performance regression: > IPERF VM-Bridge 1.02x > IPERF VM-VM 1.04x > IPERF VM-ExtHost 1.00x > IPERF VM-NS 1.01x > IPERF VM-VLAN-Bridge 1.03x > IPERF VM-VLAN-VM 1.03x > IPERF VM-VLAN-ExtHost 1.01x >
Re: [ovs-dev] [PATCH v2] netdev-dpdk: Refactor the DPDK transmit path.
Hi Marko, On Wed, Jan 13, 2021 at 10:51:03AM +, Kovacevic, Marko wrote: > Hi Flavio, > > > So, I see a small performance gain. Can you provide a more detailed > > test description? I wonder if we are testing the same thing. > > VSPERF phy2phy_tput uses 2 dpdk pmds and performs zero packet loss > test(RFC2544) with bidirectional traffic. Ok. > > > and PVP test results are in line with your observations. > > > NIC: Fortville 10G X710 > > > > Great. > > > > > Performance with different traffic profiles when deployed with 32VM's > > +1M flows + vxlan enabled : > > > With burst mode : ~1% increase in performance > > > > That is aligned with above. > > > > > as compared to > > > scatter mode: ~4% decrease in performance. > > > > Hm, that is unexpected. Is this reproducible? > > If you mean are the results reproducible then yes, > I done a re-run of v1 and v2 again Thanks for checking. > I ran my test against these commits to get baseline results and then ran v1 > and v2 and compared results > DPDK: b1d36cf82 (HEAD, tag: v20.11, origin/releases) version: 20.11.0 > OVS: 7f79ae2fb (HEAD -> master, origin/master, origin/HEAD) Documentation: > Simplify the website main page. > > V1: > DPDK: b1d36cf82 (HEAD, tag: v20.11, origin/releases) version: 20.11.0 > OVS: 957132b3a (HEAD -> master) netdev-dpdk: Refactor the DPDK transmit path. > 7f79ae2fb (origin/master, origin/HEAD) Documentation: Simplify the website > main page. > > V1: Burst= 1% > V1: Scatter -4% > As sunil reported already shows the same results again Alright. I will try to reproduce in my lab to understand the root cause. > As for V2 it didn't show any increase or decrease for both burst and scatter, > And even between the v1 and v2 results from my second run it didn't show much > difference on my test anyway. > > > How are you switching between burst and scatter mode? > > As for how were switching between burst and scatter, I just use two different > traffic profiles & gather results. > So I run the test using burst profile restart the test then use scatter. > > Traffic @ Phy NIC Rx: > Ether()/IP()/UDP()/VXLAN()/Ether()/IP() > > Burst: on the outer ip we do a burst of 32 packets with same ip then switch > for next 32 and so on > Scatter: for scatter we do incrementally for 32 Glad that I asked because I thought you were talking about the NIC mode with scatter enabled or not. > And we do this for 1M flows > > I hope this answers your question They are helpful. I will try to reproduce the -4% results in my lab as a next step. It may take some time. Thanks for testing and providing feedbacks! -- fbl ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH v2] netdev-dpdk: Refactor the DPDK transmit path.
Hi Flavio, > So, I see a small performance gain. Can you provide a more detailed > test description? I wonder if we are testing the same thing. VSPERF phy2phy_tput uses 2 dpdk pmds and performs zero packet loss test(RFC2544) with bidirectional traffic. > > > and PVP test results are in line with your observations. > > NIC: Fortville 10G X710 > > Great. > > > Performance with different traffic profiles when deployed with 32VM's > +1M flows + vxlan enabled : > > With burst mode : ~1% increase in performance > > That is aligned with above. > > > as compared to > > scatter mode: ~4% decrease in performance. > > Hm, that is unexpected. Is this reproducible? If you mean are the results reproducible then yes, I done a re-run of v1 and v2 again I ran my test against these commits to get baseline results and then ran v1 and v2 and compared results DPDK: b1d36cf82 (HEAD, tag: v20.11, origin/releases) version: 20.11.0 OVS: 7f79ae2fb (HEAD -> master, origin/master, origin/HEAD) Documentation: Simplify the website main page. V1: DPDK: b1d36cf82 (HEAD, tag: v20.11, origin/releases) version: 20.11.0 OVS: 957132b3a (HEAD -> master) netdev-dpdk: Refactor the DPDK transmit path. 7f79ae2fb (origin/master, origin/HEAD) Documentation: Simplify the website main page. V1: Burst= 1% V1: Scatter -4% As sunil reported already shows the same results again As for V2 it didn't show any increase or decrease for both burst and scatter, And even between the v1 and v2 results from my second run it didn't show much difference on my test anyway. > How are you switching between burst and scatter mode? As for how were switching between burst and scatter, I just use two different traffic profiles & gather results. So I run the test using burst profile restart the test then use scatter. Traffic @ Phy NIC Rx: Ether()/IP()/UDP()/VXLAN()/Ether()/IP() Burst: on the outer ip we do a burst of 32 packets with same ip then switch for next 32 and so on Scatter: for scatter we do incrementally for 32 And we do this for 1M flows I hope this answers your question > > Thanks, > fbl Thanks, Marko K ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH v2] netdev-dpdk: Refactor the DPDK transmit path.
On Sun, Jan 10, 2021 at 4:05 AM Flavio Leitner wrote: > > This patch split out the common code between vhost and > dpdk transmit paths to shared functions to simplify the > code and fix an issue. > > The issue is that the packet coming from non-DPDK device > and egressing on a DPDK device currently skips the hwol > preparation. > > This also have the side effect of leaving only the dpdk nit: has* > transmit code under the txq lock. > > Signed-off-by: Flavio Leitner Reviewed-by: David Marchand Thanks. -- David Marchand ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH v2] netdev-dpdk: Refactor the DPDK transmit path.
Hi Sunil, On Mon, Jan 11, 2021 at 03:49:56PM +, Pai G, Sunil wrote: > Hi Flavio, > > > > -Original Message- > > From: dev On Behalf Of Flavio Leitner > > Sent: Sunday, January 10, 2021 8:35 AM > > To: d...@openvswitch.org > > Cc: David Marchand ; Flavio Leitner > > > > Subject: [ovs-dev] [PATCH v2] netdev-dpdk: Refactor the DPDK transmit > > path. > > > > This patch split out the common code between vhost and dpdk transmit > > paths to shared functions to simplify the code and fix an issue. > > > > The issue is that the packet coming from non-DPDK device and egressing on a > > DPDK device currently skips the hwol preparation. > > > > This also have the side effect of leaving only the dpdk transmit code under > > the txq lock. > > > > Signed-off-by: Flavio Leitner > > --- > > > > V2: > >- mentioned the tx lock change in the commit message. > >- fixed packet leak when copy fails. > >- moved pkt_cnt = cnt two lines up. > > > > I tested the following scenarios with iperf and iperf -R and noticed no > > performance regression: > > IPERF VM-Bridge 1.02x > > IPERF VM-VM 1.04x > > IPERF VM-ExtHost 1.00x > > IPERF VM-NS 1.01x > > IPERF VM-VLAN-Bridge 1.03x > > IPERF VM-VLAN-VM 1.03x > > IPERF VM-VLAN-ExtHost 1.01x > > IPERF VM-VLAN-NS 1.02x > > IPERFVM-V6-VM 1.03x > > IPERF VM-V6-ExtHost 1.00x > > IPERFVM-V6-NS 1.01x > > IPERF-R VM-Bridge 1.01x > > IPERF-R VM-VM 1.04x > > IPERF-R VM-ExtHost 1.10x > > IPERF-R VM-NS 1.01x > > IPERF-R VM-VLAN-Bridge 1.03x > > IPERF-R VM-VLAN-VM 1.02x > > IPERF-R VM-VLAN-ExtHost 1.08x > > IPERF-R VM-VLAN-NS 1.02x > > IPERF-RVM-V6-VM 1.00x > > IPERF-R VM-V6-ExtHost 1.11x > > IPERF-RVM-V6-NS 1.00x > > > > Now using trex, 64byte packet, PVP: > > Original: 3.6Mpps > > avg. packets per output batch: 32.00 > > idle cycles: 0 (0.00%) > > avg cycles per packet: 304.92 (8331383020/27323150) > > avg processing cycles per packet: 304.92 (8331383020/27323150) > > > > Patched: 3.6Mpps > > avg. packets per output batch: 32.00 > > idle cycles: 0 (0.00%) > > avg cycles per packet: 304.08 (21875784116/71941516) > > avg processing cycles per packet: 304.08 (21875784116/71941516) > > > Ran the following tests in VSPERF for QoS and they all pass: Great, thanks for testing the patch. > ovsdpdk_qos_create_phy_port > ovsdpdk_qos_delete_phy_port > ovsdpdk_qos_create_vport > ovsdpdk_qos_delete_vport > ovsdpdk_qos_create_no_cir > ovsdpdk_qos_create_no_cbs > > For performance tests: > We see a slight dip in performance for physical ports ~1% in phy2phy_tput > test in VSPERF with the patch(v1) for 64byte packet I tried P2P here (v2) with trex, single flow, single queue, single PMD and I got this: original: 12.88Mpps/6.6Gbps idle cycles: 0 (0.00%) processing cycles: 14466174204 (100.00%) avg cycles per packet: 170.73 (14466174204/84730796) avg processing cycles per packet: 170.73 (14466174204/84730796) patched: 13.00Mpps/6.65Mpps idle cycles: 0 (0.00%) processing cycles: 3783269716 (100.00%) avg cycles per packet: 169.49 (3783269716/22321968) avg processing cycles per packet: 169.49 (3783269716/22321968) So, I see a small performance gain. Can you provide a more detailed test description? I wonder if we are testing the same thing. > and PVP test results are in line with your observations. > NIC: Fortville 10G X710 Great. > Performance with different traffic profiles when deployed with 32VM's +1M > flows + vxlan enabled : > With burst mode : ~1% increase in performance That is aligned with above. > as compared to > scatter mode: ~4% decrease in performance. Hm, that is unexpected. Is this reproducible? How are you switching between burst and scatter mode? Thanks, fbl ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH v2] netdev-dpdk: Refactor the DPDK transmit path.
Hi Flavio, > -Original Message- > From: dev On Behalf Of Flavio Leitner > Sent: Sunday, January 10, 2021 8:35 AM > To: d...@openvswitch.org > Cc: David Marchand ; Flavio Leitner > > Subject: [ovs-dev] [PATCH v2] netdev-dpdk: Refactor the DPDK transmit > path. > > This patch split out the common code between vhost and dpdk transmit > paths to shared functions to simplify the code and fix an issue. > > The issue is that the packet coming from non-DPDK device and egressing on a > DPDK device currently skips the hwol preparation. > > This also have the side effect of leaving only the dpdk transmit code under > the txq lock. > > Signed-off-by: Flavio Leitner > --- > > V2: >- mentioned the tx lock change in the commit message. >- fixed packet leak when copy fails. >- moved pkt_cnt = cnt two lines up. > > I tested the following scenarios with iperf and iperf -R and noticed no > performance regression: > IPERF VM-Bridge 1.02x > IPERF VM-VM 1.04x > IPERF VM-ExtHost 1.00x > IPERF VM-NS 1.01x > IPERF VM-VLAN-Bridge 1.03x > IPERF VM-VLAN-VM 1.03x > IPERF VM-VLAN-ExtHost 1.01x > IPERF VM-VLAN-NS 1.02x > IPERFVM-V6-VM 1.03x > IPERF VM-V6-ExtHost 1.00x > IPERFVM-V6-NS 1.01x > IPERF-R VM-Bridge 1.01x > IPERF-R VM-VM 1.04x > IPERF-R VM-ExtHost 1.10x > IPERF-R VM-NS 1.01x > IPERF-R VM-VLAN-Bridge 1.03x > IPERF-R VM-VLAN-VM 1.02x > IPERF-R VM-VLAN-ExtHost 1.08x > IPERF-R VM-VLAN-NS 1.02x > IPERF-RVM-V6-VM 1.00x > IPERF-R VM-V6-ExtHost 1.11x > IPERF-RVM-V6-NS 1.00x > > Now using trex, 64byte packet, PVP: > Original: 3.6Mpps > avg. packets per output batch: 32.00 > idle cycles: 0 (0.00%) > avg cycles per packet: 304.92 (8331383020/27323150) > avg processing cycles per packet: 304.92 (8331383020/27323150) > > Patched: 3.6Mpps > avg. packets per output batch: 32.00 > idle cycles: 0 (0.00%) > avg cycles per packet: 304.08 (21875784116/71941516) > avg processing cycles per packet: 304.08 (21875784116/71941516) Ran the following tests in VSPERF for QoS and they all pass: ovsdpdk_qos_create_phy_port ovsdpdk_qos_delete_phy_port ovsdpdk_qos_create_vport ovsdpdk_qos_delete_vport ovsdpdk_qos_create_no_cir ovsdpdk_qos_create_no_cbs For performance tests: We see a slight dip in performance for physical ports ~1% in phy2phy_tput test in VSPERF with the patch(v1) for 64byte packet and PVP test results are in line with your observations. NIC: Fortville 10G X710 Performance with different traffic profiles when deployed with 32VM's +1M flows + vxlan enabled : With burst mode : ~1% increase in performance as compared to scatter mode: ~4% decrease in performance. [snipped] Thanks , Sunil Intel > dev mailing list > d...@openvswitch.org > https://mail.openvswitch.org/mailman/listinfo/ovs-dev ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
[ovs-dev] [PATCH v2] netdev-dpdk: Refactor the DPDK transmit path.
This patch split out the common code between vhost and dpdk transmit paths to shared functions to simplify the code and fix an issue. The issue is that the packet coming from non-DPDK device and egressing on a DPDK device currently skips the hwol preparation. This also have the side effect of leaving only the dpdk transmit code under the txq lock. Signed-off-by: Flavio Leitner --- V2: - mentioned the tx lock change in the commit message. - fixed packet leak when copy fails. - moved pkt_cnt = cnt two lines up. I tested the following scenarios with iperf and iperf -R and noticed no performance regression: IPERF VM-Bridge 1.02x IPERF VM-VM 1.04x IPERF VM-ExtHost 1.00x IPERF VM-NS 1.01x IPERF VM-VLAN-Bridge 1.03x IPERF VM-VLAN-VM 1.03x IPERF VM-VLAN-ExtHost 1.01x IPERF VM-VLAN-NS 1.02x IPERFVM-V6-VM 1.03x IPERF VM-V6-ExtHost 1.00x IPERFVM-V6-NS 1.01x IPERF-R VM-Bridge 1.01x IPERF-R VM-VM 1.04x IPERF-R VM-ExtHost 1.10x IPERF-R VM-NS 1.01x IPERF-R VM-VLAN-Bridge 1.03x IPERF-R VM-VLAN-VM 1.02x IPERF-R VM-VLAN-ExtHost 1.08x IPERF-R VM-VLAN-NS 1.02x IPERF-RVM-V6-VM 1.00x IPERF-R VM-V6-ExtHost 1.11x IPERF-RVM-V6-NS 1.00x Now using trex, 64byte packet, PVP: Original: 3.6Mpps avg. packets per output batch: 32.00 idle cycles: 0 (0.00%) avg cycles per packet: 304.92 (8331383020/27323150) avg processing cycles per packet: 304.92 (8331383020/27323150) Patched: 3.6Mpps avg. packets per output batch: 32.00 idle cycles: 0 (0.00%) avg cycles per packet: 304.08 (21875784116/71941516) avg processing cycles per packet: 304.08 (21875784116/71941516) lib/netdev-dpdk.c | 335 +++--- 1 file changed, 139 insertions(+), 196 deletions(-) diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c index 2640a421a..a1437db4d 100644 --- a/lib/netdev-dpdk.c +++ b/lib/netdev-dpdk.c @@ -2585,90 +2585,6 @@ netdev_dpdk_vhost_update_tx_counters(struct netdev_dpdk *dev, } } -static void -__netdev_dpdk_vhost_send(struct netdev *netdev, int qid, - struct dp_packet **pkts, int cnt) -{ -struct netdev_dpdk *dev = netdev_dpdk_cast(netdev); -struct rte_mbuf **cur_pkts = (struct rte_mbuf **) pkts; -struct netdev_dpdk_sw_stats sw_stats_add; -unsigned int n_packets_to_free = cnt; -unsigned int total_packets = cnt; -int i, retries = 0; -int max_retries = VHOST_ENQ_RETRY_MIN; -int vid = netdev_dpdk_get_vid(dev); - -qid = dev->tx_q[qid % netdev->n_txq].map; - -if (OVS_UNLIKELY(vid < 0 || !dev->vhost_reconfigured || qid < 0 - || !(dev->flags & NETDEV_UP))) { -rte_spinlock_lock(>stats_lock); -dev->stats.tx_dropped+= cnt; -rte_spinlock_unlock(>stats_lock); -goto out; -} - -if (OVS_UNLIKELY(!rte_spinlock_trylock(>tx_q[qid].tx_lock))) { -COVERAGE_INC(vhost_tx_contention); -rte_spinlock_lock(>tx_q[qid].tx_lock); -} - -sw_stats_add.tx_invalid_hwol_drops = cnt; -if (userspace_tso_enabled()) { -cnt = netdev_dpdk_prep_hwol_batch(dev, cur_pkts, cnt); -} - -sw_stats_add.tx_invalid_hwol_drops -= cnt; -sw_stats_add.tx_mtu_exceeded_drops = cnt; -cnt = netdev_dpdk_filter_packet_len(dev, cur_pkts, cnt); -sw_stats_add.tx_mtu_exceeded_drops -= cnt; - -/* Check has QoS has been configured for the netdev */ -sw_stats_add.tx_qos_drops = cnt; -cnt = netdev_dpdk_qos_run(dev, cur_pkts, cnt, true); -sw_stats_add.tx_qos_drops -= cnt; - -n_packets_to_free = cnt; - -do { -int vhost_qid = qid * VIRTIO_QNUM + VIRTIO_RXQ; -unsigned int tx_pkts; - -tx_pkts = rte_vhost_enqueue_burst(vid, vhost_qid, cur_pkts, cnt); -if (OVS_LIKELY(tx_pkts)) { -/* Packets have been sent.*/ -cnt -= tx_pkts; -/* Prepare for possible retry.*/ -cur_pkts = _pkts[tx_pkts]; -if (OVS_UNLIKELY(cnt && !retries)) { -/* - * Read max retries as there are packets not sent - * and no retries have already occurred. - */ -atomic_read_relaxed(>vhost_tx_retries_max, _retries); -} -} else { -/* No packets sent - do not retry.*/ -break; -} -} while (cnt && (retries++ < max_retries)); - -rte_spinlock_unlock(>tx_q[qid].tx_lock); - -sw_stats_add.tx_failure_drops = cnt; -sw_stats_add.tx_retries = MIN(retries, max_retries); - -rte_spinlock_lock(>stats_lock); -netdev_dpdk_vhost_update_tx_counters(dev, pkts, total_packets, - _stats_add); -