subject:"\[ovs\-dev\] \[PATCH v2\] netdev\-dpdk\: Refactor the DPDK transmit path."

Re: [ovs-dev] [PATCH v2] netdev-dpdk: Refactor the DPDK transmit path.

2022-01-12 Thread Flavio Leitner



Hello Sunil, Marko and Ian.

Mike worked to identify the reason for the performance issue
reported by you a while ago. He summarized below. I wonder
if you can give a try on his patch too and tell us if we are
on the right track.

Thanks,
fbl

On Wed, Jan 05, 2022 at 03:01:47PM -0500, Mike Pattrick wrote:
> Hello Flavio,
> 
> Great patch, I think you really did a lot to improve the code here and
> I think that's borne out by the consistent performance improvements
> across multiple tests.
> 
> Regarding the 4% regression that Intel detected, I found the following
> white paper to describe the "scatter" test:
> 
> https://builders.intel.com/docs/networkbuilders/open-vswitch-optimized-deployment-benchmark-technology-guide.pdf
> 
> This document calls out the following key points:
> 
> The original test was summarized as:
> - 32 VMs with one million flows.
> - Test runs on four physical cores for OVS and 10 hyper-threaded cores
> for TestPMD
> - An Ixia pitches traffic at a sub 0.1% loss rate
> - The server catches traffic with a E810-C 100G
> - The traffic's profile is: Ether()/IP()/UDP()/VXLAN()/Ether()/IP() 
> - On the outer IP, the source address changes incrementally across the
> 32 instances
> - The destination address remains the same on the outer IP.
> - The inner source IP remains
> - The inner destination address increments to create the one million
> flows for the test
> - EMC and SMC were disabled
> 
> I could not reproduce this test exactly because I don't have access to
> the same hardware - notably the Intel NIC and an Ixia - and I didn't
> want to create an environment that wouldn't be reproduced in real world
> scenarios. I did pin VM and TXQs/RXQs threads to cores, but I didn't
> optimize the setup nearly to the extant that the white paper described.
> My test setup consisted of two Fedora 35 servers directly connected
> across Mellanox5E cards with Trex pitching traffic and TestPMD
> reflecting it.
> 
> In my test I was still able to reproduce a similar performance penalty.
> I found that the key factors was the combination of VXLAN and a large
> number of flows. So once I had a setup that could reproduce close to
> the 4% penalty I stopped modifying my test framework and started
> searching for the slow code.
> 
> I didn't see any obvious issues in the code that should cause a
> significant slowdown, in fact, most of the code is identical or
> slightly improved. So to help my analysis, I created several variations
> of your patch reverting small aspects of the change and benchmarked
> each variation.
> 
> Because the difference in performance across each variation was so
> minor, I took a lot of samples. I pitched traffic over one million
> flows for 240 seconds and averaged out the throughput, I then repeated
> this process a total of five times for each patch. Finally, I repeated
> the whole process three times to produce 15 data points per patch.
> 
> The best results came from the patch enclosed below, with the code from
> netdev_dpdk_common_send() protected by the splinlock, as it is in the
> pre-patch code. This yielded a 2.7% +/- 0.64 performance boost over the
> master branch.
> 
> 
> Cheers,
> Michael
> 
> 
> diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c
> index bc1633663..5db5d7e2a 100644
> --- a/lib/netdev-dpdk.c
> +++ b/lib/netdev-dpdk.c
> @@ -2777,13 +2777,13 @@ netdev_dpdk_vhost_send(struct netdev *netdev, int qid,
>  return 0;
>  }
>  
> -cnt = netdev_dpdk_common_send(netdev, batch, );
> -
>  if (OVS_UNLIKELY(!rte_spinlock_trylock(>tx_q[qid].tx_lock))) {
>  COVERAGE_INC(vhost_tx_contention);
>  rte_spinlock_lock(>tx_q[qid].tx_lock);
>  }
>  
> +cnt = netdev_dpdk_common_send(netdev, batch, );
> +
>  pkts = (struct rte_mbuf **) batch->packets;
>  vhost_batch_cnt = cnt;
>  retries = 0;
> @@ -2843,13 +2843,15 @@ netdev_dpdk_eth_send(struct netdev *netdev, int qid,
>  return 0;
>  }
>  
> -cnt = netdev_dpdk_common_send(netdev, batch, );
> -dropped = batch_cnt - cnt;
>  if (OVS_UNLIKELY(concurrent_txq)) {
>  qid = qid % dev->up.n_txq;
>  rte_spinlock_lock(>tx_q[qid].tx_lock);
>  }
>  
> +cnt = netdev_dpdk_common_send(netdev, batch, );
> +
> +dropped = batch_cnt - cnt;
> +
>  dropped += netdev_dpdk_eth_tx_burst(dev, qid, pkts, cnt);
>  if (OVS_UNLIKELY(dropped)) {
>  struct netdev_dpdk_sw_stats *sw_stats = dev->sw_stats;
> 
> 
> On Sun, 2021-01-10 at 00:05 -0300, Flavio Leitner wrote:
> > This patch split out the common code between vhost and
> > dpdk transmit paths to shared functions to simplify the
> > code and fix an issue.
> > 
> > The issue is that the packet coming from non-DPDK device
> > and egressing on a DPDK device currently skips the hwol
> > preparation.
> > 
> > This also have the side effect of leaving only the dpdk
> > transmit code under the txq lock.
> > 
> > Signed-off-by: Flavio Leitner 
> > Reviewed-by: David Marchand 
> > ---
> >

Re: [ovs-dev] [PATCH v2] netdev-dpdk: Refactor the DPDK transmit path.

2022-01-05 Thread Mike Pattrick

Hello Flavio,

Great patch, I think you really did a lot to improve the code here and
I think that's borne out by the consistent performance improvements
across multiple tests.

Regarding the 4% regression that Intel detected, I found the following
white paper to describe the "scatter" test:

https://builders.intel.com/docs/networkbuilders/open-vswitch-optimized-deployment-benchmark-technology-guide.pdf

This document calls out the following key points:

The original test was summarized as:
- 32 VMs with one million flows.
- Test runs on four physical cores for OVS and 10 hyper-threaded cores
for TestPMD
- An Ixia pitches traffic at a sub 0.1% loss rate
- The server catches traffic with a E810-C 100G
- The traffic's profile is: Ether()/IP()/UDP()/VXLAN()/Ether()/IP() 
- On the outer IP, the source address changes incrementally across the
32 instances
- The destination address remains the same on the outer IP.
- The inner source IP remains
- The inner destination address increments to create the one million
flows for the test
- EMC and SMC were disabled

I could not reproduce this test exactly because I don't have access to
the same hardware - notably the Intel NIC and an Ixia - and I didn't
want to create an environment that wouldn't be reproduced in real world
scenarios. I did pin VM and TXQs/RXQs threads to cores, but I didn't
optimize the setup nearly to the extant that the white paper described.
My test setup consisted of two Fedora 35 servers directly connected
across Mellanox5E cards with Trex pitching traffic and TestPMD
reflecting it.

In my test I was still able to reproduce a similar performance penalty.
I found that the key factors was the combination of VXLAN and a large
number of flows. So once I had a setup that could reproduce close to
the 4% penalty I stopped modifying my test framework and started
searching for the slow code.

I didn't see any obvious issues in the code that should cause a
significant slowdown, in fact, most of the code is identical or
slightly improved. So to help my analysis, I created several variations
of your patch reverting small aspects of the change and benchmarked
each variation.

Because the difference in performance across each variation was so
minor, I took a lot of samples. I pitched traffic over one million
flows for 240 seconds and averaged out the throughput, I then repeated
this process a total of five times for each patch. Finally, I repeated
the whole process three times to produce 15 data points per patch.

The best results came from the patch enclosed below, with the code from
netdev_dpdk_common_send() protected by the splinlock, as it is in the
pre-patch code. This yielded a 2.7% +/- 0.64 performance boost over the
master branch.

Cheers,
Michael

diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c
index bc1633663..5db5d7e2a 100644
--- a/lib/netdev-dpdk.c
+++ b/lib/netdev-dpdk.c
@@ -2777,13 +2777,13 @@ netdev_dpdk_vhost_send(struct netdev *netdev, int qid,
 return 0;
 }

-cnt = netdev_dpdk_common_send(netdev, batch, );
-
 if (OVS_UNLIKELY(!rte_spinlock_trylock(>tx_q[qid].tx_lock))) {
 COVERAGE_INC(vhost_tx_contention);
 rte_spinlock_lock(>tx_q[qid].tx_lock);
 }

+cnt = netdev_dpdk_common_send(netdev, batch, );
+
 pkts = (struct rte_mbuf **) batch->packets;
 vhost_batch_cnt = cnt;
 retries = 0;
@@ -2843,13 +2843,15 @@ netdev_dpdk_eth_send(struct netdev *netdev, int qid,
 return 0;
 }

-cnt = netdev_dpdk_common_send(netdev, batch, );
-dropped = batch_cnt - cnt;
 if (OVS_UNLIKELY(concurrent_txq)) {
 qid = qid % dev->up.n_txq;
 rte_spinlock_lock(>tx_q[qid].tx_lock);
 }

+cnt = netdev_dpdk_common_send(netdev, batch, );
+
+dropped = batch_cnt - cnt;
+
 dropped += netdev_dpdk_eth_tx_burst(dev, qid, pkts, cnt);
 if (OVS_UNLIKELY(dropped)) {
 struct netdev_dpdk_sw_stats *sw_stats = dev->sw_stats;

On Sun, 2021-01-10 at 00:05 -0300, Flavio Leitner wrote:
> This patch split out the common code between vhost and
> dpdk transmit paths to shared functions to simplify the
> code and fix an issue.
> 
> The issue is that the packet coming from non-DPDK device
> and egressing on a DPDK device currently skips the hwol
> preparation.
> 
> This also have the side effect of leaving only the dpdk
> transmit code under the txq lock.
> 
> Signed-off-by: Flavio Leitner 
> Reviewed-by: David Marchand 
> ---
> 
> V2:
>- mentioned the tx lock change in the commit message.
>- fixed packet leak when copy fails.
>- moved pkt_cnt = cnt two lines up.
> 
> I tested the following scenarios with iperf and iperf -R
> and noticed no performance regression:
>   IPERF   VM-Bridge  1.02x
>   IPERF   VM-VM  1.04x
>   IPERF  VM-ExtHost  1.00x
>   IPERF   VM-NS  1.01x
>   IPERF  VM-VLAN-Bridge  1.03x
>   IPERF  VM-VLAN-VM  1.03x
>   IPERF VM-VLAN-ExtHost  1.01x
>

Re: [ovs-dev] [PATCH v2] netdev-dpdk: Refactor the DPDK transmit path.

2021-01-13 Thread Flavio Leitner



Hi Marko,

On Wed, Jan 13, 2021 at 10:51:03AM +, Kovacevic, Marko wrote:
> Hi Flavio,
> 
> > So, I see a small performance gain. Can you provide a more detailed
> > test description? I wonder if we are testing the same thing.
> 
> VSPERF phy2phy_tput uses 2 dpdk pmds and performs zero packet loss 
> test(RFC2544) with bidirectional traffic.

Ok.

> > > and PVP test results are in line with your observations.
> > > NIC: Fortville 10G X710
> > 
> > Great.
> > 
> > > Performance with different traffic profiles when deployed with 32VM's
> > +1M flows + vxlan enabled :
> > > With burst mode : ~1% increase in performance
> > 
> > That is aligned with above.
> > 
> > > as compared to
> > > scatter mode: ~4% decrease in performance.
> > 
> > Hm, that is unexpected. Is this reproducible?
> 
> If you mean are the results reproducible then yes, 
> I done a re-run of v1 and v2 again 

Thanks for checking.

> I ran my test against these commits to get baseline results and then ran v1 
> and v2 and compared results
> DPDK: b1d36cf82 (HEAD, tag: v20.11, origin/releases) version: 20.11.0
> OVS: 7f79ae2fb (HEAD -> master, origin/master, origin/HEAD) Documentation: 
> Simplify the website main page.
> 
> V1:
> DPDK: b1d36cf82 (HEAD, tag: v20.11, origin/releases) version: 20.11.0
> OVS: 957132b3a (HEAD -> master) netdev-dpdk: Refactor the DPDK transmit path.
> 7f79ae2fb (origin/master, origin/HEAD) Documentation: Simplify the website 
> main page.
> 
> V1: Burst= 1%
> V1: Scatter -4%
> As sunil reported already shows the same results again

Alright. I will try to reproduce in my lab to understand the root
cause.


> As for V2 it didn't show any increase or decrease for both burst and scatter, 
> And even between the v1 and v2 results from my second run it didn't show much 
> difference on my test anyway.
> 
> > How are you switching between burst and scatter mode?
> 
> As for how were switching between burst and scatter, I just use two different 
> traffic profiles & gather results.
> So I run the test using burst profile restart the test then use scatter.
> 
> Traffic @ Phy NIC Rx:
> Ether()/IP()/UDP()/VXLAN()/Ether()/IP()
> 
> Burst: on the outer ip we do a burst of 32 packets with same ip then switch 
> for next 32 and so on 
> Scatter: for scatter we do incrementally for 32 

Glad that I asked because I thought you were talking about the
NIC mode with scatter enabled or not.

> And we do this for 1M flows
> 
> I hope this answers your question

They are helpful. I will try to reproduce the -4% results in my
lab as a next step. It may take some time.

Thanks for testing and providing feedbacks!

-- 
fbl
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Re: [ovs-dev] [PATCH v2] netdev-dpdk: Refactor the DPDK transmit path.

2021-01-13 Thread Kovacevic, Marko

Hi Flavio,

> So, I see a small performance gain. Can you provide a more detailed
> test description? I wonder if we are testing the same thing.

VSPERF phy2phy_tput uses 2 dpdk pmds and performs zero packet loss 
test(RFC2544) with bidirectional traffic.

> 
> > and PVP test results are in line with your observations.
> > NIC: Fortville 10G X710
> 
> Great.
> 
> > Performance with different traffic profiles when deployed with 32VM's
> +1M flows + vxlan enabled :
> > With burst mode : ~1% increase in performance
> 
> That is aligned with above.
> 
> > as compared to
> > scatter mode: ~4% decrease in performance.
> 
> Hm, that is unexpected. Is this reproducible?

If you mean are the results reproducible then yes, 
I done a re-run of v1 and v2 again 

I ran my test against these commits to get baseline results and then ran v1 and 
v2 and compared results
DPDK: b1d36cf82 (HEAD, tag: v20.11, origin/releases) version: 20.11.0
OVS: 7f79ae2fb (HEAD -> master, origin/master, origin/HEAD) Documentation: 
Simplify the website main page.

V1:
DPDK: b1d36cf82 (HEAD, tag: v20.11, origin/releases) version: 20.11.0
OVS: 957132b3a (HEAD -> master) netdev-dpdk: Refactor the DPDK transmit path.
7f79ae2fb (origin/master, origin/HEAD) Documentation: Simplify the website main 
page.

V1: Burst= 1%
V1: Scatter -4%
As sunil reported already shows the same results again

As for V2 it didn't show any increase or decrease for both burst and scatter, 
And even between the v1 and v2 results from my second run it didn't show much 
difference on my test anyway.

> How are you switching between burst and scatter mode?

As for how were switching between burst and scatter, I just use two different 
traffic profiles & gather results.
So I run the test using burst profile restart the test then use scatter.

Traffic @ Phy NIC Rx:
Ether()/IP()/UDP()/VXLAN()/Ether()/IP()

Burst: on the outer ip we do a burst of 32 packets with same ip then switch for 
next 32 and so on 
Scatter: for scatter we do incrementally for 32 

And we do this for 1M flows

I hope this answers your question
> 
> Thanks,
> fbl

Thanks,
Marko K
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Re: [ovs-dev] [PATCH v2] netdev-dpdk: Refactor the DPDK transmit path.

2021-01-11 Thread David Marchand

On Sun, Jan 10, 2021 at 4:05 AM Flavio Leitner  wrote:
>
> This patch split out the common code between vhost and
> dpdk transmit paths to shared functions to simplify the
> code and fix an issue.
>
> The issue is that the packet coming from non-DPDK device
> and egressing on a DPDK device currently skips the hwol
> preparation.
>
> This also have the side effect of leaving only the dpdk

nit: has*

> transmit code under the txq lock.
>
> Signed-off-by: Flavio Leitner 

Reviewed-by: David Marchand 

Thanks.


-- 
David Marchand

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Re: [ovs-dev] [PATCH v2] netdev-dpdk: Refactor the DPDK transmit path.

2021-01-11 Thread Flavio Leitner



Hi Sunil,

On Mon, Jan 11, 2021 at 03:49:56PM +, Pai G, Sunil wrote:
> Hi Flavio,
> 
> 
> > -Original Message-
> > From: dev  On Behalf Of Flavio Leitner
> > Sent: Sunday, January 10, 2021 8:35 AM
> > To: d...@openvswitch.org
> > Cc: David Marchand ; Flavio Leitner
> > 
> > Subject: [ovs-dev] [PATCH v2] netdev-dpdk: Refactor the DPDK transmit
> > path.
> > 
> > This patch split out the common code between vhost and dpdk transmit
> > paths to shared functions to simplify the code and fix an issue.
> > 
> > The issue is that the packet coming from non-DPDK device and egressing on a
> > DPDK device currently skips the hwol preparation.
> > 
> > This also have the side effect of leaving only the dpdk transmit code under
> > the txq lock.
> > 
> > Signed-off-by: Flavio Leitner 
> > ---
> > 
> > V2:
> >- mentioned the tx lock change in the commit message.
> >- fixed packet leak when copy fails.
> >- moved pkt_cnt = cnt two lines up.
> > 
> > I tested the following scenarios with iperf and iperf -R and noticed no
> > performance regression:
> >   IPERF   VM-Bridge  1.02x
> >   IPERF   VM-VM  1.04x
> >   IPERF  VM-ExtHost  1.00x
> >   IPERF   VM-NS  1.01x
> >   IPERF  VM-VLAN-Bridge  1.03x
> >   IPERF  VM-VLAN-VM  1.03x
> >   IPERF VM-VLAN-ExtHost  1.01x
> >   IPERF  VM-VLAN-NS  1.02x
> >   IPERFVM-V6-VM  1.03x
> >   IPERF   VM-V6-ExtHost  1.00x
> >   IPERFVM-V6-NS  1.01x
> > IPERF-R   VM-Bridge  1.01x
> > IPERF-R   VM-VM  1.04x
> > IPERF-R  VM-ExtHost  1.10x
> > IPERF-R   VM-NS  1.01x
> > IPERF-R  VM-VLAN-Bridge  1.03x
> > IPERF-R  VM-VLAN-VM  1.02x
> > IPERF-R VM-VLAN-ExtHost  1.08x
> > IPERF-R  VM-VLAN-NS  1.02x
> > IPERF-RVM-V6-VM  1.00x
> > IPERF-R   VM-V6-ExtHost  1.11x
> > IPERF-RVM-V6-NS  1.00x
> > 
> > Now using trex, 64byte packet, PVP:
> > Original: 3.6Mpps
> >   avg. packets per output batch: 32.00
> >   idle cycles: 0 (0.00%)
> >   avg cycles per packet: 304.92 (8331383020/27323150)
> >   avg processing cycles per packet: 304.92 (8331383020/27323150)
> > 
> > Patched: 3.6Mpps
> >   avg. packets per output batch: 32.00
> >   idle cycles: 0 (0.00%)
> >   avg cycles per packet: 304.08 (21875784116/71941516)
> >   avg processing cycles per packet: 304.08 (21875784116/71941516)
> 
> 
> Ran the following tests in VSPERF for QoS and they all pass:

Great, thanks for testing the patch.

> ovsdpdk_qos_create_phy_port   
> ovsdpdk_qos_delete_phy_port   
> ovsdpdk_qos_create_vport  
> ovsdpdk_qos_delete_vport  
> ovsdpdk_qos_create_no_cir 
> ovsdpdk_qos_create_no_cbs
> 
> For performance tests:
> We see a slight dip in performance for physical ports ~1% in phy2phy_tput 
> test in VSPERF with the patch(v1) for 64byte packet

I tried P2P here (v2) with trex, single flow, single queue,
single PMD and I got this:

original: 12.88Mpps/6.6Gbps
  idle cycles: 0 (0.00%)
  processing cycles: 14466174204 (100.00%)
  avg cycles per packet: 170.73 (14466174204/84730796)
  avg processing cycles per packet: 170.73 (14466174204/84730796)


patched: 13.00Mpps/6.65Mpps
  idle cycles: 0 (0.00%)
  processing cycles: 3783269716 (100.00%)
  avg cycles per packet: 169.49 (3783269716/22321968)
  avg processing cycles per packet: 169.49 (3783269716/22321968)

So, I see a small performance gain. Can you provide a more detailed
test description? I wonder if we are testing the same thing.

> and PVP test results are in line with your observations.
> NIC: Fortville 10G X710

Great.

> Performance with different traffic profiles when deployed with 32VM's +1M 
> flows + vxlan enabled :
> With burst mode : ~1% increase in performance

That is aligned with above.

> as compared to 
> scatter mode: ~4% decrease in performance.

Hm, that is unexpected. Is this reproducible?
How are you switching between burst and scatter mode?

Thanks,
fbl
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Re: [ovs-dev] [PATCH v2] netdev-dpdk: Refactor the DPDK transmit path.

2021-01-11 Thread Pai G, Sunil

Hi Flavio,


> -Original Message-
> From: dev  On Behalf Of Flavio Leitner
> Sent: Sunday, January 10, 2021 8:35 AM
> To: d...@openvswitch.org
> Cc: David Marchand ; Flavio Leitner
> 
> Subject: [ovs-dev] [PATCH v2] netdev-dpdk: Refactor the DPDK transmit
> path.
> 
> This patch split out the common code between vhost and dpdk transmit
> paths to shared functions to simplify the code and fix an issue.
> 
> The issue is that the packet coming from non-DPDK device and egressing on a
> DPDK device currently skips the hwol preparation.
> 
> This also have the side effect of leaving only the dpdk transmit code under
> the txq lock.
> 
> Signed-off-by: Flavio Leitner 
> ---
> 
> V2:
>- mentioned the tx lock change in the commit message.
>- fixed packet leak when copy fails.
>- moved pkt_cnt = cnt two lines up.
> 
> I tested the following scenarios with iperf and iperf -R and noticed no
> performance regression:
>   IPERF   VM-Bridge  1.02x
>   IPERF   VM-VM  1.04x
>   IPERF  VM-ExtHost  1.00x
>   IPERF   VM-NS  1.01x
>   IPERF  VM-VLAN-Bridge  1.03x
>   IPERF  VM-VLAN-VM  1.03x
>   IPERF VM-VLAN-ExtHost  1.01x
>   IPERF  VM-VLAN-NS  1.02x
>   IPERFVM-V6-VM  1.03x
>   IPERF   VM-V6-ExtHost  1.00x
>   IPERFVM-V6-NS  1.01x
> IPERF-R   VM-Bridge  1.01x
> IPERF-R   VM-VM  1.04x
> IPERF-R  VM-ExtHost  1.10x
> IPERF-R   VM-NS  1.01x
> IPERF-R  VM-VLAN-Bridge  1.03x
> IPERF-R  VM-VLAN-VM  1.02x
> IPERF-R VM-VLAN-ExtHost  1.08x
> IPERF-R  VM-VLAN-NS  1.02x
> IPERF-RVM-V6-VM  1.00x
> IPERF-R   VM-V6-ExtHost  1.11x
> IPERF-RVM-V6-NS  1.00x
> 
> Now using trex, 64byte packet, PVP:
> Original: 3.6Mpps
>   avg. packets per output batch: 32.00
>   idle cycles: 0 (0.00%)
>   avg cycles per packet: 304.92 (8331383020/27323150)
>   avg processing cycles per packet: 304.92 (8331383020/27323150)
> 
> Patched: 3.6Mpps
>   avg. packets per output batch: 32.00
>   idle cycles: 0 (0.00%)
>   avg cycles per packet: 304.08 (21875784116/71941516)
>   avg processing cycles per packet: 304.08 (21875784116/71941516)


Ran the following tests in VSPERF for QoS and they all pass:
ovsdpdk_qos_create_phy_port 
ovsdpdk_qos_delete_phy_port 
ovsdpdk_qos_create_vport
ovsdpdk_qos_delete_vport
ovsdpdk_qos_create_no_cir   
ovsdpdk_qos_create_no_cbs

For performance tests:
We see a slight dip in performance for physical ports ~1% in phy2phy_tput test 
in VSPERF with the patch(v1) for 64byte packet
and PVP test results are in line with your observations.
NIC: Fortville 10G X710


Performance with different traffic profiles when deployed with 32VM's +1M flows 
+ vxlan enabled :
With burst mode : ~1% increase in performance
as compared to 
scatter mode: ~4% decrease in performance.

[snipped]

Thanks , 
Sunil
Intel

> dev mailing list
> d...@openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

[ovs-dev] [PATCH v2] netdev-dpdk: Refactor the DPDK transmit path.

2021-01-09 Thread Flavio Leitner

This patch split out the common code between vhost and
dpdk transmit paths to shared functions to simplify the
code and fix an issue.

The issue is that the packet coming from non-DPDK device
and egressing on a DPDK device currently skips the hwol
preparation.

This also have the side effect of leaving only the dpdk
transmit code under the txq lock.

Signed-off-by: Flavio Leitner 
---

V2:
   - mentioned the tx lock change in the commit message.
   - fixed packet leak when copy fails.
   - moved pkt_cnt = cnt two lines up.

I tested the following scenarios with iperf and iperf -R
and noticed no performance regression:
  IPERF   VM-Bridge  1.02x
  IPERF   VM-VM  1.04x
  IPERF  VM-ExtHost  1.00x
  IPERF   VM-NS  1.01x
  IPERF  VM-VLAN-Bridge  1.03x
  IPERF  VM-VLAN-VM  1.03x
  IPERF VM-VLAN-ExtHost  1.01x
  IPERF  VM-VLAN-NS  1.02x
  IPERFVM-V6-VM  1.03x
  IPERF   VM-V6-ExtHost  1.00x
  IPERFVM-V6-NS  1.01x
IPERF-R   VM-Bridge  1.01x
IPERF-R   VM-VM  1.04x
IPERF-R  VM-ExtHost  1.10x
IPERF-R   VM-NS  1.01x
IPERF-R  VM-VLAN-Bridge  1.03x
IPERF-R  VM-VLAN-VM  1.02x
IPERF-R VM-VLAN-ExtHost  1.08x
IPERF-R  VM-VLAN-NS  1.02x
IPERF-RVM-V6-VM  1.00x
IPERF-R   VM-V6-ExtHost  1.11x
IPERF-RVM-V6-NS  1.00x

Now using trex, 64byte packet, PVP:
Original: 3.6Mpps
  avg. packets per output batch: 32.00
  idle cycles: 0 (0.00%)
  avg cycles per packet: 304.92 (8331383020/27323150)
  avg processing cycles per packet: 304.92 (8331383020/27323150)

Patched: 3.6Mpps
  avg. packets per output batch: 32.00
  idle cycles: 0 (0.00%)
  avg cycles per packet: 304.08 (21875784116/71941516)
  avg processing cycles per packet: 304.08 (21875784116/71941516)

 lib/netdev-dpdk.c | 335 +++---
 1 file changed, 139 insertions(+), 196 deletions(-)

diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c
index 2640a421a..a1437db4d 100644
--- a/lib/netdev-dpdk.c
+++ b/lib/netdev-dpdk.c
@@ -2585,90 +2585,6 @@ netdev_dpdk_vhost_update_tx_counters(struct netdev_dpdk 
*dev,
 }
 }
 
-static void
-__netdev_dpdk_vhost_send(struct netdev *netdev, int qid,
- struct dp_packet **pkts, int cnt)
-{
-struct netdev_dpdk *dev = netdev_dpdk_cast(netdev);
-struct rte_mbuf **cur_pkts = (struct rte_mbuf **) pkts;
-struct netdev_dpdk_sw_stats sw_stats_add;
-unsigned int n_packets_to_free = cnt;
-unsigned int total_packets = cnt;
-int i, retries = 0;
-int max_retries = VHOST_ENQ_RETRY_MIN;
-int vid = netdev_dpdk_get_vid(dev);
-
-qid = dev->tx_q[qid % netdev->n_txq].map;
-
-if (OVS_UNLIKELY(vid < 0 || !dev->vhost_reconfigured || qid < 0
- || !(dev->flags & NETDEV_UP))) {
-rte_spinlock_lock(>stats_lock);
-dev->stats.tx_dropped+= cnt;
-rte_spinlock_unlock(>stats_lock);
-goto out;
-}
-
-if (OVS_UNLIKELY(!rte_spinlock_trylock(>tx_q[qid].tx_lock))) {
-COVERAGE_INC(vhost_tx_contention);
-rte_spinlock_lock(>tx_q[qid].tx_lock);
-}
-
-sw_stats_add.tx_invalid_hwol_drops = cnt;
-if (userspace_tso_enabled()) {
-cnt = netdev_dpdk_prep_hwol_batch(dev, cur_pkts, cnt);
-}
-
-sw_stats_add.tx_invalid_hwol_drops -= cnt;
-sw_stats_add.tx_mtu_exceeded_drops = cnt;
-cnt = netdev_dpdk_filter_packet_len(dev, cur_pkts, cnt);
-sw_stats_add.tx_mtu_exceeded_drops -= cnt;
-
-/* Check has QoS has been configured for the netdev */
-sw_stats_add.tx_qos_drops = cnt;
-cnt = netdev_dpdk_qos_run(dev, cur_pkts, cnt, true);
-sw_stats_add.tx_qos_drops -= cnt;
-
-n_packets_to_free = cnt;
-
-do {
-int vhost_qid = qid * VIRTIO_QNUM + VIRTIO_RXQ;
-unsigned int tx_pkts;
-
-tx_pkts = rte_vhost_enqueue_burst(vid, vhost_qid, cur_pkts, cnt);
-if (OVS_LIKELY(tx_pkts)) {
-/* Packets have been sent.*/
-cnt -= tx_pkts;
-/* Prepare for possible retry.*/
-cur_pkts = _pkts[tx_pkts];
-if (OVS_UNLIKELY(cnt && !retries)) {
-/*
- * Read max retries as there are packets not sent
- * and no retries have already occurred.
- */
-atomic_read_relaxed(>vhost_tx_retries_max, _retries);
-}
-} else {
-/* No packets sent - do not retry.*/
-break;
-}
-} while (cnt && (retries++ < max_retries));
-
-rte_spinlock_unlock(>tx_q[qid].tx_lock);
-
-sw_stats_add.tx_failure_drops = cnt;
-sw_stats_add.tx_retries = MIN(retries, max_retries);
-
-rte_spinlock_lock(>stats_lock);
-netdev_dpdk_vhost_update_tx_counters(dev, pkts, total_packets,
- _stats_add);
-

Re: [ovs-dev] [PATCH v2] netdev-dpdk: Refactor the DPDK transmit path.

Re: [ovs-dev] [PATCH v2] netdev-dpdk: Refactor the DPDK transmit path.

Re: [ovs-dev] [PATCH v2] netdev-dpdk: Refactor the DPDK transmit path.

Re: [ovs-dev] [PATCH v2] netdev-dpdk: Refactor the DPDK transmit path.

Re: [ovs-dev] [PATCH v2] netdev-dpdk: Refactor the DPDK transmit path.

Re: [ovs-dev] [PATCH v2] netdev-dpdk: Refactor the DPDK transmit path.

Re: [ovs-dev] [PATCH v2] netdev-dpdk: Refactor the DPDK transmit path.

[ovs-dev] [PATCH v2] netdev-dpdk: Refactor the DPDK transmit path.

8 matches

Site Navigation

Mail list logo

Footer information