Looking at your perf stats I see following: OVS 2.7:
??.??% - dp_netdev_process_rxq_port |-- 93.36% - dp_netdev_input |-- ??.??% - netdev_rxq_recv OVS 2.9: 99.69% - dp_netdev_process_rxq_port |-- 79.45% - dp_netdev_input |-- 11.26% - dp_netdev_pmd_flush_output_packets |-- ??.??% - netdev_rxq_recv Could you please fill the missed (??.??) values? This data I got from the picture attached to the previous mail, but pictures are still not allowed in mail-list (i.e. stripped). It'll be good if you can upload your raw data to some external resource and post the link here. Anyway, from the data I have, I can see that total sum of time spent in "dp_netdev_input" and "dp_netdev_pmd_flush_output_packets" for 2.9 is 90.71%, which is less then 93.36% spent for 2.7. This means that processing + sending become even faster or remains with the approximately same performance. We definitely need all the missed values to be sure, but it seems that the "netdev_rxq_recv()" could be the issue. To check if DPDK itself causes the performance regression, I'd ask you to check pure PHY-PHY test with testpmd app from DPDK 16.11 and DPDK 17.11. Maybe it's the performance issue with bnxt driver that you're using. There was too many changes in that driver: 30 files changed, 17189 insertions(+), 3358 deletions(-) Best regards, Ilya Maximets. On 20.06.2018 01:18, Shahaji Bhosle wrote: > Hi Ilya, > This issue is a release blocker for us, just wanted to check check if you > need more details from us? Anything to expedite or root cause the problem we > can help > Please let us know > Thanks Shahaji > > On Mon, Jun 18, 2018 at 10:20 AM Shahaji Bhosle <[email protected] > <mailto:[email protected]>> wrote: > > Thanks Ilya, I will look at the commit, but not sure now how to tell how > much real work is being done, I would have liked polling cycles to be treated > as before and not towards packet processing. That does explain, as long as > there are packets on the wire we are always 100%, basically cannot tell how > efficiently the CPUs are being used. > Thanks, Shahaji > > On Mon, Jun 18, 2018 at 10:07 AM, Ilya Maximets <[email protected] > <mailto:[email protected]>> wrote: > > Thanks for the data. > > I have to note additionally that the meaning of "processing cycles" > significantly changed since the following commit: > > commit a2ac666d5265c01661e189caac321d962f54649f > Author: Ciara Loftus <[email protected] > <mailto:[email protected]>> > Date: Mon Feb 20 12:53:00 2017 +0000 > > dpif-netdev: Change definitions of 'idle' & 'processing' > cycles > > Instead of counting all polling cycles as processing cycles, > only count > the cycles where packets were received from the polling. > > This could explain the difference in "PMD Processing Cycles" column, > because successful "POLLING" cycles are now included into > "PROCESSING". > > Best regards, Ilya Maximets. > > On 18.06.2018 16:31, Shahaji Bhosle wrote: > > Hi Ilya, > > Thanks for the quick reply, > > Please find the numbers for our PHY-PHY test, please note that with > OVS 2.9.1 + DPDK 17.11 even a 10% of the below numbers will make the OVS > 2.9+DPDK17.11 processing cycles to hit 100%, but 2.7 will on our setup never > goes above 75% for processing cycles. I am also attaching the perf report > between the two code bases and I think the > "11.26%--dp_netdev_pmd_flush_output_packets" is causing us to take the > performance hit. Out testing is also SRIOV and CPUs are ARM A72 cores. We are > happy to run more tests, it is not easy for use to move back to OVS 2.8, but > could happy to try more experiments if it helps us narrow down further. > Please note we have also tried increasing the tx-flush-interval and it helps > a little but still not significant enough. Let us know. > > > > Thanks, Shahaji > > > > > > *Setup:* > > IXIA<----SFP28--->Port 0 {(PF0)==[OVS+DPDK]==(PF1)} Port > 1<-----SFP28---->IXIA > > > > release/version config Test direction MPPS Ixia > Line rate (%) PMD Processing Cycles (%) > > OVS 2.9 + DPDK 17.11 OVS on Maia (PF0--PF1) No drop port 1 to 2 > 31.3 85 99.9 > > port 2 to 1 > 31.3 85 99.9 > > bi 15.5 > + 15.5 42 99.9 > > > > > > OVS 2.7 + DPDK 16.11 OVS on Maia (PF0--PF1) No drop port 1 to 2 > 33.8 90 71 > > port 2 to 1 > 32.7 88 70 > > bi > 17.4+17.4 47 74 > > > > > > > > > > > > > > > > On Mon, Jun 18, 2018 at 4:25 AM, Nitin Katiyar > <[email protected] <mailto:[email protected]> > <mailto:[email protected] <mailto:[email protected]>>> > wrote: > > > > Hi, > > We also experienced degradation from OVS2.6/2.7 to > OVS2.8.2(with DPDK17.05.02). The drop is more for 64 bytes packet size > (~8-10%) even with higher number of flows. I tried OVS 2.8 with DPDK17.11 and > it improved for higher packet sizes but 64 bytes size is still the concern. > > > > Regards, > > Nitin > > > > -----Original Message----- > > From: Ilya Maximets [mailto:[email protected] > <mailto:[email protected]> <mailto:[email protected] > <mailto:[email protected]>>] > > Sent: Monday, June 18, 2018 1:32 PM > > To: [email protected] <mailto:[email protected]> > <mailto:[email protected] <mailto:[email protected]>>; > [email protected] <mailto:[email protected]> > <mailto:[email protected] <mailto:[email protected]>> > > Subject: Re: [ovs-dev] 64Byte packet performance regression on > 2.9 from 2.7 > > > > CC: Shahaji Bhosle > > > > Sorry, missed you in CC list. > > > > Best regards, Ilya Maximets. > > > > On 15.06.2018 10:44, Ilya Maximets wrote: > > >> Hi, > > >> I just upgraded from OvS 2.7 + DPDK 16.11 to OvS2.9 + DPDK > 17.11 and > > >> running into performance issue with 64 Byte packet rate. One > > >> interesting thing that I notice that even at very light load > from > > >> IXIA the processing cycles on all the PMD threads run close > to 100% > > >> of the cpu cycle on 2.9 OvS, but on OvS 2.7 even under full > load the > > >> processing cycles remain at 75% of the cpu cycles. > > >> > > >> Attaching the FlameGraphs of both the versions, the only > thing that > > >> pops out to me is the new way invoking netdev_send() is on > 2.9 is > > >> being invoked via dp_netdev_pmd_flush_output_packets() > which seems > > >> to be adding another ~11% to the whole rx to tx path. > > >> > > >> I also did try the tx-flush-interval to 50 and more it does > seem to > > >> help, but not significant enough to match the 2.7 > performance. > > >> > > >> > > >> Any help or ideas would be really great. Thanks, Shahaji > > > > > > Hello, Shahaji. > > > Could you, please, describe your testing scenario in more > details? > > > Also, mail-list filters attachments, so they are not > available. You > > > need to publish them somewhere else or write in text format > inside the letter. > > > > > > About the performance itself: Some performance degradation > because of > > > output batching is expected for tests with low number of > flows or > > > simple PHY-PHY tests. It was mainly targeted for cases with > relatively > > > large number of flows, for amortizing of vhost-user penalties > > > (PHY-VM-PHY, VM-VM cases), OVS bonding cases. > > > > > > If your test involves vhost-user ports, then you should also > consider > > > vhost-user performance regression in stable DPDK 17.11 > because of > > > fixes for CVE-2018-1059. Related bug: > > > https://dpdk.org/tracker/show_bug.cgi?id=48 > <https://dpdk.org/tracker/show_bug.cgi?id=48> > > > > > > It'll be good if you'll be able to test OVS 2.8 + DPDK 17.05. > There > > > was too many changes since 2.7. It'll be hard to track down > the root cause. > > > > > > Best regards, Ilya Maximets. > > > > > > > > > _______________________________________________ dev mailing list [email protected] https://mail.openvswitch.org/mailman/listinfo/ovs-dev
